[EN] Play the Wav File with ESP32.

This article applies the ESP32 microcontroller’s DAC and MicroPython to open WAV files, which are audio recording files and exported to the DAC connected to the speakers as shown in Figure 1. The used file is an uncompressed 8-bit mono PCM audio file. And the sample program supports a sampling rate at about 50KHz or at 44100 level.

Figure 1 An example of a board to test the functionality of an article

WAV file structure

A Wav file consists of a header that checks the file type and size, followed by a chunk of the fmt section for detailing the file format, and data to contain the following:

4‘fmt ‘
4 fmt size
2audio type
2number of channel
4Sample Rate
4Byte Rate
2Block Align
2bit per Sample
4size of data

From the table, it can be seen that to read data from the file starts from reading the first 4 bytes to check if it is ‘RIFF’ or not, if yes it will read the file size by 4 bytes and then read the next 4 bytes to check if it’s ‘WAVE’ or not, if yes, it’s the header of the WAV file.

The next 4 bytes are the text ‘fmt ‘, indicating that they are part of the audio format’s description. And the last section in ‘data’ is the audio data that needs to be read to be exported to the DAC.

Example code

An example program of this article is to play mono.wav and mono2.wav audio files, which the reader must upload the files to the microcontroller board first, as shown in Figure 2.

Figure 2 Files in ESP32

The example code for a Python program is as follows.

import time
import sys
from machine import DAC, Pin, freq
import gc


dacPin1 = Pin(25) # ต่อกับลำโพง
dacPin2 = Pin(26) # ต่อกับ adcPin1

dac1 = DAC( dacPin1 )
dac2 = DAC( dacPin2 )

def playWavFile( fName ):
    monoFile = open(fName,"rb")
    mark = monoFile.read(4)
    if (mark != b'RIFF'):
        print("ไม่ใช้ WAV!")
    fileSize = int.from_bytes(monoFile.read(4),"little")
    print("File size = {} bytes".format(fileSize))
    fileType = monoFile.read(4)
    if (fileType != b'WAVE'):
        print("ไม่ใช้ WAV!!")

    chunk = monoFile.read(4)
    lengthFormat = 0
    audioFormat = 0
    numChannels = 0
    sampleRate = 0
    byteRate = 0
    blockAlign = 0

    if (chunk == b'fmt '):
        lengthFormat = int.from_bytes(monoFile.read(4),"little")
        audioFormat = int.from_bytes(monoFile.read(2),"little") 
        numChannels = int.from_bytes(monoFile.read(2),"little")
        sampleRate = int.from_bytes(monoFile.read(4),"little")
        byteRate = int.from_bytes(monoFile.read(4),"little") 
        blockAlign = int.from_bytes(monoFile.read(2),"little") 
        bitsPerSample = int.from_bytes(monoFile.read(2),"little")
        print("Length of format data = {}".format(lengthFormat))
        print("Audio's format = {}".format(audioFormat))
        print("Number of channel(s) = {}".format(numChannels))
        print("Sample rate = {}".format(sampleRate))
        print("Byte rate = {}".format(byteRate))
        print("Block align = {}".format(blockAlign))
        print("Bits per sample = {}".format(bitsPerSample))
        minValue = 255
        maxValue = 0
        chunk = monoFile.read(4)
        if (chunk != b'data'):
            print("ไม่ใช้ WAV!!!!")
        dataSize = int.from_bytes(monoFile.read(4),"little")
        print("Data size = {}".format(dataSize))
        if (bitsPerSample > 8):
            print("ไม่รองรับข้อมูลที่มากกว่า 8 บืต")
        buffer = monoFile.read(dataSize)
        # find min/max
        for i in range(len(buffer)):
            if (buffer[i] > maxValue):
                maxValue = buffer[i]
            if (buffer[i]<minValue):
                minValue = buffer[i]
        # normalize
        xScale = 255.0/(maxValue-minValue)
        # play
        tm = int(1000000/sampleRate)
        for i in range(len(buffer)):
            data = int(((buffer[i]-minValue)*xScale))
            dac1.write( data )  
    if (audioFormat != 1):
        print("ไม่รองรับกรณีที่ไม่ใช้ PCM!!!")
    dac1.write( 0 )
############### main program

From the code, it is found that in the process of reading the audio data, it finds the minimum and maximum values to be used for scaling the data, giving the minimum value to 0 and the maximum to 255. The value to be exported is subtracted from the minimum value and multiplied by xScale to adjust the value to the range 0 to 255.

In addition, a variable tm has been created to determine the approximate delay value by taking the Sample Rate to 1000000 to bring the value to a microsecond delay which makes it possible to support the transmission of audio signals that are closer to the truth

When the program is running, it reports the data of the enabled file as shown in Figure 3 and reads the audio data to the DAC.

Figure 3 Example of the results of the display from the program


From this article, readers will be able to open WAV files and read the data correctly. But files that can be processed to be exported to the DAC must be converted to MONO and uncompressed 8-bit data only. We hope that the reader will continue to improve it. And finally, have fun with programming.

(C) 2020-2022, By Jarut Busarathid and Danai Jedsadathitikul
Updated 2022-02-05