I think the problem is in the parameters that you pass to sf.read()
. You are specifying things like the number of channels or the sampling frequency of the file you are reading, but that is not something that you should specify since it finds out the file itself. In fact, if you look at the error message it tells you that:
TypeError: Not allowed for existing files (except 'RAW'): samplerate, channels, format, subtype, endian
That is, you can not specify samplerate
or channels
or format
ni subtype
ni endian
at the time of opening an existing file, except if the file is of type RAW
(already that in that case that information is not available in the file and you can not find it, so you have to give it to them).
I have read without problems this sample FLAC file with the following line:
data, samplerate = sf.read('2L-125_stereo-44k-16b_04.flac')
Now then. What you get in data
is no longer a list of values (the samples ), but a two-dimensional array, specifically, two columns per N
rows, with N
the number of samples of the file. The first column contains the audio of the left channel and the second column contains the audio of the right channel.
Since the result of sf.read()
is a numpy array, you can see how many dimensions the array has and the number of rows and columns with:
>>> data.shape
(4311216, 2)
In my case I have 4311216
rows ( samples ) and the 2 columns mentioned.
You can not do the spectrogram of a two-dimensional signal, but you have alternatives:
Make the spectrogram of a single channel (left or right)
Combine the two channels with some formula so that you get a one-dimensional result. For example, for each pair of samples (left, right) do their average: (left + right) / 2
Let's see examples of code and results in each case:
Separate spectrograms for each channel
It is enough to use the slices notation of numpy to keep column 0 or column 1 of the data:
y = data[:, 0]
Pxx, freqs, bins, im = plt.specgram(y, NFFT=512, Fs=samplerate, cmap="inferno")
# ...
plt.show()
Left channel:
y = data[:, 1]
Pxx, freqs, bins, im = plt.specgram(y, NFFT=512, Fs=samplerate, cmap="inferno")
# ...
plt.show()
Right channel:
Merge the two channels
It suffices to use the function sum()
of numpy, specifying axis=1
to be added by rows instead of columns. The result will be an array with the sum of the channels. We divide it by two and ready:
y = data.sum(axis=1)/2
Pxx, freqs, bins, im = plt.specgram(y, NFFT=512, Fs=samplerate, cmap="inferno")
# ...
plt.show()
Conversion to "mono":
Regarding reading an .mp3, unfortunately soundfile
is not able to do it, and in general it is difficult to find open source libraries that allow you to do so because MP3 is protected by patents. The recommendation is that you use a converter (outside python) to convert it to a format that soundfile
can load. For example ogg
. You can have your python program call the external application to do the conversion, using os.system()
for example.