Audio over the Short-Time Fourrier-Transform

William Greenwood - Contact: greenwoodw50 [at] gmail [dot] com

Due to speed requirements, i decided to do a rewrite of this program in rust. This can be seen here. All further developments will be made there.

Abstract

Audio over STFT uses a camera as well as a display to allow interaction with a sound. Sound is converted to a lossless spectrogram before being displayed to a screen. It can then be captured by the relevant capture device and converted back into sound. This process is very slightly lossy due to the compression of converting 16-bit integers into unsigned 8-bit integers.

Technical details

Audio samples are chunked together and a FFT is performed, this converts the samples into a complex array. The length of these vectors is the amplitude of the sound and is currently mapped to the value of the colour. The angle of the vector relative to the real axis is the phase of the sound and is usually discarded, this is mapped to the saturation of the colour.

The hue is currently unmapped but could potentially be used as error-correction data or a copy of either phase or amplitude.

Current modes

Loop A looped file is used and played to the default audio output.
File Displays an audio file and records it to a wave file.
Stream The default audio input is used and the recovered audio is written to an wave file

Limitations

The window size of the FFT defines only the lower limit of the frequency recoverable. I have found that the lower the window size (and thus, the width of the displayed spectrogram) the better the recovered audio quality. Unfortunately, after graphing the relationship between processing time and spectrogram width, it appears the relationship is exponentially slower the smaller the width is. Therefore, the sample frequency (lowering which dramatically reduces the process time) and the spectrogram height have all been tuned to allow us to increece the window width as much as possible.

Due to this bottle-neck, the following things would increece the quality of the recovered data:

Increeced computational power
Increeced capture quality
Further optimization
Re-writing in a faster language

Current optimization

After profiling the code, steps have been taken to optimize the code. Primarily, the result from the FFT (power) is not mesured in dB, and is therefore scaled logarithmically. To convert this to dB the formuala 20log₁₀(power). This (and its respective inverse) is innefficient. As a replacement for the inverse, a lookup-table has been used.

As well as this, if the window width is over 1800, I have deemed it more efficient to use a log aproximation based of arctan. This is very loosely accurate, but this does not affect our application much as our approximation will be immediately un-aproximated.

description	Using a camera pointed at a spectrogram as an intermediate stage of processing audio to allow interation with the sound.
last change	Wed, 13 Nov 2024 20:33:45 +0000 (20:33 +0000)

2024-11-13	will	tidying up main	commit \| commitdiff \| tree \| snapshot
2024-09-10	will	Changed loop.py to chache spectrograms	commit \| commitdiff \| tree \| snapshot
2024-09-09	will	Added .venv	commit \| commitdiff \| tree \| snapshot
2024-09-03	will	Various small changes	commit \| commitdiff \| tree \| snapshot
2024-09-01	will	Fixed buzzing	commit \| commitdiff \| tree \| snapshot
2024-08-26	will	Deleted unecaccary files	commit \| commitdiff \| tree \| snapshot
2024-08-26	will	Added .gitignore *.pyc	commit \| commitdiff \| tree \| snapshot
2024-08-26	will	Removed README	commit \| commitdiff \| tree \| snapshot
2024-08-26	will	Resolution and waitKey changes	commit \| commitdiff \| tree \| snapshot
2024-08-25	will	Optimixation and notation	commit \| commitdiff \| tree \| snapshot
2024-08-24	will	Added .gitignore	commit \| commitdiff \| tree \| snapshot
2024-08-24	will	Fixed lookup table system	commit \| commitdiff \| tree \| snapshot
2024-08-18	will	Fixed camera buffer issue and started on lookup	commit \| commitdiff \| tree \| snapshot
2024-08-05	will	Added README.html	commit \| commitdiff \| tree \| snapshot
2024-08-05	will	- Added VideoCapture thread class to continuously captu...	commit \| commitdiff \| tree \| snapshot
2024-07-31	will	uhhh	commit \| commitdiff \| tree \| snapshot
...