Well, the short answer is .093 seconds. That’s about the shortest amount of time mathematicians need to generate a full analysis of a sound’s component frequencies.

Well, the short answer is .093 seconds. That’s about the shortest amount of time mathematicians need to generate a full analysis of a sound’s component frequencies.

On an even smaller scale, computers typically store sound information in 44100 samples per second. This makes up the typical waveform view of sound that most are accustomed to seeing. However, each sample only gives information about amplitude (or volume), which is a pale portrait of sound. Sound in the physical world is essentially an unfolding of waves over time. Therefore, when translating from physical to digital, frequency information over time is essential to give a meaningful atomic definition of any sound.

Comments

I've been doing spectral music for a pretty long time and have always had reservations about FFTs - they're kind of a solution, but not to the problem. I usually use DFTs or variants thereof (like Goertzel's algorithm) to watch for specific frequencies in a more flexible way.

Folks interested in spectral synthesis and music making should check out my iPhone/iPad apps: Tondo, real time spectral finger painting , Droneo for precisely setting musical intervals , and even Banshee, which is a messaging app that turns text into modem-like chirps that i decode with the aforementioned Goertzel's algorithm. I'm right now working on a polyphonic real time multitouch granular instrument, which is coming along very nicely. AND I'm

alsoa staffer at WFMU, so you Rhizome readers should probably tune in some time!Pleasantly receptive while listening to and seeing .093's compact, tiered. dense and very interesting profile.

The use of FFT transforms for anything relating to audio is quite inappropriate, due to Fourier transforms working on a linear frequency scale rather than a log scale. We perceive sound on a log scale, so for example we discern the same number of tones between 40Hz - 80Hz as there are between 10KHz - 20KHz, which might seem counterintuitive because there is only 40 Hz difference in the first range but 10,000 Hz difference in the second range.

Our ears can discern approximately 4,000 specific frequencies, which are spaced evenly on a logarithmic scale between approximately 20Hz - 20KHz. With there being about 10 octaves in the range of human hearing, this means we can discern about 400 specific frequencies per octave. The FFT however knows nothing about octaves, and instead uses a fixed linear spacing. A 16K point FFT (or DFT) for example will give you a resolution of about 1.25 Hz, which in the 40-80Hz octave gives only about 32 points (whereas our ears can discern 400 points in that octave). Therefore, the resolution of Fourier transforms is extremely poor in the lower octaves, and the processing overall is very inefficient for audio purposes.

Wavelet Transforms or other transforms which operate on a log scale would be much better choices.