Time Resolution versus Frequency Resolution

Created by Chris Tsanjoures, Modified on Tue, 14 Oct at 4:52 PM by Chris Tsanjoures

A key trade-off when working with discrete Fourier transforms (DFT or FFT) is the inverse relationship between time resolution and frequency resolution – as one gets better the other gets worse. Both are a function of the “time constant” (also called the “time window”) of the measurement. The time constant is simply the time that it takes to record enough samples for a DFT of a given size, at a given sampling rate. Longer time windows provide tighter, more detailed frequency resolution (often more than we want at high frequencies) but at the expense of less detailed time resolution.

Time resolution might be the least of your worries if you are doing a long-term average of a signal or a steady-state measurement of a sound system using a statistically random signal such as pink noise. It could however, be an important factor when analyzing a dynamic signal such as speech or music, where you may need to see features of the signal that are very closely spaced in time as separate events. For example, if two drum beats occur within the time constant of a single FFT, the resulting spectrum in the frequency domain includes the energy from both as a single figure at each frequency. If you needed to see each beat as a separate event, you would need to shorten the time window, which would result in more widely spaced frequency bins.

You can calculate the time constant for an FFT (in seconds) by dividing the sampling rate used to record the time-domain signal by the FFT size in samples. For example, the default FFT size for spectrum measurements in Smaart is 16K (16384) samples. A 16K FFT recorded at 48000 samples/second has a time constant of 0.341 seconds (16384/48000) or 341 milliseconds.

Low frequencies have longer cycle times than high frequencies of course – that’s what makes them low frequencies – so it makes sense that you have to look at a signal over a longer period of time to resolve them. In fact, the lowest frequency that an FFT (or any other kind of DFT) can clearly “see” is 1/T, where T is the FFT time constant in seconds. Using the example of a 16K FFT at 48k sample rate, frequency resolution in that case works out to 2.93 Hz (1/0.341).

If you are familiar with the reciprocal relationship between cycle time and frequency in sine waves (f = 1/t and t = 1/f), you may have spotted the fact that it echoes the relationship between time constant and frequency resolution in an FFT. In fact, the frequency resolution of an FFT is equal to the frequency of a sinewave that cycles exactly once within the FFT time window. All other frequency bins are at integer multiples (harmonics) of that fundamental frequency, and so knowing the time constant also tells you how far apart the frequency bins are.

FFT Frequency Resolution shown on a logarithmic frequency scale. Each doubling of FFT size (in samples) doubles the FFT frequency resolution and extends its frequency range an octave lower.

In practical terms, given a sampling rate of 44.1k or 48k, Smaart’s 16K default FFT size for spectrum measurements provides very good low-frequency resolution down to the lower reaches of subwoofer frequency ranges, and much greater time resolution than you need for analysis of signals such as pink noise. As regards more dynamic signals such as speech or music, if we recorded 16K FFTs end-to-end for a full minute at 48k sample rate, that works out to just about 176 discrete frames per minute (60 / 0.341 ≈ 176). That would tend to meet or exceed the average tempo for most musical genres, meaning that it provides enough time resolution to see the spectral content of individual notes.

In terms of speech analysis, typical speaking rates for native English speakers range from about 140-180 words per minute or about 200-300 syllables per minute, so a 16K FFT can get you words but not syllables. Dropping the FFT size to 8k would double the time resolution to about a minimum of about 352 frames per minute – enough to keep up with insanely fast music or distinguish individual syllables at typical rates of speaking – but does so at the expense of some loss of detail at low frequencies.

A couple of other trade-offs associated with the length of a DFT or FFT are the computational costs, which increase exponentially with size, and the issue of excess frequency resolution at high frequencies when linearly spaced DFT data is plotted on a logarithmic frequency scale. In RTA measurements, the use of fractional octave banding effectively nullifies the excess high-frequency resolution issue and even lower end computers these days can perform real-time analysis using FFT sizes of 16K or even 32K with relative ease.

In transfer function measurements, where computational costs are a bigger problem in general, Smaart’s multi-time-window (MTW) feature, attempts to sidestep both problems by using a series of small FFTs at progressively lower sampling rates to deliver approximately 1 Hz resolution at low frequencies without incurring excessively high resolution in the upper octaves. Smoothing the transfer function also helps to clean up excess resolution at high frequencies and works for both MTW and measurements that use just a single FFT size.