STI, STIPA, and Smaart

Created by Jake Bedard, Modified on Mon, 7 Oct, 2024 at 10:19 AM by Jake Bedard

STI, STIPA, and Smaart

In almost all situations where speech is transmitted between a talker and a listener, the speech is degraded somewhat by the transmission channel and becomes less intelligible to the listener. This may be due to:

Signal-to-noise ratio
Psychoacoustic masking effects
Sound Pressure Level
Ambient noise level
Reverberation time (RT60)
Reflections
Loudspeaker frequency response/orientation
Distortion
Background noise
Attenuation over distance/due to obstacles

For Public Address & General Alarm (PAGA), Public Address (PA), Voice Address (VA) and alarm systems, it is critical that messages are not only audible, but intelligible so that important instructions can be understood (for instance, directions to the nearest exit in the event of an emergency). This is not only a primary concern of potentially hazardous areas such as industrial sites, factories, and oil/gas platforms, but also any area with large crowds such as stadiums, train stations, airports, shopping centers, and hospitals.

If emergency announcements are unintelligible or easily misunderstood due to poor system quality, tragic consequences may result. Therefore, it is essential to properly design, install, and verify the speech intelligibility of these reinforcement systems. Those who do may even have legal or contractual requirements regarding specific speech intelligibility levels.

Standards

The ISO 7240-16/-19 standard requires the verification of electroacoustic sound systems for emergency purposes. Realistic circumstances shall ascertain a measurable minimum level of speech intelligibility in case of emergency. Therefore, speech intelligibility from a regulatory view is not a subjective measurement, but can be verified with several more or less complex methods that have been standardized in IEC 60268-16.

The two main variables that affect the intelligibility are the Signal to Noise Ratio (SNR) and the reverberation time (RT60), with high SNR and low reverberation required to achieve good intelligibility. Ideally the SNR should be above 15 dB, with the SPL of the message at listening position between 60 and 80 dBA.

More information on measuring reverberation time can be found in this article.

The Speech Transmission Index (STI)

The core concept behind STI is that speech is a carrier wave. This carrier wave is modulated by low-frequency fluctuations as the speaker's mouth and tongue move to form different phonemes, the sounds that create words. These low-frequency modulations, then, are the carriers of any spoken information, and anything that reduces the depth of these modulations will have a negative impact on speech intelligibility. This can include (ambient or electronic) noise, excessive reverberation, distortion, and audible echoes.

The basis for calculating STI is the Modulation Transfer Function (MTF), which compares the depth of modulation in the received signal to that of the transmitted signal, at specified frequencies. The modulation transfer function can be measured directly, using specialized "speech-like" test signals, or calculated indirectly from the Impulse Response (or ETC) of a system under test. It is measured over a range of 7 octaves, from 125 Hz to 8 kHz, at 14 modulation frequencies per band. These modulation frequencies range from 0.63 Hz to 12.5 Hz in 1/3-octave intervals.

The STI quantifies the way a system affects the intelligibility of messages from a speaking person (recorded or otherwise) to a listener. It is calculated from the change in modulation of a test signal, giving a value of between 0 and 1 (1 being the most intelligible and 0 the least).

Band	STI Range	Examples of typical uses
A+	> 0.76	Recording studios
A	0.74 - 0.76	Theaters, speech auditoria, parliaments, courts
B	0.70 - 0.74	Theaters, speech auditoria, parliaments, courts
C	0.66 - 0.70	Teleconference, theaters
D	0.62 - 0.66	Classrooms, concert halls
E	0.58 - 0.62	Concert halls, modern churches
F	0.54 - 0.58	PA in shopping malls, public offices, cathedrals
G	0.50 - 0.54	PA in shopping malls, public offices
H	0.46 - 0.50	PA in difficult acoustic environments
I	0.42 - 0.46	PA in very difficult spaces
J	0.38 - 0.42	Not suitable for PA systems
U	< 0.36	Not suitable for PA systems

The intelligibility of emergency announcements can be quantitively assessed by determining the STI of the system, which is usually measured one of two ways: the Full STI testing method and the Speech Transmission Index for Public Address (STIPA) testing method.

Full STI

A full STI measurement requires 98 separate test signals with 14 different modulation frequencies. This process requires at least 15 minutes to complete. With hundreds of measurements required for assessing large spaces, it is incredibly time intensive.

STIPA

The STIPA method is a simplified version of Full STI and requires only one test signal, which is comprised of modulated, "Speech-Weighted" pink noise with two modulation frequencies in each octave band. This allows a measurement with similar performance to the full STI method to be taken in only 15 seconds. For this reason, STIPA is widely considered to be a more efficient method of STI measurement, and has all but replaced the full STI method.

The STIPA test signal is transmitted through a system (i.e., an airport PA system or from a stage in a concert hall) either acoustically via a calibrated Talkbox or electronically via an .mp3/.wav file sent through a line input. The SPL of the test signal is measured at listener position(s) and STI is calculated from the change in modulation depth between the transmitted and received signals, represented by a Modulation Transfer Function (MTF).

How Does Smaart Calculate STI?

Smaart can calculate STI from an Impulse Response (IR) measurement, provided that certain conditions are met. In circumstances where the MTF of the System Under Test (SUT) is derived from an IR, it is considered an indirect measurement (as opposed to direct).

Per IEC 60268-16:2020, this process has two key requirements:

It must use a deterministic test signal (such as period-matched noise or a log sweep) without a data window
The measurement duration/FFT time constant of the measurement must be at least 1.6 seconds

The second requirement brings forth an issue, however: a Fourier transform with a 1.59 sec time constant would have its first non-zero frequency bin at 0.63 Hz, which happens to be the lowest frequency evaluated for STI. Calculating STI from a 1.6 second IR measurement would then require interpolating the missing frequencies in the lowest octaves, but this is not mentioned in the standard.

A more practical solution is to simply measure over a longer period. A 5-second DFT time constant, for example, would provide a frequency resolution of 0.2 Hz. This would result in frequency bins closer to the STI modulation frequencies. Smaart happens to offer a DFT size equating to exactly 5 seconds for sample rates 44.1k and above.

Calculating STI From an IR in Smaart

Smaart is able to calculate STI from an already captured Impulse Response measurement. A 2-part guide on measuring Impulse Response is available here on the Support portal, with Part 1 available here and Part 2 available here. When testing for Intelligibility, the test signal of choice should be Speech-Weighted, Pseudorandom Pink Noise (available within Smaart's Signal Generator options).

Once you've captured and selected your IR measurement:

1. Make sure the IR measurement is visible in the main window's plot and is the top measurement in the Z-Order if multiple traces are displayed.

2. Click the "All Bands" button at the bottom of the Control Bar to open the "All Bands" table.

3. Click the "Calculate STI" button in the "All Bands" window to open the "Calculate STI" dialog. You will then need to select whether the measurement is "Noiseless" (made in a quiet space) or "Noise Present" (made in the presence of normal ambient background noise levels)

Noiseless measurements are typically made when ambient noise levels are lower than they would be when the system is in use and/or at a higher output level than the normal operating level of the SUT. They require separate estimation of operational speech level and typical in-service ambient noise levels. IR measurements can be made using period-matched pink noise or exponential sweeps (aka log sweeps or "pink" sweeps). The signal-to-noise ratio in each octave band from 125 Hz to 8 kHz must be at least 20 dB. IR measurements may be averaged to improve the signal-to-noise ratio.
Noise present measurements require the use of a period-matched, speech-weighted noise signal when capturing the IR (such as the STIPA test signal). They are measured at the operational sound level of the system without averaging, and is done in the presence of typical in-service ambient noise levels. The measurement signal input for the transfer function signal pair used to capture the IR measurement must first be calibrated for accurate sound level measurement. A guide on Smaart input calibration can be found here.

When you select Noise Present (assuming the IR measurement was captured using a calibrated measurement signal input and no averaging), there is nothing else you need to do. Smaart will calculate sound level for the measurement channel while capturing the IR while assuming that typical ambient noise levels are included in the measurement. STI, STIPA(IR), and equivalent CIS figures along with overall qualitative assessment (Excellent, Good, Fair, or Poor) and letter grades appear in the Results section of the dialog window.

For Noiseless measurements, you first need to provide the operational Speech Level for the SUT as an A-weighted (dBA) sound level. You will also need unweighted noise levels for each octave band that are typical of ambient noise levels present when the system is in normal use. The latter can be imported from a calibrated Smaart RTA measurement by clicking the Import Noise button and selecting the .srf file that you want to use. Alternatively, you can type decibel levels for each octave directly into the table below on the "Noise dB" row. You can also estimate the effect of equalization on the STI figure by entering +/- dB values in the "EQ dB" row of the table. Values in the Results section will update as you make changes.

The Clear EQ and Clear Noise buttons reset the corresponding rows of the table to all zeroes.

The Copy button copies MTI, EQ, and Noise values from the table along with the MTF "m" values for all modulation frequencies in all octaves to the operating system's clipboard in tab-delimited ASCII format.

Saving Your Work

When you have made your selections, you can click Save to save your work. If you are working with a new live measurement, Smaart will prompt you for a file name and then write the IR measurement to a .wav file with the STI calculation details included as metadata in the file header. If you are working with an IR measurement already stored as a .wav file, Smaart will rewrite the file with a new header that includes the STI metadata. In that case, you will be asked to confirm that you want to overwrite the existing file.

While the audio sample data in an existing .wav file is unaltered when you save STI calculation details, some programs (such as older/basic audio software) may be unable to open the file after saving. Smaart versions 8.5 and later use the BWAV header specification for IR data files, which most modern audio software should be able to read. It is a good idea to test that assumption (or make a copy of the .wav file) before overwriting.