Phonetica was published by Karger Publishers up to and including 2020. If you or your institution subscribed to Phonetica during that period, you might still have access to the full text of this article on the Karger platform if you cannot access it here.
This study was motivated by the prospective role played by brain rhythms in speech perception. The intelligibility – in terms of word error rate – of natural-sounding, synthetically generated sentences was measured using a paradigm that alters speech-energy rhythm over a range of frequencies. The material com-prised 96 semantically unpredictable sentences, each approximately 2 s long (6–8 words per sentence), generated by a high-quality text-to-speech (TTS) synthesis engine. The TTS waveform was time-compressed by a factor of 3, creating a signal with a syllable rhythm three times faster than the original, and whose intel-ligibility is poor (<50% words correct). A waveform with an artificial rhythm was produced by automatically segmenting the time-compressed waveform into consecutive 40-ms fragments, each followed by a silent interval. The parameters varied were the length of the silent interval (0–160 ms) and whether the lengths of silence were equal (‘periodic’) or not (‘aperiodic’). The performance curve (word error rate as a function of mean duration of silence) was U-shaped. The lowest word error rate (i.e., highest intelligibility) occurred when the silence was 80 ms long and inserted periodically. This is also the condition for which word error rate increased when the silence was inserted aperiodically. These data are consistent with a model (TEMPO) in which low-frequency brain rhythms affect the ability to decode the speech signal. In TEMPO, optimum intelligibility is achieved when the syllable rhythm is within the range of the high theta-frequency brain rhythms (6–12 Hz), comparable to the rate at which segments and syllables are articulated in conversational speech.
© 2009 S. Karger AG, Basel