Speech and helium speech, with a brief introduction to the physics of the voice
This short document gives a very brief description of the source-filter model of voiced
speech used. It
uses this to explain some of the most noticeable features of helium speech, which it illustrates with sound files. If this isn't clear, see this more complete Introduction to the physics of speech.
The source-filter model of the vocal tract
vibration of the vocal folds in the larynx produces a varying air flow which may be treated as a periodic source (A).
(A periodic signal is cyclic: its motion is reproduced after a time interval called its
consequence is that its spectrum is made up of harmonics. Go to
'What is a sound spectrum?' for an introduction.) This source signal
is input to the vocal tract. The tract behaves like a variable filter (B) in that its response is different for different frequencies. It is variable because, by changing the position of your tongue, jaw etc you can change that frequency response.
The input signal and the vocal tract, together with the
radiation properties of the mouth, face and external field, produce a sound output (C).
Because the source is harmonic, we can say that the gain of the tract (B) is sampled at
the fundamental frequency fo. In the case sketched at left below, the resonances R1 and R2 can be determined
the peaks in the envelope of the sound spectrum. These peaks are called the formants
(F1 and F2). (See What is a formant?)
Note that the detail in the spectrum is easier to see if F0 is low, e.g. for
a low pitched man's voice (diagram at left), than it is for a child's voice -
shown at right.
The lowest resonance is determined to a considerable extent by the end effect of your
mouth: if you
lower your jaw, R1 rises. R2 is affected by the jaw position too, but it is primarily affected by the position
of the constriction inside your mouth. Moving your tongue forwards and backwards
R2 (and also R1, but to a lesser extent). Maps of (R1,R2) for various accents of English are given on Sounds of World English.
Nearly all information in speech is in the range 200 Hz - 8 kHz. (The telephone carries
only 300 Hz -
4 kHz but speech is reasonably intelligible.) The pitch is determined by the spacing
of harmonics as much as or more than by the fundamental. Thus you can tell the pitch of a man's
the phone even though the fundamental of that signal is not present. Note that the size of the
(~170 mm long)
gives resonances around 500 Hz and above. In fact a closed tube of this length is a functional
tract for the vowel "er" as in "herd". For this 'neutral' vowel, the first five resonances of
author's vocal tract are indeed at values of about 500, 1500, 2500, 3500 and 4500
What helium does to speech
You can investigate the model described above by changing the speed of sound. Inhaling helium
frequencies of the resonances, and therefore of the formants they produce (See What is a formant?). As you would expect from the model above, it does not change the pitch, which
is determined by the tension, mass and geometry of vocal folds, and some other effects.
however change the timbre. In speech, you may have the illusion that the pitch has
because one doesn't think much about pitch when listening to speech. To make it clear,
can sing with and without a lung containing a substantial fraction of He and listen.
Warnings: He is
suffocating and conducts heat well. After one inhalation of He, breathe air normally for
minutes. In a gas cylinder, He is under high pressure. Do not inhale directly from a gas
Fill a toy balloon and inhale from that.
Okay, having read those warnings, you might not want to try. So I've put the recordings of my experiment below.
The first diagram shows a schematic picture of the spectrum (power vs frequency) for the sound of the voice made with a particular
configuration of the
vocal tract filled with air. The solid line is the spectral envelope; the vertical
lines are the
harmonics of the vibration of the vocal folds. The second diagram shows the effect of
with helium, but keeping the tract configuration the same (i.e. trying to pronounce the
as before, but with a throat full of helium). The speed of sound is greater, so the
resonances occur at
higher frequencies, as do the formants they produce: the second formant has now been shifted right off scale in this
diagram. The flesh
in your vocal folds still vibrates at the same* frequency, however, so the harmonics occur at the
What does this sound like? Obviously the helium makes a big difference to the sound of the voice.
If you do the experiment with someone who has some experience with singing or music, (and if s/he doesn't laugh too much on hearing helium voice)
then the pitch
will be the same in the two cases. The pitch is determined by the frequencies of the
these have not changed*. The speech does, however, sound 'like Donald Duck'. There is
at low frequencies so the sound is thin and squeaky. This alteration to the timbre
in a spectacular way. Although we can understand whole sentences (using contextual
clues) we find
that individual vowels are very difficult to identify. (By the way, an articulate but otherwise standard duck would have a shorter vocal tract than ours so, even while breathing air, Donald would have resonances at rather higher frequencies than ours.)
* If you keep the muscle tensions the same, that is, the frequencies will not change
could be a small change because the less dense He loads the vocal folds a bit less than
the air, but
this effect is slight. The effect on the resonances is large, however. Its size depends on
how pure the
He in your vocal tract is.
More about voice acoustics
The very brief account above addresses only vowels. Our Introduction to the physics of speech is a much broader introduction. It provides both a simple overview, and a rather more detailed account. Throughout, it suggests a range of experiments for the reader to try – none of the others involving helium.
Some explanatory notes