Image: Jordan Hollender
|
The challenge was clear. Take the telegraph line that
ran across the bottom of the Atlantic Ocean, carrying simple
pulses of electricity over its narrow 300-hertz bandwidth,
and use it to transmit voice conversations. The problem was,
it was 1952, and the technology was rudimentary. Bandwidth
compression of voice signals, called vocoding, had been done
during World War II for radio transmission of speech—but
the quality of the resulting speech was inadequate.
It was up to [see sidebar, ]., then
a graduate student at the Massachusetts Institute of Technology,
in Cambridge, to come up with a better idea. He dismissed
the then-current vocoding technology, which took 10 different
frequencies of the voice signal and converted their varying
amplitudes into analog signals for transmission. Instead,
he went back to the fundamentals: how the elements of the
voice are created and resonate within the human vocal tract.
For different elements, resonances peak at different points
in the spectrum; the frequencies of these points are called
formant frequencies.
In human speech, the frequencies of formants depend on the shape and
motion of the articulators—mouth, jaw, tongue, and lips.
Different combinations of these shapes and motions create
many different formants. Flanagan chose three of those. Then,
by coding the variations in frequency for each formant, he
supplied a more efficient means of representing speech than
the technique used earlier, which coded the amplitudes of
multiple signals at fixed positions of frequency. Although
this formant coding was never implemented on those telegraph
cables, it did set the stage for later advances in bandwidth
conservation.
He also did experiments to determine how accurately the ear detects
errors in such coding, establishing the engineering criteria
for his own and future speech coding technology. In fact,
a short letter to the editor of the Journal of the Acoustical Society of America
(May 1955) describing these experiments is still his most
frequently requested publication.
Flanagan's work with speech coding heralded a series of advances over
the years, including a currently favored technique, linear
predictive coding. In this technique, the value of a speech
signal at each sample point is predicted by a combination
of past samples. This type of coding is used in low-bandwidth
speech communications today—in cellphones, voice mail
systems, and computer-generated speech. Later ideas championed
by Flanagan contributed to the development of modern automatic
speech-recognition systems, audio codecs like MP3, and today's
voice over IP technology.
"Wayne Gretzky once said he played hockey well because he doesn't
go to where the puck is; he skates to where it is going to
be," said Rich Cox, the vice president who directs the IP
and Voice Services Research Lab for AT and T Labs Research
in Florham Park, N.J. "That's Jim. He could see where things
were going to go," even if it would take decades to get there.
For this—his "sustained leadership and outstanding contributions in speech
technology"—Flanagan is being honored with the 2005 IEEE
Medal of Honor.