A better grasp of how a brain interprets the ears' input led to virtual surround, in which two speakers sound like many more
They are out there now, and more are coming: DVD players, television sets, set-top boxes, and receivers that include virtual surround technologies. Found under brand names such as TruSurround from SRS Labs and N-2-2 from Spatializer, these technologies envelop even those who are space, wiring, or speaker challenged in sound appearing to come from the front, rear, and sides.
Speakers and amps and wires—oh my!
The idea of experiencing 5.1-channel surround sound in their homes thrills most people until they discover how expensive and complicated it is to set up and adjust the equipment. A system requires five speakers (Left, Center, Right, Surround Left, and Surround Right) and typically also includes a subwoofer (represented by .1). Each must be positioned properly, wired to the main amplifier or receiver, and balanced—daunting tasks for someone who just wants to watch a good movie.
For many people this is the beginning and the end of their surround sound experience. Face it—standard surround sound is cumbersome. Only a true "audiophile" will purchase extra speakers, wiring, and amplifiers and conduct setup tests or adjust the speakers for the best listening possible.
Dolby Laboratories Inc., San Francisco, developed what is now the most prevalent system for delivering 5.1 surround to the home, namely, Dolby Digital. But the company recognized that an alternative to complex multispeaker systems would be required if it wanted to expand the appeal and hence the market of the system, which uses a five-speaker plus subwoofer system to deliver surround sound. Virtual surround, which needs only two speakers to create a surround sound image, was the answer to both Dolby's and the consumer's problems. Virtual surround sound systems mimic actual surround sound by exploiting the way that the auditory system perceives the direction from which a sound is coming. They can create a very realistic surround sound field from only two speakers placed in front, to the left and right, of the listener, as found in standard stereo systems.
The key lies in a so-called virtualizer, developed by companies such as SRS Labs, Spatializer, and Qsound. It turns Dolby Digital's 5.1-channel output into the two virtual surround channels.
Evolution, not revolution
In the late 1920s and early 1930s, researchers, noting that people listened to sound through one speaker but had two ears, began developing what came to be known as stereo.
Around the early 1970s, some people like Ben Baur of CBS, and Peter Scheiber, an independent consultant in Bloomington, Ind., mused that stereo was great, but everything seemed to be coming from in front of and between the two speakers. So quad, as it was called, was born, which used four speakers: left and right speakers in front of the listener as in conventional stereo and left and right speakers behind the listener to create the sensation of being "surrounded by sound" [see figure]. Quad failed to catch on and quickly died.
While quad was being developed, Dolby Labs was becoming very successful with tape noise reduction systems. Its first systems were designed for professional reel-to-reel units, but then Henry Kloss, founder of Advent Corp., implemented a simplified version of the technology on a cassette deck, giving Dolby an entrée into the vast consumer market.
New formats emerge
In the meantime, the film industry had been struggling with optical sound tracks, in use since the 1930s. The tracks were optically printed along the edge of the film, but were noisy, of poor quality, and degraded a little every time the film was shown. So Dolby applied its noise-reduction and frequency- extension system to film and crossed into the film industry.
Around 1977, when the original Star Wars movie was released, film producers such as George Lucas wondered if they couldn't deliver something a little more exciting than the simple stereo then standard in movie theaters. They went to Dolby, who had licensed some of the original quad patents, and turned the speaker configuration into a diamond shape rather than a rectangle. Left, Right, Surround Left, and Surround Right became Left, Center, Right, and Surround.
By providing a special encoder, it was possible to merge, or encode, four source channels into the more easily delivered two-channel recording formats, and then play them back in the theater with a decoder that derived four channels again. Thus was born Dolby Stereo. The configuration made it possible to isolate dialog to the center speaker, but the surround channel was monaural instead of stereo, unlike the original quad. Today, virtually every movie is encoded in Dolby Stereo as one of its audio formats; most are also encoded in Dolby Digital 5.1.
During the late 1970s through the early 1980s, the home video formats of the day, VHS and Beta, both started to support high-quality stereo sound tracks with VHS HiFi and Beta HiFi. The Dolby Stereo encoding of the original movie soundtracks were being transferred to these tapes, under a format now badged Dolby Surround. Soon, most home audio-video receivers included Dolby Surround decoders, and later Dolby Pro Logic decoders, which significantly enhanced the perceived separation between the channels. Thus the average consumer could enjoy surround sound at home. Unfortunately, it remained a problem to install.
Later, in the 1990s, when digital penetrated the consumer market in DVD players and digital receivers, Dolby released Dolby Digital, while Digital Theater Systems Inc. (DTS), of Agoura Hills, Calif., released Digital Theater Sound, offering high-quality 5.1 channel performance. Both systems provided, for the first time, five discrete high-quality audio channels (Left, Center, Right, Left Surround, and Right Surround) plus the .1 channel. This bandwidth-limited channel was dubbed LFE for low-frequency effects (like explosions, bombs, and jet takeoffs) and in a typical home theater system is connected to a subwoofer. All the same, consumer reluctance to add speakers, space, and wire remained unchanged.
In the meantime, some folks, mainly those involved with high-end audio, were wondering why stereo, and even surround sound, fell short of sounding like the real thing. Something was amiss, but nobody knew quite what. Around the mid-1980s, Arnold Klayman, who was consulting with Hughes Aircraft Co. on the Boeing 747's public address system, decided to pursue the question in his spare time.
What he discovered should have been obvious: people perceive sound as arising in three-dimensional space, not just in a flat horizontal plane between speakers. We can distinguish if a sound is above, below, in front, to the side, or behind us. In fact, many people can resolve the location of a sound within one degree of horizontal azimuth. The next question was, how is that done with only two ears?
It turns out that what the early designers of stereo and quad systems had failed to understand is that locating a sound in space involves more than amplitude, time, and phase differences (the traditional factors that account for stereo localization).
"The ear-brain system can determine the direction of a sound source, because of its uncanny ability to [utilize] changes in the frequency distribution of a complex sound as it arrives at the ear canal," said Klayman, now at SRS Labs, in Santa Ana, Calif. "This distribution changes as the sound source changes position relative to the ear, primarily as a result of the outer ear, or pinna, which acts as a baffle and resonator specifically to create the effect."
The brain interprets these subtle variations in frequency distribution as directional clues, rather than as changes in the timbre of the frequency of the sound source. This permits the ear/brain system to resolve the direction of a sound source in three-dimensional space. Without this capability there could be no surround sound. People would be unable to distinguish between sounds coming from the front and the rear speakers.
These direction-based changes in frequency distribution caused by the pinna, plus others caused by head shadowing and reflections, are now collectively referred to as head-related transfer functions (HRTFs). Klayman measured the frequency response in the ear canal to sounds produced by speakers at various locations. From his measurements, he derived a set of HRTF-based curves that could be used to position sound seemingly outside the field of stereo speakers [see figure, "Tricking the Brain"].
The realism of stereo reproduction was much enhanced by using these curves and filtering the difference portion of a stereo signal, which contains directional and ambient information representing the acoustic space of the original recording. This proprietary process, called SRS (Sound Retrieval System), was the first of the HRTF-based 3-D sound enhancement systems. When SRS is activated, the listener perceives a sound field that extends well beyond the horizontal position of the speakers and curves around the head to an angle of about 180 degrees.
Things get discrete
Until the mid-1990s, almost all surround sound was matrixed. That is, the information from all the channels, front and surround, was encoded into two-channel stereo. To distinguish surround-encoded audio from regular surround, the two channels were labeled Lt and Rt (Left total, Right total). A decoder in the cinema or in the home did the best it could to separate the left, right, center, and surround channels from the Lt/Rt source. This was the basis for Dolby Surround encoding, Dolby Surround decoding, and later Dolby Pro Logic decoding.
In mid-decade, two discrete digital surround systems were introduced: one by Dolby, first dubbed AC-3 and now called Dolby Digital, and the other by DTS. While both companies originally launched their systems in movie theaters, they each brought advanced forms of their technologies to consumer applications. Both provide 5.1 separate channels of audio. The Dolby system was adopted as the standard for DVD soundtracks worldwide, as well as for digital television, cable, and satellite applications in the United States and elsewhere.
However, the fundamental problem still remained: few people had multispeaker surround systems installed in their homes. The emergence of Dolby Digital as a standard was the primary impetus to the development of virtualizers designed to render surround sound from 5.1 channels through two speakers.
These systems used HRTF techniques similar to SRS to map the 5.1 discrete channels of sound (Left, Center, Right, Left Surround, Right Surround, and low-frequency effects) into virtual space, creating the perception of sound sources to the sides and behind a listener with no physical speakers there. Different techniques were used by other systems as well, but the goal of all the systems was the same—to reproduce a full surround-sound field from the two-speaker systems that people commonly had in their homes.
What is reality?
The first decision that must be made when designing a multichannel virtualizer is to determine exactly what the listener should experience. Should the sensation of listening to multiple speakers in a room, as in a traditional surround sound system, be re-created, or can one go further to create a more engulfing experience?
To list the worst limitations of the traditional multispeaker surround system:
The sound clearly comes from the speakers, not the environment. Attempts to mitigate this by controlling the dispersion pattern, or sound spread, of the speakers, especially the surrounds, do not deceive the listener.
The sound also seems to stick around the perimeter of the room, in a circle defined by the speaker locations, rather than enveloping the listener.
A continuous phantom image, or the illusion of a sound source coming from a location other than the speakers, cannot be created between the front and the surround speakers—that fact has been known since the days of quad. This is because when a sound source is panned between the front and other speakers, only its amplitude changes.
For example, the noise of a rocket or bullet traveling from front to rear in a film is simply getting louder in the target channel and softer in the channel that it is moving away from. Because no correction is made for the continuously varying HRTFs that occur when a sound moves from front to rear or rear to front, and because our ears sit on the sides and not the front and back of our heads, we cannot accurately perceive a mid-lateral image. In other words, people cannot correctly locate sound sources between the front and rear speakers.
A virtual system is not subject to these limitations. With no surround speakers, creation of the rear image relies entirely on the listener's hearing system. Because the ear/brain system is stimulated directly by HRTF-based transfer functions rather than indirectly through reproduction of sound through speakers, the sense of realism and immersion can be greatly increased. Further, in a virtual system a sound source moving between the front and rear channels is subjected to continuously changing HRTF functions, which can stimulate the perception of a phantom lateral image. So rather than try simply to simulate a multispeaker environment, a virtualization system can, in many cases, come closer to reality.
Sweet sounds and sweet spots
However, as with anything, there are limitations. There are basically two issues: sound quality and the sweet spot.
In addition to HRTF processing, virtualization systems use a technique called interaural crosstalk cancellation (ICC). Simply put, it attempts to cancel the crosstalk from the right speaker to the left ear and vice versa, by isolating the sounds from the left speaker to the left ear and the right to the right ear. While this can be effective, it normally results in a very narrow sweet spot. (This is the listening location relative to the speakers where the virtualization effect can be properly heard.) In many systems, if listeners simply turn their heads, or move as little as 3 cm to one side or the other, the entire virtual effect can be destroyed. This is not practical in a home theater, especially if more than one person is in the audience. Some systems also use timing and phase manipulation, which further narrows the sweet spot.
With multichannel virtualizers,only two speakers do the job of multispeaker surround systems
It should be pointed out, though, that even multispeaker surround systems have a sweet spot. Sit too near any of the speakers, and the surround effect will be grossly distorted. By relying on HRTF-based processing and judiciously avoiding the use of interaural crosstalk cancellation and time and phase manipulation, a virtualizer can be designed that provides the largest possible sweet spot, does not compromise sound quality, and creates a realistic, immersive sound field. Further, this type of system typically requires less processing power.
What was that?
The first time someone listens to a virtualizer, the experience is bound to be surprising. The only loudspeakers are in front, yet sound can also be heard from the sides and from the rear. Some well-designed systems that use interaural crosstalk cancellation and avoid the sound quality problems mentioned earlier have a stupendous effect. The system designed by the Cooper-Bauck Corp., in Tempe, Ariz., and now marketed by Harman International Industries Inc., of Northridge, Calif., as VMAx, can persuade listeners in the sweet spot that they have entered another acoustic space entirely. Other systems, such as the 3-D positioning of QSurround from Qsound Labs Inc., of Calgary, Canada, can literally make the listener feel someone is creeping up from behind, though this effect is limited to a very small sweet spot.
Home theater virtualizers envelop the listener in the acoustic space and provide interesting experiences, such as creating the perception that airplanes, rockets, and bullets are whizzing right by the listener's ears and zooming off into the "horizon." Sound can appear to come from far behind, even beyond the physical confines of the room. To test this effect, have a subject stand with her back against the rear wall of the room, where she will still perceive sounds coming from behind her.
Surround nirvana for the masses
Under the right conditions, multichannel virtualizers score on two counts. They need only two speakers to create an experience superior to that of standard multispeaker surround systems. (The speakers are placed as they would be for standard stereo, but set flat, parallel to the wall, instead of at an angle.) And they can deliver surround sound to the huge number of consumers that do not have, and are not interested in obtaining, full surround systems.
As virtualizers continue to improve, they should ultimately be heard as on a par with, or even superior to, multispeaker systems. Through direct stimulation of the ear/brain system rather than through intermediate devices, virtualization systems can conjure up a very exciting approximation of reality while preserving the sound quality of the original source.
Tekla S. Perry, Editor
About the Author
ALAN KRAEMER is vice president of engineering for SRS Labs, in Santa Ana, Calif. He was previously president of Sierra Digital Productions, a company involved with the production and engineering of jazz and classical recordings. He is a committed pianist and composer as well as a fanatical road cyclist.
To Probe Further
Greater detail on SRS Labs Inc. technologies is obtainable from http://www.srslabs.com. Information on Henry Kloss may be found at http://www.beacham.com/henry_kloss_907.html
A comprehensive FAQ (frequently asked questions) concerning Dolby Digital is located at http://www.dolby.com/tech/l.br.0011.DDFAQ.html To find out about Digital Theater Sound (DTS), see the site at http://www.dtsonline.com/.