The recording and reproduction of sound has long been a source of fascination for scientists and engineers. The phonograph, invented by Thomas Edison in 1877, was arguably the first device that could both record and reproduce an acoustic signal. Although it was one of the most remarkable inventions of its time, the phonograph did not attempt to convey any spatial characteristics of the recorded sound field; it simply recorded sound and replicated the signal through a single acoustic source. The monophonic sound field the phonograph generated could not reproduce the original sound’s spatial variability.
Over the next several decades, researchers made various attempts to replicate the spatial characteristics of a recorded sound field, without much practical progress. In the 1930s, however, Alan Blumlein invented stereo sound. One technique involved recording a sound field with two microphones, one with sensitivity to sound waves from all directions and one with a figure-eight directivity pattern. When the signals from the two microphones are played back over a pair of loudspeakers spaced carefully apart, a centrally located listener experiences, at least to some extent, the illusion of directional sound.
The invention of ambisonics in the 1970s by Michael Gerzon, Peter Fellgett, and Peter Craven extended Blumlein’s technique. As Franz Zotter and Matthias Frank explain in the opening pages of Ambisonics: A Practical 3D Audio Theory for Recording, Studio Production, Sound Reinforcement, and Virtual Reality, first-order ambisonics allows a recording studio to use four coincident microphones. One microphone is uniformly sensitive and three use figure-eight directivity patterns aligned to the x-, y-, and z-axes of a Cartesian coordinate system. Appropriate processing of those four microphone signals, along with a six-loudspeaker playback system, yields an approximate reconstruction of the directions of arrival of the recorded sound.
The book’s first chapter concisely describes those microphone techniques and related approaches and provides the reader with a solid framework for understanding the basic concepts behind ambisonics. Chapter 2 covers numerous experiments that capture how well listeners perceive a change in the direction of arrival of sound as the amplitudes of the inputs to the loudspeakers are varied, or “panned” in the terminology of acoustics. Ville Pulkki’s vector-base amplitude panning (VBAP) technique is the subject of chapter 3. It is a straightforward and successful approach to determining the amplitudes of the inputs to arbitrarily arranged loudspeakers in order to generate the illusion of sound coming from a location between the loudspeakers.
The meatiest material, higher order ambisonics, is covered in chapter 4. Zotter and Frank introduce the reader to the spherical harmonic decomposition of the sound field in order to determine the loudspeaker inputs. They also explore the relationship of VBAP to higher-order ambisonics and describe various refinements that can improve the listener’s experience of a sound recording. Subsequent chapters deal with signal flow effects, ambisonic microphone arrays, and compact loudspeaker arrays.
Zotter and Frank include an extremely useful bibliography of research in the field and provide many practical and free software options. The authors also helpfully describe several experiments of what listeners perceive as the source of a sound generated by the various recording techniques and their associated panning functions. However, they barely discuss the extent to which various recording strategies are able to replicate the physical properties of the recorded sound field, particularly in the earlier chapters. Chapter 6, on higher-order ambisonic microphones, comes closest to providing some physical insight; it presents the classical problem of a rigid sphere scattering sound waves, shows the steps necessary to reproduce a sampled version of the sound field, and offers some helpful simulations of the resulting pressure distributions.
Ambisonics makes some useful contributions, but the picture is far from complete. There is still room to provide an even deeper understanding of those approaches to sound recording and reproduction. The subject will doubtless continue to fascinate scientists and engineers for some years to come.
Philip Nelson is a professor of acoustics in the Institute of Sound and Vibration Research at the University of Southampton.