The fire at Notre Dame Cathedral in Paris in 2019 and the one at Gran Teatro La Fenice opera hall in Venice in 1996 are reminders of the fragile nature of humanity’s cultural heritage. Fortunately, acoustic measurements, numerical simulations, and digital reconstructions can recover—and to some extent preserve—the sound of humanity’s great architectural sites. What’s more, those techniques provide a way for archaeologists, historians, musicologists, and the general public to experience the lost acoustics of damaged or destroyed places.
A room’s sound
The objective of architectural acoustics is to achieve the best sound quality possible in a space, whether it’s a theater, church, concert hall, or recording studio. The propagation of sound is subject to several factors. We speak of direct sound to represent the propagation path of a sound that reaches listeners without any obstacles in its way. Indoors, the presence of walls changes the direction of the acoustic energy. The new sound paths correspond to different distances and interactions with the architecture.
When a source emits a sound, the result is direct sound and reflections that are picked up by a receiver. The collection of those reflections over time constitutes the room’s acoustic response. When the source stops producing, listeners perceive the sound’s gradual decay as reverberation—the time it takes for the sound to fade away.
Before the formal theory of room acoustics was developed, the ancient Greeks put their experiential knowledge of sound into practice. Amphitheaters, such as the archaeological site of Tindari, Sicily, shown in figure 1, are representations of that work. If a roof and surrounding walls were added to an open-air Greek theater, the effect on listeners would be striking: The acoustic energy would be directed downward but dispersed in time. Reflected sounds would take different paths in the room and reach our ears at different times.
The acoustic quality of a room therefore depends, to first approximation, on the reverberation time, which can vary depending on the room’s construction and decoration materials, the position of the sound source, and the positions of listeners. The reverberation must be adapted to the room’s use. When the voice is central, a short reverberation time is preferred so that words remain intelligible. If the reverberation is too long, actors need to slow their rate of speech to remain understandable.1 Whereas an ordinary living room may reverberate for a fraction of a second, a concert hall’s reverberation time is typically around two seconds, and a cathedral’s can exceed six seconds.
At the end of the 19th century, American physicist Wallace Clement Sabine laid the foundations of architectural acoustics by establishing a formula for calculating the reverberation time based on a room’s volume and the acoustic properties of materials present. Today’s architectural projects are developed using computer-aided design software, which allows engineers and acousticians to model the projects in two and three dimensions—sometimes including animations to provide virtual explorations of a space. Starting with architectural documents that detail the geometric characteristics of the performance hall and assumptions about the acoustic properties of its building materials, acousticians use those models to carry out predictive studies of the sound qualities of the future hall. The studies help them anticipate possible defects and propose modifications to architects. The same kinds of studies are used to understand the past acoustics of historical sites.
Scientists have used physical and digital reconstruction methods for decades. But it’s only recently that computational technologies have improved the quality and resolution of acoustic modeling sufficiently for researchers to tackle large-scale and complicated spaces. Sound in properly simulated spaces can be perceptually comparable to actual, on-site recordings.2 Once created, the models can be modified to test acoustic conditions under different architectural configurations, source and listener positions, and use contexts. Acoustic simulations can be a powerful tool for historical studies; they provide researchers with a sensory presentation of sound that had only been available earlier through descriptions.3,4
The transparent nature of acoustics is ideal for studying the layered nature of history in architectural sites. A geometrically accurate 3D model that incorporates the acoustic properties of relevant construction materials allows engineers to predict how the acoustics of a space will evolve as its geometry or materials change over time. In fact, changes that occur through the introduction, modification, or removal of material because of decay, renovation, or natural catastrophe can be incorporated into the model from documented evidence. Acousticians also examine changes in how a site is used in the context of the society’s culture and customs over time.
Most existing approaches to numerical simulation for acoustic-heritage studies use one or more geometric-acoustic modeling techniques.5 In geometric acoustics, sound is assumed to travel in straight lines, similar to a ray of light, and to propagate along paths calculated from its interaction with the three-dimensional model geometry of the environment; see, for example, the numerical simulation in figure 2. The result is a close approximation to the acoustic response of the modeled environment for a given set of conditions. However, results at low frequencies—typically below 500 Hz—are often less accurate than at high frequencies, as geometric-acoustic methods are less able to model the wave-like behavior of sound.
The wave-behavior limitation is an area of active research; an alternative approach is to use a numerical method to directly solve the underlying equations of wave motion.6 Although more accurate, such methods are too expensive computationally to offer a complete solution. Computer programs can take hours or weeks to reach final results across the full audio bandwidth for a large complex space. Hence, hybrid methods that combine geometric-acoustic, wave-based, and other statistical approaches are also an area of current research.7
Auralization is the sound equivalent of visualization. The auditory presentation of an acoustical numerical model, through auralization over headphones or speaker arrays, lets users experience a site’s acoustic properties as if they were actually there.8
The acoustics of a space is immersive and—due to the nature of auditory perception—egocentric, or individual, in contrast to the visual perception of an object, which can be viewed from outside. Today’s technologies for creating an acoustical space use ideas and methods from virtual-reality (VR) systems, and are often integrated with visual rendering, as images have been shown to affect auditory perception.9 Two approaches have emerged: One uses dedicated rooms equipped with large loudspeaker arrays and projection screens surrounding the listener, and the other uses VR helmets or head-mounted displays (HMDs) like the one worn in figure 3.
An HMD is equipped with binaural headphones and a device that tracks the position and orientation of the wearer’s head. With that head-tracking functionality, it can often achieve a more stable reproduced soundscape. But for both loudspeakers and HMDs, the first level of realism requires processing three degrees of freedom (DOF)—the movement of the listener’s head around three axes. A higher level of realism can be obtained with six DOF, which accounts for the wider movement of a listener around the virtual space.
To accommodate simple, three-DOF rendering, the acoustic characteristics—either measured or modelled—are typically represented as a higher-order Ambisonic (HOA) multichannel stream. Ambisonics is a hierarchical, spatial audio format that decomposes the sound field into spherical harmonic signals, which are then decoded to the listener’s speaker setup or directly to headphones, with optional head tracking.10 (For some background on the technique, read the book review in Physics Today, June 2020, page 52.) That decomposition is an efficient way to represent the spatial distribution of sound at a fixed point. The higher the order, the greater the number of spherical harmonics, and the more accurate the spatial information.
Today, the balance between realistic reproduction and computational complexity is usually found using third-order Ambisonics, which requires 16 audio channels. For three-DOF video rendering, a simple panoramic camera will do the job, capturing photos or videos in the so-called equirectangular format, shown in figure 4. That type of image can also be easily generated by a computer in the case of purely virtual rendering.
The HOA stream is then “decoded” for either loudspeaker array or headphone reproduction. Similarly, equirectangular images must also be processed for either projector screens surrounding the listener or an HMD screen. The result is a virtual space in which the listener can freely look around in every direction, having the impression of actually being inside the scene. Precise temporal and spatial matching between visual rendering and acoustic rendering is crucial for ensuring consistency between the senses and to avoid nausea from VR sickness.
HMDs use head-tracked binaural rendering of the HOA audio stream, which allows the VR system to take advantage of individualized filtering functions that represent the acoustical response of each listener’s head and ears. Those individual filters are called head-related transfer functions, and they can be measured in special labs or numerically simulated from geometric data. The use of such filters in tandem with low-latency head-tracking devices provides such a realistic experience that many listeners cannot distinguish reproduced and real sounds.
Six-DOF systems are still experimental, particularly for rendering existing acoustical spaces. They require capturing the sound simultaneously with numerous microphones scattered around the area where virtual listeners can move. Although some laboratories are now attempting that approach, it’s mostly used with computer-simulated acoustical renderings,11 in which software simultaneously computes the sound field at hundreds of different listening points.
For loudspeaker rendering, most systems offer only two DOF for movements—or five DOF in total—allowing listeners to move freely in the room but keeping their heads at the same elevation above the floor. That arrangement is used in the “Museum of Reproduced Sound” in Parma, Italy. Sala Bianca, a room in the museum, has an array of 189 loudspeakers hidden in the walls. In the case of rendering over HMDs, the latest generation of devices can reliably provide six-DOF tracking of position and orientation of a listener’s head. Most applications use software, such as Unity and Unreal Engine, originally made for video games.
Rooms do not sound on their own; they require a sound source. And presentation can have a significant effect on the listener’s experience. The choice of appropriate source material—intermittent coughs from a virtual theater audience, say, or footsteps in a hall—helps put the site in its cultural and societal context; it may also connect the site to its surroundings. But sounds used in reconstructions should be recorded “dry,” with no surrounding acoustic environment. The use of an anechoic room, like that shown in figure 5, achieves that objective by capturing only the direct sound, which can then be injected into the virtual reconstruction.
Taking into account the natural behavior of sources, such as the movements of actors or musicians on stage, improves the realism of reconstructions.12 Modern systems are no longer limited to static acoustic sources. Ones are readily available that can capture or render sound sources rotating around three axes. And the next generation of systems will increasingly allow for six DOF, in which the sound sources can also move freely in space.
Reflections on historical reconstructions
Exploring cultural heritage through acoustic digital reconstruction provides historians, musicologists, and others with a perspective not available using more established research methods. Furthermore, it brings a powerful means of communicating and delivering memorable, meaningful, and most importantly, multisensory experiences. The effectiveness of digital reconstruction is evident through the range of projects undertaken throughout Europe—see the box on page 35.
Over the past couple of decades, the techniques of archaeological acoustics have become prevalent in historic research and in exploring the lost acoustic environments of significant but now damaged or destroyed buildings or performance venues. We outline a few recent and ongoing projects in Europe.
In Re-sounding Falkland (https://resoundingfalkland.com/), artists David Chapman and Louise K. Wilson collaborated with the Falkland estate in Scotland in 2010 to explore how sound can be used to understand and interpret the history of existing landscapes. The most significant challenge in the project was to create a three-dimensional model and an auralization of the Temple of Decision, a now-ruined structure on a hill overlooking the estate. Little is known about that 19th-century folly, and the acoustic reconstruction was informed by what the artists discerned from the ruins that remain, the fragments of documented evidence that could be found, and what is known about the construction of similar buildings.4
In France, the ECHO project, spanning the topics of voice, acoustics, and theatrical listening, examined the acoustical evolution of several important theater sites and was a tool between 2013 and 2018 for historians to test hypotheses.3 Virtual reconstructions of the acoustics at Abbey St Germain-des-Prés and Notre Dame cathedral were carried out, with Notre Dame simulations available as virtual concert “fly-throughs,” for public demonstrations. See the image above and the video link at http://www.lam.jussieu.fr/Projets/GhostOrchestra.
Bretez is an interdisciplinary project that explores the 3D setting, with audio and visual historically inspired reconstructions of the 18th-century Paris soundscape. Based on historical archives, maps, and other sources, it aims to construct an authentic multisensory immersive environment. See the YouTube video, https://youtu.be/YP__1eHeyo4. Other recent works have used both physical and numerical reconstructions, such as with the prehistoric Lascaux cave.14 The experimental virtual archaeological acoustics (EVAA) project is developing a real-time dynamic simulator for musicians to experience the acoustics of historic performance spaces (see the image on the title page of this article).
In Italy in 1996, the Gran Teatro La Fenice in Venice burned to the ground. Two months beforehand, acoustical measurements had been made of the opera house,15 work that set the stage for a reconstruction project intended to preserve the theater’s original acoustical properties. A few years later, the Waves IR1 project captured the acoustical fingerprints of more than 100 theaters, churches, and caverns all around the world with the aim of preserving their unique acoustical behavior for posterity.16
The ERATO project is another milestone for archaeoacoustic research. Its goal was to analyze and compare the acoustical properties of ancient Greek and Roman theaters.17 The SIPARIO project, at the University of Parma, aims to create real-time acoustical renderings of historical theaters for performers, who can then sing or play an instrument in a virtual environment that re-creates the theater’s visual and acoustic presence.
Generally, auralization is only one particular, static representation of how an environment sounds. It’s a snapshot in time, and the final result depends on the limitations of the recording systems and techniques as much as on the design criteria applied to the project. In the development of a model for any heritage space, the auralization is only as good as the research documenting its history.
Perhaps most importantly, our perception of a particular auralization reflects our own contemporary culture and our own prior experience of sound events. As with many historical conceptualizations, the final results are both created from and perceived through our modern state of mind.
Funding has been provided by the European Union’s Joint Programming Initiative on Cultural Heritage project PHE, The Past Has Ears.13
Brian Katz is a CNRS research director at the Sorbonne University, Institute d’Alembert in Paris, France. Damian Murphy is a professor of sound and music computing in the department of electronic engineering at the University of York in the UK. Angelo Farina is a professor in the department of engineering and architecture at the University of Parma in Italy.