Auralizations have become more prevalent in architectural acoustics. Auralizations in listening tests are typically presented in a uni-modal fashion (audio only). However, in everyday-life one perceives complex multi-modal information. Multi-sensory research has shown that visuals can influence auditory perceptions, such as with the McGurk and ventriloquist effects. However, few studies have investigated the influence of visuals on room acoustic perception. Additionally, in the majority of previous studies, visual cues were represented by photographs either with or without visuals of the source. Previously, a virtual reality framework combining a visible animated source in a virtual room with auralizations was conceived enabling multi-modal assessments. The framework is based on BlenderVR scene graph and visual rendering with MaxMSP for the real-time audio rendering of 3rd order HOA room impulse responses (RIRs) in tracked binaural. CATT-Acoustic TUCT was used to generate the HOA RIRs. Using this framework, two listening tests were carried out: 1) a repeat of a prior audio-only test comparing auralizations with dynamic voice directivity to static orientation and 2) a test comparing dynamic voice auralizations with coherent or incoherent visuals with respect to seating position. Results indicate that judgements of several room acoustic attributes are influenced by the presence of visuals.

This content is only available via PDF.