The variation of interaural level difference (ILD) with direction and frequency is particularly complex and convoluted. The purpose of this work was to determine a set of parametric equations that can be used to calculate ILDs continuously at any value of frequency and azimuth in the horizontal plane. They were derived by fitting equations to ILDs derived from the azimuthal-dependence data tabulated by Shaw and Vaillancourt [(1985). J. Acoust. Soc Am. 78, 1120–1123] and assuming left-right symmetry. The equations are shown to fit those data to an overall RMS error less than 0.5 dB.
To complement some current projects on spatial hearing, it was necessary to have a set of closed-form equations from which interaural time and level differences (ITDs and ILDs, respectively) could be calculated continuously for any value of azimuth or frequency in the horizontal plane. For ITDs, it is typical to use the expression (rθ + r sin θ)/c, where r is the radius of the head, θ is the azimuth of the sound source, in radians, and c is the speed of sound (Woodworth, 1938; Blauert, 1997). This expression is derived from a simple geometrical model in which it is assumed that the head is spherical in shape (Duda and Martens, 1998), the two ears are diametrically opposite each other, and the source of sound is sufficiently far away for the wavefronts to be planes. It has proved remarkably popular, even though the equation is not an exact fit to any particular individual's ITDs nor does it take into account the frequency dependence of ITDs. However, the acoustics of ILDs are far more complicated, and to the best of our knowledge there is no published set of expressions for ILDs that allow accurate calculations. The best simple approximation that we know of is that published by van Opstal (2016), which is , where f is the frequency in Hz. This equation captures two overall phenomena pertaining to the shape of the function of ILD across azimuth and frequency: the dependence on azimuth essentially follows the first half (0-π) of a sinusoid, and the overall magnitude of the ILDs increases with frequency.
An ILD is the across-ear difference in spectral magnitude of the head-related transfer function (HRTF). As HRTFs are generally measured experimentally only at discrete angles [see Li and Peissig (2020) for a recent review of measurement methods], there has been substantial work on computational methods to generate continuous values at any angle. These methods include numerical interpolation [e.g., Brinkmann et al. (2015), Gamper (2013), and Grijalva et al. (2017)], acoustic simulations [e.g., Katz (2001), Kreuzer et al. (2009), and Mokhtari et al. (2011)], neural networks (Qi and Tao, 2017), or weighted sums of various forms with, in general, considerable analytic complexity, including diffraction [e.g., Blauert (1997)], Fourier-Bessel series [e.g., Zhang et al. (2009)], principal components [e.g., Kistler and Wightman (1992), Middlebrooks and Green (1992), and Mokhtari et al. (2019)], Slepian functions [e.g., Bates et al. (2015)], spherical harmonics [e.g., Evans et al. (1998)], spherical wavelets (Hu et al., 2019), and from the analysis of a spherical head [Duda and Martens (1998), who also include pseudocode]. Nevertheless, none of these cited papers has published any tables of coefficients or weights that are suitable for direct numerical calculation by their readers.
To resolve this, we set out to develop continuous parametric equations that could be fitted to published ILD data. We derived these from tabulated data reported by Shaw and Vaillancourt (1985) [henceforth S-V], which quantify “self-consistent” families of curves that best fitted the azimuthal-dependence (see below) data from twelve previous studies (Shaw, 1974). These curves were described as “syntheses,” were visually fitted and are internally consistent with smooth progressions, and represent the earlier data as a whole, though being neither true averages across listeners nor the functions of any particular individual or manikin. Shaw (1974) noted that individuals would generally be within 1 dB below 500 Hz but 5 dB or more at about 5 kHz. Though they are not any individual's ILDs, for our purposes this was much outweighed by the convenience of using smoothed, tabulated, and published data. Overall, we set ourselves the pragmatic target of fitting these within 1 dB or less within the range of 200 to 10 000 Hz. We avoided higher frequencies as they can be particularly idiosyncratic, and we could not find a simple set of expressions that were suitable.
We derived ILDs from S-V's data on the “azimuthal dependence” of at-ear level at 33 frequencies from 200 to 10 000 Hz (spaced at 50, 100, 200, and 500 Hz) and 24 azimuths from 0 to 3450 (spaced at 150; see Fig. 1(A)]; we excluded data at the irregularly spaced frequencies of 320, 630, 1250, 2900, 6300 Hz and, as noted, above 10 500 Hz. The azimuthal dependence is the difference in level at the left eardrum between a sound presented at an azimuth θ and a sound presented at 0° (i.e., directly in front of the listener). We then calculated the ILD θ,f as the difference between the azimuthal dependence at θ and at −θ (or, in terms of S-V's tabulated data, at 360°-θ); note that we assumed left-right symmetry in ILDs.
To derive the fitting equations, we started with a “principal” sinusoid dependent on azimuth θ, capturing the overall dependence on azimuth that essentially follows the first half (0-π) of a sinusoid. Next, for some frequencies, the ILD function is front-back asymmetric, in that the ILD at an azimuth of θ is considerably different to the ILD at 180-θ. We represented this by varying asymmetrically the magnitude of the principal sinusoid by a factor dependent on the argument 2θ. Next, for many frequencies, there is a large perturbation, or “dip,” at azimuths around 90°. These were represented by adding a normal distribution dependent on azimuth θ, whose parameters of magnitude, mean and standard deviation were themselves functions of frequency. A second perturbation around 5000 Hz required another normal distribution. In sum, our general equation for ILD as a function of frequency and azimuth was
where Pf is the magnitude of the principal sinusoid at frequency f; Af is the magnitude of the front-back asymmetry; D90,f, μ90,f, and σ90,f are the parameters of the normal distribution accounting for the “dip”; D5k,f, μ5k, and σ5k are the parameters for the normal distribution accounting for the 5-kHz perturbation; and N is the standard normal distribution. We found the best-fitting values of each parameter separately for each value of frequency, using the generalised reduced gradient (GRG) algorithm within Excel's “Solver” tool. Figure 2 shows these values (circles). Their dependencies on frequency are mostly smooth but in some cases are of considerable extent and complexity.
We next needed to determine further equations that would give continuous values of these parameters as a function of frequency. Trial-and-error work demonstrated that we could fit the functions of these parameters across frequency by summing a linear function of f * = log10(f) with four perturbing normal distributions [see Eqs. (2)–(8) below]. The best-fitting values of the parameters of these sums were found by again running the GRG nonlinear algorithm. We set limits of –10 ≤ a ≤ 10, σ ≤ 0.03, and the various values of μ were at least 0.1 log(kHz) apart. The lines in Fig. 2 show these fits.
The parameters that we found are listed in Eqs. (2)–(8) below. The various perturbations to the main functions (p1600, etc.) are indexed by the centre-frequency of their peak, in Hz. These seven equations, together with the model set out in Eq. (1), form our continuous parametric equations for ILDs in the horizontal plane. The units are decibels for Pf, D90,f, D5k,f, and the magnitudes of their normal-distribution perturbations, and the units are log(kHz) for the means and standard-deviations of those normal-distributions, while Af is a dimensionless number. Computational programs are available as supplementary data.1
Figure 1(B) shows the resulting ILDs calculated from these equations. Note that we used a much finer “mesh” of frequency by azimuth in order to check there were no issues or instabilities at intermediate values. Figure 1(C) shows the difference between our calculations and the ILDs derived from the S-V data, using the same y-scale as panels (A) and (B). Figure 1(D) shows the same differences using a finer y-scale of ±1 dB. It can be seen that the difference is mostly less than half a decibel. The overall RMS error, across the entire grid of 13 azimuths by 33 frequencies, was 0.40 dB, with 97% of the values being less than 1 dB. The RMS error was generally slightly smaller at low frequencies than high ones (RMS = 0.3 dB for 0.2–2 kHz but 0.5 dB for 8–10 kHz).
We report here a set of equations that enable ILDs to be calculated for any angle of azimuth up to 10 kHz. The equations are derived from tabulated data reported by Shaw and Vaillancourt (1985) and assume left-right symmetry.
To recap, ILDs are represented first as a sinusoid dependent on azimuth θ, whose magnitude is front-back asymmetric and varies with frequency f. This front-back asymmetry depends on 2θ and also varies with frequency. Two major corrections (or “perturbations”) are then added to the ILDs to represent effects at azimuths around 90° and 5 kHz, which are both modelled as normal distributions dependent on azimuth and frequency. The magnitudes of all of these effects are modelled as linear functions of log(f), corrected by four normal distributions. We appreciate that the equations are somewhat daunting at first sight, and are free of any acoustical theory. Further, though the assumption of left-right symmetry is often made in experimental work—for instance, the popular Gardner and Martin (1995) dataset is symmetric—it is known that HRTFs are not perfectly symmetric [e.g., Zhong et al. (2013)]. An analysis of the CIPIC database of individual HRTFs (Algazi et al., 2001) shows that, in terms of ILD, the asymmetry is on the order of 2 dB up to about 4 kHz and then gradually increases to about 6 dB (also, the standard deviation across listeners of the mean ILDs across azimuths follows essentially the same pattern). On the other hand, these equations conveniently give a value for ILD at any desired azimuth and frequency due to the continuous, rather than discrete, numerical modelling of the data.
We tested various reduced equations to determine their effectiveness, but we have not found any simpler way to represent the complexity in the frequential and azimuthal dependence of the ILD to the accuracy we desired. Excluding the 5-kHz perturbation barely affected the overall RMS error (0.5 dB)—though it doubled the RMS error at 5-kHz to 1 dB—but additionally excluding the 90° dip increased the overall RMS error to 1.6 dB. Further excluding the front-back asymmetry, so the ILD was modelled by the primary sinusoid only, increased the error to 1.9 dB. Excluding the various normal corrections in Eq. (2)–(8), i.e., modelling each parameter as a linear function only, increased the RMS error to 3 dB, and with the mean ILD being about 2 dB greater than the data. In this regard, the equation of Van Opstal (2016) does slightly better than this much-simplified reduction of our equations, as it gives a RMS error of 2.7 dB on our data, and with a mean ILD about 2 dB less than the data. Thus, that equation can be recommended if a front-back symmetric ILD is sufficient. However, if the front-back asymmetry matters (or if the major perturbations that we identified are required), then the equations reported here should be suitable for the purpose of calculating ILDs for any value of azimuth or frequency in the horizontal plane for use in computational research.
This work was supported by the Medical Research Council (Grant No. MR/S002898/1). We thank the Editor and the two expert reviewers for their comments, which improved the manuscript greatly.
See supplementary material at https://www.scitation.org/doi/suppl/10.1121/10.0004261 or Graetzer (2021) for matlab and Python functions implementing the equations.