The loudness recruitment associated with cochlear hearing loss increases the perceived amount of amplitude modulation (AM), called “fluctuation strength.” For normal-hearing (NH) subjects, fluctuation strength “saturates” when the AM depth is high. If such saturation occurs for hearing-impaired (HI) subjects, they may show poorer AM depth discrimination than NH subjects when the reference AM depth is high. To test this hypothesis, AM depth discrimination of a 4-kHz sinusoidal carrier, modulated at a rate of 4 or 16 Hz, was measured in a two-alternative forced-choice task for reference modulation depths, mref, of 0.5, 0.6, and 0.7. AM detection was assessed using mref = 0. Ten older HI subjects, and five young and five older NH subjects were tested. Psychometric functions were measured using five target modulation depths for each mref. For AM depth discrimination, the HI subjects performed more poorly than the NH subjects, both at 30 dB sensation level (SL) and 75 dB sound pressure level (SPL). However, for AM detection, the HI subjects performed better than the NH subjects at 30 dB SL; there was no significant difference between the HI and NH groups at 75 dB SPL. The results for the NH subjects were not affected by age.
I. INTRODUCTION
The patterns of amplitude modulation (AM) in speech and other sounds convey important information (Drullman et al., 1994a; Shannon et al., 1995; Moore, 2014). Hearing loss usually changes the way that AM is perceived and that may contribute to the difficulties experienced by hearing-impaired (HI) people in understanding speech, especially when background sounds are present (Plomp, 1978; Moore, 2003). Most people with sensorineural hearing loss experience loudness recruitment, a more rapid than normal growth of loudness with increasing sound level once the elevated detection threshold is exceeded (Fowler, 1936; Steinberg and Gardner, 1937). Loudness recruitment may be partly explained by the loss of compression on the basilar membrane that is associated with cochlear hearing loss (Moore and Oxenham, 1998; Robles and Ruggero, 2001), although it may also be caused by altered transduction between the inner hair cells and auditory neurons (Kale and Heinz, 2010) and changes in short-term neural adaptation (Scheidt et al., 2010). Whatever the cause, physiological data show that the representation of envelope information in the auditory nerve (Kale and Heinz, 2010, 2012) and midbrain (Zhong et al., 2014) is amplified in animals with sensorineural hearing loss.
One might expect that loudness recruitment would be associated with a better than normal ability to detect AM. This has often been found, especially for stimuli with low sensation levels (SLs; Lüscher and Zwislocki, 1949; Moore et al., 1992; Füllgrabe et al., 2003; Ernst and Moore, 2012; Sek et al., 2015). However, better AM detection by HI subjects has not always been found, perhaps because hearing impairment can be associated with increased “internal noise” (Zwislocki and Jordan, 1986; Stone and Moore, 2014a); the deleterious effects of internal noise on AM detection may offset the beneficial effects of the magnified internal representation of the AM.
The AM patterns in speech usually have AM depths that are well above the detection threshold. When the AM is easily detectable, use of the information conveyed by the AM depends on the ability to discriminate differences in the AM pattern and depth. For sounds with a supra-threshold AM depth, the perceived amount of fluctuation is greater for HI than for normal-hearing (NH) ears (Moore et al., 1996). In other words, loudness recruitment has the effect of exaggerating the perceived “fluctuation strength” (Fastl, 1983). The possible effects of this on the ability to detect changes in AM depth have not been extensively explored.
Takahashi and Bacon (1992) measured modulation masking (elevation of the threshold for detecting signal AM when a masker AM is present; Bacon and Grantham, 1989; Houtgast, 1989) when the signal and masker modulation were applied to independent equal-level white noise carriers and the modulated carriers were then added. Ten young NH subjects and 30 older adults (divided into three age groups) with mild hearing loss were tested. In one condition, masker modulation with a rate of 8 Hz and a depth of 1 (100% modulation) was applied to one carrier, and the threshold for detecting 8-Hz AM applied to the other carrier was measured (the relative phase of the masker and signal AM was 90°), so the task became discrimination of modulation depth. Thresholds could not be measured in this condition for some subjects and unmeasurable thresholds tended to occur more for the older HI subjects than for the young NH subjects, although the effect was not significant.
Lorenzi et al. (1997) measured modulation masking patterns for four NH and three HI subjects, using a white noise carrier. They included a condition where the masker and signal had the same modulation rate (100 Hz), in which case the task became modulation depth discrimination, with a reference modulation depth, mref, of 0.5. There was no clear difference in AM depth discrimination for the NH and HI subjects.
Sek et al. (2015) measured thresholds for detecting an increase in AM depth of a 4000-Hz sinusoidal carrier, using mref = 0.4, and modulation rates of 4 and 16 Hz. They tested six NH subjects and nine HI subjects. Performance tended to be worse for the HI than for the NH subjects, although the effect failed to reach statistical significance.
Overall, there appear to be no data strongly supporting the hypothesis that AM depth discrimination is worse for HI than for NH subjects when mref is large. However, we are not aware of any experiments that assessed this using values of mref > 0.5. The discrimination of AM depth may be poorer for HI than for NH subjects for large mref because loudness recruitment has the effect of increasing the fluctuation strength in both intervals of a forced-choice trial. The sensation of fluctuation strength approaches an asymptotic value (“saturates”) when the AM depth is large but still well below 100% (Fastl, 1983). For large AM depths, AM depth discrimination may be relatively poor for HI subjects because the fluctuation strength is close to its asymptotic value for both the reference AM depth and the incremented AM depth. The present paper assesses whether HI subjects do indeed have higher AM depth-discrimination thresholds than NH subjects when the reference modulation depth is relatively large. The HI subjects were all aged 53 or older. To assess the effects of age on AM depth discrimination, two sub-groups of NH subjects were tested, one younger and one older.
II. METHOD
A. Subjects
Ten NH subjects (six female) were tested. Five of the NH subjects were relatively young, with ages ranging from 18 to 42 yr (mean = 31 yr). The other five were older, with ages ranging from 68 to 70 yr (mean = 69 yr). All NH subjects had audiometric thresholds ≤20 dB hearing level (HL) for frequencies from 250 to 6000 Hz. Their audiometric thresholds at the frequency of the target carrier (4000 Hz) were ≤10 dB HL for nine subjects and 20 dB HL for the remaining subject. Ten HI subjects (five female) were tested, with ages from 53 to 80 yr. They had typical sloping hearing losses. Their losses at the carrier frequency of 4000 Hz ranged from 40 to 60 dB HL.
B. Stimuli and procedure
The AM was applied to a 4000-Hz sinusoidal carrier. The AM rate, fm, was either 4 or 16 Hz. These rates were chosen to be within the range of modulation rates that are assumed to be important for speech perception (Drullman et al., 1994b; Shannon et al., 1995). The level of the carrier was 75 dB sound pressure level (SPL) and 30 dB SL for the NH subjects and 30 dB SL for the HI subjects. In order to set the SL appropriately, the absolute threshold of each ear of each subject at 4000 Hz was measured using a two-alternative forced-choice procedure and a three-down one-up adaptive method with feedback. The two intervals in which the signal might occur were marked by lights. The signal duration was 500 ms, including 20-ms raised-cosine ramps, and the silent gap between the two intervals was 300 ms. The step size in signal level was 5 dB until four reversals occurred and 2 dB thereafter. Twelve reversals were obtained and the mean level at the last eight was taken as the threshold. The ear with the lowest detection threshold at 4000 Hz was tested in the main experiment. Since the average absolute threshold of the HI subjects at 4000 Hz was about 46 dB SPL, the signal level in dB SPL was similar for the HI group tested at 30 dB SL and the NH group tested at 75 dB SPL.
A two-alternative forced-choice task was used to measure AM detection thresholds and AM depth discrimination thresholds. The carrier was gated on for a duration of 1000 ms (including 20-ms raised-cosine ramps) with a 300-ms silent interval between the two carrier bursts on each trial. The AM was present throughout the carrier. The starting phase of the AM was selected randomly from one of eight values 0, 45, 90, 135, 180, 225, 270, 315°, and the randomization was different across the two intervals of a trial.
In one interval, selected at random, AM with a reference modulation depth, mref, was present. In the other interval the AM depth was increased by Δm, giving an AM depth for the target, mtarget, of mref + Δm. The subject was asked to indicate, via a virtual button on the screen, the interval in which the sound appeared to fluctuate more. After the subject responded, feedback was provided via a light indicating the correct interval. Within a block of 55 trials, the value of mref was fixed at one of four values: 0, 0.5 (−6.0 dB when expressed as 20 log10 m), 0.6 (−4.4 dB), and 0.7 (−3.1 dB). When mref was 0, the task was to detect AM rather than to discriminate AM depth. Within each block, five values of mtarget were used. The value of mtarget started at a value that was chosen to make the task relatively easy. The starting value of mtarget was 0.2 (−14.0 dB) for mref = 0, 0.9 (−0.9 dB) for mref = 0.5, and 1.0 for the other two values of mref. To help subjects to learn what to listen for, the first five trials in a block all used the starting value of mtarget. Then the value was changed from the largest value to the smallest over five successive trials, and this sequence was repeated every five trials. Thus, the subject received a reminder “easy” stimulus every five trials.
For each subject, testing was completed for one AM rate before testing with the other AM rate. The order of testing the two rates was balanced across subjects. The order of mref was 0.5, 0.6, 0.7, and 0. The NH subjects were tested using all four values of mref first at 75 dB SPL and then at 30 dB SL, and thereafter alternating between the two levels. Five blocks were run for each combination of AM rate, level and mref, and percent correct scores were averaged across blocks for each value of mref and each value of mtarget. Each subject was given at least one training block with mref = 0.2 and mtarget = 0.9 before testing proper commenced.
Stimuli were generated digitally at a sample rate of 44.1 kHz. The signal was D/A-converted by a M-Audio Delta 44 audio interface (Cumberland, RI) and passed through a manual attenuator (Hatfield, 2125, Hatfield, UK) to one earpiece of a Sennheiser HD580 headset (Wedemark, Germany).
III. RESULTS
The mean results for each group, AM rate, and level are shown in Fig. 1. Error bars depict the standard error of the mean across the ten subjects of each group. The dotted line indicates the 50% correct rate that would be obtained by guessing. For the three higher values of mref, i.e., in the AM depth discrimination task, the HI subjects (circles) performed more poorly than the NH subjects at both 30 dB SL (downward-pointing triangles) and 75 dB SPL (upward-pointing triangles). The difference of about 10–20 percentage points was rather consistent across modulation frequency and mref, and was also reasonably consistent across mtarget, except for the smallest value used. The performance of each group tended to worsen with increasing mref. In the AM detection task, the HI subjects performed considerably better than the NH subjects at 30 dB SL. The difference between the two groups was especially large for the AM rate of 16 Hz and mtarget = 0.12, where the HI subjects scored about 97% correct and the NH subjects scored only about 55% correct. For the two largest target modulation depths, the HI subjects achieved near-perfect performance.
Three-way analyses of variance (ANOVAs) were calculated on the arcsine-transformed percent correct scores separately for each value of mref, as the values of mtarget differed across mref. Separate ANOVAs were conducted for the comparison of the two groups at 30 dB SL (Table I) and at 75 dB SPL (Table II). Hearing status (HI or NH) was a between-subjects factor, while mtarget (five levels) and fm (4 Hz or 16 Hz) were within-subjects factors. The outcomes confirm the differences between HI and NH subjects described above. The only analysis with no significant main effect of hearing status was for mref = 0 at 75 dB SPL. There was a significant main effect of fm in every analysis, with the mean percentage correct always being higher for fm = 16 Hz. A similar effect was found by Sek et al. (2015). As expected, there was a highly significant effect of mtarget in every analysis. The interaction between hearing status and modulation rate was significant in both analyses for the AM detection task and in none of the analyses for the AM depth discrimination task. The HI subjects performed markedly better for fm = 16 Hz than for fm = 4 Hz when mref was equal to 0, while the NH subjects did not show such a strong effect of AM rate.
. | . | . | mref . | |||||||
---|---|---|---|---|---|---|---|---|---|---|
. | . | . | 0.5 . | 0.6 . | 0.7 . | 0 . | ||||
. | df1 . | df2 . | F . | p . | F . | p . | F . | p . | F . | p . |
Hearing status | 1 | 18 | 9.18 | <0.01 | 11.6 | <0.01 | 10.3 | <0.01 | 22.0 | <0.001 |
fm | 1 | 18 | 11.5 | <0.01 | 18.8 | <0.001 | 22.5 | <0.001 | 13.2 | <0.01 |
mtarget | 4 | 72 | 94.4 | <0.001 | 50.6 | <0.001 | 35.8 | <0.001 | 52.2 | <0.001 |
Hearing status × fm | 1 | 18 | 0.00 | 0.98 | 0.49 | 0.49 | 0.03 | 0.88 | 30.5 | <0.001 |
Hearing status × mtarget | 4 | 72 | 2.79 | <0.05 | 2.82 | <0.05 | 2.84 | <0.05 | 10.9 | <0.001 |
fm × mtarget | 4 | 72 | 0.72 | 0.58 | 1.50 | 0.21 | 3.76 | <0.01 | 2.67 | <0.05 |
Hearing status × fm × mtarget | 4 | 72 | 1.44 | 0.23 | 3.09 | <0.05 | 0.82 | 0.52 | 5.29 | <0.001 |
. | . | . | mref . | |||||||
---|---|---|---|---|---|---|---|---|---|---|
. | . | . | 0.5 . | 0.6 . | 0.7 . | 0 . | ||||
. | df1 . | df2 . | F . | p . | F . | p . | F . | p . | F . | p . |
Hearing status | 1 | 18 | 9.18 | <0.01 | 11.6 | <0.01 | 10.3 | <0.01 | 22.0 | <0.001 |
fm | 1 | 18 | 11.5 | <0.01 | 18.8 | <0.001 | 22.5 | <0.001 | 13.2 | <0.01 |
mtarget | 4 | 72 | 94.4 | <0.001 | 50.6 | <0.001 | 35.8 | <0.001 | 52.2 | <0.001 |
Hearing status × fm | 1 | 18 | 0.00 | 0.98 | 0.49 | 0.49 | 0.03 | 0.88 | 30.5 | <0.001 |
Hearing status × mtarget | 4 | 72 | 2.79 | <0.05 | 2.82 | <0.05 | 2.84 | <0.05 | 10.9 | <0.001 |
fm × mtarget | 4 | 72 | 0.72 | 0.58 | 1.50 | 0.21 | 3.76 | <0.01 | 2.67 | <0.05 |
Hearing status × fm × mtarget | 4 | 72 | 1.44 | 0.23 | 3.09 | <0.05 | 0.82 | 0.52 | 5.29 | <0.001 |
. | . | . | mref . | |||||||
---|---|---|---|---|---|---|---|---|---|---|
. | . | . | 0.5 . | 0.6 . | 0.7 . | 0 . | ||||
. | df1 . | df2 . | F . | p . | F . | p . | F . | p . | F . | p . |
Hearing status | 1 | 18 | 4.75 | <0.05 | 13.5 | <0.01 | 19.5 | <0.001 | 0.21 | 0.66 |
fm | 1 | 18 | 23.5 | <0.001 | 18.7 | <0.001 | 19.3 | <0.001 | 10.6 | <0.01 |
mtarget | 4 | 72 | 105 | <0.001 | 80.1 | <0.001 | 79.5 | <0.001 | 91.0 | <0.001 |
Hearing status × fm | 1 | 18 | 0.45 | 0.51 | 0.00 | 0.99 | 0.79 | 0.39 | 4.61 | <0.05 |
Hearing status × mtarget | 4 | 72 | 1.54 | 0.20 | 6.96 | <0.001 | 19.6 | <0.001 | 0.11 | 0.98 |
fm × mtarget | 4 | 72 | 2.00 | 0.10 | 3.68 | <0.01 | 3.28 | <0.05 | 8.76 | <0.001 |
Hearing status × fm × mtarget | 4 | 72 | 0.30 | 0.88 | 0.69 | 0.60 | 1.16 | 0.34 | 1.01 | 0.41 |
. | . | . | mref . | |||||||
---|---|---|---|---|---|---|---|---|---|---|
. | . | . | 0.5 . | 0.6 . | 0.7 . | 0 . | ||||
. | df1 . | df2 . | F . | p . | F . | p . | F . | p . | F . | p . |
Hearing status | 1 | 18 | 4.75 | <0.05 | 13.5 | <0.01 | 19.5 | <0.001 | 0.21 | 0.66 |
fm | 1 | 18 | 23.5 | <0.001 | 18.7 | <0.001 | 19.3 | <0.001 | 10.6 | <0.01 |
mtarget | 4 | 72 | 105 | <0.001 | 80.1 | <0.001 | 79.5 | <0.001 | 91.0 | <0.001 |
Hearing status × fm | 1 | 18 | 0.45 | 0.51 | 0.00 | 0.99 | 0.79 | 0.39 | 4.61 | <0.05 |
Hearing status × mtarget | 4 | 72 | 1.54 | 0.20 | 6.96 | <0.001 | 19.6 | <0.001 | 0.11 | 0.98 |
fm × mtarget | 4 | 72 | 2.00 | 0.10 | 3.68 | <0.01 | 3.28 | <0.05 | 8.76 | <0.001 |
Hearing status × fm × mtarget | 4 | 72 | 0.30 | 0.88 | 0.69 | 0.60 | 1.16 | 0.34 | 1.01 | 0.41 |
Figure 2 shows a comparison of the two age groups of NH subjects. In the AM depth discrimination task, performance was similar for the younger subjects (solid lines) and older subjects (dashed lines) at both 30 dB SL (downward-pointing triangles) and 75 dB SPL (upward-pointing triangles). For AM detection (mref = 0), the older subjects tended to perform more poorly than the younger subjects, especially for fm = 4 Hz, but there were marked individual differences among the older subjects.
Table III shows the outcomes of four-way ANOVAs on the arcsine-transformed percent correct scores for the NH subjects only for each mref with between-subjects factor age group and within-subjects factors mtarget, fm, and sound level. The factor age group was not significant in any of the ANOVAs, neither as a main effect nor in any two-way interaction.
. | . | . | mref . | . | ||||||
---|---|---|---|---|---|---|---|---|---|---|
. | . | . | 0.5 . | 0.6 . | 0.7 . | 0 . | ||||
. | df1 . | df2 . | F . | P . | F . | p . | F . | p . | F . | p . |
Age group | 1 | 8 | 0.21 | 0.66 | 0.00 | 0.95 | 0.13 | 0.73 | 1.95 | 0.20 |
fm | 1 | 8 | 16.5 | <0.01 | 7.25 | <0.05 | 7.93 | <0.05 | 0.00 | 1.00 |
Level | 1 | 8 | 8.27 | <0.05 | 1.74 | 0.22 | 0.70 | 0.43 | 28.6 | <0.001 |
Δm | 4 | 32 | 112 | <0.001 | 74.1 | <0.001 | 59.3 | <0.001 | 48.6 | <0.001 |
Age group × fm | 1 | 8 | 0.09 | 0.78 | 0.02 | 0.88 | 0.37 | 0.56 | 2.85 | 0.13 |
Age group × level | 1 | 8 | 0.00 | 0.95 | 0.44 | 0.53 | 0.96 | 0.36 | 0.11 | 0.75 |
Age group × mtarget | 4 | 32 | 0.38 | 0.82 | 1.73 | 0.17 | 0.13 | 0.96 | 1.75 | 0.16 |
fm × level | 1 | 8 | 0.74 | 0.42 | 3.07 | 0.12 | 2.14 | 0.18 | 2.25 | 0.17 |
fm × mtarget | 4 | 32 | 0.40 | 0.81 | 0.83 | 0.52 | 3.77 | <0.05 | 1.79 | 0.16 |
Level × mtarget | 4 | 32 | 0.88 | 0.49 | 4.11 | <0.01 | 10.23 | <0.05 | 10.5 | <0.001 |
Age group × fm × level | 1 | 8 | 0.02 | 0.91 | 0.70 | 0.43 | 1.42 | 0.27 | 1.63 | 0.24 |
Age group × fm × mtarget | 4 | 32 | 0.15 | 0.96 | 1.31 | 0.29 | 0.61 | 0.66 | 0.34 | 0.85 |
Age group × level × mtarget | 4 | 32 | 0.58 | 0.68 | 3.40 | <0.05 | 0.61 | 0.66 | 0.01 | 1.00 |
fm × level × mtarget | 4 | 32 | 1.96 | 0.12 | 0.77 | 0.56 | 10.2 | <0.001 | 4.44 | <0.01 |
Age group × fm × level × mtarget | 4 | 32 | 0.77 | 0.55 | 0.26 | 0.90 | 0.89 | 0.48 | 0.60 | 0.67 |
. | . | . | mref . | . | ||||||
---|---|---|---|---|---|---|---|---|---|---|
. | . | . | 0.5 . | 0.6 . | 0.7 . | 0 . | ||||
. | df1 . | df2 . | F . | P . | F . | p . | F . | p . | F . | p . |
Age group | 1 | 8 | 0.21 | 0.66 | 0.00 | 0.95 | 0.13 | 0.73 | 1.95 | 0.20 |
fm | 1 | 8 | 16.5 | <0.01 | 7.25 | <0.05 | 7.93 | <0.05 | 0.00 | 1.00 |
Level | 1 | 8 | 8.27 | <0.05 | 1.74 | 0.22 | 0.70 | 0.43 | 28.6 | <0.001 |
Δm | 4 | 32 | 112 | <0.001 | 74.1 | <0.001 | 59.3 | <0.001 | 48.6 | <0.001 |
Age group × fm | 1 | 8 | 0.09 | 0.78 | 0.02 | 0.88 | 0.37 | 0.56 | 2.85 | 0.13 |
Age group × level | 1 | 8 | 0.00 | 0.95 | 0.44 | 0.53 | 0.96 | 0.36 | 0.11 | 0.75 |
Age group × mtarget | 4 | 32 | 0.38 | 0.82 | 1.73 | 0.17 | 0.13 | 0.96 | 1.75 | 0.16 |
fm × level | 1 | 8 | 0.74 | 0.42 | 3.07 | 0.12 | 2.14 | 0.18 | 2.25 | 0.17 |
fm × mtarget | 4 | 32 | 0.40 | 0.81 | 0.83 | 0.52 | 3.77 | <0.05 | 1.79 | 0.16 |
Level × mtarget | 4 | 32 | 0.88 | 0.49 | 4.11 | <0.01 | 10.23 | <0.05 | 10.5 | <0.001 |
Age group × fm × level | 1 | 8 | 0.02 | 0.91 | 0.70 | 0.43 | 1.42 | 0.27 | 1.63 | 0.24 |
Age group × fm × mtarget | 4 | 32 | 0.15 | 0.96 | 1.31 | 0.29 | 0.61 | 0.66 | 0.34 | 0.85 |
Age group × level × mtarget | 4 | 32 | 0.58 | 0.68 | 3.40 | <0.05 | 0.61 | 0.66 | 0.01 | 1.00 |
fm × level × mtarget | 4 | 32 | 1.96 | 0.12 | 0.77 | 0.56 | 10.2 | <0.001 | 4.44 | <0.01 |
Age group × fm × level × mtarget | 4 | 32 | 0.77 | 0.55 | 0.26 | 0.90 | 0.89 | 0.48 | 0.60 | 0.67 |
IV. DISCUSSION
A. Comparison with previous results
The psychometric functions can be used to estimate the modulation depth of the target required to obtain a given percentage correct, such as 79%, as would be obtained using a three-down one-up adaptive tracking procedure (Levitt, 1971). This corresponds to a detectability index d′ of 1.14, which is close to the value tracked in several previous studies. Expressed in decibels as 20 log10 m, the 79% correct point for the NH subjects for AM depth discrimination was roughly 2 dB above mref for fm = 16 Hz and 3 dB above mref for fm = 4 Hz. The thresholds for the 16-Hz rate were similar to those reported by Ewert and Dau (2004) for similar stimuli, although they expressed the thresholds as the Weber fraction for a change in modulator power. Corresponding thresholds for the HI subjects tested in our study were roughly 3 dB for fm = 16 Hz and 5 dB or more for fm = 4 Hz. The HI subjects did not reach the 79% criterion for the higher values of mref. The resulting differences between HI and NH subjects were of the same order of magnitude as the 1.7 dB difference found by Sek et al. (2015) for mref = 0.4, though the difference was not statistically significant in their study. Our results for mref = 0.5 differ from those obtained by Lorenzi et al. (1997), since they found no clear difference in AM depth discrimination for NH and HI subjects. However, two of their HI subjects had sloping hearing losses (greater at high frequencies), and since a broadband carrier was used, performance might have been based on listening to the lower frequencies in the carrier, where the hearing loss was small. Also, performance in their task might have been partly limited by the inherent random amplitude fluctuations in the noise carrier.
The better performance of the HI than of the NH subjects for AM detection when tested at the same relatively low SL agrees with previous results (Lüscher and Zwislocki, 1949; Buus et al., 1982; Ernst and Moore, 2012; Sek et al., 2015). The effect was especially dramatic for the 16-Hz rate. At 30 dB SL the NH subjects scored barely above chance for the 16-Hz rate for values of mtarget up to 0.12, while the HI subjects achieved about 86% correct for mtarget = 0.08. For fm = 16 Hz, the HI subjects scored close to ceiling for the three highest values of mtarget, while for fm = 4 Hz, the scores only reached about 86% correct for the two largest values of mtarget. The NH subjects also performed better for fm = 16 Hz than for fm = 4 Hz, but only when tested at 75 dB SPL. This pattern of results is similar to that obtained by Ernst and Moore (2012), who found better AM detection for fm = 10 Hz than for fm = 2 Hz. The effect of AM rate might be related to the greater number of AM cycles occurring in the fixed-duration stimulus as AM rate increases (Sheft and Yost, 1990).
B. Effect of age
The comparison of results for the younger and older NH subjects in our study suggests that age has no clear effect on AM depth discrimination. The results for the AM detection task showed a trend for the younger NH subjects to perform better than the older NH subjects, although this effect was not significant. Füllgrabe et al. (2015) compared AM detection using a 4-kHz sinusoidal carrier for younger and older subjects with matched (normal) audiograms. They found a small but significant effect of age, performance being poorer for the older subjects. However, the detection of AM imposed on noise carriers shows no clear effect of age (Takahashi and Bacon, 1992; Schoof and Rosen, 2014). Overall, it appears that age does not influence AM depth discrimination and, at most, has a minor effect on AM detection, in contrast to sensitivity to temporal fine structure, for which age appears to have a substantial influence (Ross et al., 2007; Moore et al., 2012; Moore, 2014).
Models of AM detection and discrimination (Dau et al., 1997; Ewert and Dau, 2000, 2004; Paraouty et al., 2016) typically have the following stages: (1) a simulation of peripheral processing, for example, bandpass filtering, followed by half-wave rectification and lowpass filtering to extract the envelope; (2) an additive noise that is used to account for AM detection and/or intensity discrimination; (3) a multiplicative noise that is used to account for AM depth discrimination when mref is well above the detection threshold; (4) a decision mechanism, based on an ideal detector, a template-matching mechanism, or the signal-to-noise ratio in the envelope domain. Our findings, taken together with earlier results, suggest that the hypothetical additive noise is at most slightly affected by age (since age has at most a minor influence on AM detection) while the multiplicative noise is unaffected by age (since age has no effect on AM depth discrimination).
C. Effects of hearing impairment
When comparing the results for the NH and HI subjects, the possible influence of off-frequency listening should be considered. In a normal cochlea, the input-output function of the basilar membrane is highly compressive for places tuned close to the signal frequency, but is more linear for places tuned well above the signal frequency (Robles and Ruggero, 2001). Hence, when detecting AM, the high-frequency side of the excitation pattern may be more informative than the central part or the low-frequency side of the pattern. Nevertheless, it seems likely that NH subjects do not detect AM solely using the high-frequency side of the excitation pattern; rather, they combine information from all audible parts of the excitation pattern (Florentine and Buus, 1981; Moore and Sek, 1994). As a result, AM detection and discrimination are probably at least partly affected by compressive processing in the cochlea.
For HI subjects, the input-output function of the basilar membrane becomes more linear (Robles and Ruggero, 2001), so information from all parts of the excitation pattern is approximately equally informative. It is likely that the HI subjects used in our study had reduced frequency selectivity (Pick et al., 1977; Glasberg and Moore, 1986), which on its own would lead to a broader excitation pattern. Counteracting this, most of the HI subjects had a sloping hearing loss, which would limit the audible range of the high-frequency side of the excitation pattern.
Because of these factors, the audible ranges of the excitation patterns may well have differed somewhat for the NH and HI subjects, even when tested at the same SL. However, it seems likely that the better AM detection of the HI than of the NH subjects when tested at the same SL was probably caused mainly by a loss of compression rather than by a difference in the audible extent of the excitation pattern.
The finding of poorer AM depth discrimination for the HI than for the NH subjects is broadly consistent with the hypothesis described in the Introduction, based on fluctuation strength. For NH subjects, fluctuation strength for a 4-Hz AM rate increases gradually as m is increased to about 0.2 (peak-to-valley ratio, PVR, of about 4 dB), grows rapidly as m is increased from 0.2 up to about 0.8 (PVR of about 19 dB), and then saturates, increasing only slightly for m between 0.8 and 1 (Fastl, 1983, Fig. 3). Consistent with this, the NH subjects performed more poorly at discriminating AM depth as the value of mref was increased, and for mref = 0.7, scores were well below ceiling even when the target had the maximum possible value of m = 1.
Consider now how fluctuation strength may have affected the results for the HI subjects. For the average hearing loss of the HI subjects at 4 kHz (about 50 dB), loudness recruitment probably had an effect similar to increasing the “internal” representation of the PVR in dB by a factor of about 1.7 (Moore et al., 1996; Moore and Glasberg, 2004). Thus, a PVR of 9.5 dB (corresponding to mref = 0.5) for the impaired ears would lead to an internal representation similar to that produced by a PVR of about 16 dB (9.5 × 1.7 = 16, corresponding to mref = 0.73) for the normal ears. For the NH subjects, when mref = 0.7, the mean score for the 4-Hz AM rate when the target m was 0.88 (PVR = 24 dB) was about 71% correct at 30 dB SL and 74% correct at 75 dB SPL. To get approximately the same internal AM depth for the target in the impaired ears, mtarget would need to be about 0.67 (PVR = 14 dB). One would therefore predict performance of 71%–74% correct for the impaired ears for mref = 0.5 and mtarget = 0.67. The mean obtained score for the 4-Hz AM rate for this condition was about 69%, reasonably close to the predicted value. For the 16-Hz AM rate, for the NH subjects, when mref = 0.7, the mean score for mtarget = 0.88 was about 78% correct at 30 dB SL and 81% correct at 75 dB SPL. One would therefore predict performance of 78%–81% correct for the impaired ears for mref = 0.5 and mtarget = 0.67. The mean obtained score for this condition was about 77%, again reasonably close to the predicted value. Overall, the results are consistent with the hypothesis that AM depth discrimination was based on differences in perceived fluctuation strength, and that the change in fluctuation strength at threshold is similar for NH and HI subjects.
Fluctuation strength tends to be maximal for AM rates close to 4 Hz and to decrease for lower and higher AM rates (Fastl, 1983, Fig. 1). It is possible that the value of m required for saturation of fluctuation strength is higher for fm = 16 Hz than for fm = 4 Hz, although data on this are lacking. If so, this could account for why AM depth discrimination was somewhat better for the 16-Hz than for the 4-Hz AM rate, especially for the HI subjects. However, the finding that AM detection was not better for the 4-Hz rate than for the 16-Hz rate suggests that fluctuation strength cannot explain all aspects of the data. Possibly, the detection of AM is partly limited by an additive internal noise that does not vary with AM depth or rate (Ewert and Dau, 2004).
Consider next the interpretation of the results for the HI subjects in terms of the models of AM detection and discrimination described above, for example, the envelope power spectrum model (Ewert and Dau, 2000, 2004). The increase in the “internal” strength of AM associated with loudness recruitment should result in an increase in the magnitude of the multiplicative noise that is assumed to be added after envelope extraction, since the variance of this noise is assumed to be proportional to the strength of the internal envelope fluctuations. If this were the only effect involved, the internal signal-to-noise ratio in the envelope domain should be unaffected by loudness recruitment and AM depth discrimination should be similar for NH and HI subjects. This was not found to be the case. One way of accounting for the poorer performance of the HI subjects in terms of the models is to assume that hearing loss leads to an increase in the internal multiplicative noise, perhaps because of loss of inner hair cells, synapses, and neurons (Schuknecht, 1993; Kujawa and Liberman, 2015).
Multiplicative noise is used in the models to account for the finding that, for NH subjects, AM depth discrimination obeys Weber's law when the modulation depths of the reference and target are expressed in terms of modulator power (Wakefield and Viemeister, 1990; Ewert and Dau, 2004). In other words, the Weber fraction, (mtarget2 – mref2)/mref2, should be constant. When comparing across values of mref, an equal value of the Weber fraction should correspond to an equal percent correct. To assess whether this was the case for our data, for each value of fm the function relating percent correct to the Weber fraction was estimated from the data for mref = 0.5. This function was then used to predict performance for the other values of mref. The analysis was conducted separately for the NH and HI subjects and for the two levels for the NH subjects.
The results are illustrated in Fig. 3. For fm = 16 Hz (bottom row) the data correspond reasonably well to the predicted values for both the NH and HI subjects. Thus, the data are consistent with the idea that the Weber fraction is approximately constant when it is expressed as (mtarget2 – mref2)/mref2. For fm = 4 Hz (top row), the correspondence between the data and predictions is less good, especially for the data obtained at 30 dB SL. The scores obtained at 30 dB SL tend to fall below the predicted scores for both the NH and HI subjects, especially for mref = 0.7. This indicates that Weber's law does not hold exactly for fm = 4 Hz and for large values of mref; rather the Weber fraction tends to increase when mref is large. This may reflect the saturation of fluctuation strength, a factor that is not taken into account in models of AM detection and discrimination. As described above, the value of m required for saturation of fluctuation strength may be higher for fm = 16 Hz than for fm = 4 Hz and this may account for why the discrepancy between the obtained and predicted thresholds is greater for fm = 4 Hz than for fm = 16 Hz.
D. Implications for speech perception
Finally, consider the possible implications of these results for speech perception. Speech is a highly modulated signal, with PVRs in narrow frequency bands reaching 30–40 dB (Plomp, 1983; Moore et al., 2008). When trying to identify the speech of a target talker in the presence of background sounds, time-frequency regions conveying useful information about the target talker may be partly identified by an increase of AM depth in those regions. However, modulation of the background sounds may make it difficult to detect an increase in AM depth produced by the target, especially when loudness recruitment leads to an effective magnification of the internal AM depth and near-saturation of fluctuation strength. When the background sound is one or a few talkers, it will be highly modulated and this could well lead to a saturation of fluctuation strength for HI listeners. This may partly explain the finding that NH listeners usually understand speech much better when it is presented in a fluctuating background than when it is presented in a steady background, whereas HI listeners often show a reduced or zero fluctuating-masker benefit (Duquesnoy, 1983; Peters et al., 1998; Bernstein and Grant, 2009). Even for notionally steady background noises, random amplitude fluctuations play a strong role in limiting the intelligibility of speech (Stone et al., 2012; Stone and Moore, 2014b), and magnification of the internal representation of the depth of these fluctuations by loudness recruitment may increase this effect for HI listeners.
V. CONCLUSIONS
For relatively large values of mref (0.5–0.7), NH subjects showed better AM depth discrimination than HI subjects, for modulation rates of 4 and 16 Hz. The difference in percent correct scores was typically 10%–20%.
In contrast, the HI subjects showed better AM detection than the NH subjects when the comparison was made at the same SL of 30 dB.
There was no clear effect of age on AM detection or discrimination for the NH subjects.
The differences between the HI and NH subjects in AM depth discrimination can be explained in terms of the sensation of fluctuation strength, and especially the way that fluctuation strength saturates at high modulation depths. Loudness recruitment probably increases fluctuation strength, leading to near-saturation of fluctuation strength in both intervals of a forced-choice trial when mref is large. The data are consistent with the idea that the change in fluctuation strength at the threshold for AM-depth discrimination is similar for NH and HI subjects.
ACKNOWLEDGMENTS
This work was supported by the Engineering and Physical Sciences Research Council (UK, Grant No. RG78536). We thank Aleksander Sęk for writing the computer software used to run the experiment. We thank Virginia Richards, Christian Lorenzi, and an anonymous reviewer for very helpful comments on an earlier version of this paper.