Visual calibration of auditory space requires re-alignment of representations differing in (1) format (auditory hemispheric channels vs visual maps) and (2) reference frames (head-centered vs eye-centered). Here, a ventriloquism paradigm from Kopčo, Lin, Shinn-Cunningham, and Groh [J. Neurosci. 29, 13809–13814 (2009)] was used to examine these processes in humans for ventriloquism induced within one spatial hemifield. Results show that (1) the auditory representation can be adapted even by aligned audio-visual stimuli, and (2) the spatial reference frame is primarily head-centered, with a weak eye-centered modulation. These results support the view that the ventriloquism aftereffect is driven by multiple spatially non-uniform, hemisphere-specific processes.

Vision plays an important role in calibration of auditory spatial perception. In the “ventriloquism aftereffect” (VAE), repeated presentations of spatially mismatched visual and auditory stimuli produce a shift in perceived sound location that persists when the sound is presented alone (Canon, 1970; Recanzone, 1998; Woods and Recanzone, 2004; Bertelson et al., 2006). The brain mechanisms that support this process are mysterious because spatial representations seem to differ in vision and in hearing in two ways.

First, visual space is initially encoded relative to the direction of the eye gaze, while the cues for auditory space are first computed relative to the orientation of the head (Groh and Sparks, 1992). A means of reconciling this discrepancy in reference frames (RFs) is necessary to achieve correct recalibration. Our previous study suggests that a mixture of eye-centered and head-centered RFs are associated with recalibration in the central region of the audiovisual field (Kopčo et al., 2009).

Second, there is growing evidence that, in mammals, auditory space is encoded non-homogeneously, based on two (or more) spatial channels roughly aligned with the left and right hemifields of the horizontal plane (Grothe et al., 2010; Groh, 2014). This is markedly different from visual spatial codes, in which the retinal surface provides a map of the position of stimuli in the environment.

Thus, the process of using visual information to recalibrate auditory space is multifaceted, and may operate differently in different portions of the environmental scene. Indeed, differential patterns of adaptation across auditory space have been observed (Phillips and Hall, 2005; Maier et al., 2010), suggesting that the auditory code in humans likely employs the same two-channel scheme that has been observed in animal species (Salminen et al., 2009).

Here, we tested whether the spatial characteristics of the VAE induced in the audiovisual periphery (i.e., in a single hemifield) differ from those occurring when the aftereffect is induced in the central region (i.e., covering both hemifields; Kopčo et al., 2009). Persistent visually driven biases in perceived sound location were induced. As in Kopčo et al. (2009), we presented mismatched (5°-shifted) audio-visual (AV) stimuli in only a subregion of space [Fig. 1(A), top panel], but this time the training region was peripheral, rather than central, to the fixation point (FP) used for these trials. We evaluated the effects of this pairing on saccade accuracy for interleaved auditory-only trials both from that FP and a non-training FP in the opposite hemifield [Fig. 1(A), bottom panel].

Fig. 1.

(Color online) Experimental setup and raw experimental data. (A) AV display used to present the AV training stimuli in one experimental block. At the beginning of each AV training trial (top), the subject had to fixate on the same initial FP; then, the training stimulus was presented from one of three locations lateral to the FP, keeping the direction of the induced shift the same within a block (by consistently presenting the visual adaptor displaced to the left, to the right, or aligned with the target speaker). On the auditory-only probe trials (bottom), the same nine speaker locations and two FPs were used in all blocks. The probe trials were randomly interleaved among the training trials and the FP and target locations varied randomly from trial to trial. The dashed frame indicates the central training region used in Kopčo et al. (2009). (B) Raw saccade endpoints of the responses to the AV training stimuli and auditory-only probe stimuli as a function of the actual target speaker location, collapsed across time. The symbols represent across-subject mean responses [±1 standard error of the mean (SEM) indicated by horizontal lines] in different audiovisual conditions (see legend). Graphs for each measurement type are plotted in one row, vertically offset from data for other types, for visual clarity. The A-only data corresponding to each target location are approximately aligned with that target location. For the AV data, the dashed lines connect symbol triplets for the same auditory target when presented with one of the three different visual adaptors (the AV-aligned data are located approximately at the corresponding target location).

Fig. 1.

(Color online) Experimental setup and raw experimental data. (A) AV display used to present the AV training stimuli in one experimental block. At the beginning of each AV training trial (top), the subject had to fixate on the same initial FP; then, the training stimulus was presented from one of three locations lateral to the FP, keeping the direction of the induced shift the same within a block (by consistently presenting the visual adaptor displaced to the left, to the right, or aligned with the target speaker). On the auditory-only probe trials (bottom), the same nine speaker locations and two FPs were used in all blocks. The probe trials were randomly interleaved among the training trials and the FP and target locations varied randomly from trial to trial. The dashed frame indicates the central training region used in Kopčo et al. (2009). (B) Raw saccade endpoints of the responses to the AV training stimuli and auditory-only probe stimuli as a function of the actual target speaker location, collapsed across time. The symbols represent across-subject mean responses [±1 standard error of the mean (SEM) indicated by horizontal lines] in different audiovisual conditions (see legend). Graphs for each measurement type are plotted in one row, vertically offset from data for other types, for visual clarity. The A-only data corresponding to each target location are approximately aligned with that target location. For the AV data, the dashed lines connect symbol triplets for the same auditory target when presented with one of the three different visual adaptors (the AV-aligned data are located approximately at the corresponding target location).

Close modal

As was the case for our previous study involving central training, the pairing of a displaced visual stimulus induced a local aftereffect in the peripheral trained region. Contrary to the previous study, this aftereffect appeared to be mostly in the head-centered RF, as the contribution of an eye-centered component was not readily apparent. However, we also observed biases related to the location of the FP, even when the AV stimuli were aligned. Together, these findings confirm the contribution of multiple signals related to different RFs and representational formats across the horizontal space.

All procedures and equipment closely matched those used in Kopčo et al. (2009).

Experiments were performed in an experimental lab in the Boston University Hearing Research Center. Subjects made eye movements from a visual FP to a broadband noise delivered from loudspeakers in darkness. On training trials [Fig. 1(A), top], visual stimuli were presented simultaneously with the sounds, using light-emitting diodes (LEDs) displaced from the locations of the speakers or aligned with them. On randomly interleaved probe trials [Fig. 1(A), bottom], only the auditory stimuli were presented.

Seven young adults with normal hearing by self-report participated. The experimental protocols were approved by the Boston University institutional review committee.

Subjects were seated in a quiet darkened experimental room in front of an array of speakers and LEDs (Fig. 1). To keep the head-centered RF fixed, the subjects' heads were restrained by a chin rest. Subjects' behavior was monitored and responses were collected by an infrared eye tracker, calibrated using visually guided saccades to selected target locations at the beginning of each session.

Sounds were 100-ms broadband noises (0.2–6 kHz) with 10 ms on/off ramps presented at 70 dBA from speakers mounted in the horizontal plane ∼1.2 m from the center of the listener's head. Spacing between speakers was 7.5°. For the training AV stimuli, only the speakers at the locations 15°, 22.5°, and 30° were used [Fig. 1(A)]. The LEDs for the AV stimuli were mounted so that they were either horizontally aligned with the speakers or displaced (either to the left or to the right) by 5°. They were turned on and off in synchrony with the corresponding speakers. Two additional LEDs 10° below the speaker array served as fixation locations (azimuth of ±11.8°).

Trials began with the onset of one of the two fixation LEDs. After subjects fixated the LED for 150 ms, the fixation LED was turned off and the AV or A-only stimulus was presented. The subjects performed a saccade to the perceived location of the stimulus. The saccade end point was recorded at the saccade end, i.e., when the eye fixation was sustained at the same location for 150 ms, at which point the experiment continued with the next trial. In both AV and A-only trials, the subjects were instructed to look to the location of the auditory component of the stimulus.

Training (AV) and probe (A-only) trials were randomly interleaved at a ratio of 1:1. Training stimuli were presented from one of the three training locations while the subject fixated the training FP [top panel of Fig. 1(A)]. Probe stimuli were presented from one of the nine speakers, while the subject fixated either the training or the non-training FP [bottom panel of Fig. 1(A)].

Trials were run in sessions with a consistent AV pairing (leftward, rightward, or no shift). Each session started with a pre-adaptation reference measurement (18 A-only trials from the training FP), followed by 720 trials in which the training FP and the AV shift direction was fixed. Each subject performed 12 sessions (2 FPs × 3 shift directions × 2 repeats) in order that was randomized across the subjects.

Data from the first quarter of each session were excluded to remove transitory values observed during the initial buildup of VAE. Within-session averages were computed from the remaining data separately for each combination of target location, training FP location, fixation position, and condition. Since no large left–right differences were observed, data with training FP on the left were mirror-flipped and combined with the data with training FP on the right (see Table 1). All data are presented as across-subject means and standard errors of the mean, with the training FP always shown on the right and the non-training FP on the left. Repeated measures analysis of variances (ANOVAs) were used to assess statistical significance of the observed effects.

Table 1.

Four-way repeated-measures ANOVA of the VAE magnitude data. Significance levels are as follows: *p <0.05, ***p <0.005.

Factord.f.FSignif.
Speaker Location (1 to 9) 8, 48 33.87 *** 
A-only FP (Tr. vs Non-Tr.) 1, 6 0.99  
Direction of Induced Shift (L vs R) 1, 6 0.43  
AV FP (L vs R) 1, 6 0.27  
Speaker Location × A-only FP 8, 48 0.79 
Speaker Location × X AV FP 8, 48 2.28  
A-only FP × AV FP 1, 6 0.42  
Speaker Location × Direction 8, 48 0.56  
AV FP × Direction 1, 6 2.16  
A FP × Direction 1, 6 0.1  
Speaker Loc. × AV FP × A-only FP 8, 48 0.31  
Speaker Loc. × AV FP × Direction 8, 48 0.52  
Speaker Loc. × A-only FP × Direction 8, 48 1.69  
AV FP × A-only FP × Direction 1, 6 0.12  
Loc. × AV FP × A-only FP × Direct. 8, 48 1.16  
Factord.f.FSignif.
Speaker Location (1 to 9) 8, 48 33.87 *** 
A-only FP (Tr. vs Non-Tr.) 1, 6 0.99  
Direction of Induced Shift (L vs R) 1, 6 0.43  
AV FP (L vs R) 1, 6 0.27  
Speaker Location × A-only FP 8, 48 0.79 
Speaker Location × X AV FP 8, 48 2.28  
A-only FP × AV FP 1, 6 0.42  
Speaker Location × Direction 8, 48 0.56  
AV FP × Direction 1, 6 2.16  
A FP × Direction 1, 6 0.1  
Speaker Loc. × AV FP × A-only FP 8, 48 0.31  
Speaker Loc. × AV FP × Direction 8, 48 0.52  
Speaker Loc. × A-only FP × Direction 8, 48 1.69  
AV FP × A-only FP × Direction 1, 6 0.12  
Loc. × AV FP × A-only FP × Direct. 8, 48 1.16  

As in Kopčo et al. (2009), we presented paired visual-auditory stimuli in a subregion of audiovisual space, fixed in both eye- and head-centered coordinates. We used one initial eye fixation position on training trials and presented the discrepant audiovisual stimuli from a restricted spatial range that was lateral with respect to the FP [see Fig. 1(A), top]. Because the visual training was local, we could test the spatial attributes of the resulting recalibration by shifting fixation on probe trials. Specifically, on interleaved auditory-only probe trials, we varied initial eye position (FP) with respect to the head (which was fixed) and presented sounds from all target locations spanning both the same head-centered locations and the same eye-centered locations as on the training trials [see Fig. 1(A), bottom]. We first consider the effects observed on the AV training trials themselves before turning to aspects of how the effects generalize to the auditory-only conditions across both the trained and untrained regions of space as a function of eye-referenced vs head-referenced fixation position.

A strong ventriloquism effect—or capture of the auditory stimulus location by the visual stimulus on combined AV trials—was observed. The green symbols in Fig. 1(B) show the raw responses. When the AV stimuli were aligned, the average responses were not biased at all. The relative strength of the ventriloquism effect was evaluated as percent of shift in responses towards the visual (V) component re. the A-component on misaligned AV trials, which was for each A target location and V-component shift computed as (respV-misalign − respV-align)/(stimV-misalign − stimV-align), where stim is the actual location of the V-component. The strength ranged from 96% for the target at 15° to 82% for the target at 30° (averaged across 2 directions of induced shift). Even though there was a slight decrease in the strength of the ventriloquism effect for the most lateral targets, it was expected that, as in Kopčo et al. (2009), this strong ventriloquism effect would be associated with a clear local VAE.

We next assessed the auditory-only responses interleaved with the spatially aligned AV stimuli. The red and blue circles in Fig. 1(B) show these responses. Overall, the pattern of results shows that the subjects accurately localized the auditory targets, showing a systematic displacement of the responses with the actual target locations. To analyze the impact of the visual training in more detail, the top panel of Fig. 2(A) shows the biases in these responses relative to the actual target location, separately for the two FPs. A gaze-direction-dependent adaptation is seen when comparing the responses from the training FP (red line) to those from the non-training FP (blue line). Specifically, the responses to the targets at azimuth of 0°–15° were biased to the left by 1°–2.5° when performed from the non-training FP (blue “+” symbol) compared to the responses from the training FP (red “+” symbol). A dashed line in this panel represents the same data from the central-adaptation experiment of Kopčo et al. (2009), averaged across the two FP locations as no large FP-dependent differences were observed in that study. A solid black line in the bottom panel of Fig. 2(A) shows the difference between the red and blue lines from the top panel, while the dashed line represents the difference from the central-adaptation experiment of Kopčo et al. (2009). These panels show that responses to auditory-only stimuli from AV-trained locations that are lateral and near the training FP differ depending on whether eyes fixate within the same hemifield or the opposite hemifield. On the other hand, when the AV training locations are in the center, covering both hemifields, no such differential effect of fixation location is observed (dashed line). A one-way repeated measures ANOVA performed on the difference data showed a significant effect of target location (F8,48 = 9.45, p <0.001). This effect of eye fixation direction is strong, of size comparable to the VAE (see Sec. 3.3); thus, there is some eye-gaze-dependent contribution to responses to auditory-only stimuli even when vision is not used to induce any recalibration of the auditory spatial representation. However, this contribution is only visible if the AV stimuli are presented within one spatial hemifield. Overall, the pattern of results in the top panel of Fig. 2(A) for both experiments is that, independent of FP location, the responses are mostly accurate in the trained region (all errors are much smaller than 1°, except for the blue data point at 15°), while they tend to be biased away from the training region outside of it (except for the left-most data point). This bias away is observed in all the non-training subregions for both FPs and both experiments, with the exception of the trained-FP data in the central region in the current experiment [3 red central targets in Fig. 2(A) are approximately at 0°]. Thus, the gaze-specific adaptation, which is observed in the same region, is likely caused by this lack of repulsion in the trained-FP central-location data.

Fig. 2.

(Color online) Adaptation induced by AV stimuli. (A) Average bias in A-only responses in the AV-aligned baseline condition as a function of the actual target location. The top panel shows mean response biases (±SEM) when eyes are fixated at the training FP (red line) and the non-training FP (blue line). In addition, the across-FP average data for central adaptation from Kopčo et al. (2009) are shown for comparison purposes (dashed line). The solid line in the bottom panel shows the difference between responses from training FP and the non-training FP. The dashed line shows the difference taken from Kopčo et al. (2009). (B) Predicted and observed VAE. The top left panel plots the expected pattern of biases induced in the A-only probe responses when preceding AV trials are presented in the training region (15°–30°). The red line shows predictions when the eyes fixate the training FP (i.e., the FP location used during AV training trials). The dotted blue line shows expected results from the non-training FP if the RF of adaptation is head-centered, while the dashed blue line shows expected results for an eye-centered RF. The bottom panel shows the differences between the expected bias magnitudes from the training versus the non-training FPs in the two RFs in orange. For comparison, the black dashed line sketches the results corresponding to the mixed RF observed after VAE was induced in the central region in Kopčo et al. (2009). Top right panel shows the across-subject mean (±SEM) difference between the auditory saccade end point locations when interleaved with spatially displaced AV stimuli vs when interleaved with AV-aligned stimuli, collapsed across the direction of the AV displacement. The solid black line in the bottom right panel shows the effect of initial fixation position on the magnitude of the induced shift as the across-subject mean (±SEM) difference between the shifts from the training and non-training FPs (i.e., the difference between the red and blue lines). The orange lines show the predictions of the difference for the two RFs based on the training FP data (red line) from the top right panel.

Fig. 2.

(Color online) Adaptation induced by AV stimuli. (A) Average bias in A-only responses in the AV-aligned baseline condition as a function of the actual target location. The top panel shows mean response biases (±SEM) when eyes are fixated at the training FP (red line) and the non-training FP (blue line). In addition, the across-FP average data for central adaptation from Kopčo et al. (2009) are shown for comparison purposes (dashed line). The solid line in the bottom panel shows the difference between responses from training FP and the non-training FP. The dashed line shows the difference taken from Kopčo et al. (2009). (B) Predicted and observed VAE. The top left panel plots the expected pattern of biases induced in the A-only probe responses when preceding AV trials are presented in the training region (15°–30°). The red line shows predictions when the eyes fixate the training FP (i.e., the FP location used during AV training trials). The dotted blue line shows expected results from the non-training FP if the RF of adaptation is head-centered, while the dashed blue line shows expected results for an eye-centered RF. The bottom panel shows the differences between the expected bias magnitudes from the training versus the non-training FPs in the two RFs in orange. For comparison, the black dashed line sketches the results corresponding to the mixed RF observed after VAE was induced in the central region in Kopčo et al. (2009). Top right panel shows the across-subject mean (±SEM) difference between the auditory saccade end point locations when interleaved with spatially displaced AV stimuli vs when interleaved with AV-aligned stimuli, collapsed across the direction of the AV displacement. The solid black line in the bottom right panel shows the effect of initial fixation position on the magnitude of the induced shift as the across-subject mean (±SEM) difference between the shifts from the training and non-training FPs (i.e., the difference between the red and blue lines). The orange lines show the predictions of the difference for the two RFs based on the training FP data (red line) from the top right panel.

Close modal

The expected pattern of VAE, and the predictions about the RF based on it, are illustrated in the left-hand panels of Fig. 2(B). The red line in the top left panel shows the predicted magnitude of the aftereffect induced by the AV stimuli, peaking in the trained region (15°–30°) when assessed with eyes fixating the training FP. If visually induced spatial plasticity occurs in a brain area using a head-centered RF, then shifts in perceived sound location should occur mainly for sounds at the same head-centered locations [in Fig. 2(B), dashed blue line matches the red line]. Conversely, if plasticity occurs in an eye-centered RF, then visually induced shifts should occur mainly for sounds at the same eye-referenced locations (dotted blue line is shifted to the left of the red line by the same displacement as the non-training FP is shifted relative to the training FP). The bottom left panel summarizes the predicted results if evaluated as a difference between the responses from the training and non-training FPs. The dashed orange line shows the difference between the red and dashed blue lines, corresponding to the expected results if the RF is head-centered. The dotted orange line shows the difference between the red and dotted blue lines, corresponding to the expected results if the RF is eye-centered. The dashed black line shows the predicted difference in the biases expected if the RF is mixed, as observed in Kopčo et al. (2009), in which case it should fall approximately in the middle of the predictions of the two RFs shown in orange.

We assessed the auditory-only responses interleaved with the spatially mis-aligned AV stimuli against these predictions. The red and blue triangles in Fig. 1(B) show the raw responses in the conditions in which the VAE was induced in a leftward direction (leftward-pointing triangles) or rightward direction (rightward-pointing triangles). Overall, exposure to spatially mismatched AV stimuli resulted in a shift of responses to sounds in the direction of the previously presented visual stimuli (compare the corresponding triangles to the respective circles). To allow a detailed analysis of the results comparable with the predictions of Fig. 2(B), the red line in the top right panel of Fig. 2(B) plots the magnitude of the bias in responses measured with eyes fixating the trained FP (red plus sign) re. no-shift baseline from Fig. 1(B), as a function of target location and averaged across the two directions of induced shift (note that no main effect or interaction involving the direction factor were significant in the ANOVA analysis, supporting this way of collapsing the data for visualization; Table 1). The effect was strongest for the three right-most targets, i.e., in the trained region, reaching approximately 2.3° (51% of the ventriloquism effect strength). It was also location-specific, decreasing quickly toward zero outside of the trained region. These results are consistent with the results of Kopčo et al. (2009), confirming that the VAE can be induced locally, so that it can be used to assess the VAE RF.

The RF of the VAE was examined by shifting the initial FP to a new location and examining how the observed VAE changed. The blue line in the top right panel plots the bias in responses measured with eyes fixating the new, non-trained FP (blue plus sign), shifted by approximately 23° to the left from the trained FP. There was very little difference in the measured VAE for the two FPs (blue line lies approximately on top of the red line). Thus, the observed results are consistent with visual–auditory recalibration occurring in a predominantly head-centered coordinate frame.

To compare the current results more directly to the predictions of the two models and to the data of Kopčo et al. (2009), a difference between the shift magnitudes from the two FPs was computed [bottom right of Fig. 2(B), black traces] and compared with predictions based on the two models (orange traces). Again, the results are very close to the predictions of the head-centered RF.

These results were confirmed by performing a 4-way repeated-measures ANOVA with the factors of target speaker location (nine levels), FP of the trials (training vs non-training FP), AV-trial FP location (left vs right), and the direction of induced shift (left vs right). The results of this analysis, summarized in Table 1, show that the main effect of location was always significant, confirming that the VAE is spatially specific and does not automatically generalize to the whole audiovisual field. The location by FP interaction was also significant, showing that the RF of visual–auditory recalibration is not purely head-centered, even though the eye-centered modulation is relatively small.

The current study examined the spatial properties of the VAE induced by AV stimuli presented in only one spatial hemifield in the peripheral AV field. The goal was to ascertain how the VAE unfolds as a function of multiple different spatial attributes: fixation position, generalization in head- vs eye-centered coordinates, and training within one spatial hemifield in contrast to training in both hemifields (as in Kopčo et al., 2009). The results indicate that the VAE is a multifaceted process, dependent on both the format of the neural representation of space in hearing and vision, and on the RF used by the two senses.

In terms of the representational format, the location of the fixation position impacted the pattern of adaptation induced by the AV stimuli, even when the AV-stimuli were presented from matching locations and no VAE was induced. This unexpected adaptation was not observed in the previous central-adaptation study (Kopčo et al., 2009). And, it is difficult to identify its cause, since a baseline measurement with no AV stimulation was not performed. However, a comparison of the central-adaptation and peripheral-adaptation data suggests that adaptation away from the training region was observed in the AV-aligned data in both experiments. Such expansion of space is consistent with previously observed inherent biases toward the periphery (Razavi et al., 2007). The current data shows that the inherent biases might be more correctly described as biases away from the AV-training region, rather than toward the periphery, and that the biases might be modulated by eye-gaze direction. Specifically, in the current experiment in which the AV-aligned stimuli were presented in the periphery, there was no repulsive bias in the central region when the gaze was fixated to a point in the training hemifield, but it was observed if the gaze was fixated in the opposite hemifield. At least two other factors of the current experimental design might also contribute to the effect. First, the effect might be a result of adaptation to the auditory stimulus-distribution, which becomes skewed when the training stimuli are included since all of them come from one side [e.g., similar to adaptation reported by Dahmen et al. (2010)]. Second, the visual signal might be causing some global ventriloquism-like adaptation outside the training region, such that the auditory-only responses are shifted toward the region from which the visual stimuli are frequently presented, but only when the FP is in the hemifield ipsilateral to the AV stimulation (and such shift toward the training region cancels out the repulsion observed otherwise). Whatever the specific mechanism, this adaptation effect shows that there is a hemifield-specific integration of visual and auditory spatial signals that differs from the integration occurring when the stimuli are presented centrally, covering both spatial hemifields.

Regarding RFs, the current results together with those of Kopčo et al. (2009) show that in humans the RFs of VAE are a mixture of eye-centered and head-centered coding. In the central region, the effect is a fairly even mixture of these two RFs, whereas in the periphery, the pattern more closely fits the head-centered predictions, but also shows an interaction with eye position. This shows that the transformation of the visual and auditory signals into an aligned RF, thought to be necessary for the VAE to work, is non-uniform. While it is not immediately clear what form of non-uniformity might be causing this pattern of results, it may be related to the hemispheric-difference channel models of auditory space representation (Salminen et al., 2009; Grothe et al., 2010; Groh, 2014).

Kopčo et al. (2009) performed the central-adaptation ventriloquism experiments in two rhesus monkeys in addition to the humans.1 In the monkeys, the RF was mixed between head- and eye-centered frames, consistent with most neurophysiological observations in the same species (Lee and Groh, 2012). Overall, these differences across training regions (and, possibly, across species) suggest that the locations in the brain that are recruited to accomplish this recalibration of auditory space may be widely varied. Some are likely head-centered, some are eye-centered, some may involve the position of the eyes in the orbits per se. These sites of plasticity may be recruited differently depending on the training region and whether it spans both head-centered hemifields or is contained within one.

Additional experimental and/or modeling studies are needed to test alternative explanations about the different RFs of the VAE as well as about the unexpected AV-aligned adaptation effect. However, the current results demonstrate that there are hemisphere-specific adaptation processes in visual recalibration of auditory space, resulting in different FP-dependent patterns of adaptation depending on the region in which adaptation is induced.

This work was supported by the SRDA, project DS-2016-0026, EU H2020-MSCA-RISE-2015 Grant No. 69122, and by the EU RDP projects TECHNICOM I, ITMS: 26220220182, and TECHNICOM II, ITMS2014+:313011D23. B.S.-C. was supported by Grant No. NIH R01 DC013825. J.G. was supported by Grant Nos. NIH NS50942 and NSF 0415634.

1

The current experiments were also performed in two rhesus monkeys. A detailed treatment of these effects can be found in Kopčo et al. (2019).

1.
Bertelson
,
P.
,
Frissen
,
I.
,
Vroomen
,
J.
, and
de Gelder
,
B.
(
2006
). “
The aftereffects of ventriloquism: Patterns of spatial generalization
,”
Percept. Psychophys.
68
,
428
436
.
2.
Canon
,
L. K.
(
1970
). “
Intermodality inconsistency of input and directed attention as determinants of the nature of adaptation
,”
J. Exp. Psych.
84
,
141
147
.
3.
Dahmen
,
J. C.
,
Keating
,
P.
,
Nodal
,
F. R.
,
Schulz
,
A. L.
, and
King
,
A. J.
(
2010
). “
Adaptation to stimulus statistics in the perception and neural representation of auditory space
,”
Neuron
66
,
937
948
.
4.
Groh
,
J. M.
(
2014
).
Making Space: How the Brain Knows Where Things Are
(
Harvard University Press
,
Cambridge, MA
).
5.
Groh
,
J. M.
, and
Sparks
,
D. L.
(
1992
). “
Two models for transforming auditory signals from head-centered to eye- centered coordinates
,”
Biol. Cybern.
67
,
291
302
.
6.
Grothe
,
B.
,
Pecka
,
M.
, and
McAlpine
,
D.
(
2010
). “
Mechanisms of sound localization in mammals
,”
Phys. Rev.
90
,
983
1012
.
7.
Kopčo
,
N.
,
Lin
,
I. F.
,
Shinn-Cunningham
,
B. G.
, and
Groh
,
J. M.
(
2009
). “
Reference frame of the ventriloquism aftereffect
,”
J. Neurosci.
29
,
13809
13814
.
8.
Kopčo
,
N.
,
Lokša
,
P.
,
Lin
,
I-f.
,
Groh
,
J.
, and
Barbara Shinn-Cunningham
,
B.
(
2019
).
bioRxiv
564682
.
9.
Lee
,
J.
, and
Groh
,
J. M.
(
2012
). “
Auditory signals evolve from hybrid- to eye-centered coordinates in the primate superior colliculus
,”
J. Neurophysiol.
108
,
227
242
.
10.
Maier
,
J. K.
,
McAlpine
,
D.
,
Klump
,
G. M.
, and
Pressnitzer
,
D.
(
2010
). “
Context effects in the discriminability of spatial cues
,”
J. Assoc. Res. Oto.
11
,
319
328
.
11.
Phillips
,
D. P.
, and
Hall
,
S. E.
(
2005
). “
Psychophysical evidence for adaptation of central auditory processors for interaural differences in time and level
,”
Hear. Res.
202
,
188
199
.
12.
Razavi
,
B.
,
O'Neill
,
W. E.
, and
Paige
,
G. D.
(
2007
). “
Auditory spatial perception dynamically realigns with changing eye position
,”
J. Neurosci.
27
,
10249
10258
.
13.
Recanzone
,
G. H.
(
1998
). “
Rapidly induced auditory plasticity: The ventriloquism aftereffect
,”
Proc. Natl. Acad. Sci. U. S. A.
95
,
869
875
.
14.
Salminen
,
N. H.
,
May
,
P. J.
,
Alku
,
P.
, and
Tiitinen
,
H.
(
2009
). “
A population rate code of auditory space in the human cortex
,”
PLoS One
4
,
e7600
.
15.
Woods
,
T. M.
, and
Recanzone
,
G. H.
(
2004
). “
Visually induced plasticity of auditory spatial perception in macaques
,”
Current Biol.
14
,
1559
1564
.