We video‐recorded isolated marimba notes performed by a world‐renowned percussionist attempting to produce long (L) and short (S) notes. We generated audio‐visual stimuli by crossing the video components of these notes. Subjects rated the duration of sounds presented in the audio‐alone (A), and in the audio‐visual (AV) condition (in the latter, they were to rely only on auditory information). In the A condition the S and L sounds were rated to be equal. In the AV condition, visual information affected ratings, but auditory information did not. In a second experiment, to undermine a response‐bias account of these findings, we introduced a temporal offset between the auditory and visual information. When the visual information preceded the auditory, vision influenced auditory judgments, but when the visual information lagged it did not. This tolerance for visual lead may reflect an ecology in which optical information about an event precedes the acoustic information about it. Together, these studies demonstrate a novel visual influence on auditory judgments of duration through the integration of sensory information.