Phonetic convergence, or shadowing, is the phenomenon in which people unintentionally and temporarily change phonetic details of their speech to sound more similar to another talker. Previous research has examined effects of the shadower and of the target speaker. In the current study, we examine listener contributions on the perception of phonetic convergence. Two hundred and sixty listeners completed an AXB perception task in which they were asked to determine whether the first or third stimulus was a better imitation of the middle stimulus. Listeners provided information about previous instrumental and vocal training. Preliminary examination of the results reveals that listeners with more musical training (in the form of instrumental or vocal experience) lead to better accuracy. Results from the study suggest that there is a link between musical training and detection of fine-grained phonetic details in speech.