Recently tones have been analyzed as articulatory gestures, which may be coordinated with segmental gestures. Our data from electromagnetic articulometry (EMA) show that purported neutralized phonological contrast can nonetheless exhibit coordinative difference. We develop a model based on gestural coupling to account for observed patterns. Mandarin Third Tone Sandhi (e.g., Tone3 → T3S /_ Tone3) is perceptually neutralizing in that the sandhi output (T3S) shares great similarity with Tone2. Despite both tones having rising pitch contours, there exist subtle acoustic differences. However, the difference in underlying representation between T3S and Tone2 remains unclear. By presenting evidence from the alignment pattern between tones and segments, we show that the acoustic differences between Tone2 and T3S arises out of the difference in gestural organizations. The temporal lag between the initiation of the Vowel gesture and that of Tone gesture in T3S is shorter than that in Tone2. We further argue that underlying Tone3 is the source of incomplete neutralization between the Tone2 and T3S. That is, despite the surface similarity, T3S is stored in the mental lexicon as Tone3.