Comparative studies on affective prosody in human speech revealed remarkable cross‐cultural similarities suggesting that affective prosody may have originated from a prehuman basis. To explore this hypothesis, the acoustic variation in communication calls and its perception in tree shrews were examined. Tree shrews are small diurnal mammals, genetically closely related to primates, living in dispersed pairs in the dense tropical forests of south‐east Asia. Calls were induced experimentally in a social encounter and a disturbance paradigm, respectively, and related to two behaviorally defined arousal states within specific behavioral contexts. Context and arousal state of the caller reliably predicted spectral and temporal variations in call structure. Whereas context is closely associated with the frequency‐time contour of calls (call type), arousal is expressed in shifts of fundamental frequency and the rate of call production. In a habituation‐dishabituation paradigm, testing the effect of arousal‐related variation within the same call type, tree shrews were able to discriminate acoustically between two arousal states. All in all, these findings document the relevance of affect cues in the vocal communication system of a non‐primate mammal, the tree shrew, and support that mechanisms involved in the acoustical expression and perception of emotions are deeply rooted in mammals. [Work supported by DFG FOR 499.]