Fear is a frequently studied emotion category in music and emotion research. However, research in music theory suggests that music can convey finer-grained subtypes of fear, such as terror and anxiety. Previous research on musically expressed emotions has neglected to investigate subtypes of fearful emotions. This study seeks to fill this gap in the literature. To that end, 99 participants rated the emotional impression of short excerpts of horror film music predicted to convey terror and anxiety, respectively. Then, the excerpts that most effectively conveyed these target emotions were analyzed descriptively and acoustically to demonstrate the sonic differences between musically conveyed terror and anxiety. The results support the hypothesis that music conveys terror and anxiety with markedly different musical structures and acoustic features. Terrifying music has a brighter, rougher, harsher timbre, is musically denser, and may be faster and louder than anxious music. Anxious music has a greater degree of loudness variability. Both types of fearful music tend towards minor modalities and are rhythmically unpredictable. These findings further support the application of emotional granularity in music and emotion research.

Fear is a fundamental emotion that crucially influences our daily lives. It plays an important role in risk assessment in daily decision-making, is a critical motivator for behavior, and a powerful manipulator of thought patterns and beliefs, such as those surrounding vaccinations and safety protocols during the COVID-19 pandemic (Harper et al., 2021). Fear-based mental disorders (e.g., anxiety, panic disorders) are overwhelmingly common plaguing millions worldwide each day (Yang et al., 2021). Humans communicate fear through their voice, facial expressions, and body movements, but also through art forms such as music. Anyone who has seen a scary movie, played a scary video game, or participated in a haunted house understands the ability of music to convey fear. Consider, for example, the well-known shrieking violins in the “The Murder” cue that Bernard Herrmann wrote for the famous shower murder scene in Alfred Hitchcock's film Psycho [Hitchcock (1960)]. In fact, in their review of 41 studies on emotional expression in music performance, Juslin and Laukka (2003) found evidence that professional musicians are generally able to communicate fear (as well as happiness, anger, tenderness, and sadness) about as effectively as vocal and facial expressions (Juslin, 2019). In music and emotion research, one of the most frequently studied emotional categories is “fear”1 (Juslin, 2019; Juslin and Laukka, 2003; Warrenburg, 2020a). For example, Warrenburg (2020a) reviewed the emotion terms used in 306 music and emotion studies and found that “fear” was the 6th-most popular term (following “sad,” “happy,” “anger,” “relaxed,” and “chills/pleasure”) with 332 instances.

How exactly does music convey fear? Juslin (2019) summarizes what is known to-date in music and emotion research about musically conveyed fear. In his review, he found that music typically communicates fear with fast tempi, minor and dissonant tonalities, ascending and wide-ranging pitches, staccato articulations, soft attacks, jerky and unpredictable rhythms, soft timbres, narrow and fast vibrato, and a large amount of variation in tempo, sound level, articulation, and timing (see the second column of Table I for the full list) (Juslin, 2019). These musical characteristics might successfully convey fear in part by mimicking acoustic elements of frightening vocal or natural sounds (Huron, 2015; Juslin and Laukka, 2003). For example, through acoustic analyses, Trevor et al. (2020) found that “scream-like” music underscoring terrifying scenes in horror films contains an acoustic feature unique to human screams, called “roughness,” perhaps aiding the music in effectively communicating fear.

TABLE I.

This table shows how the findings of Juslin (2019) regarding how music portrays fear compare to the descriptions of McClelland (2012, 2014, 2017b) of ombra and tempesta. The first column lists categories of musical descriptors, the second contains musical characteristics that convey fear, and the third and fourth column detail whether or not each feature is also characteristic of ombra or tempesta. The highlighted rows represent features which are only characteristic of one or the other, not both. The table overall demonstrates how current findings on musically portrayed fear are over-generalized. Research on how music expresses finer-grained subtypes of fear, such as anxiety and terror, is warranted.

CategoryFearombratempesta
(anxiety, dread)(terror, panic)
Tempo fast tempo  ✓ 
large tempo variability ✓ ✓ 
Tonality minor mode ✓ ✓ 
dissonance ✓ ✓ 
Dynamics low sound level ✓  
large sound level variability ✓ ✓ 
rapid changes in sound level ✓ ✓ 
Pitch high pitch  ✓ 
ascending pitch   
very wide pitch range ✓ ✓ 
large pitch contrasts ✓ ✓ 
micro-structural irregularitya ✓ ✓ 
Articulation staccato articulation   
large articulation variability   
soft tone attacks   
Rhythm jerky rhythms ✓ ✓ 
very large timing variability ✓ ✓ 
pauses ✓ ✓ 
Timbre soft timbre ✓  
fast vibrato rate   
small vibrato extent   
CategoryFearombratempesta
(anxiety, dread)(terror, panic)
Tempo fast tempo  ✓ 
large tempo variability ✓ ✓ 
Tonality minor mode ✓ ✓ 
dissonance ✓ ✓ 
Dynamics low sound level ✓  
large sound level variability ✓ ✓ 
rapid changes in sound level ✓ ✓ 
Pitch high pitch  ✓ 
ascending pitch   
very wide pitch range ✓ ✓ 
large pitch contrasts ✓ ✓ 
micro-structural irregularitya ✓ ✓ 
Articulation staccato articulation   
large articulation variability   
soft tone attacks   
Rhythm jerky rhythms ✓ ✓ 
very large timing variability ✓ ✓ 
pauses ✓ ✓ 
Timbre soft timbre ✓  
fast vibrato rate   
small vibrato extent   
a

Small variations in pitch and sound level (Cespedes-Guevara and Eerola, 2018).

However, research on musically expressed fear has been missing a crucial consideration: emotional granularity. Emotional granularity refers to an individual's capacity to recognize, in oneself and in others, finer-grained emotional states that may be similar to each other, and to communicate these distinctive emotional states with targeted terminology (Barrett, 2004, 2017; Warrenburg, 2019a,b, 2020b). Researchers have only just started to examine musically expressed subtypes of the basic emotions frequently studied. For example, Warrenburg (2020b) recently distinguished between two subtypes of sad emotions in music: melancholy and grief. She also called for more consideration of emotional granularity in designing future music and emotion studies in order to correct any previously formed misconceptions or inconsistencies about how music conveys emotions due to experimental designs that conflated finer-grained emotional states (Warrenburg, 2019a,b).2

There is considerable musical and psychological evidence that music may convey yet-unexplored subtypes of fear. Returning to the example of Psycho (1960), compare the previously discussed “The Murder” cue (YouTube, 2023a) to the suspenseful music Herrmann wrote for the scene where detective Arbogast creeps through Bates' house [i.e., “The Stairs” cue: (YouTube, 2023b)]. To date, most music and emotion researchers would likely classify both types of music as music that conveys fear even though there are pronounced musical and acoustic differences between them. In a branch of Music Theory called Topic Theory,3 however, these two types of music would indeed be classified separately based on differences between their collective musical features. McClelland (2012, 2014, 2017a,b) divides scary classical and contemporary film music into two distinct types, or topics: ombra and tempesta. Ombra refers to the common collection of musical features that appear in music written to underscore ghost and witch scenes, melodramas on supernatural subjects, or any scenes requiring a mysterious, suspenseful atmosphere (McClelland, 2012, 2014). McClelland (2012, 2014) describes ombra as somber and gloomy in style, having a slow or moderate pace, and containing melodies that are often exclamatory and fragmented with restless motion. The combination of a serious, dark mood with unnerving, unpredictable entrances and rhythms communicates anxiety and dread (McClelland, 2012, 2014). In contrast, the common collection of musical features used in music underscoring scenes involving storms, floods, earthquakes, or conflagrations is referred to as the topic tempesta (McClelland, 2014, 2017b). Such scenes often involve flight or pursuit, panic, or metaphorical depictions of rage or madness (McClelland, 2014, 2017b). Through the use of fast tempi, unusual modulations, fragmented melodies, and very wide melodic leaps, tempesta is agitated and stormy in style and generally communicates feelings of terror (McClelland, 2014, 2017b).4

Notably, several of the features that Juslin (2019) summarizes as communicative of fear are characteristic of tempesta, while others are characteristic of ombra. Table I demonstrates which features map onto which of the two topics. The highlighted rows depict features that are distinctly different between the two topics. For example, Juslin (2019) notes that music portrays fear through the use of fast tempi and high pitches, both of which are characteristic of tempesta (McClelland, 2014, 2017b). However, ombra consists of slow or moderate tempi and of lower pitches (McClelland, 2012, 2014). On the other hand, Juslin (2019) notes that music portrays fear with lower sound levels and soft timbres, similarly to ombra, which is generally quieter and employs darker or softer timbres (McClelland, 2012, 2014). Tempesta, contrastingly, uses rougher and brighter timbres and is notably louder than ombra (McClelland, 2014, 2017b). Additionally, some of the characteristics Juslin (2019) lists as conveying fear are not mentioned by McClelland (2012, 2014, 2017b) (signified by blank squares in the rightmost columns in Table I), and some features of ombra and tempesta are not mentioned by Juslin (2019), such as the unusual tonal modulations, the bold, unpredictable, chromatic harmonic motions, and the fragmented, disjunct melodic motions that are characteristic of both topics. These different accounts of fearful music provide strong evidence that more research is needed on musically conveyed subtypes of fear. More specifically, we argue that McClelland's research suggests that scary music conveys at least two subtypes of fear: terror and anxiety.

Current psychological theories on fear support such a distinction between subtypes or varieties of fear (Adolphs, 2013; Adolphs et al., 2019; LeDoux, 2014; Mobbs et al., 2019; Perkins et al., 2012). Subtypes of fear can be functionally differentiated: panic or terror is associated with attention and reaction to an immediate threat whereas anxiety implies a future threat to be dealt with through planning and prediction behaviors (Adolphs, 2013; Mobbs et al., 2019). The discovery that different neural networks process subtypes of fear further supports these functional differences (Adolphs, 2013; Gross and Canteras, 2012; Mobbs et al., 2019). For example, as described by Adolphs (2019) in a recent interview on the state of fear research, a neural circuit involving the periaqueductal gray and the superior colliculus has been found to mediate fear behaviors in rodents who spot aerial predators (Evans et al., 2018) while another region often active in fear states, the ventromedial hypothalamus, has not been found to respond to sightings of aerial predators (Kunwar et al., 2015; Lin et al., 2011). Using functional magnetic resonance imagery (fMRI) and a simulated maze paradigm in which predators pursued participants threatening to give them an unpleasant, yet harmless, electric shock, Mobbs et al. (2007, 2009) found activations in forebrain areas, the subgenual anterior cingulate cortex (sgACC), hippocampus, and amygdala in reaction to the detection of distant threats, while avoidance of threats that were closer in proximity activated areas in the midbrain and the mid-dorsal ACC. Given the functional differences between anxiety (a response to a possible future threat) and terror or panic (a response to an immediate, known threat), it is possible to interpret these results as reflective of different neural networks for processing these separate subtypes of fear (Perkins et al., 2012).

Humans also express subtypes of fear in markedly different ways (Kumar and Mohanty, 2016; Perkins et al., 2012). For example, we display a different facial expression for anxiety (characterized by environmental scanning behaviors such as head swivels and eye darts) than for terror [characterized by staring straight ahead (Perkins et al., 2012)]. We also express anxiety and terror differently with our voices by using a higher fundamental frequency for terror (Kumar and Mohanty, 2016). Despite this multidisciplinary evidence, subtypes of fear have yet to be investigated in music and emotion research. For example, in a review of emotion terms used in music and emotion research, Warrenburg (2020a) found no use of the term “terror” and only 16 uses of the term “anxiety.” Our study aims to fill this gap in music and emotion research by investigating musically expressed subtypes of fear.

While many databases of emotional musical excerpts exist [e.g., Eerola and Vuoskoski (2011), Paquette et al. (2013), Vieillard et al. (2008), and Warrenburg (2021)], none of them distinguish between terror and anxiety. Therefore, to answer our research questions, we elected to create our own large database of ecologically valid musical excerpts that communicate terror and anxiety, respectively. To create and validate our database, we used a method similar to that of Eerola and Vuoskoski (2011). We curated excerpts from horror film soundtracks using an expertise-based approach and then recruited participants to rate the excerpts along several emotion rating scales (both dimensional and discrete). Given the previously discussed evidence that music expresses subtypes of fear markedly differently (McClelland, 2012, 2014, 2017b), we predicted that participants would rate the musical excerpts curated from horror film soundtracks in accordance with their target emotions (terror or anxiety) on discrete emotion rating scales. Additionally, we predicted that with dimensional (valence and arousal) emotion rating scales, participants would rate the terrifying musical excerpts as communicating a more negative valence and a higher arousal than the anxious musical excerpts due to the predicted heightened intensity and dissonance of the musical features associated with terror as compared to anxiety [e.g., louder dynamics, higher pitch, etc. (McClelland, 2012, 2014, 2017b)].

After testing these hypotheses, we then used the data to filter out the most successful excerpts at portraying terror and anxiety to create FEARMUS: a new battery of fearful musical stimuli. Specifically, we ranked excerpts based on their typicality index (formula defined in Sec. II F) and retained the 50 most typical from each category for a final database of 100 excerpts portraying terror and anxiety. Once we created FEARMUS, we then used descriptive and acoustic analyses to further describe how music conveys these two subtypes of fear.

We curated three terrifying and three anxious musical excerpts from the original soundtracks of thirty horror films (see Table II for the list of films) resulting in a total of 180 excerpts (30 films × 3 excerpts × 2 emotions). We opted to curate music from horror films that were contemporary (released in 2013 or later), highly rated by both critics [assessed using “Metascore” on Metacritic (2023)] and viewers [assessed via “User Ratings” on IMDb (2023) (Internet Movie Database)], and had scores that contained 20 min or more of originally composed music. To curate, one of the experimenters with expertise in topic theory and horror film soundtracks listened to each soundtrack in full and used the descriptive criteria for ombra (McClelland, 2012, 2014) to select anxious excerpts and the descriptive criteria for tempesta (McClelland, 2014, 2017b) to select terrifying excerpts.5 The excerpts were between 10 and 30 s in length, depending on the natural phrasing of the excerpt, similarly to Eerola and Vuoskoski (2011). For the experiment, we decided to also include music that conveyed positive emotions to balance the experience of the participants. Therefore, we included 30 excerpts that convey happiness and 30 excerpts that convey tenderness from Eerola and Vuoskoski's (2011) previously validated database of emotional film music excerpts.

TABLE II.

These are the film soundtracks that we curated from to create FEARMUS. We selected films that were (i) successful among viewers and critics alike as indicated via ratings from IMDb (2023) and Metacritic (2023), (ii) had originally composed scores that included at least 20 min of music, and (iii) were released in 2013 or later. The IMDb “User Ratings” and Metacritic “Metascores” reported here were gathered on 11 November 2022.

TitleYearCountry(ies)Director(s)Composer(s)User RatingsMetascore
(out of 10)(out of 100)
Midsommar 2019 USA and Sweden Ari Aster Bobby Krlic 7.1 72 
Us 2019 USA Jordan Peele Michael Abels 6.8 81 
A Quiet Place 2018 USA John Krasinski Marco Beltrami 7.5 82 
Annihilation 2018 UK and USA Alex Garland Ben Salisbury and Geoff Barrow 6.8 79 
Hereditary 2018 USA Ari Aster Colin Stetson 7.3 87 
Mandy 2018 USA and Canada Panos Cosmatos Jóhann Jóhannsson 6.5 81 
1922 2017 USA Zak Hilditch Mike Patton 6.2 70 
Annabelle: Creation 2017 USA David F. Sandberg Benjamin Wallfisch 6.5 62 
Get Out 2017 USA Jordan Peele Michael Abels 7.7 85 
10 Ghost Stories 2017 UK Jeremy Dyson and Andy Nyman Frank Ilfman 6.4 68 
11 It 2017 USA Andy Muschietti Benjamin Wallfisch 7.3 69 
12 It Comes At Night 2017 USA Trey Edward Shults Brian McOmber 6.2 78 
13 Revenge 2017 France Coralie Fargeat ROB (Robin Coudert) 6.4 81 
14 The Blackcoat's Daughter 2017 USA and Canada Osgood Perkins Elvis Perkins 5.9 68 
15 Tigers Are Not Afraida 2017 Mexico Issa López Vince Pope 6.9 76 
16 10 Cloverfield Lane 2016 USA Dan Trachtenberg Bear McCreary 7.2 76 
17 Before I Wake 2016 USA Mike Flanagan Danny Elfman and The Newton Brothers 6.2 68 
18 Don't Breathe 2016 USA Fede Álvarez Roque Baños 7.1 71 
19 Hush 2016 USA Mike Flanagan The Newton Brothers 6.6 67 
20 Rawb 2016 France and Belgium Julia Ducournau Jim Williams 7.0 81 
21 Split 2016 USA M. Night Shyamalan West Dylan Thordson 7.3 62 
22 Crimson Peak 2015 USA Guillermo del Toro Fernando Velázquez 6.5 66 
23 The Devil's Candy 2015 USA Sean Byrne Michael Yezerski 6.4 72 
24 The Invitation 2015 USA Karyn Kusama Theodore Shapiro 6.6 74 
25 The Witch 2015 USA Robert Eggers Mark Korven 6.9 83 
26 Goodnight Mommyc 2014 Austria Veronika Franz and Severin Fiala Olga Neuwirth 6.7 81 
27 It Follows 2014 USA David Robert Mitchell Disasterpeace (Richard Vreeland) 6.8 83 
28 The Babadook 2014 Australia Jennifer Kent Jed Kurzel 6.8 86 
29 The Conjuring 2013 USA James Wan Joseph Bishara 7.5 68 
30 You're Next 2013 USA Adam Wingard Jasper Justice Lee, Kyle McKinnon, Mads Heldtberg, and Adam Wingard 6.6 66 
     Mean: 6.8 75 
     SD: 0.4 
TitleYearCountry(ies)Director(s)Composer(s)User RatingsMetascore
(out of 10)(out of 100)
Midsommar 2019 USA and Sweden Ari Aster Bobby Krlic 7.1 72 
Us 2019 USA Jordan Peele Michael Abels 6.8 81 
A Quiet Place 2018 USA John Krasinski Marco Beltrami 7.5 82 
Annihilation 2018 UK and USA Alex Garland Ben Salisbury and Geoff Barrow 6.8 79 
Hereditary 2018 USA Ari Aster Colin Stetson 7.3 87 
Mandy 2018 USA and Canada Panos Cosmatos Jóhann Jóhannsson 6.5 81 
1922 2017 USA Zak Hilditch Mike Patton 6.2 70 
Annabelle: Creation 2017 USA David F. Sandberg Benjamin Wallfisch 6.5 62 
Get Out 2017 USA Jordan Peele Michael Abels 7.7 85 
10 Ghost Stories 2017 UK Jeremy Dyson and Andy Nyman Frank Ilfman 6.4 68 
11 It 2017 USA Andy Muschietti Benjamin Wallfisch 7.3 69 
12 It Comes At Night 2017 USA Trey Edward Shults Brian McOmber 6.2 78 
13 Revenge 2017 France Coralie Fargeat ROB (Robin Coudert) 6.4 81 
14 The Blackcoat's Daughter 2017 USA and Canada Osgood Perkins Elvis Perkins 5.9 68 
15 Tigers Are Not Afraida 2017 Mexico Issa López Vince Pope 6.9 76 
16 10 Cloverfield Lane 2016 USA Dan Trachtenberg Bear McCreary 7.2 76 
17 Before I Wake 2016 USA Mike Flanagan Danny Elfman and The Newton Brothers 6.2 68 
18 Don't Breathe 2016 USA Fede Álvarez Roque Baños 7.1 71 
19 Hush 2016 USA Mike Flanagan The Newton Brothers 6.6 67 
20 Rawb 2016 France and Belgium Julia Ducournau Jim Williams 7.0 81 
21 Split 2016 USA M. Night Shyamalan West Dylan Thordson 7.3 62 
22 Crimson Peak 2015 USA Guillermo del Toro Fernando Velázquez 6.5 66 
23 The Devil's Candy 2015 USA Sean Byrne Michael Yezerski 6.4 72 
24 The Invitation 2015 USA Karyn Kusama Theodore Shapiro 6.6 74 
25 The Witch 2015 USA Robert Eggers Mark Korven 6.9 83 
26 Goodnight Mommyc 2014 Austria Veronika Franz and Severin Fiala Olga Neuwirth 6.7 81 
27 It Follows 2014 USA David Robert Mitchell Disasterpeace (Richard Vreeland) 6.8 83 
28 The Babadook 2014 Australia Jennifer Kent Jed Kurzel 6.8 86 
29 The Conjuring 2013 USA James Wan Joseph Bishara 7.5 68 
30 You're Next 2013 USA Adam Wingard Jasper Justice Lee, Kyle McKinnon, Mads Heldtberg, and Adam Wingard 6.6 66 
     Mean: 6.8 75 
     SD: 0.4 
a

Vuelven [Spanish: (they) Return].

b

Grave [French: Severe].

c

Ich seh, Ich seh [German: I see, I see].

Each of the excerpts we curated were sampled at 48 000 Hz and normalized at 1 dB. A half-second fade-in and fade-out were added to each excerpt. All recordings are single-channel 16-bit wav-files. For the original 180 excerpts, the terrifying ones had a mean length of 16.73 s (SD = 5.70), and the anxious ones had a mean length of 17.88 s (SD = 4.18).

We collected emotion ratings to filter the database down to the 50 most typical anxious musical excerpts and the 50 most typical terrifying musical excerpts. To accomplish this, we had participants listen to a randomly selected portion (one third) of the original collection of musical excerpts.6 Specifically, each participant listened to 30 of the terror excerpts, 30 of the anxiety excerpts, and all of the 30 happiness and 30 tenderness excerpts. We used a custom-built code in matlab to randomly select one third of the anxiety and terror excerpts per participant. The code ensured that all of the excerpts were evenly randomly distributed across participants.7 During the experiment, the sound files were presented in a pseudorandomized order with no target emotions being presented consecutively. After listening to each excerpt, participants rated how well it portrayed terror, anxiety, happiness, and tenderness. Four analogue-categorical scales were used for these ratings, each of which had visual reference points akin to a seven-point Likert scale but operated continuously (1 = portraying the target emotion very poorly, 7 = portraying the target emotion very successfully). Additionally, they were asked how familiar they were with each excerpt (0 = unfamiliar, 1 = somewhat familiar, 2 = very familiar), what valence the excerpt conveyed (−3 = very negative, 3 = very positive) and what arousal level each excerpt conveyed (1 = very low, 7 = very high) with similar analogue-categorical scales. Before they began the experiment, participants were given a list of definitions8 for each of the emotion terms used during the task (i.e., terror, anxiety, happiness, tenderness, arousal, and valence). This study was approved by the Cantonal Ethics Commission of Zürich, Switzerland.

We determined a target number for recruitment using a power analysis in r with a large effect size (0.7), a significance level of 0.05, and a power of 0.8 (return value was 33 participants per third, or 99 in total). Then we recruited 113 English-speaking non-musician participants from the University of Zürich (UZH) and from Zürich University of Applied Sciences. We elected to recruit non-musician participants to create a database that conveys the target emotions successfully to the general population rather than to only a musically trained subgroup. To qualify as a non-musician, participants were required to (i) not own any music certificates or diplomas and (ii) not have played a musical instrument daily within the last five years or during their childhood. We also used the Goldsmith's Musical Sophistication Index (GMSI) (Müllensiefen et al., 2014) to quantify the average level of musicianship of our participants.9 During data collection, the data from eight participants were lost due to a technical error. Therefore, our initial sample contained 105 participants (69 female), who were 18 to 49 years old (M = 25.77, SD = 5.75). All participants gave informed and written consent for their participation in accordance with the ethical and data security guidelines of the University of Zürich.

At the start of the experiment, participants had two practice trials to familiarize themselves with the interface and rating scales and to test the headphones. Participants listened to and rated 120 musical excerpts on the seven scales mentioned above: terror, anxiety, happiness, tenderness, arousal, valence, and familiarity. After the ratings experiment, they took an online questionnaire that consisted of several surveys,10 including the GMSI. Participants then chose between participation hours (course credit) or 40 Swiss francs (CHF) as compensation for their participation. The ratings task took 80–90 min, and the questionnaire took 20–30 min (total time: 100–120 min). Participants were invited to take a short break every 25% of the way through the ratings task (approximately every 20 min).

To check the data for outliers, we calculated a correlation coefficient for each participant's ratings compared to the mean ratings for each musical excerpt. Boxplots of the correlation coefficients showed that six participants were frequently outliers and therefore may not have understood the rating scales. Therefore, we eliminated their data from subsequent analyses. Our final sample contained 99 participants (66 female), who were 18–49 years old (M = 25.84, SD = 5.84). We also checked the familiarity rating of each musical excerpt and found that two excerpts were outliers with a mean rating of above 0.6. External associations that listeners have with familiar music can distort the emotions that they believe the music portrays (Eerola and Vuoskoski, 2011; Juslin and Västfjäll, 2008; Schellenberg et al., 2008; Vieillard et al., 2008). Therefore, we excluded these two excerpts from subsequent analyses for possibly being overly familiar to the participants.

To obtain highly typical examples of music for terror and anxiety, we calculated a typicality index (T) of the target emotion for each excerpt using the same procedure and equations as Eerola and Vuoskoski (2011). Typicality was calculated by subtracting the mean of the excerpt's non target emotion rating (NE) and the standard deviation of its target emotion rating (SE) from the mean of the target emotion rating (E):

Based on the typicality index, we ranked the 180 original musical excerpts (90 terror, 90 anxiety) to find the most typical excerpts for each corresponding emotion. We then used this ranking to select the 50 most typical anxious and 50 most typical terrifying excerpts resulting in a total of 100 excerpts (in other words, 55.56% of the originally curated collection of excerpts).

To uncover the sonic differences between the anxious and terrifying musical excerpts in FEARMUS, we conducted an exploratory descriptive analysis. The analysis consisted of two of the investigators listening to randomly sampled selections of the FEARMUS database and noting their observation for various musical features. For our analysis, we used the same musical features that McClelland (2014) (p. 282) used to describe ombra and tempesta (see the first column of Table VI for the full list). To do this analysis, we used Apple Music to randomly sample (using “shuffle” mode) 10 musical excerpts from the FEARMUS database per musical descriptor per target emotion (10 terror and then 10 anxiety, per each musical feature). Sitting together and listening to the excerpts on a speaker, the researchers individually noted their observations for each musical feature. For example, for the feature “tempo,” the researchers listened to ten randomly selected excerpts from FEARMUS that portrayed anxiety, and then ten that portrayed terror, simultaneously writing down their observations about “tempo” for each target emotion before moving on to another musical feature. Upon completing their notes for all of the features, the two investigators pooled and summarized their combined observations to produce the final results (shown in Table VI).

The results of our exploratory descriptive analyses produced several hypotheses about the sonic similarities and differences between musically conveyed terror and anxiety. We elected to test these hypotheses using confirmatory acoustic analyses. For these analyses, we chose to measure 13 acoustic features that were suitable to test our hypotheses (see Table VII in Sec. III). To complete these analyses, we used the Music Information Retrieval (MIR) toolbox (version 1.8.1) (Lartillot et al., 2007) in matlab.11 For acoustic features related to timbre, we controlled for the different lengths of the excerpts in FEARMUS by randomly sampling five one-second segments of music from each excerpt to analyze. These segments were non-overlapping and did not contain the first and last faded half-seconds of each track. We analyzed the other non-timbral acoustic features across the full lengths of the excerpts. We then controlled for length differences with random slopes in our mixed effects models (see Sec. II I). We measured all spectral features with a sampling rate of 48 000 Hz, a Hamming window, and a frame length of 50 ms with a half-overlapping hop length.

All statistical analyses were done using r analysis software version 3.6.1 (R Core Team, 2019). We used both general linear regression models (for data without repeated measures) and mixed-effects linear regression models (for data with repeated measures) to test our hypotheses regarding the emotion ratings and the acoustic analyses. For these analyses, we used the lm function to fit general linear models and the lmer function to fit mixed effect linear regression models. We also used the “lme4” library (Bates et al., 2015) to fit the mixed-effects models and calculate t-values, and the “lmer” test package (Kuznetsova et al., 2017) to estimate p-values and degrees of freedom. Before model fitting, all categorical variables were coded as 0 and 1 in alphabetical order (i.e., for target emotion, anxiety = 0 and terror = 1). The significance level for all analyses was set to FDR-adjusted p < 0.05.

To test our main hypotheses, we used emotion ratings (valence, arousal, terror, and anxiety) as the predicted values and target emotions (terror and anxiety) as predictor values for four mixed-effects linear regression models. Additionally, we used target emotions (terror and anxiety) as the predicted values and emotion ratings (terror and anxiety) as the predictor values for two additional mixed effects linear regression models. Participants and excerpt track numbers were included as random slopes. We report the results of the regression analyses in Table III, and the means and standard deviations of the ratings, grouped by target emotion, in Table IV.

TABLE III.

This table shows the results of our linear regression analyses testing how the emotion ratings (anxiety, terror, valence, arousal) are predicted by the target emotions (anxiety, terror), and how the target emotions (anxiety, terror) are predicted by the discrete emotion ratings (anxiety, terror). The bold results are statistically significant at adjusted p < 0.01. The results demonstrate more negative valence ratings, higher arousal ratings, and higher terror ratings for terrifying musical stimuli as compared to anxious musical stimuli, consistent with our hypotheses. Furthermore, anxious musical stimuli were rated as conveying a higher level of anxiety than terror, consistent with our hypotheses. However, terrifying musical stimuli were rated as conveying a higher level of anxiety than terror, and a greater degree of anxiety than anxious musical excerpts, inconsistent with our hypotheses. n = 99.

Est.SEtUnadj. pAdj. pdf
 Anxiety rating 
Intercept Target emotion (1 = terror) 5.136 0.085 60.164 < 2E−16 < 2.6E−16 137.274 
0.378 0.072 5.222 4.69E−07 5.12E−07 186.602 
 Terror rating 
Intercept Target emotion (1 = terror) 4.260 0.127 33.600 < 2E−16 < 2.6E−16 138.223 
1.014 0.104 9.800 < 2E−16 < 2.6E−16 217.542 
 Valence rating 
Intercept Target emotion (1 = terror) −1.418 0.062 −22.800 < 2E−16 < 2.6E−16 148.998 
−0.481 0.050 −9.720 < 2E−16 < 2.6E−16 187.872 
 Arousal rating 
Intercept Target emotion (1 = terror) 4.230 0.091 46.850 < 2E−16 < 2.6E−16 133.467 
0.890 0.081 11.000 < 2E−16 < 2.6E−16 204.545 
 Conveyed anxiety 
Intercept Rating scale (1 = terror rating) 5.136 0.085 60.421 < 2E−16 < 2.6E−16 133.188 
−0.875 0.096 −9.141 4.52E−15 5.42E−15 106.842 
 Conveyed terror 
Intercept Rating scale (1 = terror rating) 5.513 0.071 78.191 < 2E−16 < 2.6E−16 121.292 
−0.239 0.074 −3.221 1.64E−03 1.64E−03 121.538 
Est.SEtUnadj. pAdj. pdf
 Anxiety rating 
Intercept Target emotion (1 = terror) 5.136 0.085 60.164 < 2E−16 < 2.6E−16 137.274 
0.378 0.072 5.222 4.69E−07 5.12E−07 186.602 
 Terror rating 
Intercept Target emotion (1 = terror) 4.260 0.127 33.600 < 2E−16 < 2.6E−16 138.223 
1.014 0.104 9.800 < 2E−16 < 2.6E−16 217.542 
 Valence rating 
Intercept Target emotion (1 = terror) −1.418 0.062 −22.800 < 2E−16 < 2.6E−16 148.998 
−0.481 0.050 −9.720 < 2E−16 < 2.6E−16 187.872 
 Arousal rating 
Intercept Target emotion (1 = terror) 4.230 0.091 46.850 < 2E−16 < 2.6E−16 133.467 
0.890 0.081 11.000 < 2E−16 < 2.6E−16 204.545 
 Conveyed anxiety 
Intercept Rating scale (1 = terror rating) 5.136 0.085 60.421 < 2E−16 < 2.6E−16 133.188 
−0.875 0.096 −9.141 4.52E−15 5.42E−15 106.842 
 Conveyed terror 
Intercept Rating scale (1 = terror rating) 5.513 0.071 78.191 < 2E−16 < 2.6E−16 121.292 
−0.239 0.074 −3.221 1.64E−03 1.64E−03 121.538 
TABLE IV.

This table shows the means and SDs of the seven emotion ratings (anxiety, terror, tenderness, happiness, valence, arousal, familiarity) that participants gave the musical excerpts as grouped by the four target emotions (anxiety, terror, tenderness, happiness). Analog-categorical scales were used for these ratings, each of which had visual reference points akin to a 7-point Likert scale but operated continuously. The four discrete emotions were rated from 1 to 7 (1 = portraying the target emotion very poorly, 7 = portraying the target emotion very successfully), valence was rated –3 to 3 ( –3 = very negative, 3 = very positive), arousal 1 to 7 (1 = very low, 7 = very high), and familiarity 0 to 2 (0 = unfamiliar, 1 = somewhat familiar, 2 = very familiar). n = 99.

Rating
anxietyterrortendernesshappinessvalencearousalfamiliarity
  5.14 4.26 1.54 1.29 −1.42 4.26 0.32 
Target emotions anxiety (1.36) (1.76) (1.16) (0.71) (0.98) (1.46) (0.5) 
[1–7] [1–7] [1–7] [1–7] [−3 to 3] [1–7] [0–2] 
5.51 5.28 1.26 1.16 −1.9 5.15 0.38 
terror (1.24) (1.60) (0.90) (0.5) (0.95) (1.39) (0.55) 
[1–7] [1–7] [1–7] [1–7] [−3 to 3] [1–7] [0–2] 
1.43 1.11 5.64 4.43 1.2 4.32 0.59 
tenderness (0.91) (0.4) (1.28) (1.51) (1.09) (1.37) (0.63) 
[1–7] [1–7] [1–7] [1–7] [−3 to 3] [1–7] [0–2] 
1.36 1.11 4.61 5.4 1.67 4.59 0.67 
happiness (0.81) (0.41) (1.67) (1.36) (0.99) (1.39) (0.68) 
[1–7] [1–7] [1–7] [1–7] [−3 to 3] [1–7] [0–2] 
Rating
anxietyterrortendernesshappinessvalencearousalfamiliarity
  5.14 4.26 1.54 1.29 −1.42 4.26 0.32 
Target emotions anxiety (1.36) (1.76) (1.16) (0.71) (0.98) (1.46) (0.5) 
[1–7] [1–7] [1–7] [1–7] [−3 to 3] [1–7] [0–2] 
5.51 5.28 1.26 1.16 −1.9 5.15 0.38 
terror (1.24) (1.60) (0.90) (0.5) (0.95) (1.39) (0.55) 
[1–7] [1–7] [1–7] [1–7] [−3 to 3] [1–7] [0–2] 
1.43 1.11 5.64 4.43 1.2 4.32 0.59 
tenderness (0.91) (0.4) (1.28) (1.51) (1.09) (1.37) (0.63) 
[1–7] [1–7] [1–7] [1–7] [−3 to 3] [1–7] [0–2] 
1.36 1.11 4.61 5.4 1.67 4.59 0.67 
happiness (0.81) (0.41) (1.67) (1.36) (0.99) (1.39) (0.68) 
[1–7] [1–7] [1–7] [1–7] [−3 to 3] [1–7] [0–2] 

Consistent with our hypotheses, the results demonstrate a significant main effect between target emotions (1 = terror) and terror ratings driven by higher terror ratings for terrifying musical stimuli (M = 5.28, SD = 1.58) as compared to anxious musical stimuli (M = 4.26, SD = 1.76, p < 0.0001). Also, consistent with our hypotheses, the results demonstrate a significant main effect between target emotions and valence ratings driven by more negative valence ratings for terrifying musical stimuli (M = –1.90, SD = 0.95) as compared to anxious musical stimuli (M = –1.42, SD = 0.98, p < 0.0001), and a significant main effect between target emotions and arousal ratings driven by higher arousal ratings for terrifying musical stimuli (M = 5.15, SD = 1.39) as compared to anxious musical stimuli (M = 4.26, SD = 1.46, p < 0.0001). Furthermore, the results demonstrate a significant main effect between emotion ratings (terror = 1) and musically conveyed anxiety driven by higher anxiety ratings (M = 5.14, SD = 1.36) than terror ratings (M = 4.26, SD = 1.76, p < 0.0001) for anxious music, also consistent with our hypotheses. However, inconsistent with our hypotheses, the results demonstrate a significant main effect between target emotions and anxiety ratings driven by higher anxiety ratings for terrifying musical stimuli (M = 5.51, SD = 1.24) as compared to anxious musical stimuli (M = 5.14, SD = 1.36, p < 0.0001). Additionally, the results demonstrate a significant main effect between emotion ratings (terror rating = 1) and musically conveyed terror driven by higher anxiety ratings (M = 5.51, SD = 1.24) than terror ratings (M = 5.28, SD = 1.60, p < 0.0001) for terrifying music, inconsistent with our hypotheses. The rating results are also displayed as boxplots in Figs. 1 and 2.

FIG. 1.

The boxplots show the results of the ratings experiment during which 99 participants rated 240 musical excerpts selected to convey one of four target emotions (90 conveying terror, 90 conveying anxiety, 30 conveying happiness, and 30 conveying tenderness). Participants rated the conveyed emotion of the musical excerpts using discrete emotion scales and dimensional emotion scales (valence and arousal). Here, we show the average discrete emotional ratings per target emotion. In line with our predictions, participants rated the terrifying musical excerpts as conveying greater terror than the anxious musical excerpts. Furthermore, they also rated the anxious musical excerpts as conveying greater anxiety than terror, consistent with our hypotheses. However, inconsistent with our hypothesis, participants rated the terrifying excerpts as conveying greater anxiety than terror, and a greater degree of anxiety than the anxious excerpts.

FIG. 1.

The boxplots show the results of the ratings experiment during which 99 participants rated 240 musical excerpts selected to convey one of four target emotions (90 conveying terror, 90 conveying anxiety, 30 conveying happiness, and 30 conveying tenderness). Participants rated the conveyed emotion of the musical excerpts using discrete emotion scales and dimensional emotion scales (valence and arousal). Here, we show the average discrete emotional ratings per target emotion. In line with our predictions, participants rated the terrifying musical excerpts as conveying greater terror than the anxious musical excerpts. Furthermore, they also rated the anxious musical excerpts as conveying greater anxiety than terror, consistent with our hypotheses. However, inconsistent with our hypothesis, participants rated the terrifying excerpts as conveying greater anxiety than terror, and a greater degree of anxiety than the anxious excerpts.

Close modal
FIG. 2.

The boxplots show additional results from the ratings experiment described in the Fig. 1 caption. Specifically, these boxplots show the average dimensional emotion ratings per target emotion of the musical excerpts. The top graph shows the valence ratings and the bottom graph shows the arousal ratings. Consistent with our hypotheses, participants rated the terrifying musical excerpts as conveying a more negative average valence and a higher average arousal than the anxious musical excerpts.

FIG. 2.

The boxplots show additional results from the ratings experiment described in the Fig. 1 caption. Specifically, these boxplots show the average dimensional emotion ratings per target emotion of the musical excerpts. The top graph shows the valence ratings and the bottom graph shows the arousal ratings. Consistent with our hypotheses, participants rated the terrifying musical excerpts as conveying a more negative average valence and a higher average arousal than the anxious musical excerpts.

Close modal

To create the FEARMUS database, we calculated the typicality index (T) for each of the 180 excerpts and used it to rank them to find the most typical excerpts for each target emotion. The final database consists of the 50 most typical terrifying musical excerpts and the 50 most typical anxious musical excerpts for a total of 100 excerpts (see supplementary Table S1 for information about each excerpt12). Table V provides information on the distribution of the typicality indices per each emotion.

TABLE V.

Descriptive statistics characterizing the resulting typicality indices per target emotion in FEARMUS. n = 100.

Typicality Index
MeanSDMinMax
Target emotions Anxiety 1.85 0.31 1.44 2.57 
Terror 1.65 0.41 1.02 2.77 
Typicality Index
MeanSDMinMax
Target emotions Anxiety 1.85 0.31 1.44 2.57 
Terror 1.65 0.41 1.02 2.77 

The final selection of excerpts have very similar length distributions. Terrifying excerpts have a mean length of 18.2 s (SD = 6.02) and anxious excerpts have a mean length of 17.8 s (SD = 4.10). FEARMUS is available for download from the Open Science Framework (2023).

We used McClelland's (2014) list of musical features that he used to describe ombra and tempesta as a model for our descriptive analysis of the FEARMUS database. Table VI contains the results. In the second column, we define the musical terms that we used with the Merriam-Webster online dictionary (Merriam-Webster, 2023). Several crucial differences between anxious and terrifying music are apparent in these results concerning tempo, harmony, melody and figuration, dynamics, timbre, and the sounds referenced by these types of music. Anxious music contains slow, heavy tempi while terrifying music contains fast tempi or sustained walls of noise. The harmonies in anxious music are vast and spacious while the harmonies in terrifying music are often densely packed. The melodies in anxious music contain stuttering, sighing figurations while terrifying music contains more chaotic figurations including shrill, randomly stepping, fast-moving lines or noisy clusters. Anxious music typically features open, hollow textures than lean on both extremes of the pitch spectrum (very low and very high) while terrifying music usually has a very active, dense, massive texture. Concerning dynamics, anxious music is typically quieter overall compared to the screaming intensity of terrifying music. The timbre of anxious music is more muted, distant, and dark while the timbre of terrifying music is harsh, noisy, and jarring. The sounds referenced by anxious music are suspenseful or generally creepy (e.g., rats skittering and squeaking, ticking clocks, creaking doors, whispering) while the sounds referenced by terrifying music are evocative of more urgent threats or reactions to such threats (e.g., fire alarms, explosions, thunder, banging on a door). There are also many shared features between anxious and terrifying music, especially concerning tonality, rhythm, bass, and instrumentation. Both are typically in a minor key, if a key is in fact discernable amidst the large amount of chromaticism present in both. They both often contain unpredictable dynamics, pulsing or sustained basses (although at different tempi), and hugely unpredictable rhythms (e.g., sudden entrances, fluctuations in tempo, random silences). Since these types of music are often featured side by side in the same score, it is perhaps unsurprising that they share similar instrumentations as well: classic full string orchestra with some electronic sounds and sometimes added voices, piano, or percussion.

TABLE VI.

This table shows the results of the descriptive analysis of the FEARMUS database. The anxious and terrifying excerpts in the FEARMUS database are compared along features used by McClelland (2014) to describe the ombra and tempesta topics.

FeatureDefinitionAnxious musicTerrifying music
General features summary observations about the overall sound of these musical excerpts impending doom communicated by insistent rhythms and held tones, suspenseful, atmospheric, uneasy frantic, chaotic, noisy, distressing, inescapable, adrenaline 
Tempo the rate of speed of a musical piece or passage indicated by one of a series of directions (such as largo, presto, or allegro) and often by an exact metronome markinga ponderous, pacing, heavy march, funereal, slowly meandering or held chords or tones without a clear tempo, stasis frenetic, throbbing, far-apart beats barely held in succession, frantic, spasmodic, walls of noise 
Tonality the organization of all the tones and harmonies of a piece of music in relation to a tonica minor, chromatic, lacking a discernable tonic or tonality monophonic drone tones with ambiguous tonalities, highly chromatic, percussive without tonality, minor 
Harmony the structure of music with respect to the composition and progression of chordsa wide ranging, hollow, clustered in the bass and soprano, minor chords, wandering, unmoored, chromatic, repetitive short progressions dense, compact, unchanging wall of sound; slowly rising chromatically, building held chord that becomes increasingly dissonant 
Melody a rhythmic succession of single tones organized as an aesthetic wholea high-pitched, drifting tones; sliding fragments, rising overall directionality, falling figures, held or repeated tones, minor mode, narrow pitch range shrill, noisy clusters, randomly stepping lines, drunken, lost, repetitive fragments, held trills, lack of a distinct melody, rising figures 
Bass of low pitch; relating to or having the range or part of a bassa percussive, unpredictable, absent from the texture, slow beats, sustained, narrow pitch range, voluminous drones throbbing, rumbling presence, lack of bass entirely, insistent pulse, distant drones, shifting voices, swirling, dense chords 
Figuration ornamentation of a musical passage by using decorative and usually repetitive figuresa sighing, stuttering fragments; space between utterances, drifting held tones, held chord walls, blips of noise, slow trills, throbbing drones, narrow motions short bursts of noise, trills, tremolo, held clusters of tones; chattering, sliding, rising figures 
Rhythm the aspect of music comprising all the elements (such as accent, meter, and tempo) that relate to forward movementa stasis punctuated by unpredictable entrances, consistent underlying pulse, fluctuations in the tempo of fragments, uneven silences, shifting repetitive figures, slower motions sustained chaos, fast-running pulse, unpredictable shifts in the overall rhythmic texture, slow chord changes, fast-running upper lines 
Texture a pattern of musical sound created by tones or lines played or sung togethera hollow polyphony, wide range between parts, substantial leaning in bass and soprano dissonant polyphony, dense walls of sound, voices slowly added to build intensity, full and active texture, growing, massive 
Dynamics variation and contrast in force or intensitya static, quieter, slowly increasing in volume, sudden swells screaming, loud, startling entrances, very sudden shifts from total silence to extreme loudness, sharp accents, increasing loudness 
Instrumentation the arrangement or composition of music for instruments especially for a band or orchestraa classical string orchestra, percussion, electronic sounds (more tonal than noisy), voices, piano classical string orchestra, electronic sounds (more noisy than tonal), percussion, choirs, brass 
Sound references any non-musical sounds evoked or mimicked by the music record playing, car running, alarms, clock ticking, rats, door creaking, rattle of a snake, wind chimes, large creatures bellowing, footsteps, machinery, children's voices, shouting, whispering, whistling screams, fire alarms, banging on a door, tea kettle whistles, earthquake, thunder, bees buzzing, weather siren, bats, birds, elephant bellow, gunshots, explosions, car crash, monkey shrieks 
Timbre the quality given to a sound by its overtonesa buzzy, grating, ethereal, other-worldly, raspy, slithering, plucked, dark, muted, distant, reverberant, muddy, discordant, moving in space from far away to close to the ear shrieking, shrill, painful, noisy, harsh, close to the ear, blurred, jarring, active, unpleasant 
FeatureDefinitionAnxious musicTerrifying music
General features summary observations about the overall sound of these musical excerpts impending doom communicated by insistent rhythms and held tones, suspenseful, atmospheric, uneasy frantic, chaotic, noisy, distressing, inescapable, adrenaline 
Tempo the rate of speed of a musical piece or passage indicated by one of a series of directions (such as largo, presto, or allegro) and often by an exact metronome markinga ponderous, pacing, heavy march, funereal, slowly meandering or held chords or tones without a clear tempo, stasis frenetic, throbbing, far-apart beats barely held in succession, frantic, spasmodic, walls of noise 
Tonality the organization of all the tones and harmonies of a piece of music in relation to a tonica minor, chromatic, lacking a discernable tonic or tonality monophonic drone tones with ambiguous tonalities, highly chromatic, percussive without tonality, minor 
Harmony the structure of music with respect to the composition and progression of chordsa wide ranging, hollow, clustered in the bass and soprano, minor chords, wandering, unmoored, chromatic, repetitive short progressions dense, compact, unchanging wall of sound; slowly rising chromatically, building held chord that becomes increasingly dissonant 
Melody a rhythmic succession of single tones organized as an aesthetic wholea high-pitched, drifting tones; sliding fragments, rising overall directionality, falling figures, held or repeated tones, minor mode, narrow pitch range shrill, noisy clusters, randomly stepping lines, drunken, lost, repetitive fragments, held trills, lack of a distinct melody, rising figures 
Bass of low pitch; relating to or having the range or part of a bassa percussive, unpredictable, absent from the texture, slow beats, sustained, narrow pitch range, voluminous drones throbbing, rumbling presence, lack of bass entirely, insistent pulse, distant drones, shifting voices, swirling, dense chords 
Figuration ornamentation of a musical passage by using decorative and usually repetitive figuresa sighing, stuttering fragments; space between utterances, drifting held tones, held chord walls, blips of noise, slow trills, throbbing drones, narrow motions short bursts of noise, trills, tremolo, held clusters of tones; chattering, sliding, rising figures 
Rhythm the aspect of music comprising all the elements (such as accent, meter, and tempo) that relate to forward movementa stasis punctuated by unpredictable entrances, consistent underlying pulse, fluctuations in the tempo of fragments, uneven silences, shifting repetitive figures, slower motions sustained chaos, fast-running pulse, unpredictable shifts in the overall rhythmic texture, slow chord changes, fast-running upper lines 
Texture a pattern of musical sound created by tones or lines played or sung togethera hollow polyphony, wide range between parts, substantial leaning in bass and soprano dissonant polyphony, dense walls of sound, voices slowly added to build intensity, full and active texture, growing, massive 
Dynamics variation and contrast in force or intensitya static, quieter, slowly increasing in volume, sudden swells screaming, loud, startling entrances, very sudden shifts from total silence to extreme loudness, sharp accents, increasing loudness 
Instrumentation the arrangement or composition of music for instruments especially for a band or orchestraa classical string orchestra, percussion, electronic sounds (more tonal than noisy), voices, piano classical string orchestra, electronic sounds (more noisy than tonal), percussion, choirs, brass 
Sound references any non-musical sounds evoked or mimicked by the music record playing, car running, alarms, clock ticking, rats, door creaking, rattle of a snake, wind chimes, large creatures bellowing, footsteps, machinery, children's voices, shouting, whispering, whistling screams, fire alarms, banging on a door, tea kettle whistles, earthquake, thunder, bees buzzing, weather siren, bats, birds, elephant bellow, gunshots, explosions, car crash, monkey shrieks 
Timbre the quality given to a sound by its overtonesa buzzy, grating, ethereal, other-worldly, raspy, slithering, plucked, dark, muted, distant, reverberant, muddy, discordant, moving in space from far away to close to the ear shrieking, shrill, painful, noisy, harsh, close to the ear, blurred, jarring, active, unpleasant 
a

Dictionary definitions by Merriam-Webster (2023).

We selected thirteen acoustic features for a confirmatory analysis of our observations following the descriptive analysis. The full list of features, including definitions and our hypotheses, is in Table VII. With these features, we were able to gather data about the tempo and rhythm (pulse clarity, event density), loudness (RMS, low energy), mode, and timbre (brightness, roughness, noisiness, and spectral distribution descriptors) of the excerpts. Given our descriptive results, we predicted that compared to anxious music, terrifying music would be significantly faster, louder, and rhythmically denser, would have more consistent loudness levels, and would exhibit timbres that are generally brighter, rougher, and noisier. We also predicted that both terrifying and anxious music would be in the minor mode and have similarly irregular rhythmic structures (low pulse clarity). We measured the non-timbral acoustic features (e.g., event density, loudness, loudness variability, mode, pulse clarity, and tempo) across the entire length of each excerpt. To measure the timbral features (e.g., brightness, roughness, zero crossing rate, and all spectral distribution descriptors), we analyzed five randomly selected one-second-long segments taken from each musical excerpt, and then averaged those values together resulting in 100 mean values (one per excerpt). Table VIII presents the resulting unnormalized mean values.

TABLE VII.

This table shows the thirteen acoustic features we selected to investigate to compare the anxious and terrifying music excerpts in FEARMUS. In the left-most columns, we list the acoustic features of interest along with their definitions, and also interpretations of those definitions (see footnotes for citations). In the right-most columns, we indicate how we predicted these features would behave in anxious compared to terrifying music.

FeatureDefinitionInterpretationAnxietyTerror
Event density The average frequency of events, i.e., the number of events detected per seconda Density of musical activity ↓ ↑ 
Root-mean square (RMS) The global energy of the signal computed by taking the root average of the square of the amplitudea Loudness ↓ ↑ 
Low energy The percentage of frames showing less-than-average energya,b Loudness variability ↑ ↓ 
Mode An arrangement of the eight diatonic notes or tones of an octave according to one of several fixed schemes of their intervalsc Major (closer to +1) or minor (closer to −1) equally minor (negative) 
Pulse clarity Estimates the rhythmic clarity, indicating the strength of the beatsa,d Strength of a perceived rhythm or beat equally low 
Tempo The rate of speed of a musical piece or passage indicated by one of a series of directions (such as largo, presto, or allegro) and often by an exact metronome markingc The perceived pace or speed of the music ↓ ↑ 
Brightness Measuring the amount of energy above a cut- off frequency (1500 Hz)a,e Way of estimating the high frequency energy content of a spectral distribution ↓ ↑ 
Roughness The average of all the dissonance between all possible pairs of peaks of a spectruma,f Estimation of the sensory dissonancea ↓ ↑ 
Spectral centroid Geometric center of the amplitude spectruma Spectral distribution descriptorg ↓ ↑ 
Spectral flatness Ratio between the geometric and the arithmetic mean of the spectruma Discriminates noise from harmonic contentg ↓ ↑ 
Spectral roll-off The frequency below which 85%d of the total spectral energy is containeda Estimation of the amount of high frequency energy contentg ↓ ↑ 
Spectral spread The standard deviation of the spectral distributiona Spectral distribution descriptorg ↓ ↑ 
Zero crossing rate A simple indicator of noisiness: counting the number of times the signal crosses the x-axisa A simple indicator of noisinessh ↓ ↑ 
FeatureDefinitionInterpretationAnxietyTerror
Event density The average frequency of events, i.e., the number of events detected per seconda Density of musical activity ↓ ↑ 
Root-mean square (RMS) The global energy of the signal computed by taking the root average of the square of the amplitudea Loudness ↓ ↑ 
Low energy The percentage of frames showing less-than-average energya,b Loudness variability ↑ ↓ 
Mode An arrangement of the eight diatonic notes or tones of an octave according to one of several fixed schemes of their intervalsc Major (closer to +1) or minor (closer to −1) equally minor (negative) 
Pulse clarity Estimates the rhythmic clarity, indicating the strength of the beatsa,d Strength of a perceived rhythm or beat equally low 
Tempo The rate of speed of a musical piece or passage indicated by one of a series of directions (such as largo, presto, or allegro) and often by an exact metronome markingc The perceived pace or speed of the music ↓ ↑ 
Brightness Measuring the amount of energy above a cut- off frequency (1500 Hz)a,e Way of estimating the high frequency energy content of a spectral distribution ↓ ↑ 
Roughness The average of all the dissonance between all possible pairs of peaks of a spectruma,f Estimation of the sensory dissonancea ↓ ↑ 
Spectral centroid Geometric center of the amplitude spectruma Spectral distribution descriptorg ↓ ↑ 
Spectral flatness Ratio between the geometric and the arithmetic mean of the spectruma Discriminates noise from harmonic contentg ↓ ↑ 
Spectral roll-off The frequency below which 85%d of the total spectral energy is containeda Estimation of the amount of high frequency energy contentg ↓ ↑ 
Spectral spread The standard deviation of the spectral distributiona Spectral distribution descriptorg ↓ ↑ 
Zero crossing rate A simple indicator of noisiness: counting the number of times the signal crosses the x-axisa A simple indicator of noisinessh ↓ ↑ 
TABLE VIII.

This table shows the unnormalized means and standard deviations of the 13 acoustic features (grouped by expressed emotion: anxiety or terror) that we measured to analyze the FEARMUS database. The upper half of the table reports the acoustic features we measured across entire excerpts, and the lower half reports the acoustic features we measured and averaged across five 1-s-long fragments randomly sampled per excerpt. The right column provides context for interpreting the resulting values for each acoustic feature [see Lartillot (2021) for more detailed information; top half, n = 100; bottom half, n = 500 (Chudy, 2016; Juslin, 2000; Lartillot et al., 2021, 2008; Peeters et al., 2011; Sethares, 2005; Tzanetakis and Cook, 2002)].

Expressed emotionMeasurement context
AnxietyTerror
Acoustic features (measured using whole excerpts) Event density 2.176 3.190 Number of sonic events per second 
(1.307) (1.758) 
Loudness 0.172 0.194 Global energy of a signal; higher values indicate greater energy 
(0.045) (0.058) 
Loudness variability 0.561 0.513 Range of 0–1; closer to 1 indicates 
(0.068) (0.099) greater loudness variability 
Mode −0.012 −0.024 Between −1 and +1; closer to −1 = minor, closer to +1 = major 
(0.094) (0.093) 
Pulse clarity 0.205 0.192 Higher values indicate a clearer, stronger beat 
(0.155) (0.159) 
Tempo 123.6 133.8 Beats per minute (bpm) 
(35.45) (31.82) 
Acoustic features (measured across averaged fragments) Brightness 0.208 0.484 Energy in upper frequencies 
(0.152) (0.148) 
Roughness 687.0 2146 Higher values indicate greater sensory dissonance 
(570.9) (1561) 
Spectral centroid 1425 2659 Mean of the spectral distribution; reported in frequency (Hz) 
(985.8) (1054) 
Spectral flatness 0.070 0.106 Ratio indicating noisiness; closer to 1 indicates greater noisiness 
(0.049) (0.049) 
Spectral roll-off 2497 5153 Frequency (Hz); Higher values indicate more energy in upper frequencies 
(2329) (2128) 
Spectral spread 2647 3107 SD of spectral distribution; reported in frequency (Hz) 
(833.4) (756.9) 
Zero crossing rate 408.7 1295 Higher values indicate noisier signals 
(346.5) (856.7) 
Expressed emotionMeasurement context
AnxietyTerror
Acoustic features (measured using whole excerpts) Event density 2.176 3.190 Number of sonic events per second 
(1.307) (1.758) 
Loudness 0.172 0.194 Global energy of a signal; higher values indicate greater energy 
(0.045) (0.058) 
Loudness variability 0.561 0.513 Range of 0–1; closer to 1 indicates 
(0.068) (0.099) greater loudness variability 
Mode −0.012 −0.024 Between −1 and +1; closer to −1 = minor, closer to +1 = major 
(0.094) (0.093) 
Pulse clarity 0.205 0.192 Higher values indicate a clearer, stronger beat 
(0.155) (0.159) 
Tempo 123.6 133.8 Beats per minute (bpm) 
(35.45) (31.82) 
Acoustic features (measured across averaged fragments) Brightness 0.208 0.484 Energy in upper frequencies 
(0.152) (0.148) 
Roughness 687.0 2146 Higher values indicate greater sensory dissonance 
(570.9) (1561) 
Spectral centroid 1425 2659 Mean of the spectral distribution; reported in frequency (Hz) 
(985.8) (1054) 
Spectral flatness 0.070 0.106 Ratio indicating noisiness; closer to 1 indicates greater noisiness 
(0.049) (0.049) 
Spectral roll-off 2497 5153 Frequency (Hz); Higher values indicate more energy in upper frequencies 
(2329) (2128) 
Spectral spread 2647 3107 SD of spectral distribution; reported in frequency (Hz) 
(833.4) (756.9) 
Zero crossing rate 408.7 1295 Higher values indicate noisier signals 
(346.5) (856.7) 

To test our hypotheses related to the non-timbral acoustic features, we used the six acoustic features as the predicted values for six mixed-effects linear regression models. The predictor values were the target emotions: terror and anxiety. The lengths of the excerpts (in seconds) were included as random slopes. Before running the analyses, we normalized all the resulting values of the six acoustic analyses. We report the regression analyses results in the upper half of Table IX and in the upper panel of Fig. 3. Consistent with our hypotheses, the results demonstrate a significant main effect between target emotions (1 = terror) and event density driven by a higher average event density for terrifying musical stimuli (M = 3.19, SD = 1.76) as compared to anxious musical stimuli (M = 2.18, SD = 1.31, p = 0.011). The results also demonstrate a significant main effect between target emotions and loudness variability driven by a lower average loudness variability for terrifying musical stimuli (M = 0.513, SD = 0.099) as compared to anxious musical stimuli (M = 0.561, SD = 0.068, p = 0.032), although this difference of only 0.048 (i.e., 4.8% more frames in which the energy of the signal is lower than average) may not be perceptually meaningful. There were no significant main effects between target emotions and loudness (p = 0.087), mode (p = 0.648), pulse clarity (p = 0.788), or tempo (p = 0.247). While in the predicted direction, the average tempo (in beats per minute, bpm) of terrifying musical stimuli (M = 133.8, SD = 31.82) was not significantly faster than anxious music (M = 123.6, SD = 35.45), inconsistent with our hypothesis. Similarly, terrifying music was not significantly louder (M = 0.194, SD = 0.058) than anxious music (M = 0.172, SD = 0.045), also inconsistent with our hypothesis. Both terrifying and anxious music exhibited modal ambiguity that leaned towards more minor modalities (anxious: M = –0.012, SD = 0.094; terror: M = –0.024, SD = 0.093), consistent with our hypotheses that both would tend towards minor modalities. They also had similarly irregular rhythmic structures signified by a low pulse clarity (anxious: M = 0.205, SD = 0.155; terror: M = 0.192, SD = 0.159), consistent with our hypotheses.

TABLE IX.

This table shows the results of our linear regression analyses testing how 13 acoustic features are predicted by the target emotions (anxiety, terror) of FEARMUS. The bold results are statistically significant where adjusted p-values < 0.05. The acoustic features in the top half of the table were measured across the full length of each excerpt. For the acoustic features on the lower half of the table, we randomly selected five 1-s-long segments from each excerpt and averaged across them. The results demonstrate that terrifying music has a significantly greater event density, more consistent loudness (lower loudness variability), and a noisier, brighter, and harsher timbre than anxious music, consistent with our hypotheses. However, terrifying music is not significantly faster in tempo or louder, inconsistent with our hypotheses. Both anxious and terrifying music use minor modes and have similarly inconsistent rhythmic structures, consistent with our hypotheses. n = 100.

Est.SEtUnadj. pAdj. pdf
 Event density 
Intercept Expressed emotion (1 = terror) 0.207 0.028 7.333 2.37E−07 4.40E−07 22.081  
0.142 0.048 2.985 8.74E-03 0.011 16.043  
 Loudness 
Intercept Expressed emotion (1 = terror) 0.375 0.030 12.480 6.17E−08 1.23E−07 11.246  
0.098 0.052 1.880 0.077 0.087 17.654  
 Loudness variability 
Intercept Expressed emotion (1 = terror) 0.618 0.027 23.318 1.43E−14 9.30E−14 17.403  
−0.095 0.040 −2.393 0.027 0.032 18.619  
 Mode 
Intercept Expressed emotion (1 = terror) 0.533 0.033 15.912 7.88E−10 1.71E−09 12.842  
−0.021 0.042 −0.498 0.623 0.648 26.694  
 Pulse clarity 
Intercept Expressed emotion (1 = terror) 0.205 0.031 6.538 9.42E−05 1.29E−04 9.250  
−0.012 0.045 −0.275 0.788 0.788 13.643  
 Tempo 
Intercept Expressed emotion (1 = terror) 0.483 0.041 11.667 2.28E−06 3.95E−06 8.148  
0.074 0.058 1.263 0.228 0.247 13.631  
  Est. SE t Unadj. p Adj. p R2 Adj. R2 
 Brightness 
Intercept Expressed emotion (1 = terror) 0.240 0.024 9.926 <2E−16 <1.73E−15   
0.379 0.034 11.071 <2E−16 <1.73E−15 0.556 0.551 
 Roughness 
Intercept Expressed emotion (1 = terror) 0.086 0.020 4.308 3.91E−05 5.98E−05   
0.213 0.028 7.523 2.61E−11 7.54E−11 0.366 0.360 
 Spectral centroid 
Intercept Expressed emotion (1 = terror) 0.200 0.025 8.110 1.47E−12 5.46E−12   
0.249 0.035 7.153 1.55E−10 4.03E−10 0.343 0.336 
 Spectral flatness 
Intercept Expressed emotion (1 = terror) 0.200 0.025 8.175 1.07E−12 4.64E−12   
0.148 0.035 4.253 4.82E−05 6.96E−05 0.156 0.147 
 Spectral roll-off 
Intercept Expressed emotion (1 = terror) 0.202 0.026 7.896 4.22E−12 1.37E−11   
0.252 0.036 6.977 3.58E−10 8.46E−10 0.332 0.325 
 Spectral spread 
Intercept Expressed emotion (1 = terror) 0.329 0.027 12.012 <2E−16 <1.73E−15   
0.132 0.039 3.405 9.59E−04 1.25E−03 0.106 0.097 
 Zero crossing rate 
Intercept Expressed emotion (1 = terror) 0.079 0.017 4.643 1.07E−05 1.74E−05   
0.202 0.024 8.342 4.70E−13 2.44E−12 0.415 0.409 
Est.SEtUnadj. pAdj. pdf
 Event density 
Intercept Expressed emotion (1 = terror) 0.207 0.028 7.333 2.37E−07 4.40E−07 22.081  
0.142 0.048 2.985 8.74E-03 0.011 16.043  
 Loudness 
Intercept Expressed emotion (1 = terror) 0.375 0.030 12.480 6.17E−08 1.23E−07 11.246  
0.098 0.052 1.880 0.077 0.087 17.654  
 Loudness variability 
Intercept Expressed emotion (1 = terror) 0.618 0.027 23.318 1.43E−14 9.30E−14 17.403  
−0.095 0.040 −2.393 0.027 0.032 18.619  
 Mode 
Intercept Expressed emotion (1 = terror) 0.533 0.033 15.912 7.88E−10 1.71E−09 12.842  
−0.021 0.042 −0.498 0.623 0.648 26.694  
 Pulse clarity 
Intercept Expressed emotion (1 = terror) 0.205 0.031 6.538 9.42E−05 1.29E−04 9.250  
−0.012 0.045 −0.275 0.788 0.788 13.643  
 Tempo 
Intercept Expressed emotion (1 = terror) 0.483 0.041 11.667 2.28E−06 3.95E−06 8.148  
0.074 0.058 1.263 0.228 0.247 13.631  
  Est. SE t Unadj. p Adj. p R2 Adj. R2 
 Brightness 
Intercept Expressed emotion (1 = terror) 0.240 0.024 9.926 <2E−16 <1.73E−15   
0.379 0.034 11.071 <2E−16 <1.73E−15 0.556 0.551 
 Roughness 
Intercept Expressed emotion (1 = terror) 0.086 0.020 4.308 3.91E−05 5.98E−05   
0.213 0.028 7.523 2.61E−11 7.54E−11 0.366 0.360 
 Spectral centroid 
Intercept Expressed emotion (1 = terror) 0.200 0.025 8.110 1.47E−12 5.46E−12   
0.249 0.035 7.153 1.55E−10 4.03E−10 0.343 0.336 
 Spectral flatness 
Intercept Expressed emotion (1 = terror) 0.200 0.025 8.175 1.07E−12 4.64E−12   
0.148 0.035 4.253 4.82E−05 6.96E−05 0.156 0.147 
 Spectral roll-off 
Intercept Expressed emotion (1 = terror) 0.202 0.026 7.896 4.22E−12 1.37E−11   
0.252 0.036 6.977 3.58E−10 8.46E−10 0.332 0.325 
 Spectral spread 
Intercept Expressed emotion (1 = terror) 0.329 0.027 12.012 <2E−16 <1.73E−15   
0.132 0.039 3.405 9.59E−04 1.25E−03 0.106 0.097 
 Zero crossing rate 
Intercept Expressed emotion (1 = terror) 0.079 0.017 4.643 1.07E−05 1.74E−05   
0.202 0.024 8.342 4.70E−13 2.44E−12 0.415 0.409 
FIG. 3.

Here, we show the results of the acoustic analyses comparing musically conveyed anxiety and terror, as measured using the FEARMUS database excerpts. In the upper panel we focus on non-timbral acoustic features. We analyzed these features using the entire length of each excerpt. The lower panel displays the results for the timbral acoustic features. For these features, we pseudo-randomly selected five 1-s-long segments from each of the 100 excerpts in FEARMUS to analyze and then average across for each excerpt. Consistent with our hypotheses, the results indicate that terrifying music has a brighter, harsher, noisier timbre, has more musical activity per second, and has less-variable loudness than anxious music. Furthermore, both subtypes of fearful music have relatively unclear rhythmic structures (low pulse clarity) and are mostly in minor modalities (see Table VIII for the unnormalized mean values). Inconsistent with our hypotheses, tempo and loudness are not statistically different between the two subtypes of fear. We had predicted that terrifying music would be louder and faster than anxious music, and while our results are in the predicted direction, they are not statistically significant.

FIG. 3.

Here, we show the results of the acoustic analyses comparing musically conveyed anxiety and terror, as measured using the FEARMUS database excerpts. In the upper panel we focus on non-timbral acoustic features. We analyzed these features using the entire length of each excerpt. The lower panel displays the results for the timbral acoustic features. For these features, we pseudo-randomly selected five 1-s-long segments from each of the 100 excerpts in FEARMUS to analyze and then average across for each excerpt. Consistent with our hypotheses, the results indicate that terrifying music has a brighter, harsher, noisier timbre, has more musical activity per second, and has less-variable loudness than anxious music. Furthermore, both subtypes of fearful music have relatively unclear rhythmic structures (low pulse clarity) and are mostly in minor modalities (see Table VIII for the unnormalized mean values). Inconsistent with our hypotheses, tempo and loudness are not statistically different between the two subtypes of fear. We had predicted that terrifying music would be louder and faster than anxious music, and while our results are in the predicted direction, they are not statistically significant.

Close modal

To test our hypotheses related to timbre, we used standard general linear regression analyses. For each model, the predicted values were the timbral acoustic features and the predictor value was target emotion (terror = 1). We report the results of these analyses in the lower half of Table IX and in the lower panel of Fig. 3. We found a main effect for target emotion on all of the timbral acoustic features that we measured driven by higher values for terrifying music as compared to anxious music. Consistent with our hypothesis, terrifying music exhibited a brighter average timbre exhibited by a higher average brightness (p < 0.001; R2 = 0.556; adjusted R2 = 0.551), spectral centroid (p < 0.001; R2 = 0.343; adjusted R2 = 0.336), and spectral roll-off (p < 0.001; R2 = 0.332; adjusted R2 = 0.325) than anxious music. Additionally, also in line with our prediction, terrifying music exhibited a noisier and rougher average timbre than anxious music exhibited by a higher roughness (p < 0.001; R2 = 0.366; adjusted R2 = 0.360), spectral flatness (p < 0.001; R2 = 0.156; adjusted R2 = 0.147), spectral spread (p < 0.002; R2 = 0.106; adjusted R2 = 0.097), and zero crossing rate (p < 0.001; R2 = 0.415; adjusted R2 = 0.409).

Here, we report on how music differentially conveys two subtypes of fear: anxiety and terror. To research musically conveyed subtypes of fear, we created a custom database of musical excerpts called FEARMUS. To create the database, we first curated 180 excerpts from contemporary horror film soundtracks (90 portraying terror, and 90 portraying anxiety) using an expertise-based approach. We then validated the efficacy of the music at portraying the target emotions through an experiment during which participants rated the conveyed emotions of the excerpts with discrete and dimensional emotion scales. Next, we used the results of the rating experiment to filter the final database down the 100 most typical musical excerpts that portray terror and anxiety. We then applied descriptive and acoustic analyses to FEARMUS to outline the difference between these musically portrayed subtypes of fear.

The results of our behavioral ratings demonstrated that while terrifying and anxious music are quite differentiable on dimensional emotion scales, they are less clearly differentiable using discrete emotion scales. Consistent with our hypotheses, terrifying music was rated as conveying a lower valence and a higher arousal than anxious music. Furthermore, also consistent with our hypotheses, anxious music was rated as conveying a greater degree of anxiety than terror, and terrifying music was rated as conveying a greater degree of terror than anxious music. However, inconsistent with our predictions, terrifying music was rated as conveying a greater degree of anxiety than terror, and a greater degree of anxiety than the anxious musical excerpts. Overall, the evidence provides an inconclusive picture of the degree to which these subtypes of fear are perceptually differentiable in music.

It is worthwhile to consider what factors might be driving these conflicting results. For one, perhaps there was some confusion between portrayed and felt emotion during the ratings task (Schubert, 2013), despite our instructions to rate conveyed emotions. Notably, we did not instruct participants on the difference between felt and portrayed emotion. The terrifying musical excerpts may have induced anxious feelings in participants causing them to give those excerpts higher anxiety ratings. It also could be that subtypes of emotion are layered or overlapping. When something is conveying terror, perhaps that emotion is layered with a high degree of anxiety. A similar future ratings experiment might employ a forced-choice design to account for these possibilities.

Furthermore, it is also worthwhile to consider a couple of potential confounding factors in our design that may have affected the results of the behavioral rating experiment as well. First, while we recruited participants who could speak English well, we did not test their level of English comprehension nor document their native language. This confound may have affected the results of the rating experiment since language and culture hugely impact emotion perception and labelling (Barrett et al., 2011; Engelmann and Pogosyan, 2013; Ogarkova, 2016). Different languages contain vastly different numbers of emotion terms (Ogarkova, 2016). For example, Dutch has been found to contain 1501 emotion terms (Hoekstra, 1986; Ogarkova, 2016) while Czech only has 404 (Ogarkova, 2016; Slaměník et al., 2008). While our inclusion of emotion definitions may have helped to control for different perceptions of emotion across our participants, there still may have been some differences in their approach to the task linked to diverse native languages and cultural backgrounds. Furthermore, it could be that participants had varied emotional granularity capabilities, both in terms of perception and communication, which may have confounded our results. Participants with lower emotional granularity may have struggled more with the emotion rating task and provided less accurate ratings than participants with higher emotional granularity. Our study did not attempt to measure the emotional granularity capabilities of our participants.

The results of the descriptive and acoustic analyses provide more substantial evidence for the existence of at least two differentiable subtypes of musically conveyed fear. While some of their musical and acoustic features overlap, anxious and terrifying music display some striking sonic differences. Generally, terrifying music has a brighter, harsher, and rougher timbre, and is musically denser than anxious music. Anxious music has a greater degree of loudness variability than terrifying music. In terms of their similarities, both anxious and terrifying music tend towards minor modalities and are rhythmically unpredictable. Finally, while the descriptive analysis indicated that terrifying music is typically faster and louder than anxious music, the acoustic analyses results were inconsistent with this observation. Recall that terrifying music sometimes has no tempo at all but instead consists of static walls of noise. Perhaps such instances resulted in an average tempo that was not much faster than the anxious music excerpts. Regarding loudness, occasionally terrifying excerpts have long crescendos to intense climaxes. Perhaps such quieter beginnings resulted in a lower average loudness more comparable to the anxious musical excerpts as well. Possible explanations aside, it is interesting to compare these findings to the summary of Juslin (2019) of previous findings on musically conveyed fear and to the descriptions of McClelland (2012, 2014, 2017b) of ombra and tempesta. As highlighted in Table I, while Juslin (2019) describes fearful music as having fast tempi and low sound levels, McClelland (2012, 2014, 2017b) describes ombra as having slow-to-moderate tempi and tempesta as generally very loud. However, while we expected anxious and terrifying music to mirror ombra and tempesta in these critical differences, those expectations were not borne out in our results.

Overall, our results provoke the question of whether previous descriptions of musically conveyed fear [e.g., as summarized in Juslin (2019)] are adequate. Researching fear as a broad category in music cognition research may have produced overgeneralized accounts that conflate and under describe musically conveyed subtypes of fear. Contrastingly, our results align well with McClelland's (2012, 2014, 2017a,b) descriptions of ombra and tempesta. This finding demonstrates the benefit of integrating traditional music theories with psychological approaches in music cognition research. Such interdisciplinary approaches can uncover richer, better-informed portraits of how music functions.

In conclusion, it seems that there is indeed a strong sonic difference between at least two subtypes of musically conveyed fear. However, it is yet inconclusive as to whether these subtypes of fear are clearly perceptually distinguishable from one another. To better uncover how distinguishable subtypes of musically conveyed emotions (including terror and anxiety) are, it is vital that researchers incorporate emotional granularity into future experimental designs. It is essential that subtypes of emotions are considered overall to avoid overgeneralizations and incorrect conclusions about how music conveys emotion, although accounting for different languages and emotion constructs across cultures will be a challenging aspect of such future work.

C.T. received funding from the European Union's Horizon 2020 research and innovation program under the Marie Skłodowska-Curie grant agreement (Grant No. 835682). S.F. received funding from Swiss National Science Foundation (Grant Nos. SNSF PP00P1_157409/1 and PP00P1_183711/1). Thanks to Arkady Konovalov and to the members of the Cognitive and Affective Neuroscience Laboratory at UZH for their feedback and support.

1

The other four of the five most researched emotions are sadness, happiness, anger, and tenderness/love (Juslin, 2019).

2

For a more extensive discussion on the potential impact of incorporating emotional granularity into music and emotion research, see Chap. 10 of Warrenburg, 2019b.

3

Topic Theory is used to define and catalogue instances in which certain combinations of musical features consistently and reliably communicate clear cultural associations and related emotions (Mirka, 2014; Ratner, 1980). For instance, one topic is the “march” topic which consists of brass instruments and drums, an upbeat even tempo, a major key, and exciting melodies (Ratner, 1980). The cultural associations with such a combination of musical features include celebrations, holidays, parades, military, and various emotions that are relevant to these associations such as joy, happiness, excitement, or pride.

4

For a table describing the musical features that convey ombra and tempesta, see McClelland's (2014) chapter “Ombra and Tempesta” in The Oxford Handbook of Topic Theory, pp. 279–300.

5

The experimenter also relied on the plot of the films to find these excerpts. Moments when antagonizing forces attacked the protagonists typically had music that matched the tempesta criteria, and moments of suspense typically had music matching the ombra criteria.

6

We elected to have each participant only rate one third of the excerpts due to time constraints.

7

The code first generated 35 lists containing the numbers 1–90 in a random order (e.g., 14, 32, 6, 87, 10, etc.). Then, each of those lists was split into thirds to create 105 shorter lists of 30 numbers. In doing so, each group of three lists encompassed all 90 excerpts in a random order with no overlap between those three lists. Those shorter lists then were used to index the list of audio files during the experiment. This procedure ensured that each participant heard a random third of the 90 excerpts and that each excerpt was rated by a third of the participants, evenly but randomly distributing the excerpts across the participants.

8

We based our definitions of the discrete emotions on entries in the online Merriam-Webster dictionary (Merriam-Webster, 2023). Sometimes we altered a word or two to make the definitions more accessible (e.g., “misfortune” instead of “ill”). The definitions given to participants were as follows: Happiness—a state of well-being and contentment, joy; Tenderness—a tender quality or condition, such as gentleness and affection; Anxiety—apprehensive uneasiness or nervousness usually over an impending or anticipated misfortune; and Terror—a state of intense or overwhelming fear. Additionally, we used the Self-Assessment Manikin (Bradley and Lang, 1994), flipped horizontally to match the directionality of the rating scales (negative-positive/low-high), to demonstrate the meaning of valence and arousal.

9

Forty-two participants took the GMSI in English, 63 in German. The results showed a mean score of 63.79 (SD = 18.63) with no outliers and a normal distribution. This score is well below the average norm (M = 81.58; SD = 20.64) validated by the authors of the GMSI (Müllensiefen et al., 2014), supporting our claim that our participants were non-musicians.

10

The other surveys that participants filled out consisted of the Autism Spectrum Quotient, Beck Depression Inventory, Big Five Inventory, Positive and Negative Affect Schedule, Levenson Self-Report Psychopathy Scale, and the State-Trait Anxiety Inventory. These surveys were part of a larger study. Therefore, we do not comment on the results of these surveys in this paper.

11

For detailed information on how each acoustic feature is extracted by the MIR Toolbox (Lartillot et al., 2007), see the User's Manual (version 1.8.1) (University of Jyväskylä, 2023).

12

See supplementary material at https://www.scitation.org/doi/suppl/10.1121/10.0016857 for Table S1 containing metadata, typicality indices, and ranking information for all 100 excerpts in the FEARMUS database.

1.
Adolphs
,
R.
(
2013
). “
The biology of fear
,”
Curr. Biol.
23
(
2
),
R79
R93
.
2.
Adolphs
,
R.
,
Mlodinow
,
L.
, and
Barrett
,
L. F.
(
2019
). “
What is an emotion?
,”
Curr. Biol.
29
(
20
),
R1060
R1064
.
3.
Barrett
,
L. F.
(
2004
). “
Feelings or words? Understanding the content in self-report ratings of experienced emotion
,”
J. Personality Social Psychol.
87
(
2
),
266
281
.
4.
Barrett
,
L. F.
(
2017
).
How Emotions Are Made: The Secret Life of the Brain
(
Houghton Mifflin Harcourt
,
New York
).
5.
Barrett
,
L. F.
,
Mesquita
,
B.
, and
Gendron
,
M.
(
2011
). “
Context in emotion perception
,”
Curr. Dir. Psychol. Sci.
20
(
5
),
286
290
.
6.
Bates
,
D.
,
Mächler
,
M.
,
Bolker
,
B.
, and
Walker
,
S.
(
2015
). “
Fitting linear mixed-effects models using lme4
,”
J. Stat. Soft.
67
(
1
),
1
48
.
7.
Bradley
,
M. M.
, and
Lang
,
P. J.
(
1994
). “
Measuring emotion: The self-assessment manikin and the semantic differential
,”
J. Behav. Ther. Exp. Psych.
25
(
1
),
49
59
.
8.
Cespedes-Guevara
,
J.
, and
Eerola
,
T.
(
2018
). “
Music communicates affects, not basic emotions—A constructionist account of attribution of emotional meanings to music
,”
Front. Psychol.
9
,
215
.
9.
Chudy
,
M.
(
2016
). “
Discriminating music performers by timbre: On the relation between instrumental gesture, tone quality and perception in classical cello performance
,” Queen Mary University of London, https://qmro.qmul.ac.uk/xmlui/handle/123456789/18378 (Last viewed January 10, 2023).
10.
Eerola
,
T.
, and
Vuoskoski
,
J. K.
(
2011
). “
A comparison of the discrete and dimensional models of emotion in music
,”
Psychol. Music
39
(
1
),
18
49
.
11.
Engelmann
,
J. B.
, and
Pogosyan
,
M.
(
2013
). “
Emotion perception across cultures: The role of cognitive mechanisms
,”
Front. Psychol.
4
,
118
.
12.
Evans
,
D. A.
,
Stempel
,
A. V.
,
Vale
,
R.
,
Ruehle
,
S.
,
Lefler
,
Y.
, and
Branco
,
T.
(
2018
). “
A synaptic threshold mechanism for computing escape decisions
,”
Nature
558
,
590
594
.
13.
Gross
,
C. T.
, and
Canteras
,
N. S.
(
2012
). “
The many paths to fear
,”
Nat. Rev. Neurosci.
13
(
9
),
651
658
.
14.
Harper
,
C. A.
,
Satchell
,
L. P.
,
Fido
,
D.
, and
Latzman
,
R. D.
(
2021
). “
Functional fear predicts public health compliance in the COVID-19 pandemic
,”
Int. J. Ment. Health Addiction
19
(
5
),
1875
1888
.
15.
Hitchcock
,
A.
(
1960
).
Psycho
(
Paramount Pictures
,
Los Angeles, CA
).
16.
Hoekstra
,
H. A.
(
1986
).
Cognition and Affect in the Appraisal of Events
(
University of Groningen
,
Groningen
).
17.
Huron
,
D.
(
2015
). “
Cues and signals: An ethological approach to music-related emotion
,”
Signata.
6
,
331
351
.
18.
IMDb
(
2023
). http://www.imdb.com (Last viewed January 10, 2023).
19.
Juslin
,
P. N.
(
2000
). “
Cue utilization in communication of emotion in music performance: Relating performance to perception
,”
J. Exp. Psychol. Hum. Percept. Perform.
26
(
6
),
1797
1812
.
20.
Juslin
,
P. N.
(
2019
).
Musical Emotions Explained: Unlocking the Secrets of Musical Affect
(
Oxford University Press
,
Oxford
).
21.
Juslin
,
P. N.
, and
Laukka
,
P.
(
2003
). “
Communication of emotions in vocal expression and music performance: Different channels, same code?
,”
Psychol. Bull.
129
(
5
),
770
814
.
22.
Juslin
,
P. N.
, and
Västfjäll
,
D.
(
2008
). “
Emotional responses to music: The need to consider underlying mechanisms
,”
Behav. Brain Sci.
31
(
05
),
559
575
.
23.
Kumar
,
P. H.
, and
Mohanty
,
M. N.
(
2016
). “
Efficient feature extraction for fear state analysis from human voice
,”
Indian J. Sci. Technol.
9
(
38
),
1
11
.
24.
Kunwar
,
P. S.
,
Zelikowsky
,
M.
,
Remedios
,
R.
,
Cai
,
H.
,
Yilmaz
,
M.
,
Meister
,
M.
, and
Anderson
,
D. J.
(
2015
). “
Ventromedial hypothalamic neurons control a defensive emotion state
,”
ELife
4
,
e06633
.
25.
Kuznetsova
,
A.
,
Brockhoff
,
P. B.
, and
Christensen
,
R. H. B.
(
2017
). “
lmerTest package: Tests in linear mixed effects models
,”
J. Stat. Soft.
82
(
13
),
1
26
.
26.
Lartillot
,
O.
(
2021
).
MIRtoolbox 1.8.1 User's Manual
.
27.
Lartillot
,
O.
,
Eerola
,
T.
,
Toiviainen
,
P.
, and
Fornari
,
J.
(
2008
).
Multi-feature modeling of pulse clarity: Design, validation, and optimization
, pp.
521
526
.
28.
Lartillot
,
O.
,
Toiviainen
,
P.
, and
Eerola
,
T.
(
2007
). “
A MATLAB toolbox for music information retrieval
,” in
Data Analysis, Machine Learning and Applications
(
Springer
,
Berlin
), pp.
261
268
.
29.
LeDoux
,
J. E.
(
2014
). “
Coming to terms with fear
,”
Proc. Natl. Acad. Sci. U.S.A.
111
(
8
),
2871
2878
.
30.
Lin
,
D.
,
Boyle
,
M. P.
,
Dollar
,
P.
,
Lee
,
H.
,
Lein
,
E. S.
,
Perona
,
P.
, and
Anderson
,
D. J.
(
2011
). “
Functional identification of an aggression locus in the mouse hypothalamus
,”
Nature
470
(
7333
),
221
226
.
31.
McClelland
,
C.
(
2012
).
Ombra: Supernatural Music in the Eighteenth Century
(
Lexington Books
,
Lanham, MD
).
32.
McClelland
,
C.
(
2014
). “
Ombra and tempesta
,” in
The Oxford Handbook of Topic Theory
, edited by
D.
Mirka
(
Oxford University Press
,
Oxford
), pp.
279
300
.
33.
McClelland
,
C.
(
2017a
). “
Of gods and monsters: Signification in Franz Waxman's film score Bride of Frankenstein
,”
J. Film Music
7
(
1
),
5
19
.
34.
McClelland
,
C.
(
2017b
).
Tempesta: Stormy Music in the Eighteenth Century
(
Lexington Books
,
Lanham, MD
).
35.
Merriam-Webster
(
2023
). https://www.merriam-webster.com (Last viewed September 1, 2021).
36.
Metacritic
(
2023
). http://www.metacritic.com (Last viewed January 10, 2023).
37.
Mirka
,
D.
(
2014
).
The Oxford Handbook of Topic Theory
(
Oxford University Press
,
Oxford
).
38.
Mobbs
,
D.
,
Adolphs
,
R.
,
Fanselow
,
M. S.
,
Barrett
,
L. F.
,
LeDoux
,
J. E.
,
Ressler
,
K.
, and
Tye
,
K. M.
(
2019
). “
Viewpoints: Approaches to defining and investigating fear
,”
Nat. Neurosci.
22
(
8
),
1205
1216
.
39.
Mobbs
,
D.
,
Marchant
,
J. L.
,
Hassabis
,
D.
,
Seymour
,
B.
,
Tan
,
G.
,
Gray
,
M.
,
Petrovic
,
P.
,
Dolan
,
R. J.
, and
Frith
,
C. D.
(
2009
). “
From threat to fear: The neural organization of defensive fear systems in humans
,”
J. Neurosci.
29
(
39
),
12236
12243
.
40.
Mobbs
,
D.
,
Petrovic
,
P.
,
Marchant
,
J. L.
,
Hassabis
,
D.
,
Weiskopf
,
N.
,
Seymour
,
B.
,
Dolan
,
R. J.
, and
Frith
,
C. D.
(
2007
). “
When fear is near: Threat imminence elicits prefrontal-periaqueductal gray shifts in humans
,”
Science
317
(
5841
),
1079
1083
.
41.
Müllensiefen
,
D.
,
Gingras
,
B.
,
Musil
,
J.
, and
Stewart
,
L.
(
2014
). “
The musicality of non-musicians: An index for assessing musical sophistication in the general population
,”
PLoS One.
9
(
2
),
e89642
.
42.
Ogarkova
,
A.
(
2016
). “
Translatability of emotions
,” in
Emotion Measurement
(
Elsevier
,
Amsterdam
), pp.
575
599
.
43.
Open Science Framework
(
2023
). https://osf.io/8sjtw/ (Last viewed January 10, 2023).
44.
Paquette
,
S.
,
Peretz
,
I.
, and
Belin
,
P.
(
2011
). “
The ‘Musical Emotional Bursts’: A validated set of musical affect bursts to investigate auditory affective processing
,”
Front. Psychol.
4
,
509
.
45.
Peeters
,
G.
,
Giordano
,
B. L.
,
Susini
,
P.
,
Misdariis
,
N.
, and
McAdams
,
S.
(
2011
). “
The timbre toolbox: Extracting audio descriptors from musical signals
,”
J. Acoust. Soc. Am.
130
(
5
),
2902
2916
.
46.
Perkins
,
A. M.
,
Inchley-Mort
,
S. L.
,
Pickering
,
A. D.
,
Corr
,
P. J.
, and
Burgess
,
A. P.
(
2012
). “
A facial expression for anxiety
,”
J. Pers. Soc. Psychol.
102
(
5
),
910
924
.
47.
Ratner
,
L. G.
(
1980
). “
Topics
,” in
Classic Music: Expression, Form, and Style
(
Schirmer Books
,
New York
).
48.
R Core Team
(
2019
). “
R: A language and environment for statistical computing
,” R Foundation for Statistical Computing,
Vienna, Austria
, https://www.r-project.org (Last viewed January 10, 2023).
49.
Schellenberg
,
E. G.
,
Peretz
,
I.
, and
Vieillard
,
S.
(
2008
). “
Liking for happy- and sad-sounding music: Effects of exposure
,”
Cogn. Emotion
22
(
2
),
218
237
.
50.
Schubert
,
E.
(
2013
). “
Emotion felt by the listener and expressed by the music: Literature review and theoretical perspectives
,”
Front. Psychol.
4
,
837
.
51.
Sethares
,
W. A.
(
2005
).
Tuning, Timbre, Spectrum, Scale
(
Springer Science & Business Media
,
New York
).
52.
Slaměník
,
I.
,
Hurychová
,
Z.
, and
Kebza
,
V.
(
2008
). “
Socio-cultural dependency of emotions: Comparative analysis using prototype approach
,” in
Psychosocial Aspects of Transformation of the Czech Society within the Context of European Integration
(
Matfyzpress
,
Prague
), pp.
105
122
.
53.
Trevor
,
C.
,
Arnal
,
L. H.
, and
Frühholz
,
S.
(
2020
). “
Terrifying film music mimics alarming acoustic feature of human screams
,”
J. Acoust. Soc. Am.
147
(
6
),
EL540
EL545
.
54.
Tzanetakis
,
G.
, and
Cook
,
P.
(
2002
). “
Musical genre classification of audio signals
,”
IEEE Trans. Speech Audio Process.
10
(
5
),
293
302
.
55.
University of Jyväskylä
(
2023
). https://www.jyu.fi/hytk/fi/laitokset/mutku/en/research/materials/mirtoolbox/ (Last viewed January 10, 2023).
56.
Vieillard
,
S.
,
Peretz
,
I.
,
Gosselin
,
N.
,
Khalfa
,
S.
,
Gagnon
,
L.
, and
Bouchard
,
B.
(
2008
). “
Happy, sad, scary and peaceful musical excerpts for research on emotions
,”
Cogn. Emotion
22
(
4
),
720
752
.
57.
Warrenburg
,
L. A.
(
2019a
). “
Comparing musical and psychological emotion theories
,”
Psychomusicol.: Music Mind Brain
30
(
1
),
1
19
.
58.
Warrenburg
,
L. A.
(
2019b
). “
Subtle semblances of sorrow: Exploring music, emotional theory, and methodology
,” Ph.D. thesis, the Ohio State University, Columbus, OH.
59.
Warrenburg
,
L. A.
(
2020a
). “
Choosing the right tune: A review of music stimuli used in emotion research
,”
Music Percept.
37
(
3
),
240
258
.
60.
Warrenburg
,
L. A.
(
2020b
). “
Redefining sad music: Music's structure suggests at least two sad states
,”
J. New Music Res.
49
(
4
),
373
386
.
61.
Warrenburg
,
L. A.
(
2021
). “
The PUMS database: A corpus of previously-used musical stimuli in 306 studies of music and emotion
,”
Empirical Musicol. Rev.
16
(
1
),
145
150
.
62.
Yang
,
X.
,
Fang
,
Y.
,
Chen
,
H.
,
Zhang
,
T.
,
Yin
,
X.
,
Man
,
J.
,
Yang
,
L.
, and
Lu
,
M.
(
2021
). “
Global, regional and national burden of anxiety disorders from 1990 to 2019: Results from the Global Burden of Disease Study 2019
,”
Epidemiol. Psychiatr. Sci.
30
,
e36
.
63.
YouTube
(
2023a
). https://tinyurl.com/4wbjcumx (Last viewed January 10, 2023).
64.
YouTube
(
2023b
). https://tinyurl.com/35zv5e5r (Last viewed January 10, 2023).

Supplementary Material