This study investigated the abilities of listeners to classify various sorts of musical stimuli as major vs minor. All stimuli combined four pure tones: low and high tonics ( and ), dominant (), and either a major third () or a minor third (B♭). Especially interesting results were obtained using tone-scrambles, randomly ordered sequences of pure tones presented at 15 per second. All tone-scrambles tested comprised 16 's ('s 's), 8 's, and either 8 's or 8 B♭'s. The distribution of proportion correct across 275 listeners tested over the course of three experiments was strikingly bimodal, with one mode very close to chance performance, and the other very close to perfect performance. Testing with tone-scrambles thus sorts listeners fairly cleanly into two subpopulations. Listeners in subpopulation 1 are sufficiently sensitive to major vs minor to classify tone-scrambles nearly perfectly; listeners in subpopulation 2 (comprising roughly 70% of the population) have very little sensitivity to major vs minor. Skill in classifying major vs minor tone-scrambles shows a modest correlation of around 0.5 with years of musical training.
REFERENCES
The reader may wonder what would happen to performance in the major/minor classification task if the tone-scramble tonic were roved randomly from trial to trial. Pilot studies show that this makes the task much harder. None of the listeners we tested performed better in the roved-tonic version of the task; many who are near perfect in the basic tone-scramble task were severely impaired in the roved-tonic condition. Part of the reason for the increased difficulty of the roved-tonic condition may have to do with the fact that in this condition the tonic needs to be established anew in each tone-scramble; if tones at the third degree of the scale register as such only after the tonic is established, then these critical tones may exert little impact on the listener's decision if they occur early in the tone-scramble (before the tonic has been established). By contrast, in the un-roved condition, the tonic persists across trials; thus, tones at the third degree of the scale can exert powerful impact even when they occur very early in the stimulus.
For chords, these eight example stimuli consisted of (i) a major chord followed by (ii) a minor chord, both of the lowest of the five possible brightnesses, followed by (iii) a major chord followed by (iv) a minor chord, both of the second-highest of the five possible brightnesses, followed by (v) a major chord followed by (vi) a minor chord, both of the second-lowest of the five possible brightnesses, followed by (vii) a major chord followed by (viii) a minor chord, both of the highest of the five possible brightnesses. For tone-scrambles, as in Experiment 2, the eight example stimuli included two each of all four types of tone-scrambles (high and low pitch-height major and minor) alternating between major and minor.
Previous studies have addressed similar questions by comparing the performance of trained musicians in various auditory tasks to that of nonmusicians. For example, Spiegel and Watson (Ref. 33) documented that trained musicians performed better (on average) than a matched sample of nonmusicians on several different tasks requiring the listener to discriminate the pitches of notes. Similarly, McDermott et al. (Ref. 35) found that trained musicians were better than nonmusicians at several different sorts of tasks requiring discrimination of note pitch, note loudness, note brightness as well as discrimination of two-note intervals in these different note properties. If, however, the sensitivity required to perform well in the tone-scramble classification task is (i) unlearnable for some people, and (ii) important for musical performance, then listeners who are highly sensitive to the difference between major and minor tone-scrambles will be more likely to proceed to higher levels of training than insensitive listeners. Under this scenario, the strategy of directly comparing the sensitivities of nonmusicians vs trained musicians might well yield a strong positive correlation between musical training and sensitivity in the tone-scramble task even though no causal relationship exists in fact.
If the prior density, , were nonuniform, then we would have .
Although the histogram shown in Fig. 1 appears strongly bimodal, one might wonder whether this impression would be confirmed by a formal statistical test. We performed such a test. Specifically, we used a likelihood ratio test to compare the fits provided to these data by two nested models. The restricted (unimodal) model assumed that the number of correct responses for each listener is a binomial random variable with parameters p and n = 45, where p is a random variable drawn from a beta density fz,w(p). (By varying z and w, fz,w can flexibly capture a wide range of different unimodal density functions defined on the interval [0, 1].) The fuller (bimodal) model assumed that p is drawn from a mixture of two beta densities. The fit provided by the fuller model must be better than that provided by the restricted model; the likelihood ratio test assesses whether the fit provided by the fuller model is significantly better than would be expected by chance under the null hypothesis that the restricted model is true. This test rejects the null hypothesis with vanishingly small p-value for the data shown in Fig. 1. However, it also rejects the null hypothesis with very small p-values for all of the other data sets we report, even those whose histograms appear much more unimodal than Fig. 1 (e.g., those in the right two panels of Fig. 5). We conclude that none of our data sets are actually unimodal. The mixture parameter in the fuller model reflects the strength of the data’s bimodality, with small values indicating little need for a second mode. We do not bother to report these values, however, because they merely confirm what is already clear from the histograms.