A method for automatic transcription of polyphonic music is proposed in this work that models the temporal evolution of musical tones. The model extends the shift-invariant probabilistic latent component analysis method by supporting the use of spectral templates that correspond to sound states such as attack, sustain, and decay. The order of these templates is controlled using hidden Markov model-based temporal constraints. In addition, the model can exploit multiple templates per pitch and instrument source. The shift-invariant aspect of the model makes it suitable for music signals that exhibit frequency modulations or tuning changes. Pitch-wise hidden Markov models are also utilized in a postprocessing step for note tracking. For training, sound state templates were extracted for various orchestral instruments using isolated note samples. The proposed transcription system was tested on multiple-instrument recordings from various datasets. Experimental results show that the proposed model is superior to a non-temporally constrained model and also outperforms various state-of-the-art transcription systems for the same experiment.
Skip Nav Destination
Article navigation
March 2013
March 06 2013
Multiple-instrument polyphonic music transcription using a temporally constrained shift-invariant model
Emmanouil Benetos;
Emmanouil Benetos
a)
Centre for Digital Music, School of Electronic Engineering and Computer Science, Queen Mary University of London
, Mile End Road, London E1 4NS, United Kingdom
Search for other works by this author on:
Simon Dixon
Simon Dixon
Centre for Digital Music, School of Electronic Engineering and Computer Science, Queen Mary University of London
, Mile End Road, London E1 4NS, United Kingdom
Search for other works by this author on:
a)
Author to whom correspondence should be addressed. Electronic mail: emmanouilb@eecs.qmul.ac.uk
J. Acoust. Soc. Am. 133, 1727–1741 (2013)
Article history
Received:
August 16 2012
Accepted:
January 22 2013
Citation
Emmanouil Benetos, Simon Dixon; Multiple-instrument polyphonic music transcription using a temporally constrained shift-invariant model. J. Acoust. Soc. Am. 1 March 2013; 133 (3): 1727–1741. https://doi.org/10.1121/1.4790351
Download citation file:
Sign in
Don't already have an account? Register
Sign In
You could not be signed in. Please check your credentials and make sure you have an active account and try again.
Sign in via your Institution
Sign in via your InstitutionPay-Per-View Access
$40.00
Citing articles via
Related Content
Multiple-timbre fundamental frequency tracking using an instrument spectrum library
J Acoust Soc Am (September 2012)
An investigation of prior knowledge in Automatic Music Transcription systems
J. Acoust. Soc. Am. (October 2015)
Multiple-timbre note tracking using linear dynamical systems
J Acoust Soc Am (October 2016)
Polyphonic pitch tracking with deep layered learning
J. Acoust. Soc. Am. (July 2020)
Automatic transcription of Turkish microtonal music
J. Acoust. Soc. Am. (October 2015)