Posted on 09 December 2012

Part 8 in a series of videos recorded from ACM MIRUM 2012 in Nara, Japan.

One difficulty with tempo estimation is octave errors, e.g. a song played at 80 beats per minute (BPM) could be detected as having 40 BPM or 160 BPM. Geoffroy Peeters presents an alternative method, perceptual tempo estimation, which uses tempo estimates provided by humans to define the reference tempo and to train the tempo estimator. For training, these perceptual tempo estimates were collected during large-scale experiments conducted by where users provide their estimates of tempo for several songs. These tempo estimates are then supplied to the machine learning method, GMM regression, which builds a probabilistic model for the feature space incorporating both computationally determined tempo features (energy variation, harmonic variation, spectral balance variation, and short-term event repetition) and the perceptual tempo estimates. Finally, to estimate the tempo of a novel input, the model infers the tempo estimate from the computationally determined features in a maximum-likelihood fashion.