Posted on 22 March 2010

Paper details:

Methods that employ dictionary learning and sparse coding have become popular for discovering structure in acoustic signals. Unfortunately, these methods also share a common limitation. When there is significant spectral-temporal overlap among the dictionary atoms present in a signal, it becomes difficult for these methods to learn atoms properly. Often, information from multiple atoms is represented as a single atom by the learning procedure. If an atom in the output dictionary contains musical information from multiple sources, transcription and source separation cannot be accurately performed.

In this paper, we propose a novel dictionary learning method that performs well despite the presence of spectral-temporal overlap among dictionary atoms. Our method imposes a harmonic constraint that restricts each atom to represent at most one pitch. Furthermore, our method is flexible by allowing the size of the dictionary to grow based upon the complexity of the input signal. Our method consistently achieves higher recall and precision than other well-known dictionary learning algorithms.