The Raag Classifier

Buoyed up by our victory with the instrument classifier, we next attempted to create a model which could classify the elusive and a raag classifier. The purpose of this project was to classify audio clips into four commonly-used raags: pahadi, darbari, yaman and bhairavi. We trained it exclusively on sitar recordings due to its being the most used classical instrument nowadays, but hoped to expand it so that it would work for every instrument. First of all, we started off using the YAMNet model to help. This did not work, maybe because the model was meant to distinguish between various different sounds, not detect patterns in music. We then tried a variety of other ways, like making our neural network larger and more complex or using MFCCs, features which capture the experience of the human ear, but they did not work either.

A More Technical Perspective

 We downloaded forty videos from YouTube, 10 for each raag, for our database, with the videos being primarily composed of sitar music and varying greatly in length. We then downloaded and cut all of them to procure twenty 15 second audio files from each video, resulting in a grand total of 800 fifteen-second audio files.

The Yamnet Raag Attempt

We modified our earlier 11-genre classifier YAMNet code to work on this problem, confident that its earlier success would translate to this field as well. What we did not realize then was that this was an entirely different problem from the one which the YAMNet model was meant to be used for, as this problem was detecting the different recurring frequencies in a file, not classifying, and thus transfer learning did not apply in this case. We got a dismal dev accuracy of around 0.35, necessitating a fresh attempt at the problem.

The MFCC Attempt

We next tried to confront this task through the use of MFCCs, as we had done with our earlier problems. We hoped that it would result in a better performance, as MFCCs were consistently used by many other programmers to identify raags, but there was no noticeable difference observed.

Our Efforts to Decrease the Bias

We correctly observed that this was due to an exceedingly high avoidable bias error, which we sought to correct through increasing the complexity and density of the network and procuring more features. Neither effort bore any fruit. We increased the neural network’s hidden layers to four, and also increased the amount of neurons in each layer by a considerable amount, with the first layer having 10000, the second 5000, the third 500 and the fourth 50. Performance remained dismal. We next attempted to calculate the spectrogram of each audio file to gain a better understanding of the pitches and notes used. We used the scipy.signal.spectrogram function for this purpose, but our efforts were not helpful again either.

A spectrogram of an audio clip in raag Pahadi
Our Use of Chromagrams

We next tried to chronicle the frequencies present in each audio file, which we managed to calculate. This let us detect the fundamental note of a raag through focusing on the two most prominent notes, which we assumed to be Sa and Pa. We then transposed the frequencies so that the Sa note will become C, which will let us compare all of the audio files on an equal footing. We next used the two major notes (the tonic and the fifth) to identify the raag. This method was unusual for us because it did not actually use machine learning of any kind. It resulted in train and dev accuracies well above 0.8, but our test accuracy came to be around 0.5. You can see the average distributions for each raag below.

This project marked the first time we explored possibilities beyond our original project’s mission, i.e., to classify music by instrument or genre. Here, for the first time, we tried to identify the patterns in the music itself, boldly trying to expand beyond our comfort zone. It thus marked a new era for us – an era of experimentation and ambition as we grew more and more comfortable with machine learning.


Save the Sitar is a website dedicated to promoting and preserving Pakistan’s classical music. Join our growing community to help further our cause!

Follow Save the Sitar!

Get new content delivered directly to your inbox.

We’re on social media!

Leave a comment