Hello! It has been quite a while, hasn’t it. Apologies for the long absence. It started out due to a combination of the stress of the exams and Muneeb’s going to college, and then just extended itself. Now, however, things are looking up somewhat. I’ll try to keep things updating more regularly now – we are officially back.
After this long pause, it will be nostalgic to look back at a new instrument, just like we have before – so behold, the chimta!
Wait a second, this looks familiar … no. Just no. Please. We’re still suffering from the effects of the matka, why this now! Just ignore us. We’ll be hyperventilating somewhere in the corner here. Focus instead on this magnificent, elaborately constructed, musically harmonious – pair of tongs.
If you’ve managed to get over your perpetual phobia of cooking implements reused as music instruments (you don’t have one? How strange), you may be interested to know that the chimta is a traditional Punjabi folk instrument, which indeed has its origins in the humble tongs. Starting out as a cooking implement, it evolved with time into an instrument in its own right with the addition of small bells to its sides, much like the far better known tambourine. Often used as an accompaniment to folk and religious music, it is essential to keeping the beat with its rhythmic shaking. That said, it, much like the tanpura, is normally used to accompany singing, as used by the inimitable Arif Lohar, or other instruments like the dhol – never as a solo instrument.
Arif Lohar’s father, Asim Lohar singing while accompanied with the chimta
The chimta as of now remains alive and well, much like its compatriot and frequent associate, the dhol. That said, with the encroaching loss of many folk traditions and music styles, it is doubtlessly under peril as well in the long term. However, at the very least we can reassure ourselves that it will never go extinct in one place – the kitchen!
Save the Sitar is a website dedicated to promoting and preserving Pakistan’s classical music. Join our growing community to help further our cause!
So far we have covered many instruments, from the exalted sitar to the humble matka (the jury is still out on whether this is actually an instrument or now). However, we have never covered the national instrument of Pakistan, have we? So what will it be? The sitar? The tabla? The sarangi? We have no idea! Well, actually, you do, because you read the title, but play along for now. So as we turn with eager eyes to that fount of all knowledge, the Wikipedia page, we find it to be … the daf.
What? What even is the daff? A sort of hat? A flower? A particularly repugnant type of worm? Well, as it turns out, none of them! The daf is actually a frame drum, somewhat resembling the tambourine. It is actually an instrument of many names, even accounting for the limitations of romanization: its many aliases include the riq, the dayere, and the gaval. It is also surprisingly widespread, ranging from the frigid North pole to equatorial India. That said, its design remains constant throughout – it nearly always consists of some sort of skin or synthetic material stretched over a circular frame, with jingles attached beneath it. Compared to some of the variations the sitar (*cough cough* surbahar) can be subject to, the daf is almost suspiciously unchanged throughout its startlingly wide range – but then again, it is difficult to change much with such a basic design.
This simple instrument, however, hides a long and storied history. Emerging from Iran, it was documented by at least the fifth to sixth century BC, but most probably dates to some time before then. Long associated with spiritual music as well as traditional festivities, the daf got catapaulted to fame when it was claimed to be the only musical instrument not forbidden by Islam, giving it an even more important place in Sufi music. In spite of that, the daf is also somewhat paradoxically often associated with old Bollywood songs, like the classic “Dil ka haal sune dil-waala” or “Daffli waale, Daffli baja”.
The daf thus continues to survive to the present day, much as it has done so in the past, weathering its invasions, cultural clashes and much more. Its steady beat continues to resound throughout the Subcontinent and beyond – as we hope it will in the future.
Save the Sitar is a website dedicated to promoting and preserving Pakistan’s classical music. Join our growing community to help further our cause!
Does the rhythmic steady masitkhani baj of the Senia gharana call to you? Or are the singing sitarists of the Indore gharana your favourite? How about the lively playing of the Kapurthala rababis? Or the meend-using Pounch sitarists? How about the khayal-performing Delhi sitar players? We know its hard to choose from such a variety of options, but try to settle on your favourite and let us know!
Save the Sitar is a website dedicated to promoting and preserving Pakistan’s classical music. Join our growing community to help further our cause!
Buoyed up by our victory with the instrument classifier, we next attempted to create a model which could classify the elusive and a raag classifier. The purpose of this project was to classify audio clips into four commonly-used raags: pahadi, darbari, yaman and bhairavi. We trained it exclusively on sitar recordings due to its being the most used classical instrument nowadays, but hoped to expand it so that it would work for every instrument. First of all, we started off using the YAMNet model to help. This did not work, maybe because the model was meant to distinguish between various different sounds, not detect patterns in music. We then tried a variety of other ways, like making our neural network larger and more complex or using MFCCs, features which capture the experience of the human ear, but they did not work either.
A More Technical Perspective
We downloaded forty videos from YouTube, 10 for each raag, for our database, with the videos being primarily composed of sitar music and varying greatly in length. We then downloaded and cut all of them to procure twenty 15 second audio files from each video, resulting in a grand total of 800 fifteen-second audio files.
The Yamnet Raag Attempt
We modified our earlier 11-genre classifier YAMNet code to work on this problem, confident that its earlier success would translate to this field as well. What we did not realize then was that this was an entirely different problem from the one which the YAMNet model was meant to be used for, as this problem was detecting the different recurring frequencies in a file, not classifying, and thus transfer learning did not apply in this case. We got a dismal dev accuracy of around 0.35, necessitating a fresh attempt at the problem.
The MFCC Attempt
We next tried to confront this task through the use of MFCCs, as we had done with our earlier problems. We hoped that it would result in a better performance, as MFCCs were consistently used by many other programmers to identify raags, but there was no noticeable difference observed.
Our Efforts to Decrease the Bias
We correctly observed that this was due to an exceedingly high avoidable bias error, which we sought to correct through increasing the complexity and density of the network and procuring more features. Neither effort bore any fruit. We increased the neural network’s hidden layers to four, and also increased the amount of neurons in each layer by a considerable amount, with the first layer having 10000, the second 5000, the third 500 and the fourth 50. Performance remained dismal. We next attempted to calculate the spectrogram of each audio file to gain a better understanding of the pitches and notes used. We used the scipy.signal.spectrogram function for this purpose, but our efforts were not helpful again either.
A spectrogram of an audio clip in raag Pahadi
Our Use of Chromagrams
We next tried to chronicle the frequencies present in each audio file, which we managed to calculate. This let us detect the fundamental note of a raag through focusing on the two most prominent notes, which we assumed to be Sa and Pa. We then transposed the frequencies so that the Sa note will become C, which will let us compare all of the audio files on an equal footing. We next used the two major notes (the tonic and the fifth) to identify the raag. This method was unusual for us because it did not actually use machine learning of any kind. It resulted in train and dev accuracies well above 0.8, but our test accuracy came to be around 0.5. You can see the average distributions for each raag below.
This project marked the first time we explored possibilities beyond our original project’s mission, i.e., to classify music by instrument or genre. Here, for the first time, we tried to identify the patterns in the music itself, boldly trying to expand beyond our comfort zone. It thus marked a new era for us – an era of experimentation and ambition as we grew more and more comfortable with machine learning.
Save the Sitar is a website dedicated to promoting and preserving Pakistan’s classical music. Join our growing community to help further our cause!
This classifier classifies audio into seven instrumental categories: the sitar, the bansuri, the sarangi, the sarod, the tabla, the harmonium and Western music (a catch-all category for any piece of music which does not fit into the previous ones). This was the first of our projects which could truly be called ours – we were the ones who wrote its code, as well as the ones who proposed and then decided on its purpose. Like the genre classifier, its current version trains on embeddings provided by the YAMNet model – meaning that we let the much more advanced model process the data before feeding it to our own model.
A More Technical Perspective
Our database consists of around 1070 files, each 15 seconds long, with the files being harvested from YouTube. Each file consists of one prominent instrument being played, with no other accompaniments. However, while cutting the files the first time, we by mistake cut them in such a manner that each file consisted of the first fifteen seconds of its source clip. We did not realize this fatal mistake until afterwards, and it contributed greatly to the initial low performance of our model.
MFCC Instrument Classification
First of all, we extracted forty MFCC features from each file, which we then used to train our model. Our model had only one hidden layer of 100 neurons, as increasing the network density had no observable advantages. The model reported a training accuracy of 1 and a dev set accuracy of 1 but a test accuracy of 0.58 due to severe overfitting. One can see the breakdown of its performance below.
Total
Sarangi
Sarod
Sitar
Bansuri
Tabla
Harmonium
Sarangi
20
10
1
1
1
3
4
Sarod
20
4
10
0
2
2
2
Sitar
22
0
0
9
7
5
1
Bansuri
18
3
2
0
6
7
0
Tabla with Harmonium
20
0
0
2
0
17
1
Tabla with Sarangi
17
0
0
0
0
16
1
Harmonium
19
0
0
1
0
7
11
In the table above, the labels on the left show the actual category, while the labels on the top show the amount which were labeled for each category by our model.
YAMNet Instrument Classification
We then used the YAMNet model instead, as it had worked quite well on the previous genre-classification problem. We used the same training database again, and modified our YAMNet classification code for this purpose. We then once more reported a training and dev accuracy of 1 and 1, but a test accuracy of 0.88. Our model had only one hidden layer with 512 neurons, but used the embeddings extracted by the YAMNet model to train. You can see a breakdown of its performance and how it compares to that of the MFCC based model’s below.
Move the slider from the right to the left to see the increased accuracy of the YAMNet model
We then added the category of Western Music, which consisted only of the violin and piano solos. It reported an accuracy of 0.9 after this, alongside an accuracy of 1 for both the piano and tabla test cases. This was rather surprising to us, as we had expected the addition of another class to decrease the overall accuracy considerably after our experience to adding classes to our genre identifying model. Instead, it remained mostly the same, contradicting our expectations. This is a prime example of how we truly learned machine learning throughout our journey – by actually experimenting to test our assumptions and, more often than not, making mistakes and learning from them.
Save the Sitar is a website dedicated to promoting and preserving Pakistan’s classical music. Join our growing community to help further our cause!