Tuesday, April 05, 2005

Matlab Art #2

Origin of the image: 1 frame of the evolution the output of a multi layer perceptron is going through while it tries to adjust the different curves to straight lines. In this case the adjustment was obviously heading towards an explosion so I stopped it at the time of the image.


Anyone who has been using the music tagging service by Shazam might have wondered how on earth their computers are able to recognize audio that fast and under such extremely noisy conditions. The solution, as I found out after a quite short internet search, is a pretty smart algorithm developed by Dr. Avery Wang. Thanks to the hints he gave in a presentation at the ISMIR2003 conference, I was able to program the whole thing in Matlab in one evening.

First I did a test with a few songs on my pc. After the program trains itself with the songs, it takes a 5-second fragment of one of the songs, highly degrades it with noise and then tries to recognize it. It turned out to work just fine, and above all, very fast. Then I went for the heavier version, and fed 1000 different audio tracks into the training part of the program. To my surprise, it continued to recognize the music perfectly, in much cases up to noise degradation levels at which the human ear isn't capable anymore of recognizing it, which was nice.

That being written, if you want to get the Matlab code to play around a bit at home with, just contact me [updated 2016-03-21] check the GitHub repo.