The March 2024 issue of IEEE Spectrum is here!

Close bar

Computers with an Ear for Music

System classifies music using a mix of human and artificial intelligence

2 min read
Computers with an Ear for Music

If you wanted to find every jazz tune on the Internet that paired a harmonica with a saxophone, you could enter keywords for the instruments, but your results would be meager. That’s because verbal tagging for music hardly exists.

Now comes a new method that promises to use machines, trained by human beings, to classify the vast trove of music and video that’s in cyberspace. It’s an approach midway between the two systems that have so far been applied in recommender systems.

The first method, called collaborative filtering, uses an algorithm to infer your taste in music, movies, or books from your past choices, then suggests new materials enjoyed by other people with similar taste. Improving such algorithms was the point behind the Netflix prize competition, which IEEE Spectrum covered in 2009 . However, collaborative filtering’s no good at categorizing things that aren’t already popular.

The second method uses human experts to classify things, as has done with its Music Genome project.  The experts actually listen to each song, then fit it into a number of categories so that customers who name a song they like can be presented with a selection of similar music. But though Pandora’s musicians have thus categorized nearly a million songs in the past dozen years, there’s no way they can handle the 60 hours of multimedia that’s uploaded to YouTube every minute.

The new approach is something of a hybrid of man and machine. It comes from engineers Gert Lanckret and Luke Barrington at the University of California at San Diego, and Douglas Tournbull at Ithaca College, in New York, who describe it in the 24 April issue of the Proceedings of the National Academy of Sciences.

The researchers got tagging information from human participants in the music annotation game “Herd It” and fed it into a machine-learning program, which then sought out additional information from the human informants. The program thus refines the model through several iterations until it can reliably mimic the humans. At that point the machine can go to work classifying music. The researchers say their method can be scaled up easily, which means that it should be able to search the Internet’s enormous inventory of music and video.

Such man-machine collaborations seem to be taking over many facets of artificial intelligence. Call it artificially enhanced intelligence—or naturally enhanced AI. In so-called Advanced Chess, for instance, human/machine combos known as “centaurs” regularly outplay the best human chess masters—and the best chess machines.

“This pattern is true not only in chess, but throughout the economy,” wrote MIT economists Erik Brynjolfsson and Andrew McAfee in the Sloan Management Review, in December  “In medicine, law, finance, retailing, manufacturing and even scientific discovery, the key to winning the race is not to race against machines, but to win using machines.”

Of course, nobody’s talking about using machines—alone or with coaching—to judge the artistic value of a piece of music. Yet.

Photo: Katie Wardrobe

The Conversation (0)