11 July 2008--We all occasionally catch ourselves humming a tune, or singing along to the radio. What separates most of us from real musicians is the knowledge and skill to turn a hummed melody into a complete song. Three researchers in Washington state, however, aim to bridge at least some of that gap.
They've created a program called MySong that can generate a chord progression to fit any vocal melody. You simply sing into a computer microphone to the beat of a digitized metronome, and MySong comes up with an accompaniment of chords that sounds good with it. ”Lots of songs have only three chords,” says Sumit Basu of Microsoft Research, a cocreator of MySong. ”If you have the melody, it seems like you ought to be able to predict what the chords are.” Basu and his collaborators--Dan Morris, a Microsoft Research colleague, and Ian Simon, a Ph.D. student at the University of Washington, Seattle--will show off some of their program's features at The Twenty-Third AAAI Conference on Artificial Intelligence next week.
For any given melody, there's no such thing as a ”correct” chord progression. But we tend to like songs with patterns we're used to hearing. When a musician begins fitting chords to a melody, the choices are guided by a lifetime of listening to other songs and figuring out why they sound good. MySong's creators gave their program a similar musical education by assembling a library of nearly 300 songs in the form of lead sheets, sheet music where written-out chords--such as C major, A minor, G major--accompany a single melody line. (For examples of lead sheets, check out Wikiphonia, whose songs form the basis of MySong's library).
By analyzing this library, MySong creates probability tables that calculate two factors. For a given chord, which chords are most likely to proceed or follow it? And for each chord, which melody notes are likely to appear with it? These probability tables essentially give MySong mathematically derived music theory from pop music itself.
After a user records a vocal track, MySong initially provides two ways to alter the generated chord progressions, both in the form of slider bars. One slider affects how much weight each of the two probabilities is given: whether a chord best matches the vocal melody notes or whether it best fits in with the chords that surround it. The second slider allows the user to weight between major (happy-sounding) and minor (sad-sounding) keys.
”Typically in other machine-learning approaches, the blending would be fixed by whoever develops it,” says Morris. ”Rather than fixing it in our code, we put that on a slider and exposed it to the user as an extra creative degree of freedom.” This concept of opening to amateur users the hidden mathematics of the underlying model is one of the topics the researchers will discuss at the AAAI conference. ”With just a few clicks you can actually do a lot of manipulation and get a lot of different feels to accompany what you're doing,” adds Basu.
The technology behind MySong goes beyond simple modeling of song structure: before it can do anything else, it first has to make sense of the imprecise acoustic sounds we call singing. ”The voice is a real challenge,” says Basu, who notes that most people, including trained singers, use some vibrato, where the voice vibrates above and below the intended note. Although computers can accurately track pitch, it's difficult for them to determine the exact notes the singer meant to sing.
MySong's team realized that they could work around the complexities of the voice by requiring the user to pick a tempo and sing along with a digitized metronome so that the melody maps to uniform units of time. MySong takes frequency samples 100 times a second, and, rather than trying to accurately assemble a melody from those frequencies, it keeps track of how often each frequency appears during a user-defined number of beats. This quantity over quality approach means that the chord selected for a measure best fits those notes hit most often or held the longest.
MySong's novelty, says Christopher Rafael, an associate professor of informatics at Indiana University, is in this ability to avoid technical problems rather than solve them. He also sees promise in the program's ability to engage novices, a sentiment echoed by Gil Weinberg, director of music technology at Georgia Tech. ”I believe everyone has the ability to express themselves by singing songs or banging on something,” says Weinberg. ”What's nice about the voice is that you don't even need an object to bang on.” Weinberg says he hopes that MySong will provide a gateway for further learning. ”Many students just don't get to the expressive and creative portion of music, because there's so much technique and theory in the beginning,” he says.
Basu, Morris, and Simon may write software by day, but they each spend much of their free time writing and performing music. ”We're kind of casual, amateur musicians who love making music,” says Morris. In one test of MySong's abilities, they pitted some of the chord progressions they wrote for their own songs against those chosen by the software. A blind jury of experienced musicians consistently rated MySong's chord progressions nearly as high as it rated those chosen by human musicians.
For Basu, such success was bittersweet triumph. ”It made me feel proud for MySong, but these were the chords I had slaved over,” he laughs. ”But when I listened to what MySong had chosen, it was often more interesting that what I had done.”
Since developing MySong, Basu used the program to assist him in the early stages of songwriting. ”Dan [Morris] always jokes that I'm the one user of MySong,” says Basu. ”There are progressions I use now in my music that I learned from MySong.”
As a user and creator, Basu emphasizes that MySong is meant for creativity assistance rather than creativity replacement. ”The creative spark still has to come from people,” he says, ”and one of the things that makes me feel better as a musician is that there's more to music than just the chords you choose.”
Microsoft has not yet decided to commercialize MySong, but the team hopes to improve its core modeling algorithms, as well as provide more user control over which libraries the program relies on. Basu imagines users being able to select chords based on libraries of specific musicians. A slider, for instance, that could blend chord-progression styles between the jazz singer Ella Fitzgerald and the metal band Slayer.
To Probe Further
Listen to audio samples created by MySong users and judge the chords for yourself at https://research.microsoft.com/~dan/mysong/