Musical Robot Learns to Sing, Has Album Dropping on Spotify

We’ve been writing about the musical robots from Georgia Tech’s Center for Music Technology for many, many years. Over that time, Gil Weinberg’s robots have progressed from being able to dance along to music that they hear, to being able to improvise along with it, to now being able to compose, play, and sing completely original songs.

Shimon, the marimba-playing robot that has performed in places like the Kennedy Center, will be going on a new tour to promote an album that will be released on Spotify next month, featuring songs written (and sung) entirely by the robot.

Deep learning is famous for producing results that seem like they sort of make sense, but actually don’t at all. Key to Shimon’s composing ability is its semantic knowledge—the ability to make thematic connections between things, which is a step beyond just throwing some deep learning at a huge database of music composed by humans (although that’s Shimon’s starting point, a dataset of 50,000 lyrics from jazz, prog rock, and hip-hop). So rather than just training a neural network that relates specific words that tend to be found together in lyrics, Shimon can recognize more general themes and build on them to create a coherent piece of music.

Fans of Shimon may have noticed that the robot has had its head almost completely replaced. It may be tempting to say “upgraded,” since the robot now has eyes, eyebrows, and a mouth, but I’ll always have a liking for Shimon’s older design, which had just one sort of abstract eye thing ( that functions as a mouth on the current design). Personally, I very much appreciate robots that are able to be highly expressive without resorting to anthropomorphism, but in its new career as a pop sensation, I guess having eyes and a mouth are, like, important, or something?

To find out more about Shimon’s new talents (and new face), we spoke with Georgia Tech professor Gil Weinberg and his PhD student Richard Savery.

IEEE Spectrum: What makes Shimon’s music fundamentally different from music that could have been written by a human?

Richard Savery: Shimon’s musical knowledge is drawn from training on huge datasets of lyrics, around 20,000 prog rock songs and another 20,000 jazz songs. With this level of data Shimon is able to draw on far more sources of inspiration than than a human would ever be able to. At a fundamental level Shimon is able to take in huge amounts of new material very rapidly, so within a day it can change from focusing on jazz lyrics, to hip hop to prog rock, or a hybrid combination of them all.

How much human adjustment is involved in developing coherent melodies and lyrics with Shimon?

Savery: Just like working with a human collaborator, there’s many different ways Shimon can interact. Shimon can perform a range of musical tasks from composing a full song by itself or just playing a part composed by a human. For the new album we focused on human-robot collaboration so every song has some elements that were created by a human and some by Shimon. More than human adjustment from Shimon’s generation we try and have a musical dialogue where we get inspired and build on Shimon’s creation. Like any band, each of us has our own strengths and weaknesses, in our case no one else writes lyrics, so it was natural for Shimon to take responsibility for the lyrics. As a lyricist there’s a few ways Shimon can work, firstly Shimon can be given some keywords or ideas, like “earth” and “humanity” and then generate a full song of lyrics around those words. In addition to keywords Shimon can also take a musical and write lyrics that fit over that melody.

The press release mentions that Shimon is able to “decide what’s good.” What does that mean?

Richard Savery: When Shimon writes lyrics the first step is generating thousands of phrases. So for those keywords Shimon will generate lots of material about “earth,” and then also generate related synonyms and antonyms like “world,” and “ocean.” Like a human composer Shimon has to parse through lots of ideas to choose what’s good from the creations. Shimon has preferences towards maintaining the same sentiment, or gradually shifting sentiment as well as trying to keep rhymes going between lines. For Shimon good lyrics should rhyme, keep some core thematic ideas going, maintain a similar sentiment and have some similarity to existing lyrics.

I would guess that Shimon’s voice could have been almost anything—why choose this particular voice?

Gil Weinberg: Since we did not have singing voice synthesis expertise in our Robotic Musicianship group at Georgia Tech, we looked to collaborate with other groups. The Music Technology Group at Pompeu Fabra University developed a remarkable deep learning-based singing voice synthesizer and was excited to collaborate. As part of the process, we sent them audio files of songs recorded by one of our students to be used as a dataset to train their neural network. At the end, we decided to use another voice that was trained on a different dataset, since we felt it better represented Shimon’s genderless personality and was a better fit to the melodic register of our songs.

“We hope both audiences and musicians will see Shimon as an expressive and creative musician, who can understand and connect to music like we humans do, but also has a strange and unique mind that can surprise and inspire us”

Can you tell us about the changes made to Shimon’s face?

Weinberg: We are big fans of avoiding exaggerated anthropomorphism and using too many degrees of freedom in our robots. We feel that this might push robots into the uncanny valley. But after much deliberation, we decided that a singing robot should have a mouth to represent the embodiment of singing and to look believable. It was important to us, though, not to add DoFs for this purpose, rather to replace the old eye DoF with a mouth to minimize complexity. Originally, we thought to repurpose both DoFs of the old eye (bottom eyelid and top eye lid) to represent top lip and bottom lip. But we felt this might be too anthropomorphic, and that it would be more challenging and interesting to use only one DoF to automatically control mouth size based on the lyric’s phonemes. For this purpose, we looked at examples as varied as parrot vocalization and Muppets animation, to learn how animals and animators go about mouth actuation. Once we were happy with what we developed, we decided to use the old top eyelid DoFs as an eyebrow, to add more emotion to Shimon’s expression.

Are you able to take advantage of any inherently robotic capabilities of Shimon?

Weinberg: One of the most important new features of the new Shimon, in addition to its singing song-writing capabilities, is a total redesign of its striking arms. As part of the process we replaced the old solenoid-based actuators with new brushless DC motors that can support a much faster striking (up to 30 hits per second) as well as a wider and more linear dynamic range—from very soft pianissimo to much louder fortissimo. This not only allows for a much richer musical expression, but also supports the ability to create new humanly impossible timbres and sonorities by using 8 novel virtuosic actuators. We hope and believe that these new abilities would push human collaborators to new uncharted directions that could not be achieved in human-to-human collaboration.

How do you hope audiences will react to Shimon?

Weinberg: We hope both audiences and musicians will see Shimon as an expressive and creative musician, who can understand and connect to music like we humans do, but also has a strange and unique mind that can surprise and inspire us to listen to, play, and think about music in new ways.

What are you working on next?

Gil Weinberg: We are currently working on new capabilities that would allow Shimon to listen to, understand, and respond to lyrics in real time. The first genre we are exploring for this functionality is rap battles. We plan to release a new album on Spotify April 10th featuring songs where Shimon not only sings but raps in real time as well.

[ Georgia Tech ]

From Your Site Articles

Building Human-Robot Relationships Through Music and Dance ›

georgia institute of technology musical robots robot software

Topics

Sections

More

For IEEE Members

For IEEE Members

IEEE Spectrum

Follow IEEE Spectrum

Support IEEE Spectrum

Musical Robot Learns to Sing, Has Album Dropping on Spotify

Georgia Tech’s marimba-playing robot can now compose and sing its own music

Robert Kahn: The Great Interconnector

Video Friday: SpaceHopper

LED Touchscreen Is Also a PV Charger

Related Stories

12 Robotics Teams Will Hunt For (Virtual) Subterranean Artifacts

AI and Robots Are a Minefield of Cognitive Biases

Video Friday: These Robots Have Made 1 Million Autonomous Deliveries

Topics

Sections

More

For IEEE Members

For IEEE Members

IEEE Spectrum

Follow IEEE Spectrum

Support IEEE Spectrum

Enjoy more free content and benefits by creating an account

Saving articles to read later requires an IEEE Spectrum account

The Institute content is only available for members

Downloading full PDF issues is exclusive for IEEE Members

Downloading this e-book is exclusive for IEEE Members

Access to Spectrum 's Digital Edition is exclusive for IEEE Members

Following topics is a feature exclusive for IEEE Members

Adding your response to an article requires an IEEE Spectrum account

Create an account to access more content and features on IEEE Spectrum , including the ability to save articles to read later, download Spectrum Collections, and participate in conversations with readers and editors. For more exclusive content and features, consider Joining IEEE .

Join the world’s largest professional organization devoted to engineering and applied sciences and get access to all of Spectrum’s articles, archives, PDF downloads, and other benefits. Learn more →

Join the world’s largest professional organization devoted to engineering and applied sciences and get access to this e-book plus all of IEEE Spectrum’s articles, archives, PDF downloads, and other benefits. Learn more →

Access Thousands of Articles — Completely Free

Create an account and get exclusive content and features: Save articles, download collections, and talk to tech insiders — all free! For full access and benefits, join IEEE as a paying member.

Musical Robot Learns to Sing, Has Album Dropping on Spotify

Georgia Tech’s marimba-playing robot can now compose and sing its own music

Robert Kahn: The Great Interconnector

Video Friday: SpaceHopper

LED Touchscreen Is Also a PV Charger

Related Stories

12 Robotics Teams Will Hunt For (Virtual) Subterranean Artifacts

AI and Robots Are a Minefield of Cognitive Biases

Video Friday: These Robots Have Made 1 Million Autonomous Deliveries