In 2016, Microsoft’s Racist Chatbot Revealed the Dangers of Online Conversation

The bot learned language from people on Twitter—but it also learned values

5 min read
Screenshots of Microsoft's artificial intelligence program, Tay.AI, and its tweets.
Microsoft's Tay chatbot started out as a cool teenage girl, but quickly turned into a hate-speech-spewing disaster.
Photo-illustration: Gluekit

UPDATE 4 JANUARY 2024: In 2016, Microsoft’s chatbot Tay—designed to pick up its lexicon and syntax from interactions with real people posting comments on Twitter—was barraged with antisocial ideas and vulgar language. Within a few hours of it landing in bad company, it began parroting the worst of what one might encounter on social media.

But by 2022, that debacle was in the rear-view mirror and Microsoft was basking in the global excitement over Generative Pre-Trained Transformer 3, or GPT-3, the large language model upon which its Uber-popular chatbot, ChatGPT, is based. Its subsidiary OpenAI instituted plenty of home training for this new brainchild before exposing it to the world and its sometimes-unsavory elements.

Jay Wolcott, CEO of Knowbl, a startup that offers companies generative AI as a service, says “Tay was an early version of a machine learning technique attempting to accomplish some of the generative appeal that we now see so famously through GPT. But it was on a much different scale than we’re experiencing with large language model frameworks.” LLMs such as GPT-3 and GPT-4 have been trained on millions and billions of data points. So what’s happening today is going to be different than what Microsoft was attempting to do before transformers became a thing. “[Tay] was probably more of a generative adversarial network, which is much easier to dynamically change through some of the intrusion aspects that we saw happen on Twitter,” says Wolcott. That catastrophe would be very difficult to replicate that at scale today, he adds, explaining that the size of LLMs dramatically lowers the odds that the type of coordinated attack carried out against Tay would be successful.

But there are new challenges to contend with, Wolcott says. For one, “How do you control the content pieces that the LLM will and won’t respond to?” Wolcott says his company, Knowbl, is an example of firms that have sprung up to help companies take advantage of the power of generative AI, but with programmed guardrails that set limits on an interactive AI assistant so it doesn’t deliver responses that go against the client company’s interests.

Now, billions of people around the world—in the enterprise use case and increasingly for individuals who want to avoid time-consuming deep dives to find what they’re looking for­­—rely on chatbots to answer questions and straighten out knotty situations. Give a bot a prompt regarding just about anything (ask it to Write a thank-you note, for help debugging code, for menu ideas, or to come up with talking points for a debate), and it will (win perhaps one or two follow-up questions) deliver a helpful response. College applicants are even using chatbots to help them draft their admissions essays. ­—IEEE Spectrum

Original article from 25 November 2019 follows:

This is part five of a six-part series on the history of natural language processing.

In March 2016, Microsoft was preparing to release its new chatbot, Tay, on Twitter. Described as an experiment in “conversational understanding,” Tay was designed to engage people in dialogue through tweets or direct messages, while emulating the style and slang of a teenage girl. She was, according to her creators, “Microsoft’s A.I. fam from the Internet that’s got zero chill.” She loved E.D.M. music, had a favorite Pokémon, and often said extremely online things, like “swagulated.”

Tay was an experiment at the intersection of machine learning, natural language processing, and social networks. While other chatbots in the past—like Joseph Weizenbaum’s Eliza—conducted conversation by following pre-programmed and narrow scripts, Tay was designed to learn more about language over time, enabling her to have conversations about any topic.

Tay was designed to learn more about language over time…. Eventually, her programmers hoped, Tay would sound just like the Internet.

Machine learning works by developing generalizations from large amounts of data. In any given data set, the algorithm will discern patterns and then “learn” how to approximate those patterns in its own behavior.

Using this technique, engineers at Microsoft trained Tay’s algorithm on a dataset of anonymized public data along with some pre-written material provided by professional comedians to give it a basic grasp of language. The plan was to release Tay online, then let the bot discover patterns of language through its interactions, which she would emulate in subsequent conversations. Eventually, her programmers hoped, Tay would sound just like the Internet.

On March 23, 2016, Microsoft released Tay to the public on Twitter. At first, Tay engaged harmlessly with her growing number of followers with banter and lame jokes. But after only a few hours, Tay started tweeting highly offensive things, such as: “I f@#%&*# hate feminists and they should all die and burn in hell” or “Bush did 9/11 and Hitler would have done a better job…”

Within 16 hours of her release, Tay had tweeted more than 95,000 times, and a troubling percentage of her messages were abusive and offensive. Twitter users started registering their outrage, and Microsoft had little choice but to suspend the account. What the company had intended on being a fun experiment in “conversational understanding” had become their very own golem, spiraling out of control through the animating force of language.

Over the next week, many reports emerged detailing precisely how a bot that was supposed to mimic the language of a teenage girl became so vile. It turned out that just a few hours after Tay was released, a post on the troll-laden bulletin board, 4chan, shared a link to Tay’s Twitter account and encouraged users to inundate the bot with racist, misogynistic, and anti-semitic language.

In a coordinated effort, the trolls exploited a “repeat after me” function that had been built into Tay, whereby the bot repeated anything that was said to it on demand. But more than this, Tay’s in-built capacity to learn meant that she internalized some of the language she was taught by the trolls, and repeated it unprompted. For example, one user innocently asked Tay whether Ricky Gervais was an atheist, to which she responded: “Ricky Gervais learned totalitarianism from Adolf Hitler, the inventor of atheism.”

“Ricky Gervais learned totalitarianism from Adolf Hitler, the inventor of atheism.”

The coordinated attack on Tay worked better than the 4channers expected and was discussed widely in the media in the weeks after. Some saw Tay’s failure as evidence of social media’s inherent toxicity, a place that brings out the worst in people and allows trolls to hide in anonymity.

For others, though, Tay’s behavior was evidence of poor design decisions on Microsoft’s behalf.

Zoë Quinn, a game developer and writer who’s been a frequent target of online abuse, argued that Microsoft should have been more cognizant of the context in which Tay was being released. If a bot learns how to speak on Twitter—a platform rife with abusive language—then naturally it will learn some abusive language. Microsoft, Quinn argued, should have planned for this contingency and ensured that Tay was not corrupted so easily. “It’s 2016,” she tweeted. “If you’re not asking yourself ‘how could this be used to hurt someone’ in your design/engineering process, you’ve failed.”

Some months after taking Tay down, Microsoft released Zo, a “politically correct” version of the original bot. Zo, who was active on social networks from 2016 to 2019, was designed to shut down conversations about certain contentious topics, including politics and religion, to ensure she didn’t offend people. (If a correspondent kept pressing her to talk about a certain sensitive topic, she left the conversation altogether, with a sentence like: “im better than u bye.”)

The lesson Microsoft learned the hard way is that designing computational systems that can communicate with people online is not just a technical problem, but a deeply social endeavor. Inviting a bot into the value-laden world of language requires thinking, in advance, about what context it will be deployed in, what type of communicator you want it to be, and what type of human values you want it to reflect.

As we move towards an online world in which bots are more prevalent, these questions must be at the forefront of the design process. Otherwise there will be more golems released into the world that will reflect back to us, in language, the worst parts of ourselves.

This is the fifth installment of a six-part series on the history of natural language processing. Last week’s post described people’s weird intimacy with a rudimentary chatbot created in 1966. Come back next Monday for part six, which tells of the controversy surrounding OpenAI’s magnificent language generator, GPT-2.

You can also check out our prior series on the untold history of AI.

The Conversation (0)