Microsoft’s AI Research Draws Controversy Over Possible Disinformation Use

Microsoft’s AI could enable its popular chatbot to comment on news, but critics see a tool for spreading disinformation

4 min read

Illustration of chat bubble
Illustration: iStock

AI capable of automatically posting relevant comments on news articles has raised concerns that the technology could empower online disinformation campaigns designed to influence public opinion and national elections. The AI research in question, conducted by Microsoft Research Asia and Beihang University in China, became the subject of controversy even prior to the paper’s scheduled presentation at a major AI conference this week.

The “DeepCom” AI model developed by the Microsoft and Beihang University team showed that it could effectively mimic human behavior by reading and commenting on news articles written in English and Chinese. But the original paper [PDF] uploaded to the arXiv preprint server on 26 September made no mention of ethical issues regarding possible misuse of the technology. The omission sparked a backlash that eventually prompted the research team to upload an updated paper [PDF] addressing those concerns.

“A paper by Beijing researchers presents a new machine learning technique whose main uses seem to be trolling and disinformation,” wrote Arvind Narayanan, a computer scientist at the Center for Information Technology Policy at Princeton University, in a Twitter post. “It’s been accepted for publication at EMLNP [sic], one of the top 3 venues for Natural Language Processing research. Cool Cool Cool [sic].”

The Microsoft and Beihang University paper has spurred discussion within the broader research community about whether machine learning researchers should follow stricter guidelines and more openly acknowledge the possible negative implications of certain AI applications. The paper is currently scheduled for presentation at the 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP) in Hong Kong on 7 November.

Both Narayanan and David Ha, a scientist at Google Brain Research, voiced their skepticism of the original paper’s suggestion that “automatic news comment generation is beneficial for real applications but has not attracted enough attention from the research community.” Ha sarcastically asked if there would be a follow-up paper about an AI model called “DeepTroll” or “DeepWumao” (“Wumao” is the name for Chinese Internet commentators paid by the Chinese Communist Party to help manipulate public opinion by making online comments).

“I think there's qualitative difference between research on fundamental problems that have the potential for misuse and applications which are specifically suited to, if not designed for, misuse,”

Jack Clark, a former journalist turned policy director for the OpenAI research organization, gave a more blunt rebuttal to the paper’s suggestion: “As a former journalist, I can tell you that this is a lie.”

Researchers such as Alvin Grissom II, a computer scientist at Ursinus College in Collegeville, Penn., raised questions about what types of AI research deserve to be publicized by prominent research conferences such as EMNLP. “I think there’s qualitative difference between research on fundamental problems that have the potential for misuse and applications which are specifically suited to, if not designed for, misuse,” said Grissom in a Twitter post.

The Microsoft and Beihang University researchers’ updated paper, which acknowledges some of the ethical concerns, was uploaded after Katyanna Quach reported on the controversy for The Register. The updated version also removed the original paper’s statement about how “automatic news generation is beneficial for real applications.”

“We are aware of potential ethical issues with application of these methods to generate news commentary that is taken as human,” the researchers wrote in the updated paper’s conclusion. “We hope to stimulate discussion about best practices and controls on these methods around responsible uses of the technology.”

“Security conferences these days require submissions to describe ethical considerations and how the authors followed ethical principles. Machine learning conferences should consider doing this.”

The updated paper’s conclusion about possible applications also specifically mentions that the team was “motivated to extend the capabilities of a popular chatbot.” That almost certainly refers to Microsoft’s China-based chatbot named Xiaoice. It has more than 660 million users worldwide and has become a virtual celebrity in China. Wei Wu, one of the coauthors on the DeepCom paper, holds the position of principal applied scientist for the Microsoft Xiaoice team at Microsoft Research Asia in Beijing. 

The Microsoft and Beihang University researchers did not provide much additional input when reached for comment. Instead, both Wu and a Microsoft representative referred to the updated version of the paper that acknowledges the ethical issues. But the Microsoft representative was unable to refer IEEE Spectrum to even one source who could speak about the company’s research review process.

“I’d like to hear from Microsoft if they had any ethical review process in place, and whether they plan to make any changes to their processes in the future in response to the concerns about this paper,” Narayanan wrote in an email to IEEE Spectrum. His prior work includes research on how AI can learn gender and racial biases from language.

Microsoft has previously staked out a position for itself as a leader in AI ethics with initiatives such as the company’s AI and Ethics in Engineering and Research (AETHER) Committee. That committee’s advice has supposedly led Microsoft to reject certain sales of its commercialized technology in the past. It’s less clear how much AETHER is involved in screening AI research collaborations prior to the AI application and commercialization stage.

Meanwhile, Nayaranan and other researchers have also asked questions about the review process for accepting papers at the EMNLP conference being held in Hong Kong. “Security conferences these days require submissions to describe ethical considerations and how the authors followed ethical principles,” Narayanan wrote in a Twitter post. “Machine learning conferences should consider doing this.”

Narayanan urged conference attendees to direct questions at both the paper’s authors and the program chairs for the conference.

The organizers admit that authors and reviewers were not required to look at the ethical considerations or social impact of the technologies described in the papers that were submitted. But reviewers did flag a handful of papers based on ethical concerns, says Jing Jiang, a computer scientist at Singapore Management University in Singapore who served on the conference’s organizing committee. The committee decided that the authors of any flagged papers making the cut on technical merits would be asked to address ethical issues in a revised draft and undergo additional review.

But here’s the rub: The DeepCom paper by Microsoft and Beihang University did not get flagged by any reviewers, says Jiang. Perhaps more formal guidelines for considering ethical issues in papers submitted to future conferences are in order.

This article was updated on 5 November.

The Conversation (0)