How Computers Can Finally Detect Sarcasm

IEEE SpectrumFOR THE TECHNOLOGY INSIDER
TopicsAerospaceAIBiomedicalClimate TechComputingConsumer ElectronicsEnergyHistory of TechnologyRoboticsSemiconductorsTelecommunicationsTransportation
SectionsFeaturesNewsOpinionCareersDIYEngineering Resources
MoreNewslettersSpecial ReportsCollectionsExplainersTop Programming LanguagesRobots Guide ↗IEEE Job Site ↗
For IEEE MembersCurrent IssueMagazine ArchiveThe InstituteThe Institute Archive
For IEEE MembersCurrent IssueMagazine ArchiveThe InstituteThe Institute Archive
IEEE SpectrumAbout UsContact UsReprints & Permissions ↗Advertising ↗
Follow IEEE Spectrum
Support IEEE SpectrumIEEE Spectrum is the flagship publication of the IEEE — the world’s largest professional organization devoted to engineering and applied sciences. Our articles, videos, and infographics inform our readers about developments in technology, engineering, and science.
Subscribe
About IEEEContact & SupportAccessibilityNondiscrimination PolicyTermsIEEE Privacy PolicyCookie PreferencesAd Privacy Options
© Copyright 2025 IEEE — All rights reserved. A public charity, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity.

Hi and welcome to Fixing the Future, IEEE Spectrum’s podcast series on the technologies that can set us on the right path toward sustainability, meaningful work, and a healthy economy for all. Fixing the Future is sponsored by COMSOL, makers of of COMSOL Multiphysics simulation software. I’m Steven Cherry.

Leonard: Hey, Penny. How’s work?
Penny: Great! I hope I’m a waitress at the Cheesecake Factory for my whole life!
Sheldon: Was that sarcasm?
Penny: No.
Sheldon: Was that sarcasm?
Penny: Yes.

Steven Cherry That’s Leonard, Penny and Sheldon from season two of the Big Bang Theory. Fans of the show know there’s some question of whether Sheldon understands sarcasm. In some episodes he does, and in others he’s just learning it. But there’s no question that computers don’t understand sarcasm or didn’t until some researchers at the University of Central Florida started them on a path to learning it. Software engineers have been working on various flavors of sentiment analysis for quite some time. Back in 2005, I wrote an article in Spectrum about call centers automatically scanning conversations for anger either by the caller or the service operator. One of the early use cases behind messages like this call may be monitored for quality assurance purposes. Since then, software has been getting better and better at detecting joy, fear, sadness, confidence and now, finally, sarcasm. My guest today, Ramia Akula, is a PhD student and a graduate research assistant at the University of Central Florida is Complex Adaptive Systems Lab.. She has at least 11 publications to her name, including the most recent interpretable multiheaded self attention architecture for Sarcasm Detection in Social Media, published in March in the journal Entropy with her advisor, Ivan Garibay Ramia. Welcome to the podcast.

Ramya Akula Thank you. It’s my pleasure to be here.

Ramya, maybe you could tell us a little bit about how sentiment analysis works for things like anger and sadness and joy. And then what’s different and harder about sarcasm?

Ramya Akula So in general, understanding the sentiment behind people’s emotions like a variety of emotions. It’s always been hard. Actually, to some extent when you are in a face-to-face conversation, probably with all the visual cues and bodily gestures, it helps the conversation. But when we do not know who is sitting behind the computer or the mobile phone, so it’s always hard. So that applies for all kinds of sentiments. So that includes anger, emotion, humor, and also sarcasm as well. So that’s the initial point of this research.

Steven Cherry And what makes sarcasm harder than some of the others?

Ramya Akula So sometimes sarcasm can be humor, but also it hurts people really bad. Also how people interpret it because of people coming from different cultures, different backgrounds. In some cultures, something might be okay, but in another it is not. So taking these different cultures, backgrounds, and also the colloquialisms and the slang people use, these are some of the challenges that we face in everyday conversations, especially with sarcasm detection.

Steven Cherry Computers have been writing news and sports stories for some time now, taking a bunch of facts and turning them into simple narratives. Professional writers haven’t been particularly worried by this development, though, because the thinking is that computers have a long way to go—which may be never—when it comes to nuanced, subtle, creative forms of writing. What writers are mainly depending on to save their jobs and maybe their souls is irony, satire, humor. What they’re depending on, in a word, is subtext. Are you trying to teach computers to understand subtext?

Ramya Akula To be precise, these algorithms ... One of the toughest jobs for the algorithms is understanding the context, which we humans are really good at, so any human can understand the context and then go on the content based on the context, but for the algorithms, it’s always hard because when you have such long sentences, so having the semantic similarity or some kind of a relationship between the words in these long sentences, understanding the context and then coming up with the next sentence or coming up with some kind of a sentiment like a humor or the irony or these kinds of emotions to the text that adds another level of complexity. Yet in the machine learning community, they started, like most researchers, attacking this problem by looking at different representations. So by taking the sentence as it is and then chunking it down into parts like phrases, and then having different representations for each phrase. So in order to understand the context and then put all this context together and then generate a meaningful sentence next. I feel like it’s still in a very initial phase. And we have a long way to go.

You started with social media posts. This seems like in some ways an easier problem and in some ways a harder problem than, say, audio from a call center. You don’t have tone and intonation, which I think in a real conversation are often clues in what we might call human-to-human sarcasm detection.

Ramya Akula Yes. So in speech recognition, that’s one advantage, we look at the connotation or the how the voice modulate,s and then those kind of the signals will help us better understand it. But when we look at the text like a real text from all the articles or the online conversations that we see day to day. So there is not really any stress or any kind of a connotation that you could relate to. So that that’s what it makes a little harder for any algorithm to see. Yeah. So Hodor, for checking the severity of the humor or sarcasm there.

Steven Cherry If I understand something in your paper, neuropsychologists and linguists have apparently worked on sarcasm, but often through identifying sarcastic words and punctuation and also something called sentimental shifts. But what are these? And did you use them too?

Ramya Akula So the neurolinguistics or the psychologists, they primarily look at ... So the data that they get is mainly from real humans and real conversations. So it’s actually also when they are looking at the text, text written by real humans, then it’s actually the real humans are understanding the sense of the text. Right. So we humans, as I said earlier, so we are good at understanding the context just by reading it or just by talking in any form. We are good at understanding the context. So in our case, because we have no human involved in any of the data analysis part, so it’s all the pre-processing and everything is done automatically. It’s the algorithm that is doing it.

So we definitely use some cues. And also for the machine learning part, we have the labeled data, which is like giving a sentence, it is labeled with a sentence as sarcasm—has some sarcasm or no sarcasm—and then the data is split into training and test. So we use this data to train our algorithm and then test it on unseen annotated data. So in our case, because the data is already labeled, so we use those labels and also in our case, we use weights to understand what are the cues. So instead of real humans looking at the cues in the sentence, our algorithm looks at the weights that give us the cues for the words.

Steven Cherry We can say a little bit more about this. There’s a lot of AI here, I gather, and it involves pre-trained language models that help you break down a sentence into something you call word embeddings. What are they and how do these models work?

Ramya Akula So basically a computer understands everything in terms of numbers. Right. So we have to convert the words into numbers for the algorithm to understand. So that’s been put forward. So would this embedding does is basically the conversion of the real world into vectors of numbers. In our case, what we’ve used is that we use multiple endings. So there are like many embeddings out there. So starting from [inaudible] to the very latest GPT [Generalized Pre-Trained Transformer] that we are seeing every day, that’s generating tons of data out there.

So in our case, we use the BERT—BERT is one of the latest embedding technologies out there. BERT stands for Bidirectional Encoder Representations from Transformers. I know it’s a mouthful, but it’s basically what it does is that it takes the words—individual words in a sentence—and it tries to relate, connect, each word with every other word, both on the left and right side and also from the right to left side. So the main reason, for the BERT to work that way is that it is trying to understand the positional encoding.

So that’s basically what comes next. Like, for example, I need apples for work. So in this context, does the user mean you need fruit apples for work or an Apple gadget for work? So that depends really on the context. Right. So as I said, humans can understand the context, but for an algorithm to understand what comes, either the gadget or the fruit, it depends on the entire sentence or the conversation. So what BERT does, is basically it looks at these individual positional encodings and then tries to find the similarity or the closest similar word that comes next to it and then put it together. So it works both in the right to left and the left the right directions.

So to better understand and understand the semantic similarity. So similarly, we also have different things like Elmo [Embeddings from Language Models]. We tried experimenting with different embedding types, so we had the BERT, ELMo, and several others. So we added this part into our studies, so this is just the initial layer. So it’s a type of conversion for converting the real words into numbers to fit it into the algorithm.

Steven Cherry I gather you have in mind that this could help Twitter and Facebook better spot trolls. Is the idea that it could help detect insincere speech of all kinds?

Ramya Akula Yes, it does. That’s a short answer. But adapting algorithms is something ... It’s up to the corporates whether they want to do it or not, but that that’s the main idea—to help curb the unhealthy conversations online. So that could be anything, ranging from trolling, bullying, to all the way to misinformation propagation. So that’s a wide spectrum. So, yeah.

Steven Cherry Do you think it would help to work with audio in the way the call centers do? For one thing, it would turn the punctuation cues into tones and inflections that they represent.

Ramya Akula So the most precise answer is yes. But then there is another filter to it though, or actually, adding an additional layer. So the first thing is they’re analyzing the audio form. Right. So in the audio form, we also get the cues like as I said earlier. So we’re based on the audio. I mean, the connotations are the expressions that give us and others another set of helpful cues. But after the conversation is again transcribed, that is when our algorithm can help. So, yes, definitely our algorithm can help for using any kind of speech synthesis or for any application in call center or any voice recorder stuff. Yes, we will also add the speech part to it.

Steven Cherry Ramya, your bachelor’s degree was earned in India and your master’s in Germany before you came to the U.S. You speak five languages. Two of them are apparently Dravidian languages. I have two questions for you. Why the University of Central Florida, and what’s next?

Ramya Akula I did my master’s in Technical University of Kaiserslautern in Germany, and my master’s thesis was mainly on the visualization on social networks. And this is back in 2014. So that is when I got introduced to working on social networks. And I was so fascinated to learn about how people adapt to the changes that comes along their way, adapting the technology, how online life keeps changing.

For example, before Covid and after Covid, how we’ve moved from face-to-face to a completely virtual world. So when I was doing my master’s thesis on social networks, I was so interested in the topic. And then I worked for a while again in the industry. But then again, I wanted to come back to academics to pursue ... Get into the research field, actually to understand—rather than like developing something out there in an industry for someone. I thought maybe I could do some research and try to understand and get more knowledge about the field.

So then I was like looking at different options. One of the options was working with Ivan Garibay because he had the Darpa SocialSim project. And so it about a $6.5 million funded project. But the overall idea of the project is really fascinating. It’s looking at the human simulation, how humans behave on all these kinds of online social media networks. So when I wrote about this project and about his lab, so that was my main I think the trajectory point toward this lab and of my work.

And so this project is also part of that of that main big project. And going forward, I would want to work for a startup where I can learn because every day is like a learning process; we can learn like multiple things.

Steven Cherry It seems like a lot of this would be applicable to chatbots. Is that is that a possible direction?

Ramya Akula Chatbots? Yes, that’s one application in a question-answering system. But there is a lot more to it. So instead of just the automated way of analyzing the question and answering stuff. So it can’t be applied for multiple things like not just the online conversations, but also personal assistants, yeah. So it applies for the personal assistant as well.

Steven Cherry What a computer beat the world champion of chess. It was impressive and winning it go was more impressive. And beating the champions of Jeopardy was impressive, at least until you realized it was mostly the computer knew Wikipedia better and faster than humans. But about four years ago, a computer at Carnegie Mellon University beat some top players at poker, which required to in some sense understand bluffing. That seems impressive in a whole different way from poker and go. And this development with sarcasm seems like a similar advance.

Ramya Akula So the main advantage of having these algorithms is that, as I said, they are really good at understanding the different patterns. Right. We as a human being are limited in that sense, how much of a pro we are in a certain task. And so there is always a limitation to understanding a different pattern and learning the patterns are in fact matching the patterns. That is where we can take hold of help of the algorithms like it, like our sarcasm detector or any other machine learning algorithms, because they look at all possible combinations. And also the beauty of this, the beauty of machine learning is that so the algorithm knows when it should stop learning.

Or actually the programmer who is looking at the training lost, like when the training is like really dropping, then that’s when he would know that it’s now it’s starting to decay. Like, for example, it is all fitting on the data. So we have to stop the training. So those are those are the kind of indications for a programmer to stop training.

But after the training, then we can see how well these patterns are learned. So all the all the previous achievements by different machine learning algorithms, precisely the reinforcement learning algorithms, is that it could look at all different, I mean, the variety of combinations of winning chances. And yeah. And then like having all that data within the very last time and then learn from it. It’s like sort of most of these also had some kind of feedback loop. So from which it learns. So sometimes the programmer that helps or the human in the loop that helps the training and sometimes the machine learning train learns by itself. Yeah. So these algorithms help us better understand the patterns and we humans better understand the context.

Steven Cherry Well, Ramya, there are two kinds of people in the world, those who hate sarcasm and those who live by it. I sometimes think that no two people can have a friendship or a romance if they’re on opposite sides of that line. And I can’t think of a more exciting and terrifying prospect than a robot that understands sarcasm. Exciting because maybe I can have a real conversation someday with Siri and terrifying if it means software will soon be writing better fiction than me and not just sports records—to say nothing of the advance towards Skynet. But thanks for this completely trivial and unimportant work and for joining us today.

Ramya Akula It’s my pleasure. It was fun talking to you.

Steven Cherry We’ve been speaking with University of Central Florida PhD student Rami Akula, whose work on detecting sarcasm, a significant advance in the field of sentiment analysis, was published recently in the journal Entropy.

Fixing the Future is sponsored by COMSOL, makers of mathematical modeling software and a longtime supporter of IEEE Spectrum as a way to connect and communicate with engineers.

Fixing the Future is brought to you by IEEE Spectrum, the member magazine of the Institute of Electrical and Electronic Engineers, a professional organization dedicated to advancing technology for the benefit of humanity.

This interview was recorded May 21st, 2021, on Adobe Audition via Zoom, and edited in Audacity. Our theme music is by Chad Crouch.

You can subscribe to Fixing the Future on Spotify, Stitcher, Apple, Google, and wherever else you get your podcasts, or listen on the Spectrum website, which also contains transcripts of all our episodes. We welcome your feedback on the web or in social media.

For Radio Spectrum, I’m Steven Cherry.

About Fixing the Future

On IEEE Spectrum's Fixing the Future podcast, our editors talk with the brightest minds in technology about concrete solutions to big challenges

Subscribe & Listen

Spotify ↗Apple Podcasts ↗Google Podcasts ↗Amazon Music ↗Pocket Casts ↗Overcast ↗Stitcher ↗RSS Feed ↗

The Conversation (6)

Chen Qi08 Oct, 2021

INDV

This article mainly describes the development of artificial intelligence algorithm for human emotion analysis in today's situation. At present, it is easy for intelligence to analyze human emotions about joy, fear, sadness, confidence, but it is very difficult to analyze sarcasm.Due to cultural differences in different countries and different language environments, the conclusions are different.Scientists use convert the words into numbers for the algorithm to understand.humans can understand the context, but for an algorithm,it depends on the entire sentence or the conversation.So scientists tried experimenting with different embedding types, just like BERT, ELMo, and added this part into studies.This technology can be used to prevent the unhealthy speech, which I think will play a positive role in the network environment.

Chen Qi08 Oct, 2021

INDV

This article mainly describes the development of artificial intelligence algorithm for human emotion analysis in today's situation. At present, it is easy for intelligence to analyze human emotions about joy, fear, sadness, confidence, but it is very difficult to analyze sarcasm.Due to cultural differences in different countries and different language environments, the conclusions are different.Scientists use convert the words into numbers for the algorithm to understand.humans can understand the context, but for an algorithm,it depends on the entire sentence or the conversation.So scientists tried experimenting with different types, just like BERT, ELMo, and added this part into studies.

haichen fu08 Oct, 2021

INDV

This passage is talking about how computer learning about people’s sentiment behind their emotions, especially sarcasm.

Ramya Akula show his/her point of view about what makes sarcasm harder than some of the others. she said the most important thing is that sometimes sarcasm can be humor, but also it hurts people really bad. Also how people interpret it because of people coming from different cultures, different backgrounds. but the others may don’t know the meaning of those “joke”, so they might misunderstanding their meaning by thinking about they are shaming them.

Especially on the website, people talking only by words, so you can’t judge their really sentiment by their expression. For computers they can only judge from their word , so it’s really significant for PC to know the connection between the words in the passage.

And in the end they talk about the using of this technology, it can be used in chatbots and also personal assistant, and what’s more maybe just like ALPHAGO defeat human champion in cheesing, one day machine could defeat people in writing maybe also composing. In the past we always said that the only one thing machine could never know is our emotion, so we could never be replaced by the robots. But this passage show me that this could be happened for robots to be like really human. I don’t mean that they will rule the world , but it’s more possible for them to be a really member to live in the world, maybe a friend even more for family.

To draw a conclusion I think this is a really important technology in machine learning , so I’m really grateful to see what it really can create in the future.

Zhuangzhuang Jia08 Oct, 2021

INDV

The development of artificial intelligence in daily life provides us with great convenience, but it is difficult to achieve emotional intelligence or difficult, because our tone of voice is of great help to understand the words, and the computer need to translate what people say to digital again to understand the emotions, it is more difficult, it is hard to understand the emotions ironic statements is also one of the more difficult.So we don't have to worry about artificial intelligence replacing our work at present

Chen Qi08 Oct, 2021

INDV

Jiajun Fang08 Oct, 2021

INDV

Is it really possible to build a machine that thinks like a person? We're getting closer to building an AI that thinks like a human. Maybe in the future the robot nanny will replace a real human nanny who is working more efficiently without any complaints. If they have real emotion, they are more perfect, but not a cool machine. If robots turn out to be smarter than we think, that could be a huge ordeal for humans, and they might not want to be human servers. It is not clear what will happen in the future, making robots serve human beings is a good thing, but the issue of artificial intelligence is still controversial.

Listen Next

About Fixing the Future

On IEEE Spectrum’s Fixing the Future podcast, our editors talk with the brightest minds in technology about concrete solutions to big challenges.

See all Fixing the Future episodes →

65 episodes

Subscribe & Listen

Listen to Fixing the Future in your favorite podcast player

Spotify ↗Apple Podcasts ↗Google Podcasts ↗Amazon Music ↗Pocket Casts ↗Overcast ↗Stitcher ↗RSS Feed ↗

Fixing the Future podcast is sponsored by

Topics

Sections

More

For IEEE Members

For IEEE Members

IEEE Spectrum

Follow IEEE Spectrum

Support IEEE Spectrum

How Computers Can Finally Detect Sarcasm

Ramya Akula and the tech that lets sentiment analysis spot mocking words

Using AI to Clear Land Mines in Ukraine

Combining drones and machine learning to demine Ukrainian battlefields

About Fixing the Future

Subscribe & Listen

Listen Next

Using AI to Clear Land Mines in Ukraine

Combining drones and machine learning to demine Ukrainian battlefields

Never Recharge Your Consumer Electronics Again?

Exeger's flexible dye-based solar cells can recharge consumer electronics indoors

The UK's ARIA Is Searching For Better AI Tech

Suraj Bramhavar is leading a program to make training more efficient

Topics

Sections

More

For IEEE Members

For IEEE Members

IEEE Spectrum

Follow IEEE Spectrum

Support IEEE Spectrum

Enjoy more free content and benefits by creating an account

Saving articles to read later requires an IEEE Spectrum account

The Institute content is only available for members

Downloading full PDF issues is exclusive for IEEE Members

Downloading this e-book is exclusive for IEEE Members

Access to Spectrum 's Digital Edition is exclusive for IEEE Members

Following topics is a feature exclusive for IEEE Members

Adding your response to an article requires an IEEE Spectrum account

Create an account to access more content and features on IEEE Spectrum , including the ability to save articles to read later, download Spectrum Collections, and participate in conversations with readers and editors. For more exclusive content and features, consider Joining IEEE .

Join the world’s largest professional organization devoted to engineering and applied sciences and get access to all of Spectrum’s articles, archives, PDF downloads, and other benefits. Learn more about IEEE →

Join the world’s largest professional organization devoted to engineering and applied sciences and get access to this e-book plus all of IEEE Spectrum’s articles, archives, PDF downloads, and other benefits. Learn more about IEEE →

Access Thousands of Articles — Completely Free

Create an account and get exclusive content and features: Save articles, download collections, and post comments — all free! For full access and benefits, subscribe to Spectrum.

How Computers Can Finally Detect Sarcasm

Ramya Akula and the tech that lets sentiment analysis spot mocking words

Using AI to Clear Land Mines in Ukraine

Combining drones and machine learning to demine Ukrainian battlefields

About Fixing the Future

Subscribe & Listen

Listen Next

Using AI to Clear Land Mines in Ukraine

Combining drones and machine learning to demine Ukrainian battlefields

Never Recharge Your Consumer Electronics Again?

Exeger's flexible dye-based solar cells can recharge consumer electronics indoors

The UK's ARIA Is Searching For Better AI Tech

Suraj Bramhavar is leading a program to make training more efficient

More Signal, Less Noise: Fixing the Future With Stephen Cass

More Signal, Less Noise:
Fixing the Future With Stephen Cass