What Is Emotion AI and Why Should You Care?
Recently I had the opportunity to attend the inaugural Emotion AI Conference, organized by Seth Grimes, a leading analyst and business consultant in the areas of natural language processing (NLP), text analytics, sentiment analysis and their business applications. (Seth also organized the first Text Analytics Summit 15 years ago, which I also had the privilege to attend.) The conference was attended by about 70 people (including presenters and panelists) from industry and academia in the US, Canada, and Europe.
Given the conference topic, what is emotion AI, why is it relevant, and what do you need to know about it? Read on to find out, but first, some background.
Emotions are a cornerstone of human intelligence and decision making
We humans are highly emotional beings, and emotions impact everything we do, even if we are not, for the most part, aware of it. They guide our attention, impact what and how we learn and remember, how we perceive ourselves and others, and ultimately how we grow as individuals and who we become. As Yann LeCun, one of the godfathers of AI and of deep learning, said: “It is impossible to have intelligence without emotions” (Quoted in Daniel McDuff presentation).
Emotions are highly personal, yet also social. Emotional responses in others in general and in reaction to our own actions are some of the first things we learn as infants. They are also the reason why we love storytelling. And why stories are highly effective for learning, for influencing and inspiring others, and for instigating action and change. In a well-constructed story, the plot (its narrative arc) is closely intertwined with the characters’ emotional evolution (the emotional arc), forming the double helix of narrative + emotion. Also, what is history, politics, and news if not a collection of stories – true or otherwise – and the emotions therein?
We continue to be governed by emotions in subtle and not-so-subtle ways throughout our lives. As one of the conference presenters, Diana Lucaci of True Impact said, “People say what they think and act on how they feel.”
It is not surprising then, that neuroscience research and emotional design have for years been staples in marketing and advertising and in product, service, and website design. Emotions nudge you, the user, to click on a link and buy a product or follow the emotional breadcrumbs through a website, just like its designer intended.
Emotion artificial intelligence seeks to understand, replicate, and simulate human emotions
In the field of artificial intelligence (AI), researchers and practitioners have likewise for years looked into ways to mimic and explore human emotions. The field took off in 1995 when MIT Media lab professor Rosalind Picard published an article entitled “Affective Computing.” It gave rise to a new discipline by the same name, from which emotional artificial intelligence spun off. The goal of emotion AI? To understand, replicate, and simulate human emotions in and by machines.
Affective computing and emotion AI incorporate many technologies and application areas. The Affective Computing group at MIT, for example, “aims to bridge the gap between human emotions and computational technology.” Its projects range from “finding new ways to forecast and prevent depression before there are any clear outward signs of it; to inventing ways to help people with special needs who face communication, motivation, and emotion regulation challenges; to enabling robots and computers to receive natural emotional feedback and improve human experiences.” And this is just scratching the surface; there are many more projects, applications, and use cases on the lab’s website.
Sentiment analysis goes multi-modal
One of the areas of emotion AI is sentiment analysis, a field that has existed since at least the early 2000s. Sentiment analysis is usually conducted on textual data, be it emails, chats, social media posts, or survey responses. It uses NLP, computational linguistics, and text analytics to infer positive or negative attitude (aka “orientation”) of the text writer: Do they say good or bad things about your brand and your products or services?
The obvious applications of sentiment analysis have been brand/reputation management (especially on social media), recommender systems, content-based filtering, semantic search, and understating user/consumer opinions and needs to inform product design, triaging customer complaints, etc.
Several of the conference presentations were devoted to this topic, which, despite all the recent progress in NLP and related fields, is still hard. Not least because there is little agreement among researchers on even what constitutes basic human emotions and how many of them are there, said Bing Liu, Professor of Computer Science at the University of Illinois at Chicago.
Emotions are also notoriously hard to identify and code (label), since they are ambiguous, shifting, overlapping, and adjacent. For example, one can feel anger, sadness, and disgust at the same time. Moreover, emotions are not always easy to pin down. And clear, unambiguous labels are important: AI – or at least the 70% of it that is known as supervised learning – depends on data that has been tagged (“annotated” or “labeled’) by humans. That’s how machines learn. (Hence “supervised”.)
Then there is complexity behind how emotions are conveyed, explained Professor Liu. When speaking, emotions are communicated through a broad range of linguistic and paralinguistic cues, such as intonation, facial expressions, body movements, gestures and posture, and bio-physical signals (sweating, skin flushing, etc.). In writing, they are signaled by punctuation, capitalization, emoticons, and other creative expressions, for example, word lengthening (e.g. “soooo slow” or “so sloooow”). And that is in addition to word choices and grammar!
There are also cultural differences in how people convey emotions. To complicate things further, there is a phenomenon known as the “cognitive gap.” What people say and how they truly feel do not always match, for multiple reasons: they are trying to be polite or avoid hurting other people’s feelings. Or maybe they are trying to keep their emotions to themselves.
Professor Liu said that context and multi-modal data may be helpful to resolve many such ambiguities. And in fact, with the advancement of biometrics and wearables, the field has expanded to analyzing emotions from sensor data, which includes heart rate, temperature, brain waves, blood flow, and muscle bio-signals, as well as voice, facial expressions, images, and video.
This trend of leveraging sensors will continue, predicted strategist and “tech emotionographer” Pamela Pavliscak (who also teaches at Pratt Institute), with the exception of, perhaps, facial recognition technologies (FRTs) and haptic/touch data. FRTs recently came under fire because of privacy concerns. (Some of my previous notes on this topic can be found here.) And touch data is a “no-go” for obvious reasons: the COVID-19 pandemic.
Can wearables be used to identify emotions?
A meta-analysis by Professor Przemysław Kazienko of Wroclaw University of Science and Technology focused on wearables and tried to answer the following question: “Can they be used to identify emotions in everyday life?” If we could do that, we could, for example, improve health and well-being and clinical outcomes of patients suffering from diseases that have mood altering effects. (The example he used is kidney dysfunction.)
We could also use wearables for stress control, mental health, and autism. One such app developed at the MIT Media Lab monitors a person’s heartbeat to detect whether they are experiencing negative emotions such as stress, pain, or frustration and it releases a scent to help the wearer to cope.
And of course, we could use emotions detected from wearables for “good old” personalization and product/service improvement: from online content and product recommendations, to virtual assistants and gaming experience. (Although I am concerned about privacy implications for several of the use cases Professor Kazienko mentioned.)
Emotion data from wearable devices can also be used to prevent car accidents (when the driver gets drowsy, for example), to track students’ attention and hence academic success, and to improve social interactions, among several other things, he said.
Ms. Pavliscak illustrated the latter with US+ project by Lauren Lee McCarthy and Kyle McDonald, which is “a Google Hangout video chat app that uses audio, facial expression, and linguistic analysis to optimize conversations.” The app analyzes what users say and whether they use common vocabulary and sentence structures, which we humans tend to do as a conversation unfolds. (This is known as “linguistic style matching.”) For each of the chat participants the app displays a quick visualization and pop-up notifications, for example: “Stop talking about yourself so much” or “What are you hiding? Clare is speaking much more honestly.” It can even automute a participant “when the conversation gets out of balance.”
Image: US+ project, accessed May 12, 2020
There were many more fascinating applications of emotional data and emotion analysis in Ms. Pavliscak’s keynote, “Design for an Emotionally Intelligent Future.” The video recordings of all conference sessions are available on the conference site for $200, with a discounted price of $100 for additional viewers from your organization, and I highly recommend that you watch them.
Going back to the question, “Can wearables be used to identify emotions in everyday life?”, Professor Kazienko’s conclusion is: “Emotion recognition with wearables is the future of personalized affective computing.” However, he continued, we need more field studies, more and better data, active collaboration between researchers, a common model on how to classify emotions, data and code sharing, and several other things to improve recognition quality and reproducibility.
Using emotions to make AI sound (and act like a) human
As more of our experiences and daily lives are mediated by artificial intelligence, we increasingly want AI to look, “feel,” sound, and (re)act like a human. Perhaps, because humans are in general less frightening than machines. (Or are they?) Or, maybe because our imagination is constrained by our own physiological characteristics – similar to the Sapir-Whorf hypothesis which postulates that the structure of a language affects how its speakers perceive the world. (Add emotions to the mix, and it becomes quite complex very quickly.)
Or maybe because we humans like to remodel the world to our image and liking. Have you seen Atlas the robot by Boston Robotics performing gymnastics? In all his mechanical glory, he is sort-of alright, no? But the company’s headless mechanical dogs on the other hand are rather unsettling…
Now think about how you would like Atlas to sound if he could talk. Obviously, he can’t (yet), but we increasingly interact with chatbots and virtual assistants. And if we could imbue machines with empathy, they could provide better consumer and brand experience, and by doing this build familiarity, stronger bonds, and deeper trust, said Greg Hedges, Chief Experience Officer at Rain, a consultancy. His firm creates “emotionally-intelligent voice experiences” for brands such as Nike, Starbucks,Tiffany, Tide, and Sesame Street.
From chatbots to “emotional chatting machines”?
Since potentially significant business benefits can be gained from emotionally aware AI it is perhaps not surprising that “emotionally intelligent” chatbots are a hot research area. Professor Liu mentioned above is working “to create chatbots that can perceive and express emotions and learn continually during conversations.” He calls them “Emotional Chatting Machines.”
“Emotional intelligence is a vital aspect of human intelligence,” he said, and moreover, research shows that emotion in dialogue systems can enhance user satisfaction. Chatbots that express empathy decrease user frustration and stress. They lead to fewer breakdowns in dialogues. And they also inspire people to cooperate rather than rage about “stupid machines,” he continued.
“Systems that mimic human style are more natural to interact with,” said Daniel McDuff, Principal Researcher at Microsoft. His team is building embodied agents, among other things. (See also the last section for details on his other work.) We humans constantly adapt to each other and this adaptation creates social cohesion. When people interact with a virtual reality agent over time, they similarly try to adapt to its style, he said. And so virtual agents and chatbots will need to adapt to humans.
Emotional chatbots in business could mean more satisfied customers and citizens and perhaps lower costs, as chatbots will be able to take care of most inquiries and free human operators for less common or more complex interactions. Especially in these times of crisis. More than two months after the WHO declared the novel coronavirus disease a global pandemic, telephone lines at banks, insurance companies, major retailers, and government offices remain jammed. IBM, who offered its Watson Assistant to help deploy chatbots to governments saw a 40% increase in traffic to its chatbot platform. Google promptly launched its own Rapid Response Virtual Agent to “[q]uickly build and implement a customized Contact Center AI virtual agent to respond to questions your customers have due to COVID-19 over chat, voice, and social channels.”
Not yet…
But beware vendor promises. Chatbots in general require extensive training and/or scripting, and so the costs may not be significantly lower – depending on organization size and number, type and complexity of use cases, and other parameters.
“Emotionally aware” chatbots add another layer of complexity and challenges to chatbot technologies. Here are a few mentioned by Professor Liu:
- Emotion-labeled data that such emotionally aware chatbots and VAs rely on is hard to obtain at the scale needed to train the machines.
- Annotation (i.e. labels), where available, are subjective and classification may be inaccurate. And with AI, just like with other data-dependent applications, “garbage in, garbage out.”
- An emotionally enabled bot must also be able to balance understanding the emotions of the speaker with whom it is conversing with generating its own emotional and linguistics responses in return. This is really hard to do because there is a clear dependency between the two processes. And because they need to happen almost simultaneously.
- All this becomes even harder in what is know as “open domain,” unconstrained by bot use cases, application areas, industry, etc.
Professor Liu’s takeaway on the state of emotionally aware chatbots: “Chatting with emotions is vital for dialogue systems,” but there are massive quality issues, and because of that, the technology is not ready for prime time.
He said, there are examples of deployed chatbots with emotions, but such bots are mainly based on rules – they are scripted, so there is no intelligence there. These bots are also domain-constrained: they are typically used in narrow cases such as customer service or as emotional companions. To create more human-like “emotional chatting machines,” he said, we will need to get better at multi-modal emotion detection and generation (using sensor data mentioned earlier).
Emotion-inspired machine learning
The most intriguing session at the conference was the keynote by Daniel McDuff, Principal Researcher at Microsoft, who asked the question “How can machine learning leverage emotions to learn and explore?” He spoke about “visceral machines” – the idea that machine learning/AI systems should have some emotional mechanisms similar to those of humans. Or, at least that they should be able to model emotions for the reasons mentioned above: we want technology to interact with us humans as if it were human.
But there is another reason. Emotions, said Mr. McDuff, help us to understand and explore the world. They are fundamental to understanding what it means to take risks, to achieve positive outcomes, and so on. As we interact with the world, he continued, we receive either positive or negative response that guides our further actions. What if we could incorporate emotions into machine learning systems to help inform them? (And to improve their performance.)
One of the examples he showed involved using drivers’ heartbeats to guide machine learning. (Heart rate is one of the ways we express and experience emotions. Others include at the biological level: changes in body temperature; pupil and blood vessel dilation/constriction and changes in blood flow, breathing, and brain waves; increased/ decreased production of saliva, hormones and digestive enzymes, and so on.) Mr. McDuff’s team used drivers’ emotional responses as expressed through their heartbeat in training a neural network model to drive a car. (In addition to other data, I presume.) The result? The model was able to drive a vehicle longer than the state-of-the-art model without emotions.
Image: Daniel McDuff, “Emotion Inspired (Machine) Leaning”, keynote at The Emotion AI Conference, May 5, 2020.
His team also looked at applying emotions (or rather human physiological response data) to teach machines to avoid car crashes, in this particular case, using human drivers’ facial expressions. It seems that it worked too. Their next project is looking to “combine various emotional response signals, risk-aversity and curiosity to nudge machines to explore more and do so in a safe way.”
Our take
In Russian there is a proverb which, loosely translated, says that when we meet a person, we judge them by their dress, and when they leave, we judge them by their intelligence. (It is roughly equivalent in English to “You only get one chance to make a first impression” and “You can’t judge a book by its cover.”) Interestingly enough – and this is perhaps one of the rare occasions when ancient folk wisdom does not ring true – people do not usually remember what you wore or said or did, but they do remember how you made them feel – in other words, the emotions you stirred within them.
Emotions impact all aspects of our intelligence and behavior, at the individual and group level. In aggregate, they determine the behavior of the markets, social cohesion, the health of local and global economies, and the progress of nations. The Nordics, for example, are some of the happiest nations in the world, and they are among the richest, too.
So, can we take emotions – this fundamental quality that makes us human – decode it, and re-code it into machines? Well, look at language and speech technologies or computer vision. Significant successes have been achieved in the past decade as a result of big data, cheap compute power, powerful hardware, rapidly improving algorithms, and collaboration/open-source software.
However, if the field of AI is any indication, we may never be able to create truly emotionally aware machines. We just do not understand emotions – and intelligence – well enough. What started 60 years ago with an ambitious goal to simulate human intelligence and where the field’s creators thought “a significant advance can be made in one or more of these problems if a carefully selected group of scientists work on it together for a summer,” evolved into something more narrow and brittle: applied AI (or narrow AI), as the field is now known.
Does this mean that emotion AI is too hard, and that we should just give up? Despite more limited scope, many narrow AI applications generate significant business benefits and human value, and emotion AI has the potential do that, too, as The Emotion AI conference has illustrated. As long as we are aware of the challenges the field is facing and have rational expectations. But that would be so unlike us, humans – to be rational – that is.
Next steps
To learn more about emotion AI, check out the conference website, watch the recordings, or give us a call to learn more about this field and how to get started with AI and machine learning in the context of your organization.