Nowadays communication is largely dominated by digital text-based channels which naturally only transfer a small part of the information that is present in face-to-face conversations. In particular, information about the communication partner’s emotional state, which is naturally expressed through facial expressions, body language and other non-verbal indicators, can hardly be transferred. Approaches such as emojis address this issue by allowing the sender to show how he (for reasons of readability, the pronoun “he” addresses all genders equally) feels by selecting an appropriate (smiley) face. However, the crucial difference is that this smiley must be deliberately chosen and does not necessarily represent an authentic expression of the sender’s emotional state. The present paper discusses typical challenges and misunderstandings of communication in the digital era by the example of chat communication. It reflects its ramifications on the perceived authenticity of the transferred emotions and discusses possible (technology-based) approaches towards a more direct, authentic way of communication.
With the shift of communication to digital channels, it is increasingly subject to the rules these channels impose. Communication is thus shaped and influenced by the rules that are specifically or unintentionally implemented in communication systems. For example, features such as the WhatsApp read receipt hooks make it possible to track when a certain message has been read by the receiver. However, besides such an instrumental effect, features of communication technologies often impose new social rules to communication . In the case of the WhatsApp read receipt feature for example, the imposed rule could be answering messages immediately, as it might be socially rude to postpone an answer to a message that has obviously been read pressuring the receiver to answer the message immediately – because the sender might get angry when there is no answer though knowing his message has been read , . In sum, digital communication often differs radically from natural face-to-face communication, as technological features can subtract or highlight certain layers of human communication. For example, the implicit, non-verbal signals in face-to-face communication tend to get lost in textual communication. In contrast, the specific words to convey a message remain permanently present and get a (sometimes immoderately) high meaning in text compared to verbal communication. However, such differences between digital and analog communication on the experiential level are rarely considered when designing technology. Instead, it seems that the design of communication devices is often driven by technical opportunities and inventing ever new features that exploit what technology has to offer. Creating opportunities beyond natural communication and doing things you normally cannot do, like sending animated stickers or GIFs in real-time via mobile text communication, seems to be an implicit goal. In contrast, a most natural way of interpersonal communication with all its facets through digital communication is rarely the primary objective . While digital communication can be advantageous for particular needs other needs and possibilities are set aside. For example, the missing automatic transfer of authentic emotions in written messages can be considered an advantage if one doesn’t want the communication partner to know about how one actually feels. At the same time, it can create a feeling of insecurity between communication partners. Even though no actual conflict exists, one can easily misinterpret or get worried about the other one’s message. For example, a study on social conflicts related to digital technology showed that receiving a text message without any smileys or emojis is often interpreted as a signal that the sender must be angry – which usually was not the case . With the establishment of textual chat as a major communication medium , digital communication seems to become increasingly alienated from the natural way of communicating, driven by the more or less random effects of modern technology . This obviously creates a gap between the most frequent communication channels and the needs that people can fulfill in direct communication. The present paper highlights certain challenges and typical misunderstandings of communication in the digital era by the example of chat communication. It reflects its ramifications on the perceived authenticity of the transferred emotions and discusses possible technology-based approaches towards a more direct, authentic way of communication.
2 Characteristics and Challenges of Digital Channels in Socio-emotional Communication
Communication is shifting more and more to digital channels with textual chat as a major medium . As one central aspect, textual chat provides the user with a shielded space and full control over the content he wants to send, which only represents a small snapshot of the sum of information the user expresses in direct communication. Interpersonal analog communication, when people come together in a personal conversation, mostly consists of non-verbal communication  including facial expressions, gestures, vocal pitch, speed of speech, etc. A study by Mehrabian  suggests that the overall impression of a person in communication can be divided into three different components: Words, tone of voice and body language. Words are responsible thereby only to 7 % for the total impression, which a human makes on his discussion partner. To 38 % the tone of the voice counts and to 55 % the body language. Especially, in socio-emotional communication, when it is about more than factual information exchange and the communication also expresses something about the relationship between the communication partners, such non-verbal elements are of central importance. However, these components of non-verbal communication are to a large extent involuntary, thus not always consciously controlled. Most body language signals including facial expressions represent unconscious gestures  with which the body reacts to the conversational situation, the emotional world or the appearance of the other person. For example, in the case of genuinely perceived bad news, such as the delivery of a death message, a person will hardly succeed in not revealing his feelings by signals from his face or posture. The same applies to pleasant surprises in the opposite case. In general, emotional states such as fear, boredom, tension or self-confidence are not only reflected in words, but also to a large part in tone and unconscious body language. Moreover, for a meaningful and effective communication, words, tone of voice and body language must be congruent .
In digital communication, however, this congruence is difficult to achieve because at least one of the components is either missing or simplified in its multifaceted complexity. In video chat, for example, usually only a small part of the body (usually the face) is captured and displayed, the video and audio quality are subject to the situational technical conditions. Voice messages are also subject to encoding quality and cannot transmit the body language component. Text chat is a common form of communication today. It operates exclusively via words and does not use either tone of voice or body language. In text chat, the non-verbal, involuntary parts not only get lost, but the sender is given the opportunity to deliberately place supposedly involuntary aspects (see Figure 1).
Communication channels can be distinguished by the wealth of information they are able to transmit . In fact, as O’Sullivan  points out, there are many aspects that could explain a preference for mediated, leaner channels over direct communication channels from the sender perspective, if one considers the particular features of such communication media. For example, leaner channels such as text messages, emails or letters could be used to ambiguate and obfuscate unattractive or embarrassing aspects. Attempts of deception would have less chance of detection (, p. 408 ff.). At the same time, leaner channels are also a means to mute an expected negative response . In addition, leaner, asynchronous channels could also be used to benefit the partner. For example, mediated channels could be an asset when accusing the partner and reducing the distress of confrontation and providing greater control over when and how to respond. O’Sullivan calls this protective effect of leaner channels by avoiding a confrontation with the communication partner’s emotions in full intensity, a “buffer effect”. O’Sullivan  further assumes that this buffer effect can be a central element for media choice. In contrast to earlier theories such as media richness theory , assuming that individuals generally prefer richer channels to reduce equivocality, O’Sullivan  highlights a functional perspective of media choice. Communication channels can be seen as a tool for managing self-relevant information in pursuit of self-presentational goals, whereby the characteristics and particular constrictions of mediated channels are often seen as advantageous for interactions that could threaten positive impressions. In consequence, under certain circumstances, individuals in fact seek to increase equivocality by selecting leaner channels. In sum, O’Sullivan (, p. 423) argues that the “assumption in media richness theory that individuals seek high efficiency and low equivocality in making channel choices is applicable only in certain situations” and further points out that “situations exist in which individuals desire to shade and shape impressions, and that is when low efficiency and high equivocality channels can be functional and effective in helping individuals reach their interactional goals.”
3 Emojis and Other Examples of Misleading Emotion Expression in Textual Digital Communication
As pointed out in the previous paragraphs, different motivations can lead to deliberately limiting and ambiguating information exchange and thus preferring leaner digital communication channels over direct conversation. In some cases, however, the communication partners are actually interested in an intense and rich communication and use digital channels for practical reasons, such as staying in touch with friends and family abroad. In such situations, users might wish to simulate direct contact as much as possible and thus be interested in authentic emotion exchange. In order to address the missing transfer of emotions in digital communication, different approaches have been developed. In mobile chat communication one substitute of choice for non-verbal communication are ideograms, respectively emojis. Basically, the idea of emojis is to show how one feels by selecting an appropriate smiley face. Instead of seeing the sender’s actual face, such as in direct conversation, the receiver perceives a smiley face as a proxy for the sender’s emotional state. However, other than in face-to-face conversation, choosing an emoji is a deliberate act and not an automatically sent signal. In contrast to a smile on the human face, in a face-to-face conversation, the representation of the smiling emoji used in a chat conversation is not physiologically linked to the sender of the message. This difference is important to understand the specific challenges of expressing and interpreting emotions in textual digital communication.
Conducting a social conversation is a skill that people learn from childhood. They learn to use their body for communication, to adjust their pitch to their words and to react to their conversation partner in an appropriate manner. With each conversation they improve their ability to read the body language of others, to initiate a conversation and to engage in a conversation. Digital communication utilizes a completely different form of communication and implies new rules of social interaction which do not necessarily correspond to those of analog communication. Knowledge learned from analogue interpersonal interaction, for example regarding how to behave and present oneself according to etiquette in certain social situations, can hardly be applied in chat communication, since this aspect is simply not captured by the rules of textual chat. As users live in both analogue and digital reality at all times, behavioral forms mix to the detriment of the analogue reality, leading to a possible neglect of adequate manners under social circumstances when people are occupied with their mobile devices and fade into the ”the state of monomaniacal obliviousness” . As a natural consequence of substituting analogue with digital channels in more and more situations, the analogue social competence seems threatened, as it is already apparent in common social situations of our every-day lives. For example, making a phone call becomes less and less common among younger people. Instead, text messages are used to communicate in each and every situation, sometimes even if the communication partner is in the room next door and you could just go and knock the door . Likewise, young employers often get nervous if they have to make a phone call to business partners, which can be problematic for the business relationship even from an economic perspective . Moreover, whereas in the past, it was generally more common to start a conversation in public spaces or new social environments, nowadays, this skill is less required, as one can always hide behind one’s smartphone. While pretending to be busy, there is no need to interact with others. As a side effect, however, people are less trained in the natural art of direct conversation.
In addition, nowadays communication trends imply a loss in authenticity and mutual understanding between conversation partners. When communication between people becomes more and more dominated by digital, often text-based channels, reactions expressed via body language and especially mimics, are often replaced by deliberately chosen ideograms. In text communication, because of the freedom to be able to choose an arbitrary emoji representing an emotion, it is possible that the emoji chosen does neither correspond to the actual facial expression nor the actual emotional state of the message’s sender. Since facial expressions are a key component to experience empathy , this condition makes it difficult to build an emotional bond and overcome the uncertainty of being understood. Besides differences in individual interpretations of emojis, emojis can also be “misused” to pretend, hide or interpolate an emotion. In sum, the recipient can never be sure to what degree the chosen emoji represents an authentic expression of emotion and it is questionable to which degree emojis are helpful to empathize with the emotional state of the chat partner. This results in the paradox that today many people have more contacts and contact possibilities than ever, but actually lack authentic social interaction. The chat culture is characterized by a constant uncertainty about the true emotional state of the communication partner, which often manifests itself by a particular friction in communication, frequent misinformation and hurt feelings .
While on the one hand, emojis could help to reduce misunderstandings by enriching pure textual messages with some form of emotion expression, on the other hand, the misinterpretation or overuse of emojis in quality and quantity can become the actual problem. Compared to the strength of emotion expression in face-to-face conversations, users tend choose much “stronger” emojis. For example, the top two emojis used worldwide  have tears of joy or sadness in their eyes and thus depict an emotional state which in reality is rarely shown. This tendency to overplay emotions in chat communication has induced an inflation of the emotional value of emojis. It led to the ironic situation that if you are in the rare situation of really feeling such an intense emotion as the chosen emoji depicts, you have to explain this to your chat partners. For example, when something is really funny, the excessively used tear laughter smiley alone is not enough to convey that emotion. Thus, people often add an overhead of words to verbally explain the emotion (resulting in a redundancy of easy emotion expression through emojis).
I am laughing so hard right now I am almost crying! Can you believe it?
Another issue is the fact that emojis can be interpreted in fundamentally different ways. For example, tears can be interpreted as tears of laughter or tears of suffering, leading to confusion about whether one same emoji is adequate for funny or sad messages. Depending on the social context, the impact can be quite devastating, such as when commenting a message about illness or even death with a tear laughing smiley (see Figure 2). The interference of the individual interpretation of the emojis is amplified by the fact that emojis are not displayed uniformly on all systems. Depending on the OS and OS version, the depiction of the emoji differ vastly (see Figure 3). For example, when an Apple user sends the top left emoji to a friend which owns a Samsung smartphone, what the friend will receive is the bottom left emoji: obviously the Samsung user gets a different impression of the friend’s emotion than intended.
4 Approaches for a More Authentic Emotion Expression in Digital Communication
A possibility to address the issue of incongruence of emotions in digital communication is to transfer natural communication elements like facial expressions to digital channels. In order for a system to resemble as much as possible the situation of natural emotion expression in face-to-face conversations it is crucial that the system meets certain key criteria, which we describe as follows:
First, the system is required to simplify the process of emotional expression compared to existing systems. Therefore, it must integrate as seamlessly as possible into the users existing communication habits. It should use the expressions the user is already used to so that the user has no trouble identifying with the expression of the emotional state delivered by the system. In the case of chat communication, a subset of the same set of emojis the user usually uses to express his emotions would be appropriate.
Second, the system should also reduce and improve the effort the user must take on to express feelings in order to state an improvement in usability and reduce the emotional friction.
Third, the responsibility for the emotion should be taken away from the user by the system, i. e. the system should generate the emotional expression, not the user. Thus, the user’s expression gains value, as he cannot be held responsible for any manipulation. As a consequence, the system guarantees for the authenticity of the representation of the emotional expression.
Fourth, in order to achieve this level of authenticity in expressiveness, the system must be able to reliably capture and transmit involuntary communication aspects, respectively facial reactions. Facial reactions can be captured using the common front camera of the user’s smartphone utilizing artificial intelligence (AI) methods of emotion recognition. The detected emotion can then be mapped to an emoji as a natural reaction of the dialog partner and, in synchronization with the textual content, allow an authentic exchange and evaluation of the emotional state for both sender and receiver of a message. In the following, we describe the implementation of such a system.
5 Auto Emojis as a Natural Way of Emotion Expression
One possible approach to transfer natural communication elements is to utilize artificial intelligence to map the user’s facial expression to corresponding emojis in real-time. We operationalized this idea in a first prototype of a chat application called Chat42 (see Figure 4). Chat 42 consists of two main features: Auto Emojis and Auto Emoji Read Receipts. With the Auto Emoji system, the user only has one emoji to choose from, which mirrors his current facial expression. Auto Emojis are send voluntarily. The idea is that Auto Emojis may lead to an increase in the perceived authenticity of emojis as well as to an increase in empathy by the recipient of the message. The Auto Emoji Read Receipt system on the other hand captures the recipients’ reaction to a message (meaning the dominant facial expression while the user is reading the message) and sends it as an emoji read receipt to the sender of the message, regardless of the consent of the recipient (whose reaction is send to the sender). Therefore, Auto Emoji Read Receipts are sent involuntarily. This mirrors the natural process of human emotion expression, where emotional information is often processed unconsciously ,  and facial expressions can be elicited as an emotional reaction without the person being aware of it. This is why people sometimes show facial reactions like smiling at their phone while reading non-interactive content like text messages without addressing an immediate recipient. The Auto Emoji Read Receipt system aims to catch that reaction to deliver the sender of a message an authentic impression of the recipient’s emotional state and thereby aims to increase in empathy by the receiver of the read receipt. A first evaluation of Chat42 over the course of three weeks in an exploratory randomized field study revealed promising results. See  for a more detailed presentation of the prototype and evaluation study. As intended, the systems Auto Emoji and Auto Emoji Read Receipt significantly increased the perceived authenticity of emojis as well as the perceived empathy towards the chat partner. However, more elaborate studies with larger samples are needed to get a more comprehensive picture and to confirm the trends identified in this study.
6 Critical Discussion and Future Work
The presented features Auto Emoji and Auto Emoji Read Receipt can be extended in a variety of ways. Auto Emoji could be implemented as an add-on option to normal manual emojis instead of a stand-alone option. This would give users the opportunity to decide whether they want to use manual emojis or Auto Emojis. Another direction for further investigation would be to extend the Auto Emoji Read Receipt system to a sequence of reactions. Thereby, not only a single reaction, but all reactions that were detected while the recipient read the message would be attached to a message. This would diversify the reaction and transport even more facets of emotional expression in mobile chat communication. However, independent of the particular features, a ”Deep talk, real chat” Application like Chat42 also raises several critical considerations around ethical perspectives and users’ acceptance. Chat42 deeply intervenes in the privacy of the user, since the user entrusts the system with the representation of his or her personal expression. The system takes responsibility not only for the transmission of the emotions as in typical chat applications, but also for the authenticity of the emotions. Auto Emojis simplify the communication process and at the same time manage to convey mimic authenticity. Still, a central question is whether authenticity is always what the users want: On the one hand, Auto Emojis take the burden of having to express one’s emotions. On the other hand, this comes at the costs of a loss of control. Given that digital communication is often instrumentalized to gain more control about the impression one makes on others than in face-to-face communication (e. g., ), one may question in which situations and to what extent users actually prefer the ease of emotion exposure to emotion control.
Funding source: Bundesministerium für Bildung und Forschung
Award Identifier / Grant number: FKZ: 16SV8097
Funding statement: This research has been funded by the German Federal Ministry of Education and Research (BMBF), project GINA (FKZ: 16SV8097).
About the authors
Cedric Quintes studied Computer Science (B. Sc.) and Human-Computer-Interaction (M. Sc.). He is currently working as a researcher at the Institute of Psychology at the Ludwig-Maximilians-University in Munich, Germany in the area of human-robot-interaction.
Daniel Ullrich is a post-doctoral researcher at the Institute of media informatics at the Ludwig-Maximilians-University in Munich, Germany. His research interests are in the area of human-robot-interaction, the design and evaluation of interactive systems, and the effects of digital media for society and well-being.
 Alford, H. (2012). Would it kill you to stop doing that?: a modern guide to manners. New York: Twelve.Search in Google Scholar
 Balconi, M., & Mazza, G. (2009). Consciousness and emotion: ERP modulation and attentive vs. pre-attentive elaboration of emotional facial expressions by backward masking. Motivation and Emotion, 33(2), 113–124. https://doi.org/10.1007/s11031-009-9122-8.10.1007/s11031-009-9122-8Search in Google Scholar
 Beaver, L. (2017). Here’s how millennials are impacting the future of communication. Retrieved from https://www.businessinsider.de/heres-how-millennials-are-impacting-the-future-of-communication-2017-1.Search in Google Scholar
 Berscheid, E. S. (2006). Review of Silent Messages: Implicit Communication of Emotions and Attitudes. 2nd ed. Contemporary Psychology: A Journal of Reviews, 26, 648. https://doi.org/10.1037/020475.10.1037/020475Search in Google Scholar
 Blabst, N., & Diefenbach, S. (2017, July 1). WhatsApp and Wellbeing: A study on WhatsApp usage, communication quality and stress. https://doi.org/10.14236/ewic/HCI2017.85.10.14236/ewic/HCI2017.85Search in Google Scholar
 Colbert, A., Yee, N., & George, G. (2016). The Digital Workforce and the Workplace of the Future. Academy of Management Journal, 59(3), 731–739. https://doi.org/10.5465/amj.2016.4003.10.5465/amj.2016.4003Search in Google Scholar
 Daft, R. L., & Lengel, R. H. (1984). Information richness: A new approach to managerial behavior and organizational design. Research in Organizational Behavior, 6, 191–233.10.21236/ADA128980Search in Google Scholar
 Dana, & Gavril. (2019). Upstanders and the emotional effect of the haunting blue ticks. 76–84.Search in Google Scholar
 Diefenbach, S., & Ullrich, D. (2019). Disrespectful Technologies: Social Norm Conflicts in Digital Worlds. https://doi.org/10.1007/978-3-319-94947-5_5.10.1007/978-3-319-94947-5_5Search in Google Scholar
 Dimberg, U., Thunberg, M., & Elmehed, K. (2000). Unconscious Facial Reactions to Emotional Facial Expressions. Psychological Science, 11(1), 86–89. https://doi.org/10.1111/1467-9280.00221.10.1111/1467-9280.00221Search in Google Scholar PubMed
 Joyce, G. (2019). The Most Popular Emojis. Retrieved August 5, 2019, from https://www.brandwatch.com/blog/the-most-popular-emojis/.Search in Google Scholar
 Martin, C. (2018). One In 10 Millennials Would Rather Lose A Finger Than Give Up Their Smartphone: Survey. Retrieved August 5, 2019, from https://www.mediapost.com/publications/article/322677/one-in-10-millennials-would-rather-lose-a-finger-t.html.Search in Google Scholar
 Mehrabian, A. (1981). Silent messages – implicit communication of emotions and attitudes (2nd ed.). Belmont, California.Search in Google Scholar
 O’Sullivan, B. (2000). What you don’t know won’t hurt me: impression management functions of communication channels in relationships. Human Communication Research, 26(3), 403–431. https://doi.org/10.1093/hcr/26.3.403.10.1093/hcr/26.3.403Search in Google Scholar
 Quintes, C. (2019). Outsourcing Emotions – An Introduction of AI-Based Emojis to Mobile Chat Communication. Ludwig-Maximilians-Universität München (Master Thesis).Search in Google Scholar
 Regenbogen, C., & Habel, U. (2015). Facial Expressions in Empathy Research. In Understanding Facial Expressions in Communication (pp. 101–117). https://doi.org/10.1007/978-81-322-1934-7_6.10.1007/978-81-322-1934-7_6Search in Google Scholar
 Riordan, M. A., & Kreuz, R. J. (2010). Cues in computer-mediated communication: A corpus analysis. Computers in Human Behavior, 26(6), 1806–1817. https://doi.org/10.1016/j.chb.2010.07.008.10.1016/j.chb.2010.07.008Search in Google Scholar
 Trevino, L. K., Lengel, R. H., & Daft, R. L. (1987). Media Symbolism, Media Richness, and Media Choice in Organizations. Communication Research, 14(5), 553–574. https://doi.org/10.1177/009365087014005006.10.1177/009365087014005006Search in Google Scholar
 Turkle, S. (2017). Alone Together: Why We Expect More from Technology and Less from Each Other. Journal of the American Society for Information Science and Technology.Search in Google Scholar
 Yang, Y.-H., & Yeh, S.-L. (2018). Unconscious processing of facial expression as revealed by affective priming under continuous flash suppression. Psychonomic Bulletin & Review, 25(6), 2215–2223. https://doi.org/10.3758/s13423-018-1437-6.10.3758/s13423-018-1437-6Search in Google Scholar PubMed
© 2019 Walter de Gruyter GmbH, Berlin/Boston