Skip to content
BY 4.0 license Open Access Published by De Gruyter January 12, 2022

Deep learning approach to text analysis for human emotion detection from big data

  • Jia Guo EMAIL logo


Emotional recognition has arisen as an essential field of study that can expose a variety of valuable inputs. Emotion can be articulated in several means that can be seen, like speech and facial expressions, written text, and gestures. Emotion recognition in a text document is fundamentally a content-based classification issue, including notions from natural language processing (NLP) and deep learning fields. Hence, in this study, deep learning assisted semantic text analysis (DLSTA) has been proposed for human emotion detection using big data. Emotion detection from textual sources can be done utilizing notions of Natural Language Processing. Word embeddings are extensively utilized for several NLP tasks, like machine translation, sentiment analysis, and question answering. NLP techniques improve the performance of learning-based methods by incorporating the semantic and syntactic features of the text. The numerical outcomes demonstrate that the suggested method achieves an expressively superior quality of human emotion detection rate of 97.22% and the classification accuracy rate of 98.02% with different state-of-the-art methods and can be enhanced by other emotional word embeddings.

1 Introduction to text analysis for human emotion detection

Emotion can be conveyed in several forms, such as face and movements, voice, and written language [1]. Emotion recognition in text documents is an issue of material – identification based on principles derived from deep learning. In day-to-day life, human emotions play an important role [2]. Emotion can generally be understood as intuition that differs from thought or knowledge. Emotion influences an individual’s personal ability to consider different circumstances and control the response to incentives [3]. Emotional acceptance is used in many fields like medicine, law, advertising, e-learning, etc. [4].

Further considered as an important aspect for developed human communication is the emotional description [5]. Other than human interaction, emotion detection systems benefit from psychosocial interventions and identify criminal motivations [6]. The voice, gesture, and writing of a person identified as voice, appearance, and text emotion can be psychologically conveyed. Sufficient effort is made to recognize speech and face emotion; however, a framework of text-based emotion detection still requires to be attracted [7]. Identifying human emotions in the document becomes incredibly valuable from a data analysis perspective in language modeling [8]. The emotions of joy, sorrow, anger, delight, hate, fear, etc., are demonstrated. While there is no regular structure of the term feelings, the emphasis is on emotional research in cognitive science [9].

The state is sometimes connected with aware excitement of thoughts either qualitatively or with environmental factors. Again, emotional responses such as pleasure, sadness, terror, anger, surprise, etc., are deduced from peoples’ private perceptions and their immediate environment [10]. In life, the total composition of people, emotions play an essential role. There are various feedback types, such as words, short sentences, facial expressions films, large messages, text, and emoticons, which can sense feelings. These input types differ from application to application [11].

Many social networking sites generate various textual and audio data containing significant data and perform an ever more significant emotional understanding role [12]. The secure production of cognitive technologies is influenced as a foundation of human-computer emotional communication. Emotion extraction based on media is a big challenge in enhancing contact between humans and machines [13]. General interest is again given to the textual opinion analysis reported in social media, including Microblog, and several similar research studies have been carried out [14]. However, the knowledge about feelings in the document is minimal, and the identity of technical words in such areas is subject to various restraints [15]. When sound input in media platforms grows, it is impossible to fulfill the present emotional identification system’s needs just by one mode to reach the correct emotions [16]. The device can hardly determine the emotions conveyed in interactions in textual sentiment classification by interacting with the terms, expressions, words, and dependency. Because of the integral relationships among text and voice, modal convergence and emotional identification can improve the social networks’ output through NLP [17]. The actual emotional status of the speech and text emotional examination should be calculated.

The identification of feelings is one of the core aspects of object recognition in NLP. The feelings should be applied to different communication modes, including voice, facial expression, and biological signs. Text messaging is now probably the most common mode of communication. Text messages have many uses, and they are critical among texts in which emotions are efficiently understood. An insightful chat on the tweeter can understand the user’s feelings and have extra sensitive and human-like responses. If a device can discern emotions from the message text, it can generate a normal speech in the text-to-speech combination [18].

Emotions are an important factor in detecting human activity and have multiple implementations in text messages published by users. Recovery of knowledge, contact between person and computer is useful for text analysis of human emotion. Deep learning has helped with the Semantic Text Analysis to detect human emotions through big data [19]. Text-based source emotion tracking can be carried out using natural language processing conceptions [20]. Word embedding is widely used for many NLP tasks, like machine translation, analysis of feelings, and question answering. NLP techniques increase academic productivity by incorporating the semantic characteristics of the text. The main contributions of DLSTA are as follows:

  • DLSTA analysis is carried out using natural language processing notions by textual root emotion analysis. Word embedding is commonly used for several NLP functions, including computer translation, interpretation of emotions, and question answering.

  • DLSTA is modeled with NLP methods that improve learning efficiency by integrating its semantic and syntactic characteristics.

  • The numerical results have been executed, and the suggested DLSTA model achieves prediction, classification accuracy, detection, precision, performance, and recall ratio compared to other existing approaches.

The remaining article is organized as follows: Section 2 comprises various background studies concerning land use and land change cover. Section 3 elaborates the proposed DLSTA model for human emotion detection using big data. Section 4 constitutes the results that validate the performance with its corresponding descriptions. Finally, the conclusion with future perspectives is discussed in Section 5.

2 Background study on human emotion detection

This section discusses several works that various researchers have carried out; Zhong et al. [21] developed the Knowledge-Enriched Transformer (KET) model. KET tackles these problems by introducing an enriched information transformer, in which internal statements are perceived using the use of hierarchical attention. In contrast, the use of an effective context-conscious graphic focus method is dynamically used for external information. Experiments on several textual data sets reveal that both meaning and general experience reliably contribute to emotional detection success.

Gaind et al. [22] proposed Emotion Detection and Analysis (EDA). EDA provides a way of classifying text into six types of emotion: pleasure, sorrow, terror, wrath, outrage, and disgust. EDA uses two methods and merges them to derive these feelings from texts effectively. The first method is based on developing natural languages and uses different text characteristics like emoticons, graduate words and negations, voice pieces, and other grammatical analyses. The second is focused on classification algorithms for machine learning. EDA effectively developed a system for automating the need for manual annotation of big datasets is eliminated.

Shrivastava et al. [7] discussed Sequence-Based Convolutional Neural Network (SB-CNN). SB-CNN implements the word embedding for emotion recognition dependent sequence-based convolution. The suggested model implements a mechanism of focus that permits CNN to concentrate on terms that have a larger influence on the identification or on the part of the features that require more attention. The work’s key goal is to build the structure that recently gathered data for their clients’ minds and track social media because there is an understanding of public sentiment behind those subjects.

Sailunaz and Alhajj [23] proposed the Emotion and sentiment analysis (ESA) model. ESA recognizes, evaluates, and produces suggestions on people’s sentimental emotions in their Twitter posts from the document. ESA compiled tweets and responses on a few particular subjects and generated a dataset of e-mail, users, sentiments, feelings, etc. Developers used the data collection for tweets and their reactions to thoughts and sentiments and assessed users’ impact based on different metrics for users and messages.

Ghosh et al. [24] introduced the Touch Interactions Model (TIM). TIM helps concentrate various touch experiences characteristics with a mobile claw, leading to a custom model for user emotion. It is important to differentiate between typing and swiping behaviors to document the correct characteristics. The land realities marks for user emotions are obtained directly from the user by gathering auto reports daily. The features of the TIM model link it to the customized machine learning model that senses four emotional states (happy, sad, stressed, relaxed).

Jena [25] developed a collaborative learning environment (CLE). CLE attempted to test academic knowledge using numerous effective machine learning techniques. In CLE, there is a double contribution: (i) researching the emotion directionality of student information using machine learning, and (ii) analysis and forecasting of emotions of students using big-data systems. The CLE technologies can be extended using Big Data Structures and adapted to enhance value extraction for the learning of children, faculty, and other interested parties, for the variation of source, speed, and truth.

DLSTA has been proposed with deep study to detect human emotions using big data based on the survey. Textual root emotion analysis can be carried out using natural language processing notions. NLP techniques improve the effectiveness of methods for teaching by integrating semantic and syntactic text characteristics.

3 Deep learning assisted semantic text analysis (DLSTA)

Detecting a person’s emotional state by analyzing someone’s written text seems challenging. Identifying the emotions of the text plays a vital role in human-computer interaction (HCI). An individual’s speech can convey emotions, facial expressions, and written texts called facial, text-based, and speech emotions. Adequate work has been performed on facial and speech emotion detection, and a text-based emotional recognition system also needs to draw researchers. Identifying human emotions in the text in computational linguistics is becoming progressively significant from an application perspective. Text emotion detection aims to discover the text’s emotions by analyzing the writer’s input text. This is based on the supposition that if anyone is happy, they will use encouraging words. These words may infer the underlying negative feelings of a person who is stressed, depressed, or frustrated. The text’s emotional recognition is important because it is the primary medium of human-computer interaction with people on e-mails, texts, chat rooms, forums, web blogs, product reviews, and other social media platforms such as YouTube, Twitter, and Facebook. Emotional recognition applications can be used in business, psychology, education, and many other ways in which the feelings need to be understood and interpreted. Sentiment analysis is an NLP field that has implemented the significance of the results it generates for user profiling. Especially, sentiment analysis is generally linked with opinion mining, where the objective is to determine for every appropriate aspect of the sentence a polarity (negative, neutral, positive). In real-time applications, the prerequisite is to go beyond and determine a better granularity for the state of mind articulated by users. There are diverse emotional models in the literature and their peculiarity and granularity of the application field. However, the recognization of various emotions from a small sentence is still a challenging task. Every user has her or his behavioral models which can diverge from the normal model, and the usage of emotion in personalized structures is a well-implemented practice, and various works have confirmed its significance. Hence, in this paper, the DLSTA model has been proposed for human emotion detection using big data. Word embeddings have been commonly used in NLP applications because the vector depictions of words capture beneficial semantic components and linguistic association among words utilizing deep learning methods. Word embeddings are frequently used as feature input to the ML model, allowing ML methods to progress raw text information.

DLSTA analyses by the use of root emotion analysis are performed utilizing natural language processing concepts. Word embedding has been frequently utilized in many NLP activities such as computer translation, emotional interpretation, and answering questions. DLSTA is designed using NLP approaches to increase the efficiency of learning through the integration of its semantic and syntactic features. The numerical results were conducted, and as compared to other current techniques, a proposed DLSTA model provides a prediction and accuracy of classification, detection, precision, performance, and recall ratio. Figure 1 shows the proposed DLSTA model. This work results from text analysis, and questionnaire-based methods have been analyzed to identify a human’s emotional state. The feature has been extracted separately from both text analysis and questionnaire-based methods. Subsequently, features determined from these two methods are pooled to produce the last feature vectors. These feature vectors are deliberate in support vector machine-based platforms to identify a person’s emotional state. Finally, to improve the system’s performance, the likelihood scores of support vector machines have been joined utilizing NLP. For both testing and training datasets of text, pre-processing task on the gathered data has been carried out. If the word “not” comes with a verb, adjective, or adverb, it has been merged with the word for further reflection; otherwise, the nullification is detached as again it will not impact the sentence for emotions. The fundamental emotions are the only main features as the text contains the fundamental emotions whose values will be the likelihoods of the emotional state in the sentence. These elementary emotions are sad, joy, anger, fear, and disgust. Numerous characteristics/features respect a specific emotion. Mapping the emotion string to mathematical values is completed based on data gathering formats. Emotion detection is completed by extracting emotional keywords from the text. These keywords match the knowledge base or the vocabulary like Thesaurus to discover emotional expressions.

Figure 1 
               Proposed DLSTA model.
Figure 1

Proposed DLSTA model.

The characteristic was retrieved independently from both text analysis and techniques based on the questionnaire. The characteristics from these two approaches are subsequently combined to generate the final vectors of features. These functional vectors support the emotional state of the individual on a vector-based machine platform. Finally, the chances of an NLP support vector were included to increase the system’s performance.

Deep Learning permits the system to comprehend the semantic and building of sentences the interdependency of the sentence. The emotion dataset is first built, which is tagged. This tagged dataset is then fed to the neural network which trains the dataset for more accurateness and handles new data. There are different options for selecting training models, like Recurrent Neural Network and Convolution Neural Network. Afterward training the neural network, analytic reports are produced until the desired accuracy is not attained. Before employing the algorithms on the input, pre-processing on the text is completed. This conversion on the raw input into another format is easy and efficient for processing. There are different approaches for pre-processing data like Cleaning in which it deals with stop words, punctuation, capitalization, repeated letters, etc. Annotation in which the tokens are markup as part of speech, Standardization in which the input is prearranged for effective access, and extracting the valuable features is important for a specific task or application.

Figure 2 shows the text classification using NLP. Human emotion recognition in the text is a vital natural language processing (NLP) tasks whose solution can advantage numerous applications in diverse fields, involving e-learning, data mining, human-computer interaction, information filtering systems, and psychology. NLP techniques have been utilized to extract syntactic and semantic features. In this method, pre-trained neural networks generate word embeddings used as features in NLP models. This paper recognizes the sets of features that lead to the best-performing methods; highlights the influences of simple NLP tasks, like parsing and part-of-speech tagging, on the performances of these methods; and specifies some open issues.

Figure 2 
               Text classification using NLP.
Figure 2

Text classification using NLP.

A vital topic of study that can reveal a range of relevant inputs has emerged called emotional recognition. There are various ways of articulating emotions, such as voice and facial expressions, written language, and gestures. The identification of emotions in a written document is essentially a matter of content categorization, incorporating ideas from natural language processing and the disciplines of profound learning. Therefore, human emotion identification with DLSTA has been proposed. Textual sources may be used to detect emotions using NLP concepts.

Figure 3 shows the NLP pipeline. NLP model is utilized in the automatic text classification. Pre-processing data retrieved initially from extracting text acting in the abstract, automatically cleaning the text from probable encoding error. The proposed study segments the text by words and then by phrase and tokenize words. Documents are often supplemented with metadata that captures added descriptive classification data about documents. Part of Speech (POS) tagging is the progression of labeling every word in the text with lexical category labels, like a verb, adjective, and noun. These labels are required in the following phases in the pipeline. This study determines and extracts named entities. Dependency Parsing extracts syntactic structure (tree) that encodes grammatical dependency relationships among words in sentences. For instance, direct object, indirect object, and non-clausal subject relationships in parsed information take their head and dependent word into account. Lemmatization is produced by Lemmatized Bag of Words (LBOW) feature. A bag of words (BOW) captures whether a word seems or not in an assumed abstract in contradiction of every word that looks like in the corpus. N-gram model extracts noun compound bigrams like samples representing a concept in the text. Verb class Clustering semantically predicates the same verb composed. Feature Selections that are common or rare in the annotated corpus are detached so that the classifiers utilize only the most discerning features. The threshold is set for every node by a progression of error and trial, normally the least threshold values of existences are chosen, while the high threshold differs significantly contingent on the feature types.

Figure 3 
               NLP pipeline.
Figure 3

NLP pipeline.

Figure 4 shows the word embedding model. The proposed model attempts to detect the masked words’ actual value, based on the context given by the other, non-masked, words in the series. In practice, the emotion detection of the output words needs: accumulating a classification layer on uppermost of encoder outputs, reproducing the output vector by the embedding matrices, converting them into the dictionary dimensions, and computing the likelihood of every word in the dictionary with softmax. The loss function considers only the emotion detection of the masked value and disregards the non-masked word’ forecast.

Figure 4 
               Word embedding model.
Figure 4

Word embedding model.

This section describes the two classifiers formed and an ensemble technique that pools their outputs. The two classifiers are based on diverse documents depictions. Contingent on the dataset utilized, the emotion classification tasks can be denoted as a multiclass or a multilabel issue. For both types of issues, this study utilized a one-vs-rest support vector machine classifier. Therefore, provided test samples, classifiers output the judgment function values for every feeling that gives the training information. The class linked with the test samples is then engaged to be emotions with the maximum decision function values (for multiclass) or the set of sentiments with optimistic judgment function values (for multilabel).

This paper utilized a support vector machine classifier with a linear kernel in our first method and symbolized each document as a Bag of Words. Various n-grams have been extracted (after lemmatization), social media and punctuation features. Explicitly, bigrams, NRC lexicons unigrams features (amount of terms in a post linked with every distress label in NRC lexicons) and occurrence of the question, interjection, links, user names, sad emotions, and happy emotions.

Word embedding grounded vector can be united to signify documents into fixed-size vectors. The proposed study has experimented with numerous document depictions, merging the word vector, subsequent the notations: low constant weights are assigned to words that do not seem in the training information. Weighing the word discriminatory abilities here is relative. This method assumes that documents mean more in their embedded representation, as more information is available for categorization tasks. Consequently, test samples have been supplied, classifiers provide value for judgment for every emotion which provides information for training. Then, emotions with the highest decision value or the set of feelings with optimistic judgment function values are used in the class associated with the samples (for multilevel).

(1) c we ( t 1 , , t l ) = 1 j = 1 l b j j = 1 l b j u ( t j ) .

As derived in equation (1) where c we denotes the word embedding-based vectors illustration for documents c with l term, u indicates pre-trained word-to-vector map, and b j signifies weight specifying the comparative significance of the term t j . The document illustrations in this study experimented with involve a continuous bag of words, classifier weights. Classifier weights have been employed in this method to compute weight functions, w ( t , e ) for every term t in the training information, which denotes its significance in categorizing documents as articulating emotions e . First demonstrating the document in a Bag of Words binary vector depiction where this study extracted unigram features. Then, for every e , support vector machine model has been trained with linear kernels and acquired n ( e , t ) to be the weight linked by models with every term t in training data.

(2) w ( t , e ) = n ( e , t ) μ e μ e .

As shown in equation (2) where μ e and μ e are the respective mean and standard deviation of model weights in complete value.

(3) b j = 1 , t j U or w ( t j , e ) < 1 , w ( t j , e ) , else .

As discussed in equation (3) where U represents the vocabulary produced from the training information. Low constant weights are allocated to terms which did not look like in the training information or are small discrimination. For other terms, weights are relative to the word discriminative powers. This approach seizures the concept that terms in documents are more significant in its embedded depiction since more information provides the classification tasks. Remember that, in this approach, various documents depiction is utilized for each emotions e since the discrimination power of each word (weight b j ) is diverse for every emotion.

Ensembles tend to attain good outcomes when there is an important diversity between the classifiers. As a preliminary stage, this study converted the above classifier judgment function values output to signify likelihoods, utilizing softmax conversion for multiclass issues, and sigmoid conversion for multilabel issues. The ensemble approaches this study experimented with follow the symbolization:

(4) n en ( c ) = β ( n bow ( c ) ) + ( 1 β ) ( n we ( c ) ) .

As discussed in equation (4) where n en ( c ) denotes output likelihood vectors for the ensemble classifier given a test document c , n bow ( c ) and n we ( c ) indicate the Bag of Words’ output likelihood vector and the word embeddings based classifier correspondingly, and β signifies a constraint that respects the particular ensemble approach utilized. This study has tested with the subsequent weighted-average likelihoods approaches: equivalent weights ( β = 0.5 ) , stack ( β is trained by an added classifiers), and accuracy-based weighting ( β reproduces the ratio among the macro accuracy score for the two classifiers). This study has established in our set-up that accuracy-based weighting realized good performance; therefore, consequences have been reported utilizing this system.

The proposed model utilized word2vec as it has been exposed that word2vec produces good word embedding for most common Natural Language Processing (NLP) task than other methods. Since no evidence expressed that the continuous bag of words design overtakes the skip-gram framework or vice versa, this paper randomly selected the skip-grams framework for word2vec. Word embedding can be denoted as a map U C : ω θ , which map a word ω from a vocabulary U to a real-valued vectors θ in embedding spaces with the dimensions of C . The skip-grams framework utilizes the emphasis word as the single input layer and the objective contextual word as the output forecast layer. To prevent costly computation over each word in U negative-sampling that sample a few output word and update embedding for this minor samples in every iteration. The proposed model formulates the model statistically in the following. Preassume a series of the target word ω 1 , ω 2 , , ω T and its contextual word g 1 , g 2 , , g T , the training target is to increase the conditional log-likelihood of perceiving the real output contextual words provided the input target words,

(5) max I = max 1 T t = 1 T log Q ( g t | ω t ) .

As derived in equation (5) where I denotes the objective functions, and Q ( g | ω ) is the conditional likelihood in the neural likelihood language models. Q ( g | ω ) is typically stated by

(6) Q ( g | ω ) = e θ g T θ ω g U e θ g T θ ω .

As shown in equation (6) where θ and θ are the output and input word embedding, correspondingly. Consequently, the log-likelihood can be transcribed as

(7) log Q ( g | ω ) = θ g T θ ω log g U θ g T θ ω

The proposed model can yield the derived of I to determine the embedding, modernizing the expression, repetitively. However, the calculation is tremendously costly as in every iteration and algorithms necessities to go via the vocabulary U . Utilizing negative sampling, an empirical log-likelihood Q ( g | ω ) to approximate Q ( g | ω ) :

(8) Q ( g | ω ) = log ρ ( θ g T θ ω ) + j l g Q m ( g ) [ log ρ ( θ g T θ ω ) ] .

As inferred from the equation (8) where ρ ( y ) = 1 / ( 1 + exp ( y ) ) denotes a softmax function that regularizes real vectors into likelihood vectors, Q m ( g j ) = f ( g j ) 3 / 4 j U f ( g j ) 3 / 4 indicates an experiential distribution that produces l negative sample with f ( g j ) term frequency for term g j . The word embedding θ can be calculated by exploiting the target function in equation (5) by substituting Q ( g | ω ) : with Q ( g | ω ) . The proposed DLSTA model enhances prediction, classification accuracy, detection, precision, performance and recall ratio compared to other existing methods.

4 Results and discussion

DLSTA has been evaluated based on performance, accuracy, and detection. The effect of emotions is detected by various parameters of the word clustering approach in the first group. In the second group, the emotional Classification is compared with results when using various characteristics and coefficients. According to the text analysis, the provinces’ analysis’s detection results vary with different emotions. Each lateral row is the actual outcome, and the result obtained is every lateral row. Multiple regression is a visual tool that enables us to identify and confuse every type of feeling. The detection rate of DLSTA is shown in Figure 5.

Figure 5 
               The detection rate of DLSTA.
Figure 5

The detection rate of DLSTA.

The correlation findings are then used to assess the various emotions based on the trust in classification of different is negatively linked with identification. Thus, the error can be used as a consistency classification measure for predicting emotion based on text analysis. In each situation, the data is divided into many classification trusts, each covering a particular period. The amount of appropriately categorized findings increases with the growing concentration in Classification for each text. In comparison, the amount of incorrectly labeled text analysis is near to the predicted rate. The predicted rate of DLSTA is shown in Figure 6.

Figure 6 
               The prediction rate of DLSTA.
Figure 6

The prediction rate of DLSTA.

Excited is quickly distinguished as being angry, while in user mode, they can notice that text-speech is complementary. The precision of most forms of emotions has increased, and the uncertainty of emotion is mitigated by integrating audible and text psychological functionality. It shows the feasibility of modal mutation. Experimental findings indicate that modal fusion may effectively minimize emotional confusion and enhance emotional sensitivity. The precision rate of DLSTA is shown in Figure 7.

Figure 7 
               The precision rate of DLSTA.
Figure 7

The precision rate of DLSTA.

DLSTA method is used for human emotion detection based on text analysis. The recognition system trains seven classifiers based on the text for various corresponding expression pictures, i.e., sadness, surprise, joy, anger, fear disgust, neutral. The prediction and detection of DLSTA are shown in Table 1. After experiments on the justification of the mapped and transformed text, such variables are specifically chosen. The overall result of emotion detection is equated with a capability that allows a large time saving through NLP.

Table 1

The prediction rate and the detection rate

Accuracy Prediction (%) Detection (%)
Happy 80.2 90.2
Sad 80.8 91.3
Surprise 80.9 94.5
Disgust 81.4 93.3
Fear 82.3 93.6
Anger 83.5 94.5
Neutral 81.2 98.2
Average 83.2 92.1

Emotion recognition is the major element in the text analysis situation with multiclass classification. The measure of accuracy, recall, and F1 was used to analyze the quality of DLSTA. The expression classifier for every emotion segment is the basis for evaluating the expression classifier’s Performance in all classes using a macro estimate. The overall classification accuracy is used to detect human emotion by text analysis through NLP. The classification accuracy of DLSTA is shown in Figure 8.

Figure 8 
               The classification accuracy of DLSTA.
Figure 8

The classification accuracy of DLSTA.

The best values for describing text feelings are estimated employing recall and F measure; Variance scheme appearance experiments have been performed. DLSTA system refers to word group characteristics of one function. The group’s full texts are detected by different human emotions based on text analysis; the measurement function is zero. The recall and F measure of DLSTA is shown in Table 2. The complete classification accuracy is obtained from the recall and F measure of different human emotions.

Table 2

The recall and the F measure

Accuracy Recall rate F-Measure
Happy 84.2 90.2
Sad 84.3 91.3
Surprise 84.4 94.5
Disgust 86.3 93.3
Fear 81.2 93.6
Anger 86.3 94.5
Neutral 84.1 98.2
Average 85.5 92.1

If the word cluster is used to conduct text emotion detection, word classification is very important. The text terms are listed as contents of emotions. We placed emotional words together into various groups according to their types of expression and textual emotion. Content terms were clustered using the NLP before clustering. The Performance is based on the text analysis used for different human detection stages in the DLSTA method. The Performance of DLSTA is shown in Figure 9.

Figure 9 
               The performance rate of DLSTA.
Figure 9

The performance rate of DLSTA.

The proposed method achieves the highest classification accuracy and detection rate when compared to other existing knowledge-enriched transformer (KET), emotion and sentiment analysis (ESA), emotion detection and analysis (EDA), sequence-based convolutional neural network (SB-CNN), touch interactions model (TIM), and collaborative learning environment (CLE).

5 Future work and conclusion

This paper presents DLSTA for the identification of human emotions using text analysis from big data. Textual emotion analysis can be carried out using natural language processing notions. Word embedding is commonly used for several NLP functions, including computer translation, interpretation of emotions, and question answering. The techniques of NLP enhance the efficiency of learning approaches by combining semantical and syntactic language characteristics. Emotion is conveyed in different forms, such as face and voice, gestures, and written language. Emotion can be observed with text emotion recognition, and it is a matter of information classification involving natural language processing and deep learning principles. Findings demonstrate that the suggested approach is a very promising choice for emotion recognition due to its powerful ability to learn raw data features directly. The qualitative results indicate that the proposed DLSTA approach expressly achieves the highest detection rate of 97.22 and 98.02% of classification accuracy with various emotional term embedding methods. Future work will concentrate on advancement in emotion detection, modeling the emotions’ magnitude, permitting manifold emotion classes to be active concurrently, and studying alternative emotion class models.

  1. Conflict of interest: Author states no conflict of interest.


[1] Chatterjee A, Gupta U, Chinnakotla MK, Srikanth R, Galley M, Agrawal P. Understanding emotions in text using deep learning and big data. Comput Hum Behav. 2019;93:309–17.10.1016/j.chb.2018.12.029Search in Google Scholar

[2] Rodríguez AOR, Riaño MA, García PAG, Marín CEM, Crespo RG, Wu X. Emotional characterization of children through a learning environment using learning analytics and AR-Sandbox. J Ambient Intell Human Comput. 2020;11:1–15.10.1007/s12652-020-01887-2Search in Google Scholar

[3] Hossain MS, Muhammad G. Emotion recognition using deep learning approach from audio–visual emotional big data. Inf Fusion. 2019;49:69–78.10.1016/j.inffus.2018.09.008Search in Google Scholar

[4] Zhang H, Jolfaei A, Alazab M. A face emotion recognition method using convolutional neural network and image edge computing. IEEE Access. 2019;7:159081–9.10.1109/ACCESS.2019.2949741Search in Google Scholar

[5] Kanjo E, Younis EM, Ang CS. Deep learning analysis of mobile physiological, environmental and location sensor data for emotion detection. Inf Fusion. 2019;49:46–56.10.1016/j.inffus.2018.09.001Search in Google Scholar

[6] Chen T, Ju S, Yuan X, Elhoseny M, Ren F, Fan M, et al. Emotion recognition using empirical mode decomposition and approximation entropy. Comput Electr Eng. 2018;72:383–92.10.1016/j.compeleceng.2018.09.022Search in Google Scholar

[7] Shrivastava K, Kumar S, Jain DK. An effective approach for emotion detection in multimedia text data using sequence based convolutional neural network. Multimed Tools Appl. 2019;78(20):29607–39.10.1007/s11042-019-07813-9Search in Google Scholar

[8] Asghar MZ, Subhan F, Ahmad H, Khan WZ, Hakak S, Gadekallu TR, et al. Senti-eSystem: a sentiment-based eSystem-using hybridized fuzzy and deep neural network for measuring customer satisfaction. Software Pract Exper. 2021;51(3):571–94.10.1002/spe.2853Search in Google Scholar

[9] Hasan M, Rundensteiner E, Agu E. Automatic emotion detection in text streams by analyzing twitter data. Int J Data Sci Analytics. 2019;7(1):35–51.10.1007/s41060-018-0096-zSearch in Google Scholar

[10] Alazab R. 0038 children below 5 years of employed mothers are less exposed to acute poisoning in Alexandria, Egypt. Occup Environ Med. 2014;71(Suppl 1):A63.10.1136/oemed-2014-102362.196Search in Google Scholar

[11] Dash S, Luhach AK, Chilamkurti N, Baek S, Nam Y. A Neuro-fuzzy approach for user behaviour classification and prediction. J Cloud Comput. 2019;8(1):17.10.1186/s13677-019-0144-9Search in Google Scholar

[12] Awad A, Obayan A, Salhab S, Roufayel R, Kadry S. Effect of smoking on appetite, concentration and stress level. Glob J Health Sci. 2020;12(1):139–9.10.5539/gjhs.v12n1p139Search in Google Scholar

[13] Meqdad MN, Abdali-Mohammadi F, Kadry S. Recognizing emotional state of user based on learning method and conceptual memories. TELKOMNIKA. 2020;18(6):3033–40.10.12928/telkomnika.v18i6.16756Search in Google Scholar

[14] Ding C, Zhou A, Liu Y, Chang R, Hsu CH, Wang S. A cloud-edge collaboration framework for cognitive service. IEEE Trans Cloud Comput. 2020.10.1109/TCC.2020.2997008Search in Google Scholar

[15] Chan SKW, Kao SYS, Leung SL, Hui CLM, Lee EHM, Chang WC, et al. The role of cognitive functioning and symptomology in self-stigma formation in psychosis. In International Congress Of Psychiatry, RANZCP 2016; 2016, May.Search in Google Scholar

[16] Bhardwaj A, Al-Turjman F, Kumar M, Stephan T, Mostarda L. Capturing-the-Invisible (CTI): behavior-based attacks recognition in iot-oriented industrial control systems. IEEE Access. 2020;8:104956–66.10.1109/ACCESS.2020.2998983Search in Google Scholar

[17] Moreira MW, Rodrigues JJ, Kumar N, Saleem K, Illin IV. Postpartum depression prediction through pregnancy data analysis for emotion-aware smart systems. Inf Fusion. 2019;47:23–31.10.1016/j.inffus.2018.07.001Search in Google Scholar

[18] Zhang G, Hsu CHR, Lai H, Zheng X. Deep learning based feature representation for automated skin histopathological image annotation. Multimed Tools Appl. 2018;77(8):9849–69.10.1007/s11042-017-4788-5Search in Google Scholar

[19] Lokesh S, Kumar PM, Devi MR, Parthasarathy P, Gokulnath C. An automatic tamil speech recognition system by using bidirectional recurrent neural network with self-organizing map. Neural Comput Appl. 2019;31(5):1521–31.10.1007/s00521-018-3466-5Search in Google Scholar

[20] Kanisha B, Lokesh S, Kumar PM, Parthasarathy P, Babu GC. Speech recognition with improved support vector machine using dual classifiers and cross fitness validation. Personal ubiquitous Comput. 2018;22(5–6):1083–91.10.1007/s00779-018-1139-0Search in Google Scholar

[21] Zhong P, Wang D, Miao C. Knowledge-enriched transformer for emotion detection in textual conversations. arXiv preprint arXiv:1909.10681; 201910.18653/v1/D19-1016Search in Google Scholar

[22] Gaind B, Syal V, Padgalwar S. Emotion detection and analysis on social media. arXiv preprint arXiv:1901.08458; 2019Search in Google Scholar

[23] Sailunaz K, Alhajj R. Emotion and sentiment analysis from Twitter text. J Comput Sci. 2019;36:101003.10.1016/j.jocs.2019.05.009Search in Google Scholar

[24] Ghosh S, Hiware K, Ganguly N, Mitra B, De P. Emotion detection from touch interactions during text entry on smartphones. Int J Human-Comput Stud. 2019;130:47–57.10.1016/j.ijhcs.2019.04.005Search in Google Scholar

[25] Jena RK. Sentiment mining in a collaborative learning environment: capitalizing on big data. Behav Inf Technol. 2019;38(9):986–1001.10.1080/0144929X.2019.1625440Search in Google Scholar

Received: 2021-03-01
Revised: 2021-07-06
Accepted: 2021-09-17
Published Online: 2022-01-12

© 2022 Jia Guo, published by De Gruyter

This work is licensed under the Creative Commons Attribution 4.0 International License.

Downloaded on 3.10.2023 from
Scroll to top button