A study on intelligent translation of English sentences by a semantic feature extractor

: In order to enhance the performance of machine translation, this article brie ﬂ y introduced algorithms that can be used to extract semantic feature vectors. Then, the aforementioned algorithms were integrated with the encoder – decoder translation algorithm, and the resulting algorithms were subsequently tested. First, the performance of the semantic recognition of the long short-term memory (LSTM)-based semantic feature extractor was tested, followed by a comparison with the translation algorithm that does not include semantic features, as well as the translation algorithm that incorporates convolutional neural network-extracted semantic features. The ﬁ ndings demonstrated that the LSTM-based semantic feature extractor accurately identi ﬁ ed the semantics of the source language. The proposed translation algorithm, which is based on LSTM semantic features, achieved more accurate translations compared to the other two algorithms. Furthermore, it was less a ﬀ ected by the length of the source language.


Introduction
The advancement of globalization has led to more frequent cultural exchanges, and a common language is necessary for effective communication.Currently, English is one of the universal languages used internationally, so the promotion and application of English in China are becoming more and more widespread [1].Although learning English can improve one's translation skills, it is a time-consuming process.Additionally, the efficiency of human translation is limited, making it challenging to efficiently handle large volumes of translation tasks.Additionally, human energy is limited, and long-term manual translation work can lead to a decline in the quality of translation [2].In order to improve the efficiency of English translation, and with the advancement of computer technology, machine translation algorithms have been applied to English translation.Traditional machine translation relies on bilingual dictionaries for one-to-one translation.However, this method can only translate the general meaning, and the grammar and word order of the translation may not adhere to the rules of the target language.Particularly when faced with translation words with similar meanings, errors are more likely to occur.Deep learning algorithms can effectively find hidden patterns in data, and when applied to machine translation, they can effectively extract semantic features from the source text to be translated, thereby making the translation more accurate [3].Long short-term memory (LSTM) is a special type of recurrent neural network (RNN) that is widely used for processing sequential data.Due to its ability to memorize long-term dependency information, it plays a significant role in various fields, such as language modeling, machine translation, and speech recognition [4].The technique of semantic feature extraction based on LSTM primarily involves utilizing the LSTM model to extract features from input sequences for subsequent classification and recognition purposes.To be specific, the LSTM model explores and memorizes semantic information when processing sequential data based on contextual cues from both preceding and succeeding sequences [5].It then utilizes these semantic details to generate new output sequences.This capability allows the LSTM model to be applied in unsupervised text data processing [6], facilitating the learning of complex language patterns and knowledge while transforming them into generalized semantic features.The advantage of this feature extraction method lies in its ability to automatically extract features without the need for configuration, while also possessing good generalization and interpretability [7].Relevant works are reviewed below.Lim [8] proposed a neural machine translation improvement based on a novel beam search evaluation function and found that the proposed method effectively improved the quality of English-Chinese translation.Xiong [9] studied the method of speech semantic feature recognition in the English spoken language database under the framework of machine learning and speech semantic recognition based on Chinese name recognition results.The analysis of experimental results showed that the spoken English semantic recognition model had significant improvements compared to traditional speech recognition systems.Lee et al. [10] used a character-level convolutional network for machine translation.In multi-language experiments, the character-level convolutional network encoder was significantly better than the sub-word level encoder.Ban and Ning [11] conducted a study on a Chinese-English machine translation model based on deep neural networks.They found that incorporating a segment of the voice data for assessing the enhancement in model performance could enhance the translation model's effectiveness.This article briefly introduces algorithms that can be used to extract semantic feature vectors.Then, they were combined with the encoder-decoder constructed translation algorithm.Finally, these translation algorithms were tested.The novelty of this article lies in the utilization of an LSTM model to extract semantic features from English, thereby enriching features and improving translation accuracy.The contribution of this article is using an LSTM model to extract semantic features from English and combining them with a translation algorithm to provide effective references for accurate translation.The limitation of this study lies in the insufficient number of samples used to train the algorithm.Therefore, a future research direction would be to increase the sample size in order to enhance the algorithm's generalizability.
2 English intelligent translation algorithm with semantic feature fusion

Extraction algorithm for semantic feature vectors
Text sequences can be viewed as one-dimensional images or as two-dimensional images after layout, so convolutional neural networks (CNNs) used for extracting image features can also be used for extracting semantic features of text sequences [12].Before CNN extracts semantic features from the source text, it undergoes training.In this process, CNN uses convolutional kernels to perform convolution operations on the vectorized source text in the convolutional layer to obtain convolutional feature maps [13], compresses them in the pooling layer, repeats the convolution and pooling operations multiple times, and calculates the classification results in the fully connected layer.In this article, the classification result is the semantic label [14].Then, the calculated classification result is compared with the actual semantic label, and the hyperparameters in the convolutional layer are adjusted in reverse based on the error between the two.The above steps are repeated until the error converges to the preset range [15].The trained CNN performs convolution and pooling operations on the vectorized source text and then connects the convolutional features obtained from each convolution operation together after max-pooling and compression to form the semantic feature vector [16].
In addition to CNN, RNN can also be used for extracting semantic feature vectors from the source text, and compared to CNN, RNN is more suitable for processing data with sequence [17].However, RNN may have difficulty processing long sequence data due to the problem of "vanishing gradients," so LSTM is generally used when facing long sequence data.LSTM adds a cell state and introduces a forget mechanism compared to RNN [18], and its basic structure is shown in Figure 1.The hidden layer of LSTM uses forget gates, input gates, and output gates in the basic structure to process input data.The calculation formula for the forget gate [19] is where f t is the input of the forget gate at time t, ()  σ refers to the activation function [20], ω f refers to the weight in the forget gate, − h t 1 is the hidden state at time − t 1, x t is the input data at time t, and b f is the bias of the forget gate.The input gate updates the cell state with the input new data.The formula is where i t is the input of the input gate at time t, ω i is the weight in the input gate, b i represents the bias of the input gate, C ˜t is the temporary state of the cell at time t, C t is the state after cell update at time t, ω C is the weight used for updating cell state, and b C denotes the bias used for updating cell state.The input gate updates the hidden state based on the input data and the current cell state [21], and the formula is where o t indicates the output of the output gate at time t, ω o indicates the weight in the input gate, b o is the bias of the input gate, and h t is the hidden state at time t [22].After the LSTM is trained with the training data with semantic labels, when extracting the semantic feature vector from the source text, the source text is input into the LSTM, and the hidden state h t of the source text is obtained after the forward calculation of the above equations in the forget, input, and output gates.Then, h t is used as the semantic feature vector of the source text obtained by the LSTM extraction.

English intelligent translation combined with semantic feature vector parameter
The original machine translation algorithm used bilingual dictionaries to translate the source text one-to-one, but the difference in grammar between the source text and the target text led to word displacement [23], and the existence of synonyms also greatly reduced the accuracy of translation.Deep learning algorithms can effectively mine hidden information in text and produce more accurate translations.Currently, commonly used intelligent machine translation algorithms adopt an end-to-end encoder-decoder structure.The basic principle of this algorithm for machine translation is to use the encoder to transform the source text sequence into a longer intermediate sequence and then use the decoder to convert the intermediate sequence into the target text sequence.The encoder and the decoder both use neural network algorithms.Simply put, they "guess" the most likely intermediate sequence and target sequence based on the statistical rules obtained after training.However, in this algorithm, the encoder only converts the source text sequence into an intermediate vector sequence without fully considering the semantic feature information contained in the source text.Therefore, this article extracts semantic feature vectors, combines them with the vector sequence obtained by the encoder, and finally uses the decoder to decode the fused vector sequence.The specific process is ① The source text to be translated is vectorized using Word2vec [24].② The vectorized source text is input into both the semantic feature vector extractor and translation encoder for forward calculation.The semantic feature vector extractor uses the LSTM algorithm as mentioned above, which has been trained using a training set with semantic labels, and the translation encoder also uses the LSTM algorithm.
③ During the forward calculation process of the feature extractor and encoder, the hidden state h t in their respective hidden layers is obtained from the source vector, which is the source semantic feature vector extracted by the feature extractor and the encoding vector obtained by the encoder.
④ The source semantic feature vector and encoding vector are processed by weighted combination.The gating mechanism in LSTM is mimicked to adaptively adjust the weights of the two vectors.The weighted calculation formula is where h Ns and h sr represent the hidden state vector output by the translation encoder and the semantic feature extractor, respectively, ω Ns and ω sr stand for the gating parameters of h Ns and h sr , α is the weight of the hidden state vector output by the translation encoder, and h mix represents the vector code after semantic feature fusion.⑤ The fused vector code is input into the decoder for decoding calculation.The decoder in the translation algorithm also uses the LSTM algorithm.After the fused vector code is calculated by the decoder, the distribution probability of the target text characters is obtained, and then the Viterbi beam search algorithm is used to obtain the target text.The sequence with the largest distribution probability is taken as the translation output.
3 Experimental analysis

Experimental data
The data required for the experiments in this article came from UM-Corpus [25], which provides two million bilingual corpora aligned in English and Chinese.Ten thousand sentences (English and Chinese parallel corpora) were randomly selected as the training corpus, and then another 5,000 sentences were randomly selected as the test set.

Algorithm settings
The proposed algorithm for English translation that combined semantic feature vector parameters mainly included a semantic feature extractor, a translation encoder, and a translation decoder, all of which use the LSTM algorithm.The LSTM algorithm used in both the translation encoder and the translation decoder consisted of two hidden layers, each with 512 nodes, and the activation function employed was the sigmoid function.The LSTM algorithm used in the semantic feature extractor consisted of three hidden layers, each with 512 nodes in each layer, and the activation function employed was the sigmoid function.During the training process, the semantic feature extractor was first trained using a training set with semantic labels, including part-of-speech and grammar, and the parameters in the hidden layers were adjusted using stochastic gradient descent based on the forward calculation error with a learning rate of 0.01.Then, the trained semantic feature extractor was used to train the translation encoder and decoder, and the parameters in the encoder and decoder were also adjusted using stochastic gradient descent with a learning rate of 0.01.The beam size in the Viterbi beam search algorithm used to obtain the decoder's output translation was set as 5.The setting of the above parameters was determined through orthogonal experiments.In addition to conducting experiments on the proposed English translation algorithm, two other English translation algorithms were also tested.The main structure of the other two English translation algorithms was also an encoder-decoder.One of the algorithms uses a traditional encoder-decoder structure without combining semantic feature vectors, and its structural parameters were the same as those in the proposed translation algorithm.The other algorithm combined semantic feature vectors but used the CNN algorithm for semantic feature extraction, and the remaining parameters were the same as the proposed translation algorithm.

Test program
(1) The LSTM semantic feature extractor was trained using a training set with semantic labels.Then, a test set containing 5,000 parallel English-Chinese sentences was used to test the extractor.However, during actual usage, the new sequences converted by the semantic feature extractor are not in Chinese characters and cannot be directly evaluated for quality.Therefore, the new sequences were classified.The classification results were various semantic labels.The accuracy of semantic classification was used as a measure of the performance of the semantic feature extractor.During testing, source texts with lengths ranging from 1 to 30 words were classified according to semantics, and the classification accuracy of the semantic feature extractor for different source text lengths was recorded.(2) The translation algorithms without semantic features, based on CNN semantic features, and based on LSTM semantic features were compared.Similarly, a test set containing 5,000 parallel English-Chinese sentences was used to evaluate the translation performance for source sentence lengths ranging from 10 to 30 words.

Evaluation standards
When testing the English translation algorithm proposed in this article, the performance of the semantic feature extractor was tested first.The hidden states in the feature extractor, which can represent semantic feature vectors, as well as the parameters used to obtain these hidden states, are adjusted during training.However, there are no direct labels for the hidden states in the training set; only corresponding semantic labels are available.Therefore, this article evaluated the extraction performance of semantic feature vectors by measuring the recognition performance of semantic labels in the semantic feature extractor.The higher the accuracy of the semantic label recognition by the semantic feature extractor, the more accurate the semantic feature vectors represented by the hidden states in its recognition process.BLEU was used to measure the translation performance of the English translation algorithm.It scored the algorithm based on the similarity between the machine-translated translation and the actual translation.The higher the similarity, the higher the score, and the better the performance of the translation algorithm.Its calculation formula is where β is a penalty factor, N is the maximum number of gram in n-gram, l c and l s represent the length of the translation and the actual translation, respectively, c i is the translation sentence, s ij is the actual translation sentence corresponding to c i , m represents the total number of sentences in the actual translation, ( ) h c k i is the occurrence times of the kth n-gram in the translation, and ( ) h s k ij denotes the occurrence times of the kth n- gram in the actual translation.

Experimental results
The performance of the semantic feature extractor was tested first in the experimental analysis.The semantic feature extractor used the hidden state of the algorithm's hidden layer as semantic feature vectors, but different extractors had different parameter settings.Therefore, there was no fixed standard feature vector label in the training set.However, the semantic labels in the training set were fixed.Consequently, the accuracy of the semantic label recognition by the semantic feature extractor was tested in order to measure accuracy of its semantic feature vector.Table 1 shows the partial recognition results of the semantic feature extractor for the part-of-speech and grammar.Figure 2 shows the recognition accuracy of the partof-speech and grammar of the source text of different lengths by the semantic feature extractor.It was seen from Figure 2 that as the length of the source text increased, the recognition accuracy of the semantic feature extractor for both lexical and syntactic decreased, but the decrease was not significant, and the recognition accuracy of both was above 98%.After verifying the semantic feature vector extraction performance of the semantic feature extractor, this article conducted comparative experiments on three translation algorithms.Table 2 shows the partial translation results of the three translation algorithms.Figure 3 shows the BLEU values of the three translation algorithms under different source text lengths.Comparing the translation results of the three translation algorithms in Table 2, it was found that the translation algorithm without semantic feature combination had the largest difference from the reference translation, and its grammar order did not conform to the Chinese grammar structure; the translation algorithm that combined CNN to extract semantic features had a smaller difference between the translated text and the reference translation, but there were still differences in grammar order; the translation algorithm that combined LSTM to extract semantic features had a translation that was almost consistent with the reference translation.
From Figure 3, it was observed that the BLEU values of the translation algorithm without semantic features and the translation algorithm that integrated CNN-extracted semantic features both decreased as the length of the source text to be translated increased, while the translation algorithm that integrated LSTMextracted semantic features was stable.Under the same source text length, the translation algorithm that Intelligent translation of English sentences  7 integrated LSTM-extracted semantic features had the highest BLEU value, followed by the translation algorithm that integrated CNN-extracted semantic features, and the translation algorithm without semantic features had the lowest BLEU value.

Discussion
With the deepening of global communication and rapid development of information technology, intelligent translationa technology that can automatically translate languages quickly and accuratelyis increasingly gaining attention and importance.Given that English has become a universal language worldwide, research on intelligent translation for English sentences holds significant practical significance and theoretical value.English sentences consist of elements such as vocabulary, grammar, and semantics, which exhibit distinct characteristics in various contexts.Traditional translation methods primarily focus on translating vocabulary and syntax, while neglecting the processing of semantic aspects.Some traditional translation methods rely too heavily on corpora and prior knowledge, resulting in weak generalization ability.This article chose to use LSTM models to extract feature parameters from English.Benefiting from the LSTM model's ability to utilize contextual information for learning and identifying semantic information, the sequence information obtained after the conversion of English text sequences contains semantic feature information.The recognition performance of the semantic feature extractor was tested first in the subsequent simulation experiments.Then, three translation algorithms were compared: one without semantic features, one based on CNN semantic features, and one based on LSTM semantic features, as shown in the previous section.The LSTM-based semantic feature extractor accurately extracted the semantics of English vocabulary.Compared to the other two translation algorithms, the translation algorithm based on LSTM semantic features demonstrated superior translation performance.Additionally, all three algorithms showed a decrease in translation performance with increasing translation length.The reasons for the above results are as follows.The translation algorithm that does not utilize semantic features only focuses on vocabulary and syntax, without considering the impact of semantics on translation.Compared to the former, the translation algorithm based on CNN semantic features takes into account semantic features.However, although CNN can use convolutional kernels to extract features and combine them into overall features during the extraction process, it does not fully consider the influence of contextual language context on semantics.This leads to certain biases in semantic extraction and ultimately affects the algorithm's translation performance.The influence of contextual language on semantics is fully considered when extracting semantic features using the LSTM, resulting in more accurate extracted semantic features and more precise translation results for the translation algorithm.The reason why the performance of the three translation algorithms decreases as the length of translation increases is that longer translations contain more contextual features, which can lead to increased interference.

Conclusion
This article briefly introduces algorithms that can be used to extract semantic feature vectors and then combines them with the encoder-decoder constructed translation algorithm.Finally, the translation algorithms were tested.In the test, the performance of the LSTM-based semantic feature extractor in recognizing semantics was first checked, followed by a comparison with the translation algorithm that did not use semantic features, as well as the translation algorithm that integrated CNN-extracted semantic features.The following results were obtained.(1) The LSTM-based semantic feature extractor accurately recognized the part-of-speech and grammar of the source text, and its recognition accuracy decreased slightly as the length of the source text increased.(2) In terms of translation results, the LSTM-based translation algorithm was closer to the reference translation, followed by the translation algorithm that integrated CNN-extracted

Figure 2 :
Figure 2: The accuracy of the semantic feature extractor in recognizing part-of-speech and grammar.

Table 2 :Figure 3 :
Figure 3: The BLEU values of the three translation algorithms under different source text lengths.

Table 1 :
Partial recognition results of part-of-speech and grammar by the semantic feature extractor