NMT verb rendering: A cognitive approach to informing Arabic - into - English post - editing

: Machine translation ( MT ) has made signi ﬁ cant strides and has reached accuracy levels that often make the post - editing ( PE ) of MT output a viable alternative to manual translation. However, despite professional translators increasingly considering PE as a valid stage in their translation work ﬂ ow, little has been done to investigate MT output for the purpose of informing training in PE. Against this back - ground, the present project focuses on the handling of tense and aspect con ﬁ gurations in the English translation of Arabic sentences using current neural machine translation ( NMT ) systems. Using a dataset of representative Arabic sentences, the output of ﬁ ve NMT engines was assessed against reference transla - tions. The investigation reveals regressing accuracy levels when comparing morphological, structural, and contextual tenses. These ﬁ ndings are believed to represent valuable information that contributes to a more informed training in the PE of Arabic - into - English NMT output.


Introduction
Machine translation (MT) has drastically evolved from the simplistic systems of the late 1940s to presentday neural network sophistication. With this evolution, MT output quality has, notwithstanding language pair variations, witnessed impressive improvements that are steadily bringing it closer to human translation standards (Le and Schuster 2016). Understandably, MT output is still not perfect. Yet, its quality is now often sufficient for rapid information purposes, i.e. uses where only the gist of the source language (SL) text is needed (Hutchins 2001, 14). Even when the translation job is intended for more official usage and dissemination, MT output is increasingly considered a starting point for post-editing (PE)-based translation (see, for instance, O'Brien 2002).
In the area of translation, PE, or machine translation post-editing (MTPE), is the editing by human operators of MT output. This editing is meant to address issues, gaps, inconsistent terminology, etc. that are generated in the automatic translation process, with the intent to produce a polished translation version of a professional human translation quality (Allen 2003). The integration of MTPE in the translation workflow is supposed to improve translators' productivity. The validity of these claims notwithstanding,¹ PE has had a major impact on professional translation workflow. As such, MTPE has become a major item in the skill set  of professional translators (O'Brien 2002, Pym 2013, one that requires dedicated training. Prior awareness of weaknesses that are proper to MT output can therefore prove useful for the training of post-editors as it can help inform the translator/post-editor in their work.
From an Arabic-into-English translation perspective, this study focuses on the English verb as a linguistic aspect of the MT output. The verb is a key element of the sentence as it encapsulates multiple meaning components. Out of this multiplicity, the present project narrows down on manifestations of tense and aspect. This focus is doubly motivated. First, there is the belief that it would be arduous to attempt, within the scope of one article, a comprehensive coverage of all elements of interest along the lines proposed in the present investigation. Second, tense and aspect capture several of the characteristic distinctions between English and Arabic, which makes their study a valid starting point. By its very nature, this scope signifies that out of the indicative, subjunctive, and imperative moods, the indicative is the most propitious mood for the study, as both the subjunctive and the imperative usually imply the use of the bare form of the verb and consequently hamper any discussion in terms of tense and aspect.
Tense and aspect represent two of the properties with which the English finite verb is injected (Almanna 2016a, 2016b, Kearns 2000, Leech and Svartvik 2002. English tenses are typically divided into "past," "present," and "future" (see for example Biber et al. 2002, Coe et al. 2006, Freeborn 1987.² In contrast, Arabic has three basic tenses: " ‫ﻣ‬ ‫ﺎ‬ ‫ﺽ‬ " past, " ‫ﻣ‬ ‫ﻀ‬ ‫ﺎ‬ ‫ﺭ‬ ‫ﻉ‬ " present, and " ‫ﺃ‬ ‫ﻣ‬ ‫ﺮ‬ " imperative. The imperative, by nature, has a futuristic element, which has led some Arabic grammarians to refer to it as "future" (see Āl Sāqy 1977, Āl Syrāfī 1986, H̱ ārūn 1977. For some grammarians, tenses in Arabic fall into either of only two classes: perfect and imperfect. According to Wright (1967), the perfect is used to describe an action or event which is completed in relation to other actions or eventsusually in the past and is, therefore, part of reality. The imperfect, on the other hand, is used to describe an action or event which is not completedusually in the present or future and is, as such, part of irreality.
Grammatical aspect represents a mark on the particulars of the action carried out by the verb, such as manner, perspective, completion, continuation, the frequency and regularity of an act as a matter of routine, the duration of the act, and the continuity of the act at a particular point in time (Gadalla 2017, 30, Quirk et al. 1972, 90, Radwan 1975. English is usually described as having four aspect types, namely, simple, perfect, progressive, and perfect progressive (cf. Almanna 2016a, 2016b, Celce-Murcia and Larsen, 1999, Griffiths 2006, Kearns 2000, Kreidler 1998. Conversely, Arabic has no such grammatical category, with the consequence that any of the features which are expressed through grammatical aspect in English are lexicalised in Arabic, i.e. expressed through lexical and phrasal resources. As a consequence of this discrepancy in the availability of different tense/aspect categories, Arabicinto-English translators need to glean clues from both grammatical and lexical contextual elements to render the tense and aspect of the English verb. To account for the factors that go into the apprehension of the Arabic sentence tense, Almanna proposes a trichotomy that consists of "morphological tense," "structural tense," and "contextual tense" (2018,. Morphological tense is the tense as is expressed through the verb morphology. Structural tense is the sum of verb morphology and bound or free morphemes, aspectual words, and other items that typically point at a particular tense. Finally, contextual tense is the resultant of lexical and grammatical clues other than the verb group that are scattered within or sometimes even outside the boundaries of a sentence. In the two example sentences below, morphological tense is the "present," as can be gathered by zooming in on the verb itself " ‫ﻳ‬ ‫ﺘ‬ ‫ﺼ‬ ‫ﻞ‬ ." However, this morphological tense represents, in this case, an element of structural tense whereby the negative particle ‫"ﱂ"‬ associated with the verb in the morphological present refers to the "past."   2 It is to be noted that some grammarians limit English tenses to past and present only (see, for instance, Quirk and Greenbaum 1973) as they consider future as the outcome of the use of constructions (for more details, see Gadalla 2017, 30, Quirk et al. 1972 In the first sentence, the phrase " ‫ﰲ‬  ‫ﺍ‬  ‫ﻵ‬  ‫ﻭ‬  ‫ﻧ‬  ‫ﺔ‬  ‫ﺍ‬  ‫ﻷ‬  ‫ﺧ‬  ‫ﻴ‬  ‫ﺮ‬  ‫ﺓ‬ " (in recent days/lately) leads to a contextual tense that emphasises a period that began in the past and is considered relevant to the moment of speaking. This is typically expressed in English through the present perfect. Conversely, the effect of the contextual time marker " ‫ﺃ‬ ‫ﻣ‬ ‫ﺲ‬ " (yesterday) in the second example places emphasis on the completion of the action of not calling over a specific period in the past and can therefore be translated into a simple past. Relying on all these elements rather than adhering to morphological or structural tenses alone, the two sentences should be rendered as follows: -My son has not called me lately.
-My son did not call me yesterday.
With this in mind, the present project aims to investigate how satisfactorily neural machine translation (NMT) systems render tense and aspect in Arabic-into-English translation and what characteristics this rendering exhibits. From the perspective of NMT, the latest mainstream MT generation, the handling of tense and aspect in this language pair is particularly interesting, not because the morpho-syntactic and cognitive analysis processes that are needed to render it are complex, but because NMT does not rely on any of these analyses.
NMT is the latest variant of a shift in MT development that relies on aligned bilingual corpora. Statistical machine translation (SMT) systems, the earlier systems adopting this approach, use frequency calculations to provide the most frequent equivalent to each identified phrase. NMT systems are similar to SMT, except that they use a different computational approach called neural networks in exploiting this corpus (Forcada 2017).
Reliance on corpora to train MT systems came as a reaction to the limitations of earlier rule-based machine translation (RBMT) systems. In addition to dictionaries, an RBMT engine typically incorporates sets of SL grammatical rules used to parse the source text and sets of target language (TL) rules to generate the TL text. The limitations of these engines were due to the irregularities of human languages and the challenges that come with the attempt to code parsing modules that integrate a comprehensive set of grammatical rules with their exceptions and variations.
The internal functioning of NMT systems restricts and, in fact, orients the way the topic of accuracy is to be approached. The failure of the output to conform to a reference translation cannot be considered a "mistake" or an "error," such an interpretation would erroneously imply the use of modules that specifically address parsing or morpho-syntactic/cognitive analyses. The statistical calculations that define NMT equally mean that the output can be unpredictable. The production of a satisfactory output for a specific linguistic aspect, be it morphological, syntactic, or other, does not mean that this aspect will always be rendered correctly. Taken together, these two issues imply that we can only speak of relative strengths and weaknesses that emerge following the analysis of multiple illustrative cases.
Approaching MT as end users rather than natural language processing (NLP) specialists, our intent aligns with the methodological restrictions explained above and consists in using the conclusions of the investigation to inform training in PE. Despite its particulars, the present project remains fundamentally an instance of MT output quality evaluation. Within NLP circles, the focus on MT output quality is primarily undertaken for the purpose of gauging progress made by different engines and enhancing MT technologies (Arthur et al. 2016, Bentivogli et al. 2016, Chatzikoumi 2020, Hayakawa and Arase 2020, Popović 2018, Toral and Sánchez-Cartagena 2017, Vilar et al. 2006, Zakraoui et al. 2020. Here, the researcher can opt for either of the automatic or manual (subjective) evaluation options (see Chatzikoumi 2020, Forcada 2017. Automatic evaluation usually relies on a reference translation that is prepared and validated for the purpose of the assessment project (Forcada 2017, 304, Popović 2018. This evaluation involves the implementation of a number of metrics such as (automatic) text similarity measures (the Bilingual Evaluation Understudy or BLEU); reordering of lexical items as well as morphology and lexical errors (Forcada 2017, 305. See also Bentivogli et al. 2016, Papineni et al. 2002, Toral and Sánchez-Cartagena 2017; adequacy, fluency, and informativeness (Doddington 2002, 138-9); fluency and intelligibility (Reeder 2001); and general text matcher and translation edit rate (O'Brien 2011). Subjective or human evaluation, on the other hand, considers relatively blurry aspects such as fluency and PE effort (Forcada 2017, 305).
There is equally a growing body of scholarship investigating MT output quality from a pedagogical perspective that focuses on ways to better prepare trainees for MTPE. The thrust of research here has remained confined to the two closely intertwined aspects of PE-related skills and course design and components.³ The focus on PE-specific skills partakes of the broader endeavour to produce a comprehensive mapping of translation skills, as with the Process in the Acquisition of Translation Competence and Evaluation (PACTE) project (Rico Pérez and Torrejón 2012). Different skill categorisations are provided depending on how generic or specific the framework is (Koponen 2015, Pym 2013, Rico Pérez and Torrejón 2012. As for course design and components, course proposals commonly place PE in tandem with MT (Doherty et al. 2012, Koponen 2015, O'Brien 2002. In line with the general call to integrate technology into practical translation courses rather than leave it as standalone courses, some proposals call for integrating PE into general translation courses (Mellinger 2017, Moorkens 2018.
The recognition that MTPE is different from other types of editing and thus requires specific attention equally begs for another investigation path which undertakes to orient and raise the awareness of trainees by charting typical MT errors, as voiced by Depraetere (2010). Projects adopting this line of investigation can be traced back to Loffler-Laurian (1983. See also Schäfer 2003. Unfortunately, this type of error analysis investigation has remained underexplored, especially in MT involving Arabic. The general orientation of the present project is towards attempting to fill this gap. The research questions can be formulated as follows: 1. How accurate is the NMT handling of tense and aspect in Arabic-into-English translation? 2. What features impact this accuracy?
The following section provides an overview of the methodological aspects taken into consideration in the present output assessment exercise. Section 3 provides and discusses salient findings in the project. Recommendations for PE are made in the conclusion.

Methodology
Drawing on NLP tradition, there are generally two principal aspects to decide upon when devising an MT output assessment test. These relate to the dataset to be used for the assessment and the evaluation method to adopt.

The database
The performance evaluation of NLP tools such as parsers or MT engines usually uses a set of sentences called an evaluation database.⁴ Two types of databases are usually adopted. These are corpora and test suites. The use of one or the other of the database types is typically a factor of the purpose of the study. Corpora represent the first option in studies of a quantitative nature where the representativeness of the phenomena under investigation, as reflected in a corpus, is central to the research purpose (Lloberes et al. 2014, 87). Conversely, the use of a test suite is more appropriate in studies of a qualitative nature where phenomena under scrutiny need to be clearly isolated for analysis. In other words, the focus of projects using test suites is on coverage comprehensiveness rather than representativeness. For the present study, the test suite option has been adopted given the study's purpose of providing as exhaustive an investigation as possible of the MT rendering of tense and aspect manifestations. In other words, the frequency of occurrence of a particular tense or aspect in language does not represent a parameter in the investigation. Of more consequence for the study is providing a comprehensive coverage of these investigated phenomena.
Along the lines described in Burchardt et al. (2017, 160), the test suite for the study consists of 147 sentences derived from Almanna (2018), a pedagogical resource used for an introductory course in translation. These sentences are used in class to learn how to deal with morphological, structural, and contextual tenses in Arabic-to-English translation. The pedagogical origin of the test suite consolidates the fact that it is comprehensive, covering the 12 English tense/aspect combinations of the indicative mood ( Figure 1). With this number of sentences, the test suite is of a relatively small scale (Coughlin 2003, 63). However, this is compensated for by the narrow focus of the study and the use of five different MT engines to process each of the sentences (see below).

The evaluation method
Compared with general assessment tests that appraise the overall performance of a system, the focus of the current project, tense and aspect, is narrow. This limited scope renders unnecessary many of the metrics used in general assessment tests to secure objective and replicable results. For the current investigation, use was made of manual assessment that relies on a reference translation of the test suite Arabic sentences. The reference translation was produced based on analysis of the morphological, structural, and contextual tense types described in Section 1. To formalise the description of contextual tense, use was made of cognitive categories developed by Almanna (2022) that consist of point of emphasis, plexity, scope of intention and extent of causation, pace and time lapse, state of dividedness, state of boundedness, and degree of extension. These categories help define the contextual tense and account for the tense/aspect combination that is appropriate for each case.
Point of emphasis: This is the aspect of the action which the sentence focuses on. This emphasis can be on completion, duration, continuity, habituality, regularity, frequency, etc. In the example below, and by virtue of the phrase " ‫ﻣ‬ ‫ﻨ‬ ‫ﺬ‬ ‫ﺍ‬ ‫ﻟ‬ ‫ﺼ‬ ‫ﺒ‬ ‫ﺎ‬ ‫ﺡ‬ ," the emphasis is placed on duration, the whole period that began in the past and is seen as relevant to another point in the present. In the same example, there is equally a focus on continuity, as attention is drawn to the middle phase of the action of "revising" rather than to its beginning or end. This favours seeing the action as an ongoing activity. The combination of these two points of emphasis, i.e. duration and continuity, constitutes the contextual tense that favours rendering the sentence with the use of the present perfect continuous instead of relying on the present morphological tense.
I have been revising my lessons since morning.
Plexity: This refers to whether the action depicted is uniplex, i.e. occurring once, or multiplex, occurring multiple times in a recurrent manner. In the sentence below, the action of "smoking" is uniplex by virtue of the second clause, which depicts an action occurring while the first is ongoing and probably interrupting it. .
By contrast, the same action of "smoking" in the following sentence is multiplex on account of the clause " " (when I was young), which refers to an extended period of time. .
In translation, the differences in plexity between the two sentences represent major contextual tense elements. Hence, although the uniplex case requires the use of the continuous aspect, the multiplex case requires an interpretation that highlights habituality in the past. Considering these and other aspects, the two instances can be rendered as follows: I was smoking in the street when my father called me. I used to smoke heavily in my youth, but not anymore.
State of dividedness: While plexity refers to the number of times a full action is repeated, state of dividedness relates to the quality of one action being either internally continuous or characterised by internal breaks. Action in the sentence below is characterised by breaks as the action of "sipping" is inherently made up of internal repetitions. .
She was sipping tea in front of the window.
Conversely, the act of "watching" in the sentence below is characterised by being internally continuous, involving no interruptions.
. Scope of intention and extent of causation: Scope of intention is the quality inherent to an action that is presented as future-oriented. This contrasts with extent of causation, which is past-oriented. The interpretation of a sentence is usually a matter of balance between these two qualities. In the example below, the extent of causation in the two clauses is larger than the scope of intention as the two acts of "going" and "buying" are asserted. In other words, the emphasis in these two finite clauses is placed on the completion of the two actions, which are consequently described as points on the timeline. This favours a translation making use of the past simple.

‫ﻛ‬
. My sister went to the grocery store and bought some sugar.

‫ﺫ‬
In contrast, in the sentence below, the scope of intention in the second clause, "to buy some sugar," is larger than the extent of causation as it is not asserted that my sister bought some sugar.
. My sister went to the grocery store in order to buy some sugar.

‫ﺫ‬
Degree of extension and state of boundedness: Degree of extension refers to the quality of an action depicted as either a single point in time or a duration of various lengths. State of boundedness is about this action being either fully bounded, i.e. delimited in time from both beginning and end, partially bounded, or unbounded. In the sentence below, and based on the adverbial " ‫ﻣ‬ ‫ﻨ‬ ‫ﺬ‬ ‫ﺳ‬ ‫ﻨ‬ ‫ﻮ‬ ‫ﺍ‬ ‫ﺕ‬ " "for years," the act of "not visiting" cannot be reduced to a point on the timeline, but is rather to be drawn out as a period or span that started in the past and is relevant to the time of speaking, an implicit "now." The starting point in the past is unspecified. Therefore, the action of "not visiting" is partially bounded as it has only a right-hand boundary (now), as modelled below: These elements contribute to a contextual tense that requires a combination of the present tense and the perfect aspect.
I have not visited him for years.
In the following example, the act of "living" extends over a long period of time (almost 2 years). This act is characterised by full boundedness as we are able to identify the starting point and endpoint.
. Being equally relevant to another point in time in the future, the sentence is to be rendered through the use of the future perfect.

‫ﺑ‬
After one month, I will have lived in this city for two years.
Pace and time lapse: These refer to the quality of actions immediately following one another, with a minimum time gap to separate them. In the sentence below, there is no time lapse between the process of doing expressed by the verb " ‫ﻓ‬ ‫ﺘ‬ ‫ﺢ‬ ," "to open," and the process of sensing expressed by the verb " ‫ﺗ‬ ‫ﺬ‬ ‫ﻛ‬ ‫ﺮ‬ ," "to remember." . Here, there is also an implicit " ‫ﰷ‬ ‫ﻥ‬ " before the modalised preposition " ‫ﻋ‬ ‫ﲆ‬ " that is normally rendered into "have to," "must," etc. This favours a rendition along the lines of:

‫ﰲ‬
The moment he opened his book, he remembered that he had to call his brother. Clearly, these categories combine into a construct that contributes to the building of the overall contextual tense. In the example below, the action of "watching" is presented as one instance (uniplexity); there is no internal interruption in the flow of the action (dividedness), and no beginning or ending boundaries for the action (boundedness). Furthermore, the action of "watching" is drawn out over a short period of time (degree of extension) in which the action of "visiting" occurs (no time lapse).
. All these elements combined contribute to the rendering of the sentence as follows:

‫ﻛ‬
I was watching TV yesterday when my friend came to visit me.
At a subsequent stage, the reference translations thus obtained through reliance on cognitive categories were submitted to two bilingual raters for further validation. The percentage of agreement was found to reach 97.9%. This was considered good enough to proceed with the project.
Five online MT services were used in the project, namely, Google Translate, Microsoft Bing Translator, Reverso Translation, Systran Translate, and Yandex Translate. All these systems are based on NMT technology, include Arabic on their language list, and offer free unlimited usage without requiring the creation of a user account. With these features, these five MT engines represent services that users can refer to on an ad hoc basis.
The use of five engines is not intended as a comparative exercise. Rather, bringing together results from different engines (as is described in the following paragraphs) is meant to help transcend individual engine manifestations and establish conclusions that relate to NMT as a global approach. Other than this methodological motivation, the use of multiple engines is equally meant to help dispel the misconception, confirmed by anecdotal evidence, particularly in some parts of the Middle East, whereby online MT services are typically equated with one specific engine, namely Google Translate. This state of affairs could be attributable to, or even partake of the same phenomenon as, the ubiquity of the browser and search engine of the same company. Proper investigations delving into the whys and wherefores of MT engine market shares in this part of the world deserve attention but go beyond the scope of the present project.
For the particulars of the current investigation, each of the SL sentences in the test suite was input into the five online NMT services and the output translations were collected into the database. Based on the reference translation, the MT outputs were assessed for the tense/aspect combination rendered (see Klubička et al. 2018 for the adoption of a similar approach). The assessment scale is binary, assigning "1" for an output that is similar to the reference translation and "0" for one that deviates from it. Renderings which did not conform to the reference translation but which were still acceptable were considered correct. Conversely, renderings containing errors that affected meaning were not accepted.
Because the purpose of the exercise is not to evaluate the performance of specific engines or to compare between them but rather to come up with a general appreciation of NMT rendering, a score representing the sum of the points gathered from the five individual MT outputs for each sentence was calculated. Each individual sentence score thus potentially ranges between 0 and 5. Finally, because the test suite sentences were sorted into the 12 tense/aspect combinations available in the translations, the scores of sentences belonging to each of the same tense/aspect combinations were added up and turned into a percentage to give an appreciation of the accuracy achieved for each specific combination.
MT output was collected for evaluation over a period extending from May to June 2021. The data in Figure 2 show that none of the MT engines displays sensibly better overall output, and that the five NMT engines tend to have comparable results with similar overall strengths and weaknesses, despite differences that may appear for individual tenses.
The all-engine results (Figure 3) show discrepant tense rendering accuracy rates. One can distinguish an initial group of tenses with accuracy rates that exceed 90%. This group includes the future continuous, the future simple, the present perfect continuous, the present simple, the future perfect, and the present continuous. Within this group, output errors are usually due to an issue in the processing of the whole sentence rather than in the rendering of the tense (e.g. processing the sentence as imperative instead of indicative). Renderings for the remaining tenses are more erratic. The perfect aspect represents the common denominator in tenses with the lowest accuracy rates. However, given that two other tenses using the perfect aspect belong to the top-ranking group, it is not possible to establish any causal relationship between this aspect and accuracy rates. In fact, the overall results do not favour any patterns along tense/aspect lines.
Coupling the initial results provided above with the working principles of the NMT technology (Section 1), it is the main contention of the present project that the differences in rendering accuracies are to be explained  in terms of tense types (morphological, structural, or contextual) along the lines defined in Almanna (2018, 62-4). In theory, every Arabic sentence carries morphological, structural, and contextual tenses. However, these tenses can be aligned, converging into the same tense, or they can point in different directions. Whether converging or not, these tense types need to be taken into consideration for an adequate rendering of the sentence. Results from the experiment show higher accuracy rates in instances where tense types within the sentence are aligned. The examples below illustrate different instances of this scenario. In sentence (1), the morphological "past" in " ‫ﺳ‬ ‫ﺎ‬ ‫ﻓ‬ ‫ﺮ‬ " coincides with a contextual focus on completion (point of emphasis), which makes the action emerge as a uniplex point in time, and an extent of causation that relates to the past " ‫ﺃ‬ ‫ﻣ‬ ‫ﺲ‬ " (yesterday). The past simple is fully reflected in the five NMT systems. The same applies to sentence (2) where the morphological "present" is reinforced by a contextual focus on habituality (point of emphasis), an iteration in time (multiplexity) with interruptions (state of dividedness), but without boundaries (state of boundedness). Again, the present simple is fully reflected. Similarly, in sentence (3), the "future" is structurally expressed in Arabic by using the bound morpheme " ‫ﺳ‬ ‫ـ‬ " or free morpheme " ‫ﺳ‬ ‫ﻮ‬ ‫ﻑ‬ " with the verb conjugated in the imperfect form (  ‫ﻣ‬  ‫ﻀ‬  ‫ﺎ‬  ‫ﺭ‬  ‫ﻉ‬ ). This future orientation is contextually reinforced by uniplexity and futurity (scope of intention), especially with the phrase " ." Furthermore, the act of "marrying" is neither drawn out (degree of extension) nor delimited in time (state of boundedness). As a last example of converging tense types, the structural form " " in (5) calls for the use of the past perfect. To explain, by virtue of the structural form " ," the action of "working" is past-oriented (extent of causation) and drawn out over a period of time (degree of extension) with a point of emphasis on the whole period that began in the past (unspecified) and seen as relevant to another point in the past (unspecified), thus having no boundaries (state of boundedness). Conversely, when tense types show less concurrence, the NMT rendering of tenses becomes more erratic. The following two sentences illustrate this unpredictability with the present perfect. In sentence (6), the verb is morphologically in the past. However, the presence of " ‫ﰲ‬  ‫ﺍ‬  ‫ﻵ‬  ‫ﻭ‬  ‫ﻧ‬  ‫ﺔ‬  ‫ﺍ‬  ‫ﻷ‬  ‫ﺧ‬  ‫ﻴ‬  ‫ﺮ‬  ‫ﺓ‬ ," "lately, recently," orients interpretation in terms of right-handed partial boundedness that relates a multiplex action to the present, thus favouring the use of the present perfect. Four out of the five MT engines produced this tense. This high accuracy seems to be the outcome of the specific presence of the item "lately" in the sentence. In other words, the association of "lately" with the present perfect represents a common pattern. In sentence (7), the verb " ‫ﺳ‬ ‫ﺎ‬ ‫ﻓ‬ ‫ﺮ‬ ," which is morphologically conjugated in the past, clashes, on account of the second clause "  ‫ﻭ‬  ‫ﺳ‬  ‫ﻮ‬  ‫ﻑ‬  ‫ﻳ‬  ‫ﻌ‬  ‫ﻮ‬  ‫ﺩ‬  ‫ﻏ‬  ‫ﺪ‬  ‫ﺍ‬ ," with partial right-hand boundedness and has to be interpreted as a period of time related to the present by virtue of the reference to the future. These elements also require the use of the present perfect. However, this time, MT rendering was less satisfactory. In comparison with the previous example, this can be explained in terms of the absence of a pattern associated with the present perfect. My father travelled to Egypt and will return tomorrow.

Systran Translate
My father travelled to Egypt and will return tomorrow.

Microsoft Bing Translator
My father travelled to Egypt and will return tomorrow.

Reverso Translation
My father travelled to Egypt and will return tomorrow.

Yandex Translate
My father travelled to Egypt and will return tomorrow.
In sentence (8), the second verb " ‫ﻳ‬ ‫ﺮ‬ ‫ﱊ‬ " is conjugated in the present. Contextually, this second action is characterised by having more than one element (multiplexity), having no boundaries (state of boundedness), and having interruptions or breaks (state of dividedness) with a focus on continuity (point of emphasis). The presence of the first verb conjugated in the past simple " ‫ﻗ‬ ‫ﺎ‬ ‫ﻝ‬ " occurred in the middle of the action of " ‫ﻳ‬ ‫ﺮ‬ ‫ﱊ‬ ," thus indicating no time lapse and imposing past-orientation (extent of causation). These elements concord for a rendering of the second verb through the use of the past continuous. However, this was only reflected in one NMT engine, as shown below. He threw some incense into the coal pot.

Microsoft Bing Translator
He said as he threw some incense into the embers bowl.

Reverso Translation
He said throw in the bowl of embers some incense.

Yandex Translate
He said throw in the bowl of embers some incense.
In sentence (9), the second verb " ‫ﺗ‬ ‫ﻌ‬ ‫ﻤ‬ ‫ﻞ‬ " is conjugated in the present form [she works]. The point of emphasis here is on continuity. The phrase " ‫ﻣ‬ ‫ﻨ‬ ‫ﺬ‬ ‫ﺳ‬ ‫ﻨ‬ ‫ﺔ‬ " [for a year] imposes a right-hand partial boundedness on the act of working characterised in this example by uniplexity and being internally continuous (state of dividedness). However, the action does not relate to the present but to a point in the past (extent of causation) dictated by the use of the past in the first verb " ‫ﺃ‬ ‫ﺧ‬ ‫ﱪ‬ ‫ﺗ‬ ‫ﲏ‬ " [she told me]. These elements lead to the use of the past perfect continuous, which is only partially reflected in NMT output. With unf througher⁵ than worry, she told me that a year ago she worked for a big company in the city centre. Reverso Translation With an uneasy joy, she told me that years ago, she worked at a big company downtown.

Yandex Translate
With joy and anxiety she told me that a year ago she worked for a big company in the middle of town.
 5 Typo in original.
The future perfect continuous seems to be particularly arduous to render, with NMT engines often generating the future perfect instead. In sentence (10), the second verb is morphologically in the past form " ‫ﺗ‬ ‫ﻌ‬ ‫ﳭ‬ ‫ﺖ‬ ," "I learned." In terms of structural tense, we have the structure " ‫ﺳ‬ ‫ﺄ‬ ‫ﻛ‬ ‫ﻮ‬ ‫ﻥ‬ ‫ﻗ‬ ‫ﺪ‬ ," "I will be already," which refers to something achieved at one point in the future.

Stative verbs
This section attempts an investigation of possible meaningful NMT output patterns in terms of verb types. The focus addresses in particular the category of stative verbs. This category concerns verbs that depict a state rather than an action and have the particularity of not being conjugated in the continuous aspect. In other words, stative verbs represent a case of a semantic component that imposes restrictions on the morpho-syntactic forms the verb can assume. This focus examines renderings yielding stative verbs that fall into the following two categories: verbs that exclusively express a stative meaning (e.g. remember) and verbs that express two meanings, one referring to a state and one referring to an action (e.g. think).
In the first case scenario, all engines provide a satisfactory rendering. This means that in no case did the rendering involve the erroneous use of the continuous aspect (sentences 11,12,13). We believe in your ability to solve the problem. .

Sentence 13
We wish you all the best. .
The second case scenario, where the English verb can serve both as a stative verb and an action verb, can in turn be broken down into two subcategories. The first one is when to each of the state or action meanings corresponds a different verb in Arabic. In sentences 14 and 15, the two uses of the verb "think" correspond to two distinct verbs in Arabic, " ‫ﺍ‬ ‫ﻋ‬ ‫ﺘ‬ ‫ﻘ‬ ‫ﺪ‬ " and " ‫ﻓ‬ ‫ﻜ‬ ّ ‫ﺮ‬ ," respectively. In this case, NMT engines tend to render the aspect satisfactorily. The second subcategory represents cases where the Arabic verb itself can, mainly from the perspective of point of emphasis, be used to refer to either a state or continuity. From an NMT perspective, the morphological, structural, and contextual elements provided in the Arabic sentence in this case may not be sufficient to determine the exact point of emphasis. This is especially the case when the verb is used in the present (imperfect) form, making the sentence open to interpretations yielding both simple and continuous aspects (Sentence 16). This ambiguity is reflected in the NMT engine output with an average of over 46% of renderings using the continuous and the remaining cases using the simple aspect. Adding a contextual element, such as " ‫ﺍ‬ ‫ﻵ‬ ‫ﻥ‬ " (now), to the sentences does not seem to yield better results with an average rendering that uses the continuous aspect barely exceeding 53%. However, the output is markedly improved (86%) once structural elements are available, as when the particle " ‫ﰷ‬ ‫ﻥ‬ " is used to express the past tense. All these results lead to the conclusion that NMT systems rely more on morphological and structural tenses for rendering verbs. Surely, NMT does not apply any morpho-syntactic analysis on the input text. However, morphological and structural manifestations of tenses are stable enough to represent patterns on which the NMT engine relies. In cases where the rendering of the tense has to take into consideration conflicting tense types coexisting in the sentence, NMT engines are relatively less successful. Knowledge of these weaknesses is considered of major significance.
The findings of this study have to be seen in light of a number of limitations. First, the decision on the use of a test suite with carefully curated sentences was justified as a response to the qualitative orientation of the project and the need to systematically cover the possible tense/aspect combinations and assess the MT output emerging from the experiment. However, it is to be admitted that the use of sentences that display single phenomena does represent a limitation. A focus on the rendering of more complex structures, although more challenging in terms of evaluation, may yield insightful results that complement those obtained in the current focus.
In what represents another limitation, the experiment followed the modus operandi of MT systems by adopting the sentence as a unit of analysis. Given the present state of the technology, NMT engines typically analyse sentences independently from each other. These engines will therefore remain totally blind to contextual elements that appear in sentences surrounding the one being processed. In sentence (18), the first verb, "forget," is to be rendered in the present perfect, taking contextual information available in the rest of the sentence into account. The present perfect in this sentence is rendered in four MT engines used in the experiment, whereas the fifth engine generated a past simple. However, when the clause containing this verb is severed from the rest of the sentence through a full stop (sentence 19), none of the MT renderings used the present perfect, with four of them shifting to the past simple. Contextual information is no longer available within the sentence being processed. Another sequel to the investigation would be to further fine tune investigations into verb categories. Verbs patterns, such as the one associated with the expression of annoyance and requiring the use of the continuous aspect, represent a viable investigation path. Finally, given that NMT is constantly improving, the output had to be repeatedly checked throughout the project stages to look for changes. Results obtained in the current project represent a snapshot of the current state of this technology.

Conclusions and recommendations
With the increasing interest in the editing of MT output as a viable step in the workflow of the professional translator, more time is dedicated to MTPE in formal training. However, much remains to be done to identify distinctive features and a typology of errors in this output within different language pairs. Along this line, the present project set out to examine one particular facet of Arabic-into-English NMT output, namely, the rendering of tense and aspect. Focus on verbs is all the more important as (1) the verb represents a pivotal element in the sentence and (2) huge discrepancies exist between Arabic and English verb systems, the range of tenses available in each language, and the absence of aspect in Arabic being salient ones.
Results show that tense/aspect rendering is relatively accurate but that this accuracy is not uniformly observed over all tenses/aspect combinations. Causal relation between accuracy and specific tenses or aspects was not found. However, results clearly showed a significant connection between rendering accuracy and the congruence of morphological, structural, and contextual tense types at play in the sentence. NMT systems, the fundamental functioning principle of which can be described as pattern matching, find in morphological and structural tenses stable enough patterns for their outputs. On the other hand, contextual tense is less morphologically and structurally tangible. In cases where contextual tense conflicts with morphological and structural tenses, NMT engines were found to be less successful.
These results point at the necessity for a focus on tense types in Arabic verbs to represent a fundamental component of curricula that address MTPE. Heightening the awareness of translation trainees to these concepts will represent an advantage when dealing with NMT output. By informing formal training in MTPE for the specific Arabic-English language pair, training can be more targeted, knowing in advance what to focus on instead of working on a random palette of texts. These findings equally underscore the viability of the framework using morphological, structural, and contextual tenses in formalising the description of the Arabic tense system and invite for further, and more finely tuned, investigations along these lines.