Skip to content
BY 4.0 license Open Access Published by De Gruyter Open Access July 30, 2022

NMT verb rendering: A cognitive approach to informing Arabic-into-English post-editing

Ali Almanna and Rafik Jamoussi
From the journal Open Linguistics

Abstract

Machine translation (MT) has made significant strides and has reached accuracy levels that often make the post-editing (PE) of MT output a viable alternative to manual translation. However, despite professional translators increasingly considering PE as a valid stage in their translation workflow, little has been done to investigate MT output for the purpose of informing training in PE. Against this background, the present project focuses on the handling of tense and aspect configurations in the English translation of Arabic sentences using current neural machine translation (NMT) systems. Using a dataset of representative Arabic sentences, the output of five NMT engines was assessed against reference translations. The investigation reveals regressing accuracy levels when comparing morphological, structural, and contextual tenses. These findings are believed to represent valuable information that contributes to a more informed training in the PE of Arabic-into-English NMT output.

1 Introduction

Machine translation (MT) has drastically evolved from the simplistic systems of the late 1940s to present-day neural network sophistication. With this evolution, MT output quality has, notwithstanding language pair variations, witnessed impressive improvements that are steadily bringing it closer to human translation standards (Le and Schuster 2016). Understandably, MT output is still not perfect. Yet, its quality is now often sufficient for rapid information purposes, i.e. uses where only the gist of the source language (SL) text is needed (Hutchins 2001, 14). Even when the translation job is intended for more official usage and dissemination, MT output is increasingly considered a starting point for post-editing (PE)-based translation (see, for instance, O’Brien 2002).

In the area of translation, PE, or machine translation post-editing (MTPE), is the editing by human operators of MT output. This editing is meant to address issues, gaps, inconsistent terminology, etc. that are generated in the automatic translation process, with the intent to produce a polished translation version of a professional human translation quality (Allen 2003). The integration of MTPE in the translation workflow is supposed to improve translators’ productivity. The validity of these claims notwithstanding,[1] PE has had a major impact on professional translation workflow. As such, MTPE has become a major item in the skill set of professional translators (O’Brien 2002, Pym 2013), one that requires dedicated training. Prior awareness of weaknesses that are proper to MT output can therefore prove useful for the training of post-editors as it can help inform the translator/post-editor in their work.

From an Arabic-into-English translation perspective, this study focuses on the English verb as a linguistic aspect of the MT output. The verb is a key element of the sentence as it encapsulates multiple meaning components. Out of this multiplicity, the present project narrows down on manifestations of tense and aspect. This focus is doubly motivated. First, there is the belief that it would be arduous to attempt, within the scope of one article, a comprehensive coverage of all elements of interest along the lines proposed in the present investigation. Second, tense and aspect capture several of the characteristic distinctions between English and Arabic, which makes their study a valid starting point. By its very nature, this scope signifies that out of the indicative, subjunctive, and imperative moods, the indicative is the most propitious mood for the study, as both the subjunctive and the imperative usually imply the use of the bare form of the verb and consequently hamper any discussion in terms of tense and aspect.

Tense and aspect represent two of the properties with which the English finite verb is injected (Almanna 2016a, 2016b, Kearns 2000/2011, Leech and Svartvik 2002, 66). English tenses are typically divided into “past,” “present,” and “future” (see for example Biber et al. 2002, Coe et al. 2006, Freeborn 1987, among others).[2] In contrast, Arabic has three basic tenses: “ماض” past, “مضارع” present, and “أمر” imperative. The imperative, by nature, has a futuristic element, which has led some Arabic grammarians to refer to it as “future” (see Āl Sāqy 1977, Āl Syrāfī 1986, H̱ārūn 1977).

For some grammarians, tenses in Arabic fall into either of only two classes: perfect and imperfect. According to Wright (1967), the perfect is used to describe an action or event which is completed in relation to other actions or events – usually in the past and is, therefore, part of reality. The imperfect, on the other hand, is used to describe an action or event which is not completed – usually in the present or future and is, as such, part of irreality.

Grammatical aspect represents a mark on the particulars of the action carried out by the verb, such as manner, perspective, completion, continuation, the frequency and regularity of an act as a matter of routine, the duration of the act, and the continuity of the act at a particular point in time (Gadalla 2017, 30, Quirk et al. 1972, 90, Radwan 1975, 30). English is usually described as having four aspect types, namely, simple, perfect, progressive, and perfect progressive (cf. Almanna 2016a, 2016b, Celce-Murcia and Larsen, 1999, Griffiths 2006, Kearns 2000/2011, Kreidler 1998). Conversely, Arabic has no such grammatical category, with the consequence that any of the features which are expressed through grammatical aspect in English are lexicalised in Arabic, i.e. expressed through lexical and phrasal resources.

As a consequence of this discrepancy in the availability of different tense/aspect categories, Arabic-into-English translators need to glean clues from both grammatical and lexical contextual elements to render the tense and aspect of the English verb. To account for the factors that go into the apprehension of the Arabic sentence tense, Almanna proposes a trichotomy that consists of “morphological tense,” “structural tense,” and “contextual tense” (2018, 62–4). Morphological tense is the tense as is expressed through the verb morphology. Structural tense is the sum of verb morphology and bound or free morphemes, aspectual words, and other items that typically point at a particular tense. Finally, contextual tense is the resultant of lexical and grammatical clues other than the verb group that are scattered within or sometimes even outside the boundaries of a sentence. In the two example sentences below, morphological tense is the “present,” as can be gathered by zooming in on the verb itself “يتصل.” However, this morphological tense represents, in this case, an element of structural tense whereby the negative particle “لم” associated with the verb in the morphological present refers to the “past.”

لم1 يتّصل2 بي3 اِبني4 في الآونة الأخيرة5.

[Did not]1 [he calls]2 [me]3 [my son]4 [in recent days]5.

.لم1 يتّصل2 بي3 اِبني4 أمس5

[Did not]1 [he calls]2 [me]3 [my son]4 [yesterday]5.

In the first sentence, the phrase “في الآونة الأخيرة” (in recent days/lately) leads to a contextual tense that emphasises a period that began in the past and is considered relevant to the moment of speaking. This is typically expressed in English through the present perfect. Conversely, the effect of the contextual time marker “أمس” (yesterday) in the second example places emphasis on the completion of the action of not calling over a specific period in the past and can therefore be translated into a simple past. Relying on all these elements rather than adhering to morphological or structural tenses alone, the two sentences should be rendered as follows:

  1. My son has not called me lately.

  2. My son did not call me yesterday.

With this in mind, the present project aims to investigate how satisfactorily neural machine translation (NMT) systems render tense and aspect in Arabic-into-English translation and what characteristics this rendering exhibits. From the perspective of NMT, the latest mainstream MT generation, the handling of tense and aspect in this language pair is particularly interesting, not because the morpho-syntactic and cognitive analysis processes that are needed to render it are complex, but because NMT does not rely on any of these analyses.

NMT is the latest variant of a shift in MT development that relies on aligned bilingual corpora. Statistical machine translation (SMT) systems, the earlier systems adopting this approach, use frequency calculations to provide the most frequent equivalent to each identified phrase. NMT systems are similar to SMT, except that they use a different computational approach called neural networks in exploiting this corpus (Forcada 2017).

Reliance on corpora to train MT systems came as a reaction to the limitations of earlier rule-based machine translation (RBMT) systems. In addition to dictionaries, an RBMT engine typically incorporates sets of SL grammatical rules used to parse the source text and sets of target language (TL) rules to generate the TL text. The limitations of these engines were due to the irregularities of human languages and the challenges that come with the attempt to code parsing modules that integrate a comprehensive set of grammatical rules with their exceptions and variations.

The internal functioning of NMT systems restricts and, in fact, orients the way the topic of accuracy is to be approached. The failure of the output to conform to a reference translation cannot be considered a “mistake” or an “error,” such an interpretation would erroneously imply the use of modules that specifically address parsing or morpho-syntactic/cognitive analyses. The statistical calculations that define NMT equally mean that the output can be unpredictable. The production of a satisfactory output for a specific linguistic aspect, be it morphological, syntactic, or other, does not mean that this aspect will always be rendered correctly. Taken together, these two issues imply that we can only speak of relative strengths and weaknesses that emerge following the analysis of multiple illustrative cases.

Approaching MT as end users rather than natural language processing (NLP) specialists, our intent aligns with the methodological restrictions explained above and consists in using the conclusions of the investigation to inform training in PE. Despite its particulars, the present project remains fundamentally an instance of MT output quality evaluation. Within NLP circles, the focus on MT output quality is primarily undertaken for the purpose of gauging progress made by different engines and enhancing MT technologies (Arthur et al. 2016, Bentivogli et al. 2016, Chatzikoumi 2020, Hayakawa and Arase 2020, Popović 2018, Toral and Sánchez-Cartagena 2017, Vilar et al. 2006, Zakraoui et al. 2020).

Here, the researcher can opt for either of the automatic or manual (subjective) evaluation options (see Chatzikoumi 2020, Forcada 2017, 304–5). Automatic evaluation usually relies on a reference translation that is prepared and validated for the purpose of the assessment project (Forcada 2017, 304, Popović 2018, 130). This evaluation involves the implementation of a number of metrics such as (automatic) text similarity measures (the Bilingual Evaluation Understudy or BLEU); reordering of lexical items as well as morphology and lexical errors (Forcada 2017, 305. See also Bentivogli et al. 2016, Papineni et al. 2002, Toral and Sánchez-Cartagena 2017); adequacy, fluency, and informativeness (Doddington 2002, 138–9); fluency and intelligibility (Reeder 2001); and general text matcher and translation edit rate (O’Brien 2011). Subjective or human evaluation, on the other hand, considers relatively blurry aspects such as fluency and PE effort (Forcada 2017, 305).

There is equally a growing body of scholarship investigating MT output quality from a pedagogical perspective that focuses on ways to better prepare trainees for MTPE. The thrust of research here has remained confined to the two closely intertwined aspects of PE-related skills and course design and components.[3] The focus on PE-specific skills partakes of the broader endeavour to produce a comprehensive mapping of translation skills, as with the Process in the Acquisition of Translation Competence and Evaluation (PACTE) project (Rico Pérez and Torrejón 2012). Different skill categorisations are provided depending on how generic or specific the framework is (Koponen 2015, Pym 2013, Rico Pérez and Torrejón 2012). As for course design and components, course proposals commonly place PE in tandem with MT (Doherty et al. 2012, Koponen 2015, O’Brien 2002). In line with the general call to integrate technology into practical translation courses rather than leave it as standalone courses, some proposals call for integrating PE into general translation courses (Mellinger 2017, Moorkens 2018).

The recognition that MTPE is different from other types of editing and thus requires specific attention equally begs for another investigation path which undertakes to orient and raise the awareness of trainees by charting typical MT errors, as voiced by Depraetere (2010). Projects adopting this line of investigation can be traced back to Loffler-Laurian (1983. See also Schäfer 2003). Unfortunately, this type of error analysis investigation has remained underexplored, especially in MT involving Arabic. The general orientation of the present project is towards attempting to fill this gap. The research questions can be formulated as follows:

  1. How accurate is the NMT handling of tense and aspect in Arabic-into-English translation?

  2. What features impact this accuracy?

The following section provides an overview of the methodological aspects taken into consideration in the present output assessment exercise. Section 3 provides and discusses salient findings in the project. Recommendations for PE are made in the conclusion.

2 Methodology

Drawing on NLP tradition, there are generally two principal aspects to decide upon when devising an MT output assessment test. These relate to the dataset to be used for the assessment and the evaluation method to adopt.

2.1 The database

The performance evaluation of NLP tools such as parsers or MT engines usually uses a set of sentences called an evaluation database. [4] Two types of databases are usually adopted. These are corpora and test suites. The use of one or the other of the database types is typically a factor of the purpose of the study. Corpora represent the first option in studies of a quantitative nature where the representativeness of the phenomena under investigation, as reflected in a corpus, is central to the research purpose (Lloberes et al. 2014, 87). Conversely, the use of a test suite is more appropriate in studies of a qualitative nature where phenomena under scrutiny need to be clearly isolated for analysis. In other words, the focus of projects using test suites is on coverage comprehensiveness rather than representativeness.

For the present study, the test suite option has been adopted given the study’s purpose of providing as exhaustive an investigation as possible of the MT rendering of tense and aspect manifestations. In other words, the frequency of occurrence of a particular tense or aspect in language does not represent a parameter in the investigation. Of more consequence for the study is providing a comprehensive coverage of these investigated phenomena.

Along the lines described in Burchardt et al. (2017, 160), the test suite for the study consists of 147 sentences derived from Almanna (2018), a pedagogical resource used for an introductory course in translation. These sentences are used in class to learn how to deal with morphological, structural, and contextual tenses in Arabic-to-English translation. The pedagogical origin of the test suite consolidates the fact that it is comprehensive, covering the 12 English tense/aspect combinations of the indicative mood (Figure 1). With this number of sentences, the test suite is of a relatively small scale (Coughlin 2003, 63). However, this is compensated for by the narrow focus of the study and the use of five different MT engines to process each of the sentences (see below).

Figure 1 
                  Distribution of test suite sentences over tenses.

Figure 1

Distribution of test suite sentences over tenses.

2.2 The evaluation method

Compared with general assessment tests that appraise the overall performance of a system, the focus of the current project, tense and aspect, is narrow. This limited scope renders unnecessary many of the metrics used in general assessment tests to secure objective and replicable results. For the current investigation, use was made of manual assessment that relies on a reference translation of the test suite Arabic sentences. The reference translation was produced based on analysis of the morphological, structural, and contextual tense types described in Section 1. To formalise the description of contextual tense, use was made of cognitive categories developed by Almanna (2022) that consist of point of emphasis, plexity, scope of intention and extent of causation, pace and time lapse, state of dividedness, state of boundedness, and degree of extension. These categories help define the contextual tense and account for the tense/aspect combination that is appropriate for each case.

Point of emphasis: This is the aspect of the action which the sentence focuses on. This emphasis can be on completion, duration, continuity, habituality, regularity, frequency, etc. In the example below, and by virtue of the phrase “منذ الصباح,” the emphasis is placed on duration, the whole period that began in the past and is seen as relevant to another point in the present.

.منذ الصباح وأنا أراجع دروسي
[Since the morning] [and] [I] [I review] [my lessons].

In the same example, there is equally a focus on continuity, as attention is drawn to the middle phase of the action of “revising” rather than to its beginning or end. This favours seeing the action as an ongoing activity. The combination of these two points of emphasis, i.e. duration and continuity, constitutes the contextual tense that favours rendering the sentence with the use of the present perfect continuous instead of relying on the present morphological tense.

I have been revising my lessons since morning.

Plexity: This refers to whether the action depicted is uniplex, i.e. occurring once, or multiplex, occurring multiple times in a recurrent manner. In the sentence below, the action of “smoking” is uniplex by virtue of the second clause, which depicts an action occurring while the first is ongoing and probably interrupting it.

.كنت أُدخّن في الشارع عندما ناداني والدي
[I was] [I smoke] [in the street] [when] [called me] [my father].

By contrast, the same action of “smoking” in the following sentence is multiplex on account of the clause “عندما كنت شابا” (when I was young), which refers to an extended period of time.

.كنت أُدخّن بشراهة عندما كنت شابا، أما الآن فلا
[I was] [I smoke] [heavily] [when] [I was] [young]. [But now] [no].

In translation, the differences in plexity between the two sentences represent major contextual tense elements. Hence, although the uniplex case requires the use of the continuous aspect, the multiplex case requires an interpretation that highlights habituality in the past. Considering these and other aspects, the two instances can be rendered as follows:

I was smoking in the street when my father called me.

I used to smoke heavily in my youth, but not anymore.

State of dividedness: While plexity refers to the number of times a full action is repeated, state of dividedness relates to the quality of one action being either internally continuous or characterised by internal breaks. Action in the sentence below is characterised by breaks as the action of “sipping” is inherently made up of internal repetitions.

.كانت ترشف الشاي أمام النافذة
[She was] [she sips] [tea] [in front of the window].

She was sipping tea in front of the window.

Conversely, the act of “watching” in the sentence below is characterised by being internally continuous, involving no interruptions.

.كنت أشاهد التلفاز عندما جاء صديقي لزيارتي أمس
[I was] [I watch] [TV] [when] [my friend] [came] [for visiting me] [yesterday].

I was watching TV when my friend came to visit me.

Scope of intention and extent of causation: Scope of intention is the quality inherent to an action that is presented as future-oriented. This contrasts with extent of causation, which is past-oriented. The interpretation of a sentence is usually a matter of balance between these two qualities. In the example below, the extent of causation in the two clauses is larger than the scope of intention as the two acts of “going” and “buying” are asserted. In other words, the emphasis in these two finite clauses is placed on the completion of the two actions, which are consequently described as points on the timeline. This favours a translation making use of the past simple.

.ذهبتْ أختي إلى البقالة أمس واِشترتْ بعض السكر
[My sister] [she went] [to the grocery store] [and] [she bought] [some sugar].

My sister went to the grocery store and bought some sugar.

In contrast, in the sentence below, the scope of intention in the second clause, “to buy some sugar,” is larger than the extent of causation as it is not asserted that my sister bought some sugar.

.ذهبتْ أختي إلى البقالة لتشتري بعض السكر
[My sister] [she went] [to the grocery store] [in order to she buys] [some sugar].

My sister went to the grocery store in order to buy some sugar.

Degree of extension and state of boundedness: Degree of extension refers to the quality of an action depicted as either a single point in time or a duration of various lengths. State of boundedness is about this action being either fully bounded, i.e. delimited in time from both beginning and end, partially bounded, or unbounded. In the sentence below, and based on the adverbial “منذ سنوات” “for years,” the act of “not visiting” cannot be reduced to a point on the timeline, but is rather to be drawn out as a period or span that started in the past and is relevant to the time of speaking, an implicit “now.”

.لم أزره منذ سنوات
[Not] [I visited him] [for years].

The starting point in the past is unspecified. Therefore, the action of “not visiting” is partially bounded as it has only a right-hand boundary (now), as modelled below:

These elements contribute to a contextual tense that requires a combination of the present tense and the perfect aspect.

I have not visited him for years.

In the following example, the act of “living” extends over a long period of time (almost 2 years). This act is characterised by full boundedness as we are able to identify the starting point and endpoint.

.بعد شهر سأكون قد عشتُ في هذه المدينة لمدة سنتين
[After one month] [I will be] [I lived] [in this city] [for a period of two years].

Being equally relevant to another point in time in the future, the sentence is to be rendered through the use of the future perfect.

After one month, I will have lived in this city for two years.

Pace and time lapse: These refer to the quality of actions immediately following one another, with a minimum time gap to separate them. In the sentence below, there is no time lapse between the process of doing expressed by the verb “فتح,” “to open,” and the process of sensing expressed by the verb “تذكر,” “to remember.”

.في اللحظة التي فتح فيها كتابه تذكر أن عليه أن يتصل بأخيه
[At the moment] [which] [he opened] [in it] [his book] [he remembered] [that] [he has to] [he calls] [his brother].

Here, there is also an implicit “كان” before the modalised preposition “على” that is normally rendered into “have to,” “must,” etc. This favours a rendition along the lines of:

The moment he opened his book, he remembered that he had to call his brother.

Clearly, these categories combine into a construct that contributes to the building of the overall contextual tense. In the example below, the action of “watching” is presented as one instance (uniplexity); there is no internal interruption in the flow of the action (dividedness), and no beginning or ending boundaries for the action (boundedness). Furthermore, the action of “watching” is drawn out over a short period of time (degree of extension) in which the action of “visiting” occurs (no time lapse).

.كنت أشاهد التلفاز عندما جاء صديقي لزيارتي أمس
[I was] [I watch] [TV] [when] [my friend] [he came] [for visiting me] [yesterday].

All these elements combined contribute to the rendering of the sentence as follows:

I was watching TV yesterday when my friend came to visit me.

At a subsequent stage, the reference translations thus obtained through reliance on cognitive categories were submitted to two bilingual raters for further validation. The percentage of agreement was found to reach 97.9%. This was considered good enough to proceed with the project.

Five online MT services were used in the project, namely, Google Translate, Microsoft Bing Translator, Reverso Translation, Systran Translate, and Yandex Translate. All these systems are based on NMT technology, include Arabic on their language list, and offer free unlimited usage without requiring the creation of a user account. With these features, these five MT engines represent services that users can refer to on an ad hoc basis.

The use of five engines is not intended as a comparative exercise. Rather, bringing together results from different engines (as is described in the following paragraphs) is meant to help transcend individual engine manifestations and establish conclusions that relate to NMT as a global approach. Other than this methodological motivation, the use of multiple engines is equally meant to help dispel the misconception, confirmed by anecdotal evidence, particularly in some parts of the Middle East, whereby online MT services are typically equated with one specific engine, namely Google Translate. This state of affairs could be attributable to, or even partake of the same phenomenon as, the ubiquity of the browser and search engine of the same company. Proper investigations delving into the whys and wherefores of MT engine market shares in this part of the world deserve attention but go beyond the scope of the present project.

For the particulars of the current investigation, each of the SL sentences in the test suite was input into the five online NMT services and the output translations were collected into the database. Based on the reference translation, the MT outputs were assessed for the tense/aspect combination rendered (see Klubička et al. 2018 for the adoption of a similar approach). The assessment scale is binary, assigning “1” for an output that is similar to the reference translation and “0” for one that deviates from it. Renderings which did not conform to the reference translation but which were still acceptable were considered correct. Conversely, renderings containing errors that affected meaning were not accepted.

Because the purpose of the exercise is not to evaluate the performance of specific engines or to compare between them but rather to come up with a general appreciation of NMT rendering, a score representing the sum of the points gathered from the five individual MT outputs for each sentence was calculated. Each individual sentence score thus potentially ranges between 0 and 5. Finally, because the test suite sentences were sorted into the 12 tense/aspect combinations available in the translations, the scores of sentences belonging to each of the same tense/aspect combinations were added up and turned into a percentage to give an appreciation of the accuracy achieved for each specific combination.

3 Results and discussion

MT output was collected for evaluation over a period extending from May to June 2021. The data in Figure 2 show that none of the MT engines displays sensibly better overall output, and that the five NMT engines tend to have comparable results with similar overall strengths and weaknesses, despite differences that may appear for individual tenses.

Figure 2 
               Results by MT engine.

Figure 2

Results by MT engine.

The all-engine results (Figure 3) show discrepant tense rendering accuracy rates. One can distinguish an initial group of tenses with accuracy rates that exceed 90%. This group includes the future continuous, the future simple, the present perfect continuous, the present simple, the future perfect, and the present continuous. Within this group, output errors are usually due to an issue in the processing of the whole sentence rather than in the rendering of the tense (e.g. processing the sentence as imperative instead of indicative). Renderings for the remaining tenses are more erratic. The perfect aspect represents the common denominator in tenses with the lowest accuracy rates. However, given that two other tenses using the perfect aspect belong to the top-ranking group, it is not possible to establish any causal relationship between this aspect and accuracy rates. In fact, the overall results do not favour any patterns along tense/aspect lines.

Figure 3 
               All-engine results.

Figure 3

All-engine results.

Coupling the initial results provided above with the working principles of the NMT technology (Section 1), it is the main contention of the present project that the differences in rendering accuracies are to be explained in terms of tense types (morphological, structural, or contextual) along the lines defined in Almanna (2018, 62–4). In theory, every Arabic sentence carries morphological, structural, and contextual tenses. However, these tenses can be aligned, converging into the same tense, or they can point in different directions. Whether converging or not, these tense types need to be taken into consideration for an adequate rendering of the sentence. Results from the experiment show higher accuracy rates in instances where tense types within the sentence are aligned. The examples below illustrate different instances of this scenario.

In sentence (1), the morphological “past” in “سافر” coincides with a contextual focus on completion (point of emphasis), which makes the action emerge as a uniplex point in time, and an extent of causation that relates to the past “أمس” (yesterday). The past simple is fully reflected in the five NMT systems.

Sentence 1

.سافر والدي إلى مصر أمس ليلتقي صديقَهُ
[He travelled] [my father] [to Egypt] [to meet] [his friend].
Google Translate My father travelled to Egypt yesterday to meet his friend.
Systran Translate My father travelled to Egypt yesterday to meet his friend.
Microsoft Bing Translator My father travelled to Egypt yesterday to meet his friend.
Reverso Translation My father travelled to Egypt yesterday to meet his friend.
Yandex Translate My dad flew to Egypt yesterday to meet his friend.

The same applies to sentence (2) where the morphological “present” is reinforced by a contextual focus on habituality (point of emphasis), an iteration in time (multiplexity) with interruptions (state of dividedness), but without boundaries (state of boundedness). Again, the present simple is fully reflected.

Sentence 2

.يجلس والدي في الحديقة صباحا
[He sits] [my father] [in the garden] [in the morning].
Google Translate My father sits in the garden in the morning.
Systran Translate My father sits in the garden in the morning.
Microsoft Bing Translator My father sits in the garden in the morning.
Reverso Translation My dad sits in the park in the morning.
Yandex Translate My father sits in the garden in the morning.

Similarly, in sentence (3), the “future” is structurally expressed in Arabic by using the bound morpheme “سـ” or free morpheme “سوف” with the verb conjugated in the imperfect form (مضارع). This future orientation is contextually reinforced by uniplexity and futurity (scope of intention), especially with the phrase “في السّنةِ المُقبلةِ.” Furthermore, the act of “marrying” is neither drawn out (degree of extension) nor delimited in time (state of boundedness).

Sentence 3

.سوف أتزوّجُ في السّنةِ المُقبلةِ
[Will] [I marry] [next year]
Google Translate I will get married next year.
Systran Translate I’ll be married next year.
Microsoft Bing Translator I’m getting married next year.
Reverso Translation I’m getting married next year.
Yandex Translate I will get married next year.

In sentence (4), the structural form “كنت أقرأ” combines with a focus on duration (point of emphasis), uniplexity, internal continuity (state of dividedness), past orientation (extent of causation), and the lack of boundaries (state of boundedness), all favouring the use of the past continuous, which is fully reflected in the NMT engine outputs.

Sentence 4

.كنت أقرأ رواية في مثل هذا الوقت البارحة
[I was] [I read] [a novel] [at this time] [yesterday].
Google Translate I was reading a novel at this time yesterday.
Systran Translate I was reading a novel this time yesterday.
Microsoft Bing Translator I was reading a novel this time yesterday.
Reverso Translation I was reading a novel like this last night.
Yandex Translate I was reading a novel this time yesterday.

As a last example of converging tense types, the structural form “كنتُ قَد عملتُ” in (5) calls for the use of the past perfect. To explain, by virtue of the structural form “كنتُ قَد عملتُ” along with “لمدة سنتين,” the action of “working” is past-oriented (extent of causation) and drawn out over a period of time (degree of extension) with a point of emphasis on the whole period that began in the past (unspecified) and seen as relevant to another point in the past (unspecified), thus having no boundaries (state of boundedness).

Sentence 5

.كنتُ قَد عملتُ في تلك الشركة لمدة سنتين
[I had been] [I worked] [in that company] [for a period of two years]
Google Translate I had worked for that company for two years.
Systran Translate I had worked at that company for two years.
Microsoft Bing Translator I had worked for that company for two years.
Reverso Translation I had worked at that company for two years.
Yandex Translate You have worked in that company for two years.

Conversely, when tense types show less concurrence, the NMT rendering of tenses becomes more erratic. The following two sentences illustrate this unpredictability with the present perfect. In sentence (6), the verb is morphologically in the past. However, the presence of “في الآونة الأخيرة,” “lately, recently,” orients interpretation in terms of right-handed partial boundedness that relates a multiplex action to the present, thus favouring the use of the present perfect. Four out of the five MT engines produced this tense. This high accuracy seems to be the outcome of the specific presence of the item “lately” in the sentence. In other words, the association of “lately” with the present perfect represents a common pattern.

Sentence 6

هل اتّصلت بأختِك في الآونة الأخيرة؟
[Do] [you called] [your sister]?
Google Translate Have you called your sister recently?
Systran Translate Have you called your sister lately?
Microsoft Bing Translator Have you called your sister lately?
Reverso Translation Have you called your sister lately?
Yandex Translate Did you call your sister lately?

In sentence (7), the verb “سافر,” which is morphologically conjugated in the past, clashes, on account of the second clause “وسوف يعود غدا,” with partial right-hand boundedness and has to be interpreted as a period of time related to the present by virtue of the reference to the future. These elements also require the use of the present perfect. However, this time, MT rendering was less satisfactory. In comparison with the previous example, this can be explained in terms of the absence of a pattern associated with the present perfect.

Sentence 7

.سافر والدي إلى مصر وسوف يعود غدا
[He travelled] [my father] [to Egypt] [and] [will] [he returns] [tomorrow].
Google Translate My father travelled to Egypt and will return tomorrow.
Systran Translate My father travelled to Egypt and will return tomorrow.
Microsoft Bing Translator My father travelled to Egypt and will return tomorrow.
Reverso Translation My father travelled to Egypt and will return tomorrow.
Yandex Translate My father travelled to Egypt and will return tomorrow.

In sentence (8), the second verb “يرمي” is conjugated in the present. Contextually, this second action is characterised by having more than one element (multiplexity), having no boundaries (state of boundedness), and having interruptions or breaks (state of dividedness) with a focus on continuity (point of emphasis). The presence of the first verb conjugated in the past simple “قال” occurred in the middle of the action of “يرمي,” thus indicating no time lapse and imposing past-orientation (extent of causation). These elements concord for a rendering of the second verb through the use of the past continuous. However, this was only reflected in one NMT engine, as shown below.

Sentence 8

.قال وهو يرمي في وعاء الجمر بعض البخور
[He said] [while] [he throws] [into the coal bowl] [some incense].
Google Translate He said while throwing some incense in the coal bowl.
Systran Translate He threw some incense into the coal pot.
Microsoft Bing Translator He said as he threw some incense into the embers bowl.
Reverso Translation He said throw in the bowl of embers some incense.
Yandex Translate He said throw in the bowl of embers some incense.

In sentence (9), the second verb “تعمل” is conjugated in the present form [she works]. The point of emphasis here is on continuity. The phrase “منذ سنة” [for a year] imposes a right-hand partial boundedness on the act of working characterised in this example by uniplexity and being internally continuous (state of dividedness). However, the action does not relate to the present but to a point in the past (extent of causation) dictated by the use of the past in the first verb “أخبرتني” [she told me]. These elements lead to the use of the past perfect continuous, which is only partially reflected in NMT output.

Sentence 9

.بفرحة لا تخلو من القلق أخبرتني أنها منذ سنة وهي تعمل في شركة كبيرة في وسط المدينة
[With joy not free from anxiety] [she informed me] [that she] [for a year] [and she works] [in a big company] [in the middle of the city].
Google Translate With joy and unease she told me that for a year she had been working for a big company downtown.
Systran Translate With uneasiness she told me that a year ago she had been working for a large downtown company.
Microsoft Bing Translator With unf througher[5] than worry, she told me that a year ago she worked for a big company in the city centre.
Reverso Translation With an uneasy joy, she told me that years ago, she worked at a big company downtown.
Yandex Translate With joy and anxiety she told me that a year ago she worked for a big company in the middle of town.

The future perfect continuous seems to be particularly arduous to render, with NMT engines often generating the future perfect instead. In sentence (10), the second verb is morphologically in the past form “تعلمت,” “I learned.” In terms of structural tense, we have the structure “سأكون قد,” “I will be already,” which refers to something achieved at one point in the future. The context further conveys a point of emphasis that focuses on continuity through “لمدة عشرين عاما.” This focus on continuity is missing in renderings using the future perfect.

Sentence 10

.عندما أنهي هذه الدورة سأكون قد تعلمت اللغة الإنجليزية لمدة عشرين عاما
[When] [I finish] [this course] [I will be already] [I learned] [English] [for twenty years].
Google Translate When I finish this course I will have been learning English for twenty years.
Systran Translate When I finish this course I will be learning English for twenty years.
Microsoft Bing Translator When I finish this course, I’ll have learned English for 20 years.
Reverso Translation When I finish this course, I’ll have learned English for 20 years.
Yandex Translate When I finish this course I have learned English for twenty years.

Stative verbs

This section attempts an investigation of possible meaningful NMT output patterns in terms of verb types. The focus addresses in particular the category of stative verbs. This category concerns verbs that depict a state rather than an action and have the particularity of not being conjugated in the continuous aspect. In other words, stative verbs represent a case of a semantic component that imposes restrictions on the morpho-syntactic forms the verb can assume.

This focus examines renderings yielding stative verbs that fall into the following two categories: verbs that exclusively express a stative meaning (e.g. remember) and verbs that express two meanings, one referring to a state and one referring to an action (e.g. think).

In the first case scenario, all engines provide a satisfactory rendering. This means that in no case did the rendering involve the erroneous use of the continuous aspect (sentences 11,12,13).

Sentence 11

I agree with what you say. .أنا موافق على ما تقوله

Sentence 12

We believe in your ability to solve the problem. .نحن نؤمن بقدرتك على حل المشكلة

Sentence 13

We wish you all the best. .نتمنى لكم كل التوفيق

The second case scenario, where the English verb can serve both as a stative verb and an action verb, can in turn be broken down into two subcategories. The first one is when to each of the state or action meanings corresponds a different verb in Arabic. In sentences 14 and 15, the two uses of the verb “think” correspond to two distinct verbs in Arabic, “اعتقد” and “فكّر,” respectively. In this case, NMT engines tend to render the aspect satisfactorily.

Sentence 14

I think this is a good idea .أعتقد أنها فكرة جيدة

Sentence 15

We are thinking about what we can do. .نحن نفكر فيما يمكن أن نقوم به

The second subcategory represents cases where the Arabic verb itself can, mainly from the perspective of point of emphasis, be used to refer to either a state or continuity. From an NMT perspective, the morphological, structural, and contextual elements provided in the Arabic sentence in this case may not be sufficient to determine the exact point of emphasis. This is especially the case when the verb is used in the present (imperfect) form, making the sentence open to interpretations yielding both simple and continuous aspects (Sentence 16). This ambiguity is reflected in the NMT engine output with an average of over 46% of renderings using the continuous and the remaining cases using the simple aspect.

Sentence 16

.أواجه صعوبات في الدراسة
[I face] [difficulties] [in] [the study]
Google Translate I am having difficulties studying.
Systran Translate I’m having trouble studying.
Microsoft Bing Translator I’m having trouble studying.
Reverso Translation I have difficulties studying.
Yandex Translate I have difficulties studying.

Adding a contextual element, such as “الآن” (now), to the sentences does not seem to yield better results with an average rendering that uses the continuous aspect barely exceeding 53%. However, the output is markedly improved (86%) once structural elements are available, as when the particle “كان” is used to express the past tense.

Sentence 17

.كنت أواجه صعوبات في الدراسة
[I was] [I face] [difficulties] [in] [the study]
Google Translate I was having difficulties studying.
Systran Translate I was having trouble studying.
Microsoft Bing Translator I was having difficulties studying.
Reverso Translation I was having difficulties studying.
Yandex Translate I was having difficulties studying.

All these results lead to the conclusion that NMT systems rely more on morphological and structural tenses for rendering verbs. Surely, NMT does not apply any morpho-syntactic analysis on the input text. However, morphological and structural manifestations of tenses are stable enough to represent patterns on which the NMT engine relies. In cases where the rendering of the tense has to take into consideration conflicting tense types coexisting in the sentence, NMT engines are relatively less successful. Knowledge of these weaknesses is considered of major significance.

The findings of this study have to be seen in light of a number of limitations. First, the decision on the use of a test suite with carefully curated sentences was justified as a response to the qualitative orientation of the project and the need to systematically cover the possible tense/aspect combinations and assess the MT output emerging from the experiment. However, it is to be admitted that the use of sentences that display single phenomena does represent a limitation. A focus on the rendering of more complex structures, although more challenging in terms of evaluation, may yield insightful results that complement those obtained in the current focus.

In what represents another limitation, the experiment followed the modus operandi of MT systems by adopting the sentence as a unit of analysis. Given the present state of the technology, NMT engines typically analyse sentences independently from each other. These engines will therefore remain totally blind to contextual elements that appear in sentences surrounding the one being processed. In sentence (18), the first verb, “forget,” is to be rendered in the present perfect, taking contextual information available in the rest of the sentence into account. The present perfect in this sentence is rendered in four MT engines used in the experiment, whereas the fifth engine generated a past simple. However, when the clause containing this verb is severed from the rest of the sentence through a full stop (sentence 19), none of the MT renderings used the present perfect, with four of them shifting to the past simple. Contextual information is no longer available within the sentence being processed.

Sentence 18

لم أنس ما قاله ابدا خصوصا انني شاركت في عشرات ورشات العمل
والندوات والحوارات والنقاشات.[6]

[Not] [I forgot] [what he said] [at all] [especially] [that I] [I participated] [in] [tens] [workshops] [and seminars] [and dialogues] [and discussions].

Sentence 19

لم أنس ما قاله ابدا. خصوصا انني شاركت في عشرات ورشات
.العمل والندوات والحوارات والنقاشات

Another sequel to the investigation would be to further fine tune investigations into verb categories. Verbs patterns, such as the one associated with the expression of annoyance and requiring the use of the continuous aspect, represent a viable investigation path. Finally, given that NMT is constantly improving, the output had to be repeatedly checked throughout the project stages to look for changes. Results obtained in the current project represent a snapshot of the current state of this technology.

4 Conclusions and recommendations

With the increasing interest in the editing of MT output as a viable step in the workflow of the professional translator, more time is dedicated to MTPE in formal training. However, much remains to be done to identify distinctive features and a typology of errors in this output within different language pairs. Along this line, the present project set out to examine one particular facet of Arabic-into-English NMT output, namely, the rendering of tense and aspect. Focus on verbs is all the more important as (1) the verb represents a pivotal element in the sentence and (2) huge discrepancies exist between Arabic and English verb systems, the range of tenses available in each language, and the absence of aspect in Arabic being salient ones.

Results show that tense/aspect rendering is relatively accurate but that this accuracy is not uniformly observed over all tenses/aspect combinations. Causal relation between accuracy and specific tenses or aspects was not found. However, results clearly showed a significant connection between rendering accuracy and the congruence of morphological, structural, and contextual tense types at play in the sentence. NMT systems, the fundamental functioning principle of which can be described as pattern matching, find in morphological and structural tenses stable enough patterns for their outputs. On the other hand, contextual tense is less morphologically and structurally tangible. In cases where contextual tense conflicts with morphological and structural tenses, NMT engines were found to be less successful.

These results point at the necessity for a focus on tense types in Arabic verbs to represent a fundamental component of curricula that address MTPE. Heightening the awareness of translation trainees to these concepts will represent an advantage when dealing with NMT output. By informing formal training in MTPE for the specific Arabic–English language pair, training can be more targeted, knowing in advance what to focus on instead of working on a random palette of texts. These findings equally underscore the viability of the framework using morphological, structural, and contextual tenses in formalising the description of the Arabic tense system and invite for further, and more finely tuned, investigations along these lines.

Acknowledgments

The authors thank the colleagues who willingly accepted to rate the reference translations used in the project. The authors are equally grateful to the anonymous reviewers whose comments and suggestions helped improve the manuscript.

  1. Funding information: Open Access funding provided by the Qatar National Library.

  2. Conflict of interest: Authors state no conflict of interest.

References

Allen, Jeffrey. 2003. “Post-editing.” In Computers and translation: A translator’s guide, edited by Harold Somers, 297–317. Amsterdam: John Benjamins.10.1075/btl.35.19allSearch in Google Scholar

Almanna, Ali. 2016a. The Routledge course in translation annotation: Arabic-English-arabic. London/New York: Routledge.10.4324/9781315665580Search in Google Scholar

Almanna, Ali. 2016b. Semantics for translation students: Arabic-English-Arabic. Oxford: Peter Lang.10.3726/978-3-0353-0840-2Search in Google Scholar

Almanna, Ali. 2018. The nuts and bolts of Arabic-English Translation: An introduction to applied contrastive linguistics. Newcastle upon Tyne, England: Cambridge Scholars Publishing.Search in Google Scholar

Almanna, Ali. 2022. “A Configurational system-based approach to translating tenses from Arabic to English.” British Journal of Translation, Linguistics and Literature 2, 1–14. 10.54848/bjtll.v2i1.19.Search in Google Scholar

Al H̱awājah, Faiz. 2021. Al lafẓ wa alfiʿl fī al ṯaqāfah al ʿarabiyah al sāʾidah [words and actions in prevalent Arab culture]. SSRCAW. Retrieved March 20, 2022, from https://www.ssrcaw.org/ar/show.art.asp?aid=733828.Search in Google Scholar

Āl Sāqy, Fadel Mustafa. 1977. Āqsām āl Klām āl ʿarabī min Ḥayṯu āl Šakl wa āl Waẓīfa [Arabic parts of speech. Their forms and functions]. Cairo: Maktabat āl H̱ānǧī.Search in Google Scholar

Āl Syrāfī, Al Hassan. S. 1986. Šarḥ Kitāb Sybāwayh [Explanation of Sibawayh’s book]. Cairo: Āl Hayā āl MasRya āl ʿāma lilkitāb.Search in Google Scholar

Arthur, Philip, Graham Neubig, and Satoshi Nakamura. 2016. “Incorporating discrete translation lexicons into neural machine translation.” In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, edited by Jian Su, Kevin Duh, Xavier Carreras, 1557–67. Austin, Texas: Association for Computational Linguistics.10.18653/v1/D16-1162Search in Google Scholar

Bentivogli, Luisa, Arianna Bisazza, Mauro Cettolo, and Marcello Federico. 2016. “Neural versus phrase-based machine translation quality: A case study.” In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, edited by Jian Su, Kevin Duh, Xavier Carreras, 257–67. Austin, Texas: Association for Computational Linguistics.10.18653/v1/D16-1025Search in Google Scholar

Biber, Douglas, Susan Conrad and Geoffrey Leech. 2002. Longman student grammar of spoken and written English. Essex: Pearson Education Limited.Search in Google Scholar

Bojar, Ondřej, Rajen Chatterjee, Christian Federmann, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, et al. 2016. “Findings of the 2016 conference on machine translation.” In Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers edited by Ondřej Bojar, Christian Buck, Rajen Chatterjee, Christian Federmann, Liane Guillou, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Aurélie Névéol, Mariana Neves, Pavel Pecina, Martin Popel, Philipp Koehn, Christof Monz, Matteo Negri, Matt Post, Lucia Specia, Karin Verspoor, Jörg Tiedemann, Marco Turchi, 131–98. Berlin: Germany: Association for Computational Linguistics.10.18653/v1/W16-2301Search in Google Scholar

Burchardt, Aljoscha, Vivien Macketanz, Jon Dehdari, Georg Heigold, Peter Jan-Thorsten, and Philip Williams. 2017. “A linguistic evaluation of rule-based, phrase-based, and neural MT engines.” The Prague Bulletin of Mathematical Linguistics 108(1), 159–170.10.1515/pralin-2017-0017Search in Google Scholar

Celce-Murcia, Marianne, and Freeman Larsen. 1999. The grammar book: An ESL/EFL Teacher’s Course (2nd edition.). Boston: Heinle and Heinle Publishers Inc.Search in Google Scholar

Chatzikoumi, Eirini. 2020. “How to evaluate machine translation: A review of automated and human metrics.” Natural Language Engineering 26 (2), 137–61.10.1017/S1351324919000469Search in Google Scholar

Coe, Norman, Mark Harrison, and Ken Paterson. 2006. Oxford practice grammar. Oxford: Oxford University Press.Search in Google Scholar

Coughlin, Deborah. 2003. “Correlating automated and human assessments of machine translation quality.” In Proceedings of Machine Translation Summit IX: Papers. New Orleans, USA.Search in Google Scholar

Depraetere, Ilse. 2010. “What counts as useful advice in a university post-editing training context? Report on a case.” In Proceedings of the 14th Annual Conference of the European Association for Machine Translation, edited by François Yvon and Viggo Hansen. Saint Raphaël, France: European Association for Machine Translation.Search in Google Scholar

Doddington, George. 2002. “Automatic evaluation of machine translation quality using n-gram co-occurrence statistics.” In Proceedings of the Second International Conference on Human Language Technology Research, edited by Mitchell Marcus, 138–45. San Francisco: Morgan Kaufmann Publishers Incorporated.10.3115/1289189.1289273Search in Google Scholar

Doherty, Stephen, Dorothy Kenny, and Andy Way. 2012. “Taking statistical machine translation to the student translator.” In Proceedings of the 10th Conference of the Association for Machine Translation in the Americas: Commercial MT User Program. San Diego, California, USA: Association for Machine Translation in the Americas.Search in Google Scholar

Forcada, Mikel L. 2017. “Making sense of neural machine translation.” Translation Spaces 6(2), 291–309. 10.1075/ts.6.2.06for.Search in Google Scholar

Freeborn, Dennis. 1987. A course book in English grammar. London: Palgrave.10.1007/978-1-349-18527-6Search in Google Scholar

Gadalla, Hassan Abdel-Shafik Hassan. 2017. Translating tenses in Arabic-English and English-Arabic contexts. Newcastle upon Tyne, England: Cambridge Scholars Publishing.Search in Google Scholar

Garcia, Ignacio. 2011. “Translating by post-editing: Is it the way forward?.” Machine Translation 25(3), 217–37.10.1007/s10590-011-9115-8Search in Google Scholar

Guerberof Arenas, Ana. 2008. “Productivity and quality in the post-editing of outputs from translation memories and machine translation.” Localisation Focus The International Journal of Localisation 7(1), 11–21.Search in Google Scholar

Guerberof Arenas, Ana and Joss Moorkens. 2019. “Machine translation and post-editing training as part of a master’s programme.” Jostrans: The Journal of Specialised Translation 31, 217–38.Search in Google Scholar

Griffiths, Patrick. 2006. An introduction to English semantics and pragmatics. Edinburgh: Edinburgh University Press.Search in Google Scholar

H̱ārūn, Ābdul Salam. 1977. Kitāb sybāwayh [Sibawayh’s book]. Cairo: Āl Hayā āl MasRya āl ʿāma lilkitāb.Search in Google Scholar

Hayakawa, Takeshi and Yuki Arase. 2020. “Fine-Grained Error Analysis on English-to-Japanese Machine Translation in the Medical Domain.” In Proceedings of the 22nd Annual Conference of the European Association for Machine Translation edited by André Martins, Helena Moniz, Sara Fumega, Bruno Martins, Fernando Batista, Luisa Coheur, Carla Parra, Isabel Trancoso, Marco Turchi, Arianna Bisazza, Joss Moorkens, Ana Guerberof, Mary Nurminen, Lena Marg, and Mikel L. Forcada, 155–64. Instituto Superior Técnico, Lisbon, Portugal: European Association for Machine Translation.Search in Google Scholar

Hutchins, John. 2001. “Machine translation and human translation: In competition or in complementation.” International Journal of Translation 13(1–2), 5–20.Search in Google Scholar

Kearns, Kate. 2000/2011. Semantics. Basingstoke/New York: Palgrave Macmillan.10.1007/978-0-230-35609-2Search in Google Scholar

Klubička, Filip, Antonio Toral, and Víctor M. Sánchez-Cartagena. 2018. “Quantitative fine-grained human evaluation of machine translation systems: a case study on English to Croatian.” Machine Translation 32(3), 195–215.10.1007/s10590-018-9214-xSearch in Google Scholar

Koponen, Maarit. 2015. “How to teach machine translation post-editing? Experiences from a post-editing course.” In Proceedings of the 4th Workshop on Post-Editing Technology and Practice (WPTP4) edited by Sharon O’Brien and Michel Simard, 2–15. Miami: Florida: Association for Machine Translation in the Americas.Search in Google Scholar

Kreidler, Charles. 1998. Introducing English semantics. London/New York: Routledge.10.4324/9780203265574Search in Google Scholar

Läubli, Samuel, Chantal Amrhein, Patrick Düggelin, Beatriz Gonzalez, Alena Zwahlen, and Martin Volk. 2019. “Post-editing Productivity with Neural Machine Translation: An Empirical Assessment of Speed and Quality in the Banking and Finance Domain.” In Proceedings of the Machine Translation Summit XVII edited by Mikel Forcada, Andy Way, Barry Haddow, Rico Sennrich, 267–72. Dublin, Ireland: European Association for Machine Translation.Search in Google Scholar

Le, Quoc V., and Mike Schuster. 2016. “A neural network for machine translation, at production scale.” https://ai.googleblog.com/2016/09/a-neural-network-for-machine.htmlSearch in Google Scholar

Leech, Geoffrey, and Jan Svartvik. 2002. A communicative grammar of English (3rd edition). London/New York: Routledge.Search in Google Scholar

Lloberes, Marina, Irene Castellón, Lluís Padró, and Edgar González. 2014. “Partes: Test suite for parsing evaluation.” Procesamiento del Lenguaje Natural 53, 87–94.Search in Google Scholar

Loffler-Laurian, Anne-Marie. 1983. “Pour une typologie des erreurs dans la traduction automatique.” Multimedia 2(2), 65–78.10.1515/mult.1983.2.2.65Search in Google Scholar

Mellinger, Christopher D. 2017. “Translators and machine translation: knowledge and skills gaps in translator pedagogy.” The Interpreter and Translator Trainer 11(4), 280–93.10.1080/1750399X.2017.1359760Search in Google Scholar

Moorkens, Joss. 2018. “What to expect from Neural Machine Translation: a practical in-class translation evaluation exercise.” The Interpreter and Translator Trainer 12(4), 375–87.10.1080/1750399X.2018.1501639Search in Google Scholar

O’Brien, Sharon. 2002. “Teaching post-editing: a proposal for course content.” In Proceedings of the 6th EAMT Workshop: Teaching Machine Translation edited by Harold Somers, 99–106. Manchester, England: European Association for Machine Translation.Search in Google Scholar

O’Brien, Sharon. 2011. “Towards predicting post-editing productivity.” Machine Translation 25(3), 197–215.10.1007/s10590-011-9096-7Search in Google Scholar

Papineni, Kishore, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. “BLEU: a method for automatic evaluation of machine translation.” In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics edited by Pierre Isabelle, Eugene Charniak, and Dekang Lin, 311–8. Philadelphia, Pennsylvania, USA. Association for Computational Linguistics.10.3115/1073083.1073135Search in Google Scholar

Popović, Maja. 2018. “Error classification and analysis for machine translation quality assessment.” In Translation quality assessment: From principles to practice edited by Joss Moorkens, Sheila Castilho, Federico Gaspari, and Stephen Doherty, 129–58. Cham: Springer.10.1007/978-3-319-91241-7_7Search in Google Scholar

Pym, Anthony. 2013. “Translation skill-sets in a machine-translation age.” Meta: Journal des Traducteursmeta:/Translators” Journal 58(3), 487–503. 10.7202/1025047ar.Search in Google Scholar

Quirk, Randolph, Sidney Greenbaum, Geoffrey Neil Leech, and Jan Svartvik. 1972. A grammar of contemporary English. London: Longman.Search in Google Scholar

Quirk, Randolph, Sidney Greenbaum. 1973. A university grammar of English. London: Longman Group Limited.Search in Google Scholar

Radwan, Muhammad Ramzy El-Sayyed. 1975. A Semantico-Syntactic Study of the Verbal Piece in Colloquial Egyptian Arabic. PhD thesis. London: University of London, UK.Search in Google Scholar

Reeder, Florence. 2001. “In one hundred words or less.” In Paper Presented at the MT Evaluation Workshop MT Summit VIII, edited by Eduard Hovy, Margaret King, Sandra Manzi, and Florence Reeder. Santiago de Compostela, Spain.Search in Google Scholar

Rico Pérez, Celia, and Enrique Torrejón. 2012. “Skills and profile of the new role of the translator as MT post-editor.” Tradumàtica 10, 166–78.Search in Google Scholar

Schäfer, Falko. 2003. “MT post-editing: how to shed light on the “unknown task”. Experiences at SAP.” In EAMT Workshop: Improving MT through other Language Technology Tools: Resources and Tools for Building MT. Budapest, Hungary: European Association for Computational Linguistics.Search in Google Scholar

Toral, Antonio, and Víctor M. Sánchez-Cartagena. 2017. “A multifaceted evaluation of neural versus phrase-based machine translation for 9 Language Directions.” In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, edited by Mirella Lapata, Phil Blunsom, Alexander Koller, 1062–1072. Valencia, Spain: Association for Computational Linguistics.10.18653/v1/E17-1100Search in Google Scholar

Vilar, David, Jia Xu, Luis Fernando d”Haro, and Hermann Ney. 2006. “Error Analysis of Statistical Machine Translation Output.” In Proceedings of the fifth international conference on language resources and evaluation (LREC”06), edited by Nicoletta Calzolari, Khalid Choukri, Aldo Gangemi, Bente Maegaard, Joseph Mariani, Jan Odijk, Daniel Tapias, 697–702. Genoa, Italy: European Language Resources Association (ELRA).Search in Google Scholar

Wright, William. 1967. A grammar of the Arabic Language (3rd edition). Cambridge: Cambridge University Press.Search in Google Scholar

Zakraoui, Jezia, Moutaz Saleh, Somaya Al-Maadeed, and Jihad Mohamad AlJa”am. 2020, April. “Evaluation of Arabic to English machine translation systems.” In 2020 11th International Conference on Information and Communication Systems (ICICS) edited by Ismail Hmeidi, 185–90. Irbid, Jordan: Institute of Electrical and Electronics Engineers (IEEE).10.1109/ICICS49469.2020.239518Search in Google Scholar

Received: 2021-07-19
Revised: 2022-05-16
Accepted: 2022-05-17
Published Online: 2022-07-30

© 2022 Ali Almanna and Rafik Jamoussi, published by De Gruyter

This work is licensed under the Creative Commons Attribution 4.0 International License.

Scroll Up Arrow