Skip to content
Licensed Unlicensed Requires Authentication Published by De Gruyter Mouton March 17, 2021

An empirical study on the contribution of formal and semantic features to the grammatical gender of nouns

Ali Basirat, Marc Allassonnière-Tang and Aleksandrs Berdicevskis
From the journal Linguistics Vanguard

Abstract

This study conducts an experimental evaluation of two hypotheses about the contributions of formal and semantic features to the grammatical gender assignment of nouns. One of the hypotheses (Corbett and Fraser 2000) claims that semantic features dominate formal ones. The other hypothesis, formulated within the optimal gender assignment theory (Rice 2006), states that form and semantics contribute equally. Both hypotheses claim that the combination of formal and semantic features yields the most accurate gender identification. In this paper, we operationalize and test these hypotheses by trying to predict grammatical gender using only character-based embeddings (that capture only formal features), only context-based embeddings (that capture only semantic features) and the combination of both. We performed the experiment using data from three languages with different gender systems (French, German and Russian). Formal features are a significantly better predictor of gender than semantic ones, and the difference in prediction accuracy is very large. Overall, formal features are also significantly better than the combination of form and semantics, but the difference is very small and the results for this comparison are not entirely consistent across languages.


Corresponding author: Marc Allassonnière-Tang, Lab Dynamics of Language CNRS UMR 5596, University Lyon 2, Lyon, France, E-mail:

Funding source: IDEXLYON Fellowship Grant

Award Identifier / Grant number: 16-IDEX-0005

Funding source: University of Lyon Grant NSCO ED 476

Award Identifier / Grant number: ANR-10-LABX-0081

Funding source: French National Research Agency

Award Identifier / Grant number: ANR-11-IDEX-0007

Acknowledgement

The authors are thankful for the constructive comments from the anonymous referees and editors, which helped to significantly improve the quality of the paper. Special thanks to Niklas Edenmyr and Joakim Nivre for providing comments on earlier versions of the paper. The authors are fully responsible for all remaining errors.

  1. Research funding: The second author expresses his gratitude for the support of the IDEXLYON Fellowship Grant (16-IDEX-0005), University of Lyon Grant NSCO ED 476 (ANR-10-LABX-0081), and French National Research Agency (ANR-11-IDEX-0007).

Appendix

A. The effect of window size for character-based word embeddings

The following results are based on the validation sets. Three values of character window size are tested: 3, 6, and all based on two types of forms: inflected and lemmatized. In this context, character window size refers to the number of characters at the beginning and the end of a word with regard to their order. The value of all covers the entire noun. Table 4 shows the accuracy of the classifiers trained with character-based word embeddings with different values of character window size. The results from lemmas and inflected forms are shown separately. Their respective performance is compared with the majority baselines, i.e., every noun is associated with the gender that has the largest size in the data. The performance on inflected forms and lemmas are rather similar. In both cases, the accuracy of the classifiers is significantly higher than the baseline.

Table 4:

The performance of the classifier for predicting the gender of nouns (inflected forms and lemmas) based on character-based word embeddings with character windows of size 3, 6, and all characters. The baseline refers to the majority baseline.

Size 3 Size 6 Size all Baseline
Inflected Lemma Inflected Lemma Inflected Lemma Inflected Lemma
French 90.9 90.9 92.1 92.4 92.0 92.2 54.6 56.2
German 84.3 82.7 92.1 88.7 93.1 88.2 37.2 37.0
Russian 87.7 96.1 88.9 96.3 89.1 96.6 41.8 42.9

Different window sizes yield nearly identical results, suggesting that most of the information about the grammatical gender of nouns tends to be present at the beginning or the end of the nouns. This is expected from a linguistic perspective since most nominal markers tend to be located either at the beginning or in the end of words in the three languages. In general, these results demonstrate that the formal features are very good predictors of the grammatical gender.

B. The effect of method for context-based word embeddings

The information captured by context-based word embeddings might be influenced by parameters such as the embedding method, corpus type, context size, and noun form. In this section, we study the effect of these parameters on the grammatical gender prediction.

Three different types of word embedding methods are used for this study, including word2vec (skip-gram) (Mikolov et al. 2013), GloVe (Pennington et al. 2014), and principal word embedding (PWE) (Basirat 2018; Basirat and Nivre 2019). Each method is trained with the symmetric bag-of-words content with different sizes, 1–5 on both inflected and lemmatized corpora. Furthermore, two forms of nouns are also distinguished: raw (i.e., lemma) versus inflected. That is to say, we also assess the performance of the classifier for predicting the gender of lemmatized noun forms and inflected noun forms. For instance, inflected noun forms may include noun forms with singular and/or plural case markings. To avoid confusion of terms, the terms ‘lemmatized’ and ‘raw’ are used to refer to the embeddings, while the terms ‘lemma’ and ‘inflected’ are used to refer to the noun forms in this section.

word2vec offers two types of embedding models: cbow and skip-gram. We use the skip-gram model. The skip-gram model is a two-layer neural network that takes the word in the middle of a sequence of words (context) as input and predicts surrounding words within a certain range before or after the input word. Unlike word2vec which relies on the local occurrences of words, GloVe and PWE rely on both local and global contexts of words. These methods use matrix factorization techniques to train word embeddings from a word co-occurrence matrix undergone a transformation function. GloVe applies a static transformation on the co-occurrence data and uses a regression model to factorize the transformed data. However, PWE applies an adaptive transformation and uses a randomized singular value decomposition to factorize the matrix.

The results obtained from Glove, word2vec, and PWE are shown in Figure 3. In terms of noun form, the performance does not vary much between inflected noun forms and lemmatized noun forms. The biggest drop in accuracy occurs with Russian, which is not surprising due to its complex case inflection systems that differentiates the nominative, the accusative, the genitive, the dative, the instrumental, and the locative cases. This is expected since raw noun forms represent a larger and more versatile data source than lemmatized noun forms.

Figure 3: 
A cross-lingual comparison between the performance of the classifier for predicting the grammatical gender of nouns from context-based word embeddings trained with different methods and different context sizes.

Figure 3:

A cross-lingual comparison between the performance of the classifier for predicting the grammatical gender of nouns from context-based word embeddings trained with different methods and different context sizes.

The performance of the raw embeddings, i.e., the embeddings based on raw/inflected corpora, is higher than the performance of embeddings based on lemmatized corpora. This is expected since the inflected embeddings may include morphosyntactic features that provide obvious clues to the classifier. Nevertheless, the inflected embeddings trained with different methods act differently on the task of gender prediction of nouns. In general, the best results are obtained from the PWE embeddings with all context size for each language. Except for Glove with Russian and larger context size. word2vec does not perform as well as Glove and PWE, especially when the context size increases. The performance of the lemmatized embeddings, i.e., the embeddings based on lemmatized corpora are much lower than the performance of the inflected embeddings. Using lemmatized corpora thus seems to represent a more accurate picture of how helpful purely semantic information is in predicting the grammatical gender of nouns. Nevertheless, both types of embeddings result in accuracy higher than the baselines, which implies that semantics play an important role in the association between nouns and their grammatical gender.

Lemmatized embeddings result in a similar performance for word2vec, PWE, and Glove, regardless of the window size. However, inflected embeddings lead to different performance of the models. In general, the accuracy of PWE is stable at all window sizes with each language. However, Glove performs better with large window size while word2vec reaches a higher accuracy with small window size. Since the variation of accuracy based on different context size is not consistently found in each embedding method, it is more likely that the optimal size of the context does not vary much between languages. The small divergences in performance are more likely to occur due to the different structures of the embedding methods. In the current paper, we report the results from Glove since it is considered to be one of the most standard methods.

C. The effect of parameters and corpora on the final results

Each language has 180 classification accuracy results related to the two types of noun encoding (inflected forms and lemmas) and the two types of corpora (inflected and lemmatized) plus the three types of context-based word embeddings (word2vec, GloVe, and PWE) each trained with five values of context size and combined by character-based word embeddings trained with three values of character window size (3, 6, all). So in total, 180 × 3 = 540 results for the three languages. We found it more convenient not to represent the detailed results related to the character window size and only plot the results obtained from the character window size of six for which relatively good results were obtained for all languages.

Figure 4 summarizes the results obtained from the combination of the formal and semantic features using character-based and context-based word embeddings, respectively. As explained above, the character-based embeddings are trained with only 6 characters at the beginning and the end of words, but the context-based embeddings are trained with different values of context size along with inflected and lemmatized corpora. Context size is not an extremely important parameter on the classification performance. The optimal value of the context size is rather stable with lemmatized embeddings and fluctuates more with inflected embeddings. However, this fluctuation is more related to the embedding model rather than the data. In general, PWE and Glove result in relatively good results for all languages, with Glove requiring a larger context size for German. word2vec only performs as well as PWE and Glove for Russian.

Figure 4: 
A cross-lingual comparison between the performance of the classifier for predicting the grammatical gender of nouns from both character-based and context-based word embeddings. The character-based word embeddings are trained with a character-window size of 6, but the context-based embeddings are trained with different context size with different embedding methods.

Figure 4:

A cross-lingual comparison between the performance of the classifier for predicting the grammatical gender of nouns from both character-based and context-based word embeddings. The character-based word embeddings are trained with a character-window size of 6, but the context-based embeddings are trained with different context size with different embedding methods.

In terms of languages, the performance on German is generally lower than the performance on French and Russian. When only using context-based embeddings, the performance on French is consistently the highest. When combining character-based embeddings and context-based embeddings, the performance is high for all three languages, especially for Russian, which almost reaches ceiling. This variation indicates that the level of information encoded in form and semantics vary across languages, but it does not directly affect the current analysis, which focuses on the accuracy at the relative scale within each language.

In terms of semantics and form, due to the similarity of results between inflected forms and lemmas, only the results of the latter are shown in Table 5. The second column shows the accuracy of the classifier based on inflected embeddings and lemmatized embeddings. The third column refers to the performance of the classifier when both character-based embeddings and context-based embeddings are considered.

Table 5:

The accuracy of grammatical gender classification task based on the features defined over Lemmas, semantics (Sem.) and their combinations (Lemma + Sem.). The word embeddings representing semantics of words are trained on lemmas as noun forms with inflected and lemmatized (lem.) corpora. The results are obtained from the test sets with The PWE embedding method, context size five and character size 6.

Languages Form Sem. Form + Sem.
Inf. Lem. Inf. Lem
French 0.92 0.94 0.69 0.96 0.90
German 0.88 0.82 0.50 0.90 0.86
Russian 0.96 0.89 0.58 0.98 0.97

Formal features yield high accuracy on the classification task. In all cases, the accuracy based on character embeddings is higher than the accuracy based on lemmatized embeddings. The combination of lemmatized embeddings and character embeddings does not surpass the performance of character embeddings alone (except for Russian, but with a small magnitude). Inflected embeddings help the classifier to reach an accuracy higher than character embeddings. However, this high performance is very likely due to the morphosyntactic cues rather than the semantic information in inflected embeddings.

D. The effect of dimensionality for both types of embeddings

The number of dimensions of word embeddings has a vital role in the type and the amount of information captured (Yin and Shen 2018). A large number of dimensions may lead to the overfitting issue, and a small number may not be enough to capture the required information about words. We study the effect of dimensionality of the word embeddings on the grammatical gender classification as follows. We start with the content-based embeddings and train different sets of embeddings for each language with a different number of dimensions. Then, we fix the context-based embeddings to the one that results in the highest accuracy (i.e., 1,000-dimensional embeddings in this case) and study the effect of the dimensionality of character-based embeddings on the performance of the neural network. We report the average accuracy of 10 trial for each set of embeddings. Table 6 summarizes the average accuracy of the classifier trained with Glove word embeddings of a different number of dimensions.

Table 6:

The average accuracy of grammatical gender classification using word embeddings with different number of dimensions.

Context-based Character-based
50 100 500 750 1,000 1,250 1,500 50 100 500
French 0.65 0.68 0.72 0.72 0.73 0.73 0.73 0.91 0.91 0.91
German 0.48 0.49 0.51 0.51 0.52 0.52 0.52 0.88 0.88 0.89
Russian 0.54 0.57 0.59 0.59 0.60 0.60 0.60 0.97 0.97 0.97

On the left side, the results of the context-based embeddings show that the amount of gender-related information increases as the number of dimensions increases. The best results are obtained starting from the 500-dimensional word embeddings, which then reach plateau and keep the same level of accuracy. Additional experiments with higher dimensions indicate that the model starts overfitting and the accuracy drops. As an extreme example, the accuracy on Russian with 5,000 dimensions is 0.37. The results of the character-based embeddings do not show a strong relationship between the number of dimensions of the character-based embeddings and the performance of the classifier. This observation indicates that the dimensionality of the character-based embeddings is not an effective parameter for the task. These combined results show that even with higher dimensionality, context-based embeddings do not outperform character-based embeddings.

References

Allassonnière-Tang, Marc & Marcin Kilarski. 2020. Functions of gender and numeral classifiers in Nepali. Poznan Studies in Contemporary Linguistics 56(1). 113–168. https://doi.org/10.1515/psicl-2020-0004.Search in Google Scholar

Basirat, Ali. 2018. Principal word vectors. Uppsala: Acta Universitatis Upsaliensis PhD thesis. Available at: http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-353866.Search in Google Scholar

Basirat, Ali & Joakim Nivre. 2019. Real-valued syntactic word vectors. Journal of Experimental and Theoretical Artificial Intelligence. 32(4). 557–579. https://doi.org/10.1080/0952813X.2019.1653385.Search in Google Scholar

Basirat, Ali & Marc Tang. 2018. Lexical and morpho-syntactic features in word embeddings: A case study of nouns in Swedish. Proceedings of the 10th International Conference on Agents and Artificial Intelligence, 2, 663–674. Setúbal: SciTePress.10.5220/0006729606630674Search in Google Scholar

Bojanowski, Piotr, Edouard Grave, Armand, Joulin & Tomas Mikolov. 2017. Enriching Word Vectors with Subword Information. Transactions of the Association for Computational Linguistics 5. 135–146. https://doi.org/10.1162/tacl_a_00051.Search in Google Scholar

Boleda, Gemma. 2020. Distributional Semantics and Linguistic Theory. Annual Review of Linguistics 6(1). 213–234. https://doi.org/10.1146/annurev-linguistics-011619-030303.Search in Google Scholar

Bonami, Olivier, Matías Guzman Naranjo & Delphine Tribout. 2019. The role of morphology in gender assignment in French. Paper presented at the International Symposium of morphology. Paris: Laboratoire de linguistique formelle.Search in Google Scholar

Cao, Kris & Marek Rei. 2016. A joint model for word embedding and word morphology. In Proceedings of the 1st workshop on representation learning for NLP, 18–26. Berlin, Germany: Association for Computational Linguistics. Available at: https://www.aclweb.org/anthology/W16-1603.10.18653/v1/W16-1603Search in Google Scholar

Chen, Xinxiong, Lei Xu, Zhiyuan Liu, Maosong Sun & Huanbo Luan. 2015. Joint learning of character and word embeddings. In Proceedings of the 24th International Conference on artificial intelligence (IJCAI’15), 1236–1242. Palo Alto: AAAI Press. Available at: http://dl.acm.org/citation.cfm?id=2832415.2832421.Search in Google Scholar

Comrie, Bernard. 1999. Grammatical gender systems: A linguist’s assessment. Journal of Psycholinguistic Research 28(5). 457–466.10.1023/A:1023212225540Search in Google Scholar

Contini-Morava, Ellen & Marcin Kilarski. 2013. Functions of nominal classification. Language Sciences 40. 263–299. https://doi.org/10.1016/j.langsci.2013.03.002.Search in Google Scholar

Corbett, Greville. 1982. Gender in Russian: An account of gender specification and its relationship to declension. Russian Linguistics 6. 197–232.10.1007/BF03545848Search in Google Scholar

Corbett, Greville G. 1991. Gender. Cambridge: Cambridge University Press.10.1017/CBO9781139166119Search in Google Scholar

Corbett, Greville G. 2013a. Number of genders. In Matthew S. Dryer & Martin Haspelmath (eds.), The world atlas of language structures online. Leipzig: Max Planck Institute for Evolutionary Anthropology.Search in Google Scholar

Corbett, Greville G. 2013b. Systems of gender assignment. In Matthew S. Dryer & Martin Haspelmath (eds.), The world atlas of language structures online. Leipzig: Max Planck Institute for Evolutionary Anthropology.Search in Google Scholar

Corbett, Greville G & Norman Fraser. 2000. Gender assignment: A typology and a model. In Gunter Senft (ed.), Systems of nominal classification, 293–325. Cambridge: Cambridge University Press.Search in Google Scholar

Corteen, Emma. 2019. The assignment of grammatical gender in German: Testing optimal gender assignment theory. Cambridge: Cambridge University PhD thesis.Search in Google Scholar

Culbertson, Jennifer, Annie Gagliardi & Kenneth Smith. 2017. Competition between phonology and semantics in noun class learning. Journal of Memory and Language 92. 343–358. https://doi.org/10.1016/j.jml.2016.08.001.Search in Google Scholar

Culbertson, Jennifer, Hanna, Jarvinen, Frances Haggarty & Kenneth Smith. 2019. Children’s sensitivity to phonological and semantic cues during noun class learning: Evidence for a phonological bias. Language 95(2). 268–293. https://doi.org/10.1353/lan.0.0234.Search in Google Scholar

Devlin, Jacob, Ming-Wei Chang, Kenton Lee & Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1, 4171–4186 (Long and Short Papers). Minneapolis, Minnesota: Association for Computational Linguistics. Available at: https://www.aclweb.org/anthology/N19-1423.Search in Google Scholar

Duygu, Altinok. 2018. DEMorphy: German language morphological analyzer. Available at: arXiv.org. arXiv:1803.00902.Search in Google Scholar

Dye, Melody, Petar Milin, Richard Futrell & Michael Ramscar. 2017. A functional theory of gender paradigms. In Ferenc Kiefer, James Blevins & Huba Bartos (eds.), Perspectives on morphological organization. Leiden: BRILL. Available at: https://brill.com/view/book/edcoll/9789004342934/B9789004342934_011.xml (accessed 29 March 2020).Search in Google Scholar

Fedden, Sebastian & Greville G Corbett. 2019. The continuing challenge of the German gender system. Paper presented at the International Symposium of morphology. Paris: Laboratoire de linguistique formelle.Search in Google Scholar

Ginter, Filip, Jan, Hajič, Juhani Luotolahti, Milan Straka & Daniel Zeman. 2017. CoNLL 2017 shared task-automatically annotated raw texts and word embeddings. Available at: http://hdl.handle.net/11234/1-1989.Search in Google Scholar

Gonen, Hila, Yova Kementchedjhieva & Yoav Goldberg. 2019. How does Grammatical Gender Affect Noun Representations in Gender-Marking Languages? In Proceedings of the 2019 workshop on widening NLP, 64–67. Florence, Italy: Association for Computational Linguistics.10.18653/v1/K19-1043Search in Google Scholar

Grinevald, Colette. 2015. Linguistics of classifiers. In James D. Wright (ed.), International encyclopedia of the social and behavioral sciences, 811–818. Oxford: Elsevier.10.1016/B978-0-08-097086-8.53003-7Search in Google Scholar

Hinton, Geoffrey E., Nitish Srivastava, Alex Krizhevsky, Ilya Sutskever & Ruslan Salakhutdinov. 2012. Improving neural networks by preventing co-adaptation of feature detectors. CoRR. Available at: http://dblp.uni-trier.de/db/journals/corr/corr1207.html#abs-1207-0580.Search in Google Scholar

Kann, Katharina. 2019. Grammatical gender, neo-whorfianism, and word embeddings: A data-driven approach to linguistic relativity. Ithaca: Cornell University.Search in Google Scholar

Kemmerer, David. 2017. Categories of object concepts across languages and brains: the relevance of nominal classification systems to cognitive neuroscience. Language, Cognition and Neuroscience 32(4). 401–424. https://doi.org/10.1080/23273798.2016.1198819.Search in Google Scholar

Kramer, Ruth. 2020. Grammatical gender: A close look at gender assignment across languages. Annual Review of Linguistics 6(1). 45–66. https://doi.org/10.1146/annurev-linguistics-011718-012450.Search in Google Scholar

Lakoff, George & Mark Johnson. 2003. Metaphors we live by. London: The University of Chicago Press.10.7208/chicago/9780226470993.001.0001Search in Google Scholar

Lebret, Rémi & Ronan Collobert. 2015. Rehabilitation of count-based models for word vector representations. In Alexander Gelbukh (ed.), Computational linguistics and intelligent text processing, 417–429. Cham: Springer International Publishing.10.1007/978-3-319-18111-0_31Search in Google Scholar

LeCun, Yann A, Léon Bottou, Genevieve B. Orr & Klaus-Robert Müller. 2012. Efficient backprop. In Neural networks: Tricks of the trade, 9–48. Berlin: Springer.10.1007/978-3-642-35289-8_3Search in Google Scholar

McCann, Bryan, James, Bradbury, Caiming Xiong & Richard Socher. 2017. Learned in translation: Contextualized word vectors. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan & R. Garnett (eds.), Advances in neural information processing systems, vol. 30, 6294–6305. New York: Curran Associates, Inc. http://papers.nips.cc/paper/7209-learned-in-translation-contextualized-word-vectors.pdf.Search in Google Scholar

Melamud, Oren, David McClosky, Siddharth Patwardhan & Mohit Bansal. 2016. The Role of Context Types and Dimensionality in Learning Word Embeddings. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 1030–1040. San Diego, California: Association for Computational Linguistics. Available at: https://www.aclweb.org/anthology/N16-1118.10.18653/v1/N16-1118Search in Google Scholar

Mikolov, Tomas, Kai Chen, Greg Corrado & Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. In 1st international conference on learning representations, ICLR 2013, Scottsdale, Arizona, USA, May 2–4, 2013, Workshop Track Proceedings. Available at: http://arxiv.org/abs/1301.3781.Search in Google Scholar

Nastase, Vivi & Marius Popescu. 2009. What’s in a name? In some languages, grammatical gender. In Proceedings of the 2009 conference on empirical methods in natural language processing, 1368–1377. Singapore: Association for Computational Linguistics. Available at: https://www.aclweb.org/anthology/D09-1142.Search in Google Scholar

Nesset, Tore. 2006. Gender meets the usage-based model: Four principles of rule interaction in gender assignment. Lingua 116. 1369–1393.10.1016/j.lingua.2004.06.012Search in Google Scholar

Pennington, Jeffrey, Richard Socher & Christopher Manning. 2014. Glove: Global vectors for word representation. Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), 1532–1543. Doha: Association for Computational Linguistics.10.3115/v1/D14-1162Search in Google Scholar

Peters, Matthew E., Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee & Luke Zettlemoyer. 2018. Deep contextualized word representations. The 16th annual conference of the North American Chapter of the Association for Computational Linguistics. New Orleans: Association for Computational Linguistics.10.18653/v1/N18-1202Search in Google Scholar

Prince, Alan & Paul Smolensky. 1993. Optimality theory: Constraint interaction in generative grammar. Boulder: Rutgers University and University of Colorado.Search in Google Scholar

Rice, Curt. 2006. Optimizing gender. Lingua 116. 1394–1417.10.1016/j.lingua.2004.06.013Search in Google Scholar

Sahlgren, Magnus. 2006. The word-space model. Stockholm University PhD thesis.Search in Google Scholar

Schütze, Hinrich. 1992. Dimensions of meaning. In Proceedings of the 1992 ACM/IEEE conference on supercomputing, 787–796. IEEE Computer Society Press.10.1109/SUPERC.1992.236684Search in Google Scholar

Senft, Gunter. 2000. What do we really know about nominal classification systems. In Gunter Senft (ed.), Systems of nominal classification, 11–49. Cambridge: Cambridge University Press.Search in Google Scholar

Williams, Adina, Damian Blasi, Lawrence Wolf-Sonkin, Hanna, Wallach & Ryan, Cotterell. 2019. Quantifying the Semantic Core of Gender Systems. In Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th International Joint Conference on natural language processing (EMNLP-IJCNLP), 5734–5739. Hong Kong, China: Association for Computational Linguistics. Available at: https://www.aclweb.org/anthology/D19-1577.10.18653/v1/D19-1577Search in Google Scholar

Yin, Zi & Yuanyuan Shen. 2018. On the dimensionality of word embedding. Advances in Neural Information Processing Systems, 887–898.Search in Google Scholar

Yu, Xiang, Agnieszka Falenska & Ngoc Thang Vu. 2017. A general-purpose tagger with convolutional neural networks. In Proceedings of the first workshop on subword and character level models in NLP, 124–129. Copenhagen, Denmark: Association for Computational Linguistics. Available at: https://www.aclweb.org/anthology/W17-4118.10.18653/v1/W17-4118Search in Google Scholar

Received: 2019-12-10
Accepted: 2020-12-08
Published Online: 2021-03-17

© 2021 Walter de Gruyter GmbH, Berlin/Boston