Jump to ContentJump to Main Navigation
Show Summary Details
More options …

Linguistic Typology

Founded by Plank, Frans

Editor-in-Chief: Koptjevskaja-Tamm, Maria

3 Issues per year


IMPACT FACTOR 2016: 0.304

CiteScore 2016: 0.53

SCImago Journal Rank (SJR) 2016: 0.629
Source Normalized Impact per Paper (SNIP) 2016: 1.234

Online
ISSN
1613-415X
See all formats and pricing
More options …
Volume 20, Issue 3

Issues

Linguistic typology in natural language processing

Emily M. Bender
  • Department of Linguistics, University of Washington, Guggenheim Hall, 4th Floor, Box 352425, Seattle, WA 98195, U.S.A.
  • Email
  • Other articles by this author:
  • De Gruyter OnlineGoogle Scholar
Published Online: 2016-12-23 | DOI: https://doi.org/10.1515/lingty-2016-0035

Abstract

This paper explores the ways in which the field of natural language processing (NLP) can and does benefit from work in linguistic typology. I describe the recent increase in interest in multilingual natural language processing and give a high-level overview of the field. I then turn to a discussion of how linguistic knowledge in general is incorporated in NLP technology before describing how typological results in particular are used. I consider both rule-based and machine learning approaches to NLP and review literature on predicting typological features as well as that which leverages such features.

References

  • Ackema, Peter, Patrick Brandt, Maaike Schoorlemmer & Fred Weerman (eds.). 2006. Arguments and agreement. Oxford: Oxford University Press.Google Scholar

  • Ammar, Waleed, George Mulcaire, Miguel Ballesteros, Chris Dyer & Noah A. Smith. 2016. Many languages, one parser. Transactions of the Association for Computational Linguistics 4. 431–444. https://www.transacl.org/ojs/index.php/tacl/article/view/892Google Scholar

  • Baldwin, Timothy & Valia Kordoni (eds.). 2011. The interaction between linguistics and computational linguistics: Virtuous, vicious or vacuous? Special issue of Linguistic Issues in Language Technology 6. http://journals.linguisticsociety.org/elanguage/lilt/issue/view/330.html

  • Bandyopadhyay, Sivaji, Pushpak Bhattacharya, Vasudeva Varma, Sudeshna Sarkar, A. Kumaran & Raghavendra Udupa (eds.). 2009. Proceedings of the Third International Workshop on Cross Lingual Information Access: Addressing the Information Need of Multilingual Societies (CLIAWS3), June 4, 2009, Boulder, Colorado. Madison, WI: Omnipress. http://www.aclweb.org/anthology/W09-16Google Scholar

  • Bender, Emily M. 2008. Grammar engineering for linguistic hypothesis testing. Texas Linguistics Society 10. 16–36.Google Scholar

  • Bender, Emily M. 2009. Linguistically naïve != language independent: Why NLP needs linguistic typology. In Proceedings of the EACL 2009 workshop on the interaction between linguistics and computational linguistics: Virtuous, vicious or vacuous?, 26–32. Vrilissia, Greece: Tehnografia Digital Press. http://www.aclweb.org/anthology/W09-0106Google Scholar

  • Bender, Emily M. 2011. On achieving and evaluating language-independence in NLP. Linguistic Issues in Language Technology 6(3). 1–26. http://journals.linguisticsociety.org/elanguage/lilt/article/view/2624.htmlGoogle Scholar

  • Bender, Emily M. 2014. Language CoLLAGE: Grammatical description with the LinGO Grammar Matrix. International Conference on Language Resources and Evaluation 9. 2447–2451. http://www.lrec-conf.org/proceedings/lrec2014/pdf/639_Paper.pdfGoogle Scholar

  • Bender, Emily M., Joshua Crowgey, Michael Wayne Goodman & Fei Xia. 2014. Learning grammar specifications from IGT: A case study of Chintang. In Good et al. (eds.) 2014, 43–53. http://www.aclweb.org/anthology/W14-2206

  • Bender, Emily M., Scott Drellishak, Antske Fokkens, Laurie Poulson & Safiyyah Saleem. 2010. Grammar customization. Research on Language and Computation 23–72.

  • Bender, Emily M., Dan Flickinger & Stephan Oepen. 2002. The grammar matrix: An open-source starter-kit for the rapid development of crosslinguistically consistent broad-coverage precision grammars. International Conference on Computational Linguistics 19 (Workshop on Grammar Engineering and Evaluation). 8–14. http://www.aclweb.org/anthology/W02-1502Google Scholar

  • Bender, Emily M., Michael Wayne Goodman, Joshua Crowgey & Fei Xia. 2013. Towards creating precision grammars from interlinear glossed text: Inferring large-scale typological properties. Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities 7. 74–83. http://www.aclweb.org/anthology/W13-2710Google Scholar

  • Böhmová, Alena, Jan Hajič, Eva Hajičová & Barbora Hladká. 2003. The Prague Dependency Treebank. In Anne Abeillé (ed.), Treebanks: Building and using parsed corpora, 103–127. Dordrecht: Kluwer.Google Scholar

  • Brown, Peter F., John Cocke, Stephen A. Della Pietra, Vincent J. Della Pietra, Fredrick Jelinek, John D. Lafferty, Robert L. Mercer & Paul S. Roossin. 1990. A statistical approach to machine translation. Computational Linguistics 16. 79–85.Google Scholar

  • Buchholz, Sabine & Erwin Marsi. 2006. CoNLL-X shared task on multilingual dependency parsing. Conference on Computational Natural Language Learning 10. 149–164. http://www.aclweb.org/anthology/W06-2920Google Scholar

  • Büring, Daniel. 2010. Towards a typology of focus realization. In Malte Zimmermann & Caroline Féry (eds.), Information structure, 177–205. Oxford: Oxford University Press.Google Scholar

  • Bybee, Joan L., Revere Perkins & William Pagliuca. 1994. The evolution of grammar: Tense, aspect and modality in the languages of the world. Chicago: University of Chicago Press.Google Scholar

  • Calzolari, Nicoletta, Riccardo Del Gratta, Gil Francopoulo, Joseph Mariani, Francesco Rubino, Irene Russo & Claudia Soria. 2012. The LRE map: Harmonising community descriptions of resources. International Conference on Language Resources and Evaluation 8. 1084–1089. http://www.lrec-conf.org/proceedings/lrec2012/pdf/769_Paper.pdfGoogle Scholar

  • Comrie, Bernard. 1976. Aspect: An introduction to the study of verbal aspect and related problems. Cambridge: Cambridge University Press.Google Scholar

  • Comrie, Bernard. 1985. Tense. Cambridge: Cambridge University Press.Google Scholar

  • Comrie, Bernard. 1989. Language universals and linguistic typology. 2nd edn. Chicago: University of Chicago Press.Google Scholar

  • Copestake, Ann, Dan Flickinger, Carl Pollard & Ivan A. Sag. 2005. Minimal recursion semantics: An introduction. Research on Language and Computation 3. 281–332.Google Scholar

  • Corbett, Greville G. 1991. Gender. Cambridge: Cambridge University Press.Google Scholar

  • Corbett, Greville G. 2000. Number. Cambridge: Cambridge University Press.Google Scholar

  • Corbett, Greville G. 2006. Agreement. Cambridge: Cambridge University Press.Google Scholar

  • Crowgey, Joshua. 2012. The syntactic exponence of sentential negation: A model for the LinGO Grammar Matrix. Seattle: University of Washington MA thesis. http://hdl.handle.net/1773/22454Google Scholar

  • Cysouw, Michael. 2003. The paradigmatic structure of person marking. Oxford: Oxford University Press.Google Scholar

  • Dahl, Östen. 1979. Typology of sentence negation. Linguistics 17. 79–106.Google Scholar

  • Dahl, Östen. 1985. Tense and aspect systems. Oxford: Blackwell.Google Scholar

  • Daumé, Hal, III. 2009. Non-parametric Bayesian areal linguistics. North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2009(1). 593–601. http://www.aclweb.org/anthology/N09-1067Google Scholar

  • Daumé, Hal, III & Lyle Campbell. 2007. A Bayesian model for discovering typological implications. Association of Computational Linguistics 45(1). 65–72. http://www.aclweb.org/anthology/P07-1009Google Scholar

  • Dixon, R. M. W. 1994. Ergativity. Cambridge: Cambridge University Press.Google Scholar

  • Dixon, R. M. W. 2004. Adjective classes in typological perspective. In R. M. W. Dixon & Alexandra Y. Aikhenvald (eds.), Adjective classes: A cross-linguistic typology, 1–49. Oxford: Oxford University Press.Google Scholar

  • Drellishak, Scott. 2004. A survey of coordination strategies in the world’s languages. Seattle: University of Washington MA thesis.Google Scholar

  • Drellishak, Scott. 2009. Widespread but not universal: Improving the typological coverage of the Grammar Matrix. Seattle: University of Washington doctoral dissertation.Google Scholar

  • Drellishak, Scott & Emily M. Bender. 2005. A coordination module for a crosslinguistic grammar resource. International Conference on Head-Driven Phrase Structure Grammar 12. 108–128. http://web.stanford.edu/group/cslipublications/cslipublications/HPSG/2005/drellishak-bender.pdfGoogle Scholar

  • Dryer, Matthew S. 2005. Negative morphemes. In Haspelmath et al. (eds.) 2005, 454–457.Google Scholar

  • Dryer, Matthew S. 2008. Expression of pronominal subjects. In Martin Haspelmath, Matthew S. Dryer, David Gil & Bernard Comrie (eds.), The world atlas of language structures online, Chapter 101. München: Max Planck Digital Library. http://wals.info/feature/101Google Scholar

  • Dryer, Matthew S. 2013a. Order of adjective and noun. In Dryer & Haspelmath (eds.) 2013, Chapter 87. http://wals.info/feature/87

  • Dryer, Matthew S. 2013b. Order of adposition and noun phrase. In Dryer & Haspelmath (eds.) 2013, Chapter 85. http://wals.info/chapter/85

  • Dryer, Matthew S. 2013c. Order of demonstrative and noun. In Dryer & Haspelmath (eds.) 2013, Chapter 88. http://wals.info/chapter/88

  • Dryer, Matthew S. 2013d. Order of genitive and noun. In Dryer & Haspelmath (eds.) 2013, Chapter 86. http://wals.info/chapter/86

  • Dryer, Matthew S. 2013e. Order of numeral and noun. In Dryer & Haspelmath (eds.) 2013, Chapter 89. http://wals.info/chapter/89

  • Dryer, Matthew S. 2013f. Order of subject, object and verb. In Dryer & Haspelmath (eds.) 2103, Chapter 81. http://wals.info/chapter/81

  • Dryer, Matthew S. & Martin Haspelmath (eds.). 2013. The world atlas of language structures online. Leipzig: Max Planck Institut für evolutionäre Anthropologie. http://wals.info/

  • Evans, Nicholas & Stephen C. Levinson. 2009. The myth of language universals: Language diversity and its importance for cognitive science. Behavioral & Brain Sciences 32. 429–448.Google Scholar

  • Féry, Caroline & Manfred Krifka. 2009. Information structure: Notional distinctions, ways of expression. In Piet van Sterkenburg (ed.), Unity and diversity of languages, 123–135. Amsterdam: Benjamins.Google Scholar

  • Georgi, Ryan, Fei Xia & William D. Lewis. 2010. Comparing language similarity across genetic and typologically-based groupings. International Conference on Computational Linguistics 23. 385–393. http://www.aclweb.org/anthology/C10-1044Google Scholar

  • Georgi, Ryan, Fei Xia & William D. Lewis. 2012. Improving dependency parsing with interlinear glossed text and syntactic projection. International Conference on Computational Linguistics 24(Posters), 371–380. http://www.aclweb.org/anthology/C12-2037Google Scholar

  • Giannakopoulos, George & Georgios Petasis (eds.). 2013. Proceedings of the workshop “Multilingual multi-document summarization” (MultiLing 2013), August 9, 2013, Sofia, Bulgaria. Madison, WI: Omnipress. http://www.aclweb.org/anthology/W13-31Google Scholar

  • Givón, T. 1994. The pragmatics of de-transitive voice: Functional and typological aspects of inversion. In T. Givón (ed.), Voice and inversion, 3–44. Amsterdam: Benjamins.Google Scholar

  • Good, Jeff, Julia Hirschberg & Owen Rambow (eds.). 2014. Proceedings of the 2014 Workshop on the Use of Computational Methods in the Study of Endangered Languages (ComputEL 2014), June 26, 2014, Baltimore, Maryland, USA. http://www.aclweb.org/anthology/W14-22

  • Hajič, Jan, Massimiliano Ciaramita, Richard Johansson, Daisuke Kawahara, Maria Antònia Martí, Lluís Màrquez, Adam Meyers, Joakim Nivre, Sebastian Padó, Jann Štěpánek, Pavel Straňák, Mihai Surdeanu, Nianwen Xue & Yi Zhang. 2009. The CoNLL-2009 shared task: Syntactic and semantic dependencies in multiple languages. Conference on Computational Natural Language Learning 13(2: Shared Task). 1–18. http://www.aclweb.org/anthology/W09-1201Google Scholar

  • Haspelmath, Martin, Matthew Dryer, David Gil & Bernard Comrie (eds.). 2005. The world atlas of language structures. Oxford: Oxford University Press.Google Scholar

  • Hwa, Rebecca, Philip Resnik, Amy Weinberg, Clara Cabezas & Okan Kolak. 2005. Bootstrapping parsers via syntactic projection across parallel texts. Natural Language Engineering 11. 311–325.Google Scholar

  • Jagarlamudi, Jagadeesh, Sujith Ravi, Xiaojun Wan & Hal Daumé III (eds.). 2012. Proceedings of the First Workshop on Multilingual Modeling, July 13, 2012, Jeju, Republic of Korea. http://www.aclweb.org/anthology/W12-39

  • Kurimo, Mikko, Sami Virpioja, Ville Turunen & Krista Lagus. 2010. Morpho Challenge competition 2005–2010: Evaluations and results. ACL Special Interest Group on Computational Morphology and Phonology 11. 87–95. http://www.aclweb.org/anthology/W10-2211Google Scholar

  • Lewis, William D. 2006. ODIN: A model for adapting and enriching legacy infrastructure. IEEE International Conference on E-Science 2. 137.Google Scholar

  • Lewis, William D. & Fei Xia. 2008. Automatically identifying computationally relevant typological features. International Joint Conference on Natural Language Processing 3(2). 685–690. http://www.aclweb.org/anthology/I08-2093Google Scholar

  • Lewis, William D. & Fei Xia. 2010. Developing ODIN: A multilingual repository of annotated language data for hundreds of the world’s languages. Journal of Literary and Linguistic Computing 25. 303–319.Google Scholar

  • Lu, Xia. 2013. Exploring word order universals: A probabilistic graphical model approach. Association for Computational Linguistics 51(3: Student research workshop). 150–157. http://www.aclweb.org/anthology/P13-3022Google Scholar

  • Manning, Christopher D. & Hinrich Schütze. 1999. Foundations of statistical natural language processing. Cambridge, MA: MIT Press.Google Scholar

  • Marcus, Mitchell P., Beatrice Santorini & Mary Ann Marcinkiewicz. 1993. Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics 19. 313–330.Google Scholar

  • McDonald, Ryan, Joakim Nivre, Yvonne Quirmbach-Brundage, Yoav Goldberg, Dipanjan Das, Kuzman Ganchev, Keith Hall, Slav Petrov, Hao Zhang, Oscar Täckström, Claudia Bedini, Núria Bertomeu Castelló & Jungmee Lee. 2013. Universal dependency annotation for multilingual parsing. Association for Computational Linguistics 51(2: Short papers). 92–97. http://www.aclweb.org/anthology/P13-2017Google Scholar

  • Naseem, Tahira, Regina Barzilay & Amir Globerson. 2012. Selective sharing for multilingual dependency parsing. Association for Computational Linguistics 50(1: Long papers). 629–637. http://www.aclweb.org/anthology/P12-1066Google Scholar

  • Nivre, Joakim, Johan Hall, Sandra Kübler, Ryan McDonald, Jens Nilsson, Sebastian Riedel & Deniz Yuret. 2007. The CoNLL 2007 shared task on dependency parsing. Joint Conference on Empirical Methods in Natural Language Processing & Computational Natural Language Learning 2007. 915–932. http://www.aclweb.org/anthology/D/D07/D07-1096Google Scholar

  • Nivre, Joakim, Johan Hall, Jens Nilsson, Atanas Chanev, Gülşen Eryigit, Sandra Kübler, Svetoslav Marinov & Erwin Marsi. 2007. MaltParser: A language-independent system for data-driven dependency parsing. Natural Language Engineering 13. 95–135.Google Scholar

  • Östling, Robert. 2015. Word order typology through multilingual word alignment. Association for Computational Linguistics 53(2: Short papers). 205–211. http://www.aclweb.org/anthology/P15-2034Google Scholar

  • Payne, John R. 1985. Complex phrases and complex sentences. In Timothy Shopen (ed.), Language typology and syntactic description, Vol. 2: Complex constructions, 3–41. Cambridge: Cambridge University Press.Google Scholar

  • Petrov, Slav, Dipanjan Das & Ryan McDonald. 2012. A universal part-of-speech tagset. International Conference on Language Resources and Evaluation 8. 2089–2096. http://www.lrec-conf.org/proceedings/lrec2012/pdf/274_Paper.pdfGoogle Scholar

  • Pollard, Carl & Ivan A. Sag. 1994. Head-Driven Phrase Structure Grammar. Chicago: University of Chicago Press.Google Scholar

  • Poulson, Laurie. 2011. Meta-modeling of tense and aspect in a crosslinguistic grammar engineering platform. University of Washington Working Papers in Linguistics 28. http://http://depts.washington.edu/uwwpl/vol28/poulson_2011.pdf

  • Rama, Taraka & Prasanth Kolachina. 2012. How good are typological distances for determining genealogical relationships among languages? International Conference on Computational Linguistics 24(Posters). 975–984. http://www.aclweb.org/anthology/C12-2095Google Scholar

  • Saleem, Safiyyah. 2010. Argument optionality: A new library for the grammar matrix customization system. Seattle: University of Washington MA thesis.Google Scholar

  • Saleem, Safiyyah & Emily M. Bender. 2010. Argument optionality in the LinGO Grammar Matrix. International Conference on Computational Linguistics 23(Posters). 1068–1076. http://www.aclweb.org/anthology/C10-2123Google Scholar

  • Schultz, Tanja & Katrin Kirchhoff (eds.). 2006. Multilingual speech processing. Burlington, MA: Academic Press.Google Scholar

  • Siewierska, Anna. 2004. Person. Cambridge: Cambridge University Press.Google Scholar

  • Søgaard, Anders. 2011. Data point selection for cross-language adaptation of dependency parsers. Association for Computational Linguistics: Human Language Technologies 49(2). 682–686. http://www.aclweb.org/anthology/P11-2120Google Scholar

  • Song, Sanghoun. 2014. A grammar library for information structure. Seattle: University of Washington doctoral dissertation. http://hdl.handle.net/1773/25372Google Scholar

  • Stassen, Leon. 2000. AND-languages and WITH-languages. Linguistic Typology 4. 1–54.Google Scholar

  • Stassen, Leon. 2003. Intransitive predication. Oxford: Oxford University Press.Google Scholar

  • Stassen, Leon. 2013. Predicative adjectives. In Dryer & Haspelmath (eds.) 2013, Chapter 118. http://wals.info/feature/118Google Scholar

  • Täckström, Oscar, Ryan McDonald & Joakim Nivre. 2013. Target language adaptation of discriminative transfer parsers. North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2013(1). 1061–1071. http://www.aclweb.org/anthology/N13-1126Google Scholar

  • Teh, Yee W., Hal Daumé III & Daniel M. Roy. 2007. Bayesian agglomerative clustering with coalescents. In John C. Platt, Daphne Koller, Yoram Singer & Sam T. Roweis (eds.), Advances in neural information processing systems 20. 1463–1480. Cambridge, MA: MIT Press.Google Scholar

  • Trimble, Thomas James. 2014. Adjectives in the LinGO Grammar Matrix. Seattle: University of Washington MS thesis. http://hdl.handle.net/1773/27512Google Scholar

  • Xia, Fei, William D. Lewis, Michael Wayne Goodman, Glenn Slayden, Ryan Georgi, Joshua Crowgey & Emily M. Bender. 2016. Enriching a massively multilingual database of interlinear glossed text. Language Resources and Evaluation 50. 321–349.Google Scholar

  • Yarowsky, David, Grace Ngai & Richard Wicentowski. 2001. Inducing multilingual text analysis tools via robust projection across aligned corpora. In Proceedings of the First International Conference on Human Language Technology Research, 1–8. http://www.aclweb.org/anthology/H01-1035Google Scholar

  • Zeman, Daniel & Philip Resnik. 2008. Cross-language parser adaptation between related languages. International Joint Conference on Natural Language Processing 3(Workshop on NLP for Less Privileged Languages). 35–42. http://www.aclweb.org/anthology/I08-3008Google Scholar

  • Zhang, Yuan & Regina Barzilay. 2015. Hierarchical low-rank tensors for multilingual transfer parsing. Conference on Empirical Methods in Natural Language Processing 2015. 1857–1867. http://aclweb.org/anthology/D15-1213Google Scholar

About the article

Received: 2016-08-03

Revised: 2016-09-06

Published Online: 2016-12-23

Published in Print: 2016-12-01


Citation Information: Linguistic Typology, Volume 20, Issue 3, Pages 645–660, ISSN (Online) 1613-415X, ISSN (Print) 1430-0532, DOI: https://doi.org/10.1515/lingty-2016-0035.

Export Citation

©2016 by De Gruyter Mouton. Copyright Clearance Center

Comments (0)

Please log in or register to comment.
Log in