Jump to ContentJump to Main Navigation
Show Summary Details
More options …

The Prague Bulletin of Mathematical Linguistics

The Journal of Charles University

2 Issues per year

Open Access
Online
ISSN
1804-0462
See all formats and pricing
More options …

Continuous Learning from Human Post-Edits for Neural Machine Translation

Marco Turchi / Matteo Negri / M. Amin Farajian / Marcello Federico
Published Online: 2017-06-06 | DOI: https://doi.org/10.1515/pralin-2017-0023

Abstract

Improving machine translation (MT) by learning from human post-edits is a powerful solution that is still unexplored in the neural machine translation (NMT) framework. Also in this scenario, effective techniques for the continuous tuning of an existing model to a stream of manual corrections would have several advantages over current batch methods. First, they would make it possible to adapt systems at run time to new users/domains; second, this would happen at a lower computational cost compared to NMT retraining from scratch or in batch mode. To attack the problem, we explore several online learning strategies to stepwise fine-tune an existing model to the incoming post-edits. Our evaluation on data from two language pairs and different target domains shows significant improvements over the use of static models.

Bibliography

  • Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio. “Neural Machine Translation by Jointly Learning to align and translate”. arXiv preprint arXiv:1409.0473, 2014.Google Scholar

  • Bertoldi, Nicola, Mauro Cettolo, and Marcello Federico. Cache-based Online Adaptation for Machine Translation Enhanced Computer Assisted Translation. In Proc. of the XIV Machine Translation Summit, pages 35–42, Nice, France, September 2013.Google Scholar

  • Bojar, Ondřej et al. Findings of the 2016 Conference on Machine Translation. In Proc. of the First Conference on Machine Translation, pages 131–198, Berlin, Germany, August 2016.Google Scholar

  • Bottou, Léon. “Large-Scale Machine Learning with Stochastic Gradient Descent”. In Proc. of COMPSTAT’2010, pages 177–187, Paris, France, August 2010. Springer.Google Scholar

  • Cettolo, Mauro, Jan Niehues, Sebastian Stüker, Luisa Bentivogli, Roldano Cattoni, and Marcello Federico. The IWSLT 2015 Evaluation Campaign. In Proc. of the 12th International Workshop on Spoken Language Translation (IWSLT 2015), Da Nang, Vietnam, 2015.Google Scholar

  • Denkowski, Michael, Chris Dyer, and Alon Lavie. Learning from Post-Editing: Online Model Adaptation for Statistical Machine Translation. In Proc. of the 14th Conference of the European Chapter of the Association for Computational Linguistics, Gothenburg, Sweden, April 2014.Google Scholar

  • Duchi, John, Elad Hazan, and Yoram Singer. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization. Journal of Machine Learning Research, 2011.Google Scholar

  • Germann, Ulrich. Dynamic Phrase Tables for Machine Translation in an Interactive Post-editing Scenario. In Proc. of the Workshop on interactive and adaptive machine translation, pages 20–31, Vancouver, BC, Canada, 2014.Google Scholar

  • Kingma, Diederik P. and Jimmy Ba. Adam: A Method for Stochastic Optimization. In Proc. of the 3rd Int. Conference on Learning Representations, pages 1–13, San Diego, USA, May 2015.Google Scholar

  • Koehn, Philipp. Statistical Significance Tests for Machine Translation Evaluation. In Proceedings of the Empirical Methods on Natural Language Processing, pages 388–395, 2004.Google Scholar

  • Koehn, Philipp. Europarl: A Parallel Corpus for Statistical Machine Translation. In Proc. of the tenth Machine Translation Summit, pages 79–86, Phuket, Thailand, 2005.Google Scholar

  • Li, Xiaoqing, Jiajun Zhang, and Chengqing Zong. “One Sentence One Model for Neural Machine Translation”. arXiv preprint arXiv:1609.06490, 2016.Google Scholar

  • Luong, Minh-Thang and Christopher D. Manning. Mixture-Model Adaptation for SMT. In Proc. of the 12th International Workshop on Spoken Language Translation, pages 76–79, Da Nang, Vietnam, December 2015.Google Scholar

  • McCandless, Michael, Erik Hatcher, and Otis Gospodnetic. Lucene in Action. Manning Publications Co., Greenwich, CT, USA, 2010.Google Scholar

  • Ortiz-Martínez, Daniel. Online Learning for Statistical Machine Translation. Computational Linguistics, 42(1):121–161, 2016.Google Scholar

  • Ortiz-Martínez, Daniel, Ismael García-Varea, and Francisco Casacuberta. Online Learning for Interactive Statistical Machine Translation. In Proc. of NAACL-HLT 2010, pages 546–554, Los Angeles, California, June 2010.Google Scholar

  • Pinnis, Marcis, Rihards Kalnins, Raivis Skadins, and Inguna Skadina. What Can We Really Learn from Post-editing? In Proc. of AMTA 2016 vol. 2: MT Users’ Track, pages 86–91, Austin, Texas, November 2016.Google Scholar

  • Sennrich, Rico, Barry Haddow, and Alexandra Birch. Neural Machine Translation of Rare Words with Subword Units. In Proc. of the 54th Annual Meeting on Association for Computational Linguistics, pages 1715––1725, Berlin, Germany, August 2016. Association for Computational Linguistics.Google Scholar

  • Wäschle, Katharina, Patrick Simianer, Nicola Bertoldi, Stefan Riezler, and Marcello Federico. Generative and Discriminative Methods for Online Adaptation in SMT. In Proc. of Machine Translation Summit XIV, pages 11–18, Nice, France, September 2013.Google Scholar

  • Wuebker, Joern, Spence Green, and John DeNero. Hierarchical Incremental Adaptation for Statistical Machine Translation. In Proc. of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 1059–1065, Lisbon, Portugal, September 2015.Google Scholar

  • Zeiler, Matthew D. “ADADELTA: An Adaptive Learning Rate Method”. arXiv preprint arXiv:1212.5701, 2012.Google Scholar

About the article

Published Online: 2017-06-06

Published in Print: 2017-06-01


Citation Information: The Prague Bulletin of Mathematical Linguistics, Volume 108, Issue 1, Pages 233–244, ISSN (Online) 1804-0462, DOI: https://doi.org/10.1515/pralin-2017-0023.

Export Citation

© 2017 Marco Turchi et al., published by De Gruyter Open. This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License. BY-NC-ND 3.0

Comments (0)

Please log in or register to comment.
Log in