Jump to ContentJump to Main Navigation
Show Summary Details
More options …

Corpus Linguistics and Linguistic Theory

Founded by Gries, Stefan Th. / Stefanowitsch, Anatol

Ed. by Wulff, Stefanie

2 Issues per year

IMPACT FACTOR 2017: 1.200
5-year IMPACT FACTOR: 1.386

CiteScore 2017: 0.80

SCImago Journal Rank (SJR) 2017: 0.288
Source Normalized Impact per Paper (SNIP) 2017: 0.930

See all formats and pricing
More options …

DART – The dialogue annotation and research tool

Martin Weisser
  • Corresponding author
  • National Key Research Center for Linguistics and Applied Linguistics, Guangdong University of Foreign Studies, Guangzhou, Guangdong, China
  • Email
  • Other articles by this author:
  • De Gruyter OnlineGoogle Scholar
Published Online: 2016-01-09 | DOI: https://doi.org/10.1515/cllt-2014-0051


Corpus-based research into pragmatics is suffering from a distinct lack of suitably annotated corpora. This dilemma has so far generally forced researchers in corpus-based pragmatics to focus on well-known fixed expressions (e. g. discourse markers, politeness formulae, etc.) in their research, rather than being able to investigate interaction on the level of speech acts and other pragmatics-relevant features on a larger scale. This article describes a research environment that aims at remedying this problem (currently for English only) by making large-scale annotation of, and research into, speech acts and other linguistic levels possible in an efficient manner, at the same time discussing the difficulties and complexities inherent in such an endeavour. It then goes on to illustrate the efficiency of the approach, and how the resulting annotations represent an improvement over existing models in the form of a brief case study. The latter includes an illustrative discussion of the performance of the tool in annotating a subset of 100 files from the Switchboard corpus, plus a more detailed comparison of the automatically annotated version of one of the files with its original, manually annotated, version.

Keywords: corpus-based pragmatics; dialogue annotation; annotation tools


  • Adolphs, Svenja. 2008. Corpus and context: Investigating pragmatic functions in spoken discourse. Amsterdam: John Benjamins Publishing Company.Google Scholar

  • Aijmer, Karin. 1996. Conversational routines in English: Convention and creativity. London: Longman.Google Scholar

  • Archer, Dawn, Karin Aijmer & Anne Wichmann (eds.). 2012. Pragmatics: An advanced resource book for students. London & New York: Routledge.Google Scholar

  • Allen, James & Mark Core. 1997. Draft of DAMSL: Dialog act markup in several layers. ftp://ftp.cs.rochester.edu/pub/packages/dialog-annotation/manual.ps.gz (accessed 04 October 2014).

  • Anderson, Anne, Miles Bader, Ellen Bard, Elizabeth Boyle, Gwyneth Doherty, Simon Garrod, Stephen Isard, Jacqueline Kowtko, Jan McAllister, Jim Miller, Catherine Sotillo, Henry Thompson & Regina Weinert. 1991. The HCRC Map Task corpus. Language and Speech 34(4). 351–366.Google Scholar

  • Biber, Douglas, Stig Johansson, Geoffrey Leech, Susan Conrad & Ed Finegan. 1999. Longman grammar of spoken and written English. London: Longman.Google Scholar

  • Blutner, Reinhard. 2004. Pragmatics and the lexicon. In L. Horn & G. Ward (eds.), The handbook of pragmatics. Oxford: Blackwell. 488–514.Google Scholar

  • Bunt, Harry, Jan Alexandersson, Jean Carletta, Jae-Woong Choe, Alex Fang, Koiti Hasida, Kiyong Lee, Volha Pethukova, Andrei Popescu-Belis, Laurent Romary, Claudia Soria & David Traum. 2010. Towards and ISO standard for dialogue annotation. In Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC 2010). 2548–2555.

  • DeRose, Steven. 1988. Grammatical category disambiguation by statistical optimization. Computational Linguistics 14(1). 31–39.Google Scholar

  • Edwards, Jane. 1993. Principles and contrasting systems of discourse transcription. In Jane Edwards & Martin Lampert. (eds.), Talking data: Transcription and coding in discourse research. Hillsdale, NJ: Lawrence Erlbaum Associates.Google Scholar

  • Garside, Roger. 1987. The CLAWS word-tagging system. In Roger Garside, Geoffrey Leech & Geoffrey Sampson. (eds.), The Computational analysis of English: A corpus-based approach. London: Longman.Google Scholar

  • Green, Georgia. 1989. Pragmatics and natural language understanding. Hillsdale, NJ: Lawrence Erlbaum Associates.Google Scholar

  • Horn, Laurence & Gregory Ward (eds.). 2004. The handbook of pragmatics. Oxford: Blackwell.Google Scholar

  • Jurafsky, Daniel, Elizabeth Shriberg & Debra Biasca. 1997. Switchboard SWBD-DAMSL shallow-discourse-function annotation coder manual. http://www.icsi.berkeley.edu/pubs/speech/tr-97-02.pdf (accessed 04 October 2014).

  • Klein, Marion. 1999. Standardisation Efforts on the level of dialogue act in the MATE project. In Proceedings of the ACL Workshop “Towards Standards and Tools for Discourse Tagging”. 35–41.

  • Leech, Geoffrey, Martin Weisser, Andrew Wilson, & Martine Grice. 2000. Survey and guidelines for the representation and annotation of dialogue. In Dafydd Gibbon, Inge Mertins & Roger Moore. (eds.), Handbook of Multimodal and Spoken Language Systems. Dordrecht: Kluwer Academic Publishers. 1–101.

  • Leech, Geoffrey, Paul Rayson & Andrew Wilson. 2001. Word frequencies in written and spoken English. London: Longman.Google Scholar

  • Leech, Geoffrey & Martin Weisser. 2003. Generic speech act annotation for task-oriented dialogues. In Dawn Archer, Paul Rayson, Andrew Wilson & Tony McEnery (eds.), Proceedings of the Corpus Linguistics 2003 Conference. Lancaster University: UCREL Technical Papers, vol. 16.

  • Leech, Geoffrey & Martin Weisser. 2013. The SPAADIA Annotation Scheme. http://martinweisser.org/publications/SPAADIA_Annotation_Scheme.pdf

  • Manning, Christopher. 2011. Part-of-speech tagging from 97% to 100%: Is it time for some linguistics? In Alexander Gelbukh (ed.), Computational linguistics and intelligent text processing. Proceedings of the 12th International Conference, CICLing 2011, Tokyo, Japan, Part I. 171–189. Heidelberg: Springer.Google Scholar

  • Taylor, Anne. 1995. Dysfluency annotation stylebook for the Switchboard Corpus. Linguistic Data Consortium. https://catalog.ldc.upenn.edu/desc/addenda/LDC1999T42/DFLGUIDE.PDF (accessed 4 October 2014).

  • Schiffrin, Deborah. 1987. Discourse markers. Cambridge: Cambridge University Press.Google Scholar

  • Schiffrin, Deborah. 1994. Approaches to discourse. Oxford: Blackwell.Google Scholar

  • Searle, John. (1969). Speech Acts: An essay in the philosophy of language. Cambridge: Cambridge University Press.Google Scholar

  • Thompson, Henry, Anne Anderson & Miles Bader. 1995. Publishing a spoken and written corpus on CD-ROM: the HCRC Map Task experience. In Geoffrey Leech, Greg Myers & Jenny Thomas (eds.), Spoken English on computer: Transcription, mark-up and application. London: Longman. 168–180.Google Scholar

  • Weisser, Martin. 2002. Determining generic elements in dialogue. Language, Information and Lexicography 12–13. 131–156. 25th, December, 2003. Institute of Language and Information Studies, Yonsei University.Google Scholar

  • Weisser, Martin. 2003. SPAACy – A semi-automated tool for annotating dialogue acts. International Journal of Corpus Linguistics 8(1). 63–74.Google Scholar

  • Weisser, Martin. 2004. Tagging dialogues in SPAACy. In Jean Véronis (ed.), Traitement Automatique des Langues: Le traitement automatique des corpus oraux. 45, 131–157. Cachan: Lavoisier.Google Scholar

  • Weisser, Martin. 2009. Essential programming for Linguistics. Edinburgh Advanced Textbooks in Linguistics. Edinburgh: EUP.Google Scholar

  • Weisser, Martin. 2010. Annotating dialogue corpora semi-automatically: A corpus-based approach to pragmatics. Unpublished habilitation, University of Bayreuth.

  • Weisser, Martin. 2014a. The Dialogue Annotation and Research Tool (DART) (Version 1.0) [Computer Software]. http://martinweisser.org/ling_soft.html#DART.

  • Weisser, Martin. 2014b. The Simple Corpus Tool (Version 1.21) [Computer Software]. http://martinweisser.org/ling_soft.html#viewer.

  • Weisser, Martin. 2014c. Pragmatic Annotation. In Karin Aijmer & Christoph Rühlemann (eds.), Corpus pragmatics: A handbook. Cambridge: Cambridge University Press. 84–113.Google Scholar

  • Weisser, Martin. forthcoming. 2016. Profiling agents & callers: a dual comparison across speaker roles and British vs. American English. In Pickering, Lucy, Friginal, Eric, & Staples, Shelley. (eds.), Talking at work: Corpus-based explorations of workplace discourse. London: Palgrave Macmillan.Google Scholar

About the article

Published Online: 2016-01-09

Published in Print: 2016-10-01

Citation Information: Corpus Linguistics and Linguistic Theory, Volume 12, Issue 2, Pages 355–388, ISSN (Online) 1613-7035, ISSN (Print) 1613-7027, DOI: https://doi.org/10.1515/cllt-2014-0051.

Export Citation

©2016 by De Gruyter Mouton.Get Permission

Citing Articles

Here you can find all Crossref-listed publications in which this article is cited. If you would like to receive automatic email messages as soon as this article is cited in other publications, simply activate the “Citation Alert” on the top of this page.

Lihe Huang
Digital Scholarship in the Humanities, 2018, Volume 33, Number 2, Page 316

Comments (0)

Please log in or register to comment.
Log in