Skip to content
Licensed Unlicensed Requires Authentication Published by De Gruyter Oldenbourg May 5, 2021

Argument parsing via corpus queries

Natalie Dykes, Stefan Evert ORCID logo, Merlin Göttlinger, Philipp Heinrich and Lutz Schröder


We present an approach to extracting arguments from social media, exemplified by a case study on a large corpus of Twitter messages collected under the #Brexit hashtag during the run-up to the referendum in 2016. Our method is based on constructing dedicated corpus queries that capture predefined argumentation patterns following standard Walton-style argumentation schemes. Query matches are transformed directly into logical patterns, i. e. formulae with placeholders in a general form of modal logic. We prioritize precision over recall, exploiting the fact that the sheer size of the corpus still delivers substantial numbers of matches for all patterns, and with the goal of eventually gaining an overview of widely-used arguments and argumentation schemes. We evaluate our approach in terms of recall on a manually annotated gold standard of 1000 randomly selected tweets for three selected high-frequency patterns. We also estimate precision by manual inspection of query matches in the entire corpus. Both evaluations are accompanied by an analysis of inter-annotator agreement between three independent judges.



1. T. Alsinet, J. Argelich, R. Béjar, and J. Cemeli. A distributed argumentation algorithm for mining consistent opinions in weighted twitter discussions. Soft Comput., 23(7):2147–2166, 2019.10.1007/s00500-018-3380-xSearch in Google Scholar

2. R. Alur, T. Henzinger, and O. Kupferman. Alternating-time temporal logic. J. ACM, 49:672–713, 2002.10.1109/SFCS.1997.646098Search in Google Scholar

3. F. Baader, D. Calvanese, D. McGuinness, D. Nardi, and P. Patel-Schneider, eds. The Description Logic Handbook. Cambridge University Press, 2003.Search in Google Scholar

4. P. Baroni, D. Gabbay, M. Giacomin, and L. van der Torre, eds. Handbook of Formal Argumentation. College Publications, 2018.Search in Google Scholar

5. T. Bosc, E. Cabrio, and S. Villata. Tweeties squabbling: Positive and negative results in applying argument mining on social media. In Computational Models of Argument, COMMA 2016, vol. 287 of Frontiers Artif. Intell. Appl., pp. 21–32. IOS Press, 2016.Search in Google Scholar

6. E. Cabrio and S. Villata. Five years of argument mining: a data-driven analysis. In International Joint Conference on Artificial Intelligence, IJCAI 2018, pp. 5427–5433, 2018. in Google Scholar

7. B. Chellas. Modal logic. Cambridge University Press, 1980.10.1017/CBO9780511621192Search in Google Scholar

8. C. Chesñevar, J. McGinnis, S. Modgil, I. Rahwan, C. Reed, G. Simari, M. South, G. Vreeswijk, and S. Willmott. Towards an argument interchange format. Knowledge Eng. Review, 21(4):293–316, 2006.10.1017/S0269888906001044Search in Google Scholar

9. O. Christ. A modular and flexible architecture for an integrated corpus query system. In Papers in Computational Lexicography, COMPLEX 1994, pp. 22–32, 1994.Search in Google Scholar

10. C. Cîrstea, A. Kurz, D. Pattinson, L. Schröder, and Y. Venema. Modal logics are coalgebraic. Comput. J., 54:31–41, 2011.10.14236/ewic/VOCS2008.12Search in Google Scholar

11. J. Cohen. A coefficient of agreement for nominal scales. Educ. Psychol. Meas., 20:37–46, 1960.10.1177/001316446002000104Search in Google Scholar

12. H. Cunningham, D. Maynard, K. Bontcheva, and V. Tablan. GATE: an architecture for development of robust HLT applications. In Annual Meeting of the Association for Computational Linguistics, ACL 2002, pp. 168–175, 2002.Search in Google Scholar

13. M. Dusmanu, E. Cabrio, and S. Villata. Argument mining on Twitter: Arguments, facts and sources. In Empirical Methods in Natural Language Processing, EMNLP 2017, pp. 2317–2322. ACL, 2017.10.18653/v1/D17-1245Search in Google Scholar

14. N. Dykes, S. Evert, M. Göttlinger, P. Heinrich, and L. Schröder. Reconstructing arguments from noisy text: Introduction to the RANT project. Datenbank-Spektrum, 20:123–129, 2020.10.1007/s13222-020-00342-ySearch in Google Scholar

15. S. Evert and A. Hardie. Twenty-first century corpus workbench: Updating a query architecture for the new millennium. In Corpus Linguistics, CL 2011. University of Birmingham, 2011.Search in Google Scholar

16. S. Evert and The CWB Development Team. The IMS Open Corpus Workbench (CWB) CQP Query Language Tutorial, 2020. CWB Version 3.5, available at in Google Scholar

17. V. Feng and G. Hirst. Classifying arguments by scheme. In Annual Meeting of the Association for Computational Linguistics, ACL 2011, pp. 987–996. ACL, 2011.Search in Google Scholar

18. J. Fleiss, J. Cohen, and B. Everitt. Large sample standard errors of kappa and weighted kappa. Psychol. Bull., 72(5):323–327, 1969.10.1037/h0028106Search in Google Scholar

19. L. Godo and R. Rodríguez. Logical approaches to fuzzy similarity-based reasoning: an overview. In Preferences and Similarities, pp. 75–128. Springer, 2008.10.1007/978-3-211-85432-7_4Search in Google Scholar

20. D. Gorín, D. Pattinson, L. Schröder, F. Widmann, and T. Wißmann. COOL – a generic reasoner for coalgebraic hybrid logics (system description). In Automated Reasoning, IJCAR 2014, vol. 8562 of LNCS, pp. 396–402. Springer, 2014.10.1007/978-3-319-08587-6_31Search in Google Scholar

21. T. Goudas, C. Louizos, G. Petasis, and V. Karkaletsis. Argument extraction from news, blogs, and social media. In Artificial Intelligence: Methods and Applications, SETN 2014, pp. 287–299. Springer, 2014.10.1007/978-3-319-07064-3_23Search in Google Scholar

22. K. Grosse, C. Chesñevar, A. Maguitman, and E. Estevez. Empowering an E-government platform through Twitter-based arguments. Inteligencia Artif., 15(50):46–56, 2012.Search in Google Scholar

23. S. Kraus, D. Lehmann, and M. Magidor. Nonmonotonic reasoning, preferential models and cumulative logics. Artif. Intell., 44(1-2):167–207, 1990.10.1016/0004-3702(90)90101-5Search in Google Scholar

24. A. Kurucz, F. Wolter, M. Zakharyaschev, and D. M. Gabbay. Many-Dimensional Modal Logics: Theory and Applications. Elsevier, 2003.Search in Google Scholar

25. J. Lawrence, M. Snaith, B. Konat, K. Budzynska, and C. Reed. Debating technology for dialogical argument: Sensemaking, engagement, and analytics. ACM Trans. Internet Tech., 17(3):1–23, 2017.10.1145/3007210Search in Google Scholar

26. M. Lenz, S. Ollinger, P. Sahitaj, and R. Bergmann. Semantic textual similarity measures for case-based retrieval of argument graphs. In Case-Based Reasoning Research and Development, ICCBR 2019, vol. 11680 of LNCS, pp. 219–234. Springer, 2019.10.1007/978-3-030-29249-2_15Search in Google Scholar

27. D. Lewis. Counterfactuals. Harvard University Press, 1973.Search in Google Scholar

28. A. Lytos, T. Lagkas, P. Sarigiannidis, and K. Bontcheva. The evolution of argumentation mining: From models to social media and emerging tools. Inf. Process. Manage., 56(6):102055, 11 2019.10.1016/j.ipm.2019.102055Search in Google Scholar

29. S. Mac Lane. Categories for the Working Mathematician. Springer, 1971.10.1007/978-1-4612-9839-7Search in Google Scholar

30. T. Mikolov, K. Chen, G. Corrado, and J. Dean. Efficient estimation of word representations in vector space. CoRR, 1301.3781, 2013.Search in Google Scholar

31. G. Minnen, J. Carroll, and D. Pearce. Applied morphological processing of English. Nat. Lang. Eng., 7(3):207–223, 2001.10.1017/S1351324901002728Search in Google Scholar

32. O. Owoputi, B. O’Connor, C. Dyer, K. Gimpel, N. Schneider, and N. Smith. Improved part-of-speech tagging for online conversational text with word clusters. In Human Language Technologies, HLT-NAACL 2013, pp. 380–390. ACL, 2013.Search in Google Scholar

33. P. Pantel and M. Pennacchiotti. Espresso: Leveraging generic patterns for automatically harvesting semantic relations. In Computational Linguistics / Annual Meeting of the Association for Computational Linguistics, ACL 2006. ACL, 2006.10.3115/1220175.1220190Search in Google Scholar

34. C. Reed, S. Wells, J. Devereux, and G. Rowe. AIF+: dialogue in the argument interchange format. In Computational Models of Argument, COMMA 2008, vol. 172 of Frontiers Artif. Intell. Appl., pp. 311–323. IOS Press, 2008.Search in Google Scholar

35. N. Reimers, B. Schiller, T. Beck, J. Daxenberger, C. Stab, and I. Gurevych. Classification and clustering of arguments with contextualized word embeddings. In Annual Meeting of the Association for Computational Linguistics, ACL 2019, pp. 567–578. ACL, 2019.10.18653/v1/P19-1054Search in Google Scholar

36. A. Ritter, S. Clark, Mausam, and O. Etzioni. Named entity recognition in tweets: An experimental study. In Empirical Methods in Natural Language Processing, EMNLP 2011, pp. 1524–1534. ACL, 2011.Search in Google Scholar

37. A. Ritter, Mausam, O. Etzioni, and S. Clark. Open domain event extraction from twitter. In Knowledge Discovery and Data Mining, KDD 2012, pp. 1104–1112. ACM, 2012.10.1145/2339530.2339704Search in Google Scholar

38. J. Rutten. Universal coalgebra: A theory of systems. Theor. Comput. Sci., 249:3–80, 2000.10.1016/S0304-3975(00)00056-6Search in Google Scholar

39. L. Schröder and D. Pattinson. Modular algorithms for heterogeneous modal logics via multi-sorted coalgebra. Math. Struct. Comput. Sci., 21(2):235–266, 2011.10.1017/S0960129510000563Search in Google Scholar

40. L. Schröder, D. Pattinson, and D. Hausmann. Optimal tableaux for conditional logics with cautious monotonicity. In European Conference on Artificial Intelligence, ECAI 2010, vol. 215 of Frontiers Artif. Intell. Appl., pp. 707–712. IOS Press, 2010.Search in Google Scholar

41. F. Schäfer, S. Evert, and P. Heinrich. Japan’s 2014 general election: Political bots, right-wing Internet activism and PM Abe Shinzō’s hidden nationalist agenda. Big Data, 5(4):294–309, 2017.10.1089/big.2017.0049Search in Google Scholar

42. Y. Son, A. Buffone, J. Raso, A. Larche, A. Janocko, K. Zembroski, H. A. Schwartz, and L. Ungar. Recognizing counterfactual thinking in social media texts. In Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2017.10.18653/v1/P17-2103Search in Google Scholar

43. D. Walton, C. Reed, and F. Macagno. Argumentation Schemes. Cambridge University Press, 2008.10.1017/CBO9780511802034Search in Google Scholar

44. L. Zadeh. Probability measures of fuzzy events. J. Math. Anal. Appl., 23:421–427, 1968.10.1016/0022-247X(68)90078-4Search in Google Scholar

Received: 2020-11-27
Revised: 2021-02-12
Accepted: 2021-03-15
Published Online: 2021-05-05
Published in Print: 2021-02-23

© 2021 Walter de Gruyter GmbH, Berlin/Boston

Scroll Up Arrow