Abstract
This article describes a method to analyze characters in a literary text by considering their verbal interactions. This method exploits techniques from computational linguistics to extract all direct speech from a treebank, and to build a conversational network that visualizes the speakers, the listeners and their degree of interaction. We apply this method to create and visualize a conversational network for the Chinese Buddhist Canon. We analyze the protagonists and their interlocutors, and report statistics on their number of utterances and types of listeners, how their speech was reported, and subcommunities in the network.
References
Agarwal, Apoorv, Rambow, Owen, and Passonneau, Rebecca J. 2010. Annotation Scheme for Social Network Extraction from Text. In Proc. Association for Computational Linguistics (ACL).Search in Google Scholar
Agarwal, Apoorv, Corvalan, Augusto, Jensen, Jacob, and Rambow, Owen. 2012. Social Network Analysis of Alice in Wonderland. Proc. Workshop on Computational Linguistics for Literature.Search in Google Scholar
Baker, Paul, Gabrielatos, Costas, and McEnery, Tony. 2013. Sketching Muslims: A Corpus Driven Analysis of Representations Around the Word ‘Muslim’ in the British Press 1998-2009. Applied Linguistics 34(3):255-278.10.1093/applin/ams048Search in Google Scholar
Bingenheimer, Marcus, Hung, Jen-Jou, and Wiles, Simon. (2011). Social network visualization from TEI data. Literary and Linguistic Computing 26(3):271-278.Search in Google Scholar
Bisang, Walter. 2014. On the strength of morphological paradigms: A historical account of radical pro-drop. In Paradigm Change: In the Transeurasian Languages and Beyond, pages 23−61.Search in Google Scholar
Celikyilmaz Asli, Hakkani-Tur, Dilek, He, Hua, Kondrak, Greg, and Barbosa, Denilson. 2010. The Actor-Topic Model for Extracting Social Networks in Literary Narrative. In Proc. NIPS Machine Learning for Social Computing Workshop.Search in Google Scholar
Chang, Pi-Chuan, Tseng, Huihsin, Jurafsky, Dan and Manning Christopher D. 2009. Discriminative reordering with Chinese grammatical relations features. In Proc. 3rd Workshop on Syntax and Structure in Statistical Translation.10.3115/1626344.1626351Search in Google Scholar
Crane, Gregory. 2006. What Do You Do with a Million Books? D-Lib Magazine 12(3). http://www.dlib.org/dlib/march06/ crane/03crane.html10.1045/march2006-craneSearch in Google Scholar
Csomay, Eniko. 2013. Lexical Bundles in Discourse Structure: A Corpus-Based Study of Classroom Discourse. Applied Linguistics 34(3):369-388.10.1093/applin/ams045Search in Google Scholar
DDBC. 2008. Buddhist Studies Person Authority Databases (Beta Version). Buddhist Studies Authority Database Project, Dharma Drum Buddhist College. Accessed at http://authority.ddbc.edu.tw/person/Search in Google Scholar
Diesner, Jana, Frantz, Terrill L., and Carley, Kathleen M.. 2005. Communication Networks from the Enron Email Corpus: It’s Always about the People, Enron is no Different. Computational and Mathematical Organization Theory 11(3):201-228.10.1007/s10588-005-5377-0Search in Google Scholar
Doddington, George, Mitchell, Alexis, Przybocki, Mark, Ramshaw, Lance, Strassel, Strassel, and Weischedel, Ralph. 2004. The Automatic Content Extraction (ACE) Program: Tasks, Data, and Evaluation. In Proc. Language Resources and Evaluation Conference (LREC).Search in Google Scholar
Elson, David K., Dames, Nicholas, and McKeown, Kathleen R. 2010. Extracting social networks from literary fiction. In Proc. Association for Computational Linguistics (ACL).Search in Google Scholar
Elson, David K. and McKeown, Kathleen R. 2010. Automatic attribution of quoted speech in literary narrative. In Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI 2010), Atlanta, Georgia.10.1609/aaai.v24i1.7720Search in Google Scholar
Gansner, Emden R., & North, Stephen C. 2000. An open graph visualization system and its applications to software engineering. Software Practice and Experience, 30(11):1203-1233.10.1002/1097-024X(200009)30:11<1203::AID-SPE338>3.0.CO;2-NSearch in Google Scholar
Holmes, David I. 1994. Authorship Attribution. Computers and the Humanities 28(2):87-106.10.1007/BF01830689Search in Google Scholar
Hung, Jen-Jou, Bingenheimer, Marcus, and Wiles, Simon. 2010. Quantitative evidence for a hypothesis regarding the attribution of early Buddhist translations. Literary and Linguistic Computing 25(1):119-34.10.1093/llc/fqp036Search in Google Scholar
Kieschnick, John. 2014. A Primer in Chinese Buddhist Writings: Volume One: Foundations: Translation Key. Department of Religious Studies, Stanford University. Accessed 18th August 2015. http://religiousstudies.stanford.edu/a-primer-inchinese- buddhist-writings/Search in Google Scholar
Knuth, Donald E. 1993. The Stanford GraphBase: A Platform for Combinatorial Computing. Reading, MA: Addison-Wesley.Search in Google Scholar
Lafferty, John, McCallum, Andrew, and Pereira, Fernando C. N. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proc. International Conference on Machine Learning (ICML), pages 282-289.Search in Google Scholar
Lancaster, Lewis. 2010. From Text to Image to Analysis: Visualization of Chinese Buddhist Canon. In Proc. Digital Humanities. Lancaster, Lewis and Park, Sung-bae. 1979. The Korean Buddhist Canon: A Descriptive Catalogue. Berkeley: Berkeley University Press.10.1525/9780520317505Search in Google Scholar
Lee, John and Kong, Yin Hei. 2016. A dependency treebank of Chinese Buddhist texts. In Digital Scholarship in the Humanities 31(1):140-151.10.1093/llc/fqu048Search in Google Scholar
Lee, John and Wong, Tak Sum. 2016. Hierarchy of characters in the Chinese Buddhist Canon. In Proceedings of the Twenty-Ninth International Florida Artificial Intelligence Research Society Conference, pages 531-534.Search in Google Scholar
Liang, Jisheng, Dhillon, Navdeep, and Koperski, Krzysztof. 2010. A large-scale system for annotating and querying quotations in news feeds. In Proceedings of the 3rd International Semantic Search Workshop, pages 1–5.Search in Google Scholar
Mahlberg, Michaela and Smith, Catherine. 2012. Dickens, the suspended quotation and the corpus. Language and Literature 21(1):51-65.10.1177/0963947011432058Search in Google Scholar
McDonald, Ryan, Lerman, Kevin and Pereira, Fernando. 2006. Multilingual dependency parsing with a two-stage discriminative parser. In Proc. 10th Conference on Computational Natural Language Learning (CoNLL-X).10.3115/1596276.1596317Search in Google Scholar
Moretti, Franco. 1999. Atlas of the European Novel 1800-1900. London: Verso.Search in Google Scholar
Moretti, Franco. 2011. Network Theory, Plot Analysis. New Left Review 68: 80-102.Search in Google Scholar
Mutton, Paul. 2004. Inferring and Visualizing Social Networks on Internet Relay Chat. Proc. 8th International Conference on Information Visualization.Search in Google Scholar
Newman, Mark. 2010. Networks: An Introduction. New York: Oxford University Press.Search in Google Scholar
Oelke, Daniela, Kokkinakis, Dimitrios, and Keim, Daniel. A. 2013. Fingerprint Matrices: Uncovering the dynamics of social networks in prose literature. Computer Graphics Forum 32(3.4):371-380.10.1111/cgf.12124Search in Google Scholar
Pareti, Silvia, O’Keefe, Timothy, Konstas, Ioannis, Curran, James R., and Koprinska, Irena. 2013. Automatically Detecting and Attributing Indirect Quotations. In Proc. Empirical Methods for Natural Language Processing (EMNLP).Search in Google Scholar
Pouliquen, Bruno, Steinberger, Ralf, and Best, Clive. 2007. Automatic detection of quotations in multilingual news. In Proceedings of Recent Advances in Natural Language Processing, pages 487–492.Search in Google Scholar
Rydberg-Cox, Jeff. 2011. Social Networks and the Language of Greek Tragedy. Journal of the Chicago Colloquium on Digital Humanities and Computer Science 1(3). https://letterpress.uchicago.edu/index.php/jdhcs/article/view/86Search in Google Scholar
Sealey, Alison. 2010. Probabilities and Surprises: A Realist Approach to Identifying Linguistic and Social Patterns, with Reference to an Oral History Corpus. Applied Linguistics 31(2):215-235.10.1093/applin/amp023Search in Google Scholar
Stiller, James, Nettle, Daniel, and Dunbar, Robin I. M. 2003. The Small World of Shakespeare’s Plays. Human Nature 14(4):397-408.10.1007/s12110-003-1013-1Search in Google Scholar
Xue, Naiwen, Xia, Fei, Chiou, Fu-dong, and Palmer, Marta. 2005. The Penn Chinese Treebank: Phrase structure annotation of a large corpus. Natural Language Engineering, 11:207-238.10.1017/S135132490400364XSearch in Google Scholar
Zhao, Hai, Huang, Chang-Ning and Li, Mu. 2007. An Improved Chinese Word Segmentation System with Conditional Random Field. In H. T. Ng, & O. O. Y. Kwong (Eds.), Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing. Stroudsburg, PA: Association for Computational Linguistics, pages 162-165. Search in Google Scholar
© 2016 John Lee, Tak-sum Wong
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.