Crowdsourcing has revolutionised the way tasks can be completed but the process is frequently inefficient, costing practitioners time and money. This research investigates whether crowdsourcing can be optimised with a validation process, as measured by four criteria: quality; cost; noise; and speed. A validation model is described, simulated and tested on real data from an online crowdsourcing game to collect data about human language. Results show that by adding an agreement validation (or a like/upvote) step fewer annotations are required, noise and collection time are reduced and quality may be improved.
Award Identifier / Grant number: EP/F00575X/1
Award Identifier / Grant number: Doctoral Training Allowance
Award Identifier / Grant number: ES/L011859/1
Funding statement: The creation of the original game was funded by Engineering and Physical Sciences Research Council (EPSRC) project AnaWiki, EP/F00575X/1. The analysis of the data was partially funded by an Engineering and Physical Sciences Research Council (EPSRC) Doctoral Training Allowance granted by the University of Essex and Engineering and Physical Sciences Research Council (EPSRC) grant ES/L011859/1.
About the authors
Dr Jon Chamberlain is a web developer and lecturer in Human-Computer Interaction at the University of Essex with experience of industrial and academic computer applications (language processing, game design, social network analysis) in the domains of citizen science, marine ecology, and human rights observation. He was the lead developer of the Phrase Detectives project since its inception in 2007 and has continued investigating crowdsourcing using games and social networks for almost a decade.
Professor Udo Kruschwitz’s research interests are in natural language processing (NLP), information retrieval (IR) and the implementation of such techniques in real applications. He is developing techniques that allow the extraction of conceptual information from document collections and access logs and the utilization of such knowledge in search and navigation contexts. Professor Kruschwitz was Co-PI in the original EPRSC project that developed Phrase Detectives.
Professor Massimo Poesio is a computational linguist. His work on anaphora is driven by the analysis of corpora and of disagreements in corpus annotation, most recently, using the Phrase Detectives game-with-a-purpose to collect such data. He is also a PI of the DALI project, an Advanced ERC grant; a supervisor in the IGGI Doctoral training centre in Intelligent Games and Game Intelligence; and a PI in the Centre for Human Rights and Information Technology in the Era of Big Data.
The authors would like to acknowledge the feedback from anonymous reviewers of this paper and to thank all the players who played the game.
3. Michael S. Bernstein, Greg Little, Robert C. Miller, Björn Hartmann, Mark S. Ackerman, David R. Karger, David Crowell and Katrina Panovich, Soylent: A Word Processor with a Crowd Inside, in: Proceedings of the 23nd Annual ACM Symposium on User Interface Software and Technology (UIST’10), pp. 313–322, 2010.10.1145/1866029.1866078Search in Google Scholar
4. Amiangshu Bosu, Christopher S. Corley, Dustin Heaton, Debarshi Chatterji, Jeffrey C. Carver and Nicholas A. Kraft, Building Reputation in StackOverflow: An Empirical Investigation, in: Proceedings of the 10th Working Conference on Mining Software Repositories (MSR’13), pp. 89–92, 2013.Search in Google Scholar
6. J. Chamberlain, Groupsourcing: Distributed Problem Solving Using Social Networks, in: Proceedings of 2nd AAAI Conference on Human Computation and Crowdsourcing (HCOMP’14), 2014.10.1609/hcomp.v2i1.13162Search in Google Scholar
7. J. Chamberlain, M. Poesio and U. Kruschwitz, Phrase Detectives Corpus 1.0 Crowdsourced Anaphoric Coreference, in: Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC’16), may 2016.Search in Google Scholar
8. Ido Guy, Inbal Ronen, Naama Zwerdling, Irena Zuyev-Grabovitch and Michal Jacovi, What is Your Organization ‘Like’?: A Study of Liking Activity in the Enterprise, in: Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, CHI ’16, pp. 3025–3037, ACM, New York, NY, USA, 2016.10.1145/2858036.2858540Search in Google Scholar
9. Matthias Hirth, Tobias Hoßfeld and Phuoc Tran-Gia, Analyzing costs and accuracy of validation mechanisms for crowdsourcing platforms, Mathematical and Computer Modelling 57 (2013), 2918–2932.10.1016/j.mcm.2012.01.006Search in Google Scholar
10. J. Howe, Crowdsourcing: Why the power of the crowd is driving the future of business, Crown Publishing Group, 2008.Search in Google Scholar
11. Faiza Khattak and Ansaf Salleb-Aouissi, Quality control of crowd labeling through expert evaluation, in: Proceedings of the 2nd Workshop on Computational Social Science and the Wisdom of Crowds (NIPS’11), 2011.Search in Google Scholar
12. Anand P. Kulkarni, Matthew Can and Bjoern Hartmann, Turkomatic: Automatic Recursive Task and Workflow Design for Mechanical Turk, in: CHI ’11 Extended Abstracts on Human Factors in Computing Systems, pp. 2053–2058, ACM, New York, NY, USA, 2011.10.1145/1979742.1979865Search in Google Scholar
14. Greg Little, Lydia B. Chilton, Max Goldman and Robert C. Miller, TurKit: Human Computation Algorithms on Mechanical Turk, in: Proceedings of the 23nd Annual ACM Symposium on User Interface Software and Technology, UIST ’10, pp. 57–66, ACM, New York, NY, USA, 2010.10.1145/1866029.1866040Search in Google Scholar
15. M. Poesio, J. Chamberlain, U. Kruschwitz, L. Robaldo and L. Ducceschi, Phrase Detectives: Utilizing Collective Intelligence for Internet-Scale Language Resource Creation, ACM Transactions on Interactive Intelligent Systems 3 (2013), 1–44.10.1145/2448116.2448119Search in Google Scholar
16. W. Rafelsberger and A. Scharl, Games with a purpose for social networking platforms, in: Proceedings of the 20th ACM Conference on Hypertext and Hypermedia, 2009.10.1145/1557914.1557948Search in Google Scholar
17. Victor S. Sheng, Foster Provost and Panagiotis G. Ipeirotis, Get Another Label? Improving Data Quality and Data Mining Using Multiple, Noisy Labelers, in: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’08), pp. 614–622, 2008.10.1145/1401890.1401965Search in Google Scholar
18. Brian Sidlauskas, Calvin Bernard, Devin Bloom, Whitcomb Bronaugh, Michael Clementson and Richard P. Vari, Ichthyologists Hooked on Facebook, Science 332 (2011), 537.10.1126/science.332.6029.537-cSearch in Google Scholar PubMed
19. R. Snow, B. O’Connor, D. Jurafsky and A. Y. Ng, Cheap and fast - but is it good?: Evaluating non-expert annotations for natural language tasks, in: Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing (EMNLP’08), 2008.10.3115/1613715.1613751Search in Google Scholar
© 2018 Walter de Gruyter GmbH, Berlin/Boston