Jump to ContentJump to Main Navigation
Show Summary Details
More options …

it - Information Technology

Methods and Applications of Informatics and Information Technology

Editor-in-Chief: Conrad, Stefan / Molitor, Paul

6 Issues per year

See all formats and pricing
More options …
Volume 60, Issue 1


Optimising crowdsourcing efficiency: Amplifying human computation with validation

Jon Chamberlain / Udo Kruschwitz / Massimo Poesio
Published Online: 2018-02-28 | DOI: https://doi.org/10.1515/itit-2017-0020


Crowdsourcing has revolutionised the way tasks can be completed but the process is frequently inefficient, costing practitioners time and money. This research investigates whether crowdsourcing can be optimised with a validation process, as measured by four criteria: quality; cost; noise; and speed. A validation model is described, simulated and tested on real data from an online crowdsourcing game to collect data about human language. Results show that by adding an agreement validation (or a like/upvote) step fewer annotations are required, noise and collection time are reduced and quality may be improved.

Keywords: Crowdsourcing; Empirical studies in interaction design; Interactive games; Social networks; Natural language processing


  • 1.

    R. Artstein and M. Poesio, Inter-coder agreement for computational linguistics, Computational Linguistics 34 (2008), 555–596.CrossrefWeb of ScienceGoogle Scholar

  • 2.

    Yochai Benkler and Helen Nissenbaum, Commons-based Peer Production and Virtue, Journal of Political Philosophy 14 (2006), 394–419.CrossrefGoogle Scholar

  • 3.

    Michael S. Bernstein, Greg Little, Robert C. Miller, Björn Hartmann, Mark S. Ackerman, David R. Karger, David Crowell and Katrina Panovich, Soylent: A Word Processor with a Crowd Inside, in: Proceedings of the 23nd Annual ACM Symposium on User Interface Software and Technology (UIST’10), pp. 313–322, 2010.Google Scholar

  • 4.

    Amiangshu Bosu, Christopher S. Corley, Dustin Heaton, Debarshi Chatterji, Jeffrey C. Carver and Nicholas A. Kraft, Building Reputation in StackOverflow: An Empirical Investigation, in: Proceedings of the 10th Working Conference on Mining Software Repositories (MSR’13), pp. 89–92, 2013.Google Scholar

  • 5.

    Daren C. Brabham, Crowdsourcing, The MIT Press, 2013.Google Scholar

  • 6.

    J. Chamberlain, Groupsourcing: Distributed Problem Solving Using Social Networks, in: Proceedings of 2nd AAAI Conference on Human Computation and Crowdsourcing (HCOMP’14), 2014.Google Scholar

  • 7.

    J. Chamberlain, M. Poesio and U. Kruschwitz, Phrase Detectives Corpus 1.0 Crowdsourced Anaphoric Coreference, in: Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC’16), may 2016.Google Scholar

  • 8.

    Ido Guy, Inbal Ronen, Naama Zwerdling, Irena Zuyev-Grabovitch and Michal Jacovi, What is Your Organization ‘Like’?: A Study of Liking Activity in the Enterprise, in: Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, CHI ’16, pp. 3025–3037, ACM, New York, NY, USA, 2016.Google Scholar

  • 9.

    Matthias Hirth, Tobias Hoßfeld and Phuoc Tran-Gia, Analyzing costs and accuracy of validation mechanisms for crowdsourcing platforms, Mathematical and Computer Modelling 57 (2013), 2918–2932.CrossrefWeb of ScienceGoogle Scholar

  • 10.

    J. Howe, Crowdsourcing: Why the power of the crowd is driving the future of business, Crown Publishing Group, 2008.Google Scholar

  • 11.

    Faiza Khattak and Ansaf Salleb-Aouissi, Quality control of crowd labeling through expert evaluation, in: Proceedings of the 2nd Workshop on Computational Social Science and the Wisdom of Crowds (NIPS’11), 2011.Google Scholar

  • 12.

    Anand P. Kulkarni, Matthew Can and Bjoern Hartmann, Turkomatic: Automatic Recursive Task and Workflow Design for Mechanical Turk, in: CHI ’11 Extended Abstracts on Human Factors in Computing Systems, pp. 2053–2058, ACM, New York, NY, USA, 2011.Google Scholar

  • 13.

    M. Lafourcade, A. Joubert and N. Le Brun, Games with a Purpose (GWAPS), John Wiley & Sons, 2015.Google Scholar

  • 14.

    Greg Little, Lydia B. Chilton, Max Goldman and Robert C. Miller, TurKit: Human Computation Algorithms on Mechanical Turk, in: Proceedings of the 23nd Annual ACM Symposium on User Interface Software and Technology, UIST ’10, pp. 57–66, ACM, New York, NY, USA, 2010.Google Scholar

  • 15.

    M. Poesio, J. Chamberlain, U. Kruschwitz, L. Robaldo and L. Ducceschi, Phrase Detectives: Utilizing Collective Intelligence for Internet-Scale Language Resource Creation, ACM Transactions on Interactive Intelligent Systems 3 (2013), 1–44.Google Scholar

  • 16.

    W. Rafelsberger and A. Scharl, Games with a purpose for social networking platforms, in: Proceedings of the 20th ACM Conference on Hypertext and Hypermedia, 2009.Google Scholar

  • 17.

    Victor S. Sheng, Foster Provost and Panagiotis G. Ipeirotis, Get Another Label? Improving Data Quality and Data Mining Using Multiple, Noisy Labelers, in: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’08), pp. 614–622, 2008.Google Scholar

  • 18.

    Brian Sidlauskas, Calvin Bernard, Devin Bloom, Whitcomb Bronaugh, Michael Clementson and Richard P. Vari, Ichthyologists Hooked on Facebook, Science 332 (2011), 537.CrossrefWeb of ScienceGoogle Scholar

  • 19.

    R. Snow, B. O’Connor, D. Jurafsky and A. Y. Ng, Cheap and fast - but is it good?: Evaluating non-expert annotations for natural language tasks, in: Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing (EMNLP’08), 2008.Google Scholar

About the article

Jon Chamberlain

Dr Jon Chamberlain is a web developer and lecturer in Human-Computer Interaction at the University of Essex with experience of industrial and academic computer applications (language processing, game design, social network analysis) in the domains of citizen science, marine ecology, and human rights observation. He was the lead developer of the Phrase Detectives project since its inception in 2007 and has continued investigating crowdsourcing using games and social networks for almost a decade.

Udo Kruschwitz

Professor Udo Kruschwitz’s research interests are in natural language processing (NLP), information retrieval (IR) and the implementation of such techniques in real applications. He is developing techniques that allow the extraction of conceptual information from document collections and access logs and the utilization of such knowledge in search and navigation contexts. Professor Kruschwitz was Co-PI in the original EPRSC project that developed Phrase Detectives.

Massimo Poesio

Professor Massimo Poesio is a computational linguist. His work on anaphora is driven by the analysis of corpora and of disagreements in corpus annotation, most recently, using the Phrase Detectives game-with-a-purpose to collect such data. He is also a PI of the DALI project, an Advanced ERC grant; a supervisor in the IGGI Doctoral training centre in Intelligent Games and Game Intelligence; and a PI in the Centre for Human Rights and Information Technology in the Era of Big Data.

Received: 2017-08-31

Revised: 2018-02-02

Accepted: 2018-02-02

Published Online: 2018-02-28

Published in Print: 2018-03-01

Funding Source: Engineering and Physical Sciences Research Council

Award identifier / Grant number: EP/F00575X/1

Funding Source: Engineering and Physical Sciences Research Council

Award identifier / Grant number: Doctoral Training Allowance

Funding Source: Engineering and Physical Sciences Research Council

Award identifier / Grant number: ES/L011859/1

The creation of the original game was funded by Engineering and Physical Sciences Research Council (EPSRC) project AnaWiki, EP/F00575X/1. The analysis of the data was partially funded by an Engineering and Physical Sciences Research Council (EPSRC) Doctoral Training Allowance granted by the University of Essex and Engineering and Physical Sciences Research Council (EPSRC) grant ES/L011859/1.

Citation Information: it - Information Technology, Volume 60, Issue 1, Pages 41–49, ISSN (Online) 2196-7032, ISSN (Print) 1611-2776, DOI: https://doi.org/10.1515/itit-2017-0020.

Export Citation

© 2018 Walter de Gruyter GmbH, Berlin/Boston.Get Permission

Comments (0)

Please log in or register to comment.
Log in