Jump to ContentJump to Main Navigation
Show Summary Details
More options …

Statistics, Politics and Policy

Editor-in-Chief: Wagschal, Uwe

See all formats and pricing
More options …

Predicting the Brexit Vote by Tracking and Classifying Public Opinion Using Twitter Data

Julio Cesar Amador Diaz Lopez / Sofia Collignon-Delmar
  • University College London, London, United Kingdom of Great Britain and Northern Ireland
  • University of Strathclyde, Glasgow, United Kingdom of Great Britain and Northern Ireland
  • Other articles by this author:
  • De Gruyter OnlineGoogle Scholar
/ Kenneth Benoit
  • London School of Economics and Political Science – Methodology, London, United Kingdom of Great Britain and Northern Ireland
  • Other articles by this author:
  • De Gruyter OnlineGoogle Scholar
/ Akitaka Matsuo
  • London School of Economics and Political Science – Methodology, London, United Kingdom of Great Britain and Northern Ireland
  • Other articles by this author:
  • De Gruyter OnlineGoogle Scholar
Published Online: 2017-09-29 | DOI: https://doi.org/10.1515/spp-2017-0006


We use 23M Tweets related to the EU referendum in the UK to predict the Brexit vote. In particular, we use user-generated labels known as hashtags to build training sets related to the Leave/Remain campaign. Next, we train SVMs in order to classify Tweets. Finally, we compare our results to Internet and telephone polls. This approach not only allows to reduce the time of hand-coding data to create a training set, but also achieves high level of correlations with Internet polls. Our results suggest that Twitter data may be a suitable substitute for Internet polls and may be a useful complement for telephone polls. We also discuss the reach and limitations of this method.


  • Ackerman, S., B. Jacobs and S. Siddiqui (2016) Newly Discovered Emails Relating to Hillary Clinton Case Under Review by FBI. Retrieved January 06, 2017, from https://www.theguardian.com/us-news/2016/oct/28/fbi-reopens-hillary-clinton-emails-investigation.

  • Barberá, P. (2014) “Birds of the Same Feather Tweet Together: Bayesian Ideal Point Estimation Using Twitter Data,” Political Analysis, 23:76–91.Web of ScienceGoogle Scholar

  • Barberá, P. and G. Rivero (2015) “Understanding the Political Representativeness of Twitter Users,” Social Science Computer Review, 33:712–729.Web of ScienceCrossrefGoogle Scholar

  • Beauchamp, N. (2017) “Predicting and Interpolating State-Level Polls Using Twitter Textual Data,” American Journal of Political Science, 61:490–503.Web of ScienceCrossrefGoogle Scholar

  • Benoit, K., K. Watanabe, P. Nulty, A. Obeng, H. Wang, B. Lauderdale and W. Lowe (2017) quanteda: Quantitative Analysis of Textual Data. URL http://quanteda.io, r package version 0.99.

  • Berinsky, A. J. (2017) “Measuring Public Opinion with Surveys,” Annual Review of Political Science, 20:309–329.CrossrefWeb of ScienceGoogle Scholar

  • Bermingham, A. and A. F. Smeaton (2010) “Classifying Sentiment in Microblogs.” In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management – CIKM ’10, 1833. URL http://portal.acm.org/citation.cfm?doid=1871437.1871741.

  • Bird, S., E. Loper and E. Klein (2009) Natural Language Processing with Python. Sebastopol, CA: OReilly Media Inc.Google Scholar

  • Burnap, P., R. Gibson, L. Sloan, R. Southern and M. Williams (2016) “140 Characters to Victory?: Using Twitter to Predict the UK 2015 General Election,” Electoral Studies, 41:230–233.CrossrefWeb of ScienceGoogle Scholar

  • Caldarelli, G., A. Chessa, F. Pammolli, G. Pompa, M. Puliga, M. Riccaboni and G. Riotta (2014) “A Multi-level Geographical Study of Italian Political Elections from Twitter Data,” PLoS One, 9:e95809.Web of ScienceCrossrefGoogle Scholar

  • Campbell, A., P. E. Converse, W. E. Miller and E. Donald (1960) Stokes. The American Voter. New York, NY: John Wiley and Sons, p. 77.Google Scholar

  • Ceron, A., L. Curini, S. M. Iacus and G. Porro (2014) “Every Tweet Counts? How Sentiment Analysis of Social Media Can Improve Our Knowledge of Citizens’ Political Preferences with an Application to Italy and France,” New Media & Society, 16:340–358.CrossrefWeb of ScienceGoogle Scholar

  • Chin, D., A. Zappone and J. Zhao (2016) “Analyzing Twitter Sentiment of the 2016 Presidential Candidates.” Available at: https://web.stanford.edu/~jesszhao/files/twitterSentiment.pdf

  • DiGrazia, J., K. McKelvey, J. Bollen and F. Rojas (2013) “More Tweets, More Votes: Social Media as a Quantitative Indicator of Political Behavior,” PLoS One, 8:e79449.CrossrefWeb of ScienceGoogle Scholar

  • eMarketer (2016) Twitter, Facebook User Growth Slowing in the UK. Retrieved January 31, 2017, form https://www.emarketer.com/Article/Twitter-Facebook-User-Growth-Slowing-UK/1014326.

  • Fábrega, J. and J. Sajuria (2014) “The Formation of Political Discourse Within Online Networks: The Case of the Occupy Movement,” International Journal of Organisational Design and Engineering, 3:210–222.CrossrefGoogle Scholar

  • Franch, F. (2013) “Wisdom of the Crowds: 2010 UK Election Prediction with Social Media,” Journal of Information Technology & Politics, 10:57–71.CrossrefGoogle Scholar

  • Gayo-Avello, D. (2012) “No, You Cannot Predict Elections with Twitter,” IEEE Internet Computing, 16:91–94.Web of ScienceCrossrefGoogle Scholar

  • Gayo Avello, D., P. T. Metaxas and E. Mustafaraj (2011) “Limits of Electoral Predictions Using Twitter.” In: Proceedings of the Fifth International AAAI Conference on Weblogs and Social Media. Association for the Advancement of Artificial Intelligence. Available at: https://www.aaai.org/ocs/index.php/ICWSM/ICWSM11/paper/viewFile/2862/3254.

  • Hersh, E. D. (2015) Hacking the Electorate: How Campaigns Perceive Voters. Cambridge: Cambridge University Press.Google Scholar

  • Hopkins, D. and G. King (2010) “A Method of Automated Nonparametric Content Analysis for Social Science,” American Journal of Political Science, 54:229–247.CrossrefWeb of ScienceGoogle Scholar

  • Howard, P. N. and B. Kollanyi (2016) Bots,# Strongerin, and# Brexit: Computational Propaganda During the uk-eu Referendum. Working Paper.Google Scholar

  • Huberty, M. (2015) “Can We Vote with Our Tweet? On the Perennial Difficulty of Election Forecasting with Social Media,” International Journal of Forecasting, 31:992–1007.CrossrefWeb of ScienceGoogle Scholar

  • Huckfeldt, R. R. and J. Sprague (1995) Citizens, Politics and Social Communication: Information and Influence in an Election Campaign. Cambridge: Cambridge University Press.Google Scholar

  • Huckfeldt, R., E. G. Carmines, J. J. Mondak and E. Zeemering (2007) “Information, Activation, and Electoral Competition in the 2002 Congressional Elections,” Journal of Politics, 69:798–812.CrossrefGoogle Scholar

  • Manning, C. D., P. Raghavan and H. Schütze (2008) Introduction to Information Retrieval. Cambridge: Cambridge University Press.Google Scholar

  • McKelvey, K., J. DiGrazia and F. Rojas (2014) “Twitter Publics: How Online Political Communities Signaled Electoral Outcomes in the 2010 US House Election,” Information, Communication & Society, 17:436–450.Web of ScienceCrossrefGoogle Scholar

  • Morstatter, F., J. Pfeffer, H. Liu and K. Carley (2013) “Is the Sample Good Enough? Comparing Data from Twitter’s Streaming API with Twitter’s Firehose,” Proceedings of ICWSM, 400–408. Available at: http://www.aaai.org/ocs/index.php/ICWSM/ICWSM13/paper/view/6071.

  • Pedregosa, F., G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot and E. Duchesnay (2011) “Scikit-Learn: Machine Learning in Python,” Journal of Machine Learning Research, 12:2825–2830.Google Scholar

  • Sajuria, J. and J. Fábrega (2016) “Do We Need Polls? Why Twitter Will Not Replace Opinion Surveys, But Can Complement Them,” In: (Snee, H., C. Hine, Y. Morey, S. Roberts and H. Watson, eds.) Digital Methods for Social Science. Berlin: Springer, pp. 87–104.Google Scholar

  • Sang, E. T. K. and J. Bos (2012) “Predicting the 2011 Dutch Senate Election Results with Twitter,” In: Proceedings of the Workshop on Semantic Analysis in Social Media. Association for Computational Linguistics, pp. 53–60. Available at: https://www.let.rug.nl/bos/pubs/TjongBos2012EACL.pdf.

  • Settle, J. E., R. M. Bond, L. Coviello, C. J. Fariss, J. H. Fowler and J. J. Jones (2016) “From Posting to Voting: The Effects of Political Competition on Online Political Engagement,” Political Science Research and Methods, 4:61–378.Web of ScienceGoogle Scholar

  • Silver, N. (2016a) The Myth of the Lag. Retrieved January 06, 2017, from http://fivethirtyeight.com/features/myth-of-lag/.

  • Silver, N. (2016b) National Polls Will Wind Up Being More Accurate than They were in 2012: 2012: Obama up 1, Won by 4 2014: Clinton up 3–4, will win by 1–2 [tweet]. Retrieved January 06, 2017, from https://twitter.com/NateSilver538/status/796411118302302208.

  • Tumasjan, A., T. O. Sprenger, P. G. Sandner and I. M. Welpe (2010) “Predicting Elections with Twitter: What 140 Characters Reveal About Political Sentiment,” ICWSM, 10:178–185.Google Scholar

  • Verba, S., K. L. Schlozman, H. E. Brady and H. E. Brady (1995) Voice and Equality: Civic Voluntarism in American Politics, volume 4. Cambridge: Cambridge University Press.Google Scholar

About the article

Published Online: 2017-09-29

Published in Print: 2017-10-26

Citation Information: Statistics, Politics and Policy, Volume 8, Issue 1, Pages 85–104, ISSN (Online) 2151-7509, ISSN (Print) 2194-6299, DOI: https://doi.org/10.1515/spp-2017-0006.

Export Citation

©2017 Walter de Gruyter GmbH, Berlin/Boston.Get Permission

Citing Articles

Here you can find all Crossref-listed publications in which this article is cited. If you would like to receive automatic email messages as soon as this article is cited in other publications, simply activate the “Citation Alert” on the top of this page.

Alexandre Bovet, Flaviano Morone, and Hernán A. Makse
Scientific Reports, 2018, Volume 8, Number 1

Comments (0)

Please log in or register to comment.
Log in