Skip to content
Licensed Unlicensed Requires Authentication Published by De Gruyter September 29, 2017

Predicting the Brexit Vote by Tracking and Classifying Public Opinion Using Twitter Data

  • Julio Cesar Amador Diaz Lopez EMAIL logo , Sofia Collignon-Delmar , Kenneth Benoit and Akitaka Matsuo


We use 23M Tweets related to the EU referendum in the UK to predict the Brexit vote. In particular, we use user-generated labels known as hashtags to build training sets related to the Leave/Remain campaign. Next, we train SVMs in order to classify Tweets. Finally, we compare our results to Internet and telephone polls. This approach not only allows to reduce the time of hand-coding data to create a training set, but also achieves high level of correlations with Internet polls. Our results suggest that Twitter data may be a suitable substitute for Internet polls and may be a useful complement for telephone polls. We also discuss the reach and limitations of this method.


Ackerman, S., B. Jacobs and S. Siddiqui (2016) Newly Discovered Emails Relating to Hillary Clinton Case Under Review by FBI. Retrieved January 06, 2017, from in Google Scholar

Barberá, P. (2014) “Birds of the Same Feather Tweet Together: Bayesian Ideal Point Estimation Using Twitter Data,” Political Analysis, 23:76–91.10.1093/pan/mpu011Search in Google Scholar

Barberá, P. and G. Rivero (2015) “Understanding the Political Representativeness of Twitter Users,” Social Science Computer Review, 33:712–729.10.1177/0894439314558836Search in Google Scholar

Beauchamp, N. (2017) “Predicting and Interpolating State-Level Polls Using Twitter Textual Data,” American Journal of Political Science, 61:490–503.10.1111/ajps.12274Search in Google Scholar

Benoit, K., K. Watanabe, P. Nulty, A. Obeng, H. Wang, B. Lauderdale and W. Lowe (2017) quanteda: Quantitative Analysis of Textual Data. URL, r package version 0.99.Search in Google Scholar

Berinsky, A. J. (2017) “Measuring Public Opinion with Surveys,” Annual Review of Political Science, 20:309–329.10.1146/annurev-polisci-101513-113724Search in Google Scholar

Bermingham, A. and A. F. Smeaton (2010) “Classifying Sentiment in Microblogs.” In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management – CIKM ’10, 1833. URL in Google Scholar

Bird, S., E. Loper and E. Klein (2009) Natural Language Processing with Python. Sebastopol, CA: OReilly Media Inc.Search in Google Scholar

Burnap, P., R. Gibson, L. Sloan, R. Southern and M. Williams (2016) “140 Characters to Victory?: Using Twitter to Predict the UK 2015 General Election,” Electoral Studies, 41:230–233.10.1016/j.electstud.2015.11.017Search in Google Scholar

Caldarelli, G., A. Chessa, F. Pammolli, G. Pompa, M. Puliga, M. Riccaboni and G. Riotta (2014) “A Multi-level Geographical Study of Italian Political Elections from Twitter Data,” PLoS One, 9:e95809.10.1371/journal.pone.0095809Search in Google Scholar

Campbell, A., P. E. Converse, W. E. Miller and E. Donald (1960) Stokes. The American Voter. New York, NY: John Wiley and Sons, p. 77.Search in Google Scholar

Ceron, A., L. Curini, S. M. Iacus and G. Porro (2014) “Every Tweet Counts? How Sentiment Analysis of Social Media Can Improve Our Knowledge of Citizens’ Political Preferences with an Application to Italy and France,” New Media & Society, 16:340–358.10.1177/1461444813480466Search in Google Scholar

Chin, D., A. Zappone and J. Zhao (2016) “Analyzing Twitter Sentiment of the 2016 Presidential Candidates.” Available at: in Google Scholar

DiGrazia, J., K. McKelvey, J. Bollen and F. Rojas (2013) “More Tweets, More Votes: Social Media as a Quantitative Indicator of Political Behavior,” PLoS One, 8:e79449.10.1371/journal.pone.0079449Search in Google Scholar

eMarketer (2016) Twitter, Facebook User Growth Slowing in the UK. Retrieved January 31, 2017, form in Google Scholar

Fábrega, J. and J. Sajuria (2014) “The Formation of Political Discourse Within Online Networks: The Case of the Occupy Movement,” International Journal of Organisational Design and Engineering, 3:210–222.10.1504/IJODE.2014.065094Search in Google Scholar

Franch, F. (2013) “Wisdom of the Crowds: 2010 UK Election Prediction with Social Media,” Journal of Information Technology & Politics, 10:57–71.10.1080/19331681.2012.705080Search in Google Scholar

Gayo-Avello, D. (2012) “No, You Cannot Predict Elections with Twitter,” IEEE Internet Computing, 16:91–94.10.1109/MIC.2012.137Search in Google Scholar

Gayo Avello, D., P. T. Metaxas and E. Mustafaraj (2011) “Limits of Electoral Predictions Using Twitter.” In: Proceedings of the Fifth International AAAI Conference on Weblogs and Social Media. Association for the Advancement of Artificial Intelligence. Available at: in Google Scholar

Hersh, E. D. (2015) Hacking the Electorate: How Campaigns Perceive Voters. Cambridge: Cambridge University Press.10.1017/CBO9781316212783Search in Google Scholar

Hopkins, D. and G. King (2010) “A Method of Automated Nonparametric Content Analysis for Social Science,” American Journal of Political Science, 54:229–247.10.1111/j.1540-5907.2009.00428.xSearch in Google Scholar

Howard, P. N. and B. Kollanyi (2016) Bots,# Strongerin, and# Brexit: Computational Propaganda During the uk-eu Referendum. Working Paper.10.2139/ssrn.2798311Search in Google Scholar

Huberty, M. (2015) “Can We Vote with Our Tweet? On the Perennial Difficulty of Election Forecasting with Social Media,” International Journal of Forecasting, 31:992–1007.10.1016/j.ijforecast.2014.08.005Search in Google Scholar

Huckfeldt, R. R. and J. Sprague (1995) Citizens, Politics and Social Communication: Information and Influence in an Election Campaign. Cambridge: Cambridge University Press.10.1017/CBO9780511664113Search in Google Scholar

Huckfeldt, R., E. G. Carmines, J. J. Mondak and E. Zeemering (2007) “Information, Activation, and Electoral Competition in the 2002 Congressional Elections,” Journal of Politics, 69:798–812.10.1111/j.1468-2508.2007.00576.xSearch in Google Scholar

Manning, C. D., P. Raghavan and H. Schütze (2008) Introduction to Information Retrieval. Cambridge: Cambridge University Press.10.1017/CBO9780511809071Search in Google Scholar

McKelvey, K., J. DiGrazia and F. Rojas (2014) “Twitter Publics: How Online Political Communities Signaled Electoral Outcomes in the 2010 US House Election,” Information, Communication & Society, 17:436–450.10.1080/1369118X.2014.892149Search in Google Scholar

Morstatter, F., J. Pfeffer, H. Liu and K. Carley (2013) “Is the Sample Good Enough? Comparing Data from Twitter’s Streaming API with Twitter’s Firehose,” Proceedings of ICWSM, 400–408. Available at: in Google Scholar

Pedregosa, F., G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot and E. Duchesnay (2011) “Scikit-Learn: Machine Learning in Python,” Journal of Machine Learning Research, 12:2825–2830.Search in Google Scholar

Sajuria, J. and J. Fábrega (2016) “Do We Need Polls? Why Twitter Will Not Replace Opinion Surveys, But Can Complement Them,” In: (Snee, H., C. Hine, Y. Morey, S. Roberts and H. Watson, eds.) Digital Methods for Social Science. Berlin: Springer, pp. 87–104.10.1057/9781137453662_6Search in Google Scholar

Sang, E. T. K. and J. Bos (2012) “Predicting the 2011 Dutch Senate Election Results with Twitter,” In: Proceedings of the Workshop on Semantic Analysis in Social Media. Association for Computational Linguistics, pp. 53–60. Available at: in Google Scholar

Settle, J. E., R. M. Bond, L. Coviello, C. J. Fariss, J. H. Fowler and J. J. Jones (2016) “From Posting to Voting: The Effects of Political Competition on Online Political Engagement,” Political Science Research and Methods, 4:61–378.10.1017/psrm.2015.1Search in Google Scholar

Silver, N. (2016a) The Myth of the Lag. Retrieved January 06, 2017, from in Google Scholar

Silver, N. (2016b) National Polls Will Wind Up Being More Accurate than They were in 2012: 2012: Obama up 1, Won by 4 2014: Clinton up 3–4, will win by 1–2 [tweet]. Retrieved January 06, 2017, from in Google Scholar

Tumasjan, A., T. O. Sprenger, P. G. Sandner and I. M. Welpe (2010) “Predicting Elections with Twitter: What 140 Characters Reveal About Political Sentiment,” ICWSM, 10:178–185.Search in Google Scholar

Verba, S., K. L. Schlozman, H. E. Brady and H. E. Brady (1995) Voice and Equality: Civic Voluntarism in American Politics, volume 4. Cambridge: Cambridge University Press.10.2307/j.ctv1pnc1k7Search in Google Scholar

Published Online: 2017-9-29
Published in Print: 2017-10-26

©2017 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 7.2.2023 from
Scroll Up Arrow