Jump to ContentJump to Main Navigation
Show Summary Details
More options …

Jahrbücher für Nationalökonomie und Statistik

Journal of Economics and Statistics

Editor-in-Chief: Winker, Peter

Ed. by Büttner, Thiess / Riphahn, Regina / Smolny, Werner / Wagner, Joachim

IMPACT FACTOR 2018: 0.200
5-year IMPACT FACTOR: 0.309

CiteScore 2018: 0.50

SCImago Journal Rank (SJR) 2018: 0.154
Source Normalized Impact per Paper (SNIP) 2018: 0.382

See all formats and pricing
More options …
Volume 238, Issue 3-4


Randomization in Online Experiments

Konstantin Golyaev
Published Online: 2018-02-13 | DOI: https://doi.org/10.1515/jbnst-2018-0006


Most scientists consider randomized experiments to be the best method available to establish causality. On the Internet, during the past twenty-five years, randomized experiments have become common, often referred to as A/B testing. For practical reasons, much A/B testing does not use pseudo-random number generators to implement randomization. Instead, hash functions are used to transform the distribution of identifiers of experimental units into a uniform distribution. Using two large, industry data sets, I demonstrate that the success of hash-based quasi-randomization strategies depends greatly on the hash function used: MD5 yielded good results, while SHA512 yielded less impressive ones.

Keywords: Big Data; data science; Internet randomized experiments,A/B testing; hash functions

JEL Classification: C1; C8; C9


  • Gilbert, S.L., Lynch N.A. (2002), Brewer’s Conjecture And the Feasibility of Consistent, Available, Partition-Tolerant Web Services. ACM SIGACT News 33: 51–59.CrossrefGoogle Scholar

  • Graham, R.L., Knuth D.E., Patashnik O. (1994), Concrete Mathematics: A Foundation for Computer Science. Reading, MA, USA, Addison-Wesley.Google Scholar

  • Gueron, S.S.J., Walker J. (2011), SHA-512/256. Proceedings of the 2011 Eighth International Conference on Information Technology: New Generations, pp. 354–358.

  • Harris R.P., M. Helfand, S.H. Woolf, K.N. Lohr, C.D. Mulrow, S.M. Teutsch, D. Atkins, Methods Work Group, Third US Preventive Services Task Force (2001), Current Methods of the US Preventive Services Task Force: A Review of the Process. American Journal of Preventive Medicine 20 (3 Suppl).Google Scholar

  • Fisher, R.A. (1935), The Design of Experiments. Edinburgh, UK, Oliver and Boyd.Google Scholar

  • Knight, F.H. (1921), Risk, Uncertainty, and Profit. Boston, MA, Hart,Schaffner and Marx.Google Scholar

  • Kohavi, R., Longbotham R., Sommerfield D., Henne R.M. (2009), Controlled Experiments On the Web: Survey and Practical Guide.

  • Matsumoto, M., Nishimura T. (1998), Mersenne Twister: A 623-Dimensionally Equidistributed Uniform Pseudo-Random Number Generator. ACM Transactions on Modeling and Computer Simulation 8: 3–30.CrossrefGoogle Scholar

  • Paarsch, H.J., Golyaev K. (2016), A Gentle Introduction to Effective Computing in Quantitative Research: What Every Research Assistant Should Know. Cambridge, USA, MIT Press.Google Scholar

  • Rivest, R. (1992), The MD5 Message-Digest Algorithm. USA, RFC 1321, RFC Editor.Google Scholar

  • Schilling, M.F. (2012), The Surprising Predictability of Long Runs. Mathematics Magazine 85 (2): 141149.

About the article

Received: 2016-10-26

Revised: 2017-04-24

Accepted: 2017-05-15

Published Online: 2018-02-13

Published in Print: 2018-07-26

Citation Information: Jahrbücher für Nationalökonomie und Statistik, Volume 238, Issue 3-4, Pages 223–241, ISSN (Online) 2366-049X, ISSN (Print) 0021-4027, DOI: https://doi.org/10.1515/jbnst-2018-0006.

Export Citation

© 2018 Oldenbourg Wissenschaftsverlag GmbH, Published by De Gruyter Oldenbourg, Berlin/Boston.Get Permission

Comments (0)

Please log in or register to comment.
Log in