Jump to ContentJump to Main Navigation
Show Summary Details
More options …

Proceedings on Privacy Enhancing Technologies

4 Issues per year

Open Access
Online
ISSN
2299-0984
See all formats and pricing
More options …

Privately Evaluating Decision Trees and Random Forests

David J. Wu / Tony Feng / Michael Naehrig / Kristin Lauter
Published Online: 2016-07-14 | DOI: https://doi.org/10.1515/popets-2016-0043

Abstract

Decision trees and random forests are common classifiers with widespread use. In this paper, we develop two protocols for privately evaluating decision trees and random forests. We operate in the standard two-party setting where the server holds a model (either a tree or a forest), and the client holds an input (a feature vector). At the conclusion of the protocol, the client learns only the model’s output on its input and a few generic parameters concerning the model; the server learns nothing. The first protocol we develop provides security against semi-honest adversaries. We then give an extension of the semi-honest protocol that is robust against malicious adversaries. We implement both protocols and show that both variants are able to process trees with several hundred decision nodes in just a few seconds and a modest amount of bandwidth. Compared to previous semi-honest protocols for private decision tree evaluation, we demonstrate a tenfold improvement in computation and bandwidth.

Keywords: privacy; secure computation; decision trees

References

  • [1] BigML. https://bigml.com/.Google Scholar

  • [2] Microsoft Azure Machine Learning. https://azure.microsoft.com/en-us/services/machine-learning.Google Scholar

  • [3] R. Agrawal and R. Srikant. Privacy-preserving data mining. SIGMOD Rec., 29(2), 2000.Google Scholar

  • [4] G. Asharov, Y. Lindell, T. Schneider, and M. Zohner. More efficient oblivious transfer and extensions for faster secure computation. In CCS, pages 535-548, 2013.Google Scholar

  • [5] A. T. Azar and S. M. El-Metwally. Decision tree classifiers for automated medical diagnosis. Neural Computing and Applications, 23(7-8):2387-2403, 2013.Google Scholar

  • [6] K. Bache and M. Lichman. UCI machine learning repository, 2013.Google Scholar

  • [7] M. Barni, P. Failla, V. Kolesnikov, R. Lazzeretti, A. Sadeghi, and T. Schneider. Secure evaluation of private linear branching programs with medical applications. In ESORICS, pages 424-439, 2009.Google Scholar

  • [8] M. Bellare and O. Goldreich. On defining proofs of knowledge. In CRYPTO, pages 390-420, 1992.Google Scholar

  • [9] M. Bellare, V. T. Hoang, S. Keelveedhi, and P. Rogaway. Efficient garbling from a fixed-key blockcipher. In IEEE Symposium on Security and Privacy, pages 478-492, 2013.Google Scholar

  • [10] M. Bellare, V. T. Hoang, and P. Rogaway. Foundations of garbled circuits. Cryptology ePrint Archive, Report 2012/265, 2012.Google Scholar

  • [11] I. F. Blake and V. Kolesnikov. Strong conditional oblivious transfer and computing on intervals. In ASIACRYPT, 2004.Google Scholar

  • [12] D. Bogdanov, S. Laur, and J. Willemson. Sharemind: A framework for fast privacy-preserving computations. In ESORICS, pages 192-206, 2008.Google Scholar

  • [13] D. Boneh. The decision Diffie-Hellman problem. In ANTS, pages 48-63, 1998.Google Scholar

  • [14] J. Bos, C. Costello, P. Longa, and M. Naehrig. Specification of curve selection and supported curve parameters in MSR ECCLib. Technical Report MSR-TR-2014-92, Microsoft Research, June 2014.Google Scholar

  • [15] J. W. Bos, C. Costello, P. Longa, and M. Naehrig. Selecting elliptic curves for cryptography: An efficiency and security analysis. IACR Cryptology ePrint Archive, 2014:130, 2014.Google Scholar

  • [16] J. W. Bos, K. E. Lauter, and M. Naehrig. Private predictive analysis on encrypted medical data. Journal of Biomedical Informatics, 50:234-243, 2014.Google Scholar

  • [17] R. Bost, R. A. Popa, S. Tu, and S. Goldwasser. Machine learning classification over encrypted data. In NDSS, 2015.Google Scholar

  • [18] Z. Brakerski, C. Gentry, and V. Vaikuntanathan. (Leveled) fully homomorphic encryption without bootstrapping. In ITCS, pages 309-325, 2012.Google Scholar

  • [19] L. Breiman. Random forests. Machine Learning, 45(1):5-32, 2001.Google Scholar

  • [20] J. Brickell, D. E. Porter, V. Shmatikov, and E. Witchel. Privacy-preserving remote diagnostics. In CCS, pages 498-507, 2007.Google Scholar

  • [21] J. Camenisch and M. Stadler. Efficient group signature schemes for large groups (extended abstract). In CRYPTO, pages 410-424, 1997.Google Scholar

  • [22] R. Canetti. Security and composition of multiparty cryptographic protocols. J. Cryptology, 13(1):143-202, 2000.Google Scholar

  • [23] R. Canetti. Security and composition of cryptographic protocols: a tutorial (part I). SIGACT News, 37(3):67-92, 2006.Google Scholar

  • [24] D. Chaum and T. P. Pedersen. Wallet databases with observers. In CRYPTO, pages 89-105, 1992.Google Scholar

  • [25] R. Cramer, I. Damgård, and B. Schoenmakers. Proofs of partial knowledge and simplified design of witness hiding protocols. In CRYPTO, pages 174-187, 1994.Google Scholar

  • [26] R. Cramer, R. Gennaro, and B. Schoenmakers. A secure and optimally efficient multi-authority election scheme. In EUROCRYPT, pages 103-118, 1997.Google Scholar

  • [27] G. D. Crescenzo, R. Ostrovsky, and S. Rajagopalan. Conditional oblivious transfer and timed-release encryption. In EUROCRYPT, pages 74-89, 1999.Google Scholar

  • [28] I. Damgård, M. Geisler, and M. Krøigaard. Efficient and secure comparison for on-line auctions. In ACISP, pages 416-430, 2007.Google Scholar

  • [29] I. Damgård, M. Jurik, and J. B. Nielsen. A generalization of Paillier’s public-key system with applications to electronic voting. Int. J. Inf. Sec., 9(6):371-385, 2010.Google Scholar

  • [30] D. Demmler, T. Schneider, and M. Zohner. ABY - A framework for efficient mixed-protocol secure two-party computation. In NDSS, 2015.Google Scholar

  • [31] W. Du and Z. Zhan. Building decision tree classifier on private data. In CRPIT ’14, 2002.Google Scholar

  • [32] Z. Erkin, M. Franz, J. Guajardo, S. Katzenbeisser, I. Lagendijk, and T. Toft. Privacy-preserving face recognition. In PETS, pages 235-253, 2009.Google Scholar

  • [33] A. Fiat and A. Shamir. How to prove yourself: Practical solutions to identification and signature problems. In CRYPTO, pages 186-194, 1986.Google Scholar

  • [34] M. Fredrikson, S. Jha, and T. Ristenpart. Model inversion attacks that exploit confidence information and basic countermeasures. In ACM SIGSAC, pages 1322-1333, 2015.Google Scholar

  • [35] C. Gentry. A fully homomorphic encryption scheme. PhD thesis, Stanford University, 2009.Google Scholar

  • [36] O. Goldreich. Foundations of Cryptography: Volume 2, Basic Applications. Cambridge University Press, New York, NY, USA, 2004.Google Scholar

  • [37] S. Goldwasser and S. Micali. Probabilistic encryption. Journal of Computer and System Sciences, 28(2):270-299, April 1984.Google Scholar

  • [38] S. Goldwasser, S. Micali, and C. Rackoff. The knowledge complexity of interactive proof-systems (extended abstract). In ACM STOC, pages 291-304, 1985.Google Scholar

  • [39] T. Graepel, K. E. Lauter, and M. Naehrig. ML confidential: Machine learning on encrypted data. In ICISC, pages 1-21, 2012.Google Scholar

  • [40] T. Granlund and the GMP development team. GNU MP: The GNU Multiple Precision Arithmetic Library, 5.0.5 edition, 2012. http://gmplib.org/.Google Scholar

  • [41] S. Halevi and V. Shoup. Algorithms in HElib. In CRYPTO, pages 554-571, 2014.Google Scholar

  • [42] J. Håstad, R. Impagliazzo, L. A. Levin, and M. Luby. A pseudorandom generator from any one-way function. SIAM J. Comput., 28(4):1364-1396, 1999.Google Scholar

  • [43] T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning. Springer Series in Statistics. Springer New York Inc., 2001.Google Scholar

  • [44] C. Hazay and Y. Lindell. Efficient Secure Two-Party Protocols - Techniques and Constructions. Information Security and Cryptography. Springer, 2010.Google Scholar

  • [45] B. A. Huberman, M. K. Franklin, and T. Hogg. Enhancing privacy and trust in electronic communities. In EC, pages 78-86, 1999.Google Scholar

  • [46] Y. Ishai, J. Katz, E. Kushilevitz, Y. Lindell, and E. Petrank. On achieving the "best of both worlds" in secure multiparty computation. SIAM J. Comput., 40(1):122-141, 2011.Google Scholar

  • [47] J. Katz and L. Malka. Constant-round private function evaluation with linear complexity. In ASIACRYPT, pages 556-571, 2011.Google Scholar

  • [48] J. Kilian. Founding cryptography on oblivious transfer. In STOC, pages 20-31, 1988.Google Scholar

  • [49] H. C. Koh, W. C. Tan, and C. P. Goh. A two-step method to construct credit scoring models with data mining techniques. International Journal of Business and Information, 1:96-118, 2006.Google Scholar

  • [50] V. Kolesnikov, A.-R. Sadeghi, and T. Schneider. Improved garbled circuit building blocks and applications to auctions and computing minima. Cryptology ePrint Archive, Report 2009/411, 2009.Google Scholar

  • [51] V. Kolesnikov and T. Schneider. Improved garbled circuit: Free XOR gates and applications. In ICALP, pages 486-498, 2008.Google Scholar

  • [52] Y. Lindell and B. Pinkas. Privacy preserving data mining. In CRYPTO, pages 36-54, 2000.Google Scholar

  • [53] Y. Lindell and B. Pinkas. A proof of security of Yao’s protocol for two-party computation. J. Cryptology, 22(2):161-188, 2009.Google Scholar

  • [54] P. Mohassel and S. Niksefat. Oblivious decision programs from oblivious transfer: Efficient reductions. Financial Cryptography, 2014:269-284, 2012.Google Scholar

  • [55] P. Mohassel and S. S. Sadeghian. How to hide circuits in MPC: An efficient framework for private function evaluation. In EUROCRYPT, pages 557-574, 2013.Google Scholar

  • [56] M. Naor and B. Pinkas. Oblivious transfer and polynomial evaluation. In STOC, pages 245-254, 1999.Google Scholar

  • [57] M. Naor and B. Pinkas. Efficient oblivious transfer protocols. In SODA, pages 448-457, 2001.Google Scholar

  • [58] A. Narayanan, N. Thiagarajan, M. Lakhani, M. Hamburg, and D. Boneh. Location privacy via private proximity testing. In NDSS, 2011.Google Scholar

  • [59] P. Paillier. Public-key cryptosystems based on composite degree residuosity classes. In EUROCRYPT, volume 1592, pages 223-238. 1999.Google Scholar

  • [60] M. O. Rabin. How to exchange secrets with oblivious transfer. IACR Cryptology ePrint Archive, 2005:187, 2005.Google Scholar

  • [61] V. Shoup. NTL: A library for doing number theory. http://www.shoup.net/ntl/.Google Scholar

  • [62] A. Singh and J. V. Guttag. A comparison of non-symmetric entropy-based classification trees and support vector machine for cardiovascular risk stratification. Annual International Conference of the IEEE Engineering in Medicine and Biology Society, pages 79-82, 2011.Google Scholar

  • [63] D. J. Wu, T. Feng, M. Naehrig, and K. E. Lauter. Privately evaluating decision trees and random forests. IACR Cryptology ePrint Archive, 2015:386, 2015.Google Scholar

  • [64] A. C. Yao. How to generate and exchange secrets (extended abstract). In FOCS, pages 162-167, 1986.Google Scholar

  • [65] S. Zahur, M. Rosulek, and D. Evans. Two halves make a whole - reducing data transfer in garbled circuits using half gates. In EUROCRYPT, pages 220-250, 2015.Google Scholar

About the article

Received: 2016-02-29

Revised: 2016-06-02

Accepted: 2016-06-02

Published Online: 2016-07-14

Published in Print: 2016-10-01


Citation Information: Proceedings on Privacy Enhancing Technologies, ISSN (Online) 2299-0984, DOI: https://doi.org/10.1515/popets-2016-0043.

Export Citation

© 2016. This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. BY-NC-ND 4.0

Citing Articles

Here you can find all Crossref-listed publications in which this article is cited. If you would like to receive automatic email messages as soon as this article is cited in other publications, simply activate the “Citation Alert” on the top of this page.

Comments (0)

Please log in or register to comment.
Log in