Jump to ContentJump to Main Navigation
Show Summary Details
More options …

Dependence Modeling

Ed. by Puccetti, Giovanni


Covered by:
WoS (ESCI)
SCOPUS
MathSciNet
zbMATH


CiteScore 2018: 0.67

SCImago Journal Rank (SJR) 2018: 0.380
Source Normalized Impact per Paper (SNIP) 2018: 0.383

Open Access
Online
ISSN
2300-2298
See all formats and pricing
More options …

A simple proof of Pitman–Yor’s Chinese restaurant process from its stick-breaking representation

Caroline Lawless / Julyan Arbel
Published Online: 2019-03-08 | DOI: https://doi.org/10.1515/demo-2019-0003

Abstract

For a long time, the Dirichlet process has been the gold standard discrete random measure in Bayesian nonparametrics. The Pitman-Yor process provides a simple and mathematically tractable generalization, allowing for a very flexible control of the clustering behaviour. Two commonly used representations of the Pitman-Yor process are the stick-breaking process and the Chinese restaurant process. The former is a constructive representation of the process which turns out very handy for practical implementation, while the latter describes the partition distribution induced. Obtaining one from the other is usually done indirectly with use of measure theory. In contrast, we propose here an elementary proof of Pitman-Yor’s Chinese Restaurant process from its stick-breaking representation.

Keywords: Bayesian nonparametrics; clustering; Pitman-Yor process; partitions; stick-breaking process

MSC 2010: 62F15; 97K50

References

  • [1] Antoniak, C. E. (1974). Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems. Ann. Statist. 2(6), 1152-1174.CrossrefGoogle Scholar

  • [2] Arbel, J., P. De Blasi, and I. Prünster (2018). Stochastic approximations to the Pitman-Yor process. Bayesian Anal., to appear. Available at https://doi.org/10.1214/18-BA1127.CrossrefGoogle Scholar

  • [3] Arbel, J., S. Favaro, B. Nipoti, and Y. W. Teh (2017). Bayesian nonparametric inference for discovery probabilities: credible intervals and large sample asymptotics. Statist. Sinica 27(2), 839-858.Google Scholar

  • [4] Bassetti, F., R. Casarin, and F. Leisen (2014). Beta-product dependent Pitman-Yor processes for Bayesian inference. J. Econometrics 180(1), 49-72.Google Scholar

  • [5] Battiston, M., S. Favaro, D. M. Roy, and Y.W. Teh (2018). A characterization of product-form exchangeable feature probability functions. Ann. Appl. Probab. 28(3), 1423-1448.CrossrefWeb of ScienceGoogle Scholar

  • [6] Canale, A., A. Lijoi, B. Nipoti, and I. Prünster (2017). On the Pitman-Yor process with spike and slab base measure. Biometrika 104(3), 681-697.CrossrefWeb of ScienceGoogle Scholar

  • [7] Caron, F., W. Neiswanger, F. Wood, A. Doucet, and M. Davy (2017). Generalized Pólya urn for time-varying Pitman-Yor processes. J. Mach. Learn. Res. 18(27), 1-32.Google Scholar

  • [8] Clauset, A., C. R. Shalizi, and M. E. Newman (2009). Power-law distributions in empirical data. SIAM Rev. 51(4), 661-703.Google Scholar

  • [9] De Blasi, P., S. Favaro, A. Lijoi, R. H. Mena, I. Prünster, and M. Ruggiero (2015). Are Gibbs-type priors the most natural generalization of the Dirichlet process? IEEE Trans. Pattern Anal. Mach. Intell. 37(2), 212-229.Google Scholar

  • [10] De Luca, G. and P. Zuccolotto (2011). A tail dependence-based dissimilarity measure for _nancial time series clustering. Adv. Data Anal. Classif. 5(4), 323-340.Google Scholar

  • [11] Derrida, B. (1981). Random-energy model: An exactly solvable model of disordered systems. Phys. Rev. B 24(5), 2613-2626.Google Scholar

  • [12] Favaro, S., A. Lijoi, R. Mena, and I. Prünster (2009). Bayesian non-parametric inference for species variety with a twoparameter Poisson-Dirichlet process prior. J. R. Stat. Soc. Ser. B. Stat. Methodol. 71(5), 993-1008.Google Scholar

  • [13] Favaro, S. and S. G. Walker (2013). Slice sampling -stable Poisson-Kingman mixture models. J. Comput. Graph. Statist. 22(4), 830-847.CrossrefGoogle Scholar

  • [14] Feng, S. and W. Sun (2010). Some diffusion processes associated with two parameter Poisson-Dirichlet distribution and Dirichlet process. Probab. Theory Relat. Fields 148(3-4), 501-525.Google Scholar

  • [15] Ferguson, T. S. (1973). A Bayesian analysis of some nonparametric problems. Ann. Statist. 1(2), 209-230.Google Scholar

  • [16] Ghosal, S. and A. Van der Vaart (2017). Fundamentals of Nonparametric Bayesian Inference. Cambridge University Press.Google Scholar

  • [17] Ishwaran, H. and L. F. James (2001). Gibbs sampling methods for stick-breaking priors. J. Amer. Statist. Assoc. 96(453), 161-173.Google Scholar

  • [18] Jara, A., E. Lesa_re, M. De Iorio, and F. Quintana (2010). Bayesian semiparametric inference for multivariate doubly-intervalcensored data. Ann. Appl. Stat. 4(4), 2126-2149.CrossrefGoogle Scholar

  • [19] Kerov, S. V. (2006). Coherent random allocations, and the Ewens-Pitman formula. J. Math. Sci. 138(3), 5699-5710.Google Scholar

  • [20] Kosmidis, I. and D. Karlis (2016). Model-based clustering using copulas with applications. Stat. Comput. 26(5), 1079-1099.Web of ScienceCrossrefGoogle Scholar

  • [21] Lo, A. Y. (1984). On a class of Bayesian nonparametric estimates: I. Density estimates. Ann. Statist. 12(1), 351-357.CrossrefGoogle Scholar

  • [22] Miller, J. W. (2019). An elementary derivation of the Chinese restaurant process from Sethuraman’s stick-breaking process. Statist. Probab. Lett. 146, 112-117.Web of ScienceGoogle Scholar

  • [23] Miller, J.W. and M. T. Harrison (2014). Inconsistency of Pitman-Yor process mixtures for the number of components. J.Mach. Learn. Res. 15(1), 3333-3370.Google Scholar

  • [24] Navarrete, C., F. A. Quintana, and P.Müller (2008). Someissues in nonparametric Bayesian modeling using species sampling models. Stat. Model. 8(1), 3-21.CrossrefGoogle Scholar

  • [25] Ni, Y., P. Müller, Y. Zhu, and Y. Ji (2018). Heterogeneous reciprocal graphical models. Biometrics 74(2), 606-615.Web of ScienceGoogle Scholar

  • [26] Perman, M., J. Pitman, and M. Yor (1992). Size-biased sampling of Poisson point processes and excursions. Probab. Theory Relat. Fields 92(1), 21-39.Google Scholar

  • [27] Petrov, L. A. (2009). Two-parameter family of infinite-dimensional diffusions on the Kingman simplex. Funct. Anal. Appl. 43(4), 279-296.CrossrefWeb of ScienceGoogle Scholar

  • [28] Pitman, J. (1995). Exchangeable and partially exchangeable random partitions. Probab. Theory Relat. Fields 102(2), 145-158.Google Scholar

  • [29] Pitman, J. (2003). Poisson-Kingman partitions. In Statistics and Science: a Festschrift for Terry Speed, pp.1-34. IMS, Beachwood OH.Google Scholar

  • [30] Pitman, J. and M. Yor (1997). The two-parameter Poisson-Dirichlet distribution derived from a stable subordinator. Ann. Probab. 25(2), 855-900. [31] Scarpa, B. and D. B. Dunson (2009). Bayesian hierarchical functional data analysis via contaminated informative priors. Biometrics 65(3), 772-780.Google Scholar

  • [32] Scricciolo, C. (2014). Adaptive Bayesian density estimation in Lp-metrics with Pitman-Yor or normalized inverse-Gaussian process kernel mixtures. Bayesian Anal. 9(2), 475-520.Web of ScienceGoogle Scholar

  • [33] Sethuraman, J. (1994). A constructive de_nition of Dirichlet priors. Statist. Sinica 4(2), 639-650.Google Scholar

  • [34] Sudderth, E. B. and M. I. Jordan (2009). Shared segmentation of natural scenes using dependent Pitman-Yor processes. In D. Koller, D. Schuurmans, Y. Bengio and L. Bottou (Eds.), Advances in Neural Information Processing Systems 21, pp. 1585-1592. Curran Associates, Red Hook NY.Google Scholar

  • [35] Teh, Y. W. (2006). A hierarchical Bayesian language model based on Pitman-Yor processes. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics, pp. 985-992. Association for Computational Linguistics, Stroudsburg PA.Google Scholar

  • [36] Vershik, A., M. Yor, and N. Tsilevich (2004). On the Markov-Krein identity and quasi-invariance of the gamma process. J. Math. Sci. 121(3), 2303-2310.Google Scholar

  • [37] Wood, F., J. Gasthaus, C. Archambeau, L. James, and Y. W. Teh (2011). The sequence memoizer. Comm. ACM 54(2), 91-98.Web of ScienceGoogle Scholar

About the article

Received: 2018-10-15

Accepted: 2019-02-01

Published Online: 2019-03-08

Published in Print: 2019-03-01


Citation Information: Dependence Modeling, Volume 7, Issue 1, Pages 45–52, ISSN (Online) 2300-2298, DOI: https://doi.org/10.1515/demo-2019-0003.

Export Citation

© by Caroline Lawless, Julyan Arbel, published by De Gruyter. This work is licensed under the Creative Commons Attribution 4.0 Public License. BY 4.0

Comments (0)

Please log in or register to comment.
Log in