The availability of complex-structured data has sparked new research directions in statistics and machine learning. Bayesian nonparametrics is at the forefront of this trend thanks to two crucial features: its coherent probabilistic framework, which naturally leads to principled prediction and uncertainty quantification, and its infinite-dimensionality, which exempts from parametric restrictions and ensures full modeling flexibility. In this paper, we provide a concise overview of Bayesian nonparametrics starting from its foundations and the Dirichlet process, the most popular nonparametric prior. We describe the use of the Dirichlet process in species discovery, density estimation, and clustering problems. Among the many generalizations of the Dirichlet process proposed in the literature, we single out the Pitman–Yor process, and compare it to the Dirichlet process. Their different features are showcased with real-data illustrations. Finally, we consider more complex data structures, which require dependent versions of these models. One of the most effective strategies to achieve this goal is represented by hierarchical constructions. We highlight the role of the dependence structure in the borrowing of information and illustrate its effectiveness on unbalanced datasets.

Bayesian modeling via discrete nonparametric priors / Catalano, Marta; Lijoi, Antonio; Pruenster, Igor; Rigon, Tommaso. - In: JAPANESE JOURNAL OF STATISTICS AND DATA SCIENCE. - ISSN 2520-8764. - 6:(2023), pp. 607-624. [10.1007/s42081-023-00210-5]

Bayesian modeling via discrete nonparametric priors

Catalano, Marta;
2023

Abstract

The availability of complex-structured data has sparked new research directions in statistics and machine learning. Bayesian nonparametrics is at the forefront of this trend thanks to two crucial features: its coherent probabilistic framework, which naturally leads to principled prediction and uncertainty quantification, and its infinite-dimensionality, which exempts from parametric restrictions and ensures full modeling flexibility. In this paper, we provide a concise overview of Bayesian nonparametrics starting from its foundations and the Dirichlet process, the most popular nonparametric prior. We describe the use of the Dirichlet process in species discovery, density estimation, and clustering problems. Among the many generalizations of the Dirichlet process proposed in the literature, we single out the Pitman–Yor process, and compare it to the Dirichlet process. Their different features are showcased with real-data illustrations. Finally, we consider more complex data structures, which require dependent versions of these models. One of the most effective strategies to achieve this goal is represented by hierarchical constructions. We highlight the role of the dependence structure in the borrowing of information and illustrate its effectiveness on unbalanced datasets.
2023
Clustering, Density estimation, Dependence, Dirichlet process, Exchangeability, Mixture model, Partial exchangeability, Pitman–Yor process, Species discovery
Bayesian modeling via discrete nonparametric priors / Catalano, Marta; Lijoi, Antonio; Pruenster, Igor; Rigon, Tommaso. - In: JAPANESE JOURNAL OF STATISTICS AND DATA SCIENCE. - ISSN 2520-8764. - 6:(2023), pp. 607-624. [10.1007/s42081-023-00210-5]
File in questo prodotto:
File Dimensione Formato  
2023_JJSDS.pdf

Open Access

Tipologia: Versione dell'editore
Licenza: Creative commons
Dimensione 488.93 kB
Formato Adobe PDF
488.93 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11385/234458
Citazioni
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact