Optimal transport and Wasserstein distances are flourishing in many scientific fields as a means for comparing and connecting random structures. Here we pioneer the use of an optimal transport distance between Lévy measures to solve a statistical problem. Dependent Bayesian nonparametric models provide flexible inference on distinct, yet related, groups of observations. Each component of a vector of random measures models a group of exchangeable observations, while their dependence regulates the borrowing of information across groups. We derive the first statistical index of dependence in [0, 1] for (completely) random measures that accounts for their whole infinite-dimensional distribution, which is assumed to be equal across different groups. This is accomplished by using the geometric properties of the Wasserstein distance to solve a max–min problem at the level of the underlying Lévy measures. The Wasserstein index of dependence sheds light on the models’ deep structure and has desirable properties: (i) it is 0 if and only if the random measures are independent; (ii) it is 1 if and only if the random measures are completely dependent; (iii) it simultaneously quantifies the dependence of d ≥ 2 random measures, avoiding the need for pairwise comparisons; (iv) it can be evaluated numerically. Moreover, the index allows for informed prior specifications and fair model comparisons for Bayesian nonparametric models. Supplementary materials for this article are available online.
A Wasserstein Index of Dependence for Random Measures / Catalano, Marta; Lavenant, Hugo; Lijoi, Antonio; Pruenster, Igor. - In: JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION. - ISSN 0162-1459. - 119:547(2024), pp. 2396-2406. [10.1080/01621459.2023.2258596]
A Wasserstein Index of Dependence for Random Measures
Catalano, Marta;
2024
Abstract
Optimal transport and Wasserstein distances are flourishing in many scientific fields as a means for comparing and connecting random structures. Here we pioneer the use of an optimal transport distance between Lévy measures to solve a statistical problem. Dependent Bayesian nonparametric models provide flexible inference on distinct, yet related, groups of observations. Each component of a vector of random measures models a group of exchangeable observations, while their dependence regulates the borrowing of information across groups. We derive the first statistical index of dependence in [0, 1] for (completely) random measures that accounts for their whole infinite-dimensional distribution, which is assumed to be equal across different groups. This is accomplished by using the geometric properties of the Wasserstein distance to solve a max–min problem at the level of the underlying Lévy measures. The Wasserstein index of dependence sheds light on the models’ deep structure and has desirable properties: (i) it is 0 if and only if the random measures are independent; (ii) it is 1 if and only if the random measures are completely dependent; (iii) it simultaneously quantifies the dependence of d ≥ 2 random measures, avoiding the need for pairwise comparisons; (iv) it can be evaluated numerically. Moreover, the index allows for informed prior specifications and fair model comparisons for Bayesian nonparametric models. Supplementary materials for this article are available online.File | Dimensione | Formato | |
---|---|---|---|
2024_JASA.pdf
Solo gestori archivio
Tipologia:
Versione dell'editore
Licenza:
Tutti i diritti riservati
Dimensione
2.43 MB
Formato
Adobe PDF
|
2.43 MB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.