In this paper, following a fuzzy approach and adopting an autoregressive parameterization, we propose a robust clustering model for classifying time series. In particular, by adopting a fuzzy partitioning around medoids approach, the suggested clustering model is able to define the so-called medoid time series, which is a representative time series of each cluster, and the membership degrees of each time series to the different clusters. The robustness of the proposed clustering model is guaranteed by the adoption of a suitable robust metric for time series, i.e. the so-called exponential distance measure. In this way, the clustering model is able to tolerate the presence of outlier time series in the clustering process. In particular, it is capable of neutralizing and smoothing the disruptive effect of outlier time series, preserving the original clustering structure of the dataset, by assigning to outlier time series approximately the same membership degrees across clusters. To illustrate the usefulness and effectiveness of the suggested time series clustering model, a simulation study and an application to air pollution time series are carried out. Comparison with some existing clustering procedures suggested in the literature shows several advantages of the proposed model.
Time series clustering by a robust autoregressive metric with application to air pollution / D'Urso, P.; De Giovanni, Livia; Massari, R.. - In: CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS. - ISSN 0169-7439. - (2015), pp. 107-124. [10.1016/j.chemolab.2014.11.003]
Time series clustering by a robust autoregressive metric with application to air pollution
DE GIOVANNI, LIVIA;
2015
Abstract
In this paper, following a fuzzy approach and adopting an autoregressive parameterization, we propose a robust clustering model for classifying time series. In particular, by adopting a fuzzy partitioning around medoids approach, the suggested clustering model is able to define the so-called medoid time series, which is a representative time series of each cluster, and the membership degrees of each time series to the different clusters. The robustness of the proposed clustering model is guaranteed by the adoption of a suitable robust metric for time series, i.e. the so-called exponential distance measure. In this way, the clustering model is able to tolerate the presence of outlier time series in the clustering process. In particular, it is capable of neutralizing and smoothing the disruptive effect of outlier time series, preserving the original clustering structure of the dataset, by assigning to outlier time series approximately the same membership degrees across clusters. To illustrate the usefulness and effectiveness of the suggested time series clustering model, a simulation study and an application to air pollution time series are carried out. Comparison with some existing clustering procedures suggested in the literature shows several advantages of the proposed model.Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.