In this paper we propose a novel evolutive agent-based clustering algorithm where agents act as individuals of an evolving population, each one performing a random walk on a different subset of patterns drawn from the entire dataset. Such agents are orchestrated by means of a customised genetic algorithm and are able to perform simultaneously clustering and feature selection. Conversely to standard clustering algorithms, each agent is in charge of discovering well-formed (compact and populated) clusters and, at the same time, a suitable subset of features corresponding to the subspace where such clusters lie, following a local metric learning approach, where each cluster is characterised by its own subset of relevant features. This will not only lead to a deeper knowledge of the dataset at hand, revealing clusters that are not evident when using the whole set of features, but will also be suitable for large datasets, as each agent will process a small subset of patterns. We show the effectiveness of our algorithm on synthetic datasets, remarking some interesting future work scenarios and extensions.
Data mining by evolving agents for clusters discovery and metric learning / Martino, Alessio; Giampieri, Mauro; Luzi, Massimiliano; Rizzi, Antonello. - 102:(2019), pp. 23-35. [10.1007/978-3-319-95098-3_3]
Data mining by evolving agents for clusters discovery and metric learning
Alessio Martino
;
2019
Abstract
In this paper we propose a novel evolutive agent-based clustering algorithm where agents act as individuals of an evolving population, each one performing a random walk on a different subset of patterns drawn from the entire dataset. Such agents are orchestrated by means of a customised genetic algorithm and are able to perform simultaneously clustering and feature selection. Conversely to standard clustering algorithms, each agent is in charge of discovering well-formed (compact and populated) clusters and, at the same time, a suitable subset of features corresponding to the subspace where such clusters lie, following a local metric learning approach, where each cluster is characterised by its own subset of relevant features. This will not only lead to a deeper knowledge of the dataset at hand, revealing clusters that are not evident when using the whole set of features, but will also be suitable for large datasets, as each agent will process a small subset of patterns. We show the effectiveness of our algorithm on synthetic datasets, remarking some interesting future work scenarios and extensions.File | Dimensione | Formato | |
---|---|---|---|
Martino_indice_Data-mining_2019.pdf
Solo gestori archivio
Descrizione: indice
Tipologia:
Altro materiale allegato
Licenza:
DRM (Digital rights management) non definiti
Dimensione
205.74 kB
Formato
Adobe PDF
|
205.74 kB | Adobe PDF | Visualizza/Apri |
Martino_Data-mining_2019.pdf
Solo gestori archivio
Tipologia:
Versione dell'editore
Licenza:
DRM (Digital rights management) non definiti
Dimensione
285.75 kB
Formato
Adobe PDF
|
285.75 kB | Adobe PDF | Visualizza/Apri |
Martino_cover_Data-mining_2019.pdf
Solo gestori archivio
Descrizione: cover
Tipologia:
Altro materiale allegato
Licenza:
DRM (Digital rights management) non definiti
Dimensione
268.31 kB
Formato
Adobe PDF
|
268.31 kB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.