Static and dynamic clustering algorithms are a fundamental tool in any machine learning library. Most of the efforts in developing dynamic machine learning and data mining algorithms have been focusing on the sliding window model or more simplistic models. However, in many real-world applications one might need to deal with arbitrary deletions and insertions. For example, one might need to remove data items that are not necessarily the oldest ones, because they have been flagged as containing inappropriate content or due to privacy concerns. Clustering trajectory data might also require to deal with more general update operations. We develop a ð2 þ Þ-approximation algorithm for the k-center clustering problem with “small” amortized cost under the fully dynamic adversarial model. In such a model, points can be added or removed arbitrarily, provided that the adversary does not have access to the random choices of our algorithm. The amortized cost of our algorithm is poly-logarithmic when the ratio between the maximum and minimum distance between any two points in input is bounded by a polynomial, while k and are constant. Furthermore, we significantly improve the memory requirement of our fully dynamic algorithm, although at the cost of a worse approximation ratio of 4 þ . Our theoretical results are complemented with an extensive experimental evaluation on dynamic data from Twitter, Flickr, as well as trajectory data, demonstrating the effectiveness of our approach.

Fully Dynamic k-Center Clustering With Improved Memory Efficiency / Hubert Chan, T. (-)H.; Guerquin, Arnaud; Hu, Shuguang; Sozio, Mauro. - In: IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING. - ISSN 1041-4347. - 34:7(2022), pp. 3255-3266. [10.1109/TKDE.2020.3023020]

Fully Dynamic k-Center Clustering With Improved Memory Efficiency

Mauro Sozio
2022

Abstract

Static and dynamic clustering algorithms are a fundamental tool in any machine learning library. Most of the efforts in developing dynamic machine learning and data mining algorithms have been focusing on the sliding window model or more simplistic models. However, in many real-world applications one might need to deal with arbitrary deletions and insertions. For example, one might need to remove data items that are not necessarily the oldest ones, because they have been flagged as containing inappropriate content or due to privacy concerns. Clustering trajectory data might also require to deal with more general update operations. We develop a ð2 þ Þ-approximation algorithm for the k-center clustering problem with “small” amortized cost under the fully dynamic adversarial model. In such a model, points can be added or removed arbitrarily, provided that the adversary does not have access to the random choices of our algorithm. The amortized cost of our algorithm is poly-logarithmic when the ratio between the maximum and minimum distance between any two points in input is bounded by a polynomial, while k and are constant. Furthermore, we significantly improve the memory requirement of our fully dynamic algorithm, although at the cost of a worse approximation ratio of 4 þ . Our theoretical results are complemented with an extensive experimental evaluation on dynamic data from Twitter, Flickr, as well as trajectory data, demonstrating the effectiveness of our approach.
2022
Clustering, k-center, approximation algorithm, fully-dynamic
Fully Dynamic k-Center Clustering With Improved Memory Efficiency / Hubert Chan, T. (-)H.; Guerquin, Arnaud; Hu, Shuguang; Sozio, Mauro. - In: IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING. - ISSN 1041-4347. - 34:7(2022), pp. 3255-3266. [10.1109/TKDE.2020.3023020]
File in questo prodotto:
File Dimensione Formato  
Fully_Dynamic_kk-Center_Clustering_With_Improved_Memory_Efficiency.pdf

Solo gestori archivio

Tipologia: Versione dell'editore
Licenza: Tutti i diritti riservati
Dimensione 704.86 kB
Formato Adobe PDF
704.86 kB Adobe PDF   Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11385/248146
Citazioni
  • Scopus 3
  • ???jsp.display-item.citation.isi??? 5
  • OpenAlex ND
social impact