We present STREAMRHF, an unsupervised anomaly detection algorithm for data streams. Our algorithm builds on some of the ideas of Random Histogram Forest (RHF) [1], a state-of-the-art algorithm for batch unsupervised anomaly detection. STREAMRHF constructs a forest of decision trees, where feature splits are determined according to the kurtosis score of every feature. It irrevocably assigns an anomaly score to data points, as soon as they arrive, by means of an incremental computation of its random trees and the kurtosis scores of the features. This allows efficient online scoring and concept drift detection altogether. Our approach is tree-based which boasts several appealing properties, such as explainability of the results [2]. We conduct an extensive experimental evaluation on multiple datasets from different real-world applications. Our evaluation shows that our streaming algorithm achieves comparable average precision to RHF while outperforming state-of-the-art streaming approaches for unsupervised anomaly detection with furthermore limited computational complexity.

Nesic, S.; Putina, A.; Bahri, M.; Huet, A.; Navarro, J. M.; Rossi, Dario; Sozio, Mauro. (2022). STREamRHF: Tree-Based Unsupervised Anomaly Detection for Data Streams. In Proceedings of IEEE/ACS International Conference on Computer Systems and Applications, AICCSA (pp. 1- 8). Doi: 10.1109/AICCSA56895.2022.10017876.

STREamRHF: Tree-Based Unsupervised Anomaly Detection for Data Streams

Rossi D.;Sozio M.
2022

Abstract

We present STREAMRHF, an unsupervised anomaly detection algorithm for data streams. Our algorithm builds on some of the ideas of Random Histogram Forest (RHF) [1], a state-of-the-art algorithm for batch unsupervised anomaly detection. STREAMRHF constructs a forest of decision trees, where feature splits are determined according to the kurtosis score of every feature. It irrevocably assigns an anomaly score to data points, as soon as they arrive, by means of an incremental computation of its random trees and the kurtosis scores of the features. This allows efficient online scoring and concept drift detection altogether. Our approach is tree-based which boasts several appealing properties, such as explainability of the results [2]. We conduct an extensive experimental evaluation on multiple datasets from different real-world applications. Our evaluation shows that our streaming algorithm achieves comparable average precision to RHF while outperforming state-of-the-art streaming approaches for unsupervised anomaly detection with furthermore limited computational complexity.
2022
Anomaly detection
Data streams
Random histogram
Unsupervised learning
Nesic, S.; Putina, A.; Bahri, M.; Huet, A.; Navarro, J. M.; Rossi, Dario; Sozio, Mauro. (2022). STREamRHF: Tree-Based Unsupervised Anomaly Detection for Data Streams. In Proceedings of IEEE/ACS International Conference on Computer Systems and Applications, AICCSA (pp. 1- 8). Doi: 10.1109/AICCSA56895.2022.10017876.
File in questo prodotto:
File Dimensione Formato  
Stream_RHF___Post_PAKDD_reviews.pdf

Solo gestori archivio

Tipologia: Documento in Post-print
Licenza: Tutti i diritti riservati
Dimensione 613.47 kB
Formato Adobe PDF
613.47 kB Adobe PDF   Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11385/261259
Citazioni
  • Scopus 3
  • ???jsp.display-item.citation.isi??? 1
  • OpenAlex 3
social impact