IRIS - Institutional Research Information System

We study the problem of maintaining a decision tree in the fully-dynamic setting, where the dataset is updated by an adversarial sequence of insertions and deletions. We present the first algorithm with strong guarantees on both the quality of the tree and the worst-case update time (the maximum number of operations the algorithm performs between two dataset updates). For instance, we can maintain a tree where each node has Gini gain within β of the optimum, while guaranteeing an update time O d β−3 log4 n , where d is the number of features and n the maximum size of the dataset. This is optimal up to polylogarithmic factors, as any dynamic algorithm must have update time in Ω(d). Similar guarantees hold for the variance and information gain, for classification and regression, as well as for boosted trees. This shows that many popular decision trees such as ID3 or C4.5 can be efficiently made dynamic, answering an open question of Bressan, Damay, and Sozio (2023). We also show that, under the 3SUM conjecture or the Orthogonal Vectors Hypothesis, the update time must be polynomial in β −1

Bressan, Marco; Sozio, Mauro. (2024). Fully-Dynamic Approximate Decision Trees With Worst-Case Update Time Guarantees. In Fully-Dynamic Approximate Decision Trees With Worst-Case Update Time Guarantees (pp. 4517- 4541). https://proceedings.mlr.press/v235/bressan24a.html.

Fully-Dynamic Approximate Decision Trees With Worst-Case Update Time Guarantees

Marco Bressan;Mauro Sozio

2024

Abstract

We study the problem of maintaining a decision tree in the fully-dynamic setting, where the dataset is updated by an adversarial sequence of insertions and deletions. We present the first algorithm with strong guarantees on both the quality of the tree and the worst-case update time (the maximum number of operations the algorithm performs between two dataset updates). For instance, we can maintain a tree where each node has Gini gain within β of the optimum, while guaranteeing an update time O d β−3 log4 n , where d is the number of features and n the maximum size of the dataset. This is optimal up to polylogarithmic factors, as any dynamic algorithm must have update time in Ω(d). Similar guarantees hold for the variance and information gain, for classification and regression, as well as for boosted trees. This shows that many popular decision trees such as ID3 or C4.5 can be efficiently made dynamic, answering an open question of Bressan, Damay, and Sozio (2023). We also show that, under the 3SUM conjecture or the Orthogonal Vectors Hypothesis, the update time must be polynomial in β −1

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno del convegno
	
				2024
			
	Citazione
	
				Bressan, Marco; Sozio, Mauro. (2024). Fully-Dynamic Approximate Decision Trees With Worst-Case Update Time Guarantees. In Fully-Dynamic Approximate Decision Trees With Worst-Case Update Time Guarantees (pp. 4517- 4541). https://proceedings.mlr.press/v235/bressan24a.html.
			
	Appare nelle tipologie:
	
				04.1 - Contributo in Atti di convegno (Paper in Proceedings)

File in questo prodotto:

File	Dimensione	Formato
4082_Fully_Dynamic_Approximate.pdf Open Access Tipologia: Versione dell'editore Licenza: Tutti i diritti riservati Dimensione 477.94 kB Formato Adobe PDF Visualizza/Apri	477.94 kB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11385/248142

Citazioni

0

ND

ND

social impact