Fully-Dynamic Decision Trees
Date: Wed 31 May 2023 at 15:30
Venue: Aula Seminari, via Salaria 113
Speaker: Prof. Mauro Sozio
Title: Fully-Dynamic Decision Trees
Abstract
Historically, machine learning (ML) algorithms have been developed assuming the input datasets to be static. However, as new data is collected or removed because being noisy or obsolete, one soon faces the need to efficiently update an ML model without sacrificing its accuracy. Decision trees are a cornerstone of ML and an essential tool in any ML library. Moreover, decision-tree based algorithms such as XGBoost and random forests have been consistently embraced in real-world applications and data challenges. In our talk, we present the first fully-dynamic algorithm that maintains an ``accurate'' decision tree with ``low'' amortized cost, over an arbitrary sequence of insertions and deletions of labeled examples. We also argue that our algorithm is optimal both in terms of memory usage and running time, up to polylogarithmic factors. We conclude our talk with an extensive experimental evaluation on real-world data showing the effectiveness of our approach. Our work (presented at AAAI23) represents one of the first studies on fully-dynamic supervised ML, which has been mostly unexplored so far, to the best of our knowledge.
Bio
Mauro Sozio is a full professor at Telecom Paris University which is part of the Institut Polytechnique de Paris. Previously he held research positions at IBM Almaden and Max Planck Institute for Informatics, while he received his PhD from Sapienza University of Rome. His research interests lie in the areas of scalable machine learning, graph mining and social network analysis, with a focus on both theoretical and practical aspects. He has coauthored more than 40 scientific articles, he has been granted two patents, while he has received several awards for his work in collaboration with both industrial and academic partners.