In continual learning applications on -the -edge multiple self-centered devices (SCD) learn different local tasks independently, with each SCD only optimizing its own task. Can we achieve (almost) zero -cost collaboration between different devices? We formalize this problem as a Distributed Continual Learning (DCL) scenario, where SCDs greedily adapt to their own local tasks and a separate continual learning (CL) model perform a sparse and asynchronous consolidation step that combines the SCD models sequentially into a single multi -task model without using the original data. Unfortunately, current CL methods are not directly applicable to this scenario. We propose Data -Agnostic Consolidation (DAC), a novel double knowledge distillation method which performs distillation in the latent space via a novel Projected Latent Distillation loss. Experimental results show that DAC enables forward transfer between SCDs and reaches state-of-the-art accuracy on Split CIFAR100, CORe50 and Split TinyImageNet, both in single device and distributed CL scenarios. Somewhat surprisingly, a single out -of -distribution image is sufficient as the only source of data for DAC.

Carta, Antonio; Cossu, Andrea; Lomonaco, Vincenzo; Bacciu, Davide; Van De Weijer, Joost. (2024). Projected Latent Distillation for Data-Agnostic Consolidation in distributed continual learning. NEUROCOMPUTING, (ISSN: 0925-2312), 598: 1-9. Doi: 10.1016/j.neucom.2024.127935.

Projected Latent Distillation for Data-Agnostic Consolidation in distributed continual learning

Lomonaco, Vincenzo;
2024

Abstract

In continual learning applications on -the -edge multiple self-centered devices (SCD) learn different local tasks independently, with each SCD only optimizing its own task. Can we achieve (almost) zero -cost collaboration between different devices? We formalize this problem as a Distributed Continual Learning (DCL) scenario, where SCDs greedily adapt to their own local tasks and a separate continual learning (CL) model perform a sparse and asynchronous consolidation step that combines the SCD models sequentially into a single multi -task model without using the original data. Unfortunately, current CL methods are not directly applicable to this scenario. We propose Data -Agnostic Consolidation (DAC), a novel double knowledge distillation method which performs distillation in the latent space via a novel Projected Latent Distillation loss. Experimental results show that DAC enables forward transfer between SCDs and reaches state-of-the-art accuracy on Split CIFAR100, CORe50 and Split TinyImageNet, both in single device and distributed CL scenarios. Somewhat surprisingly, a single out -of -distribution image is sufficient as the only source of data for DAC.
2024
Continual learning, Model consolidation, Distributed continual learning
Carta, Antonio; Cossu, Andrea; Lomonaco, Vincenzo; Bacciu, Davide; Van De Weijer, Joost. (2024). Projected Latent Distillation for Data-Agnostic Consolidation in distributed continual learning. NEUROCOMPUTING, (ISSN: 0925-2312), 598: 1-9. Doi: 10.1016/j.neucom.2024.127935.
File in questo prodotto:
File Dimensione Formato  
Projected Latent Distillation.pdf

Open Access

Tipologia: Versione dell'editore
Licenza: Creative commons
Dimensione 2.28 MB
Formato Adobe PDF
2.28 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11385/253559
Citazioni
  • Scopus 4
  • ???jsp.display-item.citation.isi??? 4
  • OpenAlex ND
social impact