Logo del repository
  1. Home
 
Opzioni

Plug-and-play neural compression: A knowledge distillation framework with flexible dimensionality reduction

Meneghetti, Laura
•
Bianchi, Edoardo
•
Demo, Nicola
•
Rozza, Gianluigi
2026
  • journal article

Periodico
JOURNAL OF SYSTEMS ARCHITECTURE
Abstract
The widespread adoption of embedded vision systems in industrial applications has highlighted the limitations of deep learning models, which are characterized by a high number of parameters. This is representing a significant concern within the scientific community due to the increased computational resources and memory required for training and inference of these models. Addressing this, we propose a flexible and effective methodology for neural network compression that integrates a pluggable dimensionality reduction layer with a Knowledge Distillation (KD) approach. The proposed compression framework allows for the exploration and comparison of various state-of-the-art techniques as reduction mechanism. Specifically, we investigate and implement reduction layers based on: tensor decompositions, such as Averaged Higher-Order Singular Value Decomposition (AHOSVD); non-linear methods like bottleneck projection layers, convolutional autoencoders (CAEs), and MLP-Mixer architectures. In our approach, this reduction layer replaces certain layers of the original network, projecting feature maps into a lower-dimensional space. The subsequent KD process then guides the compressed network to retain high performance. We conducted extensive experiments on image classification tasks, evaluating the efficacy of networks incorporating these reduction strategies across multiple architectures (VGG19, ResNet101) and datasets (CIFAR-10, CIFAR-100, STL-10). Our approach was then compared against both the original, uncompressed models and quantization, a widely used reduction method, in terms of accuracy, model size, parameter reduction, and inference time. The results demonstrate the versatility and effectiveness of our approach in achieving substantial neural network compression and efficiency across various reduction layer instantiations, while consistently maintaining high accuracy.
DOI
10.1016/j.sysarc.2026.103778
Archivio
https://hdl.handle.net/20.500.11767/151331
info:eu-repo/semantics/altIdentifier/scopus/2-s2.0-105032919184
https://ricerca.unityfvg.it/handle/20.500.11767/151331
Diritti
metadata only access
Soggetti
  • Deep learning

  • Image processing

  • Neural network compre...

  • Tensor decomposition

google-scholar
Get Involved!
  • Source Code
  • Documentation
  • Slack Channel
Make it your own

DSpace-CRIS can be extensively configured to meet your needs. Decide which information need to be collected and available with fine-grained security. Start updating the theme to match your nstitution's web identity.

Need professional help?

The original creators of DSpace-CRIS at 4Science can take your project to the next level, get in touch!

Realizzato con Software DSpace-CRIS - Estensione mantenuta e ottimizzata da 4Science

  • Impostazioni dei cookie
  • Informativa sulla privacy
  • Accordo con l'utente finale
  • Invia il tuo Feedback