Logo del repository
  1. Home
 
Opzioni

Neural networks trained with SGD learn distributions of increasing complexity

Refinetti, Maria
•
Ingrosso, Alessandro
•
Goldt, Sebastian
2025
  • journal article

Periodico
JOURNAL OF STATISTICAL MECHANICS: THEORY AND EXPERIMENT
Abstract
The uncanny ability of over-parameterised neural networks to generalise well has been explained using various "simplicity biases". These theories postulate that neural networks avoid overfitting by first fitting simple, linear classifiers before learning more complex, non-linear functions. Meanwhile, data structure is also recognised as a key ingredient for good generalisation, yet its role in simplicity biases is not yet understood. Here, we show that neural networks trained using stochastic gradient descent initially classify their inputs using lower-order input statistics, like mean and covariance, and exploit higher-order statistics only later during training. We first demonstrate this distributional simplicity bias (DSB) in a solvable model of a single neuron trained on synthetic data. We then demonstrate DSB empirically in a range of deep convolutional networks and visual transformers trained on CIFAR10, and show that it even holds in networks pre-trained on ImageNet. We discuss the relation of DSB to other simplicity biases and consider its implications for the principle of Gaussian universality in learning.
DOI
10.1088/1742-5468/ad8bb8
WOS
WOS:001424698800001
Archivio
https://hdl.handle.net/20.500.11767/137850
info:eu-repo/semantics/altIdentifier/scopus/2-s2.0-105004673305
https://arxiv.org/abs/2211.11567
Diritti
open access
license:creative commons
license uri:http://creativecommons.org/licenses/by/4.0/
Soggetti
  • Settore FIS/07 - Fisi...

google-scholar
Get Involved!
  • Source Code
  • Documentation
  • Slack Channel
Make it your own

DSpace-CRIS can be extensively configured to meet your needs. Decide which information need to be collected and available with fine-grained security. Start updating the theme to match your nstitution's web identity.

Need professional help?

The original creators of DSpace-CRIS at 4Science can take your project to the next level, get in touch!

Realizzato con Software DSpace-CRIS - Estensione mantenuta e ottimizzata da 4Science

  • Impostazioni dei cookie
  • Informativa sulla privacy
  • Accordo con l'utente finale
  • Invia il tuo Feedback