Combined First- and Second-Order Directions for Deep Neural Networks Training

ANGELES MARTINEZ CALOMARDO; Marco Viola; Mahsa Yousefi

Combined First- and Second-Order Directions for Deep Neural Networks Training

ANGELES MARTINEZ CALOMARDO

•

Marco Viola

•

Mahsa Yousefi

2025

book part

Abstract

In this work, we consider a novel stochastic optimization algorithm to solve the unconstrained, nonlinear, and non-convex optimization problems arising in the training of deep neural networks. The new algorithm is based on the combination of first- and second-order information, namely, at each step the computed search direction linearly combines a variance-reduced gradient and a stochastic limited memory quasi-Newton direction. We report computational experiments showing the performance of the proposed optimizer in the training of a modern deep residual neural network for image classification tasks. The numerical results show that the proposed algorithm exhibits comparable or superior performance than the state-of-the-art Adam optimizer, without the agonizing pain of tuning its many hyperparameters.

DOI

10.1007/978-3-031-81241-5_9

Archivio

https://hdl.handle.net/11368/3104059

info:eu-repo/semantics/altIdentifier/scopus/2-s2.0-85215668544

https://link.springer.com/chapter/10.1007/978-3-031-81241-5_9

Diritti

closed access

license:copyright editore

license uri:iris.pri02

FVG url

https://arts.units.it/request-item?handle=11368/3104059

Soggetti

google-scholar

Opzioni

Combined First- and Second-Order Directions for Deep Neural Networks Training