Parallel Trust-Region Approaches in Neural Network Training: Beyond Traditional Methods.

We propose to train neural networks (NNs) using a novel variant of the “Additively Preconditioned Trust-region Strategy” (APTS). The proposed method is based on a parallelizable additive domain decomposition approach applied to the neural network’s parameters. Built upon the TR framework, the APTS method ensures global convergence towards a minimizer. Moreover, it eliminates the need for computationally expensive hyper-parameter tuning, as the TR algorithm automatically determines the step size in each iteration. We demonstrate the capabilities, strengths, and limitations of the proposed APTS training method by performing a series of numerical experiments. The presented numerical study includes a comparison with widely used training methods such as SGD, Adam, LBFGS, and the standard TR method.


Citation: K. Trotti, S. Cruz, A. Kopaničáková, and R. Krause. Parallel Trust-Region Approaches in Neural Network Training: Beyond Traditional Methods. arXiv:2302.07049, 2023.
Download: Preprint.