Multilevel training methods



Deep neural networks (DNNs) suffer from a computationally exhaustive training phase, which limits their applicability and hinders their development. During the training phase, the parameters of the network are determined by minimizing a highly non-convex loss function. My research objective is to enhance the training of DNNs by leveraging multilevel and domain-decomposition techniques. This entails developing novel optimization methods, which perform well in large-scale stochastic settings. Thus, developing methods thus which are memory and computationally efficient and convergence of which does not deteriorate in the presence of sub-sampling noise. Moreover, it is important to devise suitable strategies for constructing the multilevel hierarchy. This can be achieved by exploring the structure of the network's architecture, data representation, and the properties of the loss function.
Related references:
[1] S. Gratton, A. Kopaničáková and Ph. L. Toint. Multilevel Objective-Function-Free Optimization with an Application to Neural Networks Training.
[2] A. Kopaničáková and R. Krause. Globally Convergent Multilevel Training of Deep Residual Networks.
[3] L. Gaedke-Merzhauser*, A. Kopaničáková*, and R. Krause. Multilevel minimization for deep residual networks.
[4] C. von Planta, A. Kopaničáková, and R. Krause. Training of residual networks with stochastic MG/Opt.
[5] V. Braglia*, A. Kopaničáková*, and R. Krause. A multilevel approach to training.

Nonlinear preconditioning


Nonlinear field-split or domain-decomposition preconditioners enable an efficient solution of large-scale nonlinear multi-physics problems. The idea behind these methods is to enhance the convergence of nonlinear solution strategies by rebalancing the nonlinearities, or by transforming the basis of the solution space. This can be achieved by decomposing either the computational domain or the physics of the coupled problems. Former is of particular interest in the field of scientific computing, as it allows for massive parallelization. My research goal is to enhance the class of nonlinear preconditioners by proposing novel algorithmic variants and by extending the applicability of existent approaches to a wider range of nonlinear problems.
Related references:
[1] A. Kopaničáková, H. Kothari, G. Karniadakis and R. Krause. Enhancing training of physics-informed neural networks using domain-decomposition based preconditioning strategies.
[2] H. Kothari, A. Kopaničáková and R. Krause. Nonlinear Schwarz preconditioning for nonlinear optimization problems with bound constraints.
[3] A. Kopaničáková, H. Kothari, and R. Krause. Nonlinear Field-split Preconditioners for Solving Monolithic Phase-field Models of Brittle Fracture.
[4] C. Bilgen, A. Kopaničáková, R. Krause, and K. Weinberg. A detailed investigation of the model influencing parameters of the phase-field fracture approach.

Multilevel trust-region methods


Multilevel methods have been originally designed for solving elliptic partial differential equations. Their applicability to non-convex optimization problems was extended by utilizing the trust-region globalization strategy, giving rise to recursive multilevel trust-region methods (RMTR) [Gratton et al. ’08]. I have contributed to the development of RMTR methods by proposing several novel variants that take into account the structure of the underlying optimization problem in order to construct multilevel hierarchy and transfer operators. These methods are unique as they allow for the solution of complex non-convex minimization problems with multigrid efficiency. Moreover, they are also provably globally convergent, thus guaranteeing the success of the nonlinear iteration process.
Related references:
[1] S. Gratton, A. Kopaničáková and Ph. L. Toint. Multilevel Objective-Function-Free Optimization with an Application to Neural Networks Training.
[2] A. Kopaničáková. On the use of hybrid coarse-level models in multilevel minimization methods.
[3] A. Kopaničáková and R. Krause. Globally Convergent Multilevel Training of Deep Residual Networks.
[4] F. Chegini, A. Kopaničáková, R. Krause, and M. Weiser. Efficient identification of scars using heterogeneous model hierarchies.
[5] A. Kopaničáková and R. Krause. Recursive multilevel trust region method with application to fully monolithic phase-field models of brittle fracture.
[6] A. Kopaničáková, R. Krause, and R. Tamstorf. Subdivision-based nonlinear multiscale cloth simulation.
[7] F. Chegini, A. Kopaničáková, M. Weiser, and R. Krause. Quantitative analysis of nonlinear multifidelity optimization for inverse electrophysiology.
[8] A. Kopaničáková and R. Krause. Multilevel Active-Set Trust-Region (MASTR) Method for Bound Constrained Minimization.

Large-scale phase-field fracture simulations


Predicting damage and crack propagation is a long-lasting challenge in computational mechanics. The phase-field approach to fracture allows for predicting crack evolution without the need to explicitly model crack paths and therefore has become very popular. The development of an efficient phase-field fracture simulation framework requires scalable implementation of an underlying mathematical model and robust solution strategy. I have contributed to both aspects by implementing the finite-element phase-field fracture models and by proposing novel solution strategies for solving arising non-convex coupled constrained minimization problems. The developed simulation framework has been used to simulate brittle, conchoidal, and pneumatic fractures. More recently, it has been also employed for pressure-induced fracture propagation of stochastic fracture networks in 2D/3D, considering realistic scenarios with up to 1 000 fractures.
Related references:
[1] A. Kopaničáková, H. Kothari, and R. Krause. Nonlinear Field-split Preconditioners for Solving Monolithic Phase-field Models of Brittle Fracture.
[2] C. Bilgen, A. Kopaničáková, R. Krause, and K. Weinberg. A phase-field approach to pneumatic fracture.
[3] P. Zulian*, A. Kopaničáková* et al. Large scale simulation of pressure induced phase-field fracture propagation using Utopia.
[4] C. Bilgen, A. Kopaničáková, R. Krause, and K. Weinberg. A detailed investigation of the model influencing parameters of the phase-field fracture approach.
[5] A. Kopaničáková and R. Krause. Recursive multilevel trust region method with application to fully monolithic phase-field models of brittle fracture.
[6] C. Bilgen, A. Kopaničáková, R. Krause, and K. Weinberg. A phase-field approach to conchoidal fracture.
[7] C. Bilgen, A. Kopaničáková, R. Krause, and K. Weinberg. A phase-field approach to pneumatic fracture.

Scientific software

I actively contribute to the development of scientific software libraries with a particular focus on the development of large-scale, parallel nonlinear optimization strategies and simulation frameworks.
Software libraries:
Utopia: Open-source C++ embedded domain specific language designed for parallel nonlinear solution strategies and finite element analysis. Code repository. (Core developer)
ROOK: Large-scale finite-element framework for (pressure-induced) phase-field fracture simulations. (Solo developer)
MultiscAI: Stochastic multilevel optimization framework for training ODE-based deep neural networks. (Solo developer)
DistTraiNN: Model parallel framework for distributed training of deep neural networks (Core developer)
Heart: Parallel framework for inverse problems in electrophysiology. (Contributor)