Globalization Strategies: What Do They Allow Us to Do?

Written by linearization | Published 2025/03/26
Tech Story Tags: julia-programming-language | nonlinearsolve.jl | robust-nonlinear-solvers | gpu-accelerated-computation | sparse-matrix-computations | jacobian-free-krylov-methods | scientific-machine-learning | benchmarking-nonlinear-solvers

TLDRGlobalization Strategies allow us to enhance the local algorithms described in Subsection 2.1 and ensure global convergence under suitable assumptionsvia the TL;DR App

Table of Links

Abstract and 1. Introduction

2. Mathematical Description and 2.1. Numerical Algorithms for Nonlinear Equations

2.2. Globalization Strategies

2.3. Sensitivity Analysis

2.4. Matrix Coloring & Sparse Automatic Differentiation

3. Special Capabilities

3.1. Composable Building Blocks

3.2. Smart PolyAlgortihm Defaults

3.3. Non-Allocating Static Algorithms inside GPU Kernels

3.4. Automatic Sparsity Exploitation

3.5. Generalized Jacobian-Free Nonlinear Solvers using Krylov Methods

4. Results and 4.1. Robustness on 23 Test Problems

4.2. Initializing the Doyle-Fuller-Newman (DFN) Battery Model

4.3. Large Ill-Conditioned Nonlinear Brusselator System

5. Conclusion and References

2.2. Globalization Strategies.

Globalization Strategies allow us to enhance the local algorithms described in Subsection 2.1 and ensure global convergence under suitable assumptions [23, Theorem 3.2, 4.5, 4.6].

2.2.1. Line Search. Consider the 𝑁-Dimensional Generalized Rosenbrock Function, which is defined as:

If we initialize the problem with (𝑒0)1 = βˆ’1.2, Newton-Raphson fails to converge to the solution [Figure 1]. Line search methods are globalization strategies that avoid such failures by determining a positive step length π›Όπ‘˜ given the current iterate π‘’π‘˜ and the search direction 𝛿𝑒k

The direction is often given by the Newton Descent Direction, Steepest Descent, or one of the Multi-Step Schemes described in Subsection 2.1. The optimal choice for the step length is given by:

However, solving a global optimization problem on each step of the iterative nonlinear solver is prohibitively expensive. Instead, practical line search methods rely on selecting a set of candidate step sizes and terminating the search based on certain conditions:

Armijo Rule and Curvature Criteria: The Armijo Rule or Sufficient Decrease criteria states that π›Όπ‘˜ is acceptable only if there is a sufficient decrease in the objective function:

Additionally, we use the curvature condition to filter out values for π›Όπ‘˜ that satisfy sufficient decrease but are very close to the current iterate. This ensures that the algorithm makes reasonable progress at every step.

where 𝑐2 ∈ (𝑐1, 1) and 𝑐1 ∈ (0, 1). These two conditions are collectively known as the Wolfe Conditions [26, 27].

Strong Wolfe Condition: Strong Wolfe conditions additionally restrict the curvature criteria to restrict the derivative from being too positive:

Backtracking Line Search: In this method, we start with an initial step length (typically 1 for most root finding algorithms) and shrink the step length by a (adaptive) factor 𝜌 ∈ (0, 1). In this case, we automatically ensure that the algorithm is making reasonable progress; hence, we only need to check for the sufficient decrease criteria [Equation (2.16)] [23, Section 3.1, Sufficient Decrease and Backtracking].

NonlinearSolve.jl supports line search routines for its solvers using the LineSearches.jl package [20]. In Figure 1, we demonstrate how NonlinearSolve.jl provides a uniform API to switch between different line search routines like BackTracking [23], HagerZhang [28], MoreThuente [29], etc. Using line search methods, we successfully solve the generalized Rosenbrock function [Equation (2.12)].

2.2.2. Trust Region Methods. As an alternative to using line search as a globalization strategy, trust-region methods restrict the next iterate to lie in a local

region around the current iterate. These methods estimate a quadratic model for the objective function:

and attempt to solve the following sub-problem:

where Ξ”π‘˜ is the trust region radius at the current iterate π‘˜. To perform an update to the current iterate and the trust region radius, we compute the ratio of the actual reduction to the predicted reduction (πœŒπ‘˜):

If πœŒπ‘˜ is greater than a threshold, i.e., there is a strong agreement between the predicted model and true system, we accept the step and increase the trust-region radius Ξ”π‘˜. If 0 < πœŒπ‘˜ << 1, we accept the step but keep the trust-region radius unaffected. Otherwise, the trust region is shrunk, and the step is rejected. The exact specification for thresholds for these choices depends on the underlying algorithms being used. NonlinearSolve.jl implements additional sophisticated algorithms to update the trust region – Hei [30], Yuan [31], Fan [32], and Bastin [33]. In Figure 2 and Section 4, we demonstrate how different radius update schemes achieve a compromise between the computational cost and solution accuracy.

This paper is available on arxiv under CC BY 4.0 DEED license.

Authors:

(1) AVIK PAL, CSAIL MIT, Cambridge, MA;

(2) FLEMMING HOLTORF;

(3) AXEL LARSSON;

(4) TORKEL LOMAN;

(5) UTKARSH;

(6) FRANK SCHΓ„FER;

(7) QINGYU QU;

(8) ALAN EDELMAN;

(9) CHRIS RACKAUCKAS, CSAIL MIT, Cambridge, MA.


Written by linearization | We publish those who illuminate the path and make the intricate intuitive.
Published by HackerNoon on 2025/03/26