C Example Problems
A serial dense example: kinFerTron_dns
As an initial illustration of the use of the KINSOL package for the solution of
nonlinear systems, we give a sample program called kinFerTron_dns.c. It uses
the dense linear solver and the serial vector for the solution of the Ferraris-Tronconi test problem
[63].
This problem involves a blend of trigonometric and exponential terms,
and is subject to the following constraints
The bounds constraints on \(x_1\) and \(x_2\) are treated by introducing four additional variables and using KINSOL’s optional constraints feature to enforce non-positivity and non-negativity:
The Ferraris-Tronconi problem has two known solutions. We solve it with KINSOL using two sets of initial guesses for \(x_1\) and \(x_2\) (first their lower bounds and secondly the middle of their feasible regions), both with an exact and modified Newton method, with and without line search.
Following the initial comment block, this program has a number of #include
lines, which allow access to useful items in KINSOL header files:
kinsol.hprovides prototypes for the KINSOL functions to be called and also a number of constants that are to be used in setting input arguments and testing the return value ofKINSol().nvector_serial.hprovides prototypes for the serial vector implementation.sunmatrix_dense.hprovides the prototypes for the dense matrix implementation.sunlinsol_dense.hprovides the prototypes for the dense linear solver implementation.sundials_types.hfile provides the definition ofsunrealtype. For now, it suffices to readsunrealtypeasdouble.
Next, the program defines some problem-specific constants, which are isolated to this early location to make it easy to change them as needed.
The program prologue ends with prototypes of the user-supplied system function
func and several helper functions.
The main program begins with creating the SUNDIALS context object
(sunctx) which must be passed to all other SUNDIALS constructors. Then we
allocated the user data structure, data, which contains two arrays with
lower and upper bounds for \(x_1\) and \(x_2\). Next, we create five
serial vectors for the two different initial guesses (u1 and u2), the
solution vector (u), the scaling factors (s), and the constraint
specifications (c).
The initial guess vectors u1 and u2 are filled by the functions
SetInitialGuess1 and SetInitialGuess2 and the constraint vector c is
initialized to [0,0,1,-1,1,-1] indicating that there are no additional
constraints on the first two components of u (i.e., \(x_1\) and
\(x_2\)) and that the 3rd and 5th components should be non-negative, while
the 4th and 6th should be non-positive.
The calls to N_VNew_Serial(), and also later calls to various KIN***
functions, make use of a function, check_flag, which examines the
return value and prints a message if there was a failure. The check_flag
function was written to be used for any serial SUNDIALS application.
The call to KINCreate() creates the KINSOL solver memory block. Its
return value is a pointer to that memory block for this problem. In the case of
failure, the return value is NULL. This pointer must be passed in the
remaining calls to KINSOL functions.
The next four calls to KINSOL optional input functions specify the pointer to
the user data structure (to be passed to all subsequent calls to func), the
vector of additional constraints, and the function and scaled step tolerances,
fnormtol and scsteptol, respectively.
The solver is initialized with KINInit() call which specifies the system
function func and provides the vector u which will be used internally as
a template for cloning additional necessary vectors of the same type as u.
The call to SUNDenseMatrix() to creates the Jacobian matrix to use with
the dense linear solver which is created by SUNLinSol_Dense(). Finally,
both of these objects are attached to KINSOL by calling
KINSetLinearSolver().
The main program proceeds by solving the nonlinear system eight times, using
each of the two initial guesses, u1 and u2 (which are first copied into
the vector u using N_VScale()), with and without globalization
through line search (specified by setting glstr to KIN_LINESEARCH and
KIN_NONE, respectively), and applying either an exact or a modified Newton
method. The switch from exact to modified Newton is done by changing the number
of nonlinear iterations after which a Jacobian evaluation is enforced, a value
mset=1 thus resulting in re-evaluating the Jacobian at every single
iteration of the nonlinear solver (exact Newton method). Note that passing
mset=0 indicates using the default KINSOL value of 10.
The actual problem solution is carried out in the function SolveIt which
calls the main solver function, KINSol(), after first setting the
optional input mset. After a successful return from KINSol(), the
solution \([x_1, x_2]\) and some solver statistics are printed.
The function func is a straightforward expression of the extended nonlinear
system. It uses the N_VGetArrayPointer() function to extract the data
arrays of the vectors u and f and sets the components of fdata using
the current values for the components of udata. See KINSysFn for a
detailed specification of func.
The output generated by kinFerTron_dns is shown below.
Ferraris and Tronconi test problem
Tolerance parameters:
fnormtol = 1e-05
scsteptol = 1e-05
------------------------------------------
Initial guess on lower bounds
[x1,x2] = 0.25 1.5
Exact Newton
Solution:
[x1,x2] = 0.299449 2.83693
Final Statistics:
nni = 3 nfe = 4
nje = 3 nfeD = 18
Exact Newton with line search
Solution:
[x1,x2] = 0.299449 2.83693
Final Statistics:
nni = 3 nfe = 4
nje = 3 nfeD = 18
Modified Newton
Solution:
[x1,x2] = 0.299449 2.83693
Final Statistics:
nni = 11 nfe = 12
nje = 2 nfeD = 12
Modified Newton with line search
Solution:
[x1,x2] = 0.299449 2.83693
Final Statistics:
nni = 11 nfe = 12
nje = 2 nfeD = 12
------------------------------------------
Initial guess in middle of feasible region
[x1,x2] = 0.625 3.89159
Exact Newton
Solution:
[x1,x2] = 0.5 3.14159
Final Statistics:
nni = 5 nfe = 6
nje = 5 nfeD = 30
Exact Newton with line search
Solution:
[x1,x2] = 0.5 3.14159
Final Statistics:
nni = 5 nfe = 6
nje = 5 nfeD = 30
Modified Newton
Solution:
[x1,x2] = 0.500003 3.1416
Final Statistics:
nni = 12 nfe = 13
nje = 2 nfeD = 12
Modified Newton with line search
Solution:
[x1,x2] = 0.500003 3.1416
Final Statistics:
nni = 12 nfe = 13
nje = 2 nfeD = 12
A serial Krylov example: kinFoodWeb_kry
We give here an example that illustrates the use of KINSOL with the GMRES Krylov method as the linear system solver.
This program solves a nonlinear system that arises from a discretized system of partial differential equations. The PDE system is a six-species food web population model, with predator-prey interaction and diffusion on the unit square in two dimensions. Given the dependent variable vector of species concentrations \(c = [c_1, c_2,..., c_{n_s}]^T\), where \(n_s = 2 n_p\) is the number of species and \(n_p\) is the number of predators and of prey, then the PDEs can be written as
where the subscripts \(i\) are used to distinguish the species, and where
The problem coefficients are given by
and
The spatial domain is the unit square \((x,y) \in [0,1] \times [0,1]\).
Homogeneous Neumann boundary conditions are imposed and the initial guess is constant in both \(x\) and \(y\). For this example, the equations are discretized spatially with standard central finite differences on a \(8 \times 8\) mesh with \(n_s = 6\), giving a system of size 384.
Among the initial #include lines in this case is sunlinsol_spgmr.h which
contains constants and function prototypes associated with the GMRES
linear solver.
The main program calls KINCreate() and then calls KINInit()
with the name of the user-supplied system function func and solution vector
as arguments. The main program then calls a number of KINSet* routines
to notify KINSOL of the user data pointer, the positivity constraints on the
solution, and convergence tolerances on the system function and step size. It
calls SUNLinSol_SPGMR() to create the linear solver, supplying the
maxl value of 15 as the maximum Krylov subspace dimension. It then calls
KINSetLinearSolver() to attach this solver module to KINSOL. Next, a
maximum value of maxlrst = 2 restarts is imposed through a call to
SUNLinSol_SPGMRSetMaxRestarts(). Finally, the user-supplied
preconditioner setup and solve functions, PrecSetupBD and PrecSolveBD,
are specified through a call to KINSetPreconditioner(). The data
pointer passed to KINSetUserData() is passed to PrecSetupBD and
PrecSolveBD whenever these are called.
Next, KINSol() is called to solve the system, the return value is tested
for error conditions, and the approximate solution vector is printed via a call
to PrintOutput. After that, PrintFinalStats is called to get and print
final statistics, and memory is freed by calls to N_VDestroy(),
FreeUserData, KINFree(), and SUNContext_Free(). The
statistics printed are the total numbers of nonlinear iterations (nni),
func evaluations (nfe, excluding those for \(Jv\) product
evaluations), func evaluations for \(Jv\) evaluations (nfeSG),
linear (Krylov) iterations (nli), preconditioner evaluations (npe), and
preconditioner solves (nps). See the Optional output functions
section for more information.
Mathematically, the dependent variable has three dimensions: species number,
\(x\) mesh point, and \(y\) mesh point. The macro IJ_Vptr isolates
the translation from three dimensions to the one-dimensional contiguous array of
components under the serial vector. This macro allows for clearer code and makes
it easy to change the underlying layout of the three-dimensional data.
The preconditioner used here is the block-diagonal part of the true Newton
matrix and is based only on the partial derivatives of the interaction terms
\(f\) in the above equation and hence its diagonal blocks are \(n_s
\times n_s\) matrices (\(n_s = 6\)). It is generated and factored in the
PrecSetupBD routine and backsolved in the PrecSolveBD routine. See
KINLsPrecSetupFn and KINLsPrecSolveFn for detailed
descriptions of these preconditioner functions.
The program kinFoodWeb_kry.c uses the “small” dense functions for all
operations on the \(6 \times 6\) preconditioner blocks. Thus it includes
sundials_dense.h, and calls the small dense matrix functions
SUNDlsMat_newDenseMat, SUNDlsMat_denseGETRF, and
SUNDlsMat_denseGETRS. The small dense functions are generally available for
KINSOL user programs (for more information, see the comments in the header file
sundials_dense.h).
In addition to the functions called by KINSOL, kinFoodWeb_kry.c includes
definitions of several functions. These are: AllocUserData to allocate
space for the preconditioner and the pivot arrays; InitUserData to load
problem constants in the data block; FreeUserData to free that block;
SetInitialProfiles to load the initial values in cc; PrintHeader to
print the heading for the output; PrintOutput to retrieve and print selected
solution values; PrintFinalStats to print statistics; and check_flag to
check return values for error conditions.
The output generated by kinFoodWeb_kry is shown below. Note that the
solution involved 9 Newton iterations, with an average of about 37 Krylov
iterations per Newton iteration.
Predator-prey test problem -- KINSol (serial version)
Mesh dimensions = 8 X 8
Number of species = 6
Total system size = 384
Flag globalstrategy = 0 (0 = None, 1 = Linesearch)
Linear solver is SPGMR with maxl = 15, maxlrst = 2
Preconditioning uses interaction-only block-diagonal matrix
Positivity constraints imposed on all components
Tolerance parameters: fnormtol = 1e-07 scsteptol = 1e-13
Initial profile of concentration
At all mesh points: 1 1 1 30000 30000 30000
Computed equilibrium species concentrations:
At bottom left:
1.16428 1.16428 1.16428 34927.5 34927.5 34927.5
At top right:
1.25797 1.25797 1.25797 37736.7 37736.7 37736.7
Final Statistics..
nni = 9 nli = 329
nfe = 10 nfeSG = 338
nps = 338 npe = 1 ncfl = 6
A parallel example: kinFoodWeb_kry_bbd_p
In this example, kinFoodWeb_kry_bbd_p, we solve the same problem as with
kinFoodWeb_kry above, but in parallel, and instead of supplying the
preconditioner we use the KINBBDPRE
preconditioner.
In this case, we think of the parallel MPI processes as being laid out in a
rectangle, and each process being assigned a subgrid of size MXSUB
\(\times\) MYSUB of the x-y grid. If there are NPEX processes in the
x direction and NPEY processes in the y direction, then the overall grid
size is MX \(\times\) MY with MX = NPEX * MXSUB and MY =
NPEY * MYSUB, and the size of the nonlinear system is NUM_SPECIES * MX *
MY.
The evaluation of the nonlinear system function is performed in func. In
this parallel setting, the processes first communicate the subgrid boundary data
and then compute the local components of the nonlinear system function. The MPI
communication is isolated in the function ccomm (which in turn calls
BRecvPost, BSend, and BRecvWait) and the subgrid boundary data
received from neighboring processes is loaded into the work array cext. The
computation of the nonlinear system function is done in func_local which
starts by copying the local segment of the cc vector into cext, and then
by imposing the boundary conditions by copying the first interior mesh line from
cc into cext. After this, the nonlinear system function is evaluated by
using central finite-difference approximations using the data in cext
exclusively.
KINBBDPRE uses a band-block-diagonal
preconditioner, generated by difference quotients. The upper and lower
half-bandwidths of the Jacobian block generated on each process are both equal
to \(2 n_s - 1\), and that is the value passed as mudq and mldq in
the call to KINBBDPrecInit(). These values are much less than the true
half-bandwidths of the Jacobian blocks, which are \(n_s \times\) MXSUB.
However, an even narrower band matrix is retained as the preconditioner, with
half-bandwidths equal to \(n_s\), and this is the value passed to
KINBBDPrecInit() for mukeep and mlkeep.
The function func_local is also passed as the gloc argument to
KINBBDPrecInit(). Since all communication needed for the evaluation of
the local approximation of \(f\) used in building the band-block-diagonal
preconditioner is already done for the evaluation of \(f\) in func, a
NULL pointer is passed as the gcomm argument to
KINBBDPrecInit().
The main program resembles closely that of the kinFoodWeb_kry example,
with particularization arising from the use of the MPI parallel vector. It begins by initializing MPI and obtaining the total
number of processes and the rank of the local process. The local length of the
solution vector is then computed as NUM_SPECIES * MXSUB * MYSUB. Distributed
vectors are created by calling the constructor N_VNew_Parallel() with
the MPI communicator and the local and global problem sizes as arguments. All
output is performed only from the process with ID equal to 0. Finally, after
KINSol() is called and the results are printed, all memory is
deallocated, and the MPI environment is terminated by calling MPI_Finalize.
The output generated by kinFoodWeb_kry_bbd_p is shown below. Note that 9
Newton iterations were required, with an average of about 52 Krylov iterations
per Newton iteration.
Predator-prey test problem -- KINSol (parallel-BBD version)
Mesh dimensions = 20 X 20
Number of species = 6
Total system size = 2400
Subgrid dimensions = 10 X 10
Processor array is 2 X 2
Flag globalstrategy = 0 (0 = None, 1 = Linesearch)
Linear solver is SPGMR with maxl = 20, maxlrst = 2
Preconditioning uses band-block-diagonal matrix from KINBBDPRE
Difference quotient half-bandwidths: mudq = 11, mldq = 11
Retained band block half-bandwidths: mukeep = 6, mlkeep = 6
Tolerance parameters: fnormtol = 1e-07 scsteptol = 1e-13
Initial profile of concentration
At all mesh points: 1 1 1 30000 30000 30000
Computed equilibrium species concentrations:
At bottom left:
1.165 1.165 1.165 34949 34949 34949
At top right:
1.25552 1.25552 1.25552 37663.2 37663.2 37663.2
Final Statistics..
nni = 9 nli = 464
nfe = 10 nfeSG = 473
nps = 473 npe = 1 ncfl = 6