1.5. Performance Profiling
Added in version 6.0.0.
SUNDIALS includes a lightweight performance profiling layer that can be enabled at compile-time. Optionally, this profiling layer can leverage Caliper [20] for more advanced instrumentation and profiling. By default, only SUNDIALS library code is profiled. However, a public profiling API can be utilized to leverage the SUNDIALS profiler to time user code regions as well (see §1.5.2).
1.5.1. Enabling Profiling
To enable profiling, SUNDIALS must be built with the CMake option
SUNDIALS_BUILD_WITH_PROFILING set to ON. To utilize Caliper
support, the CMake option ENABLE_CALIPER must also be set to ON.
More details in regards to configuring SUNDIALS with CMake can be found in
§1.1.
When SUNDIALS is built with profiling enabled and without Caliper, then the
environment variable SUNPROFILER_PRINT can be utilized to enable/disable the
printing of profiler information. Setting SUNPROFILER_PRINT=1 will cause the
profiling information to be printed to stdout when the SUNDIALS simulation context is
freed. Setting SUNPROFILER_PRINT=0 will result in no profiling information
being printed unless the SUNProfiler_Print() function is called
explicitly. By default, SUNPROFILER_PRINT is assumed to be 0.
SUNPROFILER_PRINT can also be set to a file path where the output should be printed.
If Caliper is enabled, then users should refer to the Caliper documentation
for information on getting profiler output. In most cases, this involves
setting the CALI_CONFIG environment variable.
Note
The SUNDIALS profiler requires POSIX timers or the Windows profileapi.h timers.
Warning
While the SUNDIALS profiling scheme is relatively lightweight, enabling profiling can still negatively impact performance. As such, it is recommended that profiling is enabled judiciously.
1.5.2. Profiler API
The primary way of interacting with the SUNDIALS profiler is through the following macros:
SUNDIALS_MARK_FUNCTION_BEGIN(profobj)
SUNDIALS_MARK_FUNCTION_END(profobj)
SUNDIALS_WRAP_STATEMENT(profobj, name, stmt)
SUNDIALS_MARK_BEGIN(profobj, name)
SUNDIALS_MARK_END(profobj, name)
Additionally, in C++ applications, the follow macro is available:
SUNDIALS_CXX_MARK_FUNCTION(profobj)
These macros can be used to time specific functions or code regions. When using
the *_BEGIN macros, it is important that a matching *_END macro is
placed at all exit points for the scope/function. The
SUNDIALS_CXX_MARK_FUNCTION macro only needs to be placed at the beginning of
a function, and leverages RAII to implicitly end the region.
The profobj argument to the macro should be a SUNProfiler object, i.e.
-
type SUNProfiler
An opaque pointer containing profiling information.
When SUNDIALS is built with profiling, a default profiling object is stored in the
SUNContext object and can be accessed with a call to
SUNContext_GetProfiler().
The name argument should be a unique string indicating the name of the
region/function. It is important that the name given to the *_BEGIN macros
matches the name given to the *_END macros.
In addition to the macros, the following methods of the SUNProfiler class
are available.
-
int SUNProfiler_Create(SUNComm comm, const char *title, SUNProfiler *p)
Creates a new
SUNProfilerobject.- Arguments:
comm– the MPI communicator to use, if MPI is enabled, otherwise can beSUN_COMM_NULL.title– a title or description of the profilerp– [in,out] On input this is a pointer to aSUNProfiler, on output it will point to a newSUNProfilerinstance
- Returns:
Returns zero if successful, or non-zero if an error occurred
-
int SUNProfiler_Free(SUNProfiler *p)
Frees a
SUNProfilerobject.- Arguments:
p– [in,out] On input this is a pointer to aSUNProfiler, on output it will beNULL
- Returns:
Returns zero if successful, or non-zero if an error occurred
-
int SUNProfiler_Begin(SUNProfiler p, const char *name)
Starts timing the region indicated by the
name.- Arguments:
p– aSUNProfilerobjectname– a name for the profiling region
- Returns:
Returns zero if successful, or non-zero if an error occurred
-
int SUNProfiler_End(SUNProfiler p, const char *name)
Ends the timing of a region indicated by the
name.- Arguments:
p– aSUNProfilerobjectname– a name for the profiling region
- Returns:
Returns zero if successful, or non-zero if an error occurred
-
int SUNProfiler_GetElapsedTime(SUNProfiler p, const char *name, double *time)
Get the elapsed time for the timer “name” in seconds.
- Arguments:
p– aSUNProfilerobjectname– the name for the profiling region of interesttime– upon return, the elapsed time for the timer
- Returns:
Returns zero if successful, or non-zero if an error occurred
-
int SUNProfiler_GetTimerResolution(SUNProfiler p, double *resolution)
Get the timer resolution in seconds.
- Arguments:
p– aSUNProfilerobjectresolution– upon return, the resolution for the timer
- Returns:
Returns zero if successful, or non-zero if an error occurred
-
int SUNProfiler_Print(SUNProfiler p, FILE *fp)
Prints out a profiling summary. When constructed with an MPI comm the summary will include the average and maximum time per rank (in seconds) spent in each marked up region.
- Arguments:
p– aSUNProfilerobjectfp– the file handler to print to
- Returns:
Returns zero if successful, or non-zero if an error occurred
-
int SUNProfiler_Reset(SUNProfiler p)
Resets the region timings and counters to zero.
- Arguments:
p– aSUNProfilerobject
- Returns:
Returns zero if successful, or non-zero if an error occurred
1.5.3. Example Usage
The following is an excerpt from the CVODE example code examples/cvode/serial/cvAdvDiff_bnd.c.
It is applicable to any of the SUNDIALS solver packages.
SUNContext ctx;
SUNProfiler profobj;
/* Create the SUNDIALS context */
retval = SUNContext_Create(SUN_COMM_NULL, &ctx);
/* Get a reference to the profiler */
retval = SUNContext_GetProfiler(ctx, &profobj);
/* ... */
SUNDIALS_MARK_BEGIN(profobj, "Integration loop");
umax = N_VMaxNorm(u);
PrintHeader(reltol, abstol, umax);
for(iout=1, tout=T1; iout <= NOUT; iout++, tout += DTOUT) {
retval = CVode(cvode_mem, tout, u, &t, CV_NORMAL);
umax = N_VMaxNorm(u);
retval = CVodeGetNumSteps(cvode_mem, &nst);
PrintOutput(t, umax, nst);
}
SUNDIALS_MARK_END(profobj, "Integration loop");
PrintFinalStats(cvode_mem); /* Print some final statistics */