scipy.differentiate.

hessian#

scipy.differentiate.hessian(f, x, *, tolerances=None, maxiter=10, order=8, initial_step=0.5, step_factor=2.0)[source]#

Evaluate the Hessian of a function numerically.

Parameters:
fcallable

The function whose Hessian is desired. The signature must be:

f(xi: ndarray) -> ndarray

where each element of xi is a finite real. If the function to be differentiated accepts additional arguments, wrap it (e.g. using functools.partial or lambda) and pass the wrapped callable into hessian. f must not mutate the array xi. See Notes regarding vectorization and the dimensionality of the input and output.

xfloat array_like

Points at which to evaluate the Hessian. Must have at least one dimension. See Notes regarding the dimensionality and vectorization.

tolerancesdictionary of floats, optional

Absolute and relative tolerances. Valid keys of the dictionary are:

  • atol - absolute tolerance on the derivative

  • rtol - relative tolerance on the derivative

Iteration will stop when res.error < atol + rtol * abs(res.df). The default atol is the smallest normal number of the appropriate dtype, and the default rtol is the square root of the precision of the appropriate dtype.

orderint, default: 8

The (positive integer) order of the finite difference formula to be used. Odd integers will be rounded up to the next even integer.

initial_stepfloat, default: 0.5

The (absolute) initial step size for the finite difference derivative approximation.

step_factorfloat, default: 2.0

The factor by which the step size is reduced in each iteration; i.e. the step size in iteration 1 is initial_step/step_factor. If step_factor < 1, subsequent steps will be greater than the initial step; this may be useful if steps smaller than some threshold are undesirable (e.g. due to subtractive cancellation error).

maxiterint, default: 10

The maximum number of iterations of the algorithm to perform. See Notes.

Returns:
res_RichResult

An object similar to an instance of scipy.optimize.OptimizeResult with the following attributes. The descriptions are written as though the values will be scalars; however, if f returns an array, the outputs will be arrays of the same shape.

successbool array

True where the algorithm terminated successfully (status 0); False otherwise.

statusint array

An integer representing the exit status of the algorithm.

  • 0 : The algorithm converged to the specified tolerances.

  • -1 : The error estimate increased, so iteration was terminated.

  • -2 : The maximum number of iterations was reached.

  • -3 : A non-finite value was encountered.

ddffloat array

The Hessian of f at x, if the algorithm terminated successfully.

errorfloat array

An estimate of the error: the magnitude of the difference between the current estimate of the Hessian and the estimate in the previous iteration.

nfevint array

The number of points at which f was evaluated.

Each element of an attribute is associated with the corresponding element of ddf. For instance, element [i, j] of nfev is the number of points at which f was evaluated for the sake of computing element [i, j] of ddf.

See also

derivative, jacobian

Notes

Suppose we wish to evaluate the Hessian of a function \(f: \mathbf{R}^m \rightarrow \mathbf{R}\), and we assign to variable m the positive integer value of \(m\). If we wish to evaluate the Hessian at a single point, then:

  • argument x must be an array of shape (m,)

  • argument f must be vectorized to accept an array of shape (m, ...). The first axis represents the \(m\) inputs of \(f\); the remaining axes indicated by ellipses are for evaluating the function at several abscissae in a single call.

  • argument f must return an array of shape (...).

  • attribute dff of the result object will be an array of shape (m, m), the Hessian.

This function is also vectorized in the sense that the Hessian can be evaluated at k points in a single call. In this case, x would be an array of shape (m, k), f would accept an array of shape (m, ...) and return an array of shape (...), and the ddf attribute of the result would have shape (m, m, k). Note that the axis associated with the k points is included within the axes denoted by (...).

Currently, hessian is implemented by nesting calls to jacobian. All options passed to hessian are used for both the inner and outer calls with one exception: the rtol used in the inner jacobian call is tightened by a factor of 100 with the expectation that the inner error can be ignored. A consequence is that rtol should not be set less than 100 times the precision of the dtype of x; a warning is emitted otherwise.

References

[1]

Hessian matrix, Wikipedia, https://en.wikipedia.org/wiki/Hessian_matrix

Examples

The Rosenbrock function maps from \(\mathbf{R}^m \rightarrow \mathbf{R}\); the SciPy implementation scipy.optimize.rosen is vectorized to accept an array of shape (m, ...) and return an array of shape .... Suppose we wish to evaluate the Hessian at [0.5, 0.5, 0.5].

>>> import numpy as np
>>> from scipy.differentiate import hessian
>>> from scipy.optimize import rosen, rosen_hess
>>> m = 3
>>> x = np.full(m, 0.5)
>>> res = hessian(rosen, x)
>>> ref = rosen_hess(x)  # reference value of the Hessian
>>> np.allclose(res.ddf, ref)
True

hessian is vectorized to evaluate the Hessian at multiple points in a single call.

>>> rng = np.random.default_rng()
>>> x = rng.random((m, 10))
>>> res = hessian(rosen, x)
>>> ref = [rosen_hess(xi) for xi in x.T]
>>> ref = np.moveaxis(ref, 0, -1)
>>> np.allclose(res.ddf, ref)
True