scipy.spatial.distance.

add_distance_kernel#

scipy.spatial.distance.add_distance_kernel(metric_spec, dtype_spec, kernel, backend)[source]#

Register a distance computation kernel function.

Parameters:
metric_specstr

Specifies the metric name, dimension, striding, and additional parameters that the kernel supports. The format is described by the following BNF notation:

metric_spec ::= metric_name "." metric_dim
              | metric_name "." metric_dim ":"
              | metric_name "." metric_dim ":" param_spec
              | metric_name "." metric_dim ":" param_spec ":" param_spec
metric_name ::= IDENTIFIER
metric_dim  ::= DIM
param_spec  ::= param_name "." param_dim
param_name  ::= IDENTIFIER
param_dim   ::= DIM
IDENTIFIER  ::= [A-Za-z_][A-Za-z0-9_]*
DIM         ::= "s" | "v" | "m"

Examples:

  • 'cosine.s' declares a kernel to compute the distance between two vectors.

  • 'euclidean.v:' declares a kernel to compute the distance vector between two matrices of the same shape. The input and output may be strided.

  • 'mahalanobis.m:w.m' declares a kernel to compute the distance matrix between two matrices, taking an additional matrix-valued parameter w.

  • 'minkowski.m:w.v:p.s' declares a kernel to compute the distance matrix between two matrices, taking two additional parameters: vector parameter w and scalar parameter p.

The current implementation supports up to two additional parameters.

dtype_specstr

Specifies the input dtype and output dtype that the kernel supports.

dtype_spec   ::= inout_dtype
               | input_dtype "->" output_dtype
inout_dtype  ::= DTYPE_CODE
input_dtype  ::= DTYPE_CODE
output_dtype ::= DTYPE_CODE
DTYPE_CODE   ::= "f32"  ; IEEE 754-2008 'binary32' format
               | "f64"  ; IEEE 754-2008 'binary64' format

TODO: support non-floating point types

The kernel is guaranteed to be called with x and y having input_dtype and out and additional arguments having output_dtype.

kernelscipy.LowLevelCallable

Distance computation kernel function and signature. kernel.function must wrap a C function pointer to the distance computation kernel; Python function is NOT supported. kernel.user_data is ignored. kernel.signature must match the following exactly:

  • if metric_spec does not contain a ':', then

    • if metric_dim is 's':

      "int (void *, void *, size_t, void *)"

      The kernel is called with (x, y, n, out) where x and y are n-vectors of input_dtype and out is a scalar of output_dtype.

    • if metric_dim is 'v':

      "int (void *, void *, size_t, size_t, void *)"

      The kernel is called with (x, y, n, m, out) where x and y are m-by-n matrices of input_dtype and out is an m-vector of output_dtype.

    • if metric_dim is 'm':

      "int (void *, void *, size_t, size_t, size_t, void *)"

      The kernel is called with (x, y, n, mx, my, out) where x is an mx-by-n matrix and y is an my-by-n matrix, both of input_dtype, and out is an mx-by-my matrix of output_dtype.

  • if metric_spec contains a ':', then

    • if metric_dim is 's':

      "int (void *, ssize_t *, void *, ssize_t *, size_t, void *" + PARAM_LIST + ")"

      The kernel is called with (x, xs, y, ys, n, out, outs, ARGS) where x, y, n and out are defined above, and xs and ys are length-one arrays containing the stride of x and y, respectively.

    • if metric_dim is 'v':

      "int (void *, ssize_t *, void *, ssize_t *, size_t, size_t, void *, ssize_t *" + PARAM_LIST + ")"

      The kernel is called with (x, xs, y, ys, n, m, out, outs, ARGS) where x, y, n, m, and out are defined above, xs and ys are length-two arrays containing the stride of x and y, respectively, and outs is a length-one array containing the stride of out.

    • if metric_dim is 'm':

      "int (void *, ssize_t *, void *, ssize_t *, size_t, size_t, size_t, void *, ssize_t *" + PARAM_LIST + ")"

      The kernel is called with (x, xs, y, ys, n, mx, my, out, outs, ARGS) where x, y, n, mx, my and out are defined above, and xs, ys and outs are length-two arrays containing the stride of x, y and out respectively.

    In the above, PARAM_LIST = ", void *, ssize_t *" * N where 0 <= N <= 2 is the number of additional parameters. ARGS is either empty or one of arg1, str1 or arg1, str1, arg2, str2, where argK is a scalar, vector, or matrix, and strK contains the corresponding strides stored in a length-0, length-1 or length-2 array.

All matrices are stored in row-major order. The kernel is guaranteed to be called such that any element of out does not overlap with any element of x, y, or an additional argument in memory, and no two elements in out overlap in memory.

backendstr

Name of the backend. Must be an IDENTIFIER. The default backend has the name "scipy".

Returns:
None

Notes

This function is not thread-safe and should not be called at the same time as other distance functions.

If a kernel is already registered with the given metric_spec, dtype_spec, and backend, possibly with different ordering of additional parameters, it is replaced with kernel.

The added kernel may be used if the backend is selected either via an explicit backend argument of a distance function or via the select_backend() function. (TODO: link select_backend)

Examples

The following example illustrates the steps needed from Python code to add a distance computation kernel and call it from cdist. It assumes that a capsule wrapping a C function pointer of the appropriate signature is already exported.

>>> import scipy
>>> import scipy.spatial.distance as sd
>>> capsule = sd._distance_wrap.cosine_DistanceMatrix_capsule()
>>> kernel = scipy.LowLevelCallable(capsule)
>>> sd.add_distance_kernel('cosine.m', 'f64', kernel, 'alt')
>>> sd.cdist([[1, 2, 3]], [[4, 5, 6]], 'new_cosine', backend='alt')
array([[0.02536815]])