hidimstat.clustered_inference#

hidimstat.clustered_inference(X_init, y, ward, n_clusters, scaler_sampling=None, train_size=1.0, groups=None, seed=0, n_jobs=1, memory=None, verbose=1, **kwargs)[source]#

Clustered inference algorithm for statistical analysis of high-dimensional data.

This algorithm implements the method described in [Chevalier et al., 2022] for performing statistical inference on high-dimensional linear models using feature clustering to reduce dimensionality.

Parameters:

X_initndarray, shape (n_samples, n_features): Original high-dimensional input data matrix.
yndarray, shape (n_samples,) or (n_samples, n_times): Target variable(s). Can be univariate or multivariate (temporal) data.
wardsklearn.cluster.FeatureAgglomeration: Hierarchical clustering object that implements Ward’s method for feature agglomeration.
n_clustersint: Number of clusters to use for dimensionality reduction.
scaler_samplingsklearn.preprocessing object, optional (default=None): Scaler to standardize the clustered features.
train_sizefloat, optional (default=1.0): Fraction of samples to use for computing the clustering. When train_size=1.0, all samples are used.
groupsndarray, shape (n_samples,), optional (default=None): Sample group labels for stratified subsampling.
seedint, optional (default=0): Random seed for reproducible subsampling.
n_jobsint, optional (default=1): Number of parallel jobs for computation.
memorystr or joblib.Memory object, optional (default=None): Used to cache the output of the computation of the clustering and the inference. By default, no caching is done. If a string is given, it is the path to the caching directory.
verboseint, optional (default=1): Verbosity level for progress messages.
**kwargsdict: Additional arguments passed to the statistical inference function.

Returns:

ward_FeatureAgglomeration: Fitted clustering object.
beta_hatndarray, shape (n_clusters,) or (n_clusters, n_times): Estimated coefficients at cluster level.
theta_hatndarray: Estimated precision matrix.
precision_diagndarray: Diagonal of the covariance matrix.

Notes

The algorithm follows these main steps: 1. Subsample the data (if train_size < 1) 2. Cluster features using Ward hierarchical clustering 3. Transform data to cluster space 4. Perform statistical inference using desparsified lasso

References

Examples using `hidimstat.clustered_inference`#

Support recovery on fMRI data

Support recovery on simulated data (2D)

hidimstat.clustered_inference#

Examples using hidimstat.clustered_inference#

This Page

Examples using `hidimstat.clustered_inference`#