Reference API

This is the primary reference of snfpy. Please refer to the user guide for more information on how to best implement these functions in your own workflows.

List of modules

snf.compute - Primary SNF functionality

Contains the primary functions for conducting similarity network fusion workflows.

make_affinity(*data[, metric, K, mu, normalize]) Constructs affinity (i.e., similarity) matrix from data
get_n_clusters(arr[, n_clusters]) Finds optimal number of clusters in arr via eigengap method
snf(*aff[, K, t, alpha]) Performs Similarity Network Fusion on aff matrices
group_predict(train, test, labels, *[, K, mu, t]) Propagates labels from train data to test data via SNF

snf.metrics - Evaluation metrics

Functions for computing various metrics to aid interpretation of similarity network fusion outputs.

nmi(labels) Calculates normalized mutual information for all combinations of labels
rank_feature_by_nmi(inputs, W, *[, K, mu, …]) Calculates NMI of each feature in inputs with W
silhouette_score(arr, labels) Calculates modified silhouette score from affinity matrix
affinity_zscore(arr, labels[, n_perms, seed]) Calculates z-score of silhouette (affinity) score by permutation

snf.cv - Cross-validation functions

Code for implementing cross-validation of similarity network fusion. Useful for determining the “optimal” number of clusters in a dataset within a cross-validated, data-driven framework.

snf_gridsearch(*data[, metric, mu, K, …]) Performs grid search for SNF hyperparameters mu, K, and n_clusters
get_optimal_params(zaff, labels[, neighbors]) Finds optimal parameters for SNF based on K-folds grid search

snf.datasets - Load tests datasets

Functions for loading test data setss

load_simdata() Loads “similarity” data with two datatypes
load_digits() Loads “digits” dataset with four datatypes