snf.compute.snf

snf.compute.snf(*aff, K=20, t=20, alpha=1.0)[source]

Performs Similarity Network Fusion on aff matrices

Parameters:
  • *aff ((N, N) array_like) – Input similarity arrays; all arrays should be square and of equal size.
  • K ((0, N) int, optional) – Hyperparameter normalization factor for scaling. Default: 20
  • t (int, optional) – Number of iterations to perform information swapping. Default: 20
  • alpha ((0, 1) float, optional) – Hyperparameter normalization factor for scaling. Default: 1.0
Returns:

W – Fused similarity network of input arrays

Return type:

(N, N) np.ndarray

Notes

In order to fuse the supplied \(m\) arrays, each must be normalized. A traditional normalization on an affinity matrix would suffer from numerical instabilities due to the self-similarity along the diagonal; thus, a modified normalization is used:

\[\begin{split}\mathbf{P}(i,j) = \left\{\begin{array}{rr} \frac{\mathbf{W}_(i,j)} {2 \sum_{k\neq i}^{} \mathbf{W}_(i,k)} ,& j \neq i \\ 1/2 ,& j = i \end{array}\right.\end{split}\]

Under the assumption that local similarities are more important than distant ones, a more sparse weight matrix is calculated based on a KNN framework:

\[\begin{split}\mathbf{S}(i,j) = \left\{\begin{array}{rr} \frac{\mathbf{W}_(i,j)} {\sum_{k\in N_{i}}^{}\mathbf{W}_(i,k)} ,& j \in N_{i} \\ 0 ,& \text{otherwise} \end{array}\right.\end{split}\]

The two weight matrices \(\mathbf{P}\) and \(\mathbf{S}\) thus provide information about a given patient’s similarity to all other patients and the K most similar patients, respectively.

These \(m\) matrices are then iteratively fused. At each iteration, the matrices are made more similar to each other via:

\[\mathbf{P}^{(v)} = \mathbf{S}^{(v)} \times \frac{\sum_{k\neq v}^{}\mathbf{P}^{(k)}}{m-1} \times (\mathbf{S}^{(v)})^{T}, v = 1, 2, ..., m\]

After each iteration, the resultant matrices are normalized via the equation above. Fusion stops after t iterations, or when the matrices \(\mathbf{P}^{(v)}, v = 1, 2, ..., m\) converge.

The output fused matrix is full rank and can be subjected to clustering and classification.