Preview notice. This site includes method notes, datasets, metrics, and code; results and weights are not included.

Evaluation design

Metrics

The benchmark evaluates latent representations with 20 metrics across clustering quality, projection preservation, and latent-space structure. These are the same metric families used in the manuscript.

Metric definitions are public in preview mode; aggregate result figures remain gated until journal acceptance.

Datasets

53

27 cancer and 26 development cohorts

Study tracks

11

7 comparative, 4 robustness and efficiency

Metrics

20

clustering, DRE, and LSE families

Primary latent

10D

shared dimensionality for learned methods

Axis 01

Clustering

Partition agreement and separation against the shared Leiden reference.

Axis 02

Projection fidelity

Co-ranking measures for UMAP and t-SNE views of each learned latent space.

Axis 03

Latent structure

Intrinsic spectral and geometric diagnostics before plotting or clustering.

Definitions

Metric inventory

Each metric lists its intended direction so reviewers can distinguish optimization targets from diagnostic quantities.

Clustering quality

Agreement and separation of K-means clusters in latent space against the Leiden reference partition.

6 metrics
MetricDirectionDefinition
NMIhigherNormalized mutual information for partition agreement.
ARIhigherAdjusted Rand index with chance correction.
ASWhigherAverage silhouette width for intra- versus inter-cluster distance.
DAVlowerDavies-Bouldin index; lower values indicate less cluster overlap.
CALhigherCalinski-Harabasz score for compact, well-separated clusters.
CORdiagnosticMean absolute inter-dimensional Pearson correlation in the latent space.

Dimensionality reduction evaluation

Co-ranking evaluation of how UMAP and t-SNE projections preserve neighborhoods from the learned latent space.

8 metrics
MetricDirectionDefinition
UMAP distance correlationhigherRank-distance agreement for UMAP projections.
UMAP Q_localhigherLocal nearest-neighbor preservation at k = 15.
UMAP Q_globalhigherGlobal structure preservation in the projection.
UMAP overallhigherCombined local and global UMAP quality.
t-SNE distance correlationhigherRank-distance agreement for t-SNE projections.
t-SNE Q_localhigherLocal nearest-neighbor preservation at k = 15.
t-SNE Q_globalhigherGlobal structure preservation in the projection.
t-SNE overallhigherCombined local and global t-SNE quality.

Latent space evaluation

Intrinsic spectral and geometric diagnostics for the latent representation before 2-D plotting.

6 metrics
MetricDirectionDefinition
Manifold dimensionalitydiagnosticIntrinsic dimension estimate from the PCA eigenvalue spectrum.
Spectral decaydiagnosticSlope of the sorted eigenvalue curve.
Participation ratiohigherEffective number of active latent dimensions.
AnisotropydiagnosticDirectional spread uniformity across the latent axes.
Noise resiliencehigherEmbedding stability under Gaussian perturbation.
LSE overallhigherComposite of normalized latent-space diagnostic scores.

Statistical testing

  • Two-sided Wilcoxon signed-rank tests compare paired method outputs across the same datasets.
  • Benjamini-Hochberg FDR correction is applied within each results table at q = 0.05.
  • All comparisons use the same preprocessing pipeline and a fixed random seed where subsampling is needed.

Continue