Architecture

Method

GAHIB fits a variational autoencoder on single-cell RNA-seq counts. The encoder can be a multilayer perceptron, a Transformer, or a graph neural network, and the latent representation is shaped by three losses that together produce a hierarchy-aware embedding.

Encoder options: MLP, Transformer, GAT, GCN, GraphSAGE, ChebConv, TAG, GraphTransformer, ARMA.

Detailed benchmark figures remain gated; architecture and objective details are safe preview content.

Mechanism summary

Graph-attention VAE with bottleneck geometry

The first screen is text-first so the companion-site overview can show route content without repeating manuscript figures.

Encoder

Graph-aware counts

A graph or sequence encoder maps single-cell counts into a shared latent representation.

Bottleneck

Compressed manifold

A low-dimensional coordinate is trained directly instead of relying only on post-hoc projection.

Geometry

Lorentz hierarchy

The hyperbolic objective encourages radial hierarchy and angular lineage separation.

Objective

Three coupled losses

The objective combines count reconstruction, bottleneck reconstruction, and Lorentz geometry rather than treating visualization as a post-hoc projection.

Reconstruction. A count likelihood - negative-binomial, zero-inflated negative-binomial, Poisson, or zero-inflated Poisson - fits the raw counts.

Information bottleneck. A second decoder reconstructs the input from a 2-D manifold coordinate, compressing the latent into a low-dimensional representation suitable for visualization and trajectory analysis.

Lorentz hyperbolic loss. Anchors the manifold coordinate on the hyperboloid model of hyperbolic space. Radial distance from the origin encodes hierarchy: cells closer to the origin sit higher in the developmental tree, and lineage divergence maps to angular separation.

Lorentz distance

The Lorentz distance between two points on the hyperboloid is

d_\mathcal{L}(\mathbf{u}, \mathbf{v}) = \operatorname{arcosh}(-\langle \mathbf{u}, \mathbf{v}\rangle_\mathcal{L})

where the Minkowski inner product is

\langle \mathbf{u}, \mathbf{v}\rangle_\mathcal{L} = -u_0 v_0 + \sum_{i=1}^{d} u_i v_i.

Reference diagrams

Manuscript figures

These diagrams remain available for site readers below the method summary; they are excluded from the standalone site-overview composition.

GAHIB architecture: graph-attention encoder, information bottleneck, hyperbolic loss — The encoder produces a high-dimensional latent. An information bottleneck compresses it to a 2-D manifold coordinate; a Lorentz hyperbolic loss anchors that coordinate on the hyperboloid.

GAHIB overview: end-to-end pipeline — Pipeline overview: scRNA-seq counts to encoder to bottleneck to hyperbolic embedding.

Important constraint

The hyperbolic loss is degenerate when the bottleneck coordinate is untrained. The implementation enforces this dependency when the bottleneck reconstruction weight is zero.

Optional trajectory modules

Stochastic differential equation and partial differential equation modules are provided for trajectory experiments. These are off by default and documented in the source.

Detailed quantitative comparisons against baselines, ablations, and interpretability analyses appear in the manuscript and will be published on this site upon journal acceptance.

Continue