Preprint

Meaning on a Video Card

Scalable, Lossless Geometric Primitives

John Vaught · Synoros · ORCID 0009-0006-0179-9522

Abstract

A GPU is engineered to rasterize triangles and decode video. Exact, lossless representation of non-Euclidean structure was never one of its design goals, and geometric machine learning re-implements the same delicate primitives project by project, re-encountering their numerical pitfalls each time. holonomy_lib is a PyTorch library of geometric and spectral primitives (Riemannian manifolds, hyperbolic graph operations, spectral graph theory, discrete Ricci curvature, computational topology) built to close that gap, a trusted substrate for learning geometric structure directly, made to hold under float64, automatic differentiation, and GPU execution.

Several of its components surfaced results worth recording. A multiplicative spectral-shift factor is silently dropped from the hyperbolic heat-kernel dimension recursion under automatic differentiation. A hand-derived closed form for the ℍ⁵ heat kernel is faster and about three orders of magnitude more precise than that recursion. An arcsinh reparameterization of hyperbolic distance removes two distinct backward-NaN modes without an ε hyperparameter. A curvature-sign dispatch lets a continuous, learnable per-point curvature κ cross zero during training without re-instantiation, improving on discrete curvature gating in a controlled identifiability task. A reference-free dual-check method, computable from the object alone, certifies such primitives where no reference solution exists, and caught the dropped factor.

Two further capabilities instrument the geometry in use. The first is a spectral-dimension estimator that detects representational collapse, and the second is a content-addressable provenance layer that makes activation tracing and ablation native operations. Together they say that nonlinear, non-Euclidean embedding (learnable, per-point, and free to cross zero) is practically viable on commodity GPUs. Whether structure made directly available shortens the route language models take to the same geometry is the open wager the library is built to test.

Introduction

When I first heard Sutton say that large language models are bitter too, it crystallized the idea I had been circling in theory: that what a language model learns is the geometry of meaning (the shape of how things relate) and that it is made to learn this only indirectly, as an artifact of compressing how we communicate, rather than from the shape itself.

Sutton [2019] argued that across six decades of artificial intelligence, general methods that scale with computation (search and learning) have consistently overtaken systems built from human-encoded domain knowledge. The knowledge-engineering approach wins in the short run and is overtaken in the long run. This is the bitter lesson. In a 2025 interview [Sutton and Patel, 2025], Sutton extended the argument to current large language models, which learn by imitating a finite store of human-generated text and do not learn continually from their own experience, so he expects systems that learn from experience and computation to eventually supersede them, another turn of the same lesson, with the models that depend most on human-produced data being the ones overtaken.

Language is itself a human artifact, an engineered, lossy intermediate representation of the world. Models trained on it do acquire non-linguistic structure, such as linear analogy geometry in word embeddings [Mikolov et al., 2013], the Fourier-basis representation a small transformer forms to do modular addition [Nanda et al., 2023], and the geometric addition circuit a production model uses in place of the textbook algorithm [Lindsey et al., 2025]. But that structure is acquired indirectly, reconstructed from the statistics of text, and at a cost in data and compute far above what the structure itself requires. If the bitter lesson applies one level down, the language channel is the human-encoded prior that scale will eventually route around.

The compute that powers that scale is no different. In practice scale means the GPU, and the GPU is a human-engineered artifact built to rasterize triangles and decode video. The general methods that overtake hand-built knowledge run on hardware hand-built for graphics, dense regular arithmetic at high throughput, where exact, lossless representation of structure was never a design goal. The substrate the bitter lesson runs on is itself a domain-specific human prior. Geometric machinery that means to ride the same hardware therefore carries a concrete obligation, to stay numerically exact while fitting what a video card does well.

This lab was founded on a wager that this concept can be acted on, that geometric and algorithmic structure can be made directly available to learning (as differentiable, composable primitives a model optimizes over and through). The working hypothesis of this substrate research program is that supplying correct geometric machinery, and letting scale and search act on it directly, reaches the structure that language models acquire slowly and as a side effect by a shorter and less wasteful route. The program is pursued in a sibling project (synoros-substrate) and shares the direction of an active literature on curvature-aware representation learning [Nickel and Kiela, 2017, Gu et al., 2019, Bachmann et al., 2020, Giovanni et al., 2022, Guo et al., 2025]. A learner that operates on a geometric substrate, however, inherits every error in that substrate. The substrate has to be correct first.

The actual reason for curved geometry in representation is target-fidelity [Vaught, 2026]. The familiar version (spacetime is hyperbolic, so embeddings should be hyperbolic) is not the true justification. Modeling meaning is, with one label peeled away, modeling the world that meaning describes, and a model of it must span the regimes that world presents, where a linear-algebra default supplies only the single flat one. Aircraft engineers studied birds without claiming aircraft are birds: they took the features that make flight work and built systems that respect them. The analogue here is a substrate that spans curvature (spherical, flat, hyperbolic, and the per-point mixtures and sign changes between them) because a faithful model of the target cannot be confined to one. The target is the structure of relations, learnable from text, images, or concepts alike, since what is fit is the shape of those relations, independent of the semantics that express them.

holonomy_lib is that substrate layer. It consolidates the mathematics that geometric machine learning otherwise re-implements project by project. It spans Riemannian manifolds (fixed-rank, SPD, the hyperboloid model, the κ-stereographic model, the Lorentzian (1, n−1) model, product manifolds, and a per-point-κ heterogeneous manifold); hyperbolic graph operations (the Fréchet mean [Karcher, 1977], manifold-aware inner products [Pennec, 2006], hyperbolic Laplacian eigenmaps, the ℍⁿ heat kernel); spectral graph theory (combinatorial, normalized, signed and magnetic Laplacians, diffusion maps, effective resistance, spectral dimension); discrete Ricci curvature and flow; tensor decompositions; Riemannian optimization [Absil et al., 2008]; simplicial topology and batched persistent homology; cellular sheaves; SO(3) primitives; and a content-addressable provenance layer. Everything is GPU-native and batched-first; every numerical constant is either derived from inputs, marked as a universal invariant, or cataloged with a documented procedure and scale of validity; every public primitive carries a citation to the source that defines its mathematics; and a static audit enforces the constant discipline in continuous integration.