Modeling the universe well
§04 walked through four fields that all reached for high-dimensional geometric descriptions of their subject matter. §05 argued that the convergence is not a methodological accident: the fields were forced into geometric structure by what they were studying. This section takes the deflationary step that ties those facts together. Meaning is our word for our cognitive model of the universe. A model of meaning is therefore, with one layer of human label removed, a model of the universe. The convergence in §04 is what happens when models of the universe finally start being built the way the universe is shaped, rather than the way GPU arithmetic happens to be shaped.
The argument has three parts. First, the universe has structural features (similarity-shaped, hierarchy-shaped, correspondence-shaped) that any model of it has to respect, and “meaning” is the human label for our modeling of those features. Second, a model takes the shape of its target, or at the very least is designed after it the way aircraft are designed after birds; the target underneath the label “meaning” is the universe. Third, this position has identifiable kin in the philosophy literature, and identifiable distance from neighbors that make stronger claims than this argument needs.
Features the universe has, and their geometric homes
Start with what shows up in our cognitive modeling of the world, before any theory is named.
The first feature is similarity. “Cat” and “kitten” sit nearer in sense than either does to “logarithm.” Anything that wears the title of a meaning-model has to encode that fact somewhere. The geometric home for similarity is distance [a numerical measure of how far apart two points sit in some space]. Closer points mean more similar things. A flat vector space with cosine similarity is one realization. A hyperbolic space with hyperbolic distance is another. What is forced is not the specific realization but the bare fact that any model with a notion of similarity has to land it somewhere, and “somewhere” is some notion of nearness.
The second is hierarchy. “Animal” contains “dog,” which contains “Labrador,” which contains the specific dog asleep in the next room. The structure is not a chain; it is a tree, because “animal” also contains “cat” with its own breeds, and “fish” with its own families, and so on. Trees grow exponentially. The number of leaves at depth \(n\) is the branching factor raised to the \(n\)th power. If your tree branches three ways at every node, depth ten gives you about sixty thousand leaves. The geometric home for tree-like growth is hyperbolic geometry [the curved geometry where the volume of a ball grows exponentially with its radius rather than polynomially]. Nickel and Kiela showed in 2017 that hierarchical data embeds dramatically more efficiently in hyperbolic space than in Euclidean space, because the host’s volume growth matches the data’s (Nickel and Kiela 2017). They tried to fit the WordNet hierarchy [a standard hand-curated tree of English nouns and their relationships] into a flat space and a hyperbolic one with the same number of dimensions, and the hyperbolic version won by a large margin on every metric they cared about. The 2025 HypLoRA paper (Yang et al. 2025) went further: it found that the token embeddings inside trained language models already contain power-law and tree-like structure, which the flat-space training has been trying to fit through the wrong-shaped doorway. Hierarchy is real out there in how the world organizes its categories, and the geometry that fits it is the geometry where exponential branching is what the volume itself does.
The third is correspondence. The relationship between “father” and “son” is mirrored in the relationship between “tree” and “branch.” When we recognize an analogy, we are recognizing that two structures share their connectivity even though their substances differ. The geometric home for cross-domain pattern-matching is topology [the branch of geometry that asks what features of a shape survive when the shape is bent or stretched without being torn]. Topological invariants are exactly the features that ride through transformations of the underlying material. Carlsson’s framework for topological data analysis (Carlsson 2009) and the persistent-homology results from neuroscience (Giusti et al. 2015; Reimann et al. 2017) are versions of the same move: track the shape that survives rotation, rescaling, or relabeling, and you have tracked the part of the structure that is portable across instances. Persistent homology is the technique that makes this concrete. Given a cloud of points, it asks which loops, voids, and higher-dimensional holes persist as you smoothly inflate the points into one connected blob, and the persistent ones are the genuine features rather than the noise.
Three features. Three geometric homes. I have not picked convenient mappings. Any model of the universe that captures human-scale features runs into similarity, hierarchy, and correspondence in turn, and each has its own geometric home regardless of substrate. This is the part of the argument I call forced in the same sense the special-relativity geometry was forced in §02. You can ignore one of the features and end up with a poor model, and many models do exactly this. But the features of the target do not stop being there because the model ignores them. They show up in the failure modes of whichever model you built. A flat-space model of a tree-shaped concept eventually crowds together the leaves it should be separating, because the volume in a flat space does not grow fast enough to give them room. A model with no topology eventually fails to recognize an analogy that would be obvious to a child. The geometry is not a stylistic choice. It is what fits or fails to fit the shape of the target the model is aimed at.
After the target, not by analogy
The literature is full of a careless version of this move, and I need to walk past it before I can state the careful one.
The careless version runs: the universe is non-Euclidean, therefore meaning embeddings should be non-Euclidean. That is an analogy. Object A has property X, so the model of A should have property X, because A and its model resemble each other. The reasoning is bad. The non-Euclidean structure of spacetime arose from particular physical pressures: the invariance of the speed of light, the curvature induced by mass, the gauge structure of the standard model. Those pressures do not transfer to language modeling. A reader who left §02 thinking “spacetime is hyperbolic, so embeddings should be hyperbolic” was reading the section wrong, and I am not endorsing that move.
The argument I am making is target-fidelity.
What we are actually modeling, once “meaning” is unpacked as a label for our cognitive activity, is the universe. Cognition is something structural beings do (brains, ecosystems of brains, machines built by brains) inside a structural universe, and the values that activity produces are what we call meaning. So a model of meaning is, with the human label peeled away, a model of the universe; and the question of what shape that model should take is the question of what shape its target has. The universe has multiple geometric regimes: Euclidean local, hyperbolic where exponential branching is the natural growth, Hilbert-space-like where degrees of freedom multiply combinatorially, topological in whatever survives the various transformations the universe runs over its contents. A model that respects only one regime is, by construction, only modeling one regime of the target. Aircraft designers studied birds without claiming aircraft are birds: they learned which features made flight possible and built systems that respected those features. The same move applies here. Build models that respect the regimes the universe presents, not by analogy with the universe but by design after the actual target.
The distinction changes what evidence counts. An analogy argument can be defeated by pointing out the analogy is shallow: the model of A does not need to share property X with A just because we noticed a resemblance. A target-fidelity argument cannot be defeated that way, because it is not an inference from a target’s properties to its model’s. It is a statement about what counts as a good model of a target. If the target has multiple geometric regimes, a model that respects only one is, by construction, modeling only one regime of the target. The four-program convergence in §04 is then not a coincidence and not a methodological artifact. It is what target-fidelity looks like from the inside, observed in four different examples: four programs realizing, despite themselves, that the target has more structure than the flat-space tooling can carry. Our idea of meaning is just our word for the values we use to model the universe.
This is the load-bearing move of the section. For the engineering argument the paper is making, what matters is target-fidelity between our models and the universe they are modeling, regardless of which metaphysical reading the reader prefers. The structural picture sits one level above the metaphysical disputes. A reader committed to mathematical-universe-hypothesis maximalism, an ontic structural realist, a sophisticated substantivalist about spacetime, and a relationalist can each take the target-fidelity claim home with their own gloss on what “the universe” amounts to. The §04 convergence is the data. The target-fidelity framing is a way of reading what the data is showing, and it is compatible with several deeper readings of what existence is.
The features-to-homes argument is forced by the features themselves. The §04 convergence is the empirical fact. Target-fidelity is the philosophical proposal that ties the two together, and it is the part a reader is most welcome to push back on. A reader who declines target-fidelity can still take much of the paper home; the §07 architecture sketch rests on the geometric features being real, not on any particular reading of why. But the section’s strong claims are that “meaning” is our word for our model of the universe, and that good models take their target’s shape, and these are claims, not derivations.
Distance from Tegmark, position relative to structural realism
The position has neighbors in the philosophy literature.
The closest distant neighbor, the one a careful reader is most likely to confuse this position with, is Max Tegmark’s mathematical universe hypothesis, often abbreviated MUH [the claim that every consistent mathematical structure is realized as a physical universe; ours is one such structure among an enormous ensemble] (Tegmark 2008). Tegmark’s argument runs roughly like this: the only way to take seriously a physical reality independent of human description is to identify physical reality with mathematical structure. Once you do that, there is no principled reason to grant physical existence to one consistent structure (ours) and withhold it from all the others. So all of them exist. Our universe is one structure in an ensemble whose size is the size of “all consistent mathematical structures.”
The position here does not make that claim. I am asserting that this universe has whatever structure it has, and that our cognitive models of the universe (which we label “meaning”) share that structure because they are aimed at it. I am not asserting that every other consistent mathematical structure exists physically. The narrower claim does not require the broader one. The §04 convergence and the target-fidelity move are equally compatible with “this universe is structural and the only universe” and with “one structural universe among many.” This paper picks neither. It is about modeling, not about the cardinality of existence.
This narrowing also dissolves the standard objection to Tegmark. George Ellis’s 2009 critique (Ellis 2009) is the sharpest version of the worry. Ellis grants that mathematics is unreasonably effective in describing the physical world; that part is not in dispute. What he argues is that Tegmark conflates two different domains in identifying them. The first is the observable world that physics is responsible to: the world you can probe with apparatus, measure with clocks and rulers, and check against. The second is the Platonic mathematical world that mathematics is responsible to: the world of consistent structures, theorems, and proofs. Physics, on Ellis’s reading, is the discipline that holds mathematical models accountable to observation. To identify physical reality with mathematical structure, Tegmark has to throw out the observational content of physical theories as “baggage” around a mathematical core. Ellis argues this is incoherent: the external-reality claim was supposed to be about the observable world in the first place, and a structure-only ontology has nothing to observe with and nothing to be observed.
Ellis’s critique lands on Tegmark and does not transfer here, because the structure-versus-observation split Ellis assumes is exactly what the target-fidelity view denies. Observation is not extra-structural content laid on top of a mathematical skeleton. Observation is itself structural. It is the phenomenology of a constrained slice observer, which §03 set up directly. When a physicist looks at an apparatus, the looking is a structural process inside the universe. The data the apparatus produces is structural. The match between mathematical model and observed measurement is a match between two structures, both inside the host. There is no separate observation-domain on one side and a separate structure-domain on the other; there is a single structural reality, and observation is a process inside it. If a reader finds that move unconvincing, the section’s strongest claim degrades to “structural realism applied to model-building for the universe,” which is still defensible. It is just less ambitious than the version with observation folded in.
The closer neighbors are the structural realists. John Worrall’s 1989 paper (Worrall 1989) introduced the move. Worrall noticed a striking pattern in the history of physics: when one theory replaces another, the entities the old theory posited often get discarded outright. Caloric (the imagined fluid that was supposed to carry heat from one body to another) is gone. Phlogiston (the imagined substance released when things burned) is gone. The aether of §02 (the imagined medium that light was supposed to wave through) is gone. But the structural relationships those theories captured between observable quantities tend to survive into the new theory in a modified form. Maxwell’s equations were originally written down for an aether-based theory of electromagnetism. The aether got discarded in 1905, but the equations themselves survived almost untouched into relativistic field theory. Worrall’s diagnosis: what science tracks across theory change is structure, not the underlying entities. Theory change preserves the structural skeleton while replacing the metaphysical clothing.
James Ladyman and Don Ross sharpened Worrall’s move into ontic structural realism [the claim that structure is not merely what science best tracks but what fundamentally exists; the underlying entities are either nothing over and above the structural relations, or nothing at all] in their 2007 book Every Thing Must Go (Ladyman and Ross 2007). Their argument runs through philosophy of physics. Quantum mechanics, on their reading, gives us particles whose individuality is ambiguous to the point of being meaningless: two electrons in a singlet state cannot be told apart even in principle, and the formalism behaves as if there is no fact of the matter about which is which. General relativity gives us a spacetime whose points have no identity independent of the relations the metric imposes on them. Both theories, they argue, push toward the conclusion that relations are more fundamental than any objects standing in them. The position this paper takes can fairly be described as ontic structural realism applied to model-building for the universe, sitting at the more aggressive end of that tradition.
The most recent kin is Colin Hamlin’s 2026 paper in Synthese (Hamlin 2026). Hamlin develops what he calls a Universal Theory of Structure, abbreviated UTS, by holding ontic structural realism and Tegmark’s MUH together at once. Both views, taken seriously, run into what Hamlin calls the collapse problem: the worry that they erase the distinction between physical structures and merely mathematical structures, leaving no principled reason to call one of them “real” and the other “abstract.” Hamlin’s move is to say the collapse is not a bug but the feature, and that the collapsed view, where physical and mathematical structures are not different kinds of thing at all, is the correct one. I am not following him into that move. My position remains narrower: this universe has structure, our cognitive models of it (which we call meaning) share that structure because they are aimed at it, and the cardinality of the rest of existence is left open. But Hamlin is the closest published kin to this section’s philosophical core, and a reader who wants the maximally ambitious version of structure-as-existence has a developed account of it in his work.
When is a model wrong?
The standard worry, against any view that places “meaning” in the modeling activity rather than in some abstract realm, is that correctness dissolves. If a cognizer’s [the thing doing cognition] internal model is just a structure, and “meaning” is just our word for what the model produces, then any internal structure is meaning of something, and the cognizer is incapable of being wrong. The “anything goes” objection.
The objection has a clean answer once target-fidelity is the framing.
A cognizer is itself a structure inside the host. Its internal model is another structure, the representational machinery considered as a system of relations. The cognizer’s model is correct about a target to the extent that the structure of the model matches the structure of the target. Correctness is structure-to-structure fit. It is not arbitrary; the host’s relations either are or are not preserved by the model.
What is configuration-dependent is which target the cognizer is modeling and which structural features matter for its purposes. A bee modeling nectar sources cares about the spatial structure of flower distributions and not about bee parliaments, because there are no bee parliaments. A human modeling a conversation cares about social-relational structure (who said what to whom, what was implied, how the listener took it) and not about the quantum-mechanical phase of every electron in the room. Different cognizers, configured differently by genetics, environment, and training, fit different aspects of the host. They get different “correct” models because they are modeling different sub-structures of the same target, not because correctness has dissolved into preference.
In the configuration-independent limit, where a cognizer’s purposes shrink to nothing in particular and the question is whether the model fits the host as such, there is a single target: the structure of the universe itself. A model is substrate-correct in this limit when it preserves the structural relations of the host. Science is the long-running project of approaching this limit asymptotically. We do not get there; the match is always partial, and there is no view from nowhere we could check the match against. But “more substrate-correct” is a meaningful comparative even when the asymptote is unreachable.
What this gives back is the resolution of the “anything goes” worry. Different configurations of cognizer give rise to different “correct” models, each correct relative to its configuration. No model is correct in isolation from a configuration, and no model is wrong simply because some other model exists. But there is a structural-realist asymptote all of them are approaching, and approach is comparable. The view does not collapse into relativism. It anchors at the substrate.
The reach of the claim
Here is what the paper does claim. The universe has structural features (similarity-shaped, hierarchy-shaped, correspondence-shaped) with natural geometric homes (distance, hyperbolic geometry, topology). “Meaning” is the human label for our cognitive modeling of those features. A model of meaning is therefore a model of the universe with one layer of label peeled away, and good models take their target’s shape, or at least are designed after it, in the way aircraft are designed after birds. Observation is itself structural. Correctness is structure-to-structure fit, with substrate-correct as the configuration-independent limit.
Here is what the paper does not claim. That every consistent mathematical structure exists physically. That ours is one structure among many. Substantivalism or relationalism about spacetime. Any specific reading of quantum mechanics. The existence of a view from nowhere that would let us check our models against reality directly. The position is consistent with several resolutions of each. The structural picture sits one level above those metaphysical disputes, and that is by design.
The paper is doing one job: arguing that what we call models of meaning should be built after the structure of the universe they are aimed at, rather than after the linear-algebra defaults that fit a flat geometry on a GPU. That argument does not depend on settling what existence is. It depends on the four-program convergence being real, the structural features of the universe being real, and the target-fidelity framing being a more honest reading than analogy.
The next section turns from the philosophical core to the engineering implication. If meaning shares the host’s regimes, the architecture for modeling meaning should span those regimes rather than collapse them into a single tractable space. That is where the paper goes from “what is the picture” to “what is the build.”