Synoros research

pes

Plastic Embedding Substrate

A typed-graph code retriever. Observer attention deposits mass; the distance field deforms; subsequent queries rank by what has been attended to.

Hit@5 over BM25

+10pp

Router vs RAG, 20-seed

+8.60pp ± 6.19

Agent cost vs bare

−31.6%

License

MIT + Commons Clause

Apply for accessConfidential · access by request

What ships

  • ·Typed-edge code graph parsed via tree-sitter + jedi. Five relations across file, class, function, method nodes.
  • ·Substrate retrieval. Effective-resistance distance from a focus set, not embedding cosine. Observer attention deposits mass that deforms the distance field; the substrate is the working index, not a stale separate one.
  • ·Two substrate flavors. L⁺ scalar (production default) and per-concept (R, K) signed tensor (research-side, variable-rank under nuclear-norm regularization).
  • ·Compositional router. Per-task contextual bandit over (channels × depth × body-inject × RAG). k-NN fallback against historically-seen actions on similar tasks.
  • ·Claude Code hooks. user_prompt_submit injects substrate-ranked context before the model's turn; post_tool_use deposits observer mass on read / grep / edit.

Scalar substrate

Effective-resistance retrieval

Effective resistance is the spectral distance between two nodes on a graph. Close means many parallel paths, not just one short one. The substrate ranks every node by its distance from a focus set.

app.pymodels.pysessions.pyRequestSessionsend()encode()headers()auth.pyAdapter

Focus

Request

Top retrievals

  1. 1.models.pyd=1
  2. 2.send()d=1
  3. 3.encode()d=1
contains
imports
calls
Typed-graph retrieval. A focus concept lights up; the distance field decays through the graph and ranks nearby code first. Color intensity tracks closeness to the focus. In production the metric is effective resistance on the weighted graph Laplacian, not the hop count shown here.

Tensor substrate

prototype

Per-relation geometries

One geometry per relation. Imports for bug fixes, calls for refactors, inherits for type-system changes. Imports is validated on current evals; the others are not.

Per-relation distance fields

Focus: Request

contains3/10
imports1/10
inherits3/10
calls6/10
refs3/10
Tensor substrate: a separate (R, K) signed bilinear geometry per relation type. The scalar substrate collapses these into one L⁺ field; the tensor variant keeps them separated, so different code-engineering task types can pull from the relation that actually carries their signal. Imports carries the bug-fix signal in current evals; refactor and inheritance tasks are unvalidated.

Scalar substrate vs BM25

10 real post-cutoff bug-fix tasks. Uncontaminated; the model has not seen the fixes. Gold-patch file in retriever top-K. No router, no seed noise.

RetrieverHit @ K=5MRR
BM25 over file contents30%0.225
pes scalar substrate40%0.270

+10pp hit-rate, +20% MRR over BM25 on the same tasks. Bare LLM is not in the table because it does not retrieve; it greps until something works.

Tensor substrate, per slice

Same 10 tasks, using the tensor substrate's per-relation rankings (K=5 distinct geometries, one per edge type).

RetrieverHit @ K=5MRR
tensor union (all K=5 slices)40%0.190
tensor slice 1 (imports) alone40%0.240
tensor slices 0 / 2 / 3 / 4 alone0%0.000

On bug-fix tasks only the imports slice has signal. Refactor and type-system tasks should exercise the other slices but the 10-task set does not cover them yet.

Router on SWE-bench

707 SWE-bench-Lite + Verified tasks × 193 public-system predictions. Trained on leave-one-repo-out CV across 12 repos. Per-repo resolution rate vs the always-best-single-action baseline.

RepoRouterBaseline (~RAG)Δ
scikit-learn73.9%32.6%+41.3pp
requests69.2%38.5%+30.8pp
pallets/flask25.0%0.0%+25.0pp
sphinx-doc29.8%5.3%+24.6pp
pydata/xarray26.9%3.8%+23.1pp
pylint-dev40.0%20.0%+20.0pp
astropy37.5%20.8%+16.7pp
django36.8%21.5%+15.2pp
sympy37.7%23.8%+13.8pp
Per-repo wins are single-seed. 20-seed task-held-out cross-validation gives +8.60pp ± 6.19pp (router wins 19/20 seeds, never loses; median +6.67pp).

Agent cost

claude-opus-4-7 on 6 psf/requests SWE-bench tasks. Total cost across 6 tasks; all configs resolved all 6.

ConfigTotal costvs bare
bare LLM (no injection)$4.19baseline
aider RepoMap (RAG)$3.38−19.3%
pes scalar substrate (hooks default)$2.87−31.6%

6 tasks is small, single repo, Opus-only, all from before the model's training cutoff so the model may have seen the fixes. Treat as directional rather than definitive.

Modules

ModuleWhat it does
codegraphPython source to typed-edge graph via tree-sitter and jedi resolution. File / class / function / method nodes. Five relations: contains, imports, inherits, calls, refs.
substrate / observer / plasticity / encoderL⁺ scalar substrate primitives. The scalar field that beats RAG in the default configuration.
tensor_substrate / tensor_encoder / sequential_observerPer-concept (R, K) signed bilinear edges, signed effective resistance, variable-rank emergence under nuclear-norm regularization. Research-side; not yet validated downstream.
substrate_session / tensor_sessionHigh-level session adapters for both substrate flavors.
compositional_router / task_featurizerPer-task contextual bandit over (channels × depth × body-inject × RAG). k-NN fallback so the router can't lose to the best historically-seen action on similar tasks.
hooks.user_prompt_submit / hooks.post_tool_useClaude Code hooks. Substrate-ranked context injection before the model's turn; observer-mass deposit on read / grep / edit.

Confidential

Access by request

The Plastic Embedding Substrate isn't publicly available. Tell us your use case and we'll follow up about access to the code and a guided substrate session.

Apply for access