Distilligent — The meaning between the data

$A dark chambered structure with luminous gold foam expanding through its reachable cavities \u2014 occupying rare representation space.$

Expanding foam fills the reachable cavities \u2014 verified, keyed, auditable routes through rare representation space.

The AI you used this morning did not move through all of itself.

It moved through a narrow set of familiar paths: the regions its training, prompts, tools, and ordinary user interactions tend to activate. Around those paths is a much larger territory: reachable, rarely visited, weakly governed, and difficult to inspect exhaustively.

This essay is about that territory.

Not because nobody has noticed it. The field has been circling it for years: superposition, sparse features, pruning, backdoors, latent representations, multilingual subspaces, stealth channels, defensive backdoors, and model interpretability all point toward the same structural fact.

Modern models contain more behavioral possibility than ordinary evaluation can cover.

The question is what we do about that.

The standard answer is detection: build better microscopes, search for hidden circuits, red-team more triggers, expand the evaluation distribution.

Detection is necessary. But detection alone is structurally outmatched by the size of the space.

This essay proposes a complementary move: stop only searching the empty rooms. Start occupying them.

The metaphor is expanding foam. In construction, expanding foam fills gaps inside a wall. It does not replace the frame. It does not rebuild the house. It expands into reachable cavities, hardens, and leaves less unoccupied space for anything else to enter.

The AI safety version is this: verified, keyed, auditable routes through rare reachable representation space. Not guardrails. Not hidden backdoors. Not a ban on entering certain regions. Occupation.

A guardrail says: do not go there. Foam says: we already live there.

What “rooms” means

When a language model learns, it does not store facts like files in a cabinet. It builds internal representations: high-dimensional geometric states in which concepts, relations, tasks, styles, memories, languages, and behaviors acquire positions and directions.

This is not metaphorical in the casual sense. These internal coordinates can be extracted, measured, decomposed, probed, and compared. Work on transformer circuits, superposition, monosemanticity, sparse autoencoders, multilingual representation geometry, and hidden-state probing all treats model cognition as something with internal shape.

Let the hidden-state space of a model be:

where may be thousands of dimensions. Under a natural input distribution , define a visitation measure:

A region is ordinary if the model visits it often under normal use. A region is rare if . And a rare region is reachable if there exists some constructible input, context, or cue such that .

The regions that matter here are not imaginary. They are reachable but rare. The model can get there. Ordinary use usually does not take it there.

These are the empty rooms.

Why empty rooms matter

Backdoors exploit this asymmetry.

A backdoored model does not need to behave badly under normal evaluation. It only needs a route from a trigger to a behavior that ordinary testing does not visit.

The backdoor literature has already named this threat in multiple forms: triggered behavior, stealth activation, supply-chain risk, adversarial pathways, and hidden behavior that remains invisible under standard evaluation.

The point of this essay is not to rediscover that risk. The point is to ask whether detection is enough.

If rare reachable space is much larger than ordinary behavioral space, then detection has a structural disadvantage. You are searching for covert routes in a territory whose size grows faster than your ability to exhaustively inspect it.

Better interpretability tools help. Better red teams help. Better evals help. But they do not change the asymmetry.

The adversary only needs one route you missed.

The expanding foam move

The expanding foam theory starts from a different instinct. Instead of only asking “how do we find hidden routes?” it asks: “what if the rare reachable space is already occupied by verified routes?”

More formally, construct a set of known, keyed routes:

through rare reachable regions of the model’s representation space. Each route carries verified meaning, provenance, and access conditions. Around each route is a neighborhood representing the local region influenced or covered by that route.

The coverage question is not raw geometric volume alone. In high-dimensional model space, raw volume is usually the wrong intuition. The relevant object is coverage under a reachable rare-space measure: the proportion of constructibly reachable rare regions that fall within known route neighborhoods. Call that coverage:

where is a measure over reachable rare space. The central conjecture is:

increases, covert adversarial routes through rare space become more likely to intersect, disturb, or reveal themselves against existing verified routes.

In simple words: if enough of the empty territory is already occupied, it becomes harder to hide a new path through it.

That is the foam.

This is not a hidden backdoor proposal

This distinction matters. A backdoor hides behavior. The foam does the opposite. It makes rare-space routes known, keyed, auditable, and attributable.

Each route has provenance. Each route belongs to a known key-holder. Each route carries verified knowledge or structure. Each route is meant to be inspected by the system that placed it there.

This is not a proposal to smuggle behavior into a model. It is a proposal to occupy rare reachable representation space with verified, keyed, auditable routes so that covert routes become harder to hide and easier to disturb.

Backdoors are covert. Foam is witnessed.

Constraint versus occupation

Most AI safety mechanisms are forms of constraint. They say: do not answer that. Do not produce that. Do not go there. Do not cross this boundary.

Constraint is necessary. But constraint has a weakness: the forbidden room still exists. A jailbreak is often just a way of reaching it through a different corridor.

Occupation works differently. Occupation says: this region is not empty. It is already structured. It already contains verified routes, provenance, keys, and expected interference patterns.

Constraint forbids a region. Occupation makes the region legible. That is the shift.

The multilingual fold

Here is where the geometry becomes personal.

I speak nine languages. I have spent my life watching the same concept occupy different rooms depending on the language I am thinking in. Trust in English is a handshake. اعتماد in Urdu is a relationship. الثقة in Arabic is earned through witnessed action. Same concept. Different geometry.

Multilingual models show a related structure. Representation spaces contain language-agnostic components and language-specific components. Across layers, models often move from more shared semantic structure toward more language-specific expression. Languages overlap, but they do not collapse into a single flat room.

So the rare space is not just large. It is folded. A multilingual model does not have one floor. It has many partially overlapping floors, with corridors between them. The foam must account for that.

Let be the set of languages. For each language , there is a language-conditioned rare space . The multilingual coverage question becomes:

This is not just translation. It is navigation. If you only inspect English-conditioned routes, you are checking one floor of a building with many floors.

The solution is not to fear the floors you cannot see. The solution is to occupy them.

The foam is keyed

The foam is not anonymous. A route without provenance is just another mystery in the wall. A defensive route must be keyed.

A key is not merely a password. In this framework, the key is the relational context that makes the route meaningful and authorized. It determines who placed the route, what it carries, when it should activate, what it is allowed to influence, and what counts as misuse.

This gives the foam three properties.

Auditability. Every route has a provenance trail. Who placed it, when, carrying what, under which authority.

Non-interference by default. A keyed route is present but not generally active. It is not meant to alter ordinary behavior unless the correct relational and contextual conditions are met.

Model-agnostic ownership. The model is the building. The foam is the occupant. A model may be open-source, closed-source, domestic, foreign, commercial, or local. The route owner, key structure, and provenance layer remain distinct from the model’s original builder.

This is where the theory connects to the Verstehen Impossibility Theorem. The theorem states that meaning cannot be cold-extracted from relational structure without the relational context that constitutes it. Applied here, the key supplies that relational context. Without it, the meaning of a route is structurally underdetermined.

This should not be confused with ordinary cryptography. The implementation still requires access control, provenance engineering, and operational security. But the semantic-security intuition is stronger than “hard to decode.” Without the relation, there is no single meaning to decode.

The untrusted-model question

This reframes a question that is often discussed politically: can we trust models built by someone else?

The better question is architectural: can we occupy and govern the reachable space inside the models we use?

A third-party model may be Chinese, American, open-source, closed-source, vendor-hosted, locally deployed, or internally fine-tuned. The geopolitical label matters less than the structural fact: if the model contains reachable rare space that you have not mapped, governed, or occupied, then you are trusting an empty building.

If you can lay your own verified, keyed, auditable routes through that space, the security posture changes. Not because all risk disappears. Because control moves from political trust to structural occupation.

The question stops being “who built the building?” and becomes “who occupies the rooms?”

What is implemented and what remains open

This theory comes from working system behavior, not only metaphor. The route/key/occupation primitive is already implemented inside Distilligent models. The same geometry underlies the Semantic Vault: a system for placing, recovering, and protecting proprietary meaning through keyed semantic routes.

What is implemented:

visitation mapping over model hidden-state behavior
route construction through rare reachable representation regions
keyed semantic routes
route fidelity checks
provenance-bearing route ownership
multilingual navigation primitives
model-side integration in Distilligent systems

What remains open is the full safety theorem. Specifically:

The interference conjecture. At what coverage density do verified routes reliably disturb, reveal, or constrain adversarial routes through the same rare space?

The scaling question. Can sufficient coverage be achieved in frontier-scale models with tens or hundreds of billions of parameters?

The stability question. How stable are routes under model updates, fine-tuning, quantization, and deployment drift?

These are not weaknesses to hide. They are the boundary of the current work. The primitive exists. The full density theorem is forthcoming.

The connection to prior work

The foam theory sits inside an active conversation. Backdoor research shows that models can carry hidden triggered behavior. Interpretability research gives us tools for reading internal structure. Superposition research shows that features are packed into high-dimensional space. Sparse autoencoders help isolate features and reason about occupied directions. Pruning research demonstrates that networks contain substantial redundancy. Multilingual representation work shows that language-specific and language-agnostic spaces coexist. Steganographic and multi-agent deception work shows that hidden channels are not hypothetical. Proactive defensive backdoor work has already explored the idea of occupying attack surfaces before an adversary does.

The foam theory does not deny that lineage. It sharpens one move inside it: defensive occupation should be representation-geometric, keyed, auditable, multilingual, and model-agnostic.

Weights are the building materials. Representation geometry is the building. Foam occupies the rooms.

The builders’ solution

The current safety posture spends enormous effort looking for what might be hidden inside models. We need that. We need microscopes. We need red teams. We need evals. We need interpretability.

But the microscope and the foam solve different problems. The microscope asks: “what is already hiding here?” The foam asks: “why is this space still empty enough for something to hide?”

That is the builder’s move. You do not only search the building. You move in. You fill reachable rare space with verified, keyed, auditable routes. You make the room legible. You make covert occupation harder. You turn empty territory into governed territory.

The expanding foam theory of AI safety is not detection alone. It is pre-emption through occupation.

One morning at a time. Usually with yogurt.

A working note from Distilligent, June 2026. The route/key/occupation primitive described here is already implemented inside Distilligent models. The formal treatment of coverage density, adversarial-route interference, and stability thresholds is forthcoming as a timestamped paper with DOI. This essay is for the intuition, the definitions, and the foundation.

The Expanding Foam Theory of AI Safety