A pilot proposal · Safeguards
Models generate and train on languages designed decades ago for human convenience. A language designed for verification makes whole classes of unsafe code impossible to write — and provable to a static analyzer before it ever runs. Think of what TypeScript did for JavaScript, stewarded by Microsoft — but for Python's world, built for the agentic era, and stewarded by Anthropic.
The missing standard
Anthropic published the Model Context Protocol so models could use tools through one open interface, and Skills so models could load capabilities through one open format. Both became standards the field builds on, and Anthropic steers their direction. A third surface remains undefined.
How models reach tools. Open, adopted across the field, Anthropic-stewarded.
How models load capabilities. Open format, growing ecosystem.
How models write and train on code. No standard. No verifiability floor. The biggest safety surface, unowned.
Every model today writes code in languages built for human ergonomics, where unsafe behavior — hidden mutation, null dereferences, data races, injection, undeclared side effects — is something the model can emit and no analyzer can statically rule out. As autonomous agents write and run more code with less human review, that gap becomes the safety problem.
The safety thesis
The danger in agent-written code is not that the model is unskilled. The danger is unverified behavior: an effect the model did not declare, a mutation no one expected, a path that dereferences null, a string that reaches a SQL or HTML sink unescaped. Today these are caught — when they are caught — by review, testing, and luck.
In a language where these classes are compile-time type errors, the agent cannot write the unsafe thing, and a static analyzer can prove what the code will and will not do before it runs. Defense moves out of the adversarial runtime and into something as tractable as a type checker.
The model stays fixed. The dangerous classes become unwritable.
Capability is composed from verifiable parts a static analyzer can read.
This is directed evolution rather than open-ended self-modification: a fixed model generating within a verifiable substrate, where new capability is composition of safe parts and the safety properties are decided statically. The thing that makes the agent capable and the thing that makes it defensible become the same design.
The compiler as free red team
Current AI coding tools generate code that could be wrong. The standard response is more review — human or model. Both are expensive and have correlated failure modes: if the generating model missed something, the reviewing model is likely to miss it too, trained on similar data with similar blind spots.
A compiler has uncorrelated failure modes. It does not hallucinate, does not lose track of invariants over long contexts, does not get distracted. When an LLM writes U code and the compiler rejects it, that is a vulnerability caught before it runs — at zero marginal cost, in milliseconds, exhaustively across every code path the type system covers. The compile-error loop between model and compiler runs thousands of times a day without human involvement.
The pit of success for LLMs: the zero-annotation default is already safe. The compiler catches what the model missed. Two different kinds of intelligence, two different failure modes — together they are stronger than either alone.
This advantage compounds. Anthropic observes LLM-generated U code at scale — which patterns recur, which modifier combinations the model gets right automatically, which it misses. Future versions of U add new compile-time checks derived from those observations: new modifier dimensions, new constraint validations, new error classes made impossible. No other language designer has a billion lines of LLM-generated code to learn from. The language becomes safer with every model generation that trains on it.
What the language does
The design places a small, commutative set of verified annotations at each binding site: ownership (stack or heap), mutability and its write policy, nullability, async boundary, and declared effects. The zero-annotation default is the safest and fastest configuration, so every cost and every risk requires a visible, deliberate opt-in. A reader — human or model — sees each property in the source, with no inference required.
U is inspired by Python, and reads like it — clean indentation, low ceremony, familiar shape. The relationship is the one TypeScript has to JavaScript: not a replacement but a safety layer over a language people already use, stewarded by a single company, that became the default way serious teams write the underlying language because it caught a whole class of errors before they shipped. U is that layer for Python's world, built for the agentic era. It also holds to Python's own maxim, "there should be one obvious way to do it," which Python set for itself and then drifted from as it sprawled into many ways to format a string, loop, or run something async — U keeps the single obvious way, the way Kotlin kept Java's ecosystem while removing its verbosity and footguns. Teams that reach for Python will find U immediately legible, and find that it does not let them write the things Python lets them get wrong.
Python decorators — @cache, @retry, @auth — become modifier annotations: +D(cache(60)), +D(retry(3)), +D(auth("admin")). The same cross-cutting behavior, applied by the compiler rather than at runtime, visible at every binding site. Python's @ is matrix multiplication in U, following NumPy convention. Complex numbers C64 and C128 are built-in primitives — scientific Python users find the numeric stack already there, no external libraries needed.
// the zero-annotation default is safe and fast; risk is always visible f processPayment( order: Order+R+M(MVCC), // heap, mutable under MVCC — declared user: User+R, // heap reference, immutable — declared { timeout: I = 30 } // optional named params, defaulted — no wrapper ): PaymentResult ! PaymentError // the failure mode is declared // at the call site, the optional is just a value — no type, no wrapper processPayment(order, user) // timeout defaults to 30 processPayment(order, user, { timeout: 60 }) // override — no :I needed
Because the properties are verified rather than inferred, a whole catalog of failures stops being runtime surprises and becomes compile errors:
Python parity for scientific computing — without external libraries.
Complex numbers C64 / C128 are built-in primitives. @ is matrix multiplication, ^ is powers — the same operators Python uses. +V enables SIMD vectorization. +R(GPU) moves data to device memory. Python people read U code immediately.
For an autonomous agent, this is a guardrail it cannot step around: the unsafe program does not compile. For a reviewing analyzer, every consequential property is a fact in the source, not a guess.
Full language reference — modifiers, type system, concurrency, formal guarantees →
Side by side
Agent-generated code rarely fails one bug at a time; the failures arrive clustered, latent, and surface in production against a real user or attacker. Below, the failures other languages accept at runtime are, in U, compile-time type errors or checks the compiler inserts by default. The agent cannot emit them; the analyzer proves them absent.
// query built by string concatenation const q = "SELECT * FROM users " + "WHERE id = " + userId; db.query(q); // userId = "0 OR 1=1" // → every row leaks
// {uid} is bound, never interpolated rows = sql`SELECT * FROM users WHERE id = {uid}` // a raw string cannot become Sql — // concatenating a query by hand is // a type error
def summarize(orders, users): first = orders[0] # IndexError if empty user = users.get(first["uid"]) # None if missing return { "name": user.name, # 💥 None.name "total": first["amount"] + user.credit # str + int, fails deep }
f summarize(orders: [Order],
users: {S: User}): Summary+N
orders.length == 0 & r => none // empty handled
first = orders[0] // Order, bounds-checked
user: User+N = users[first.uid] // +N in the type
user ?? r => none // must handle missing
r => Summary({
name: user.name, // typed, non-null
total: first.amount + user.credit // both N
})
Four failures sit latent in the Python: an empty list, a missing key, a null dereference, and a string-plus-number confusion from untyped data. In U the type confusion and the null dereference are compile errors, the array access is bounds-checked rather than silent, and the missing-key case is part of the type the compiler forces you to handle. The U version is longer precisely because the compiler made every failure path visible.
The duck typing in the Python version is not only a correctness hazard — it is an attack surface. Prototype pollution in JavaScript and attribute-injection tricks in Python exist because the type and attribute structure is mutable and inferred at runtime rather than declared and sealed at compile time. U has no untyped escape hatch, no open prototype, and no inferred attribute bag: every binding's type is fixed and verified, so the whole class of pollution and injection-by-type-confusion attacks becomes unexpressible.
int total = balance + deposit; // silent wraparound on overflow → // a negative balance, a classic // source of financial CVEs
total: I32 = balance + deposit // signed types trap on overflow — always hash: U32 = seed * prime // unsigned types wrap — by definition // intent is in the type, not the operator
The pattern repeats across the catalog: use-after-free, data races on shared mutable state without a declared write policy, async violations, const mutation, missing i18n keys. Each is a runtime failure elsewhere and a compile error in U.
Why a standard, and why Anthropic
A verifiability-first language is most valuable as an open standard the field adopts, not as a closed tool. Published openly, other labs train on it and generate in it, which entrenches it as the default substrate for machine-written code. Stewarded by Anthropic, its direction stays with the lab that defined it. Adoption by others is not leakage; it is what makes the standard load-bearing, exactly as it was for MCP.
Open standard, Anthropic-stewarded. Other labs train on it. Anthropic holds the pen on how AI writes safe code.
This follows a proven path. Go was built inside a large company, published open, and championed by its designers; it became infrastructure the field relies on, with no ownership friction because it was open from the first day. A language for safe machine-generated code, published as an Anthropic standard, puts Anthropic at the center of the surface where AI safety and AI productivity meet.
The strategic payoff is position, not enclosure: the steward of the standard everyone trains on shapes the field more than the owner of a tool no one else can touch.
The deeper move · a compilation target
The Java Virtual Machine and the .NET Common Language Runtime did not win as languages. They won as targets: a common substrate that many languages compile to, where the platform owner controlled the verification, the security model, and therefore the ecosystem. Adoption never required developers to hand-write bytecode. It required the tooling around the target to be good.
The same shape applies here, with one decisive difference. JVM bytecode and CLR IL are machine-only. This target is human-readable and verifiable — so the compiled-to form can be audited, reviewed, and statically proven safe, which is exactly the property safety demands. A platform target that a person or an analyzer can still read.
Be flexible in what you accept. Be strict in what you emit.
Postel's robustness principle, applied to code itself: accept the world's messy languages as input, normalize to one strict, verifiable, analyzable target as output. The diversity goes in; the safety comes out.
Because models generate the target and transpilers port to it, adoption does not wait on developers learning new syntax. A large fraction of existing code can be ported into the safe target — by the same models that now write code — and every codebase ported becomes more training data in the safe target, which makes the porting better, which ports more. Anthropic's head start in training on it and porting to it compounds.
The target compiles to WASM and to C. Compiling to WASM means it runs inside the sandboxed-execution standard the field already trusts, rather than inside a general-purpose JavaScript runtime never designed for untrusted machine-written code. Anthropic can ship the runtime that safely executes the target in production — the move Node made when it lifted Google's V8 engine into a server runtime, except here the runtime is sandboxed and attested by design.
Anthropic already owns the runtime. Bun — the JavaScript runtime written in Zig with JavaScriptCore, now part of Anthropic since December 2025 — is the natural distribution layer. U compiles to WASM; WASM runs on Bun. The language, the porting models, and the runtime are all under one roof. No new infrastructure required.
Generate safe. Port safe. Run safe. One target Anthropic stewards across all three.
The platform is three layers, each an Anthropic-owned standard: the verifiable target models write and train on; the porting models that bring the world's code into it; and the sandboxed runtime that executes it. The same way Sun owned the JVM, Microsoft owned the CLR, and the V8 team's engine became the substrate under Node.
This is where the three layers become a product motion. An enterprise or trusted partner brings its existing codebases, the porting models translate them into the verifiable target, and the result is code that is analyzable, sandboxed, and runtime-verified — measurably safer than what they ran before, without a greenfield rewrite. The work happens inside the customer's own environment; their source never has to leave their walls. After the port, Anthropic's agents — trained on the target — close the loop, maintaining and improving the ported code autonomously, around the clock, against the same verifiable guarantees.
Port your legacy code into the verifiable target. It comes out safer, sandboxed, and maintained by agents that cannot write the unsafe thing.
Anthropic already publishes the vulnerabilities its models find in real codebases. The next step is to publish the fixes: take great open-source codebases, port them into the verifiable target, and the patched, hardened version — with the remediations applied and, in this language, more obvious — lives in the target. Each one is a public good and a draw: the safer version of a well-known project exists in U, and the safety is legible there in a way it was not in the original.
The precedent is recent and close to home. Bun, now Anthropic's, was ported from Zig to Rust almost entirely by Claude Code — over a million lines, the test suite passing. Bun's own pre-release audit of that port is the argument for this target. The port carries 13,365 unsafe blocks; the audit's fair point is that most are removable FFI-boundary and ported-idiom scaffolding. But the finding that matters is the one the count does not include: five functions are unsound today — undefined behavior reachable from safe Rust. The escape hatch leaked past its own boundary, so even the region the language calls safe was not.
U has no escape hatch at all. There is no unsafe region whose invariants can leak into the safe one — because there is no unsafe region.
That is the difference between porting into a language that permits unsafety behind a keyword and porting into one that cannot express it. Rust's unsafe is a boundary that can be crossed, and as Bun's audit shows, its invariants can leak back into safe code. A target with no escape hatch has no such boundary to leak. The remediation is real because the target will not compile the unsafe version at all. Every hardened codebase published this way is more high-quality code in the language, drawing the next one in.
This also changes the posture of Anthropic's security work. Publishing the vulnerabilities a model found — in code the lab cannot itself release fixes for — is a capability demonstration that leaves the finding on the table and the finder looking adversarial. Porting-and-hardening inverts it: instead of announcing the problem, Anthropic delivers the remediation — the codebase ported into the verifiable target, the issues surfaced and fixed at scale, handed back safer than it arrived. The same capability, in the opposite social position: not the lab that catches you out, but the lab that hardens your code.
For open-source projects this is a public good at a scale no human team could match. For enterprises it is a service: bring a codebase, get back a hardened, verifiable, sandboxed version, with the remediations applied and the unsafe forms made unexpressible — a paid offering that also turns every engagement into more code in the language and more training signal for the porting models. The economics and the flywheel point the same way. Only the lab with the model that ports at scale, and a target where the safety holds by construction, can offer it credibly.
The ecosystem, and the research optics
React did this for Facebook and Go did it for Google: a language or framework published openly, with the originating lab stewarding its direction, becomes a durable source of research visibility, conference presence, and engineering recruiting. The community forms around the standard, and the standard's home institution sits at the center of it.
The same follows here, and the ecosystem is concrete: conferences and workshops dedicated to the language and its verification properties; books and courses; certifications; tooling, analyzers, and editor support contributed by the community. Each one is research surface for Anthropic and a recruiting funnel of exactly the engineers who care about safe, verifiable systems. I am happy to help organize all of it — the conferences, the curriculum, the certification program — as part of stewarding the standard.
The standard publishes once. The conferences, books, certifications, and community compound for years.
The research-optics benefit is the kind a lab cannot buy directly: being the institution that defined how AI writes safe code, with the field's practitioners gathering around the standard you steward.
The research it answers
Autonomous code generation compounds where verification is cheap: types, tests, and compilation let an agent run, check, and correct itself without a human in the loop. A language designed for verification widens that region deliberately — more properties become checkable, so more of the agent's work can run unsupervised and safe.
The concrete research that proves this is porting. Training a model to port an existing codebase into the verifiable target is itself a verifier-rich task — the ported code compiles or it does not, the analyzer proves the unsafe classes absent or it does not, the tests pass or they do not. That makes porting the ideal first experiment: a measurable ML result that tests the language, the target, and the safety claim in a single spike, and produces the flywheel's first turn.
Where it leads next
Forcing every declaration upfront — types, write policies, and the handling of every declared failure mode — does more than catch bugs. It makes a program's consequential behavior legible and machine-checkable at the boundary: what it reads, what it writes, what it can fail at, and how concurrent writes resolve are all facts in the source rather than runtime surprises.
That legibility opens the next research direction. Once an agent's actions are expressed against a substrate where effects and failure modes are declared, an organization can express its policies against the same substrate: which classes of action an agent may complete autonomously, and which must route to human approval. The verifiable flows close their own loop; the consequential ones stop at a gate. Capability rises while the dangerous fraction stays under review — measured against declared structure rather than inferred behavior.
Declared structure makes verifiable flows safe to automate. Declared policy decides which flows are allowed to close the loop at all.
This is the research arc beyond the pilot: the verifiable language is the floor, and organizational policy expressed against it is how autonomous agents are given real authority safely — most of the work closing its own loop, the consequential remainder held for approval. The pilot earns the right to build toward it.
The ask · a scoped pilot
This is not a request to build a language, a porting fleet, and a runtime on day one. It is a request for one bounded experiment that tests the whole thesis cheaply, with a clear deliverable and a clear way to fail.
Build a minimal verifiable target plus its static analyzer, and train a model to port a real codebase into it. Measure: does the ported code have a lower unsafe-action rate, and a higher fraction of fault classes the analyzer proves absent, than the original? Hold the model fixed; vary only the target. One experiment that tests the language, the target, and the safety claim together.
If the number holds, publish the target's specification as an open, Anthropic-stewarded standard with the reference implementation, on the MCP model, and scale the porting models. If it does not hold, the lab has learned something cheap and concrete about the limits of verifiable substrates.
With the target proven and adopted, ship the sandboxed runtime that executes it safely in production, compiling to WASM and C. Generate safe, port safe, run safe — three Anthropic-stewarded layers. This is the destination the pilot earns toward, not the ask.
If Phase 1 shows no measurable safety improvement from porting into the verifiable target, the project stops at the spike. The cost is a short, self-contained research effort; the result is a real finding either way.
The next research direction · Safeguards
Safeguards today runs ML analysis on every prompt execution, because freeform prompts are unstructured strings: the surface area is unbounded, so the cost is O(executions). A language that encourages structured, typed, hash-pinned prompt templates changes the economics. The surface area shrinks. Most of the analysis moves from the runtime to the audit registry. The teraflops go where they are actually needed.
Freeform prompts: every execution is a new surface to analyze.
Structured templates: audit once, verify the hash at runtime — O(1).
When prompt`Summarize {doc: S} in {n: I} words` is compiled, the static template text becomes a constant with a deterministic hash. The variable parts are typed slots — {doc: S}, {n: I} — not arbitrary strings. The hash is baked into the module, included in the bundle's Merkle root, and submitted to the audit registry. Safeguards reviews it once. Every subsequent execution verifies the hash is signed and moves on. The ML runs on the rare anomalies — not on every call.
Structure makes a layered defense tractable. For any prompt execution, the tiers are:
Is this bundle hash registered and signed by auditors? No ML. No teraflops. If the hash is in the registry, the template was already reviewed. This handles the vast majority of production traffic.
Does this instantiation have correct typed slots? Does the input match the declared {doc: S}? Static checks the compiler already enforces. No inference required.
Is this instantiation within the normal distribution for this template class? A small model trained on the structure of audited templates catches the 99% that are typical. The structure defines the verifier-rich domain; the expert is cheap precisely because the surface area is declared, not inferred.
Only the rare edge case — outside the distilled expert's certified ball — reaches the full model. The teraflops land where the uncertainty actually is. This is where Safeguards' ML budget produces the most signal per dollar.
The distilled expert at Tier 3 is not a replacement for the base model. It is accrued structure knowledge the base model lacks: this template class, these slot types, this distribution of normal inputs. The expert is cheaper and more accurate on its ball precisely because the template gave it a defined domain. Without structure, every prompt is a fresh open-ended problem. With structure, 99% of executions are pattern-matching against a known schema.
The bundle's Merkle root hash covers every prompt function in the deployed system. Any response from any agent can be traced through a cryptographic chain:
// The provenance chain — automatic, no manual SRI LLM response → execution hash // Safebox already generates this → prompt function hash // baked in at compile time → module hash // covers all templates in the module → bundle Merkle root // covers the entire deployed system → M-of-N auditor signatures // registry sign-off // Any tampered template changes its hash // → fails to match signed registry entry → blocked before execution // → zero tampered prompts reach the full-model tier
This is SRI for prompts — generated automatically by the compiler, not hand-coded per resource. Changing a prompt in any deployed module changes its hash, breaks the registry match, and stops execution. Supply chain attacks on prompt templates become detectable before they run.
The M-of-N auditor model scales with LLMs. New template hash arrives → LLM auditors review at scale → human sign-off on high-capability packages → signed entry in registry. One audit. Every execution that follows is O(1). The teraflops go to auditing new structure once, not to rechecking known-safe structure forever.
Claude Code already ships as a Bun executable. If Claude Code encourages — or generates by default — structured U prompt templates rather than freeform strings, every Claude Code user becomes a source of auditable, hash-pinned prompt artifacts. The Safeguards cost per execution drops. The savings are real and compounding:
Structure is what makes safety cheap. The language encourages structure. The compiler enforces it. Safeguards analyzes the surface that remains — which is the surface that actually matters.
The research direction is concrete: measure the reduction in Safeguards cost per execution as Claude Code adoption of structured prompt templates grows. The metric is a ratio — Tier 1+2 resolutions vs Tier 4 escalations — and it should improve monotonically as the template registry grows. A larger registry means more of the world's prompt traffic has a hash that resolves cheaply. The distilled expert trains on an expanding corpus of audited structure. The full model focuses on a shrinking tail of genuine uncertainty. The savings compound with adoption, the same way the porting flywheel compounds with every codebase ported.
A second research direction · after the language is built
The design already puts the numeric stack in the language: complex numbers C64/C128, @ for matrix multiply, ^ for powers, +V for SIMD, +R(GPU) for device memory, and exact rationals as primitives — the things a scientific-Python user otherwise imports. And within a stated window (n, d < 253, covering currency probabilities, and small-denominator ratios), the comparisons float gets wrong are simply correct, and results are bit-reproducible across hardware given a fixed evaluation order. That much is a property of the type system, not a promise.
>>> 0.1 + 0.2 == 0.3 False >>> 0.1 + 0.2 0.30000000000000004 # money needs a separate library # 1/3 is never exact in float
Q(1,10) + Q(2,10) == Q(3,10) // true, exactly third = 1 / 3 // Q(1,3) — exact, not 0.333… price = Q(1999,100) // $19.99, no Decimal import
Reproducibility is verifiability. Same inputs, identical outputs across GPUs and driver versions — within the exactness window, given fixed order.
Float training is notoriously non-deterministic across hardware and runs. Exact rational evaluation makes a result a fact a re-run and an analyzer can both confirm — the property the rest of this proposal asks of code, applied to numbers.
What that exactness enables for model training and inference is not a result this proposal claims. It is a research agenda that becomes testable only once the language and its exact-rational runtime exist — sequenced after the pilot, the way the policy work is. Stated honestly, with the open questions intact:
exp or √ is not rational, and exact adjoints over a deep graph face the same growth cost as the forward pass.None of this is claimed as done. Each is a measurable question the implemented language would let us answer — the language first, the research it earns second.
Why this is clean, and why now
The language is designed to be published open and stewarded by Anthropic — no licensing, no enclosure, built fresh inside the lab as Anthropic's standard. The timing matches the moment: agentic code generation is scaling now, before any verifiability standard exists, which is the same window MCP arrived in. The lab that defines the safe-code substrate while the field is still settling will steer it for the field's lifetime.
Models already write the code. The only open question is whether the code they write can be proven safe before it runs.
I have spent years designing this language and can build the reference implementation. The pilot asks for the smallest version of that work that produces a real answer.
The language design described here has also been submitted as a paper to POPL (the ACM Symposium on Principles of Programming Languages). The submission is anonymized, though anyone familiar with the work could deanonymize it. The reviewer pool is drawn almost entirely from the academic programming-languages community, and to my knowledge the large majority do not work at Anthropic. I mention it so the prior disclosure is on the record and the timeline is clear: the design exists, it is documented, and it is dated.