Building the Signature Surface

Signed Truth, Part Three of Three

May 12, 2026

Enterprises think they are deploying intelligence. What they are actually deploying is delegated authority.

The problem is no longer whether the model is correct. The problem is whether the institution can survive the action when it is wrong.

Most agentic deployments stall here.

Defensible is not a posture. It is an architecture — evidenced by artifacts, that survives adversarial scrutiny. By 2 August 2026, that architecture must exist: to satisfy regulators under the EU AI Act, auditors during financial close, your board after an incident, parties harmed when something goes wrong.

The path to the signature surface

→ The Intent–Execution Gap — the diagnostic. Why existing identity systems leave machine intent unprotected.

→ Signed Truth — the bottleneck. Why enterprise AI stalls between a generated answer and a signed decision.

→ Where Delegation Stops — the boundaries. What your institution can redesign and what it cannot.

→ Building the Signature Surface — the architecture. You are here.

Most enterprise agents stay in sandbox indefinitely. The reason is not model quality; it is institutional risk. Letting an agent act at production scale, against real institutional authority, is not a release decision an executive can make without architecture. Without architecture, the institution is limited to the speed of manual oversight. With it, delegation can scale because the harness guarantees actions stay inside a defined boundary.

The signature surface is the only structural way to scale delegation without scaling liability.

This is not AI governance. It is institutional mechanics for delegated machine authority. That is the actual category.

An institution either has a signature surface or it does not.

Two paths

Two-thirds of enterprise AI runs through third-party APIs — OpenAI, Anthropic, Google, embedded copilots. One-third runs in-house. The architecture is identical. The implementation altitude differs.

The API is rented. The liability is yours. Whether you run your own weights or call an API from San Francisco, the institutional requirement is identical. The four components and the six harness elements apply in both registers; the implementation altitude differs.

The six harness elements live at different altitudes in each register.

In the runtime build: mandate in runtime config, tool boundary as policy engine in runtime, escalation triggered by agent state, failure-mode declared in runtime degradation paths, halt cutting agent execution, forensic record captured per action at runtime.

In the perimeter build: mandate at the boundary interceptor, tool boundary as policy engine at perimeter, escalation triggered by inbound responses, failure-mode declared in perimeter degradation paths, halt cutting the perimeter connection, forensic record captured per request and response at the boundary.

Same six elements. Different altitude. The institution that confuses the two is the institution that operates an in-house harness against vendor APIs (insufficient) or a firewall harness against in-house runtime (theatrical).

The sections below describe the architecture using the runtime register because it is pedagogically clearest; each section notes the firewall translation, and a consolidated firewall harness summary follows.

The four components

The surface is an airlock between probabilistic systems and institutional authority. Inference on one side, under conditions appropriate to inference. Institutional action on the other side, under conditions appropriate to authority.

The harness contains the inference. The case file is the payload that survives the vacuum to reach the signer. The reliability floor is the pressure check. The audit trail is the pressure log.

Four components make the transfer between the two sides containable, observable, and reversible.

The harness

The harness is the runtime envelope around an agent’s execution. Probabilistic inference inside; deterministic walls around it.

The model generates possibilities. The harness decides what becomes institutional reality.

Six elements compose a working harness. Each is a buildable artifact.

The mandate specification is a machine-readable description of what the agent is authorised to do. Domain, scope, time horizon, blast radius, escalation triggers. Not natural language. Structured fields the runtime parses and the audit layer records. AGENTS.md v1.1 — hosted under the Linux Foundation’s Agentic AI Foundation with multi-vendor support — is the format the industry is converging on. Writing your mandate against AGENTS.md is writing against a standard.

The tool boundary enumerates which tools, APIs, data sources, and write paths are within scope, and which are not. Policy says what is allowed; the tool boundary says what is reachable. Policy engines like Open Policy Agent and Cedar render the boundary as enforced gates at runtime rather than as documents that drift from implementation. If your team’s answer to could the agent access X? is we have a policy against it, the boundary is not architectural.

The escalation specification names the conditions under which the agent stops and asks for human authority. Threshold-based: value, scope, novelty, risk tier, model confidence. Explicit, not inferred at runtime by the agent itself. An agent that decides for itself when to escalate has not been escalation-engineered; it has been hoped for.

The failure-mode declaration specifies what happens when something goes wrong. Degradation paths, fallback behaviours, halt conditions. Pre-declared and machine-readable. The institution that does not pre-declare failure is the institution that learns about failure from incident reviews — wrong altitude, wrong moment.

The halt condition is the kill-switch — independent of the agent’s cooperation, enforced at a layer the agent’s reasoning cannot override. Your incident-review board will ask: when this fails, can you stop it? The halt condition is the answer. For high-risk deployments under Article 50 scrutiny — financial-services applications, healthcare workflows, employment decisions — the halt condition can be hardware-anchored through Trusted Execution Environments with remote attestation. Most enterprise deployments do not need that rigor today; the categories that will need it should design for it now.

The forensic record is the per-action artifact the audit trail consumes. Agent identity, mandate identifier, timestamp, tool calls, outputs, human approver chain. Recorded immutably as the action happens, not after.

Concrete example. A financial-services organisation deploys a reconciliation agent. The mandate scopes it to a specific class of transactions, a specific time window, and a maximum blast radius — transactions touched per run. The tool boundary enumerates the read-only data sources and the specific write path: the proposed-adjustment queue, not the general ledger, enforced by policy engine. The escalation specification triggers on transactions above a value threshold, on anomalies the agent’s calibration flags, and on patterns matching known control-failure scenarios. The failure-mode declaration says: on confidence below threshold, suspend and flag; on tool error, halt and alert. The halt condition is enforced by the orchestration layer, not by the agent. The forensic record captures every transaction touched with full traceability.

Six elements. Built once, instantiated per deployment.

Firewall translation. In the perimeter register, the same six elements live at a guardian interceptor at your institutional boundary, not in the agent’s runtime. The mandate governs what outbound calls are permitted; the tool boundary enforces what data sources and write paths the vendor’s agent can reach through your perimeter; escalation triggers on inbound responses; the halt condition cuts the connection at the perimeter, not at the agent; the forensic record captures the full traffic at the boundary.

The protocol layer has shipped working components for both registers. Your team does not have to invent the cryptographic handshakes or identity registries; the open-source and financial ecosystems have already finalised them. AGENTS.md, MCP, A2A, AGNTCY’s Tool-Based Access Control, AP2 mandates, RFC 9421 message signatures, Visa’s Trusted Agent Protocol — each maps to a specific harness element. Your job is composition, not invention. (Protocol detail belongs in a different register; the standalone corrigibility series at anivar.net/corrigibility reads the wire-layer drafts test by test.)

In June 2025 I argued that agents are the runtime. The harness is what makes that runtime enforceable.

The case file

The harness produces actions. The signer receives a case file — the structured artifact that contains everything required to authorise the action and nothing the signer should not see.

A case file at minimum carries five fields:

Conclusion — what the agent recommends or has done.
Authorisation context — mandate identifier, scope, blast radius.
Supporting state — what the agent saw, which sources it consulted, which tools it called.
Alternatives — paths the agent considered and rejected, so the signer can verify the chosen path was the right one, not the only one.
Candidate signature — what the signer will bind to.

A mandated signature is only as legal as the case file is legible. A fifty-page log dump signed by the CFO is a legal fiction; a one-page synthesis of decision, evidence, risk, and alternatives is an exercise of authority. The case file is what protects the executive from the agent — the architectural mechanism that lets a senior individual stake their authority on a decision they can actually verify. Case file design is not a UX concern; it is a liability concern.

Two registers matter for the signing moment.

Human-Present (HP). The signer reviews each case file. Default for high-risk, high-blast-radius, or novel actions. Slower. Defensible. The signature is at the action. AP2’s Cart Mandate is the protocol-layer rendering.

Human-Not-Present (HNP). The institution has pre-signed an Intent Mandate that defines the conditions under which the agent may authorise itself. The agent acts; the case file is recorded; no human is in the loop at the moment of the action. The signature is at the boundary, not the action.

HNP is authority compression — pre-authorising classes of action instead of signing each one. Ten thousand individual decisions become one bounded mandate. This is what enables delegation at scale; it is also where the architecture earns its most demanding scrutiny. HNP is permitted only when the boundary is well-defined, the reliability floor is high, and the audit trail is complete enough that a post-hoc human review can reconstruct what happened. Most enterprise deployments will use HP for the first quarter of production and migrate specific action classes to HNP as the architecture proves out.

The intent–execution gap appears here. Even with a signed case file, the agent must execute against the authorisation. If execution drifts from the case file, your institution must detect the drift before harm compounds. Your QA function for agentic systems — the AI Reliability role — owns this discipline.

Firewall translation. In the perimeter register, the case file is produced from outbound and inbound traffic by the boundary interceptor. The same five fields. The same legibility requirement. The signer’s exposure is identical whether the agent runs in your data centre or in someone else’s.

The reliability floor

The point where delegation becomes allowed.

The reliability floor is where software becomes authority. Below the floor, the agent advises. Above the floor, the institution acts.

Five metrics compose a working floor.

The behavioural threshold is the quantitative bar on the operations the agent performs: accuracy on the domain, calibration on confidence, robustness to adversarial input, consistency across runs. Domain-specific. Measured continuously. An agent that has not cleared the bar produces flags, not signatures.

Outcome reconciliation requires that the agent’s recommendations be reconcilable with downstream outcomes. If a recommendation produces an action and the action produces a result, the result must be observable and traceable back to the recommendation. Phantom state — where the agent confabulates state the institution has no way to verify — is the failure mode this metric catches. The most expensive incidents in the next two years will be phantom state discovered in audit, not in production.

The correction window is the time between the agent producing an output and a human being able to act on it, including review and correction. If an action triggers downstream consequences faster than the institution can intervene, the floor has been violated. Making inaction visible is operationalised here. Inaction is not a default; it is a measured and bounded position.

Coverage discipline is the requirement that the floor apply to all paths the agent can take, not just the happy path. Edge cases the agent handles poorly count against the floor even if they are rare in production traffic. Calibrating the floor on happy-path traffic alone is the equivalent of stress-testing a bridge with the average car: technically valid, structurally useless.

Tier calibration recognises that the floor for a low-risk task is not the floor for a high-risk task. The harness must know which floor applies to which mandate. This is the delegation gradient in operation: higher delegation demands tighter boundary control. As delegation altitude increases, reconstruction requirements, boundary precision, floor height, and liability exposure all increase proportionally. The architecture has to match the altitude.

For high-risk categories under Annex III — hiring, lending, education, public services, biometric categorisation — the floor must also include disparate-impact monitoring as a measured metric, not an audit-time exercise. The signature surface does not eliminate bias; it makes bias detectable, contestable, and reconstructible, which is what defensibility under Article 50 requires.

In August 2025 I named the AI Reliability role as the discipline that practises this — the QA function evolved for the agentic context. Organisations that have not staffed an AI Reliability function are operating below their declared floor without knowing it. The role is not optional; the architecture requires someone to enforce the floor as data, not as posture.

Firewall translation. In the perimeter register, the reliability floor is measured against the vendor’s API behaviour rather than against your own model. The five metrics still apply. The signer’s exposure is identical; the measurement infrastructure attaches to the boundary.

The institution can survive a bad decision. What it cannot survive is a decision it cannot reconstruct.

The audit trail

The memory layer that makes authority reconstructible.

The audit trail is what survives the decision. Per-row attribution at the audit-trail layer: every record affected by an agent’s action carries agent identity, mandate identifier, timestamp, and human approver chain. The forensic record from the harness flows here, joined with the case file the signer received, joined with the reliability-floor measurements at the time of the decision, joined with the eventual outcome.

An institution cannot correct what it cannot reconstruct.

This is the trilogy’s structural claim, expressed at the audit layer. Corrigibility — the architectural capacity for affected participants to detect error, signal harm, and trigger correction — depends on memory. Memory that cannot be reconstructed is not memory; it is narrative.

Two design properties are load-bearing.

Immutable memory trails. The audit trail must be tamper-evident at the storage layer. Hash-chained append-only structures, RFC 9421 message signatures with chain-bound counters, or equivalent. Explainability without immutable memory trails is post-hoc theatre — you cannot govern what you cannot reconstruct. Stories are negotiable; histories are evidence.

Authority without reconstruction is theatre. The institution that cannot rebuild what its agents did is the institution whose authority can be challenged and not defended. Memory is liability infrastructure.

Cross-boundary reconstruction. When an action crosses organisational boundaries — agent A in your organisation calls resource Y in a partner organisation — the audit trail must reconstruct across the boundary. AP2 mandate chains, Verifiable Intent’s three-layer credential binding, TAP’s RFC 9421 signatures are working primitives. Your team composes; the wire-layer specs exist.

The audit trail is the airlock’s pressure log. The institution can reconstruct what crossed the surface, in what order, under what authority, with what outcome. The 21 percent of organisations with mature governance models have audit trails that look like this. The remaining 79 percent have logs that look like audit trails until the first incident review reveals the chain cannot be reconstructed and the decision cannot be defended.

The Article 50 transparency guidelines the European Commission published in May 2026 turn the institutional layer’s documentation requirements into a published baseline. The architecture above is what produces the documentation those guidelines require.

Institutions can delegate action only as fast as they can reconstruct responsibility.

Procurement at the perimeter

Four procurement disciplines worth naming for organisations operating in the firewall register. First: vendor contracts that do not require cross-boundary audit cooperation create reconstruction gaps your auditors will surface in the first incident review. Second: vendor-provided agentic products that ship without exposed mandate, boundary, and forensic-record primitives are not deployment-ready for the August register, regardless of model capability. Third: model rollback opacity — when a vendor changes the underlying model without notification, the reliability floor measurements your team gathered against the prior model no longer apply, and your defensibility memo becomes a description of an architecture that is no longer running. Fourth: undisclosed tool-routing changes — when a vendor adds, removes, or re-routes the tools an agent calls without explicit notification, the tool boundary your audit trail records may diverge from what actually executed. Procurement is part of the architecture.

The perimeter owns the authority. The vendor owns the model.

The defensibility test

The four components are not arbitrary. They satisfy five conditions your regulator, auditor, and incident-review board will check.

Can you stop it? The halt condition in the harness. The per-action forensic record that lets you propagate a revocation. The per-row attribution in the audit trail that lets you unwind a downstream record. When an agent malfunctions, when a regulator orders cessation, when an incident requires immediate halt — your incident-review board will ask whether the stop worked and how you know. The architecture above is the answer.

Are the rules legible? The mandate specification, tool boundary, escalation specification, and failure-mode declaration are machine-readable. Any party with read access to the harness configuration can see how the agent is permitted to act. This is what auditors mean by documented controls. The architecture renders the controls as code rather than as policy documents that drift from implementation.

Can someone outside verify behaviour? The audit trail with immutable memory trails and cross-boundary reconstruction. Your external auditor, your regulator, the party harmed in an incident — each must be able to verify what the agent did without depending on your operator’s word. The architecture is what makes that verification structurally possible.

Does the signature actually bind? The case file as the artifact that makes the signer’s judgment institutionally enforceable. The mandated-signature register from Where Delegation Stops is what makes the binding survive contact with adversarial scrutiny. The institution cannot redefine the mandate after the fact because the mandate is recorded and the action is recorded against it.

Can the design be reproduced? The harness, case file, reliability floor, and audit trail can be reproduced in a parallel deployment by any party with the spec. The architecture is not vendor-locked. Your deployment is defensible if another competent team could rebuild the architecture; it is fragile if it can only be defended by your specific vendor.

An architecture that clears all five passes Article 50, Sarbanes-Oxley, professional-liability scrutiny, and incident-review hostile questioning. An architecture that fails any single test is where adversarial scrutiny will land first. The structural framework that formalises these five conditions — and the case studies behind them — is in Corrigibility as a Structural Precondition for Digital Public Infrastructure: A Cybernetic Framework (Aravind, 2026), with the agentic-systems extension in Epistemic Capture and the Action Boundary: Corrigibility for Learned and Agentic Public Infrastructure. Further reading at anivar.net/corrigibility.

Where to start this week

If your organisation has an agentic deployment in production or in pilot and no signature surface, three actions in the next seven days move you measurably closer to defensible.

Produce the mandate document. Pick one production deployment. Write its mandate specification in AGENTS.md format. Scope, time horizon, blast radius, escalation triggers, tool boundary. Two pages, machine-readable. This is the artifact your auditor will ask for in the first session; producing it now also surfaces the cases where your team cannot yet describe what the agent is authorised to do, which is itself the discovery you need.

Inventory your forensic records. For the same deployment, write down everything your runtime — or your perimeter, in the firewall register — currently captures per action. Compare against the six harness elements. The gap is your engineering backlog for the next ninety days.

Identify your signers. Who currently authorises the agent’s outputs? At what altitude? With what evidence? If the answer is the agent just acts or we have a policy, your signing model is undefined. Defining it before August 2026 is the institutional discipline this trilogy describes.

These three actions are not the surface. They are the first three artifacts that let your team start building it.

What each stakeholder must deliver

Your engineering organisation. Four artifacts in twelve weeks: the harness specification document (AGENTS.md-formatted, per deployment), the case file schema (data structure, with HP and HNP variants), the reliability-floor measurement infrastructure (five metrics, dashboards, alerts), and the audit-trail backbone (immutable, hash-chained or equivalent, queryable). The harness specification is the easiest. The audit trail is the most consequential. Start with the audit trail if you must pick one.

Your legal and risk function. Two artifacts. A mapped list of mandated signatures — every regulatory, fiduciary, professional-liability, and contractual obligation your deployment touches — with the specific Article 50 / Annex III obligations cross-referenced. A statement of architectural sufficiency: a defensibility memo that names how the four components satisfy each mandated signature. Both reviewable in a single board session.

Your executive layer. One artifact and one cadence. The artifact is a deployment manifest: for each agentic system in production or pilot, the mandate, the signer, the reliability floor, the audit-trail status, and the August readiness assessment. The cadence is biweekly architecture-readiness alignment with engineering leadership — not quarterly checkbox reviews — because the twelve weeks between now and the deadline do not absorb stale governance. The decision: scope back deployments that cannot reach defensibility by the deadline, or commit the resources to bring them across. Both are legitimate. Operating an undefended deployment past the deadline is not.

Your board.

For each agentic system we operate or rely on, can we stop it, are the rules legible, can someone outside verify behaviour, does the signature bind, and can the design be reproduced?

Five conditions. Five evidences. The board that asks this question quarterly through 2027 makes the signature surface a structural requirement; the board that does not makes the signature surface optional, which is the same as making it absent.

In July 2025 I argued that your AI system isn’t a black box; it’s an org chart. The signature surface is what happens when an organisation takes that claim seriously. The system stops being a generator of answers and becomes an institutional artifact with a defined boundary, a defined process for crossing the boundary, and a defined record of what was crossed.

Closing the trilogy

The trilogy started with a diagnosis. Enterprise AI is bottlenecked not by model quality but by the organisational machinery between generated answers and signed decisions. Signed Truth named the missing surface. Where Delegation Stops distinguished what it can and cannot redesign. This issue has described how it gets built.

Most institutions still treat the signature surface as a brake — the architecture that slows agentic AI down to a defensible speed. The framing has it backwards. The surface is what allows delegation in the first place. Without it, the institution has a machine that suggests; with it, the institution has an entity that acts.

You are not building AI governance. You are building institutional mechanics for delegated machine authority.

Institutions can delegate action only as fast as they can reconstruct responsibility.

The International AI Safety Report 2026 named the shift the trilogy has been operating inside. The report, chaired by Bengio with expert representation from over thirty countries, marked the AI safety field’s formal pivot from model behaviour to deployment-system behaviour: the most pressing risks from artificial intelligence now come not from the models themselves but from the complex systems institutions build around them. The IAISR’s operational answer is defence-in-depth across training, deployment, monitoring, and societal resilience layers. The trilogy goes one altitude further — into the institutional architecture that determines whether systems built around models can actually be held to account. The signature surface is what defence-in-depth looks like at the institutional altitude.

Institutions that fail to build the surface will discover the boundary only after failure crosses it. The airlock either holds, or it does not.

An institution either has a signature surface or it does not.

Where does delegation stop in your organisation, and at that boundary, who can still say no?

The next issues address what happens when the institutional layer’s signatures meet the agentic substrate, and how the structural framework extends from organisations to the systems they deploy.

Build it.

Anivar Aravind is an Engineering Executive and System Thinker. The Layer 8 is a professional newsletter on the power, incentive, and governance layer of digital infrastructure. His structural framework on corrigibility is at anivar.net/corrigibility, with preprints on SSRN. Async. Cross-posted to LinkedIn. You can subscribe on Substack or LinkedIn.

Thanks for reading Layer 8! This post is public so feel free to share it.

Discussion about this post

Ready for more?