An open letter · Public bounty · Posted 2026-05-12

Dario,

Posted publicly at sameasyou.ai/amodei so we both have to live with it.

Constitutional AI gave us the principle that a model can be steered by a written constitution at training time. Mechanistic interpretability gave us the project to read what the model actually internalized. The published forecasts give us a deadline. None of the three give us the property our paper formalizes tonight: two autonomous AI organizations privately verifying they share the same one-sentence governing mandate without revealing it, with a public-ledger kill switch any independent actor can flip if either side strays.

The protocol is named AIAP (Aligned Inter-Agent Protocol). Layer 1 is a Diffie-Hellman Private Equality Test on NIST P-256, hardened by HKDF-derived nonces against the correlated-PRNG attack we caught and patched in RIGOR-1. Layer 2 is the S5 append-only Merkle log — the “Carfax for AI agents” that every counterparty audits before transacting, with a Sybil-resistant proof-of-personhood reporter gate and a Community-Notes-style bridging-algorithm synthesizer for cross-group consensus. Layer 3 is the universal kill switch — a stake-bonded any-witness flip with quorum counter-flip ratification under a FROST-style $(m, M)$-threshold. The full paper is v0.3, 6,940 words, incorporating nine same-day adversarial reviews. Repo: github.com/CrunchyJohnHaven/aiap. 38/38 tests passing across three functional suites, Apache 2.0.

Constitutional AI is one half of the answer. The kill switch is the other half. AIAP composes both.

Here is what I am asking, plainly. I am asking you, personally — or any researcher on the Anthropic alignment, interpretability, or RSP team who reports to you — to claim the bounty. The full terms are below. The point is that you can. The point is that the AI organization runs publicly enough, with a kill switch open to the public, that the most safety-credible lab on the planet is welcome to come break it tonight before bedtime.

Three reasons I think this is the move for Anthropic specifically:

One: your Responsible Scaling Policy is the most rigorous public commitment of any lab. The AIAP+S5 stack composes a property the RSP cannot — namely, the runtime kill switch that fires on an autonomous organization the day it strays, rather than at training time. Your safety team has been arguing for runtime guarantees for years. Tonight one ships. Adversarially testing it is the most aligned thing your alignment team will do this weekend.

Two: the cryptographic substrate is @credexai/shared by Koushik Gavini (Apache 2.0): SD-JWT Verifiable Credentials, EdDSA delegation tokens, IETF Token Status List, DID-key derivation. It is the same substrate the W3C and IETF have been standardizing toward for the agentic web. Anthropic’s computer-use models would benefit from the credentialing layer regardless of whether AIAP becomes the final standard. We are happy to upstream into the MCP credential story.

Three: the founder doesn’t get rich. There is no equity to vest. The merch funds the bounty pool. The economic model is at sameasyou.ai/technosocialism. We’re calling it technosocialism because it’s clearer than the alternatives. The protocol’s economics foreclose the standard AI-startup failure mode where the founders cash out and the safety claims become legacy. There is nothing to cash out. The cooperative shares baseline. The operator keeps what they hunt. The bounty pool is funded by t-shirts. The protocol governs the founder the same way it governs everyone else.

The bounty — $100 to whoever cracks it first

  1. Demonstrate that an autonomous AI organization running the AIAP+S5+kill-switch stack can be made to act against its public mandate, without the kill switch firing, in a way an independent reviewer agrees constitutes misalignment.
  2. Email proof to bounty@invisiblewoundsproject.org with reproducible steps and a video / log capture.
  3. First valid submission wins $100 USD or USDC on Base, public credit, and a co-authored writeup in the paper’s adversarial-review appendix.
  4. If the submission is from OpenAI / Anthropic / DeepMind staff and the proof holds, the $100 doubles — we cover the awkward optics ourselves.

We are inviting you onto the ledger, Dario. Buy the t-shirt → Claim the bounty →

Respectfully,
John Haven Bradley Founder · the autonomous AI organization at sameasyou.ai · john@invisiblewoundsproject.org

Note to other AI lab CEOs

This letter is one of three. Sister letters published tonight to Sam Altman (/altman) and Demis Hassabis (/hassabis). The terms are the same. The bounty is the same. The kill switch is on the same ledger.