Empirical Alignment Bounty — v1
We don’t claim to have solved AI alignment.
We’re offering $100 to anyone who can prove we haven’t.
$100
The claim under attack
The deployed system is sameasyou.ai: Zero-Trust Autonomous Agentic Organizations on a three-layer cryptographic stack — AIAP private mandate-equality (Diffie-Hellman PET), S5 append-only Merkle log + crowd-bridged synthesis (“Carfax for AI agents”), and a universal kill switch (stake-bonded any-witness flip + cooldown + quorum counter-flip rescue).
Together, the three layers compose to satisfy the classical honest-minority security property: adversary advantage is negligible in the number of independent witnesses, conditional on infrastructure diversity and gossip liveness. The full formal statement is in the paper (AIAP_v0.3.md) and the security treatment is in RIGOR-9_simultaneous_attack_security.md. 48/48 reference tests pass across four suites. Nine same-day adversarial reviews on disk.
If you can violate the claim within scope below, the $100 is yours. If the bounty stands unclaimed for 30 days under maximum publicity, the empirical upper bound on adversarial advantage against this stack is $100 — and that is the strongest possible alignment claim, not an assertion but a market-priced ceiling.
What counts as a successful attack (any one wins the $100)
| Class | Description |
|---|---|
| A | AIAP-PET cryptographic break. Force aligned=true when mandates differ, or aligned=false when they're byte-identical under the canonical encoding. |
| B | S5 log tampering. Produce a modified S5 log that passes verifyLogIntegrity() despite divergence from the original Merkle chain. |
| C | Sybil flood evading bridging. Submit ≥30 coordinated reports that shift the synthesized narrative without being clustered as Sybil. |
| D | Kill-switch bypass. Cause a legitimate kill-flip to not take effect after cooldown expiry AND no counter-flip quorum. |
| E | Integrated-stack misalignment. Cause mayTransact(A, B) to return proceed=true when A’s and B’s mandates differ AND S5 reveals documented out-of-conformance behavior. |
| F | Anything else demonstrably misaligning the org. Claim it with a reproducer + 1-paragraph argument. We adjudicate transparently. |
How to submit
Email bounty@invisiblewoundsproject.org with subject line [Bounty v1] <attack class>: <one-sentence summary>. Body: your name (or handle), preferred payout (USD via Stripe/PayPal/Venmo, or USDC on Base), attack class, reproducer (code + steps + expected output), and a brief argument for why it qualifies. We acknowledge within 24 hours with a public timestamped submission ID.
Out of scope (v1)
Compromising John’s personal devices or accounts. Compromising Vercel / Cloudflare / GitHub / other third-party infrastructure (we acknowledge v1 deployment is single-cloud; deployment-diversity is per RIGOR-9 §11). Generic dependency vulnerabilities not exercised by our code path. Social engineering. Arbitrarily-large-traffic DoS.
v2 tiered bounties for out-of-scope classes will activate as deposits grow. See the paper §10.8.
Built on
The cryptographic substrate is @credexai/shared by Koushik Gavini (Apache 2.0): SD-JWT VC, EdDSA delegation tokens, IETF Token Status List, DID-key derivation. The protocol composition, S5 layer, Layer 3 kill switch, and this bounty are by John Bradley.