Fragments attested this session: 0
Cost saved vs gpt-4o: $0.0000
Live since page load

Wall time, complex
Critical path
Cost, complex
Price per fragment
Signing  
SMSH  
Anchor  
Compliance  
Built for  
Attested Fragmented Inference Routing

The most significant breakthrough in AI inference since the transformer.

6.1x faster on simple queries. 98.5% cheaper at scale. Every sub-step cryptographically signed before the output moves. Patent pending.

6.1x
Faster than monolithic
gate bypass — Cerebras direct, live
98.5%
Cheaper on complex queries
tiered routing vs GPT-4.1 mono
0.8ms
Attestation overhead
ML-DSA-65 (NIST FIPS 204) per fragment
28
Patent claims
provisional filed 2026
Two ways in
Route it through us, or just sign what you already ran.

Same cryptographic receipt either way — ML-DSA-65 (NIST FIPS 204) + Ed25519, verifiable offline or by URL. The only question is whether AFiR runs the inference, or simply attests inference you ran yourself.

No key yet? Mint a free one — 25 signed receipts, 7 days, no card. Already have a key? The same key runs a SideCar — sign inference you already ran. Mint a free key →
Route through AFiR
We run it. We sign it.

Send a prompt to /v1/afir/run. AFiR fragments, routes, and signs every sub-step before the output moves. Fastest path to faster, cheaper, signed inference.

  • 6.1x faster, 98.5% cheaper at scale
  • Use our key free to start, or bring your own models
  • Every fragment signed and verifiable
Mint a free key — 25 receipts →
SideCar · attest what you already ran
You run it. We sign it.

Already running your own model on your own stack? Hand us {input, output} at /v1/afir/sign. Zero routing, zero model change — the output becomes signed inference.

  • Try it free on our key — no signup, live demo
  • SideCar your own key so the receipt resolves to you
  • Provenance you control, on inference you own
SideCar your key →
Live Demo
See it run in real time.

Type any question below. Watch AFIR fragment, route, and attest in parallel — side by side with standard inference.

Analyze Q3 earnings for margin compression signals Identify GDPR compliance gaps in this privacy policy Summarize key holdings in case law excerpt
Standard Inference
gpt-4o · Monolithic · Unsigned
Idle
Awaiting prompt
Cost
Latency
VS
AFIR
Parallel · Attested · Patent Pending
Idle
Awaiting prompt
Cost
Latency
Fragment Receipt — ML-DSA-65 Signed
Receipt Anatomy
The Receipt. Verifiable by anyone.

Every AFIR response bundles a tamper-evident receipt containing signed attestations for each fragment and a Merkle root over the full response tree.

afir_receipt.json
Verify independently — no Hive API call required. Public key available at /.well-known/afir-signing-key
P1
Signed Tool Calls
Before-and-after receipt for every MCP/A2A tool invocation. Binds tool name, phase, input hash, output hash, model, timestamp, nullifier, and parent receipt chain.
tool · agent
POST /v1/afir/tool/sign
P2
Cross-Agent Receipt Trees
Aggregates fragment and tool nullifiers into a Merkle root. One signature over the entire pipeline. IETF SPICE draft-mw-spice-inference-chain aligned.
multi-agent
POST /v1/afir/tree/build
P3
KV Cache Signing
Signs KV cache prefix entries at write time using vLLM-compatible SHA-256 prefix hashes. Supports parent_cache_receipt chaining for provenance across turns.
context · prefix
POST /v1/afir/cache/sign
P4
Model Manifest
TEE-less streaming attestation. Signs model_id, weights_sha3, config_hash, and endpoint. Resolves by nullifier publicly. No trusted hardware required.
attestation
POST /v1/afir/manifest/publish
P5
Crypto-Agile Layer
Per-request suite: ML-DSA-65+Ed25519 (default), ML-DSA-44+Ed25519, Ed25519-only. Reserved: SLH-DSA-SHA2-128s (FIPS 205), FN-DSA-512 (Falcon). No re-architecture on algorithm upgrade.
FIPS 204 · agile
POST /v1/afir/sign · GET /v1/afir/algorithms
Receipt + Reasoning State
The receipt says what happened. SMSH proves what the model knew.

Every AFIR receipt carries an smsh field — a cryptographic seal over the exact reasoning state (system prompt, context window, policy snapshot, identity) that was active when each fragment was signed. Without it, you can prove the output was signed. With it, you can prove the reasoning that authorized it. AFIR ships three SMSH tiers.

smsh
SHA-256 over the canonicalized state record. Default on every AFIR receipt. Verifiable offline — no Hive API call required.
smsh-pq
ML-KEM-768 sealed envelope. ML-DSA-65 primary signature. For regulated evidence payloads, ViewKey material, and long-retention compliance artifacts.
smsh-max
Full-chain binding. SMSH-PQ + delta chain + Merkle root anchored on Base/USDC. For FINRA WORM, EU AI Act high-risk, SR 11-7 model risk, and multi-year retention requirements.
SMSH Canon spec →
Integration
One line. Zero migration.

Drop AFIR into any OpenAI-compatible workflow with a single base URL swap. Native endpoint also available.

Before — standard OpenAI
from openai import OpenAI client = OpenAI( api_key="sk-...", ) response = client.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": prompt}] )
Change one line
After — AFIR
from openai import OpenAI client = OpenAI( api_key="afir_your_key", base_url="https://srotzin--afir-cern-afir-api.modal.run/v1", # Only change ) response = client.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": prompt}] ) # Receipt automatically bundled in response.choices[0].message.receipt
Node.js — AFIR native endpoint
const response = await fetch('https://srotzin--afir-cern-afir-api.modal.run/v1/afir/run', { method: 'POST', headers: { 'Content-Type': 'application/json', 'X-AFIR-API-Key': 'afir_your_key', }, body: JSON.stringify({ prompt: 'Review this contract for indemnification risk...', config: { fragment_strategy: 'auto', attestation: 'ml-dsa-65', } }) }); const { answer, receipt } = await response.json(); // receipt.merkle_root — verifiable offline
curl
curl -X POST https://srotzin--afir-cern-afir-api.modal.run/v1/afir/run \ -H "Content-Type: application/json" \ -H "X-AFIR-API-Key: afir_your_key" \ -d '{ "prompt": "Review this contract clause for indemnification obligations", "config": { "fragment_strategy": "auto", "attestation": "ml-dsa-65" } }'
Or bring your own models
curl — your models (any OpenAI-compatible endpoint)
curl -X POST https://srotzin--afir-cern-afir-api.modal.run/v1/afir/run \ -H "Content-Type: application/json" \ -H "X-AFIR-API-Key: afir_your_key" \ -d '{ "prompt": "Review this contract clause for indemnification obligations", "models": { "cheap": {"model": "claude-haiku-3", "api_key": "YOUR_KEY", "base_url": "https://api.anthropic.com/v1"}, "mid": {"model": "claude-sonnet-4", "api_key": "YOUR_KEY", "base_url": "https://api.anthropic.com/v1"}, "premium": {"model": "claude-opus-4", "api_key": "YOUR_KEY", "base_url": "https://api.anthropic.com/v1"} } }'

Works with Anthropic, Google Gemini, Groq, Mistral, or any provider with an OpenAI-compatible /chat/completions endpoint. AFIR owns decomposition. You supply execution.

Pricing
Three products. One decision.

Pick the product that matches your compliance posture. All three use the same API — swap by changing one header.

AFIR
Signed Inference
Fragmented inference with ML-DSA-65 receipts. Production today. The baseline every other product builds on.
signing
ML-DSA-65 (NIST FIPS 204) · Ed25519 compat
smsh tier
smsh (canonical)
AFiR Pro
Predictive Routing. Signed.
Same inference pipeline with predictive tier demotion and live-wake routing. Every fragment signed ML-DSA-65 + Ed25519. Faster on complex queries, cheaper at scale.
signing
ML-DSA-65 (NIST FIPS 204) · Ed25519
smsh tier
smsh (canonical)
AFiRCern
Fastest. Cheapest. Strongest.
Dynamic live-wake propagation + predictive tier demotion + PQ signing + on-chain Merkle anchor.
6.1×
faster wall time
98.5%
cost reduction
241ms
median latency — Cerebras direct
0.8ms
signing overhead
74%
receipt compression
signing
ML-DSA-65 primary · full delta chain · Base/USDC anchor
smsh tier
smsh-max (on-chain Merkle)
Developer
Engineers & Builders
$0.0001
per fragment, metered
You are building something and need attested inference without a monthly commitment. Pay only for what you route.
  • Up to 10M fragments / month
  • ML-DSA-65 (NIST FIPS 204) attestation on every fragment
  • OpenAI-compatible API
  • REST + receipt bundled in response
  • Public signing key at /.well-known
Get Started
Professional
Startups & AI Product Teams
$2,500
per month
You are shipping an AI product and need receipts for every inference call your users trigger. Flat billing, no surprises.
  • 25M fragments included
  • ML-DSA-65 (NIST FIPS 204) attestation on every fragment
  • OpenAI-compatible API
  • Merkle completeness proof
  • Standard support
Contact Sales
Enterprise
Large Enterprises & Financial Institutions
$50,000
per month
You are running AI at enterprise scale with internal audit, SIEM integration, and legal discovery requirements. Your signing key, your infrastructure, your receipts.
  • 5B fragments included
  • Dedicated signing key hierarchy
  • White-label receipt schema
  • On-premise deployment option
  • Joint patent licensing available
  • Dedicated engineering support
Contact Sales
Hyperscaler
Hyperscaler
Inference Platforms & Cloud Providers
$250K
platform fee / month
per fragment starting at $0.000020
You route inference at scale and need every call signed before the output moves. Your customers get receipts. Your platform gets liability coverage. Your legal team gets a defensible record.
  • Unlimited fragments at negotiated rate
  • Custom signing key infrastructure
  • Direct engineering SLA
  • Dedicated patent licensing terms
  • On-premise Docker deployment available — request access below
Talk to Steve
Cost Modeler
Run the numbers on your stack.

Pick your current model and your AFIR tier models. The math uses real published rates. See exactly what you save — or don't.

1,000,000
4,000
3–5 fragments — entity extraction, classification, simple Q&A
Cheap
Mid
Full
Fragment Routing
Monthly Cost
Monolithic (your current model)
AFIR tiered routing
You save
Attestation overhead
Latency Estimate
Mono sequential
AFIR parallel
Speedup

Illustrative model. Rates from published pricing as of June 2026. Actual savings vary by prompt structure, cache hit rate, and tier spread. Attestation overhead is 0.785ms/fragment (ML-DSA-65, NIST FIPS 204). Latency uses parallel DAG execution — simple queries gate-bypass (single fragment, cheap model), complex queries run in parallel waves. Critical path shown, not total sequential time.

Live Benchmark Chart
Your configuration vs Standard — updated as you adjust the sliders above

Latency and cost reflect your selected model and complexity. Receipts and signing are fixed per tier.

INFERENCE COMPARATOR
How does your current provider stack up?

Enter your Together, Fireworks, or custom provider numbers. See the signed-tier margin and latency gap side by side.

Open Inference Comparator →
FAQ
Hard questions. Straight answers.

The objections engineers and legal teams raise. Answered with numbers, not marketing.

Latency reduction depends on query type. Simple queries bypass decomposition entirely (gate bypass) and complete in 352ms vs 1,692ms for a monolithic GPT-4.1 call — 4.8x faster. Complex queries run fragments in parallel; wall-clock time tracks the critical path, not total token count. In live benchmarks with Cerebras direct routing, gate bypass completes in 241ms vs 867ms monolithic — 3.6x faster. Complex multi-fragment queries: 5,955ms vs 10,628ms — 1.78x faster, 98.5% cheaper. Cheaper models account for cost savings; parallelism accounts for latency savings. These are orthogonal gains. The tiered routing cuts the bill; the DAG execution cuts the latency.

You could build a proxy with a hash in a weekend. What takes longer: a correct DAG decomposition engine that preserves semantic dependency ordering across arbitrary prompts, an ML-DSA-65 (NIST FIPS 204) attestation chain at 0.8ms per-fragment overhead at inference latency scales, a Merkle completeness proof that binds input state, routing decisions, and output hashes into a single verifiable receipt, and a key-split architecture where your signing key never leaves your perimeter. The engineering surface is the DAG correctness guarantees and the attestation chain integrity — not the proxy layer. Build it and run it in production under audit; that's the actual weekend estimate.

Signing outputs at the final response layer is not new. What is patent-pending (filed June 2026) is the combination of fragment-level attestation across a routed DAG — specifically, attesting each node before its output is consumed as input by a downstream node, so the chain of custody is continuous and not retrospective. Prior art signs the envelope; AFIR signs every edge in the dependency graph mid-execution. If you have specific prior art that covers per-node attestation within a runtime inference DAG with Merkle assembly proofs, file it against the application — that is the correct venue.

The signed receipt captures the exact input state, the routing decision, the full dependency graph, and the output hash at assembly time. If assembly produces an incorrect result, the receipt is forensic evidence of exactly which node produced which output under which routing decision — that is the point of the Merkle completeness proof. Liability follows the evidence: if the decomposition logic is wrong, that is traceable to the DAG construction step in the receipt; if a model tier returns a defective fragment, that is attested at that node. Hive provides the receipt infrastructure and the routing logic; the customer's signed key binds them to the input state they submitted. The receipt does not resolve liability by itself — it makes the facts unambiguous.

The key split architecture is designed for this constraint: the customer holds the ML-DSA-65 signing key on-premises, and Hive holds only the verification root. Inference fragments transit Hive's routing layer, but the signing authority never leaves the customer's perimeter — Hive cannot forge a receipt the customer did not authorize. If your legal requirement is that no inference payload leaves your network, AFIR is not the right fit in its current hosted form. If the requirement is that a third party cannot produce valid attested outputs without your authorization, the key split satisfies that. Bring the receipt architecture spec to your legal team against those two specific threat models.

The EU AI Act's current logging obligations for high-risk systems do not mandate per-fragment attestation or Merkle assembly proofs. AFIR's receipt format exceeds those requirements by design, not by regulatory necessity. The value proposition is not compliance box-checking — it is that when a regulator, customer, or internal audit asks exactly what inputs produced exactly what output under exactly what model routing decision, you produce a cryptographically verifiable answer rather than reconstructed logs. Regulations are a floor; your exposure in a dispute is the ceiling. The receipt is for the ceiling.

Audit logs record what your system observed; receipts attest what the inference system executed. Your logs can be amended, retroactively structured, or missing entries due to pipeline failures — they are assertions your system makes about itself. An AFIR receipt is signed at execution time by a key you hold, binds the input hash, the routing graph, and the output hash into a Merkle structure, and cannot be back-filled without invalidating the signature. The distinction matters when a counterparty — a regulator, a plaintiff, an enterprise customer — challenges whether a specific output came from a specific input under a specific model. A log entry is testimony; a receipt is evidence.

The receipt's verifiability depends on the ML-DSA-65 key pair, not on Hive's operational continuity. The customer holds the signing key; the verification root is a public key that can be exported, archived, and verified offline with any standard FIPS 204-compliant implementation. If Hive ceases operations, existing receipts remain verifiable against the public key the customer already holds — no Hive infrastructure required. New receipt issuance would stop, but the forensic value of issued receipts does not decay. The escrow and key export procedures are documented in the enterprise agreement for exactly this scenario.

On-Premise
AFIR runs inside your network.

No fragments leave your infrastructure. No data touches our servers. You operate the container. We license the IP and hold the verification root.

Your Infrastructure
AFIR Docker container — your VPC
Your signing key — your HSM or KMS
Fragment data never leaves this boundary
Receipts stored in your own infrastructure
One-time
activation
Receipt hash only
on verify calls
Hive
Verification master key — held by Hive only
License activation server — one call on deploy
Countersignature on receipt schema — proves genuine AFIR
Patent-protected signing primitive — IP stays with Hive
How it works
01
You pull the image
Signed Docker image delivered to your private registry via presigned URL. No public Docker Hub.
02
One activation call
Container phones home once on first start. Passes instance fingerprint, receives signed session token. Runs fully offline after that.
03
Your pipeline calls AFIR
Drop-in replacement for your existing inference endpoint. Same API shape. Receipts generated locally, stored wherever you want.
04
Dual-signed receipts
Every receipt carries your signing key and Hive's countersignature on the schema. You prove what happened. We prove it was genuine AFIR.
Request Docker Access
Deploy AFIR in your VPC.

Available to Hyperscaler tier customers. Submit your request and we will follow up with deployment specs, image delivery, and licensing terms.

Get Access
Start attesting in minutes.

One checkout. Your live key appears on screen the moment payment clears — no email round-trip, no waiting. Metered billing: you only pay for what you sign.

EU AI Act — Articles 12 & 13 — Effective August 2, 2026

AFIR receipts satisfy technical logging and traceability requirements for high-risk AI systems. Every inference call produces a cryptographically signed, tamper-evident receipt — fulfilling the logging obligations under Article 12 and the transparency obligations under Article 13. No additional integration required. Your existing AFIR API calls are already compliant.