Live in prod · ML-DSA-65 Patent Pending

HiveWidget. The Drop-In that signs every LLM response.
Any model. Any provider. One line.

HiveWidget is the customer-side trust layer for AI. Your LLM call stays where it is. After the response returns, the widget runs a 5-stage pipeline — COMPILE, COMPRESS, AMPLIFY, CERTIFY, WRITEBACK — and hands back an ML-DSA-65 (FIPS 204) post-quantum certificate. The cert binds prompt_hash, response_sha256, enrichment_sha256, tenant_did, and partner_id. Tamper with any of them and the signature breaks. Public verify, no shared secret.

What each piece does for you

XCALIBUR is the platform. CCACW is the pipeline. AmpliHive is the widget that delivers it.

One umbrella, one pipeline, five tools. Each tool answers one question a human actually asks. No marketing words you have to translate.

Tool 1 · COMPRESS

“Why is my LLM bill so high?”

Squeezes prompts 3-4× before they hit the model. Same output quality. 60-76% token savings. Runs at CCACW stage 2.

Tool 2 · COMPILE

“Why does my LLM lie or get jailbroken?”

Hardens prompts against negation attacks, jailbreak templates, hallucination triggers before send. Runs at CCACW stage 1.

Tool 3 · AMPLIFY

“Why doesn’t my LLM remember my company’s context?”

Pulls your tenant’s corpus into the prompt at call time. Persistent memory. Runs at CCACW stage 3.

Tool 4 · VERIFY

“How do I prove to my regulator the corpus content is real?”

Zero-knowledge proof that the corpus content injected into a prompt is a genuine member of your committed corpus — without revealing what it is. Trustless attestation. Runs alongside CCACW stage 4.

Tool 5 · ORACLE

“Why is the full pipeline slow?”

Speculative coordinator. Predicts the LLM response, pre-stages the next AMPLIFY, pre-caches the next COMPRESS. Cuts pipeline latency 40% on existing hardware.

Cross-members · FEDERATION + MESH

“Why isn’t the network getting smarter? Why is my edge traffic slow?”

FEDERATION shares structural learning across tenants (gradients only, never content). MESH replicates the corpus to every edge PoP so Singapore traffic doesn’t round-trip to Virginia.

How the tiers map

smsh is the tier name. Each tier turns on more of the tools.

smsh is not a separate product — it is the service level. Think of it like card tiers. Same card, different included features.

smsh

$0.80 / 1M in

COMPRESS only. Ed25519 receipt. Base inference savings.

smshPQ

$1.10 / 1M in

COMPRESS + PQ-sealed receipt (ML-DSA-65 + SLH-DSA). Regulated workloads.

smshMax

$1.40 / 1M in

Full CCACW: COMPRESS + COMPILE + AMPLIFY + Certify + Writeback. Corpus moat.

smshPQMax · Flagship

$1.85 / 1M in

Full CCACW + PQ-sealed envelope + VERIFY-ready. The full stack.

AmpliHive Enterprise Tenant

$120K / yr base

All tools + ORACLE + MESH + FEDERATION. Partner-resident instance. Your IAM, your corpus, Hyperscale overage on smshPQMax.

See it on pricing

Live ladder

Full breakdown with Enterprise commit ladder (10% / 25% / 40% / 50% off at Hyperscale). /pricing/#inference

What it is

A sidecar, not a pipe.

HiveWidget does not route your inference. It does not see your provider key. It does not change your latency budget. It sits next to your LLM call, hashes the inputs, runs five fast stages, and returns a signed certificate plus an enriched payload. Your customer’s data path stays exactly where it is. The only thing that changes is that every response is now provably yours, provably untampered, and provably attributed.

Drop-in

One npm or pip line

npm i @hivery/amplihive or pip install amplihive. Construct one client with your tenant DID. Wrap or sign. Done.

Provider-agnostic

Works with everything

Fireworks, Cloudflare AI Gateway, OpenRouter, Together, OpenAI, Anthropic, Bedrock, vLLM, Ollama, your private cluster. We extract response text from any standard shape and sign whatever you handed us.

Post-quantum

ML-DSA-65, FIPS 204

NIST-standardized lattice signatures. Sig 3,309 bytes, pubkey 1,952 bytes. Verifier needs nothing but the cert. No shared secret. No HSM dependency on the verifier side.

The 5-stage pipeline

What happens between your LLM call and the cert.

Total round-trip is ~6ms p95 plus the ML-DSA-65 sign (~60ms). Off your critical path. The customer’s LLM latency story is unchanged.

01
COMPILE
tokenize + canonicalize
02
COMPRESS
domain-dictionary token reduction
03
AMPLIFY
tenant corpus + powers
04
CERTIFY
ML-DSA-65 sign over full payload
05
WRITEBACK
append to tenant corpus
Stage 01 · COMPILE

Normalize input. Hash everything.

The widget canonicalizes the prompt + response, computes prompt_hash (SHA-256 over UTF-8 bytes after whitespace canonicalization) and response_sha256 (SHA-256 over the raw response bytes). Anyone re-deriving these from the original text will get a bit-identical match. This is the audit anchor.

What the cert anchors
prompt_hash     "sha256:f8a3\u20269c1d"
response_sha256 "sha256:b27e\u202641a8"
// re-derive from text \u2192 byte-identical
Stage 02 · COMPRESS

Domain-dictionary token reduction.

Long phrases that appear in regulated workloads — legal review, financial analysis, healthcare summarization, code review — get reversibly compressed to single tokens. Honest reduction depends on prompt density:

  • 25–40% on typical enterprise prompts
  • up to 78% on dense domain text (verified live on a legal indemnity review)
  • 0% on terse non-domain prompts — we don’t pretend otherwise

Reduction is reported per-call. Customers see exactly what they saved.

Live (legal prompt, /sign)
tokens_in            119
tokens_out           26
token_reduction_pct  78.15
Stage 03 · AMPLIFY

Output-side enrichment from the tenant corpus.

Every tenant gets a structurally-isolated corpus. AMPLIFY pulls the relevant prior facts, patterns, and prior signed responses for this tenant and threads them into the output as enrichment, with a confidence score. The enrichment is hashed into enrichment_sha256 and bound into the same signature. The corpus is yours alone — no cross-tenant bleed, ever.

In the cert
enrichment_sha256  "sha256:97c4\u2026a0b1"
confidence         0.87
corpus_hits_count  12
Stage 04 · CERTIFY

ML-DSA-65 signature over the full payload.

The payload — tenant DID, prompt hash, response hash, enrichment hash, partner ID, timestamp — is serialized canonically and signed under FIPS 204’s ML-DSA-65. Sig is 3,309 bytes. Pubkey is 1,952 bytes. Sign latency is ~60ms; verify is ~17ms. Public verify endpoint at POST /v1/amplify/verify — no shared secret, anyone can audit.

Production timing
amplihive_overhead_p50  4.47 ms
amplihive_overhead_p95  6.05 ms
mldsa65_sign_ms         59.76
mldsa65_verify_ms       16.76
Stage 05 · WRITEBACK

The signed response becomes future context.

The certified response is written back into the tenant corpus. Every signed call makes the next one slightly smarter, slightly cheaper, and adds defensible switching cost. This is the tenant moat: the longer you run on HiveWidget, the more provably-yours your corpus becomes. Disable with writeback: false if you don’t want it.

Tenant moat
writeback.id      8129
writeback.kind    "signed_response"
// future AMPLIFY draws from this
The 8 attributes

Why this is a homerun, not a feature.

Most “trust” layers ship one of these. HiveWidget ships eight, signed under the same cert, in the same call.

01

PQ certified

ML-DSA-65 / FIPS 204. Quantum-safe. NIST-standardized. Public verify.

02

Tenant moat

Per-tenant corpus, writeback default-on. Switching cost compounds per call.

03

Token compressed

25–78% reduction on dense domain prompts. Honest %, reported per call.

04

Enriched

Output-side amplify from corpus + powers. Hashed into the same signature.

05

Partner attributed

partner_id bound INSIDE the signed payload. Flip a byte, signature breaks.

06

Tamper-evident

3-case verify in the UI: clean, tampered response, tampered partner_id.

07

Co-brandable

Pre-wired partner pages. Private label on the table. White-label SDK namespace negotiable.

08

Revenue-share native

Every signed call carries the partner ID. Share is a SQL query, not a reconciliation call.

Production numbers

What it does in prod, today.

Re-derivable. Hit POST /v1/amplify/bench against hivemorph.onrender.com and watch.

78.15%
Token reduction
Dense legal prompt, /sign full pipeline, smsh-std post-dictionary-expansion.
6.05ms
p95 overhead
AmpliHive-side latency excluding ML-DSA-65 sign. Off your critical path.
59.76ms
Sign time
ML-DSA-65 signature over the full canonical payload. One round-trip.
16.76ms
Verify time
Public verify of signature + hash re-derivation. No shared secret needed.
Cert anatomy

What’s inside the signed payload.

Everything that matters is signed. Tamper with any field below and verify returns signature_valid: false.

XCALIBUR Certificate · payload
ML-DSA-65 / FIPS 204
v
Certificate schema version. Required for forward compatibility.
tenant_did
Decentralized identifier for the customer. Binds the cert to a specific tenant. Cross-tenant replay impossible.
prompt_hash
SHA-256 over canonicalized prompt bytes. Anyone with the original prompt can re-derive and confirm match.
response_sha256
SHA-256 over the raw response bytes returned to the customer. If the response changes by one byte, this changes, signature breaks.
enrichment_sha256
SHA-256 over the enrichment object from AMPLIFY. Binds the corpus-side amplification to this exact response.
partner_id
Optional. Inference provider attribution. Bound IN the signed payload — revenue share is provable, not promised.
ts_ns
Nanosecond timestamp at sign time. Ordering, replay window, audit timeline.
Install paths

Three ways to ship it. Same cert at the end.

Mode 1 · split

Customer runs the LLM. Widget signs.

The most common. Customer keeps their provider key, calls their LLM, then calls hive.sign({ prompt, response }). Widget never sees the API key. Widget never routes inference.

Mode 2 · sealed

Widget wraps the LLM call.

hive.wrap(() => openai.chat.completions.create(req), { promptText }). The widget invokes the customer’s LLM function, extracts the response text, signs in one shot. One block of code instead of two.

// JS / TS — split mode (most common)
import OpenAI from "openai";
import { AmpliHive } from "@hivery/amplihive";

const llm  = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
const hive = new AmpliHive({
  tenantDid: "did:hive:acme",
  apiKey:    process.env.AMPLIHIVE_API_KEY,
  partnerId: "fireworks",                 // optional, bound into cert
});

const prompt = "Draft a 3-clause SaaS indemnity.";
const r = await llm.chat.completions.create({
  model: "claude_sonnet_4_6",
  messages: [{ role: "user", content: prompt }],
});
const text = r.choices[0].message.content;

const cert = await hive.sign({ prompt, response: text });
//
// cert.certificate.alg                   \u2192 "ML-DSA-65"
// cert.certificate.payload.tenant_did    \u2192 "did:hive:acme"
// cert.certificate.payload.partner_id    \u2192 "fireworks"
// cert.compression.token_reduction_pct   \u2192 78.15  (when dense)
// cert.stages.{compile,compress,amplify,certify,writeback}
# Python — split mode
import openai
from amplihive import AmpliHive

hive = AmpliHive(
    tenant_did="did:hive:acme",
    api_key=os.environ["AMPLIHIVE_API_KEY"],
    partner_id="cloudflare",          # optional, bound into cert
)

prompt = "Summarize this MRI report at the patient level."
r = openai.chat.completions.create(
    model="gpt_5_5",
    messages=[{"role": "user", "content": prompt}],
)
text = r.choices[0].message.content

cert = hive.sign(prompt=prompt, response=text)
# cert.certificate.alg                 \u2192 "ML-DSA-65"
# cert.certificate.payload.partner_id  \u2192 "cloudflare"
# cert.compression                     \u2192 {"tokens_in": ..., "tokens_out": ..., "token_reduction_pct": ...}
# Python — HiveCompute mode (tier-routed, USDC-settled)
from amplihive import HiveCompute

hive = HiveCompute(
    tenant_did="did:hive:acme",
    api_key=os.environ["AMPLIHIVE_API_KEY"],
    wallet_key=os.environ["BASE_WALLET_KEY"],   # pays 0.02 USDC per call
)

# tier="T1_STANDARD" picks cheapest qualifying model (Claude Sonnet 4.6 today)
# tier="T2_HIGH" routes to GPT-5.5 / Opus 4.7 / Gemini 3.1 Pro
result = hive.complete(
    prompt="Audit this 1099 for ALCOA compliance.",
    tier="T2_HIGH",
    partner_id="alcoa-agentguard",
)

# result.text                          \u2192 the response
# result.model_used                    \u2192 "claude_opus_4_7"
# result.cert.payload.tenant_did       \u2192 "did:hive:acme"
# result.cert.payload.counterparty_did \u2192 "did:eth:0x..."
# result.cert.payload.price_atomic     \u2192 20000  (0.02 USDC)
# result.cert.alg                      \u2192 "ML-DSA-65"
Where it works

Provider-agnostic. By design.

The widget extracts response text from any standard shape and signs whatever you handed it. Bring your provider, keep your provider key, keep your latency budget.

FireworksOpenAI-compat
CloudflareAI Gateway
OpenRouterrouted
TogetherOpenAI-compat
OpenAInative
AnthropicMessages API
Bedrockvia SDK
Workers AIedge
vLLMself-hosted
Ollamalocal
HiveComputeinternal
your stackextract fn
Models we sign for

Hard-core models. Three tiers. No filler.

The widget signs any provider you bring. When you route through HiveCompute, you get tier-based selection across seven frontier models from OpenAI, Anthropic, and Google. No open-weights filler, no random Llama forks, no gimmicks. Tier T0 for classification, T1 for standard agent work, T2 for audit-grade reasoning. Same signed cert at every tier.

GPT-5.5openai · T2 high
Claude Opus 4.7anthropic · T2 high
Gemini 3.1 Progoogle · T2 high
Claude Sonnet 4.6anthropic · T1 standard
GPT-5.4openai · T1 standard
  
GPT-5.4 Miniopenai · T0 cheap
Gemini 3 Flashgoogle · T0 cheap
  

Bring your own provider key in split mode and the widget never sees it. Route through HiveCompute and we pick the cheapest model that clears your tier, signed under ML-DSA-65, settled in USDC on Base.

Who buys this

Three buyers. One widget.

Direct customer

You ship AI into a regulated buyer

Bank, hospital, gov, pharma, insurer, law firm, F500 procurement. Their CISO asks: “Can you prove this response wasn’t tampered? Can you prove it was generated for our tenant, not someone else’s?” HiveWidget is the yes.

Install & try
Inference provider

You sell tokens (Fireworks, Cloudflare, Together, OpenRouter)

Your customers ask you for compliance. You don’t want to build a PQ signing infra. You don’t want to compete with your own customers’ trust layer. White-label HiveWidget. partner_id=you is in every cert. Revenue share is provable.

Reseller pitch
Platform / gateway

You aggregate AI in front of customers

AI Gateway, MCP gateway, agent platform, RAG SaaS. You need the audit and provenance story to close enterprise. Bind partner_id=<your-platform> into every cert. Customer-side install, zero data exfil, full provenance.

Architecture
FAQ

The questions we keep getting.

Does HiveWidget see my prompt or response body?

By default the widget hashes inputs locally and posts the cleartext to /v1/amplify/sign so AMPLIFY can run against the tenant corpus. If you want hash-only mode (no body crosses the wire), pass amplify: false and you ship only hashes. The cert still signs the same fields.

Does this slow my inference down?

No. The widget runs after your LLM returns. AmpliHive overhead is 4.47ms p50 / 6.05ms p95. ML-DSA-65 sign is ~60ms. Your customer sees the LLM response on the same TTFT as before; the cert arrives a fraction of a second later. If you want sealed-mode parallelism, use wrap().

How is partner_id tamper-proof?

It’s a field inside the canonical payload that gets signed. Change it to any other value — even one character — and the ML-DSA-65 signature no longer verifies. You can’t forge a Fireworks-attributed cert without Fireworks’ signing key, because there isn’t a Fireworks signing key — the tenant key signs, and the partner_id is just bound into the payload alongside the hashes.

What does verify look like?

One POST to /v1/amplify/verify with the cert plus (optionally) the original prompt and response. Returns signature_valid: true|false and per-field hash-match booleans. No shared secret. No API key required. Anyone with the cert can audit.

What happens if I disable writeback?

Set writeback: false at client construction (or per-call skip_writeback: true). The cert is still issued. AMPLIFY still reads from your existing corpus. The new signed response just doesn’t append. Use this for ephemeral workloads or when the tenant corpus is read-only.

Why ML-DSA-65 and not Ed25519?

Quantum. ML-DSA-65 is NIST’s FIPS 204 standardized lattice-based signature, designed to be secure against quantum adversaries. Ed25519 is not. Federal procurement and serious enterprise procurement increasingly require PQ-ready cryptography. ML-DSA-65 gets you on that list. Ed25519 does not.

Can I run the widget against my private model?

Yes. The widget doesn’t care which provider generated the response. Pass the text into sign(). Or pass an extract function to wrap() that pulls text out of your custom response shape. vLLM, Ollama, custom HTTP — all fine.

What’s the pricing?

$0.06 per 1M signed calls. Partner share negotiable (default 30% in our calculators). For volume above 1B/month, contact us. Reseller terms.

Try the live widget against production.

Sign a real (prompt, response). Watch the 5-stage pipeline. Tamper the response. Tamper the partner_id. The signature breaks. That’s the close.