Live in prod · ML-DSA-65 Patent Pending

HiveWidget. The drop-in that signs every LLM response.
Any model. Any provider. One line.

HiveWidget is a trust layer that lives on your side of the call. Your LLM call stays exactly where it is. Once the response comes back, the widget runs it through five quick stages: COMPILE, COMPRESS, AMPLIFY, CERTIFY, WRITEBACK. At the end you get an ML-DSA-65 (FIPS 204) post-quantum certificate. That cert locks in prompt_hash, response_sha256, enrichment_sha256, tenant_did, and partner_id. Change any one of them and the signature breaks. Anyone can check it, and there's no shared secret involved.

Try the live widget Resell to your customers Install paths Architecture

What each piece does for you

XCALIBUR is the platform. CCACW is the pipeline. AmpliHive is the widget that delivers it.

There's one umbrella, one pipeline, and five tools underneath it. Each tool answers a question a real person actually asks. No marketing words you have to decode.

Tool 1 · COMPRESS

“Why is my LLM bill so high?”

It squeezes prompts down 3 to 4 times smaller before they hit the model. You keep the same output quality and save 60 to 76% on tokens. Runs at CCACW stage 2.

Tool 2 · COMPILE

“Why does my LLM lie or get jailbroken?”

It hardens your prompt against negation attacks, jailbreak templates, and things that trigger hallucinations, before you ever send it. Runs at CCACW stage 1.

Tool 3 · AMPLIFY

“Why doesn’t my LLM remember my company’s context?”

It pulls your company's own data into the prompt right when you call it, so the model has memory that sticks. Runs at CCACW stage 3.

Tool 4 · VERIFY

“How do I prove to my regulator the corpus content is real?”

It gives you a zero-knowledge proof that the data added into a prompt really is part of your own committed dataset. It does this without showing anyone what that data actually is. Runs alongside CCACW stage 4.

Tool 5 · ORACLE

“Why is the full pipeline slow?”

It's a coordinator that guesses ahead. It predicts the LLM's response, gets the next AMPLIFY step ready early, and pre-caches the next COMPRESS step. That cuts pipeline delay by 40% on the same hardware.

Cross-members · FEDERATION + MESH

“Why isn’t the network getting smarter? Why is my edge traffic slow?”

FEDERATION shares learning across tenants using gradients only, never the actual content. MESH copies your data to every edge location, so traffic from Singapore doesn’t have to round-trip all the way to Virginia.

How the tiers map

smsh is the tier name. Each tier turns on more of the tools.

smsh isn't a separate product. It's the service level. Think of it like credit card tiers: same card, different features included.

smsh

$0.80 / 1M in

COMPRESS only. Ed25519 receipt. Base inference savings.

smshPQ

$1.10 / 1M in

COMPRESS + PQ-sealed receipt (ML-DSA-65 + SLH-DSA). Regulated workloads.

smshMax

$1.40 / 1M in

Full CCACW: COMPRESS + COMPILE + AMPLIFY + Certify + Writeback. Corpus moat.

smshPQMax · Flagship

$1.85 / 1M in

The full CCACW pipeline plus a PQ-sealed envelope, ready for VERIFY. This is everything.

AmpliHive Enterprise Tenant

$120K / yr base

All tools + ORACLE + MESH + FEDERATION. Partner-resident instance. Your IAM, your corpus, Hyperscale overage on smshPQMax.

See it on pricing

Live ladder

Full breakdown with Enterprise commit ladder (10% / 25% / 40% / 50% off at Hyperscale). /pricing/#inference

What it is

A sidecar, not a pipe.

HiveWidget doesn't route your inference. It never sees your provider key. It won't touch your latency budget. It just sits next to your LLM call, hashes the inputs, runs five fast stages, and hands back a signed certificate plus an enriched payload. Your customer’s data path stays exactly where it is. The only thing that changes: every response can now be proven yours, proven untampered, and proven attributed.

Drop-in

One npm or pip line

npm i @hivery/amplihive or pip install amplihive. Construct one client with your tenant DID. Wrap or sign. Done.

Provider-agnostic

Works with everything

Fireworks, Cloudflare AI Gateway, OpenRouter, Together, OpenAI, Anthropic, Bedrock, vLLM, Ollama, your private cluster. We extract response text from any standard shape and sign whatever you handed us.

Post-quantum

ML-DSA-65, FIPS 204

A NIST-standardized lattice signature. The signature is 3,309 bytes, the public key is 1,952 bytes. Anyone checking it needs nothing but the cert itself. No shared secret, no hardware security module needed on their end.

The 5-stage pipeline

What happens between your LLM call and the cert.

The whole round-trip takes about 6ms at p95, plus about 60ms to sign with ML-DSA-65. It stays off your critical path, so your customer's LLM speed doesn't change at all.

COMPILE

tokenize + canonicalize

COMPRESS

domain-dictionary token reduction

AMPLIFY

tenant corpus + powers

CERTIFY

ML-DSA-65 sign over full payload

WRITEBACK

append to tenant corpus

Stage 01 · COMPILE

Clean up the input. Hash everything.

The widget cleans up the prompt and response into a standard format, then computes prompt_hash (a SHA-256 fingerprint of the text after whitespace cleanup) and response_sha256 (a SHA-256 fingerprint of the raw response bytes). If anyone recomputes these from the original text, they'll get an exact match. This is the anchor point for any audit.

What the cert anchors

prompt_hash     "sha256:f8a3\u20269c1d"
response_sha256 "sha256:b27e\u202641a8"
// recompute from the text, get an exact byte match

Stage 02 · COMPRESS

Shrinking tokens with a domain dictionary.

Long phrases that show up a lot in regulated work, like legal review, financial analysis, healthcare summaries, and code review, get squeezed down into single tokens and can be reversed back later. How much you save depends on how dense the prompt is:

25 to 40% on typical enterprise prompts
up to 78% on dense domain text (we've verified this live on a legal indemnity review)
0% on short, plain prompts. We're not going to pretend otherwise

You get the reduction number on every call, so you see exactly what you saved.

Live (legal prompt, /sign)

tokens_in            119
tokens_out           26
token_reduction_pct  78.15

Stage 03 · AMPLIFY

Adding context from your own data, on the way out.

Every tenant gets their own fully separate dataset. AMPLIFY pulls the relevant facts, patterns, and past signed responses for that tenant and weaves them into the output, along with a confidence score. That added context gets hashed into enrichment_sha256 and locked into the same signature. Your data is yours alone. It never mixes with another tenant's.

In the cert

enrichment_sha256  "sha256:97c4\u2026a0b1"
confidence         0.87
corpus_hits_count  12

Stage 04 · CERTIFY

An ML-DSA-65 signature over the whole payload.

The payload, meaning the tenant ID, prompt hash, response hash, enrichment hash, partner ID, and timestamp, gets packaged in a standard format and signed with FIPS 204's ML-DSA-65. The signature is 3,309 bytes. The public key is 1,952 bytes. Signing takes about 60ms, and checking it takes about 17ms. Anyone can check it for free at POST /v1/amplify/verify. No shared secret needed.

Production timing

amplihive_overhead_p50  4.47 ms
amplihive_overhead_p95  6.05 ms
mldsa65_sign_ms         59.76
mldsa65_verify_ms       16.76

Stage 05 · WRITEBACK

Today's signed response becomes tomorrow's context.

The certified response gets written back into your dataset. Every signed call makes the next one a little smarter, a little cheaper, and harder to walk away from. The longer you run on HiveWidget, the more provably yours your data becomes. If you don't want this, turn it off with writeback: false.

Tenant moat

writeback.id      8129
writeback.kind    "signed_response"
// future AMPLIFY draws from this

The 8 attributes

Why this is a big deal, not just a feature.

Most “trust” layers give you one of these things. HiveWidget gives you eight, all signed under the same cert, in the same call.

PQ certified

ML-DSA-65 / FIPS 204. Quantum-safe. NIST-standardized. Anyone can check it.

Tenant moat

Your own separate dataset, with writeback turned on by default. Switching gets harder with every call.

Token compressed

25 to 78% smaller on dense domain prompts. Real numbers, reported on every call.

Enriched

Your own data gets pulled into the output. Hashed right into the same signature.

Partner attributed

partner_id is locked INSIDE the signed payload. Change one byte and the signature breaks.

Tamper-evident

Three test cases you can run in the UI: clean, tampered response, tampered partner_id.

Co-brandable

Partner pages already wired up. Private labeling available. White-label SDK naming is negotiable.

Revenue-share native

Every signed call carries the partner ID. Figuring out the split is a SQL query, not a phone call.

Cert anatomy

What’s inside the signed payload.

Everything that matters gets signed. Change any field below and checking it returns signature_valid: false.

XCALIBUR Certificate · payload

ML-DSA-65 / FIPS 204

The certificate's schema version. Needed so future versions stay compatible.

tenant_did

A decentralized ID for the customer. It locks the cert to one specific tenant. No one can replay it as another tenant.

prompt_hash

A SHA-256 fingerprint of the cleaned-up prompt. Anyone with the original prompt can recompute it and check the match.

response_sha256

A SHA-256 fingerprint of the raw response sent to the customer. Change the response by even one byte, and this changes, and the signature breaks.

enrichment_sha256

A SHA-256 fingerprint of the enrichment object from AMPLIFY. It ties the added context to this exact response.

partner_id

Optional. Credits the inference provider. It's locked IN the signed payload, so revenue share is provable, not just promised.

ts_ns

A nanosecond timestamp from when it was signed. Used for ordering, replay windows, and audit timelines.

Install paths

Three ways to ship it. Same cert at the end.

Mode 1 · split

You run the LLM. The widget signs.

This is the most common setup. You keep your provider key, call your LLM, then call hive.sign({ prompt, response }). The widget never sees your API key and never routes your inference.

Mode 2 · sealed

The widget wraps the LLM call.

hive.wrap(() => openai.chat.completions.create(req), { promptText }). The widget calls your LLM function, pulls out the response text, and signs it in one shot. That's one block of code instead of two.

// JS / TS: split mode (most common)
import OpenAI from "openai";
import { AmpliHive } from "@hivery/amplihive";

const llm  = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
const hive = new AmpliHive({
  tenantDid: "did:hive:acme",
  apiKey:    process.env.AMPLIHIVE_API_KEY,
  partnerId: "fireworks",                 // optional, bound into cert
});

const prompt = "Draft a 3-clause SaaS indemnity.";
const r = await llm.chat.completions.create({
  model: "claude_sonnet_4_6",
  messages: [{ role: "user", content: prompt }],
});
const text = r.choices[0].message.content;

const cert = await hive.sign({ prompt, response: text });
//
// cert.certificate.alg                   \u2192 "ML-DSA-65"
// cert.certificate.payload.tenant_did    \u2192 "did:hive:acme"
// cert.certificate.payload.partner_id    \u2192 "fireworks"
// cert.compression.token_reduction_pct   \u2192 78.15  (when dense)
// cert.stages.{compile,compress,amplify,certify,writeback}

# Python: split mode
import openai
from amplihive import AmpliHive

hive = AmpliHive(
    tenant_did="did:hive:acme",
    api_key=os.environ["AMPLIHIVE_API_KEY"],
    partner_id="cloudflare",          # optional, bound into cert
)

prompt = "Summarize this MRI report at the patient level."
r = openai.chat.completions.create(
    model="gpt_5_5",
    messages=[{"role": "user", "content": prompt}],
)
text = r.choices[0].message.content

cert = hive.sign(prompt=prompt, response=text)
# cert.certificate.alg                 \u2192 "ML-DSA-65"
# cert.certificate.payload.partner_id  \u2192 "cloudflare"
# cert.compression                     \u2192 {"tokens_in": ..., "tokens_out": ..., "token_reduction_pct": ...}

# Python: HiveCompute mode (tier-routed, USDC-settled)
from amplihive import HiveCompute

hive = HiveCompute(
    tenant_did="did:hive:acme",
    api_key=os.environ["AMPLIHIVE_API_KEY"],
    wallet_key=os.environ["BASE_WALLET_KEY"],   # pays 0.02 USDC per call
)

# tier="T1_STANDARD" picks cheapest qualifying model (Claude Sonnet 4.6 today)
# tier="T2_HIGH" routes to GPT-5.5 / Opus 4.7 / Gemini 3.1 Pro
result = hive.complete(
    prompt="Audit this 1099 for ALCOA compliance.",
    tier="T2_HIGH",
    partner_id="alcoa-agentguard",
)

# result.text                          \u2192 the response
# result.model_used                    \u2192 "claude_opus_4_7"
# result.cert.payload.tenant_did       \u2192 "did:hive:acme"
# result.cert.payload.counterparty_did \u2192 "did:eth:0x..."
# result.cert.payload.price_atomic     \u2192 20000  (0.02 USDC)
# result.cert.alg                      \u2192 "ML-DSA-65"

Models we sign for

Serious models. Three tiers. No filler.

The widget signs whatever provider you bring. Route through HiveCompute instead, and you get tier-based picking across seven frontier models from OpenAI, Anthropic, and Google. No open-weights filler, no random Llama forks, no gimmicks. Tier T0 handles classification, T1 handles standard agent work, T2 handles audit-grade reasoning. You get the same signed cert at every tier.

GPT-5.5openai · T2 high

Claude Opus 4.7anthropic · T2 high

Gemini 3.1 Progoogle · T2 high

Claude Sonnet 4.6anthropic · T1 standard

GPT-5.4openai · T1 standard

GPT-5.4 Miniopenai · T0 cheap

Gemini 3 Flashgoogle · T0 cheap

Bring your own provider key in split mode and the widget never sees it. Route through HiveCompute instead, and we pick the cheapest model that clears your tier, sign it with ML-DSA-65, and settle it in USDC on Base.

Who buys this

Three buyers. One widget.

Direct customer

You ship AI into a regulated buyer

Banks, hospitals, government, pharma, insurers, law firms, F500 procurement teams. Their CISO will ask: “Can you prove this response wasn't tampered with? Can you prove it was made for our account, not someone else's?” HiveWidget is how you say yes.

Install & try

Inference provider

You sell tokens (Fireworks, Cloudflare, Together, OpenRouter)

Your customers keep asking you for compliance. You don't want to build a whole post-quantum signing setup, and you don't want to compete with your own customers' trust layer. White-label HiveWidget instead. partner_id=you shows up in every cert, and the revenue share is provable.

Reseller pitch

Platform / gateway

You aggregate AI in front of customers

AI Gateway, MCP gateway, agent platform, RAG SaaS. To close enterprise deals, you need a real story about audits and where the data came from. Lock partner_id=<your-platform> into every cert. It installs on the customer's side, no data ever leaves, and you get the full history.

Architecture

FAQ

The questions we keep getting.

Does HiveWidget see my prompt or response body?

By default, the widget hashes your inputs locally and sends the plain text to /v1/amplify/sign so AMPLIFY can work against your dataset. If you want hash-only mode, where the actual text never crosses the wire, pass amplify: false and only hashes get sent. The cert still signs the same fields.

Does this slow my inference down?

No. The widget runs after your LLM already returned its answer. AmpliHive's own overhead is 4.47ms at p50 and 6.05ms at p95. The ML-DSA-65 sign takes about 60ms. Your customer sees the LLM response at the same speed as before, and the cert shows up a fraction of a second later. If you want them running in parallel, use wrap().

How is partner_id tamper-proof?

It's a field inside the standard payload that gets signed. Change it to anything else, even one character, and the ML-DSA-65 signature stops checking out. You can't fake a Fireworks-attributed cert without Fireworks' signing key. Actually, there isn't a separate Fireworks signing key at all. The tenant's key does the signing, and partner_id is just locked into the payload alongside the hashes.

What does verify look like?

One POST to /v1/amplify/verify with the cert, plus optionally the original prompt and response. It returns signature_valid: true|false and a true or false for each field's hash match. No shared secret, no API key required. Anyone with the cert can check it.

What happens if I disable writeback?

Set writeback: false when you set up the client, or use skip_writeback: true on a single call. You still get the cert. AMPLIFY still reads from your existing data. The new signed response just doesn't get added to it. Use this for one-off jobs, or when your data is read-only.

Why ML-DSA-65 and not Ed25519?

Quantum computers. ML-DSA-65 is NIST's FIPS 204 standardized signature, built to stay secure even against quantum computers. Ed25519 isn't. Federal buyers and serious enterprise buyers increasingly want cryptography that's ready for that future. ML-DSA-65 qualifies. Ed25519 doesn't.

Can I run the widget against my private model?

Yes. The widget doesn't care which provider produced the response. Pass the text into sign(), or give wrap() a function that pulls the text out of your custom response format. vLLM, Ollama, custom HTTP, all fine.

What’s the pricing?

$0.06 per 1 million signed calls. The partner share is negotiable (30% is our default in the calculators). For volume above 1 billion a month, contact us directly. Reseller terms.

HiveWidget. The drop-in that signs every LLM response.Any model. Any provider. One line.

XCALIBUR is the platform. CCACW is the pipeline. AmpliHive is the widget that delivers it.

“Why is my LLM bill so high?”

“Why does my LLM lie or get jailbroken?”

“Why doesn’t my LLM remember my company’s context?”

“How do I prove to my regulator the corpus content is real?”

“Why is the full pipeline slow?”

“Why isn’t the network getting smarter? Why is my edge traffic slow?”

smsh is the tier name. Each tier turns on more of the tools.

$0.80 / 1M in

$1.10 / 1M in

$1.40 / 1M in

$1.85 / 1M in

$120K / yr base

Live ladder

A sidecar, not a pipe.

One npm or pip line

Works with everything

ML-DSA-65, FIPS 204

What happens between your LLM call and the cert.

Clean up the input. Hash everything.

Shrinking tokens with a domain dictionary.

Adding context from your own data, on the way out.

An ML-DSA-65 signature over the whole payload.

Today's signed response becomes tomorrow's context.

Why this is a big deal, not just a feature.

PQ certified

Tenant moat

Token compressed

Enriched

Partner attributed

Tamper-evident

Co-brandable

Revenue-share native

What it does in prod, today.

What’s inside the signed payload.

Three ways to ship it. Same cert at the end.

You run the LLM. The widget signs.

The widget wraps the LLM call.

Works with any provider. That's by design.

Serious models. Three tiers. No filler.

Three buyers. One widget.

You ship AI into a regulated buyer

You sell tokens (Fireworks, Cloudflare, Together, OpenRouter)

You aggregate AI in front of customers

The questions we keep getting.

Try the live widget against production.

HiveWidget. The drop-in that signs every LLM response.
Any model. Any provider. One line.