Sign every inference. Even the streaming ones.
Sub-microsecond Ed25519 receipts off the hot path. ML-DSA-65 PQ-finalized session anchor at close. Five reinforcement layers built for HIPAA voice, PCI call centers, and agentic streams. Offline zero-secret verify on a laptop.
Voice and streaming inference broke every receipt design before this.
A three-minute HIPAA voice call is hundreds of model turns. Old designs forced a choice between latency, tamper detection, and offline audit. You couldn't have all three.
Sign every turn inline
Adds milliseconds to every segment. Latency budget gone before the first packet ships. Voice goes choppy. Models go to the next vendor.
Sign once at session close
Fast on the wire. But a tampered mid-stream segment is invisible until the call ends — or never, if the attacker also tampers the close. No mid-stream tamper detection.
Verify online with a key holder
Auditor has to call your service to verify a receipt. Audit becomes a dependency on the vendor. That is not how compliance works. PCI / HIPAA reviewers verify on their own hardware.
One hot path. One off path. One PQ-finalized close.
The producer never waits on signing. The sidecar never blocks the producer. The PQ anchor never runs on the hot path. Three lanes, one verifiable proof.
HOT PATH OFF PATH AT SESSION CLOSE -------- -------- ---------------- producer --> enqueue(seg) ----+ ( voice / LLM stream ) | v +---------------------+ +-----------------------+ | sidecar queue | ---> | per-segment receipt | | (non-blocking) | | Ed25519, off-path | +---------------------+ +-----------------------+ | v chain of segment receipts ( hash- linked, tamper- detectable ) | v +-------------------+ | ML-DSA-65 anchor | <-- session.close() | PQ-finalized | +-------------------+ | v one verifiable proof ( offline, zero-secret )
Every layer makes the receipt harder to attack and easier to audit.
Build order in the spec was D → C → A → B → E. Read them in any order — each one stands on its own. All five are bound into the same signature chain, so any tamper breaks verify.
Drops, resumes, and transfers stay one record.
A transport drop, a network change, or a transfer to a human agent never breaks the chain. The next segment after the gap carries a continuation block referencing both the session root and the prior segment commit.
- sidecar.mark_interruption(cause=transport_close|network_change|transfer)
- sidecar.mark_transfer(to_handler=did:...) for warm-handoff scenarios
- verify_chain accepts seq jumps iff a continuation block is present
- missing continuation block → chain rejected as incomplete
Every segment names whether the model or the counterparty said it.
source_role is required on every SegmentInput. Bound into the segment signature, so tampering with attribution breaks verify. The asserts field declares the limit explicitly: this is origin within the session, not civil identity.
- source_role: "model_side" | "counterparty_side" (required field)
- asserts: stream_provenance_only; source_role_is_origin_not_identity
- tamper test: flipping source_role on one segment fails offline verify
- no overclaim — receipt never asserts who the counterparty actually is
The "~0ms" claim is measured and signed, not asserted.
The sidecar measures its own hot-path overhead per enqueue and embeds {max, p99, mean, declared_budget_us, within_budget, n_samples} in the SessionReceipt. The attestation is bound into the PQ signature. Mutate it and PQ verify fails.
- measured max in smoke test: 11.59 µs against a 100 µs declared budget
- within_budget = (max ≤ declared_budget_us); reported, not just claimed
- mutating declared_budget_us after the fact breaks ML-DSA-65 verify
- n_samples reported so auditor sees the population size
Live receipts and sealed sessions are visibly different.
Every segment receipt carries verification_state="live_provisional" and sig_alg="ed25519". The SessionReceipt at close carries verification_state="sealed" and sig_alg="ML-DSA-65". Auditors and downstream systems can't confuse a per-segment receipt with a final session anchor.
- segment receipts: live_provisional / Ed25519 (fast, off-path)
- session receipt: sealed / ML-DSA-65 (post-quantum substrate)
- verify path picks the right algorithm based on the state field
- no possibility of replaying a live receipt as a sealed one
PCI / PHI never lands in the receipt. The receipt still proves it was there.
A card number, an SSN, or a medical record value is redacted in the input before the segment is enqueued. Only {pos, class, commit, in_boundary} go into the receipt. The cleartext never enters the chain. Later, an authorized auditor with the salt + value can prove the held value matches the committed span.
- redact_text(cleartext, spans) → (redacted_text, span_dicts, salts)
- commit = SHA-256(domain || salt || class || value) — non-invertible
- verify_disclosure(span, salt, value) is the selective-disclosure check
- wrong value fails, wrong salt fails, wrong class fails
Acceptance proven. Numbers measured. Tamper rejected.
Both test suites run from a fresh checkout. Five base criteria, five reinforcement criteria. Ten of ten green. Bench overhead measured on a 200-segment, 300ms-cadence stream.
Acceptance smoke 10 of 10 pass
- 1✓~0ms added latency on the live stream
- 2✓Session receipt reconstructs from the segment chain
- 3✓Offline zero-secret verify of the whole chain
- 4✓Tampering one segment breaks the chain
- 5✓PQ-finalize session root post-stream and verify offline
- 6✓Layer D — live_provisional segments vs sealed session
- 7✓Layer C — latency attestation within budget, bound into PQ sig
- 8✓Layer A — drop + resume verifies end-to-end via continuation
- 9✓Layer B — source_role bound; tamper breaks verify
- 10✓Layer E — PCI never in receipt; selective disclosure matches
Bench & latency attestation
200 segments at 300ms cadence. Off-path discipline holds end-to-end. Numbers ship with every session receipt as signed evidence.
Wrap your stream. Issue receipts. Verify offline.
The sidecar lives next to your producer — LLM emitter, voice TTS, agentic loop, whatever. You call enqueue() per segment. It returns immediately. At session close you get back one PQ-finalized receipt. Anyone with the public keys can verify the whole chain offline.
from hive_stream import StreamSidecar, SegmentInput, Grounding, LiveSigner, PQFinalizer from hive_stream.boundary import ComplianceBoundary from hive_stream.redaction import redact_text # 1. open a sidecar (hot path is whatever follows; this runs once per session) sidecar = StreamSidecar( session_id="did:hive:tenant:my-call-center", model_ref="mir:hivecompute/voice-pci-v1", live=LiveSigner.from_key(LIVE_ED25519_KEY), pq=PQFinalizer.from_key(PQ_MLDSA_KEY), declared_budget_us=100, ).start() # 2. for each segment in the live stream -- this is the hot path for seg in voice_stream: redacted_input, spans_in, salts_in = redact_text(seg.heard, seg.pii_spans_in) redacted_output, spans_out, salts_out = redact_text(seg.said, seg.pii_spans_out) sidecar.enqueue(SegmentInput( seq=seg.i, t_start=seg.t0, t_end=seg.t1, input_window=redacted_input, output=redacted_output, model_ref="mir:hivecompute/voice-pci-v1", grounding=Grounding(score=seg.grounding_score, claims_root=seg.claim_hash), boundary=ComplianceBoundary(in_scope=True, pii_class="phi"), source_role="model_side", # Layer B redacted_spans=spans_in + spans_out, # Layer E )) # enqueue() returns immediately. signing happens off the hot path. # 3. mid-stream interruption? one line. (Layer A) if transport.dropped(): sidecar.mark_interruption(cause="transport_close") # 4. close the session -- ML-DSA-65 PQ-finalized anchor session_receipt = sidecar.close_session() # session_receipt.verification_state == "sealed" (Layer D) # session_receipt.latency_attestation (Layer C) # verify offline anywhere: # from hive_stream.session import verify_chain # ok, why = verify_chain(sidecar.chain, LIVE_PUBLIC_KEY)
Built for streams that have to be auditable on the next business day.
Anywhere a model is talking to a person about regulated material.
HIPAA voice
Telehealth intake, behavioral-health sessions, voice agents touching PHI. Layer E keeps the data class on the receipt; the data itself stays out.
PCI call centers
Card-not-present voice flows where PANs must be auditable without ever entering log storage. PCI scope shrinks; auditability grows.
Agentic streaming
Long-running agent loops where the model talks to other models or tools over hours. Continuity (Layer A) means transfers and reconnects stay one record.
LLM proxies for regulated industries
If you re-sell inference to banks, insurers, or hospitals, AFiR-Stream is the receipt layer your customers' auditors will ask for first.
smoke/smoke_afir_stream.py) and bench (bench/bench_afir_stream.py) checked into the AFiR-Stream substrate. Receipt schema and offline verifier ship with the API key. SiGR, MiR, GCA, GiTM, QPuF, EvAR, MaR, MiR-M, MiR-Plus, MPP, AFiR-RC, AFiR-CacheSign, and AFiR-Manifest live at /.