The frontier model only sees real speech. And the routing is signed.
Most of a voice call is not speech that needs a frontier model — it is silence, hold music, background noise, DTMF tones. AFiR-Stream-Route classifies each segment on a cheap edge classifier, routes it to the cheapest sufficient model, and binds the routing decision into the segment receipt. The customer's model cost drops. The proof is the moat: which model handled which segment class, signed, per segment.
Compression is the wrong lever. Routing is the right one — if you can prove it.
Audio is already compressed and already binary; the cost and latency live in the model's forward pass, not in audio transport. The lever that works is sending less audio to the expensive model. But content-aware routing already exists — what does not exist is a tamper-evident record of which model actually handled each segment class. In regulated voice, that record is the audit answer.
Wrong bottleneck
Shrinking already-compressed audio does not make the model think faster, and it risks discarding signal the model needs. The cost is in the forward pass. Compression targets transport, which was never the constraint.
No provenance of the route
Routing silence away from the frontier model saves money — but with no signed record, you cannot prove the compliant model handled the speech that carried PHI. An unsigned routing log is an assertion, not evidence.
Alteration stays invisible
A routing table in a vendor database with no tamper-evident accumulator over the per-segment decisions means a silent rewrite — claiming a cheap-model segment went to the compliant model — is undetectable. No portable audit trail.
Classify, route, sign — the routing decision is bound into the segment receipt.
A lightweight classifier runs at the edge as segments arrive from AFiR-Stream. Each segment gets a content class, routes to the cheapest sufficient model, and the routing block — content class, classifier identity, model handled, policy commitment — binds into the segment signature. No signer-core change: it consumes the live SiGR signer, it does not modify it.
EDGE CLASSIFIER ROUTE BY CONTENT CLASS SIGN + VERIFY --------------- ---------------------- ------------- segment in --> content_class --+ ( AFiR-Stream segmentation ) | v +-------------------+ speech --> asr-frontier-compliant | classifier | overlap --> asr-frontier-diarize | mir:vad-class-v2 | silence --> drop ( no model ) +-------------------+ noise --> cheap-classifier | tones --> dtmf-decoder v materiality (AFiR-S) override? high-consequence segment always routes to compliant frontier model | v +------------------------+ +----------------------+ | routing block | ---> | ML-DSA-65 signature | <-- per segment | content_class | | bound into receipt | | classifier_ref | +----------------------+ | routed_to ( model ) | | | policy_id | v | materiality_override | offline, zero-secret verify +------------------------+ ( rewrite a route -> verify fails )
Each content class to the cheapest model that fits it.
The policy in effect is committed by hash and bound into every segment receipt. Change the policy and the commitment changes — the route a customer was billed for is provable against the policy that produced it.
Four properties. Each one tested. Each one breaks verify when attacked.
All four bind into the same per-segment signature. Any tamper — to a content class, a routed-to model, a policy commitment, or an override flag — breaks the recomputed root and fails offline verify.
Every segment is bound to the model that actually handled it.
The content class, the classifier identity, and the model the segment was routed to are inputs to the segment receipt. The binding of a segment to its route is itself tamper-evident — rewrite which model handled a segment and the recomputed root fails verify.
- routing: { content_class, classifier_ref, routed_to }
- policy_id commits the routing policy in effect
- tamper test: rewrite routed_to fails offline verify
- each segment provably tied to the model that processed it
The compliant model handled every sensitive segment — signed.
The moat. In regulated voice the audit question is "did a compliant model handle the PHI-bearing speech segment?" Because routed_to names the attested compliance model and binds into the signature, that question has a signed, per-segment answer nobody else can produce.
- routed_to names the attested compliance model
- per-segment, signed, verifiable without a shared secret
- "compliant model handled the sensitive segment" is provable
- an audit answer no unsigned routing log can give
You never cheap-out on the segment that matters.
A segment flagged high-consequence by AFiR-S — a payment instruction, a consent statement — always routes to the compliant frontier model, overriding a borderline content classification. The override is recorded and bound, so the escalation is itself provable.
- consequence / materiality measure overrides content class
- materiality_override flag bound into the receipt
- smoke: a borderline noise segment at 9200bp consequence escalated
- provable that the material segment got the compliant model
The receipt asserts the routing assignment — never that it was optimal.
The asserts field is set to routing_provenance_only. It proves the named segment was assigned the named class and routed to the named model under the named policy. It does not claim the routing was optimal, the classification correct, or the transcription accurate. No overclaim.
- asserts: "routing_provenance_only"
- routing proves assignment, not optimality
- classification correctness is scoped out, declared on the artifact
- composes on stream provenance + PQ session sealing
Reduction to practice. Real values. Tamper rejected.
A twenty-segment call, classified and routed from a fresh run against the live signer. Six smoke criteria, six green. The numbers and hashes below are the actual outputs of that test run.
Smoke test 6 of 6 pass
- 1✓Mixed stream classified and routed per policy, all twenty segments
- 2✓Silence and noise routed away from the frontier model
- 3✓Every segment receipt verifies offline, zero-secret, untampered
- 4✓Rewriting a cheap-model route to the compliant model breaks the signed root
- 5✓A materiality-flagged borderline segment overrode to the compliant model
- 6✓One ML-DSA-65 signature per segment, routing block bound in
Signed test-run outputs
Twenty segments: 6 speech, 1 overlap, 7 silence, 4 noise, 2 non-speech tones. Frontier model saw 8 of 20. Seven silence segments dropped to no model. One borderline noise segment at 9200bp consequence overrode to the compliant model. All twenty receipts verify untampered.
Classify the segment. Route it. Sign the route. Verify offline.
Drop the classifier stage in front of your AFiR-Stream signer. Each segment gets routed to the cheapest sufficient model and gets one PQ-anchored receipt with the routing decision bound in. Anyone with the public key verifies the whole call offline.
import { classify, route, signSegment, verify } from "@hive/afir-stream-route" // 1. classify each segment at the edge -- cheap, microseconds, no frontier cost const cls = classify(segment) // speech | silence | noise | non_speech_signal | overlap // 2. route to the cheapest sufficient model -- materiality can override const r = route(cls, { policy: ROUTING_POLICY, // policy_id = 0xa04681...32fe46 materiality_bp: segment.consequence_bp, // high-consequence -> compliant model }) // r.routed_to -> "mir:asr-frontier-compliant" (only for real speech) // silence -> "drop" noise -> cheap-classifier tones -> dtmf-decoder // 3. bind the routing block into the segment signature -- one PQ receipt per segment const receipt = await signSegment(segment, { routing: r, // content_class, classifier_ref, routed_to, policy_id asserts: "routing_provenance_only", // assignment, not optimality }) // 4. anyone verifies offline -- no shared secret, no call home const { ok } = await verify(receipt, PUBLIC_KEY) // rewrite routed_to to claim the compliant model handled a cheap segment -> ok === false
Built for real-time voice that has to be auditable.
Anywhere a regulated voice stream routes across models and an auditor will ask which model handled the sensitive segment.
HIPAA voice agents
PHI-bearing speech provably handled by the compliant ASR model, per segment. Silence and hold music route away — the model cost drops, the proof holds.
PCI call centers
Payment-instruction segments escalate to the compliant model by materiality, signed. Card-tone DTMF goes to a cheap decoder, never the frontier model.
Voice infra & CCaaS
Cut frontier-ASR spend on the non-speech fraction of every call while selling auditable routing proof to your regulated buyers. The meter still spins on receipts.
Verified Agents
Efficient real-time agents that prove they routed efficiently — not over-charging by sending silence to the frontier model, not cheap-ing out on what matters.