`router/v1` — Intent Classification + Sink Routing¶

Status: Draft (design-locked, ready for first implementation) · Stability: v1 will be frozen with the first reference router (router-v1) · Implementations: in-tree only until v1 is frozen.

The router/v1 surface sits between asr/v1 (which produces transcribed text in partially-filled IntentEnvelopes) and the orchestrator (which fans envelopes out to sinks per sink/v1). The router does two jobs:

Classify the envelope's intent (Intent.Kind, Intent.Confidence, optional Intent.Reasoning).
Route the envelope to sinks (Routing.PrimarySink, Routing.AlsoTo, Routing.Suppress).

Optionally, the router also derives envelopes — splitting compound utterances, rewriting transcripts via coreference, emitting session summaries, and dropping noise.

This document is the contract. Routers that conform to it can be loaded by any version of the Vox core that supports router/v1.

Scope¶

router/v1 covers:

The router's input contract (partially-filled IntentEnvelope)
The classifier chain (pluggable: rules / local-model / LLM)
Per-stream history (shallow context window)
Routing rule format (declarative table + per-source overrides)
Suppress patterns (kill-switch list)
Bundled routing presets
Derived envelope rules (splitting, coreference, summaries)
Drop rules
Provenance stamping for derived envelopes
Router interface lifecycle
Error semantics
Audit hook contract
Versioning and stability rules

router/v1 does not cover:

Audio capture or transcription (capture/v1, asr/v1)
Speaker diarization (segment/v1)
Sink delivery (sink/v1)
The orchestrator that drives the pipeline (internal)
Predicate-based dynamic routing rules (deferred to v1.x — additive)
RBAC / authz checks on routing decisions (Enterprise: authz/v1)

Input — what the router consumes¶

The router receives a partially-filled IntentEnvelope (full schema in sink-v1.md). When asr/v1 finishes transcribing a speech segment, the envelope has:

Populated: Identity (EnvelopeID, SessionID, StreamID, ParentID), Time span (StartedAt, EndedAt, Duration), Content (Transcript, Language, Confidence), Speaker (Label, SourceKind, optional Embedding), Provenance (ASRBackend, SegmenterImpl, CapturedAt, Pipeline), optional AudioRef.
Empty: Intent block, Routing block.

The router fills Intent and Routing, then emits the completed envelope.

Per-stream history¶

The router maintains a per-stream sliding window of recent envelopes for coreference and short-term context.

router:
  history_depth: 1                    # default: just the previous envelope
                                      # range: 0 (stateless) to 20

Per-stream, not per-session — different streams (your mic vs system audio) keep independent histories.
Read-only from the classifier's perspective. The router never mutates past envelopes; if it needs to derive new content from past context, it emits a new envelope with ParentID set.
Stateless mode (history_depth: 0) supported for deterministic-replay and diagnostic use cases.

Classifier Chain¶

The router doesn't classify directly. It dispatches to classifiers in a configurable chain. Each classifier returns {intent_kind, confidence, reasoning} or "abstain".

Three classifier families (v1)¶

Classifier	Cost	Latency budget	Determinism	Default in chain?
`rules`	Free	10 ms	Yes	Yes
`local-model`	Free	500 ms	Mostly	Yes
`llm`	Tokens	5 s	No	No (opt-in)

Chain semantics¶

Classifiers run in declared order.
Each classifier returns confidence ∈ [0.0, 1.0] or abstains.
First classifier with confidence ≥ min_confidence_threshold wins; chain stops.
If all classifiers abstain or fall below threshold → Intent.Kind = unclassified → routes per routes.unclassified (fallback).

Configuration¶

router:
  history_depth: 1
  min_confidence_threshold: 0.7

  classifier_chain:
    - type: rules
      patterns_file: ~/.vox/router/rules.yaml
    - type: local-model
      model_path: ~/.vox/router/intent-classifier.onnx
    # llm classifier omitted by default — opt-in
    # - type: llm
    #   sink: llm-anthropic              # reuse an existing LLM sink in "classify" mode
    #   system_prompt_template: ~/.vox/router/classify.tmpl

`rules` classifier¶

Lightweight regex / keyword / verb-pattern matching. The file format:

rules:
  - intent: command
    patterns:
      - "^(create|make|add|file|open)\\s+(a |an )?(bd |beads |task |issue |ticket)"
      - "^(remind me to|todo|action item:)"
    confidence: 0.85

  - intent: question
    patterns:
      - "\\?$"
      - "^(what|why|how|when|where|who|can you|could you|would you|do you|does)"
    confidence: 0.75

  - intent: prompt
    patterns:
      - "^(write|generate|draft|compose|tell me)"
    confidence: 0.7

  - intent: note
    patterns:
      - "^(note:|fyi:|just noting|btw)"
    confidence: 0.8

Bundled defaults ship with Vox; users override per repo or per user.

`local-model` classifier¶

A small ONNX text classifier (DistilBERT-class), fetchable via vox model download intent-classifier-v1. Not bundled (license + size). Custom models welcome with the same input/output signature:

Input: UTF-8 transcript text
Output: one of the 9 IntentKind values + a confidence float

`llm` classifier¶

Reuses an existing LLM sink in "classify" mode. The router sends a structured prompt; the LLM returns JSON with intent_kind + confidence + reasoning. Reasoning is captured in Intent.Reasoning for audit.

Off by default because every utterance becomes an LLM call → cost + latency + offline-story breakage. Users with budget and quality requirements enable it as the chain's terminal classifier.

Latency budgets¶

Classifier	Default budget	On overrun
`rules`	10 ms	Hard cap; abort classifier
`local-model`	500 ms	Skipped if budget exhausted
`llm`	5 s	User-tuned; configurable
End-to-end router	600 ms with rules+local-model	`router.latency_exceeded` telemetry event

Routing Rule Format¶

The v1 routing table¶

Declarative intent_kind → {primary, also_to} with optional by_source override.

router:
  routes:
    prompt:
      primary: llm-anthropic
      also_to: [s3]
    command:
      primary: llm-anthropic
      also_to: [bd, s3]
    todo:
      primary: bd
      also_to: [s3, ox-ledger]
    note:
      primary: ox-ledger
      also_to: [s3]
    question:
      primary: llm-anthropic
      also_to: [s3]
    summary:
      primary: email-smtp
      also_to: [s3, ox-ledger]
    raw_transcript:
      primary: s3
      also_to: [ox-ledger]
    llm_response:
      primary: s3
      also_to: [email-smtp]
    unclassified:
      primary: local-file              # fallback when classifier gives up
      also_to: []

  # Per-source-kind overrides (sparse — only entries that differ from default)
  by_source:
    online:
      summary:
        primary: email-smtp
        also_to: [s3, ox-ledger, bd]    # meetings extract action items into bd
      todo:
        primary: bd
        also_to: [s3, ox-ledger, email-smtp]
    self:
      prompt:
        primary: llm-anthropic
        also_to: []                     # don't archive every dictation prompt
    file:
      raw_transcript:
        primary: local-file
        also_to: []

Lookup order¶

by_source[envelope.Speaker.SourceKind][envelope.Intent.Kind] — most specific
routes[envelope.Intent.Kind] — default for that intent
routes.unclassified — terminal fallback

Result populates Routing.PrimarySink and Routing.AlsoTo. Sinks then apply their own filter blocks (locked in sink/v1) — both layers must pass for actual delivery.

Suppress Patterns — The Safety Net¶

Pattern-based opt-outs that go straight into Routing.Suppress. Additive to any Suppress entries from upstream stages.

router:
  suppress:
    - name: private-marker
      pattern: "(?i)\\bprivate\\b"
      sinks: [s3, ox-ledger, email-smtp]      # mentioned "private" — keep off shared surfaces
    - name: secret-detection
      pattern: "(?i)\\b(password|secret|api[_\\s]?key|token|bearer\\s)\\b"
      sinks: [s3, ox-ledger, email-smtp, llm-anthropic, llm-openai, llm-google]
                                                # secrets stay local-only
    - name: pii-email-fallback
      pattern: "[\\w.+-]+@[\\w-]+\\.[\\w.-]+"
      sinks: [llm-google]                      # example: never send emails to a specific LLM

Defaults¶

Two patterns ship enabled by default — table-stakes safety:

private-marker — "private" keyword → off shared surfaces
secret-detection — common secret patterns → no network sinks

Disable via router.suppress.enabled: false (not recommended).

Bundled Presets¶

Most users won't write a routing table from scratch. They pick a preset and tweak.

router:
  preset: meetings-and-dictation          # bundled preset name
  overrides:                              # applied on top of the preset
    summary:
      also_to: [s3, ox-ledger, my-custom-sink]

The six bundled presets¶

Preset	Primary use case	Key routing characteristic
`meetings-and-dictation` (default)	Mixed: meetings + dictation + calls	Balanced; all intent_kinds routed sensibly
`dictation-only`	Voice memos, prompt composition	Self source emphasized; commands → bd; prompts → LLM; minimal archive
`meeting-capture`	Recording meetings (in-person + online)	online + in-person emphasized; summary → email; action items → bd; full archive
`archive-everything`	Compliance / over-record posture	Every envelope to s3 + ox-ledger regardless of intent
`local-only`	No-network / offline / paranoid	All sinks are local-file or bd; no LLM, no email, no S3, no ox
`llm-heavy`	Power user with budget	LLM in classification chain enabled; LLM primary for most intents; full archive

Presets live at internal/router/presets/<name>.yaml. vox router presets lists them; vox router preset show <name> prints the resolved config. Copy and edit:

vox router preset show meetings-and-dictation > ~/.vox/router.yaml

Derived Envelopes¶

The router can emit zero, one, or many envelopes per input. Four derivation modes:

(a) Splitting compound utterances — opt-in¶

Compound utterances ("send Sarah an email and create a bd issue") become multiple envelopes. The LLM classifier handles segmentation; rules and local-model can't reliably detect compounds.

router:
  splitting:
    enabled: false                    # opt-in; off by default
    classifier: llm                   # LLM is required for reliable splitting
    min_segment_confidence: 0.75

When enabled, each output segment becomes its own envelope:

New EnvelopeID for each segment
ParentID = <original envelope's EnvelopeID> (audit linkage)
Same SessionID, StreamID, StartedAt, EndedAt, Speaker, Provenance as the parent
Transcript = <segment text>
Intent.Kind = <segment kind>

(b) Coreference resolution¶

If the current transcript contains a pronominal reference (it, that, this, one, the same, do so, them) AND the previous envelope is recent enough, the router rewrites the transcript.

router:
  coreference:
    enabled: true                     # default ON — basic mode is cheap and useful
    mode: prepend-previous            # prepend-previous | rewrite-llm | off
    pronouns: [it, that, this, one, the same, "do so", "them"]
    max_context_chars: 200
    require_short_gap: 30s

prepend-previous (default): rewrite by appending the previous transcript as parenthetical context.

Original:    "do that for next Tuesday"
Previous:    "remind me to email Sarah"
Rewritten:   "do that for next Tuesday (referring to: 'remind me to email Sarah')"

Dumb but reliable. Transparent (downstream can see what was done). Zero LLM cost.

rewrite-llm (opt-in): sends current + previous to an LLM with a "rewrite to be self-contained" prompt. More accurate, costs tokens.

When coreference applies, Custom["router.coreference_applied"] = true is stamped on the envelope.

(c) Summary triggers — opt-in¶

The router accumulates envelopes per session and emits derived summary envelopes on:

Trigger	Default	Configurable
Session end (orchestrator emits `session.ended` event)	Always when summaries enabled	`triggers.on_session_end: bool`
Idle timeout	10 minutes	`triggers.idle_timeout: <duration>`
Envelope count threshold	Off	`triggers.envelope_threshold: <int>`
Explicit command (`vox session summarize`)	Always available	—

Two summary content modes:

Mode	Behavior	Use case
`concatenate` (default)	Transcript = newline-joined envelopes in time order, with speaker labels prefixed	Cheap; works offline; raw transcript IS the summary
`llm`	Route the concatenated transcript through a designated LLM sink in summarize mode; LLM response becomes the summary envelope's transcript	Higher quality; costs tokens; needs LLM sink configured

router:
  summaries:
    enabled: true
    mode: concatenate                 # concatenate | llm
    llm_sink: llm-anthropic           # required if mode: llm
    llm_prompt_template: ~/.vox/router/summarize.tmpl
    triggers:
      on_session_end: true
      idle_timeout: 10m
      envelope_threshold: 0

The emitted summary envelope:

Intent.Kind = summary
ParentID = "" (derived from many envelopes — use Custom["router.source_envelope_ids"] to list them)
SessionID matches the session being summarized
StreamID = "" (multi-stream summary)
StartedAt / EndedAt span the entire session
Provenance.RouterImpl = "router-v1"
Custom["router.summary_mode"] = "concatenate" or "llm"

Routes through the normal routing table (default summary route: email-smtp + s3 + ox-ledger).

(d) Drop semantics¶

Configurable drop_rules — envelopes matching are dropped entirely, not routed anywhere.

router:
  drop_rules:
    - intent_kind: unclassified
      max_confidence: 0.3             # drop ONLY if classifier was REALLY uncertain
    - intent_kind: raw_transcript
      always: true                    # drop all raw transcripts (rare config)
    - max_transcript_chars: 5         # drop anything 5 chars or less
  log_dropped: true                   # log at DEBUG; counter always increments

Default drop rules:

unclassified with confidence < 0.3 — filler ("uh", "hmm")
Any transcript ≤ 5 chars

Dropped envelopes increment router.dropped counter and emit an audit/v1 event when audit is loaded.

Derived Envelope Provenance¶

Every router-produced envelope (split, coreference-rewritten, summary) carries uniform provenance stamping:

Field	Value
`EnvelopeID`	new UUID
`ParentID`	source envelope's `EnvelopeID`, or `""` for many-to-one (summary)
`Custom["router.source_envelope_ids"]`	for summaries: comma-separated list of contributing envelope IDs
`Custom["router.derivation"]`	`"split"` / `"coreference"` / `"summary"`
`Provenance.RouterImpl`	the router's identifier (e.g., `"router-v1"`)
`Custom["router.summary_mode"]`	populated for summary envelopes
`Custom["router.coreference_applied"]`	`true` if coreference rewrote the transcript

An auditor can reconstruct: "this summary covers these 47 envelopes from session X; this envelope had its transcript rewritten by coreference against envelope Y."

Router Interface¶

Router {
  # Identity
  Name()          -> string
  Capabilities()  -> Capabilities

  # Lifecycle
  Open(config)    -> Error
  Close()         -> Error            # drains internal state with timeout

  # Hot path
  Route(ctx, envelope)  -> []IntentEnvelope | RouterError
    # Returns zero or more output envelopes:
    #   - zero  → envelope dropped (drop_rules matched)
    #   - one   → standard 1:1 classification
    #   - many  → splitting fired (compound utterance)

  # Asynchronous emission (summaries, late LLM classifications)
  Emissions()     -> <-chan IntentEnvelope

  # Session lifecycle awareness (for summary triggers)
  OnSessionEvent(event)               # "session.started" | "session.ended" | "session.idle"

  # Diagnostics
  Stats()         -> Stats            # routed_in, routed_out, classifier_hits,
                                      # latency_p50/p99, drops, errors
  Health()        -> Health
}

Capabilities {
  SupportedClassifiers []string       # "rules" | "local-model" | "llm"
  SupportsSplitting    bool
  SupportsCoreference  bool
  SupportsSummaries    bool
  MaxHistoryDepth      uint32
}

Orchestrator loop (informative)¶

envelope = asr.NextEnvelope()
outputs, err = router.Route(ctx, envelope)
For each output in outputs: orchestrator sends to sinks per output.Routing
Concurrently, orchestrator drains router.Emissions() for async derived envelopes (summaries that arrive after a session-end trigger)
Orchestrator forwards session lifecycle events (OnSessionEvent) so the router can fire summary triggers

One router per pipeline¶

router/v1 does NOT support a chain of routers. Composition happens inside the router via the classifier chain. One router instance, configured via classifier chain + routing table + summary settings.

Reasoning: - Chain of routers would duplicate speaker resolution, history bookkeeping, summary accumulation - A single router with internal composition is simpler to debug - Users with exotic needs can write a wrapper router that delegates internally

Error Model¶

Typed errors, mirroring capture/v1 + sink/v1:

RouterError {
  Kind     RouterErrorKind
  Stage    string                     # "classifier:rules" | "classifier:llm" |
                                      # "split" | "coreference" | "summary"
  Message  string
  Cause    Error?
}

RouterErrorKind {
  ErrClassifierUnavailable           # specific classifier down
  ErrAllClassifiersFailed            # entire chain abstained or errored
  ErrInvalidEnvelope                 # input envelope failed validation
  ErrInternal                        # bug
  ErrBudgetExceeded                  # latency budget hit
}

Failure handling¶

Failure	Action
Classifier crashes / panics	Router-side recover; mark classifier unhealthy; take out of rotation for `health_recovery_interval` (60s default); continue with remaining classifiers
All classifiers abstain or below threshold	`Intent.Kind = unclassified`, `Intent.Confidence = 0`, route per `routes.unclassified` (default fallback: `local-file`)
Classifier returns invalid IntentKind	Treat as abstain; log structured warning; classifier stays in rotation; counter increments
Splitting LLM call fails	Emit the original envelope unchanged (single output); log warning
Coreference LLM rewrite fails	Fall through to `prepend-previous` (or skip if that's also failing); envelope continues
Summary generation (`llm` mode) fails	Fall back to `concatenate` mode and emit; orchestrator gets a summary envelope regardless
Router itself crashes	Orchestrator-side recover; mid-route envelopes dropped (logged + audit event); router restarted
`Route()` exceeds budget	Return whatever the chain produced so far; emit `router.latency_exceeded` event; count budget violation

Key principle: the router never blocks the pipeline. Worst case is Intent.Kind = unclassified routing to fallback — never a hung pipeline.

Route() returns output envelopes AND optionally a RouterError for telemetry. The error is informational, never fatal — for stats and audit only.

Audit Hook¶

When audit/v1 is loaded, the router MUST emit a RouterDecisionEvent for every routing decision:

RouterDecisionEvent {
  Timestamp        Timestamp
  EnvelopeID       string
  ParentID         string?
  InputTranscript  string              # transcript as classified (may differ from output if coreference rewrote)
  OutputTranscript string              # final transcript on the output envelope
  Intent {
    Kind        IntentKind
    Confidence  float
    Reasoning   string?
  }
  Routing {
    PrimarySink    string
    AlsoTo         []string
    Suppress       []string
  }
  ClassifierChain []ClassifierStep     # which classifiers fired, in order
  Derivation      string?              # "split" | "coreference" | "summary" | null
  LatencyMS       uint32
}

ClassifierStep {
  Name        string                   # "rules" | "local-model" | "llm-anthropic"
  Confidence  float
  Abstained   bool
  LatencyMS   uint32
  Reasoning   string?
}

Routing decisions are the most consequential trust boundary in Vox — they determine where your voice content ends up. An auditor reconstructing "why did my password show up in an S3 bucket" needs exactly this data.

When audit/v1 is NOT loaded, the router emits the same data as structured logs at DEBUG level (configurable to INFO via router.log_decisions_at: info).

Versioning and Stability¶

router/v1 is the contract above. Once frozen:

Non-breaking changes (allowed in v1.x): adding optional fields to Capabilities / Stats / configuration; adding new classifier types; adding new presets; adding new drop_rules options; adding the deferred predicate-based extra_rules (with CEL or similar) as additive opt-in.
Breaking changes (require v2): changing the routing-table lookup semantics; changing the Route() signature; changing how splitting / coreference / summary derivations work in a non-additive way; removing or repurposing any existing field.

The core supports one vN of router/ at a time, with overlap during migrations.

v1.x additive features (shipped)¶

Filler-word removal¶

After the classifier sets Intent, the router optionally strips common verbal noise tokens ("um", "uh", "like" as discourse marker, "you know", etc.) from the transcript before it reaches sinks. Classification always runs on the raw transcript; filler removal is a post-classification pass.

router:
  filler_removal_enabled: true          # default: true; set false to disable
  filler_removal_words: []              # override the default list (empty = use built-in list)

Default filler list: um, uh, er, ah, hmm, like, you know, sort of, kind of, basically, literally, actually, well, so, right, okay.

When fillers are stripped, three provenance keys are stamped on the envelope: - router.filler_removed = true - router.filler_count = <int> — number of distinct filler tokens removed - router.original_transcript = <string> — the raw pre-removal transcript

If the entire utterance consists of fillers (fewer than 3 non-whitespace characters would remain), removal is skipped and the original is preserved.

Snippet templates (voice text-expander)¶

A pre-classification stage replaces a matched trigger phrase with a configured expansion, then lets normal classification run on the expanded text.

router:
  snippets:
    "vox slack ack": "Got it, will follow up shortly."
    "vox standup status": "Yesterday: shipped X. Today: working on Y. Blockers: none."
    "vox sign off email": "Best, Jay"

Matching is case- and punctuation-insensitive. When a snippet fires, three provenance keys are stamped: - router.snippet_expanded = true - router.snippet_trigger = <original transcript> — what the user said - router.snippet_match = <normalized key> — the matched trigger

Snippets default to empty (disabled). Configure via router.snippets in the config map.

Predicate-based extra_rules (deferred to v1.x)¶

router:
  extra_rules:                                  # opt-in; evaluated AFTER routes + by_source
    - when: "Intent.Kind == 'command' && contains(Transcript, 'urgent')"
      add_to: [email-smtp]                      # additive — not replacement
    - when: "Speaker.Label == 'cto'"
      add_to: [s3-cto-archive]

Why not in v1: - Requires a sandboxed expression engine (likely CEL) - Adds a debugging dimension ("why didn't my rule fire?") that hurts predictability - Most use cases are already covered by by_source overrides + suppress patterns

Path to v1.x: - Add CEL or equivalent as the expression engine - Gate behind router.extra_rules.enabled flag - Document the expression DSL with examples - Ship as additive — no breaking change to v1's table-based config

Reference Implementation¶

One built-in router ships in v1: router-v1 (aka default). It implements everything above.

Users can swap it for a custom router via router.implementation: <name>, but in practice the built-in covers the entire grilled design space.

Enterprise: `router-rbac-audit` (planned)¶

Same router/v1 contract, adds:

RBAC checks before allowing certain routes (e.g., "envelopes from speakers without compliance:read cannot route to LLM sinks")
Immutable audit storage integration (authz/v1 + enterprise audit store)
Per-team / per-user routing policy overrides

Lives in the enterprise repo. Loaded via the same registry mechanism; swap-in is a config change.

Project Principle: Opinionated Defaults, Every Default Configurable¶

This contract continues the principle from capture/v1 and sink/v1. Every behavior with a defensible default (min_confidence_threshold: 0.7, history_depth: 1, coreference.mode: prepend-previous, summaries.mode: concatenate, the meetings-and-dictation preset, etc.) is exposed as a config knob. Defaults reflect a considered recommendation for the typical voice-to-LLM use case; the knobs exist so specialized workflows can tune them.

router/v1 — Intent Classification + Sink Routing¶