Knowledge sources
Knowledge sources
Per-conversation toggle that decides what the Assistant and Practice agents consult.
The three sources
| Source | Engine | Budget (chunks) |
|---|---|---|
| Vault (default) | Hybrid retrieval over legal_vault_chunks | 8 |
| Firm templates | Vault scoped to template + precedent_contract classifications | 4 |
| Federal caselaw | CourtListener /api/rest/v3/search/ opinion search | 3 |
Final response cap is 12 chunks total.
How the picker works
Chip row above the chat input — click any source to toggle it. Vault
is “sticky” — the UI won’t let you disable every source. Source
preference persists on the QA session (legal_qa_sessions.sources
varchar[]).
Caselaw nuances
CourtListener results are ephemeral — they don’t get persisted to
legal_vault_chunks. The Assistant gets snippet + case name + court
- date filed; if the model cites them, the citation card shows an Open source ↗ link to the full opinion on courtlistener.com.
CourtListener is unauthenticated for the search API we use, so it works out-of-the-box. No additional credentials required.
Practice-agent interaction
When a session is bound to a practice agent and the user has the
Vault source enabled, the agent’s vault_classifications scope is
applied (e.g., Litigation agent restricts to brief +
depo_transcript + exhibit). If that scoped retrieval returns
nothing, we broaden to the full Vault and stamp broadened=true on
the Why trace.
Templates and caselaw sources are not affected by practice-agent scoping.
Adding a source
The source vocabulary is currently (vault | templates | caselaw).
Adding a fourth (state statutes, internal Slack archive, etc.) is a
two-file change in services/legal_sources.py — declare the key in
ALL_SOURCES, add a budget entry, add a retriever. Then expose it
to the SPA picker.