Skip to content

Legal Vault

Legal Vault

The firm’s AI-ready document corpus. Indexed for hybrid retrieval (pgvector + tsvector + Reciprocal Rank Fusion) and surfaced to the Assistant, Workflows, Practice agents, Templates, and Bots.

Three ways to populate

PathWhereWhen
Paste textPaste text button on /admin/legal/vaultOne-off briefs, memos
Upload fileUpload file buttonPDF / DOCX / TXT / XLSX. Server extracts via pypdf, falls back to tesseract OCR for scanned PDFs, walks every sheet for xlsx
DMS import/admin/legal/dmsBulk — connect iManage / NetDocs, browse folders, import

Plus the email-to-Vault gateway for inbound correspondence and the Word add-in for ad-hoc text fragments.

Per-document fields

title "Smith v. Jones — MTD (2024)"
classification brief | precedent_contract | depo_transcript |
memo | exhibit | statute_notes | template |
correspondence | other
tags ["delaware", "indemnity", "12(b)(6)"]
practice_area m_and_a | litigation | tax | ip | family ...
source_matter_id (optional — links to a matter)
inline_text the body (auto-extracted for uploads)
acl_mode "open" | "matter_team"

Ingest pipeline

  1. Chunker splits into ~1200-char chunks with 200-char overlap (cap 2000 chunks per doc).
  2. Embedder produces pgvector embeddings via the tenant’s embedding-task AI Provider.
  3. Postgres tsvector is computed via generated column for keyword search.
  4. ivfflat + GIN indexes for hybrid retrieval.

Status chip cycles pending_ingest → ingesting → ready (or failed with the reason). The Vault page auto-refreshes every 5 seconds while jobs are in flight.

Prerequisites

Must have at least one AI Provider with task_routing.embedding = true under AI providers. Without it, ingest goes to failed: no_embedding_provider. The Assistant and Workflows also need a task_routing.chat = true provider.

ACL modes

  • open (default) — anyone with the pollenix.legal entitlement and a legal role reads it.
  • matter_team — only members of the document’s source_matter_id team can read. See Matter-team ACLs.

Toggle at upload time via the Restrict to matter team checkbox.

Sources retrieved

The Vault is one of three knowledge sources the Assistant can consult. See Knowledge sources.