Per-firm fine-tuning
Per-firm fine-tuning
Train a custom model on your firm’s Vault so the Assistant
and Workflows match your house style and
vocabulary. Configure at /admin/legal/fine-tuning.
What it tunes
Style + voice, not facts. The Vault chunks become instruction-style JSONL pairs (~3 per chunk) — summarize-this / extract-clauses / rephrase-as. The fine-tuned model learns your firm’s register; the underlying facts still come from retrieval at inference time.
Backends
- OpenAI — fully implemented. Uploads JSONL to OpenAI Files,
starts a
/v1/fine_tuning/jobsrun, polls until ready. - Anthropic — scaffolded; no self-serve fine-tune API yet, so
jobs go to
failed: backend_not_supported_yet. - Bedrock, Azure OpenAI — same scaffolded state. Roadmap for when those providers’ fine-tune surfaces stabilize.
Credentials are pulled from the tenant’s existing AI Provider configuration — you don’t enter the OpenAI API key twice.
Starting a job
- New job → label, backend, base model
(
gpt-4o-mini-2024-07-18is a good default). - Pollen8 snapshots the Vault to JSONL (must have ≥10 ready chunks).
- Uploads to the provider, creates the fine-tune job, stores the
provider_job_id. - Poll status with Refresh until it reaches
ready.
Activating
Once a job is ready, Activate flips is_active=true on that
row (atomically deactivating any other). From that moment on, the
Assistant + every workflow driver substitute the fine-tuned model id
for the provider’s default — transparently.
The Why trace stamps fine_tuned: true on every LLM call that used
the override, so you can audit attribution.
Going back to stock
Use stock model button bulk-deactivates all jobs for the tenant — Assistant + workflows resume using the provider default. Existing jobs aren’t deleted; they remain in the table for re-activation.
When not to fine-tune
If the Vault is small (under a few hundred chunks), the fine-tune will overfit. Stick with retrieval-only until the corpus matures. If house style is already captured by the base model, the marginal gain is small — measure A/B before promoting.