Safety & validation
Safety & validation
DB Engine is read-only by contract. The validator layer enforces it — even SQL hand-edited in the workbench passes through the same gate as planner-generated SQL.
Five rejection rules
| Rule | Why |
|---|---|
| SELECT-only | DROP / DELETE / UPDATE / INSERT / TRUNCATE / GRANT / REVOKE / ALTER all rejected at parse time |
| Single-statement | Stacked ;-separated statements rejected — no smuggling DML behind a SELECT |
| Single-connection | Three-part names referencing another connection rejected |
No pg_* / information_schema writes | Even on Postgres, system-catalog writes blocked |
| Row-cap enforced | LIMIT auto-injected if missing |
Scope predicates
Per-connection WHERE-fragment list. The validator AND-injects them into every executed query. Format:
[ { table: "encounters", predicate: "tenant_id = '...'" }, { table: "patients", predicate: "tenant_id = '...'" }]The planner is told about scope predicates in its system prompt so it doesn’t double-apply them, but if it omits them, the validator re-applies them. They can’t be bypassed.
EXPLAIN preflight
Before execution, the driver runs the equivalent of:
- Postgres / MySQL:
EXPLAIN (FORMAT JSON) <user_sql>. - BigQuery: dry-run with billing estimate.
- Snowflake:
EXPLAIN <user_sql>. - DuckDB:
EXPLAIN <user_sql>.
Estimated row scan above db.explain.max_rows → block. Estimated
cost (BigQuery $$) above db.explain.max_cost_usd → block. Both
limits are tenant-configurable.
Statement timeout
Driver-level statement timeout: 30 seconds. A long-running query
gets cancelled. Adjust per connection via
config.statement_timeout_ms if you need a different ceiling (some
analytical workloads do).
What an adversarial user can NOT do
- Execute DML or DDL.
- Read another tenant’s data even if shared connection — scope predicates AND’d in.
- Run a query that scans more than
max_rows(configurable). - Run a query for longer than the statement timeout.
- Bypass the validator by hand-editing — workbench-edited SQL re- enters the same validator.
What the validator does NOT prevent
- A poorly-permissioned connection. If the DB user the connection uses has more privileges than you intended, that’s a connection- config problem. Pollen8’s best practice: connect with a least-privilege read-only user.
- Information disclosure from queries that should have been scope-predicated but weren’t. Make sure your scope predicates cover every table you don’t want exposed.
Audit
Every execute call stamps a Why trace with:
- Connection id.
- Original NL question (when planner-routed).
- Planner SQL + edited SQL (when different).
- EXPLAIN cost estimate.
- Execution time + row count.
- User id + AuthContext token id.
Queryable via the audit table for any SOC2 / HIPAA accounting need.