Skip to content

[codex] Harden tenant isolation SQL boundaries#4

Draft
kopy wants to merge 2 commits intolukaskratzel:mainfrom
kopy:codex/tenant-project-isolation-hardening
Draft

[codex] Harden tenant isolation SQL boundaries#4
kopy wants to merge 2 commits intolukaskratzel:mainfrom
kopy:codex/tenant-project-isolation-hardening

Conversation

@kopy
Copy link
Copy Markdown
Contributor

@kopy kopy commented May 3, 2026

Summary

This PR hardens tenant/project isolation for agent-generated SQL compiled by voight, especially when tenantScopingPolicy protects tables by scope columns such as project_id, tenant_id, or workspace_id.

Security assessment after the final pass:

  • No RCE was found.
  • No mutation/multi-statement parser escape was found.
  • No clean client-side CVE-grade issue was proven where attacker-controlled Agent SQL alone crosses tenant boundaries under a clearly correct full deployment configuration.
  • The strongest finding is a conditional policy/catalog issue: logical alias protection did not automatically protect the resolved physical table name when that physical table name was still queryable.
  • The remaining findings are defense-in-depth or deployment hazards around MySQL runtime behavior and policy identity spoofing.

The fixes in this PR still matter because tenant isolation often fails at exactly these catalog, policy, and database-runtime boundaries.

Findings Split

Agent-SQL-only Candidate

Direct physical-table bypass behind catalog aliases

If the intended security model is "the app exposes and protects the logical table name," this is the strongest finding.

Prerequisites:

  • Catalog maps a public logical name such as public_events to a physical table such as tracking.events.
  • Policy protects tables: ["public_events"].
  • The physical table name is still accepted by the catalog.
  • The attacker knows or guesses the physical table path.

Example before the fix:

-- catalog alias: public_events -> tracking.events
-- policy tables: ["public_events"]
SELECT project_id
FROM tracking.events
WHERE project_id = 'project-bravo';

What was wrong:

The parser accepted valid SQL and the old policy matched only the logical alias, so direct physical access could compile without the injected project_id = 'project-alpha' guard.

Reassessment:

  • 7.1 High, conditional CVSS-style triage label.
  • Agent SQL alone can trigger it if logical aliases are treated as the public/security boundary.
  • If the deployment requires every queryable physical name to be separately protected or hidden, then this becomes a configuration gap rather than a standalone compiler break.

Fix:

The policy now resolves configured scope table names through the catalog, so protecting a logical alias also protects the resolved physical table identity.

Policy-identity Hardening

CTE reference alias shadowing

A CTE could be referenced with an alias that matched a scoped table name without tripping the shadowing guard.

Prerequisites:

  • Policy protects a table such as events.
  • Attacker can submit a WITH query.
  • Attacker aliases a CTE reference as the protected table name.

Example before the fix:

-- policy tables: ["events"]
WITH planted AS (
  SELECT id, 'project-bravo' AS project_id, metric
  FROM users
)
SELECT project_id
FROM planted AS events;

Reassessment:

  • 5.3 Medium CVSS-style triage label.
  • This is policy-identity spoofing, not a proven direct cross-project row leak by itself.
  • The old shadowing check rejected CTE names that matched scoped tables, but missed aliases on CTE references.

Fix:

The policy now rejects CTE references whose CTE name or table alias shadows a scoped table identity.

Config / Runtime Hazards

MySQL string project_id type-coercion leak

This is not an Agent-SQL-only break. It requires trusted application code to pass the wrong policy context type.

Prerequisites:

  • MySQL project_id is a string column, for example VARCHAR.
  • Trusted application code accidentally passes policyContext.projectId = 0 or false.

Example emitted guard before the fix:

WHERE project_id = 0

What was wrong:

On MySQL 8.4, nonnumeric strings such as project-alpha and project-bravo compare equal to 0, so a wrong trusted context type could widen the tenant boundary.

Reassessment:

  • 4.8 Medium, config/trusted-input hazard CVSS-style triage label.
  • Not a parser escape.
  • Not attacker SQL alone against a correct config.

Fix:

tenantScopingPolicy now defaults scope values to strings. Numeric, bigint, and boolean scope columns must opt in through scopeValueType, for example scopeValueType: "bigint" for a BIGINT project_id.

MySQL string collation widening

This is a schema/runtime hazard, not a compiler/parser bug.

Prerequisites:

  • MySQL string project_id uses case/accent-insensitive collation, for example utf8mb4_0900_ai_ci.
  • Different project IDs are equal under that collation.

Example:

WHERE project_id = 'project-alpha'

Under an unsafe collation, that can also match PROJECT-ALPHA.

Reassessment:

  • 4.2 Medium, schema hazard CVSS-style triage label.
  • Not Agent-SQL-only.
  • Mitigation is binary or case-sensitive collation/comparison semantics for string scope columns.

Additional Hardening

Function-call sandbox note

The focused RCE pass flagged arbitrary function calls when compile(...) is used without allowedFunctionsPolicy(...). This is not a default RDS MySQL 8.4 RCE: sys_exec/sys_eval are not RDS features and are not built-in MySQL 8.4 functions. They require dangerous third-party loadable UDFs or stored routines to be installed and executable. Built-ins such as LOAD_FILE() also require server privileges such as FILE and compatible server settings.

Still, because voight is intended for untrusted Agent SQL, callers should configure a strict function allowlist instead of relying on database deployment defaults.

  • maxLimitPolicy now traverses every bound select, not only the outer select, and recursive defaultLimit insertion covers CTEs, derived tables, scalar subqueries, EXISTS, and IN subqueries.
  • Function names and CAST target type identifiers are rejected unless they are simple unquoted raw SQL identifiers.
  • Catalog path keys use tuple-style identity and reject dotted path segments, preventing quoted dotted identifiers from colliding with schema-qualified paths.
  • Schema-qualified physical tables fail closed when protected only by a short physical table name.
  • Alias expansion in GROUP BY, HAVING, and ORDER BY now preserves non-associative binary expression grouping.

MySQL 8.4 / RDS Coverage

Added an opt-in live MySQL 8.4 integration test, gated by VOIGHT_MYSQL84_URL or VOIGHT_MYSQL84_PORT. The live check ran against a disposable mysql:8.4 container and covers:

  • OR 1=1 with direct victim predicates
  • victim-side LEFT JOIN predicates
  • scalar, EXISTS, and correlated aggregate subqueries
  • rejection of 0/false context values for string project_id
  • explicit BIGINT project_id scoping via scopeValueType: "bigint"
  • a documented MySQL collation hazard where case-insensitive string comparison can match PROJECT-ALPHA for project-alpha

The compile hardening suite also pins newer MySQL 8.4 features as rejected until the compiler models them and can tenant-scope every table-producing branch:

  • JSON_TABLE
  • TABLE
  • VALUES
  • UNION, INTERSECT, EXCEPT
  • recursive CTEs
  • named windows and window frames
  • JSON arrows
  • quantified subqueries
  • rollups
  • locking clauses and index hints
  • version comments and optimizer hints

Validation

Ran:

corepack pnpm run check

Result:

  • packages/voight: 39 test files passed, 1 skipped, 337 tests passed, 6 skipped
  • packages/voight-parser: 1 test file passed, 10 tests passed
  • format check passed
  • typecheck passed

Also ran live MySQL 8.4 validation:

VOIGHT_MYSQL84_PORT=32768 corepack pnpm --filter @voight8/voight exec vitest run tests/integration/security/mysql84-project-isolation.test.ts

Result: 1 test file passed, 6 tests passed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant