Skip to content

Centralize per-database logic behind a single SqlDialect abstraction #1317

@lovasoa

Description

@lovasoa

Introduce a SqlDialect abstraction to make per-database logic a single source of truth

Per-database knowledge is currently scattered across many files and keyed on two parallel enums that nothing forces to agree: SupportedDatabase (8 variants: Sqlite, Duckdb, Oracle, Postgres, MySql, Mssql, Snowflake, Generic) and sqlx's AnyKind (5 variants). Because the dialect-specific data is spread over ~6 independent match sites, several keyed on the wrong enum, the compiler cannot enforce completeness: each site silently falls through to a _ default. Adding or fixing a dialect today means hunting down every site and hoping none was missed, and a missed arm is a silent correctness/data bug, not a compile error.

Proof: the references are real and scattered

$ grep -rcn -E "SupportedDatabase::(Sqlite|Postgres|MySql|Mssql|Duckdb|Oracle|Snowflake|Generic)" src | grep -v ':0' ; echo "total:" ; grep -rn -E "SupportedDatabase::(Sqlite|Postgres|MySql|Mssql|Duckdb|Oracle|Snowflake|Generic)" src | wc -l
src/webserver/database/sql.rs:37
src/webserver/database/sql/parameter_extraction.rs:9
src/filesystem.rs:3
total:      49

$ grep -rn -E "AnyKind::" src | wc -l
      24

49 per-variant SupportedDatabase references (37 in sql.rs alone) plus 24 AnyKind references, all to be kept consistent by hand.

The scattered sites

The enum mismatch directly produces fragile fallthroughs that only exist because the key type is wrong:

Proposed change

Introduce a single trait SqlDialect (or a method-rich impl on SupportedDatabase) that centralizes everything per-DB:

  • placeholder style and placeholder cast type
  • sqlparser Dialect
  • otel db.system.name
  • default pool size
  • capability flags: native JSON, COPY FROM STDIN, trace propagation, rollback statement, :: cast support, MSSQL concat rewriting

Crucially, key the placeholder data on SupportedDatabase and make DB_PLACEHOLDERS (and the other tables) an exhaustive match, so the compiler forces every dialect to be handled and the unreachable!/_ => Odbc/_ => Generic fallthroughs disappear. Adding or fixing a dialect then becomes "implement one trait / add the arm the compiler demands" instead of editing ~6 scattered sites that nothing forces to agree.

Risks and mitigations

  • Large but mechanical refactor. Mitigate by leaning on the existing dialect/placeholder tests in sql.rs (the ALL_DIALECTS table drives many of them) which already pin per-dialect behavior.
  • Land it incrementally, one capability at a time, so each step stays reviewable and bisectable.

Expected win

  • Single source of truth for per-DB behavior.
  • Compiler-enforced dialect completeness instead of hand-synced matches across two enums.
  • Removes the silent _ => Odbc / _ => Generic fallthroughs that can quietly mishandle (or corrupt data for) an unlisted dialect.

This refines the "split sql.rs by responsibility" suggestion in #1249: a SqlDialect abstraction is the natural seam to split along.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions