Skip to content

Compiling SQL scalar expressions

Nicolas Buduroi edited this page Jan 22, 2011 · 10 revisions

Compiling SQL Scalar Expressions

Definitions

SQL scalar expressions of different kinds can be used in a variety of statements like check constraints, row values, where and default clauses for examples. They can be broke down into the following (simplified) categories based on the standard.

  • Scalar values
    • <value expression>
  • Null or default values
    • <contextually typed value specification>
  • Qualified identifiers
    • <identifier chain>
  • Functions
    • <set function specification> (only for querying?)
    • <cast specification>
    • <numeric value function>
    • <string value function>
    • <datetime value function>
  • Case expressions
    • <case expression>
  • Operators
    • <boolean value expression>
    • <numeric value expression>
    • <string value expression>
    • <datetime value expression>

Scalar Values

These shall be represented by Clojure equivalent scalar values.

Null or Default Values

Represented as keyword (:null and :default) or null values could also simply be Clojure nil values.

Qualified Identifiers and Single-Argument Functions

Identifiers shall be represented as keyword in the part after the forward slash. The identifier chain would be separated by dots. When compiled each identifier in the chain could optionally delimited.

Functions shall be represented as keyword in the part before the forward slash. That's a great idea for functions having only one argument.

Case expression and Operators

Case expressions and operators shall be represented by corresponding Clojure functions.

AST

Here's a proposal for the eventual AST to be used:

ValueExpression ::=
  | ScalarExpression
  | NullOrDefaultExpression
  | Identifier
  | Function
  | CaseExpression
  | Operator

Implementation Differences

For MySQL, PostgreSQL and SQL Server.

Numeric Value Functions

  • position POSITION(substr IN str)
    • SQL Server: CHARINDEX(str, substr [, start_loc])
  • length for char/binary types
    • PostgreSQL: length (with an optional encoding parameter), bit_length, char_length, octet_length
    • MySQL: length, bit_length, char_length, octet_length
    • SQL Server: LEN for *CHAR types and DATALENGTH for the TEXT type.
  • extracting date/time extract(field from timestamp|interval)
    • SQL Server: DATEPART(field, date)
  • absolute
  • ceiling/floor
  • modulus
    • MySQL: % operator
    • SQL Server: % operator

String Value Functions and Operator

  • substring substring(string [from int] [for int])
    • PostgreSQL: standard substring plus regexp version and substr(string, from [, count])
    • SQL Server: SUBSTRING (value, start, length)
  • upper and lower case
  • character encoding
    • We'll skip those for now!
  • trimming trim([leading | trailing | both] [characters] from string)
    • SQL Server: only ltrim and rtrim
  • replace (standard got some weird TRANSLATE_REGEX function)
    • PostgreSQL: overlay(string placing string from int [for int])
    • MySQL: REPLACE(str, from_str, to_str)
    • SQL Server: REPLACE (string, pattern, replacement)
  • concatenation (|| operator)
    • MySQL: CONCAT function
    • SQL Server: + operator (no implicit convertion, no text or image support)

Datetime Value Functions

  • current time/data/timestamp (with or without timezone)