Add post-processing SQL over semantic query results by nicosuave · Pull Request #135 · sidequery/sidemantic

nicosuave · 2026-04-10T23:37:02Z

Summary

Adds support for composing arbitrary SQL (CASE, window functions, arithmetic, COALESCE, etc.) on top of semantic query results via subquery wrapping. The inner query compiles through the semantic pipeline; the outer query is plain SQL left untouched.
Refactors the query rewriter to walk the AST recursively (bottom-up), enabling nested subqueries and semantic subqueries inside JOIN clauses.
Adds a post_process parameter to compile() and query() for the Python API path, with automatic CTE hoisting so the composed SQL is valid.
15 new tests covering CASE expressions, window functions, metric arithmetic, deeply nested subqueries, JOIN subqueries, and the post_process API.

Support arbitrary SQL (CASE, window functions, arithmetic, etc.) on top of semantic query results via subquery wrapping. The rewriter now walks the query tree recursively so nested subqueries and JOIN subqueries that reference semantic models are compiled correctly. Also adds a post_process parameter to compile() and query() for the Python API path, with automatic CTE hoisting.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 552c2d9091

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-10T23:40:19Z

sidemantic/sql/query_rewriter.py

+        has_subquery_in_joins = any(isinstance(join.this, exp.Subquery) for join in (parsed.args.get("joins") or []))

-        if has_ctes or has_subquery_in_from:
+        if has_ctes or has_subquery_in_from or has_subquery_in_joins:


Preserve root semantic rewrite when JOIN has subquery

Routing every query with a JOIN subquery through _rewrite_with_ctes_or_subqueries skips _rewrite_simple_query for the root SELECT, because _rewrite_select_tree only rewrites child scopes. For queries whose root FROM is a semantic model (for example, FROM orders o JOIN (SELECT ...) s), metric references like o.revenue are no longer rewritten through the semantic pipeline or rejected by the explicit JOIN guard, so execution can now fail with binder errors or silently return raw row-level columns instead of semantic aggregates.

Useful? React with 👍 / 👎.

When the root FROM references a semantic model and a JOIN contains a subquery, the routing into _rewrite_with_ctes_or_subqueries must still apply _rewrite_simple_query to the root SELECT so the explicit JOIN guard and semantic rewriting are not bypassed.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c3565e5730

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-11T00:16:26Z

sidemantic/sql/query_rewriter.py

+        if self._references_semantic_model(parsed):
+            return self._rewrite_simple_query(parsed)


Keep CTE scope when rewriting root semantic queries

When a query has a root semantic FROM plus a WITH clause, this branch rewrites the whole root through _rewrite_simple_query, which regenerates SQL from extracted metrics/dimensions and drops the original CTE definitions. Any root predicate/expression that references those CTE names then fails at execution time (e.g., WITH allowed AS (...) SELECT ... FROM orders WHERE status IN (SELECT ... FROM allowed) now errors with missing table allowed). This regression is introduced by the new root rewrite path for CTE/subquery queries.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-11T00:16:26Z

sidemantic/core/semantic_layer.py

+                    parsed_inner.set("with", None)
+                    body = parsed_inner.sql(dialect=target_dialect)
+                    ctes = with_clause.sql(dialect=target_dialect)
+                    return ctes + "\n" + post_process.replace("{inner}", body)


Merge hoisted CTEs with outer post-process WITH

The post-processing path prepends hoisted inner CTEs directly in front of the user-supplied post_process SQL. If post_process itself starts with WITH, the result becomes WITH ...\nWITH ..., which is invalid SQL and fails to parse/execute. Since the feature is documented as allowing arbitrary SQL wrapping, this makes valid CTE-based post-processing unusable whenever the inner semantic query emits CTEs.

Useful? React with 👍 / 👎.

Fix two issues from PR review: 1. When the root SELECT references a semantic model and has user-defined CTEs, _rewrite_simple_query was discarding them. Now user CTEs are saved before rewriting and merged back into the generated SQL so filter references like IN (SELECT ... FROM user_cte) remain valid. 2. When post_process SQL has its own WITH clause and the inner semantic query also produces CTEs, the two WITH clauses are now merged into one instead of producing invalid double-WITH SQL.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 02194ec3a6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-11T14:37:53Z

sidemantic/sql/query_rewriter.py

+                    user_ctes = [cte.copy() for cte in original_with.expressions]
+                    gen_with.set("expressions", user_ctes + list(gen_with.expressions))


Preserve WITH RECURSIVE when merging root CTEs

When a root semantic query includes WITH RECURSIVE, this merge path copies only original_with.expressions into the generated WITH and drops the recursive flag from the original clause. The rewritten SQL becomes WITH ... instead of WITH RECURSIVE ..., so self-referencing CTEs fail at execution (e.g., DuckDB reports a circular CTE reference). I reproduced this with a root FROM orders query filtered by a recursive CTE; rewrite output omitted RECURSIVE.

Useful? React with 👍 / 👎.

Propagate the recursive flag from user-defined CTEs to the merged WITH clause so self-referencing CTEs execute correctly.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6cc3958289

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-11T14:44:30Z

sidemantic/sql/query_rewriter.py

+                gen_with = rewritten.args.get("with")
+                if gen_with:
+                    user_ctes = [cte.copy() for cte in original_with.expressions]
+                    gen_with.set("expressions", user_ctes + list(gen_with.expressions))


Avoid duplicate CTE aliases when merging root WITH clauses

When a root semantic query already has a user CTE name that matches an internally generated CTE (for example orders_cte), this merge creates duplicate aliases and the rewritten SQL fails to parse. I reproduced this with WITH orders_cte AS (...) SELECT orders.revenue FROM orders ..., which rewrote to two orders_cte entries and raised a parser error. Since generated CTE names are implementation details, users can hit this unintentionally; the merge needs collision handling (rename or namespace).

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-11T14:44:30Z

sidemantic/core/semantic_layer.py

+                        merged = list(inner_with.expressions) + list(outer_with.expressions)
+                        outer_with.set("expressions", merged)


Prevent CTE name collisions in post_process WITH hoisting

The post_process hoisting path prepends inner generated CTEs to outer post_process CTEs without checking alias conflicts, so a user CTE name like orders_cte collides with the semantic generator’s orders_cte and produces invalid SQL. Calling compile(..., post_process='WITH orders_cte AS (...) SELECT ... FROM ({inner})') currently yields duplicate orders_cte definitions and fails at execution time.

Useful? React with 👍 / 👎.

Two changes: 1. post_process path: remove CTE hoisting entirely. Inner SQL (with CTEs) is placed directly in the subquery position. CTEs inside subqueries are valid in all target databases and naturally scoped, so name collisions with post_process CTEs cannot occur. 2. Root semantic + user CTEs: detect name collisions between user CTEs and generated CTEs, raising a clear error instead of producing invalid SQL. Walk-based renaming was too aggressive (renamed user CTE references inside filter subqueries).

chatgpt-codex-connector bot reviewed Apr 10, 2026

View reviewed changes

chatgpt-codex-connector bot reviewed Apr 11, 2026

View reviewed changes

Preserve WITH RECURSIVE when merging root CTEs

6cc3958

Propagate the recursive flag from user-defined CTEs to the merged WITH clause so self-referencing CTEs execute correctly.

chatgpt-codex-connector bot reviewed Apr 11, 2026

View reviewed changes

nicosuave merged commit 26e9294 into main Apr 11, 2026
15 checks passed

nicosuave deleted the nicosuave/subquery-postprocess branch April 11, 2026 15:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add post-processing SQL over semantic query results#135

Add post-processing SQL over semantic query results#135
nicosuave merged 5 commits intomainfrom
nicosuave/subquery-postprocess

nicosuave commented Apr 10, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Apr 10, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Apr 11, 2026

Uh oh!

chatgpt-codex-connector bot Apr 11, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Apr 11, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Apr 11, 2026

Uh oh!

chatgpt-codex-connector bot Apr 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		if self._references_semantic_model(parsed):
		return self._rewrite_simple_query(parsed)

		user_ctes = [cte.copy() for cte in original_with.expressions]
		gen_with.set("expressions", user_ctes + list(gen_with.expressions))

		merged = list(inner_with.expressions) + list(outer_with.expressions)
		outer_with.set("expressions", merged)

Conversation

nicosuave commented Apr 10, 2026

Summary

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant