Skip to content

feat: Support planning subqueries with OuterReferenceColumn belongs to non-adjacent outer relations #19930

Open
mkleen wants to merge 26 commits intoapache:mainfrom
mkleen:plann-recursive-subquery-cleanup
Open

feat: Support planning subqueries with OuterReferenceColumn belongs to non-adjacent outer relations #19930
mkleen wants to merge 26 commits intoapache:mainfrom
mkleen:plann-recursive-subquery-cleanup

Conversation

@mkleen
Copy link
Contributor

@mkleen mkleen commented Jan 21, 2026

Which issue does this PR close?

Cleaned-up version of #18806 with:

Rationale for this change

See #18806

What changes are included in this PR?

See #18806

Are these changes tested?

Yes

Are there any user-facing changes?

outer_queries_schema is removed from PlannerContext.

@github-actions github-actions bot added sql SQL Planner logical-expr Logical plan and expressions sqllogictest SQL Logic Tests (.slt) and removed sqllogictest SQL Logic Tests (.slt) labels Jan 21, 2026
@mkleen mkleen changed the title Plann recursive subquery cleanup feat: Support planning subqueries with OuterReferenceColumn belongs to non-adjacent outer relations Cleanup Jan 21, 2026
Projection: l.a, l.b, l.c, l.d, l.e
SubqueryAlias: l
Projection: lineitem.l_item_id AS a, lineitem.l_description AS b, lineitem.price AS c
Projection: lineitem.l_orderkey AS a, lineitem.l_item_id AS b, lineitem.l_description AS c, lineitem.l_extendedprice AS d, lineitem.price AS e
Copy link
Contributor Author

@mkleen mkleen Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is because the lineitem table is extended now.

"o1.o_custkey".to_string(),
"o1.o_orderstatus".to_string(),
"o1.customer_id".to_string(),
"o1.o_totalprice".to_string(),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is because the order table is extended now.

assert_snapshot!(
err.strip_backtrace(),
@"Error during planning: Source table contains 3 columns but only 2 names given as column alias"
@"Error during planning: Source table contains 5 columns but only 2 names given as column alias"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is because the lineitem table is extended now.

])),
"orders" => Ok(Schema::new(vec![
Field::new("order_id", DataType::UInt32, false),
Field::new("o_orderkey", DataType::UInt32, false),
Copy link
Contributor Author

@mkleen mkleen Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These extensions were necessary to get the tests to work.

github-merge-queue bot pushed a commit that referenced this pull request Jan 22, 2026
## Which issue does this PR close?

While working on #19930 , I
noticed a trailing whitespace in the CROSS JOIN logical plan output.
This whitespace is inconsistent with the rest of the logical plan
formatting.

## Rationale for this change

This change removes the unnecessary trailing whitespace in the logical
plan output of a CROSS JOIN.

## What changes are included in this PR?

## Are these changes tested?

Yes.

## Are there any user-facing changes?

No.
@mkleen mkleen changed the title feat: Support planning subqueries with OuterReferenceColumn belongs to non-adjacent outer relations Cleanup feat: Support planning subqueries with OuterReferenceColumn belongs to non-adjacent outer relations Jan 23, 2026
@mkleen mkleen force-pushed the plann-recursive-subquery-cleanup branch from 99902bc to 3b1fa64 Compare January 23, 2026 08:51
@mkleen
Copy link
Contributor Author

mkleen commented Jan 26, 2026

@alamb @duongcongtoai Could you review this when you have a moment so we can get it merged?

@mkleen mkleen force-pushed the plann-recursive-subquery-cleanup branch from af87772 to cd81a28 Compare January 26, 2026 08:52
@duongcongtoai
Copy link
Contributor

nice, i'll take a look once i have time

fn table_with_column_alias() {
let sql = "SELECT a, b, c
FROM lineitem l (a, b, c)";
let sql = "SELECT a, b, c, d, e
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be nice to have a sqllogictest for this PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah i see you replaced them with integration test instead

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree we should have some sqllogictest coverage to make sure everything is hooked up right

Copy link
Contributor Author

@mkleen mkleen Feb 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added two sqllogictests back which pass without the optimizer changes, the other two remain in sql_integration.rs because they depend on the optimizer change.

// TODO revisit this validation logic
plan_err!(
"Correlated scalar subquery in the GROUP BY clause must also be in the aggregate expressions"
"Correlated scalar subquery in the GROUP BY clause must \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this change is unrelated (it's in my original PR)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is from fmt.

@mkleen
Copy link
Contributor Author

mkleen commented Feb 5, 2026

@duongcongtoai Thank you for your feedback!

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @mkleen and @duongcongtoai

I think this looks pretty good to me -- I have some small suggestions for API improvements, but the basic idea is 🏆

It would also be nice to restore one or two slt tests to make sure this code is connected end to end

fn table_with_column_alias() {
let sql = "SELECT a, b, c
FROM lineitem l (a, b, c)";
let sql = "SELECT a, b, c, d, e
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree we should have some sqllogictest coverage to make sure everything is hooked up right

/// The query schema of the outer query plan, used to resolve the columns in subquery
outer_query_schema: Option<DFSchemaRef>,

/// The queries schemas of outer query relations, used to resolve the outer referenced
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perfect

@mkleen
Copy link
Contributor Author

mkleen commented Feb 6, 2026

@alamb Thank you for the review. I will add your feedback soon!

@alamb
Copy link
Contributor

alamb commented Feb 6, 2026

Thanks @mkleen -- sorry for the delay in reviewing this

@mkleen mkleen force-pushed the plann-recursive-subquery-cleanup branch from 8899f47 to 5328de1 Compare February 7, 2026 13:11
@github-actions github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Feb 7, 2026
@mkleen mkleen force-pushed the plann-recursive-subquery-cleanup branch from 9ba50c7 to 42170bf Compare February 7, 2026 13:40
@mkleen mkleen requested review from alamb and duongcongtoai February 7, 2026 19:35
@mkleen
Copy link
Contributor Author

mkleen commented Feb 7, 2026

@alamb @duongcongtoai Could you please do one more review?

This is the interesting change 818926d

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

logical-expr Logical plan and expressions sql SQL Planner sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Supporting planning (binding) Nested correlated subquery error with a depth exceeding 1

3 participants