Skip to content

Improve hstack lowering#22353

Open
rjzamora wants to merge 19 commits intorapidsai:mainfrom
rjzamora:improve-hstack-lowering
Open

Improve hstack lowering#22353
rjzamora wants to merge 19 commits intorapidsai:mainfrom
rjzamora:improve-hstack-lowering

Conversation

@rjzamora
Copy link
Copy Markdown
Member

@rjzamora rjzamora commented May 1, 2026

Description

Problem statement: We currently translate HStack nodes with non-pointwise expressions to the equivalent Select node at lowering time. This is because all our non-pointwise Expr-decomposition logic is specific to Select. Before this PR, this translation was skipped whenever the underlying HStack was completely overwriting it's original columns. The problem with this case is that we loose "anchor" columns that tell the Select how to broadcast scalar-aggregation results.

Proposed solution: We add a temporary "anchor" column to the translated HStack so that broadcasting works correctly in the Select node.

Motivation:

  • We can handle all over() expression decomposition within Select if we know all non-pointwise HStack operations are lowered to Select anyway.
  • We don't "fall back" for other non-over HStack corner cases either.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@rjzamora rjzamora self-assigned this May 1, 2026
@rjzamora rjzamora requested a review from a team as a code owner May 1, 2026 18:37
@rjzamora rjzamora requested a review from vyasr May 1, 2026 18:37
@rjzamora rjzamora added 3 - Ready for Review Ready for review by team improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels May 1, 2026
@github-actions github-actions Bot added Python Affects Python cuDF API. cudf-polars Issues specific to cudf-polars labels May 1, 2026
@GPUtester GPUtester moved this to In Progress in cuDF Python May 1, 2026
Copy link
Copy Markdown
Contributor

@Matt711 Matt711 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good one nitpick. And I think we can write a test here.

Comment thread python/cudf_polars/cudf_polars/experimental/parallel.py Outdated
@rjzamora
Copy link
Copy Markdown
Member Author

rjzamora commented May 4, 2026

I think we can write a test here.

So, I'm not entirely sure if a new test makes sense. test_with_columns_scalar_upstream_20981 would fail if this code path was implemented incorrectly. The only reason we didn't get an error before is that we didn't define a msg argument to the _lower_ir_fallback call (so we didn't see a UserWarning failure here). Now that there is no silent fallback path, it needs to "work" for the test to pass.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

3 - Ready for Review Ready for review by team cudf-polars Issues specific to cudf-polars improvement Improvement / enhancement to an existing function non-breaking Non-breaking change Python Affects Python cuDF API.

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

3 participants