Skip to content

[FEA][Java][JNI] Support multi-input groupby aggregations#22377

Draft
malinjawi wants to merge 1 commit intorapidsai:mainfrom
malinjawi:java-multi-input-aggregations
Draft

[FEA][Java][JNI] Support multi-input groupby aggregations#22377
malinjawi wants to merge 1 commit intorapidsai:mainfrom
malinjawi:java-multi-input-aggregations

Conversation

@malinjawi
Copy link
Copy Markdown

@malinjawi malinjawi commented May 5, 2026

Description

Adds role-tagged Java/JNI groupby aggregations that allow a logical
multi-input aggregate to be represented by multiple Aggregation instances
sharing a correlation id. The JNI groupby path buckets those role-tagged
aggregations, dispatches the underlying libcudf request directly, and places
results back in the requested output slots.

This PR adds the initial min_by roles:

  • GroupByAggregation.orderingForMinBy(long multiInputId)
  • GroupByAggregation.valueForMinBy(long multiInputId)
  • MultiInputIds.next() for thread-safe id generation
  • JNI-native multi_input_aggregation dispatch for paired min-by inputs

This avoids forcing callers to pack multi-input aggregate arguments into a
struct column before groupby and unpack the struct afterward.

closes #22276

Validation

  • clang-format -i on touched native C++ files.
  • git diff --check rapids/main...HEAD.
  • Java main source compilation with javac -source 8 -target 8.
  • Focused Java test compilation for MultiInputAggregationTest.
  • mvn -B surefire:test -Dtest="ai.rapids.cudf.MultiInputAggregationTest#testMultiInputIdsUniqueness+testMinByAggregationEquality" -DfailIfNoTests=false in the cuDF Java devcontainer.
  • mvn -B install -DskipTests ... in the cuDF Java devcontainer, including native cudfjni rebuild of AggregationJni.cpp and TableJni.cpp.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 5, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@github-actions github-actions Bot added the Java Affects Java cuDF API. label May 5, 2026
@malinjawi
Copy link
Copy Markdown
Author

I do not have permission to add RAPIDS repo labels from this fork. Could a maintainer please add the required label-checker labels? Suggested labels: feature request, Java, Spark, and non-breaking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Java Affects Java cuDF API.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEA][Java][JNI] Support multi-input aggregations without requiring a struct-column wrapper

1 participant