Summary
Investigate integrating libcudf-rs as an optional GPU execution backend for Beacon queries.
Beacon already focuses on high-performance scientific and tabular data access with Arrow + DataFusion interoperability across formats such as Zarr, NetCDF, Parquet, Arrow IPC, CSV, and BBF. libcudf-rs may provide a useful path to accelerate eligible DataFusion physical plans using NVIDIA GPUs through RAPIDS cuDF.
This issue proposes a prototype to determine whether libcudf-rs can be integrated cleanly, safely, and optionally without changing Beacon’s default CPU execution path.
Motivation
Some Beacon workloads may be GPU-friendly, especially:
- large tabular scans from Parquet / Arrow IPC / CSV
- projection-heavy queries
- filter-heavy queries
- aggregation workloads
- sort / group-by workloads
- repeated analytical queries over large datasets
libcudf-rs includes a libcudf-datafusion crate that integrates with Apache DataFusion by applying physical optimizer rules that replace eligible DataFusion execution nodes with cuDF-backed GPU variants.
For Beacon, this could provide an optional acceleration path for supported query plans while retaining the existing CPU/DataFusion path as the default and fallback.
Goals
- Evaluate whether
libcudf-rs can be used as an optional GPU backend for Beacon.
- Identify which Beacon query paths are compatible with
libcudf-datafusion.
- Prototype GPU acceleration for a minimal subset of query plans.
- Preserve current CPU behavior when GPU support is disabled or unavailable.
- Define a clean feature flag / runtime configuration model.
- Measure performance and correctness against the existing execution engine.
Summary
Investigate integrating
libcudf-rsas an optional GPU execution backend for Beacon queries.Beacon already focuses on high-performance scientific and tabular data access with Arrow + DataFusion interoperability across formats such as Zarr, NetCDF, Parquet, Arrow IPC, CSV, and BBF.
libcudf-rsmay provide a useful path to accelerate eligible DataFusion physical plans using NVIDIA GPUs through RAPIDS cuDF.This issue proposes a prototype to determine whether
libcudf-rscan be integrated cleanly, safely, and optionally without changing Beacon’s default CPU execution path.Motivation
Some Beacon workloads may be GPU-friendly, especially:
libcudf-rsincludes alibcudf-datafusioncrate that integrates with Apache DataFusion by applying physical optimizer rules that replace eligible DataFusion execution nodes with cuDF-backed GPU variants.For Beacon, this could provide an optional acceleration path for supported query plans while retaining the existing CPU/DataFusion path as the default and fallback.
Goals
libcudf-rscan be used as an optional GPU backend for Beacon.libcudf-datafusion.