Skip to content

perf: remove Thrust#12

Draft
mojomex wants to merge 3 commits into
mainfrom
perf/yank-out-thrust
Draft

perf: remove Thrust#12
mojomex wants to merge 3 commits into
mainfrom
perf/yank-out-thrust

Conversation

@mojomex
Copy link
Copy Markdown

@mojomex mojomex commented May 26, 2026

image

SpConv is great when there's only one model on the GPU, but its extensive use of Thrust comes with lots of cudaMalloc, cudaFree, and cudaMemset calls that cannot be avoided, even with thrust::cuda::par.on(stream).

We had similar issues with autoware_cuda_pointcloud_preprocessor here.

The (ugly but most straightforward) solution for SpConv is to add an API-compatible Thrust replacement, and point SpConv and CUMM to that replacement. This replacement always uses the Async equivalents of the above CUDA calls.

@mojomex mojomex self-assigned this May 26, 2026
mojomex added 3 commits May 26, 2026 21:27
Signed-off-by: Max SCHMELLER <max.schmeller@tier4.jp>
Signed-off-by: Max SCHMELLER <max.schmeller@tier4.jp>
Signed-off-by: Max SCHMELLER <max.schmeller@tier4.jp>
@mojomex mojomex force-pushed the perf/yank-out-thrust branch from 4819705 to 90271a6 Compare May 26, 2026 12:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant