Use CUDA batch memory copy API wherever possible by kingcrimsontianyu · Pull Request #954 · rapidsai/kvikio

kingcrimsontianyu · 2026-04-14T20:43:20Z

The use of CUDA batched memory copy for general CPU-GPU copy is recommended by the driver team, as it avoids certain limitations in the traditional memory copy API, such as unexpected device-wide synchronizations. This PR replaces cuMemcpyHtoDAsync and cuMemcpyDtoHAsync with cuMemcpyBatchAsync.

copy-pr-bot · 2026-04-14T20:43:24Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

kingcrimsontianyu · 2026-04-15T01:28:40Z

/ok to test 4d1e8c1

vuule

Looks good, just one question

vuule · 2026-05-06T22:45:09Z

  decltype(cuMemcpyHtoDAsync)* MemcpyHtoDAsync{nullptr};
  decltype(cuMemcpyDtoHAsync)* MemcpyDtoHAsync{nullptr};


are these still needed?

They will not be used, but since this is currently part of a public API, I hope they can stay.

Use cuda batch memory copy API wherever applicable

c690b0a

kingcrimsontianyu added improvement Improves an existing functionality non-breaking Introduces a non-breaking change c++ Affects the C++ API of KvikIO labels Apr 14, 2026

kingcrimsontianyu added 4 commits April 14, 2026 16:56

Update existing code that uses legacy cuda memcpy API

3eb78d6

Fix bugs

1151018

Fix bugs

a4f64ce

Cleanup

4d1e8c1

kingcrimsontianyu changed the title ~~Use cuda batch memory copy API wherever possible~~ Use CUDA batch memory copy API wherever possible Apr 15, 2026

kingcrimsontianyu added 3 commits May 6, 2026 18:12

Merge branch 'main' into cuda-batch-memcpy

7f9592d

Update

e7e0cdd

Improve comment

7bf47cd

kingcrimsontianyu marked this pull request as ready for review May 6, 2026 18:38

kingcrimsontianyu requested a review from a team as a code owner May 6, 2026 18:38

vuule approved these changes May 6, 2026

View reviewed changes

Merge branch 'main' into cuda-batch-memcpy

a82c686

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use CUDA batch memory copy API wherever possible#954

Use CUDA batch memory copy API wherever possible#954
kingcrimsontianyu wants to merge 9 commits intorapidsai:mainfrom
kingcrimsontianyu:cuda-batch-memcpy

kingcrimsontianyu commented Apr 14, 2026 •

edited

Loading

Uh oh!

copy-pr-bot Bot commented Apr 14, 2026

Uh oh!

kingcrimsontianyu commented Apr 15, 2026

Uh oh!

vuule left a comment

Uh oh!

vuule May 6, 2026

Uh oh!

kingcrimsontianyu May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		decltype(cuMemcpyHtoDAsync)* MemcpyHtoDAsync{nullptr};
		decltype(cuMemcpyDtoHAsync)* MemcpyDtoHAsync{nullptr};

Conversation

kingcrimsontianyu commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

copy-pr-bot Bot commented Apr 14, 2026

Uh oh!

kingcrimsontianyu commented Apr 15, 2026

Uh oh!

vuule left a comment

Choose a reason for hiding this comment

Uh oh!

vuule May 6, 2026

Choose a reason for hiding this comment

Uh oh!

kingcrimsontianyu May 7, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kingcrimsontianyu commented Apr 14, 2026 •

edited

Loading