Skip to content

Conversation

@36000
Copy link
Collaborator

@36000 36000 commented Jan 7, 2026

This is a large PR that does a lot of things:
(1) removes c++ and pybind layer in favor of cuda python. This means no more need for c compiler and that the package can now be pip installed, and should be put on pypi. For now, cuda python also uses the real time nvidia compiler. This seems to work. It only takes a second or two to compile, and the compiled code is not meaningfully slower. However, we can easily switch back to using nvcc if we want to, because cuda python can also interact with code that's already compiled. As a part of removing the C++ code and dependencies, I have also removed the dump streamlines call, and I am working on doing that in python instead. For now, anyone with a large tractography should really be using TRX anyways, which is implemented.

(2) streamlines the python API, such that users can more easily pass in their data and let GPUstreamlines handle things like trx generation, batching, some of the model parameters, etc.

(3) Updates PTT algorithm, which is now working great! After this PR, I can start optimizing / adding features to this new feature which will hopefully attract more users

(4) refactors the cuda c code (without really changing it) to separate the original bootstrapping code from the PTT and probabilistic code. These approaches require different inputs and are sufficiently different such that I think it makes sense to keep them separate in the code base. From the users perspective, this change does not matter.

(5) sets floats to single precision by default. This can still be changed in the header files back to double at any time. For now, this increases performance without changing streamline outputs meaningfully.

Copilot AI review requested due to automatic review settings January 7, 2026 22:45
@36000
Copy link
Collaborator Author

36000 commented Jan 7, 2026

@mauro-bis @romerojosh

This is a pretty significant change to the code base. It's a totally different API and hopefully more python-user oriented. I could of done most of this with the old setup, but I didn't like depending on pybind and a c compiler, and also I just wanted to try out CUDA Python. I hope these changes will make it so more people use this! I find this software super useful still, but it has not caught on yet in the wider community. We are planning on setting the current master to version 1.0, then merging this and calling this 2.0. Then, we plan to put this on pypi. I am wondering, does this sound good to you two?

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR represents a major architectural refactoring that removes the C++/pybind11 build pipeline in favor of CUDA Python with JIT compilation. The changes include:

  • Migration from C++ bindings to pure Python with CUDA Python for GPU code compilation
  • Streamlined Python API with new GPUTracker context manager and direction getter classes
  • Improved PTT algorithm implementation with better parallelization
  • Separation of bootstrapping code from probabilistic/PTT tracking code
  • Simplified build system using setuptools instead of scikit-build-core

Key Changes:

  • Removes ~2400 lines of C++/CUDA code and replaces with organized Python + CUDA modules
  • Introduces runtime compilation via CUDA Python instead of pre-compilation
  • Adds new helper classes (BootDirectionGetter, ProbDirectionGetter, PttDirectionGetter)
  • Updates API to be more user-friendly with better batching and TRX support

Reviewed changes

Copilot reviewed 28 out of 31 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
setup.py New build script that auto-generates Python constants from globals.h
pyproject.toml Updated to use setuptools with cuda-python dependencies
cuslines/init.py New package entry point exposing GPUTracker and direction getters
cuslines/cuda_python/*.py New Python modules implementing GPU tracking with CUDA Python
cuslines/cuda_c/*.cu Refactored CUDA kernels separated by algorithm type
run_gpu_streamlines.py Simplified example using new API
globals.h Updated constants including REAL_SIZE change to 4 (float32)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant