Skip to content

jcelerier/librediffusion

Repository files navigation

librediffusion

A C++ / CUDA / TensorRT implementation of StreamDiffusion

Implemented in ossia score

Benchmarks

On a RTX 5090 at 1 step:

SDXL Turbo 1024x1024: stable 26 fps

sdxl

SD Turbo 512x512: stable 96 fps

sdturbo

SDXS: above 600 fps

sdxs

Models need to be converted to TensorRT through the Python script [train-lora.py] beforehand:

$ uv run train-lora.py --model stabilityai/sd-turbo --min-batch 1 --max-batch 1 --opt-batch 1 --min-resolution 512 --max-resolution 1024 --output ./engines-sd-turbo

About

C++ / CUDA implementation of StreamDiffusion

Topics

Resources

License

Stars

Watchers

Forks

Sponsor this project