Skip to content

Update dependency diffusers to v0.38.0#89

Open
renovate[bot] wants to merge 1 commit into
masterfrom
renovate/diffusers-0.x
Open

Update dependency diffusers to v0.38.0#89
renovate[bot] wants to merge 1 commit into
masterfrom
renovate/diffusers-0.x

Conversation

@renovate
Copy link
Copy Markdown
Contributor

@renovate renovate Bot commented Dec 8, 2025

ℹ️ Note

This PR body was truncated due to platform limits.

This PR contains the following updates:

Package Change Age Confidence
diffusers ==0.35.2==0.38.0 age confidence

Release Notes

huggingface/diffusers (diffusers)

v0.38.0: Diffusers 0.38.0: New image and audio pipelines, Core library improvements, and more

Compare Source

New Pipelines

LLaDA2

LLaDA2 is a family of discrete diffusion language models that generate text through block-wise iterative refinement. Instead of autoregressive token-by-token generation, LLaDA2 starts with a fully masked sequence and progressively unmasks tokens by confidence over multiple refinement steps.

Nucleus-MoE

NucleusMoE-Image is a 2B active 17B parameter model trained with efficiency at its core. Our novel architecture highlights the scalability of a sparse MoE architecture for Image generation.

Thanks to @​sippycoder for the contribution.

Ernie-Image

ERNIE-Image is a powerful and highly efficient image generation model with 8B parameters.

Thanks to @​HsiaWinter for the contribution.

LongCat-AudioDiT

LongCat-AudioDiT is a text-to-audio diffusion model from Meituan LongCat.

Thanks to @​RuixiangMa for the contribution.

Ace-Step 1.5

ACE-Step 1.5 generates variable-length stereo audio at 48 kHz (10 seconds to 10 minutes) from text prompts and optional lyrics. The full system pairs a Language Model planner with a Diffusion Transformer (DiT) synthesizer; this pipeline wraps the DiT half of that stack, and consists of three components: an AutoencoderOobleck VAE that compresses waveforms into 25 Hz stereo latents, a Qwen3-based text encoder for prompt and lyric conditioning, and an AceStepTransformer1DModel DiT that operates in the VAE latent space using flow matching.

Thanks to @​ChuxiJ for the contribution.

Flux.2 Small Decoder

Make your Flux.2 decoding faster with this new small decoder model from the Black Forest Labs. You can check it out here. It was contributed by @​huemin-art in this PR.

Modular Pipeline Support

We added modular support for LTX-2 and Hunyuan 1.5.

Core Library

All commits

Significant community contributions

The following contributors have made significant changes to the library over the last release:

v0.37.1: Fixes for AutoModel type hints in Modular Pipelines and Flux Klein LoRA loading

Compare Source

  • Fix for loading ModularPipelines with AutoModel type hints in their modular_model_index.json #​13271
  • Fix Flux Klein LoRA loading #​13313
  • Fix unguarded torchvision import in Cosmos Predict 2.5 #​13321

v0.37.0: Diffusers 0.37.0: Modular Diffusers, New image and video pipelines, multiple core library improvements, and more 🔥

Compare Source

Modular Diffusers

Modular Diffusers introduces a new way to build diffusion pipelines by composing reusable blocks. Instead of writing entire pipelines from scratch, you can now mix and match building blocks to create custom workflows tailored to your specific needs! This complements the existing DiffusionPipeline class, providing a more flexible way to create custom diffusion pipelines.

Find more details on how to get started with Modular Diffusers here, and also check out the announcement post.

New Pipelines and Models

Image 🌆
  • Z Image Omni Base: Z-Image is the foundation model of the Z-Image family, engineered for good quality, robust generative diversity, broad stylistic coverage, and precise prompt adherence. While Z-Image-Turbo is built for speed, Z-Image is a full-capacity, undistilled transformer designed to be the backbone for creators, researchers, and developers who require the highest level of creative freedom. Thanks to @​RuoyiDufor for contributing this in #​12857.
  • Flux2 Klein:FLUX.2 [Klein] unifies generation and editing in a single compact architecture, delivering state-of-the-art quality with end-to-end inference in as low as under a second. Built for applications that require real-time image generation without sacrificing quality, and runs on consumer hardware, with as little as 13GB VRAM.
  • Qwen Image Layered: Qwen-Image-Layered is a model capable of decomposing an image into multiple RGBA layers. This layered representation unlocks inherent editability: each layer can be independently manipulated without affecting other content. Thanks to @​naykun for contributing this in #​12853.
  • FIBO Edit: Fibo Edit is an 8B parameter image-to-image model that introduces a new paradigm of structured control, operating on JSON inputs paired with source images to enable deterministic and repeatable editing workflows. Featuring native masking for granular precision, it moves beyond simple prompt-based diffusion to offer explicit, interpretable control optimized for production environments. Its lightweight architecture is designed for deep customization, empowering researchers to build specialized “Edit” models for domain-specific tasks while delivering top-tier aesthetic quality. Thanks galbria for contributing it in #​12930.
  • Cosmos Predict2.5: Cosmos-Predict2.5, the latest version of the Cosmos World Foundation Models (WFMs) family, specialized for simulating and predicting the future state of the world. Thanks to @​miguelmartin75 for contributing it in #​12852.
  • Cosmos Transfer2.5: Cosmos-Transfer2.5 is a conditional world generation model with adaptive multimodal control, that produces high-quality world simulations conditioned on multiple control inputs. These inputs can take different modalities—including edges, blurred video, segmentation maps, and depth maps. Thanks to @​miguelmartin75 for contributing it in #​13066.
  • GLM-Image: GLM-Image is an image generation model adopts a hybrid autoregressive + diffusion decoder architecture, effectively pushing the upper bound of visual fidelity and fine-grained details. In general image generation quality, it aligns with industry-standard LDM-based approaches, while demonstrating significant advantages in knowledge-intensive image generation scenarios. Thanks to @​zRzRzRzRzRzRzR for contributing it in #​12973.
  • RAE: Representation Autoencoders (aka RAE) are an exciting alternative to traditional VAEs, typically used in the area of latent-space diffusion models of image generation. RAEs leverage pre-trained vision encoders and train lightweight decoders for the task of reconstruction.
Video + audio 🎥 🎼
  • LTX-2: LTX-2 is an audio-conditioned text-to-video generation model that can generate videos with synced audio. Full and distilled model inference, as well as two-stage inference with spatial sampling, is supported. We also support a conditioning pipeline that allows for passing different conditions (such as images, series of images, etc.). Check out the docs to learn more!
  • Helios: Helios is a 14B video generation model that runs at 17 FPS on a single NVIDIA H100 GPU and supports minute-scale generation while matching a strong baseline in quality. Thanks to @​SHYuanBest for contributing this in #​13208.

Improvements to Core Library

New caching methods
New context-parallelism (CP) backends
Misc
  • Mambo-G Guidance: New guider implementation (#​12862)
  • Laplace Scheduler for DDPM (#​11320)
  • Custom Sigmas in UniPCMultistepScheduler (#​12109)
  • MultiControlNet support for SD3 Inpainting (#​11251)
  • Context parallel in native flash attention (#​12829)
  • NPU Ulysses Attention Support (#​12919)
  • Fix Wan 2.1 I2V Context Parallel Inference (#​12909)
  • Fix Qwen-Image Context Parallel Inference (#​12970)
  • Introduction to @apply_lora_scale decorator for simplifying model definitions (#​12994)
  • Introduction of pipeline-level “cpu” device_map (#​12811)
  • Enable CP for kernels-based attention backends (#​12812)
  • Diffusers is fully functional with Transformers V5 (#​12976)

A lot of the above features/improvements came as part of the MVP program we have been running. Immense thanks to the contributors!

Bug Fixes

  • Fix QwenImageEditPlus on NPU (#​13017)
  • Fix MT5Tokenizer → use T5Tokenizer for Transformers v5.0+ compatibility (#​12877)
  • Fix Wan/WanI2V patchification (#​13038)
  • Fix LTX-2 inference with num_videos_per_prompt > 1 and CFG (#​13121)
  • Fix Flux2 img2img prediction (#​12855)
  • Fix QwenImage txt_seq_lens handling (#​12702)
  • Fix prefix_token_len bug (#​12845)
  • Fix ftfy imports in Wan and SkyReels-V2 (#​12314, #​13113)
  • Fix is_fsdp determination (#​12960)
  • Fix GLM-Image get_image_features API (#​13052)
  • Fix Wan 2.2 when either transformer isn't present (#​13055)
  • Fix guider issue (#​13147)
  • Fix torchao quantizer for new versions (#​12901)
  • Fix GGUF for unquantized types with unquantize kernels (#​12498)
  • Make Qwen hidden states contiguous for torchao (#​13081)
  • Make Flux hidden states contiguous (#​13068)
  • Fix Kandinsky 5 hardcoded CUDA autocast (#​12814)
  • Fix aiter availability check (#​13059)
  • Fix attention mask check for unsupported backends (#​12892)
  • Allow prompt and prior_token_ids simultaneously in GlmImagePipeline (#​13092)
  • GLM-Image batch support (#​13007)
  • Cosmos 2.5 Video2World frame extraction fix (#​13018)
  • ResNet: only use contiguous in training mode (#​12977)

All commits

Note

PR body was truncated to here.


Configuration

📅 Schedule: (UTC)

  • Branch creation
    • At any time (no schedule defined)
  • Automerge
    • At any time (no schedule defined)

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

Rebasing: Whenever PR is behind base branch, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about this update again.


  • If you want to rebase/retry this PR, check this box

This PR was generated by Mend Renovate. View the repository job log.

@renovate renovate Bot force-pushed the renovate/diffusers-0.x branch from 75b4d4b to a2b7aef Compare March 5, 2026 17:33
@renovate renovate Bot changed the title Update dependency diffusers to v0.36.0 Update dependency diffusers to v0.37.0 Mar 5, 2026
@renovate renovate Bot force-pushed the renovate/diffusers-0.x branch from a2b7aef to 7514a41 Compare March 25, 2026 08:59
@renovate renovate Bot changed the title Update dependency diffusers to v0.37.0 Update dependency diffusers to v0.37.1 Mar 25, 2026
@renovate renovate Bot force-pushed the renovate/diffusers-0.x branch from 7514a41 to 4f0e94c Compare May 1, 2026 14:51
@renovate renovate Bot changed the title Update dependency diffusers to v0.37.1 Update dependency diffusers to v0.38.0 May 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants