Skip to content

Fix SDXL Diffusers Format Support & Add Manual Model Type Override#1285

Open
fewtarius wants to merge 3 commits intoleejet:masterfrom
SyntheticAutonomicMind:master
Open

Fix SDXL Diffusers Format Support & Add Manual Model Type Override#1285
fewtarius wants to merge 3 commits intoleejet:masterfrom
SyntheticAutonomicMind:master

Conversation

@fewtarius
Copy link

Summary

This PR fixes loading of SDXL models in diffusers directory format and adds a new --model-type CLI parameter for manual model version control.

Problem

When loading SDXL models in diffusers directory format (e.g., duchaiten-pony-real-v20-sdxl), the text encoder tensors failed to load with "unknown tensor" or "tensor not in model file" errors. This occurred because:

  1. Missing prefix conversions: The te. and te.1. prefixes used by diffusers format weren't in the name conversion list
  2. Incorrect tensor prefixes: Text encoders were loaded with prefixes that didn't match what the model graph expects
  3. Version detection issues: Later components (VAE) could override the already-detected SDXL version

Solution

1. Name Conversion Support (name_conversion.cpp)

Added te. and te.1. prefixes to the cond_stage_model conversion list, enabling proper tensor name translation from diffusers format to checkpoint format.

2. Correct Tensor Prefixes (model.cpp)

  • Changed text_encoder prefix from te. to cond_stage_model.transformer.
  • Changed text_encoder_2 prefix from te.1. to cond_stage_model.1.transformer.
  • Added early return in get_sd_version() when SDXL is detected to prevent VAE from overriding version

3. Version Caching (stable-diffusion.cpp)

Added version caching to preserve SDXL detection across component loading phases.

4. Manual Model Type Override (--model-type parameter)

New CLI parameter allowing users to manually specify the model version:

--model-type sdxl          # Force SDXL version
--model-type sd1           # Force SD 1.x version
--model-type flux          # Force FLUX version

Supported values: sd1, sd2, sdxl, sdxl_inpaint, sdxl_pix2pix, flux, sd3, svd

Use cases:

  • Auto-detection fails or is ambiguous
  • Testing model behavior with different version settings
  • Working with modified/custom models

Files Changed

File Changes
src/name_conversion.cpp +2 lines: Added te. and te.1. prefix mappings
src/model.cpp +15/-3 lines: Updated diffusers text encoder prefixes, improved SDXL detection
src/stable-diffusion.cpp +18/-4 lines: Added version caching and manual override logic
include/stable-diffusion.h +1 line: Added version_override field to sd_ctx_params_t
examples/common/common.hpp +40 lines: Added --model-type CLI parameter and string-to-enum conversion

Testing

  • ✅ SDXL diffusers models now load successfully
  • ✅ SD 1.5 models continue to work (regression tested)
  • ✅ All text encoder tensors are found and loaded correctly
  • ✅ Manual override with --model-type sdxl works
  • ✅ Auto-detection still works when --model-type is not specified

Commits

  1. fix(diffusers): use correct tensor name prefixes for SDXL text encoders - Core fix for tensor prefix matching
  2. feat(cli): add --model-type parameter for manual version override - User-facing control over model detection
  3. fix(diffusers): add support for diffusers SDXL text encoder prefixes - Name conversion support for diffusers format

Compatibility

  • Backward compatible: All existing workflows continue to work unchanged
  • No breaking changes: Auto-detection remains the default behavior
  • Additive only: New functionality doesn't modify existing behavior

Related Issues

Fixes loading of diffusers-format SDXL models that previously failed with tensor loading errors.

Problem:
When loading SDXL models in diffusers directory format, text encoders were
loaded with prefixes "te." and "te.1." which don't match the expected tensor
names in the model graph. The model expects "cond_stage_model.transformer."
for clip_l and "cond_stage_model.1.transformer." for clip_g.

This caused "tensor not in model file" errors for all text encoder tensors
when loading SDXL diffusers models.

Solution:
- Changed text_encoder prefix from "te." to "cond_stage_model.transformer."
- Changed text_encoder_2 prefix from "te.1." to "cond_stage_model.1.transformer."
- These prefixes now match what's used when loading separate clip_l/clip_g files
- Added early return in get_sd_version() when SDXL is detected to prevent
  later components (VAE) from overriding the version
- Added version caching to prevent re-detection from changing SDXL version

Testing:
- SDXL diffusers models now load successfully
- SD 1.5 models continue to work (regression tested)
- All text encoder tensors are found and loaded correctly

Files changed:
- model.cpp: Updated diffusers text encoder prefixes and SDXL detection logic
- stable-diffusion.cpp: Added version caching to preserve SDXL detection
Adds a new --model-type CLI parameter that allows users to manually specify
the model version instead of relying on auto-detection. This is useful when:
- Auto-detection fails or is ambiguous
- Testing model behavior with different version settings
- Working with modified/custom models

Usage:
  --model-type sdxl          # Force SDXL version
  --model-type sd1           # Force SD 1.x version
  --model-type flux          # Force FLUX version

Supported values: sd1, sd2, sdxl, sdxl_inpaint, sdxl_pix2pix, flux, sd3, svd

Implementation:
- Added version_override field to sd_ctx_params_t struct
- Added model_type string parameter to SDContextParams
- Added string-to-enum conversion in to_sd_ctx_params_t()
- Updated model loading to check for manual override before auto-detection
- Auto-detection still works when --model-type is not specified

Testing:
- Tested manual override with --model-type sdxl (works)
- Tested auto-detection without parameter (still works)
- Tested with SD 1.5 model and --model-type sd1 (works)

Files changed:
- stable-diffusion.h: Added version_override field to sd_ctx_params_t
- stable-diffusion.cpp: Added version override logic and initialization
- examples/common/common.hpp: Added CLI parameter and string-to-enum conversion
Problem: SDXL models in diffusers directory format fail to load with "unknown tensor" errors
Solution: Added te. and te.1. prefixes to cond_stage_model conversion list
Testing: SDXL diffusers models now load and generate successfully

Root cause: When loading diffusers SDXL models, text_encoder uses "te." prefix
and text_encoder_2 uses "te.1." prefix. These weren't in the name conversion
prefix list, so tensors weren't being converted to checkpoint format names.

This fix enables diffusers-format SDXL models to work alongside single-file
checkpoint models without requiring format conversion.

Fixes: Models like duchaiten-pony-real-v20-sdxl in diffusers directory layout
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant