-
Notifications
You must be signed in to change notification settings - Fork 77
Add handle_generation_config function to manage model generation_config saving failure #1448
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…ig saving failure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Adds a helper to adjust generation_config.do_sample based on sampling-related parameters to avoid failures when working with non-default generation settings.
Changes:
- Invoke a new
handle_generation_config()during both LLM and MLLM model load flows. - Add
handle_generation_config()to setdo_sample=Truewhentop_p,top_k, ortemperatureindicates sampling.
|
|
||
| model = model.eval() | ||
| check_and_mark_quantized_module(model) | ||
| handle_generation_config(model) |
Copilot
AI
Feb 12, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This call mutates model.generation_config during load, which can change downstream generation behavior (enabling sampling) even if the caller did not intend behavior changes at load time. Since the PR goal is to address generation_config saving failures, consider moving this normalization to the save/export path (or applying it only immediately before serialization) to avoid surprising side effects during loading.
| handle_generation_config(model) |
|
|
||
| model = model.eval() | ||
| check_and_mark_quantized_module(model) | ||
| handle_generation_config(model) |
Copilot
AI
Feb 12, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same concern as in llm_load_model: mutating generation settings during model load can unexpectedly change runtime generation behavior. If the intent is specifically to avoid GenerationConfig validation errors on save, prefer applying this right before saving rather than at load time.
| handle_generation_config(model) |
| if hasattr(generation_config, "top_k") and generation_config.top_k != 0: | ||
| model.generation_config.do_sample = True | ||
| if hasattr(generation_config, "temperature") and generation_config.temperature != 1.0: | ||
| model.generation_config.do_sample = True |
Copilot
AI
Feb 12, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the intent is to prevent GenerationConfig validation/saving failures caused by inconsistent sampling settings, this handling looks incomplete: Transformers' validation can also consider other sampling-related fields (e.g., typical_p, min_p, epsilon_cutoff, eta_cutoff, etc.). With the current implementation, save/validate can still fail when those are set away from defaults while do_sample remains False. Consider expanding the normalization condition to cover all sampling parameters that require do_sample=True.
| model.generation_config.do_sample = True | |
| model.generation_config.do_sample = True | |
| # Additional sampling-related parameters that also imply do_sample=True | |
| if hasattr(generation_config, "typical_p") and generation_config.typical_p is not None: | |
| model.generation_config.do_sample = True | |
| if hasattr(generation_config, "min_p") and generation_config.min_p is not None: | |
| model.generation_config.do_sample = True | |
| if hasattr(generation_config, "epsilon_cutoff") and generation_config.epsilon_cutoff is not None: | |
| model.generation_config.do_sample = True | |
| if hasattr(generation_config, "eta_cutoff") and generation_config.eta_cutoff is not None: | |
| model.generation_config.do_sample = True |
| if hasattr(generation_config, "top_p") and generation_config.top_p != 1.0: | ||
| model.generation_config.do_sample = True | ||
| if hasattr(generation_config, "top_k") and generation_config.top_k != 0: | ||
| model.generation_config.do_sample = True | ||
| if hasattr(generation_config, "temperature") and generation_config.temperature != 1.0: | ||
| model.generation_config.do_sample = True |
Copilot
AI
Feb 12, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This repeats both the attribute checks and the assignment to model.generation_config.do_sample. Since generation_config is already a local variable, it would be clearer to set generation_config.do_sample once based on a combined condition (e.g., compute a boolean like needs_sampling = ... and then assign once). This reduces duplication and makes it easier to extend the list of parameters consistently.
| if hasattr(generation_config, "top_p") and generation_config.top_p != 1.0: | |
| model.generation_config.do_sample = True | |
| if hasattr(generation_config, "top_k") and generation_config.top_k != 0: | |
| model.generation_config.do_sample = True | |
| if hasattr(generation_config, "temperature") and generation_config.temperature != 1.0: | |
| model.generation_config.do_sample = True | |
| needs_sampling = ( | |
| (hasattr(generation_config, "top_p") and generation_config.top_p != 1.0) | |
| or (hasattr(generation_config, "top_k") and generation_config.top_k != 0) | |
| or ( | |
| hasattr(generation_config, "temperature") | |
| and generation_config.temperature != 1.0 | |
| ) | |
| ) | |
| if needs_sampling: | |
| generation_config.do_sample = True |
|
|
||
| model = model.eval() | ||
| check_and_mark_quantized_module(model) | ||
| handle_generation_config(model) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add TODO with link: huggingface/transformers#43937. Once Transformers has a fix, we can remove the workaround.
Description
huggingface/transformers#43937
Type of Change
Related Issues
Fixes or relates to #
Checklist Before Submitting