feat(BA-5419): support percentage values for rolling update max_surge and max_unavailable#10563
Open
jopemachine wants to merge 13 commits intomainfrom
Open
feat(BA-5419): support percentage values for rolling update max_surge and max_unavailable#10563jopemachine wants to merge 13 commits intomainfrom
jopemachine wants to merge 13 commits intomainfrom
Conversation
jopemachine
added a commit
that referenced
this pull request
Mar 26, 2026
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Adds support for percentage-based max_surge / max_unavailable for rolling updates (Kubernetes-style rounding), expanding validation/DTO/GQL surfaces and updating strategy evaluation to resolve percentages at runtime.
Changes:
- Introduce shared
validate_int_or_percent()/resolve_int_or_percent()helpers and apply them to rolling update config/spec. - Update rolling update strategy to resolve surge/unavailable against desired replicas at execution time.
- Expand DTO/GQL types and add unit tests covering percentage parsing and rounding behavior.
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/unit/manager/sokovan/deployment/strategy/test_rolling_update.py | Adds strategy/spec unit tests for percentage inputs, rounding, and deadlock prevention cases. |
| tests/unit/common/dto/manager/v2/deployment/test_request.py | Adds DTO validation tests ensuring percent strings are accepted/rejected as expected and numeric strings normalize to int. |
| src/ai/backend/manager/sokovan/deployment/strategy/rolling_update.py | Switches from raw spec fields to desired-based resolution helpers before computing budgets. |
| src/ai/backend/manager/models/deployment_policy/row.py | Updates RollingUpdateSpec schema to accept int-or-percent and adds resolver helpers. |
| src/ai/backend/manager/api/gql/deployment/types/policy.py | Broadens GQL fields to JSON to carry either int or percent string. |
| src/ai/backend/common/types.py | Adds shared int-or-percent validation and resolution helpers. |
| src/ai/backend/common/dto/manager/v2/deployment/types.py | Broadens response DTO fields to `int |
| src/ai/backend/common/dto/manager/v2/deployment/request.py | Broadens request DTO to accept int-or-percent with a shared validator. |
| changes/10563.feature.md | Adds changelog entry for the new percentage-based behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
c874cd8 to
15ed8b7
Compare
2768df4 to
ce69b1e
Compare
7ce41a4 to
2dbf998
Compare
… and max_unavailable Add support for percentage-based values (e.g., "25%") in addition to absolute integers for max_surge and max_unavailable rolling update parameters. Percentage values are resolved to absolute counts at execution time based on the desired replica count, with max_surge rounding up and max_unavailable rounding down (matching Kubernetes semantics). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… and max_unavailable Add support for float fraction values (e.g., 0.25 for 25%) in addition to absolute integers for max_surge and max_unavailable rolling update parameters. Float values are resolved to absolute counts at execution time based on the desired replica count, with max_surge rounding up and max_unavailable rounding down (matching Kubernetes semantics). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: octodog <mu001@lablup.com>
Update test files to use IntOrPercent instead of plain int for RollingUpdateSpec max_surge/max_unavailable after type change. Also merge diverged alembic heads from main. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
4946496 to
7647381
Compare
…-check logic Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… Field descriptions - Create IntOrPercentTypeGQL as a standalone enum.Enum class with @gql_enum decorator instead of wrapping the DTO enum via gql_enum() function call - Move JSON examples from description to examples in RollingUpdateConfigInput fields - Rename _iop helper to _int_or_percent in tests Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ng update tests - Add docstring and use match statement in RollingUpdateSpec._resolve - Refactor TestRollingUpdateConfigInput with RollingUpdateValidScenario dataclass - Add max_unavailable invalid input test cases Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ate in tests - Replace hardcoded added_version="26.4.0" with NEXT_RELEASE_VERSION - Use InvalidEndpointState instead of bare Exception in test assertions Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… values - Add field_validator to coerce plain int to IntOrPercent for legacy DB rows - Fix type annotation in migration eb9441fcf90a that broke SQLAlchemy 2.0 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The legacy int values in strategy_spec won't exist in practice, so no backward compatibility shim or data migration is needed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace Field(default=None, description=...) with plain None default for extra_mounts in CreateRevisionInput and AddRevisionInput, and remove unused pydantic Field import. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: octodog <mu001@lablup.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Resolves BA-5419.
Summary
0.25for 25%) formax_surgeandmax_unavailablerolling update parameters, in addition to existing absolute integer valuesmax_surgerounding up andmax_unavailablerounding down (matching Kubernetes semantics)Rounding semantics (follows Kubernetes convention)
max_surgemax_unavailableExample (
desired_replicas = 10, both set to25%):max_surge:ceil(10 * 0.25) = ceil(2.5) = 3→ up to 13 replicas simultaneouslymax_unavailable:floor(10 * 0.25) = floor(2.5) = 2→ at least 8 replicas always availableBoth directions round toward safety: surge allows more creation headroom, unavailable restricts how many can go down.
Test plan
Resolves BA-5419
📚 Documentation preview 📚: https://sorna--10563.org.readthedocs.build/en/10563/
📚 Documentation preview 📚: https://sorna-ko--10563.org.readthedocs.build/ko/10563/