Fix responses transformer to properly close reasoning before message content#466
Open
kwanLeeFrmVi wants to merge 3 commits intodecolua:masterfrom
Open
Conversation
…fter reasoning blocks - Auto-close reasoning block when switching from reasoning_content to regular content - Use incremented output_index (reasoningIndex + 1) for message items after reasoning - Apply msgIdx correction to both text content and tool_calls handling - Prevent output_index collision between reasoning and subsequent message items
ee9d427 to
ab03398
Compare
…_tokens - Convert OpenAI reasoning_effort (low/medium/high) to Claude thinking config - Map low→1024, medium→8192, high→32768 budget_tokens - Use DEFAULT_BUDGET_TOKENS fallback for thinking.budget_tokens when not specified - Import DEFAULT_BUDGET_TOKENS from runtimeConfig
Contributor
Author
Description for commit #972363Fix: Add mandatory
|
… budget_tokens for older models - Detect Claude 4.6 models via model name pattern (4-6 or 4.6) - Map reasoning_effort to output_config.effort + adaptive thinking for 4.6 - Keep budget_tokens mapping (low→1024, medium→8192, high→32768) for older Claude models - Default to adaptive thinking type for 4.6 when explicit thinking config provided without type - Conditionally include budget_tokens only when present in explicit thinking config
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
When models output native
reasoning_content(e.g.,ollama/glm-5,ollama/kimi-k2.5), the Responses API transformer was not properly transitioning from reasoning to message content. The reasoning stream stayed "open" untilfinish_reasonarrived, causing out-of-order events that confused clients and prevented message rendering after reasoning.Root Cause
In createResponsesApiTransformStream, closeReasoning() was only called at:
</thinking>tag detection (for tag-based reasoning)finish_reasonarrivalWhen native
reasoning_contentwas followed bycontent, there was no explicit transition.Fix
Added early reasoning closure when content arrives after reasoning_content: