Skip to content

WIP: Route inline reasoning streams#3446

Open
Hmbown wants to merge 2 commits into
mainfrom
issue/3222-inline-reasoning-tags
Open

WIP: Route inline reasoning streams#3446
Hmbown wants to merge 2 commits into
mainfrom
issue/3222-inline-reasoning-tags

Conversation

@Hmbown

@Hmbown Hmbown commented Jun 23, 2026

Copy link
Copy Markdown
Owner

Summary

  • add provider-scoped reasoning_stream_style config for separate_field, inline_tags, and none stream behavior
  • route inline <think>...</think> content into Thinking blocks, including tags split across SSE chunks
  • document the provider-table key for OpenAI-compatible gateways

Tests

  • cargo fmt --all -- --check
  • cargo test -p codewhale-tui --bin codewhale-tui --locked reasoning_style
  • cargo test -p codewhale-tui --bin codewhale-tui --locked provider_config_loads_reasoning_stream_style
  • cargo test -p codewhale-tui --bin codewhale-tui --locked stream_decoder_tests
  • cargo test -p codewhale-tui --bin codewhale-tui --locked stream_reasoning (0 matching tests)
  • git diff --check

Refs #3222.

Reported with detailed route-shape examples by @buko.

@greptile-apps greptile-apps Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new configuration option, reasoning_stream_style, which allows users to customize how streaming reasoning content is handled and separated from the main answer text. It supports separate fields, inline tags (e.g., <think>...</think>), or treating reasoning content as standard text. The changes include updates to client initialization, SSE chunk parsing, configuration loading, documentation, and comprehensive unit tests. The reviewer suggested using fully qualified paths for SystemTime and UNIX_EPOCH in the new configuration test to avoid potential compilation issues if they are not imported.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +4122 to +4125
let nanos = SystemTime::now()
.duration_since(UNIX_EPOCH)
.unwrap()
.as_nanos();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To prevent potential compilation errors if SystemTime and UNIX_EPOCH are not imported in this test module, it is safer to use their fully qualified paths from std::time.

Suggested change
let nanos = SystemTime::now()
.duration_since(UNIX_EPOCH)
.unwrap()
.as_nanos();
let nanos = std::time::SystemTime::now()
.duration_since(std::time::UNIX_EPOCH)
.unwrap()
.as_nanos();

@greptile-apps greptile-apps Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

@Hmbown Hmbown marked this pull request as ready for review June 23, 2026 05:56

@greptile-apps greptile-apps Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

Hmbown added 2 commits June 23, 2026 01:55
Add provider-scoped reasoning_stream_style handling for chat-completion streams so OpenAI-compatible routes can choose separate_field, inline_tags, or none without a new provider enum variant.

Inline <think> blocks are streamed into Thinking content blocks, including tags split across SSE chunks, while none preserves raw visible text and existing separate-field reasoning keeps its default behavior.

Refs #3222.
Limit the legacy parse_sse_chunk wrapper to test builds so release and clippy builds use only the style-aware stream parser path.

Refs #3222.
@Hmbown Hmbown force-pushed the issue/3222-inline-reasoning-tags branch from 53ef812 to 3636629 Compare June 23, 2026 08:59
@claude

claude Bot commented Jun 23, 2026

Copy link
Copy Markdown

Claude finished @Hmbown's task in 5m 19s —— View job


PR Review: Route inline reasoning streams

  • Gather context and understand request
  • Read diff and changed files
  • Review correctness and logic
  • Check provider/model/route safety
  • Check reuse and simplification
  • Check test coverage
  • Post findings

Summary

The routing logic is sound and the state machine for split-chunk tag parsing is correct. A few issues worth addressing before this ships, in priority order:


1. Silent fallback on unrecognized config value — parse_reasoning_stream_style (chat.rs:2165–2173)

fn parse_reasoning_stream_style(value: &str) -> Option<ReasoningStreamStyle> {
    match value.trim().to_ascii_lowercase().replace('-', "_").as_str() {
        ...
        _ => None,
    }
}

reasoning_stream_style_for_stream calls this and, on None, falls through to the provider/model heuristic. A user who typos "inline_tag" (singular) gets silently ignored — their configured gateway behaviour is then driven by is_reasoning_model_for_stream, which returns SeparateField for reasoning providers and None otherwise. Neither is what they intended. A logging::warn (or even a debug log) when the configured value goes unrecognized would surface this during setup. Fix this →


2. InlineTags mode also consumes reasoning_content fields (chat.rs:2742–2751)

if reasoning_stream_style != ReasoningStreamStyle::None
    && let Some(reasoning) = reasoning_text.as_deref()
{
    push_thinking_delta(...);
}

The condition != None means InlineTags also routes a reasoning_content/reasoning delta to push_thinking_delta. For a gateway configured with inline_tags that also happens to emit reasoning_content (e.g. an OpenRouter passthrough with metadata), both paths activate: reasoning_content starts a Thinking block, then the content string with <think> tags is split by inline_reasoning_segments. The block-transition logic in push_text_delta/push_thinking_delta handles it correctly (no panic, no dropped content), but the downstream consumer sees more thinking-block starts than expected.

If the intent is "InlineTags gateways don't use the separate field", the condition should be == ReasoningStreamStyle::SeparateField. If the intent is "route both if present", that's fine but should be documented. Fix this →


3. Missing test: unclosed <think> tag at stream end (chat.rs:2891–2915)

The flush path (inline_reasoning_segments("", state, true) at finish_reason) correctly emits remaining pending content as Thinking when inside_think == true. But there is no test for this — a truncated stream ending with an open <think> and no matching </think>. Worth a dedicated #[test] that sends a single chunk with <think> content and no close tag, followed by a finish-reason chunk, asserting the partial reasoning is emitted rather than silently dropped. Fix this →


4. Module-level doc comment names a now-test-only function (chat.rs:4)

//! ... SSE parsing (`parse_sse_chunk`) all live here.

parse_sse_chunk is now #[cfg(test)]; the production entry point is parse_sse_chunk_with_reasoning_style. The doc comment should be updated to avoid confusion. Fix this →


False alarm from Gemini

The inline comment suggestion on config/tests.rs:4241 (use std::time::SystemTime::now() fully qualified) is not needed — SystemTime and UNIX_EPOCH are already imported at config/tests.rs:8 via use std::time::{SystemTime, UNIX_EPOCH}.


What's good

  • trailing_tag_prefix_len — the is_char_boundary guard handles UTF-8 content (e.g. CJK text before a tag) correctly; the O(tag.len()) scan is fine for 7–9 char tags.
  • push_reasoning_segment empty filter — prevents spurious ContentBlockStart events when a split produces an empty leading/trailing segment.
  • #[cfg(test)] shim on parse_sse_chunk — clean pattern; the old signature is a thin wrapper around parse_sse_chunk_with_reasoning_style and gated so it doesn't pollute production exports. All callers in client.rs are inside #[cfg(test)] mod tests, so nothing breaks.
  • Config merge at merge_provider_config — the new field is included correctly with or semantics (override wins, then base).
  • reasoning_stream_style_for_stream override priority — explicit config beats model/provider heuristic, which is the right default for OpenAI-compatible gateways.
  • Block transition logic in push_text_delta/push_thinking_delta — correctly stops the opposite block and increments content_index before opening the new one; no index bookkeeping bugs found.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant