WIP: Route inline reasoning streams by Hmbown · Pull Request #3446 · Hmbown/CodeWhale

Hmbown · 2026-06-23T04:26:36Z

Summary

add provider-scoped reasoning_stream_style config for separate_field, inline_tags, and none stream behavior
route inline <think>...</think> content into Thinking blocks, including tags split across SSE chunks
document the provider-table key for OpenAI-compatible gateways

Tests

cargo fmt --all -- --check
cargo test -p codewhale-tui --bin codewhale-tui --locked reasoning_style
cargo test -p codewhale-tui --bin codewhale-tui --locked provider_config_loads_reasoning_stream_style
cargo test -p codewhale-tui --bin codewhale-tui --locked stream_decoder_tests
cargo test -p codewhale-tui --bin codewhale-tui --locked stream_reasoning (0 matching tests)
git diff --check

Refs #3222.

Reported with detailed route-shape examples by @buko.

greptile-apps

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

gemini-code-assist

Code Review

This pull request introduces a new configuration option, reasoning_stream_style, which allows users to customize how streaming reasoning content is handled and separated from the main answer text. It supports separate fields, inline tags (e.g., <think>...</think>), or treating reasoning content as standard text. The changes include updates to client initialization, SSE chunk parsing, configuration loading, documentation, and comprehensive unit tests. The reviewer suggested using fully qualified paths for SystemTime and UNIX_EPOCH in the new configuration test to avoid potential compilation issues if they are not imported.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-06-23T04:27:44Z

+    let nanos = SystemTime::now()
+        .duration_since(UNIX_EPOCH)
+        .unwrap()
+        .as_nanos();


To prevent potential compilation errors if SystemTime and UNIX_EPOCH are not imported in this test module, it is safer to use their fully qualified paths from std::time.

Suggested change

let nanos = SystemTime::now()

.duration_since(UNIX_EPOCH)

.unwrap()

.as_nanos();

let nanos = std::time::SystemTime::now()

.duration_since(std::time::UNIX_EPOCH)

.unwrap()

.as_nanos();

greptile-apps

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

greptile-apps

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

Add provider-scoped reasoning_stream_style handling for chat-completion streams so OpenAI-compatible routes can choose separate_field, inline_tags, or none without a new provider enum variant. Inline <think> blocks are streamed into Thinking content blocks, including tags split across SSE chunks, while none preserves raw visible text and existing separate-field reasoning keeps its default behavior. Refs #3222.

Limit the legacy parse_sse_chunk wrapper to test builds so release and clippy builds use only the style-aware stream parser path. Refs #3222.

claude · 2026-06-23T09:02:23Z

Claude finished @Hmbown's task in 5m 19s —— View job

PR Review: Route inline reasoning streams

Summary

The routing logic is sound and the state machine for split-chunk tag parsing is correct. A few issues worth addressing before this ships, in priority order:

1. Silent fallback on unrecognized config value — `parse_reasoning_stream_style` (`chat.rs:2165–2173`)

fn parse_reasoning_stream_style(value: &str) -> Option<ReasoningStreamStyle> {
    match value.trim().to_ascii_lowercase().replace('-', "_").as_str() {
        ...
        _ => None,
    }
}

reasoning_stream_style_for_stream calls this and, on None, falls through to the provider/model heuristic. A user who typos "inline_tag" (singular) gets silently ignored — their configured gateway behaviour is then driven by is_reasoning_model_for_stream, which returns SeparateField for reasoning providers and None otherwise. Neither is what they intended. A logging::warn (or even a debug log) when the configured value goes unrecognized would surface this during setup. Fix this →

2. `InlineTags` mode also consumes `reasoning_content` fields (`chat.rs:2742–2751`)

if reasoning_stream_style != ReasoningStreamStyle::None
    && let Some(reasoning) = reasoning_text.as_deref()
{
    push_thinking_delta(...);
}

The condition != None means InlineTags also routes a reasoning_content/reasoning delta to push_thinking_delta. For a gateway configured with inline_tags that also happens to emit reasoning_content (e.g. an OpenRouter passthrough with metadata), both paths activate: reasoning_content starts a Thinking block, then the content string with <think> tags is split by inline_reasoning_segments. The block-transition logic in push_text_delta/push_thinking_delta handles it correctly (no panic, no dropped content), but the downstream consumer sees more thinking-block starts than expected.

If the intent is "InlineTags gateways don't use the separate field", the condition should be == ReasoningStreamStyle::SeparateField. If the intent is "route both if present", that's fine but should be documented. Fix this →

3. Missing test: unclosed `<think>` tag at stream end (`chat.rs:2891–2915`)

The flush path (inline_reasoning_segments("", state, true) at finish_reason) correctly emits remaining pending content as Thinking when inside_think == true. But there is no test for this — a truncated stream ending with an open <think> and no matching </think>. Worth a dedicated #[test] that sends a single chunk with <think> content and no close tag, followed by a finish-reason chunk, asserting the partial reasoning is emitted rather than silently dropped. Fix this →

4. Module-level doc comment names a now-test-only function (`chat.rs:4`)

//! ... SSE parsing (`parse_sse_chunk`) all live here.

parse_sse_chunk is now #[cfg(test)]; the production entry point is parse_sse_chunk_with_reasoning_style. The doc comment should be updated to avoid confusion. Fix this →

False alarm from Gemini

The inline comment suggestion on config/tests.rs:4241 (use std::time::SystemTime::now() fully qualified) is not needed — SystemTime and UNIX_EPOCH are already imported at config/tests.rs:8 via use std::time::{SystemTime, UNIX_EPOCH}.

What's good

trailing_tag_prefix_len — the is_char_boundary guard handles UTF-8 content (e.g. CJK text before a tag) correctly; the O(tag.len()) scan is fine for 7–9 char tags.
push_reasoning_segment empty filter — prevents spurious ContentBlockStart events when a split produces an empty leading/trailing segment.
#[cfg(test)] shim on parse_sse_chunk — clean pattern; the old signature is a thin wrapper around parse_sse_chunk_with_reasoning_style and gated so it doesn't pollute production exports. All callers in client.rs are inside #[cfg(test)] mod tests, so nothing breaks.
Config merge at merge_provider_config — the new field is included correctly with or semantics (override wins, then base).
reasoning_stream_style_for_stream override priority — explicit config beats model/provider heuristic, which is the right default for OpenAI-compatible gateways.
Block transition logic in push_text_delta/push_thinking_delta — correctly stops the opposite block and increments content_index before opening the new one; no index bookkeeping bugs found.

greptile-apps Bot reviewed Jun 23, 2026

View reviewed changes

gemini-code-assist Bot reviewed Jun 23, 2026

View reviewed changes

greptile-apps Bot reviewed Jun 23, 2026

View reviewed changes

Hmbown marked this pull request as ready for review June 23, 2026 05:56

greptile-apps Bot reviewed Jun 23, 2026

View reviewed changes

Hmbown added 2 commits June 23, 2026 01:55

Fix inline reasoning stream lint

3636629

Limit the legacy parse_sse_chunk wrapper to test builds so release and clippy builds use only the style-aware stream parser path. Refs #3222.

Hmbown force-pushed the issue/3222-inline-reasoning-tags branch from 53ef812 to 3636629 Compare June 23, 2026 08:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: Route inline reasoning streams#3446

WIP: Route inline reasoning streams#3446
Hmbown wants to merge 2 commits into
mainfrom
issue/3222-inline-reasoning-tags

Hmbown commented Jun 23, 2026

Uh oh!

greptile-apps Bot left a comment

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jun 23, 2026

Uh oh!

greptile-apps Bot left a comment

Uh oh!

greptile-apps Bot left a comment

Uh oh!

claude Bot commented Jun 23, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Hmbown commented Jun 23, 2026

Summary

Tests

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 23, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

claude Bot commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: Route inline reasoning streams

Summary

1. Silent fallback on unrecognized config value — parse_reasoning_stream_style (chat.rs:2165–2173)

2. InlineTags mode also consumes reasoning_content fields (chat.rs:2742–2751)

3. Missing test: unclosed <think> tag at stream end (chat.rs:2891–2915)

4. Module-level doc comment names a now-test-only function (chat.rs:4)

False alarm from Gemini

What's good

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

claude Bot commented Jun 23, 2026 •

edited

Loading

1. Silent fallback on unrecognized config value — `parse_reasoning_stream_style` (`chat.rs:2165–2173`)

2. `InlineTags` mode also consumes `reasoning_content` fields (`chat.rs:2742–2751`)

3. Missing test: unclosed `<think>` tag at stream end (`chat.rs:2891–2915`)

4. Module-level doc comment names a now-test-only function (`chat.rs:4`)