feat(protocol): add thinking param to Chat API and support bare string image_url by ConnorLi96 · Pull Request #1098 · lightseekorg/smg

ConnorLi96 · 2026-04-11T16:48:47Z

Problem

No way to control extended thinking via Chat Completions API (only Messages API has thinking field).
Some clients send "image_url": "https://..." (bare string) instead of "image_url": {"url": "https://..."}, causing deserialization errors.
Tiktoken partial UTF-8 decode logs at WARN level, flooding production logs during CJK/emoji streaming.

Solution

Add thinking: Option<ThinkingConfig> to ChatCompletionRequest, reusing the existing ThinkingConfig from Messages API. Normalized to chat_template_kwargs with both enable_thinking (Qwen3) and thinking (Kimi-K2.5) keys. Uses or_insert to not override user-provided kwargs.
Custom Deserialize on ImageUrl to accept both bare string and object forms.
Change tiktoken partial UTF-8 log from warn! to debug!.

Behavior change note: ImageUrl now uses a custom deserializer. Serialization is unchanged (#[derive(Serialize)]).

Changes

crates/protocols/src/chat.rs: add thinking field + normalization + unit tests
crates/protocols/src/common.rs: custom ImageUrl deserializer + unit tests (bare string, object, missing url, roundtrip, skip_none)
crates/tokenizer/src/tiktoken.rs: log level warn -> debug

Test Plan

7 new unit tests for ImageUrl deserialization + serialization
4 new unit tests for thinking normalization (enabled, disabled, no-override, absent)
All 63 protocol tests pass

Checklist:

cargo +nightly fmt passes
cargo clippy --all-targets --all-features -- -D warnings passes
cargo test -p openai-protocol passes (63 tests)

Made with Cursor

coderabbitai · 2026-04-11T16:49:00Z

Warning

Rate limit exceeded

@ConnorLi96 has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 16 minutes and 50 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 16 minutes and 50 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 0d342ad2-0d7e-47ad-8f28-bd2b0faa7a66

📥 Commits

Reviewing files that changed from the base of the PR and between 381c5f6 and cf003e3.

📒 Files selected for processing (3)

crates/protocols/src/chat.rs
crates/protocols/src/common.rs
crates/tokenizer/src/tiktoken.rs

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch connorli/thinking-and-image-url

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…g image_url 1. Add Anthropic-style `thinking` field to ChatCompletionRequest. Normalized to chat_template_kwargs with both `thinking` and `enable_thinking` keys for Kimi-K2.5 and Qwen3 template compat. 2. Support bare string image_url: accept both `"image_url": "url"` and `"image_url": {"url": "..."}` via custom serde Deserialize. 3. Reduce tiktoken partial UTF-8 decode log from warn to debug (expected during streaming of CJK/emoji, not an error). Includes unit tests for ImageUrl deserialization (bare string, object, missing url, roundtrip) and thinking normalization. Signed-off-by: ConnorLi96 <ConnorLi96@users.noreply.github.com> Made-with: Cursor

claude · 2026-04-11T16:51:32Z

+                ThinkingConfig::Enabled { .. } => {
+                    kwargs
+                        .entry("enable_thinking".to_string())
+                        .or_insert(Value::Bool(true));
+                    kwargs
+                        .entry("thinking".to_string())
+                        .or_insert(Value::Bool(true));
+                }


🟡 Nit: budget_tokens is silently discarded here — the Enabled { .. } pattern ignores it, and only a boolean true is forwarded to the template kwargs. Since budget_tokens is a required field on ThinkingConfig::Enabled, every caller must provide it, yet it has no effect on this code path. This could confuse users who expect their token budget to be respected.

Consider either:

Adding a tracing::debug! noting the value is ignored (cheap, aids debugging), or

Forwarding it as an additional kwarg (e.g., budget_tokens) so future templates can opt in to it.

gemini-code-assist

Code Review

This pull request introduces support for extended thinking configurations in ChatCompletionRequest by mapping the thinking field to chat_template_kwargs. It also enhances ImageUrl to support deserialization from both string and object formats and downgrades the logging level for tiktoken decode failures. Feedback was provided to ensure the thinking field takes precedence during normalization and to update the JsonSchema for ImageUrl to match its new deserialization capabilities.

gemini-code-assist · 2026-04-11T16:51:33Z

+        if let Some(ref thinking) = self.thinking {
+            let kwargs = self.chat_template_kwargs.get_or_insert_with(HashMap::new);
+            match thinking {
+                ThinkingConfig::Enabled { .. } => {
+                    kwargs
+                        .entry("enable_thinking".to_string())
+                        .or_insert(Value::Bool(true));
+                    kwargs
+                        .entry("thinking".to_string())
+                        .or_insert(Value::Bool(true));
+                }
+                ThinkingConfig::Disabled => {
+                    kwargs
+                        .entry("enable_thinking".to_string())
+                        .or_insert(Value::Bool(false));
+                    kwargs
+                        .entry("thinking".to_string())
+                        .or_insert(Value::Bool(false));
+                }
+            }
+        }


Using or_insert here creates a logic issue where the explicit thinking field is ignored if the corresponding keys already exist in chat_template_kwargs. For example, if a user sets thinking: Disabled but chat_template_kwargs contains "enable_thinking": true, the model will remain in thinking mode. Since thinking is a first-class field in the request, it should take precedence over the generic chat_template_kwargs during normalization to ensure alignment with the external specification. Additionally, the budget_tokens value from the Enabled variant is currently dropped; consider propagating it to the kwargs as well if the underlying templates support it.

if let Some(ref thinking) = self.thinking { let kwargs = self.chat_template_kwargs.get_or_insert_with(HashMap::new); match thinking { ThinkingConfig::Enabled { .. } => { kwargs.insert("enable_thinking".to_string(), Value::Bool(true)); kwargs.insert("thinking".to_string(), Value::Bool(true)); } ThinkingConfig::Disabled => { kwargs.insert("enable_thinking".to_string(), Value::Bool(false)); kwargs.insert("thinking".to_string(), Value::Bool(false)); } } }

References

For protocol data structures that mirror an external API (e.g., OpenAI), prioritize alignment with the external specification over internal consistency.

gemini-code-assist · 2026-04-11T16:51:33Z

+#[derive(Debug, Clone, Serialize, PartialEq, schemars::JsonSchema)]
 pub struct ImageUrl {
    pub url: String,
    #[serde(skip_serializing_if = "Option::is_none")]
-    pub detail: Option<String>, // "auto", "low", or "high"
+    pub detail: Option<String>,
+}


The ImageUrl struct now supports deserialization from a bare string via a custom Deserialize implementation, but the derived JsonSchema does not reflect this change. This will result in generated API documentation and client libraries that only allow the object form. To maintain alignment with external specifications that allow both forms, you should implement schemars::JsonSchema manually for ImageUrl to specify that it can be either a string or an object (e.g., using any_of).

References

For protocol data structures that mirror an external API (e.g., OpenAI), prioritize alignment with the external specification over internal consistency.

claude · 2026-04-11T16:51:38Z

 }

-#[derive(Debug, Clone, Deserialize, Serialize, PartialEq, schemars::JsonSchema)]
+#[derive(Debug, Clone, Serialize, PartialEq, schemars::JsonSchema)]


🟡 Nit: The schemars::JsonSchema derive still generates a schema for the struct form only (object with url + optional detail). Now that the custom Deserialize impl also accepts a bare string, the generated JSON Schema / OpenAPI docs won't reflect that. Clients doing client-side validation against the schema would reject the bare-string form that the server actually accepts.

If the schema is exposed in API docs, consider a manual JsonSchema impl (or #[schemars(with = ...)]) that emits a oneOf of string and object.

claude

Clean PR with good test coverage. Two minor nits posted inline — neither is blocking.

Summary: 🟡 Nit x2, 🔴 Important x0

budget_tokens silently discarded during thinking normalization — worth a debug log or forwarding as a kwarg
JsonSchema derive on ImageUrl doesn't reflect the new bare-string acceptance

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cf003e3994

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-04-11T16:52:59Z

+        // Migrate thinking → chat_template_kwargs.
+        // Sets both keys: Qwen3 templates check `enable_thinking`, Kimi-K2.5 checks `thinking`.
+        // Uses or_insert to not override user-provided kwargs.
+        if let Some(ref thinking) = self.thinking {


Drop migrated thinking field after populating template kwargs

In the OpenAI HTTP routing path I checked (model_gateway/src/routers/openai/chat.rs:84-106), the request is serialized from ChatCompletionRequest and then only fields in SGLANG_FIELDS are stripped (model_gateway/src/routers/openai/provider.rs:11-35), which does not include thinking. Because this normalization block migrates thinking into chat_template_kwargs but leaves self.thinking set, any client that now sends thinking will forward an extra unsupported parameter to external OpenAI-compatible backends (a regression from previous behavior where unknown thinking was dropped during deserialization), causing 400-style upstream rejections.

Useful? React with 👍 / 👎.

ConnorLi96 requested review from CatherineSue, key4ng and slin1237 as code owners April 11, 2026 16:48

github-actions bot added tokenizer Tokenizer related changes protocols Protocols crate changes labels Apr 11, 2026

claude bot reviewed Apr 11, 2026

View reviewed changes

gemini-code-assist bot reviewed Apr 11, 2026

View reviewed changes

claude bot reviewed Apr 11, 2026

View reviewed changes

claude bot approved these changes Apr 11, 2026

View reviewed changes

chatgpt-codex-connector bot reviewed Apr 11, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(protocol): add thinking param to Chat API and support bare string image_url#1098

feat(protocol): add thinking param to Chat API and support bare string image_url#1098
ConnorLi96 wants to merge 1 commit intomainfrom
connorli/thinking-and-image-url

ConnorLi96 commented Apr 11, 2026

Uh oh!

coderabbitai bot commented Apr 11, 2026

Rate limit exceeded

Uh oh!

claude bot Apr 11, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Apr 11, 2026

Uh oh!

gemini-code-assist bot Apr 11, 2026

Uh oh!

claude bot Apr 11, 2026

Uh oh!

claude bot left a comment

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Apr 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ConnorLi96 commented Apr 11, 2026

Problem

Solution

Changes

Test Plan

Uh oh!

coderabbitai bot commented Apr 11, 2026

Rate limit exceeded

Uh oh!

claude bot Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant