cross provider handoff support#47
Conversation
seems like open ai streams dont have a last item all the time so if we ever saw a tool_use we map it.
i removed the error of unsupported provider since nothing relies on it and i wanted the tests to pass. all of this will be refactored in a moment anyways
There was a problem hiding this comment.
Reviewed Changes
Managerbot reviewed 15 out of 20 changed files in this pull request and generated 12 comments.
Files not reviewed (5)
- test/fixtures/handoff_live_fixture.json
- test/fixtures/handoff_media_live_fixture.json
- scripts/generate_handoff_live_fixture.rb
- scripts/generate_handoff_media_fixture.rb
- test/utils/readfile_tool_helper.rb
Comments suppressed due to low confidence (2)
-
lib/llm_gateway/client.rb:65
- (are_you_sure) Falling back to `client.class.name.downcase` instead of raising an error silently accepts unknown providers. This could cause subtle bugs where sanitization comparisons silently pass or fail with unexpected provider strings (e.g. `"llmgateway::clients::someclient"` instead of a short id like `"openai"`). Consider raising for truly unknown types or maintaining an explicit allowlist. test/unit/input_message_sanitizer_test.rb:33
- (test_completeness) The test uses `{ type: "reasoning", summary: [ { text: "second thought" } ] }` but `extract_reasoning_text` looks for `item[:text]`, `item[:summary_text]`, or `item[:reasoning]` inside a summary array. The hash `{ text: "second thought" }` uses symbol key `:text` which matches `item[:text]`. This test is valid, but there is no test covering the string-keyed variant `{ "text" => "second thought" }` to ensure cross-key compatibility.
| end | ||
| end | ||
|
|
||
| message.merge(content: sanitized_content) |
There was a problem hiding this comment.
message.merge(content: sanitized_content) always uses symbol key :content, but the original message might have used string key "content". This could result in both "content" and :content keys existing in the merged hash, which may cause downstream issues.
| return block if same_model_replay | ||
|
|
||
| text = extract_reasoning_text(block) | ||
| return nil if text.nil? || text.strip.empty? |
There was a problem hiding this comment.
When text is non-empty, the block is converted to { type: "text", text: text } with only symbol keys. However if the original block used string keys elsewhere, this inconsistency could cause issues in downstream mappers that expect a uniform key type.
| summary = block[:summary] | ||
| if summary.is_a?(Array) | ||
| text = summary.filter_map do |item| | ||
| next item if item.is_a?(String) |
There was a problem hiding this comment.
next item if item.is_a?(String) — a plain string item in the summary array would be used as raw text, which seems unintentional given the rest of the logic expects hashes with text fields. Consider removing or clarifying this branch.
| return messages unless input_sanitizer | ||
|
|
||
| target_provider = LlmGateway::Client.provider_id_from_client(client) | ||
| target_api = stream_api_name |
There was a problem hiding this comment.
stream_api_name is used for both chat and stream methods, but the name implies it is stream-specific. When called from chat, returning a stream-oriented API name could lead to incorrect same_model_replay? comparisons. Consider using a more general method or accepting the API name as a parameter.
| id_map = {} | ||
|
|
||
| messages.map do |message| | ||
| next message unless message.is_a?(Hash) && message[:content].is_a?(Array) |
There was a problem hiding this comment.
message[:content].is_a?(Array) — after super (the base sanitizer) processes the message, the content could now contain symbol-keyed blocks. However the check next message unless message.is_a?(Hash) && message[:content].is_a?(Array) skips messages whose content is a string. Tool messages in OpenAI's chat completions format can have string content; if a message was not transformed by the base sanitizer, its content key might still be a string, causing the normalization to be silently skipped.
| original_id = block[:id] || block["id"] | ||
| normalized_id = normalize_tool_call_id(original_id, target_provider: target_provider) | ||
| id_map[original_id] = normalized_id if original_id && normalized_id | ||
| block.merge(id: normalized_id) |
There was a problem hiding this comment.
block.merge(id: normalized_id) always uses a symbol key :id, which could conflict with an existing string key "id" if the block came in with string keys, resulting in duplicate keys with different types.
| block.merge(id: normalized_id) | ||
| when "tool_result" | ||
| original_tool_use_id = block[:tool_use_id] || block["tool_use_id"] | ||
| normalized_tool_use_id = id_map[original_tool_use_id] || normalize_tool_call_id(original_tool_use_id, target_provider: target_provider) |
There was a problem hiding this comment.
For tool_result blocks, the fallback normalize_tool_call_id(original_tool_use_id, target_provider: target_provider) is called when the id is not found in id_map. This means if the corresponding tool_use block was in a different message that was skipped (e.g. non-Array content), the id could be incorrectly normalized independently without the mapping context, leading to a mismatch.
| skip "Missing fixture at #{FIXTURE_PATH}. Run: ruby scripts/generate_handoff_live_fixture.rb" unless File.exist?(FIXTURE_PATH) | ||
|
|
||
| base_transcript = symbolize(JSON.parse(File.read(FIXTURE_PATH))) | ||
| transcript = Marshal.load(Marshal.dump(base_transcript)) |
There was a problem hiding this comment.
Marshal.load(Marshal.dump(base_transcript)) is used for deep cloning. This is fine functionally, but Marshal can fail on objects that don't support serialization. Consider using a pure Ruby deep-dup approach (e.g. JSON.parse(base_transcript.to_json)) for safety with plain Hash/Array structures.
| skip "Missing fixture at #{FIXTURE_PATH}. Run: ruby scripts/generate_handoff_media_fixture.rb" unless File.exist?(FIXTURE_PATH) | ||
|
|
||
| base_transcript = symbolize(JSON.parse(File.read(FIXTURE_PATH))) | ||
| transcript = Marshal.load(Marshal.dump(base_transcript)) |
There was a problem hiding this comment.
Same Marshal.load(Marshal.dump(...)) concern as in handoff_live_test.rb. Prefer a JSON round-trip for deep cloning plain data structures.
| last_item = output.last || {} | ||
|
|
||
| last_item[:type] == "function_call" ? "tool_use" : "stop" | ||
| tool_state.any? || last_item[:type] == "function_call" ? "tool_use" : "stop" |
There was a problem hiding this comment.
tool_state.any? || last_item[:type] == "function_call" — if tool_state is truthy but the actual response did not involve tool calls (e.g. stale state from a previous turn), this could incorrectly return "tool_use" as the stop reason. Ensure tool_state is properly reset between turns.
1) Every adapter can now sanitize input messages before mapping
In lib/llm_gateway/adapters/adapter.rb:
So cross-provider handoff happens at request time, right before provider-specific input mapping.
2) Generic sanitizer handles reasoning/thinking portability
New file: lib/llm_gateway/adapters/input_message_sanitizer.rb
For assistant messages with metadata (provider, api, model):
This prevents provider-specific reasoning block formats from breaking on another provider.
3) OpenAI chat-completions adds tool-call ID normalization
New file: lib/llm_gateway/adapters/open_ai/chat_completions/input_message_sanitizer.rb
On top of generic sanitizing, it normalizes tool IDs:
This keeps tool-call/result references valid when handing off transcripts between APIs/providers with different ID constraints.