fix: reconstruct OpenAI Responses SSE output items#414
fix: reconstruct OpenAI Responses SSE output items#414tomasmach wants to merge 6 commits intospacedriveapp:mainfrom
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
WalkthroughReplaces the SSE parser in Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
src/llm/model.rs
Outdated
| return Vec::new(); | ||
| }; | ||
|
|
||
| let mut ordered_items = vec![serde_json::Value::Null; max_index + 1]; |
There was a problem hiding this comment.
Filling gaps with null means the final response.output can contain null entries if indices ever get sparse, which can make downstream parsing/reporting noisy. If you don’t rely on preserving sparse indices, you can just collect the existing items in key order (BTreeMap already keeps them ordered).
| let mut ordered_items = vec![serde_json::Value::Null; max_index + 1]; | |
| fn collect_responses_output_items( | |
| output_items: &BTreeMap<usize, serde_json::Value>, | |
| ) -> Vec<serde_json::Value> { | |
| output_items.values().cloned().collect() | |
| } |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/llm/model.rs (1)
2851-2858: Consider adding a comment explaining the streaming merge heuristic.The string merge logic has specialized semantics for streaming text reconstruction—it only replaces when the source is longer and starts with the target. This non-obvious behavior could confuse future maintainers.
(serde_json::Value::String(target_text), serde_json::Value::String(source_text)) => { + // Streaming merge: only replace if source extends target (common in delta accumulation) + // or if target is empty if target_text.is_empty() || (source_text.len() > target_text.len() && source_text.starts_with(target_text.as_str()))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/llm/model.rs` around lines 2851 - 2858, Add a concise comment above the match arm handling (serde_json::Value::String(target_text), serde_json::Value::String(source_text)) explaining the streaming-merge heuristic: we only replace target_text when the incoming source_text is longer and starts_with(target_text) to support incremental/streamed text reconstruction (where newer chunks append to an earlier partial value) and to avoid overwriting with shorter partial updates; mention that empty target_text is also replaced. Keep the comment short, reference the variables target_text and source_text, and place it immediately above that block in src/llm/model.rs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/llm/model.rs`:
- Around line 2851-2858: Add a concise comment above the match arm handling
(serde_json::Value::String(target_text), serde_json::Value::String(source_text))
explaining the streaming-merge heuristic: we only replace target_text when the
incoming source_text is longer and starts_with(target_text) to support
incremental/streamed text reconstruction (where newer chunks append to an
earlier partial value) and to avoid overwriting with shorter partial updates;
mention that empty target_text is also replaced. Keep the comment short,
reference the variables target_text and source_text, and place it immediately
above that block in src/llm/model.rs.
Co-authored-by: tembo[bot] <208362400+tembo[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/llm/model.rs`:
- Around line 3004-3012: There is a stray duplicated block of code that
reimplements the append logic after the append_json_string_field function has
already closed, causing an unmatched delimiter and compilation failure; remove
the duplicated block (the second occurrence that starts with let object =
ensure_json_object(value); and the subsequent operations on field_name/suffix)
so only the proper append_json_string_field implementation remains, ensuring no
extra closing brace or duplicated logic referencing ensure_json_object,
field_name, or suffix is left in the file.
- Simplify collect_responses_output_items to use BTreeMap::values().cloned().collect() instead of filling sparse indices with null (per Tembo review) - Add doc comments to all 13 new helper functions for docstring coverage - Add explanatory comment on merge_json_value streaming heuristic (per CodeRabbit review)
Summary
response.output_item, content-part, text-delta, and function-call-argument events before parsing the final responseresponse.completedso message-only completion shells from ChatGPT Codex no longer fail memory persistence flows in channels like TelegramChanged Files
src/llm/model.rs— teach the Responses SSE parser to accumulate streamed output state and merge it with completed-response metadatasrc/llm/model.rs— add regression tests for message content reconstruction and function call reconstructionVerification
./scripts/preflight.sh✅ passed./scripts/gate-pr.sh✅ passedcargo test parse_openai_✅ passedcargo build --release✅ passed locally with embedded frontend during reproduction setupRoot Cause
response.completedpayloads only include a baremessageshell without the text or function-call payload that arrived earlier in stream eventsresponse.completed, so it surfacedempty or unsupported response from OpenAI ChatGPT Responses APIinstead of reconstructing the assistant outputResidual Risk
Note
This fix significantly improves robustness of the OpenAI Responses SSE parser by implementing stateful stream reconstruction. The parser now accumulates output events across
response.output_item.added,response.content_part.added/done,response.output_text.delta/done,response.refusal.delta/done, andresponse.function_call_arguments.delta/doneevents before finalizing the response. This ensures incompleteresponse.completedpayloads (which only contain metadata shells) are enriched with content captured during streaming. Two regression tests validate both message text reconstruction and function call argument reconstruction flows.Written by Tembo for commit 35904669f. This will update automatically on new commits.