Open
Conversation
Add support for iterative agent cycles that allow the agent to execute multiple tool calls and regenerate responses. Introduce message queuing with automatic flushing based on chat state, and improve UI feedback for tool execution progress. - Add MAX_AGENT_CYCLES constant (50) and loop generation in start_generation - Introduce ToolStepOutcome enum (NoToolCalls, Paused, Continue) - Replace check_tool_calls_and_continue with process_tool_calls_once - Add chatId parameter to queue actions for proper multi-chat support - Implement useQueueAutoFlush hook for automatic message queue processing - Add subchat_log tracking and attached_files deduplication in ToolCall - Update ToolUsageSummary to display progress steps and file information - Improve UsageCounter to persist token display during idle periods - Fix tool decision handling to only regenerate when decisions accepted - Add regenerate command support in ResendButton - Update test coverage for new agent loop behavior and queue operations
- Store limited_messages in PreparedChat and session for tool context - Track last_prompt_messages in ChatSession to use actual LLM input for tools - Allow critical ToolDecision commands to bypass queue size limits when paused - Refactor deduplicate_and_merge_context_files to return dedup notes - Rename is_covered_by_history to find_coverage_in_history for clarity - Improve finish_stream to only add non-empty messages - Consolidate tool result notes into existing tool messages
- Import setUseCompression action and PATCH_LIKE_FUNCTIONS constant - Add listener to automatically approve patch-like tool confirmations when automatic_patch is enabled - Add listener for setUseCompression action to sync compression setting - Improve error handling consistency with inline comments - Add type annotation to useQueueAutoFlush reducer
Separate user messages into queued_messages and non-user messages into thread.messages when creating a new chat. This allows the UI to handle message sending through the queue mechanism. Also remove setSendImmediately call in useCompressChat as queued messages are now the default flow.
…cancellation
- Rename `queued_messages` to `queued_items` throughout codebase for clarity
- Introduce `QueuedItem` type with `client_request_id`, `priority`, `command_type`, `preview`
- Replace `QueuedUserMessage` with unified `QueuedItem` supporting all command types
- Add `priority` field to `ChatCommand` and `CommandRequest` for immediate execution
- Implement `cancelQueuedItem` API endpoint (DELETE /chats/:chat_id/queue/:client_request_id)
- Add `cancelQueued` action to `useChatActions` hook
- Update `sendUserMessage` to accept optional `priority` parameter
- Refactor `Chat.tsx` to pass `sendPolicy` ("immediate" | "after_flow") to submit handler
- Update `ChatForm.tsx` to use `selectQueuedItems` instead of `selectQueuedMessages`
- Add queue state to runtime updates for real-time visibility in UI
- Implement `build_queued_items()` and `emit_queue_update()` in session
- Add `inject_priority_messages_if_any()` to generation flow for priority message handling
- Update snapshot and runtime events to include `queued_items` array
- Improve queued message display with command type and preview text
- Update CSS for queued message styling with priority indicators
- Remove `useQueueAutoFlush` hook (no longer needed with server-side queue management)
- Update test fixtures and mock stores to use `queued_items`
Add `newChatWithInitialMessages` async thunk to handle creating a new chat and sending initial user messages. Includes helper function `toMessageContent` to normalize message content formats (string, array, or legacy format). Update consumers in `useCompressChat` and `useEventBusForApp` to use the new thunk for chats with initial messages. Add safety check in `useCapsForToolUse` to prevent redundant model updates.
Add support for automatic patch execution when all pending confirmations are patch-like tool calls. This includes: - New `setAutomaticPatch` action creator and middleware listener in GUI - New `automatic_patch` field in ThreadParams - Patch-like tool detection in backend (patch, text_edit, create_textdoc, etc) - Logic to skip pause when automatic_patch is enabled and all confirmations are patch-like tools - Persistence of automatic_patch setting in trajectory snapshots
Add optional `file_rev` field to ContextFile to track file revisions/hashes. This field is used for deduplication and change detection across file contexts. The field is skipped during serialization and deserialization to maintain backward compatibility. - Add `file_rev: Option<String>` field to ContextFile struct - Update all ContextFile instantiations to include `file_rev: None` - Add file hashing via `official_text_hashing_function` in pp_tool_results - Use file_rev for duplicate detection in find_coverage_in_history
Extend both STREAM_IDLE_TIMEOUT and STREAM_TOTAL_TIMEOUT from their previous values (120s and 900s respectively) to 3600s (1 hour) to allow longer-running streaming operations to complete without premature timeout.
- Add draft message storage utilities with automatic cleanup of stale messages - Implement useDraftMessage hook for managing chat input drafts - Add ChatLoading component with animated skeleton and progress indicators - Enable parallel tool execution with configurable concurrency limit (16) - Add snapshot_received flag to track initial chat state synchronization - Implement thread parameter persistence to localStorage - Add entertainment messages to strategic planning tool with progress tracking - Use official text hashing for file revisions in postprocessing - Add middleware listeners for syncing thread parameters to backend - Normalize line endings across codebase (CRLF → LF) Refs: Multiple feature enhancements for improved UX and performance
…agement - Add voice recording and transcription capabilities using Whisper - Implement audio decoding for multiple formats (WAV, WebM, MP3, OGG) - Add audio resampling to 16kHz for transcription - Create voice service with model download and caching - Add microphone button UI with recording state feedback - Add voice status API endpoint with model management - Integrate LLVM installation in CI/CD workflows - Add optional voice feature flag to Cargo.toml - Add user memories context injection in chat system - Improve chat snapshot handling to preserve thread settings - Add missing dependency imports and error handling
Implement comprehensive task management with kanban board, multi-role chat system (planner/orchestrator/agent), and tools for task lifecycle management. - Add KanbanBoard component for visual task card management - Implement TaskList and TaskWorkspace for task UI - Add Sidebar integration showing active tasks - Extend Toolbar with task tabs and creation - Implement task agent tools (update, complete, fail, finish) - Add task board management tools (create/update/move/delete cards) - Implement orchestrator tools (check agents, ready cards, instructions) - Add task agent finish tool for completion reporting - Implement task mark card tools for manual resolution - Add task check agents tool for monitoring agent status - Implement task initialization and board state management - Update config with task planner/orchestrator/agent prompts and parameters
Add optional task_meta field to ThreadParams to track task context (task_id, role, agent_id, card_id) in chat sessions. Update chat reducer to preserve and propagate task metadata through snapshots. Add infer_task_id_from_chat_id utility to derive task ID from chat ID patterns. Update all task tool get_task_id functions to use inference fallback. Allow trajectory save for task chats with empty messages.
Add an optional code_workdir parameter to AtCommandsContext to support task-specific working directories for agent operations. Update all instantiations of AtCommandsContext to pass the new parameter, and implement path resolution logic for relative file paths within the working directory. Also refactor task-related endpoints to use "planner" terminology instead of "orchestrator" and add PATCH endpoint for updating task metadata.
Refactor ToolTaskBoardCreateCard, ToolTaskBoardUpdateCard, ToolTaskBoardMoveCard, and ToolTaskBoardDeleteCard to acquire the context lock only for extracting planner role and global context, then release it before performing long-running operations like board loading and storage. This reduces lock contention and prevents potential deadlocks.
- Add expandable/collapsible chat section with chevron toggle in header - Chat header becomes clickable to expand/hide kanban board and panels - Smooth 0.3s chevron rotation animation when expanding/collapsing - Create new AgentStatusDot component with animated indicators - Blue pulsing dots (1.5s) for active agents showing progress - Green pulsing dots (2s) for completed agents - Static red dots for failed agents - Integrated animated dots into AgentsPanel for visual status feedback
…unction Add new `get_project_dirs_with_code_workdir()` and `get_project_dirs_for_workdir()` functions to handle code_workdir parameter consistently across tools. Update file edit, cat, mv, rm, tree, and tool_strategic_planning to use the new function instead of duplicating workdir logic.
- Restructure Chat component with flex layout for better content scrolling - Add model parameter to createChatWithId action for task agent chats - Update ChatRawJSON to accept generic thread objects instead of ChatHistoryItem - Relax copyChatHistoryToClipboard type to accept Record<string, unknown> - Fix ModelSelector nullish coalescing operator (|| to ??) - Add task_meta handling in reducer snapshot event for task chat detection - Support task name updates in UpdateTaskMetaRequest and handle_update_task_meta - Add model field to updateTaskMeta mutation in tasks service - Pass default_agent_model when creating agent chats in TaskWorkspace - Add selectThreadById fallback in ThreadHistory for active thread lookup - Enable task tab renaming in Toolbar with updateTaskMeta integration
…trategic_planning, memory_bank)
Disable message compression logic in history limiting to simplify token management. Add parse_depends_on helper to accept both array and comma-separated string formats for task card dependencies, improving API flexibility.
…9b0bb8f1/card/T-1/5d257e1f
…9b0bb8f1/card/T-3/1bafe783
…9b0bb8f1/card/T-5/fdd3b6cb
…9b0bb8f1/card/T-4/1cc14148
Update handle_v1_trajectories_get and handle_v1_trajectories_delete to use find_trajectory_path() for consistent trajectory resolution across workspace and task directories. Filter task trajectories from main history view.
- Rename response types: PreviewResponse → TransformPreviewResponse/HandoffPreviewResponse - Update response fields to match backend: stats → individual token counts and reduction percent - Simplify transform options: remove summarize_conversation, add dedup_and_compress_context - Expand handoff options: add include_last_user_plus, llm_summary_for_excluded - Refactor TrajectoryPopover into separate button and content components - Update trajectory hook to request SSE refresh after successful transform apply - Add sse_refresh_requested field to chat state for reconnection signaling - Update useChatSubscription to listen for refresh requests and reconnect - Wrap request bodies in options envelope for API consistency
Apply consistent multi-line formatting to JSX props, TypeScript types, and complex expressions across multiple components and utilities. Remove unused import and minor whitespace fixes.
Remove dedicated self_hosted.rs module (374 lines) and integrate self-hosted caps parsing directly into caps.rs via new convert_self_hosted_caps_if_needed() function. Simplify provider configs by removing legacy chat_models fields and always using Vec::new(). Add model caps auto-application to all chat models. Implement normalized model matching (case-insensitive, strip -latest/-preview/-fp* suffixes, dot→dash conversion). Other changes: - Fill model defaults in chat sampling params - Improve model filter regexes (o1/o3/o4 support) - GUI: animation improvements + session_state tracking - HTTP API: centralized json_response helper - Pricing: add GPT-5/o4/Claude 4.x/Gemini 2.5 entries Closes #TODO
…M requests - Add `linearize_thread_for_llm()`: merges consecutive user-like messages (user/plain_text/cd_instruction/context_file) and folds tool-appendable content into preceding tool/diff messages while preserving tool_call_id - Comprehensive test suite covering 100%+ real-world patterns from trajectories - Update all LLM adapters (refact/openai/anthropic/responses) to use linearized messages instead of raw chat history - Eliminates redundant user messages for strict role alternation and prefix caching optimization refactor(chat): extract synchronous thread params patching - Move `buildThreadParamsPatch()` from actions to shared util - New chats: send tool_use/mode only on empty message history - Existing chats: sync model/boost_reasoning/include_project_info/etc. - Eliminates async RTK listener race condition feat(providers): add Responses API toggles for OpenAI/xAI - `use_responses_api` field with UI toggle and smartlinks - xAI: Responses API (`/v1/responses`) vs Chat Completions (`/v1/chat/completions`) - OpenAI: Responses API (`/v1/responses`) vs Chat Completions - Hidden provider variants (`xai_responses`, `openai_responses`) for UI merging - Icons/labels show Responses API variant feat(ui): tool call argument tooltips - Hover tool card headers to see function name + JSON args - 10s delay, positioned above card, auto-dismiss - Handles malformed JSON gracefully with raw fallback - Portal positioning with mouse tracking fix(chat): prevent stale trajectory state overwrites - Skip boolean flag updates (streaming/waiting/etc.) when chat SSE active - Sidebar SSE (trajectories) can lag behind chat SSE; only update session_state - Preserves `snapshot_received` as authoritative chat state source feat(metering): USD cost display with breakdown - Parse `metering_usd` from usage with prompt/generated/cache breakdown - Hover cards show detailed USD vs coins (show USD when available) - History list, message footers, usage counter all updated - Graceful fallback to coins when no USD data refactor(providers): model fallback + defaults - Fallback missing light/thinking models to chat_default_model - Provider defaults YAML loaded per-model type (chat/light/thinking) - Resolve user model aliases (gpt-4o → refact/gpt-4o) - Hide refact/refact_self_hosted settings (managed by backend) feat(chat): save preamble/knowledge to session - System prompts/project context persisted to avoid regeneration churn - Agentic knowledge enrichment saved before user messages - `prepare_session_preamble_and_knowledge()` extracts to shared util chore(modes): bump schema_version + add_workspace_folder tool All modes updated to schema v6 with `add_workspace_folder` tool enabled
Introduce tracking of currently running Refact models from pricing metadata and filter available models list to only include those models. - Add `running_models` field to `RefactProvider` - Implement `set_running_models()` and override `get_available_models_from_caps()` - Extract running model IDs from pricing object keys in capabilities loader - Add default `set_running_models()` no-op to provider trait
- Add n_cxt to CommonParams with serde skip for optional serialization - Expand ReasoningIntent with Default, Minimal, XHigh variants and update adapter mappings (anthropic_budget, to_openai_reasoning) - Implement Anthropic citation resending in multi-turn conversations with proper text block attachment - Add comprehensive Refact adapter citation parsing for LiteLLM Anthropic (provider_specific_fields.citation[s], non-streaming message fields) - Simplify usage parsing: consistent cache_read subtraction from prompt_tokens across providers with zero-filtering and fallback logic - Introduce selectEffectiveMaxContextTokens respecting context_tokens_cap - Remove extra_headers from AdapterSettings (cleanup) - Add extensive tests for citation streaming, multi-turn resending, and usage edge cases (zero fields, partial chunks, PDF page citations) - Map new ReasoningEffort variants and pass n_ctx to Refact adapter Fixes # various citation/streaming issues
- Add abort_flag polling and process group killing in shell tool execution - Distinguish user aborts from tool timeouts and normal completion - Update queue/tools logic to check abort state before/after tool execution - Add interrupted flag and partial output handling for interrupted shell commands - Ensure proper state transitions (Idle) and UI feedback for aborts
- Sort paths and directories in files_in_workspace for consistent trie building - Sort context paths in pp_utils for stable processing order - Replace HashMap with BTreeMap in pp_tool_results for sorted iteration - Sort customization modes by title/id to ensure consistent UI order - Enhance model pattern matching with canonical names and wildcard support - Change default reasoning effort to Medium when boost_reasoning enabled - Update anthropic test comment for clarity
…port - Expand ReasoningEffort enum with None/Default/Minimal/XHigh/Max variants - Add Anthropic "effort" reasoning style (adaptive thinking + output_config) - Update UI: temperature disabled with reasoning, mid-chat warnings, new buttons - Fix tool call args normalization for empty/null values from LLMs - Improve reasoning display: paragraph breaks for bold titles, better caps lookup - Add n_ctx to SamplingParameters, update test timeouts and fixtures Fixes reasoning config inconsistencies across providers (OpenAI/Mistral/XAI/Qwen/etc).
Extend ReasoningEffort enum parsing and conversion to support new XHigh and Max levels in both generation and preparation modules.
Replace truncateMiddle/truncate functions with full text display across all tool cards. Improve layout with flex properties and vertical-align fixes for better text wrapping. Add streaming progress step display. - ResearchTool, KnowledgeTool, ShellTool, SubagentTool: full query display - EditTool: robust file path fallback, full filename display - ReadTool: clickable full file list instead of truncated summary - GenericTool: simplified full arg formatting without truncation limits - StreamingToolCard: add step progress parsing and display - CSS: flex layout improvements, ellipsis handling, baseline alignment
Add full Claude Code integration supporting: - Auto-detection of CLI credentials at ~/.claude/.credentials.json - OAuth Bearer token auth (Authorization: Bearer) alongside API keys - Claude Code-specific headers (user-agent, beta flags, system prefix) - Server-side web_search tool passthrough for multi-turn conversations - mcp_ tool prefixing/stripping for server-side tool validation - Dynamic model discovery from Anthropic /v1/models API Refactor provider traits to support async model fetching and auth_token. Add server_content_blocks to preserve server_tool_use/web_search_tool_result blocks verbatim across turns (required for encrypted_index citation validation). Add claude_code.yaml default provider config.
- Add OpenAICodexProvider registration and module import
- Add codex-mini pricing model ($1.50/$6.00 per M tokens)
- Add robust ChatToolFunction::parse_args() to handle malformed LLM tool arguments (empty strings, null, arrays → `{}`)
- Add comprehensive unit tests for argument parsing edge cases
- Update all tool argument parsing to use new parse_args() method
- Improve argument extraction logic in stream_core.rs
- Remove unused is_claude_code_oauth flags from adapters
BREAKING CHANGE: Tool argument parsing behavior changed - non-object JSON now normalizes to `{}` instead of erroring
Implement full OpenAI Codex provider with: - Multiple auth methods (API key, OAuth token, Codex CLI auto-detection) - Dynamic model discovery via /v1/models API filtered for codex models - Support for CODEX_API_KEY/OPENAI_API_KEY env vars - Model capabilities matching with date suffix stripping - Provider schema and default YAML config - Fallback to model_caps when API unavailable
Implement full OpenAI Codex provider with: - Multiple auth methods (API key, OAuth token, Codex CLI auto-detection) - Dynamic model discovery via /v1/models API filtered for codex models - Support for CODEX_API_KEY/OPENAI_API_KEY env vars - Model capabilities matching with date suffix stripping - Provider schema and default YAML config - Fallback to model_caps when API unavailable Also increase tool budget limits and cat tool max lines to 32768/2000 to support larger Codex contexts
- Implement full OAuth2 PKCE flow with Anthropic Console API
- Add GUI "Login with Anthropic" button and code exchange UI
- Remove legacy API key support (ANTHROPIC_API_KEY, api_key field)
- Prioritize OAuth tokens over CLI session tokens in auth resolution
- Add OAuth status display with token expiry detection
- New HTTP endpoints: /providers/claude_code/oauth/{start,exchange,logout}
- Store OAuthTokens (access_token, refresh_token, expires_at) in provider config
- Update model listing and auth diagnostics for OAuth-only flow
Closes #TODO
…odels status Introduce `has_credentials()` trait method and `selected_model_count()` for providers to determine status automatically. Provider status now shows "active"/"configured"/"not_configured" badges. ProviderForm uses dynamic schema parsing with per-field save (patch semantics) and removes global Save button. Backend changes: - ProviderRuntime.enabled now requires credentials AND selected models - ProviderListItem/ProviderDetailResponse add has_credentials/status fields - Model enable/disable logic simplified (allowlist only, no empty-set default) Frontend changes: - ProviderCard shows model count + status dots instead of toggle - SchemaField components handle string/boolean/secret fields with save states - Per-field editing with blur-save and inline feedback Closes #providers-config-ux
…ditor Introduce dynamic YAML schema parsing for provider configuration with: - Separate important/extra fields based on f_extra flag - Per-field save with backend YAML merge preserving secrets - Remove legacy OAuth UI in favor of direct credential input - Add API key support alongside OAuth tokens for Claude Code Backend changes: - Simplify settings merge logic - Remove obsolete OAuth endpoints - Dual auth support (API key + OAuth) for Anthropic/Claude Code
Implement full OpenAI OAuth2 flow with PKCE for secure browser-based login:
- New openai_codex_oauth module with verifier/challenge generation
- Support for both in-app OAuth tokens and Codex CLI credentials
- HTTP callback endpoint for automatic auth completion
- Updated provider schema to use oauth: { supported: true }
- Enhanced GUI with auto-polling and provider-specific labels
- Simplified OpenAICodexProvider auth resolution
Supports ChatGPT Plus/Pro subscriptions for GPT-5-Codex model access.
Apply consistent line breaking for multiline JSX elements and expressions in ToolCard components, ProviderForm, and related files.
…ries surfacing Introduce KnowledgeIndex for O(1) retrieval by files/tags/entities/related fields. Auto-builds in background from all knowledge dirs (local+global). Key improvements: - Surface "Related memories (short form)" in 20+ tools (cat, search, tree, edit tools, knowledge creation, subagents, etc.) - <50ms in-memory lookup - Richer frontmatter (created_at, summary, entities, related_files/entities, hashes) - Scoped VecDB search (knowledge/trajectory dirs) with de-dup + usefulness scoring - Consistent archiving/indexing across local/global knowledge roots Supports both new rich docs and legacy frontmatter.
Replace `supports_reasoning: Option<String>` and `supports_boost_reasoning` with explicit `reasoning_effort_options`, `supports_thinking_budget`, and `supports_adaptive_thinking_budget` fields throughout model records and caps. Simplify model adaptation logic in chat/prepare.rs to use capability checks instead of string matching. Update OAuth token exchange to use form params. Add background OAuth token refresh task. Improve knowledge index ranking. Fix diff rendering to separate JSON from related memories. Update known_models.json, TypeScript types, and tests accordingly.
…de only - Remove openai_codex handling from OAuth start/exchange/reset endpoints - Delete openai_codex OAuth callback handler entirely - Simplify provider checks to only support "claude_code" - Rename save_provider_oauth_tokens to save_oauth_tokens_to_provider - Update redirect URI in openai_codex_oauth (likely for migration) - Hardcode claude_code.yaml config path and provider creation
Move completion_models to completion_presets.json and embedding_models to embedding_presets.json for better modularity. Update CONTRIBUTING.md reference. Retain chat_models in known_models.json (not shown in diff).
… discovery Remove dependency on OpenAI models API call and instead use: - Hardcoded known Codex model IDs - Regex discovery of Codex models from model_caps - Simplified model matching without date suffix stripping Improves reliability by eliminating external API dependency and enables multimodal support by default.
…form endpoints Improve OpenAI Codex auth resolution to prefer OPENAI_API_KEY (from token-exchange) over OAuth access tokens for api.openai.com compatibility. Also add subchat thinking progress streaming for deep_research/strategic_planning/ code_review tools, plus minor GUI provider label/icon additions and YAML compat fixes. - Update auth priority: in-app API key → Codex CLI API key → OAuth tokens - Add token-exchange flow to obtain API key during OAuth - Preserve existing oauth_tokens in YAML refresh - Expose OPENAI_API_KEY at config top-level for backward compat - Implement SubchatProgressCollector for real-time thinking previews - Update GUI constants/icons for openai_codex/claude_code providers
Replace deprecated reqwest-eventsource with eventsource-stream for SSE streaming. Add HTTP status validation before streaming and improve error handling with format_llm_error_body. refactor(openai-codex): support ChatGPT backend OAuth endpoint Add ChatGptBackendOAuth variant with chatgpt_account_id extraction from JWT. Support both Platform API (/v1/responses) and ChatGPT backend (/backend-api/codex/responses) endpoints with conditional request params. refactor(subchat): add run_subchat_once_with_parent and subchat_depth Introduce run_subchat_once_with_parent for nested subchats with proper parent tx/abort/depth propagation. Remove entertainment message helpers in favor of native progress streaming. Add unicode-safe truncation. feat(anthropic)!: filter orphaned web search citations and server blocks Strip citations with encrypted_index lacking server_content_blocks, and server_tool_use blocks without matching web_search_tool_result. Prevents invalid multi-turn requests. refactor(tools): migrate to run_subchat_once_with_parent Update code_review, strategic_planning, deep_research to use new run_subchat_once_with_parent API. Remove hardcoded entertainment messages. chore: add ChatGPT OAuth diagnostics and GUI polling fix Add api_key_exchange_error field and status checks. Fix ProviderOAuth polling to stop on terminal backend errors. fix(adapter): ChatGPT backend param compatibility Detect chatgpt.com/backend-api endpoint and omit unsupported params (max_output_tokens, temperature, stop). Set store:false. fix(refact): sanitize thinking_blocks and citations Filter thinking_blocks to valid types only. Strip web citations without server_content_blocks.
- Add AppendReasoning delta and reasoning_tail tracking for OpenAI Responses API - Implement FinalizeToolCalls to handle complete tool calls from .done events - Fix thinking blocks deduplication by ID to prevent duplicates - Enhance UI streaming progress with auto-scroll, markdown rendering, and preserved newlines - Improve scrollbar UX with stable gutters and hover-only thumbs - Robustify OpenAI adapter with ChatGPT backend param filtering and comprehensive event handling - Consistent null-safe reasoning capability checks across UI components Fixes streaming progress truncation and tool call argument replacement issues.
…nt blocks - Add dedicated ToolCard components for OpenAI server tools (web_search_call, file_search_call, code_interpreter_call, computer_call, image_generation, audio, refusal, mcp_call, mcp_list_tools) - Implement OpenAI Responses API stateful multi-turn support (previous_response_id, store=true, tail-only message sending) - Add server_content_blocks display and server-executed tool result formatting - Enhance stream parsing for lifecycle events, output_items, and citations - Update ThreadParams/TrajectorySnapshot with previous_response_id persistence - Rich rendering: web results with links, file matches, code outputs, images, transcripts with proper icons and summaries BREAKING CHANGE: OpenAI Responses API now requires store=true and chains via previous_response_id. UI expects srvtoolu_* prefixed server tool calls. Fixes #model-capabilities-resolution
- Disable store=true, previous_response_id, include fields for chatgpt.com/backend-api - Filter reasoning items from input (not persisted server-side) - Skip redundant server content blocks for already-streamed data (output_text.done, reasoning.done, etc.) - Refine output_item handling to avoid premature tool call emissions Fixes compatibility issues causing 404s and incorrect streaming behavior.
…uter provider routing Improve provider model management and streaming robustness: **Provider Enhancements** - Add vLLM, Ollama, LM Studio API model discovery (`fetch_available_models`) - OpenRouter provider routing: `selected_provider`, `provider_variants`, endpoints API - Live model filtering, pricing, capabilities from provider APIs - Google Gemini API model listing and health checks - Cache-aware provider enabled state (`enabled: true` in YAML) **Streaming Improvements** - Anthropic interleaved thinking: per-block reasoning via `block_index` - Robust thinking block deduplication by `(id, type+index, type+signature)` - Server content block ordering preservation (`_order_index`) - Cache guard: prompt prefix validation, ephemeral cache_control injection - OpenAI Responses/LiteLLM: `FinalizeToolCalls`, `AppendReasoning` deltas **Fixes** - Fix streaming truncation, tool call deduplication, argument replacement - Model switch clears `previous_response_id` (Responses API) - Filter orphaned `server_tool_use` blocks in multi-turn - UI: ServerContentBlocks rendering, auto-scroll, markdown **UI/UX** - Provider model cards: search, grouping (OpenRouter families), provider selection table - Cache guard confirmation dialog with diff/estimated cost - OpenRouter account balance, health status badges Closes streaming progress and tool finalization issues
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.