fix: prevent workers from getting stuck in timeout-retry loops#477
fix: prevent workers from getting stuck in timeout-retry loops#477TheDarkSkyXD wants to merge 1 commit intospacedriveapp:mainfrom
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (1)
WalkthroughWorker-level transient retry cap reduced from 5 to 3 attempts. Anthropic API Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~3 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (5)
src/llm/anthropic/params.rs (1)
100-106: LGTM — per-request timeout correctly overrides the client-level default.The 10-minute timeout is appropriate for large completions with extended thinking, and the comment clearly explains the rationale. This directly addresses the timeout→retry cascade described in the PR.
Consider extracting the timeout value to a named constant for self-documentation:
const ANTHROPIC_REQUEST_TIMEOUT: std::time::Duration = std::time::Duration::from_secs(10 * 60);This is optional; the inline expression is clear enough given the comment.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/llm/anthropic/params.rs` around lines 100 - 106, Extract the inline duration into a named constant for clarity and reuse: define a const like ANTHROPIC_REQUEST_TIMEOUT (type std::time::Duration) and replace the inline std::time::Duration::from_secs(10 * 60) in the builder.timeout(...) call (the http_client.post(&url).timeout(...) chain) with that constant; keep the existing comment and header usage around the builder variable unchanged.interface/src/components/ModelSelect.tsx (3)
102-105: Consider hoistingproviderRankoutside the component.Since
providerOrderis a module-level constant,providerRankdoesn't depend on any component state and can be defined at module scope for clarity.♻️ Optional: move to module scope
+ const providerRank = (p: string) => { + const index = providerOrder.indexOf(p); + return index === -1 ? Number.MAX_SAFE_INTEGER : index; + }; + export function ModelSelect({ ... - const providerRank = (p: string) => { - const index = providerOrder.indexOf(p); - return index === -1 ? Number.MAX_SAFE_INTEGER : index; - };🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@interface/src/components/ModelSelect.tsx` around lines 102 - 105, Hoist the providerRank helper out of the React component by moving its definition to module scope (using the existing module-level providerOrder constant) so it no longer gets recreated on each render; locate providerRank in ModelSelect.tsx, cut the function from inside the component body and paste it above the component definition (keeping the same implementation that returns Number.MAX_SAFE_INTEGER when index === -1), and ensure any references to providerRank inside the component remain unchanged.
313-316:flatList.indexOf(model)inside render loop causes O(n²) complexity.Each model lookup traverses the entire
flatList. With hundreds of models this can cause noticeable lag, especially during mouse hover events that triggersetHighlightIndex.♻️ Proposed fix: build an index map once
const flatList = useMemo(() => { const items: ModelInfo[] = []; for (const p of sortedProviders) { for (const m of grouped[p]) { items.push(m); } } return items; }, [sortedProviders, grouped]); + const flatIndexMap = useMemo(() => { + const map = new Map<string, number>(); + flatList.forEach((m, i) => map.set(m.id, i)); + return map; + }, [flatList]);Then in the render:
- const flatIdx = flatList.indexOf(model); + const flatIdx = flatIndexMap.get(model.id) ?? -1;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@interface/src/components/ModelSelect.tsx` around lines 313 - 316, The render loop is doing flatList.indexOf(model) causing O(n²) behavior; create a lookup map once (e.g., build a flatIndexMap keyed by model id or a stable identifier from flatList) and use that map inside the grouped[prov].map to compute flatIdx instead of indexOf; update references to flatIdx, highlightIndex, setHighlightIndex, model.id and value to use the map lookup and ensure the map is rebuilt only when flatList changes (memoize or compute outside the render loop).
198-214: Minor UX inconsistency: ArrowUp doesn't open the dropdown when closed.ArrowDown opens the dropdown (line 200-203), but ArrowUp only cycles through items. This may be intentional, but for consistency you could consider also opening on ArrowUp.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@interface/src/components/ModelSelect.tsx` around lines 198 - 214, The ArrowUp handler should mirror ArrowDown by opening the dropdown when it's closed; inside the ArrowUp branch (where e.key === "ArrowUp") add the same check used by ArrowDown: if (!open) { setOpen(true); setFilter(value); } and then adjust the setHighlightIndex logic to handle the newly opened state (e.g., initialize to flatList.length - 1 so the last item is highlighted) while preserving the existing decrement/wrap behavior; update references to setOpen, setFilter, setHighlightIndex, flatList, and value in the ArrowUp block accordingly.scripts/bundle-sidecar.ps1 (1)
33-33: ReplaceWrite-HostwithWrite-Informationfor better log stream compatibility.
Write-Hostoutputs directly to the console and cannot be captured by redirection operators;Write-Informationparticipates in PowerShell output streams and allows better control via-InformationAction.♻️ Proposed refactor
-Write-Host "Building spacebot ($BuildMode) for $TargetTriple..." +Write-Information "Building spacebot ($BuildMode) for $TargetTriple..." -InformationAction Continue @@ -Write-Host "Copied $SrcBin -> $DestBin" -Write-Host "Sidecar binary ready." +Write-Information "Copied $SrcBin -> $DestBin" -InformationAction Continue +Write-Information "Sidecar binary ready." -InformationAction ContinueAlso applies to: 51-52
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/bundle-sidecar.ps1` at line 33, Replace the direct console writes using Write-Host with Write-Information so messages participate in PowerShell's information stream; update the call shown (the literal line containing Write-Host "Building spacebot ($BuildMode) for $TargetTriple...") and the other occurrences around lines 51-52 to use Write-Information "Building spacebot ($BuildMode) for $TargetTriple..." -InformationAction Continue (preserving the original message text and any variable interpolation) so logs can be captured/controlled by InformationAction and redirection.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@scripts/bundle-sidecar.ps1`:
- Around line 35-40: The cargo build invocations (the native calls in the
branches comparing $TargetTriple and $HostTriple) can fail silently in
PowerShell 5.1; after each cargo build call (both the platform-specific branch
and the else branch around the lines invoking cargo build with `@CargoFlags` and
--target $TargetTriple), check $LASTEXITCODE (or the ExitCode from
Start-Process) and if non-zero immediately fail (e.g., throw or exit 1) with a
clear message including the exit code so the script doesn’t continue to
Copy-Item with a missing/stale spacebot.exe.
In `@src/agent/compactor.rs`:
- Around line 226-247: Branch compaction uses drain(..remove_count) without the
ToolResult-boundary protection found in compactor.rs (the while loop that
advances remove_count when encountering User messages containing
UserContent::ToolResult), which can orphan ToolResult blocks; refactor by
extracting that boundary logic into a shared function (e.g.,
calculate_safe_remove_count(total: usize, fraction: f32, hist: &History) ->
usize) and call it from both compactor.rs (replacing the existing remove_count
calculation and while loop) and Branch::compact_history() (replacing the direct
drain(..remove_count)), ensuring the helper checks hist.get(index) for
Message::User with content items matching UserContent::ToolResult and advances
the count accordingly.
In `@src/config/types.rs`:
- Line 845: Default-enabling the evaluate_enabled flag exposes JS execution by
default; change the default value of the evaluate_enabled field back to false in
the type's Default/constructor so browser-driven JS evaluation is opt-in, update
any related comments/docs to note it must be explicitly enabled, and run/update
tests that assert default settings; specifically modify the evaluate_enabled
default and ensure the browser evaluation code path (the JS/evaluate flow)
continues to check evaluate_enabled before executing.
In `@src/tools/browser.rs`:
- Around line 2057-2067: The current code in src/tools/browser.rs always treats
any error from browser.close().await (in the block handling Some(mut browser))
as non-fatal; change it to only suppress known terminal/disconnect errors (e.g.
oneshot canceled / WebSocket connection closed) by matching the error
kind/string/inner type from the result of browser.close().await and logging
those as warnings, but for any other unexpected error propagate it (return Err
or propagate the error out of the function) so the caller does not receive a
false "Browser closed" success; keep the existing tracing::warn for the known
cases and ensure the code path that produces the success response ("Browser
closed") is only reached when close either succeeded or matched a known terminal
disconnect.
---
Nitpick comments:
In `@interface/src/components/ModelSelect.tsx`:
- Around line 102-105: Hoist the providerRank helper out of the React component
by moving its definition to module scope (using the existing module-level
providerOrder constant) so it no longer gets recreated on each render; locate
providerRank in ModelSelect.tsx, cut the function from inside the component body
and paste it above the component definition (keeping the same implementation
that returns Number.MAX_SAFE_INTEGER when index === -1), and ensure any
references to providerRank inside the component remain unchanged.
- Around line 313-316: The render loop is doing flatList.indexOf(model) causing
O(n²) behavior; create a lookup map once (e.g., build a flatIndexMap keyed by
model id or a stable identifier from flatList) and use that map inside the
grouped[prov].map to compute flatIdx instead of indexOf; update references to
flatIdx, highlightIndex, setHighlightIndex, model.id and value to use the map
lookup and ensure the map is rebuilt only when flatList changes (memoize or
compute outside the render loop).
- Around line 198-214: The ArrowUp handler should mirror ArrowDown by opening
the dropdown when it's closed; inside the ArrowUp branch (where e.key ===
"ArrowUp") add the same check used by ArrowDown: if (!open) { setOpen(true);
setFilter(value); } and then adjust the setHighlightIndex logic to handle the
newly opened state (e.g., initialize to flatList.length - 1 so the last item is
highlighted) while preserving the existing decrement/wrap behavior; update
references to setOpen, setFilter, setHighlightIndex, flatList, and value in the
ArrowUp block accordingly.
In `@scripts/bundle-sidecar.ps1`:
- Line 33: Replace the direct console writes using Write-Host with
Write-Information so messages participate in PowerShell's information stream;
update the call shown (the literal line containing Write-Host "Building spacebot
($BuildMode) for $TargetTriple...") and the other occurrences around lines 51-52
to use Write-Information "Building spacebot ($BuildMode) for $TargetTriple..."
-InformationAction Continue (preserving the original message text and any
variable interpolation) so logs can be captured/controlled by InformationAction
and redirection.
In `@src/llm/anthropic/params.rs`:
- Around line 100-106: Extract the inline duration into a named constant for
clarity and reuse: define a const like ANTHROPIC_REQUEST_TIMEOUT (type
std::time::Duration) and replace the inline std::time::Duration::from_secs(10 *
60) in the builder.timeout(...) call (the http_client.post(&url).timeout(...)
chain) with that constant; keep the existing comment and header usage around the
builder variable unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 9c545778-e40b-4860-aea5-141a799abdfa
⛔ Files ignored due to path filters (50)
desktop/src-tauri/gen/schemas/windows-schema.jsonis excluded by!**/gen/**,!**/*.json,!**/gen/**desktop/src-tauri/icons/64x64.pngis excluded by!**/*.png,!**/*.pngdesktop/src-tauri/icons/Square107x107Logo.pngis excluded by!**/*.png,!**/*.pngdesktop/src-tauri/icons/Square142x142Logo.pngis excluded by!**/*.png,!**/*.pngdesktop/src-tauri/icons/Square150x150Logo.pngis excluded by!**/*.png,!**/*.pngdesktop/src-tauri/icons/Square284x284Logo.pngis excluded by!**/*.png,!**/*.pngdesktop/src-tauri/icons/Square30x30Logo.pngis excluded by!**/*.png,!**/*.pngdesktop/src-tauri/icons/Square310x310Logo.pngis excluded by!**/*.png,!**/*.pngdesktop/src-tauri/icons/Square44x44Logo.pngis excluded by!**/*.png,!**/*.pngdesktop/src-tauri/icons/Square71x71Logo.pngis excluded by!**/*.png,!**/*.pngdesktop/src-tauri/icons/Square89x89Logo.pngis excluded by!**/*.png,!**/*.pngdesktop/src-tauri/icons/StoreLogo.pngis excluded by!**/*.png,!**/*.pngdesktop/src-tauri/icons/android/mipmap-anydpi-v26/ic_launcher.xmlis excluded by!**/*.xmldesktop/src-tauri/icons/android/mipmap-hdpi/ic_launcher.pngis excluded by!**/*.png,!**/*.pngdesktop/src-tauri/icons/android/mipmap-hdpi/ic_launcher_foreground.pngis excluded by!**/*.png,!**/*.pngdesktop/src-tauri/icons/android/mipmap-hdpi/ic_launcher_round.pngis excluded by!**/*.png,!**/*.pngdesktop/src-tauri/icons/android/mipmap-mdpi/ic_launcher.pngis excluded by!**/*.png,!**/*.pngdesktop/src-tauri/icons/android/mipmap-mdpi/ic_launcher_foreground.pngis excluded by!**/*.png,!**/*.pngdesktop/src-tauri/icons/android/mipmap-mdpi/ic_launcher_round.pngis excluded by!**/*.png,!**/*.pngdesktop/src-tauri/icons/android/mipmap-xhdpi/ic_launcher.pngis excluded by!**/*.png,!**/*.pngdesktop/src-tauri/icons/android/mipmap-xhdpi/ic_launcher_foreground.pngis excluded by!**/*.png,!**/*.pngdesktop/src-tauri/icons/android/mipmap-xhdpi/ic_launcher_round.pngis excluded by!**/*.png,!**/*.pngdesktop/src-tauri/icons/android/mipmap-xxhdpi/ic_launcher.pngis excluded by!**/*.png,!**/*.pngdesktop/src-tauri/icons/android/mipmap-xxhdpi/ic_launcher_foreground.pngis excluded by!**/*.png,!**/*.pngdesktop/src-tauri/icons/android/mipmap-xxhdpi/ic_launcher_round.pngis excluded by!**/*.png,!**/*.pngdesktop/src-tauri/icons/android/mipmap-xxxhdpi/ic_launcher.pngis excluded by!**/*.png,!**/*.pngdesktop/src-tauri/icons/android/mipmap-xxxhdpi/ic_launcher_foreground.pngis excluded by!**/*.png,!**/*.pngdesktop/src-tauri/icons/android/mipmap-xxxhdpi/ic_launcher_round.pngis excluded by!**/*.png,!**/*.pngdesktop/src-tauri/icons/android/values/ic_launcher_background.xmlis excluded by!**/*.xmldesktop/src-tauri/icons/icon.icois excluded by!**/*.ico,!**/*.icodesktop/src-tauri/icons/icon.pngis excluded by!**/*.png,!**/*.pngdesktop/src-tauri/icons/ios/AppIcon-20x20@1x.pngis excluded by!**/*.png,!**/*.pngdesktop/src-tauri/icons/ios/AppIcon-20x20@2x-1.pngis excluded by!**/*.png,!**/*.pngdesktop/src-tauri/icons/ios/AppIcon-20x20@2x.pngis excluded by!**/*.png,!**/*.pngdesktop/src-tauri/icons/ios/AppIcon-20x20@3x.pngis excluded by!**/*.png,!**/*.pngdesktop/src-tauri/icons/ios/AppIcon-29x29@1x.pngis excluded by!**/*.png,!**/*.pngdesktop/src-tauri/icons/ios/AppIcon-29x29@2x-1.pngis excluded by!**/*.png,!**/*.pngdesktop/src-tauri/icons/ios/AppIcon-29x29@2x.pngis excluded by!**/*.png,!**/*.pngdesktop/src-tauri/icons/ios/AppIcon-29x29@3x.pngis excluded by!**/*.png,!**/*.pngdesktop/src-tauri/icons/ios/AppIcon-40x40@1x.pngis excluded by!**/*.png,!**/*.pngdesktop/src-tauri/icons/ios/AppIcon-40x40@2x-1.pngis excluded by!**/*.png,!**/*.pngdesktop/src-tauri/icons/ios/AppIcon-40x40@2x.pngis excluded by!**/*.png,!**/*.pngdesktop/src-tauri/icons/ios/AppIcon-40x40@3x.pngis excluded by!**/*.png,!**/*.pngdesktop/src-tauri/icons/ios/AppIcon-512@2x.pngis excluded by!**/*.png,!**/*.pngdesktop/src-tauri/icons/ios/AppIcon-60x60@2x.pngis excluded by!**/*.png,!**/*.pngdesktop/src-tauri/icons/ios/AppIcon-60x60@3x.pngis excluded by!**/*.png,!**/*.pngdesktop/src-tauri/icons/ios/AppIcon-76x76@1x.pngis excluded by!**/*.png,!**/*.pngdesktop/src-tauri/icons/ios/AppIcon-76x76@2x.pngis excluded by!**/*.png,!**/*.pngdesktop/src-tauri/icons/ios/AppIcon-83.5x83.5@2x.pngis excluded by!**/*.png,!**/*.pngdesktop/tauri.windows.conf.jsonis excluded by!**/*.json
📒 Files selected for processing (8)
interface/src/components/ModelSelect.tsxscripts/bundle-sidecar.ps1src/agent/compactor.rssrc/agent/worker.rssrc/api/models.rssrc/config/types.rssrc/llm/anthropic/params.rssrc/tools/browser.rs
scripts/bundle-sidecar.ps1
Outdated
| if ($TargetTriple -ne $HostTriple) { | ||
| cargo build @CargoFlags --target $TargetTriple --manifest-path "$RepoRoot\Cargo.toml" | ||
| $SrcBin = Join-Path $RepoRoot "target\$TargetTriple\$BuildMode\spacebot.exe" | ||
| } else { | ||
| cargo build @CargoFlags --manifest-path "$RepoRoot\Cargo.toml" | ||
| $SrcBin = Join-Path $RepoRoot "target\$BuildMode\spacebot.exe" |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify current script lacks explicit native-exit guards and source existence check.
rg -n -C2 'cargo build|LASTEXITCODE|PSNativeCommandUseErrorActionPreference|Test-Path \$SrcBin|Copy-Item \$SrcBin' scripts/bundle-sidecar.ps1Repository: spacedriveapp/spacebot
Length of output: 636
🏁 Script executed:
head -n 35 scripts/bundle-sidecar.ps1Repository: spacedriveapp/spacebot
Length of output: 1059
🏁 Script executed:
#!/bin/bash
# Verify PowerShell behavior: does $ErrorActionPreference = "Stop" catch native command non-zero exit?
# Key: native commands in PS 5.1 and earlier do NOT inherently throw on non-zero exit with ErrorActionPreference alone
rg -n '\$ErrorActionPreference' scripts/bundle-sidecar.ps1Repository: spacedriveapp/spacebot
Length of output: 100
🏁 Script executed:
# Check for PowerShell version requirements or constraints
rg -i 'powershell|pwsh|version|requires' scripts/bundle-sidecar.ps1 | head -20Repository: spacedriveapp/spacebot
Length of output: 48
🏁 Script executed:
# Check for any CI/build config that specifies PowerShell version
fd -e yml -e yaml -e json | xargs rg -l 'powershell|pwsh' | head -5Repository: spacedriveapp/spacebot
Length of output: 97
🏁 Script executed:
# Check if tauri config references this script or specifies environment
cat desktop/tauri.windows.conf.json | jq '.build // .bundle // empty' 2>/dev/null || cat desktop/tauri.windows.conf.json | head -40Repository: spacedriveapp/spacebot
Length of output: 407
🏁 Script executed:
# Search for any other safety checks in build/CI pipeline
fd -e yml -e yaml . .github/ .gitlab/ azure-pipelines* 2>/dev/null | head -3Repository: spacedriveapp/spacebot
Length of output: 155
Fail fast when cargo build returns non-zero exit code.
The script invokes cargo build as a native command via Windows PowerShell 5.1 (called as powershell in Tauri config). In PowerShell 5.1, $ErrorActionPreference = "Stop" does not catch non-zero exits from native commands, so a failed build will silently continue to Copy-Item, risking a stale or missing binary being shipped.
🔧 Proposed fix
$ErrorActionPreference = "Stop"
+$PSNativeCommandUseErrorActionPreference = $true
@@
if ($TargetTriple -ne $HostTriple) {
cargo build `@CargoFlags` --target $TargetTriple --manifest-path "$RepoRoot\Cargo.toml"
+ if ($LASTEXITCODE -ne 0) { throw "cargo build failed with exit code $LASTEXITCODE" }
$SrcBin = Join-Path $RepoRoot "target\$TargetTriple\$BuildMode\spacebot.exe"
} else {
cargo build `@CargoFlags` --manifest-path "$RepoRoot\Cargo.toml"
+ if ($LASTEXITCODE -ne 0) { throw "cargo build failed with exit code $LASTEXITCODE" }
$SrcBin = Join-Path $RepoRoot "target\$BuildMode\spacebot.exe"
}
@@
+if (-not (Test-Path $SrcBin)) {
+ throw "Built binary not found: $SrcBin"
+}
Copy-Item $SrcBin $DestBin -ForceAlso applies to: 50–50
🧰 Tools
🪛 PSScriptAnalyzer (1.24.0)
[warning] Missing BOM encoding for non-ASCII encoded file 'bundle-sidecar.ps1'
(PSUseBOMForUnicodeEncodedFile)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/bundle-sidecar.ps1` around lines 35 - 40, The cargo build invocations
(the native calls in the branches comparing $TargetTriple and $HostTriple) can
fail silently in PowerShell 5.1; after each cargo build call (both the
platform-specific branch and the else branch around the lines invoking cargo
build with `@CargoFlags` and --target $TargetTriple), check $LASTEXITCODE (or the
ExitCode from Start-Process) and if non-zero immediately fail (e.g., throw or
exit 1) with a clear message including the exit code so the script doesn’t
continue to Copy-Item with a missing/stale spacebot.exe.
src/agent/compactor.rs
Outdated
| let mut remove_count = ((total as f32 * fraction) as usize) | ||
| .max(1) | ||
| .min(total.saturating_sub(2)); | ||
| if remove_count == 0 { | ||
| return Ok(0); | ||
| } | ||
|
|
||
| // Advance past User messages containing ToolResult blocks so we never | ||
| // orphan tool_results whose matching tool_use was in a removed message. | ||
| while remove_count < total.saturating_sub(2) { | ||
| let has_tool_result = matches!( | ||
| hist.get(remove_count), | ||
| Some(Message::User { content }) | ||
| if content.iter().any(|item| | ||
| matches!(item, UserContent::ToolResult(_))) | ||
| ); | ||
| if has_tool_result { | ||
| remove_count += 1; | ||
| } else { | ||
| break; | ||
| } | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Description: Check if branch.rs compact_history handles ToolResult boundaries
# Expected: Either branch.rs has similar skip logic, or branches never contain ToolResult messages
# Search for compact_history in branch.rs and show surrounding context
rg -n -A 20 'fn compact_history' src/agent/branch.rs
# Check if branches ever receive ToolResult content
rg -n 'ToolResult' src/agent/branch.rsRepository: spacedriveapp/spacebot
Length of output: 740
Branch compaction should apply the same ToolResult boundary logic as compactor — extract to shared utility.
Branch.rs compact_history() (line 287-303) uses .drain(..remove_count) without the ToolResult-boundary skip logic from compactor.rs (lines 226-247). While branches currently don't receive ToolResult messages, the inconsistency is a latent bug: if branch conversations ever include tool results (e.g., from enhanced curation flows), the drain could orphan ToolResult blocks and trigger Anthropic API rejection.
Recommendation: Either add the defensive skip logic to branch compaction, or extract the boundary logic into a shared utility function (e.g., calculate_safe_remove_count()) to prevent future divergence.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/agent/compactor.rs` around lines 226 - 247, Branch compaction uses
drain(..remove_count) without the ToolResult-boundary protection found in
compactor.rs (the while loop that advances remove_count when encountering User
messages containing UserContent::ToolResult), which can orphan ToolResult
blocks; refactor by extracting that boundary logic into a shared function (e.g.,
calculate_safe_remove_count(total: usize, fraction: f32, hist: &History) ->
usize) and call it from both compactor.rs (replacing the existing remove_count
calculation and while loop) and Branch::compact_history() (replacing the direct
drain(..remove_count)), ensuring the helper checks hist.get(index) for
Message::User with content items matching UserContent::ToolResult and advances
the count accordingly.
src/config/types.rs
Outdated
| enabled: true, | ||
| headless: true, | ||
| evaluate_enabled: false, | ||
| evaluate_enabled: true, |
There was a problem hiding this comment.
Default-enabling browser_evaluate weakens the security boundary.
Line 845 flips evaluate_enabled to true, which makes arbitrary JS execution available by default (see src/tools/browser.rs, Line 1753). This changes a high-risk capability from opt-in to opt-out and increases prompt-injection blast radius.
🔧 Suggested fix
- evaluate_enabled: true,
+ evaluate_enabled: false,📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| evaluate_enabled: true, | |
| evaluate_enabled: false, |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/config/types.rs` at line 845, Default-enabling the evaluate_enabled flag
exposes JS execution by default; change the default value of the
evaluate_enabled field back to false in the type's Default/constructor so
browser-driven JS evaluation is opt-in, update any related comments/docs to note
it must be explicitly enabled, and run/update tests that assert default
settings; specifically modify the evaluate_enabled default and ensure the
browser evaluation code path (the JS/evaluate flow) continues to check
evaluate_enabled before executing.
src/tools/browser.rs
Outdated
| if let Some(mut browser) = browser { | ||
| if let Err(error) = browser.close().await { | ||
| // The browser connection may already be dead (e.g. "oneshot canceled" | ||
| // from a dropped WebSocket). That's fine — the browser is effectively | ||
| // closed. Log it but don't fail the tool call. | ||
| tracing::warn!( | ||
| policy = "close_browser", | ||
| %error, | ||
| "browser.close() failed, treating as already closed" | ||
| ); | ||
| } |
There was a problem hiding this comment.
Don’t treat every browser.close() failure as success.
Lines 2058-2067 currently suppress all close failures and still return "Browser closed". That can hide real shutdown failures and leave a running/orphaned browser while reporting success.
Please only suppress known terminal/disconnect errors and propagate unexpected close failures.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/tools/browser.rs` around lines 2057 - 2067, The current code in
src/tools/browser.rs always treats any error from browser.close().await (in the
block handling Some(mut browser)) as non-fatal; change it to only suppress known
terminal/disconnect errors (e.g. oneshot canceled / WebSocket connection closed)
by matching the error kind/string/inner type from the result of
browser.close().await and logging those as warnings, but for any other
unexpected error propagate it (return Err or propagate the error out of the
function) so the caller does not receive a false "Browser closed" success; keep
the existing tracing::warn for the known cases and ensure the code path that
produces the success response ("Browser closed") is only reached when close
either succeeded or matched a known terminal disconnect.
Workers spawning large LLM requests (e.g. writing a full PRD) were getting stuck for 30+ minutes because: 1. The global reqwest client timeout of 120s was too short for large Anthropic completions with extended thinking — requests timed out before the API could finish generating. 2. The retry cascade (3 model-level retries × 5 worker-level retries) meant each timeout failure took ~30 minutes before the worker finally gave up, only to be respawned with the same result. Changes: - Add 10-minute per-request timeout on Anthropic API calls, overriding the 120s global client timeout. Matches the scale of the streaming path (30min). - Reduce MAX_TRANSIENT_RETRIES from 5 to 3 (still 9 total attempts with model-level retries) to fail faster on sustained API issues. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
248967e to
5a5d5b8
Compare
Summary
reqwest::Clienttimeout of 120s was too short for large completions with extended thinking, causing requests to time out before the API could finish generating. The per-request override matches the scale of the streaming path (30min).Context
Workers spawning large LLM requests (e.g. writing a full PRD document) were getting stuck in a timeout→retry cascade:
Test plan
cargo build -p spacebot)🤖 Generated with Claude Code