chore: executor benchmarks, scripts, docs, and visualization#209
chore: executor benchmarks, scripts, docs, and visualization#209sethconvex wants to merge 7 commits intographite-base/209from
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughThis PR introduces executor mode, a sharded task queue architecture for high-throughput workflow execution. It adds new components (task queue, coordinator, executor actions), extends the client and schema, includes comprehensive benchmarks and documentation, and modifies routing logic in existing modules to support batch actions and executor-based step processing. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
commit: |
There was a problem hiding this comment.
Actionable comments posted: 20
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/component/journal.ts (1)
130-266:⚠️ Potential issue | 🟠 MajorExecutor path drops per-step retry/scheduling.
Lines 130-266: whenexecutorShardsis set, inserts useDEFAULT_QM_RETRY(or none for actions) and ignorestepArgs.retry/schedulerOptions, soretry: false, custom backoff, orrunAt/runAfterare silently lost vs the workpool path. Please map retries intotaskQueueand explicitly reject unsupported scheduling (or implement delayed tasks).🛠️ Suggested normalization/guard
- const { retry, schedulerOptions } = stepArgs; + const { retry, schedulerOptions } = stepArgs; + const retryConfig = + retry === false + ? undefined + : retry === true || retry === undefined + ? DEFAULT_QM_RETRY + : retry; + if (workflow.executorShards && schedulerOptions) { + throw new Error("schedulerOptions not supported in executor mode"); + }- await ctx.db.insert("taskQueue", { + await ctx.db.insert("taskQueue", { shard, functionType: "query", handle: step.handle, args: step.args, stepId, workflowId: workflow._id, generationNumber, - retry: DEFAULT_QM_RETRY, + retry: retryConfig, });Apply
retryConfigsimilarly in mutation/action taskQueue inserts.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/component/journal.ts` around lines 130 - 266, The executor path (when workflow.executorShards is set) currently hardcodes DEFAULT_QM_RETRY (or omits retry for actions) and ignores per-step retry/scheduling (stepArgs.retry and schedulerOptions) leading to lost retry/backoff/runAt semantics; update the taskQueue inserts (the ctx.db.insert("taskQueue") calls in the switch branches for "query","mutation","action") to map stepArgs.retry into the inserted retry configuration (and include any supported schedulerOptions fields like runAt/runAfter/backoff), and for any unsupported scheduling options explicitly throw or return an error so callers know they’re unsupported (use shardForWorkflow, stepId, stepArgs.batchActionName and workflow.executorShards to locate the relevant branches). Ensure behavior mirrors workpool.enqueueQuery/Mutation/Action parameter handling or documents/rejects divergences.
♻️ Duplicate comments (3)
benchmark_results/batched_1000_timeline_20260215_112330.json (1)
1-43: Same unpopulated timeline fields asbatched_1000_timeline_20260215_111510.json.
durationMs: 0,elapsedMs: 0,concurrency: [],itemRows: [],steps: []— see the note raised onbatched_1000_timeline_20260215_111510.json. The pattern is consistent across both timeline snapshots, confirming a systematic recording gap rather than a one-off artifact.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@benchmark_results/batched_1000_timeline_20260215_112330.json` around lines 1 - 43, The batched timeline object is being left unpopulated (durationMs, elapsedMs, concurrency, itemRows, steps are empty/zero) — update the timeline recording/finalization logic so the "batched" timeline is fully populated before persisting: in the code paths that build the batched object (look for functions or methods that construct or finalize the "batched" timeline or call a finalizeTimeline/recordTimeline routine), calculate and set timelineStart and timelineEnd (use run start/stop timestamps), compute durationMs and elapsedMs, populate concurrency and steps arrays with the recorded events, and fill itemRows with per-item timing/metadata; ensure this update runs for the batched mode (batchedMaxWorkers/batchedMaxParallelism paths) and is invoked both on normal completion and on retry/error code paths so the JSON snapshot contains the expected fields.benchmark_results/batched_1000_status_20260215_112330.json (1)
1-13: Same concern as the other benchmark JSON files — consider excluding from repo.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@benchmark_results/batched_1000_status_20260215_112330.json` around lines 1 - 13, The benchmark results file benchmark_results/batched_1000_status_20260215_112330.json should not be committed; remove it from the repository and stop tracking future similar files by deleting or moving this file from source control (use git rm --cached on the file to remove from the index while keeping local copy) and add an appropriate pattern for benchmark_results/*.json (or the specific naming convention) to .gitignore so future benchmark JSONs are not committed; ensure CI/artifacts or a designated storage location is documented for keeping these outputs instead.benchmark_results/batched_1000_timeline_20260215_115704.json (1)
1-43: Same concern as the other benchmark JSON file — consider excluding from repo.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@benchmark_results/batched_1000_timeline_20260215_115704.json` around lines 1 - 43, This JSON (batched_1000_timeline_20260215_115704.json) is a generated benchmark artifact that should be removed from the repo; delete the committed file, add a .gitignore entry to exclude generated benchmark timeline files (e.g. a wildcard like batched_*_timeline_*.json or a dedicated pattern for benchmark outputs), and update any CI/docs to upload or store these artifacts outside source control instead of committing them.
🧹 Nitpick comments (13)
scripts/check_regular.cjs (1)
2-2: Consider accepting the file path as a CLI argument instead of hardcoding/tmp/jokebattle_data.json.The hardcoded path makes the script fragile across different machines or CI environments.
process.argv[2]with a sensible default is an easy improvement.♻️ Proposed refactor
-const d = readFileSync("/tmp/jokebattle_data.json", "utf8"); +const filePath = process.argv[2] ?? "/tmp/jokebattle_data.json"; +const d = readFileSync(filePath, "utf8");🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/check_regular.cjs` at line 2, The script currently hardcodes the input file path in the call to fs.readFileSync (const d = fs.readFileSync("/tmp/jokebattle_data.json", "utf8")), making it brittle; change this to read the path from process.argv[2] with a sensible default (e.g. use process.argv[2] || "/tmp/jokebattle_data.json"), update any related variable names that consume the file contents (e.g. d) and ensure the script prints a helpful usage message or default note when no argument is provided; keep the fs.readFileSync call but replace the literal path with the variable so the file path can be passed via CLI.scripts/watch_workflow_debug.sh (1)
7-22: Extractjson_lastto a shared helper function.This function is duplicated identically across three scripts:
scripts/watch_workflow_debug.sh,scripts/benchmark_workflow_small.sh, andscripts/benchmark_joke_battle.sh. Create a shared helper (e.g.,scripts/lib/common.sh) andsourceit from each script to eliminate duplication.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/watch_workflow_debug.sh` around lines 7 - 22, Move the json_last function out of each script into a single shared helper file (e.g., create scripts/lib/common.sh) that defines json_last exactly once, then update scripts/watch_workflow_debug.sh, scripts/benchmark_workflow_small.sh, and scripts/benchmark_joke_battle.sh to source that helper (source "scripts/lib/common.sh") and remove the duplicated function bodies; ensure the helper is executable/readable and that the function name json_last remains unchanged so callers keep working.scripts/benchmark_workflow_small.sh (1)
37-50: Avoid clobbering the fixed /tmp error log.A shared
/tmp/workflow_bench_err.logcan be overwritten if runs overlap, making debugging harder. Consider a per-run temp log (PID-based or mktemp).♻️ Proposed refactor
run_one() { local mode="$1" local topic="$2" + local err_log="/tmp/workflow_bench_err.${mode}.$$" if [[ "$SKIP_CLEAR" != "1" ]]; then run_convex llmSimulation:clearAll "{}" >/dev/null || true sleep 1 fi @@ - st="$(run_convex llmSimulation:benchmarkStatus "{\"simulationId\":\"$simulation_id\"}" 2>/tmp/workflow_bench_err.log)" + st="$(run_convex llmSimulation:benchmarkStatus "{\"simulationId\":\"$simulation_id\"}" 2>"$err_log")"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/benchmark_workflow_small.sh` around lines 37 - 50, The script's run_one function currently writes stderr to a fixed /tmp/workflow_bench_err.log which can be clobbered by concurrent runs; update run_one to create a per-run temporary error log (e.g., use mktemp or include $$/PID in the filename), store that temp path in a local variable (e.g., err_log), and replace all references to /tmp/workflow_bench_err.log with that variable (including cleanup at the end of run_one); ensure the temp file is created before the loop and removed on exit so concurrent runs do not overwrite each other's logs.scripts/benchmark_compare.sh (3)
36-39: Three separatepython3invocations to parse the same JSON is wasteful.You could parse all three fields in a single call:
Proposed consolidation
- local completed failed running - completed="$(echo "$raw" | python3 -c 'import json,sys; print(json.load(sys.stdin).get("completed",0))' 2>/dev/null || echo 0)" - failed="$(echo "$raw" | python3 -c 'import json,sys; print(json.load(sys.stdin).get("failed",0))' 2>/dev/null || echo 0)" - running="$(echo "$raw" | python3 -c 'import json,sys; print(json.load(sys.stdin).get("running",0))' 2>/dev/null || echo 0)" + local parsed + parsed="$(echo "$raw" | python3 -c ' +import json, sys +d = json.load(sys.stdin) +print(d.get("completed", 0), d.get("failed", 0), d.get("running", 0)) +' 2>/dev/null || echo "0 0 0")" + local completed failed running + read -r completed failed running <<< "$parsed"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/benchmark_compare.sh` around lines 36 - 39, Replace the three separate python3 invocations that extract "completed", "failed", and "running" from "$raw" with a single python3 call that reads the JSON once and prints the three values (e.g., space-separated or newline), then read those into the shell variables; update the code around variables completed, failed, running to use a single python3 invocation that returns defaults of 0 when keys are missing and assign with read -r completed failed running (or mapfile) to avoid repeated JSON parsing.
46-48:evalfor dynamic variable assignment is fragile — consider alternatives.While the
modeargument is currently hardcoded ("batched"/"standard"),evalis a common source of injection bugs if that ever changes. A safer pattern usesdeclareor an associative array.Example using declare
- eval "${mode}_elapsed=$elapsed" - eval "${mode}_completed=$completed" - eval "${mode}_failed=$failed" + declare -g "${mode}_elapsed=$elapsed" + declare -g "${mode}_completed=$completed" + declare -g "${mode}_failed=$failed"Apply the same change at lines 54-56.
Also applies to: 54-56
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/benchmark_compare.sh` around lines 46 - 48, The script uses eval to assign dynamic variables (e.g., eval "${mode}_elapsed=$elapsed", eval "${mode}_completed=$completed", eval "${mode}_failed=$failed") which is fragile and can lead to injection; replace these eval assignments with a safer approach such as using declare to create the dynamic variable names (declare "${mode}_elapsed=$elapsed" etc.) or, better, refactor to store results in an associative array (e.g., results[${mode}_elapsed]=$elapsed) and update the code to read from that array; apply the same change for the corresponding eval calls at the other location (lines 54-56).
24-25:start_outis captured but never used.The return value of
startBenchmarkis assigned tostart_outbut never referenced. Either remove the variable or use it (e.g., to log the run ID).Proposed fix
- local start_out - start_out="$(convex_run "benchmark:startBenchmark" "{\"mode\":\"$mode\",\"count\":$COUNT}")" + convex_run "benchmark:startBenchmark" "{\"mode\":\"$mode\",\"count\":$COUNT}" >/dev/null🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/benchmark_compare.sh` around lines 24 - 25, The script assigns the result of convex_run "benchmark:startBenchmark" to start_out but never uses it; either remove start_out and just call convex_run for side effects, or capture and use it (e.g., parse and echo a run ID/info). Update the invocation around start_out and ensure the symbol start_out (and the convex_run call to "benchmark:startBenchmark") is either removed or its value is logged/consumed (for example echoing the returned run ID together with mode and COUNT) so the assignment is meaningful.src/client/step.ts (1)
191-205: Base-name matching could produce false positives across modules.Splitting on
:/and taking only the last segment means two functions like"moduleA:process"and"moduleB:process"would both match if"process"is inbatchActionNames. This is probably intentional given how batch actions are registered, but worth noting if module-qualified names become necessary in the future.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/client/step.ts` around lines 191 - 205, The current base-name matching logic (involving safeFunctionName(target.function), splitting on /[:\/]/ and comparing the last segment against this.batchActionNames) can produce false positives across modules; to fix, narrow matching by preferring full-qualified names first and falling back to base-name only when necessary: update the detection in the code that sets batchActionName to first check this.batchActionNames.has(fnName) (the full safeFunctionName), then if not found check the existing base-name fallback (parts[parts.length-1]) and only set batchActionName to that when explicitly required; ensure this preserves behavior for target.kind === "function" and target.functionType === "action" and keep the variable name batchActionName unchanged.scripts/soak_workflow_20.sh (2)
22-25:json_fieldinterpolates the field name directly into Python code — safe here but fragile.Since
json_fieldis only called with hardcoded field names, there's no current injection risk. However, a safer pattern usessys.argv:Safer alternative
json_field() { local field="$1" - python3 -c "import json,sys; d=json.load(sys.stdin); print(d['$field'])" + python3 -c "import json,sys; d=json.load(sys.stdin); print(d[sys.argv[1]])" "$field" }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/soak_workflow_20.sh` around lines 22 - 25, The json_field function currently interpolates the field name directly into the inline Python code (json_field), which is fragile; change it to pass the field name as an argument via sys.argv to avoid injecting shell content into the Python snippet, e.g. call python3 with the field as an argv parameter and use sys.argv[1] inside the Python code, and update any callers of json_field if needed to continue passing the field name as the first parameter.
41-67: Line 67 is dead code — thewhile trueloop always setsdone=1beforebreak.Both exit paths from the loop (lines 55-56 and lines 62-63) set
done=1beforebreak, and there is no other way to exit thewhile true. The guard on line 67 will therefore never incrementfailures.Proposed fix — remove the dead code
- done=0 while true; do ... - done=1 break fi if [[ "$elapsed_s" -ge "$TIMEOUT_SECS" ]]; then echo "timeout run=$i" run_secs+=("$elapsed_s") failures=$((failures + 1)) - done=1 break fi sleep "$POLL_SECS" done - [[ "$done" -eq 1 ]] || failures=$((failures + 1))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/soak_workflow_20.sh` around lines 41 - 67, The guard at the end using the done flag is dead code because the while true loop always sets done=1 before breaking; remove the final check/increment line ("[[ \"$done\" -eq 1 ]] || failures=$((failures + 1))") so failures is only updated where the loop exits (inside the non-running and timeout branches). Locate the while true loop that calls run_convex llmSimulation:benchmarkStatus and updates status, elapsed_s, run_secs and failures, and delete the trailing dead-code guard referencing done.benchmark_results/batched_1000_timeline_20260215_115155.json (1)
1-43: Consider excluding benchmark result artifacts from the repository.These JSON files are ephemeral, machine-generated snapshots that will accumulate over time and bloat the repository history. They also contain run-specific IDs and timestamps with no reuse value.
Consider adding
benchmark_results/to.gitignoreand storing these artifacts externally (CI artifacts, cloud storage, etc.) instead.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@benchmark_results/batched_1000_timeline_20260215_115155.json` around lines 1 - 43, Add the benchmark_results/ directory to .gitignore and stop committing ephemeral JSON artifacts: update .gitignore to include the literal entry "benchmark_results/" and commit that change, then remove any already-tracked files from git (e.g., using git rm --cached on files under benchmark_results/) and commit the removal so history no longer grows; finally, configure CI or external storage to persist these artifacts instead of the repository (ensure any pipeline or job names that produce these files push them to CI artifacts or cloud storage).scripts/run_joke_batched_trace.sh (1)
5-5:DEBUG_TRACEis interpolated as a raw JSON value — non-boolean inputs will produce invalid JSON.
DEBUG_TRACEdefaults to"false"which works since it's a valid JSON boolean literal. However, inputs likeDEBUG_TRACE=1orDEBUG_TRACE=yeswould produce malformed JSON in the argument string on line 29. Consider validating or coercing:Optional guard
+[[ "$DEBUG_TRACE" == "true" ]] || DEBUG_TRACE="false" + echo "debugTrace=$DEBUG_TRACE count=$COUNT"Also applies to: 29-29
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/run_joke_batched_trace.sh` at line 5, The DEBUG_TRACE variable is inserted as a raw JSON literal and accepts arbitrary input; coerce/validate it to a JSON boolean before use. Replace the simple default assignment of DEBUG_TRACE with a small normalization block that maps common truthy values (e.g., 1, "1", yes, y, true, TRUE) to the literal true and everything else to false (for example using a case/regex check or if-statement), and then use that normalized DEBUG_TRACE variable in the JSON argument construction so the resulting JSON contains only the literal true or false; reference the DEBUG_TRACE variable and the JSON argument construction where the variable is interpolated.src/client/workflowMutation.ts (1)
45-49: Consider grouping optional parameters into an options object for better ergonomics.Two trailing optional positional parameters means a caller who needs only
batchActionNamesmust write:workflowMutation(component, registered, undefined, batchActionNames)Consolidating into a single options bag avoids this:
♻️ Suggested refactor
export function workflowMutation<ArgsValidator extends PropertyValidators>( component: WorkflowComponent, registered: WorkflowDefinition<ArgsValidator>, - defaultWorkpoolOptions?: WorkpoolOptions, - batchActionNames?: Set<string>, + options?: { + workpoolOptions?: WorkpoolOptions; + batchActionNames?: Set<string>; + }, ): RegisteredMutation<...> { const workpoolOptions = { - ...defaultWorkpoolOptions, + ...options?.workpoolOptions, ...registered.workpoolOptions, }; // ... const executor = new StepExecutor( // ... - batchActionNames, + options?.batchActionNames, );🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/client/workflowMutation.ts` around lines 45 - 49, Replace the two trailing optional positional parameters on workflowMutation (defaultWorkpoolOptions and batchActionNames) with a single optional options object to improve ergonomics: change the signature of workflowMutation to accept an options?: { defaultWorkpoolOptions?: WorkpoolOptions; batchActionNames?: Set<string> } (or similar), update the body to read options.defaultWorkpoolOptions and options.batchActionNames, and update all call sites to pass an options object instead of using positional undefined placeholders; optionally add a lightweight overload or compatibility branch in workflowMutation to accept the old positional form and map it to the new options shape while emitting a deprecation note to callers.src/component/workflow.ts (1)
330-357: Sort steps bystepNumberintimelinePagefor stable ordering.
The DB collect order isn’t guaranteed; sorting avoids shuffled bars in the timeline viz.Suggested fix
- const stepDocs = await ctx.db + const stepDocs = await ctx.db .query("steps") .withIndex("workflow", (q) => q.eq("workflowId", wf._id)) .collect(); + stepDocs.sort((a, b) => a.stepNumber - b.stepNumber); return {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/component/workflow.ts` around lines 330 - 357, The timeline steps can appear in nondeterministic DB order; before mapping stepDocs into the steps array in the Promise.all mapping (the async wf => { ... } block that builds page), sort the collected stepDocs by their stepNumber (or sort the produced steps array by stepNumber) to ensure stable ordering for the timeline visualization; update the code around the stepDocs handling (the stepDocs variable and the steps: stepDocs.map(...) block) to perform a numeric sort by stepNumber prior to mapping.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.mcp.json:
- Around line 1-13: The .mcp.json currently contains user-specific absolute
paths (the "args" entry pointing to
"/Users/magicseth/Projects/claudemanager/dist/main/mcp-server.js" and the "env"
value for "MCP_SOCKET_PATH"), so replace those with non-user-specific defaults
and move this file to a template: create a .mcp.example.json that uses
repo-relative paths (e.g., "args": ["./dist/main/mcp-server.js"]) and a
placeholder or env interpolation for MCP_SOCKET_PATH (e.g., use an explicitly
documented placeholder value or reference an env var like "${MCP_SOCKET_PATH}"),
update README with instructions for creating a local .mcp.json from
.mcp.example.json, and add the real .mcp.json to .gitignore so developers/CI can
supply machine-specific paths without committing them.
In `@1npm`:
- Around line 1-108: The file contains a duplicated CLI help output for "convex
dev" (the entire help block is repeated); remove the duplicate so only one
instance of the help text remains (or delete the file if this output should not
be committed), ensuring the single retained block includes the full options list
and usage lines for "convex dev".
In `@benchmark_results/batched_1000_timeline_20260215_111510.json`:
- Around line 1-43: The batched run timeline fields (batched.jokesDone,
judgesDone, picksDone, maxConcurrency, concurrency, itemRows, durationMs,
elapsedMs, steps, result) are not being populated for large batches; inspect and
fix the timeline aggregation path (e.g., finalizeBatchedRun / recordTimeline /
aggregateWorkerTimeline) to ensure it iterates over all items/workers instead of
being truncated by a hardcoded cap or page size, correctly accumulates counts
and maxConcurrency, appends all itemRows and concurrency samples, computes
durationMs/elapsedMs from startedAt/completedAt using safe numeric types, and
sets result when aggregation completes; update any early-return conditions or
try/catch that swallow errors so the batched object is assigned the computed
values for the runId.
In `@benchmark_results/workflow_benchmark_small_1000_2026-02-15_00-08-07.txt`:
- Around line 1-4: This benchmark run shows errors from the CONVEX cleanup step
(CONVEX ?(llmSimulation:clearAll)) reporting "Simulation not found" for IDs
'j9712dprjeky0709w2098prf65816c0h' and 'j97507jy904tpv1erkkkzgt1yx8172qh';
before archiving, either move this file out of the successful archive into a
failed/partial folder or update the file/metadata to annotate the run as
incomplete and include the exact error lines and the simulation IDs so consumers
won't treat this as a successful workflow_small benchmark (section_count=1000).
In `@benchmark_results/workflow_results_archive_latest.txt`:
- Around line 2-4: The file contains an absolute local path in the repo metadata
(the "repo=" line) which leaks local filesystem info; change the "repo=" value
in benchmark_results/workflow_results_archive_latest.txt to a repo-relative
identifier or a sanitized placeholder (e.g., repository name or commit hash)
instead of an absolute /Users/... path so the artifact is portable and
non-identifying.
In `@docs/executor-mode-spec.md`:
- Around line 42-66: Several fenced code blocks (ASCII diagrams such as the
"User mutation └─ workflow.start() └─ workflow.create mutation (component) ..."
and other similar blocks) are missing language identifiers which triggers
markdownlint MD040; update each of those fenced blocks (the diagram blocks
referencing executor(shard=0/1/.../N-1), taskQueue, flush loop, and the other
listed blocks) by adding a language tag like ```text (or ```typescript where
appropriate) at the opening fence so the blocks become e.g. ```text ... ```;
ensure you apply this change to all reported blocks (the diagram shown plus the
other occurrences) so every fenced block includes a language identifier.
- Line 1035: Fix the typo in the docs for the POLL_BACKOFF_MS entry: change
"budge" to "budget" in the table cell describing POLL_BACKOFF_MS so the sentence
reads "wastes query budget" (refer to the `POLL_BACKOFF_MS` table row in
executor-mode-spec.md).
In `@docs/executor-mode.md`:
- Around line 196-202: The constants in the executor-mode table (CLAIM_LIMIT,
MAX_CONCURRENCY, POLL_BACKOFF_MS, MAX_EMPTY_POLLS, RESCHEDULE_MS) conflict with
the values in the tuning guide; update the table to match the authoritative
tuning-guide values (e.g., set CLAIM_LIMIT to 800, POLL_BACKOFF_MS to 200,
MAX_CONCURRENCY to 500, MAX_EMPTY_POLLS to 150) and ensure RESCHEDULE_MS matches
the implementation/guide as well, adjusting the table row text and any
explanatory text so both docs present the same defaults.
- Around line 93-97: Replace the inaccurate description that tasks are assigned
via Math.random with the current deterministic shard mapping: explain that
executor mode uses a hash-based assignment (e.g., hash(workflowId) % numShards)
so the same workflowId always maps to the same shard; update any examples or
tuning guidance that assume uniform random distribution to reflect that skew and
affinity are determined by workflowId hashing rather than Math.random.
In `@example/convex/benchmark.ts`:
- Around line 140-150: The file defines a BenchmarkMode type alias that's
unused, so update function signatures to use that alias instead of inline
unions: change doWork(mode: "simulated" | "real", ...) to use mode:
BenchmarkMode, and similarly update any other functions mentioned (e.g.,
simulateWork, callClaude or the function at the 162-163 region) to accept
BenchmarkMode; alternatively, if you prefer removal, delete the BenchmarkMode
alias and keep the inline unions—make the change consistently where mode is
typed to eliminate the no-unused-vars warning.
In `@example/convex/http.ts`:
- Around line 101-114: The code reads params via URLSearchParams and casts
params.get("after") to a Number in createdAfter without validation, so guard the
parsed value (createdAfter) using Number.isFinite (or Number.isNaN/isFinite
checks) and, if invalid, replace the current friendly missing-parameter UI (the
document.body.innerHTML block) with a similar message that the ?after= value is
not a valid timestamp and instructs the user how to obtain a numeric startedAt;
after showing the message, throw an Error to stop execution. Ensure you locate
and update the logic around URLSearchParams/params, the createdAfter assignment,
and the existing document.body.innerHTML error block so invalid non-numeric
input is handled the same way as a missing parameter.
In `@package.json`:
- Around line 65-67: The package.json currently lists "@anthropic-ai/sdk" in
"dependencies" but it's only used by the example (example/convex/benchmark.ts);
move "@anthropic-ai/sdk": "^0.75.0" from "dependencies" into "devDependencies"
in package.json, keep the same version spec, update the lockfile (npm/yarn/pnpm
install) so consumers won't receive this runtime dependency, and verify the
example still resolves the package during local development/test runs.
In `@screenshots/d1024.md`:
- Line 12: The Viz URL in screenshots/d1024.md (the string
"https://cautious-quail-607.convex.site/benchmark-viz?after=1771390151616") is
deployment-specific and may break; update the markdown to either (a) replace the
live link with an archived/static asset (commit a screenshot image into the repo
and link to that file), or (b) keep the live link but add an explicit
ephemeral/disposable-note next to it (e.g., "ephemeral deployment — may be
removed") so readers know it can become a dead link; ensure the referenced URL
string and any alt text reflect the change.
In `@scripts/check_regular.cjs`:
- Around line 2-3: Wrap the synchronous file read and JSON parse in a try/catch
around the fs.readFileSync("/tmp/jokebattle_data.json", "utf8") and
JSON.parse(d) calls (variables d and j) to handle ENOENT and SyntaxError: on
ENOENT log a clear message that the file is missing, on SyntaxError log that the
JSON is malformed (including the error.message), and for any other error log it
and exit non‑zero; alternatively check fs.existsSync before reading and still
catch JSON.parse failures to provide actionable messages rather than letting the
raw exceptions bubble.
- Around line 1-3: Rename the file from check_regular.cjs to check_regular.mjs
and convert CommonJS to ESM by replacing the require("fs") usage with an ESM
import (e.g., import fs from "fs" or import { readFileSync } from "fs"), keep
using readFileSync("/tmp/jokebattle_data.json", "utf8") and JSON.parse(d) as
before, and ensure package.json ("type": "module") is compatible; alternatively,
if you prefer to keep .cjs, add an eslint.config.js override to mark scripts/
with the node environment or exclude it from linting so no ESLint
no-undef/require errors occur.
In `@scripts/watch_workflow_debug.sh`:
- Around line 43-57: The inline Python block invoked with python3 -c that reads
json into d and prints fields (d["pendingTotal"], d["pendingBySlot"], and
iterates d["simulations"]) can raise KeyError/Exception and, under set -euo
pipefail, will exit the whole script; update that block to guard the parse/print
work with a try/except that catches Exception (including
KeyError/JSONDecodeError), writes a short diagnostic to stderr, and exits with
code 0 so the surrounding while true loop continues; locate the python3 -c '...
d = json.load(sys.stdin) ... print("pending_total=...") ...' block and add the
exception handling around the parsing/printing logic.
In `@src/client/index.ts`:
- Around line 6-7: Replace the permissive "type BatchWorkpool = any" with a
minimal local interface named BatchWorkpool that declares only the
methods/properties your code actually calls (e.g., the run/submit/close/stop
signatures or iterator/length properties used elsewhere); update
src/client/index.ts to export or use that interface in place of any so callers
get type safety, and add a TODO comment to remove this local interface once the
official BatchWorkpool type is exported by `@convex-dev/workpool`. Ensure the
interface method names/signatures exactly match uses in functions that accept a
BatchWorkpool so the compiler will catch API drift.
In `@src/component/coordinator.ts`:
- Around line 10-21: The ensureCoordinatorRunning function can create duplicate
coordinatorState rows under concurrency; change the logic to enforce a singleton
state record: use a fixed unique key/ID for the coordinator state (instead of
blind insert) and perform an atomic upsert/conditional update so only one record
is created and its scheduled flag is set (replace the current ctx.db.insert call
with an upsert or a transaction that patches-or-inserts by the fixed _id); also
add a small cleanup step (query all coordinatorState rows and collapse/delete
duplicates, keeping the single canonical record) to guard existing databases.
Target symbols: ensureCoordinatorRunning, coordinatorState query/first(),
ctx.db.patch, ctx.db.insert and the scheduler call
internal.coordinator.coordinator.
In `@src/component/pool.ts`:
- Around line 97-99: The onCompleteHandler currently creates a logger with a
hard-coded DEFAULT_LOG_LEVEL (const console = createLogger(DEFAULT_LOG_LEVEL)),
which ignores any per-workflow/workpool logLevel; update onCompleteHandler to
read an optional log level from the provided context (e.g.,
args.context.logLevel or args.context.onComplete?.logLevel) and pass that into
createLogger instead of DEFAULT_LOG_LEVEL, falling back to DEFAULT_LOG_LEVEL
only when the context value is absent; adjust variable references around
createLogger, console and any callers in onCompleteHandler to use the resolved
log level so verbosity remains configurable.
In `@src/component/taskQueue.ts`:
- Line 3: The import statement in taskQueue.ts currently imports MutationCtx but
it is unused; remove MutationCtx from the named imports in the import from
"./_generated/server.js" (i.e., update the import that includes mutation and
query so it no longer imports MutationCtx) to satisfy the linter and eliminate
the unused-symbol warning.
---
Outside diff comments:
In `@src/component/journal.ts`:
- Around line 130-266: The executor path (when workflow.executorShards is set)
currently hardcodes DEFAULT_QM_RETRY (or omits retry for actions) and ignores
per-step retry/scheduling (stepArgs.retry and schedulerOptions) leading to lost
retry/backoff/runAt semantics; update the taskQueue inserts (the
ctx.db.insert("taskQueue") calls in the switch branches for
"query","mutation","action") to map stepArgs.retry into the inserted retry
configuration (and include any supported schedulerOptions fields like
runAt/runAfter/backoff), and for any unsupported scheduling options explicitly
throw or return an error so callers know they’re unsupported (use
shardForWorkflow, stepId, stepArgs.batchActionName and workflow.executorShards
to locate the relevant branches). Ensure behavior mirrors
workpool.enqueueQuery/Mutation/Action parameter handling or documents/rejects
divergences.
---
Duplicate comments:
In `@benchmark_results/batched_1000_status_20260215_112330.json`:
- Around line 1-13: The benchmark results file
benchmark_results/batched_1000_status_20260215_112330.json should not be
committed; remove it from the repository and stop tracking future similar files
by deleting or moving this file from source control (use git rm --cached on the
file to remove from the index while keeping local copy) and add an appropriate
pattern for benchmark_results/*.json (or the specific naming convention) to
.gitignore so future benchmark JSONs are not committed; ensure CI/artifacts or a
designated storage location is documented for keeping these outputs instead.
In `@benchmark_results/batched_1000_timeline_20260215_112330.json`:
- Around line 1-43: The batched timeline object is being left unpopulated
(durationMs, elapsedMs, concurrency, itemRows, steps are empty/zero) — update
the timeline recording/finalization logic so the "batched" timeline is fully
populated before persisting: in the code paths that build the batched object
(look for functions or methods that construct or finalize the "batched" timeline
or call a finalizeTimeline/recordTimeline routine), calculate and set
timelineStart and timelineEnd (use run start/stop timestamps), compute
durationMs and elapsedMs, populate concurrency and steps arrays with the
recorded events, and fill itemRows with per-item timing/metadata; ensure this
update runs for the batched mode (batchedMaxWorkers/batchedMaxParallelism paths)
and is invoked both on normal completion and on retry/error code paths so the
JSON snapshot contains the expected fields.
In `@benchmark_results/batched_1000_timeline_20260215_115704.json`:
- Around line 1-43: This JSON (batched_1000_timeline_20260215_115704.json) is a
generated benchmark artifact that should be removed from the repo; delete the
committed file, add a .gitignore entry to exclude generated benchmark timeline
files (e.g. a wildcard like batched_*_timeline_*.json or a dedicated pattern for
benchmark outputs), and update any CI/docs to upload or store these artifacts
outside source control instead of committing them.
---
Nitpick comments:
In `@benchmark_results/batched_1000_timeline_20260215_115155.json`:
- Around line 1-43: Add the benchmark_results/ directory to .gitignore and stop
committing ephemeral JSON artifacts: update .gitignore to include the literal
entry "benchmark_results/" and commit that change, then remove any
already-tracked files from git (e.g., using git rm --cached on files under
benchmark_results/) and commit the removal so history no longer grows; finally,
configure CI or external storage to persist these artifacts instead of the
repository (ensure any pipeline or job names that produce these files push them
to CI artifacts or cloud storage).
In `@scripts/benchmark_compare.sh`:
- Around line 36-39: Replace the three separate python3 invocations that extract
"completed", "failed", and "running" from "$raw" with a single python3 call that
reads the JSON once and prints the three values (e.g., space-separated or
newline), then read those into the shell variables; update the code around
variables completed, failed, running to use a single python3 invocation that
returns defaults of 0 when keys are missing and assign with read -r completed
failed running (or mapfile) to avoid repeated JSON parsing.
- Around line 46-48: The script uses eval to assign dynamic variables (e.g.,
eval "${mode}_elapsed=$elapsed", eval "${mode}_completed=$completed", eval
"${mode}_failed=$failed") which is fragile and can lead to injection; replace
these eval assignments with a safer approach such as using declare to create the
dynamic variable names (declare "${mode}_elapsed=$elapsed" etc.) or, better,
refactor to store results in an associative array (e.g.,
results[${mode}_elapsed]=$elapsed) and update the code to read from that array;
apply the same change for the corresponding eval calls at the other location
(lines 54-56).
- Around line 24-25: The script assigns the result of convex_run
"benchmark:startBenchmark" to start_out but never uses it; either remove
start_out and just call convex_run for side effects, or capture and use it
(e.g., parse and echo a run ID/info). Update the invocation around start_out and
ensure the symbol start_out (and the convex_run call to
"benchmark:startBenchmark") is either removed or its value is logged/consumed
(for example echoing the returned run ID together with mode and COUNT) so the
assignment is meaningful.
In `@scripts/benchmark_workflow_small.sh`:
- Around line 37-50: The script's run_one function currently writes stderr to a
fixed /tmp/workflow_bench_err.log which can be clobbered by concurrent runs;
update run_one to create a per-run temporary error log (e.g., use mktemp or
include $$/PID in the filename), store that temp path in a local variable (e.g.,
err_log), and replace all references to /tmp/workflow_bench_err.log with that
variable (including cleanup at the end of run_one); ensure the temp file is
created before the loop and removed on exit so concurrent runs do not overwrite
each other's logs.
In `@scripts/check_regular.cjs`:
- Line 2: The script currently hardcodes the input file path in the call to
fs.readFileSync (const d = fs.readFileSync("/tmp/jokebattle_data.json",
"utf8")), making it brittle; change this to read the path from process.argv[2]
with a sensible default (e.g. use process.argv[2] ||
"/tmp/jokebattle_data.json"), update any related variable names that consume the
file contents (e.g. d) and ensure the script prints a helpful usage message or
default note when no argument is provided; keep the fs.readFileSync call but
replace the literal path with the variable so the file path can be passed via
CLI.
In `@scripts/run_joke_batched_trace.sh`:
- Line 5: The DEBUG_TRACE variable is inserted as a raw JSON literal and accepts
arbitrary input; coerce/validate it to a JSON boolean before use. Replace the
simple default assignment of DEBUG_TRACE with a small normalization block that
maps common truthy values (e.g., 1, "1", yes, y, true, TRUE) to the literal true
and everything else to false (for example using a case/regex check or
if-statement), and then use that normalized DEBUG_TRACE variable in the JSON
argument construction so the resulting JSON contains only the literal true or
false; reference the DEBUG_TRACE variable and the JSON argument construction
where the variable is interpolated.
In `@scripts/soak_workflow_20.sh`:
- Around line 22-25: The json_field function currently interpolates the field
name directly into the inline Python code (json_field), which is fragile; change
it to pass the field name as an argument via sys.argv to avoid injecting shell
content into the Python snippet, e.g. call python3 with the field as an argv
parameter and use sys.argv[1] inside the Python code, and update any callers of
json_field if needed to continue passing the field name as the first parameter.
- Around line 41-67: The guard at the end using the done flag is dead code
because the while true loop always sets done=1 before breaking; remove the final
check/increment line ("[[ \"$done\" -eq 1 ]] || failures=$((failures + 1))") so
failures is only updated where the loop exits (inside the non-running and
timeout branches). Locate the while true loop that calls run_convex
llmSimulation:benchmarkStatus and updates status, elapsed_s, run_secs and
failures, and delete the trailing dead-code guard referencing done.
In `@scripts/watch_workflow_debug.sh`:
- Around line 7-22: Move the json_last function out of each script into a single
shared helper file (e.g., create scripts/lib/common.sh) that defines json_last
exactly once, then update scripts/watch_workflow_debug.sh,
scripts/benchmark_workflow_small.sh, and scripts/benchmark_joke_battle.sh to
source that helper (source "scripts/lib/common.sh") and remove the duplicated
function bodies; ensure the helper is executable/readable and that the function
name json_last remains unchanged so callers keep working.
In `@src/client/step.ts`:
- Around line 191-205: The current base-name matching logic (involving
safeFunctionName(target.function), splitting on /[:\/]/ and comparing the last
segment against this.batchActionNames) can produce false positives across
modules; to fix, narrow matching by preferring full-qualified names first and
falling back to base-name only when necessary: update the detection in the code
that sets batchActionName to first check this.batchActionNames.has(fnName) (the
full safeFunctionName), then if not found check the existing base-name fallback
(parts[parts.length-1]) and only set batchActionName to that when explicitly
required; ensure this preserves behavior for target.kind === "function" and
target.functionType === "action" and keep the variable name batchActionName
unchanged.
In `@src/client/workflowMutation.ts`:
- Around line 45-49: Replace the two trailing optional positional parameters on
workflowMutation (defaultWorkpoolOptions and batchActionNames) with a single
optional options object to improve ergonomics: change the signature of
workflowMutation to accept an options?: { defaultWorkpoolOptions?:
WorkpoolOptions; batchActionNames?: Set<string> } (or similar), update the body
to read options.defaultWorkpoolOptions and options.batchActionNames, and update
all call sites to pass an options object instead of using positional undefined
placeholders; optionally add a lightweight overload or compatibility branch in
workflowMutation to accept the old positional form and map it to the new options
shape while emitting a deprecation note to callers.
In `@src/component/workflow.ts`:
- Around line 330-357: The timeline steps can appear in nondeterministic DB
order; before mapping stepDocs into the steps array in the Promise.all mapping
(the async wf => { ... } block that builds page), sort the collected stepDocs by
their stepNumber (or sort the produced steps array by stepNumber) to ensure
stable ordering for the timeline visualization; update the code around the
stepDocs handling (the stepDocs variable and the steps: stepDocs.map(...) block)
to perform a numeric sort by stepNumber prior to mapping.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (5)
example/convex/_generated/api.d.tsis excluded by!**/_generated/**package-lock.jsonis excluded by!**/package-lock.jsonsrc/component/_generated/api.tsis excluded by!**/_generated/**src/component/_generated/component.tsis excluded by!**/_generated/**src/component/_generated/server.tsis excluded by!**/_generated/**
📒 Files selected for processing (78)
.mcp.json1cd1npmbenchmark_results/batched100_logs_20260215_103112.jsonlbenchmark_results/batched100_logs_filtered_20260215_103112.txtbenchmark_results/batched100_status_20260215_103112.jsonbenchmark_results/batched100_timeline_20260215_103112.jsonbenchmark_results/batched100_trace_20260215_103112.txtbenchmark_results/batched_1000_logs_20260215_104803.txtbenchmark_results/batched_1000_logs_20260215_105644.txtbenchmark_results/batched_1000_logs_20260215_111119.txtbenchmark_results/batched_1000_logs_20260215_111510.txtbenchmark_results/batched_1000_logs_20260215_112033.txtbenchmark_results/batched_1000_logs_20260215_112330.txtbenchmark_results/batched_1000_logs_20260215_113347.txtbenchmark_results/batched_1000_logs_20260215_115155.txtbenchmark_results/batched_1000_logs_20260215_115704.txtbenchmark_results/batched_1000_status_20260215_104803.jsonbenchmark_results/batched_1000_status_20260215_104803.txtbenchmark_results/batched_1000_status_20260215_105644.jsonbenchmark_results/batched_1000_status_20260215_105644.txtbenchmark_results/batched_1000_status_20260215_111119.jsonbenchmark_results/batched_1000_status_20260215_111119.txtbenchmark_results/batched_1000_status_20260215_111510.jsonbenchmark_results/batched_1000_status_20260215_111510.txtbenchmark_results/batched_1000_status_20260215_112033.jsonbenchmark_results/batched_1000_status_20260215_112033.txtbenchmark_results/batched_1000_status_20260215_112330.jsonbenchmark_results/batched_1000_status_20260215_112330.txtbenchmark_results/batched_1000_status_20260215_113347.jsonbenchmark_results/batched_1000_status_20260215_113347.txtbenchmark_results/batched_1000_status_20260215_115155.jsonbenchmark_results/batched_1000_status_20260215_115155.txtbenchmark_results/batched_1000_status_20260215_115704.jsonbenchmark_results/batched_1000_status_20260215_115704.txtbenchmark_results/batched_1000_timeline_20260215_104803.jsonbenchmark_results/batched_1000_timeline_20260215_105644.jsonbenchmark_results/batched_1000_timeline_20260215_111119.jsonbenchmark_results/batched_1000_timeline_20260215_111510.jsonbenchmark_results/batched_1000_timeline_20260215_112033.jsonbenchmark_results/batched_1000_timeline_20260215_112330.jsonbenchmark_results/batched_1000_timeline_20260215_113347.jsonbenchmark_results/batched_1000_timeline_20260215_115155.jsonbenchmark_results/batched_1000_timeline_20260215_115704.jsonbenchmark_results/workflow_benchmark_small_1000_2026-02-14.txtbenchmark_results/workflow_benchmark_small_1000_2026-02-15_00-08-07.txtbenchmark_results/workflow_benchmark_small_1000_2026-02-15_00-12-56_skipclear.txtbenchmark_results/workflow_benchmark_small_500_2026-02-14.txtbenchmark_results/workflow_benchmark_small_50_2026-02-14.txtbenchmark_results/workflow_results_archive_2026-02-15_00-06-51.txtbenchmark_results/workflow_results_archive_latest.txtdocs/executor-mode-spec.mddocs/executor-mode.mddocs/tuning-guide.mdexample/convex/benchmark.tsexample/convex/convex.config.tsexample/convex/http.tsexample/convex/schema.tspackage.jsonscreenshots/d1024.mdscripts/benchmark_compare.shscripts/benchmark_joke_battle.shscripts/benchmark_workflow_small.shscripts/check_regular.cjsscripts/run_joke_batched_trace.shscripts/soak_workflow_20.shscripts/watch_workflow_debug.shsrc/client/index.tssrc/client/step.tssrc/client/workflowMutation.tssrc/component/batch.tssrc/component/coordinator.tssrc/component/event.tssrc/component/journal.tssrc/component/pool.tssrc/component/schema.tssrc/component/taskQueue.tssrc/component/workflow.ts
| { | ||
| "mcpServers": { | ||
| "hive": { | ||
| "type": "stdio", | ||
| "command": "node", | ||
| "args": [ | ||
| "/Users/magicseth/Projects/claudemanager/dist/main/mcp-server.js" | ||
| ], | ||
| "env": { | ||
| "MCP_SOCKET_PATH": "/Users/magicseth/Library/Application Support/Hive/mcp.sock" | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Avoid committing user-specific absolute paths in .mcp.json.
Lines 7 and 10 hardcode /Users/magicseth/..., which will break for other devs/CI and leaks a local machine path. Consider moving this to a template (e.g., .mcp.example.json + .gitignore) or switching to repo‑relative paths and externally supplied environment variables.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.mcp.json around lines 1 - 13, The .mcp.json currently contains
user-specific absolute paths (the "args" entry pointing to
"/Users/magicseth/Projects/claudemanager/dist/main/mcp-server.js" and the "env"
value for "MCP_SOCKET_PATH"), so replace those with non-user-specific defaults
and move this file to a template: create a .mcp.example.json that uses
repo-relative paths (e.g., "args": ["./dist/main/mcp-server.js"]) and a
placeholder or env interpolation for MCP_SOCKET_PATH (e.g., use an explicitly
documented placeholder value or reference an env var like "${MCP_SOCKET_PATH}"),
update README with instructions for creating a local .mcp.json from
.mcp.example.json, and add the real .mcp.json to .gitignore so developers/CI can
supply machine-specific paths without committing them.
| error: too many arguments for 'dev'. Expected 0 arguments but got 2. | ||
| error: too many arguments for 'dev'. Expected 0 arguments but got 2. | ||
|
|
||
|
|
||
| Usage: convex dev [options] | ||
|
|
||
| Develop against a dev deployment, watching for changes | ||
|
|
||
| 1. Configures a new or existing project (if needed) | ||
| 2. Updates generated types and pushes code to the configured dev deployment | ||
| 3. Runs the provided command (if `--run` or `--run-sh` is used) | ||
| 4. Watches for file changes, and repeats step 2 | ||
|
|
||
|
|
||
| Options: | ||
| -v, --verbose Show full listing of changes | ||
| --typecheck <mode> Check TypeScript files with `tsc --noEmit`. | ||
| (choices: "enable", "try", "disable", default: | ||
| "try") | ||
| --typecheck-components Check TypeScript files within component | ||
| implementations with `tsc --noEmit`. (default: | ||
| false) | ||
| --codegen <mode> Regenerate code in `convex/_generated/` | ||
| (choices: "enable", "disable", default: | ||
| "enable") | ||
| --once Execute only the first 3 steps, stop on any | ||
| failure (default: false) | ||
| --until-success Execute only the first 3 steps, on failure | ||
| watch for local and remote changes and retry | ||
| steps 2 and 3 (default: false) | ||
| --run <functionName> The identifier of the function to run in step | ||
| 3, like `api.init.createData` or | ||
| `myDir/myFile:myFunction` | ||
| --run-component <functionName> If --run is used and the function is in a | ||
| component, the path the component tree defined | ||
| in convex.config.ts. Components are a beta | ||
| feature. This flag is unstable and may change | ||
| in subsequent releases. | ||
| --run-sh <command> A shell command to run in step 3, like `node | ||
| myScript.js`. If you just want to run a Convex | ||
| function, use `--run` instead. | ||
| --tail-logs [mode] Choose whether to tail Convex function logs in | ||
| this terminal (choices: "always", | ||
| "pause-on-deploy", "disable", default: | ||
| "pause-on-deploy") | ||
| --configure [choice] Ignore existing configuration and configure | ||
| new or existing project, interactively or set | ||
| by --team <team_slug>, --project | ||
| <project_slug>, and --dev-deployment | ||
| local|cloud (choices: "new", "existing") | ||
| --env-file <envFile> Path to a custom file of environment | ||
| variables, for choosing the deployment, e.g. | ||
| CONVEX_DEPLOYMENT or CONVEX_SELF_HOSTED_URL. | ||
| Same format as .env.local or .env files, and | ||
| overrides them. | ||
| -h, --help display help for command | ||
| Usage: convex dev [options] | ||
|
|
||
| Develop against a dev deployment, watching for changes | ||
|
|
||
| 1. Configures a new or existing project (if needed) | ||
| 2. Updates generated types and pushes code to the configured dev deployment | ||
| 3. Runs the provided command (if `--run` or `--run-sh` is used) | ||
| 4. Watches for file changes, and repeats step 2 | ||
|
|
||
|
|
||
| Options: | ||
| -v, --verbose Show full listing of changes | ||
| --typecheck <mode> Check TypeScript files with `tsc --noEmit`. | ||
| (choices: "enable", "try", "disable", default: | ||
| "try") | ||
| --typecheck-components Check TypeScript files within component | ||
| implementations with `tsc --noEmit`. (default: | ||
| false) | ||
| --codegen <mode> Regenerate code in `convex/_generated/` | ||
| (choices: "enable", "disable", default: | ||
| "enable") | ||
| --once Execute only the first 3 steps, stop on any | ||
| failure (default: false) | ||
| --until-success Execute only the first 3 steps, on failure | ||
| watch for local and remote changes and retry | ||
| steps 2 and 3 (default: false) | ||
| --run <functionName> The identifier of the function to run in step | ||
| 3, like `api.init.createData` or | ||
| `myDir/myFile:myFunction` | ||
| --run-component <functionName> If --run is used and the function is in a | ||
| component, the path the component tree defined | ||
| in convex.config.ts. Components are a beta | ||
| feature. This flag is unstable and may change | ||
| in subsequent releases. | ||
| --run-sh <command> A shell command to run in step 3, like `node | ||
| myScript.js`. If you just want to run a Convex | ||
| function, use `--run` instead. | ||
| --tail-logs [mode] Choose whether to tail Convex function logs in | ||
| this terminal (choices: "always", | ||
| "pause-on-deploy", "disable", default: | ||
| "pause-on-deploy") | ||
| --configure [choice] Ignore existing configuration and configure | ||
| new or existing project, interactively or set | ||
| by --team <team_slug>, --project | ||
| <project_slug>, and --dev-deployment | ||
| local|cloud (choices: "new", "existing") | ||
| --env-file <envFile> Path to a custom file of environment | ||
| variables, for choosing the deployment, e.g. | ||
| CONVEX_DEPLOYMENT or CONVEX_SELF_HOSTED_URL. | ||
| Same format as .env.local or .env files, and | ||
| overrides them. | ||
| -h, --help display help for command |
There was a problem hiding this comment.
Deduplicate or drop this CLI output artifact.
This file appears to be duplicated help output (entire help block repeated). If it’s meant to be checked in, trim to a single occurrence; otherwise consider removing the file to avoid noise.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@1npm` around lines 1 - 108, The file contains a duplicated CLI help output
for "convex dev" (the entire help block is repeated); remove the duplicate so
only one instance of the help text remains (or delete the file if this output
should not be committed), ensuring the single retained block includes the full
options list and usage lines for "convex dev".
| { | ||
| "batched": { | ||
| "batchedMaxConcurrencyPerWorker": 1000, | ||
| "batchedMaxParallelism": 100, | ||
| "batchedMaxWorkers": 2, | ||
| "completedAt": 1771183136898, | ||
| "completedWorkflows": 1000, | ||
| "concurrency": [], | ||
| "count": 1000, | ||
| "durationMs": 0, | ||
| "elapsedMs": 0, | ||
| "failedWorkflows": 0, | ||
| "itemRows": [], | ||
| "jokesDone": 0, | ||
| "jokesTotal": 2000, | ||
| "judgesDone": 0, | ||
| "judgesTotal": 1000, | ||
| "maxConcurrency": 0, | ||
| "mode": "batched", | ||
| "picksDone": 0, | ||
| "picksTotal": 1000, | ||
| "regularMaxParallelism": 50, | ||
| "result": null, | ||
| "retry": { | ||
| "base": 2, | ||
| "initialBackoffMs": 200, | ||
| "maxAttempts": 4 | ||
| }, | ||
| "runId": "jh79gh7s96qn90zp247mh68s5d81691f", | ||
| "startedAt": 1771182912234, | ||
| "status": "completed", | ||
| "steps": [], | ||
| "tieCount": 316, | ||
| "timelineEnd": 1771182912234, | ||
| "timelineStart": 1771182912234, | ||
| "winner1Count": 315, | ||
| "winner2Count": 369 | ||
| }, | ||
| "count": 1000, | ||
| "globalTimelineEnd": 1771182912234, | ||
| "globalTimelineStart": 1771182912234, | ||
| "regular": null | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Search for code that writes or generates timeline snapshots
rg -n --type=ts --type=js 'timelineStart|timelineEnd|durationMs|itemRows|concurrency' -A 3 -B 3 | head -100Repository: get-convex/workflow
Length of output: 3774
🏁 Script executed:
# Find files in benchmark_results to check if this is a pattern
find . -name "*timeline*.json" -type f | head -20Repository: get-convex/workflow
Length of output: 690
🏁 Script executed:
# Search for code that generates or populates timeline data
rg -n --type=ts --type=js 'jokesDone|judgesDone|picksDone|completedWorkflows' -B 2 -A 2 | head -100Repository: get-convex/workflow
Length of output: 45
🏁 Script executed:
# Look for visualization or chart code that uses these timeline fields
rg -n --type=ts --type=jsx --type=tsx 'concurrency.*render|itemRows.*render|steps.*render|chart' -B 2 -A 2 | head -80Repository: get-convex/workflow
Length of output: 90
🏁 Script executed:
# Search for code that writes JSON files named timeline
rg -n --type=ts --type=js 'timeline.*json|writeFileSync|writeFile' -B 3 -A 3 | head -120Repository: get-convex/workflow
Length of output: 45
🏁 Script executed:
# Look for test or benchmark directories
find . -type d -name '*test*' -o -type d -name '*benchmark*' | head -20Repository: get-convex/workflow
Length of output: 82
🏁 Script executed:
# Check if there's benchmark code that generates these timeline files
fd -t f -e ts -e js . | grep -i benchmark | head -20Repository: get-convex/workflow
Length of output: 90
🏁 Script executed:
# Examine the actual structure of a couple timeline files to compare
cat ./benchmark_results/batched100_timeline_20260215_103112.jsonRepository: get-convex/workflow
Length of output: 50376
🏁 Script executed:
# Look for where completedAt, startedAt, durationMs are used in timeline context
rg -n --type=ts 'completedAt|startedAt' -B 2 -A 2 | grep -A 10 -B 10 'timeline' | head -80Repository: get-convex/workflow
Length of output: 45
🏁 Script executed:
# Look at the benchmark code that generates these files
cat -n example/convex/benchmark.ts | head -200Repository: get-convex/workflow
Length of output: 8981
🏁 Script executed:
# Search for timeline snapshot generation in the benchmark
rg -n 'timeline.*json|writeFile|snapshot' example/convex/benchmark.ts -A 5 -B 5Repository: get-convex/workflow
Length of output: 45
🏁 Script executed:
# Check if there's any batch-size-dependent logic
rg -n 'batched.*1000|count.*1000|batchedMaxWorkers\|batchedMaxParallelism' example/convex/benchmark.ts -B 3 -A 3Repository: get-convex/workflow
Length of output: 45
🏁 Script executed:
# Search for where timeline snapshots are written to JSON
rg -n 'writeFileSync|writeFile|JSON.stringify' example/convex/ -B 5 -A 5 | head -150Repository: get-convex/workflow
Length of output: 796
🏁 Script executed:
# Look for Timeline type or snapshot-related code in the entire codebase
rg -n 'Timeline|timeline.*json|concurrency.*record|itemRows' src/ -B 2 -A 2 | head -100Repository: get-convex/workflow
Length of output: 45
🏁 Script executed:
# Check if there's test/bench code that generates these files
fd -t f -e ts -e js . | xargs grep -l 'timeline.*json\|writeFile.*timeline' 2>/dev/null | head -10Repository: get-convex/workflow
Length of output: 45
🏁 Script executed:
# Search more broadly for code that populates the fields mentioned in the review
rg -n 'durationMs.*0|jokesDone|judgesDone|picksDone|maxConcurrency' -B 3 -A 3 | head -80Repository: get-convex/workflow
Length of output: 6297
Timeline snapshot fields are not being populated for larger batched runs — confirmed recording bug.
The batched 100-item run properly records timeline data (jokesDone: 200, judgesDone: 100, picksDone: 100, maxConcurrency: 200), but all 1000-item batched runs consistently show zero values for these fields (jokesDone: 0, judgesDone: 0, picksDone: 0, maxConcurrency: 0), empty arrays for concurrency and itemRows, and durationMs and elapsedMs both zero. All 1000-item runs also return result: null, indicating incomplete or failed state.
This pattern is 100% reproducible across 7 separate 1000-item benchmark runs. Any visualization code consuming concurrency, itemRows, steps, durationMs, elapsedMs, or maxConcurrency will receive no data for these larger batches, potentially causing silent failures or empty charts.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@benchmark_results/batched_1000_timeline_20260215_111510.json` around lines 1
- 43, The batched run timeline fields (batched.jokesDone, judgesDone, picksDone,
maxConcurrency, concurrency, itemRows, durationMs, elapsedMs, steps, result) are
not being populated for large batches; inspect and fix the timeline aggregation
path (e.g., finalizeBatchedRun / recordTimeline / aggregateWorkerTimeline) to
ensure it iterates over all items/workers instead of being truncated by a
hardcoded cap or page size, correctly accumulates counts and maxConcurrency,
appends all itemRows and concurrency samples, computes durationMs/elapsedMs from
startedAt/completedAt using safe numeric types, and sets result when aggregation
completes; update any early-return conditions or try/catch that swallow errors
so the batched object is assigned the computed values for the runId.
| === workflow small benchmark === | ||
| section_count=1000 | ||
| [CONVEX ?(llmSimulation:clearAll)] [ERROR] 'Simulation not found: j9712dprjeky0709w2098prf65816c0h' | ||
| [CONVEX ?(llmSimulation:clearAll)] [ERROR] 'Simulation not found: j97507jy904tpv1erkkkzgt1yx8172qh' |
There was a problem hiding this comment.
Clarify failed benchmark output before archiving.
Lines 3-4 show “Simulation not found” errors. If this run is incomplete, consider moving it to a failed/partial folder or annotating the archive so it isn’t mistaken for a successful benchmark.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@benchmark_results/workflow_benchmark_small_1000_2026-02-15_00-08-07.txt`
around lines 1 - 4, This benchmark run shows errors from the CONVEX cleanup step
(CONVEX ?(llmSimulation:clearAll)) reporting "Simulation not found" for IDs
'j9712dprjeky0709w2098prf65816c0h' and 'j97507jy904tpv1erkkkzgt1yx8172qh';
before archiving, either move this file out of the successful archive into a
failed/partial folder or update the file/metadata to annotate the run as
incomplete and include the exact error lines and the simulation IDs so consumers
won't treat this as a successful workflow_small benchmark (section_count=1000).
| generated_at=2026-02-15 00:23:43 PST | ||
| repo=/Users/magicseth/Projects/workflow | ||
| branch=llm-simulation-demo |
There was a problem hiding this comment.
Avoid committing absolute local paths.
Line 3 includes /Users/..., which leaks a local filesystem path and makes the artifact less portable. Consider stripping it or replacing with a repo-relative identifier.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@benchmark_results/workflow_results_archive_latest.txt` around lines 2 - 4,
The file contains an absolute local path in the repo metadata (the "repo=" line)
which leaks local filesystem info; change the "repo=" value in
benchmark_results/workflow_results_archive_latest.txt to a repo-relative
identifier or a sanitized placeholder (e.g., repository name or commit hash)
instead of an absolute /Users/... path so the artifact is portable and
non-identifying.
| python3 -c ' | ||
| import json, sys | ||
| d = json.load(sys.stdin) | ||
| print("pending_total={} workflow_running={} workflow_done={}".format( | ||
| d["pendingTotal"], d["workflowRunning"], d["workflowDone"] | ||
| )) | ||
| print("pending_by_slot:", ", ".join("s{}={}".format(p["slot"], p["pending"]) for p in d["pendingBySlot"])) | ||
| print("--- simulations ---") | ||
| for s in d["simulations"]: | ||
| elapsed = s["elapsedMs"] / 1000.0 | ||
| print("{} mode={} status={} elapsed={:.1f}s outline={} sections={}".format( | ||
| s["id"], s["mode"], s["status"], elapsed, s["outlineCount"], s["sectionCount"] | ||
| )) | ||
| ' <<< "$json" | ||
| sleep "$INTERVAL_SECS" |
There was a problem hiding this comment.
Python parse failure under set -e kills the entire monitoring loop.
With set -euo pipefail active, any KeyError (e.g., debugPoolView schema change or partial response) exits the python3 command with a non-zero code, which propagates out of the while true loop and terminates the script rather than retrying on the next interval.
🛡️ Proposed fix – guard the parse step and continue on failure
- python3 -c '
+ if ! python3 -c '
import json, sys
d = json.load(sys.stdin)
print("pending_total={} workflow_running={} workflow_done={}".format(
d["pendingTotal"], d["workflowRunning"], d["workflowDone"]
))
print("pending_by_slot:", ", ".join("s{}={}".format(p["slot"], p["pending"]) for p in d["pendingBySlot"]))
print("--- simulations ---")
for s in d["simulations"]:
elapsed = s["elapsedMs"] / 1000.0
print("{} mode={} status={} elapsed={:.1f}s outline={} sections={}".format(
s["id"], s["mode"], s["status"], elapsed, s["outlineCount"], s["sectionCount"]
))
-' <<< "$json"
+' <<< "$json"; then
+ echo "parse error – unexpected JSON shape"
+ echo "$json"
+ fi
sleep "$INTERVAL_SECS"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| python3 -c ' | |
| import json, sys | |
| d = json.load(sys.stdin) | |
| print("pending_total={} workflow_running={} workflow_done={}".format( | |
| d["pendingTotal"], d["workflowRunning"], d["workflowDone"] | |
| )) | |
| print("pending_by_slot:", ", ".join("s{}={}".format(p["slot"], p["pending"]) for p in d["pendingBySlot"])) | |
| print("--- simulations ---") | |
| for s in d["simulations"]: | |
| elapsed = s["elapsedMs"] / 1000.0 | |
| print("{} mode={} status={} elapsed={:.1f}s outline={} sections={}".format( | |
| s["id"], s["mode"], s["status"], elapsed, s["outlineCount"], s["sectionCount"] | |
| )) | |
| ' <<< "$json" | |
| sleep "$INTERVAL_SECS" | |
| if ! python3 -c ' | |
| import json, sys | |
| d = json.load(sys.stdin) | |
| print("pending_total={} workflow_running={} workflow_done={}".format( | |
| d["pendingTotal"], d["workflowRunning"], d["workflowDone"] | |
| )) | |
| print("pending_by_slot:", ", ".join("s{}={}".format(p["slot"], p["pending"]) for p in d["pendingBySlot"])) | |
| print("--- simulations ---") | |
| for s in d["simulations"]: | |
| elapsed = s["elapsedMs"] / 1000.0 | |
| print("{} mode={} status={} elapsed={:.1f}s outline={} sections={}".format( | |
| s["id"], s["mode"], s["status"], elapsed, s["outlineCount"], s["sectionCount"] | |
| )) | |
| ' <<< "$json"; then | |
| echo "parse error – unexpected JSON shape" | |
| echo "$json" | |
| fi | |
| sleep "$INTERVAL_SECS" |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/watch_workflow_debug.sh` around lines 43 - 57, The inline Python
block invoked with python3 -c that reads json into d and prints fields
(d["pendingTotal"], d["pendingBySlot"], and iterates d["simulations"]) can raise
KeyError/Exception and, under set -euo pipefail, will exit the whole script;
update that block to guard the parse/print work with a try/except that catches
Exception (including KeyError/JSONDecodeError), writes a short diagnostic to
stderr, and exits with code 0 so the surrounding while true loop continues;
locate the python3 -c '... d = json.load(sys.stdin) ...
print("pending_total=...") ...' block and add the exception handling around the
parsing/printing logic.
| // BatchWorkpool not yet exported from @convex-dev/workpool@0.3.1 | ||
| type BatchWorkpool = any; |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, check the current state of src/client/index.ts around lines 6-7
cat -n src/client/index.ts | head -20Repository: get-convex/workflow
Length of output: 734
🏁 Script executed:
# Check package.json to see workpool version
cat package.json | grep -A 5 -B 5 "workpool"Repository: get-convex/workflow
Length of output: 630
🏁 Script executed:
# Search for any workpool imports or usages in the codebase
rg "workpool" --type ts --type js -iRepository: get-convex/workflow
Length of output: 5957
🌐 Web query:
@convex-dev/workpool BatchWorkpool export TypeScript type 0.3.1 latest version
💡 Result:
@convex-dev/workpoolv0.3.1 is a real published version (itspackage.jsonshows"version": "0.3.1"). [1]- In v0.3.1, the client API exports a
Workpoolclass (with batching methods likeenqueueActionBatch,enqueueMutationBatch,statusBatch, etc.), but there is no exported TypeScript symbol namedBatchWorkpoolindist/client/index.d.ts. [3] - Latest version is not 0.3.1: UNPKG’s version selector shows
latest (0.4.0)(and also lists0.3.2). [2]
Sources:
[1] [3] [2]
Define a local BatchWorkpool interface instead of any.
BatchWorkpool is not exported from @convex-dev/workpool@0.3.1. Using any loses type safety and hides API drift. Define a minimal local interface based on actual usage, or replace it when the type becomes available in a future workpool release.
Suggested minimal local interface
-// BatchWorkpool not yet exported from `@convex-dev/workpool`@0.3.1
-type BatchWorkpool = any;
+// BatchWorkpool not yet exported from `@convex-dev/workpool`@0.3.1
+type BatchWorkpool = {
+ action: <Args extends DefaultFunctionArgs = any, Returns = any>(
+ name: string,
+ opts: {
+ args: Record<string, Validator<any, any, any>>;
+ handler: (
+ ctx: GenericActionCtx<GenericDataModel>,
+ args: Args,
+ ) => Promise<Returns>;
+ },
+ ) => RegisteredAction<"internal", Args, Returns>;
+ enqueueByHandle: (
+ ctx: GenericMutationCtx<GenericDataModel>,
+ name: string,
+ args: DefaultFunctionArgs,
+ options: { onComplete: { fnHandle: string; context: unknown } },
+ ) => Promise<string>;
+};🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/client/index.ts` around lines 6 - 7, Replace the permissive "type
BatchWorkpool = any" with a minimal local interface named BatchWorkpool that
declares only the methods/properties your code actually calls (e.g., the
run/submit/close/stop signatures or iterator/length properties used elsewhere);
update src/client/index.ts to export or use that interface in place of any so
callers get type safety, and add a TODO comment to remove this local interface
once the official BatchWorkpool type is exported by `@convex-dev/workpool`. Ensure
the interface method names/signatures exactly match uses in functions that
accept a BatchWorkpool so the compiler will catch API drift.
| export async function ensureCoordinatorRunning(ctx: MutationCtx) { | ||
| const state = await ctx.db.query("coordinatorState").first(); | ||
| if (state?.scheduled) { | ||
| return; | ||
| } | ||
| if (state) { | ||
| await ctx.db.patch(state._id, { scheduled: true }); | ||
| } else { | ||
| await ctx.db.insert("coordinatorState", { scheduled: true }); | ||
| } | ||
| await ctx.scheduler.runAfter(0, internal.coordinator.coordinator); | ||
| } |
There was a problem hiding this comment.
Guard against multiple coordinatorState rows.
Lines 10-19 insert coordinatorState if missing; concurrent callers can insert duplicates and leave a stale scheduled: true, which can block future scheduling or spawn multiple coordinators. Consider enforcing a singleton (fixed ID/config entry) or cleaning duplicates.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/component/coordinator.ts` around lines 10 - 21, The
ensureCoordinatorRunning function can create duplicate coordinatorState rows
under concurrency; change the logic to enforce a singleton state record: use a
fixed unique key/ID for the coordinator state (instead of blind insert) and
perform an atomic upsert/conditional update so only one record is created and
its scheduled flag is set (replace the current ctx.db.insert call with an upsert
or a transaction that patches-or-inserts by the fixed _id); also add a small
cleanup step (query all coordinatorState rows and collapse/delete duplicates,
keeping the single canonical record) to guard existing databases. Target
symbols: ensureCoordinatorRunning, coordinatorState query/first(), ctx.db.patch,
ctx.db.insert and the scheduler call internal.coordinator.coordinator.
| const console = createLogger(DEFAULT_LOG_LEVEL); | ||
| const stepId = | ||
| "stepId" in args.context && typeof args.context.stepId === "string" |
There was a problem hiding this comment.
Preserve configurable log levels in onCompleteHandler.
Line 97 hard-codes DEFAULT_LOG_LEVEL, which bypasses any per-workflow/workpool logLevel settings. Consider plumbing the optional log level from the onComplete context so verbosity stays configurable.
Suggested fix
- const console = createLogger(DEFAULT_LOG_LEVEL);
+ const context = args.context as Partial<OnCompleteContext>;
+ const console = createLogger(
+ context.workpoolOptions?.logLevel ?? DEFAULT_LOG_LEVEL,
+ );🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/component/pool.ts` around lines 97 - 99, The onCompleteHandler currently
creates a logger with a hard-coded DEFAULT_LOG_LEVEL (const console =
createLogger(DEFAULT_LOG_LEVEL)), which ignores any per-workflow/workpool
logLevel; update onCompleteHandler to read an optional log level from the
provided context (e.g., args.context.logLevel or
args.context.onComplete?.logLevel) and pass that into createLogger instead of
DEFAULT_LOG_LEVEL, falling back to DEFAULT_LOG_LEVEL only when the context value
is absent; adjust variable references around createLogger, console and any
callers in onCompleteHandler to use the resolved log level so verbosity remains
configurable.
| @@ -0,0 +1,454 @@ | |||
| import { v } from "convex/values"; | |||
| import { vResultValidator } from "@convex-dev/workpool"; | |||
| import { mutation, query, type MutationCtx } from "./_generated/server.js"; | |||
There was a problem hiding this comment.
Remove unused MutationCtx import (lint warning).
Suggested fix
-import { mutation, query, type MutationCtx } from "./_generated/server.js";
+import { mutation, query } from "./_generated/server.js";📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| import { mutation, query, type MutationCtx } from "./_generated/server.js"; | |
| import { mutation, query } from "./_generated/server.js"; |
🧰 Tools
🪛 GitHub Check: Test and lint
[warning] 3-3:
'MutationCtx' is defined but never used. Allowed unused vars must match /^_/u
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/component/taskQueue.ts` at line 3, The import statement in taskQueue.ts
currently imports MutationCtx but it is unused; remove MutationCtx from the
named imports in the import from "./_generated/server.js" (i.e., update the
import that includes mutation and query so it no longer imports MutationCtx) to
satisfy the linter and eliminate the unused-symbol warning.
b1552ab to
372547f
Compare
b5cd3ea to
801c4b2
Compare
|
Warning This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
This stack of pull requests is managed by Graphite. Learn more about stacking. |
3cb55b4 to
a6f0186
Compare
91a3b55 to
ff68a69
Compare
932d4cc to
ae2a0a9
Compare
Benchmark results, tuning docs, example benchmark/viz endpoints, and helper scripts for the executor mode implementation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
ae2a0a9 to
a974d24
Compare
- Replace fire-and-forget scheduler.runAfter safety net with durable replayQueue table - recordResultBatch: insert replay entry → inline replay → delete on success - processReplayBatch runs sequentially in flush loop (OCC-free) - Remove eager replayQueue cleanup from completeHandler (avoids OCC with executors) - Add bumpEpoch (stop executors), clearReplayQueue, clearTaskQueue utilities - Lower MAX_CONCURRENCY=50, CLAIM_LIMIT=200 for real API calls - Simplify Haiku benchmark to "hello world" (minimal tokens) - 19825/20000 real Haiku benchmark: 0 stuck, 175 failed (API transients) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- CLAIM_LIMIT=50 (matches MAX_CONCURRENCY) so executors re-query the task queue frequently, picking up later steps for earlier workflows instead of always grabbing step-0 tasks for newer workflows. - Task queue indexed by [shard, workflowCreatedAt] (ascending) so tasks for earlier-created workflows are always claimed first. This gives FIFO completion ordering: the first 2k of 20k workflows finish in ~4 min median while the last 2k take ~20 min. - 50 concurrency × 100 shards = 5000 concurrent slots. This works because real-world throughput is gated by the LLM API (Anthropic), not local compute. Higher per-shard concurrency wastes V8 memory (64 MB limit) holding idle HTTP connections. - failPendingTasks mutation: force-fails all queued tasks in a shard, marks steps as failed, and inserts replay entries so workflows complete (as failures) rather than getting stuck forever. - failAllPendingTasks action: iterates all 100 shards to nuke the entire task queue — useful for clearing stale work after crashes. - priorityAnalysis action: measures whether FIFO ordering is working by bucketing workflows into creation-time deciles and comparing median completion times. Benchmark results (20k real Claude Haiku workflows): 19,886 completed, 114 failed (0.57%), 0 stuck p50=12.1min, p90=20.8min, p99=22.1min, slowest=26.6min Earliest 2k workflows: median 4.3 min to complete Latest 2k workflows: median 20.1 min to complete Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Self-rescheduling watchdog mutation checks each shard's task/replay queues every 30s. If tasks are older than 60s, the shard's executor is presumed dead and a replacement is scheduled at the current epoch. Watchdog starts automatically with startExecutors and stops on bumpEpoch. Also adds mode/count badge to benchmark viz (green for real, gray for simulated) via URL query params. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… name - Schedule successor at RESCHEDULE_MS (no jitter) so it's running before the original stops claiming at RESCHEDULE_MS + jitter. Eliminates the ~2min gap when rate-limited tasks block drainInFlight. - Reduce RESCHEDULE_MS from 5min to 3min for headroom under 10min timeout. - Batch watchdog into 25-shard ticks to stay under 32k doc read limit. - Viz: derive WF_NAME from mode URL param so standard runs render correctly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Benchmarks, Scripts, and Documentation for Executor Mode
Depends on #210 (executor mode core implementation).
Contents
Benchmark suite (
example/convex/benchmark.ts)startBenchmark/startBenchmarkBatchTimeline visualization (
example/convex/http.ts)Benchmark results (
benchmark_results/)Scripts (
scripts/)benchmark_workflow_small.sh— run benchmarks and collect resultsbenchmark_compare.sh— compare standard vs executor modeDocumentation (
docs/)executor-mode.md— usage guideexecutor-mode-spec.md— clean-room implementation spectuning-guide.md— tuning constants and benchmark results