feat(session): task checkpointing — blocked agents hand off to next session#351
feat(session): task checkpointing — blocked agents hand off to next session#351krrish-berri-2 wants to merge 1 commit into
Conversation
…ext session
Adds three MCP tools (save_task_progress, list_blocked_tasks, get_blocked_task)
and backing platform routes so agents can persist their work state before giving
up, and successor sessions can resume blocked tasks instead of starting from scratch.
- harnesses/opencode/session-task-mcp.mjs: new standalone stdio MCP server
exposing the three tools; follows same env/retry/proxy pattern as report-issue-mcp
- harnesses/opencode/gen-mcp-config.mjs: wire lap-session-task into opencode.json
- prisma/schema.prisma + migration: add task_checkpoint JSONB to managed_agent_session
- POST /sessions/{id}/task_checkpoint: agent writes {summary, status, blocked_reason}
- GET /sessions/{id}/task_checkpoint: read checkpoint for any session (same agent)
- GET /sessions/{id}/blocked_tasks: list other sessions with status=blocked
- tests/agent-task-checkpoint.spec.ts: e2e behavioral tests
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Greptile SummaryThis PR introduces a task-checkpointing system for managed agents: a new stdio MCP server (
Confidence Score: 3/5The blocked-task handoff mechanism has a persistent-state bug that will cause every subsequent session to re-attempt already-claimed work, and the E2E test will fail immediately on any environment without pre-provisioning a specific agent. Once a new session picks up a blocked task, the prior session's checkpoint remains in state 'blocked' indefinitely. Every future session for that agent will discover the same stale entry, attempt to resume it, and produce duplicate or conflicting work. There is no MCP tool path that allows the new session to update the old session's record. The E2E test also relies on a hardcoded agent UUID that will not exist outside the original dev environment, so the test plan cannot be executed without additional setup steps that are not yet documented.
|
| Filename | Overview |
|---|---|
| src/app/api/v1/managed_agents/sessions/[session_id]/blocked_tasks/route.ts | New route returning blocked checkpoints for the same agent; has a logic gap where picked-up sessions are never cleared from the list, and ordering by created_at is less useful than ordering by last_seen_at |
| src/app/api/v1/managed_agents/sessions/[session_id]/task_checkpoint/route.ts | New POST/GET routes for writing and reading task checkpoints; auth is correct; Zod validation is missing a refine constraint to require blocked_reason when status=blocked |
| harnesses/opencode/session-task-mcp.mjs | New stdio MCP server exposing three task-checkpoint tools; retry logic and token refresh are solid; save_task_progress always writes to the caller's own session, leaving no mechanism to clear a prior session's blocked status |
| harnesses/opencode/gen-mcp-config.mjs | Adds lap-session-task MCP entry guarded by the same issueBase+issueAccess check as the existing lap-issue-reporter; no issues |
| prisma/migrations/20260526000001_add_task_checkpoint/migration.sql | Adds nullable JSONB column task_checkpoint to managed_agent_session; additive-only migration, safe to deploy and roll back |
| prisma/schema.prisma | Adds task_checkpoint Json? field to Session model with an inline comment; matches migration |
| tests/agent-task-checkpoint.spec.ts | E2E behavioral tests for checkpoint tools; AGENT_ID falls back to a hardcoded UUID that won't exist in any environment except the original dev setup, causing immediate failures on staging or CI |
Reviews (1): Last reviewed commit: "feat(session): add task checkpointing so..." | Re-trigger Greptile
| const rows = await prisma.session.findMany({ | ||
| where: { | ||
| agent_id: sessionRow.agent_id, | ||
| session_id: { not: session_id }, | ||
| task_checkpoint: { | ||
| path: ["status"], | ||
| equals: "blocked", | ||
| }, | ||
| }, | ||
| select: { | ||
| session_id: true, | ||
| task_checkpoint: true, | ||
| // Fall back to session updated_at for ordering when checkpoint updated_at is unavailable. | ||
| last_seen_at: true, | ||
| created_at: true, | ||
| }, | ||
| orderBy: { created_at: "desc" }, | ||
| take: 10, | ||
| }); |
There was a problem hiding this comment.
Blocked tasks stay visible after pickup — persistent duplicate resumption
When a new session (S2) picks up a blocked task from S1, it calls save_task_progress on its own session_id (S2). S1's task_checkpoint.status remains "blocked" forever because neither the MCP tool nor any API operation updates S1's record. Every subsequent session will also see S1 in list_blocked_tasks and attempt to resume the same work, with no way to know S2 is already handling it.
The MCP tool callSaveTaskProgress always writes to process.env.SESSION_ID || input.session_id (the caller's current session), so there is no code path today that clears the prior session's blocked status after pickup.
| const CheckpointBody = z.object({ | ||
| summary: z.string().min(1), | ||
| status: z.enum(["in_progress", "blocked", "complete"]), | ||
| blocked_reason: z.string().optional(), | ||
| }); |
There was a problem hiding this comment.
blocked_reason is not enforced when status is "blocked"
The Zod schema marks blocked_reason as optional unconditionally. When an agent sets status="blocked" without providing a reason, the next agent receiving this checkpoint from list_blocked_tasks has no context to determine whether the original blocker is still present. The tool description says "Required when status=blocked" but the server doesn't enforce it, so agents can silently omit it.
| const CheckpointBody = z.object({ | |
| summary: z.string().min(1), | |
| status: z.enum(["in_progress", "blocked", "complete"]), | |
| blocked_reason: z.string().optional(), | |
| }); | |
| const CheckpointBody = z.object({ | |
| summary: z.string().min(1), | |
| status: z.enum(["in_progress", "blocked", "complete"]), | |
| blocked_reason: z.string().optional(), | |
| }).refine( | |
| (data) => data.status !== "blocked" || (typeof data.blocked_reason === "string" && data.blocked_reason.length > 0), | |
| { message: "blocked_reason is required when status is 'blocked'", path: ["blocked_reason"] }, | |
| ); |
| last_seen_at: true, | ||
| created_at: true, | ||
| }, | ||
| orderBy: { created_at: "desc" }, |
There was a problem hiding this comment.
Ordering by
created_at instead of checkpoint's updated_at may surface stale tasks first
Sessions are sorted by when they were created, not by when they were blocked. A task blocked yesterday in a newer session would appear after an older session from last week. Ordering by last_seen_at (already selected) is a better approximation than created_at.
| orderBy: { created_at: "desc" }, | |
| orderBy: { last_seen_at: "desc" }, |
| process.env.CHECKPOINT_TEST_AGENT_ID ?? "9cbb91a6-e66d-43c5-92ed-68a570429527"; | ||
|
|
||
| const TURN_TIMEOUT_MS = 90_000; |
There was a problem hiding this comment.
Hardcoded agent UUID will fail in every non-dev environment
AGENT_ID falls back to a hardcoded UUID when CHECKPOINT_TEST_AGENT_ID is not set. spawnAndWait calls POST /agents/{AGENT_ID}/session, which will return a 404 on any staging or CI environment that doesn't have this specific agent. The test plan mentions running against staging, but the test will immediately fail there unless CHECKPOINT_TEST_AGENT_ID is explicitly configured — and there is no documentation of this requirement in the test file or the test plan.
Summary
session-task-mcp.mjsexposes 3 tools:save_task_progress,list_blocked_tasks,get_blocked_taskPOST/GET /sessions/{id}/task_checkpointandGET /sessions/{id}/blocked_taskstask_checkpoint JSONBcolumn tomanaged_agent_sessionlap-session-taskMCP into opencode.json at harness boottests/agent-task-checkpoint.spec.tschecks implicit tool use behaviorHow it works
Agent calls
save_task_progress({summary, status, blocked_reason})at each milestone and before giving up. Statusblockedmeans another session should pick it up.On next session start, agent calls
list_blocked_tasks(per system prompt instruction), sees prior blocked work, callsget_blocked_taskfor full context, and resumes instead of picking a new ticket.Verified locally
lap-session-task_*in agent toolset ✓list_blocked_tasksreturns blocked checkpoints from prior sessions ✓save_task_progresswrites to DB andGET task_checkpointreads it back ✓list_blocked_tasksunprompted when asked to pick a GitHub issue ✓Test plan
session-task-mcp.mjs+ updatedgen-mcp-config.mjsmust be in the Docker image)npx prisma migrate deployon prod DB (addstask_checkpointcolumn)tests/agent-task-checkpoint.spec.tsagainst staginglist_blocked_tasksfirst🤖 Generated with Claude Code