feat(session): task checkpointing — blocked agents hand off to next session by krrish-berri-2 · Pull Request #351 · BerriAI/litellm-agent-platform

krrish-berri-2 · 2026-05-27T01:45:35Z

Summary

New MCP server session-task-mcp.mjs exposes 3 tools: save_task_progress, list_blocked_tasks, get_blocked_task
Platform routes: POST/GET /sessions/{id}/task_checkpoint and GET /sessions/{id}/blocked_tasks
DB migration: adds task_checkpoint JSONB column to managed_agent_session
gen-mcp-config.mjs: wires lap-session-task MCP into opencode.json at harness boot
E2E test: tests/agent-task-checkpoint.spec.ts checks implicit tool use behavior

How it works

Agent calls save_task_progress({summary, status, blocked_reason}) at each milestone and before giving up. Status blocked means another session should pick it up.

On next session start, agent calls list_blocked_tasks (per system prompt instruction), sees prior blocked work, calls get_blocked_task for full context, and resumes instead of picking a new ticket.

Verified locally

All 3 tools appear as lap-session-task_* in agent toolset ✓
list_blocked_tasks returns blocked checkpoints from prior sessions ✓
save_task_progress writes to DB and GET task_checkpoint reads it back ✓
Agent calls list_blocked_tasks unprompted when asked to pick a GitHub issue ✓

Test plan

Deploy inline harness image (new session-task-mcp.mjs + updated gen-mcp-config.mjs must be in the Docker image)
Run npx prisma migrate deploy on prod DB (adds task_checkpoint column)
Run tests/agent-task-checkpoint.spec.ts against staging
Manual: ask Shin to pick a ticket, confirm it calls list_blocked_tasks first
Manual: let a session get blocked (sandbox failure), verify next session sees it and resumes

🤖 Generated with Claude Code

…ext session Adds three MCP tools (save_task_progress, list_blocked_tasks, get_blocked_task) and backing platform routes so agents can persist their work state before giving up, and successor sessions can resume blocked tasks instead of starting from scratch. - harnesses/opencode/session-task-mcp.mjs: new standalone stdio MCP server exposing the three tools; follows same env/retry/proxy pattern as report-issue-mcp - harnesses/opencode/gen-mcp-config.mjs: wire lap-session-task into opencode.json - prisma/schema.prisma + migration: add task_checkpoint JSONB to managed_agent_session - POST /sessions/{id}/task_checkpoint: agent writes {summary, status, blocked_reason} - GET /sessions/{id}/task_checkpoint: read checkpoint for any session (same agent) - GET /sessions/{id}/blocked_tasks: list other sessions with status=blocked - tests/agent-task-checkpoint.spec.ts: e2e behavioral tests Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

greptile-apps · 2026-05-27T01:49:25Z

Greptile Summary

This PR introduces a task-checkpointing system for managed agents: a new stdio MCP server (session-task-mcp.mjs) exposes three tools for saving and retrieving task progress, two new API routes persist and read checkpoint state from a new task_checkpoint JSONB column, and an E2E test validates the behavior end-to-end.

MCP server + routes: save_task_progress, list_blocked_tasks, and get_blocked_task are wired to POST/GET /sessions/{id}/task_checkpoint and GET /sessions/{id}/blocked_tasks. Auth correctly scopes reads and writes to the owning agent.
DB migration: Additive JSONB column on managed_agent_session, safe to deploy and roll back.
Core gap: When a new session picks up a blocked task, the old session's task_checkpoint.status is never updated from \"blocked\" — save_task_progress always writes to the caller's own session_id. Every future session will continue seeing the old blocked entry in list_blocked_tasks, leading to repeated duplicate pickup attempts with no way to suppress them short of a manual DB update.

Confidence Score: 3/5

The blocked-task handoff mechanism has a persistent-state bug that will cause every subsequent session to re-attempt already-claimed work, and the E2E test will fail immediately on any environment without pre-provisioning a specific agent.

Once a new session picks up a blocked task, the prior session's checkpoint remains in state 'blocked' indefinitely. Every future session for that agent will discover the same stale entry, attempt to resume it, and produce duplicate or conflicting work. There is no MCP tool path that allows the new session to update the old session's record. The E2E test also relies on a hardcoded agent UUID that will not exist outside the original dev environment, so the test plan cannot be executed without additional setup steps that are not yet documented.

src/app/api/v1/managed_agents/sessions/[session_id]/blocked_tasks/route.ts (stale-blocked logic) and tests/agent-task-checkpoint.spec.ts (hardcoded agent UUID) need attention before this is ready to ship.

Important Files Changed

Filename	Overview
src/app/api/v1/managed_agents/sessions/[session_id]/blocked_tasks/route.ts	New route returning blocked checkpoints for the same agent; has a logic gap where picked-up sessions are never cleared from the list, and ordering by created_at is less useful than ordering by last_seen_at
src/app/api/v1/managed_agents/sessions/[session_id]/task_checkpoint/route.ts	New POST/GET routes for writing and reading task checkpoints; auth is correct; Zod validation is missing a refine constraint to require blocked_reason when status=blocked
harnesses/opencode/session-task-mcp.mjs	New stdio MCP server exposing three task-checkpoint tools; retry logic and token refresh are solid; save_task_progress always writes to the caller's own session, leaving no mechanism to clear a prior session's blocked status
harnesses/opencode/gen-mcp-config.mjs	Adds lap-session-task MCP entry guarded by the same issueBase+issueAccess check as the existing lap-issue-reporter; no issues
prisma/migrations/20260526000001_add_task_checkpoint/migration.sql	Adds nullable JSONB column task_checkpoint to managed_agent_session; additive-only migration, safe to deploy and roll back
prisma/schema.prisma	Adds task_checkpoint Json? field to Session model with an inline comment; matches migration
tests/agent-task-checkpoint.spec.ts	E2E behavioral tests for checkpoint tools; AGENT_ID falls back to a hardcoded UUID that won't exist in any environment except the original dev setup, causing immediate failures on staging or CI

_{Reviews (1): Last reviewed commit: "feat(session): add task checkpointing so..." | Re-trigger Greptile}

greptile-apps · 2026-05-27T01:49:29Z

+  const rows = await prisma.session.findMany({
+    where: {
+      agent_id: sessionRow.agent_id,
+      session_id: { not: session_id },
+      task_checkpoint: {
+        path: ["status"],
+        equals: "blocked",
+      },
+    },
+    select: {
+      session_id: true,
+      task_checkpoint: true,
+      // Fall back to session updated_at for ordering when checkpoint updated_at is unavailable.
+      last_seen_at: true,
+      created_at: true,
+    },
+    orderBy: { created_at: "desc" },
+    take: 10,
+  });


Blocked tasks stay visible after pickup — persistent duplicate resumption

When a new session (S2) picks up a blocked task from S1, it calls save_task_progress on its own session_id (S2). S1's task_checkpoint.status remains "blocked" forever because neither the MCP tool nor any API operation updates S1's record. Every subsequent session will also see S1 in list_blocked_tasks and attempt to resume the same work, with no way to know S2 is already handling it.

The MCP tool callSaveTaskProgress always writes to process.env.SESSION_ID || input.session_id (the caller's current session), so there is no code path today that clears the prior session's blocked status after pickup.

greptile-apps · 2026-05-27T01:49:30Z

+const CheckpointBody = z.object({
+  summary: z.string().min(1),
+  status: z.enum(["in_progress", "blocked", "complete"]),
+  blocked_reason: z.string().optional(),
+});


blocked_reason is not enforced when status is "blocked"

The Zod schema marks blocked_reason as optional unconditionally. When an agent sets status="blocked" without providing a reason, the next agent receiving this checkpoint from list_blocked_tasks has no context to determine whether the original blocker is still present. The tool description says "Required when status=blocked" but the server doesn't enforce it, so agents can silently omit it.

Suggested change

const CheckpointBody = z.object({

summary: z.string().min(1),

status: z.enum(["in_progress", "blocked", "complete"]),

blocked_reason: z.string().optional(),

});

const CheckpointBody = z.object({

summary: z.string().min(1),

status: z.enum(["in_progress", "blocked", "complete"]),

blocked_reason: z.string().optional(),

}).refine(

(data) => data.status !== "blocked" || (typeof data.blocked_reason === "string" && data.blocked_reason.length > 0),

{ message: "blocked_reason is required when status is 'blocked'", path: ["blocked_reason"] },

);

greptile-apps · 2026-05-27T01:49:31Z

+      last_seen_at: true,
+      created_at: true,
+    },
+    orderBy: { created_at: "desc" },


Ordering by created_at instead of checkpoint's updated_at may surface stale tasks first

Sessions are sorted by when they were created, not by when they were blocked. A task blocked yesterday in a newer session would appear after an older session from last week. Ordering by last_seen_at (already selected) is a better approximation than created_at.

Suggested change

orderBy: { created_at: "desc" },

orderBy: { last_seen_at: "desc" },

greptile-apps · 2026-05-27T01:49:32Z

+  process.env.CHECKPOINT_TEST_AGENT_ID ?? "9cbb91a6-e66d-43c5-92ed-68a570429527";
+
+const TURN_TIMEOUT_MS = 90_000;


Hardcoded agent UUID will fail in every non-dev environment

AGENT_ID falls back to a hardcoded UUID when CHECKPOINT_TEST_AGENT_ID is not set. spawnAndWait calls POST /agents/{AGENT_ID}/session, which will return a 404 on any staging or CI environment that doesn't have this specific agent. The test plan mentions running against staging, but the test will immediately fail there unless CHECKPOINT_TEST_AGENT_ID is explicitly configured — and there is no documentation of this requirement in the test file or the test plan.

greptile-apps Bot reviewed May 27, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(session): task checkpointing — blocked agents hand off to next session#351

feat(session): task checkpointing — blocked agents hand off to next session#351
krrish-berri-2 wants to merge 1 commit into
mainfrom
worktree-session-checkpointing

krrish-berri-2 commented May 27, 2026

Uh oh!

greptile-apps Bot commented May 27, 2026

Important Files Changed

Uh oh!

greptile-apps Bot May 27, 2026

Uh oh!

greptile-apps Bot May 27, 2026

Uh oh!

greptile-apps Bot May 27, 2026

Uh oh!

greptile-apps Bot May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	orderBy: { created_at: "desc" },
	orderBy: { last_seen_at: "desc" },

		process.env.CHECKPOINT_TEST_AGENT_ID ?? "9cbb91a6-e66d-43c5-92ed-68a570429527";

		const TURN_TIMEOUT_MS = 90_000;

Conversation

krrish-berri-2 commented May 27, 2026

Summary

How it works

Verified locally

Test plan

Uh oh!

greptile-apps Bot commented May 27, 2026

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Uh oh!

greptile-apps Bot May 27, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot May 27, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot May 27, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot May 27, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant