Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
57ba9d8
docs: add codex durability research
May 14, 2026
2ad7f36
Plan Codex turn-completion durability
May 14, 2026
635fadd
Address Codex durability plan review gaps
May 14, 2026
7c87679
Tighten Codex durability plan edge cases
May 14, 2026
3b32457
Fix MCP instruction template syntax
May 14, 2026
eeb42f6
Add Codex durability proof and store
May 14, 2026
2de71b0
Observe Codex turns through remote proxy
May 14, 2026
ff837b4
Launch fresh Codex without pre-durable resume
May 14, 2026
548eb23
Persist Codex candidate before input
May 14, 2026
139453f
Promote Codex sessions after rollout proof
May 14, 2026
dc56171
Repair captured Codex reopen deterministically
May 14, 2026
ebfbea8
Fail closed when Codex candidate capture fails
May 14, 2026
ad64bf4
Recover durable Codex PTY exits
May 14, 2026
1c2ff3b
Stabilize Codex durable session restore
May 14, 2026
05549b9
Harden Codex durable restore promotion
May 14, 2026
6cd072c
Address Codex durable restore review gaps
May 14, 2026
b840575
Close Codex durability review gaps
May 14, 2026
a267fcb
Harden Codex durability failure paths
May 14, 2026
10fa789
Harden Codex candidate restore surfaces
May 14, 2026
a6dcaf2
Address Codex durability review gaps
May 14, 2026
f2d2af9
Reap stale Codex durability records
May 14, 2026
3ba7ee5
Preserve Codex durability in tab registry copies
May 14, 2026
42b6f8d
Block failed Codex identity reuse
May 14, 2026
77a7c95
Preserve Codex candidates from sidebar
May 14, 2026
091a454
Repair Codex proof reopen surfaces
May 14, 2026
d7b91c8
Persist Codex proof repair state
May 14, 2026
c0497f9
Complete Codex durability review fixes
May 14, 2026
087da3f
Enforce exact Codex live candidate repair
May 14, 2026
b8a0101
Update agent CLI smoke input stub
May 14, 2026
ff93d02
Extend Codex candidate capture deadline
May 14, 2026
634b53d
Allow Codex startup terminal reports through gate
May 14, 2026
2afe515
Preserve Codex proxy text frames
May 14, 2026
170b4be
Fix Codex gated prompt and blocked input feedback
May 14, 2026
273e374
Centralize Codex restore create decisions
May 15, 2026
0482f3d
docs: record Codex restore decision centralization lesson
May 15, 2026
7565ccc
Reject raw Codex resume ids consistently
May 15, 2026
7152a34
Bridge Codex restore state across server restart
May 15, 2026
b141d5e
Remove Codex resume loaded-thread polling
May 15, 2026
1efde54
Constrain Codex restore store evidence
May 15, 2026
915a2ed
Harden Codex durability restore edge cases
May 15, 2026
ed66bbe
Disable Codex apps for managed launches
May 16, 2026
a61f97f
Clarify Codex sidecar reaper blockers
May 16, 2026
41fbc5d
docs: capture Codex durability bridge invariant
May 16, 2026
1546890
fix: stabilize codex app server contracts
May 16, 2026
4a6c377
test: capture process exits in contract harness
May 16, 2026
1877e2d
fix: retry codex wrapper identity proof
May 16, 2026
73f2803
test: log managed codex app server argv
May 18, 2026
33c3de6
test: cover codex launch retry and legacy sidecar cleanup
May 18, 2026
7354303
test: remove stale provisional codex tab assertions
May 18, 2026
89db259
test: avoid duplicate Codex turn notification coverage
May 18, 2026
c2c30a2
test: use valid Claude session UUID
May 18, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 31 additions & 13 deletions docs/lab-notes/2026-04-20-coding-cli-session-contract.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Coding CLI Session Contract Lab Note

This note records the real-binary provider probes rerun on `2026-04-26` inside `/home/user/code/freshell/.worktrees/trycycle-codex-session-resilience`. Binary version facts were refreshed on `2026-05-03` inside `/home/user/code/freshell/.worktrees/land-local-main-codex-sidecar-lifecycle`.
This note records the real-binary provider probes rerun on `2026-04-26` inside `/home/user/code/freshell/.worktrees/trycycle-codex-session-resilience`. Binary version facts were refreshed on `2026-05-14` inside `/home/user/code/freshell/.worktrees/codex-stability-implementation-20260514`.

The implementation plan file is dated `2026-04-19` because the design work was written the day before. This note is dated `2026-04-26` because the real-provider contracts were re-proved on the implementation machine on that date, and that verification date is the one Freshell is allowed to build on.

Expand All @@ -9,7 +9,7 @@ The implementation plan file is dated `2026-04-19` because the design work was w
{
"capturedOn": "2026-04-26",
"planCreatedOn": "2026-04-19",
"dateReason": "The plan was drafted on 2026-04-19, but the checked-in note is dated 2026-04-26 because that is when the durable behavior contract was re-proved on the implementation machine and the earlier 2026-04-23 contract capture was superseded by the newer provider behavior. Binary version facts were refreshed on 2026-05-03 after the installed provider versions changed.",
"dateReason": "The plan was drafted on 2026-04-19, but the checked-in note is dated 2026-04-26 because that is when the durable behavior contract was re-proved on the implementation machine and the earlier 2026-04-23 contract capture was superseded by the newer provider behavior. Binary version facts were refreshed on 2026-05-14 after the installed provider versions changed.",
"cleanup": {
"liveProcessAuditCommand": "ps -eo pid,ppid,stat,cmd --sort=pid | rg \"codex|claude|opencode\"",
"ownershipReportFields": [
Expand Down Expand Up @@ -37,7 +37,7 @@ The implementation plan file is dated `2026-04-19` because the design work was w
"codex": {
"executable": "codex",
"resolvedPath": "/home/user/.npm-global/bin/codex",
"version": "codex-cli 0.128.0",
"version": "codex-cli 0.130.0",
"freshRemoteBootstrapCommand": "codex --remote <ws>",
"freshRemoteBootstrapEventsBeforeUserTurn": [
"connection",
Expand All @@ -60,8 +60,11 @@ The implementation plan file is dated `2026-04-19` because the design work was w
],
"remoteResumeBootstrapFollowupMethods": [
"account/rateLimits/read",
"command/exec",
"hooks/list",
"skills/list",
"skills/list"
"skills/list",
"thread/goal/get"
],
"freshRemoteAllocatesThreadBeforeUserTurn": true,
"shellSnapshotGlob": ".codex/shell_snapshots/*.sh",
Expand All @@ -81,7 +84,7 @@ The implementation plan file is dated `2026-04-19` because the design work was w
"executable": "claude",
"resolvedPath": "/home/user/bin/claude",
"isolatedBinaryPath": "/home/user/.local/bin/claude",
"version": "2.1.126 (Claude Code)",
"version": "2.1.140 (Claude Code)",
"exactIdCommandTemplate": "HOME=<temp-home> /home/user/.local/bin/claude --bare --dangerously-skip-permissions -p --session-id <uuid> <prompt>",
"namedResumeCommandTemplate": "HOME=<temp-home> /home/user/.local/bin/claude --bare --dangerously-skip-permissions -p --resume <title-or-uuid> [--name <title>] <prompt>",
"transcriptGlob": ".claude/projects/*/<uuid>.jsonl",
Expand All @@ -94,14 +97,15 @@ The implementation plan file is dated `2026-04-19` because the design work was w
"opencode": {
"executable": "opencode",
"resolvedPath": "/home/user/.opencode/bin/opencode",
"version": "1.14.33",
"version": "1.14.50",
"runCommandTemplate": "opencode run <prompt> --format json --dangerously-skip-permissions",
"serveCommandTemplate": "opencode serve --hostname 127.0.0.1 --port <port>",
"globalHealthPath": "/global/health",
"sessionStatusPath": "/session/status",
"canonicalIdentity": "session-id",
"runEventSessionIdMatchesDbId": true,
"busyStatusUsesAuthoritativeSessionId": true,
"attachFormatJsonEmitsEvents": false,
"titleOnResumeMutatesStoredTitle": false,
"sessionSubcommands": [
"list",
Expand Down Expand Up @@ -138,10 +142,10 @@ command -v codex
# /home/user/.npm-global/bin/codex

codex --version
# codex-cli 0.128.0
# codex-cli 0.130.0
```

This 2026-05-03 version refresh supersedes the older `codex-cli 0.125.0` capture. The current version of record on this machine is `codex-cli 0.128.0`.
This 2026-05-14 version refresh supersedes the older `codex-cli 0.128.0` capture. The current version of record on this machine is `codex-cli 0.130.0`.

Fresh remote bootstrap was probed with a loopback websocket stub and:

Expand All @@ -160,7 +164,7 @@ Before any user turn, the CLI opened a connection and issued:

That proves fresh `codex --remote` allocates a thread during bootstrap, before the first user turn, but that thread allocation is not yet the durable contract Freshell may persist.

The remote resume form was re-proved through a websocket proxy in front of the real app-server. Before any user turn, `codex --remote <ws> --no-alt-screen resume <sessionId>` issued the stable prefix through `thread/resume`, and then the follow-up `skills/list` and `account/rateLimits/read` calls. The trailing post-resume follow-up order was observed to vary between reruns on the same binary, so only the stable prefix plus the required follow-up method set is treated as contract.
The remote resume form was re-proved through a websocket proxy in front of the real app-server. Before any user turn, `codex --remote <ws> --no-alt-screen resume <sessionId>` issued the stable prefix through `thread/resume`, and then the follow-up `account/rateLimits/read`, `command/exec`, `hooks/list`, `skills/list`, and `thread/goal/get` calls. The trailing post-resume follow-up order was observed to vary between reruns on the same binary, so only the stable prefix plus the required follow-up method set is treated as contract.

Real provider-owned durability was re-proved against the app-server websocket with:

Expand Down Expand Up @@ -229,6 +233,19 @@ Allowed Freshell behavior:
- Freshell may only persist canonical Codex identity after the durable `.jsonl` artifact exists at the provider-reported `thread.path`.
- Freshell must not treat the bootstrap `thread/start` id as durable restore identity.

### 2026-05-14 Codex restore decision addendum

The `da2e0076` refactor added a design constraint that belongs with the provider contract: deterministic Codex restore needs one typed create/restore decision path, not only a correct rollout proof reader. Restore-like entry points must make the same decision about canonical `sessionRef`, captured candidate proof, live attach after proof failure, fresh create, and legacy raw resume. Keeping those choices local to each caller risks separate restore semantics.

Design-level change recorded from `/home/user/code/freshell/.worktrees/codex-stability-implementation-20260514`: `/home/user/code/freshell/.worktrees/codex-stability-implementation-20260514/server/coding-cli/codex-app-server/restore-decision.ts` now owns `planCodexCreateRestoreDecision` and `resolveCodexCreateRestoreDecision`, and `/home/user/code/freshell/.worktrees/codex-stability-implementation-20260514/server/ws-handler.ts` routes Codex `terminal.create` and reopen handling through it. This is a narrow centralization, not a claim that every surface is done.

Follow-up constraints:

- Move exact live-candidate matching into the central module or make its typed input contract require enough live candidate identity for the module to verify `candidateThreadId` and `rolloutPath`.
- Remove or replace `legacy_raw_resume_passthrough`; raw resume should not remain a durable restore identity path.
- Extend the same decision path to REST, MCP, CLI, and any future restore-like surface instead of maintaining parallel semantics.
- Add surface matrix tests so coverage proves all entry points use the same restore decisions, not just the decision module and the current websocket route.

## Claude

Version and binaries:
Expand All @@ -238,7 +255,7 @@ command -v claude
# /home/user/bin/claude

claude --version
# 2.1.126 (Claude Code)
# 2.1.140 (Claude Code)
```

The wrapper at `/home/user/bin/claude` shells out to `/home/user/.local/bin/claude`. The isolated probes used the actual binary and overrode `HOME` to keep persistence inside the probe temp root.
Expand Down Expand Up @@ -287,7 +304,7 @@ command -v opencode
# /home/user/.opencode/bin/opencode

opencode --version
# 1.14.33
# 1.14.50
```

Fresh isolated runs were probed with:
Expand All @@ -312,9 +329,10 @@ curl http://127.0.0.1:<port>/session/status

Observed control behavior:

- `/global/health` returned a healthy payload with version `1.14.33`.
- `/global/health` returned a healthy payload with version `1.14.50`.
- `/session/status` returned `{}` while idle.
- During an attached `opencode run ... --attach http://127.0.0.1:<port>`, `/session/status` returned the same authoritative `sessionID` with `{ "type": "busy" }`.
- During an attached `opencode run ... --attach http://127.0.0.1:<port>`, `/session/status` returned an authoritative `sessionID` with `{ "type": "busy" }`, and the same id was persisted as a `session.id` row in the isolated OpenCode database.
- On OpenCode `1.14.50`, attached `opencode run ... --attach ... --format json` exited successfully but emitted no JSON event lines on stdout, so attached-mode identity must come from `/session/status` plus the persisted database row rather than attached-run stdout.

Title semantics were probed with:

Expand Down
Loading
Loading