|
| 1 | +# Plan: Agent Harness Abstraction |
| 2 | + |
| 3 | +## Metadata |
| 4 | +- **Specification**: `codev/specs/591-af-workspace-failure-with-code.md` |
| 5 | + |
| 6 | +## Executive Summary |
| 7 | + |
| 8 | +Implement an extensible agent harness system that replaces hardcoded Claude-specific `--append-system-prompt` flags with per-harness role injection. Three phases: (1) create the harness module with built-in providers and config parsing, (2) refactor all call sites to use it, (3) add integration tests. |
| 9 | + |
| 10 | +## Phases (Machine Readable) |
| 11 | + |
| 12 | +```json |
| 13 | +{ |
| 14 | + "phases": [ |
| 15 | + {"id": "harness-module", "title": "Harness Provider Module + Config"}, |
| 16 | + {"id": "call-site-refactor", "title": "Call Site Refactoring"}, |
| 17 | + {"id": "integration-tests", "title": "Integration Tests"} |
| 18 | + ] |
| 19 | +} |
| 20 | +``` |
| 21 | + |
| 22 | +## Phase Breakdown |
| 23 | + |
| 24 | +### Phase 1: Harness Provider Module + Config |
| 25 | +**Dependencies**: None |
| 26 | + |
| 27 | +#### Objectives |
| 28 | +- Create the harness provider module with interface, built-in providers, custom config parsing, and resolution function |
| 29 | +- Add harness config fields to the UserConfig type system |
| 30 | + |
| 31 | +#### Deliverables |
| 32 | +- `packages/codev/src/agent-farm/utils/harness.ts` — new file |
| 33 | +- `packages/codev/src/agent-farm/types.ts` — updated UserConfig |
| 34 | +- `packages/codev/src/lib/config.ts` — updated CodevConfig + custom harness validation at load time |
| 35 | +- `packages/codev/src/agent-farm/__tests__/harness.test.ts` — new test file |
| 36 | + |
| 37 | +#### Implementation Details |
| 38 | + |
| 39 | +**`harness.ts`** — New module containing: |
| 40 | + |
| 41 | +1. `HarnessProvider` interface with `buildRoleInjection()` and `buildScriptRoleInjection()` |
| 42 | +2. Built-in provider objects: `CLAUDE_HARNESS`, `CODEX_HARNESS`, `GEMINI_HARNESS` |
| 43 | +3. `BUILTIN_HARNESSES` registry map: `{ claude, codex, gemini }` |
| 44 | +4. `buildCustomHarnessProvider(config)` — constructs a HarnessProvider from a custom config definition, expanding `${ROLE_FILE}` and `${ROLE_CONTENT}` template variables |
| 45 | +5. `resolveHarness(harnessName: string | undefined, userConfig: UserConfig)` — resolves a harness name to a provider: |
| 46 | + - `undefined` → returns `CLAUDE_HARNESS` (default) |
| 47 | + - matches built-in → returns built-in |
| 48 | + - matches key in `userConfig.harness` → builds custom provider |
| 49 | + - otherwise → throws descriptive error |
| 50 | + |
| 51 | +**`types.ts`** — Add to `UserConfig`: |
| 52 | +```typescript |
| 53 | +shell?: { |
| 54 | + architect?: string | string[]; |
| 55 | + architectHarness?: string; // NEW |
| 56 | + builder?: string | string[]; |
| 57 | + builderHarness?: string; // NEW |
| 58 | + shell?: string | string[]; |
| 59 | +}; |
| 60 | +harness?: Record<string, { // NEW |
| 61 | + roleArgs: string[]; |
| 62 | + roleEnv?: Record<string, string>; |
| 63 | + roleScriptFragment: string; |
| 64 | + roleScriptEnv?: Record<string, string>; |
| 65 | +}>; |
| 66 | +``` |
| 67 | + |
| 68 | +**`lib/config.ts`** — Add same fields to `CodevConfig`: |
| 69 | +```typescript |
| 70 | +shell?: { |
| 71 | + architect?: string | string[]; |
| 72 | + architectHarness?: string; // NEW |
| 73 | + builder?: string | string[]; |
| 74 | + builderHarness?: string; // NEW |
| 75 | + shell?: string | string[]; |
| 76 | +}; |
| 77 | +harness?: Record<string, { ... }>; // NEW — same shape as UserConfig |
| 78 | +``` |
| 79 | + |
| 80 | +Add a `validateHarnessConfig()` function called during `loadConfig()` that checks: |
| 81 | +- Each custom harness entry has required fields (`roleArgs`, `roleScriptFragment`) |
| 82 | +- `roleArgs` is an array of strings |
| 83 | +- `roleScriptFragment` is a string |
| 84 | +If validation fails, throw a descriptive error at config load time (per spec: "missing required fields produce a descriptive error at config load time"). |
| 85 | + |
| 86 | +#### Acceptance Criteria |
| 87 | +- `resolveHarness('claude')` returns claude provider |
| 88 | +- `resolveHarness('codex')` returns codex provider with `-c model_instructions_file=<path>` args |
| 89 | +- `resolveHarness('gemini')` returns gemini provider with `GEMINI_SYSTEM_MD` env var |
| 90 | +- `resolveHarness(undefined)` defaults to claude |
| 91 | +- `resolveHarness('nonexistent')` throws clear error |
| 92 | +- Custom harness from config correctly expands `${ROLE_FILE}` and `${ROLE_CONTENT}` |
| 93 | +- Missing required fields in custom harness throw descriptive error |
| 94 | + |
| 95 | +#### Test Plan |
| 96 | +- **Unit Tests**: Test each built-in provider's `buildRoleInjection()` and `buildScriptRoleInjection()` outputs. Test resolution logic (built-in, custom, default, error). Test template variable expansion. Test custom harness validation (missing required fields). |
| 97 | + |
| 98 | +--- |
| 99 | + |
| 100 | +### Phase 2: Call Site Refactoring |
| 101 | +**Dependencies**: Phase 1 |
| 102 | + |
| 103 | +#### Objectives |
| 104 | +- Replace all 5 hardcoded `--append-system-prompt` locations with harness provider calls |
| 105 | +- Update `buildArchitectArgs()` return type and its 3 callers |
| 106 | +- Fix side issues (deprecated Codex flag, "Claude exited" message) |
| 107 | + |
| 108 | +#### Deliverables |
| 109 | +- `packages/codev/src/agent-farm/commands/spawn-worktree.ts` — updated |
| 110 | +- `packages/codev/src/agent-farm/commands/architect.ts` — updated |
| 111 | +- `packages/codev/src/agent-farm/servers/tower-utils.ts` — updated |
| 112 | +- `packages/codev/src/agent-farm/servers/tower-terminals.ts` — updated (2 call sites) |
| 113 | +- `packages/codev/src/agent-farm/servers/tower-instances.ts` — updated (1 call site) |
| 114 | +- `packages/codev/src/commands/consult/index.ts` — side fix |
| 115 | +- `packages/codev/src/agent-farm/utils/config.ts` — add harness resolution helpers |
| 116 | +- `packages/codev/src/agent-farm/__tests__/spawn-worktree.test.ts` — updated |
| 117 | +- `packages/codev/src/agent-farm/__tests__/af-architect.test.ts` — updated |
| 118 | +- `packages/codev/src/commands/consult/__tests__/codex-sdk.test.ts` — updated (`experimental_instructions_file` → `model_instructions_file`) |
| 119 | + |
| 120 | +#### Implementation Details |
| 121 | + |
| 122 | +**`spawn-worktree.ts`** — Two functions to update: |
| 123 | + |
| 124 | +1. `startBuilderSession()` (line 597): Import `resolveHarness`. Get builder harness from config. Call `provider.buildScriptRoleInjection(roleContent, roleFile)`. Generate `export` lines from returned `env`. Replace hardcoded `--append-system-prompt "$(cat '${roleFile}')"` with the harness fragment. Change "Claude exited" to "Agent exited" in restart message. |
| 125 | + |
| 126 | +2. `buildWorktreeLaunchScript()` (line 668): Same pattern. Both functions need the harness name passed in (from config) or resolved internally. |
| 127 | + |
| 128 | +**`architect.ts`** (line 29): Write role to `<workspaceRoot>/.architect-role.md` (align with `tower-utils.ts`). Call `resolveHarness(harnessName).buildRoleInjection(role.content, roleFilePath)`. Spread returned args into spawn args. Merge returned env into spawn env. |
| 129 | + |
| 130 | +**`tower-utils.ts`** (line 175): `buildArchitectArgs()` return type changes from `string[]` to `{ args: string[]; env: Record<string, string> }`. Resolve harness, call `buildRoleInjection()`, return combined args and env. |
| 131 | + |
| 132 | +**Caller updates** — Three callers of `buildArchitectArgs()`: |
| 133 | +- `tower-terminals.ts:536` — destructure `{ args, env }`, pass env to session creation |
| 134 | +- `tower-terminals.ts:725` — same pattern for resume |
| 135 | +- `tower-instances.ts:377` — same pattern for workspace launch |
| 136 | + |
| 137 | +**Config helpers** — Add `getArchitectHarness()` and `getBuilderHarness()` to `agent-farm/utils/config.ts` that read `architectHarness`/`builderHarness` from UserConfig and call `resolveHarness()`. These helpers are the single point where harness resolution happens — callers (`startBuilderSession()`, `buildWorktreeLaunchScript()`, `architect()`, `buildArchitectArgs()`) call the helper rather than resolving inline, avoiding duplication. |
| 138 | + |
| 139 | +**Consult side fix** — `consult/index.ts:383`: Change `experimental_instructions_file` to `model_instructions_file` in the Codex SDK config. |
| 140 | + |
| 141 | +#### Acceptance Criteria |
| 142 | +- Claude workflows unchanged (regression-safe) |
| 143 | +- No remaining references to `--append-system-prompt` except inside `CLAUDE_HARNESS` provider |
| 144 | +- `buildArchitectArgs()` returns `{ args, env }` and all callers handle it |
| 145 | +- `architect.ts` writes role to `.architect-role.md` |
| 146 | +- "Agent exited" message in restart loop |
| 147 | +- `consult` uses `model_instructions_file` for Codex |
| 148 | + |
| 149 | +#### Test Plan |
| 150 | +- **Unit Tests**: Update `spawn-worktree.test.ts` to test with claude harness (regression) and verify no hardcoded `--append-system-prompt` outside the provider. Update `af-architect.test.ts` similarly. |
| 151 | +- **Manual Testing**: If possible, verify `afx workspace start` with claude config still works. |
| 152 | + |
| 153 | +--- |
| 154 | + |
| 155 | +### Phase 3: Integration Tests |
| 156 | +**Dependencies**: Phase 2 |
| 157 | + |
| 158 | +#### Objectives |
| 159 | +- Add comprehensive integration tests verifying all call sites produce correct commands for each harness type |
| 160 | +- Verify custom harness config end-to-end |
| 161 | + |
| 162 | +#### Deliverables |
| 163 | +- `packages/codev/src/agent-farm/__tests__/harness-integration.test.ts` — new test file |
| 164 | + |
| 165 | +#### Implementation Details |
| 166 | + |
| 167 | +Integration tests that verify the final output of each call site for each harness type: |
| 168 | + |
| 169 | +1. **`buildWorktreeLaunchScript()` integration**: For each harness (claude, codex, gemini, custom), verify the generated bash script contains the correct role injection fragment and env exports. |
| 170 | + |
| 171 | +2. **`buildArchitectArgs()` integration**: For each harness, verify the returned args and env are correct. For claude: `--append-system-prompt` with content. For codex: `-c model_instructions_file=<path>`. For gemini: env `GEMINI_SYSTEM_MD`. |
| 172 | + |
| 173 | +3. **Custom harness end-to-end**: Define a custom harness in mock config, resolve it, verify template expansion works correctly with `${ROLE_FILE}` and `${ROLE_CONTENT}`. |
| 174 | + |
| 175 | +4. **Error cases**: Unknown harness name, missing required fields in custom harness config. |
| 176 | + |
| 177 | +5. **No-role behavior**: Verify all call sites skip harness injection when role is `null`. |
| 178 | + |
| 179 | +#### Acceptance Criteria |
| 180 | +- All test scenarios from spec section "Test Scenarios" (1-8) have corresponding test cases |
| 181 | +- Tests pass for all 3 built-in harnesses + custom harness + error cases |
| 182 | +- No-role behavior tested |
| 183 | + |
| 184 | +#### Test Plan |
| 185 | +- **Integration Tests**: Parameterized tests across harness types for each call site function. |
| 186 | + |
| 187 | +## Risk Assessment |
| 188 | + |
| 189 | +| Risk | Impact | Mitigation | |
| 190 | +|------|--------|------------| |
| 191 | +| `tower-terminals.ts` env propagation may not reach PTY sessions | High | Verify how `createPtySession()` handles env vars; may need to update PTY creation | |
| 192 | +| `architect.ts` `shell: true` spawn with multiline role content | Medium | Existing issue — not introduced by this change. Document for future fix. | |
| 193 | +| Codex `model_instructions_file` flag may change again | Low | Verified against current Codex v0.117.0; easy to update | |
0 commit comments