Add Kiro agent integration with E2E-first TDD#554
Add Kiro agent integration with E2E-first TDD#554alishakawaguchi wants to merge 9 commits intomainfrom
Conversation
Flip the implementer procedure from unit-test-first TDD to E2E-driven development where E2E tests are the primary spec and unit tests are written after each E2E test passes to lock in behavior. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Entire-Checkpoint: 3b9c57f41fc2
Replace hardcoded method names and event types with source-file references so skill files age gracefully when interfaces change. Split ambiguous AGENT_SLUG parameter into AGENT_PACKAGE (Go dirs), AGENT_KEY (registry), and AGENT_SLUG (E2E/scripts). Add hook-only scope section to SKILL.md and remove checklist from researcher exclusion list. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Entire-Checkpoint: f5f2b5651f79
Implement full agent integration for Amazon's Kiro AI coding CLI, following established patterns from OpenCode (SQLite-backed transcripts) and Cursor (JSON hooks file). Includes core agent, lifecycle hook handling (5 hooks via stdin JSON), hook installation to .kiro/agents/entire.json, transcript analysis with SQLite3 CLI access, E2E agent runner, and comprehensive unit tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Entire-Checkpoint: f8edcbac7a08
Enforce strict discipline: E2E tests drive development, unit tests are written last. Previously the skill interleaved unit test writing after each E2E tier, which meant unit tests were written to match assumed behavior rather than observed behavior from actual E2E runs. Key changes: - Remove all "After passing, write unit tests" blocks from Steps 4-12 - Add Step 13 (Full E2E Suite Pass) to run complete suite before unit tests - Add Step 14 (Write Unit Tests) consolidated with golden fixture guidance - Rename Phase 2 to "Write E2E Runner" (no test scenarios, runner only) - Add "Core Rule: E2E-First TDD" section to SKILL.md and implementer.md - Delete test-writer Step 6 (Write E2E Test Scenarios) - Update all step references and renumber to 1-16 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Entire-Checkpoint: 2ca049d51515
Implement the Kiro (Amazon AI coding CLI) agent for the Entire CLI, following the E2E-first TDD approach. Kiro runs inside tmux (has TTY), requiring special handling in the prepare-commit-msg fast path via agentUsesTerminal() to distinguish agent commits from human commits. Key changes: - cmd/entire/cli/agent/kiro/: Full agent package (hooks, lifecycle, types, session ID via SQLite, transcript parsing) - e2e/agents/kiro.go: E2E test driver with KiroSession wrapper using end-of-line prompt pattern to avoid matching echoed input - strategy/manual_commit_hooks.go: Restore hasTTY() + add agentUsesTerminal() for TTY-based agents like Kiro - strategy/manual_commit_rewind.go: Clear TranscriptPath on rewind so condensation reads from shadow branch, not stale transcript - lifecycle.go, manual_commit_git.go: TranscriptPath backfill for agents with deferred transcript persistence - .github/workflows/e2e*.yml: Add kiro to CI matrix and options 108 unit tests across 4 test files, all E2E and integration tests pass. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Entire-Checkpoint: 4a7c9d636b45
Each phase and implementation step now includes a `/commit` instruction so progress is committed incrementally rather than piling up uncommitted. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Entire-Checkpoint: 92e9dafcd86f
PR SummaryMedium Risk Overview Updates lifecycle/strategy code to better handle agents with deferred transcript availability: Refreshes the Written by Cursor Bugbot for commit 237ba90. Configure here. |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 3 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Comment @cursor review or bugbot run to trigger another review on this PR
| return nil, fmt.Errorf("failed to chunk transcript: %w", err) | ||
| } | ||
| return chunks, nil | ||
| } |
There was a problem hiding this comment.
JSONL chunking on JSON blob corrupts large transcripts
Medium Severity
ChunkTranscript falls back to agent.ChunkJSONL for large Kiro transcripts, but Kiro transcripts are single JSON objects (as noted in the comment and AGENT.md), not JSONL. ChunkJSONL splits at newline boundaries, which for a pretty-printed JSON object would produce invalid JSON fragments in each chunk. ReassembleTranscript then concatenates them with newlines (JSONL-style), which won't reconstruct the original JSON. Other agents with JSON transcripts (Gemini) implement proper JSON-aware chunking that splits at message array boundaries.
| // (e.g., "[entire] 3% !> now commit it"). The real prompt has "!>" at | ||
| // end-of-line (e.g., "[entire] 3% !>"). Match only the latter. | ||
| return s.TmuxSession.WaitFor(`(?m)!>\s*$`, timeout) | ||
| } |
There was a problem hiding this comment.
WaitFor ignores caller's pattern after Send
Medium Severity
KiroSession.WaitFor completely ignores the pattern parameter after any Send call, always waiting for (?m)!>\s*$ instead. While existing tests only pass PromptPattern() (which matches !>), this silently discards the caller's intent. If any test ever passes a different pattern (e.g., checking for specific output text or a completion marker like Credits:), it will be silently ignored, causing false passes or unexplained timeouts with no indication the pattern was overridden.
| SessionID: sessionID, | ||
| Prompt: raw.Prompt, | ||
| Timestamp: time.Now(), | ||
| }, nil |
There was a problem hiding this comment.
Stale session ID cache causes cross-session contamination
Medium Severity
The session ID cache uses a single file (kiro-active-session) that persists across sessions. If agentSpawn doesn't fire for a new session (the code explicitly handles this as a fallback case), parseUserPromptSubmit reads the stale session ID from a previous session. This would associate the new session's events with the old session's state, potentially corrupting checkpoint data. The fallback only triggers when the cache file is missing, not when it's stale.
Additional Locations (1)
There was a problem hiding this comment.
Pull request overview
This PR adds first-class support for the Kiro (Amazon) agent to Entire CLI, including hook installation/parsing and an E2E runner, and updates strategy/lifecycle handling for edge cases discovered during integration (notably deferred transcripts and mid-turn commits).
Changes:
- Add a new
cmd/entire/cli/agent/kiro/package implementing hook support and lifecycle parsing (with SQLite transcript caching). - Add a Kiro E2E agent runner and wire Kiro into E2E task help text and CI workflows.
- Adjust manual-commit strategy and lifecycle to better handle mid-turn commits, deferred transcript availability, and rewind/condensation edge cases.
Reviewed changes
Copilot reviewed 27 out of 27 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| scripts/test-kiro-agent-integration.sh | Adds a manual script intended to validate Kiro hook firing/stdin payloads. |
| mise-tasks/test/e2e/_default | Updates E2E task help text to include kiro as an agent option. |
| e2e/agents/kiro.go | Adds Kiro implementation for the E2E harness (tmux interactive mode + retry gating). |
| cmd/entire/cli/strategy/manual_commit_rewind.go | Clears TranscriptPath after rewinds so post-rewind condensation uses the shadow branch state. |
| cmd/entire/cli/strategy/manual_commit_hooks.go | Updates prepare-commit-msg behavior for “TTY agents” (e.g., Kiro) and active-session commit linking. |
| cmd/entire/cli/strategy/manual_commit_git.go | Backfills TranscriptPath during SaveStep for deferred transcript persistence. |
| cmd/entire/cli/strategy/manual_commit_condensation.go | Allows condensation to proceed with minimal session data when transcript isn’t available mid-session. |
| cmd/entire/cli/lifecycle.go | Passes transcript ref into turn-end transitions and backfills session TranscriptPath when needed. |
| cmd/entire/cli/hooks_cmd.go | Registers the Kiro agent package via blank import. |
| cmd/entire/cli/agent/registry.go | Adds AgentNameKiro / AgentTypeKiro constants. |
| cmd/entire/cli/agent/kiro/types.go | Defines Kiro hook stdin payload types and .kiro/agents/entire.json structures. |
| cmd/entire/cli/agent/kiro/types_test.go | Adds unit tests for Kiro hook JSON parsing and config marshaling. |
| cmd/entire/cli/agent/kiro/lifecycle.go | Implements lifecycle event parsing + session-id caching. |
| cmd/entire/cli/agent/kiro/lifecycle_test.go | Adds unit tests for Kiro lifecycle parsing and cache behavior. |
| cmd/entire/cli/agent/kiro/hooks.go | Implements Kiro hook install/uninstall/detect for .kiro/agents/entire.json. |
| cmd/entire/cli/agent/kiro/hooks_test.go | Adds unit tests for hook installation/idempotency/detection. |
| cmd/entire/cli/agent/kiro/kiro.go | Implements Kiro agent identity, transcript caching via sqlite3, session read/write behavior. |
| cmd/entire/cli/agent/kiro/kiro_test.go | Adds unit tests for Kiro agent methods (with one incomplete DetectPresence test). |
| cmd/entire/cli/agent/kiro/AGENT.md | Adds a Kiro integration one-pager documenting hooks/config/transcripts. |
| .github/workflows/e2e.yml | Adds kiro to the E2E matrix and installs kiro-cli. |
| .github/workflows/e2e-isolated.yml | Adds kiro as an isolated E2E workflow option and installs kiro-cli. |
| .claude/skills/agent-integration/test-writer.md | Refocuses “write-tests” phase on creating the E2E runner only. |
| .claude/skills/agent-integration/researcher.md | Refactors research output into a persistent AGENT.md one-pager. |
| .claude/skills/agent-integration/implementer.md | Reworks the workflow to strict E2E-first TDD with unit tests written last. |
| .claude/skills/agent-integration/SKILL.md | Updates skill definition to reflect E2E-first TDD and new phase outputs. |
| .claude/plugins/agent-integration/commands/write-tests.md | Updates command description to “Create E2E agent runner (no unit tests)”. |
| .claude/plugins/agent-integration/commands/implement.md | Updates implement command description (contains a spelling typo). |
Comments suppressed due to low confidence (1)
.github/workflows/e2e-isolated.yml:50
- This workflow now allows selecting
kiro, but the bootstrap step runs withoutE2E_AGENTset and without any Kiro auth configuration. Since the Kiro runner’s Bootstrap() checkskiro-cli whoamion CI, selectingkirois likely to fail unless the workflow also provisions credentials (or skips the auth check). Consider settingE2E_AGENT=${{ inputs.agent }}for the bootstrap step and documenting/gating Kiro behind an auth secret or condition.
- name: Install agent CLI
run: |
case "${{ inputs.agent }}" in
claude-code) curl -fsSL https://claude.ai/install.sh | bash ;;
opencode) curl -fsSL https://opencode.ai/install | bash ;;
gemini-cli) npm install -g @google/gemini-cli ;;
kiro) curl -fsSL https://cli.kiro.dev/install | bash ;;
esac
echo "$HOME/.local/bin" >> $GITHUB_PATH
- name: Bootstrap agent
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
GEMINI_API_KEY: ${{ secrets.GEMINI_API_KEY }}
run: go run ./e2e/bootstrap
|
|
||
| **Cycle:** | ||
|
|
||
| 1. Run: `mise run test:e2e -agent $AGENT_SLUG TestSingleSessionManualCommit` |
There was a problem hiding this comment.
The sample command uses mise run test:e2e -agent $AGENT_SLUG ... but the mise task defines --agent (double dash). As written, the command won’t set E2E_AGENT and will likely run the full agent registry instead of the intended single agent.
| 1. Run: `mise run test:e2e -agent $AGENT_SLUG TestSingleSessionManualCommit` | |
| 1. Run: `mise run test:e2e --agent $AGENT_SLUG TestSingleSessionManualCommit` |
| ## Hook Configuration | ||
|
|
||
| **File:** `.kiro/agents/entire.json` | ||
|
|
||
| We own the entire file — no round-trip preservation needed (unlike Cursor's shared `hooks.json`). | ||
|
|
There was a problem hiding this comment.
PR description mentions Kiro hook installation via .kiro/settings/hooks.json, but the integration (and this doc) uses .kiro/agents/entire.json. Please reconcile this so the PR description matches the actual implementation (or adjust the implementation if .kiro/settings/hooks.json is the intended target).
| func TestDetectPresence_WithKiroDir(t *testing.T) { | ||
| t.Parallel() | ||
|
|
||
| tempDir := t.TempDir() | ||
| if err := os.MkdirAll(filepath.Join(tempDir, ".kiro"), 0o750); err != nil { | ||
| t.Fatalf("failed to create .kiro dir: %v", err) | ||
| } | ||
|
|
||
| // DetectPresence uses paths.WorktreeRoot which won't resolve in a temp dir, | ||
| // so it falls back to ".". We chdir to make it find .kiro. | ||
| // Since t.Chdir is not parallelizable, we test a separate scenario below. | ||
| } | ||
|
|
||
| func TestDetectPresence_WithoutKiroDir(t *testing.T) { | ||
| t.Parallel() | ||
|
|
||
| ag := &KiroAgent{} | ||
| // In a temp dir without .kiro, presence should be false. | ||
| // paths.WorktreeRoot will fail in temp dir (not a git repo), falls back to ".". | ||
| // Since "." doesn't have .kiro, this should return false. | ||
| found, err := ag.DetectPresence(context.Background()) | ||
| if err != nil { | ||
| t.Fatalf("DetectPresence() error = %v", err) | ||
| } | ||
| // We can't guarantee false in CI since the working dir might have .kiro, | ||
| // but we can at least verify no error. | ||
| _ = found | ||
| } |
There was a problem hiding this comment.
TestDetectPresence_WithKiroDir is currently incomplete: it creates a temp dir and .kiro/, but never calls DetectPresence or asserts anything, so it can’t fail and doesn’t validate behavior. Consider converting it to a non-parallel test that t.Chdir(tempDir), calls DetectPresence, and asserts found==true (and similarly make the “without .kiro” case deterministic by chdir’ing into an empty temp dir and asserting found==false).
| @@ -1,5 +1,5 @@ | |||
| --- | |||
| description: "Build the agent Go package via TDD using research findings and E2E tests as spec" | |||
| description: "E2E-first Test driven develpoment — unit tests written last" | |||
There was a problem hiding this comment.
Typo in the command description: “develpoment” → “development”.
| description: "E2E-first Test driven develpoment — unit tests written last" | |
| description: "E2E-first Test driven development — unit tests written last" |
| // Read the stable session ID generated at agentSpawn. | ||
| sessionID := k.readCachedSessionID(ctx) | ||
| if sessionID == "" { | ||
| // Fallback: try SQLite for the session ID. | ||
| sid, queryErr := k.querySessionID(ctx, raw.CWD) | ||
| if queryErr != nil || sid == "" { | ||
| sessionID = "unknown" | ||
| } else { | ||
| sessionID = sid | ||
| } | ||
| } |
There was a problem hiding this comment.
The session ID cache file (.entire/tmp/kiro-active-session) is never cleared. If agentSpawn is skipped in a later run (a case the code explicitly anticipates), readCachedSessionID could return a stale ID from a previous session and incorrectly associate prompts/stop events with the wrong session. Consider deleting the cache file at stop (or storing additional scoping info like a timestamp / CWD key) so the fallback path can’t reuse stale state across sessions.
| // Fast path: when an agent is committing (ACTIVE session + no TTY or known | ||
| // TTY agent), skip content detection and interactive prompts. The agent can't | ||
| // respond to TTY prompts and the content detection can miss mid-session work | ||
| // (no shadow branch yet, transcript analysis may fail). Generate a checkpoint | ||
| // ID and add the trailer directly. | ||
| // | ||
| // This covers two cases: | ||
| // 1. Non-TTY agents (Claude Code, Gemini CLI): hasTTY()=false | ||
| // 2. TTY agents (Kiro in tmux): hasTTY()=true but agentUsesTerminal()=true | ||
| for _, state := range sessions { | ||
| if state.Phase.IsActive() && (!hasTTY() || agentUsesTerminal(state.AgentType)) { |
There was a problem hiding this comment.
The new fast path treats any ACTIVE Kiro session as an “agent commit” (because agentUsesTerminal(state.AgentType) is true), and will add an Entire-Checkpoint trailer unconditionally (bypassing content overlap detection and all prompting). This can incorrectly auto-link unrelated human commits made while a Kiro session is ACTIVE. Consider restricting this fast path to cases that are reliably agent-driven (e.g., no TTY / known agent env var), or keep the no-prompt behavior but still require sessionsWithContent/content-overlap before adding the trailer for TTY agents.
| // Fast path: when an agent is committing (ACTIVE session + no TTY or known | |
| // TTY agent), skip content detection and interactive prompts. The agent can't | |
| // respond to TTY prompts and the content detection can miss mid-session work | |
| // (no shadow branch yet, transcript analysis may fail). Generate a checkpoint | |
| // ID and add the trailer directly. | |
| // | |
| // This covers two cases: | |
| // 1. Non-TTY agents (Claude Code, Gemini CLI): hasTTY()=false | |
| // 2. TTY agents (Kiro in tmux): hasTTY()=true but agentUsesTerminal()=true | |
| for _, state := range sessions { | |
| if state.Phase.IsActive() && (!hasTTY() || agentUsesTerminal(state.AgentType)) { | |
| // Fast path: when an agent is committing in a non-interactive (no TTY) | |
| // environment with an ACTIVE session, skip content detection and interactive | |
| // prompts. The agent can't respond to TTY prompts and the content detection | |
| // can miss mid-session work (no shadow branch yet, transcript analysis may | |
| // fail). Generate a checkpoint ID and add the trailer directly. | |
| // | |
| // This covers non-TTY agents (e.g., Claude Code, Gemini CLI): hasTTY()=false. | |
| for _, state := range sessions { | |
| if state.Phase.IsActive() && !hasTTY() { |
| # Install capture hooks (replace entire hooks with capture scripts) | ||
| mkdir -p .kiro/agents | ||
| cat > .kiro/agents/capture.json <<EOF | ||
| { | ||
| "agentSpawn": [{"command": "cat > ${CAPTURE_DIR}/agent-spawn.json"}], | ||
| "userPromptSubmit": [{"command": "cat > ${CAPTURE_DIR}/user-prompt-submit.json"}], | ||
| "stop": [{"command": "cat > ${CAPTURE_DIR}/stop.json"}] | ||
| } | ||
| EOF |
There was a problem hiding this comment.
The script writes .kiro/agents/capture.json as a bare hook map, but Kiro agent files (and this PR’s integration) use the .kiro/agents/<agent>.json agent definition structure (top-level name/tools/hooks). As written, capture.json likely won’t be loaded or valid, so hooks won’t fire and payloads won’t be captured. Consider either overwriting .kiro/agents/entire.json with a valid capture agent definition (and restoring it after), or generate a full agent definition file and instruct running kiro-cli chat --agent capture.
| @@ -36,6 +36,7 @@ jobs: | |||
| claude-code) curl -fsSL https://claude.ai/install.sh | bash ;; | |||
| opencode) curl -fsSL https://opencode.ai/install | bash ;; | |||
| gemini-cli) npm install -g @google/gemini-cli ;; | |||
| kiro) curl -fsSL https://cli.kiro.dev/install | bash ;; | |||
| esac | |||
| echo "$HOME/.local/bin" >> $GITHUB_PATH | |||
There was a problem hiding this comment.
Adding kiro to the E2E matrix will cause the main-branch workflow to run Kiro E2E on every push. The Kiro runner’s Bootstrap() fails on CI when not authenticated (kiro-cli whoami), and this workflow doesn’t provide any Kiro credentials/env to log in, so the kiro job is likely to fail consistently. Consider gating the kiro matrix entry behind a repo secret/variable (or making the job conditional), and/or setting E2E_AGENT=${{ matrix.agent }} for the bootstrap step so bootstrap only runs for the selected agent.


Summary
Key changes
cmd/entire/cli/agent/kiro/): FullAgentinterface implementation includingHookSupport,TranscriptAnalyzer, hook installation via.kiro/settings/hooks.json, and lifecycle event parsinge2e/agents/kiro.go): Registers Kiro with the E2E framework, supports both prompt and interactive session modes/commitinstructions at each milestone, and the implementer/researcher/test-writer docs are restructured for E2E-first developmentTest plan
cmd/entire/cli/agent/kiro/*_test.go)mise run test:ci)mise run test:e2e --agent kiro)🤖 Generated with Claude Code