Skip to content

Add Kiro agent integration with E2E-first TDD#554

Open
alishakawaguchi wants to merge 9 commits intomainfrom
alisha/kiro-oneshot
Open

Add Kiro agent integration with E2E-first TDD#554
alishakawaguchi wants to merge 9 commits intomainfrom
alisha/kiro-oneshot

Conversation

@alishakawaguchi
Copy link
Contributor

@alishakawaguchi alishakawaguchi commented Feb 28, 2026

Summary

  • Add Kiro (AWS IDE agent) integration: hook installation, lifecycle event parsing, transcript analysis, and session management
  • Restructure agent-integration skill for E2E-first TDD workflow with tiered test progression
  • Add commit steps to each skill phase so progress is committed incrementally
  • Add Kiro E2E test runner and CI workflow support

Key changes

  • New agent package (cmd/entire/cli/agent/kiro/): Full Agent interface implementation including HookSupport, TranscriptAnalyzer, hook installation via .kiro/settings/hooks.json, and lifecycle event parsing
  • E2E runner (e2e/agents/kiro.go): Registers Kiro with the E2E framework, supports both prompt and interactive session modes
  • Skill improvements: Agent-integration skill phases now include /commit instructions at each milestone, and the implementer/researcher/test-writer docs are restructured for E2E-first development
  • Strategy fixes: Minor fixes to condensation, hooks, and rewind for edge cases discovered during Kiro integration

Test plan

  • Unit tests for hooks, lifecycle, types, transcript, and agent registration (cmd/entire/cli/agent/kiro/*_test.go)
  • Integration tests pass (mise run test:ci)
  • E2E tests with real Kiro agent (mise run test:e2e --agent kiro)

🤖 Generated with Claude Code

alishakawaguchi and others added 9 commits February 26, 2026 15:35
Flip the implementer procedure from unit-test-first TDD to E2E-driven
development where E2E tests are the primary spec and unit tests are
written after each E2E test passes to lock in behavior.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: 3b9c57f41fc2
Replace hardcoded method names and event types with source-file references
so skill files age gracefully when interfaces change. Split ambiguous
AGENT_SLUG parameter into AGENT_PACKAGE (Go dirs), AGENT_KEY (registry),
and AGENT_SLUG (E2E/scripts). Add hook-only scope section to SKILL.md
and remove checklist from researcher exclusion list.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: f5f2b5651f79
Implement full agent integration for Amazon's Kiro AI coding CLI,
following established patterns from OpenCode (SQLite-backed transcripts)
and Cursor (JSON hooks file). Includes core agent, lifecycle hook
handling (5 hooks via stdin JSON), hook installation to
.kiro/agents/entire.json, transcript analysis with SQLite3 CLI access,
E2E agent runner, and comprehensive unit tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: f8edcbac7a08
Enforce strict discipline: E2E tests drive development, unit tests are
written last. Previously the skill interleaved unit test writing after
each E2E tier, which meant unit tests were written to match assumed
behavior rather than observed behavior from actual E2E runs.

Key changes:
- Remove all "After passing, write unit tests" blocks from Steps 4-12
- Add Step 13 (Full E2E Suite Pass) to run complete suite before unit tests
- Add Step 14 (Write Unit Tests) consolidated with golden fixture guidance
- Rename Phase 2 to "Write E2E Runner" (no test scenarios, runner only)
- Add "Core Rule: E2E-First TDD" section to SKILL.md and implementer.md
- Delete test-writer Step 6 (Write E2E Test Scenarios)
- Update all step references and renumber to 1-16

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: 2ca049d51515
Implement the Kiro (Amazon AI coding CLI) agent for the Entire CLI,
following the E2E-first TDD approach. Kiro runs inside tmux (has TTY),
requiring special handling in the prepare-commit-msg fast path via
agentUsesTerminal() to distinguish agent commits from human commits.

Key changes:
- cmd/entire/cli/agent/kiro/: Full agent package (hooks, lifecycle,
  types, session ID via SQLite, transcript parsing)
- e2e/agents/kiro.go: E2E test driver with KiroSession wrapper using
  end-of-line prompt pattern to avoid matching echoed input
- strategy/manual_commit_hooks.go: Restore hasTTY() + add
  agentUsesTerminal() for TTY-based agents like Kiro
- strategy/manual_commit_rewind.go: Clear TranscriptPath on rewind
  so condensation reads from shadow branch, not stale transcript
- lifecycle.go, manual_commit_git.go: TranscriptPath backfill for
  agents with deferred transcript persistence
- .github/workflows/e2e*.yml: Add kiro to CI matrix and options

108 unit tests across 4 test files, all E2E and integration tests pass.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: 4a7c9d636b45
Each phase and implementation step now includes a `/commit` instruction
so progress is committed incrementally rather than piling up uncommitted.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: 92e9dafcd86f
@alishakawaguchi alishakawaguchi requested a review from a team as a code owner February 28, 2026 00:12
Copilot AI review requested due to automatic review settings February 28, 2026 00:12
@cursor
Copy link

cursor bot commented Feb 28, 2026

PR Summary

Medium Risk
Adds a new agent integration and adjusts lifecycle/strategy behavior around transcript paths and commit-msg auto-linking, which can affect checkpoint linking and session state across agents. CI/E2E matrix changes also broaden execution surface and could expose flakiness or regressions in hook handling.

Overview
Adds first-class support for the Kiro agent, including hook installation to .kiro/agents/entire.json, lifecycle event parsing (agent-spawn, user-prompt-submit, stop), session/transcript caching, and SQLite-based transcript lookup (via sqlite3 CLI). Includes a comprehensive unit test suite for the new agent and a new E2E runner e2e/agents/kiro.go that drives Kiro via tmux/interactive mode.

Updates lifecycle/strategy code to better handle agents with deferred transcript availability: transitionSessionTurnEnd can now backfill TranscriptPath, condensation no longer hard-fails when mid-session commits lack a live transcript, commit-step handling backfills transcript path when it appears later, commit-msg auto-linking is extended to TTY-based agents (Kiro), and rewinds clear TranscriptPath to prefer shadow-branch extraction.

Refreshes the .claude agent-integration skill docs to enforce E2E-first TDD (unit tests last) with tiered E2E progression and incremental commit checkpoints, and expands CI/workflows + mise E2E tooling to include kiro.

Written by Cursor Bugbot for commit 237ba90. Configure here.

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 3 potential issues.

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Comment @cursor review or bugbot run to trigger another review on this PR

return nil, fmt.Errorf("failed to chunk transcript: %w", err)
}
return chunks, nil
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JSONL chunking on JSON blob corrupts large transcripts

Medium Severity

ChunkTranscript falls back to agent.ChunkJSONL for large Kiro transcripts, but Kiro transcripts are single JSON objects (as noted in the comment and AGENT.md), not JSONL. ChunkJSONL splits at newline boundaries, which for a pretty-printed JSON object would produce invalid JSON fragments in each chunk. ReassembleTranscript then concatenates them with newlines (JSONL-style), which won't reconstruct the original JSON. Other agents with JSON transcripts (Gemini) implement proper JSON-aware chunking that splits at message array boundaries.

Fix in Cursor Fix in Web

// (e.g., "[entire] 3% !> now commit it"). The real prompt has "!>" at
// end-of-line (e.g., "[entire] 3% !>"). Match only the latter.
return s.TmuxSession.WaitFor(`(?m)!>\s*$`, timeout)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WaitFor ignores caller's pattern after Send

Medium Severity

KiroSession.WaitFor completely ignores the pattern parameter after any Send call, always waiting for (?m)!>\s*$ instead. While existing tests only pass PromptPattern() (which matches !>), this silently discards the caller's intent. If any test ever passes a different pattern (e.g., checking for specific output text or a completion marker like Credits:), it will be silently ignored, causing false passes or unexplained timeouts with no indication the pattern was overridden.

Fix in Cursor Fix in Web

SessionID: sessionID,
Prompt: raw.Prompt,
Timestamp: time.Now(),
}, nil
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stale session ID cache causes cross-session contamination

Medium Severity

The session ID cache uses a single file (kiro-active-session) that persists across sessions. If agentSpawn doesn't fire for a new session (the code explicitly handles this as a fallback case), parseUserPromptSubmit reads the stale session ID from a previous session. This would associate the new session's events with the old session's state, potentially corrupting checkpoint data. The fallback only triggers when the cache file is missing, not when it's stale.

Additional Locations (1)

Fix in Cursor Fix in Web

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds first-class support for the Kiro (Amazon) agent to Entire CLI, including hook installation/parsing and an E2E runner, and updates strategy/lifecycle handling for edge cases discovered during integration (notably deferred transcripts and mid-turn commits).

Changes:

  • Add a new cmd/entire/cli/agent/kiro/ package implementing hook support and lifecycle parsing (with SQLite transcript caching).
  • Add a Kiro E2E agent runner and wire Kiro into E2E task help text and CI workflows.
  • Adjust manual-commit strategy and lifecycle to better handle mid-turn commits, deferred transcript availability, and rewind/condensation edge cases.

Reviewed changes

Copilot reviewed 27 out of 27 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
scripts/test-kiro-agent-integration.sh Adds a manual script intended to validate Kiro hook firing/stdin payloads.
mise-tasks/test/e2e/_default Updates E2E task help text to include kiro as an agent option.
e2e/agents/kiro.go Adds Kiro implementation for the E2E harness (tmux interactive mode + retry gating).
cmd/entire/cli/strategy/manual_commit_rewind.go Clears TranscriptPath after rewinds so post-rewind condensation uses the shadow branch state.
cmd/entire/cli/strategy/manual_commit_hooks.go Updates prepare-commit-msg behavior for “TTY agents” (e.g., Kiro) and active-session commit linking.
cmd/entire/cli/strategy/manual_commit_git.go Backfills TranscriptPath during SaveStep for deferred transcript persistence.
cmd/entire/cli/strategy/manual_commit_condensation.go Allows condensation to proceed with minimal session data when transcript isn’t available mid-session.
cmd/entire/cli/lifecycle.go Passes transcript ref into turn-end transitions and backfills session TranscriptPath when needed.
cmd/entire/cli/hooks_cmd.go Registers the Kiro agent package via blank import.
cmd/entire/cli/agent/registry.go Adds AgentNameKiro / AgentTypeKiro constants.
cmd/entire/cli/agent/kiro/types.go Defines Kiro hook stdin payload types and .kiro/agents/entire.json structures.
cmd/entire/cli/agent/kiro/types_test.go Adds unit tests for Kiro hook JSON parsing and config marshaling.
cmd/entire/cli/agent/kiro/lifecycle.go Implements lifecycle event parsing + session-id caching.
cmd/entire/cli/agent/kiro/lifecycle_test.go Adds unit tests for Kiro lifecycle parsing and cache behavior.
cmd/entire/cli/agent/kiro/hooks.go Implements Kiro hook install/uninstall/detect for .kiro/agents/entire.json.
cmd/entire/cli/agent/kiro/hooks_test.go Adds unit tests for hook installation/idempotency/detection.
cmd/entire/cli/agent/kiro/kiro.go Implements Kiro agent identity, transcript caching via sqlite3, session read/write behavior.
cmd/entire/cli/agent/kiro/kiro_test.go Adds unit tests for Kiro agent methods (with one incomplete DetectPresence test).
cmd/entire/cli/agent/kiro/AGENT.md Adds a Kiro integration one-pager documenting hooks/config/transcripts.
.github/workflows/e2e.yml Adds kiro to the E2E matrix and installs kiro-cli.
.github/workflows/e2e-isolated.yml Adds kiro as an isolated E2E workflow option and installs kiro-cli.
.claude/skills/agent-integration/test-writer.md Refocuses “write-tests” phase on creating the E2E runner only.
.claude/skills/agent-integration/researcher.md Refactors research output into a persistent AGENT.md one-pager.
.claude/skills/agent-integration/implementer.md Reworks the workflow to strict E2E-first TDD with unit tests written last.
.claude/skills/agent-integration/SKILL.md Updates skill definition to reflect E2E-first TDD and new phase outputs.
.claude/plugins/agent-integration/commands/write-tests.md Updates command description to “Create E2E agent runner (no unit tests)”.
.claude/plugins/agent-integration/commands/implement.md Updates implement command description (contains a spelling typo).
Comments suppressed due to low confidence (1)

.github/workflows/e2e-isolated.yml:50

  • This workflow now allows selecting kiro, but the bootstrap step runs without E2E_AGENT set and without any Kiro auth configuration. Since the Kiro runner’s Bootstrap() checks kiro-cli whoami on CI, selecting kiro is likely to fail unless the workflow also provisions credentials (or skips the auth check). Consider setting E2E_AGENT=${{ inputs.agent }} for the bootstrap step and documenting/gating Kiro behind an auth secret or condition.
      - name: Install agent CLI
        run: |
          case "${{ inputs.agent }}" in
            claude-code) curl -fsSL https://claude.ai/install.sh | bash ;;
            opencode)    curl -fsSL https://opencode.ai/install | bash ;;
            gemini-cli)  npm install -g @google/gemini-cli ;;
            kiro)        curl -fsSL https://cli.kiro.dev/install | bash ;;
          esac
          echo "$HOME/.local/bin" >> $GITHUB_PATH

      - name: Bootstrap agent
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
          GEMINI_API_KEY: ${{ secrets.GEMINI_API_KEY }}
        run: go run ./e2e/bootstrap


**Cycle:**

1. Run: `mise run test:e2e -agent $AGENT_SLUG TestSingleSessionManualCommit`
Copy link

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sample command uses mise run test:e2e -agent $AGENT_SLUG ... but the mise task defines --agent (double dash). As written, the command won’t set E2E_AGENT and will likely run the full agent registry instead of the intended single agent.

Suggested change
1. Run: `mise run test:e2e -agent $AGENT_SLUG TestSingleSessionManualCommit`
1. Run: `mise run test:e2e --agent $AGENT_SLUG TestSingleSessionManualCommit`

Copilot uses AI. Check for mistakes.
Comment on lines +28 to +33
## Hook Configuration

**File:** `.kiro/agents/entire.json`

We own the entire file — no round-trip preservation needed (unlike Cursor's shared `hooks.json`).

Copy link

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description mentions Kiro hook installation via .kiro/settings/hooks.json, but the integration (and this doc) uses .kiro/agents/entire.json. Please reconcile this so the PR description matches the actual implementation (or adjust the implementation if .kiro/settings/hooks.json is the intended target).

Copilot uses AI. Check for mistakes.
Comment on lines +78 to +105
func TestDetectPresence_WithKiroDir(t *testing.T) {
t.Parallel()

tempDir := t.TempDir()
if err := os.MkdirAll(filepath.Join(tempDir, ".kiro"), 0o750); err != nil {
t.Fatalf("failed to create .kiro dir: %v", err)
}

// DetectPresence uses paths.WorktreeRoot which won't resolve in a temp dir,
// so it falls back to ".". We chdir to make it find .kiro.
// Since t.Chdir is not parallelizable, we test a separate scenario below.
}

func TestDetectPresence_WithoutKiroDir(t *testing.T) {
t.Parallel()

ag := &KiroAgent{}
// In a temp dir without .kiro, presence should be false.
// paths.WorktreeRoot will fail in temp dir (not a git repo), falls back to ".".
// Since "." doesn't have .kiro, this should return false.
found, err := ag.DetectPresence(context.Background())
if err != nil {
t.Fatalf("DetectPresence() error = %v", err)
}
// We can't guarantee false in CI since the working dir might have .kiro,
// but we can at least verify no error.
_ = found
}
Copy link

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TestDetectPresence_WithKiroDir is currently incomplete: it creates a temp dir and .kiro/, but never calls DetectPresence or asserts anything, so it can’t fail and doesn’t validate behavior. Consider converting it to a non-parallel test that t.Chdir(tempDir), calls DetectPresence, and asserts found==true (and similarly make the “without .kiro” case deterministic by chdir’ing into an empty temp dir and asserting found==false).

Copilot uses AI. Check for mistakes.
@@ -1,5 +1,5 @@
---
description: "Build the agent Go package via TDD using research findings and E2E tests as spec"
description: "E2E-first Test driven develpoment — unit tests written last"
Copy link

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in the command description: “develpoment” → “development”.

Suggested change
description: "E2E-first Test driven develpoment — unit tests written last"
description: "E2E-first Test driven development — unit tests written last"

Copilot uses AI. Check for mistakes.
Comment on lines +108 to +118
// Read the stable session ID generated at agentSpawn.
sessionID := k.readCachedSessionID(ctx)
if sessionID == "" {
// Fallback: try SQLite for the session ID.
sid, queryErr := k.querySessionID(ctx, raw.CWD)
if queryErr != nil || sid == "" {
sessionID = "unknown"
} else {
sessionID = sid
}
}
Copy link

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The session ID cache file (.entire/tmp/kiro-active-session) is never cleared. If agentSpawn is skipped in a later run (a case the code explicitly anticipates), readCachedSessionID could return a stale ID from a previous session and incorrectly associate prompts/stop events with the wrong session. Consider deleting the cache file at stop (or storing additional scoping info like a timestamp / CWD key) so the fallback path can’t reuse stale state across sessions.

Copilot uses AI. Check for mistakes.
Comment on lines +356 to +366
// Fast path: when an agent is committing (ACTIVE session + no TTY or known
// TTY agent), skip content detection and interactive prompts. The agent can't
// respond to TTY prompts and the content detection can miss mid-session work
// (no shadow branch yet, transcript analysis may fail). Generate a checkpoint
// ID and add the trailer directly.
//
// This covers two cases:
// 1. Non-TTY agents (Claude Code, Gemini CLI): hasTTY()=false
// 2. TTY agents (Kiro in tmux): hasTTY()=true but agentUsesTerminal()=true
for _, state := range sessions {
if state.Phase.IsActive() && (!hasTTY() || agentUsesTerminal(state.AgentType)) {
Copy link

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new fast path treats any ACTIVE Kiro session as an “agent commit” (because agentUsesTerminal(state.AgentType) is true), and will add an Entire-Checkpoint trailer unconditionally (bypassing content overlap detection and all prompting). This can incorrectly auto-link unrelated human commits made while a Kiro session is ACTIVE. Consider restricting this fast path to cases that are reliably agent-driven (e.g., no TTY / known agent env var), or keep the no-prompt behavior but still require sessionsWithContent/content-overlap before adding the trailer for TTY agents.

Suggested change
// Fast path: when an agent is committing (ACTIVE session + no TTY or known
// TTY agent), skip content detection and interactive prompts. The agent can't
// respond to TTY prompts and the content detection can miss mid-session work
// (no shadow branch yet, transcript analysis may fail). Generate a checkpoint
// ID and add the trailer directly.
//
// This covers two cases:
// 1. Non-TTY agents (Claude Code, Gemini CLI): hasTTY()=false
// 2. TTY agents (Kiro in tmux): hasTTY()=true but agentUsesTerminal()=true
for _, state := range sessions {
if state.Phase.IsActive() && (!hasTTY() || agentUsesTerminal(state.AgentType)) {
// Fast path: when an agent is committing in a non-interactive (no TTY)
// environment with an ACTIVE session, skip content detection and interactive
// prompts. The agent can't respond to TTY prompts and the content detection
// can miss mid-session work (no shadow branch yet, transcript analysis may
// fail). Generate a checkpoint ID and add the trailer directly.
//
// This covers non-TTY agents (e.g., Claude Code, Gemini CLI): hasTTY()=false.
for _, state := range sessions {
if state.Phase.IsActive() && !hasTTY() {

Copilot uses AI. Check for mistakes.
Comment on lines +51 to +59
# Install capture hooks (replace entire hooks with capture scripts)
mkdir -p .kiro/agents
cat > .kiro/agents/capture.json <<EOF
{
"agentSpawn": [{"command": "cat > ${CAPTURE_DIR}/agent-spawn.json"}],
"userPromptSubmit": [{"command": "cat > ${CAPTURE_DIR}/user-prompt-submit.json"}],
"stop": [{"command": "cat > ${CAPTURE_DIR}/stop.json"}]
}
EOF
Copy link

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script writes .kiro/agents/capture.json as a bare hook map, but Kiro agent files (and this PR’s integration) use the .kiro/agents/<agent>.json agent definition structure (top-level name/tools/hooks). As written, capture.json likely won’t be loaded or valid, so hooks won’t fire and payloads won’t be captured. Consider either overwriting .kiro/agents/entire.json with a valid capture agent definition (and restoring it after), or generate a full agent definition file and instruct running kiro-cli chat --agent capture.

Copilot uses AI. Check for mistakes.
Comment on lines 20 to 41
@@ -36,6 +36,7 @@ jobs:
claude-code) curl -fsSL https://claude.ai/install.sh | bash ;;
opencode) curl -fsSL https://opencode.ai/install | bash ;;
gemini-cli) npm install -g @google/gemini-cli ;;
kiro) curl -fsSL https://cli.kiro.dev/install | bash ;;
esac
echo "$HOME/.local/bin" >> $GITHUB_PATH
Copy link

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding kiro to the E2E matrix will cause the main-branch workflow to run Kiro E2E on every push. The Kiro runner’s Bootstrap() fails on CI when not authenticated (kiro-cli whoami), and this workflow doesn’t provide any Kiro credentials/env to log in, so the kiro job is likely to fail consistently. Consider gating the kiro matrix entry behind a repo secret/variable (or making the job conditional), and/or setting E2E_AGENT=${{ matrix.agent }} for the bootstrap step so bootstrap only runs for the selected agent.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants