Skip to content

feat: prompt user for evolution via Stop-hook AskUserQuestion + dev test harness#57

Merged
ThunderConch merged 15 commits into
masterfrom
feat/evolve-prompt-at-stop
Apr 20, 2026
Merged

feat: prompt user for evolution via Stop-hook AskUserQuestion + dev test harness#57
ThunderConch merged 15 commits into
masterfrom
feat/evolve-prompt-at-stop

Conversation

@ThunderConch
Copy link
Copy Markdown
Owner

Summary

  • Party pokemon ready to evolve now trigger a Stop-hook decision:"block" whose reason tells Claude to call AskUserQuestion with the pokemon-voice phrasing (e.g. ...어라!? 파이리의 상태가...?). Works for both branch evolutions (e.g. Eevee's 3 stone-based paths) and single-chain evolutions (e.g. Charmander → Charmeleon), with a cross-gen ensurePokemonInDB fallback so pokemon not native to the active gen still resolve.
  • Refuse / accept handled through the same prompt: selecting a target runs tokenmon evolve <pokemon> <target>; Refuse (or no/cancel/거부) sets a permanent evolution_prompt_shown flag so the same pokemon is not re-prompted until the user runs /tkm evolve manually.
  • Dev-only /tkm:test-evolve slash harness that backs up state, seeds one of six scenarios (branch-eevee / single-charmander / multi-3 / overflow-5 / refuse-persist / accept-clear-reprompt), swaps the active hooks/hooks.json to route hooks at the worktree, lets the user trigger the prompt in their live session, then auto-runs --verify + --restore via an append to the block reason.
  • Packaging: package.json now uses an explicit files allowlist so all test-evolve paths (skills/test-evolve/, src/cli/test-evolve.ts, src/test-evolve/, src/test-scenarios/) stay out of published tarballs, and .tokenmon/test-backup/ is gitignored.

Key commit surface

  • src/hooks/stop.ts — post-lock evolution scan, block emission before the first_stop/no_delta early return, system_message preserved in block output, test-harness tail appended when .tokenmon/test-backup/current.json exists.
  • src/core/evolution.tsmarkEvolutionReady helper, single-chain evolution_ready plumbing (no more silent auto-evolve), ensurePokemonInDB fallback on every source/target lookup, new applySingleChainEvolution export that mirrors applyBranchEvolution.
  • src/core/pokemon-data.ts — cross-gen fallback in getPokemonName and pokemonIdByName so localized names from another generation's dex resolve.
  • src/cli/tokenmon.tscmdEvolve resolves targetArg through resolvePokemonArg and routes single-chain pokemon through checkEvolution + executeEvolve (previously "no eligible" for any non-branching pokemon); executeEvolve dispatches branch vs single-chain by Array.isArray(data.evolves_to) and clears evolution_prompt_shown on the new key.
  • src/status-line.ts — drop the statusline.evolution_ready hint; the AskUserQuestion prompt is the canonical surface.
  • src/core/notifications.ts — skip the evolution_ready notification when the prompt has already fired.
  • i18n en/ko + pokemon-voice variants — verbatim question text, 4+ branch overflow rule (3 targets + Refuse on buttons, remainder listed as "다른 폼" in the question body so the user can pick via Other), and Other-input validation with a max-two re-prompt cap.
  • test/e2e/evolve-askuserquestion.test.ts — hook-level contract coverage for the new decision:"block" JSON and the prompt-shown flag lifecycle (child_process fallback; tmux full-session harness remains a follow-up).

Dev harness (dev-only, excluded from publish)

  • /tkm:test-evolve <scenario>--setup (backup + seed + hooks swap)
  • User sends a short message → live AskUserQuestion renders
  • User clicks a button → tokenmon evolve in the same turn → auto --verify → auto --restore → compact report
  • /tkm:test-evolve --list | --restore for emergency cleanup | --help

Test plan

  • npm run typecheck clean
  • npm test — 1203 existing tests still green
  • node --import tsx --test test/e2e/evolve-askuserquestion.test.ts — 2/2 pass
  • Manual: /tkm:test-evolve single-charmander → click Charmeleon → verify + restore run automatically
  • Manual: /tkm:test-evolve branch-eevee → click Vaporeon → Eevee evolves via water-stone seed
  • Manual: /tkm:test-evolve refuse-persist → click Refuse → next Stop does not re-block

🤖 Generated with Claude Code

ThunderConch and others added 15 commits April 20, 2026 11:53
Stop hook scans party for pokemon ready to evolve (branch + single
chain) and emits {decision:"block", reason} to force Claude to call
AskUserQuestion instead of auto-evolving or silently staging a flag.
User selects a target, Claude runs `tokenmon evolve <name> <target>`.
Refuse sets evolution_prompt_shown, which holds the prompt until the
user manually runs `/tkm evolve` to clear it.

- evolution.ts: single-chain now uses the same flag-based flow as
  branch evolutions; state-missing callers keep the original return
  signature so existing tests remain green
- stop.ts: post-XP scan emits block JSON first, then persists the
  prompt_shown flag under lock (duplicate prompt > silent loss on
  crash); lock failures are logged rather than swallowed
- markEvolutionReady helper dedups the three single-chain paths in
  checkEvolution
- notifications.ts + status-line.ts: skip the ready hint once the
  prompt has fired
- test/e2e: harness verifies block JSON and flag persistence via an
  isolated CLAUDE_CONFIG_DIR; tmux full-session harness remains TODO
The evolution block path in stop.ts previously emitted
{decision:"block", reason} but discarded any system_message that was
populated in the same stop turn — level-up, catch, and achievement
notifications would silently disappear the moment an evolution
triggered. Merge the accumulated system_message into the block output
so user-facing messages persist alongside Claude's block instruction.
systemMessage is user-facing only per the Claude Code hooks spec, so
it does not interfere with the reason field Claude consumes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Dev-only slash command that auto-cycles the full E2E path for the
Stop-hook evolution AskUserQuestion feature. Per scenario: backup
state and hooks, seed test party, swap hooks.json to worktree paths,
spawn a fresh tmux pane with isolated CLAUDE_CONFIG_DIR, launch
Claude Code, capture the AskUserQuestion render, send-keys the
scenario's expected answer, 3-layer verify (UI regex, tokenmon evolve
tool call, state diff), restore backup. No-arg runs all 6 scenarios
sequentially; --scenario <name> runs one; --restore cleans up after
an aborted run; --dry-run lists scenarios without LLM cost.

Harness is excluded from the published plugin via the new files
allowlist in package.json.

- 6 scenarios covering branch, single-chain, batch, overflow, refuse
  persistence, and accept-clear-reprompt lifecycle
- backup.ts: dual-format hooks.json swap (baked absolute paths OR
  CLAUDE_PLUGIN_ROOT/DATA template vars), byte-level restore, gen-aware
  state/config paths; resolves hooks path via PLUGIN_ROOT walk-up
- tmux-driver.ts: pane spawn with isolated CLAUDE_CONFIG_DIR,
  capture-with-pattern-wait, numeric + text send-keys, graceful
  tmux-missing fallback
- verify.ts: 3-layer assertion from UI regex through tool call
  detection to state diff against expected_after
- cli/test-evolve.ts: SIGINT handler + try/finally for crash-safe
  restore (duplicate prompt preferred over silent loss)
- skills/test-evolve/SKILL.md: slash command entry delegating to the
  tsx CLI
- Tighten existing e2e test types (any to Partial<State>, Partial<Config>)
  and drop the unused execFileSync import

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Drop all tmux automation, Claude Code spawning, and UI/tool-level
asserts. Claude Code's Ink-based REPL does not accept tmux send-keys
submissions, which made the "spawn a child session and auto-answer
AskUserQuestion" path unreliable and expensive to debug. The simpler
and more useful shape is: backup, seed, swap hooks.json, let the
human trigger the prompt in their own live session, then verify state
and restore.

CLI subcommands are now:
  tokenmon test-evolve --list                list all scenarios
  tokenmon test-evolve --setup <scenario>    backup + seed + swap
  tokenmon test-evolve --verify              state diff vs expected
  tokenmon test-evolve --restore             byte-level restore

A tiny current.json pointer under .tokenmon/test-backup/ lets
--verify and --restore work without passing the scenario name again.

- cli/test-evolve.ts: 463 -> 156 lines, no tmux/spawn/waitForPattern
- test-evolve/verify.ts: 245 -> 115 lines, state-only assertions
- test-evolve/tmux-driver.ts: deleted
- skills/test-evolve/SKILL.md: rewritten for manual dispatch

Also pick up two earlier fixes needed to make the setup path actually
work end-to-end:
- backup.ts swapHooksJson: regex now accepts JSON-escaped `\"`
  surrounding baked absolute paths so hooks.json rewrites apply when
  the user runs with baked (post-install) hook paths
- backup.ts: drop the hardcoded `/home/minsiwon00/...` fallback and
  resolve the active hooks.json via PLUGIN_ROOT + a marketplace
  fallback, throwing when neither exists instead of writing to a
  non-existent path

Verified: typecheck passes; 1203/1203 tests pass; round-trip
--setup branch-eevee then --restore produces byte-identical state,
config, and hooks files.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous SKILL.md asked the user to manually invoke
/tkm:test-evolve --verify and /tkm:test-evolve --restore at the end
of each scenario. That contradicted the original design goal — the
cycle (seed → test → verify → restore) must always close itself so
the user's live state and hooks.json never stay mutated past the
test window.

Update the skill to orchestrate the full cycle across turns:

1. On /tkm:test-evolve <scenario>: run --setup, tell the user to
   send any message, then stop the turn.
2. On the next turn, after the Stop hook emits the block reason and
   the user picks an option (or refuses) via AskUserQuestion, run
   `tokenmon evolve <pokemon> <target>` for the chosen target(s).
3. Immediately after the evolve call(s) succeed, run --verify and
   then --restore, always, even if verify reports FAIL. Restore is
   unconditional so the user's real state/config/hooks.json come
   back to pre-test bytes.

--restore is still exposed as a manual escape hatch for the case
where the session is killed before the auto-restore step runs.
--verify is no longer a user-facing command.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two adjustments so the evolution prompt is always visible through the
AskUserQuestion path:

1. Block detection moves BEFORE the first_stop/no_delta early return
   in stop.ts. The prior placement meant that if evolution_ready was
   already set when a new session started (e.g. after a cheat/test
   seed, or a resumed conversation where conditions had been met but
   the block had never fired), the very first Stop silently returned
   and the user had to send a second message before AskUserQuestion
   surfaced. Running block detection regardless of the lock result
   kind fixes this and keeps the existing `evolution_prompt_shown`
   guard so duplicate blocks are still prevented.

2. Drop the `evolution_ready` hint from the status line. Because the
   Stop hook now reliably produces an AskUserQuestion prompt on every
   qualifying stop, rendering the same "pokemon ready to evolve"
   notice in the status line was redundant noise — the prompt itself
   is the canonical surface.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nches

Three polish items surfaced during live testing of the evolution
AskUserQuestion flow:

1. Tone mismatch. Claude was paraphrasing the AskUserQuestion
   question text instead of using the pokemon-voice phrasing
   ("...어라!? {pokemon}의 상태가...?") that the status line had
   been using. Each locale's hook.evolution_candidate_line now
   carries the exact per-pokemon question string under a "use
   VERBATIM" label, and hook.evolution_block_reason instructs Claude
   to copy that string into AskUserQuestion.question without any
   rewording.

2. Overflow handling for branches with more than 3 targets (Eevee has
   8). AskUserQuestion caps at 4 options, so the previous "all
   targets + Refuse" instruction silently broke for Eevee-class
   pokemon. The new reason text spells out the rule: ≤3 targets show
   all + Refuse, 4+ targets show the first 3 + Refuse and list the
   remaining targets in the question body so the user can pick any of
   them via 'Other'.

3. Cross-gen name resolution in getPokemonName. Seed data (and any
   other path that surfaces a pokemon not native to the active
   generation's i18n) was falling back to the numeric ID, so the
   block reason rendered "133" instead of "이브이". Added a cross-gen
   lookup that searches each generation's game i18n before using the
   ID as the final fallback; the active generation is still preferred.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ismatch

Without an explicit rule, Claude could blindly feed a user's freeform
'Other' response into `tokenmon evolve`. A garbled or off-list name
then errored silently inside applyBranchEvolution (not in the
evolves_to list -> null return) with no useful feedback to the user.

Extend hook.evolution_block_reason in all four locale + voice combos
with an explicit handling rule:

- Button picks run `tokenmon evolve` directly.
- 'Refuse' (button or text: refuse/no/cancel/거부) skips.
- Free-text 'Other' must be validated against the target list
  (case-insensitive, English and localized names both accepted)
  before the command runs. On mismatch, reply with a short "I didn't
  recognize that" and re-invoke the same AskUserQuestion. Re-prompt
  caps at 2 iterations so a user who keeps typing garbage eventually
  lands on an implicit 'Refuse' instead of an infinite loop.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…items

Three live-test fallouts:

1. Cross-gen reverse name lookup in pokemonIdByName. Previously
   searched only the active generation's i18n, so "이브이" in a
   gen4-active save returned undefined and the evolve CLI said
   "컬렉션에서 "이브이"을(를) 찾을 수 없다". Mirror the forward lookup
   from getPokemonName: try the active gen first, then fall back
   across installed gens' ko/en tables. Real-gameplay benefit too —
   any path that accepts user-typed names for pokemon from other
   gens now resolves.

2. cmdEvolve's targetArg was passed through to the branch.name
   string-equal comparison without being resolved through
   resolvePokemonArg. That forced Claude (or the user) to pass
   numeric IDs; a localized name like "샤미드" always fell through
   to "현재 ~의 진화 조건을 만족하는 경로가 없다" even when the
   target was actually eligible. Apply the same name→ID resolution
   we already use on pokemonArg.

3. The branch-eevee test scenario seeded `evolution_options` with
   all 8 Eeveelutions but never seeded the evolution conditions, so
   applyBranchEvolution's runtime check rejected every choice. Add
   `items: { water-stone, thunder-stone, fire-stone }` to the scenario
   seed (and an optional `items`/`current_region` pair on the
   Scenario type + writeSeed) so the chosen target actually evolves,
   and trim evolution_options to the 5 branches that are genuinely
   eligible (stones + friendship) for clearer overflow testing.

All 1203 existing tests still green; harness CLI typechecks clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… in skill

1. branch-eevee now seeds the full 8-branch evolution_options again
   (134/135/136/196/197/470/471/700). Only 3 are eligible via the
   seeded stones + 2 via friendship, but all 8 belong to the real
   Eevee dex so limiting the seed to 5 was misleading — users
   expected the complete set in the AskUserQuestion prompt. The
   ineligible three (Leafeon/Glaceon/Sylveon) naturally end up in
   the question body's "Other forms" list because the 4-option cap
   rule only promotes eligible targets to buttons.

2. Rewrite skills/test-evolve/SKILL.md so the evolution-trigger turn
   is treated as one continuous cycle instead of a turn-N /
   turn-N+1 split. The previous wording let Claude stop after the
   user answered the AskUserQuestion without running `tokenmon
   evolve`, verify, or restore. The new instructions say explicitly
   that the entire chain (render question → run evolve → print
   summary → --verify → --restore → final report) must complete
   within the same turn without stopping early.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When the active generation's pokemon DB does not contain the source
pokemon (e.g. Eevee #133 in a gen4-active save), checkEvolution,
getEligibleBranches, applyBranchEvolution and applySingleChainEvolution
all returned null/empty because `db.pokemon[baseId]` is undefined.
That left test-evolve --setup branch-eevee + Claude-driven evolve
failing with "현재 이브이의 진화 조건을 만족하는 경로가 없다" on every
scenario that used a cross-gen pokemon — and would also affect any
real-gameplay path where a user's party member came from another
generation via migration.

Fix: every access of the source pokemon's data now falls back to
ensurePokemonInDB, which transparently walks the other generations'
data and injects the missing pokemon into the active gen's cache.
applyBranchEvolution also applies the same fallback for the target
data so cross-gen branch targets resolve.

Also correct branch-eevee.json: the crawled data only contains the
three stone-based Eeveelutions (Vaporeon/Jolteon/Flareon), so the
scenario's evolution_options list is trimmed to match reality. The
later Eeveelutions (Espeon/Umbreon/Leafeon/Glaceon/Sylveon) are a
data-crawler backlog item, not an overflow-path regression.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…lback

Live scenario testing surfaced two related failures:

1. cmdEvolve only dispatched through getEligibleBranches, so pokemon
   whose data.evolves_to is a string (or legacy line[stage+1]) — i.e.
   every non-branching pokemon — hit "no eligible" and returned
   without invoking executeEvolve. That defeated the Stop-hook
   AskUserQuestion flow for any single-chain pokemon: the user would
   see the prompt, click the target, and nothing would happen.

   Add a dedicated single-chain branch in cmdEvolve that routes
   through checkEvolution (no state → returns a validated
   EvolutionResult) and then executeEvolve, which already dispatches
   to applySingleChainEvolution under the hood. The branch runs
   ahead of the getEligibleBranches path so branching pokemon still
   use the original flow.

2. checkEvolution's string-evolves_to path only called
   ensurePokemonInDB for explicit cross-gen references like
   "gen1:25". Plain numeric IDs that happened to live only in another
   generation (e.g. Charmeleon #5 on a gen4-active save) fell through
   and returned null because targetData was undefined. Same issue on
   the legacy line[stage+1] path. Both paths now use the
   ensurePokemonInDB fallback so cross-gen targets resolve.

Also seed the required evolution stones on the multi-3 and
overflow-5 scenarios so Pikachu (thunder-stone) and Eevee
(water/thunder/fire stones) can actually evolve when chosen.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tive

The test-evolve skill only lives in Claude's context during the turn
that invokes /tkm:test-evolve (setup turn). On the follow-up turn
where the Stop hook emits the evolution block and Claude renders
AskUserQuestion, the skill content has usually dropped out of the
prompt window, so Claude hands control back to the user after the
evolve call instead of running the auto verify + auto restore tail
the skill required.

When stop.ts detects the harness marker file (
.tokenmon/test-backup/current.json), append an explicit
"[TEST HARNESS ACTIVE]" block to the reason that spells out the two
trailing commands with their absolute paths (the dev flow's own
PLUGIN_ROOT + bin/tsx-resolve.sh + src/cli/test-evolve.ts), plus a
reminder to print the final report. That keeps the cycle
self-closing whether or not the skill-turn history is still in
context.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…+ achievements)

Three review findings from review-comments.md:

1. [P1] applySingleChainEvolution now falls back to ensurePokemonInDB
   for plain numeric cross-gen targets, not just the explicit
   `genN:id` crossGenRef syntax. Matches the fallback already in
   checkEvolution so every Stop-hook evolution that asks the user
   can actually resolve when the target species lives in another
   generation's dex. The legacy line[stage+1] branch picks up the
   same fallback for symmetry.

2. [P2] getInstalledHooksPath in src/test-evolve/backup.ts now
   searches ~/.claude/plugins/cache/tkm/tkm/<version>/hooks/hooks.json
   in addition to the worktree PLUGIN_ROOT and the marketplaces tree.
   That is where the release install actually lives — the prior
   helper threw before creating backups on any machine installed via
   the cache path, which was exactly the setup the new harness
   documents.

3. [P2] cmdAchievements now reads commonState in addition to
   state.achievements and merges entries from getCommonAchievementsDB
   so cross-generation achievements such as all_gen_badges stay in
   the listing. Before this patch the command only consulted the
   active-gen table so common achievements silently disappeared once
   a user unlocked them.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ThunderConch ThunderConch merged commit dd93a83 into master Apr 20, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant