Simulation mode for interactive spec-driven software simulation by freesig · Pull Request #16 · essential-contributions/spec-forest

freesig · 2026-04-06T23:17:24Z

Summary

Add core simulation engine: runner, prompts, session management, orchestration
Simulation UI: channel picker, scenario input, cursor navigation, mouse capture
Feature regeneration via MCP tool and TUI keybind
Convert MCP tool handlers to async to prevent runtime deadlock
Iterative prompt tuning to enforce JSON output and spec fidelity

Stack

PR 1/5: TUI Infrastructure
PR 2/5 ← you are here
PR 3/5: Sim Tools, Game Mode & Streaming
PR 4/5: Members Screen
PR 5/5: Lean Game Mode

Test plan

Launch simulation from TUI with a scenario
Simulation produces spec-referenced interactions
Channel picker options work (explore code, consume whole spec)
Feature regeneration updates descriptions from directory state
MCP tools don't deadlock when called during simulation

…tion Introduces a new TUI screen where users can interactively simulate their software based on spec nodes before implementation. An AI agent drives the simulation, producing per-channel output (UI, audio, network, errors, logs) grounded in spec node references. Key features: - Channel picker to select output channels before starting - Dual input modes (Normal/Insert) with Ctrl+Enter to submit - Tab/split-pane layout cycling (F5) for multi-channel views - Spec reference overlays ([^N] footnotes, number keys to inspect) - Spec gap indicators for ungrounded agent behavior - Behavior reporting mode (r key) for inverse traceability - Background processing with spinner, session resumption via --resume

…turn

…s bar The MCP config passed to claude CLI was missing the required "type": "http" field, causing immediate rejection. Additionally, simulation errors were invisible because the simulation screen never rendered app.message.

Adds the ability to generate and view shadow answers in the TUI, comparing spec answers against actual codebase implementation. Press Shift+S to trigger shadow generation, which shows a progress bar and populates implementation status icons (○/◐/●/⚡) in the tree view and detailed status/review in the node content panel.

Shift+Enter is not reliably forwarded by tmux, making it impossible to submit simulation input in tmux sessions. Add Ctrl+S as a fallback.

The model would break character on short inputs like "q", responding with plain text instead of JSON. Append a format reminder to each resume prompt to keep responses in the required JSON envelope.

…scendants Instead of loading all spec nodes, the simulation now loads the selected node, its ancestor chain, descendants, a spec summary, and other root questions. The prompt strongly encourages querying MCP tools for additional context beyond the focus subtree.

… spec gaps

…t mode

Adds an optional scenario description step between channel picker and simulation start. Users can describe the starting state and ongoing conditions (e.g. "other nodes are sending ACK messages") to customize the simulation context instead of always starting from a blank state.

…lacing channels Reports now display a magenta-bordered overlay with the LLM's explanation of why the simulation behaves a certain way, citing spec nodes with [^N] markers. Channel contents are preserved so the user doesn't lose the current simulation state and node references.

…are active Candidate reload was being forced every 250ms tick while any background operation was running. Now only reloads once on the busy→idle transition.

…ctually generates it

…l focus cycling Add [S] keybinding in simulation Normal mode to edit the scenario at any point during a running simulation. The input area shows a [Scenario] prefix, pre-populates with the current scenario, and on submit sends a scenario update to the LLM while persisting it on the session. Also adds log panel focus support with arrow key line scrolling and Tab cycling through main → tree → log panels.

…t ref input to simulation - LLM now must list every discrete decision in a "decisions" array with refs and spec_gaps - Changed spec_gap from single optional string to spec_gaps array (multiple per channel) - Number keys buffer with ~500ms delay for multi-digit refs (e.g. "11" for [^11]) - Decisions panel renders below channels in magenta with inline ref markers - SimOpenRef searches both channel refs and decision refs

…from directory state Adds UpdateFeature op to the append-only log, a regeneration pipeline that uses directory context and child Q&A nodes to produce an updated description via Claude, and triggers cascade review on children after update.

…ctory state Adds RegenerateFeature action bound to Shift+R in both tree and flat-list modes, with an API layer that validates the selected node is a root feature before spawning regeneration.

…access codebase Previously run_claude was called without a directory, so Claude had no file access. Now uses run_claude_in_dir_cached with the spec's directory, includes project directory instructions in the prompt, and appends the codebase context output instruction so responses build the dir context cache.

Full explore (X key) now shows a depth selection modal instead of immediately starting with hardcoded depth 3. Exploration also runs with end_on_answer=true so it stops after answering leaf nodes.

…taminating sim responses With --output-format text, intermediate MCP tool call/result content was concatenated into stdout, causing JSON parse failures when the model used tools during resumed sessions (e.g., scenario updates). Switching to --output-format json isolates the final assistant text in a result field.

…ecified behavior Reframe the AI's role from "simulate a working app" to "spec-simulation tool" that renders ONLY what spec nodes explicitly describe. Key changes: - Add CARDINAL RULE preamble making grounding the #1 priority - Strengthen "do not guess" into explicit rules against loosely-related citations - Remove "realistic TUI" language that pushed AI to invent UI elements - Reframe spec_gaps as warning flags, not permission slips - Add "prefer omission over invention" rule with concrete examples - Fix initial/resume prompts to stop requesting a "starting state"

Adds a [Tab] toggle on the channel picker screen that loads all spec nodes into the simulation system prompt instead of just the focus node's ancestors and descendants.

… invented behavior The previous prompt revision was too strict, causing the AI to just list spec coverage instead of rendering an interactive simulation. Rebalance: - AI MUST render the feature as it would look if implemented - Everything grounded in a spec node gets cited with [^N] markers - Everything invented to make the simulation interactive gets flagged as a spec_gap — never silently blended with grounded content - Loose citations (citing a parent/related node for something it doesn't specifically describe) are explicitly called out as the wrong pattern - Updated both build_system_prompt and build_system_prompt_whole_spec

…ation prediction Use entropy-based gap detection: only flag spec_gaps for high-entropy decisions where different implementers would diverge. Low-entropy choices (submit buttons, standard layout, obvious defaults) are rendered naturally without flagging, producing cleaner simulation output.

Replace string buffer with Vec<CapturedKey> so simulation insert mode captures all keystrokes as discrete tokens. Special keys render as {enter}, {up}, {down}, {tab}, etc. for readable input display.

…mode Enables mouse capture in the terminal and records left-clicks within the channel content area as {left-click:X,Y} tokens in the captured key sequence, allowing the AI to interpret simulated UI interactions.

When enabled, sets the claude CLI working directory to the spec's project directory and appends code-aware instructions to the system prompt — letting the agent consult the actual codebase for unspecified details while treating the spec as the source of truth.

MCP answer_question and update_answer were reimplementing embed + submit_op directly, skipping the post-answer pipeline (entropy evaluation, child generation, descendant review, summary regeneration). Now both route through api::answer_node — the same path used by the web UI and TUI. - Extend answer_node to accept optional residual_entropy; skip background AI evaluation when caller provides it - Remove redundant update_answer MCP tool (answer_question handles both) - MCP now returns full Node instead of just {node_id, status}

Global EnableMouseCapture was preventing text selection/copy across the entire TUI. Now mouse capture is toggled on only when entering simulation insert mode and off when leaving.

Add a broadcast channel to AppState so the op_loop notifies all subscribers after each committed operation. The TUI subscribes and refreshes the relevant view (spec list or node list) immediately, replacing the previous pull-only model that required user navigation to see background changes.

Sync tool handlers were calling Handle::block_on() from within the tokio async runtime, causing hangs on every MCP tool call. Converted all 10 affected handlers to async fn and replaced block_on with .await.

Instruments: tool handler entry/exit, answer_node stages, submit_op flow, op_loop receive/apply, and MCP handler creation.

…messages

freesig added 30 commits March 31, 2026 13:26

fix: use Shift+Enter instead of Ctrl+Enter for simulation input submit

144a2bd

fix: show loading spinner in simulation channel panes during initial …

dba261e

…turn

fix: add Ctrl+S as alternative submit binding for simulation input

1eb5804

Shift+Enter is not reliably forwarded by tmux, making it impossible to submit simulation input in tmux sessions. Add Ctrl+S as a fallback.

fix: reinforce JSON output format on simulation resume turns

a420f1d

The model would break character on short inputs like "q", responding with plain text instead of JSON. Append a format reminder to each resume prompt to keep responses in the required JSON envelope.

fix: strengthen simulation prompt to enforce spec node references and…

2a40108

… spec gaps

feat: add cursor navigation and arrow key support to simulation inser…

94577b2

…t mode

fix: stop continuous candidate polling in TUI while background tasks …

d042d23

…are active Candidate reload was being forced every 250ms tick while any background operation was running. Now only reloads once on the busy→idle transition.

fix: include spec_gap field in simulation prompt JSON schema so LLM a…

ecdc9f6

…ctually generates it

feat: add [R] Regen keypress in TUI to regenerate a feature from dire…

1d23b9c

…ctory state Adds RegenerateFeature action bound to Shift+R in both tree and flat-list modes, with an API layer that validates the selected node is a root feature before spawning regeneration.

feat: add depth picker modal for full exploration and end on answer

87161f7

Full explore (X key) now shows a depth selection modal instead of immediately starting with hardcoded depth 3. Exploration also runs with end_on_answer=true so it stops after answering leaf nodes.

feat: add consume whole spec toggle to simulation channel picker

8df5e46

Adds a [Tab] toggle on the channel picker screen that loads all spec nodes into the simulation system prompt instead of just the focus node's ancestors and descendants.

feat: display captured key representations in simulation input

1c0e123

Replace string buffer with Vec<CapturedKey> so simulation insert mode captures all keystrokes as discrete tokens. Special keys render as {enter}, {up}, {down}, {tab}, etc. for readable input display.

fix: scope mouse capture to simulation insert mode only

2bf4fcc

Global EnableMouseCapture was preventing text selection/copy across the entire TUI. Now mouse capture is toggled on only when entering simulation insert mode and off when leaving.

freesig added 5 commits April 1, 2026 08:20

fix: convert MCP tool handlers to async to prevent runtime deadlock

ab34691

Sync tool handlers were calling Handle::block_on() from within the tokio async runtime, causing hangs on every MCP tool call. Converted all 10 affected handlers to async fn and replaced block_on with .await.

debug: add tracing to MCP tool call path to diagnose hang

9a421cc

Instruments: tool handler entry/exit, answer_node stages, submit_op flow, op_loop receive/apply, and MCP handler creation.

chore: reduce debug tracing to lightweight debug-level logs

5f7a652

fix: use floor_char_boundary to prevent UTF-8 slicing panic in error …

1cf0c3d

…messages

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simulation mode for interactive spec-driven software simulation#16

Simulation mode for interactive spec-driven software simulation#16
freesig wants to merge 35 commits into
freesig/sim-1-tui-infrafrom
freesig/sim-2-simulation

freesig commented Apr 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

freesig commented Apr 6, 2026

Summary

Stack

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant