Skip to content

Freesig/sim#9

Draft
freesig wants to merge 100 commits into
mainfrom
freesig/sim
Draft

Freesig/sim#9
freesig wants to merge 100 commits into
mainfrom
freesig/sim

Conversation

@freesig
Copy link
Copy Markdown
Collaborator

@freesig freesig commented Mar 31, 2026

No description provided.

freesig added 30 commits March 30, 2026 15:10
…from dir

The seed feature now opens a full filesystem tree browser instead of a
plain text path input. Users can expand/collapse directories, navigate
with arrow keys, go above home with Backspace, and select with S.
After selecting a directory, a new screen lets you choose exploration
depth (1–5) before seeding. Uses the existing recursive ingest backend
instead of single-level seed_spec, with background status polling.
Add candidate browsing and acceptance to the spec_view content panel.
When a node has candidates, they appear below the answer section with
[/] to browse and [y] to accept. Candidates persist after acceptance.
Both create-spec and seed-from-directory flows now prompt the user
to choose between local/remote and development/exploration before
proceeding, passing the selection through to the spec-forest API.
Wire [f] and [n] keybindings in SpecView to create new root features
and child questions via the external editor, using existing backend
create_feature and add_child APIs.
Poll sync connection status on every tick and display a red
"Sync: not connected" label in the footer when a sync URL is
configured but the connection is not established (not logged in,
login failed, or connection dropped).
Add a settings screen accessible from the spec view (press [g]) that
shows the current active directory and allows changing it via the
directory browser or clearing it. The directory automatically flows
to AI prompts for development-mode specs once set.
Sync module and other shared library code used eprintln!() which writes
to stderr and corrupts the TUI alternate screen, leaving lingering text
after operations like sync login. Switched all eprintln calls to tracing
macros so messages route to the log file instead. Also fixed a test that
used an 80-column terminal too narrow for the full footer text.
Captures tracing events into an in-memory ring buffer via a custom
tracing Layer, displayed as a bottom panel toggled with [l].
Supports scrolling with PgUp/PgDn. File logging is unaffected.
… support

Show all candidate answer texts instead of only the selected one, fix
misleading [/] browse hint to show actual keybindings, and add [E] to
load the selected candidate into the external editor for editing before
submission.
Tab focus switching no longer desyncs the displayed node. Removed
node_selected flat list index in favor of always using tree_state
for selection. Main panel Up/Down now navigates siblings.
Answering a question no longer automatically generates child nodes by
default. A new "Auto Explore" toggle in the Config screen (g) controls
this behavior, persisted across sessions via the settings table.
Log model, prompt size, response size, and elapsed time on success,
failure, and timeout paths for better observability of Claude API calls.
The TUI now exclusively uses the spec-forest API layer instead of
accessing the database directly. Added missing API functions
(get_children, get_next_question, get/set_setting) and re-exported
DB types from spec-forest so frontends don't need a direct dep.
The editor now shows both fields as editable sections (## Question /
## Answer) so existing answers are preserved and either field can be
updated independently. The flat-list 'e' key now edits the selected
node instead of fetching the next unanswered question.
The comment filter in run_editor() was stripping all lines starting
with '#', including the '## Question' and '## Answer' section markers
that parse_question_answer() needs. This caused edited answers to be
placed into the question field instead.
…tion

Introduces a new TUI screen where users can interactively simulate their
software based on spec nodes before implementation. An AI agent drives the
simulation, producing per-channel output (UI, audio, network, errors, logs)
grounded in spec node references.

Key features:
- Channel picker to select output channels before starting
- Dual input modes (Normal/Insert) with Ctrl+Enter to submit
- Tab/split-pane layout cycling (F5) for multi-channel views
- Spec reference overlays ([^N] footnotes, number keys to inspect)
- Spec gap indicators for ungrounded agent behavior
- Behavior reporting mode (r key) for inverse traceability
- Background processing with spinner, session resumption via --resume
freesig added 29 commits April 1, 2026 17:02
Game mode lets users play through their spec by choosing interaction+outcome
pairs. Each choice feeds back into the spec DAG, iteratively refining the
specification through play. Players can also reject outcomes with corrections.

- New data types: GameChoiceGroup, GameOutcome, GameTreeRoot, GameSpecUpdate
- Game-specific AI prompts presenting alternative outcomes per interaction
- Backend orchestration for game turns with background spec updates
- 4 new MCP tools: game_get_choices, game_select_outcome, game_reject_outcome,
  game_get_spec_updates
- TUI: grouped choices panel, reject overlay, spec update log overlay
- Channel picker toggle (g key) to enable game mode
Update game mode cardinal rule and tree rules to steer the AI toward
interactions that expose genuine spec ambiguity rather than obvious
outcomes. At least half of interactions should target decision points
where the player's choice resolves a meaningful specification question.
Show a breadcrumb bar (Start > Login > Email > Submit) in the TUI
simulation view, allowing users to see their full interaction path
and jump to any previous point. Press 'b' to focus the trail, use
left/right to select, Enter to jump. Also exposes breadcrumbs via
the sim_get_status MCP tool and adds a sim_navigate_to tool.
LLM responses sometimes include prose or markdown code fences around
JSON, causing parse failures in simulation and game mode. Strengthen
all prompts to explicitly forbid non-JSON output and extract a shared
generic `extract_json<T>()` helper to replace duplicated 4-step
extraction logic across all four parse functions.
Simulation init was slow with no way to see what Claude was doing.
Switch from --output-format json to stream-json, parse NDJSON to
extract the same final result while logging tool calls at info level
and all intermediate events at debug level.
Replace cmd.output() with spawn + BufReader line-by-line streaming
so stream-json events are logged as they arrive, not after the full
response completes. Tool use events now appear immediately in logs.
Byte-index slicing panicked on multi-byte UTF-8 characters (e.g. '…')
when truncating log lines to 200 bytes.
LLMs struggle to produce valid deeply-nested JSON at scale (~32KB),
causing consistent parse failures like "key must be a string at column
18790". Replace the nested tree format with a flat {nodes, edges}
adjacency list that eliminates nesting entirely.

- Add FlatTree/FlatNode/FlatEdge wire types for Claude's output
- Add flat_to_sim_tree and flat_to_game_tree conversion functions
- Parse functions try flat format first, fall back to nested (legacy)
- Update prompt schemas for both sim and game mode
- Include serde error and full response in parse failure messages
- Add tracing to extract_json for step-by-step diagnostics
Mirrors the web UI's AccessPanel, allowing users to view spec members
and creators to add/remove members via the sync server.
New experimental game mode focused on efficient spec refinement through
simulated software play. Each output generates 2 new child nodes plus
shortcut edges to existing nodes, forming a DAG. Batch pre-generation
(depth 3) keeps navigation instant, and background spec updates refine
the spec as the player navigates. Includes TUI panel, MCP tools, and
entropy-guided interaction selection.
Use spec_read_only config for lean batch generation so the AI can only
call search_nodes, get_node, get_descendants, and get_spec_summary.
Removes filesystem tools (Read, Glob, Grep) and explicitly tells the
AI not to attempt any write or sim/game tools.
… and backgrounding

Allow pressing back while a scene is generating by updating can_go_back
during Processing status and cancelling in-flight leaf generation via the
generation counter. Auto-trigger pregen when landing on nodes with shallow
depth (after initial turn, navigation, and go-back). Change Esc to
background lean games instead of destroying them, with full session
restore via the existing session picker.
…le spec

Main lean AI now signals spec-relevant behavior via per-node spec_updates in
the batch response. On navigation, a separate background AI session with write
tools and full spec context determines the best placement — adding Q&A nodes,
updating existing answers, or creating new features as appropriate.
Remove automatic background Claude sessions for spec updates during lean
game navigation. Instead, navigation history accumulates as unsent actions
that the user explicitly sends (press 's') with notes to the main AI
session via --resume. The AI then uses write MCP tools to update the spec.

Key changes:
- Track unsent action count via lean_sent_path_len on SimSession
- Queue leaf generation and send-actions when session is busy
- Warn on quit if unsent actions remain (double-press Q to confirm)
- Show send(N) in status bar, spec update progress in output title
- Remove LeanSpecSuggestion, background spec update orchestration,
  and separate spec update Claude sessions
- Add LeanGraph::replace_at() for modify: replaces current node's content
  and edges with the new batch so the modified output is immediately visible
- Update format_navigation_history to use edge labels (e.g. "Click Submit")
  instead of raw_text for clearer action descriptions
- Send actions overlay now lists all unsent actions by label before the
  notes input, so users can see what they're sending
Show "◐" yellow indicator on edges approaching ungenerated frontier nodes
instead of the normal "●" green. Trigger background pregen immediately
after navigating via a leaf edge so the next level generates without delay.
Silent navigation now signals acceptance — the AI is instructed to
treat unmodified/unqueried outputs as correct and use them to fill
unspecified gaps in the spec.
Op notifications were silently dropped while on LeanGame or Simulation
screens, leaving self.nodes stale. Nodes added via MCP during those
sessions would appear empty or missing when navigating back to SpecView.
…tputs

Add explicit instructions across system prompt, channel semantics, and DAG
rules telling the AI to render concrete application output rather than
surfacing spec-level questions or uncertainty markers in channel text.
Back navigation no longer removes entries from the send-actions history.
A separate append-only lean_action_history records every forward and back
navigation chronologically, so send-actions reflects the complete player
journey including backtracking.
Pregen was not triggering when navigating to a leaf node because:
1) A prior pregen (from an ancestor) held lean_generating=true, and
   after completing it never re-checked the current position.
2) The pregen anchor was the navigated-to node which might have only
   generative edges — merge_batch couldn't attach the new batch.

Now find_pregen_target BFS-walks to the nearest node with leaf edges,
and spawn_queued_work re-checks the current position after pregen ends.
Generative/Shortcut navigation now explicitly sets status to Idle, and
the Processing branch syncs interactions from the graph so existing
nodes always show their edges regardless of status timing.
Include all simulation channels (network, audio, errors, logs) in the
navigation history sent to the spec update prompt, and add guidance for
the AI to reason about system-wide implications of UI interactions.
The lean game AI was not grounding its outputs in the spec, making
different choices even when the spec clearly specified behavior. Add
SPEC FIDELITY and WHEN THE SPEC IS SILENT sections to both system
prompt builders to prioritize faithful spec rendering over entropy
exploration.
Spawn fast Haiku-based text scenarios in parallel with the slow initial
lean game turn so the player has something to do while waiting. Warmup
picks high-entropy spec nodes, generates short situational prompts, and
captures player responses for later spec updates via send-actions.
Restructure lean game prompts so the AI acts as a scenario designer
rather than a generic simulator. The AI now identifies high-entropy
decisions it had to make and designs DAG paths as mini-scenarios that
force the player to confront those assumptions.

Key changes:
- Activate spec_gaps field: AI logs implementer assumptions per channel
- Feature-scoped entropy: high-entropy nodes filtered to focus feature
- Scenario design framing: CARDINAL RULE rewritten for gap exploration
- spec_gaps flow into send-actions prompt as validation evidence
- Resume prompt guides continued scenario exploration
Show warmup scenarios in the output panel while the main game loads.
Players press 'r' to enter response mode and Ctrl+S to submit. The
status bar shows warmup-specific hints and a "game ready!" indicator
when the real game has loaded. Warmup state is synced from the session
and cleared on transition to the real game.
@freesig freesig marked this pull request as draft April 6, 2026 23:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant