Skip to content

zzunkie/yardlet

Repository files navigation

Yardlet

crates.io CI downloads license: MIT

English | 한국어

Rent the intelligence. Own the work. Yardlet is a local console for engineering the loop that turns a few sentences of intent into verified, durable work, using your already-installed coding agents as interchangeable workers.

Yardlet terminal UI demo

"I don't prompt Claude anymore. I have loops running that prompt Claude… my job is to write loops." That is how Anthropic's Claude Code lead describes his own workflow now, and loop engineering is the name the practice picked up. Yardlet is that practice as a product, for everyone:

  • Prompts are compiled, not written. You state intent once; every worker prompt is built from contracts, rules, skills, role discipline, and checkpoints you own. Improve those inputs and every future prompt improves.
  • The loop is yours, not a vendor's. Worker-neutral (Claude Code, Codex, or any CLI behind one contract), local (state lives in your repo), and it survives crashes, restarts, and worker swaps.
  • The verifier is never the doer. A deterministic evaluator checks every run against the contract; risky plans get reviewer-role verification tasks. "Done" is earned, not self-reported. Mechanical checks are deterministic (schema, IDs, scope drift, forbidden paths from the actual git diff, and the validation commands Yardlet runs itself); semantic quality is judged by separate reviewer-role tasks, not by pretending a checker judges everything.

Full identity: docs/identity.md.

User intent (a few sentences)
  -> planning gate            intent / scope / acceptance contract
  -> work queue               bounded tasks, dependencies, parallel-ready
  -> packet compiler          prompts built from state you own
      -> hidden workers       claude / codex / any CLI, sandboxed, routable
  -> deterministic evaluator  done is checked, not declared
  -> checkpoint / handoff     durable artifacts, resumable forever

Install

cargo install yardlet

Prebuilt binaries for macOS and Linux are attached to each GitHub release; with cargo-binstall installed, cargo binstall yardlet fetches one instead of compiling.

Your Claude Code and Codex, as they are

If claude or codex runs on your machine, Yardlet can drive it, with no new accounts, no extra configuration, no setup step. Yardlet discovers the installed CLIs, probes readiness, and puts them to work exactly as you already pay for them. Any other agent CLI can be added in config alone (see "Adding a worker"), including API-backed tools via a per-worker invocation.pass_env opt-in.

Because the worker runs in your project, your existing setup keeps working inside each task: your CLAUDE.md, skills, hooks, MCP servers, and subagents all still apply. Yardlet layers orchestration and verification on top rather than replacing your harness. It does not try to be clever at the LLM layer; it runs the harnesses that are already good there as interchangeable workers and spends its own effort on the parts you can solve deterministically: routing, evaluation, state, merges, recovery, handoffs.

It is built to run within the subscriptions you already pay for, not to rack up per-token API costs. Billing keys (ANTHROPIC_API_KEY, OPENAI_API_KEY, and the like) are scrubbed from the worker environment before spawn, so an unattended auto-drain cannot silently bill against an API key instead of your subscription; a worker opts a specific var back in only via pass_env.

The loop

cd your-project
yardlet new "add admin order search with status, email, and date filters"
yardlet queue                      # review the planned tasks
yardlet run --auto                 # drain the queue, stopping only at human gates
yardlet handoff                    # read the teammate-readable summary
yardlet                            # or do it all from the terminal UI

Like the worker CLIs, yardlet just works in any directory: the first command creates .agents/ state on demand. yardlet init exists for scripting or to re-scaffold, but you do not need to run it first.

A one-sentence request becomes an intent contract plus a bounded task queue with explicit dependencies; each task runs through a hidden worker, is checked by a deterministic evaluator, and leaves a checkpoint and handoff under .agents/runs/.

Terminal UI shortcuts

The terminal UI (yardlet with no subcommand) is the main way to drive a session. From the Home screen:

Key Action
n New work: describe a request (when idle).
r Run the next task.
A Auto-drain the queue.
p Approve the next task; while a drain runs, request a graceful pause.
a Answer a task waiting on you (NeedsUser).
Esc Stop the running worker.
/ Browse the queue, then the workers panel past its end.
Enter Open the selected task's handoff.
Space / Enter Toggle the selected worker on/off (in the workers panel).
i View the intent contract.
h View the latest handoff.
R Reports and history browser.
m Monitor the worker's live output.
s Settings (can be opened mid-run).
g Refresh, re-probing worker readiness.
l Toggle language.
f Toggle access level (sandboxed / full).
u Restart into a freshly installed update (when available).
q / Ctrl+C Quit.

Korean keyboard layouts work without switching back to English: the Hangul jamo are mapped to the same shortcuts.

Commands

Command Purpose
yardlet Open the terminal UI (auto-inits on first use).
yardlet init [--force] Explicitly scaffold .agents/ state (optional).
yardlet new "<request>" [--worker <id>] Plan a request into an intent contract + queue.
yardlet goal "<goal>" [--verify "..."] Express lane: skip planning, run one goal to a verify condition.
yardlet new "..." --image <path> Attach a local image to the goal (also auto-detected from the request).
yardlet queue List the work queue.
yardlet status [--json] Workspace, intent, queue, and worker summary.
yardlet worker status Worker readiness and billing-env safety.
yardlet inspect repo [--json] Cheap deterministic local evidence.
yardlet packet --task <id> --worker <id> [--dry-run] Compile a worker packet.
yardlet run --next [--execute] [--worker <id>] Prepare (default) or run the next task.
yardlet run --auto [--parallel N] Drain the queue autonomously; optionally N tasks at once.
yardlet answer "<reply>" Answer a task waiting on you (NeedsUser) and resume it.
yardlet handoff Print the latest run's handoff.
yardlet report Print the intent's final report (aggregate of every task).
yardlet recover Recover state from an interrupted session (orphaned runs, unread plans).
yardlet skill list / suggest / equip <preset> / unequip / research / create / apply / review Classify, equip, author, and score skills.
yardlet harness review Show auto-learned rules and skills with their eval scores.
yardlet routing review Per-kind worker success stats + suggested preferences.
yardlet routing apply --kind K --worker W Pin a worker for a task kind (human-approved).

When a worker needs input it leaves the task in NeedsUser with a question. yardlet status (and the TUI) shows the question; reply with yardlet answer "..." (or press a in the TUI) and Yardlet re-runs the task with your answer.

Language

Worker-authored content (plan summary, task titles, handoff, questions) follows your language. By default Yardlet auto-detects it from your request, so a Korean request gets a Korean plan and handoff while code and identifiers stay English. Set language: in .agents/yardlet.yaml to ko/en/etc. to force one.

Permissions

Workers run in a bounded sandbox by default (local files and tests, no network). This is layered:

  1. Safe by default: codex workspace-write, claude acceptEdits.
  2. Report, don't bypass: if a worker needs network, an install, production, or a destructive action, it stops and asks via NeedsUser instead of failing silently. You grant access and resume.
  3. Explicit escalation: yardlet run --next --execute --full-access (or yardlet answer --full-access) drops the sandbox for that run only. Off by default; it is a human-granted permission, never automatic.

Worker routing

The planner picks a worker per task from an editable rubric in .agents/workers.yaml (each worker's best_for + a cost_bias dial). At run time the choice is deterministic: preferred worker → readiness check → fall back to the next ready worker. Every run logs its outcome to .agents/telemetry/runs.jsonl; yardlet routing review aggregates that and suggests profile changes (e.g. "claude-code wins refactors"), which you apply with yardlet routing apply. Telemetry never changes routing on its own. Design: docs/routing-and-telemetry.md.

run --next prepares a run and stops before invoking a worker by default, because spawning a subscription-backed worker consumes usage. Pass --execute to actually run it.

Workers can be toggled on/off from the Home workers panel (arrow keys past the queue, then Enter/Space); a disabled worker is skipped by routing and planning.

Adding a worker

Codex and Claude Code have built-in adapters. Any other subscription-backed CLI can be added in .agents/workers.yaml alone: give it an invocation template and Yardlet drives it through the same contract (packet on stdin → result files out). Placeholders: {run_dir}, {model}, {effort}, {image}.

- id: mytool
  best_for: "..."            # planner rubric
  invocation:
    command: mytool          # must support --version (readiness probe)
    supports_noninteractive: true
    args: ["run", "--json", "--out", "{run_dir}"]
    sandbox_args: ["--sandbox"]        # default access level
    full_access_args: ["--yolo"]       # only when full access is granted
    model_args: ["--model", "{model}"] # added when a model is set
    effort_args: ["--effort", "{effort}"]
    image_args: ["-i", "{image}"]      # repeated per attached image

The worker must be able to write files in the workspace (that is how results come back); its subprocess env is sanitized unless the profile opts vars back in with pass_env.

The ecosystem's agents are Yardlet's supply side: terminal agents like oh-my-pi (omp), OpenCode, Gemini CLI, or an API-backed CLI of your own all fit the same template. Register the winners, swap them per task, keep the records.

Role profiles

Each task runs under a role, a prompt mode over the worker, derived from the task kind: implementationbuilder, reviewreviewer, researchresearcher, safetysecurity. The same Codex/Claude session gets role-specific working rules (a reviewer cites file:line evidence and doesn't rewrite code; a researcher makes no code changes; security audits adversarially and never prints secret values). Extend a role per workspace by writing .agents/agents/<role>.md; it is appended to that role's packets.

Parallel execution

The planner marks which tasks genuinely depend on each other (depends_on); everything else is independent. With parallelism on, the auto-drain runs up to N independent tasks at once, each in its own git worktree on branch yard/<task-id>, possibly on different workers. Workers run in parallel, but queue state keeps a single writer and results merge back sequentially; a merge conflict is never auto-resolved (the task drops to Partial and its worktree is kept for inspection). Off by default; opt in via Settings ("Parallel tasks"), max_parallel in .agents/yardlet.yaml, or yardlet run --auto --parallel 3. Requires a clean git tree, otherwise Yardlet falls back to sequential.

Inside a task, workers are free to use their own subagents. Yardlet's queue parallelism is for work that must survive sessions, cross workers, or pass human gates. Design: docs/parallel-queue.md.

Crash safety

Yardlet state survives restarts. On startup (and via yardlet recover) it recovers interrupted sessions: a planning result the previous session paid for but never read is consumed into the queue, finished orphaned runs are evaluated and merged (worktree runs included), and unfinished ones are requeued.

Build

cargo build
cargo test
cargo run -- init

Contributing: see CONTRIBUTING.md for build/test, the core invariants, and the PR process. Adding another worker is config-only (see "Adding a worker"); PRs to wire up new workers are welcome.

Canonical state

Yardlet owns state; workers do not. Canonical state lives under .agents/ in the target repo:

.agents/
  yardlet.yaml              workspace config
  intent-contract.yaml   current goal / scope / acceptance
  work-queue.yaml         tasks
  *-policy.yaml           tool / approval / interaction / research / billing policy
  workers.yaml            worker profiles + routing
  runs/<run-id>/          per-run artifacts (result, validation, checkpoint, handoff)
  checkpoints/            latest compact resume points
  handoffs/               teammate-readable summaries

User-level, non-secret config lives under ~/.yardlet/.

License

MIT

Packages

 
 
 

Contributors

Languages