Sutra is a governed AI development orchestrator that lets Codex or Gemini plan, Claude Code execute, and Sutra validate, confirm, track, and report the flow.
It is intentionally lightweight: a local Python CLI that can be installed directly into a repo and used from terminal.
- Validates that Codex/Gemini and Claude Code are available from the same developer shell.
- Confirms the chain:
Codex/Gemini -> Sutra -> Claude Code. - Converts a requirement file into bounded Claude Code tasks.
- Shows the tasks before execution.
- Requires confirmation before running tasks unless
--yes/--auto-approveis used. - Runs Claude Code task-by-task with timeout, max turns, model, budget, and allowed tools.
- Runs validation commands after every task.
- Tracks progress in
.sutra/runs/<RUN_ID>/progress.jsonanddocs/progress.md. - Tracks Claude token usage or estimated usage in
.sutra/runs/<RUN_ID>/token-ledger.json. - Produces a run summary.
Install the standalone tool globally using pipx:
pipx install sutra-cliOr using standard pip:
python -m pip install sutra-cliDownload the pre-compiled binary for your operating system from the Latest Releases page. No Python installation is required.
Install the latest development version directly from the repository:
pipx install git+https://github.com/sairintechnologycom/sutra.gitIf you are contributing to Sutra, install it in editable mode:
git clone https://github.com/sairintechnologycom/sutra.git
cd code_orchestrator
python -m pip install -e .Verify the installation:
sutra --helpInstall and authenticate the CLIs you want to use:
claude --version
codex --version
# or
gemini --versionSutra does not directly authenticate these tools. It validates that they are installed, callable, and usable in headless mode.
sutra initThis creates:
.sutra/config.json
CLAUDE.md
AGENTS.md
GEMINI.md
.claude/settings.json
.claude/skills/implement-task/SKILL.md
docs/progress.md
docs/decisions.md
requirements/REQ-001.md
For Codex:
sutra doctor --engine codex --smoke-testFor Gemini:
sutra doctor --engine gemini --smoke-testExpected successful output includes:
CHAIN CONFIRMED: codex -> Sutra -> Claude Code
or:
CHAIN CONFIRMED: gemini -> Sutra -> Claude Code
This means Sutra can invoke the planner and Claude Code from the same environment.
Example:
requirements/REQ-001.md
# Requirement: Add Student Progress Dashboard
## Goal
Build a dashboard that shows progress by subject, topic, quiz score, and mastery level.
## Expected Outcome
- Dashboard page created
- API endpoint added if required
- Tests added
- Existing tests pass
## Constraints
- Use existing frontend layout
- Do not introduce a new UI library
- Do not change authentication flow
## Validation
- npm test passes
- npm run lint passessutra plan --input requirements/REQ-001.md --engine codexor:
sutra plan --input requirements/REQ-001.md --engine geminiSutra shows the generated tasks immediately.
sutra validate --run REQ-001sutra approve --run REQ-001sutra run --run REQ-001 --smoke-testSutra shows each task before and during execution:
▶ Executing T001: Inspect repository, confirm scope, and identify required files
model=sonnet timeout=300s max_turns=2
allowed_tools=Read, Bash(git status *), Bash(git diff *)
sutra run \
--input requirements/REQ-001.md \
--engine codex \
--auto-approve \
--smoke-testUse this only when .claude/settings.json and .sutra/config.json are properly locked down.
Dry run shows the planned task execution without invoking Claude Code:
sutra run --run REQ-001 --dry-run -y --skip-doctorsutra tokens report --run REQ-001Important: actual token usage is used when Claude output exposes usage metadata. Otherwise, Sutra estimates token usage using prompt/output size and calculates savings against the configured baseline multiplier.
Config location:
.sutra/config.json
Default baseline:
{
"policy": {
"token_baseline_multiplier": 1.5
}
}Formula:
tokens_saved = estimated_baseline_tokens - actual_or_estimated_tokens
sutra init
sutra doctor --engine codex --smoke-test
sutra plan --input requirements/REQ-001.md --engine codex
sutra validate --run REQ-001
sutra approve --run REQ-001
sutra run --run REQ-001
sutra run-task --run REQ-001 --task T002
sutra status --run REQ-001
sutra summarize --run REQ-001
sutra tokens report --run REQ-001Sutra uses several controls:
- CLI chain validation before execution.
- Task plan validation.
- Human confirmation before execution by default.
- Claude timeout per task.
- Claude max-turns per task.
- Claude allowed tools per task.
- Validation command allow-list.
- Dangerous command deny-list.
- Progress and token ledger after each task.
- Token savings are estimates unless Claude Code emits usage details in structured output.
- Codex/Gemini planner output must be valid JSON; otherwise Sutra falls back to a local deterministic starter plan unless
--strict-planneris used. - Sutra validates the chain via Sutra-mediated execution, not a direct native integration between Codex/Gemini and Claude.
- The CLI is local-first and does not yet include a remote dashboard.