Autonomous software engineering β powered by OpenAI Β· Claude Β· OpenCode
APE plans, reviews, patches, builds, and commits your code β while you stay in control. Every change is risk-classified and reviewed by the appropriate AI pipeline before a single byte hits disk. In debate mode a full Debate Viewer CLI UI renders each phase in real time β color-coded, structured, and engineer-friendly. Works with any language, any stack, any build system.
APE is a fully autonomous AI coding agent for any software project β Python services, TypeScript apps, Rust CLIs, Go microservices, Java backends, C/C++ firmware, and everything in between. You give it a goal in plain English; it figures out the task plan, generates minimal unified-diff patches, runs your build system, and iterates on failures β all with human approval gates at every critical step.
Version 2.0 introduces Risk-Gated Debate Mode: every proposed change is automatically routed to the right review pipeline based on a weighted risk classifier. Low-risk changes get a fast single-pass review; high-risk or safety-critical changes (data models, auth flows, transaction handlers, critical system code) go through a 4-phase adversarial AI debate before anything touches your codebase.
| Feature | Detail | |
|---|---|---|
| π§ | Multi-provider AI | OpenAI, Claude, and OpenCode β mix and match any model as proposer or critic |
| π― | Risk-gated mode selection | Automatic LITE vs DEBATE routing per change-set |
| βοΈ | 4-phase adversarial debate | Propose β Challenge β Rebuttal β Final audit |
| π | Free model support | OpenCode Zen API: glm-5-free, minimax-m2.5-free, kimi-k2.5-free, big-pickle |
| οΏ½ | Debate Viewer CLI UI | Structured, color-coded terminal panels for every debate phase |
| π₯ | Risk heatmap | ASCII per-file risk bars rendered after each debate |
| ποΈ | Debate session logs | Auto-persists _session.json, _patch_v1.diff, _patch_v2.diff |
| π | Firmware-safe guardrails | Blocks struct renames, ISR changes, oversized deletions, protected paths β web/script files exempt from deletion limits |
| π©Ή | Unified diff patches | All changes via git apply β no full-file overwrites |
| π¨ | Build loop | Run your build after every patch; auto-fix on failure (up to 3 retries) |
| π° | Budget tracking | Per-provider (OpenAI / Claude) + per-phase USD and token accounting |
| π§ | Persistent memory | Architecture decisions, constraints, and errors survive sessions |
| β©οΈ | Resume sessions | Pick up exactly where you left off with --resume |
| π | Human approval gates | Y/N prompts before every apply and commit β always |
| π΅ | Dry-run by default | Nothing written to disk unless you explicitly pass --apply |
| π | Verbose debug mode | --verbose dumps full raw model JSON for every phase |
# Install
cd ape && npm install
cp .env.example .env # add OPENAI_API_KEY and ANTHROPIC_API_KEY
# Preview what APE would do (safe, no writes)
node index.js \
--goal="Add rate limiting middleware to the Express API" \
--type=node \
--build="npm test"
# Actually apply patches
node index.js \
--goal="Add rate limiting middleware to the Express API" \
--type=node \
--build="npm test" \
--applyThe centrepiece of APE v2.0. Every task is classified before any AI call is made:
Change set
β
βΌ
ββββββββββββββββββββββββββββββββ
β Risk Classifier (0-100) β
β β’ struct / enum β +40 β
β β’ ISR / IRAM_ATTR β +30 β
β β’ memory ops β +15 β
β β’ concurrency β +15 β
β β’ protected path β +25 β
ββββββββββββ¬ββββββββββββββββββββ
β
βββββββββββΌβββββββββββ
β Mode Selector ββββ --lite-only / --debate-only
βββββββββββ¬βββββββββββ
β
ββββββββββ΄βββββββββ
β β
βΌ βΌ
LITE mode DEBATE mode
(score < 55) (score β₯ 55)
1 provider 4 phases:
call Phase 1 β Proposer generates patch
~$0.01-0.03 Phase 2 β Critic challenges
Phase 3 β Proposer rebuts
Phase 4 β Critic final audit
~$0.00-0.25 (free w/ OpenCode)
Force-debate conditions override score entirely:
- Any
struct/typedef/enumkeyword in the diff ISRorIRAM_ATTRin the diff- Protected path (
src/protocol,src/radio,src/routing) + patch > 80 lines
β Full documentation: docs/risk-gated-debate.md
When running in debate mode, APE renders a full structured terminal UI as each phase completes:
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
AI DEBATE SESSION [Task 3]
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Mode: debate
Risk Level: HIGH
Risk Score: 72
Triggers:
Β· struct keyword detected
Β· TTL logic modified
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
[PHASE 1] GPT Proposal
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Files: src/mesh_rx.c, src/routing.c
Patch lines: 184
Self risk: medium
Confidence: 78%
[PHASE 2] Critic Challenge
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β mesh_rx.c:142-168
Issue: Possible race condition on shared buffer
Severity: CRITICAL
β packet.h:33-48
Issue: Enum order modified (protocol risk)
Severity: MEDIUM
[PHASE 3] Proposer Defense
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Reverted enum reorder
β Added boundary guard for TTL decrement
β Wrapped shared buffer access in mutex
--- PATCH CHANGES (v1 β v2) ---
Lines removed: 12 Lines added: 18
[PHASE 4] Final Audit
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Remaining issues: none
Final Risk: LOW
Confidence: 84%
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
FINAL DECISION
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Mode used: DEBATE
Allow Apply: YES
Allow Commit: NO (requires human)
Final Confidence: 82%
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Risk Heatmap:
mesh_rx.c ββββββββββ 70%
routing.c ββββββββββ 30%
packet.h ββββββββββ 90%
Apply patch? (y/n)
Color coding: RED = critical, YELLOW = medium/high, GREEN = safe/low.
Enable --verbose to print full raw model JSON after each phase.
Session artifacts persisted to <target>/.ape/sessions/:
.ape/sessions/
<ts>_session.json full 4-phase debate record
<ts>_patch_v1.diff original proposer patch
<ts>_patch_v2.diff revised patch after defense
| Flag | Default | Description |
|---|---|---|
--goal |
required | What to build or fix |
--type |
node |
See Project Types table below. 23 types supported. |
--build |
(none) | Build command β e.g. npm test, cargo test, pytest, make, dotnet build |
--target |
cwd | Path to your project directory |
--max-budget |
5.00 |
USD spending cap |
--max-tokens |
500000 |
Total token cap |
--resume |
false |
Resume from ape-state.json |
--no-git |
false |
Skip all git operations |
| Flag | Default | Description |
|---|---|---|
--lite-only |
false |
Force single-pass LITE review for all tasks |
--debate-only |
false |
Force 4-phase DEBATE review for all tasks |
| Flag | Default | Description |
|---|---|---|
--apply |
false |
Write patches to disk. Without this, APE is in dry-run mode |
| Flag | Default | Description |
|---|---|---|
--allow-protected |
false |
Allow changes to protected paths (e.g. src/protocol, src/radio, src/routing) |
--allow-isr |
false |
Allow patches touching ISR / IRAM_ATTR code |
--confidence-threshold |
70 |
Minimum AI confidence score (0-100) to allow apply |
--auto-commit |
false |
Auto-propose commit after each task (human still approves) |
| Flag | Default | Description |
|---|---|---|
--verbose |
false |
Print full raw model JSON responses after each debate phase |
β Full reference: docs/cli-reference.md
Pass any of the following to --type. Each type sets the planner prompt, build conventions, and guardrail rules appropriate for that stack.
| Type | Stack | Build command hint |
|---|---|---|
embedded |
C/C++ firmware (ESP-IDF / Arduino / bare-metal) | idf.py build / pio run |
cli |
CLI tool in any language (Python, Go, Rust, Node, Cβ¦) | (varies by language) |
node |
Node.js (Express / general) | npm test |
web |
Generic browser front-end (HTML + JS) | (none) |
htmlcss |
Pure HTML + CSS, no build tool | (none) |
python |
Python 3 scripts / libraries | pytest |
react |
React 18 + Vite SPA | npm test |
api |
Node/Express REST API | npm test |
rust |
Rust (Cargo 2021) | cargo test |
docker |
Dockerfiles + Compose only | docker build . |
arduino |
Arduino / PlatformIO sketches | pio run |
nextjs |
Next.js 14 App Router + Tailwind | npm test |
go |
Go modules (go.mod) |
go test ./... |
fastapi |
FastAPI + Pydantic v2 | pytest |
bash |
Bash scripts | shellcheck |
svelte |
SvelteKit + TypeScript | npm test |
tauri |
Tauri (Rust backend + web front-end) | cargo test |
vscode-ext |
VS Code extension | npm test |
terraform |
Terraform / HCL2 | terraform validate |
platformio |
PlatformIO (embedded) | pio test |
dotnet |
.NET 8 / C# | dotnet test |
cpp |
C++17/20 with CMake | cmake --build build |
c |
C11 with Makefile / CMake | make |
1. PLAN Proposer model generates architecture + task list
Critic model reviews and refines the plan
β
2. FOR EACH TASK: β
βββββββββββββββββββββββββββββ
β Pre-guardrail ββ
β Protected path check ββ
ββββββββββββ€βββββββββββββββββ
β β
ββββββββββββΌβββββββββββββββββ
β Risk Classifier ββ
β Score 0-100, detect ISR ββ
ββββββββββββ€βββββββββββββββββ
β β
ββββββββββββΌβββββββββββββββββ
β Mode Selector ββ
β LITE or DEBATE ββ
ββββββββββββ€βββββββββββββββββ
β β
ββββββββββββΌβββββββββββ β
β Review Pipeline β β
β (LITE or DEBATE) β β
ββββββββββββ¬βββββββββββ β
β β
ββββββββββββΌβββββββββββββββββ
β Post-guardrail ββ
β checkPatch() ββ
ββββββββββββ€βββββββββββββββββ
β β
ββββββββββββΌβββββββββββββββββ
β Consensus ββ
β allow_apply? ββ
ββββββββββββ€βββββββββββββββββ
β β
ββββββββββββΌβββββββββββββββββ
β Dry-run gate ββ β default: stop here
β --apply required ββ
ββββββββββββ€βββββββββββββββββ
β β
ββββββββββββΌβββββββββββββββββ
β Human Y/N ββ
ββββββββββββ€βββββββββββββββββ
β β
ββββββββββββΌβββββββββββββββββ
β git apply ββ
β hard fail if rejected ββ
ββββββββββββ€βββββββββββββββββ
β β
ββββββββββββΌβββββββββββββββββ
β Build + fix loop ββ
β up to 3 retries ββ
ββββββββββββ€βββββββββββββββββ
β β
ββββββββββββΌβββββββββββββββββ
β Human Y/N commit ββ
ββββββββββββ€βββββββββββββββββ
β β
ββββββββββββΌβββββββββββββββββ
β Save report artifact ββ
βββββββββββββββββββββββββββββ
β
3. SUMMARY budget + phases βββββ
ape/
βββ index.js CLI entry β arg parsing, option assembly
βββ src/
β
βββ orchestrator.js Master loop: risk β mode β review β apply
β
βββ ββ Planning βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββ planner.js GPT creates task list for any project type; Claude refines
βββ taskManager.js Dependency-aware queue; done/failed/pending
βββ memory.js Arch decisions, constraints, error history
βββ stateTracker.js Iteration counter, current task, budget snapshot
β
βββ ββ Risk & Mode ββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββ riskClassifier.js Weighted 0-100 score; detects ISR/struct/protected
βββ modeSelector.js lite | debate; CLI flags override classifier
β
βββ ββ Review Pipelines βββββββββββββββββββββββββββββββββββββββββββββββββ
βββ liteReviewer.js Single pass β unified diff (LITE mode)
βββ debateOrchestrator.js 4-phase adversarial debate (DEBATE mode)
βββ debateViewer.js Debate Viewer CLI UI β panels, heatmap, prompts, log persist
βββ critiqueParser.js Parses/normalises all 4 phase JSON; safe fallbacks
βββ rebuttalEngine.js Phase 3 rebuttal; addressed_items tracking
βββ consensus.js fromLite / fromDebate β CONSENSUS_OUTPUT
β
βββ ββ Patch Lifecycle ββββββββββββββββββββββββββββββββββββββββββββββββββ
βββ patchApplier.js applyDiff / saveDiff / previewDiff
βββ guardrails.js checkPatch / checkPaths / checkFiles (file-type-aware)
βββ patchGenerator.js Legacy helper; used for record saving
β
βββ ββ Build & Git ββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββ buildRunner.js Run build command; extract errors
βββ gitManager.js Branch, stage, commit, awaitApproval
β
βββ ββ AI Providers βββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββ openai.js OpenAIProvider + legacy completeJSON helpers
βββ claude.js ClaudeProvider + legacy completeJSON helpers
βββ providers/LLMProvider.js Abstract base β generate(prompt, options)
βββ providers/providerFactory.js createProvider('openai'|'claude'|'opencode')
βββ providers/OpenCodeProvider.js fetch-based; free model allowlist; 4096 token default
βββ core/DebateSession.js Session state: proposer/critic providers + model names
β
βββ ββ Infrastructure βββββββββββββββββββββββββββββββββββββββββββββββββββ
βββ budgetManager.js Per-model + per-phase USD + token tracking
βββ logger.js Coloured console output; modeDecision, riskScoreβ¦
βββ config.js .env validation; throws on missing keys
Every run writes structured artifacts into your project:
<your-project>/
βββ .ape/
βββ patches/ <ts>_<taskId>.diff every attempted patch (audit trail)
βββ debates/ <ts>_<taskId>.json full 4-phase debate records
βββ sessions/ <ts>_session.json debate viewer session log
β <ts>_patch_v1.diff original proposer patch
β <ts>_patch_v2.diff revised patch after defense (if changed)
βββ memory.json architecture decisions, constraints, error history
βββ state.json iteration counter, task status, budget snapshot
# 1. Install dependencies
cd ape && npm install
# 2. Configure API keys
cp .env.example .envEdit .env:
# Required for OpenAI / Claude (default providers)
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
# Optional: choose which model fills each debate role
# PROPOSER_PROVIDER=openai # openai | claude | opencode
# PROPOSER_MODEL=gpt-4.1
# CRITIC_PROVIDER=claude # openai | claude | opencode
# CRITIC_MODEL=claude-opus-4-5
# Optional: use OpenCode free models (no API key needed for free tier)
# OPENCODE_ZEN_BASE_URL=https://opencode.ai
# PROPOSER_PROVIDER=opencode
# PROPOSER_MODEL=glm-5-free
# CRITIC_PROVIDER=opencode
# CRITIC_MODEL=kimi-k2.5-free# 3. Verify
node index.js --helpAPE uses a proposer / critic model: one AI proposes the patch, another challenges it. You can assign any supported provider to either role via .env.
| Provider | Key | Models |
|---|---|---|
| OpenAI | openai |
gpt-4.1, gpt-4o, any GPT model |
| Anthropic (Claude) | claude |
claude-opus-4-5, any Claude model |
| OpenCode Zen API | opencode |
glm-5-free, minimax-m2.5-free, kimi-k2.5-free, big-pickle (free) |
Set these four variables in your .env:
PROPOSER_PROVIDER=openai # who generates the patch
PROPOSER_MODEL=gpt-4.1
CRITIC_PROVIDER=claude # who challenges and audits it
CRITIC_MODEL=claude-opus-4-5OpenCode exposes an OpenAI-compatible /zen/v1/chat/completions endpoint. The free models are allowlisted by default β no billing setup required.
Step 1 β add to .env:
OPENCODE_ZEN_BASE_URL=https://opencode.ai
# OPENCODE_ZEN_API_KEY= β leave blank for free tier
OPENCODE_DEFAULT_MODEL=glm-5-freeStep 2 β choose a debate pairing:
# All-free debate (GLM proposes, Kimi critiques)
PROPOSER_PROVIDER=opencode
PROPOSER_MODEL=glm-5-free
CRITIC_PROVIDER=opencode
CRITIC_MODEL=kimi-k2.5-free
# Mixed: GPT proposes, OpenCode critiques for free
PROPOSER_PROVIDER=openai
PROPOSER_MODEL=gpt-4.1
CRITIC_PROVIDER=opencode
CRITIC_MODEL=minimax-m2.5-freeStep 3 β run APE normally:
node index.js \
--goal="Add error handling to the data pipeline" \
--type=python --build="pytest" --applyAllowlist: by default only
glm-5-free,minimax-m2.5-free,kimi-k2.5-free, andbig-pickleare accepted. SetOPENCODE_ALLOW_ANY_MODEL=1to bypass the check for other model strings.
# Safe preview β see the plan without touching any files
node index.js \
--goal="Add input validation and error handling to the user registration endpoint" \
--type=node
# FastAPI service
node index.js \
--goal="Add JWT authentication to the FastAPI backend" \
--type=fastapi \
--build="pytest" \
--apply --max-budget=5.00
# Force full adversarial debate on critical payment logic
node index.js \
--goal="Refactor the payment transaction rollback handler" \
--type=node \
--debate-only --allow-protected \
--build="npm test" \
--apply --max-budget=10.00
# Force debate with verbose model JSON output
node index.js \
--goal="Refactor routing layer" \
--type=embedded \
--debate-only --apply --verbose
# Rust CLI tool
node index.js \
--goal="Add async file processing with progress bar" \
--type=cli --target=./my-cli \
--build="cargo test" \
--apply --max-budget=3.00
# Resume an interrupted session
node index.js \
--goal="Add JWT authentication to the FastAPI backend" \
--type=web --build="pytest" \
--resume --apply
# Zero-cost debate using OpenCode free models
node index.js \
--goal="Refactor the data pipeline module" \
--type=python --build="pytest" \
--debate-only --apply
# (set PROPOSER_PROVIDER=opencode CRITIC_PROVIDER=opencode in .env first)| Doc | Description |
|---|---|
| docs/architecture.md | Full data-flow diagrams, module map, session state, risk scoring table |
| docs/risk-gated-debate.md | LITE mode, DEBATE mode (all 4 phases), debate viewer UI, consensus, budget fallback |
| docs/cli-reference.md | Every CLI flag with defaults, types, and examples |
| docs/guardrails.md | Pre/post guardrails, protected paths, deletion ratio, custom config |
| docs/modules.md | Full public API for every module in src/ |
MIT License
Copyright (c) 2026 APE Contributors
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.