CodeFRAME

AI coding agents that work autonomously while you sleep. Check in like a coworker, answer questions when needed, ship features continuously.

Overview

CodeFRAME is an autonomous AI development system where specialized agents collaborate to build software features end-to-end. It combines multi-agent orchestration, human-in-the-loop blockers, and intelligent context management to enable truly autonomous software development cycles.

Unlike traditional AI coding assistants that wait for your prompts, CodeFRAME agents work independently on tasks, ask questions when blocked, and coordinate with each other to ship complete features—day and night.

Two modes of operation:

CLI-first (v2) — Complete Golden Path workflow from the command line, no server required
Dashboard (v1) — Real-time web UI with WebSocket updates for monitoring and interaction

What's New (Updated: 2026-01-29)

V2 Strategic Roadmap Published

We've completed the v2 strategic planning and published a comprehensive 5-phase roadmap:

Phase 1 (Current): CLI Foundation - cf prd generate, live streaming, integration tests
Phase 2: Server Layer as thin adapter over core
Phase 3: Web UI rebuild on v2 foundation
Phase 4: Multi-Agent Coordination (the "FRAME" vision)
Phase 5: Advanced Features (TUI, metrics, replay)

See docs/V2_STRATEGIC_ROADMAP.md for full details.

Environment Validation & Tool Detection

New cf env commands — Validate your development environment and auto-install missing tools.

# Quick environment health check
cf env check

# Comprehensive diagnostics
cf env doctor

# Install a specific missing tool
cf env install-missing pytest

# Auto-install all missing required tools
cf env auto-install --yes

Supported tool ecosystems:

Python: pytest, ruff, mypy, black, flake8, coverage, pre-commit
JavaScript/TypeScript: eslint, prettier, jest, typescript, vite, vitest
Rust: clippy, rustfmt, rust-analyzer, cargo-edit
System: git, docker, make, curl, gh

GitHub PR Workflow Commands

New cf pr commands — Manage pull requests directly from the CLI.

# Create a PR from current branch
cf pr create --title "Add feature X" --body "Description..."

# List open PRs
cf pr list

# View PR details
cf pr view 123

# Merge a PR
cf pr merge 123 --method squash

# Close a PR
cf pr close 123

Task Self-Diagnosis

New cf work diagnose command — Automatically analyze failed tasks and get actionable recommendations.

# Diagnose why a task failed
cf work diagnose <task-id>

# Verbose output with detailed logs
cf work diagnose <task-id> --verbose

The diagnostic agent analyzes run logs, identifies root causes, and provides specific fix recommendations.

Bug Fixes

Fixed #265: NoneType error when codebase_index is None during task execution
Fixed #253: Checkpoint diff API now returns 404 instead of 500 when workspace directory is missing

Previous: Enhanced Task Generation (2026-01-25)

Task Scheduling & Templates

Task Scheduling with Critical Path Analysis:

# Show task schedule with dependencies
cf schedule show

# Predict completion dates
cf schedule predict

# Identify bottleneck tasks
cf schedule bottlenecks

Task Templates for Common Patterns:

# List available templates
cf templates list

# Show template details
cf templates show api-endpoint

# Apply template to generate tasks
cf templates apply api-endpoint --name "User API"

7 Built-in Templates:

api-endpoint - REST API endpoint with tests
react-component - React component with tests
database-migration - Schema migration with rollback
cli-command - CLI command with help and tests
integration-test - Integration test suite
bug-fix - Bug investigation and fix workflow
feature-flag - Feature flag implementation

Effort Estimation:

Tasks now support estimated_hours field
CPM-based scheduling calculates critical path
Bottleneck detection identifies blocking tasks

Tech Stack Configuration (2026-01-16)

Describe your tech stack — Tell CodeFRAME what technologies your project uses during initialization.

# Auto-detect from project files (pyproject.toml, package.json, Cargo.toml, etc.)
cf init . --detect

# Provide explicit tech stack description
cf init . --tech-stack "Python 3.11 with FastAPI, uv, pytest"
cf init . --tech-stack "TypeScript monorepo with pnpm, Next.js frontend"
cf init . --tech-stack "Rust project using cargo"

# Interactive setup
cf init . --tech-stack-interactive

Why this matters: The agent uses your tech stack description to determine appropriate commands and patterns. Works with any stack — Python, TypeScript, Rust, Go, monorepos, or mixed environments.

Agent Self-Correction & Observability (2026-01-16)

Verification self-correction loop — Agent now automatically attempts to fix failing verification gates.

# Execute with verbose output to see self-correction progress
cf work start <task-id> --execute --verbose

# Watch the agent attempt fixes in real-time
[VERIFY] Running final verification (attempt 1/3)
[VERIFY] Gates failed: pytest, ruff
[SELFCORRECT] Attempting to fix verification failures
[SELFCORRECT] Applied 2/2 fixes, re-verifying...

Capabilities:

Self-Correction Loop — Agent analyzes gate errors and generates fix plans using LLM
Verbose Mode — --verbose / -v flag shows detailed verification and self-correction progress
FAILED Task Status — Tasks can now transition to FAILED state for proper error visibility
Project Preferences — Agent loads AGENTS.md or CLAUDE.md for per-project configuration

Parallel Batch Execution (2026-01-15)

Multi-task batch execution — Run multiple tasks with intelligent parallelization.

# Execute multiple tasks in parallel
cf work batch run task1 task2 task3 --strategy parallel

# Execute all READY tasks with LLM-inferred dependencies
cf work batch run --all-ready --strategy auto

# Automatic retry on failure
cf work batch run --all-ready --retry 3

Batch Capabilities:

Parallel Execution — ThreadPoolExecutor-based concurrent task execution
Dependency Graph — DAG-based task ordering with cycle detection
LLM Dependency Inference — --strategy auto analyzes task descriptions to infer dependencies
Automatic Retry — --retry N retries failed tasks up to N times
Batch Resume — cf work batch resume <batch-id> re-runs failed/blocked tasks

Key Features

CLI-First Agent System (v2)

Autonomous Execution — cf work start --execute runs the full agent loop
Self-Correction Loop — Agent automatically fixes failing verification gates (up to 3 attempts)
Human-in-the-Loop Blockers — Agents pause and ask questions when they need decisions
Verification Gates — Automatic ruff/pytest checks after changes
Verbose Mode — --verbose flag shows detailed progress and self-correction activity
Dry Run Mode — Preview changes without applying them
State Persistence — Resume work across sessions
Task Diagnosis — Automatic root cause analysis for failed tasks

Multi-Agent Orchestration

Multi-Agent Orchestra — Lead agent coordinates backend, frontend, test, and review specialists
Async/Await Architecture — Non-blocking agent execution with true concurrency
Self-Correction Loops — Agents automatically fix failing tests (up to 3 attempts)
WebSocket Agent Broadcasting — Real-time agent status updates to all connected clients
Tactical Pattern Handling — Automatic resolution of common file conflicts

Quality & Review

AI Quality Enforcement — Dual-layer quality system preventing test skipping and enforcing 85%+ coverage
Quality Gates — Pre-completion checks block bad code (tests, types, coverage, review)
BUILD Gate — Validates configuration errors before execution
Automated Code Review — Security scanning, OWASP pattern detection, and complexity analysis
Lint Enforcement — Multi-language linting with trend tracking and automatic fixes

State & Context Management

Context-Aware Memory — Tiered HOT/WARM/COLD memory system reduces token usage by 30-50%
Session Lifecycle — Auto-save/restore work context across CLI restarts
Checkpoint & Recovery — Git + DB snapshots enable project state rollback
Phase-Aware Components — UI intelligently selects data sources based on project phase

Developer Experience

Real-time Dashboard — WebSocket-powered UI with agent status, blockers, and progress tracking
Environment Validation — cf env check validates tools and dependencies
PR Workflow — cf pr create/list/merge for GitHub integration
Task Scheduling — CPM-based critical path analysis
Task Templates — 7 built-in templates for common development patterns
Cost Tracking — Real-time token usage and cost analytics per agent/task

Architecture

┌─────────────────────────────────────────────────────────────┐
│                    CLI / Agent Orchestrator                  │
│  • cf work start --execute                                   │
│  • Context loading → Planning → Execution → Verification    │
│  • Blocker detection and human-in-loop                      │
└─────────────┬──────────────┬──────────────┬────────────┬────┘
              │              │              │            │
      ┌───────▼───┐   ┌──────▼──────┐  ┌───▼────────┐  ┌▼────────┐
      │ Backend   │   │  Frontend   │  │    Test    │  │ Review  │
      │ Worker    │   │  Worker     │  │   Worker   │  │ Worker  │
      │ (async)   │   │  (async)    │  │  (async)   │  │ (async) │
      └─────┬─────┘   └──────┬──────┘  └─────┬──────┘  └────┬────┘
            │                │               │              │
            │  ┌─────────────▼───────────────▼──────────────▼─────┐
            │  │         Blocker Management (Sync/Async)           │
            │  │  • Database-backed queue (SQLite)                 │
            │  │  • Human-in-the-loop questions                    │
            │  └───────────────────────────────────────────────────┘
            │
    ┌───────▼──────────────────────────────────────────────────┐
    │              Context Management Layer                     │
    │  • Tiered memory (HOT/WARM/COLD)                         │
    │  • Importance scoring & tier assignment                   │
    │  • Flash save mechanism                                   │
    └──────────────────────────────────────────────────────────┘

Quick Start

Prerequisites

Python 3.11+
Node.js 18+ (for frontend, optional)
Anthropic API key
SQLite 3 (included with Python)

Installation

# Clone repository
git clone https://github.com/frankbria/codeframe.git
cd codeframe

# Install uv package manager
curl -LsSf https://astral.sh/uv/install.sh | sh

# Set up backend
uv venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
uv sync

# Set up environment
export ANTHROPIC_API_KEY="your-api-key-here"

# Validate environment (optional but recommended)
cf env check

CLI-First Workflow (v2 — Recommended)

# 1. Initialize workspace (with optional tech stack detection)
cd /path/to/your/project
cf init . --detect
# Or explicit: cf init . --tech-stack "Python with FastAPI, uv, pytest"

# 2. Add a PRD (Product Requirements Document)
cf prd add requirements.md

# 3. Generate tasks from PRD
cf tasks generate

# 4. List tasks
cf tasks list

# 5. Start work on a task (with AI agent)
cf work start <task-id> --execute

# 6. Check for blockers (questions the agent needs answered)
cf blocker list
cf blocker answer <blocker-id> "Your answer here"

# 7. Resume work after answering blockers
cf work resume <task-id>

# 8. If a task fails, diagnose the issue
cf work diagnose <task-id>

# 9. Review changes and create checkpoint
cf review
cf checkpoint create "Feature complete"

Dashboard Mode (v1)

# Start the dashboard (from project root)
codeframe serve

# Or manually start backend and frontend separately:
# Terminal 1: Backend
uv run uvicorn codeframe.ui.server:app --reload --port 8080

# Terminal 2: Frontend
cd web-ui && npm install && npm run dev

# Access dashboard at http://localhost:3000

CLI Commands

Workspace Management

cf init <path>                           # Initialize workspace for a repo
cf init <path> --detect                  # Initialize + auto-detect tech stack
cf init <path> --tech-stack "description"  # Initialize + explicit tech stack
cf init <path> --tech-stack-interactive  # Initialize + interactive setup
cf status                                # Show workspace status

Environment Validation

cf env check                    # Quick environment health check
cf env doctor                   # Comprehensive diagnostics
cf env install-missing <tool>   # Install specific missing tool
cf env auto-install --yes       # Install all missing tools

PRD (Product Requirements)

cf prd add <file.md>        # Add/update PRD
cf prd show                 # Display current PRD
cf prd list                 # List all PRDs
cf prd versions <id>        # Show version history
cf prd diff <id> <v1> <v2>  # Compare versions

Task Management

cf tasks generate           # Generate tasks from PRD (uses LLM)
cf tasks list               # List all tasks
cf tasks list --status READY  # Filter by status
cf tasks show <id>          # Show task details

Task Scheduling

cf schedule show            # Show task schedule with dependencies
cf schedule predict         # Predict completion dates
cf schedule bottlenecks     # Identify blocking tasks

Task Templates

cf templates list           # List available templates
cf templates show <name>    # Show template details
cf templates apply <name>   # Generate tasks from template

Work Execution

cf work start <id>          # Start work (creates run record)
cf work start <id> --execute     # Start with AI agent execution
cf work start <id> --execute --verbose  # Execute with detailed output
cf work start <id> --execute --dry-run  # Preview changes only
cf work stop <id>           # Stop current run
cf work resume <id>         # Resume blocked work
cf work diagnose <id>       # Diagnose failed task

Batch Execution

cf work batch run <id1> <id2> ...     # Execute multiple tasks
cf work batch run --all-ready         # Execute all READY tasks
cf work batch run --strategy parallel # Parallel execution
cf work batch run --strategy auto     # LLM-inferred dependencies
cf work batch run --retry 3           # Auto-retry failed tasks
cf work batch status [batch_id]       # Show batch status
cf work batch cancel <batch_id>       # Cancel running batch
cf work batch resume <batch_id>       # Re-run failed tasks

Blockers

cf blocker list             # List open blockers
cf blocker show <id>        # Show blocker details
cf blocker answer <id> "answer"  # Answer a blocker

Pull Requests

cf pr create --title "..." --body "..."  # Create PR
cf pr list                               # List open PRs
cf pr view <number>                      # View PR details
cf pr merge <number> --method squash     # Merge PR
cf pr close <number>                     # Close PR

Quality & Review

cf review                   # Run verification gates
cf patch export             # Export changes as patch
cf commit                   # Commit changes

Checkpoints

cf checkpoint create <name>  # Create checkpoint
cf checkpoint list          # List checkpoints
cf checkpoint restore <id>  # Restore to checkpoint
cf summary                  # Show session summary

Configuration

Environment Variables

# Required
ANTHROPIC_API_KEY=sk-ant-...           # Anthropic API key

# Optional - Database
DATABASE_PATH=./codeframe.db           # SQLite database path (default: in-memory)

# Optional - Quality Enforcement
MIN_COVERAGE_PERCENT=85                # Minimum test coverage required
CODEFRAME_ENABLE_SKIP_DETECTION=true   # Enable skip detection gate (default: true)

# Optional - Git Integration
AUTO_COMMIT_ENABLED=true               # Enable automatic commits after test passes

# Optional - Notifications
NOTIFICATION_DESKTOP_ENABLED=true      # Enable desktop notifications
NOTIFICATION_WEBHOOK_URL=https://...   # Webhook endpoint for agent events

# Frontend (set at build time for Next.js)
NEXT_PUBLIC_API_URL=http://localhost:8080
NEXT_PUBLIC_WS_URL=ws://localhost:8080/ws

Project Configuration

See CLAUDE.md in project root for project-specific configuration including:

Active technologies and frameworks
Coding standards and conventions
Testing requirements
Documentation structure

API Documentation

Core Endpoints

POST   /api/projects                          # Create project
GET    /api/projects/{id}                     # Get project details
POST   /api/projects/{id}/prd                 # Submit PRD

GET    /api/projects/{id}/agents              # List agents
POST   /api/projects/{id}/agents              # Create agent

GET    /api/projects/{id}/blockers            # List blockers
POST   /api/blockers/{id}/answer              # Answer blocker

GET    /api/projects/{id}/tasks               # List tasks
GET    /api/tasks/{id}                        # Get task details
POST   /api/tasks/approve                     # Approve tasks for development

GET    /api/schedule/{project_id}             # Get task schedule
GET    /api/templates                         # List task templates
POST   /api/templates/{name}/apply            # Apply template

WebSocket

WS     /ws?token=JWT_TOKEN                    # WebSocket connection (auth required)

For detailed API documentation, see /docs (Swagger UI) or /redoc (ReDoc) when the server is running.

Testing

Run Tests

# Run all unit tests
uv run pytest

# Run specific test suite
uv run pytest tests/core/           # Core module tests
uv run pytest tests/agents/         # Agent tests
uv run pytest tests/api/            # API endpoint tests
uv run pytest tests/cli/            # CLI command tests

# Run with coverage
uv run pytest --cov=codeframe --cov-report=html

# Run v2 tests only
uv run pytest -m v2

Test Statistics

Total Tests: 3000+
- Core module tests: ~815
- Unit tests: ~900 (Python + TypeScript)
- Integration tests: ~500
- E2E tests: 85+ (Backend + Playwright)
Coverage: 88%+
Pass Rate: 100%

Documentation

For detailed documentation, see:

Strategic Roadmap: docs/V2_STRATEGIC_ROADMAP.md - 5-phase development plan
Quick Start (v2): docs/QUICKSTART.md - Get started in 5 minutes
Golden Path (v2): docs/GOLDEN_PATH.md - CLI-first workflow contract
Agent Implementation: docs/AGENT_IMPLEMENTATION_TASKS.md - Agent system details
CLI Wireframe: docs/CLI_WIREFRAME.md - Command structure
CLI Test Report: docs/CLI_V2_TEST_REPORT.md - End-to-end test results
Product Requirements: PRD.md
System Architecture: CODEFRAME_SPEC.md
Authentication: docs/authentication.md - Security guide
Sprint Planning: SPRINTS.md
Agent Guide: AGENTS.md

Contributing

We welcome contributions! To get started:

Fork and clone the repository
Install dependencies: uv sync
Install pre-commit hooks: pre-commit install
Run tests to ensure everything works: uv run pytest

Code Standards

Python: Follow PEP 8, use ruff for linting
TypeScript: Follow ESLint rules, use Prettier for formatting
Type Hints: Required for all Python functions
Tests: Required for all new features (85%+ coverage)
Documentation: Update README and docstrings for API changes

Pull Request Process

Create a feature branch: git checkout -b feature/my-feature
Write tests first (TDD approach encouraged)
Implement feature with proper error handling
Ensure all tests pass: uv run pytest
Run quality checks: uv run ruff check .
Update documentation if needed
Submit PR with clear description of changes

Roadmap

Completed

Phase 0: CLI-first Golden Path workflow
Phase 1 (Batch): Batch execution (serial), status monitoring, cancellation
Phase 2 (Batch): Parallel execution, dependency graphs, LLM inference, auto-retry
Autonomous agent execution with blocker detection
Verification gates integration
Task-based model selection
Self-correction loop for verification failures
Verbose mode for observability
Task scheduling with CPM analysis
Task templates (7 built-in)
Environment validation and tool detection
GitHub PR workflow commands
Task self-diagnosis system

In Progress (Phase 1: CLI Foundation)

cf prd generate: Interactive AI-driven PRD creation (Socratic discovery)
cf work follow: Live streaming of task execution
Integration tests: Credential and environment module coverage

Planned (Phases 2-5)

Phase 2: Server layer as thin adapter, OpenAPI docs, rate limiting
Phase 3: Web UI rebuild on v2 foundation
Phase 4: Multi-agent coordination, agent roles, conflict resolution
Phase 5: TUI dashboard, token/cost tracking, debug/replay mode

See docs/V2_STRATEGIC_ROADMAP.md for the complete roadmap.

License

This project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0).

Key points:

Open Source: Free to use, modify, and distribute
Copyleft: Derivative works must also be AGPL-3.0
Network Use: If you run a modified version as a service, you must release source code
Commercial Use: Permitted with AGPL-3.0 compliance

See LICENSE for full details.

Credits & Acknowledgments

Core Team

Frank Bria - Creator and Lead Developer

Technologies

Anthropic Claude - AI reasoning engine powering all agents
FastAPI - High-performance async web framework
FastAPI Users - Authentication and user management
React + TypeScript - Modern frontend with real-time updates
SQLite - Embedded database for persistence
Playwright - End-to-end testing framework
pytest + jest - Comprehensive testing frameworks

Inspiration

Built on the principles of:

Autonomous agent systems (AutoGPT, BabyAGI)
Multi-agent orchestration (LangGraph, CrewAI)
Human-in-the-loop design (Constitutional AI)
Test-driven development (Kent Beck, Robert Martin)

Support & Community

Issues: GitHub Issues
Discussions: GitHub Discussions
Documentation: Full Documentation

Built with care by humans and AI agents working together

Name		Name	Last commit message	Last commit date
Latest commit History 582 Commits
.beads		.beads
.claude		.claude
.github/workflows		.github/workflows
.specify		.specify
claudedocs		claudedocs
codeframe		codeframe
docs		docs
examples		examples
legacy/web-ui		legacy/web-ui
legacydocs		legacydocs
scripts		scripts
specs		specs
sprints		sprints
systemd		systemd
test-results		test-results
tests		tests
testsprite_tests		testsprite_tests
.coveragerc		.coveragerc
.env.example		.env.example
.env.staging.example		.env.staging.example
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmessage		.gitmessage
.pre-commit-config.yaml		.pre-commit-config.yaml
.staging-server.conf.example		.staging-server.conf.example
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
PRD.md		PRD.md
README.md		README.md
TESTING.md		TESTING.md
codeframe.db		codeframe.db
coverage.json		coverage.json
ecosystem.staging.config.js		ecosystem.staging.config.js
package.json		package.json
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
test_config_manual.py		test_config_manual.py
test_mock_check.py		test_mock_check.py
test_output.log		test_output.log
test_standalone.py		test_standalone.py
uv.lock		uv.lock

License

frankbria/codeframe

Folders and files

Latest commit

History

Repository files navigation

CodeFRAME

Overview

What's New (Updated: 2026-01-29)

V2 Strategic Roadmap Published

Environment Validation & Tool Detection

GitHub PR Workflow Commands

Task Self-Diagnosis

Bug Fixes

Previous: Enhanced Task Generation (2026-01-25)

Key Features

CLI-First Agent System (v2)

Multi-Agent Orchestration

Quality & Review

State & Context Management

Developer Experience

Architecture

Quick Start

Prerequisites

Installation

CLI-First Workflow (v2 — Recommended)

Dashboard Mode (v1)

CLI Commands

Workspace Management

Environment Validation

PRD (Product Requirements)

Task Management

Task Scheduling

Task Templates

Work Execution

Batch Execution

Blockers

Pull Requests

Quality & Review

Checkpoints

Configuration

Environment Variables

Project Configuration

API Documentation

Core Endpoints

WebSocket

Testing

Run Tests

Test Statistics

Documentation

Contributing

Code Standards

Pull Request Process

Roadmap

Completed

In Progress (Phase 1: CLI Foundation)

Planned (Phases 2-5)

License

Credits & Acknowledgments

Core Team

Technologies

Inspiration

Support & Community

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 7

Languages

Packages