The practical guide to building AI agent harnesses β with real code examples you can copy and run.
π harness-guide.com | δΈζη«
English | δΈζ
A harness is the runtime wrapper that turns a bare language model into an agent β an autonomous system that can perceive its environment, make decisions, and take actions over multiple steps. The harness handles everything the model can't do on its own: executing tools, managing memory, assembling context, and enforcing safety boundaries.
This guide covers harness engineering from first principles to production patterns, with real code in every article.
| Topic | Description |
|---|---|
| What is a Harness? | The concept in 3 minutes. How it turns a model into an agent. Harness vs. framework vs. runtime. |
| Your First Harness | Build a working harness in 50 lines of Python. Complete code you can copy and run. |
| Harness vs. Framework | When to use a raw harness vs. LangChain/CrewAI. Decision tree + side-by-side code comparison. |
| Topic | Description |
|---|---|
| Agentic Loop | The think β act β observe cycle. Turn budgets, parallel tool calls, loop detection, streaming. |
| Tool System | Tool registry, static vs. dynamic loading, MCP protocol, description quality patterns. |
| Memory & Context | Context assembly, session management, two-tier memory (daily logs + long-term). AGENTS.md and MEMORY.md patterns. |
| Guardrails | Permission models, trust boundaries, sandboxing, prompt injection defense. |
| Topic | Description |
|---|---|
| Context Engineering | Priority-based assembly, three lines of defense for compression, token budgeting. |
| Sandbox | Docker and Firecracker setups, network isolation, filesystem restrictions. |
| Skill System | Skill packaging, on-demand loading, SKILL.md format, thin harness + thick skills. |
| Sub-Agent | Leader-Worker pattern, file-based communication, session isolation, parallel execution. |
| Error Handling | Error classification, retry strategies, graceful degradation, checkpoint/resume. |
| Multi-Agent Orchestration | Orchestration patterns (pipeline, fan-out, supervisor), context isolation, real-world examples (Multica, Paseo, OpenClaw). |
| Scheduling & Automation | Cron, heartbeats, event triggers. Session targeting, delivery, LangSmith vs harness-native comparison. |
| Long-Running Harness Design | Context anxiety, self-evaluation bias, context reset vs compaction, GAN-inspired generator-evaluator architecture. |
| Managed Agents Architecture | Brain/hands/session decoupling, pets vs cattle, credential isolation, TTFT improvements. |
| Eval Infrastructure Noise | Resource config swings benchmark scores by 6pp. Floor+ceiling enforcement strategy. |
| Topic | Description |
|---|---|
| Implementation Comparison | Side-by-side comparison of OpenClaw, Claude Code, Codex, Cline, Aider, Cursor. |
| Glossary | Key terms defined. |
| Topic | Description |
|---|---|
| Shipping Our Windows Client | Build time 15minβ4min, install time 10minβ2min. How we rebuilt the Electron packaging pipeline. |
| Ghost Account Hunting | 1000+ ghost accounts drained our platform in 15 days. The full post-mortem. |
- Go to Issues β New Issue
- Choose "π¬ Submit a Resource"
- Fill in the title, URL, and why it's relevant
Or submit a PR directly β see CONTRIBUTING.md.
- π¬ GitHub Discussions β Join the conversation
- π¦ Twitter β @nexudotio
- π¬ ι£δΉ¦ηΎ€ β ε ε ₯ Harness Engineering θ―ι’ηΎ€
Maintained by Nexu β the open-source Claude Co-worker & Managed Agent platform.
If you find this guide useful, please consider giving it a β
@misc{nexu_harness-engineering-guide_2026,
author = {Nexu Team},
title = {Harness Engineering Guide},
year = {2026},
publisher = {GitHub},
howpublished = {\url{https://github.com/nexu-io/harness-engineering-guide}}
}