Awesome Agent Harness

A curated, implementation-first list of agent harness engineering resources, with GitHub projects as the primary focus.

Total entries: 159
GitHub entries: 135 (84.9%)
GitHub in project categories (excluding readings): 131/131 (100.0%)
Categories: 9
Last verified: 2026-04-22
Language: English | 中文

Featured Harness Blogs

Scaling Managed Agents: Decoupling the brain from the hands: Anthropic's meta-harness architecture for decoupling session logs, harness loops, and sandboxes in long-horizon agents.
Claude Code auto mode: Anthropic's write-up on classifier-backed approval delegation for safer high-autonomy coding-agent runs.
Harness engineering (OpenAI): Field report on building reliable agent-first software via harness constraints and verification.
Building Effective AI Agents: Anthropic's practical guidance on when to use workflows vs. autonomous agents and how to structure them.
Writing effective tools for AI agents: Best practices for tool interface design so agents call tools safely and reliably.
Effective harnesses for long-running agents: Practical guide to maintaining state, resumability, and reliability over long agent runs.
Harness design for long-running application development: Follow-up article on improving long-running app generation through harness structure.
Improving Deep Agents with harness engineering: Evidence that harness improvements alone can move benchmark performance.
Evaluating Deep Agents: Our Learnings: LangChain's practical lessons on evaluating stateful and long-horizon agents.
Your Agent Needs a Harness, Not a Framework: Argument for reliability-first infrastructure around agents instead of framework-only thinking.

Category Overview

Category	Entries
Harness Architecture & Orchestration	20
Context & Working-State Engineering	8
Execution Substrates & Sandboxing	16
Protocols, Tool Interfaces & Agent Contracts	11
Evaluation Harnesses & Benchmarks	20
Observability & Reliability Operations	13
Guardrails, Security & Governance	11
Reference Harness Implementations	32
Essential Readings & Ecosystem Maps	28

Catalog

Notes:

Stars are rendered as badges from snapshot values.
Repository update dates are tracked in data/projects.yaml and validation reports.
Entries are sorted by stars (descending) within each category.

Harness Architecture & Orchestration

Project	Link	Tags	Summary
DeerFlow	GitHub	long-horizon, memory, subagents	Long-horizon super-agent harness integrating memory, tools, subagents, and sandboxes.
AutoGen	GitHub	multi-agent, orchestration, framework	Programming framework for agentic AI with multi-agent interaction and orchestration.
Agno	GitHub	scale, runtime, management	Agent software runtime focused on running and managing agentic systems at scale.
LangGraph	GitHub	graph, workflow, runtime	Graph-based runtime for resilient stateful agents and deterministic workflow control.
Semantic Kernel	GitHub	enterprise, orchestration, plugins	Enterprise-grade agentic application framework with orchestration and plugin patterns.
OpenAI Agents SDK (Python)	GitHub	sdk, handoff, workflows	Lightweight framework for multi-agent workflows, handoffs, and production patterns.
deepagents	GitHub	runtime, orchestration, long-running	Open-source harness for long-running, tool-using agents with planning and subagent patterns.
Google ADK (Python)	GitHub	toolkit, deployment, evaluation	Code-first toolkit to build, evaluate, and deploy advanced AI agents.
PydanticAI	GitHub	python, typing, schema	Type-safe Python framework for agents with strong schema contracts and tooling.
Hive	GitHub	harness, orchestration, runtime	Outcome-driven agent runtime harness with explicit control loops and orchestration blocks.
Microsoft Agent Framework	GitHub	multi-agent, workflows, observability	Multi-language framework for building, orchestrating, and deploying AI agents with graph workflows and observability.
VoltAgent	GitHub	typescript, platform, runtime	TypeScript agent engineering platform built around open runtime abstractions.
mcp-agent	GitHub	mcp, runtime, workflow	Practical agent framework centered on MCP tool ecosystems and workflow composition.
Yao	GitHub	single-binary, runtime, autonomous	Single-binary runtime for defining and running autonomous agents.
Cloudflare Agents	GitHub	platform, deployment, runtime	Platform runtime for building and deploying agents with production infrastructure primitives.
Docker Agent	GitHub	docker, runtime, container	Agent builder and runtime stack emphasizing container-native execution.
NeMo Agent Toolkit	GitHub	multi-agent, optimization, toolkit	Open toolkit for connecting and optimizing teams of AI agents.
Scion	GitHub	multi-agent, containers, orchestration	Experimental multi-agent orchestration testbed that runs isolated agent harnesses in containers, worktrees, and remote runtimes.
deepagentsjs	GitHub	typescript, langgraph, subagents	TypeScript agent harness with built-in planning, filesystem tools, subagents, and LangGraph-native runtime hooks.
hankweave	GitHub	long-horizon, runtime, checkpoints	Headless-first long-horizon runtime that orchestrates existing agent harnesses with sentinels, loops, checkpoints, and event journals.

Context & Working-State Engineering

Project	Link	Tags	Summary
everything-claude-code	GitHub	context, skills, harness-practices	Large open repository of harness practices around memory, skills, and context control for coding agents.
claude-mem	GitHub	memory, context, session	Plugin-style memory layer that captures session history and reinjects relevant context into future coding runs.
planning-with-files	GitHub	planning, skills, persistence	Skill package for persistent file-based planning in coding-agent workflows.
Agent Skills for Context Engineering	GitHub	skills, context, production	Large skill library oriented around context engineering and production agents.
Context-Engineering Handbook	GitHub	context-engineering, handbook, practices	First-principles handbook focused on practical context engineering for agent systems.
Trellis	GitHub	specs, memory, workflow	Multi-platform coding-agent workflow framework with task context, project memory, and spec injection.
Awesome Context Engineering	GitHub	awesome-list, context, survey	Survey-style list for context engineering resources and frameworks.
context-space	GitHub	context, infrastructure, mcp	Infrastructure project focused on context engineering building blocks and MCP-centric integrations.

Execution Substrates & Sandboxing

Project	Link	Tags	Summary
Daytona	GitHub	sandbox, execution, infra	Secure and elastic sandbox infrastructure for running AI-generated code with file, Git, LSP, and execution APIs.
CUA	GitHub	computer-use, sandbox, infra	Infrastructure stack for computer-use agents with sandbox, SDK, and benchmark support.
E2B	GitHub	cloud-sandbox, execution, enterprise	Secure cloud environments with real tools for production-grade agent execution.
OpenSandbox	GitHub	sandbox, security, runtime	Secure and extensible sandbox runtime built for agent workloads.
agent-infra sandbox	GitHub	all-in-one, browser, shell	All-in-one sandbox combining browser, shell, files, MCP, and IDE server.
Judge0	GitHub	code-execution, sandbox, backend	Scalable sandboxed code execution system usable as an agent execution backend.
Agent Sandbox	GitHub	kubernetes, sandbox, stateful	Kubernetes-native sandbox control plane for isolated, stateful agent runtimes with stable identity, persistence, and warm-pool support.
stakpak/agent	GitHub	always-on, autonomous, ops	Always-on open agent that runs on your machines with autonomous operational loops.
OSS-Fuzz Gen	GitHub	fuzzing, security, execution	LLM-powered fuzzing workflows integrated with controlled execution contexts.
Tensorlake	GitHub	microvm, sandbox, orchestration	Serverless runtime for agent sandboxes with MicroVM isolation, snapshots, suspend-resume, and background orchestration.
Arrakis	GitHub	sandbox, microvm, snapshots	Self-hosted sandbox substrate with MicroVM isolation, snapshot restore, and REST, SDK, and MCP interfaces for agent code execution and computer use.
AgentScope Runtime	GitHub	runtime, sandbox, deployment	Production runtime for agent apps with secure tool sandboxes, deployment APIs, observability, and state services.
SWE-ReX	GitHub	sandbox, execution, coding-agent	Sandboxed execution infrastructure for AI coding agents at local and cloud scale.
sandboxed.sh	GitHub	self-hosted, isolation, orchestrator	Self-hosted orchestrator running coding agents inside isolated Linux workspaces.
Capsule	GitHub	wasm, sandbox, task-runtime	Durable runtime that coordinates agent tasks inside isolated WebAssembly sandboxes with retries and lifecycle tracking.
terminal-bench-env	GitHub	terminal, benchmark-env, sandbox	Environment layer for terminal-agent benchmark execution.

Protocols, Tool Interfaces & Agent Contracts

Project	Link	Tags	Summary
GitHub Spec Kit	GitHub	spec-driven, workflows, tooling	Toolkit for spec-driven development to guide deterministic agent execution.
MCP Servers	GitHub	mcp, servers, implementations	Official collection of MCP server implementations across tools and domains.
AGENTS.md	GitHub	spec, agent-file, instructions	Open format for repository-local instructions that coding agents can follow.
Model Context Protocol	GitHub	mcp, protocol, interoperability	Core specification and docs for MCP-based tool and context interoperability.
directories (rules and MCP indexes)	GitHub	directories, mcp, rules	Curated directories of agent rules and MCP servers for tool discovery.
LangChain MCP Adapters	GitHub	mcp, adapters, integration	Adapters connecting LangChain components with MCP servers.
Microsoft MCP Servers	GitHub	mcp, enterprise, servers	Microsoft's official MCP server catalog for enterprise data and tools.
ACPX	GitHub	acp, client, sessions	Headless CLI client for stateful Agent Client Protocol sessions.
Microsoft Learn MCP	GitHub	mcp, docs, grounding	MCP server and CLI for grounding agents with Microsoft documentation sources.
IBM MCP	GitHub	mcp, clients, tooling	IBM collection of MCP servers, clients, and developer tooling.
AGENT.md	GitHub	standard, agent-file, interoperability	Standardized machine-readable file format for agentic coding tools.

Evaluation Harnesses & Benchmarks

Project	Link	Tags	Summary
Promptfoo	GitHub	eval, red-team, ci	Config-driven prompt/agent/RAG testing, comparison, and red-team evaluation tool.
DeepEval	GitHub	evaluation, framework, testing	LLM evaluation framework supporting agent and workflow quality testing.
RAGAS	GitHub	rag, metrics, evaluation	Open evaluation toolkit for LLM and RAG quality metrics.
lm-evaluation-harness	GitHub	benchmark, harness, llm	Popular benchmark harness for consistent LLM evaluation across tasks.
SWE-bench	GitHub	benchmark, swe, evaluation	Standard benchmark for evaluating issue-fixing software engineering agents.
verifiers	GitHub	verifier, rl, evaluation	Library for RL environments and verifier-based evaluation loops.
AgentBench	GitHub	benchmark, cross-domain, agent	Cross-environment benchmark for evaluating LLM agents as tool-using systems.
LangWatch	GitHub	simulation, evaluation, testing	End-to-end platform for agent simulations, evaluation loops, and production testing.
EvalScope	GitHub	benchmark, framework, llm	Customizable framework for large-model benchmarking and performance evaluation.
Terminal-Bench	GitHub	terminal, benchmark, long-horizon	Terminal-native benchmark suite for long-horizon, verification-heavy agent tasks.
Harbor	GitHub	evaluation, harness, rl-env	Framework for running agent evaluations and constructing RL-style environments.
tau2-bench	GitHub	tool-use, interaction, benchmark	Tool-agent-user interaction benchmark emphasizing multi-step execution quality.
NeMo Gym	GitHub	rl-env, training, evaluation	Toolkit for building RL environments suitable for LLM/agent training and eval.
TheAgentCompany	GitHub	benchmark, workplace, multi-step	Agent benchmark with simulated software-company tasks for evaluating multi-step workplace autonomy.
Inspect Evals	GitHub	inspect, eval-suite, reproducibility	Evaluation suite collection for Inspect AI workflows.
auto-harness	GitHub	optimization, regression, evals	Benchmark-gated optimization loop that mines failures, edits agent code, and guards against regressions overnight.
Agent Evaluation	GitHub	evaluation, testing, ci	AWS framework for testing virtual agents with evaluator-driven multi-turn conversations, hooks, and CI-friendly workflows.
WorkArena	GitHub	browser, benchmark, enterprise	Browser benchmark for practical enterprise-like knowledge work tasks.
OpenHands Benchmarks	GitHub	openhands, eval, harness	Evaluation harness and benchmark definitions for OpenHands systems.
WebArena-Verified	GitHub	web-agent, benchmark, deterministic	Verified web-agent benchmark with deterministic evaluators.

Observability & Reliability Operations

Project	Link	Tags	Summary
MLflow	GitHub	platform, monitoring, evaluation	Broad AI engineering platform with monitoring and evaluation support for agents.
Langfuse	GitHub	llmops, tracing, metrics	Open-source LLM engineering platform for traces, metrics, prompts, and evals.
Opik	GitHub	monitoring, eval, tracing	End-to-end debug/eval/monitoring stack for LLM apps and agent workflows.
RagaAI Catalyst	GitHub	agentops, analytics, monitoring	Agent observability and monitoring framework with timeline and graph analytics.
TensorZero	GitHub	llmops, gateway, optimization	Open LLMOps stack unifying gateway, observability, evaluation, and optimization.
Arize Phoenix	GitHub	observability, tracing, evaluation	Open platform for AI observability, tracing, and evaluation analytics.
OpenLLMetry	GitHub	opentelemetry, instrumentation, tracing	OpenTelemetry-based instrumentation for GenAI and LLM applications.
Helicone	GitHub	monitoring, traffic, production	Lightweight platform for monitoring and evaluating LLM traffic in production.
AgentOps SDK	GitHub	agentops, monitoring, cost	Monitoring and benchmarking SDK for agent workflows with cost and trace tracking.
Latitude	GitHub	platform, eval, observability	Open-source agent engineering platform with eval and observability capabilities.
Laminar	GitHub	observability, tracing, evals	Agent-focused observability stack with tracing, evaluation runs, monitoring, and dashboards.
claude-code-reverse	GitHub	trace, visualization, debugging	Tooling to visualize and inspect Claude Code LLM interaction traces.
OpenInference	GitHub	spec, instrumentation, observability	Open instrumentation specification and tooling for AI observability.

Guardrails, Security & Governance

Project	Link	Tags	Summary
LiteLLM	GitHub	gateway, proxy, guardrails	Unified LLM gateway/proxy with cost tracking, load balancing, and guardrails.
Kong	GitHub	gateway, policy, infra	API and AI gateway infrastructure useful for policy enforcement in agent systems.
Portkey Gateway	GitHub	gateway, guardrails, routing	AI gateway with routing and guardrails for multi-model production traffic.
CAI (Cybersecurity AI)	GitHub	security, governance, framework	Security-focused agent framework for offensive/defensive AI workflows.
OpenAI Realtime Agents	GitHub	realtime, orchestration, control	Advanced agentic realtime patterns with structured control and interaction loops.
Plano	GitHub	proxy, safety, data-plane	AI-native proxy and data plane with orchestration, safety, and observability.
OpenAI CS Agents Demo	GitHub	demo, handoffs, governance	Customer-service multi-agent demo highlighting handoffs and guardrail-like control points.
ContextForge	GitHub	gateway, governance, observability	Registry and proxy layer that unifies MCP, A2A, and REST/gRPC endpoints with centralized governance and observability.
Archestra	GitHub	enterprise, guardrails, governance	Enterprise AI platform with guardrails, MCP registry, and orchestration services.
Tracecat	GitHub	security, automation, policy	AI automation platform for security teams with policy and workflow controls.
AgentGateway	GitHub	gateway, mcp, proxy	Agentic proxy gateway for AI agents and MCP server ecosystems.

Reference Harness Implementations

Project	Link	Tags	Summary
OpenCode	GitHub	terminal, coding-agent, subagents	Open-source coding agent with built-in plan/build roles, subagents, LSP support, and a client-server runtime.
Claude Code	GitHub	terminal, coding-agent, git-workflows	Official terminal coding agent that understands codebases and executes editing, debugging, and Git workflows through natural language.
Gemini CLI	GitHub	terminal, coding-agent, mcp	Open-source terminal agent with built-in tools, MCP support, checkpointing, and sandboxing controls.
Codex CLI	GitHub	terminal, coding-agent, local-execution	Terminal-native coding agent that runs locally and exposes practical agent workflows for software tasks.
OpenHands	GitHub	coding-agent, software-engineering, repo	Open-source AI software engineer focused on repo-level coding task execution.
OpenManus	GitHub	general-agent, autonomy, workflows	Open foundation for broad autonomous agent workflows with coding-heavy use cases.
learn-claude-code	GitHub	tutorial, harness, claude-code	Hands-on harness tutorial for building Claude Code-like systems from scratch.
aider	GitHub	terminal, repo-map, testing	Terminal coding assistant with repo mapping, git-aware edits, and built-in lint/test feedback loops.
Claude Code Plugins: Orchestration and Automation	GitHub	claude-code, plugins, orchestration	Production-ready Claude Code plugin marketplace bundling agents, skills, tools, and multi-agent workflow orchestrators.
CLI-Anything	GitHub	cli, tool-use, automation	CLI agent system that unifies command-line tool usage in agent loops.
NanoClaw	GitHub	containers, claude-sdk, scheduling	Container-isolated Claude agent harness with channel routing, scheduled jobs, per-group memory, and small-codebase customization.
Qwen Code	GitHub	terminal, coding-agent, cli	Terminal-native open-source coding agent tuned for practical dev loops.
SuperClaude Framework	GitHub	config, personas, workflow	Configuration framework adding commands, personas, and method templates to coding agents.
Devika	GitHub	assistant, planning, coding	Open-source coding assistant system for planning and implementing development tasks.
SWE-agent	GitHub	swe, issue-fixing, tooling	Research-grade coding agent that resolves GitHub issues with explicit tooling loops.
Aperant	GitHub	coding-agent, parallel, memory	Autonomous multi-agent coding framework with parallel execution, isolated workspaces, QA loops, and persistent memory.
Eigent	GitHub	desktop, cowork, productivity	Open-source desktop cowork agent for autonomous task execution and productivity.
IronClaw	GitHub	security, wasm, routines	Security-first personal agent harness with WASM sandboxing, routines, tool plugins, and persistent memory.
OpenHarness	GitHub	tool-use, memory, multi-agent	Open agent harness implementation covering tool use, skills, memory, permissions, and multi-agent coordination.
GitHub Copilot CLI	GitHub	terminal, coding-agent, mcp	Official terminal coding agent built on GitHub's Copilot harness with MCP extensibility, approval controls, and GitHub-native context.
Superset	GitHub	worktrees, desktop, parallel	Worktree-based desktop orchestrator for running and reviewing parallel CLI coding agents from one workspace.
Open SWE	GitHub	async, coding-agent, swe	Asynchronous open-source coding agent focused on software issue workflows.
OSAURUS	GitHub	macos, local-first, memory	Native macOS harness for autonomous coding agents with persistent memory.
HiClaw	GitHub	multi-agent, human-in-the-loop, shared-state	Collaborative multi-agent OS with manager-worker coordination, shared state, and human-in-the-loop oversight via Matrix rooms.
mini-swe-agent	GitHub	minimal, swe, coding-agent	Minimal coding agent implementation with strong benchmark competitiveness.
TinyAGI	GitHub	team-orchestration, autonomous, workflows	Team-style agent orchestrator for one-person-company style autonomous workflows.
Devon	GitHub	pair-programming, coding-agent, autonomous	Open-source pair programmer agent with autonomous coding execution patterns.
oh-my-pi	GitHub	terminal, lsp, subagents	Terminal AI coding agent with edit safety, LSP integration, and subagent support.
Open Claude Cowork	GitHub	desktop, ui, orchestration	Desktop coding cowork assistant that turns agent orchestration into GUI workflows.
holaOS	GitHub	long-horizon, desktop, durable-state	Desktop-first long-horizon agent environment with runtime, memory, tools, apps, and durable state.
Amazon Bedrock AgentCore Samples	GitHub	aws, runtime, operations	Official sample suite for deploying and operating agents with runtime, gateway, memory, observability, evaluation, and policy layers.
mini-coding-agent	GitHub	coding-agent, minimal, approvals	Minimal coding agent harness illustrating approvals, memory, bounded delegation, and durable transcripts.

Essential Readings & Ecosystem Maps

Project	Link	Stars	Tags	Summary
awesome-claude-code	GitHub		awesome-list, claude-code, skills	Community collection of Claude Code skills, hooks, and orchestrator tooling.
awesome-agentic-patterns	GitHub		awesome-list, patterns, design	Catalog of reusable agentic design patterns and implementation motifs.
awesome-mcp-servers	GitHub		awesome-list, mcp, tools	Curated MCP server index for tool interoperability in agent systems.
awesome-harness-engineering	GitHub		awesome-list, curation, harness	Curated list focused on harness engineering articles, benchmarks, and implementations.
12 Factor Agents	Reference	-	reading, operations, principles	Operations-oriented principles for building maintainable production agents.
Agent Frameworks, Runtimes, and Harnesses, oh my!	Reference	-	reading, langchain, architecture	Clear decomposition of framework vs runtime vs harness responsibilities.
Building agents with the Claude Agent SDK	Reference	-	reading, claude, sdk	Claude blog on production-oriented SDK usage for sessions, tools, and orchestration.
Building Effective AI Agents	Reference	-	reading, anthropic, agents	Anthropic's practical guidance on when to use workflows vs. autonomous agents and how to structure them.
Claude Code auto mode	Reference	-	reading, anthropic, permissions	Anthropic's write-up on classifier-backed approval delegation for safer high-autonomy coding-agent runs.
Code execution with MCP	Reference	-	reading, anthropic, mcp	Anthropic's design notes on controlled code execution via MCP boundaries.
Demystifying Evals for AI Agents	Reference	-	reading, evals, anthropic	Methodology for designing robust agent evals in non-deterministic trajectories.
Effective context engineering for AI agents	Reference	-	reading, context, anthropic	Guidance on context-window budgeting and working-state management for agents.
Effective harnesses for long-running agents	Reference	-	reading, long-running, anthropic	Practical guide to maintaining state, resumability, and reliability over long agent runs.
Evaluating Deep Agents: Our Learnings	Reference	-	reading, langchain, evaluation	LangChain's practical lessons on evaluating stateful and long-horizon agents.
Harness design for long-running application development	Reference	-	reading, app-dev, anthropic	Follow-up article on improving long-running app generation through harness structure.
Harness Engineering (Martin Fowler)	Reference	-	reading, architecture, fowler	Architectural perspective on harness engineering and entropy control.
Harness engineering (OpenAI)	Reference	-	reading, methodology, openai	Field report on building reliable agent-first software via harness constraints and verification.
How we built our multi-agent research system	Reference	-	reading, anthropic, multi-agent	Anthropic architecture write-up on role separation and coordination in multi-agent systems.
Improving Deep Agents with harness engineering	Reference	-	reading, langchain, harness	Evidence that harness improvements alone can move benchmark performance.
Making Claude Code more secure and autonomous with sandboxing	Reference	-	reading, anthropic, sandboxing	How Anthropic uses sandbox boundaries to raise agent autonomy without giving up security controls.
Quantifying infrastructure noise in agentic coding evals	Reference	-	reading, anthropic, evaluation	Analysis of how infrastructure choices impact coding-agent benchmark outcomes.
Scaling Managed Agents: Decoupling the brain from the hands	Reference	-	reading, anthropic, architecture	Anthropic's meta-harness architecture for decoupling session logs, harness loops, and sandboxes in long-horizon agents.
Skill Issue: Harness Engineering for Coding Agents	Reference	-	reading, humanlayer, coding-agents	Practical breakdown of why coding-agent quality depends heavily on harness setup.
Testing Agent Skills Systematically with Evals	Reference	-	reading, openai, evals	OpenAI Developers guide for turning agent traces into repeatable skill evaluations.
The Anatomy of an Agent Harness	Reference	-	reading, architecture, langchain	Conceptual decomposition of agent harness components and their responsibilities.
Unrolling the Codex agent loop	Reference	-	reading, openai, architecture	OpenAI engineering deep dive into the Codex harness loop, prompt growth, tool-call replay, and stateless execution tradeoffs.
Writing effective tools for AI agents	Reference	-	reading, anthropic, tools	Best practices for tool interface design so agents call tools safely and reliably.
Your Agent Needs a Harness, Not a Framework	Reference	-	reading, inngest, reliability	Argument for reliability-first infrastructure around agents instead of framework-only thinking.

Maintenance Notes

Source of truth: data/projects.yaml
Regenerate README files: python3 scripts/render_readme.py
Verify catalog and links: python3 scripts/verify_catalog.py

Citation

@misc{awesome-agent-harness,
  title={Awesome Agent Harness},
  howpublished={\url{https://github.com/Picrew/awesome-agent-harness.git}},
  year={2026}
}

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
data		data
docs		docs
reports/verification		reports/verification
scripts		scripts
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
README.md		README.md
README_zh.md		README_zh.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Awesome Agent Harness

Featured Harness Blogs

Contents

Category Overview

Catalog

Harness Architecture & Orchestration

Context & Working-State Engineering

Execution Substrates & Sandboxing

Protocols, Tool Interfaces & Agent Contracts

Evaluation Harnesses & Benchmarks

Observability & Reliability Operations

Guardrails, Security & Governance

Reference Harness Implementations

Essential Readings & Ecosystem Maps

Maintenance Notes

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Awesome Agent Harness

Featured Harness Blogs

Contents

Category Overview

Catalog

Harness Architecture & Orchestration

Context & Working-State Engineering

Execution Substrates & Sandboxing

Protocols, Tool Interfaces & Agent Contracts

Evaluation Harnesses & Benchmarks

Observability & Reliability Operations

Guardrails, Security & Governance

Reference Harness Implementations

Essential Readings & Ecosystem Maps

Maintenance Notes

Citation

About

Topics

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages