docs: Add production execution design (openadapt-agent proposal) #977

abrichr · 2026-01-17T05:51:56Z

Summary

Comprehensive design document for production execution capability, addressing the gap between benchmark evaluation and real-world automation.

Key Findings

Literature Review: UFO, Claude Computer Use, OSWorld, ST-WebAgentBench

Only 2% of orgs have deployed agentic AI at scale
Safety and human-in-the-loop are mandatory for production

Recommendation: Create new openadapt-agent package

Clear terminology (industry standard)
Separation from benchmarking (different requirements)
Production execution loop with safety integration

README Proposal: Condense from 443 to ~100 lines

EXECUTE phase includes both openadapt-agent (production) AND openadapt-evals (benchmarks)

Files

docs/design/production-execution-design.md (+724 lines)

🤖 Generated with Claude Code

- Literature review: UFO, Claude CU, OSWorld, ST-WebAgentBench - Gap analysis: safety exists but not wired to execution loop - Recommendation: Create openadapt-agent package for production automation - README improvement proposal: Condense from 443 to ~100 lines - Implementation roadmap: Q1-Q3 2026 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

abrichr merged commit fcef4c8 into main Jan 17, 2026
6 checks passed

abrichr deleted the feature/agent-design branch January 17, 2026 05:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

docs: Add production execution design (openadapt-agent proposal) #977

docs: Add production execution design (openadapt-agent proposal) #977

abrichr commented Jan 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

docs: Add production execution design (openadapt-agent proposal) #977

docs: Add production execution design (openadapt-agent proposal) #977

Conversation

abrichr commented Jan 17, 2026

Summary

Key Findings

Files

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants