diff --git a/README.md b/README.md index 8bffd41f5..9cc1b2b3f 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ # OpenAdapt: AI-First Process Automation with Large Multimodal Models (LMMs) -[![Build Status](https://github.com/OpenAdaptAI/OpenAdapt/workflows/Python%20CI/badge.svg?branch=main)](https://github.com/OpenAdaptAI/OpenAdapt/actions) +[![Build Status](https://github.com/OpenAdaptAI/OpenAdapt/actions/workflows/main.yml/badge.svg)](https://github.com/OpenAdaptAI/OpenAdapt/actions/workflows/main.yml) [![PyPI version](https://img.shields.io/pypi/v/openadapt.svg)](https://pypi.org/project/openadapt/) [![Downloads](https://img.shields.io/pypi/dm/openadapt.svg)](https://pypi.org/project/openadapt/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) @@ -309,7 +309,7 @@ OpenAdapt's key differentiator is **demonstration-conditioned automation** - "sh --- -## Terminology (Aligned with GUI Agent Literature) +## Terminology | Term | Description | |------|-------------| @@ -320,62 +320,6 @@ OpenAdapt's key differentiator is **demonstration-conditioned automation** - "sh | **Policy** | Decision-making component that maps observations to actions | | **Grounding** | Mapping intent to specific UI elements (coordinates) | -## Meta-Package Structure - -OpenAdapt v1.0+ uses a **modular architecture** where the main `openadapt` package acts as a meta-package that coordinates focused sub-packages: - -- **Core Packages**: Essential for the three-phase pipeline - - `openadapt-capture` - DEMONSTRATE phase: Collects observations and actions - - `openadapt-ml` - LEARN phase: Trains policies from demonstrations - - `openadapt-evals` - EXECUTE phase: Evaluates agents on benchmarks - -- **Optional Packages**: Enhance specific workflow phases - - `openadapt-privacy` - DEMONSTRATE: PII/PHI scrubbing before storage - - `openadapt-retrieval` - LEARN + EXECUTE: Demo conditioning for both training and evaluation - - `openadapt-grounding` - EXECUTE: UI element localization (SoM, OmniParser) - -- **Cross-Cutting**: - - `openadapt-viewer` - Trajectory visualization at any phase - -### Two Paths to Automation - -1. **Custom Training Path**: Demonstrate -> Train policy -> Deploy agent - - Best for: Repetitive tasks specific to your workflow - - Requires: `openadapt[core]` - -2. **API Agent Path**: Use pre-trained VLM APIs (Claude, GPT-4V, etc.) with demo conditioning - - Best for: General-purpose automation, rapid prototyping - - Requires: `openadapt[evals]` - ---- - -## Installation Paths - -Choose your installation based on your use case: - -``` -What do you want to do? -| -+-- Just evaluate API agents on benchmarks? -| +-- pip install openadapt[evals] -| -+-- Train custom models on your demonstrations? -| +-- pip install openadapt[core] -| -+-- Full suite with all optional packages? -| +-- pip install openadapt[all] -| -+-- Minimal CLI only (add packages later)? - +-- pip install openadapt -``` - -| Installation | Included Packages | Use Case | -|-------------|-------------------|----------| -| `openadapt` | CLI only | Start minimal, add what you need | -| `openadapt[evals]` | + evals | Benchmark API agents (Claude, GPT-4V) | -| `openadapt[core]` | + capture, ml, viewer | Full training workflow | -| `openadapt[all]` | + privacy, retrieval, grounding | Everything including optional enhancements | - --- ## Demos diff --git a/docs/design/production-execution-design.md b/docs/design/production-execution-design.md index 10e880bdd..027f32dda 100644 --- a/docs/design/production-execution-design.md +++ b/docs/design/production-execution-design.md @@ -27,7 +27,7 @@ This document addresses a critical gap in the OpenAdapt architecture: the **EXEC --- -## 1. Problem Statement +## 1. Problem Statement {#1-problem-statement} ### Current Architecture @@ -56,7 +56,7 @@ RECORD (capture) --> TRAIN (ml) --> EXECUTE (evals) --- -## 2. Literature Review +## 2. Literature Review {#2-literature-review} ### 2.1 Microsoft UFO (2024-2025) @@ -173,7 +173,7 @@ RECORD (capture) --> TRAIN (ml) --> EXECUTE (evals) --- -## 3. Gap Analysis +## 3. Gap Analysis {#3-gap-analysis} ### 3.1 What OpenAdapt Has @@ -289,7 +289,7 @@ RECORD (capture) --> TRAIN (ml) --> EXECUTE (evals) --- -## 4. Architectural Options +## 4. Architectural Options {#4-architectural-options} ### Option A: Rename/Expand openadapt-evals @@ -425,7 +425,7 @@ openadapt-evals --- -## 5. Recommendation +## 5. Recommendation {#5-recommendation} ### Primary Recommendation: Option B (openadapt-agent) @@ -507,7 +507,7 @@ openadapt eval run --benchmark waa # Benchmarking (unchanged) --- -## 6. README Improvement Proposal +## 6. README Improvement Proposal {#6-readme-improvement-proposal} ### Current Issues @@ -583,7 +583,7 @@ Production automation includes: - [Documentation](https://docs.openadapt.ai) - [Discord](https://discord.gg/yF527cQbDG) -- [Architecture](./docs/architecture-evolution.md) +- [Architecture](../architecture-evolution.md) ## License @@ -601,7 +601,7 @@ MIT --- -## 7. Implementation Roadmap +## 7. Implementation Roadmap {#7-implementation-roadmap} ### Q1 2026: Foundation @@ -639,7 +639,7 @@ MIT --- -## 8. References +## 8. References {#8-references} ### GUI Automation Frameworks