📦 Trajectory Workbench

基于 OpenSandbox 的 AI Agent 轨迹数据合成工作台。

通过在安全隔离的沙箱环境中运行 LLM 驱动的 Agent，采集真实的 Observation → Thought → Action → Result 交互轨迹，并通过 Review Agent 自动评估与自迭代优化，最终产出高质量的 SFT / DPO / RLHF 训练数据集。

✨ 核心特性

真实环境交互 — Agent 在 OpenSandbox 沙箱中执行真实命令与 GUI 操控，轨迹数据源于闭环交互而非凭空生成
9 大场景覆盖 — 内置 9 种主流 Agent 场景（含 5 个独立合成引擎），一套工作台产出多类型训练数据
LLM 驱动决策 — Agent 由 DeepSeek API 驱动（OpenAI 兼容格式），每一步自主推理和决策
自迭代闭环 — Review Agent 自动评估轨迹质量，按三级授权模型迭代优化
三级授权审批 — 🟢 低风险自动执行 → 🟡 中风险人工确认 → 🔴 高风险人工审批
可视化工作台 — Web UI 实时展示执行日志、质量评分、审批操作和迭代历史
多格式导出 — 支持 SFT / DPO / RLHF / Raw 等多种训练数据格式

🎯 支持场景

Trajectory Workbench 内置 9 种场景，覆盖当前主流 Agent 能力评估维度。其中 5 个场景由独立合成引擎驱动，具备完整的数据生成 Pipeline。

独立合成引擎场景

场景	引擎	参考来源	说明
🏗️ EnvScaler 工具调用	`envscaler`	—	基于状态化环境骨架的工具调用轨迹合成，支持 reward 信号与 check function 验证
🔧 ToolACE 工具调用	`toolace`	ToolACE (ICLR 2025)、ToolACE-MT	工具自进化合成 + 多轮调用轨迹生成，含跨组交叉、角色背景注入、缺参追问
🔍 Search2QA	`search2qa`	WebExplorer	搜索轨迹驱动的 QA 合成，支持 Query Evolution 与轨迹改写
⚙️ Toucan MCP 工具交互	`toucan`	Toucan	基于 Smithery MCP Server 注册表的工具调用轨迹合成，含 6 维质量检查
📱 Mobile Agent	`mobile_agent`	Redroid	基于 OpenSandbox + Redroid 的 Android GUI 操控轨迹合成，VLM 驱动截图→推理→动作循环

通用沙箱场景

场景	ID	说明
🖥️ GUI 操作	`gui`	浏览器 / 桌面系统的界面操控，生成 GUI 自动化与 RPA 训练数据
🌐 Deep Search	`deep_search`	搜索引擎检索与信息整合，生成深度搜索和综合推理训练数据
🤖 多 Agent 协调	`multi_agent`	多智能体协作与交互，生成多轮多角色对话与协调决策训练数据
💻 代码执行	`code_exec`	代码编写、测试与调试，最基础也最通用的场景

🏗️ 系统架构

┌─────────────┐     ┌──────────────────────────┐     ┌─────────────────────┐
│   Web UI    │────▶│      Backend API          │────▶│  OpenSandbox Server │
│  (React +   │     │      (FastAPI)            │     │  (沙箱控制面)        │
│   Vite)     │     │                          │     │  port:8080          │
│  port:5173  │     │  ┌────────────────────┐  │     └──────────┬──────────┘
└─────────────┘     │  │  Scene Engines     │  │                │
                    │  │                    │  │                ▼
                    │  │  • EnvScaler       │  │         ┌──────────────┐
                    │  │  • ToolACE         │  │         │ Docker 沙箱   │
                    │  │  • Search2QA       │  │         │ (隔离执行环境) │
                    │  │  • Toucan          │  │         └──────────────┘
                    │  │  • Mobile Agent ◀──┼──┼──┐
                    │  └────────────────────┘  │  │  ┌───────────────────────┐
                    │                          │  └─▶│  Docker 沙箱           │
                    │  port:3000               │     │  (Redroid      │
                    └──────────┬───────────────┘     │   Redroid 容器)  │
                               │                     └───────────────────────┘
                        ┌──────▼──────┐
                        │ DeepSeek API│
                        │ (Agent 大脑) │
                        └─────────────┘

数据生成流程

用户提出任务 → 选择场景 → 生成 Pipeline → 沙箱执行 → 产出轨迹
                                                           ↓
                                                   Review Agent 评估
                                                           ↓
                       ┌── 🟢 自主执行区 → 自动修改，重跑
                       ├── 🟡 人工确认区 → Web UI 选择方案
                       └── 🔴 人工审批区 → Web UI 审批
                                                           ↓
                                                   质量达标 → 导出数据集
                                                       (SFT / DPO / RLHF)

📁 项目结构

trajectory-workbench/
├── README.md                 # 本文档
├── INSTALL_GUIDE.md          # 集成指南（新手版）
├── requirements.txt          # Python 依赖
├── backend.py                # 后端 API 服务 (FastAPI)，包含场景调度和完整执行逻辑
├── pipeline.py               # CLI Pipeline 编排脚本（可独立运行）
│
├── envscaler/                # 🏗️ EnvScaler 合成引擎
│   ├── config.py             #   配置 + MCP Server 模板 + Agent Prompt
│   ├── scene_manager.py      #   场景文件加载 / 解析 / 提取
│   ├── sandbox_runner.py     #   沙箱部署 MCP Server
│   ├── trajectory_gen.py     #   Agent 轨迹生成
│   ├── envscaler_pipeline.py #   Pipeline 编排 + Review + Export
│   └── envscaler_api.py      #   FastAPI 路由
│
├── toolace/                  # 🔧 ToolACE 合成引擎
│   ├── step1_tool_evolution.py   #   工具自进化合成 (TSS)
│   ├── step2_task_generation.py  #   任务生成 (SDG)
│   ├── step3_trajectory_gen.py   #   多轮轨迹生成 (ToolACE-MT)
│   ├── toolace_pipeline.py       #   Pipeline 编排 + Review + Export
│   └── toolace_api.py            #   FastAPI 路由
│
├── search2qa/                # 🔍 Search2QA 合成引擎
│   ├── main.py               #   三阶段流水线编排
│   ├── llm_engine.py         #   LLM 多轮交互引擎 (Function Calling)
│   ├── tools.py              #   工具实现 (DuckDuckGo + 网页爬取)
│   ├── prompts.py            #   三阶段提示词模板
│   ├── scene_handler.py      #   沙箱执行控制器
│   └── trace_manager.py      #   轨迹记录与管理
│
├── toucan/                   # ⚙️ Toucan 合成引擎
│   ├── step0_smithery_setup.py   #   Smithery MCP Server 注册
│   ├── step1_question_synthesis.py #  问题合成 + 嵌入去重
│   ├── step2_quality_check.py    #   6 维质量检查
│   ├── step3_trajectory_gen.py   #   Agent 轨迹生成 (MCP 调用)
│   ├── toucan_pipeline.py        #   Pipeline 编排
│   └── toucan_api.py             #   FastAPI 路由
│
├── mobile_agent/             # 📱 Mobile Agent 合成引擎 (NEW)
│   ├── config.py             #   配置 + 动作空间 + System Prompt + Tools Schema
│   ├── mobile_scenarios.json #   10 个内置 Android GUI 任务场景
│   ├── sandbox_runner.py     #   OpenSandbox + ADB 沙箱生命周期管理
│   ├── trajectory_gen.py     #   VLM 驱动的截图→推理→动作循环
│   ├── mobile_pipeline.py    #   Pipeline 编排 + 5 维 Review + Export
│   ├── mobile_api.py         #   FastAPI 路由 (/api/mobile/*)
│   ├── test_local.py         #   集成测试 (支持 Mock 模式离线运行)
│   └── README.md             #   模块详细文档
│
├── web-ui/                   # 前端 Web UI
│   ├── src/
│   │   ├── App.jsx           #   主界面组件
│   │   └── App.css           #   样式
│   ├── package.json
│   └── vite.config.js
│
└── output/                   # 导出的轨迹数据（自动生成）
    ├── *_export.json         #   Web UI 导出格式
    ├── *_sft_*.jsonl         #   SFT 训练数据
    ├── *_dpo_*.jsonl         #   DPO 训练数据
    └── best_trajectory_*.json    # 最佳轨迹

📋 环境要求

组件	最低版本	说明
macOS / Linux	—	支持 Apple Silicon (M1/M2/M3/M4)
Docker Desktop	4.0+	需分配至少 8GB 内存
Python	3.10+	推荐使用 `uv` 包管理器
Node.js	18+	用于前端 Web UI
DeepSeek API Key	—	在 platform.deepseek.com 获取
KVM 支持	—	Mobile Agent 场景需要（Linux 需 `/dev/kvm`，macOS 需 Rosetta 2）

🚀 快速开始

第一步：克隆项目

git clone https://github.com/Stephen3zero24/trajectory-workbench.git
cd trajectory-workbench

第二步：安装基础工具

macOS 用户：

# 安装 Homebrew（如未安装）
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

# 安装 uv（Python 包管理器）和 Node.js
brew install uv node

安装 Docker Desktop： 前往 docker.com 下载安装。安装后进入 Settings → Resources，将 Memory 设为 8GB，CPUs 设为 4。

验证安装：

docker --version      # 应输出版本号
python3 --version     # 需要 3.10+
node --version        # 需要 18+
uv --version          # 应输出版本号

第三步：安装 Python 依赖

uv venv .venv
source .venv/bin/activate    # macOS / Linux
uv pip install -r requirements.txt

第四步：拉取沙箱镜像

# OpenSandbox 镜像（通用场景）
opensandbox-server init-config ~/.sandbox.toml --example docker

# Redroid 镜像（Mobile Agent 场景, 可选）
docker pull redroid/redroid:14.0.0-latest

# 国内加速: Redroid 镜像托管在 Docker Hub, 配置镜像加速即可
# 见下方 FAQ "Docker 镜像拉取很慢"

第五步：配置 API Key

export DEEPSEEK_API_KEY="your-api-key-here"

# 建议写入 shell 配置文件使其永久生效
echo 'export DEEPSEEK_API_KEY="your-api-key-here"' >> ~/.zshrc
source ~/.zshrc

第六步：安装前端依赖

cd web-ui && npm install && cd ..

第七步：启动服务

需要打开 3 个终端窗口，分别启动 3 个服务：

终端 1 — OpenSandbox Server（沙箱控制面）：

source .venv/bin/activate
opensandbox-server
# ✅ INFO: Uvicorn running on http://127.0.0.1:8080

终端 2 — Backend API（后端服务）：

source .venv/bin/activate
export DEEPSEEK_API_KEY="your-api-key-here"
python3 backend.py
# ✅ 🚀 轨迹合成工作台后端启动
# ✅ INFO: Uvicorn running on http://0.0.0.0:3000

终端 3 — Web UI（前端界面）：

cd web-ui && npm run dev
# ✅ VITE ready — Local: http://localhost:5173/

第八步：打开浏览器

访问 **http://localhost:5173**，页面顶部应显示两个绿色状态标签：

OpenSandbox: connected ✅
DeepSeek: configured ✅

📖 使用教程

1. 定义任务

在首页任务描述框中输入希望 Agent 完成的任务（或点击示例按钮快速填入），选择场景类型，配置模型参数，点击 "▶ 启动 Pipeline"。

💡 新手建议：从「代码执行」场景开始，沙箱环境最通用、上手门槛最低。

2. 观察执行过程

页面自动进入"沙箱执行"阶段，可实时看到沙箱初始化日志、Agent 每一步的决策与执行结果、Review Agent 的评估过程。

3. 处理审批请求

当 Review Agent 提出修改建议时：

🟢 自主执行区：低风险修改（如温度调整），系统已自动应用
🟡 人工确认区：中风险修改（如任务描述优化），选择一个方案后确认
🔴 人工审批区：高风险修改（如环境依赖变更），查看影响评估后审批

4. 导出数据集

轨迹质量达到阈值（默认 80 分）或手动点击"跳过迭代 · 直接导出"后，数据将导出至 output/ 目录。

🔧 CLI 模式（无 Web UI）

source .venv/bin/activate
export DEEPSEEK_API_KEY="your-api-key-here"

# 通用 Pipeline（确保 OpenSandbox Server 已在另一终端运行）
python3 pipeline.py

# 独立运行各合成引擎
python -m toolace.toolace_pipeline --task-count 5 --expansion-count 2
python -m toucan.toucan_pipeline
python3 search2qa/main.py --seed "量子计算" --mode question --evolutions 2

# Mobile Agent (同样通过 OpenSandbox, 需确保 Server 已启动)
python -m mobile_agent.test_local --dry-run          # 结构验证 (Mock, 无需 API Key)
python -m mobile_agent.test_local --max-tasks 3      # 完整测试 (需 API Key)

可修改 pipeline.py 底部的 TaskConfig 切换场景：

config = TaskConfig(
    task_id="task_001",
    task_desc="你的任务描述...",
    scene_type="mcp_tool",    # mcp_tool / gui / deep_search / multi_agent / code_exec
    model="deepseek-chat",
    temperature=0.7,
    max_steps=15,
)

Mobile Agent 使用独立的 Pipeline 入口：

import asyncio
from mobile_agent.config import MobileAgentPipelineConfig
from mobile_agent.mobile_pipeline import run_mobile_pipeline

config = MobileAgentPipelineConfig(
    task_id="my_task",
    max_steps=20,
    max_tasks=5,
    scenario_filter_tags=["settings"],  # 只跑 settings 标签的任务
    enable_vision=True,                 # 截图发送给 VLM
    enable_ui_tree=True,                # 获取 UI hierarchy
)
result = asyncio.run(run_mobile_pipeline(config))

🔌 场景引擎详解

🏗️ EnvScaler — 状态化环境工具调用

将外部生成的状态化环境（由 skel_builder + scen_generator 产出）部署为沙箱内 MCP Server，Agent 通过 scene_action 工具与环境交互，支持 reward 信号和 check function 自动验证。

与 Toucan / ToolACE 的核心区别在于：环境是有状态的领域模拟（如诊所预约系统、库存管理等），而非无状态 API 调用。

🔧 ToolACE — 工具自进化 + 多轮轨迹

三阶段 Pipeline：工具自进化合成 (TSS) → 任务生成 (SDG) → 多轮轨迹生成 (ToolACE-MT)。相比原论文新增了跨组交叉任务生成、8 种角色背景注入、缺参追问等改进，使生成的轨迹数据更接近真实场景。

🔍 Search2QA — 搜索轨迹驱动的 QA 合成

三阶段流水线：初始化 QA → 迭代复杂化 (Query Evolution) → 轨迹改写（造题轨迹 → 答题轨迹）。支持 Question 模式（种子→QA）和 Answer 模式（答案→问题），使用 DuckDuckGo 搜索（免费无需 API Key）。

⚙️ Toucan — MCP 工具调用

基于 Smithery MCP Server 注册表，Pipeline 为：MCP Server 注册 → 问题合成 + 嵌入去重 → 6 维质量检查（难度/质量/真实性/独特性/可验证性/稳定性） → Agent 轨迹生成。预置 8 个 MCP Server（Exa Search、Brave Search、GitHub、Filesystem 等），支持 4 种采样策略。

📱 Mobile Agent — Android GUI 操控

基于 OpenSandbox + Redroid，在 OpenSandbox 管理的 Redroid 容器中运行 VLM 驱动的 GUI Agent。Pipeline 为：场景加载 → OpenSandbox 创建 Android 沙箱 → 截图+UI树采集 → VLM 推理与动作决策 → 执行动作 → 循环 → 5 维质量评估 → 数据集导出。

与其他引擎的核心区别：

Observation 是视觉的 — 通过 adb shell screencap + uiautomator dump 获取截图和 UI hierarchy，而非文本工具返回值
Action 是空间操控 — tap(坐标)、swipe(轨迹)、input_text、key_event 等 7 种 GUI 动作，全部通过 ADB 执行
沙箱镜像不同 — 使用 redroid/redroid（Android AVD），但同样由 OpenSandbox Server 统一管理
Review 维度不同 — UI 理解、动作准确性、推理清晰度、完成度、效率
内置 10 个场景 — 覆盖系统设置、闹钟、联系人、计算器、跨应用多步操作等

详细文档见 mobile_agent/README.md。

⚙️ 配置说明

服务端口

服务	默认端口	配置方式
OpenSandbox Server	8080	`~/.sandbox.toml`
Backend API	3000	`backend.py` 末行
Web UI	5173	Vite 默认

环境变量

变量名	必填	说明
`DEEPSEEK_API_KEY`	✅	DeepSeek API 密钥
`OPENSANDBOX_SERVER`	❌	OpenSandbox 地址，默认 `http://127.0.0.1:8080`
`DEEPSEEK_BASE_URL`	❌	DeepSeek API 地址，默认 `https://api.deepseek.com`
`SMITHERY_API_KEY`	❌	Smithery API 密钥（Toucan 场景可选）
`MOBILE_SANDBOX_IMAGE`	❌	Redroid 镜像，默认 `redroid/redroid:14.0.0-latest`

支持的模型

模型 ID	名称	说明
`deepseek-chat`	DeepSeek-Chat (V3.2)	默认模型，性价比高
`deepseek-reasoner`	DeepSeek-Reasoner (R1)	推理能力更强，适合复杂任务

DeepSeek API 使用 OpenAI 兼容格式，可替换为任何兼容的模型提供商。Mobile Agent 场景推荐使用支持 Vision 的模型以获得最佳截图理解效果。

Docker 资源配置（推荐）

资源	推荐值	说明
CPUs	4	留一半给宿主机
Memory	8 GB	每个沙箱约占 1–2 GB，Mobile Agent 约占 2–3 GB
Disk	40 GB+	沙箱镜像需要存储空间（Redroid 镜像约 200 MB）

❓ 常见问题

Q: Docker 镜像拉取很慢？ 在 Docker Desktop → Settings → Docker Engine 中添加镜像加速：

{ "registry-mirrors": ["https://mirror.ccs.tencentyun.com"] }

Redroid 镜像托管在 Docker Hub，配置镜像加速即可。

Q: SDK 创建沙箱报 NoneType 错误？ 这是 SDK 0.1.5 和 Server 0.1.8 之间的已知兼容性问题。项目已通过 httpx 创建 + SDK connect 接管 的方式绕过，无需额外处理。

Q: 如何更换 LLM？ 修改 DEEPSEEK_BASE_URL 环境变量指向其他 OpenAI 兼容 API 地址即可，同时修改 DEEPSEEK_API_KEY 为对应密钥。

Q: 不同场景对沙箱环境有什么要求？ 大多数场景共用 OpenSandbox 镜像（opensandbox/code-interpreter:v1.0.2），差异体现在 Agent 的系统提示和任务描述上。Mobile Agent 场景同样由 OpenSandbox 管理，但使用 Redroid 镜像（redroid/redroid），通过 Android shell 直接执行 GUI 命令。

Q: Mobile Agent 在 macOS 上运行报 KVM 错误？ Redroid 不需要 KVM，但需要宿主机加载 binder 内核模块。macOS 用户需使用 Colima：

colima start --vm-type=vz --vz-rosetta --memory 8 --cpu 4

Q: OpenSandbox Server 不可达时能否使用 Mobile Agent？ 可以。Server 不可达时自动切换到 Mock 模式，Pipeline 逻辑正常运行但不会真正操控 Android 设备，适合开发调试 Agent Prompt 和验证 Pipeline 流程。

Q: 并发多少个沙箱合适？ 16 GB 内存的机器建议最多同时运行 3 个 OpenSandbox 沙箱，可在 Web UI 中调整并发数。Redroid 资源更轻（约 500MB/实例），128 CPU 可同时跑 30+ 实例。

Q: 轨迹质量一直不达标？ 可尝试：简化任务描述 → 降低质量阈值（0.8→0.7） → 增加最大迭代轮次 → 调低 Temperature（如 0.3） → 切换至 deepseek-reasoner 模型。

🗺️ Roadmap

~~GUI 操作场景专用桌面环境镜像~~ → Mobile Agent (OpenSandbox + Redroid)
鸿蒙 HarmonyOS 场景支持（基于 hdc 协议适配层 + DevEco 模拟器）
Deep Search 场景接入真实搜索引擎 API
多 Agent 协调场景支持自定义角色编排
更多导出格式（ShareGPT、Alpaca）
接入更多 LLM 提供商（OpenAI、Anthropic、本地模型）
批量任务调度与数据集自动化生产
Mobile Agent 支持自定义 APK 预装与应用内操控

🤝 技术栈

层级	技术
沙箱平台	OpenSandbox (Alibaba)
LLM API	DeepSeek (OpenAI 兼容格式)
后端	Python 3.10+ · FastAPI · uvicorn · httpx
前端	React 19 · Vite 8
合成引擎	ToolACE · Search2QA (WebExplorer) · Toucan · EnvScaler · Mobile Agent
容器化	Docker
MCP	fastmcp · qwen-agent · sentence-transformers
移动端	Redroid · Redroid (Android-in-Container) · uiautomator2

📚 参考论文

Docker Deployment

5/9 demo deployment is a 3-service docker-compose: VibeDataBot (platform side, separate repo) plus opensandbox-server and trajectory-workbench, both built from the same image in this repo and selected via the compose command field.

Build

docker build -t trajectory-workbench:demo .

Restricted networks (China mainland datacenters / transparent-proxy environments): if the build fails fetching deb.debian.org (502 / timeout), override the APT mirror via --build-arg:

docker build \
  --build-arg APT_MIRROR=https://mirrors.tuna.tsinghua.edu.cn/debian \
  -t trajectory-workbench:demo .

Available mirrors:

TUNA (Tsinghua): https://mirrors.tuna.tsinghua.edu.cn/debian
Aliyun: https://mirrors.aliyun.com/debian
USTC: https://mirrors.ustc.edu.cn/debian

The -security suffix is preserved automatically — sed replaces only the http://deb.debian.org/debian prefix.

Required env

var	required	notes
`DEEPSEEK_API_KEY`	yes	LLM credential, set on the trajectory-workbench service
`OPENSANDBOX_SERVER`	optional	full URL, defaults to `http://127.0.0.1:8080`; in docker-compose set to `http://opensandbox-server:8080` so the SDK connect path resolves to the sibling service
`PLATFORM_SKILLS_URL`	yes (compose)	root URL of VibeDataBot's `/api/skills` endpoint. trajectory-workbench fetches the skill manifest at startup and fails fast if the URL is unreachable or returns zero active `trajectory-synthesis` skills. In docker-compose set to the VibeDataBot service URL (e.g. `http://vibedatabot:3000`).

Volumes

./output:/app/output — persists trace artifacts written by search2qa scenarios
./deploy/sandbox.toml:/etc/opensandbox/sandbox.toml:ro — config for the opensandbox-server service (mount only on that service)
/var/run/docker.sock:/var/run/docker.sock — required on the opensandbox-server service only; trajectory-workbench doesn't need it. The sandbox runtime spawns sibling containers via the host docker daemon.

Service commands

services:
  opensandbox-server:
    image: trajectory-workbench:demo
    command: ["opensandbox-server", "--config", "/etc/opensandbox/sandbox.toml"]
  trajectory-workbench:
    image: trajectory-workbench:demo
    # default CMD is uvicorn backend:app — leave unset

Platform integration

VibeDataBot (separate repo, owned by Rui) reaches this service via the TRAJECTORY_WORKBENCH_URL env. Inside the same docker network the default is http://trajectory-workbench:3100. The compose file itself lives in the VibeDataBot repo, not here.

📄 License

Apache 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 64 Commits
deploy		deploy
docs		docs
envscaler		envscaler
mobile_agent		mobile_agent
scripts		scripts
search2qa		search2qa
toolace		toolace
toucan		toucan
trajectory_agent		trajectory_agent
web-ui		web-ui
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
Dockerfile.search2qa-sandbox		Dockerfile.search2qa-sandbox
INSTALL_GUIDE.md		INSTALL_GUIDE.md
README.md		README.md
backend.py		backend.py
pipeline.py		pipeline.py
pytest.ini		pytest.ini
requirements-demo.txt		requirements-demo.txt
requirements.txt		requirements.txt
sandbox-requirements.txt		sandbox-requirements.txt
sandbox_utils.py		sandbox_utils.py

Folders and files

Latest commit

History

Repository files navigation

📦 Trajectory Workbench

✨ 核心特性

🎯 支持场景

独立合成引擎场景

通用沙箱场景

🏗️ 系统架构

数据生成流程

📁 项目结构

📋 环境要求

🚀 快速开始

第一步：克隆项目

第二步：安装基础工具

第三步：安装 Python 依赖

第四步：拉取沙箱镜像

第五步：配置 API Key

第六步：安装前端依赖

第七步：启动服务

第八步：打开浏览器

📖 使用教程

1. 定义任务

2. 观察执行过程

3. 处理审批请求

4. 导出数据集

🔧 CLI 模式（无 Web UI）

🔌 场景引擎详解

🏗️ EnvScaler — 状态化环境工具调用

🔧 ToolACE — 工具自进化 + 多轮轨迹

🔍 Search2QA — 搜索轨迹驱动的 QA 合成

⚙️ Toucan — MCP 工具调用

📱 Mobile Agent — Android GUI 操控

⚙️ 配置说明

服务端口

环境变量

支持的模型

Docker 资源配置（推荐）

❓ 常见问题

🗺️ Roadmap

🤝 技术栈

📚 参考论文

Docker Deployment

Build

Required env

Volumes

Service commands

Platform integration

📄 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages