Skip to content

blocksense-network/computer-use-agent

Repository files navigation

Computer-Use Agent

A Playwright-based “computer-use” agent that completes a browser navigation challenge (30 steps) and reaches /finish in under 5 minutes in headless Chromium.

This repo is a worked example of the Agent Harbor long-horizon task methodology: write the spec, encode success as tests, track progress explicitly, and iterate on failures using traces until the suite is green — resulting in a “one-shot” agent you can run end-to-end without human intervention.

Methodology (simplified)

  1. Spec the task (inputs/outputs, constraints, non-cheating policy): docs/SPEC.md
  2. Make success verifiable with correctness + performance gates: tests/
  3. Track the plan + verification criteria in a single status file: computer-use-agent.status.org
  4. Build the smallest end-to-end loop (open → solve step → submit → advance) until all success criteria pass

Fully automated loop: run tests, inspect the failure artifacts, patch, repeat — until the agent can reliably “one-shot” the full run.

Quickstart

Requirements: Node >=22.

npm ci
npx playwright install chromium
npm test

Run the solver:

npm run solve -- --version 3 --headless

Override the challenge URL:

BNC_BASE_URL='https://serene-frangipane-7fd25b.netlify.app' npm run solve -- --version 3 --headless

Verify the 5-minute budget

npx playwright test tests/perf.spec.ts

Repo map

  • docs/SPEC.md — requirements/spec
  • computer-use-agent.status.org — milestones + verification criteria
  • src/ — agent runner + per-method solvers
  • tests/ — correctness + performance tests (with trace/video/screenshot on failure)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published