Skip to content

feat: OnlyAgent Hands — WebHID remote keystroke execution#3

Open
0c-coder wants to merge 1 commit intomainfrom
feat/onlyagent-hands
Open

feat: OnlyAgent Hands — WebHID remote keystroke execution#3
0c-coder wants to merge 1 commit intomainfrom
feat/onlyagent-hands

Conversation

@0c-coder
Copy link
Owner

Summary

  • Gateway Hands module (Rust): macro→keystroke compiler, CBOR packet framing, SQLx DB layer, in-memory session manager, 16 Axum API endpoints
  • WebHID browser bridge (TypeScript): OnlyKeyHands driver class, relay loop polling gateway and delivering HID reports to OnlyKey device
  • Web UI: job list dashboard with polling, live execution view with step tracking and emergency stop
  • Screenshot capture agent (Rust): cross-platform daemon for AI visual reasoning feedback loop
  • Prisma schema: 5 new models (HandsJob, HandsStep, HandsSession, HandsScreenshot, HandsAuditEvent)

Architecture

The system uses OnlyKey hardware tokens as remote keystroke executors via the WebHID API:

  1. AI Reasoning Loop plans high-level macros (open_browser, navigate_url, type_text, etc.)
  2. Gateway compiler translates macros into OS-specific HID keystroke sequences
  3. CBOR packet framer encodes keystrokes into 64-byte HID reports (5B header + 59B payload)
  4. Browser bridge polls the gateway and delivers reports to OnlyKey via WebHID
  5. OnlyKey device types the keystrokes on the host machine as a physical keyboard
  6. Screenshot agent captures the result for visual verification, closing the loop

New Files

  • apps/gateway/src/hands/ — 7 Rust modules (models, compile, packet, db, session, api, mod)
  • apps/web/src/lib/hands/ — 6 TypeScript modules (types, webhid, api, bridge, index)
  • apps/web/src/app/(dashboard)/hands/ — 3 React components (page, job-list, live-view)
  • apps/screencap/ — Rust binary for cross-platform screenshot capture

Test plan

  • cargo check -p onecli-gateway compiles cleanly
  • cargo test -p onecli-gateway — compile, packet, and session unit tests pass
  • npx prisma generate succeeds with new schema
  • npx prisma migrate dev creates Hands tables
  • Web app builds (pnpm build) with new pages and components
  • WebHID connection flow works in Chrome with OnlyKey device
  • Job create → start → session establish → packet delivery end-to-end

🤖 Generated with Claude Code

Introduces OnlyAgent Hands, a system that uses OnlyKey hardware tokens
as remote keystroke executors via the WebHID API. An AI reasoning loop
plans actions, the gateway compiles them to OS-specific keystrokes,
and a browser bridge delivers HID reports to the OnlyKey device which
types them on the host machine. A screenshot capture agent closes
the feedback loop for visual verification.

Gateway (Rust):
- hands/ module: models, macro→keystroke compiler, CBOR packet framing,
  SQLx database layer, in-memory session manager, 16 Axum API endpoints
- Compiler translates high-level macros (open_browser, navigate_url,
  type_text, etc.) into OS-specific HID keystroke sequences
- 64-byte HID report framing with 5-byte header + 59-byte CBOR payload

Web (Next.js):
- WebHID driver (OnlyKeyHands class) for direct USB communication
- Bridge relay loop: gateway → WebHID → OnlyKey → status reports
- Job list dashboard with polling and status indicators
- Live execution view with step tracking and emergency stop

Screenshot agent (Rust):
- Cross-platform capture daemon (macOS/Linux/Windows)
- Periodic capture + upload to gateway for AI visual reasoning

Schema:
- 5 new Prisma models: HandsJob, HandsStep, HandsSession,
  HandsScreenshot, HandsAuditEvent

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant