A single-binary Go CLI that discovers hardware facts across your cluster via SSH, builds a ClusterSnapshot, provides deterministic reservation-aware task placement, and exposes guarded optional local control surfaces — no daemon-first architecture required.
# Install
go install github.com/toasterbook88/axis/cmd/axis@latest
# Inspect the local machine
axis facts
# Inspect the full cluster (requires ~/.axis/nodes.yaml)
axis status
# Ask where to run a task
axis task place "run ollama inference on a 7b model"axis facts— local hardware/tool snapshotaxis status— full cluster snapshot over SSHaxis status --cached— explicit daemon-backed cached snapshot readaxis task place— advisory placement with fit score and failure reasoningaxis task place --cached— explicit daemon-backed cached placement readaxis task context --cached— explicit daemon-backed cached context blockaxis serve— optional local HTTP API surfaceaxis daemon refresh— force a fresh daemon snapshot nowaxis daemon invalidate— explicit local daemon cache invalidationaxis mcp serve— optional read-only MCP server over stdioaxis task run— explicit execution surface layered on top of placement
/runandaxis_executerequire explicitmode=scriptormode=exec+confirm: "YES"- Hardened safety blocker with allow-list, cluster-aware RAM/GPU checks, and learned-bad fast path
- Per-node reservation caps enforced before any command runs (
RAMTotalMB - 1024headroom) - Reservations auto-clean on every daemon refresh (45 min TTL + legacy leak detection)
- All cached paths (
task place --cached,task context --cached,mcp serve --cached) use the singleinternal/daemon/client - Visible in
/snapshot/metaasreserved_mb
| Feature | Details |
|---|---|
| Local fact collection | OS, kernel, arch, CPU cores/model, RAM (total/free + pressure), disk, GPU list, network addresses |
| Tool inventory | go, python3, git, docker, ollama, node, swift, cargo, gcc |
| SSH cluster sweep | Concurrent fan-out over all configured nodes; per-node timeout |
| ClusterSnapshot | Structured JSON/YAML with per-node status (complete / partial / unreachable / error) and cluster-level aggregates |
| Advisory task placement | axis task place ranks nodes deterministically by pressure, GPU, effective headroom, allocatable RAM, reservation ratio, and locality; --cached uses the explicit daemon snapshot cache |
| Optional local control surfaces | axis serve, axis daemon invalidate, axis mcp serve, axis task run, and axis chat are available when explicitly invoked |
| Single-binary operation | No required daemon, database, or background process; local server/MCP surfaces are opt-in |
| Structured output | axis facts and axis status support JSON/YAML; axis task place supports human output and JSON |
Using go install (recommended):
go install github.com/toasterbook88/axis/cmd/axis@latestBuild from source:
git clone https://github.com/toasterbook88/axis.git
cd axis
go build -o axis ./cmd/axis/
# Optional: move to $PATH
mv axis /usr/local/bin/axisRequirements: Go 1.26.1+, SSH key-based auth for remote nodes.
axis facts # JSON (default)
axis facts --format yaml # YAMLCreate ~/.axis/nodes.yaml (see nodes.example.yaml):
nodes:
- name: node-a
hostname: node-a.local
ssh_user: alice
role: primary
- name: node-b
hostname: node-b.local
ssh_user: alice
role: worker
# ssh_port: 22
# timeout_sec: 10Then:
axis status # JSON cluster snapshot
axis status --format yaml
axis status --cached # read explicit daemon cache instead of live SSH sweepaxis task place "analyze a git repo"
# → Selected node: node-b (remote, fit 82/100)
# Tool: git
# Reason:
# - has required tool: git
# - free RAM: 14336 MB
axis task place "run ollama inference on a 7b model" --format json
axis task place --cached "run ollama inference on a 7b model"Placement uses keyword matching against the task description (no ML). It infers the required tool (ollama, git, go, docker) and minimum free RAM from specific keywords (model, 7b, inference, heavy, etc.), then scores each reachable node — tool presence is a hard requirement, and eligible nodes are ranked by pressure, GPU preference, effective headroom, allocatable RAM, reservation ratio, and stable name ordering.
With --cached, placement uses the explicit daemon snapshot cache instead of a fresh SSH sweep. JSON output includes a source wrapper so you can tell whether the decision came from daemon-cache or live fallback.
axis task context "test inference"
axis task context --cached "test inference"--cached uses the explicit daemon snapshot cache and includes a Source: line in the rendered context block so you can tell where the prompt data came from.
axis serveStarts the local AXIS HTTP API and execution surface on 127.0.0.1:42425 by default, plus a background snapshot refresh loop that powers the explicit cached-read path.
axis daemon invalidate
axis daemon invalidate --cache-addr 127.0.0.1:42425Clears the daemon-backed snapshot cache explicitly. This does not change the default axis status live path; it only affects cached reads and other daemon-backed surfaces.
axis daemon refresh
axis daemon refresh --cache-addr 127.0.0.1:42425Forces the daemon to rebuild its cached snapshot immediately. This is the fastest way to ensure axis status --cached and axis task place --cached use fresh cluster state without waiting for the next background tick.
~/.axis/nodes.yaml fields:
| Field | Required | Default | Description |
|---|---|---|---|
name |
yes | — | Logical node name |
hostname |
yes | — | Resolvable hostname or IP |
ssh_user |
yes | — | SSH username |
role |
no | — | primary or worker |
ssh_port |
no | 22 |
SSH port |
timeout_sec |
no | 10 |
Per-node collection timeout (seconds) |
┌─────────────────────┬─────────────────────────────────────────────────────────────────────────────────┐
│ Package │ Role │
├─────────────────────┼─────────────────────────────────────────────────────────────────────────────────┤
│ cmd/axis/ │ Cobra CLI entry — chat, facts, status, task, serve, context, scripts, skills │
├─────────────────────┼─────────────────────────────────────────────────────────────────────────────────┤
│ internal/config/ │ Loads ~/.axis/nodes.yaml (node list, SSH user/port/timeout) │
├─────────────────────┼─────────────────────────────────────────────────────────────────────────────────┤
│ internal/facts/ │ SSH into each node, collects RAM/CPU/GPU/tools │
├─────────────────────┼─────────────────────────────────────────────────────────────────────────────────┤
│ internal/placement/ │ Filter + rank nodes by free RAM, pressure, GPU, locality; ComputeFitScore 0–100 │
├─────────────────────┼─────────────────────────────────────────────────────────────────────────────────┤
│ internal/chat/ │ Streams via local Ollama (localhost:11434), graceful fallback message │
├─────────────────────┼─────────────────────────────────────────────────────────────────────────────────┤
│ internal/snapshot/ │ Assembles `ClusterSnapshot` from `[]NodeFacts` │
├─────────────────────┼─────────────────────────────────────────────────────────────────────────────────┤
│ internal/daemon/ │ Background snapshot refresh, in-memory cache, and explicit invalidation │
├─────────────────────┼─────────────────────────────────────────────────────────────────────────────────┤
│ internal/state/ │ Persists local placement memory and execution state │
├─────────────────────┼─────────────────────────────────────────────────────────────────────────────────┤
│ internal/api/ │ Local HTTP API and execution surface │
├─────────────────────┼─────────────────────────────────────────────────────────────────────────────────┤
│ internal/mcp/ │ Read-only MCP server over stdio │
├─────────────────────┼─────────────────────────────────────────────────────────────────────────────────┤
│ internal/transport/ │ Raw SSH execution layer │
├─────────────────────┼─────────────────────────────────────────────────────────────────────────────────┤
│ internal/discovery/ │ Node discovery │
├─────────────────────┼─────────────────────────────────────────────────────────────────────────────────┤
│ internal/models/ │ Shared types: NodeFacts, TaskRequirements, Locality │
└─────────────────────┴─────────────────────────────────────────────────────────────────────────────────┘
- Config lives at
~/.axis/nodes.yaml— no cluster IPs hardcoded in code - Placement is deterministic: RAM pressure → GPU → effective headroom → allocatable RAM → reservation ratio → name
- ComputeFitScore factors in GPU (+25pts) and local-node bonus (+10pts) — M1↔M3 RAM sharing would be relevant here
- Chat hardcoded to localhost:11434 Ollama — no remote inference routing yet
axis servehosts an optional daemon-backed cache;axis status --cached,axis task place --cached,axis task context --cached,axis daemon refresh, andaxis daemon invalidateuse it explicitlyaxis serveandaxis mcp serveare optional local surfaces, not required infrastructure- Placement memory lives locally in
~/.axis/state.json
Current phase: The original Phase 1 observability core is complete, and main now ships Phase 2-style reservation-aware placement plus explicit cache, MCP, chat, and execution surfaces. The project is still not daemon-first.
See Phase 1 Spec and White Paper for detailed design notes.
The following are planned directions, not current functionality:
- Background coordinator (
axisd) - Mesh networking / peer discovery beyond a static seed file
- Phase 2+ features — see white paper
See CONTRIBUTING.md. Keep PRs small and focused; open an issue before adding Phase 2+ features.
MIT — see LICENSE.