Skip to content

ComfyUI platform integration (Phases 2–5): model store, capability picker, installer wiring, V2 pane#878

Merged
thinmintdev merged 25 commits into
mainfrom
feat/comfyui-platform
Jun 17, 2026
Merged

ComfyUI platform integration (Phases 2–5): model store, capability picker, installer wiring, V2 pane#878
thinmintdev merged 25 commits into
mainfrom
feat/comfyui-platform

Conversation

@thinmintdev

Copy link
Copy Markdown
Contributor

Promotes ComfyUI toward a first-class hal0 platform component. Built subagent-driven from docs/superpowers/plans/2026-06-16-comfyui-platform-integration.md (grilled + planned this session).

⚠️ Merge as PARTIAL — known-broken / deferred (user-approved)

What works (tested: 130 pytest + 16 Playwright e2e green, ruff clean)

  • P2 model store: pulls reconciled to /mnt/ai-models/comfyui/models; capability→model matrix (benchmark defaults: Qwen-Image 4-step, LTX-2 video, +SDXL-Lightning, +ESRGAN upscale); vendored get_*.sh + 2 new; async fetch wrapper (see ComfyUI [C1]: deferred model fetch broken for 4/5 families (positional vs --precision) #872); curated SDXL/ESRGAN.
  • P3 installer: ships control scripts + extra_model_paths.yaml; comfyui Extensions-registry entry; installer services step + repair; capability picker records defaults (deferred pull).
  • P4 API: /render/cancel, /restart, /logs, /workflows/{name}/launch, /preview added to existing /status+/switchover(gated)+/pin.
  • P5 UI: V2 "Render hero" pane (comfyui-pane.jsx) replacing V1 — render hero + queue + GTT/RAM gauges + workflows strip + footer; live-wired to the API; empty-state lockup (PR fix(memory): empty-bank graph placeholder no longer locks up the dashboard #845) avoided; reduced-motion respected.

Decisions (grill 2026-06-16)

Keep kyuz0 digest-pinned image (not build) this week · ComfyUI stays slot/arbiter-managed (not systemd unit) · "official" = deterministic provisioning · switchover implicit + gated · deferred pulls.

Security review: clean (path-traversal router-blocked, fetch input validated, no injection, no secrets).

🤖 Generated with Claude Code

Co-Authored-By: Claude Opus 4.8 (1M context) noreply@anthropic.com

thinmintdev and others added 17 commits June 16, 2026 16:59
_comfyui_models_dir() now returns model_store_root()/comfyui/models/<subdir>
instead of hardcoded /var/lib/hal0/comfyui/models/<subdir>. Aligns with
per-install model-store config and deployment expectations.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Vendor 5 kyuz0 scripts (set_extra_paths, get_qwen_image, get_wan22,
  get_hunyuan15, get_ltx2) — adapted MODEL_DIR → /mnt/ai-models/comfyui/models
- New get_sdxl.sh: SDXL base + SDXL-Lightning 8-step LoRA + sdxl-vae-fp16-fix;
  supports --precision and --dry-run
- New get_esrgan.sh: 4x-UltraSharp + RealESRGAN_x4plus → upscale_models/;
  supports --dry-run
- TDD: tests/comfyui/test_fetch_scripts.py 24/24 pass (bash -n, exec bit,
  yaml 8 keys + base_path, dry-run subdir assertions)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…lasses

Adds src/hal0/comfyui/capabilities.py: CAPABILITIES dict (5 caps × 14 variants)
with Capability/ModelVariant dataclasses and default_variant() helper.
Tests in tests/comfyui/test_capabilities.py (8 tests, all green).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Task 2.4: fetch_model(variant) -> job_id via subprocess.Popen;
get_job/cancel_job with module-level registry. esrgan (precision=None)
skips --precision arg. 11 TDD tests (mocked Popen), no real downloads.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add two image-gen entries to the curated catalogue:
- sdxl-lightning: ByteDance SDXL + 8-step Lightning LoRA (checkpoints/)
- esrgan-4x: xinntao RealESRGAN x4plus upscaler (upscale_models/, bundle_only)

TDD: tests/registry/test_curated_comfyui.py (8 tests, all green).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
TDD: tests/install/test_comfyui_scripts_shipped.py (6/6 pass).

- Add installer/comfyui/scripts/{comfy-up,comfy-down,comfy-logs,
  comfy-postinstall,start-inference,stop-inference}.sh — verbatim from
  the working CT105 copies (digest-pinned kyuz0 image kept intentionally).
- installer/install.sh: new "ComfyUI control scripts" block installs
  scripts to /opt/comfyui/ (fixed path; comfy-up.sh self-references it),
  creates /mnt/ai-models/comfyui/{models,output,input,user,custom_nodes},
  and places extra_model_paths.yaml if absent. Dev-mode skipped.

No script-name mismatch: comfyui.py does not shell out to any
/opt/comfyui/*.sh (API comment confirms — scripts are manual-ops only).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds installer/comfyui/extra_model_paths.yaml with in-container bind path
/root/comfy-models and all standard ComfyUI model-type keys. TDD tests verify
file existence, YAML validity, base_path, and required keys.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds ComfyUI to EXTENSIONS (kind=app, default_enabled=True). install_extension
branch starts the container via /opt/comfyui/comfy-up.sh if a container runtime
is present, skipping silently otherwise. ComfyUI is NOT a systemd unit.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds ComfyUI to GET /api/install/services (probed via _container_active()
using podman/docker inspect). Repair is a special case before the systemd
allowlist: calls /opt/comfyui/comfy-up.sh directly, not systemctl. Adds
_container_active() helper.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…stall hook

- src/hal0/comfyui/selection.py: auto_selections() and variant_for()
- POST /api/comfyui/models/fetch: auto or explicit selections → 202 + job ids
- Selections.comfyui_defaults: (cap_id, family) pairs recorded at install, no pull
- build_auto_selections() populates comfyui_defaults from CAPABILITIES defaults
- tests: 15 new TDD tests (7 selection + 8 fetch route), all green

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add POST /render/cancel, POST /restart, GET /logs, POST /workflows/{name}/launch,
GET /preview to the ComfyUI API router. Switchover gate untouched. Telemetry
util field gated to null when no running job (gpu_busy_percent forced-high
artifact). it/s, eta, step absent by design (websocket-only, skip noted).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Port Design/design_handoff_comfyui_imagegen into comfyui-pane.jsx:
render-hero (preview frame + step pips + progress + it/s + eta),
ordered queue rows (running ldot.generating + pending), iGPU GTT
radial gauge + RAM spark + 2x2 device grid, workflows strip (6 flows),
models-on-share inventory, container footer with sctrl controls.

CSS scoped under .comfy-page (blue --comfy accent). Empty-queue state
in-flow with min-height (not position:absolute overlay — PR #845 guard).
Reduced-motion rule freezes pulse + shimmer animations. Mock data via
window.__comfyuiV2MockOverride seam for e2e injection.

8/8 imagegen-v2 e2e tests pass. comfyui-arbiter-v3 (4 tests) breaks
as expected: those specs test the old switchover/pin UI removed in V2;
will be updated in Task 5.2 live wiring.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Poll /api/comfyui/status at 900ms; transform server payload into the V2
pane's RUN/QUEUE/GTT/RAM data shape via transformComfyuiStatus (with safe
placeholders for absent per-job detail). Control handlers: Cancel →
POST /render/cancel, Restart (footer) → POST /restart, Logs →
GET /logs, workflow chip → POST /workflows/{name}/launch; Open ComfyUI
link uses live endpoint field. Preview <img> src → /api/comfyui/preview.
window.__comfyuiV2MockOverride seam preserved (override wins, bypasses
poll). Rewrote comfyui-arbiter-v3.spec.ts: 8 live-wired tests (status
renders into hero/queue/telemetry, cancel/launch/restart fire correct
endpoints, idle + degrade paths). imagegen-v2.spec.ts updated for
button/a duality. 16/16 pass; build clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The old fetch.py called scripts as `<script> --precision <p>` but all
kyuz0 scripts except get_sdxl.sh take POSITIONAL args and require multiple
invocations (dependency chain). Auto-fetch silently no-oped for 4/5 families.

- capabilities.py: add `fetch_steps: tuple[tuple[str,...],...]` to
  ModelVariant; populate all 12 variants with exact argv sequences per
  script contract (positional for ltx2/wan22/hunyuan15/qwen, flag for sdxl,
  empty for esrgan).
- fetch.py: fetch_model now iterates fetch_steps sequentially with
  `bash <script> *step_args`; stops on first nonzero exit; drop old
  `--precision` construction. Synchronous (steps run inline).
- tests/comfyui/test_fetch.py: replace wrong-contract mock tests with
  recording _PopenRecorder asserting correct call count, argv order,
  and early-stop on failure.
- tests/comfyui/test_fetch_dryrun.py: real subprocess smoke for flag-form
  scripts (esrgan --dry-run, sdxl --precision fp16 --dry-run) verifying
  argv is real-script-valid without network.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Follow-up to 5d3fd15. The synchronous step loop blocked the caller for the
whole multi-hour download, breaking the async contract of
POST /api/comfyui/models/fetch (returns 202, jobs run in background) and
serializing {auto:true} multi-family fetches.

- fetch.py: factor step loop into private worker `_run_sequence(rec, ...)`;
  fetch_model creates rec(status=running), starts it in a daemon Thread
  stored as rec["_thread"], and returns job_id IMMEDIATELY. get_job reflects
  live status; cancel_job terminates rec["_proc"] (current step). Stop-on-
  first-nonzero + cancelled handling stay in the worker.
- test_fetch.py: add _wait_done() poll helper; assert terminal status only
  after worker completes; read _PopenRecorder after completion. New
  test_non_blocking_returns_while_step_running proves fetch_model returns
  with status=running while a slow step is still in wait(). Cancel test now
  cancels a genuinely in-flight (blocked) step.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
thinmintdev and others added 8 commits June 16, 2026 20:32
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…875)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@thinmintdev thinmintdev merged commit bbbdcdd into main Jun 17, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant