mac-llm-lab

Truly useful local AI on Apple Silicon. A worked reference rig across 16 GB, 32 GB, and 64 GB — one architecture, three SoC budgets, so anyone with the Mac they already own can run real models locally.

The wager: a single LLM call is brain-like and primitive — what feels useful (ChatGPT, Claude Code) is a system of models, retrieval, tools, and routing. This project builds that system locally with small open models and proves they can be insanely useful. Read MANIFESTO.md for the why.

Stack. Ollama on the host (native Apple Silicon, unified memory) + Open WebUI in Docker (LAN browser UI) + OpenCode in Docker (agentic coding against a launchd-resident llama-server, driven by the oc wrapper CLI), wired to a five-profile OWUI lineup. Branded mac-llm-lab here; one rename away from any other handle (see Fork checklist).

Architecture: spec.md. Model selection: profiles.md.

Migration note. The coding stack was rebuilt on OpenCode on 2026-06-10, replacing the previous claw-code + LiteLLM-bridge + grammar stack at every memory tier. Rationale and evidence: host/test/docs/OPENCODE-MIGRATION-DECISION.md. The last commit with the old stack intact is tagged claw-stack-final — check out that tag to reproduce the claw baseline.

Quickstart — install with the wizard

The fastest path to a working code stack (OpenCode + llama-server + the oc wrapper) is the bundled installer. It's pure Bash, curl-only, no Homebrew required, and strictly idempotent — re-runs are safe on a live system.

git clone https://github.com/<you>/mac-llm-lab.git
cd mac-llm-lab
./wizard/wizard install

The wizard will:

detect your Mac's RAM and pick a memory tier (16 / 32 / 64 GB) — override with ←/→ arrow keys on the slider
ask for a topology — full-local (host + client both on this Mac) or client-only (this Mac talks to a host elsewhere on the LAN)
install Xcode CLT, cmake, llama.cpp, OrbStack, Ollama, fetch the tier GGUF, install the launchd-resident OpenCode llama-server, build the opencode:local client image, install the global agent prompt (~/.config/opencode/AGENTS.md) and the oc wrapper (~/.local/bin/oc)
finish with an end-to-end smoke: the prompt-injection wire-capture probe plus a real oc run artifact

After install:

oc                         # OpenCode TUI on the current directory
oc run "fix the tests"     # headless one-shot
oc probe                   # assert the global prompt reaches the agent
./wizard/wizard doctor     # read-only state inspection
./wizard/wizard --help

See wizard/README.md for tier model choices, idempotency guarantees, and trust boundaries (one upstream curl | sh for OrbStack, opt-out instructions included).

Profiles (Open WebUI lineup)

The wizard installs the code stack only. The five-profile OWUI chat lineup is the broader host/ setup — see Manual / OWUI setup below.

Profile	Use it for	Backing model
`general`	daily driver — chat, code, vision	Qwen3.6-27B Q8_0
`fast`	snappy triage, no `<think>`	Qwen3.6-35B-A3B MoE Q4
`reasoning`	hard thinking, planning	Nemotron Super 49B v1.5 Q6
`digest`	long-context extract	Qwen3-30B-A3B-Instruct-2507 Q4
`analyze`	long-context reasoning	Qwen3-30B-A3B-Thinking-2507 Q6

One profile resident at a time, swapped on demand. Full rationale in profiles.md. Agentic coding runs on a separate, dedicated llama-server (host/llama-server/) driven by OpenCode — that's what the wizard wires up.

Manual / OWUI setup

If you want the full chat lineup (Open WebUI, the five profiles, the host orchestration CLI) or prefer to install piece-by-piece, each directory has its own README:

host/ollama/ — install Ollama, stage GGUFs
host/ollama/Modelfiles/ — ollama create the aliases
host/ — Open WebUI Docker stack, groups, per-model config
host/llama-server/ — the dedicated coding llama-server (launchd-resident, tier-parameterized)
host/scripts/ — install mac-llm-lab-hostctl for orchestration
client/ — install the mac-llm-lab CLI on your laptop
client/opencode/ — containerised OpenCode + the oc wrapper

The wizard automates 4 and 7 (serving, client image, global prompt, oc). The OWUI chat profiles in 1–3 remain manual today.

Fork checklist

# 1. Brand: replace `mac-llm-lab` everywhere (LAN hostname, script names, plist Label)
grep -rl 'mac-llm-lab\|LLM Lab' . | xargs sed -i '' 's/mac-llm-lab/your-brand/g; s/LLM Lab/Your-Brand/g'

# 2. Rig username: Modelfile FROM paths point to /Users/nigel/.ollama/gguf/
sed -i '' "s|/Users/nigel/|/Users/$USER/|g" host/ollama/Modelfiles/*.Modelfile

# 3. Repo path: mac-llm-lab-hostctl defaults to ~/Desktop/bench/mac-llm-lab.
#    Either clone there, or set `HOST_REPO=/your/path` in your shell profile.

After step 1, also rename host/ollama/launchd/com.mac-llm-lab.ollama-env.plist to match.

Browser

Use Chrome or Firefox for long Open WebUI sessions. Safari WebContent retains 10+ GB after closing thinking-mode chats.

License

MIT — see LICENSE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mac-llm-lab

Quickstart — install with the wizard

Profiles (Open WebUI lineup)

Manual / OWUI setup

Fork checklist

Browser

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 99 Commits
client		client
host		host
issues		issues
research		research
wizard		wizard
.gitignore		.gitignore
LICENSE		LICENSE
MANIFESTO.md		MANIFESTO.md
README.md		README.md
profiles.md		profiles.md
spec.md		spec.md

Folders and files

Latest commit

History

Repository files navigation

mac-llm-lab

Quickstart — install with the wizard

Profiles (Open WebUI lineup)

Manual / OWUI setup

Fork checklist

Browser

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages