Ideogram 4 MLX WebUI

Local Ideogram 4 image generation for Apple Silicon using MLX-native weights. The default model is MLXBits/ideogram-4-mlx-q8, an int8 MLX conversion of Ideogram 4 intended for mflux.

Architecture

WebUI (:5173) -> FastAPI (:8000) -> Model daemon (:8001) -> mflux/MLX

server/model_daemon.py owns the single local MLX model instance and handles load/unload, generation jobs, cancellation, short-lived artifacts, and local LoRA reloads.
server/mlx_runtime.py resolves the Hugging Face or local MLX model path, loads the mflux Ideogram 4 runtime, tracks MLX memory, and runs image generation.
server/main.py is the WebUI gateway and persistence layer. It stores prompts, generated images, favorites, the last form state, and prompt-image history links; runs Magic Prompt; and proxies generation work to the daemon.
webui/ is the React/Vite/TanStack Router interface with editor, gallery, history, favorites, and a client-side generation queue.
ideogram4_mlx.py is the CLI. It uses the daemon by default and can run direct local MLX generation with --daemon off.

The FastAPI/WebUI generation contract stays stable: submit a structured caption, width, height, preset, seed, and output format.

Legacy PyTorch/MPS Runtime

The previous PyTorch/MPS implementation is preserved on the legacy/pytorch-mps branch. Use that branch if you need the old direct torch/MPS runtime, FP8 dequant loading path, MPS scheduler patching, MPS warmup behavior, or the old ideogram4_mps.py CLI.

This main line is now optimized for the MLX/mflux q8 runtime and does not aim to keep backwards compatibility with the old PyTorch/MPS architecture.

Install

python3 -m venv .venv
.venv/bin/python -m pip install -r server/requirements.txt
cd webui && pnpm install
./run.sh doctor

server/requirements.txt pins mflux to PR #445 commit 8d80b9cb53688b62a2f814604b9f8b48987c5acd. As of 2026-06-22, stable mflux 0.18.0 includes Ideogram 4 FP8 support, but the mlx-forge checkpoint loader needed for MLXBits/ideogram-4-mlx-q8 is still pending in PR #445. Keep the pin until a stable mflux release can load a repo containing split_model.json.

The rollback branch for the previous PyTorch/MPS implementation is legacy/pytorch-mps.

Model Access

Default:

IDEOGRAM4_MODEL_REPO=MLXBits/ideogram-4-mlx-q8

For an already downloaded model directory:

IDEOGRAM4_MODEL_PATH=/path/to/ideogram-4-mlx-q8

The model root must contain split_model.json. If IDEOGRAM4_MODEL_PATH is not set, the daemon downloads/verifies the Hugging Face repo with huggingface_hub.snapshot_download.

The MLXBits conversion is gated on Hugging Face and distributed under the original Ideogram 4 Non-Commercial Model Agreement. Accept the model gate and authenticate with hf auth login or HF_TOKEN before first download. Check the model card before using it outside personal or research workflows.

Run

./run.sh          # model daemon + FastAPI + WebUI
./run.sh backend  # restart FastAPI only
./run.sh client   # restart Vite only
./run.sh doctor   # check dependencies, model files, ports, and memory policy

Manual debugging:

set -a && source .env && set +a
.venv/bin/python server/model_daemon.py
.venv/bin/python server/main.py
cd webui && pnpm dev

CLI:

python3 ideogram4_mlx.py --prompt-file examples/caption.json --out examples/result.png
python3 ideogram4_mlx.py --daemon off --prompt-file examples/caption.json --out examples/result.png

API

Method	Path	Description
`GET`	`/api/model/status`	Daemon state, backend, model repo/path, quantization, MLX memory
`POST`	`/api/model/load`	Load the MLX model
`POST`	`/api/model/unload`	Unload the MLX model
`GET`	`/api/magic-prompt/status`	Magic Prompt provider/configuration health
`POST`	`/api/magic-prompt`	Expand text and optional reference images into a structured caption
`POST`	`/api/verify`	Validate a structured caption through mflux's verifier when available
`POST`	`/api/generate`	Start one local generation job
`GET`	`/api/status/{task_id}`	Poll generation progress
`POST`	`/api/cancel/{task_id}`	Request cancellation
`GET`	`/api/lora/status`	Local LoRA files and active stack
`GET`	`/api/lora/presets`	Built-in downloadable LoRA presets and installed state
`POST`	`/api/lora/download`	Start a preset LoRA download task
`GET`	`/api/lora/download/{task_id}`	Poll LoRA download progress
`POST`	`/api/lora/apply`	Reload model with a local LoRA stack
`POST`	`/api/lora/remove`	Reload model without LoRA
`GET`	`/api/lora/operation/{task_id}`	Poll LoRA apply/remove progress
`GET`	`/api/images`	List generated images, optionally filtered by prompt/history link state
`GET`	`/api/images/stats`	Count total, linked, orphan, and dangling image records
`DELETE`	`/api/images/orphans`	Delete generated image files with no prompt history link
`DELETE`	`/api/images/{image_id}`	Delete one generated image record and file
`PATCH`	`/api/images/{image_id}`	Link an image to an existing prompt history row
`POST`	`/api/images/{image_id}/attach-history`	Create or attach prompt history for an image
`GET`	`/api/images/{image_id}/file`	Serve one generated image file
`GET`	`/api/prompts`	List saved prompt history rows
`GET`	`/api/prompts/{prompt_id}`	Fetch one prompt history row
`POST`	`/api/prompts`	Save a prompt history row
`DELETE`	`/api/prompts/{prompt_id}`	Delete a prompt history row
`GET`	`/api/favorites`	List favorited images/prompts
`GET`	`/api/favorites/{favorite_id}`	Fetch one favorite
`POST`	`/api/favorites`	Favorite an image or prompt
`DELETE`	`/api/favorites/images/{image_id}`	Remove favorite by image
`DELETE`	`/api/favorites/prompts/{prompt_id}`	Remove favorite by prompt
`GET`	`/api/form`	Load the last saved editor form
`POST`	`/api/form`	Save the last editor form

Generation is daemon single-slot. The WebUI can queue, reorder, cancel, and retry multiple client-side jobs, but only one job is submitted to the daemon at a time. Direct concurrent /api/generate calls return HTTP 409. LoRA download/apply/remove operations also use model operation locks because mflux applies LoRA at model load time.

All MLX/mflux runtime calls are routed through a single worker thread inside the model daemon. This avoids MLX thread-local stream failures when a LoRA-loaded model is generated after a reload. Do not call runtime.load, runtime.apply_loras, runtime.remove_loras, runtime.generate, or runtime.unload directly from request/task threads.

Configuration

See .env.example for all settings. Common values:

Variable	Default	Description
`IDEOGRAM4_MODEL_REPO`	`MLXBits/ideogram-4-mlx-q8`	Hugging Face MLX model repo
`IDEOGRAM4_MODEL_REVISION`	empty	Optional repo revision
`IDEOGRAM4_MODEL_PATH`	empty	Optional local model root containing `split_model.json`
`IDEOGRAM4_MLX_CACHE_LIMIT_GB`	empty	Optional MLX cache limit
`IDEOGRAM4_MODEL_DAEMON_AUTOLOAD`	`0`	Load model when daemon starts
`IDEOGRAM4_DEFAULT_PRESET`	`V4_QUALITY_48`	Default sampler preset
`IDEOGRAM4_MIN_IMAGE_SIZE`	`256`	Minimum API dimension
`IDEOGRAM4_MAX_IMAGE_SIZE`	`2048`	Maximum API dimension
`IDEOGRAM4_LORA_DIR`	`models/loras`	Local mflux-compatible LoRA files

Autoload is off by default so the local Magic Prompt LLM and the image model do not immediately compete for unified memory. Use the WebUI Load button or POST /api/model/load when image generation is needed. Set IDEOGRAM4_MLX_CACHE_LIMIT_GB when the machine needs a stricter reusable MLX cache budget.

Benchmarks

Use docs/benchmarks.md for the canonical prompt, seed, presets, and metrics. Current local measurements:

Case	PyTorch/MPS legacy	MLX q8	Difference
Model load, local files ready	about 285s	2.5-4.6s runtime load, 5.3s API-observed smoke	MLX loads about 54-114x faster
1024x1024 `V4_QUALITY_48`, seed `20260608`	408.0s	375.1s	MLX saves 32.9s, about 8.1% faster

The 1024 benchmark uses the same examples/caption.json prompt, preset, seed, and output size as the legacy run. The old MPS result is preserved in examples/result.log; the MLX result was generated with the q8 runtime after the model was available locally.

Post-merge main smoke produced 256x256 V4_TURBO_12 images in 8.1-9.7s through the FastAPI -> daemon path, including LoRA apply/remove checks. A direct local cache-limit pass showed IDEOGRAM4_MLX_CACHE_LIMIT_GB=2 kept reusable MLX cache near 2GB on the 256px turbo smoke; unset cache reached about 5.7GB, and 0 eliminated reusable cache. Treat these as local single-run measurements, not a cross-machine guarantee.

Magic Prompt

POST /api/magic-prompt expands a plain idea into the structured JSON caption Ideogram 4 expects. It uses the existing OpenAI-compatible provider abstraction and local llama.cpp option. POST /api/magic-prompt accepts text plus up to IDEOGRAM4_MAGIC_PROMPT_MAX_IMAGES base64 reference images; text-only requests still work when no multimodal local server is configured. Caption validation uses mflux's Ideogram 4 caption verifier instead of the old ideogram4 Python package.

WebUI State

The sidebar contains prompt history, gallery, and favorites routes. The editor autosaves the latest form through /api/form; generation results are persisted under IDEOGRAM4_OUTPUT_DIR; and prompt/image links are stored in SQLite so history pages can show their generated images. A "new seed" or regenerate action adds a client-side queue job, then the queue submits to the daemon when the single generation slot is free.

webui/src/routeTree.gen.ts is generated by TanStack Router. Temporary webui/.tanstack/tmp/ files may appear during dev/build runs and should not be treated as source changes.

Verification

python3 -m compileall server ideogram4_mlx.py
rg "torch|safetensors.torch|from ideogram4|import ideogram4" server ideogram4_mlx.py
cd webui && pnpm lint && pnpm build

With model files available, also verify:

curl http://127.0.0.1:8001/health
curl -X POST http://127.0.0.1:8001/model/load
curl http://127.0.0.1:8001/model/status

Name		Name	Last commit message	Last commit date
Latest commit History 83 Commits
docs		docs
examples		examples
notes		notes
scripts		scripts
server		server
webui		webui
.env.example		.env.example
.gitignore		.gitignore
AGENTS.md		AGENTS.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
ideogram4_mlx.py		ideogram4_mlx.py
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ideogram 4 MLX WebUI

Architecture

Legacy PyTorch/MPS Runtime

Install

Model Access

Run

API

Configuration

Benchmarks

Magic Prompt

WebUI State

Verification

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Ideogram 4 MLX WebUI

Architecture

Legacy PyTorch/MPS Runtime

Install

Model Access

Run

API

Configuration

Benchmarks

Magic Prompt

WebUI State

Verification

About

Topics

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages