End-to-end documentation to set up your own local & fully private LLM server on Debian. Equipped with chat, web search, RAG, model management, MCP servers, image generation, and TTS.
-
Updated
Mar 2, 2026
End-to-end documentation to set up your own local & fully private LLM server on Debian. Equipped with chat, web search, RAG, model management, MCP servers, image generation, and TTS.
The operations layer for your local LLM stack
A robust, production-ready Python toolkit to automate the synchronization between a directory of .gguf model files and a llama-swap config.yaml
LLM routing proxy for coding harnesses. Auto-routes to cloud or local inference via Bonsai LLM classification. Fallback, prompt rewriting, MCP code review, Signal/Discord Remote Communication
GGUF Model Manager for llama-swap — Browse HuggingFace, download, manage models in one place.
Custom Llama Swap Container Image
Auto-configure opencode to use a local llama-swap instance with model and context detection
Config-driven local LLM toolkit for llama.cpp and llama-swap, with a FastAPI Web UI, eval/benchmark helpers, and deployment packaging.
Launch and optimize llama.cpp servers automatically across Linux, macOS, and Windows using hardware detection and configuration tuning.
Start/stop your Llama Swap models with ulauncher
Autonomous overnight LLM eval pipeline for local GGUF models — multi-turn agentic tasks, dimension-routed dual-judge scoring, SQLite-backed comparison reports. Built for llama.cpp + llama-swap on dual-GPU rigs.
FLAI is a self-hosted, privacy-first AI platform. Local assistant for chat, voice, image/video gen, doc Q&A & camera analysis. Open source, GPU-optimized, multi-user with request queuing. Data never leaves your machine.
Cursor-Auto / Claude-tier-style serving for local GGUF models on Mac (M4 Max, 64 GB). FastAPI router fronts llama-swap + llama.cpp, classifying each request into a coder, planner, or uncensored-planner tier. OpenAI-compatible API, opencode integration, per-project subshell, one `llmstack` console-script.
Add a description, image, and links to the llama-swap topic page so that developers can more easily learn about it.
To associate your repository with the llama-swap topic, visit your repo's landing page and select "manage topics."