Token‑efficient LLM router + context manager for n8n and Claude Code (MCP)—deployable locally or via remote HTTP bridges.
Goal: keep massive working context without massive token use. Replace fragile, token‑hungry prompts with diffed context, semantic retrieval, function routing, and strict token budgets.
- API: Python FastAPI with caching, prompt macros, delta-context, routing, retry support, and health checks.
- MCP bridge: FastAPI server that exposes FlowDex MCP tools over HTTP for remote VS Code / Claude Code setups.
- n8n node: Minimal custom node that hits FlowDex
/inferwith few parameters (model, task, inputs). Avoids native nodes explosion. - CLI: Inspect budgets, cache, and reproduce runs.
- Local stack:
docker-compose.yml(API + Redis + MCP bridge). SQLite used for persistence by default. Swap in real vector DB later.
FlowDex grew out of shipping real automations where “just throw more tokens at the prompt” stopped scaling. Product teams and operators needed to preserve weeks of context, tool transcripts, and troubleshooting notes without paying for massive context windows or rewriting flows every time the prompt changed. FlowDex packages those lessons into a single, opinionated service that:
- Preserves the working memory of an automation across turns without duplicating the full transcript.
- Routes expensive LLM calls only when they add value, preferring cached results or cheaper tools.
- Gives operators an auditable trail of what the assistant saw, which tool it picked, and why decisions were made.
At its core FlowDex is a FastAPI service that tracks every interaction in an LLM-powered workflow. It stores diffs between turns, enforces token budgets, records retries, and makes every run reproducible. The same backend also powers:
- An MCP bridge so Claude Code (VS Code) can call the exact API used in production.
- An n8n node that lets low-code builders orchestrate FlowDex without custom HTTP glue.
- A CLI for inspecting cache hits, budgets, manifests, and reproducing runs locally.
By centralizing prompting logic and run history, FlowDex lets you:
- Ship faster – Build against one API whether the assistant runs in n8n, Claude Code, or bespoke scripts.
- Spend less – Delta-context, semantic recall, and strict budgets keep token usage predictable and low.
- Operate confidently – Persisted run manifests, diffed context, and replay tooling make debugging straightforward.
- Stay flexible – Swap models, retrieval engines, or tool definitions without rewriting every workflow node.
- Delta Context (Patch Prompting): send only what changed. We compute a content hash and minimal JSON Patch between turns.
- Token Budgeter: hard caps for system, context, tools, user with graceful degradation and logs.
- Semantic Recall (Optional): simple bag-of-words + sqlite index today; plug your own vectors later.
- Function Registry: strongly-typed tools with cost hints; router chooses tool > text when cheaper.
- Determinism & Repro: Run manifests saved as JSON; one‑click replay.
- n8n First‑Class: one compact node → FlowDex API. No sprawl of native nodes.
- Claude Code via MCP: surface FlowDex as tools inside your editor, not another chat tab.
| Component | Requirement | Notes |
|---|---|---|
| Docker (optional) | docker & docker compose |
Recommended for the quickest path with Redis + MCP bridge bundled. |
| Python | 3.10+ | Needed for local installs, CLI tooling, and development. |
| Redis (optional) | 7+ | Docker compose spins this up automatically. Local installs can point to an existing instance or fall back to the in-memory cache. |
Have an Anthropic-compatible API key (or whichever backend model you configure) ready before deploying to remote environments.
git clone https://github.com/your-org/FlowDex.git
cd FlowDex- Copy the sample environment file or create a new
.envfile at the project root. - Update secrets such as
FLOWDEX_API_KEYand the model you plan to call. - Launch the stack:
docker compose up --build
- The API will be available at
http://localhost:8787, and the MCP HTTP bridge athttp://localhost:8788. - Stop the stack with Ctrl+C or
docker compose downwhen you are done.
- Create and activate a virtual environment:
python -m venv .venv source .venv/bin/activate - Install dependencies:
pip install -r server/requirements.txt
- (Optional) Start Redis locally if you want persistence/caching beyond the process lifetime.
- Launch the API server:
uvicorn server.app:app --reload --port 8787
- In a second terminal, start the MCP HTTP bridge if you need Claude Code connectivity:
python mcp/mcp_http_server.py
Create .env (or use Docker envs):
FLOWDEX_PORT=8787
FLOWDEX_MODEL=anthropic/claude-3-5-sonnet
FLOWDEX_CACHE_DIR=.flowdex_cache
FLOWDEX_MAX_TOKENS=6000
FLOWDEX_BUDGET_SYSTEM=1000
FLOWDEX_BUDGET_CONTEXT=2500
FLOWDEX_BUDGET_USER=1500
FLOWDEX_BUDGET_TOOLS=1000
FLOWDEX_API_KEY=change-me
FLOWDEX_REDIS_URL=redis://localhost:6379/0
FLOWDEX_MODELcan be any provider/model name supported by your downstream LLM proxy.FLOWDEX_REDIS_URLis optional; omit it to run in in-memory mode (good for quick trials).- Expose
FLOWDEX_API_KEYwhenever the API is reachable from untrusted networks. The MCP bridge forwards this header automatically.
Real token counting & embeddings are pluggable. Stubs are provided so it runs offline now.
With the API running, hit the health check:
curl http://localhost:8787/healthYou should see a JSON response similar to {"status": "ok"}. If Redis is unavailable, the response will note degraded caching.
You can preload durable snippets that will be referenced by workflows:
curl -X POST http://localhost:8787/memory/put \
-H "Content-Type: application/json" \
-H "x-flowdex-api-key: $FLOWDEX_API_KEY" \
-d '{
"id": "runbook.postgres",
"title": "Postgres On-Call Runbook",
"body": "Check pg_stat_activity; restart read replicas if backlog > 500",
"tags": ["db", "incident"]
}'The memory store keeps previous versions, so you can roll back or diff updates.
curl -X POST http://localhost:8787/infer \
-H "Content-Type: application/json" \
-H "x-flowdex-api-key: $FLOWDEX_API_KEY" \
-d '{
"task": "triage",
"model": "anthropic/claude-3-5-sonnet",
"inputs": {
"user": "Summarize the last incident and draft an update.",
"context": ["runbook.postgres"],
"tool_hints": ["post_incident_report"]
}
}'The response contains a run_id, token usage, tool decisions, and the generated text. Store the run_id to retry or reproduce later.
curl -X POST http://localhost:8787/infer/<run_id>/retry \
-H "Content-Type: application/json" \
-H "x-flowdex-api-key: $FLOWDEX_API_KEY" \
-d '{
"error": "Tool failed: post_incident_report",
"patch": {"path": "/inputs/tool_hints", "op": "add", "value": ["post_incident_report", "status_page"]}
}'Retries automatically reuse context diffs to keep token usage minimal.
GET /health– lightweight readiness + Redis connectivity check.POST /infer– run a task with budgets, diffed context, and optional retrieval.POST /infer/{run_id}/retry– re-run a previous task with new error context for automated repair loops.POST /memory/put– store or update named context blobs (versioned).GET /memory/get?id=...– retrieve latest or a specific version.POST /tools/register– declare a tool with a schema and cost hints.GET /runs/{id}– view a prior run manifest for repro.
Two options depending on where VS Code / Claude Code runs:
- Local desktop – run the original stdio bridge:
Then configure Claude Code to spawn that script.
python mcp/server.py
- Remote / browser (Unraid, Codespaces, etc.) – run the HTTP bridge:
Point Claude Code to
python mcp/mcp_http_server.py # or let docker compose manage ithttps://your-host/mcp(behind Cloudflare, etc.). The bridge proxiesflowdex.infer,flowdex.infer.retry,flowdex.memory.get, andflowdex.healthover HTTP with API-key auth support.
The default docker-compose.yml now builds and runs the bridge alongside the API, exposing port 8788.
- Install the Claude Code extension in VS Code.
- Open the extension settings and add a new MCP server.
- For local setups select Process and point to
python mcp/server.py. For remote setups select HTTP and supply the URLhttp://<your-host>:8788/mcp. - Set the environment variable
FLOWDEX_API_KEYwithin the MCP configuration if your API requires it. - Reload the Claude Code extension. You should now see FlowDex tools (
flowdex.infer,flowdex.memory.get, etc.) available in the tool palette.
Install from n8n-node/ into your n8n custom nodes folder. The node calls /infer with minimal configuration.
- Copy
n8n-node/flowdexinto your n8n custom nodes directory and restart n8n. - In your workflow, add the FlowDex node.
- Configure the node parameters:
- API Base URL:
http://flowdex:8787(if running via docker compose) orhttp://localhost:8787for local runs. - Task: e.g.,
triageorsummarize. - User Input: the user prompt or payload from upstream nodes.
- System Prompt: optional guardrails to preface the run.
- Context IDs: comma-separated memory IDs or contextual notes.
- Tool Candidates: comma-separated tool identifiers FlowDex should consider.
- Model: override the default model if needed.
- API Base URL:
- Set the
X-API-Keycredential globally in n8n or append aHeader Authnode if your server requires authentication. - Trigger the workflow manually or via webhook to confirm the integration.
The CLI under cli/ offers a lightweight way to send inference requests and experiment with different context/tool combinations from the terminal.
pip install -r cli/requirements.txt
python cli/flowdex_cli.py --help- Send a quick inference:
python cli/flowdex_cli.py --task summarize --user "List recent changes" - Include context IDs for richer memory:
python cli/flowdex_cli.py --task incident_review --user "Draft a customer update" --ctx runbook.postgres comms.template - Provide tool candidates and a custom model:
python cli/flowdex_cli.py --task repair --user "Regenerate with tool assistance" --tool post_incident_report status_page --model anthropic/claude-3-5-sonnet
All CLI commands respect the same .env file for API URL and auth credentials.
The examples/ folder contains end-to-end scenarios you can run as templates:
examples/incident_triage.ipynb– notebook illustrating memory seeding, inference, and retries.examples/tool_router.py– Python script registering tools and demonstrating the router choosing them over free-form LLM output.examples/n8n_flow.json– importable n8n workflow connecting FlowDex to Slack + PagerDuty.
Each example is annotated with the commands needed to reproduce the run, making it easy to adapt to your own workflows.
Interested in improving FlowDex? Read the contribution guide for setup instructions and the recommended debugging workflow (automated tests ➝ Playwright verification ➝ manual checks only when unavoidable). The guide also includes a pull-request checklist to keep changes ship-ready.
© 2025 FlowDex. Generated 2025-10-23T00:31:05.007257Z