nan-discord-bot

Community Discord bot for nan.builders. It answers member questions in configured support channels using semantic search over a curated markdown knowledge base (a SimpleVectorStore backed by SQLite, qwen3-embedding 4096-dim vectors, cosine similarity computed in Python) and the qwen3.6 chat model exposed through the NaN LiteLLM gateway. The bot also reports daily and per-user token usage pulled from the LiteLLM admin API.

Features

Mention-triggered auto-response in channels listed in ALLOWED_CHANNELS, with retrieval-augmented answers from the local vector store and a qwen3.6 fallback.
Per-user rate limiting (3 mentions per 60-second window per (user, channel) pair).
Username sanitization to mitigate prompt injection via Discord display names.
Text commands (prefix /): /health, /docs, /search <query>.
Slash commands (Discord interactions): /metrics, /my-metrics.
Daily token usage report posted to STATUS_CHANNEL_ID at METRICS_SEND_HOUR (UTC), pulled from the LiteLLM proxy.
HTTP health endpoint on port 9101 (GET /health) consumed by the Docker HEALTHCHECK.
Doc-hash optimization: unchanged markdown files are skipped on startup, so embeddings are only recomputed when content actually changes.

Tech stack

Python 3.11 (Dockerfile python:3.11-slim, pyproject.toml requires-python = ">=3.11").
discord.py >= 2.3.2.
openai async SDK, pointed at the NaN LiteLLM gateway (AsyncOpenAI).
aiohttp for the LiteLLM admin metrics calls.
pydantic-settings for .env parsing.
SQLite (stdlib sqlite3) as the vector store backend.
Hatchling build backend.
Ruff for linting and formatting.

Architecture

main.py boots a SimpleVectorStore against vector_db/vectors.db, loads markdown files from bot/docs/knowledge/ through load_documentation, embeds any new or changed chunks via the LiteLLM embeddings endpoint, and starts the NanBot discord.py client. Graceful shutdown is wired through SIGINT/SIGTERM handlers that cancel pending tasks, persist the vector store, close the OpenAI clients, stop the health HTTP server, and close the bot connection.

On message events, NanBot.on_message filters by ALLOWED_CHANNELS and mention, applies the rate limiter, embeds the question, runs a cosine-similarity search over the in-memory chunks (top-K configurable via TOP_K), and calls LLMClient.answer_with_context against the qwen3.6 chat model. A CircuitBreaker (5 failures, 60-second cool-off) protects the chat endpoint from cascading failures, and an asyncio.Semaphore(5) caps concurrent LLM calls.

Metrics live in bot/metrics.py. They hit the LiteLLM proxy /spend/logs/ui endpoint (configured via LITELLM_PROXY_URL and LITELLM_ADMIN_KEY) and aggregate token usage per user_api_key_alias. The daily scheduler sleeps until METRICS_SEND_HOUR UTC, posts the top-10 report to STATUS_CHANNEL_ID, and then loops every 24 hours.

Project structure

discord-bot/
├── main.py                          # Entry point and shutdown wiring
├── bot/
│   ├── __init__.py
│   ├── base.py                      # NanBot, commands, message handler, health HTTP server
│   ├── config.py                    # pydantic-settings, paths, logger
│   ├── knowledge.py                 # SimpleVectorStore, chunking, doc loader
│   ├── llm.py                       # LLMClient, CircuitBreaker, RAG prompt
│   ├── metrics.py                   # LiteLLM spend log aggregation and reports
│   └── docs/
│       └── knowledge/               # Embedded markdown corpus
│           ├── intro.md
│           ├── getting-started.md
│           └── models.md
├── Dockerfile
├── entrypoint.sh
├── docker-compose.yml               # Local build
├── docker-compose.prod.yml          # Production override (pulled image)
├── pyproject.toml
├── .env.example
└── .github/
    └── workflows/
        └── deploy.yml               # GHCR build + SSH deploy

Getting started

Prerequisites

Python 3.11.
A Discord application with a bot user, a token, and the following privileged intents enabled in the Discord Developer Portal: MESSAGE CONTENT INTENT and SERVER MEMBERS INTENT. Without them the bot fails to connect with PrivilegedIntentsRequired.
The bot invited to your guild with permissions to read messages, send messages, embed links, and use slash commands.
A LiteLLM API key. The bot defaults to https://api.nan.builders/v1; override with LITELLM_BASE_URL if you run your own gateway.
For the metrics features: network reachability to the LiteLLM proxy URL (defaults to http://localhost:4000, i.e. the bot is expected to run on the same host) and an admin key with read access to /spend/logs/ui.

Local setup (without Docker)

git clone https://github.com/helmcode/nan-discord-bot.git
cd nan-discord-bot
python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
cp .env.example .env
# Fill in DISCORD_TOKEN, DISCORD_GUILD_ID, LITELLM_API_KEY, ALLOWED_CHANNELS, etc.
python main.py

Local setup (Docker)

cp .env.example .env
# Fill in the required variables.
docker compose up --build

The container exposes the health endpoint on port 9101 (consumed by the HEALTHCHECK defined in the Dockerfile).

Commands

NanBot registers two flavors of commands. Text commands use the commands.Bot prefix (currently /) and slash commands are real Discord interactions registered via bot.tree.

Command	Type	Description	Cooldown
`/health`	text	Bot status and the number of chunks currently loaded in the vector store.	none
`/docs`	text	List of markdown files loaded from `bot/docs/knowledge/`.	none
`/search <q>`	text	Top-3 chunks from the knowledge base for the given query, with cosine score.	none
`/metrics`	slash	Manually trigger the global LiteLLM top-10 token usage report (last 24 hours).	1 per 3600 s
`/my-metrics`	slash	The caller's personal token usage and per-model breakdown (last 24 hours).	1 per 300 s

Auto-response is triggered when the bot is mentioned inside a channel listed in ALLOWED_CHANNELS. Rate limiting allows at most 3 mentions per user per channel per 60-second window; excess messages get a Spanish "demasiadas peticiones" reply.

Environment variables

Name	Required	Default	Description
`DISCORD_TOKEN`	yes	—	Bot token from the Discord Developer Portal.
`DISCORD_GUILD_ID`	yes	—	Guild (server) ID the bot is associated with.
`LITELLM_BASE_URL`	no	`https://api.nan.builders/v1`	OpenAI-compatible base URL used for chat completions and embeddings.
`LITELLM_API_KEY`	yes	—	LiteLLM key used by both the chat and embeddings clients.
`LITELLM_PROXY_URL`	no	`http://localhost:4000`	LiteLLM proxy base URL used by the metrics module to call `/spend/logs/ui`.
`LITELLM_ADMIN_KEY`	no	`""` (disables metrics)	Admin key for the LiteLLM proxy. When empty, metrics commands and the daily report are skipped.
`EMBEDDING_MODEL`	no	`qwen3-embedding`	Embedding model identifier sent to the LiteLLM gateway.
`EMBEDDING_DIM`	no	`4096`	Expected embedding dimensionality. Informational; not enforced at write time.
`TOP_K`	no	`5`	Number of chunks returned by the vector search used to build the RAG context.
`ALLOWED_CHANNELS`	no	`""` (all channels)	Comma-separated Discord channel IDs the bot will respond in. Empty means every channel is allowed.
`STATUS_CHANNEL_ID`	no	`""` (disables daily report)	Channel ID where the daily metrics report is posted. Required for the scheduler to run.
`METRICS_SEND_HOUR`	no	`9`	UTC hour (0–23) at which the daily metrics report is posted.

Knowledge base

Markdown files in bot/docs/knowledge/ are loaded at startup by SimpleVectorStore, chunked on paragraph boundaries (target ~2000 chars per chunk with overlap), embedded via the LiteLLM embeddings endpoint, and persisted to vector_db/vectors.db. A doc_hashes table stores a SHA-256 of each source file so unchanged files are skipped on subsequent boots; files that disappear from disk have their chunks evicted from the database.

To update the corpus, edit or add .md files under bot/docs/knowledge/ and restart the bot. Only files whose content hash changed will trigger new embedding API calls.

Development

Lint: ruff check .
Format: ruff format .
Ruff is configured in pyproject.toml (line-length = 120, target-version = "py311", rules E, F, I, N, W, UP).
There is currently no test suite. The dev extra installs pytest and pytest-asyncio, and pyproject.toml already configures asyncio_mode = "auto" for when tests are added.

Deployment

Production runs as a Docker container on the inference server.

On every push to main, .github/workflows/deploy.yml builds the image and pushes it to GHCR (ghcr.io/helmcode/nan-discord-bot) tagged with both latest and the commit SHA.
The same workflow then SSHes into the deploy target, pulls the new image (by SHA, with latest as fallback), retags it as latest, and runs docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d --remove-orphans.
The production compose override uses network_mode: host, mounts the read-only knowledge directory and a named volume for vector_db, and caps the container at 512 MiB / 1 vCPU.
The Dockerfile starts as root, runs entrypoint.sh which chowns the vector_db volume to the unprivileged bot user, then exec gosu bot python main.py.

Required GitHub repository secrets

Secret	Purpose
`SERVER_HOST`	Hostname or IP of the deploy target.
`SERVER_USER`	SSH username on the deploy target.
`SSH_PRIVATE_KEY`	SSH private key authorized on the deploy target.
`DEPLOY_DIR`	Optional. Working directory on the deploy target. Falls back to `$HOME/nan-discord-bot` when empty or unset.

Contributing

Branch from main and open a pull request.
Run ruff check . and ruff format . before pushing.
Never commit .env files or any token, API key, or secret.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

nan-discord-bot

Features

Tech stack

Architecture

Project structure

Getting started

Prerequisites

Local setup (without Docker)

Local setup (Docker)

Commands

Environment variables

Knowledge base

Development

Deployment

Required GitHub repository secrets

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github/workflows		.github/workflows
bot		bot
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
AGENTS.md		AGENTS.md
Dockerfile		Dockerfile
README.md		README.md
docker-compose.prod.yml		docker-compose.prod.yml
docker-compose.yml		docker-compose.yml
entrypoint.sh		entrypoint.sh
main.py		main.py
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

nan-discord-bot

Features

Tech stack

Architecture

Project structure

Getting started

Prerequisites

Local setup (without Docker)

Local setup (Docker)

Commands

Environment variables

Knowledge base

Development

Deployment

Required GitHub repository secrets

Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages