Deploy a domain-expert AI chatbot in minutes. RAG-powered knowledge base, real-time SSE streaming, and a mobile-optimized webchat interface. Works with OpenAI, Claude, Ollama, or any OpenAI-compatible API.
- RAG-Powered Answers — Ingest your documents into ChromaDB; the chatbot retrieves relevant context before every response
- SSE Streaming — Tokens stream to the browser in real time via Server-Sent Events
- Mobile Optimized — Dark theme, responsive layout, Wake Lock API to prevent screen sleep
- Multi-Provider — OpenAI, Claude, Ollama, OpenRouter, or any OpenAI-compatible endpoint
- Easy Domain Swap — Change
SOUL.mdand your knowledge docs to create any expert (cooking, legal, medical, etc.) - Production Ready — Retry logic, graceful error handling, health checks, CORS middleware
# 1. Clone
git clone https://github.com/mingrath/ai-expert-chatbot.git
cd ai-expert-chatbot
# 2. Install
python -m venv venv && source venv/bin/activate
pip install -r requirements.txt
# 3. Configure
cp .env.example .env
# Edit .env — add your API_KEY and choose your MODEL_NAME
# 4. Ingest your knowledge base
python -m scripts.ingest
# 5. Run
uvicorn server.app:app --reloadOpen http://localhost:8000 and start chatting.
| Variable | Default | Description |
|---|---|---|
API_URL |
https://api.openai.com/v1/chat/completions |
LLM API endpoint |
API_KEY |
— | Your API key |
MODEL_NAME |
gpt-4o-mini |
Model to use |
AGENT_ID |
— | Optional agent/assistant ID |
HOST |
0.0.0.0 |
Server bind host |
PORT |
8000 |
Server bind port |
CHROMA_PATH |
./chroma_data |
ChromaDB storage directory |
COLLECTION_NAME |
knowledge_base |
ChromaDB collection name |
RAG_TOP_K |
10 |
Number of context chunks to retrieve |
SYSTEM_PROMPT_PATH |
./SOUL.md |
Path to system prompt file |
┌─────────────────────────────────────────────────┐
│ Browser │
│ ┌───────────────────────────────────────────┐ │
│ │ index.html + app.js + style.css │ │
│ │ (SSE streaming, auto-resize, wake lock) │ │
│ └─────────────────┬─────────────────────────┘ │
└────────────────────┼────────────────────────────┘
│ POST /api/chat (SSE)
┌────────────────────┼────────────────────────────┐
│ FastAPI Server │ │
│ ┌─────────────────▼─────────────────────────┐ │
│ │ app.py — routes, CORS, static files │ │
│ └─────────────────┬─────────────────────────┘ │
│ │ │
│ ┌─────────────────▼─────────────────────────┐ │
│ │ chat.py — SSE streaming, retry logic │ │
│ │ ┌───────────┐ ┌────────────────────┐ │ │
│ │ │ rag.py │───▶│ ChromaDB + embeds │ │ │
│ │ └───────────┘ └────────────────────┘ │ │
│ └─────────────────┬─────────────────────────┘ │
└────────────────────┼────────────────────────────┘
│ OpenAI-compatible API
┌──────▼──────┐
│ LLM Provider│
│ (OpenAI, │
│ Claude, │
│ Ollama) │
└─────────────┘
- Edit
SOUL.md— Write a system prompt for your expert persona - Replace knowledge docs — Drop your
.txt,.md, or.csvfiles intoknowledge/ - Re-ingest — Run
python -m scripts.ingest - Update the UI — Edit
static/index.htmlto change the branding, suggestions, and avatar
Examples: Legal advisor, medical FAQ, customer support, internal company wiki, educational tutor.
| Provider | API_URL |
MODEL_NAME example |
|---|---|---|
| OpenAI | https://api.openai.com/v1/chat/completions |
gpt-4o, gpt-4o-mini |
| Claude (via OpenAI compat) | Use a proxy or adapter | claude-sonnet-4-20250514 |
| Ollama | http://localhost:11434/v1/chat/completions |
llama3, mistral, phi3 |
| OpenRouter | https://openrouter.ai/api/v1/chat/completions |
openai/gpt-4o, anthropic/claude-3.5-sonnet |
| Together AI | https://api.together.xyz/v1/chat/completions |
meta-llama/Llama-3-70b-chat-hf |
| Any OpenAI-compatible | Your endpoint URL | Your model name |
ai-expert-chatbot/
├── README.md # This file
├── LICENSE # MIT License
├── requirements.txt # Python dependencies
├── .env.example # Environment template
├── .gitignore # Git ignore rules
├── SOUL.md # System prompt / persona
├── server/
│ ├── app.py # FastAPI main application
│ ├── chat.py # Chat handler with SSE streaming
│ └── rag.py # RAG helper (ChromaDB + embeddings)
├── static/
│ ├── index.html # Chat interface
│ ├── app.js # Chat logic + streaming client
│ ├── style.css # Dark theme, mobile-optimized
│ └── landing.html # Landing page template
├── knowledge/
│ ├── sample-recipes.txt # Sample knowledge documents
│ └── sample-nutrition.txt # More sample documents
├── scripts/
│ ├── ingest.py # Ingest documents into ChromaDB
│ └── query.py # Test RAG queries from CLI
└── docs/
└── deployment.md # Deployment guide
See the full Deployment Guide for Docker, Railway, Render, Fly.io, and production Nginx configuration.
Quick Docker:
docker build -t ai-expert-chatbot .
docker run -p 8000:8000 --env-file .env ai-expert-chatbot- FastAPI — High-performance async Python web framework
- ChromaDB — Open-source vector database for RAG
- sentence-transformers — State-of-the-art text embeddings
- httpx — Async HTTP client for streaming
- uvicorn — Lightning-fast ASGI server
MIT — Use it however you want.