-
Tavily Search API Key:
Local Mind uses Tavily Search to fetch live web results for its research agent.- Get your free Tavily API key here: https://app.tavily.com/home
- Add your key to your
.envfile as:TAVILY_API_KEY=your-key-here - This is required for web research to function!
-
Local LLM Model:
Local Mind runs a quantized Jan-nano model on your machine, so all your data stays private.-
What is Jan-nano?
Jan-nano is a highly efficient, open-source LLM by Menlo, specifically designed for running on local CPUs and resource-limited hardware, making it perfect for private, local AI.- Trained on high-quality English datasets.
- Optimized for speed and context length.
- Well-suited for chat, question answering, and code.
-
Quantized Model Downloads:
Local Mind supports quantized GGUF versions for best performance on your system.- Choose a quantized file (
*.gguf) from this list. - Download the version that matches your hardware and put it in your
server/model/or the main server directory.
Model Variant RAM Required File Size Download Link Q4, Q5, Q6 Varies ~1-2GB Choose here - Choose a quantized file (
-
You need both a Tavily API key and a quantized Jan-nano GGUF model file!
- Get your Tavily API Key
and add it toserver/.env:
TAVILY\_API\_KEY=your-key-here
-
Download a quantized Jan-nano model:
Pick from Jan-nano GGUF releases
and put your chosen.gguffile inserver/model/. -
Follow previous setup for server and client...
- Jan-nano is a state-of-the-art, efficient LLM built for privacy, low resource usage, and versatility.
- Model card & benchmarks
- Quantized downloads & sizes
- Pick Q4 for low RAM, Q5/Q6 for best quality if you have more resources.
- Tavily is a fast, privacy-friendly web search API for AI research agents.
- Get your API key at https://app.tavily.com/home.
- Enables Local Mind to find and cite up-to-date answers beyond your files.
See the rest of the README above for setup, API, and use cases!
local-mind/
βββ client/ # Next.js React frontend (chat UI, file upload, etc)
βββ server/ # FastAPI backend, LLM, RAG, and web search agents
β βββ app/
β β βββ config.py
β β βββ utils.py
β β βββ rag.py
β β βββ research.py
β β βββ main.py
β βββ data/ # Your uploaded/managed documents
β βββ store\_rag/ # Local vector store index
β βββ requirements.txt
βββ README.md
Local Mind is your own personal, private, full-stack AI workspace.
- Upload files, chat with your own knowledge base
- Research with live web search (and get real, cited answers)
- All locally, all private β powered by FastAPI and Next.js
- Beautiful chat interface
- File upload and management
- Live streaming answers from LLM
- Web search results shown in real time
The client talks to the FastAPI backend via simple HTTP endpoints.
- Runs your local LLM (Llama.cpp)
- Retrieval-Augmented Generation on your uploaded files
- Web research agent for up-to-date answers
- Automatic file watching and index updating
- API endpoints for chat, file management, research
- Personal Knowledge Base: Chat with your notes, manuals, or code docs.
- Research Copilot: Get the best web and local insights, cited and summarized.
- Secure Team Docs: Host on LAN, share within your orgβno data ever leaves.
- Developer Assistant: Index your codebase docs, get instant answers.
- Academic Summaries: Ask questions to your PDFs or web research.
git clone https://github.com/yourusername/local-mind.git
cd local-mindcd server
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
pip install -r requirements.txt
# Place your Llama.cpp GGUF model in this folder (e.g. jan-nano-128k-Q5_K_M.gguf)
python -m app.main
# or
uvicorn app.main:app --reload- The server runs at http://localhost:8000
- Interactive API: http://localhost:8000/docs
cd ../client
npm install
npm run dev- The client runs at http://localhost:3000
- Upload your files (PDF, TXT, etc) via the web UI.
- Chat with your knowledge base. Get answers instantly.
- Ask web questions: The agent pulls the latest info and cites real URLs.
- All private: Your files and chats never leave your device.
The FastAPI backend exposes endpoints such as:
POST /rag/upload/β Upload documentsGET /rag/files/β List uploaded filesDELETE /rag/delete/{filename}β Delete documentsGET /rag/streamβ Chat with your files (RAG)GET /research/streamβ Research agent (web search)GET /healthβ Health check
See http://localhost:8000/docs for full details.
- π‘οΈ Privacy First: 100% local, no data leaves your computer.
- π§ Multi-source AI: Mixes your files and the web.
- β‘ Lightning Fast: No cloud lag, no rate limits.
- π§© Composable: Easy to extend or add your own tools.
Pull requests, issues, and suggestions are always welcome!
- Want to add more LLM models? New RAG features? Better UI? Jump in!


