📚 Local-Doc-Chat-OCR: RAG Assistant with Vision

A local RAG (Retrieval-Augmented Generation) Q&A system built with Streamlit.

Unlike traditional RAG tools, this project integrates OCR (Optical Character Recognition) capabilities, allowing you to chat not only with text documents but also with scanned PDFs and images.

Powered by DeepSeek V3 (for high-performance reasoning) and local Ollama (for privacy-preserving embedding).

⚡️ Update：

For better Chinese understanding。
- Prerequisites：
- Please download Ollama。 And run this in terminal：

ollama pull bge-m3

✨ Core Features

📄 Universal Document Support:
- PDF: Handles both standard text PDFs and Scanned/Image-based PDFs (Auto-triggers OCR).
- Markdown/TXT: Supports common text formats.
👁️ Built-in OCR Engine:
- Integrated RapidOCR + PyMuPDF for local text extraction. No need for third-party OCR APIs.
🧠 Hybrid AI Architecture:
- LLM: DeepSeek API (OpenAI SDK Compatible).
- Embedding: Local Ollama (all-minilm), zero-cost & privacy-first.
- Vector DB: ChromaDB for local persistence.
💬 Streaming Interaction:
- Real-time typewriter effect responses.

🛠️ Tech Stack

Component	Technology	Description
Frontend	Streamlit	Lightweight Python Web Framework
LLM	DeepSeek API	High performance, low cost reasoning model
Embedding	Ollama	Running `all-minilm` locally
Vector DB	ChromaDB	Local vector storage
OCR	RapidOCR	ONNX-based offline OCR engine
ETL	PyMuPDF (fitz)	PDF parsing and image extraction

🚀 Quick Start

1. Prerequisites

Ensure you have Python 3.8+ and Ollama installed.

# Clone the repository
git clone https://github.com/YOUR_USERNAME/local-rag-ocr-bot.git
cd local-rag-ocr-bot

2. Install Dependencies

pip install -r requirements.txt

Note: The OCR libraries are relatively large, so the download might take a moment.

3. Prepare Model (Ollama)

Pull the embedding model in your terminal:

ollama pull all-minilm

Make sure the Ollama service is running in the background.

4. Configure Environment

Copy the example configuration file:

# Windows
copy .env.example .env
# Mac/Linux
cp .env.example .env

Open .env and fill in your DeepSeek API Key:

# Your DeepSeek API Key
DEEPSEEK_API_KEY=sk-xxxxxxxxxxxxxxxxxxxxxxxx

# Keep others as default
DEEPSEEK_BASE_URL=https://api.deepseek.com
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=all-minilm
CHROMA_DB_PATH=./chroma_db

5. Run App

streamlit run app.py

The browser will automatically open at http://localhost:8501.

📂 Project Structure

.
├── app.py                  # Main Streamlit application
├── rag_engine.py           # Core logic (OCR, Vectorization, RAG)
├── requirements.txt        # Python dependencies
├── .env.example            # Env template (Safe to commit)
├── .gitignore              # Git ignore rules
├── README.md               # English Documentation
└── README_CN.md            # Chinese Documentation

⚠️ Notes

OCR Speed: If you upload a scanned PDF, the system performs page-by-page recognition. Depending on your CPU, this may take longer than processing standard text. Please watch the terminal for progress.
DeepSeek Quota: Ensure your API Key has sufficient balance.
Reset Data: To clear the knowledge base, click the "Clear Knowledge Base" button in the sidebar or manually delete the local chroma_db folder.

🙌 Acknowledgments

Special thanks to the following tools that made this project possible:

Cursor: For the incredible AI-assisted coding experience.
Google Gemini: For providing architectural advice and debugging help.
DeepSeek: For the powerful reasoning API.

📄 License

This project is licensed under the MIT License. Feel free to Fork and Star!

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
README_CN.md		README_CN.md
app.py		app.py
check_env.py		check_env.py
fix_env.py		fix_env.py
pyrightconfig.json		pyrightconfig.json
rag_engine.py		rag_engine.py
reset_db.py		reset_db.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📚 Local-Doc-Chat-OCR: RAG Assistant with Vision

⚡️ Update：

✨ Core Features

🛠️ Tech Stack

🚀 Quick Start

1. Prerequisites

2. Install Dependencies

3. Prepare Model (Ollama)

4. Configure Environment

5. Run App

📂 Project Structure

⚠️ Notes

🙌 Acknowledgments

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

📚 Local-Doc-Chat-OCR: RAG Assistant with Vision

⚡️ Update：

✨ Core Features

🛠️ Tech Stack

🚀 Quick Start

1. Prerequisites

2. Install Dependencies

3. Prepare Model (Ollama)

4. Configure Environment

5. Run App

📂 Project Structure

⚠️ Notes

🙌 Acknowledgments

📄 License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages