A hybrid RAG (Retrieval-Augmented Generation) chatbot that answers questions from your documents and live web search - built with LangChain, FAISS, and Streamlit.
- Name: Vimal solanki
- Email: vimal162002@email.com
-
🖥️ Live Demo: multi-document-rag-ai-chatbot.streamlit.app
-
📹 Video Explanation: Watch here
Organizations usually store knowledge across multiple unstructured documents like PDFs, reports, and notes.
However, these documents are static and do not contain real-time information.
This creates a gap where users need both internal knowledge and up-to-date data.
To solve this problem, I built a hybrid RAG chatbot that combines document retrieval with live web search.
This system can:
- 📄 Search across multiple documents at the same time
- 🌐 Fetch real-time information from the web
- 🔀 Hybrid mode: checks docs first, then web if needed
- 📌 Clearly distinguish between document-based and real-time answers
This project solves that by combining:
- 📄 Multi-document semantic search (your private files)
- 🌐 Live web search via Tavily (real-time facts)
- 🔀 Hybrid mode: checks docs first, then web if needed
| Component | Tool |
|---|---|
| Language | Python |
| LLM | LLaMA 3.3 70B via Groq |
| Embeddings | sentence-transformers/all-MiniLM-L6-v2 (FREE, local) |
| Vector DB | FAISS |
| Web Search | Tavily |
| Orchestration | LangChain |
| UI | Streamlit |
multi-document-rag-search-engine/
│
├── config/
│ └── settings.py # loads all .env variables
│
├── core/
│ ├── ingestion.py # loads and chunks PDF/TXT files
│ ├── embedding.py # HuggingFace embedding model
│ ├── vector_store.py # FAISS index management
│ └── chain.py # RAG pipeline (retrieve → context → answer)
│
├── tools/
│ └── tavily_search.py # Tavily web search integration
│
├── ui/
│ ├── chat.py # main controller connecting UI and backend
│ └── components.py # all Streamlit UI components
│
├── data/documents/ # sample documents for testing
├── main.py # app entry point
├── .env # API keys and config (not committed)
└── requirements.txt
main.py ──────────────────▶ Streamlit UI (entry point)
│
▼
chat.py ──────────────────▶ Controller (connects UI ↔ backend)
│
├──▶ ingestion.py ──▶ Load & chunk documents
│
├──▶ embedding.py ──▶ Convert text to vectors
│
├──▶ vector_store.py ──▶ Store & search in FAISS
│
├──▶ chain.py ──▶ RAG pipeline (retrieve → LLM → answer)
│
└──▶ tavily_search.py──▶ Live web search
1. Clone the repository
git clone https://github.com/yourusername/multi-document-rag-search-engine.git
cd multi-document-rag-search-engine2. Install dependencies
pip install -r requirements.txt3. Set up environment variables
cp .env.example .env
# Add your API keys in .envEMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
CHUNK_SIZE=1000
CHUNK_OVERLAP=150
TOP_K_RESULTS=6
GPT_MODEL_NAME=llama-3.3-70b-versatile
GROQ_API_KEY=your_groq_api_key
TEMPERATURE=0
TAVILY_API_KEY=your_tavily_api_key
TOP_K_WEB_RESULTS=34. Run the app
streamlit run main.py- Upload one or more PDF or TXT files
- Click Process Documents
- Select retrieval mode — Documents, Web, or Hybrid
- Ask your question and get answers with citations
| Query | Best Mode |
|---|---|
| "Explain attention mechanism" | 📄 Documents |
| "Latest news about GPT-5" | 🌐 Web |
| "How does RAG compare to current LLM tools?" | 🔀 Hybrid |
| Service | Free Tier | Link |
|---|---|---|
| Groq | ✅ Yes | console.groq.com |
| Tavily | ✅ Yes | tavily.com |
| HuggingFace Embeddings | ✅ 100% Free | Runs locally |
- Built a hybrid RAG system with document + web retrieval
- Used FAISS for fast semantic vector search
- Integrated Tavily for real-time web results
- Implemented citation-aware answer generation
- Deployed a full AI app with Streamlit
This project is for educational and portfolio purposes.