Self-hostable agentic RAG platform powered by Ollama, ChromaDB, Neo4j, and CrewAI
OpenAgentRAG is an enterprise-grade, fully self-hostable Retrieval-Augmented Generation (RAG) platform that combines multi-agent orchestration with advanced retrieval techniques. Built on open-source technologies, it enables organizations to deploy intelligent document query systems without relying on external APIs or cloud services.
- Multi-Agent Architecture: CrewAI-powered agents for query understanding, retrieval, synthesis, and evaluation
- Graph RAG: Knowledge graph integration with Neo4j β entities and relationships are extracted from documents and used to enrich retrieval with connected context
- Hybrid Retrieval: Combines semantic search (ChromaDB) with keyword matching and knowledge graph traversal for optimal relevance
- Query Expansion: Automatically generates query variations to improve retrieval coverage
- Self-Improving Pipeline: Feedback loop for continuous quality enhancement based on user interactions
- 100% Self-Hosted: Run entirely on your infrastructure with Ollama for LLM inference
- Multi-Format Support: Ingest PDFs, DOCX, TXT, Markdown, CSV, and JSON documents
- Phase-Based Chunking: Intelligent document segmentation that preserves semantic boundaries
- Real-Time Evaluation: Built-in metrics for retrieval relevance, groundedness, and answer quality
- Modern Web UI: Next.js 14 frontend with real-time streaming responses
- Docker-First Deployment: One-command setup with docker-compose
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β User Interface β
β (Next.js 14 Web Application) β
βββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββββββ
β HTTP/WebSocket
βββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββββββββββ
β Backend API Layer β
β (FastAPI Python Service) β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββββββββββββββ β
β β REST API β β WebSocket β β Document Upload & β β
β β Endpoints β β Streaming β β Processing β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββββββββββ
β Agent Orchestration β
β (CrewAI Multi-Agent System) β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββββββββββββββ β
β β Query Agent ββββΆβ Retrieval ββββΆβ Synthesis Agent β β
β β (Expansion) β β Agent β β (Answer Generation) β β
β ββββββββββββββββ ββββββββ¬ββββββββ ββββββββββββββββββββββββββββ β
β β β β
β β βΌ β
β β ββββββββββββββββββββββββββββ β
β β β Evaluation Agent β β
β β β (Quality Assessment) β β
β β ββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββΌββββββββββββββββββββ
β β β
βββββββββΌββββββββββ βββββββββΌββββββββββ ββββββββΌβββββββββββ
β Vector Store β β LLM Inference β β Document Store β
β (ChromaDB) β β (Ollama) β β (Local FS) β
β β β β β β
β - Embeddings β β - Llama 3 β β - Raw files β
β - Semantic β β - Mistral β β - Metadata β
β Search β β - Custom models β β - Processed β
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β
β Graph-enhanced retrieval
βΌ
βββββββββββββββββββ
β Knowledge Graph β
β (Neo4j) β
β β
β - Entities β
β - Relationships β
β - Graph RAG β
βββββββββββββββββββ
- Docker 24.0+ and Docker Compose 2.0+
- 16GB RAM minimum (32GB recommended for larger models)
- 50GB free disk space (for models and data)
- Optional: NVIDIA GPU with CUDA support for faster inference
-
Clone the repository
git clone https://github.com/theja0473/RAG-AS-SERVICE.git cd open-agent-rag -
Configure environment
cp .env.example .env # Edit .env with your preferred settings (defaults work out of the box) -
Start all services
docker compose up -d
-
Pull the LLM model (required on first run)
docker compose exec ollama ollama pull llama3 -
Access the application
- Web UI: http://localhost:3000
- Backend API: http://localhost:8000
- API Documentation: http://localhost:8000/docs
- ChromaDB Admin: http://localhost:8100
- Neo4j Browser: http://localhost:7474 (credentials: neo4j/openagentrag)
Run the setup script to verify all services and initialize the system:
chmod +x scripts/setup.sh
./scripts/setup.shThis script will:
- Wait for all services to become healthy
- Pull the default LLM model
- Create the default ChromaDB collection
- Verify the complete pipeline
All configuration is managed through the .env file. Key parameters:
| Variable | Default | Description |
|---|---|---|
LLM_MODEL |
llama3 |
Ollama model for generation (llama3, mistral, mixtral) |
EMBEDDING_MODEL |
sentence-transformers/all-MiniLM-L6-v2 |
Model for vector embeddings |
CHUNK_SIZE |
512 |
Document chunk size in tokens |
CHUNK_OVERLAP |
50 |
Overlap between chunks |
TOP_K_RETRIEVAL |
5 |
Number of chunks to retrieve |
SIMILARITY_THRESHOLD |
0.7 |
Minimum similarity score for retrieval |
ENABLE_QUERY_EXPANSION |
true |
Generate query variations |
NUM_QUERY_VARIATIONS |
2 |
Number of query alternatives |
NEO4J_URI |
bolt://neo4j:7687 |
Neo4j connection URI |
NEO4J_USERNAME |
neo4j |
Neo4j username |
NEO4J_PASSWORD |
openagentrag |
Neo4j password |
GRAPH_RAG_ENABLED |
true |
Enable knowledge graph enrichment |
See .env.example for the complete list of configuration options.
| Format | Extension | Parsing Strategy |
|---|---|---|
.pdf |
PyMuPDF with text/table extraction | |
| Word | .docx |
python-docx with style preservation |
| Text | .txt |
Plain text with UTF-8 encoding |
| Markdown | .md |
CommonMark parser with header-based chunking |
| CSV | .csv |
Pandas with configurable delimiter |
| JSON | .json |
Recursive extraction with JSONPath support |
| HTML | .html |
BeautifulSoup with boilerplate removal |
| Component | Technology | Purpose |
|---|---|---|
| Backend | FastAPI (Python 3.9+) | REST API and async processing |
| Frontend | Next.js 14 (React 18) | Modern web interface with App Router |
| Vector DB | ChromaDB 0.4+ | Semantic search and embedding storage |
| Knowledge Graph | Neo4j 5 Community | Entity/relationship storage and graph traversal |
| LLM | Ollama | Local LLM inference engine |
| Agents | CrewAI | Multi-agent orchestration framework |
| Embeddings | Sentence Transformers | Text-to-vector conversion |
| Containerization | Docker + Docker Compose | Service orchestration |
Backend:
cd backend
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
pip install -r requirements.txt
uvicorn main:app --reload --host 0.0.0.0 --port 8000Frontend:
cd frontend
npm install
npm run devExternal Dependencies:
- Install Ollama locally: https://ollama.ai
- Run ChromaDB:
docker run -p 8100:8000 chromadb/chroma:latest - Run Neo4j:
docker run -p 7474:7474 -p 7687:7687 -e NEO4J_AUTH=neo4j/openagentrag neo4j:5-community
open-agent-rag/
βββ backend/
β βββ agents/ # CrewAI agent definitions
β βββ config/ # Application configuration (Pydantic Settings)
β βββ database/ # SQLAlchemy models, ChromaDB and Neo4j clients
β βββ rag/ # RAG pipeline (chunking, embedding, retrieval, graph retrieval, generation)
β βββ routers/ # FastAPI route handlers
β βββ services/ # Business logic layer
β βββ main.py # FastAPI application entry point
β βββ Dockerfile
β βββ requirements.txt
βββ frontend/
β βββ src/
β β βββ app/ # Next.js 14 App Router pages
β β βββ components/ # React components
β β βββ lib/ # API client
β βββ Dockerfile
β βββ package.json
βββ docs/ # Detailed documentation
βββ scripts/ # Utility scripts
βββ docker-compose.yml
βββ .env.example
βββ README.md
# Backend tests
cd backend
pytest tests/ --cov=app --cov-report=term-missing
# Frontend tests
cd frontend
npm test
# Integration tests
docker compose -f docker-compose.test.yml up --abort-on-container-exitComing soon: Web UI screenshots showing document upload, chat interface, and evaluation dashboard
We welcome contributions! Please see CONTRIBUTING.md for guidelines.
Quick contribution workflow:
- Fork the repository
- Create a feature branch:
git checkout -b feature/your-feature - Make your changes and add tests
- Run the test suite:
pytest(backend) andnpm test(frontend) - Commit with conventional commits:
feat: add XYZ capability - Push and create a Pull Request
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
If you use OpenAgentRAG in your research or project, please cite:
@software{openagentrag2026,
title = {OpenAgentRAG: Self-Hostable Agentic RAG Platform},
author = {OpenAgentRAG Contributors},
year = {2026},
url = {https://github.com/theja0473/RAG-AS-SERVICE},
version = {0.1.0}
}- Core RAG pipeline with hybrid retrieval
- Multi-agent orchestration with CrewAI
- Docker-based deployment
- Graph RAG with knowledge graph integration (Neo4j)
- Multi-modal support (images, audio)
- Fine-tuning pipeline for domain adaptation
- Kubernetes deployment manifests
- RESTful API for programmatic access
- Multi-tenancy support
- Documentation: docs/
- Issues: GitHub Issues
- Discussions: GitHub Discussions
Built with open-source technologies for the open-source community.