Skip to content

umair801/rag-knowledge-agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RAG Knowledge Base Agent

Enterprise AI — Instant Answers from 10,000+ Documents

Live API: https://rag.datawebify.com/api/v1/health
Interactive Docs: https://rag.datawebify.com/docs
Portfolio: datawebify.com/projects/agai4-rag-knowledge-agent
Part of: Agentic AI Portfolio — Project 4 of 50


The Problem

Enterprise teams waste thousands of hours per month searching through PDFs, policy documents, manuals, and knowledge bases to answer repetitive questions. A typical support or operations team of 3 agents spends 60-70% of their time on lookups that should be instant.

Cost of manual knowledge lookup:

  • 3 support agents x $25/hr x 160 hrs/month = $12,000/month
  • Average query resolution time: 4-6 minutes
  • Error rate from outdated documents: 15-20%

The Solution

A production-ready RAG (Retrieval Augmented Generation) system that ingests your company documents and answers any question in under 3 seconds — with source citations, conversation memory, and a REST API that plugs into any existing SaaS product.

After deployment:

  • Query resolution time: under 3 seconds (vs 4-6 minutes)
  • Agent hours saved: 400+ hrs/month
  • Cost per query: $0.002 (vs $1.67 human cost)
  • Availability: 24/7 with zero agent involvement

Business Metrics

Metric Before After Change
Query resolution time 4-6 min under 3 sec 99% faster
Monthly agent cost $12,000 $180 (AI cost) 98.5% savings
Document coverage Manual search 10,000+ docs instant 100x scale
Availability Business hours 24/7 Always on
Accuracy 80% (human error) 94%+ (cited sources) 18% improvement

System Architecture

┌─────────────────────────────────────────────┐
│              Ingestion Pipeline              │
│     (PDF / DOCX / URL → Chunks → Embeds)    │
└──────────────────┬──────────────────────────┘
                   │
                   ▼
         ┌─────────────────┐
         │  Vector Store   │
         │   (Pinecone)    │
         └────────┬────────┘
                  │
       ┌──────────▼──────────┐
       │    RAG Query Agent  │
       │  (Semantic Search + │
       │   GPT-4o-mini)      │
       └──────────┬──────────┘
                  │
       ┌──────────▼──────────┐
       │   FastAPI Backend   │
       │  (Chat + REST API)  │
       └──────────┬──────────┘
                  │
       ┌──────────▼──────────┐
       │   Metrics Layer     │
       │ (Latency, Accuracy, │
       │  Query Volume)      │
       └─────────────────────┘

Tech Stack

Layer Technology
LLM OpenAI GPT-4o-mini
Embeddings OpenAI text-embedding-3-small (1536d)
Vector Database Pinecone
RAG Framework LangChain
Document Loaders PyMuPDF, python-docx, BeautifulSoup
API Layer FastAPI + Uvicorn
Metadata + Logs Supabase (PostgreSQL)
Deployment Docker + Railway
Language Python 3.12

API Endpoints

Method Endpoint Description
GET /api/v1/health System health check
POST /api/v1/ingest/file Upload PDF, DOCX, or TXT
POST /api/v1/ingest/url Ingest content from URL
POST /api/v1/chat Ask a question, get cited answer
POST /api/v1/session/new Start a conversation session
GET /api/v1/session/{id}/history Retrieve chat history
DELETE /api/v1/session/{id} End a session
GET /api/v1/metrics System performance metrics

Full interactive documentation: https://rag.datawebify.com/docs


Key Features

Multi-format ingestion Ingest PDF, DOCX, TXT files and any public URL. Automatic chunking with overlap ensures no context is lost at document boundaries.

Semantic search Questions are matched by meaning, not keywords. "What is our cancellation policy?" finds the right clause even if the document says "termination procedure."

Source citations Every answer includes the source document and relevance score. No hallucinations — the model only answers from your documents.

Conversation memory Multi-turn sessions allow follow-up questions. Users can ask "tell me more about that" without repeating context.

Metadata filtering Restrict search to specific document categories, source types, or filenames. Query only HR documents, only legal contracts, etc.

Metrics tracking Every query is logged with latency, retrieval score, and session ID. Business reporting shows queries handled, hours saved, cost per query.


Quick Start

1. Clone the repository

git clone https://github.com/umair801/rag-knowledge-agent
cd rag-knowledge-agent

2. Install dependencies

python -m venv venv
venv\Scripts\activate
pip install -r requirements.txt

3. Configure environment

cp .env.example .env
# Add your API keys to .env

4. Run locally

uvicorn main:app --reload --port 8000

5. Ingest a document

curl -X POST https://rag.datawebify.com/api/v1/ingest/url \
  -H "Content-Type: application/json" \
  -d '{"url": "https://yoursite.com/docs", "category": "support"}'

6. Ask a question

curl -X POST https://rag.datawebify.com/api/v1/chat \
  -H "Content-Type: application/json" \
  -d '{"query": "What is your refund policy?", "top_k": 5}'

Project Structure

rag-knowledge-agent/
├── app/
│   ├── ingestion/
│   │   ├── loaders.py       # PDF, DOCX, URL, TXT loaders
│   │   ├── chunker.py       # Token-aware text chunking
│   │   ├── embedder.py      # OpenAI embeddings + Pinecone storage
│   │   └── pipeline.py      # Orchestration: load → chunk → embed → store
│   ├── retrieval/
│   │   ├── retriever.py     # Semantic search via Pinecone
│   │   ├── generator.py     # GPT-4o-mini answer generation
│   │   ├── memory.py        # Multi-turn conversation sessions
│   │   └── rag_agent.py     # Full pipeline orchestrator
│   ├── api/
│   │   ├── routes.py        # All FastAPI endpoints
│   │   └── app.py           # App factory + middleware
│   ├── metrics/
│   │   ├── tracker.py       # Query logging to Supabase
│   │   └── reporter.py      # Business metrics reporter
│   └── models/
│       └── schemas.py       # Pydantic data models
├── Dockerfile
├── railway.toml
├── requirements.txt
├── main.py
└── README.md

Environment Variables

OPENAI_API_KEY=           # OpenAI API key
PINECONE_API_KEY=         # Pinecone API key
PINECONE_INDEX_NAME=      # Pinecone index name
PINECONE_ENVIRONMENT=     # Pinecone region
SUPABASE_URL=             # Supabase project URL
SUPABASE_KEY=             # Supabase anon key
APP_ENV=                  # development or production
LOG_LEVEL=                # INFO or DEBUG

Use Cases

  • SaaS companies — Answer customer questions from help docs instantly
  • Legal firms — Query contracts and case files without manual search
  • Healthcare — Search clinical guidelines and compliance documents
  • E-commerce — Product catalog and policy Q&A at scale
  • Enterprise IT — Internal knowledge base for support teams

Engagement

Building a RAG knowledge base for your organization?

Muhammad Umair
Agentic AI Specialist and Enterprise Consultant
datawebify.com | github.com/umair801


AgAI_4 of 50 — Enterprise Agentic AI Portfolio

About

Enterprise RAG system — instant answers from 10,000+ documents. GPT-4o-mini + Pinecone + FastAPI. Live at rag.datawebify.com

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors