GitHub - maithili74/Customer_Support_Copilot

Multimodal Customer Support Copilot

A production-grade multi-agent RAG system for intelligent customer support. Handles text queries, error screenshots, and system logs with confidence-based routing, 3-layer guardrails, and human escalation.

Features

Multi-agent RAG pipeline - Classifier → Retriever → Responder → Evaluator → Router
Multimodal inputs - text queries, error screenshots (Gemini Vision), and system error logs
3-layer guardrails - regex fast path, LLM-based injection + toxicity detection, PII masking
Confidence-based routing - automatically escalates low-confidence queries to human agents
Conversation memory - remembers last 3 exchanges for natural follow-up questions
Evaluator fast path - skips LLM call for high/low similarity cases to reduce latency
Category-aware reranking - boosts retrieved docs that match the classified category
Gradio chat UI - clean interface with screenshot upload, log paste, and example queries

Architecture

Customer Input (text + optional image/logs)
            ↓
    ┌─────────────────┐
    │   Guardrails    │  ← PII masking, injection detection, toxicity filter
    └────────┬────────┘
             │ ALLOW / MASK / BLOCK
             ↓
    ┌─────────────────┐
    │   Classifier    │  ← Intent + category detection (Llama 3.1)
    └────────┬────────┘
             ↓
    ┌─────────────────┐
    │    Retriever    │  ← Semantic search over 26K KB docs (ChromaDB)
    └────────┬────────┘
             ↓
    ┌─────────────────┐
    │    Responder    │  ← Grounded answer generation (Llama 3.1)
    └────────┬────────┘
             ↓
    ┌─────────────────┐
    │    Evaluator    │  ← Confidence scoring (similarity + LLM check)
    └────────┬────────┘
             ↓
    ┌─────────────────┐
    │     Router      │  ← RESPOND or ESCALATE TO HUMAN
    └─────────────────┘

Tech Stack

Component	Technology
LLM (text)	Llama 3.1 8B via Groq API
LLM (vision)	Gemini 2.5 Flash
Embeddings	all-MiniLM-L6-v2 (sentence-transformers)
Vector DB	ChromaDB (local, persisted)
Dataset	Bitext Customer Support (26K conversations)
UI	Gradio 6.x
Guardrails	Regex + LLM classification

How RAG Works Here

Knowledge base - 26,000 real customer support responses embedded using all-MiniLM-L6-v2 and stored in ChromaDB
Query embedding - customer question converted to the same vector space
Similarity search - ChromaDB finds top 5 most semantically similar KB documents
Reranking - documents matching the classified category are boosted, top 3 returned
Grounded generation - Llama 3.1 reads the 3 retrieved documents as context and generates an answer based on them

Project Structure

customer-support-copilot/
│
├── notebooks/
│   ├── agents.py           ← classifier, retriever, responder, evaluator, router
│   ├── multimodal.py       ← vision agent, log analyzer, enrich query, pipeline
│   ├── guardrails.py       ← PII detector, injection detector, toxicity filter
│   ├── ui.py               ← Gradio chat interface
│   └── metrics.py          ← evaluation pipeline and benchmark
│
├── data/
│   ├── raw/                ← downloaded dataset + uploaded images
│   ├── processed/          ← cleaned CSV, eval results
│   └── knowledge_base/     ← FAQ text chunks
│
├── vector_store/           ← ChromaDB persisted embeddings (auto-created)
├── requirements.txt
└── README.md

⚙️ Setup

1. Clone the repository

git clone https://github.com/msithili74/Customer-Support-Copilot.git
cd customer-support-copilot

2. Install dependencies

pip install -r requirements.txt

3. Add API keys

Open notebooks/agents.py (or whichever file you run first) and set:

GROQ_API_KEY   = "gsk_..."    
GEMINI_API_KEY = "AIza..."

4. Build the knowledge base (run once)

Run all cells in data_rag.ipynb. This downloads the dataset, creates embeddings, and saves them to ChromaDB. Takes about 5-10 minutes on first run, instant on subsequent runs.

5. Run the UI

python notebooks/ui.py

Open http://localhost:7860 in your browser.

Evaluation Results

Metric	Score
Intent Classification Accuracy	83.3%
Answer Quality Rate	75.0%
Avg Retrieval Similarity	0.563
Guardrail Precision	100%
Escalation Rate	25.0%
Avg End-to-End Latency	~2s

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.gradio		.gradio
Codes		Codes
__pycache__		__pycache__
data		data
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multimodal Customer Support Copilot

Features

Architecture

Tech Stack

How RAG Works Here

Project Structure

⚙️ Setup

1. Clone the repository

2. Install dependencies

3. Add API keys

4. Build the knowledge base (run once)

5. Run the UI

Evaluation Results

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Multimodal Customer Support Copilot

Features

Architecture

Tech Stack

How RAG Works Here

Project Structure

⚙️ Setup

1. Clone the repository

2. Install dependencies

3. Add API keys

4. Build the knowledge base (run once)

5. Run the UI

Evaluation Results

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages