Skip to content

hq969/CogniVault

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CogniVault 🛡️

Enterprise Intelligent Document Assistant

Python FastAPI LangChain License

CogniVault is a secure, scalable Retrieval-Augmented Generation (RAG) platform designed for enterprise knowledge management. It allows organizations to ingest proprietary documentation (PDFs) and enables employees to query that data using Natural Language, ensuring answers are strictly grounded in internal policies.


🏗️ Architecture

CogniVault follows a modular Service-Repository pattern to separate business logic from API handling, ensuring scalability and maintainability.

The Workflow:

  1. Ingestion: PDFs are uploaded, text is extracted and split into chunks (1000 tokens).
  2. Embedding: Text chunks are converted to vector embeddings using OpenAI models.
  3. Storage: Vectors are stored locally in ChromaDB (Vector Store).
  4. Retrieval: User queries are embedded, and a cosine similarity search finds the top 3 relevant chunks.
  5. Generation: The LLM generates a factual answer based only on the retrieved context.

📂 Project Structure

This project follows industry-standard directory layout:

enterprise_project/
├── app/
│   ├── api/          # API Controllers (FastAPI Routes)
│   ├── core/         # Configuration & Secrets Management
│   ├── schemas/      # Pydantic Models for Data Validation
│   └── services/     # Business Logic (Ingestion & RAG)
├── data/             # Persisted Vector Database (ChromaDB)
├── docs/             # Architecture & API Documentation
├── Dockerfile        # Containerization instructions
└── requirements.txt  # Python Dependencies


🚀 Getting Started

Prerequisites

  • Python 3.10+
  • OpenAI API Key

1. Installation

Clone the repository and install dependencies:

git clone [https://github.com/hq969/cognivault.git](https://github.com/your-username/cognivault.git)
cd cognivault
pip install -r requirements.txt

2. Configuration

Create a .env file in the root directory:

OPENAI_API_KEY=sk-proj-your-api-key-here

3. Running the Application

Start the server using Uvicorn:

uvicorn app.main:app --reload

The API will be available at: http://localhost:8000


🔌 API Usage

You can interact with the API via the built-in Swagger UI at http://localhost:8000/docs.

1. Upload a Document

  • Endpoint: POST /api/v1/upload
  • Description: Indexes a PDF file into the knowledge base.
  • Input: Multipart/Form-Data (PDF File)

2. Query the System

  • Endpoint: POST /api/v1/query
  • Description: Ask a question based on uploaded documents.
  • Input:
{
  "question": "What is the severance policy mentioned in the handbook?"
}
  • Response:
{
  "answer": "According to the handbook, severance is calculated based on...",
  "sources": ["Employee_Handbook_2024.pdf"]
}

🐳 Docker Deployment

To run CogniVault in a containerized production environment:

# Build the image
docker build -t cognivault .

# Run the container
docker run -p 80:80 cognivault

🔮 Future Roadmap

  • Hybrid Search: Combining Keyword search with Vector search for better accuracy.
  • Role-Based Access Control (RBAC): Restricting document access by user level.
  • Frontend UI: A React/Streamlit dashboard for non-technical users.

👤 Author

Developed for Enterprise AI Solutions.

About

CogniVault is a secure, production-ready Retrieval-Augmented Generation (RAG) platform designed for enterprise knowledge management. It addresses the critical business need for internal data accessibility by allowing employees to upload proprietary PDF documents (like policies or technical manuals) and query them using natural language.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors