Skip to content

DamianTarnowski/RAGAdminPanel

Repository files navigation

🧠 Advanced RAG System - Retrieval-Augmented Generation

.NET Semantic Kernel PostgreSQL pgvector

Enterprise-grade Retrieval-Augmented Generation system built with .NET 9 and Semantic Kernel. Combines intelligent document processing, vector search, and AI-powered response generation for knowledge management and Q&A systems.

🆕 Nowości (ostatnie zmiany)

  • Elastyczne wymiary wektorów: storage vector(1536) z kolumną embedding_dimension (oryginalny wymiar), automatyczne pad/trim w serwisie wektorowym
  • HNSW Tuning w UI: parametry m, ef_construction, ef_search oraz przebudowa indeksu z poziomu Settings
  • Czat z RAG + streaming: nowa strona /chat z pseudo-streamingiem (IAsyncEnumerable), wyborem providera/modelu i ostrzeżeniami dot. kluczy API
  • LocalStorage: trwała historia rozmów oraz preferencje UI (modele, providery, klucze API); przyciski „Test pgvector i indeksy”, „Załaduj dane demo”, „Pełny healthcheck”
  • Reranking LLM/Cohere: wsparcie strategii rerankowania (Semantic/LLM/Cohere)
  • Neo4j: implementacja IGraphDatabaseService + strona grafu (Cypher, statystyki, edycja)
  • Preferencje embeddera per‑tenant (DB): provider/model zapisywane w UiSettings i używane runtime (bez restartu) w ingestion i query embeddingach
  • Live odświeżanie ustawień: SettingsHub (SignalR) wysyła SettingsUpdated → Dashboard auto‑reload
  • Wymuszenie 1536 dla OpenAI: nawet dla „large” (3072) wysyłamy dimensions: 1536 i zapisujemy vector(1536)
  • Override per dokument: Documents pozwala wymusić provider; zapis w metadanych embeddingProviderOverride i widoczne w podglądzie chunków

🌟 Features

📄 Document Processing

  • Multi-format Support: PDF, DOCX, TXT, Markdown, HTML
  • Intelligent Chunking: Smart text splitting with overlap
  • Automatic Embeddings: Local i chmurowe (OpenAI, Azure OpenAI, Google Gemini)
  • Metadata Extraction: Rich document metadata and insights

🔍 Advanced Search

  • Semantic Search: Vector similarity (pgvector + HNSW)
  • Hybrid Search: Semantic + full-text
  • Reranking: Semantic, LLM (GPT/Claude/Grok), Cohere (rerank-v3.5), Popularity, Recency, Combined
  • Context-aware Results: Extended context from neighboring chunks
  • Elastyczne wymiary: obsługa modeli 384/768/1536 dzięki normalizacji do 1536 i embedding_dimension

🤖 AI-Powered Responses

  • LLM Integration: OpenAI/Azure OpenAI; opcjonalnie inne przez konfigurację
  • Source Attribution i Confidence
  • Follow-ups i Quality assessment
  • Czat z streamingiem i wyborem modelu/provider’a; historia i ustawienia w LocalStorage

🏗️ Architecture

  • PostgreSQL + pgvector: vector DB
  • Neo4j: graf wiedzy (węzły dokumentów, relacje, zapytania Cypher)
  • Blazor Server (.NET 9): interaktywne UI, MudBlazor
  • EF Core: migracje, UseVector()

🚀 Quick Start

Prerequisites

  • .NET 9 SDK
  • PostgreSQL 16+ z rozszerzeniem pgvector
  • (opcjonalnie) Docker Desktop

1. Database (lokalny Postgres)

  • Utwórz DB ragdb_dev i włącz rozszerzenia:
CREATE DATABASE rag;
\c rag
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
CREATE EXTENSION IF NOT EXISTS vector;
  • Jeśli brak vector, doinstaluj pgvector przez StackBuilder (Windows) lub użyj obrazu pgvector w Dockerze.

2. Konfiguracja

Skonfiguruj src/RAGAdminPanel/appsettings.Development.json:

{
  "Logging": {
    "LogLevel": { "Default": "Debug", "Microsoft.AspNetCore": "Information" }
  },
  "RAG": {
    "EmbeddingProvider": "Local", // Local | OpenAI | AzureOpenAI | GoogleGemini
    "PostgreSQL": {
      "ConnectionString": "Host=localhost;Port=5432;Database=rag;Username=postgres;Password=HasloHaslo122@@@;Trust Server Certificate=true",
      "UseHalfPrecisionIndex": false,
      "HnswM": 16,
      "HnswEfConstruction": 64,
      "HnswEfSearch": 100
    },
    "OpenAI": { "ApiKey": "", "EmbeddingModel": "text-embedding-3-small", "ChatModel": "gpt-4o" },
    "GoogleGemini": { "ApiKey": "", "EmbeddingModel": "gemini-embedding-001" },
    "Anthropic": { "ApiKey": "", "ChatModel": "claude-3.7-sonnet" },
    "xAI": { "ApiKey": "", "ChatModel": "grok-4" },
    "Cohere": { "ApiKey": "", "RerankModel": "rerank-v3.5" },
    "GraphDatabase": { "Provider": "Neo4j", "ConnectionString": "bolt://localhost:7687", "Username": "neo4j", "Password": "password" },
    "RerankingProvider": "Semantic" // Semantic | LLM | Cohere
  }
}

3. Migracje bazy

cd src/RAGAdminPanel
# środowisko deweloperskie
$env:ASPNETCORE_ENVIRONMENT="Development"
dotnet ef database update

4. Uruchomienie

dotnet run --project src/RAGAdminPanel/RAGAdminPanel.csproj

Panel: https://localhost:5001

W trakcie startu aplikacja:

  • wymusza CREATE EXTENSION IF NOT EXISTS vector,
  • uruchamia db.Database.Migrate() (automatyczne migracje),
  • odtwarza indeks HNSW zgodnie z parametrami z konfiguracji.

📖 Documentation

🏗️ System Architecture

graph TB
    subgraph "🎯 Presentation Layer"
        UI[Blazor Server UI]
        API[REST API Endpoints]
    end
    
    subgraph "🧠 Business Logic Layer"
        RAG[RAG Service<br/>Pipeline Orchestrator]
        DOC[Document Processor]
        SEARCH[Semantic Search]
        LLM[LLM Service]
        VECTOR[Vector Store]
    end
    
    subgraph "💾 Data Layer"
        PG[(PostgreSQL<br/>+pgvector)]
        FILES[File Storage]
    end
    
    subgraph "🌐 External Services"
        OPENAI[OpenAI API]
        LOCAL[Local Models<br/>Ollama/SmartComponents]
    end
    
    UI --> RAG
    API --> RAG
    RAG --> DOC
    RAG --> SEARCH
    RAG --> LLM
    RAG --> VECTOR
    VECTOR --> PG
    DOC --> FILES
    LLM --> OPENAI
    LLM --> LOCAL
    SEARCH --> VECTOR
Loading

🔧 Core Components

RAG Service Pipeline

The main orchestrator coordinates a multi-step, advanced RAG workflow:

  1. Pre-analysis & Query Rewrite: The user's query is first analyzed by an LLM to be rewritten for clarity, have key entities extracted, and determine if graph-based expansion is needed.
  2. Hybrid Search: The improved query is used to perform a hybrid search, combining vector similarity with full-text search to retrieve relevant document chunks.
  3. Advanced Reranking: The initial results are reranked using a sophisticated model (like Cohere or another LLM) to bring the most relevant results to the top.
  4. Conditional Graph Expansion: If suggested by the pre-analysis step, the context is enriched with additional facts and relationships from the Neo4j knowledge graph.
  5. Response Generation: The curated, high-quality context is passed to the main LLM to generate a final, source-based answer.
  6. Post-processing & Quality-gates: The generated response undergoes a final check. This can include generating follow-up questions for the user and a quality assessment where an LLM scores the answer's faithfulness to the source context. If confidence is low, the system can even suggest a better query to the user.

Service Architecture

Service Responsibility Key Features
IRAGService Main orchestrator Pipeline coordination, health monitoring
IDocumentProcessorService Document processing Multi-format parsing, chunking
IVectorStoreService Vector operations PostgreSQL+pgvector CRUD, similarity search
ISemanticSearchService Advanced search Hybrid search, reranking, caching
ILLMService AI responses Generation, summarization, analysis
IEmbeddingService Vector embeddings Local/cloud embedding generation

Konfiguracja embeddera i priorytety

  • Preferencje per‑tenant są trzymane w tabeli UiSettings (JSONB) i pobierane przez SettingsService.GetEmbeddingPreferencesAsync()
  • Priorytet użycia providera embeddera:
    1. Override per dokument (embeddingProviderOverride w metadanych, ustawiane na stronie Documents)
    2. Preferencja tenanta z DB (UiSettingsEmbeddingProvider + model)
    3. Domyślna konfiguracja z appsettings*.json
  • Dashboard i test „Generuj embedding” korzystają z preferencji tenanta. Po zapisie ustawień UI wysyła sygnał SettingsUpdated (SignalR) i dashboard przeładowuje się automatycznie.

Ograniczenie wymiaru do 1536

  • OpenAI (serie text-embedding-3-*) wspierają parametr dimensions → backend wymusza <=1536. Storage to vector(1536), a oryginalny wymiar jest zapisywany w kolumnie embedding_dimension.

📊 Performance & Scalability

Benchmarks

Operation Avg Response Time Throughput
Document Upload (1MB PDF) ~2.5s 40 docs/min
Semantic Search ~150ms 1000 queries/min
LLM Response Generation ~3s 20 responses/min
Vector Similarity (10k docs) ~50ms 2000 searches/min

Scaling Strategies

  • Horizontal: Multiple app instances with shared PostgreSQL
  • Caching: Redis integration for query caching
  • CDN: Static asset distribution
  • Database: Read replicas for search-heavy workloads

🔐 Security Features

  • Input Validation: Comprehensive request validation
  • SQL Injection Protection: Parameterized queries via EF Core
  • File Upload Security: Type validation, size limits
  • API Rate Limiting: Configurable throttling
  • Error Handling: Secure error responses without information leakage

🧪 Testing

Run the test suite:

# Testy jednostkowe
dotnet test tests/RAGAdminPanel.Tests

📦 Deployment Options

Docker Deployment

# Build and run with Docker Compose
docker-compose up --build

# Production deployment
docker-compose -f docker-compose.prod.yml up -d

Cloud Deployment

  • Azure: App Service + Azure Database for PostgreSQL
  • AWS: ECS/EKS + RDS PostgreSQL
  • GCP: Cloud Run + Cloud SQL

🤝 Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit changes (git commit -m 'Add amazing feature')
  4. Push to branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

📝 Changelog

v1.0.0 (Current)

  • ✅ Complete RAG pipeline implementation
  • ✅ Multi-format document processing
  • ✅ PostgreSQL + pgvector integration
  • ✅ Semantic Kernel LLM integration
  • ✅ Blazor Server admin interface
  • ✅ Comprehensive logging and monitoring

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

📞 Support


Built with ❤️ using .NET 9 and Semantic Kernel

About

No description, website, or topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors