Enterprise-grade Retrieval-Augmented Generation system built with .NET 9 and Semantic Kernel. Combines intelligent document processing, vector search, and AI-powered response generation for knowledge management and Q&A systems.
- Elastyczne wymiary wektorów: storage
vector(1536)z kolumnąembedding_dimension(oryginalny wymiar), automatyczne pad/trim w serwisie wektorowym - HNSW Tuning w UI: parametry
m,ef_construction,ef_searchoraz przebudowa indeksu z poziomuSettings - Czat z RAG + streaming: nowa strona
/chatz pseudo-streamingiem (IAsyncEnumerable), wyborem providera/modelu i ostrzeżeniami dot. kluczy API - LocalStorage: trwała historia rozmów oraz preferencje UI (modele, providery, klucze API); przyciski „Test pgvector i indeksy”, „Załaduj dane demo”, „Pełny healthcheck”
- Reranking LLM/Cohere: wsparcie strategii rerankowania (Semantic/LLM/Cohere)
- Neo4j: implementacja
IGraphDatabaseService+ strona grafu (Cypher, statystyki, edycja) - Preferencje embeddera per‑tenant (DB): provider/model zapisywane w
UiSettingsi używane runtime (bez restartu) w ingestion i query embeddingach - Live odświeżanie ustawień:
SettingsHub(SignalR) wysyłaSettingsUpdated→ Dashboard auto‑reload - Wymuszenie 1536 dla OpenAI: nawet dla „large” (3072) wysyłamy
dimensions: 1536i zapisujemyvector(1536) - Override per dokument:
Documentspozwala wymusić provider; zapis w metadanychembeddingProviderOverridei widoczne w podglądzie chunków
- Multi-format Support: PDF, DOCX, TXT, Markdown, HTML
- Intelligent Chunking: Smart text splitting with overlap
- Automatic Embeddings: Local i chmurowe (OpenAI, Azure OpenAI, Google Gemini)
- Metadata Extraction: Rich document metadata and insights
- Semantic Search: Vector similarity (pgvector + HNSW)
- Hybrid Search: Semantic + full-text
- Reranking: Semantic, LLM (GPT/Claude/Grok), Cohere (
rerank-v3.5), Popularity, Recency, Combined - Context-aware Results: Extended context from neighboring chunks
- Elastyczne wymiary: obsługa modeli 384/768/1536 dzięki normalizacji do
1536iembedding_dimension
- LLM Integration: OpenAI/Azure OpenAI; opcjonalnie inne przez konfigurację
- Source Attribution i Confidence
- Follow-ups i Quality assessment
- Czat z streamingiem i wyborem modelu/provider’a; historia i ustawienia w LocalStorage
- PostgreSQL + pgvector: vector DB
- Neo4j: graf wiedzy (węzły dokumentów, relacje, zapytania Cypher)
- Blazor Server (.NET 9): interaktywne UI, MudBlazor
- EF Core: migracje,
UseVector()
- .NET 9 SDK
- PostgreSQL 16+ z rozszerzeniem
pgvector - (opcjonalnie) Docker Desktop
- Utwórz DB
ragdb_devi włącz rozszerzenia:
CREATE DATABASE rag;
\c rag
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
CREATE EXTENSION IF NOT EXISTS vector;- Jeśli brak
vector, doinstaluj pgvector przez StackBuilder (Windows) lub użyj obrazupgvectorw Dockerze.
Skonfiguruj src/RAGAdminPanel/appsettings.Development.json:
{
"Logging": {
"LogLevel": { "Default": "Debug", "Microsoft.AspNetCore": "Information" }
},
"RAG": {
"EmbeddingProvider": "Local", // Local | OpenAI | AzureOpenAI | GoogleGemini
"PostgreSQL": {
"ConnectionString": "Host=localhost;Port=5432;Database=rag;Username=postgres;Password=HasloHaslo122@@@;Trust Server Certificate=true",
"UseHalfPrecisionIndex": false,
"HnswM": 16,
"HnswEfConstruction": 64,
"HnswEfSearch": 100
},
"OpenAI": { "ApiKey": "", "EmbeddingModel": "text-embedding-3-small", "ChatModel": "gpt-4o" },
"GoogleGemini": { "ApiKey": "", "EmbeddingModel": "gemini-embedding-001" },
"Anthropic": { "ApiKey": "", "ChatModel": "claude-3.7-sonnet" },
"xAI": { "ApiKey": "", "ChatModel": "grok-4" },
"Cohere": { "ApiKey": "", "RerankModel": "rerank-v3.5" },
"GraphDatabase": { "Provider": "Neo4j", "ConnectionString": "bolt://localhost:7687", "Username": "neo4j", "Password": "password" },
"RerankingProvider": "Semantic" // Semantic | LLM | Cohere
}
}cd src/RAGAdminPanel
# środowisko deweloperskie
$env:ASPNETCORE_ENVIRONMENT="Development"
dotnet ef database updatedotnet run --project src/RAGAdminPanel/RAGAdminPanel.csprojPanel: https://localhost:5001
W trakcie startu aplikacja:
- wymusza
CREATE EXTENSION IF NOT EXISTS vector, - uruchamia
db.Database.Migrate()(automatyczne migracje), - odtwarza indeks HNSW zgodnie z parametrami z konfiguracji.
- 🏗️ Architecture Overview
- ⚙️ Configuration Guide
- 📚 API Documentation
- 🔧 Development Guide
- 🐳 Docker Deployment
- 🚀 Production Setup
- 🧾 TL;DR dla LLM (skrótowy opis działania)
graph TB
subgraph "🎯 Presentation Layer"
UI[Blazor Server UI]
API[REST API Endpoints]
end
subgraph "🧠 Business Logic Layer"
RAG[RAG Service<br/>Pipeline Orchestrator]
DOC[Document Processor]
SEARCH[Semantic Search]
LLM[LLM Service]
VECTOR[Vector Store]
end
subgraph "💾 Data Layer"
PG[(PostgreSQL<br/>+pgvector)]
FILES[File Storage]
end
subgraph "🌐 External Services"
OPENAI[OpenAI API]
LOCAL[Local Models<br/>Ollama/SmartComponents]
end
UI --> RAG
API --> RAG
RAG --> DOC
RAG --> SEARCH
RAG --> LLM
RAG --> VECTOR
VECTOR --> PG
DOC --> FILES
LLM --> OPENAI
LLM --> LOCAL
SEARCH --> VECTOR
The main orchestrator coordinates a multi-step, advanced RAG workflow:
- Pre-analysis & Query Rewrite: The user's query is first analyzed by an LLM to be rewritten for clarity, have key entities extracted, and determine if graph-based expansion is needed.
- Hybrid Search: The improved query is used to perform a hybrid search, combining vector similarity with full-text search to retrieve relevant document chunks.
- Advanced Reranking: The initial results are reranked using a sophisticated model (like Cohere or another LLM) to bring the most relevant results to the top.
- Conditional Graph Expansion: If suggested by the pre-analysis step, the context is enriched with additional facts and relationships from the Neo4j knowledge graph.
- Response Generation: The curated, high-quality context is passed to the main LLM to generate a final, source-based answer.
- Post-processing & Quality-gates: The generated response undergoes a final check. This can include generating follow-up questions for the user and a quality assessment where an LLM scores the answer's faithfulness to the source context. If confidence is low, the system can even suggest a better query to the user.
| Service | Responsibility | Key Features |
|---|---|---|
| IRAGService | Main orchestrator | Pipeline coordination, health monitoring |
| IDocumentProcessorService | Document processing | Multi-format parsing, chunking |
| IVectorStoreService | Vector operations | PostgreSQL+pgvector CRUD, similarity search |
| ISemanticSearchService | Advanced search | Hybrid search, reranking, caching |
| ILLMService | AI responses | Generation, summarization, analysis |
| IEmbeddingService | Vector embeddings | Local/cloud embedding generation |
- Preferencje per‑tenant są trzymane w tabeli
UiSettings(JSONB) i pobierane przezSettingsService.GetEmbeddingPreferencesAsync() - Priorytet użycia providera embeddera:
- Override per dokument (
embeddingProviderOverridew metadanych, ustawiane na stronieDocuments) - Preferencja tenanta z DB (
UiSettings→EmbeddingProvider+ model) - Domyślna konfiguracja z
appsettings*.json
- Override per dokument (
- Dashboard i test „Generuj embedding” korzystają z preferencji tenanta. Po zapisie ustawień UI wysyła sygnał
SettingsUpdated(SignalR) i dashboard przeładowuje się automatycznie.
- OpenAI (serie
text-embedding-3-*) wspierają parametrdimensions→ backend wymusza<=1536. Storage tovector(1536), a oryginalny wymiar jest zapisywany w kolumnieembedding_dimension.
| Operation | Avg Response Time | Throughput |
|---|---|---|
| Document Upload (1MB PDF) | ~2.5s | 40 docs/min |
| Semantic Search | ~150ms | 1000 queries/min |
| LLM Response Generation | ~3s | 20 responses/min |
| Vector Similarity (10k docs) | ~50ms | 2000 searches/min |
- Horizontal: Multiple app instances with shared PostgreSQL
- Caching: Redis integration for query caching
- CDN: Static asset distribution
- Database: Read replicas for search-heavy workloads
- Input Validation: Comprehensive request validation
- SQL Injection Protection: Parameterized queries via EF Core
- File Upload Security: Type validation, size limits
- API Rate Limiting: Configurable throttling
- Error Handling: Secure error responses without information leakage
Run the test suite:
# Testy jednostkowe
dotnet test tests/RAGAdminPanel.Tests# Build and run with Docker Compose
docker-compose up --build
# Production deployment
docker-compose -f docker-compose.prod.yml up -d- Azure: App Service + Azure Database for PostgreSQL
- AWS: ECS/EKS + RDS PostgreSQL
- GCP: Cloud Run + Cloud SQL
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit changes (
git commit -m 'Add amazing feature') - Push to branch (
git push origin feature/amazing-feature) - Open a Pull Request
- ✅ Complete RAG pipeline implementation
- ✅ Multi-format document processing
- ✅ PostgreSQL + pgvector integration
- ✅ Semantic Kernel LLM integration
- ✅ Blazor Server admin interface
- ✅ Comprehensive logging and monitoring
This project is licensed under the MIT License - see the LICENSE file for details.
- Microsoft Semantic Kernel - AI orchestration framework
- pgvector - PostgreSQL vector similarity search
- MudBlazor - Material Design components for Blazor
- OpenAI - Large Language Models and embeddings
- 📚 Documentation
- 🐛 Issue Tracker
- 💬 Discussions
- 📧 Email: support@your-org.com
Built with ❤️ using .NET 9 and Semantic Kernel