Publiczne interfejsy serwisów: dokumenty, wyszukiwanie semantyczne, generowanie odpowiedzi LLM, RAG orchestrator. Dodano reranking LLM/Cohere oraz integrację Neo4j przez IGraphDatabaseService (UI korzysta bezpośrednio).
graph TD
subgraph "🎯 Main API"
RAG[IRAGService<br/>Main Orchestrator]
end
subgraph "📄 Document Services"
DOC[IDocumentProcessorService]
CHUNK[ITextChunkingService]
end
subgraph "🔍 Search Services"
SEARCH[ISemanticSearchService]
VECTOR[IVectorStoreService]
end
subgraph "🤖 AI Services"
LLM[ILLMService]
EMB[IEmbeddingService]
end
RAG --> DOC
RAG --> SEARCH
RAG --> LLM
RAG --> VECTOR
DOC --> CHUNK
SEARCH --> VECTOR
LLM --> EMB
Task<RAGResponse> QueryAsync(string userQuery, RAGOptions? options = null, CancellationToken cancellationToken = default);Przykład z konfiguracją rerankingu:
var response = await ragService.QueryAsync(
"Jak skonfigurować API?",
new RAGOptions
{
SearchOptions = new SearchOptions
{
MaxResults = 12,
SimilarityThreshold = 0.7,
UseReranking = true,
RerankingStrategy = RerankingStrategy.LLM // lub Cohere / Semantic / Combined
},
LLMOptions = new LLMGenerationOptions { Temperature = 0.3 }
}
);Execute RAG query with conversation history context.
Task<RAGResponse> QueryWithConversationAsync(
string userQuery,
List<ConversationMessage> conversationHistory,
RAGOptions? options = null,
CancellationToken cancellationToken = default
);Process and add documents to the knowledge base.
Task<DocumentIngestionResult> IngestDocumentAsync(
IBrowserFile file,
DocumentIngestionOptions? options = null,
CancellationToken cancellationToken = default
);Example:
var result = await ragService.IngestDocumentAsync(
uploadedFile,
new DocumentIngestionOptions
{
GenerateEmbeddings = true,
ExtractInsights = true,
ProcessingOptions = new DocumentProcessingOptions
{
MaxChunkSize = 1200,
ChunkOverlap = 300,
UseSmartChunking = true
}
}
);
if (result.Success)
{
Console.WriteLine($"Document processed: {result.EmbeddingsGenerated} embeddings generated");
}Check system health and component status.
Task<RAGSystemHealth> GetSystemHealthAsync(CancellationToken cancellationToken = default);Retrieve comprehensive system statistics.
Task<RAGSystemStatistics> GetSystemStatisticsAsync(CancellationToken cancellationToken = default);Backup and restore system data.
Task<RAGExportResult> ExportSystemDataAsync(RAGExportOptions? options = null);
Task<RAGImportResult> ImportSystemDataAsync(RAGImportData importData);Handles multi-format document processing and text extraction.
Process document from uploaded file.
Task<DocumentProcessingResult> ProcessDocumentAsync(
IBrowserFile file,
DocumentProcessingOptions? options = null,
CancellationToken cancellationToken = default
);Process raw text content.
Task<DocumentProcessingResult> ProcessTextAsync(
string text,
string title,
DocumentProcessingOptions? options = null,
CancellationToken cancellationToken = default
);| Extension | Format | Parser Library |
|---|---|---|
.pdf |
PDF Documents | PdfPig |
.docx |
Microsoft Word | DocumentFormat.OpenXml |
.txt |
Plain Text | Native |
.md |
Markdown | Native |
.html |
HTML | HtmlAgilityPack |
Example:
var result = await documentProcessor.ProcessDocumentAsync(
pdfFile,
new DocumentProcessingOptions
{
MaxChunkSize = 1000,
ChunkOverlap = 200,
UseSmartChunking = true,
GenerateEmbeddings = true
}
);
foreach (var chunk in result.Document.Chunks)
{
Console.WriteLine($"Chunk {chunk.ChunkIndex}: {chunk.Content.Substring(0, 100)}...");
}Advanced search capabilities with semantic understanding.
Task<SemanticSearchResult> SearchAsync(string query, SearchOptions? options = null, CancellationToken cancellationToken = default);SearchOptions (fragmenty):
public class SearchOptions
{
public double SimilarityThreshold { get; set; } = 0.7;
public int MaxResults { get; set; } = 10;
public bool UseReranking { get; set; } = true;
public RerankingStrategy RerankingStrategy { get; set; } = RerankingStrategy.Semantic; // + LLM, Cohere
}Combine semantic and full-text search.
Task<SemanticSearchResult> HybridSearchAsync(
string query,
SearchOptions? options = null,
CancellationToken cancellationToken = default
);Search using pre-computed embedding vectors.
Task<SemanticSearchResult> SearchByEmbeddingAsync(
float[] queryEmbedding,
SearchOptions? options = null,
CancellationToken cancellationToken = default
);var searchOptions = new SearchOptions
{
SimilarityThreshold = 0.75, // Minimum similarity score
MaxResults = 10, // Maximum results to return
UseReranking = true, // Enable result reranking
RerankingStrategy = RerankingStrategy.Combined,
IncludeContext = true, // Include neighboring chunks
ContextChunks = 2, // Number of context chunks
SemanticWeight = 0.7 // Weight for hybrid search
};| Strategy | Description | Use Case |
|---|---|---|
Semantic |
LLM-based relevance scoring | High-quality results |
Popularity |
Usage-based ranking | Trending content |
Recency |
Time-based boosting | Latest information |
Combined |
Multi-factor scoring | Balanced results |
Analyze query intent and complexity.
Task<QueryAnalysis> AnalyzeQueryAsync(
string query,
CancellationToken cancellationToken = default
);Example:
var analysis = await searchService.AnalyzeQueryAsync("How to setup SSL certificates?");
Console.WriteLine($"Complexity: {analysis.Complexity}");
Console.WriteLine($"Intents: {string.Join(", ", analysis.Intents)}");
Console.WriteLine($"Keywords: {string.Join(", ", analysis.Keywords)}");AI-powered response generation and analysis.
Generate AI response from search context.
Task<LLMResponse> GenerateResponseAsync(
string userQuery,
List<SearchResultItem> searchResults,
LLMGenerationOptions? options = null,
CancellationToken cancellationToken = default
);Generate response with conversation context.
Streaming odpowiedzi w trybie konwersacyjnym.
IAsyncEnumerable<string> GenerateConversationalResponseStreamAsync(
string userQuery,
List<SearchResultItem> searchResults,
List<ConversationMessage> conversationHistory,
LLMGenerationOptions? options = null,
CancellationToken cancellationToken = default
);Uwagi: aktualnie pseudo‑streaming (chunkowanie odpowiedzi); gotowe pod real streaming providerów.
Task<LLMResponse> GenerateConversationalResponseAsync(
string userQuery,
List<SearchResultItem> searchResults,
List<ConversationMessage> conversationHistory,
LLMGenerationOptions? options = null,
CancellationToken cancellationToken = default
);Generate content summaries.
Task<string> SummarizeContentAsync(
string content,
SummaryType summaryType = SummaryType.Bullet,
int maxLength = 500,
CancellationToken cancellationToken = default
);Extract insights from documents.
Task<DocumentInsights> ExtractDocumentInsightsAsync(
Document document,
InsightExtractionType extractionType = InsightExtractionType.KeyPoints,
CancellationToken cancellationToken = default
);var llmOptions = new LLMGenerationOptions
{
Temperature = 0.7, // Creativity (0.0-2.0)
MaxTokens = 1000, // Response length limit
TopP = 0.9, // Nucleus sampling
FrequencyPenalty = 0.0, // Repetition penalty
PresencePenalty = 0.0, // Topic diversity
Style = ResponseStyle.Informative,
Language = "pl", // Response language
IncludeSources = true, // Include source attribution
IncludeConfidence = true // Include confidence scores
};
// Per-request overrides (np. per-bot custom-openai)
llmOptions.ProviderOverride = "custom-openai"; // np. openrouter/custom-openai/azure-openai itd.
llmOptions.ModelOverride = "deepseek/deepseek-chat";
llmOptions.BaseUrlOverride = "https://api.deepseek.com/v1";
llmOptions.ApiKeyOverride = "sk-...";| Style | Description | Use Case |
|---|---|---|
Concise |
Brief, to-the-point | Quick answers |
Informative |
Detailed explanations | Learning content |
Academic |
Formal, scholarly | Research papers |
Technical |
Developer-focused | API documentation |
Casual |
Conversational tone | Chat interfaces |
PostgreSQL + pgvector operations for document and embedding management.
Store document in vector database.
Task<Guid> StoreDocumentAsync(
Document document,
CancellationToken cancellationToken = default
);Retrieve document by ID.
Task<Document?> GetDocumentAsync(
Guid documentId,
bool includeEmbeddings = false,
CancellationToken cancellationToken = default
);Full-text search in documents.
Task<List<Document>> SearchDocumentsAsync(
string query,
int limit = 10,
CancellationToken cancellationToken = default
);Store vector embeddings for document chunks.
Task<int> StoreEmbeddingsAsync(
Guid documentId,
List<TextChunk> chunks,
CancellationToken cancellationToken = default
);Vector similarity search.
Task<List<SimilaritySearchResult>> SearchSimilarAsync(
float[] queryEmbedding,
double threshold = 0.7,
int maxResults = 10,
string? embeddingModel = null,
CancellationToken cancellationToken = default
);Uwagi (elastyczne wymiary):
- storage w kolumnie
VECTOR(1536), dodatkowa kolumnaembedding_dimension - serwis normalizuje embedding zapytania do 1536 (pad/trim), wynikowe wektory są przycinane do oryginalnego wymiaru przy mapowaniu
- HNSW
ef_searchmoże być ustawiany lokalnie per zapytanie (SET LOCAL), domyślnie zRAG:PostgreSQL:HnswEfSearch
Retrieve vector store statistics.
Task<VectorStoreStatistics> GetStatisticsAsync(CancellationToken cancellationToken = default);Example Response:
var stats = await vectorStore.GetStatisticsAsync();
Console.WriteLine($"Total Documents: {stats.TotalDocuments}");
Console.WriteLine($"Total Embeddings: {stats.TotalEmbeddings}");
Console.WriteLine($"Database Size: {stats.DatabaseSizeBytes / (1024*1024)} MB");
Console.WriteLine($"Indexed Vectors: {stats.IndexedVectors}");Vector embedding generation with multiple providers.
Generate embedding vector for text.
Task<ReadOnlyMemory<float>> GenerateEmbeddingAsync(
string text,
CancellationToken cancellationToken = default
);Get information about current embedding provider.
EmbeddingProviderInfo GetProviderInfo();| Provider | Model | Dimensions | Speed | Cost |
|---|---|---|---|---|
| Local | all-MiniLM-L6-v2 | 384 | Fast | Free |
| OpenAI | text-embedding-ada-002 | 1536 | Medium | $0.0001/1K tokens |
| Azure OpenAI | text-embedding-ada-002 | 1536 | Medium | Enterprise pricing |
Example:
// Generate embedding for query
var embedding = await embeddingService.GenerateEmbeddingAsync("machine learning algorithms");
Console.WriteLine($"Embedding dimensions: {embedding.Length}");
// Get provider info
var info = embeddingService.GetProviderInfo();
Console.WriteLine($"Provider: {info.ProviderName}, Model: {info.ModelName}");Complete response from RAG query.
public class RAGResponse
{
public string UserQuery { get; set; }
public string Response { get; set; }
public SemanticSearchResult SearchResults { get; set; }
public LLMResponse LLMResponse { get; set; }
public double OverallConfidence { get; set; }
public List<string> FollowUpQuestions { get; set; }
public ResponseQualityAssessment? QualityAssessment { get; set; }
public RAGPerformanceMetrics Performance { get; set; }
public List<string> Warnings { get; set; }
public DateTime Timestamp { get; set; }
}Individual search result with metadata.
public class SearchResultItem
{
public Guid Id { get; set; }
public Guid DocumentId { get; set; }
public string DocumentTitle { get; set; }
public string Content { get; set; }
public string? ExtendedContext { get; set; }
public double SimilarityScore { get; set; }
public double? RerankScore { get; set; }
public int ChunkIndex { get; set; }
public string EmbeddingModel { get; set; }
public List<string> Highlights { get; set; }
public MatchInfo MatchInfo { get; set; }
}Result of document processing operation.
public class DocumentProcessingResult
{
public bool Success { get; set; }
public Document? Document { get; set; }
public List<string> Errors { get; set; }
public List<string> Warnings { get; set; }
public TimeSpan ProcessingTime { get; set; }
public int ExtractedChunks { get; set; }
}The system uses custom exception types for different error scenarios:
// Document processing errors
public class DocumentProcessingException : Exception
{
public string FileName { get; }
public string FileType { get; }
}
// Vector search errors
public class VectorSearchException : Exception
{
public string SearchQuery { get; }
public SearchOptions Options { get; }
}
// LLM generation errors
public class LLMGenerationException : Exception
{
public string Query { get; }
public LLMGenerationOptions Options { get; }
}public class ApiErrorResponse
{
public string Error { get; set; }
public string Message { get; set; }
public string? Details { get; set; }
public DateTime Timestamp { get; set; }
public string TraceId { get; set; }
}// 1. Configure services
services.AddScoped<IRAGService, RAGService>();
services.AddScoped<IDocumentProcessorService, DocumentProcessorService>();
services.AddScoped<ISemanticSearchService, SemanticSearchService>();
services.AddScoped<ILLMService, LLMService>();
// 2. Inject and use
public class DocumentController : ControllerBase
{
private readonly IRAGService _ragService;
public DocumentController(IRAGService ragService)
{
_ragService = ragService;
}
[HttpPost("upload")]
public async Task<IActionResult> UploadDocument(IFormFile file)
{
var browserFile = new BrowserFileWrapper(file);
var result = await _ragService.IngestDocumentAsync(browserFile);
return result.Success
? Ok(new { DocumentId = result.DocumentId, Message = result.Message })
: BadRequest(new { Errors = result.Errors });
}
[HttpPost("query")]
public async Task<IActionResult> Query([FromBody] QueryRequest request)
{
var response = await _ragService.QueryAsync(
request.Query,
new RAGOptions
{
SearchOptions = request.SearchOptions,
LLMOptions = request.LLMOptions
}
);
return Ok(response);
}
}This API documentation provides comprehensive coverage of all services and their capabilities. Each service is designed to work independently or as part of the larger RAG pipeline, providing flexibility for different integration scenarios.
ILLMRerankerService– LLM-based (OpenAI/Gemini/Anthropic/xAI – wybór przez konfigurację wRAG)ICohereRerankService– API Cohere (rerank-v3.5)
- Test połączenia,
ExecuteCypherQueryAsync, dodawanie węzłów/relacji, statystyki - UI
Graph.razorkorzysta bezpośrednio z serwisu do wykonywania zapytań i modyfikacji grafu