diff --git a/examples/nvingest_mcp/README.md b/examples/nvingest_mcp/README.md new file mode 100644 index 0000000..f052854 --- /dev/null +++ b/examples/nvingest_mcp/README.md @@ -0,0 +1,318 @@ + + +# NV-Ingest MCP RAG Example + +This example demonstrates how to use NVIDIA NeMo Agent toolkit with NV-Ingest for document processing and retrieval workflows. You can expose NV-Ingest capabilities as MCP tools for AI agents to ingest documents, build knowledge bases, and perform semantic search. + +## Prerequisites + +1. **NeMo Agent Toolkit**: Ensure you have the NeMo Agent toolkit installed. If you have not already done so, follow the instructions in the [NeMo Agent Toolkit documentation](https://github.com/NVIDIA/NeMo-Agent-Toolkit) to create the development environment and install NeMo Agent Toolkit. + +2. **NV-Ingest Stack**: This example assumes you have NV-Ingest (NeMo Retriever Extraction) deployed locally. Your NV-Ingest services must be running: + - NV-Ingest microservice on port 7670 + - Milvus on port 19530 + - Embedding service on port 8012 + - MinIO on port 9000 + + If you don't have NV-Ingest deployed, refer to the [NV-Ingest documentation](https://github.com/NVIDIA/nv-ingest/blob/main/README.md) for setup instructions. For quick testing, you can use [Library Mode](https://github.com/NVIDIA/nv-ingest/blob/main/README.md#library-mode-quickstart) which requires only self-hosted NIMs or NIMs hosted on build.nvidia.com. + +3. **API Key**: Set your NVIDIA API key: + ```bash + export NVIDIA_API_KEY= + ``` + + If you don't have an API key, you can get one from [build.nvidia.com](https://org.ngc.nvidia.com/setup/api-keys). + +**Note**: If you installed NeMo Agent toolkit from source, MCP client functionality is already included. If you installed from PyPI, you may need to install the MCP dependencies separately with `uv pip install "nvidia-nat[mcp,langchain]"`. + +## Install this Workflow + +Install this example: + +```bash +uv pip install -e examples/nvingest_mcp +``` + +## Run the Workflow + +This example includes sample PDF files in `data/` for testing. + +**Note:** This is a simple example demonstrating end-to-end functionality. The tool implementations in `src/` are intentionally basic to illustrate the concepts. For production use cases, you may want to extend the retrieval logic with filtering, reranking, or more sophisticated chunking strategies. + +This example supports three modes of operation: + +| Mode | When to Use | +|------|-------------| +| **Mode 1: Direct Agent** | Uses `nat run` with tools defined in YAML - fastest and simplest | +| **Mode 2: MCP Server** | Expose tools for Cursor, Claude Desktop, or other MCP-compatible apps | +| **Mode 3: MCP Client** | Your NAT agent uses tools from an external MCP server (different tools or different machine) | + +### Mode 1: Direct Agent (Simplest) + +Tools defined directly in YAML configuration. Lowest latency, single process. + +#### Example Queries + +```bash +# Ingest the sample document +nat run --config_file examples/nvingest_mcp/configs/nvingest_agent_direct.yml \ + --input "Ingest examples/nvingest_mcp/data/multimodal_test.pdf into the vector database" +``` + +**Expected output:** The pipeline extracts text, tables, and charts from the PDF, generates embeddings, and uploads 8 chunks to Milvus collection `nv_ingest_collection`. + +```bash +# Search the knowledge base +nat run --config_file examples/nvingest_mcp/configs/nvingest_agent_direct.yml \ + --input "Search for information about tables and charts" +``` + +**Expected output:** Returns information about charts showing average frequency ranges for speaker drivers and tables describing animals and their activities. To verify the extracted content, view the source document: [multimodal_test.pdf](data/multimodal_test.pdf). + +```bash +# Ask questions about the ingested documents +nat run --config_file examples/nvingest_mcp/configs/nvingest_agent_direct.yml \ + --input "What animals are mentioned in the documents?" +``` + +**Expected output:** The animals mentioned in the documents are Giraffe, Lion, Cat, and Dog. + +### Mode 2: MCP Server + +Expose tools as MCP endpoints for external clients. + +#### Start the Server + +```bash +nat mcp serve \ + --config_file examples/nvingest_mcp/configs/nvingest_mcp_server.yml \ + --host 0.0.0.0 \ + --port 9901 +``` + +Server will be available at: `http://localhost:9901/mcp` + +#### Verify Server is Running + +```bash +# Check health +curl http://localhost:9901/health + +# List available tools +curl http://localhost:9901/debug/tools/list | jq + +# Ping through MCP protocol +nat mcp client ping --url http://localhost:9901/mcp + +# List tools through MCP protocol +nat mcp client tool list --url http://localhost:9901/mcp +``` + +#### Call Tools Directly (No LLM) + +```bash +# Call document_ingest_vdb tool +nat mcp client tool call document_ingest_vdb \ + --url http://localhost:9901/mcp \ + --json-args '{"file_path": "examples/nvingest_mcp/data/embedded_table.pdf"}' +``` + +**Expected output:** 10 chunks uploaded to Milvus collection `nv_ingest_collection`. + +```bash +# Call semantic_search tool (returns raw document chunks) +nat mcp client tool call semantic_search \ + --url http://localhost:9901/mcp \ + --json-args '{"query": "What is the minimum version of xlrd and what are the notes about it?"}' +``` + +**Expected output:** Returns raw document chunks containing the Excel dependencies table with xlrd version 2.0.1. + +#### Call the Agent Workflow (With LLM Reasoning) + +The `react_agent` workflow is also exposed as an MCP tool, allowing you to ask questions and get formulated answers: + +```bash +# Ask a question and get a reasoned answer +nat mcp client tool call react_agent \ + --url http://localhost:9901/mcp \ + --json-args '{"query": "What is the minimum version of xlrd and what are the notes about it?"}' +``` + +**Expected output:** The minimum version of xlrd is 2.0.1, and the notes indicate it is used for "Reading Excel". To verify, see [embedded_table.pdf](data/embedded_table.pdf). + +### Mode 3: MCP Client + +Connect to the MCP server from another workflow. + +#### Step 1: Start the MCP Server (Terminal 1) + +```bash +nat mcp serve \ + --config_file examples/nvingest_mcp/configs/nvingest_mcp_server.yml \ + --host 0.0.0.0 +``` + +#### Step 2: Run the MCP Client Workflow (Terminal 2) + +```bash +# Set your API key first +export NVIDIA_API_KEY= + +# Ingest through MCP +nat run --config_file examples/nvingest_mcp/configs/nvingest_mcp_client.yml \ + --input "Ingest examples/nvingest_mcp/data/test-page-form.pdf into the database" +``` + +**Expected output:** 1 chunk uploaded to Milvus collection `nv_ingest_collection`. + +```bash +# Search through MCP +nat run --config_file examples/nvingest_mcp/configs/nvingest_mcp_client.yml \ + --input "Search the knowledge base for information about Parallel Key-Value Cache Fusion and who are the authors" +``` + +**Expected output:** The authors of the paper "Parallel Key-Value Cache Fusion for Position Invariant RAG" are Philhoon Oh, Jinwoo Shin, and James Thorne from KAIST AI. To verify, see [test-page-form.pdf](data/test-page-form.pdf). + +## Available Tools + +### 1. `document_ingest` - Extract Content (No VDB) + +Extracts content from documents and returns it directly. Use for inspection or custom processing. + +**Input:** +- `file_path`: Path to the document (PDF, DOCX, and so on) + +**Output:** Extracted text, tables, and charts as formatted string + +### 2. `document_ingest_vdb` - Extract + Embed + Upload + +Full pipeline: extracts content, generates embeddings, uploads to Milvus. Use to build your knowledge base. + +**Input:** +- `file_path`: Path to the document + +**Output:** Status message with chunk count and collection name + +### 3. `semantic_search` - Query Knowledge Base + +Searches Milvus for documents relevant to a natural language query. + +**Input:** +- `query`: Natural language search query + +**Output:** Top-K relevant document chunks + +## Configuration + +### Extraction Options + +Configure what to extract from documents in your config YAML: + +| Option | Description | NIM Service Used | +|--------|-------------|------------------| +| `extract_text` | Plain text content | Built-in | +| `extract_tables` | Structured table data | table-structure (port 8006) | +| `extract_charts` | Chart and graphic descriptions | graphic-elements (port 8003) | +| `extract_images` | Image content | page-elements (port 8000) | + +Example configuration: + +```yaml +document_ingest_vdb: + extract_text: true + extract_tables: true + extract_charts: true + extract_images: false +``` + +> **Note on Embedding URLs:** The configs use different embedding URLs because of Docker networking: +> - `document_ingest_vdb` uses `http://embedding:8000/v1` - NV-Ingest runs inside Docker and reaches the embedding service via Docker's internal network +> - `semantic_search` uses `http://localhost:8012/v1` - runs on the host machine and reaches the embedding service via the exposed port + +### Configuration Reference + +#### Direct Mode (`nvingest_agent_direct.yml`) + +```yaml +functions: + document_ingest_vdb: + _type: nvingest_document_ingest_vdb + nvingest_host: localhost + nvingest_port: 7670 + milvus_uri: http://localhost:19530 + collection_name: nv_ingest_collection + embedding_url: http://embedding:8000/v1 # Docker internal network + embedding_model: nvidia/llama-3.2-nv-embedqa-1b-v2 + + semantic_search: + _type: milvus_semantic_search + milvus_uri: http://localhost:19530 + embedding_url: http://localhost:8012/v1 # Host machine access + embedding_model: nvidia/llama-3.2-nv-embedqa-1b-v2 + +workflow: + _type: react_agent + tool_names: [document_ingest, document_ingest_vdb, semantic_search] + llm_name: nim_llm +``` + +#### MCP Client Mode (`nvingest_mcp_client.yml`) + +```yaml +function_groups: + nvingest_tools: + _type: mcp_client + server: + transport: streamable-http + url: "http://localhost:9901/mcp" + include: + - document_ingest_vdb + - semantic_search + +workflow: + _type: react_agent + tool_names: [nvingest_tools] +``` + +## Troubleshooting + +Test your service connections: + +```bash +# Test NV-Ingest connection +curl http://localhost:7670/health + +# Test Milvus connection +curl http://localhost:19530/health + +# Test embedding service +curl http://localhost:8012/v1/models + +# Test MCP server (if running) +curl http://localhost:9901/health +``` + +## Capabilities + +This integration enables AI agents to: + +- **Ingest documents**: Process PDFs, extract text, tables, charts, and images +- **Build knowledge bases**: Automatically embed and store documents in Milvus VDB +- **Semantic search**: Query the knowledge base using natural language +- **Retrieval workflows**: Combine retrieval with LLM reasoning for document Q&A diff --git a/examples/nvingest_mcp/configs b/examples/nvingest_mcp/configs new file mode 120000 index 0000000..2ebd658 --- /dev/null +++ b/examples/nvingest_mcp/configs @@ -0,0 +1 @@ +src/nvingest_mcp_rag/configs \ No newline at end of file diff --git a/examples/nvingest_mcp/data b/examples/nvingest_mcp/data new file mode 120000 index 0000000..680920b --- /dev/null +++ b/examples/nvingest_mcp/data @@ -0,0 +1 @@ +src/nvingest_mcp_rag/data \ No newline at end of file diff --git a/examples/nvingest_mcp/pyproject.toml b/examples/nvingest_mcp/pyproject.toml new file mode 100644 index 0000000..1d7ba23 --- /dev/null +++ b/examples/nvingest_mcp/pyproject.toml @@ -0,0 +1,40 @@ +# SPDX-FileCopyrightText: Copyright (c) 2024-2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +[build-system] +build-backend = "setuptools.build_meta" +requires = ["setuptools >= 64"] + +[project] +name = "nvingest_mcp" +version = "0.1.0" +dependencies = [ + "nvidia-nat[mcp,langchain]~=1.3", + "nv-ingest-client", + "llama_index", + "llama-index-embeddings-nvidia", + "llama-index-llms-nvidia", + "llama-index-vector-stores-milvus", + "pymilvus[bulk_writer,model]", # bulk_writer: minio/azure, model: BM25EmbeddingFunction + "minio" +] +requires-python = ">=3.11,<3.14" +description = "NV-Ingest MCP Server + RAG Workflow Example using NeMo Agent Toolkit" +keywords = ["ai", "rag", "agents", "mcp", "nv-ingest"] +classifiers = ["Programming Language :: Python"] + +[project.entry-points.'nat.components'] +nvingest_mcp = "nvingest_mcp_rag.register" + diff --git a/examples/nvingest_mcp/src/.gitignore b/examples/nvingest_mcp/src/.gitignore new file mode 100644 index 0000000..b810e9b --- /dev/null +++ b/examples/nvingest_mcp/src/.gitignore @@ -0,0 +1,4 @@ +# Python packaging metadata (generated by setuptools editable install) +*.egg-info/ + + diff --git a/examples/nvingest_mcp/src/nvingest_mcp_rag/__init__.py b/examples/nvingest_mcp/src/nvingest_mcp_rag/__init__.py new file mode 100644 index 0000000..bec85dd --- /dev/null +++ b/examples/nvingest_mcp/src/nvingest_mcp_rag/__init__.py @@ -0,0 +1,19 @@ +# SPDX-FileCopyrightText: Copyright (c) 2024-2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""NV-Ingest MCP RAG Example Package.""" + + + diff --git a/examples/nvingest_mcp/src/nvingest_mcp_rag/configs/nvingest_agent_direct.yml b/examples/nvingest_mcp/src/nvingest_mcp_rag/configs/nvingest_agent_direct.yml new file mode 100644 index 0000000..4fc499a --- /dev/null +++ b/examples/nvingest_mcp/src/nvingest_mcp_rag/configs/nvingest_agent_direct.yml @@ -0,0 +1,92 @@ +# SPDX-FileCopyrightText: Copyright (c) 2024-2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# ============================================================================= +# NV-Ingest Direct Agent Configuration +# ============================================================================= +# +# ReAct agent with NV-Ingest tools defined directly (no MCP). +# Simplest mode - lowest latency. +# +# Usage: +# nat run --config_file configs/nvingest_agent_direct.yml --input "Your query" +# +# Requires: NV-Ingest (7670), Milvus (19530), Embedding (8012) +# +# ============================================================================= + +# NV-Ingest tools +functions: + # Extract content from documents (no VDB upload) + document_ingest: + _type: nvingest_document_ingest + nvingest_host: localhost + nvingest_port: 7670 + # Extraction options (enable/disable as needed) + extract_text: true + extract_tables: true # Uses table-structure NIM (port 8006) + extract_charts: true # Uses graphic-elements NIM (port 8003) + extract_images: false # Set to true for image extraction + text_depth: page # 'page' or 'document' + + # Extract, embed, and upload to Milvus (multimodal) + document_ingest_vdb: + _type: nvingest_document_ingest_vdb + nvingest_host: localhost + nvingest_port: 7670 + # Extraction options + extract_text: true + extract_tables: true # Uses table-structure NIM (port 8006) + extract_charts: true # Uses graphic-elements NIM (port 8003) + extract_images: false # Set to true for image extraction + text_depth: page + # VDB settings + milvus_uri: http://localhost:19530 + collection_name: nv_ingest_collection + embedding_url: http://embedding:8000/v1 + embedding_model: nvidia/llama-3.2-nv-embedqa-1b-v2 + minio_endpoint: localhost:9000 + minio_access_key: minioadmin + minio_secret_key: minioadmin + + # Search Milvus for relevant documents + semantic_search: + _type: milvus_semantic_search + milvus_uri: http://localhost:19530 + collection_name: nv_ingest_collection + embedding_url: http://localhost:8012/v1 + embedding_model: nvidia/llama-3.2-nv-embedqa-1b-v2 + top_k: 5 + +# LLM for agent reasoning +llms: + nim_llm: + _type: nim + model_name: meta/llama-3.1-70b-instruct + temperature: 0.0 + max_tokens: 2048 + api_key: ${NVIDIA_API_KEY} + +# ReAct agent workflow +workflow: + _type: react_agent + tool_names: + - document_ingest + - document_ingest_vdb + - semantic_search + llm_name: nim_llm + verbose: true + parse_agent_response_max_retries: 3 + diff --git a/examples/nvingest_mcp/src/nvingest_mcp_rag/configs/nvingest_mcp_client.yml b/examples/nvingest_mcp/src/nvingest_mcp_rag/configs/nvingest_mcp_client.yml new file mode 100644 index 0000000..7332ffa --- /dev/null +++ b/examples/nvingest_mcp/src/nvingest_mcp_rag/configs/nvingest_mcp_client.yml @@ -0,0 +1,60 @@ +# SPDX-FileCopyrightText: Copyright (c) 2024-2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# ============================================================================= +# NV-Ingest MCP Client Configuration +# ============================================================================= +# +# Connects to the NV-Ingest MCP server and uses tools via MCP protocol. +# +# Usage: +# 1. Start MCP server first: +# nat mcp serve --config_file configs/nvingest_mcp_server.yml +# +# 2. Run this client: +# nat run --config_file configs/nvingest_mcp_client.yml --input "Your query" +# +# ============================================================================= + +# Connect to NV-Ingest MCP server and use tools remotely +function_groups: + nvingest_tools: + _type: mcp_client + server: + transport: streamable-http + url: "http://localhost:9901/mcp" + include: + - document_ingest + - document_ingest_vdb + - semantic_search + +# LLM for agent reasoning +llms: + nim_llm: + _type: nim + model_name: meta/llama-3.1-70b-instruct + temperature: 0.0 + max_tokens: 2048 + api_key: ${NVIDIA_API_KEY} + +# ReAct agent workflow +workflow: + _type: react_agent + tool_names: + - nvingest_tools + llm_name: nim_llm + verbose: true + parse_agent_response_max_retries: 3 + diff --git a/examples/nvingest_mcp/src/nvingest_mcp_rag/configs/nvingest_mcp_server.yml b/examples/nvingest_mcp/src/nvingest_mcp_rag/configs/nvingest_mcp_server.yml new file mode 100644 index 0000000..5f0d028 --- /dev/null +++ b/examples/nvingest_mcp/src/nvingest_mcp_rag/configs/nvingest_mcp_server.yml @@ -0,0 +1,91 @@ +# SPDX-FileCopyrightText: Copyright (c) 2024-2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# ============================================================================= +# NV-Ingest MCP Server Configuration +# ============================================================================= +# +# Exposes NV-Ingest tools as MCP endpoints for external clients. +# +# Usage: +# nat mcp serve --config_file configs/nvingest_mcp_server.yml --host 0.0.0.0 +# +# Server: http://localhost:9901/mcp +# Tools: document_ingest, document_ingest_vdb, semantic_search +# +# ============================================================================= + +# NV-Ingest tools (exposed as MCP endpoints) +functions: + # Extract content from documents (no VDB upload) + document_ingest: + _type: nvingest_document_ingest + nvingest_host: localhost + nvingest_port: 7670 + # Extraction options (enable/disable as needed) + extract_text: true + extract_tables: true # Uses table-structure NIM (port 8006) + extract_charts: true # Uses graphic-elements NIM (port 8003) + extract_images: false # Set to true for image extraction + text_depth: page # 'page' or 'document' + + # Extract, embed, and upload to Milvus (multimodal) + document_ingest_vdb: + _type: nvingest_document_ingest_vdb + nvingest_host: localhost + nvingest_port: 7670 + # Extraction options + extract_text: true + extract_tables: true # Uses table-structure NIM (port 8006) + extract_charts: true # Uses graphic-elements NIM (port 8003) + extract_images: false # Set to true for image extraction + text_depth: page + # VDB settings + milvus_uri: http://localhost:19530 + collection_name: nv_ingest_collection + embedding_url: http://embedding:8000/v1 + embedding_model: nvidia/llama-3.2-nv-embedqa-1b-v2 + minio_endpoint: localhost:9000 + minio_access_key: minioadmin + minio_secret_key: minioadmin + + # Search Milvus for relevant documents + semantic_search: + _type: milvus_semantic_search + milvus_uri: http://localhost:19530 + collection_name: nv_ingest_collection + embedding_url: http://localhost:8012/v1 + embedding_model: nvidia/llama-3.2-nv-embedqa-1b-v2 + top_k: 5 + +# LLM for agent reasoning +llms: + nim_llm: + _type: nim + model_name: meta/llama-3.1-70b-instruct + temperature: 0.0 + max_tokens: 2048 + api_key: ${NVIDIA_API_KEY} + +# ReAct agent workflow (functions above are exposed as MCP tools) +workflow: + _type: react_agent + tool_names: + - document_ingest + - document_ingest_vdb + - semantic_search + llm_name: nim_llm + verbose: true + parse_agent_response_max_retries: 3 diff --git a/examples/nvingest_mcp/src/nvingest_mcp_rag/data/embedded_table.pdf b/examples/nvingest_mcp/src/nvingest_mcp_rag/data/embedded_table.pdf new file mode 100644 index 0000000..099c08f Binary files /dev/null and b/examples/nvingest_mcp/src/nvingest_mcp_rag/data/embedded_table.pdf differ diff --git a/examples/nvingest_mcp/src/nvingest_mcp_rag/data/multimodal_test.pdf b/examples/nvingest_mcp/src/nvingest_mcp_rag/data/multimodal_test.pdf new file mode 100644 index 0000000..61edab6 Binary files /dev/null and b/examples/nvingest_mcp/src/nvingest_mcp_rag/data/multimodal_test.pdf differ diff --git a/examples/nvingest_mcp/src/nvingest_mcp_rag/data/test-page-form.pdf b/examples/nvingest_mcp/src/nvingest_mcp_rag/data/test-page-form.pdf new file mode 100644 index 0000000..cccf300 Binary files /dev/null and b/examples/nvingest_mcp/src/nvingest_mcp_rag/data/test-page-form.pdf differ diff --git a/examples/nvingest_mcp/src/nvingest_mcp_rag/register.py b/examples/nvingest_mcp/src/nvingest_mcp_rag/register.py new file mode 100644 index 0000000..226a911 --- /dev/null +++ b/examples/nvingest_mcp/src/nvingest_mcp_rag/register.py @@ -0,0 +1,510 @@ +# SPDX-FileCopyrightText: Copyright (c) 2024-2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +NV-Ingest MCP RAG Example - Custom Function Registration + +This module registers custom NAT functions that wrap NV-Ingest operations. +These functions can be exposed via MCP server for use by other workflows. + +Aligned with the ingestion_text_only pattern from the main NV-Ingest API. +""" + +import asyncio +import logging +import os +from typing import Optional + +from pydantic import Field + +from nat.builder.builder import Builder +from nat.builder.function_info import FunctionInfo +from nat.cli.register_workflow import register_function +from nat.data_models.function import FunctionBaseConfig + +logger = logging.getLogger(__name__) + + +# ============================================================================= +# FUNCTION 1: Document Ingest Function (Configurable extraction) +# ============================================================================= + +class DocumentIngestConfig(FunctionBaseConfig, name="nvingest_document_ingest"): + """Configuration for the NV-Ingest document ingestion function.""" + + nvingest_host: str = Field( + default="localhost", + description="Hostname of the NV-Ingest service" + ) + nvingest_port: int = Field( + default=7670, + description="Port of the NV-Ingest service" + ) + # Extraction options + extract_text: bool = Field( + default=True, + description="Extract text content from documents" + ) + extract_tables: bool = Field( + default=False, + description="Extract tables from documents" + ) + extract_charts: bool = Field( + default=False, + description="Extract charts/graphics from documents" + ) + extract_images: bool = Field( + default=False, + description="Extract images from documents" + ) + text_depth: str = Field( + default="page", + description="Text extraction depth: 'page' or 'document'" + ) + + +@register_function(config_type=DocumentIngestConfig) +async def nvingest_document_ingest_function(config: DocumentIngestConfig, builder: Builder): + """ + Ingest a document using NV-Ingest and return extracted text content. + Text-only extraction - aligned with ingestion_text_only pattern. + """ + from nv_ingest_client.client import Ingestor, NvIngestClient + + # Create client with proper configuration + client = NvIngestClient( + message_client_hostname=config.nvingest_host, + message_client_port=config.nvingest_port, + ) + + async def _ingest_document(file_path: str) -> str: + """ + Ingest a document and return extracted text content. + + Args: + file_path: Path to the document file to ingest + + Returns: + Extracted text content from the document as a string + """ + logger.info(f"Ingesting document: {file_path}") + + # Validate file exists + if not os.path.exists(file_path): + return f"Error: File not found: {file_path}" + + try: + # Create ingestor with client + ingestor = Ingestor(client=client) + + # Configure pipeline with extraction options from config + ingestor = ingestor.files(file_path) + ingestor = ingestor.extract( + extract_text=config.extract_text, + extract_tables=config.extract_tables, + extract_charts=config.extract_charts, + extract_images=config.extract_images, + text_depth=config.text_depth, + ) + + extraction_types = [] + if config.extract_text: + extraction_types.append("text") + if config.extract_tables: + extraction_types.append("tables") + if config.extract_charts: + extraction_types.append("charts") + if config.extract_images: + extraction_types.append("images") + logger.info(f"Extracting: {', '.join(extraction_types)}") + + # Run ingestion synchronously in thread + results = await asyncio.to_thread( + lambda: ingestor.ingest(show_progress=True) + ) + + # Extract content from results + # NV-Ingest returns List[List[Dict]] - each inner list is chunks for one document + extracted_content = [] + + logger.info(f"Processing {len(results)} document results") + + for doc_idx, doc_chunks in enumerate(results): + if isinstance(doc_chunks, list) and len(doc_chunks) > 0: + logger.info(f"Document {doc_idx}: {len(doc_chunks)} chunks") + for chunk in doc_chunks: + if isinstance(chunk, dict): + # Access per NV-Ingest MetadataSchema structure + metadata = chunk.get("metadata", {}) or {} + content = metadata.get("content", "") + content_type = metadata.get("content_metadata", {}).get("type", "text") + if content: + extracted_content.append(f"[{content_type}] {content}") + + if extracted_content: + summary = f"Extracted {len(extracted_content)} chunks from {file_path}:\n\n" + return summary + "\n\n---\n\n".join(extracted_content) + else: + return f"Document {file_path} was processed but no content was extracted." + + except Exception as e: + logger.error(f"Error ingesting document: {e}", exc_info=True) + return f"Error ingesting document: {str(e)}" + + yield FunctionInfo.from_fn( + _ingest_document, + description=( + "Ingest a document using NV-Ingest to extract text content. " + "Provide the file path to a PDF or other supported document type." + ) + ) + + +# ============================================================================= +# FUNCTION 2: Document Ingest with VDB Upload Function +# ============================================================================= + +class DocumentIngestVDBConfig(FunctionBaseConfig, name="nvingest_document_ingest_vdb"): + """Configuration for NV-Ingest document ingestion with VDB upload. + Supports text, tables, charts, and images extraction. + """ + + nvingest_host: str = Field( + default="localhost", + description="Hostname of the NV-Ingest service" + ) + nvingest_port: int = Field( + default=7670, + description="Port of the NV-Ingest service" + ) + # Extraction options + extract_text: bool = Field( + default=True, + description="Extract text content from documents" + ) + extract_tables: bool = Field( + default=False, + description="Extract tables from documents" + ) + extract_charts: bool = Field( + default=False, + description="Extract charts/graphics from documents" + ) + extract_images: bool = Field( + default=False, + description="Extract images from documents" + ) + text_depth: str = Field( + default="page", + description="Text extraction depth: 'page' or 'document'" + ) + # VDB configuration + milvus_uri: str = Field( + default="http://localhost:19530", + description="URI of the Milvus vector database" + ) + collection_name: str = Field( + default="nv_ingest_collection", + description="Name of the Milvus collection to upload to" + ) + embedding_url: str = Field( + default="http://localhost:8012/v1", + description="Endpoint URL for the embedding model" + ) + embedding_model: str = Field( + default="nvidia/llama-3.2-nv-embedqa-1b-v2", + description="Name of the embedding model" + ) + # MinIO configuration for multimodal storage + minio_endpoint: str = Field( + default="minio:9000", + description="MinIO endpoint for storage" + ) + minio_access_key: str = Field( + default="minioadmin", + description="MinIO access key" + ) + minio_secret_key: str = Field( + default="minioadmin", + description="MinIO secret key" + ) + + +@register_function(config_type=DocumentIngestVDBConfig) +async def nvingest_document_ingest_vdb_function(config: DocumentIngestVDBConfig, builder: Builder): + """ + Ingest a document using NV-Ingest, embed it, and upload to Milvus VDB. + Text-only extraction - aligned with ingestion_text_only pattern. + """ + from nv_ingest_client.client import Ingestor, NvIngestClient + + # Create client with proper configuration + client = NvIngestClient( + message_client_hostname=config.nvingest_host, + message_client_port=config.nvingest_port, + ) + + async def _ingest_and_upload(file_path: str) -> str: + """ + Ingest a document, embed content, and upload to vector database. + + Args: + file_path: Path to the document file to ingest + + Returns: + Status message indicating success or failure + """ + # Build extraction types list for logging + extraction_types = [] + if config.extract_text: + extraction_types.append("text") + if config.extract_tables: + extraction_types.append("tables") + if config.extract_charts: + extraction_types.append("charts") + if config.extract_images: + extraction_types.append("images") + + # Log detailed configuration for visibility + logger.info("=" * 60) + logger.info("DOCUMENT INGEST WITH VDB UPLOAD") + logger.info("=" * 60) + logger.info(f"File path: {file_path}") + logger.info(f"NV-Ingest: {config.nvingest_host}:{config.nvingest_port}") + logger.info(f"Extraction: {', '.join(extraction_types)}") + logger.info(f"Milvus URI: {config.milvus_uri}") + logger.info(f"Collection name: {config.collection_name}") + logger.info(f"Embedding URL: {config.embedding_url}") + logger.info(f"Embedding model: {config.embedding_model}") + logger.info(f"MinIO endpoint: {config.minio_endpoint}") + logger.info("=" * 60) + + # Validate file exists + if not os.path.exists(file_path): + logger.error(f"File not found: {file_path}") + return f"Error: File not found: {file_path}" + + try: + # Create ingestor with client + logger.info("Creating NV-Ingest client and ingestor...") + ingestor = Ingestor(client=client) + + # Build pipeline with configurable extraction + logger.info("Configuring pipeline: files -> extract -> embed -> vdb_upload") + ingestor = ingestor.files(file_path) + + ingestor = ingestor.extract( + extract_text=config.extract_text, + extract_tables=config.extract_tables, + extract_charts=config.extract_charts, + extract_images=config.extract_images, + text_depth=config.text_depth, + ) + logger.info(f" - Extract: {', '.join(extraction_types)}, depth={config.text_depth}") + + # Embed with configured endpoint + ingestor = ingestor.embed( + endpoint_url=config.embedding_url, + model_name=config.embedding_model + ) + logger.info(f" - Embed: {config.embedding_model}") + + # VDB upload with full configuration (following ingestion_text_only pattern) + logger.info(f" - VDB Upload: collection='{config.collection_name}' at {config.milvus_uri}") + ingestor = ingestor.vdb_upload( + milvus_uri=config.milvus_uri, + collection_name=config.collection_name, + recreate=False, + stream=False, + purge_results_after_upload=False, + threshold=5000, + minio_endpoint=config.minio_endpoint, + access_key=config.minio_access_key, + secret_key=config.minio_secret_key + ) + + # Execute pipeline in thread + logger.info("Executing ingestion pipeline...") + results, failures = await asyncio.to_thread( + lambda: ingestor.ingest(return_failures=True, show_progress=True) + ) + + # Count uploaded items + total_chunks = 0 + if results: + for doc_chunks in results: + if isinstance(doc_chunks, list): + total_chunks += len(doc_chunks) + + logger.info("=" * 60) + logger.info("INGESTION COMPLETE") + logger.info(f"Total chunks uploaded: {total_chunks}") + logger.info(f"Target collection: {config.collection_name}") + logger.info(f"Milvus URI: {config.milvus_uri}") + logger.info("=" * 60) + + # Check for failures + if failures: + logger.warning(f"Some failures during ingestion: {failures}") + return ( + f"Ingested {file_path} with {len(failures)} failures. " + f"Uploaded {total_chunks} chunks to collection '{config.collection_name}' " + f"at {config.milvus_uri}" + ) + + return ( + f"Successfully ingested {file_path} and uploaded {total_chunks} " + f"chunks to Milvus collection '{config.collection_name}' at {config.milvus_uri}" + ) + + except Exception as e: + logger.error(f"Error in ingest and upload: {e}", exc_info=True) + return f"Error processing document: {str(e)}" + + yield FunctionInfo.from_fn( + _ingest_and_upload, + description=( + "Ingest a document using NV-Ingest, generate embeddings, and upload to Milvus VDB. " + "Use this to add documents to the knowledge base for later retrieval." + ) + ) + + +# ============================================================================= +# FUNCTION 3: Milvus Query Function (for RAG retrieval) +# ============================================================================= + +class MilvusQueryConfig(FunctionBaseConfig, name="milvus_semantic_search"): + """Configuration for Milvus semantic search function.""" + + milvus_uri: str = Field( + default="http://localhost:19530", + description="URI of the Milvus vector database" + ) + collection_name: str = Field( + default="nv_ingest_collection", + description="Name of the Milvus collection to search" + ) + embedding_url: str = Field( + default="http://localhost:8012/v1", + description="Endpoint URL for the embedding model" + ) + embedding_model: str = Field( + default="nvidia/llama-3.2-nv-embedqa-1b-v2", + description="Name of the embedding model" + ) + top_k: int = Field( + default=5, + description="Number of top results to return" + ) + + +@register_function(config_type=MilvusQueryConfig) +async def milvus_semantic_search_function(config: MilvusQueryConfig, builder: Builder): + """ + Perform semantic search on Milvus vector database using embeddings. + """ + + async def _semantic_search(query: str) -> str: + """ + Search the Milvus collection for documents similar to the query. + + Args: + query: The search query text + + Returns: + Retrieved document content relevant to the query + """ + logger.info("=" * 60) + logger.info("SEMANTIC SEARCH") + logger.info("=" * 60) + logger.info(f"Query: {query}") + logger.info(f"Milvus URI: {config.milvus_uri}") + logger.info(f"Collection: {config.collection_name}") + logger.info(f"Top K: {config.top_k}") + logger.info(f"Embedding model: {config.embedding_model}") + logger.info("=" * 60) + + try: + from llama_index.embeddings.nvidia import NVIDIAEmbedding + from pymilvus import MilvusClient + + # Create embedding for the query using same config as ingest + logger.info("Creating query embedding...") + embed_model = NVIDIAEmbedding( + base_url=config.embedding_url, + model=config.embedding_model + ) + + # Get query embedding + query_embedding = await asyncio.to_thread( + lambda: embed_model.get_query_embedding(query) + ) + logger.info(f"Query embedding generated (dimension: {len(query_embedding)})") + + # Connect to Milvus and search + logger.info(f"Connecting to Milvus at {config.milvus_uri}...") + client = MilvusClient(uri=config.milvus_uri) + + # Check if collection exists + collections = client.list_collections() + logger.info(f"Available collections: {collections}") + + if config.collection_name not in collections: + logger.warning(f"Collection '{config.collection_name}' does not exist!") + return ( + f"Collection '{config.collection_name}' does not exist in Milvus. " + f"Available collections: {collections}. " + "Please ingest documents first using document_ingest_vdb." + ) + + logger.info(f"Searching collection '{config.collection_name}'...") + results = client.search( + collection_name=config.collection_name, + data=[query_embedding], + limit=config.top_k, + output_fields=["text"] + ) + + # Format results + retrieved_texts = [] + for hits in results: + for hit in hits: + text = hit.get("entity", {}).get("text", "") + if text: + retrieved_texts.append(text) + + logger.info(f"Found {len(retrieved_texts)} results") + + if retrieved_texts: + return f"Found {len(retrieved_texts)} relevant documents from collection '{config.collection_name}':\n\n" + "\n\n---\n\n".join(retrieved_texts) + else: + return ( + f"No relevant documents found in collection '{config.collection_name}'. " + "The collection may be empty - try ingesting documents first using document_ingest_vdb." + ) + + except Exception as e: + logger.error(f"Error in semantic search: {e}", exc_info=True) + return f"Error performing search: {str(e)}" + + yield FunctionInfo.from_fn( + _semantic_search, + description=( + "Search the knowledge base for documents relevant to the query. " + "Returns the most relevant document content from Milvus VDB." + ) + ) +