Skip to content

bluewms/knowledge-catalog

 
 

🇺🇸 English | 🇨🇳 中文 | 🇨🇳 开发

Knowledge Catalog

Knowledge Catalog (formerly Dataplex), is an AI-powered data catalog and metadata management platform. It provides a dynamic knowledge graph of all your data, structured and unstructured, to provide semantics and business context to AI agents

This repository features tools, agents, and samples that demonstrate Knowledge Catalog features, and building context management, enrichment and retrieval solutions.

Getting Started

Open in Cloud Shell


✨ Community Extensions (Fork Features)

This fork extends the original reference-agent beyond BigQuery into a general-purpose OKF knowledge base generator — from local files, remote APIs, and with multi-LLM support. All changes are backward compatible; the original enrich command is untouched.

What's New

Feature Description
📁 Local file source Generate OKF bundles from 16 file formats: PDF, Word, Excel, PPT, Markdown, code, config, HTML, CSV
🌐 Remote API source Fetch files from URLs, API endpoints, or URL list files (--source api)
🤖 Multi-LLM support Gemini, Claude, OpenAI, DeepSeek, Qwen, Ollama — pick any via --model
🇨🇳 Chinese support Unicode filenames + automatic language matching (source is Chinese → output is Chinese)
localfile shortcut One-command workflow: reference-agent localfile /path --pattern "**/*.pdf"

Quick Start

# Install
cd okf
pip install --user -e ".[localfile]"
pip install litellm                          # multi-LLM support

# Generate a knowledge base from local PDFs (default: Gemini)
reference-agent localfile ~/Documents --pattern "**/*.pdf"

# Use DeepSeek (works in China, cost-effective)
export DEEPSEEK_API_KEY=xxx
reference-agent localfile ~/Documents --model deepseek/deepseek-chat

# Mix local files + remote URLs
reference-agent localfile ~/docs --api-url https://example.com/remote.pdf

# View supported LLM models
reference-agent list-models

# Generate interactive HTML graph
reference-agent visualize --bundle ./okf-bundle

Documentation


Contributing

See the contributing instructions to get started contributed.

License

All solutions within this repository are provided under the Apache 2.0 license. Please see LICENSE for more detailed terms and conditions.

Disclaimer

This repository and its contents are not an official Google product.

About

Google Cloud Knowledge Catalog Tools and Samples

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • HTML 45.4%
  • Python 26.1%
  • TypeScript 24.9%
  • JavaScript 1.7%
  • Shell 1.1%
  • CSS 0.8%