Skip to content

NP-compete/toolglot

Repository files navigation

toolglot

Define tools once. Use them with any model.

Quick StartWhy ToolGlotFormat Zoo30+ ModelsLangGraphProvider CookbookContributingProduction HardeningSecurity

PyPI CI Scorecard Security Policy License Python


You wrote 15 MCP tools. They work perfectly with GPT-4o. Then your boss says "make it work with Claude, Gemini, Llama, and that 2B model running on a Raspberry Pi."

Every model speaks a different tool dialect. OpenAI wants functions. Anthropic wants tools with input_schema. Gemini wants function_declarations. Cohere wants flat parameters. Ollama needs it in the chat template. That 2B model? It needs tools injected into the system prompt and responses parsed from freeform text.

ToolGlot translates between all of them.

Your Tools ──→ [ ToolGlot ] ──→ Any Model, Any Provider, Any Format
     ↑              │
 MCP / OpenAPI      └──→ OpenAI, Anthropic, Gemini, Mistral, Cohere,
 / JSON Schema           Bedrock, Ollama, vLLM, or plain system prompt

Quick Start

pip install toolglot

Translate tool definitions from the CLI:

# MCP → OpenAI format
toolglot translate --from mcp --to openai --input tools.json --output openai_tools.json

# OpenAPI spec → Anthropic format
toolglot translate --from openapi --to anthropic --input petstore.yaml

# Inspect what a model supports
toolglot capabilities --model gpt-4o

# Validate a tool definition
toolglot validate --format openai --input tools.json

# Compare two schemas and detect lossy changes
toolglot schema-diff --left-input a.json --left-format mcp --right-input b.json --right-format openai --output json

Scaffold a provider plugin:

toolglot plugin init --name acme --output-dir ./plugins

Automation-friendly CLI modes:

# Machine-readable output
toolglot capabilities --model gpt-4o --output json
toolglot validate --format mcp --input tools.json --output json
toolglot import-mcp --config ./mcp_config.json --output json

# Logging control for scripts
toolglot inspect --format mcp --input tools.json --quiet
toolglot translate --from mcp --to openai --input tools.json --verbose

Or in Python — define once, export everywhere:

from toolglot import ToolKit

# Load from any source
toolkit = ToolKit.from_mcp("./my_tools.json")

# Export to any target — one line each
openai_tools    = toolkit.to_openai()
anthropic_tools = toolkit.to_anthropic()
gemini_tools    = toolkit.to_gemini()
mistral_tools   = toolkit.to_mistral()
cohere_tools    = toolkit.to_cohere()
bedrock_tools   = toolkit.to_bedrock()
ollama_tools    = toolkit.to_ollama(model="llama3.2")
prompt_text     = toolkit.to_system_prompt()   # any model, zero native support needed

# Parse tool calls back from any provider
from toolglot import parse_tool_calls

calls = parse_tool_calls(response, provider="anthropic")
for call in calls:
    print(call.name, call.arguments)   # unified format, always

Why ToolGlot

The problem exists in pieces. Nobody assembled the solution.

What You Need Existing Solutions The Problem
Provider-agnostic tool definitions LiteLLM Full proxy. Couples you to their runtime. You wanted a library, not a service.
Tool format translation LangChain tool abstraction Locked into the LangChain ecosystem. Not standalone.
Multi-provider tool calling Vercel AI SDK TypeScript only. Frontend-focused.
Schema optimization for small models Nothing Nobody downgrades complex schemas for weaker models.
Tool calling for non-native models Nothing If the model doesn't support tools natively, you're on your own.

ToolGlot is the missing primitive. A standalone Python library that translates tool definitions between formats, optimizes schemas per model, parses responses back to a unified format, and makes tool calling work on models that don't natively support it.

┌─────────────┐     ┌───────────┐     ┌───────────┐     ┌──────────┐
│   Import    │────▶│ Canonical │────▶│ Transform │────▶│  Export  │
│ MCP/OpenAPI │     │    IR     │     │  per model │     │ per prov │
└─────────────┘     └───────────┘     └───────────┘     └──────────┘
       │                  │                 │                  │
  Your tool defs    Pydantic models    Flatten, deref,    OpenAI, Claude,
  in any format     (the truth)        simplify, validate  Gemini, Ollama...

The Format Zoo

Here's the same tool — get_weather — in six different formats. This is what ToolGlot handles for you.

Your MCP tool definition (input):

{
  "name": "get_weather",
  "description": "Get current weather for a city",
  "inputSchema": {
    "type": "object",
    "properties": {
      "city": { "type": "string", "description": "City name" },
      "units": { "type": "string", "enum": ["celsius", "fahrenheit"], "default": "celsius" }
    },
    "required": ["city"]
  }
}

OpenAI format — wraps in type: "function", nests under function.parameters:

{
  "type": "function",
  "function": {
    "name": "get_weather",
    "description": "Get current weather for a city",
    "parameters": {
      "type": "object",
      "properties": {
        "city": { "type": "string", "description": "City name" },
        "units": { "type": "string", "enum": ["celsius", "fahrenheit"], "default": "celsius" }
      },
      "required": ["city"]
    }
  }
}

Anthropic format — uses input_schema instead of parameters:

{
  "name": "get_weather",
  "description": "Get current weather for a city",
  "input_schema": {
    "type": "object",
    "properties": {
      "city": { "type": "string", "description": "City name" },
      "units": { "type": "string", "enum": ["celsius", "fahrenheit"], "default": "celsius" }
    },
    "required": ["city"]
  }
}

Gemini format — uppercase types, no default, nested under function_declarations:

{
  "function_declarations": [{
    "name": "get_weather",
    "description": "Get current weather for a city",
    "parameters": {
      "type": "OBJECT",
      "properties": {
        "city": { "type": "STRING", "description": "City name" },
        "units": { "type": "STRING", "enum": ["celsius", "fahrenheit"] }
      },
      "required": ["city"]
    }
  }]
}

Cohere format — flat parameter_definitions, Python types, no nesting:

{
  "name": "get_weather",
  "description": "Get current weather for a city",
  "parameter_definitions": {
    "city": { "type": "str", "description": "City name", "required": true },
    "units": { "type": "str", "description": "Temperature units: celsius or fahrenheit. Default: celsius", "required": false }
  }
}

System prompt fallback — for models with zero native tool support:

You have access to the following tools:

## get_weather
Get current weather for a city

Parameters:
- city (string, required): City name
- units (string, optional): Temperature units. One of: celsius, fahrenheit. Default: celsius

When you want to call a tool, respond with:
<tool_call>{"name": "get_weather", "arguments": {"city": "Tokyo"}}</tool_call>

Six formats. Same tool. ToolGlot handles all of this with one line of code.


Features

Import From Anywhere

from toolglot import ToolKit

# MCP tool definitions (JSON)
toolkit = ToolKit.from_mcp("./mcp_tools.json")

# MCP server config (connects and lists tools)
toolkit = ToolKit.from_mcp_server("./mcp_server_config.json")

# OpenAPI / Swagger spec
toolkit = ToolKit.from_openapi("./petstore.yaml")

# OpenAI function format
toolkit = ToolKit.from_openai([{"type": "function", "function": {...}}])

# Raw JSON Schema
toolkit = ToolKit.from_json_schema({"get_weather": {...}})

# LangChain tools (requires toolglot[langchain])
from langchain_core.tools import tool

@tool
def get_weather(city: str) -> str:
    """Get weather for a city."""
    ...

toolkit = ToolKit.from_langchain([get_weather])

Supported input formats:

Source Method Notes
MCP tool definitions ToolKit.from_mcp("tools.json") JSON file with MCP tool array
MCP server config ToolKit.from_mcp_server("config.json") Connects to server, lists tools
OpenAPI 3.x spec ToolKit.from_openapi("spec.yaml") Extracts operations as tools
OpenAI function format ToolKit.from_openai(tools_list) List of {"type": "function", ...}
JSON Schema ToolKit.from_json_schema(schemas) Dict of name → schema
LangChain tools ToolKit.from_langchain(tools) List of BaseTool

Export To Any Model

toolkit = ToolKit.from_mcp("./tools.json")

# Cloud providers
openai_tools    = toolkit.to_openai()          # GPT-4o, GPT-4.1, o1, o3-mini
anthropic_tools = toolkit.to_anthropic()        # Claude Sonnet 4, Claude 3.5 Haiku
gemini_tools    = toolkit.to_gemini()           # Gemini 2.5 Pro, Gemini 2.0 Flash
mistral_tools   = toolkit.to_mistral()          # Mistral Large, Codestral
cohere_tools    = toolkit.to_cohere()           # Command R+, Command A

# Cloud platforms
bedrock_tools   = toolkit.to_bedrock()          # Any model on AWS Bedrock
# Azure OpenAI and Vertex AI use the same format as OpenAI/Gemini respectively

# Local / self-hosted
ollama_tools    = toolkit.to_ollama(model="llama3.2")   # Chat-template-aware
vllm_tools      = toolkit.to_vllm()                      # Guided generation format

# Universal fallback
prompt_text     = toolkit.to_system_prompt()    # Works with ANY model

Parse Responses Back

Every provider returns tool calls differently. ToolGlot normalizes them.

from toolglot import parse_tool_calls, ToolCall

# OpenAI: tool_calls[].function.arguments (JSON string)
calls = parse_tool_calls(openai_response, provider="openai")

# Anthropic: content[].type=="tool_use", input (dict)
calls = parse_tool_calls(claude_response, provider="anthropic")

# Gemini: candidates[].content.parts[].function_call
calls = parse_tool_calls(gemini_response, provider="gemini")

# Mistral: tool_calls[].function (similar to OpenAI, subtly different)
calls = parse_tool_calls(mistral_response, provider="mistral")

# Cohere: tool_calls[].name + parameters
calls = parse_tool_calls(cohere_response, provider="cohere")

# Freeform text (for system prompt fallback)
calls = parse_tool_calls(raw_text, provider="freeform")

# Every call is the same type regardless of source
for call in calls:
    assert isinstance(call, ToolCall)
    print(call.id, call.name, call.arguments)

For streaming chunks/events:

from toolglot import parse_stream_tool_calls

calls = parse_stream_tool_calls(openai_stream_chunks, provider="openai")
calls = parse_stream_tool_calls(anthropic_stream_events, provider="anthropic")

Schema Transforms

Not all models handle complex schemas. ToolGlot downgrades intelligently.

from toolglot.transforms import flatten, simplify, deref, validate

# Flatten nested objects (required for Cohere)
# {passengers: {adults: int, children: int}} → {passengers_adults: int, passengers_children: int}
flat_toolkit = flatten(toolkit)

# Resolve $ref pointers (required for Gemini)
resolved_toolkit = deref(toolkit)

# Simplify for small models (remove anyOf, simplify enums, add descriptions)
simple_toolkit = simplify(toolkit, target_model="phi4-mini")

# Validate a tool call against the schema
result = validate(tool_call, toolkit)
if not result.valid:
    print(result.errors)

When transforms are applied automatically:

Transform Auto-applied For Why
flatten Cohere Only supports flat parameter_definitions
deref Gemini No $ref support
simplify Edge models via mode="simplified" Complex schemas confuse small models
All three mode="auto" ToolGlot picks based on capability matrix

Capability Matrix

ToolGlot knows what every model can do. Query it programmatically or from the CLI.

from toolglot import capabilities

caps = capabilities("gpt-4o")
print(caps.native_tools)      # True
print(caps.parallel_calls)    # True
print(caps.streaming)         # True
print(caps.strict_mode)       # True
print(caps.max_tools)         # 128
toolglot capabilities --model claude-sonnet-4
# native_tools: true
# parallel_calls: true
# streaming: true
# strict_mode: false
# max_tools: 500+
# recommended_mode: native

Full matrix (30+ models):

Model Provider Native Tools Parallel Streaming Strict Max Tools Recommended Mode
GPT-4o OpenAI Yes Yes Yes Yes 128 native
GPT-4o-mini OpenAI Yes Yes Yes Yes 128 native
GPT-4.1 OpenAI Yes Yes Yes Yes 128 native
GPT-4.1-mini OpenAI Yes Yes Yes Yes 128 native
GPT-4.1-nano OpenAI Yes Yes Yes Yes 128 native
o1 OpenAI Yes Yes No Yes 128 native
o3-mini OpenAI Yes Yes No Yes 128 native
o4-mini OpenAI Yes Yes No Yes 128 native
Claude Sonnet 4 Anthropic Yes Yes Yes No 500+ native
Claude 3.5 Haiku Anthropic Yes Yes Yes No 500+ native
Gemini 2.5 Pro Google Yes Yes Yes No 128 native
Gemini 2.0 Flash Google Yes Yes Yes No 128 native
Mistral Large Mistral Yes Yes Yes No 64 native
Mistral Small Mistral Yes Yes Yes No 64 native
Codestral Mistral Yes Yes Yes No 64 native
Command R+ Cohere Yes No No No 40 native (flat)
Command A Cohere Yes No No No 40 native (flat)
DeepSeek V3 DeepSeek Yes Yes Yes No 128 native
DeepSeek R1 DeepSeek No No No No -- system_prompt
Grok 3 xAI Yes Yes Yes No 128 native
Grok 3 Mini xAI Yes Yes Yes No 128 native
Llama 4 Maverick Meta Yes Yes Yes No 64 native
Llama 4 Scout Meta Yes Yes Yes No 64 native
Llama 3.3 70B Groq / Together Yes Yes Yes No 64 native
Llama 3.2 3B Ollama Template No No No ~10 simplified
Llama 3.2 1B Ollama Template No No No ~5 system_prompt
Phi-4-mini Ollama Template No No No ~10 simplified
Qwen 2.5 7B Ollama Template No No No ~15 native
Qwen 2.5 3B Ollama Template No No No ~10 simplified
Mistral 7B Ollama Template No No No ~10 native
Gemma 2 9B Ollama No No No No -- system_prompt
Gemma 2 2B Ollama No No No No -- system_prompt
DeepSeek R1 8B Ollama No No No No -- system_prompt
QwQ 32B Ollama No No No No -- system_prompt
SmolLM2 1.7B Ollama No No No No -- system_prompt

Key:

  • Native: Model has built-in tool calling API
  • Template: Tools injected via chat template (Ollama/vLLM)
  • No: No native support — use system_prompt mode
  • Max Tools: Approximate practical limit before quality degrades

LangGraph Integration

This is where ToolGlot becomes essential. create_react_agent breaks when you switch models. ToolGlot fixes that.

pip install "toolglot[langchain]"

Same Agent, Any Model

from langgraph.prebuilt import create_react_agent
from langchain_openai import ChatOpenAI
from langchain_anthropic import ChatAnthropic
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_mistralai import ChatMistralAI
from langchain_cohere import ChatCohere
from toolglot import ToolKit
from toolglot.integrations.langchain import adapt

toolkit = ToolKit.from_mcp("./my_tools.json")
tools = toolkit.to_langchain()

# adapt() wraps the model with format-aware, capability-aware tool binding
# It detects the provider, picks the right exporter, applies schema transforms

models = {
    "gpt-4o":        adapt(ChatOpenAI(model="gpt-4o"), toolkit),
    "gpt-4.1-nano":  adapt(ChatOpenAI(model="gpt-4.1-nano"), toolkit),
    "claude-sonnet":  adapt(ChatAnthropic(model="claude-sonnet-4-20250514"), toolkit),
    "gemini-flash":  adapt(ChatGoogleGenerativeAI(model="gemini-2.0-flash"), toolkit),
    "mistral-large": adapt(ChatMistralAI(model="mistral-large-latest"), toolkit),
    "command-r+":    adapt(ChatCohere(model="command-r-plus"), toolkit),
    #                 ↑ ToolGlot auto-flattens nested schemas for Cohere
}

for name, model in models.items():
    agent = create_react_agent(model, tools)
    result = agent.invoke({"messages": [("user", "Search flights from NYC to London")]})
    print(f"{name}: {result['messages'][-1].content}")

Without ToolGlot, the Cohere agent crashes on nested schemas. The Gemini agent fails on $ref. With ToolGlot, they all just work.

Local Models via Ollama

from langchain_ollama import ChatOllama
from toolglot import ToolKit
from toolglot.integrations.langchain import adapt

toolkit = ToolKit.from_mcp("./my_tools.json")
tools = toolkit.to_langchain()

# Native mode — models with solid chat template tool support
llama = adapt(ChatOllama(model="llama3.2:3b"), toolkit, mode="native")
agent = create_react_agent(llama, tools)

# Simplified mode — strip schema complexity for smaller models
phi = adapt(ChatOllama(model="phi4-mini"), toolkit, mode="simplified")
agent = create_react_agent(phi, tools)

# System prompt mode — any model, zero native support needed
gemma = adapt(ChatOllama(model="gemma2:2b"), toolkit, mode="system_prompt")
agent = create_react_agent(gemma, tools)

# Auto mode — ToolGlot checks capability matrix and picks the best strategy
qwen = adapt(ChatOllama(model="qwen2.5:7b"), toolkit, mode="auto")
agent = create_react_agent(qwen, tools)

Reasoning Models

o1, o3-mini, DeepSeek R1, QwQ — these models think deeply but most don't support native tool calling. Today, you can't use them with create_react_agent. With ToolGlot, you can.

from langchain_openai import ChatOpenAI
from langchain_ollama import ChatOllama
from toolglot import ToolKit
from toolglot.integrations.langchain import adapt

toolkit = ToolKit.from_mcp("./my_tools.json")
tools = toolkit.to_langchain()

# o3-mini — native tool calling, but simplify schemas for better results
o3 = adapt(ChatOpenAI(model="o3-mini"), toolkit, mode="auto")
agent = create_react_agent(o3, tools)

# DeepSeek R1 — no native tools, but ToolGlot injects into system prompt
# and parses <tool_call> blocks from R1's chain-of-thought output
r1 = adapt(ChatOllama(model="deepseek-r1:8b"), toolkit, mode="auto")
agent = create_react_agent(r1, tools)
# R1's deep reasoning makes it surprisingly good at multi-step tool planning.
# It just couldn't express tool calls before. Now it can.

# QwQ — same approach, reasoning model with no native tool support
qwq = adapt(ChatOllama(model="qwq:32b"), toolkit, mode="auto")
agent = create_react_agent(qwq, tools)

Multi-Provider Fallback

The crown jewel — an agent that tries the cheapest model first and falls back:

from langgraph.prebuilt import create_react_agent
from langchain_openai import ChatOpenAI
from langchain_anthropic import ChatAnthropic
from langchain_ollama import ChatOllama
from toolglot import ToolKit
from toolglot.integrations.langchain import adapt

toolkit = ToolKit.from_mcp("./my_tools.json")
tools = toolkit.to_langchain()

tiers = [
    ("local-phi4",    adapt(ChatOllama(model="phi4-mini"), toolkit, mode="auto")),
    ("gpt-4.1-nano",  adapt(ChatOpenAI(model="gpt-4.1-nano"), toolkit, mode="auto")),
    ("gpt-4o",        adapt(ChatOpenAI(model="gpt-4o"), toolkit, mode="auto")),
    ("claude-sonnet", adapt(ChatAnthropic(model="claude-sonnet-4-20250514"), toolkit, mode="auto")),
]

def call_with_fallback(messages: list):
    for name, model in tiers:
        try:
            agent = create_react_agent(model, tools)
            result = agent.invoke({"messages": messages})
            print(f"Resolved by: {name}")
            return result
        except Exception:
            continue
    raise RuntimeError("All tiers exhausted")

result = call_with_fallback([("user", "Book me a flight to Tokyo next Friday")])

Provider Cookbook

Detailed examples for every major provider. All examples use the same three tools:

from toolglot import ToolKit

toolkit = ToolKit.from_mcp("./travel_tools.json")
# Tools: get_weather, search_flights, book_hotel
# search_flights has nested params: passengers: {adults: int, children: int}
# book_hotel has array params: preferences.amenities: list[str]

OpenAI (GPT-4o, GPT-4.1, GPT-4.1-mini, GPT-4.1-nano, o1, o3-mini, o4-mini)

from openai import OpenAI

client = OpenAI()
tools = toolkit.to_openai()

# Works with every OpenAI model
for model in ["gpt-4o", "gpt-4.1", "gpt-4.1-mini", "gpt-4.1-nano", "o3-mini"]:
    response = client.chat.completions.create(
        model=model,
        messages=[{"role": "user", "content": "Weather in Tokyo?"}],
        tools=tools,
    )
    calls = parse_tool_calls(response, provider="openai")
    print(f"{model}: {calls[0].name}({calls[0].arguments})")

# Strict mode (OpenAI-specific: guarantees JSON Schema compliance)
tools_strict = toolkit.to_openai(strict=True)
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Search flights NYC to London, 2 adults"}],
    tools=tools_strict,
)

Anthropic (Claude Sonnet 4, Claude 3.5 Haiku)

from anthropic import Anthropic

client = Anthropic()
tools = toolkit.to_anthropic()

for model in ["claude-sonnet-4-20250514", "claude-3-5-haiku-20241022"]:
    response = client.messages.create(
        model=model,
        max_tokens=1024,
        tools=tools,
        messages=[{"role": "user", "content": "Weather in Tokyo?"}],
    )
    calls = parse_tool_calls(response, provider="anthropic")
    print(f"{model}: {calls[0].name}({calls[0].arguments})")

Google Gemini (Gemini 2.5 Pro, Gemini 2.0 Flash)

from google import genai

client = genai.Client()
tools = toolkit.to_gemini()

for model in ["gemini-2.5-pro-preview-05-06", "gemini-2.0-flash"]:
    response = client.models.generate_content(
        model=model,
        contents="Weather in Tokyo?",
        config=genai.types.GenerateContentConfig(tools=tools),
    )
    calls = parse_tool_calls(response, provider="gemini")
    print(f"{model}: {calls[0].name}({calls[0].arguments})")

Mistral (Mistral Large, Mistral Small, Codestral)

from mistralai import Mistral

client = Mistral()
tools = toolkit.to_mistral()

for model in ["mistral-large-latest", "mistral-small-latest", "codestral-latest"]:
    response = client.chat.complete(
        model=model,
        messages=[{"role": "user", "content": "Weather in Tokyo?"}],
        tools=tools,
    )
    calls = parse_tool_calls(response, provider="mistral")
    print(f"{model}: {calls[0].name}({calls[0].arguments})")

Cohere (Command R+, Command A)

import cohere

client = cohere.ClientV2()
tools = toolkit.to_cohere()
# ToolGlot auto-flattened nested params:
# passengers: {adults, children} → passengers_adults, passengers_children

for model in ["command-r-plus", "command-a-03-2025"]:
    response = client.chat(
        model=model,
        messages=[{"role": "user", "content": "Search flights NYC to London, 2 adults 1 child"}],
        tools=tools,
    )
    calls = parse_tool_calls(response, provider="cohere")
    # calls[0].arguments has passengers_adults=2, passengers_children=1
    # Use toolkit.unflatten(calls) to restore nested structure if needed
    print(f"{model}: {calls[0].name}({calls[0].arguments})")

DeepSeek (V3 — native, R1 — system prompt)

from openai import OpenAI

# DeepSeek V3 — native tool calling (OpenAI-compatible API)
client = OpenAI(base_url="https://api.deepseek.com", api_key="...")
tools = toolkit.to_openai()

response = client.chat.completions.create(
    model="deepseek-chat",
    messages=[{"role": "user", "content": "Weather in Tokyo?"}],
    tools=tools,
)
calls = parse_tool_calls(response, provider="openai")

# DeepSeek R1 — no native tools, use system prompt injection
prompt, instructions = toolkit.to_system_prompt(return_instructions=True)
response = client.chat.completions.create(
    model="deepseek-reasoner",
    messages=[
        {"role": "system", "content": prompt},
        {"role": "user", "content": "Weather in Tokyo?"},
    ],
)
calls = parse_tool_calls(response.choices[0].message.content, provider="freeform")

xAI (Grok 3, Grok 3 Mini)

from openai import OpenAI

client = OpenAI(base_url="https://api.x.ai/v1", api_key="...")
tools = toolkit.to_openai()

for model in ["grok-3", "grok-3-mini"]:
    response = client.chat.completions.create(
        model=model,
        messages=[{"role": "user", "content": "Weather in Tokyo?"}],
        tools=tools,
    )
    calls = parse_tool_calls(response, provider="openai")
    print(f"{model}: {calls[0].name}({calls[0].arguments})")

AWS Bedrock (Claude, Llama, Mistral on Bedrock)

import boto3

client = boto3.client("bedrock-runtime")
tools = toolkit.to_bedrock()

for model_id in [
    "anthropic.claude-sonnet-4-20250514-v1:0",
    "meta.llama3-3-70b-instruct-v1:0",
    "mistral.mistral-large-2407-v1:0",
]:
    response = client.converse(
        modelId=model_id,
        messages=[{"role": "user", "content": [{"text": "Weather in Tokyo?"}]}],
        toolConfig={"tools": tools},
    )
    calls = parse_tool_calls(response, provider="bedrock")
    print(f"{model_id}: {calls[0].name}({calls[0].arguments})")

Azure OpenAI (GPT-4o on Azure)

from openai import AzureOpenAI

client = AzureOpenAI(
    azure_endpoint="https://your-resource.openai.azure.com/",
    api_version="2025-01-01-preview",
)
tools = toolkit.to_openai()

response = client.chat.completions.create(
    model="gpt-4o",   # your deployment name
    messages=[{"role": "user", "content": "Weather in Tokyo?"}],
    tools=tools,
)
calls = parse_tool_calls(response, provider="openai")

Together AI / Groq / Fireworks / Cerebras (OpenAI-compatible)

from openai import OpenAI

tools = toolkit.to_openai()

providers = {
    "together": ("https://api.together.xyz/v1", "meta-llama/Llama-3.3-70B-Instruct-Turbo"),
    "groq":     ("https://api.groq.com/openai/v1", "llama-3.3-70b-versatile"),
    "fireworks": ("https://api.fireworks.ai/inference/v1", "accounts/fireworks/models/llama-v3p3-70b-instruct"),
    "cerebras": ("https://api.cerebras.ai/v1", "llama-3.3-70b"),
}

for name, (base_url, model) in providers.items():
    client = OpenAI(base_url=base_url, api_key="...")
    response = client.chat.completions.create(
        model=model,
        messages=[{"role": "user", "content": "Weather in Tokyo?"}],
        tools=tools,
    )
    calls = parse_tool_calls(response, provider="openai")
    print(f"{name} ({model}): {calls[0].name}({calls[0].arguments})")

Ollama (Llama, Phi, Qwen, Mistral, Gemma — local)

from openai import OpenAI

client = OpenAI(base_url="http://localhost:11434/v1", api_key="ollama")

# Models with chat template tool support — use native format
native_models = ["llama3.2:3b", "qwen2.5:7b", "mistral:7b"]
tools = toolkit.to_ollama(model="llama3.2")

for model in native_models:
    tools = toolkit.to_ollama(model=model)
    response = client.chat.completions.create(
        model=model,
        messages=[{"role": "user", "content": "Weather in Tokyo?"}],
        tools=tools,
    )
    calls = parse_tool_calls(response, provider="openai")
    print(f"{model}: {calls[0].name}({calls[0].arguments})")

# Models WITHOUT tool support — use system prompt injection
no_tool_models = ["gemma2:2b", "smollm2:1.7b", "deepseek-r1:8b"]
prompt = toolkit.to_system_prompt()

for model in no_tool_models:
    response = client.chat.completions.create(
        model=model,
        messages=[
            {"role": "system", "content": prompt},
            {"role": "user", "content": "Weather in Tokyo?"},
        ],
    )
    calls = parse_tool_calls(response.choices[0].message.content, provider="freeform")
    if calls:
        print(f"{model}: {calls[0].name}({calls[0].arguments})")
    else:
        print(f"{model}: no tool call extracted (model too small or confused)")

vLLM (Self-Hosted, Any HuggingFace Model)

from openai import OpenAI

client = OpenAI(base_url="http://localhost:8000/v1", api_key="token")
payload = toolkit.to_vllm_request(guided=True, guided_backend="outlines")

response = client.chat.completions.create(
    model="Qwen/Qwen2.5-7B-Instruct",
    messages=[{"role": "user", "content": "Weather in Tokyo?"}],
    **payload,  # includes tools + guided decoding config
)
calls = parse_tool_calls(response, provider="openai")

Architecture

toolglot/
├── toolglot/
│   ├── __init__.py            # Public API: ToolKit, parse_tool_calls, capabilities
│   ├── cli.py                 # Typer CLI: translate, validate, capabilities, inspect
│   ├── types.py               # Canonical IR: ToolDefinition, ToolParameter, ToolCall
│   ├── capabilities.py        # Model capability matrix (30+ models)
│   ├── importers/
│   │   ├── mcp.py             # MCP tool definitions → IR
│   │   ├── openapi.py         # OpenAPI 3.x specs → IR
│   │   ├── openai.py          # OpenAI function format → IR
│   │   ├── langchain.py       # LangChain BaseTool → IR
│   │   └── json_schema.py     # Raw JSON Schema → IR
│   ├── exporters/
│   │   ├── openai.py          # IR → OpenAI (+ Azure, Together, Groq, etc.)
│   │   ├── anthropic.py       # IR → Anthropic
│   │   ├── gemini.py          # IR → Google Gemini / Vertex AI
│   │   ├── mistral.py         # IR → Mistral
│   │   ├── cohere.py          # IR → Cohere (auto-flatten)
│   │   ├── bedrock.py         # IR → AWS Bedrock Converse API
│   │   ├── ollama.py          # IR → Ollama (chat template aware)
│   │   ├── vllm.py            # IR → vLLM / TGI
│   │   └── system_prompt.py   # IR → system prompt text (universal fallback)
│   ├── parsers/
│   │   ├── openai.py          # OpenAI response → ToolCall
│   │   ├── anthropic.py       # Anthropic response → ToolCall
│   │   ├── gemini.py          # Gemini response → ToolCall
│   │   ├── mistral.py         # Mistral response → ToolCall
│   │   ├── cohere.py          # Cohere response → ToolCall
│   │   └── freeform.py        # Raw text → ToolCall (regex + heuristics)
│   ├── transforms/
│   │   ├── flatten.py         # Flatten nested objects
│   │   ├── deref.py           # Resolve $ref pointers
│   │   ├── simplify.py        # Downgrade schemas for small models
│   │   └── validate.py        # Validate tool calls against schemas
│   └── integrations/
│       ├── langchain.py       # adapt(), to_langchain(), BaseTool wrappers
│       └── langgraph.py       # ToolGlotNode for graph-level integration
├── examples/
│   ├── quickstart.py          # 10-line getting started
│   ├── multi_provider.py      # Same tools → every provider
│   ├── langgraph_react_agent.py  # LangGraph with 5+ models
│   ├── local_models.py        # Ollama / vLLM examples
│   ├── format_comparison.py   # Side-by-side format output
│   ├── schema_downgrade.py    # Complex → simplified for edge models
│   ├── response_parsing.py    # Parse tool calls from all providers
│   └── capability_matrix.py   # Query model capabilities
├── tests/
│   ├── conftest.py            # Shared fixtures
│   └── unit/
│       ├── test_types.py
│       ├── test_exporters.py
│       ├── test_parsers.py
│       └── test_transforms.py
├── pyproject.toml
├── Makefile
├── LICENSE
└── README.md

Research Artifacts

python research/eval_harness.py

How It Compares

ToolGlot LiteLLM LangChain Vercel AI SDK
What it is Library Proxy/service Framework TypeScript SDK
Tool translation Standalone, composable Coupled to their router Coupled to their ecosystem Coupled to their runtime
Schema optimization Auto per model No No No
Capability matrix Built-in (30+ models) Partial No Partial
System prompt fallback Built-in No No No
Response parsing Unified across all providers Via their proxy Per-provider classes Per-provider
LangGraph integration First-class adapt() N/A Built-in (basic) N/A
Local model support Ollama + vLLM + system prompt Via proxy Via ChatOllama No
Language Python Python Python + JS TypeScript
Install size Minimal (core has 4 deps) Heavy Very heavy Heavy

Requirements

  • Python 3.10+
  • No GPU needed. No heavy ML dependencies.

Core dependencies (4 packages):

  • pydantic — IR models and validation
  • typer — CLI
  • rich — pretty output
  • pyyaml — config files

Production Hardening

For deployment and operations guidance (validation guardrails, retries/timeouts, redaction, and incident response), see:


Performance Benchmarks

ToolGlot includes benchmark coverage for import/export/transform paths across small/medium/large schema fixtures.

make benchmark
make benchmark-budget

Benchmark metrics are written to .benchmarks/latest.json and budget thresholds are enforced in CI.


Installation

# Core — translate, validate, inspect
pip install toolglot

# With LangChain/LangGraph integration
pip install "toolglot[langchain]"

# Everything
pip install "toolglot[all]"

# Development
pip install "toolglot[dev]"

CI Quality Gate Parity

CI runs the same baseline commands expected locally:

make lint
make typecheck-ci
make test
make test-cov

Current CI coverage gate is enforced in workflow configuration and should be raised over time as test coverage grows.


Release and Versioning

ToolGlot follows Semantic Versioning with automated tagged releases to TestPyPI and PyPI.


Roadmap

  • Canonical IR with Pydantic models
  • Importers: MCP, OpenAI, JSON Schema
  • Exporters: OpenAI, Anthropic, Gemini, Mistral, Cohere, Bedrock
  • System prompt fallback for non-native models
  • Response parsers for all providers
  • Schema transforms: flatten, deref, simplify
  • Capability matrix (30+ models)
  • LangGraph adapt() integration
  • OpenAPI 3.x importer
  • LangChain tool importer
  • Ollama chat template detection
  • vLLM guided generation integration
  • Streaming tool call parsing
  • MCP server auto-discovery
  • Schema diff tool (compare formats side-by-side)
  • Plugin system for custom providers
  • Web playground for interactive translation

API Stability Policy

ToolGlot's public API contract and deprecation framework are documented in:


Security

ToolGlot publishes a dedicated vulnerability reporting and response process in SECURITY.md.

  • Report vulnerabilities privately via GitHub Security Advisories
  • Scorecard runs in CI via .github/workflows/scorecard.yml
  • Any failed or regressed Scorecard check should be tracked in a follow-up issue labeled area:security and priority:P1

Contributing

Contributions welcome. Adding a new provider? It's one file in exporters/ and one in parsers/.

Compatibility invariants across exporters/parsers are enforced via contract tests:

git clone https://github.com/NP-compete/toolglot.git
cd toolglot
pip install -e ".[dev]"
pytest

Property/fuzz robustness tests:

pytest tests/unit/test_property_based.py -q

Supply-Chain Security

ToolGlot includes baseline dependency governance:

  • Dependabot updates for both pip and GitHub Actions via .github/dependabot.yml
  • PR dependency risk checks via .github/workflows/dependency-review.yml
  • SBOM generation and release attachment via .github/workflows/release-sbom.yml

Citation

@software{toolglot2026,
  author = {Soham Dutta},
  title = {ToolGlot: Define Tools Once, Use Them With Any Model},
  year = {2026},
  url = {https://github.com/NP-compete/toolglot}
}

License

MIT

About

Define tools once, run anywhere: canonical IR for portable LLM tool calling across providers.

Topics

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors