toolglot

Define tools once. Use them with any model.

Quick Start • Why ToolGlot • Format Zoo • 30+ Models • LangGraph • Provider Cookbook • Contributing • Production Hardening • Security

You wrote 15 MCP tools. They work perfectly with GPT-4o. Then your boss says "make it work with Claude, Gemini, Llama, and that 2B model running on a Raspberry Pi."

Every model speaks a different tool dialect. OpenAI wants functions. Anthropic wants tools with input_schema. Gemini wants function_declarations. Cohere wants flat parameters. Ollama needs it in the chat template. That 2B model? It needs tools injected into the system prompt and responses parsed from freeform text.

ToolGlot translates between all of them.

Your Tools ──→ [ ToolGlot ] ──→ Any Model, Any Provider, Any Format
     ↑              │
 MCP / OpenAPI      └──→ OpenAI, Anthropic, Gemini, Mistral, Cohere,
 / JSON Schema           Bedrock, Ollama, vLLM, or plain system prompt

Quick Start

pip install toolglot

Translate tool definitions from the CLI:

# MCP → OpenAI format
toolglot translate --from mcp --to openai --input tools.json --output openai_tools.json

# OpenAPI spec → Anthropic format
toolglot translate --from openapi --to anthropic --input petstore.yaml

# Inspect what a model supports
toolglot capabilities --model gpt-4o

# Validate a tool definition
toolglot validate --format openai --input tools.json

# Compare two schemas and detect lossy changes
toolglot schema-diff --left-input a.json --left-format mcp --right-input b.json --right-format openai --output json

Scaffold a provider plugin:

toolglot plugin init --name acme --output-dir ./plugins

Automation-friendly CLI modes:

# Machine-readable output
toolglot capabilities --model gpt-4o --output json
toolglot validate --format mcp --input tools.json --output json
toolglot import-mcp --config ./mcp_config.json --output json

# Logging control for scripts
toolglot inspect --format mcp --input tools.json --quiet
toolglot translate --from mcp --to openai --input tools.json --verbose

Or in Python — define once, export everywhere:

from toolglot import ToolKit

# Load from any source
toolkit = ToolKit.from_mcp("./my_tools.json")

# Export to any target — one line each
openai_tools    = toolkit.to_openai()
anthropic_tools = toolkit.to_anthropic()
gemini_tools    = toolkit.to_gemini()
mistral_tools   = toolkit.to_mistral()
cohere_tools    = toolkit.to_cohere()
bedrock_tools   = toolkit.to_bedrock()
ollama_tools    = toolkit.to_ollama(model="llama3.2")
prompt_text     = toolkit.to_system_prompt()   # any model, zero native support needed

# Parse tool calls back from any provider
from toolglot import parse_tool_calls

calls = parse_tool_calls(response, provider="anthropic")
for call in calls:
    print(call.name, call.arguments)   # unified format, always

Why ToolGlot

The problem exists in pieces. Nobody assembled the solution.

What You Need	Existing Solutions	The Problem
Provider-agnostic tool definitions	LiteLLM	Full proxy. Couples you to their runtime. You wanted a library, not a service.
Tool format translation	LangChain tool abstraction	Locked into the LangChain ecosystem. Not standalone.
Multi-provider tool calling	Vercel AI SDK	TypeScript only. Frontend-focused.
Schema optimization for small models	Nothing	Nobody downgrades complex schemas for weaker models.
Tool calling for non-native models	Nothing	If the model doesn't support tools natively, you're on your own.

ToolGlot is the missing primitive. A standalone Python library that translates tool definitions between formats, optimizes schemas per model, parses responses back to a unified format, and makes tool calling work on models that don't natively support it.

┌─────────────┐     ┌───────────┐     ┌───────────┐     ┌──────────┐
│   Import    │────▶│ Canonical │────▶│ Transform │────▶│  Export  │
│ MCP/OpenAPI │     │    IR     │     │  per model │     │ per prov │
└─────────────┘     └───────────┘     └───────────┘     └──────────┘
       │                  │                 │                  │
  Your tool defs    Pydantic models    Flatten, deref,    OpenAI, Claude,
  in any format     (the truth)        simplify, validate  Gemini, Ollama...

The Format Zoo

Here's the same tool — get_weather — in six different formats. This is what ToolGlot handles for you.

Your MCP tool definition (input):

{
  "name": "get_weather",
  "description": "Get current weather for a city",
  "inputSchema": {
    "type": "object",
    "properties": {
      "city": { "type": "string", "description": "City name" },
      "units": { "type": "string", "enum": ["celsius", "fahrenheit"], "default": "celsius" }
    },
    "required": ["city"]
  }
}

OpenAI format — wraps in type: "function", nests under function.parameters:

{
  "type": "function",
  "function": {
    "name": "get_weather",
    "description": "Get current weather for a city",
    "parameters": {
      "type": "object",
      "properties": {
        "city": { "type": "string", "description": "City name" },
        "units": { "type": "string", "enum": ["celsius", "fahrenheit"], "default": "celsius" }
      },
      "required": ["city"]
    }
  }
}

Anthropic format — uses input_schema instead of parameters:

{
  "name": "get_weather",
  "description": "Get current weather for a city",
  "input_schema": {
    "type": "object",
    "properties": {
      "city": { "type": "string", "description": "City name" },
      "units": { "type": "string", "enum": ["celsius", "fahrenheit"], "default": "celsius" }
    },
    "required": ["city"]
  }
}

Gemini format — uppercase types, no default, nested under function_declarations:

{
  "function_declarations": [{
    "name": "get_weather",
    "description": "Get current weather for a city",
    "parameters": {
      "type": "OBJECT",
      "properties": {
        "city": { "type": "STRING", "description": "City name" },
        "units": { "type": "STRING", "enum": ["celsius", "fahrenheit"] }
      },
      "required": ["city"]
    }
  }]
}

Cohere format — flat parameter_definitions, Python types, no nesting:

{
  "name": "get_weather",
  "description": "Get current weather for a city",
  "parameter_definitions": {
    "city": { "type": "str", "description": "City name", "required": true },
    "units": { "type": "str", "description": "Temperature units: celsius or fahrenheit. Default: celsius", "required": false }
  }
}

System prompt fallback — for models with zero native tool support:

You have access to the following tools:

## get_weather
Get current weather for a city

Parameters:
- city (string, required): City name
- units (string, optional): Temperature units. One of: celsius, fahrenheit. Default: celsius

When you want to call a tool, respond with:
<tool_call>{"name": "get_weather", "arguments": {"city": "Tokyo"}}</tool_call>

Six formats. Same tool. ToolGlot handles all of this with one line of code.

Features

Import From Anywhere

from toolglot import ToolKit

# MCP tool definitions (JSON)
toolkit = ToolKit.from_mcp("./mcp_tools.json")

# MCP server config (connects and lists tools)
toolkit = ToolKit.from_mcp_server("./mcp_server_config.json")

# OpenAPI / Swagger spec
toolkit = ToolKit.from_openapi("./petstore.yaml")

# OpenAI function format
toolkit = ToolKit.from_openai([{"type": "function", "function": {...}}])

# Raw JSON Schema
toolkit = ToolKit.from_json_schema({"get_weather": {...}})

# LangChain tools (requires toolglot[langchain])
from langchain_core.tools import tool

@tool
def get_weather(city: str) -> str:
    """Get weather for a city."""
    ...

toolkit = ToolKit.from_langchain([get_weather])

Supported input formats:

Source	Method	Notes
MCP tool definitions	`ToolKit.from_mcp("tools.json")`	JSON file with MCP tool array
MCP server config	`ToolKit.from_mcp_server("config.json")`	Connects to server, lists tools
OpenAPI 3.x spec	`ToolKit.from_openapi("spec.yaml")`	Extracts operations as tools
OpenAI function format	`ToolKit.from_openai(tools_list)`	List of `{"type": "function", ...}`
JSON Schema	`ToolKit.from_json_schema(schemas)`	Dict of name → schema
LangChain tools	`ToolKit.from_langchain(tools)`	List of `BaseTool`

Export To Any Model

toolkit = ToolKit.from_mcp("./tools.json")

# Cloud providers
openai_tools    = toolkit.to_openai()          # GPT-4o, GPT-4.1, o1, o3-mini
anthropic_tools = toolkit.to_anthropic()        # Claude Sonnet 4, Claude 3.5 Haiku
gemini_tools    = toolkit.to_gemini()           # Gemini 2.5 Pro, Gemini 2.0 Flash
mistral_tools   = toolkit.to_mistral()          # Mistral Large, Codestral
cohere_tools    = toolkit.to_cohere()           # Command R+, Command A

# Cloud platforms
bedrock_tools   = toolkit.to_bedrock()          # Any model on AWS Bedrock
# Azure OpenAI and Vertex AI use the same format as OpenAI/Gemini respectively

# Local / self-hosted
ollama_tools    = toolkit.to_ollama(model="llama3.2")   # Chat-template-aware
vllm_tools      = toolkit.to_vllm()                      # Guided generation format

# Universal fallback
prompt_text     = toolkit.to_system_prompt()    # Works with ANY model

Parse Responses Back

Every provider returns tool calls differently. ToolGlot normalizes them.

from toolglot import parse_tool_calls, ToolCall

# OpenAI: tool_calls[].function.arguments (JSON string)
calls = parse_tool_calls(openai_response, provider="openai")

# Anthropic: content[].type=="tool_use", input (dict)
calls = parse_tool_calls(claude_response, provider="anthropic")

# Gemini: candidates[].content.parts[].function_call
calls = parse_tool_calls(gemini_response, provider="gemini")

# Mistral: tool_calls[].function (similar to OpenAI, subtly different)
calls = parse_tool_calls(mistral_response, provider="mistral")

# Cohere: tool_calls[].name + parameters
calls = parse_tool_calls(cohere_response, provider="cohere")

# Freeform text (for system prompt fallback)
calls = parse_tool_calls(raw_text, provider="freeform")

# Every call is the same type regardless of source
for call in calls:
    assert isinstance(call, ToolCall)
    print(call.id, call.name, call.arguments)

For streaming chunks/events:

from toolglot import parse_stream_tool_calls

calls = parse_stream_tool_calls(openai_stream_chunks, provider="openai")
calls = parse_stream_tool_calls(anthropic_stream_events, provider="anthropic")

Schema Transforms

Not all models handle complex schemas. ToolGlot downgrades intelligently.

from toolglot.transforms import flatten, simplify, deref, validate

# Flatten nested objects (required for Cohere)
# {passengers: {adults: int, children: int}} → {passengers_adults: int, passengers_children: int}
flat_toolkit = flatten(toolkit)

# Resolve $ref pointers (required for Gemini)
resolved_toolkit = deref(toolkit)

# Simplify for small models (remove anyOf, simplify enums, add descriptions)
simple_toolkit = simplify(toolkit, target_model="phi4-mini")

# Validate a tool call against the schema
result = validate(tool_call, toolkit)
if not result.valid:
    print(result.errors)

When transforms are applied automatically:

Transform	Auto-applied For	Why
`flatten`	Cohere	Only supports flat `parameter_definitions`
`deref`	Gemini	No `$ref` support
`simplify`	Edge models via `mode="simplified"`	Complex schemas confuse small models
All three	`mode="auto"`	ToolGlot picks based on capability matrix

Capability Matrix

ToolGlot knows what every model can do. Query it programmatically or from the CLI.

from toolglot import capabilities

caps = capabilities("gpt-4o")
print(caps.native_tools)      # True
print(caps.parallel_calls)    # True
print(caps.streaming)         # True
print(caps.strict_mode)       # True
print(caps.max_tools)         # 128

toolglot capabilities --model claude-sonnet-4
# native_tools: true
# parallel_calls: true
# streaming: true
# strict_mode: false
# max_tools: 500+
# recommended_mode: native

Full matrix (30+ models):

Model	Provider	Native Tools	Parallel	Streaming	Strict	Max Tools	Recommended Mode
GPT-4o	OpenAI	Yes	Yes	Yes	Yes	128	`native`
GPT-4o-mini	OpenAI	Yes	Yes	Yes	Yes	128	`native`
GPT-4.1	OpenAI	Yes	Yes	Yes	Yes	128	`native`
GPT-4.1-mini	OpenAI	Yes	Yes	Yes	Yes	128	`native`
GPT-4.1-nano	OpenAI	Yes	Yes	Yes	Yes	128	`native`
o1	OpenAI	Yes	Yes	No	Yes	128	`native`
o3-mini	OpenAI	Yes	Yes	No	Yes	128	`native`
o4-mini	OpenAI	Yes	Yes	No	Yes	128	`native`
Claude Sonnet 4	Anthropic	Yes	Yes	Yes	No	500+	`native`
Claude 3.5 Haiku	Anthropic	Yes	Yes	Yes	No	500+	`native`
Gemini 2.5 Pro	Google	Yes	Yes	Yes	No	128	`native`
Gemini 2.0 Flash	Google	Yes	Yes	Yes	No	128	`native`
Mistral Large	Mistral	Yes	Yes	Yes	No	64	`native`
Mistral Small	Mistral	Yes	Yes	Yes	No	64	`native`
Codestral	Mistral	Yes	Yes	Yes	No	64	`native`
Command R+	Cohere	Yes	No	No	No	40	`native` (flat)
Command A	Cohere	Yes	No	No	No	40	`native` (flat)
DeepSeek V3	DeepSeek	Yes	Yes	Yes	No	128	`native`
DeepSeek R1	DeepSeek	No	No	No	No	--	`system_prompt`
Grok 3	xAI	Yes	Yes	Yes	No	128	`native`
Grok 3 Mini	xAI	Yes	Yes	Yes	No	128	`native`
Llama 4 Maverick	Meta	Yes	Yes	Yes	No	64	`native`
Llama 4 Scout	Meta	Yes	Yes	Yes	No	64	`native`
Llama 3.3 70B	Groq / Together	Yes	Yes	Yes	No	64	`native`
Llama 3.2 3B	Ollama	Template	No	No	No	~10	`simplified`
Llama 3.2 1B	Ollama	Template	No	No	No	~5	`system_prompt`
Phi-4-mini	Ollama	Template	No	No	No	~10	`simplified`
Qwen 2.5 7B	Ollama	Template	No	No	No	~15	`native`
Qwen 2.5 3B	Ollama	Template	No	No	No	~10	`simplified`
Mistral 7B	Ollama	Template	No	No	No	~10	`native`
Gemma 2 9B	Ollama	No	No	No	No	--	`system_prompt`
Gemma 2 2B	Ollama	No	No	No	No	--	`system_prompt`
DeepSeek R1 8B	Ollama	No	No	No	No	--	`system_prompt`
QwQ 32B	Ollama	No	No	No	No	--	`system_prompt`
SmolLM2 1.7B	Ollama	No	No	No	No	--	`system_prompt`

Key:

Native: Model has built-in tool calling API
Template: Tools injected via chat template (Ollama/vLLM)
No: No native support — use system_prompt mode
Max Tools: Approximate practical limit before quality degrades

LangGraph Integration

This is where ToolGlot becomes essential. create_react_agent breaks when you switch models. ToolGlot fixes that.

pip install "toolglot[langchain]"

Same Agent, Any Model

from langgraph.prebuilt import create_react_agent
from langchain_openai import ChatOpenAI
from langchain_anthropic import ChatAnthropic
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_mistralai import ChatMistralAI
from langchain_cohere import ChatCohere
from toolglot import ToolKit
from toolglot.integrations.langchain import adapt

toolkit = ToolKit.from_mcp("./my_tools.json")
tools = toolkit.to_langchain()

# adapt() wraps the model with format-aware, capability-aware tool binding
# It detects the provider, picks the right exporter, applies schema transforms

models = {
    "gpt-4o":        adapt(ChatOpenAI(model="gpt-4o"), toolkit),
    "gpt-4.1-nano":  adapt(ChatOpenAI(model="gpt-4.1-nano"), toolkit),
    "claude-sonnet":  adapt(ChatAnthropic(model="claude-sonnet-4-20250514"), toolkit),
    "gemini-flash":  adapt(ChatGoogleGenerativeAI(model="gemini-2.0-flash"), toolkit),
    "mistral-large": adapt(ChatMistralAI(model="mistral-large-latest"), toolkit),
    "command-r+":    adapt(ChatCohere(model="command-r-plus"), toolkit),
    #                 ↑ ToolGlot auto-flattens nested schemas for Cohere
}

for name, model in models.items():
    agent = create_react_agent(model, tools)
    result = agent.invoke({"messages": [("user", "Search flights from NYC to London")]})
    print(f"{name}: {result['messages'][-1].content}")

Without ToolGlot, the Cohere agent crashes on nested schemas. The Gemini agent fails on $ref. With ToolGlot, they all just work.

Local Models via Ollama

from langchain_ollama import ChatOllama
from toolglot import ToolKit
from toolglot.integrations.langchain import adapt

toolkit = ToolKit.from_mcp("./my_tools.json")
tools = toolkit.to_langchain()

# Native mode — models with solid chat template tool support
llama = adapt(ChatOllama(model="llama3.2:3b"), toolkit, mode="native")
agent = create_react_agent(llama, tools)

# Simplified mode — strip schema complexity for smaller models
phi = adapt(ChatOllama(model="phi4-mini"), toolkit, mode="simplified")
agent = create_react_agent(phi, tools)

# System prompt mode — any model, zero native support needed
gemma = adapt(ChatOllama(model="gemma2:2b"), toolkit, mode="system_prompt")
agent = create_react_agent(gemma, tools)

# Auto mode — ToolGlot checks capability matrix and picks the best strategy
qwen = adapt(ChatOllama(model="qwen2.5:7b"), toolkit, mode="auto")
agent = create_react_agent(qwen, tools)

Reasoning Models

o1, o3-mini, DeepSeek R1, QwQ — these models think deeply but most don't support native tool calling. Today, you can't use them with create_react_agent. With ToolGlot, you can.

from langchain_openai import ChatOpenAI
from langchain_ollama import ChatOllama
from toolglot import ToolKit
from toolglot.integrations.langchain import adapt

toolkit = ToolKit.from_mcp("./my_tools.json")
tools = toolkit.to_langchain()

# o3-mini — native tool calling, but simplify schemas for better results
o3 = adapt(ChatOpenAI(model="o3-mini"), toolkit, mode="auto")
agent = create_react_agent(o3, tools)

# DeepSeek R1 — no native tools, but ToolGlot injects into system prompt
# and parses <tool_call> blocks from R1's chain-of-thought output
r1 = adapt(ChatOllama(model="deepseek-r1:8b"), toolkit, mode="auto")
agent = create_react_agent(r1, tools)
# R1's deep reasoning makes it surprisingly good at multi-step tool planning.
# It just couldn't express tool calls before. Now it can.

# QwQ — same approach, reasoning model with no native tool support
qwq = adapt(ChatOllama(model="qwq:32b"), toolkit, mode="auto")
agent = create_react_agent(qwq, tools)

Multi-Provider Fallback

The crown jewel — an agent that tries the cheapest model first and falls back:

from langgraph.prebuilt import create_react_agent
from langchain_openai import ChatOpenAI
from langchain_anthropic import ChatAnthropic
from langchain_ollama import ChatOllama
from toolglot import ToolKit
from toolglot.integrations.langchain import adapt

toolkit = ToolKit.from_mcp("./my_tools.json")
tools = toolkit.to_langchain()

tiers = [
    ("local-phi4",    adapt(ChatOllama(model="phi4-mini"), toolkit, mode="auto")),
    ("gpt-4.1-nano",  adapt(ChatOpenAI(model="gpt-4.1-nano"), toolkit, mode="auto")),
    ("gpt-4o",        adapt(ChatOpenAI(model="gpt-4o"), toolkit, mode="auto")),
    ("claude-sonnet", adapt(ChatAnthropic(model="claude-sonnet-4-20250514"), toolkit, mode="auto")),
]

def call_with_fallback(messages: list):
    for name, model in tiers:
        try:
            agent = create_react_agent(model, tools)
            result = agent.invoke({"messages": messages})
            print(f"Resolved by: {name}")
            return result
        except Exception:
            continue
    raise RuntimeError("All tiers exhausted")

result = call_with_fallback([("user", "Book me a flight to Tokyo next Friday")])

Provider Cookbook

Detailed examples for every major provider. All examples use the same three tools:

from toolglot import ToolKit

toolkit = ToolKit.from_mcp("./travel_tools.json")
# Tools: get_weather, search_flights, book_hotel
# search_flights has nested params: passengers: {adults: int, children: int}
# book_hotel has array params: preferences.amenities: list[str]

OpenAI (GPT-4o, GPT-4.1, GPT-4.1-mini, GPT-4.1-nano, o1, o3-mini, o4-mini)

from openai import OpenAI

client = OpenAI()
tools = toolkit.to_openai()

# Works with every OpenAI model
for model in ["gpt-4o", "gpt-4.1", "gpt-4.1-mini", "gpt-4.1-nano", "o3-mini"]:
    response = client.chat.completions.create(
        model=model,
        messages=[{"role": "user", "content": "Weather in Tokyo?"}],
        tools=tools,
    )
    calls = parse_tool_calls(response, provider="openai")
    print(f"{model}: {calls[0].name}({calls[0].arguments})")

# Strict mode (OpenAI-specific: guarantees JSON Schema compliance)
tools_strict = toolkit.to_openai(strict=True)
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Search flights NYC to London, 2 adults"}],
    tools=tools_strict,
)

Anthropic (Claude Sonnet 4, Claude 3.5 Haiku)

from anthropic import Anthropic

client = Anthropic()
tools = toolkit.to_anthropic()

for model in ["claude-sonnet-4-20250514", "claude-3-5-haiku-20241022"]:
    response = client.messages.create(
        model=model,
        max_tokens=1024,
        tools=tools,
        messages=[{"role": "user", "content": "Weather in Tokyo?"}],
    )
    calls = parse_tool_calls(response, provider="anthropic")
    print(f"{model}: {calls[0].name}({calls[0].arguments})")

Google Gemini (Gemini 2.5 Pro, Gemini 2.0 Flash)

from google import genai

client = genai.Client()
tools = toolkit.to_gemini()

for model in ["gemini-2.5-pro-preview-05-06", "gemini-2.0-flash"]:
    response = client.models.generate_content(
        model=model,
        contents="Weather in Tokyo?",
        config=genai.types.GenerateContentConfig(tools=tools),
    )
    calls = parse_tool_calls(response, provider="gemini")
    print(f"{model}: {calls[0].name}({calls[0].arguments})")

Mistral (Mistral Large, Mistral Small, Codestral)

from mistralai import Mistral

client = Mistral()
tools = toolkit.to_mistral()

for model in ["mistral-large-latest", "mistral-small-latest", "codestral-latest"]:
    response = client.chat.complete(
        model=model,
        messages=[{"role": "user", "content": "Weather in Tokyo?"}],
        tools=tools,
    )
    calls = parse_tool_calls(response, provider="mistral")
    print(f"{model}: {calls[0].name}({calls[0].arguments})")

Cohere (Command R+, Command A)

import cohere

client = cohere.ClientV2()
tools = toolkit.to_cohere()
# ToolGlot auto-flattened nested params:
# passengers: {adults, children} → passengers_adults, passengers_children

for model in ["command-r-plus", "command-a-03-2025"]:
    response = client.chat(
        model=model,
        messages=[{"role": "user", "content": "Search flights NYC to London, 2 adults 1 child"}],
        tools=tools,
    )
    calls = parse_tool_calls(response, provider="cohere")
    # calls[0].arguments has passengers_adults=2, passengers_children=1
    # Use toolkit.unflatten(calls) to restore nested structure if needed
    print(f"{model}: {calls[0].name}({calls[0].arguments})")

DeepSeek (V3 — native, R1 — system prompt)

from openai import OpenAI

# DeepSeek V3 — native tool calling (OpenAI-compatible API)
client = OpenAI(base_url="https://api.deepseek.com", api_key="...")
tools = toolkit.to_openai()

response = client.chat.completions.create(
    model="deepseek-chat",
    messages=[{"role": "user", "content": "Weather in Tokyo?"}],
    tools=tools,
)
calls = parse_tool_calls(response, provider="openai")

# DeepSeek R1 — no native tools, use system prompt injection
prompt, instructions = toolkit.to_system_prompt(return_instructions=True)
response = client.chat.completions.create(
    model="deepseek-reasoner",
    messages=[
        {"role": "system", "content": prompt},
        {"role": "user", "content": "Weather in Tokyo?"},
    ],
)
calls = parse_tool_calls(response.choices[0].message.content, provider="freeform")

xAI (Grok 3, Grok 3 Mini)

from openai import OpenAI

client = OpenAI(base_url="https://api.x.ai/v1", api_key="...")
tools = toolkit.to_openai()

for model in ["grok-3", "grok-3-mini"]:
    response = client.chat.completions.create(
        model=model,
        messages=[{"role": "user", "content": "Weather in Tokyo?"}],
        tools=tools,
    )
    calls = parse_tool_calls(response, provider="openai")
    print(f"{model}: {calls[0].name}({calls[0].arguments})")

AWS Bedrock (Claude, Llama, Mistral on Bedrock)

import boto3

client = boto3.client("bedrock-runtime")
tools = toolkit.to_bedrock()

for model_id in [
    "anthropic.claude-sonnet-4-20250514-v1:0",
    "meta.llama3-3-70b-instruct-v1:0",
    "mistral.mistral-large-2407-v1:0",
]:
    response = client.converse(
        modelId=model_id,
        messages=[{"role": "user", "content": [{"text": "Weather in Tokyo?"}]}],
        toolConfig={"tools": tools},
    )
    calls = parse_tool_calls(response, provider="bedrock")
    print(f"{model_id}: {calls[0].name}({calls[0].arguments})")

Azure OpenAI (GPT-4o on Azure)

from openai import AzureOpenAI

client = AzureOpenAI(
    azure_endpoint="https://your-resource.openai.azure.com/",
    api_version="2025-01-01-preview",
)
tools = toolkit.to_openai()

response = client.chat.completions.create(
    model="gpt-4o",   # your deployment name
    messages=[{"role": "user", "content": "Weather in Tokyo?"}],
    tools=tools,
)
calls = parse_tool_calls(response, provider="openai")

Together AI / Groq / Fireworks / Cerebras (OpenAI-compatible)

from openai import OpenAI

tools = toolkit.to_openai()

providers = {
    "together": ("https://api.together.xyz/v1", "meta-llama/Llama-3.3-70B-Instruct-Turbo"),
    "groq":     ("https://api.groq.com/openai/v1", "llama-3.3-70b-versatile"),
    "fireworks": ("https://api.fireworks.ai/inference/v1", "accounts/fireworks/models/llama-v3p3-70b-instruct"),
    "cerebras": ("https://api.cerebras.ai/v1", "llama-3.3-70b"),
}

for name, (base_url, model) in providers.items():
    client = OpenAI(base_url=base_url, api_key="...")
    response = client.chat.completions.create(
        model=model,
        messages=[{"role": "user", "content": "Weather in Tokyo?"}],
        tools=tools,
    )
    calls = parse_tool_calls(response, provider="openai")
    print(f"{name} ({model}): {calls[0].name}({calls[0].arguments})")

Ollama (Llama, Phi, Qwen, Mistral, Gemma — local)

from openai import OpenAI

client = OpenAI(base_url="http://localhost:11434/v1", api_key="ollama")

# Models with chat template tool support — use native format
native_models = ["llama3.2:3b", "qwen2.5:7b", "mistral:7b"]
tools = toolkit.to_ollama(model="llama3.2")

for model in native_models:
    tools = toolkit.to_ollama(model=model)
    response = client.chat.completions.create(
        model=model,
        messages=[{"role": "user", "content": "Weather in Tokyo?"}],
        tools=tools,
    )
    calls = parse_tool_calls(response, provider="openai")
    print(f"{model}: {calls[0].name}({calls[0].arguments})")

# Models WITHOUT tool support — use system prompt injection
no_tool_models = ["gemma2:2b", "smollm2:1.7b", "deepseek-r1:8b"]
prompt = toolkit.to_system_prompt()

for model in no_tool_models:
    response = client.chat.completions.create(
        model=model,
        messages=[
            {"role": "system", "content": prompt},
            {"role": "user", "content": "Weather in Tokyo?"},
        ],
    )
    calls = parse_tool_calls(response.choices[0].message.content, provider="freeform")
    if calls:
        print(f"{model}: {calls[0].name}({calls[0].arguments})")
    else:
        print(f"{model}: no tool call extracted (model too small or confused)")

vLLM (Self-Hosted, Any HuggingFace Model)

from openai import OpenAI

client = OpenAI(base_url="http://localhost:8000/v1", api_key="token")
payload = toolkit.to_vllm_request(guided=True, guided_backend="outlines")

response = client.chat.completions.create(
    model="Qwen/Qwen2.5-7B-Instruct",
    messages=[{"role": "user", "content": "Weather in Tokyo?"}],
    **payload,  # includes tools + guided decoding config
)
calls = parse_tool_calls(response, provider="openai")

Architecture

toolglot/
├── toolglot/
│   ├── __init__.py            # Public API: ToolKit, parse_tool_calls, capabilities
│   ├── cli.py                 # Typer CLI: translate, validate, capabilities, inspect
│   ├── types.py               # Canonical IR: ToolDefinition, ToolParameter, ToolCall
│   ├── capabilities.py        # Model capability matrix (30+ models)
│   ├── importers/
│   │   ├── mcp.py             # MCP tool definitions → IR
│   │   ├── openapi.py         # OpenAPI 3.x specs → IR
│   │   ├── openai.py          # OpenAI function format → IR
│   │   ├── langchain.py       # LangChain BaseTool → IR
│   │   └── json_schema.py     # Raw JSON Schema → IR
│   ├── exporters/
│   │   ├── openai.py          # IR → OpenAI (+ Azure, Together, Groq, etc.)
│   │   ├── anthropic.py       # IR → Anthropic
│   │   ├── gemini.py          # IR → Google Gemini / Vertex AI
│   │   ├── mistral.py         # IR → Mistral
│   │   ├── cohere.py          # IR → Cohere (auto-flatten)
│   │   ├── bedrock.py         # IR → AWS Bedrock Converse API
│   │   ├── ollama.py          # IR → Ollama (chat template aware)
│   │   ├── vllm.py            # IR → vLLM / TGI
│   │   └── system_prompt.py   # IR → system prompt text (universal fallback)
│   ├── parsers/
│   │   ├── openai.py          # OpenAI response → ToolCall
│   │   ├── anthropic.py       # Anthropic response → ToolCall
│   │   ├── gemini.py          # Gemini response → ToolCall
│   │   ├── mistral.py         # Mistral response → ToolCall
│   │   ├── cohere.py          # Cohere response → ToolCall
│   │   └── freeform.py        # Raw text → ToolCall (regex + heuristics)
│   ├── transforms/
│   │   ├── flatten.py         # Flatten nested objects
│   │   ├── deref.py           # Resolve $ref pointers
│   │   ├── simplify.py        # Downgrade schemas for small models
│   │   └── validate.py        # Validate tool calls against schemas
│   └── integrations/
│       ├── langchain.py       # adapt(), to_langchain(), BaseTool wrappers
│       └── langgraph.py       # ToolGlotNode for graph-level integration
├── examples/
│   ├── quickstart.py          # 10-line getting started
│   ├── multi_provider.py      # Same tools → every provider
│   ├── langgraph_react_agent.py  # LangGraph with 5+ models
│   ├── local_models.py        # Ollama / vLLM examples
│   ├── format_comparison.py   # Side-by-side format output
│   ├── schema_downgrade.py    # Complex → simplified for edge models
│   ├── response_parsing.py    # Parse tool calls from all providers
│   └── capability_matrix.py   # Query model capabilities
├── tests/
│   ├── conftest.py            # Shared fixtures
│   └── unit/
│       ├── test_types.py
│       ├── test_exporters.py
│       ├── test_parsers.py
│       └── test_transforms.py
├── pyproject.toml
├── Makefile
├── LICENSE
└── README.md

Research Artifacts

Manuscript sources: paper/
Reproducible paper metrics harness: research/

python research/eval_harness.py

How It Compares

	ToolGlot	LiteLLM	LangChain	Vercel AI SDK
What it is	Library	Proxy/service	Framework	TypeScript SDK
Tool translation	Standalone, composable	Coupled to their router	Coupled to their ecosystem	Coupled to their runtime
Schema optimization	Auto per model	No	No	No
Capability matrix	Built-in (30+ models)	Partial	No	Partial
System prompt fallback	Built-in	No	No	No
Response parsing	Unified across all providers	Via their proxy	Per-provider classes	Per-provider
LangGraph integration	First-class `adapt()`	N/A	Built-in (basic)	N/A
Local model support	Ollama + vLLM + system prompt	Via proxy	Via ChatOllama	No
Language	Python	Python	Python + JS	TypeScript
Install size	Minimal (core has 4 deps)	Heavy	Very heavy	Heavy

Requirements

Python 3.10+
No GPU needed. No heavy ML dependencies.

Core dependencies (4 packages):

pydantic — IR models and validation
typer — CLI
rich — pretty output
pyyaml — config files

Production Hardening

For deployment and operations guidance (validation guardrails, retries/timeouts, redaction, and incident response), see:

docs/production.md

Performance Benchmarks

ToolGlot includes benchmark coverage for import/export/transform paths across small/medium/large schema fixtures.

make benchmark
make benchmark-budget

Benchmark metrics are written to .benchmarks/latest.json and budget thresholds are enforced in CI.

Installation

# Core — translate, validate, inspect
pip install toolglot

# With LangChain/LangGraph integration
pip install "toolglot[langchain]"

# Everything
pip install "toolglot[all]"

# Development
pip install "toolglot[dev]"

CI Quality Gate Parity

CI runs the same baseline commands expected locally:

make lint
make typecheck-ci
make test
make test-cov

Current CI coverage gate is enforced in workflow configuration and should be raised over time as test coverage grows.

Release and Versioning

ToolGlot follows Semantic Versioning with automated tagged releases to TestPyPI and PyPI.

Roadmap

API Stability Policy

ToolGlot's public API contract and deprecation framework are documented in:

Security

ToolGlot publishes a dedicated vulnerability reporting and response process in SECURITY.md.

Report vulnerabilities privately via GitHub Security Advisories
Scorecard runs in CI via .github/workflows/scorecard.yml
Any failed or regressed Scorecard check should be tracked in a follow-up issue labeled area:security and priority:P1

Contributing

Contributions welcome. Adding a new provider? It's one file in exporters/ and one in parsers/.

Compatibility invariants across exporters/parsers are enforced via contract tests:

docs/contract-tests.md Start here for a first PR path, local setup, and contributor checklist:
CONTRIBUTING.md Error classes and stable CLI exit semantics are documented in:
docs/errors.md Golden fixture drift detection and regeneration workflow:
docs/golden-fixtures.md Defensive validation limits for untrusted schemas:
docs/security-limits.md Plugin authoring and publish guidance:
docs/plugin-authoring.md MCP discovery/import quickstart and troubleshooting:
docs/mcp-discovery.md

git clone https://github.com/NP-compete/toolglot.git
cd toolglot
pip install -e ".[dev]"
pytest

Property/fuzz robustness tests:

pytest tests/unit/test_property_based.py -q

Supply-Chain Security

ToolGlot includes baseline dependency governance:

Dependabot updates for both pip and GitHub Actions via .github/dependabot.yml
PR dependency risk checks via .github/workflows/dependency-review.yml
SBOM generation and release attachment via .github/workflows/release-sbom.yml

Citation

@software{toolglot2026,
  author = {Soham Dutta},
  title = {ToolGlot: Define Tools Once, Use Them With Any Model},
  year = {2026},
  url = {https://github.com/NP-compete/toolglot}
}

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
.benchmarks		.benchmarks
.github		.github
docs		docs
examples		examples
paper		paper
research		research
scripts		scripts
tests		tests
toolglot		toolglot
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
SECURITY.md		SECURITY.md
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

toolglot

Quick Start

Why ToolGlot

The Format Zoo

Features

Import From Anywhere

Export To Any Model

Parse Responses Back

Schema Transforms

Capability Matrix

LangGraph Integration

Same Agent, Any Model

Local Models via Ollama

Reasoning Models

Multi-Provider Fallback

Provider Cookbook

OpenAI (GPT-4o, GPT-4.1, GPT-4.1-mini, GPT-4.1-nano, o1, o3-mini, o4-mini)

Anthropic (Claude Sonnet 4, Claude 3.5 Haiku)

Google Gemini (Gemini 2.5 Pro, Gemini 2.0 Flash)

Mistral (Mistral Large, Mistral Small, Codestral)

Cohere (Command R+, Command A)

DeepSeek (V3 — native, R1 — system prompt)

xAI (Grok 3, Grok 3 Mini)

AWS Bedrock (Claude, Llama, Mistral on Bedrock)

Azure OpenAI (GPT-4o on Azure)

Together AI / Groq / Fireworks / Cerebras (OpenAI-compatible)

Ollama (Llama, Phi, Qwen, Mistral, Gemma — local)

vLLM (Self-Hosted, Any HuggingFace Model)

Architecture

Research Artifacts

How It Compares

Requirements

Production Hardening

Performance Benchmarks

Installation

CI Quality Gate Parity

Release and Versioning

Roadmap

API Stability Policy

Security

Contributing

Supply-Chain Security

Citation

License

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages