Skip to content

[Sandbox] Inference Gateway #486

@edenreich

Description

@edenreich

Project summary

An open-source, high-performance gateway unifying multiple LLM providers, from local solutions like Ollama to major cloud providers such as OpenAI, Groq, Cohere, Anthropic, Cloudflare and DeepSeek.

Project description

Inference Gateway is an open-source, high-performance gateway that unifies access to multiple Large Language Model (LLM) providers behind a single, consistent API. It supports local solutions such as Ollama, as well as major cloud providers including OpenAI, Groq, Cohere, Anthropic, Cloudflare, and DeepSeek.

The project provides a normalized, OpenAI-compatible interface so that applications and agents can switch between providers without code changes, and it offers first-class support for emerging agent protocols like MCP (Model Context Protocol) and A2A (Agent-to-Agent). It is designed to run natively in cloud native environments, with Kubernetes deployment, a dedicated Operator, Helm charts, OpenTelemetry tracing, and Prometheus metrics out of the box.

In addition to the gateway itself, the project ships an ecosystem of SDKs (Go, Python, TypeScript, Rust), a UI, and a Kubernetes Operator, making it straightforward for developers and platform teams to integrate LLMs into their cloud native workloads. The goal is to provide a vendor-neutral, observable, and secure entry point for AI inference traffic in cloud native architectures.

This application supersedes the previous application in #382, which was closed at the maintainer's request because the application form has since changed.

Org repo URL (provide if all repos under the org are in scope of the application)

https://github.com/inference-gateway/

Project repo URL in scope of application

https://github.com/inference-gateway/inference-gateway

Additional repos in scope of the application

Website URL

https://docs.inference-gateway.com/

Roadmap

Roadmap 2026

Roadmap context

Near-term focus is on building a great CLI for convenient interaction with multiple LLMs across providers, deepening MCP and A2A protocol support, and hardening the Kubernetes Operator and observability story. The full 2026 roadmap is tracked in the project repository.

Contributing guide

https://github.com/inference-gateway/inference-gateway/blob/main/CONTRIBUTING.md

Code of Conduct (CoC)

N/A - will be added before TOC review

Adopters

N/A

Maintainers file

N/A - will be added before TOC review

Security policy file

N/A - will be added before TOC review

Standard or specification?

The project itself is not a standard, but it implements and exposes an OpenAPI 3.x specification (see openapi.yaml in the primary repo) for its public HTTP API, and integrates with emerging agent protocols including MCP (Model Context Protocol) and A2A (Agent-to-Agent).

Business product or service to project separation

This project is unrelated to any product or service.

Why CNCF?

Inference Gateway is a natural fit for the CNCF landscape because it is a cloud native, Kubernetes-first project that integrates closely with existing CNCF technologies (Kubernetes, Helm, OpenTelemetry, Prometheus). Joining the CNCF would provide vendor-neutral governance, a broader contributor base, and visibility within the cloud native ecosystem, while ensuring the project remains open, community-driven, and aligned with cloud native best practices as AI inference becomes a core workload on Kubernetes.

Benefit to the landscape

The project enhances the cloud native landscape by providing a vendor-neutral, observable, and Kubernetes-native entry point for LLM inference traffic. It offers strong documentation and first-class support for emerging protocols such as MCP and A2A, which are still very new and not yet well-documented in the wider ecosystem. By unifying multiple LLM providers (local and cloud) behind a consistent OpenAI-compatible API, it lowers the barrier for cloud native applications and platform teams to adopt AI workloads in a portable, observable way.

Cloud native 'fit'

Inference Gateway fits in the AI / AI Agents area of the cloud native landscape. It is designed from day one to be cloud native: it ships as a container image, has a Helm chart and a dedicated Kubernetes Operator, exposes Prometheus metrics, emits OpenTelemetry traces, and is configured via standard cloud native primitives (env vars, ConfigMaps, Secrets). It embodies cloud native principles of containerization, declarative configuration, observability, and Kubernetes-native operation.

Cloud native 'integration'

Inference Gateway integrates with and complements several CNCF projects:

  • Kubernetes: deployed natively as a workload, with a dedicated Operator and Helm chart.
  • Helm: official Helm chart provided for easy installation.
  • OpenTelemetry: emits traces and metrics for full observability of inference traffic.
  • Prometheus: exposes Prometheus-compatible metrics out of the box.
  • containerd / OCI: distributed as standard OCI container images.

Cloud native overlap

There may be some overlap with the Gateway API Inference Extension work in the Kubernetes/CNCF ecosystem (see https://www.cncf.io/blog/2025/04/21/deep-dive-into-the-gateway-api-inference-extension/). The scope is different: that effort focuses on extending the Gateway API for inference routing within Kubernetes, while Inference Gateway is a standalone, provider-agnostic gateway that normalizes access to many LLM providers (local and cloud) and adds support for agent protocols like MCP and A2A.

Similar projects

  • Gateway API Inference Extension (Kubernetes SIG-Network) - related but narrower scope, focused on Gateway API routing for inference inside Kubernetes.
  • Other LLM gateway/proxy projects exist in the wider ecosystem, but they typically focus on a subset of providers and do not provide first-class cloud native integration (Operator, Helm, OpenTelemetry, Prometheus) together with MCP and A2A protocol support.

Landscape

No - the project is not yet listed on the CNCF Cloud Native Landscape.

Insights

No - the project is not yet listed on LFX Insights.

Trademark and accounts

  • If the project is accepted, I agree to donate all project trademarks and accounts to the CNCF

IP policy

  • If the project is accepted, I agree the project will follow the CNCF IP Policy

Will the project require a license exception?

The project is licensed under MIT. If MIT is not on the CNCF Allowlist for core project code, a license exception may be required, or the project can be relicensed to Apache 2.0 as part of the onboarding process if the TOC prefers.

Project "Domain Technical Review"

No response - the project has not yet engaged with a domain-specific TAG. Happy to do so as part of the review process.

Application contact email(s)

eden.reich@gmail.com

Contributing or sponsoring entity signatory information

Name Country Email address
Eden Reich Germany eden.reich@gmail.com

CNCF contacts

No response

Additional information

This application supersedes the previous Sandbox application tracked in #382, which was closed by a maintainer with a request to open a new issue because the application form has since changed. The information here has been updated and adapted to the current form.

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

Status

🏗 Upcoming

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions