Design Spec: k8s-litellm-proxy-health
Parent: #87
Target: rw-cli-codecollection
Spec
codebundle_name: "k8s-litellm-proxy-health"
target_collection: "rw-cli-codecollection"
display_name: "Kubernetes LiteLLM Proxy API Health"
author: "rw-codebundle-agent"
purpose: |
Exposes LiteLLM proxy health beyond pod logs by calling the proxy HTTP API:
liveness/readiness, configured models/routes, optional deep model probes, and
integration health—typically via in-cluster Service URL or kubectl port-forward
to the proxy Service.
tasks:
- name: "Check LiteLLM Liveness Endpoint"
description: "Calls GET /health/liveliness (or equivalent) to confirm the proxy process responds without invoking upstream LLMs."
script_name: "check-litellm-liveness.sh"
expected_issue_severity: [3, 4]
access_level: "read-only"
data_type: "metrics"
- name: "Check LiteLLM Readiness and Dependencies"
description: "Calls GET /health/readiness and surfaces DB/cache connectivity, version, and callback registration—detects misconfigured persistence or startup failures."
script_name: "check-litellm-readiness.sh"
expected_issue_severity: [2, 3]
access_level: "read-only"
data_type: "metrics"
- name: "List Configured Models and Routes"
description: "Uses /v1/models and/or /model/info to verify expected providers/models are registered; flags empty or unexpected routing relative to config intent."
script_name: "list-litellm-models.sh"
expected_issue_severity: [2, 3]
access_level: "read-only"
data_type: "logs-config"
- name: "Optional Deep Model Health Probe"
description: "When enabled, calls GET /health with master key to run upstream health checks; default off or short timeout because it invokes real LLM API calls and can be costly."
script_name: "check-litellm-deep-health.sh"
expected_issue_severity: [3, 4]
access_level: "read-only"
data_type: "metrics"
- name: "Check External Integration Service Health"
description: "Calls GET /health/services for named integrations (for example observability exporters) when configured; reports unhealthy sidecar integrations."
script_name: "check-litellm-integration-health.sh"
expected_issue_severity: [2, 3]
access_level: "read-only"
data_type: "metrics"
- name: "Verify Kubernetes Service Reachability Context"
description: "Optional kubectl-based checks (Service/Endpoints presence, port) to correlate API failures with cluster networking or missing endpoints—not a substitute for API checks."
script_name: "verify-litellm-k8s-service.sh"
expected_issue_severity: [2, 3]
access_level: "read-only"
data_type: "metrics"
scope:
level: "Resource"
qualifiers:
- CONTEXT
- NAMESPACE
- LITELLM_SERVICE_NAME
iteration_pattern: |
One SLX per LiteLLM proxy Service (or Deployment) in a namespace discovered by
label/name qualifier; PROXY_BASE_URL may point to ClusterIP/DNS or localhost
when using kubectl port-forward documented in README.
resource_types:
- "kubernetes_service"
generation_strategy: |
Generate one SLX per matched Service (or annotated Deployment) in the target namespace
representing the LiteLLM proxy. Qualifiers: cluster context, namespace, service name.
Resource match: kubernetes_service with optional label selector for litellm proxy.
env_vars:
- name: CONTEXT
description: "Kubernetes context to use for optional kubectl checks"
required: true
- name: NAMESPACE
description: "Namespace where LiteLLM proxy runs"
required: true
- name: PROXY_BASE_URL
description: "Base URL for LiteLLM HTTP API (for example http://127.0.0.1:4000 after kubectl port-forward svc/...)"
required: true
- name: LITELLM_SERVICE_NAME
description: "Kubernetes Service name for the LiteLLM proxy (for discovery and kubectl helpers)"
required: true
- name: LITELLM_HTTP_PORT
description: "Service port number for the proxy HTTP listener"
required: false
default: "4000"
- name: LITELLM_RUN_DEEP_HEALTH
description: "Set true to enable expensive GET /health upstream probes"
required: false
default: "false"
- name: LITELLM_INTEGRATION_SERVICES
description: "Comma-separated integration names for /health/services checks, or empty to skip"
required: false
default: ""
secrets:
- name: litellm_master_key
description: "LiteLLM master or admin API key (Bearer) for protected endpoints such as /health, /health/services, and some model info routes"
format: "Plain text or secret reference compatible with RunWhen secret import"
- name: kubeconfig
description: "Standard kubeconfig for kubectl-backed optional tasks"
format: "kubeconfig YAML"
platform:
name: "kubernetes"
cli_tools:
- "kubectl"
- "curl"
- "jq"
auth_methods:
- "Bearer token (LITELLM master key)"
- "kubeconfig for cluster context"
api_docs: "https://docs.litellm.ai/docs/proxy/health"
related_bundles:
- name: "k8s-litellm-spend-governance"
relationship: "complements"
notes: "Proxy-health covers availability and routing; spend-governance covers budgets, spend logs, and failure-oriented usage analytics."
- name: "k8s-redis-healthcheck"
relationship: "complements"
notes: "When Redis backs the proxy, Redis bundle validates datastore health; this bundle validates the proxy API layer."
- name: "k8s-jaeger-http-query"
relationship: "complements"
notes: "Similar pattern of HTTP API diagnostics against a Kubernetes-scoped workload."
test_scenarios:
- name: "healthy_proxy"
description: "Readiness connected, liveness OK, models listed"
expected_issues: 0
- name: "db_disconnected"
description: "Readiness shows DB error"
expected_issues: 1
expected_severities: [3]
notes: |
Document kubectl port-forward for internal-only Services (for example
kubectl port-forward -n NAMESPACE svc/LITELLM_SERVICE_NAME 4000:4000) and warn
that GET /health performs real upstream LLM calls—gate behind LITELLM_RUN_DEEP_HEALTH.
LiteLLM versions differ slightly; prefer tolerant JSON parsing and explicit HTTP
status handling. Align timeouts with RunWhen task limits (<1 min total suite).
Design Spec: k8s-litellm-proxy-health
Parent: #87
Target:
rw-cli-codecollectionSpec