Skip to content

[design-spec] k8s-litellm-spend-governance #89

@rw-codebundle-agent

Description

@rw-codebundle-agent

Design Spec: k8s-litellm-spend-governance

Parent: #87
Target: rw-cli-codecollection

Spec

codebundle_name: "k8s-litellm-spend-governance"
target_collection: "rw-cli-codecollection"
display_name: "Kubernetes LiteLLM Spend and Governance"
author: "rw-codebundle-agent"

purpose: |
  Surfaces LiteLLM operational and cost governance signals from proxy Admin APIs:
  spend logs, global spend reports, per-key/user/team budgets and rate limits, and
  aggregates that highlight failed or budget-blocked traffic—without relying on
  container log scraping alone.

tasks:
  - name: "Review Recent Spend Logs for Failures"
    description: "Queries /spend/logs (with date/window parameters) and flags rows indicating errors, budget_exceeded, rate_limited, or provider failures."
    script_name: "review-litellm-spend-logs.sh"
    expected_issue_severity: [2, 4]
    access_level: "read-only"
    data_type: "metrics"

  - name: "Check Global Spend Report Against Threshold"
    description: "Calls /global/spend/report over a configurable window and compares total spend to LITELLM_SPEND_THRESHOLD_USD (or relative increase vs prior window)."
    script_name: "check-litellm-global-spend.sh"
    expected_issue_severity: [2, 3]
    access_level: "read-only"
    data_type: "metrics"

  - name: "Inspect Virtual Key Spend and Remaining Budget"
    description: "Uses key metadata endpoints (for example /key/info or list keys) to report keys near max_budget, expired keys, or anomalous spend velocity."
    script_name: "inspect-litellm-key-budgets.sh"
    expected_issue_severity: [2, 3]
    access_level: "read-only"
    data_type: "metrics"

  - name: "Review User Budget and Rate Limit Status"
    description: "Calls /user/info (and related) for configured user_id(s) to surface soft_budget_cooldown, tpm/rpm limits, and spend versus budget."
    script_name: "review-litellm-user-budgets.sh"
    expected_issue_severity: [2, 3]
    access_level: "read-only"
    data_type: "metrics"

  - name: "Summarize Team Budgets and Limits"
    description: "When team IDs or aliases are configured, queries team endpoints to verify rpm/tpm/max_budget settings and highlight teams at risk of blocking traffic."
    script_name: "summarize-litellm-team-budgets.sh"
    expected_issue_severity: [2, 3]
    access_level: "read-only"
    data_type: "metrics"

  - name: "Aggregate Error and Blocked Request Signals"
    description: "Derives a short summary of failure modes from spend logs and proxy error fields (for example budget_exceeded count, provider 429/5xx) for quick triage."
    script_name: "aggregate-litellm-failure-signals.sh"
    expected_issue_severity: [3, 4]
    access_level: "read-only"
    data_type: "metrics"

scope:
  level: "Resource"
  qualifiers:
    - CONTEXT
    - NAMESPACE
    - LITELLM_SERVICE_NAME
  iteration_pattern: |
    Same LiteLLM proxy Service resource as the health bundle; this bundle focuses on
    authenticated Admin/spend routes against PROXY_BASE_URL.

resource_types:
  - "kubernetes_service"
generation_strategy: |
  Pair with k8s-litellm-proxy-health SLX generation: one spend-governance SLX per
  proxy instance (context + namespace + service). Optional additional qualifiers for
  TEAM_IDS or USER_IDS when operators want scoped governance reports.

env_vars:
  - name: CONTEXT
    description: "Kubernetes context for optional kubectl correlation"
    required: true

  - name: NAMESPACE
    description: "Namespace of the LiteLLM deployment"
    required: true

  - name: PROXY_BASE_URL
    description: "LiteLLM proxy base URL for API calls"
    required: true

  - name: LITELLM_SERVICE_NAME
    description: "Service name for labeling and docs"
    required: true

  - name: RW_LOOKBACK_WINDOW
    description: "Time window for spend logs and reports (for example 24h, 7d)—implementer maps to API date params"
    required: false
    default: "24h"

  - name: LITELLM_SPEND_THRESHOLD_USD
    description: "Alert when global or scoped spend in the window exceeds this USD amount (0 disables)"
    required: false
    default: "0"

  - name: LITELLM_USER_IDS
    description: "Comma-separated internal user_ids to check with /user/info; empty skips user task"
    required: false
    default: ""

  - name: LITELLM_TEAM_IDS
    description: "Comma-separated team identifiers for team budget task; empty skips"
    required: false
    default: ""

secrets:
  - name: litellm_master_key
    description: "Master key or key with permissions for /spend and /global/spend/report routes"
    format: "Bearer token"

  - name: kubeconfig
    description: "kubeconfig for optional kubectl context"
    format: "kubeconfig YAML"

platform:
  name: "kubernetes"
  cli_tools:
    - "kubectl"
    - "curl"
    - "jq"
  auth_methods:
    - "Bearer master key with spend route permissions"
  api_docs: "https://docs.litellm.ai/docs/proxy/cost_tracking"

related_bundles:
  - name: "k8s-litellm-proxy-health"
    relationship: "complements"
    notes: "Health bundle validates proxy availability; this bundle validates financial and policy pressure (budgets, failures)."

  - name: "k8s-prometheus-healthcheck"
    relationship: "complements"
    notes: "If Prometheus scrapes LiteLLM metrics, that bundle complements API-derived spend and error summaries."

test_scenarios:
  - name: "nominal_spend"
    description: "Spend within threshold, no budget_exceeded in logs"
    expected_issues: 0

  - name: "budget_exceeded_spike"
    description: "Spend logs show repeated budget_exceeded for a key"
    expected_issues: 2
    expected_severities: [3, 3]

notes: |
  Some spend and team report routes are Enterprise or require specific key
  permissions; implement graceful degradation with clear issues when HTTP 403
  indicates missing scope. Database-backed features must be enabled on the proxy
  for full spend logs. Prefer summarization over dumping full log payloads to stay
  within report size limits. Document port-forward alongside k8s-litellm-proxy-health.

Metadata

Metadata

Assignees

No one assigned

    Labels

    completedAgent work completeddesign-specArchitect has produced a design specnew-codebundleScoped issue for SRE to implement a new CodeBundle

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions