Skip to content

[intake] Monitor the health of LiteLLM on Kubernetes beyond contai... #87

@rw-codebundle-agent

Description

@rw-codebundle-agent

Request

Title: Monitor the health of LiteLLM on Kubernetes beyond contai...

Description:

Monitor the health of LiteLLM on Kubernetes beyond container logs -- API driven insights, request failures, budgets, and so on.

Platform: Kubernetes, LiteLLM

Additional context: port-forward for internal API access with kubectl


Existing Coverage (from registry search)

The following CodeBundles may overlap with this request. Designer: consider reusing, extending, or differentiating.

1. Azure ACR Health Check (95% match)

  • Collection: rw-cli-codecollection
  • Platform: Azure
  • Description: Comprehensive health checks for Azure Container Registry (ACR), including network configuration, resource health, authentication, storage utilization, pull/push metrics, and security analysis.
  • Tasks: Check Network Configuration for ACR ${ACR_NAME} In Resource Group ${AZ_RESOURCE_GROUP}, Check DNS & TLS Reachability for Registry ${ACR_NAME}, Check ACR Login & Authentication for Registry ${ACR_NAME}, Check ACR SKU and Usage Metrics for Registry ${ACR_NAME}, Check ACR Storage Utilization for Registry ${ACR_NAME}, Analyze ACR Pull/Push Success Ratio for Registry ${ACR_NAME}, Check ACR Repository Event Failures for Registry ${ACR_NAME}, Check ACR Security Configuration and RBAC for Registry ${ACR_NAME}
  • Tags: ACR, STORAGE, REGISTRY, CONTAINER, NETWORK, HEALTH, AZURE, SECURITY
  • Link: /collections/rw-cli-codecollection/codebundles/azure-acr-health

2. Kubernetes API Server Health (90% match)

  • Collection: rw-public-codecollection
  • Platform: None
  • Description: Check the health of a Kubernetes API server using kubectl.
  • Tags: KUBERNETES, AKS, EKS, OPENSHIFT, GKE
  • Link: /collections/rw-public-codecollection/codebundles/k8s-kubectl-apiserverhealth

3. Kubernetes Redis Healthcheck (87% match)

  • Collection: rw-cli-codecollection
  • Platform: Kubernetes
  • Description: This taskset collects information on your redis workload in your Kubernetes cluster and raises issues if any health checks fail.
  • Tasks: Ping ${DEPLOYMENT_NAME} Redis Workload, Verify ${DEPLOYMENT_NAME} Redis Read Write Operation in Kubernetes
  • Tags: KUBERNETES, REDIS, AKS, EKS, OPENSHIFT, GKE
  • Link: /collections/rw-cli-codecollection/codebundles/k8s-redis-healthcheck

4. AWS CloudWatch Logs health (83% match)

  • Collection: aws-c7n-codecollection
  • Platform: AWS
  • Description: Check AWS Monitoring Configuration Health
  • Tasks: List CloudWatch Log Groups Without Retention Period in AWS Region ${AWS_REGION} in AWS Account ${AWS_ACCOUNT_NAME}, Check CloudTrail Configuration in AWS Region ${AWS_REGION} in AWS Account ${AWS_ACCOUNT_NAME}, Check for CloudTrail integration with CloudWatch Logs in AWS Region ${AWS_REGION} in AWS Account ${AWS_ACCOUNT_NAME}
  • Tags: CLOUDWATCH, AWS, TAG, CLOUDCUSTODIAN, CLOUDTRAIL
  • Link: /collections/aws-c7n-codecollection/codebundles/aws-c7n-monitoring-health

5. Kubernetes cert-manager Healthcheck (78% match)

  • Collection: rw-cli-codecollection
  • Platform: Kubernetes
  • Description: Checks the overall health of certificates in a namespace that are managed by cert-manager.
  • Tasks: Get Namespace Certificate Summary for Namespace ${NAMESPACE}, Find Unhealthy Certificates in Namespace ${NAMESPACE}, Find Failed Certificate Requests and Identify Issues for Namespace ${NAMESPACE}
  • Tags: KUBERNETES, CERT-MANAGER, AKS, EKS, OPENSHIFT, GKE
  • Link: /collections/rw-cli-codecollection/codebundles/k8s-certmanager-healthcheck

6. Kubernetes Cluster Node Health (75% match)

  • Collection: rw-cli-codecollection
  • Platform: Kubernetes
  • Description: Evaluate cluster node health using kubectl
  • Tasks: Check for Node Restarts in Cluster ${CONTEXT} within Interval ${RW_LOOKBACK_WINDOW}
  • Tags: KUBERNETES, AKS, EKS, OPENSHIFT, GKE
  • Link: /collections/rw-cli-codecollection/codebundles/k8s-cluster-node-health

7. Kubernetes Daemonset Health Check (71% match)

  • Collection: rw-public-codecollection
  • Platform: None
  • Description: Checks that the current state of a daemonset is healthy and returns a score of either 1 (healthy) or 0 (unhealthy).
  • Tags: KUBERNETES, AKS, EKS, K8S, OPENSHIFT, GKE
  • Link: /collections/rw-public-codecollection/codebundles/k8s-daemonset-healthcheck

8. Kubernetes Patroni Health Check (67% match)

  • Collection: rw-public-codecollection
  • Platform: None
  • Description: Uses kubectl (or equivalent) to query the state of a patroni cluster and determine if it's healthy.
  • Tags: POSTGRESQL, KUBERNETES, AKS, EKS, PATRONI, OPENSHIFT, GKE
  • Link: /collections/rw-public-codecollection/codebundles/k8s-patroni-healthcheck

Open Requests (may overlap)

Consider commenting on an existing issue instead of duplicating work.


Created via CodeCollection Registry intake at 2026-04-16 00:52 UTC.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions