Skip to content

[design-spec] vercel-project-http-error-health #80

@rw-codebundle-agent

Description

@rw-codebundle-agent

Design Spec: vercel-project-http-error-health

Parent: #79
Target: rw-cli-codecollection

Spec

# CodeBundle Design Spec: vercel-project-http-error-health
# Parent issue: runwhen-contrib/codecollection-registry#79

codebundle_name: "vercel-project-http-error-health"
target_collection: "rw-cli-codecollection"
display_name: "Vercel Project HTTP Error Health"
author: "rw-codebundle-agent"

purpose: |
  Monitors frontend and edge/serverless request health on Vercel by aggregating
  runtime request logs for a project deployment: 4xx (including 400 bad requests),
  5xx server errors, error rates versus configurable thresholds, and the top
  request paths contributing to those failures. Helps operators spot regressions,
  misconfigured routes, and upstream failures quickly.

tasks:
  - name: "Validate Vercel API Access and Resolve Project for `${VERCEL_PROJECT}`"
    description: |
      Confirms the bearer token can access the team scope, resolves project id/slug,
      and fails fast with a clear issue if credentials or project identifiers are wrong.
    script_name: "vercel-validate-project.sh"
    expected_issue_severity: [3, 4]
    access_level: "read-only"
    data_type: "config"

  - name: "Resolve Production Deployment for Log Analysis for Project `${VERCEL_PROJECT}`"
    description: |
      Selects the target deployment (default: latest production deployment) used as the
      log source for the lookback window; documents deployment id and URL in the report.
    script_name: "vercel-resolve-deployment.sh"
    expected_issue_severity: [2, 3]
    access_level: "read-only"
    data_type: "config"

  - name: "Summarize 5xx Server Error Rate for Project `${VERCEL_PROJECT}`"
    description: |
      Streams or pages runtime logs for the deployment, counts responses with status
      500-599, computes error rate vs total sampled requests (or vs time window), and
      raises issues when above `${ERROR_RATE_THRESHOLD_PCT}` or absolute count exceeds
      `${MIN_ERROR_EVENTS}`.
    script_name: "vercel-summarize-5xx-rate.sh"
    expected_issue_severity: [2, 4]
    access_level: "read-only"
    data_type: "metrics"

  - name: "Summarize 4xx Client Error Rate (incl. 400) for Project `${VERCEL_PROJECT}`"
    description: |
      Aggregates 4xx responses with emphasis on 400 (bad request) and other application
      client errors; separates 404 traffic when possible so this task focuses on
      non-404 4xx if `${EXCLUDE_404_FROM_4XX}` is true (configurable).
    script_name: "vercel-summarize-4xx-rate.sh"
    expected_issue_severity: [2, 3]
    access_level: "read-only"
    data_type: "metrics"

  - name: "List Top Error Paths by 5xx Count for Project `${VERCEL_PROJECT}`"
    description: |
      Ranks request paths by volume of 5xx responses in the lookback window to show
      which routes or assets fail most often.
    script_name: "vercel-top-paths-5xx.sh"
    expected_issue_severity: [2, 3]
    access_level: "read-only"
    data_type: "metrics"

  - name: "List Top Paths by 4xx (non-404) Count for Project `${VERCEL_PROJECT}`"
    description: |
      Surfaces paths with the highest 4xx counts excluding 404 when configured, to
      highlight validation, auth, or routing problems distinct from missing pages.
    script_name: "vercel-top-paths-4xx.sh"
    expected_issue_severity: [2, 3]
    access_level: "read-only"
    data_type: "metrics"

scope:
  level: "Project"
  qualifiers:
    - VERCEL_TEAM_ID
    - VERCEL_PROJECT
  iteration_pattern: |
    One SLX per Vercel project the user selects (by id or slug under a team). Optional
    future discovery: list projects via API for a team when RESOURCES=All (stretch goal;
    initial implementation may require explicit project id/slug).

resource_types:
  - "vercel_project"
generation_strategy: |
  Generate one TaskSet/SLX per qualified Vercel project. Resource match: project slug
  or id under team VERCEL_TEAM_ID. Qualifiers: team id, project name.

env_vars:
  - name: VERCEL_TEAM_ID
    description: "Vercel team (teamId) scope for API calls"
    required: true

  - name: VERCEL_PROJECT
    description: "Project id or slug to analyze"
    required: true

  - name: LOOKBACK_MINUTES
    description: "Window for log sampling relative to now"
    required: false
    default: "60"

  - name: ERROR_RATE_THRESHOLD_PCT
    description: "Issue when 5xx rate exceeds this percentage of sampled requests"
    required: false
    default: "1"

  - name: MIN_ERROR_EVENTS
    description: "Minimum 5xx events before raising a high-severity issue (noise guard)"
    required: false
    default: "5"

  - name: EXCLUDE_404_FROM_4XX
    description: "If true, 404 responses are excluded from 4xx error summaries here"
    required: false
    default: "true"

secrets:
  - name: vercel_api_token
    description: "Vercel bearer token with read access to projects and deployment logs"
    format: "Plain text token (VERCEL_TOKEN or Bearer)"

platform:
  name: "vercel"
  cli_tools:
    - "curl"
    - "jq"
  auth_methods:
    - "Bearer token (Vercel personal or team token with log read scope)"
  api_docs: "https://vercel.com/docs/rest-api"

related_bundles:
  - name: "curl-http-ok"
    relationship: "complements"
    notes: "Synthetic URL probes; this bundle uses Vercel runtime logs for real traffic."
  - name: "vercel-project-path-traffic-health"
    relationship: "complements"
    notes: "Sibling bundle covers popular paths and 404-focused analysis from the same log source."

test_scenarios:
  - name: "healthy_project"
    description: "Low 4xx/5xx rates below thresholds in window"
    expected_issues: 0

  - name: "spike_5xx"
    description: "Deployment returning many 5xx for a single path"
    expected_issues: 2
    expected_severities: [3, 3]

notes: |
  Runtime logs are exposed per deployment (e.g. GET runtime-logs stream with
  responseStatusCode and requestPath). Implementation must handle NDJSON/stream parsing,
  pagination or sampling limits, and API rate limits. If log volume is large, use
  statistical sampling with documented confidence. Edge-only static traffic may require
  ensuring the selected deployment matches production traffic. Web Analytics is
  browser-oriented and is not a substitute for server/runtime logs for status codes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    completedAgent work completeddesign-specArchitect has produced a design specnew-codebundleScoped issue for SRE to implement a new CodeBundle

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions