Skip to content

client-proxy 502 errors & orchestration-api snapshot not found: poor observability and ambiguous error messages #2730

@AdaAibaby

Description

@AdaAibaby

Summary

In production environments, we are observing two critical issues that lead to sandbox startup failures, ambiguous user-facing errors, and poor debuggability:

  1. client-proxy floods with 502 Reverse proxy errors

    • No upstream health checking
    • No TCP/grpc connectivity validation before forwarding
    • Returns generic 502 instead of meaningful 503 Service Unavailable
    • Causes sandbox process startup to fail intermittently
  2. orchestration-api returns misleading 404 on snapshot not found

    • Log: snapshot not found
    • User error: Sandbox doesn't exist or you don't have access to it
    • No detailed logging for why snapshot lookup failed (missing object, path, permissions, cache)
    • Service still registers routes as ready despite being unable to serve snapshots

These issues make it extremely hard to debug sandbox startup failures and provide a bad user experience.

Log Evidence

client-proxy 502
Reverse proxy error {"service": "client-proxy", "shturl.cc/": "im2pa8g5739aezjy57vr1", "target_hostname": "10.254.73.19", "target_port": "5007", "status_code": 502}

orchestration-api snapshot not found
snapshot not found {"service": "orchestration-api", "shturl.cc/": "im2pa8g5739aezjy57vr1"}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions