Goal
Add WebSocket transport for InterLink server ↔ plugin communication while keeping the existing REST API working.
Phase 0 — Confirm the desired direction and roles
Recommended topology: plugin connects to server.
- Plugin (apptainer) opens
wss://interlink-server/ws
- Keeps one long-lived connection
- Server can send requests to plugin (bidirectional control) and plugin can push events back without polling
This typically works better with NAT/firewalls than server-initiated connections.
Phase 1 — Introduce a transport layer abstraction (server + plugin)
1A) Define a common internal “service interface”
On the server, refactor REST handlers so they call a set of internal methods like:
CreatePod(...)
DeletePod(...)
GetPodStatus(...)
GetLogs(...) (even if currently implemented as polling)
The key is: HTTP handlers become thin adapters.
1B) Define a transport-neutral message envelope
Start with JSON (easy to debug). Example:
{
"v": 1,
"id": "uuid-or-monotonic",
"kind": "request",
"op": "POST",
"path": "/v1/pods",
"headers": { "authorization": "Bearer …" },
"body": { "...": "..." },
"deadline_ms": 30000
}
Response:
{
"v": 1,
"id": "same-id",
"kind": "response",
"status": 200,
"body": { "...": "..." },
"error": null
}
And define a few control messages:
hello / welcome (handshake + protocol version)
ping / pong (keepalive)
register (plugin type, capabilities, plugin instance ID)
- optionally
event for async server→plugin or plugin→server events later
This is the “REST-over-WS” bridge that avoids rewriting all payloads immediately.
Phase 2 — WebSocket endpoint on the InterLink server
2A) Add WS endpoint
Expose a new endpoint (example):
On upgrade:
- Authenticate (see below)
- Read
register message (plugin identifies itself: apptainer, slurm, etc.)
- Store the connection in a connection manager keyed by plugin instance ID
2B) Authentication options
Pick one (in order of simplicity):
- Bearer token in
Authorization header during WS upgrade
- Token as query param (works but less clean)
- Mutual TLS if you already have that model
For parity with REST, keep the same token validation logic.
2C) Routing incoming WS requests
Implement a dispatcher that:
- parses the envelope
- maps
op+path to the same internal handler logic you use for REST
- returns a response message with the same
id
This allows you to reuse existing request structs and validation.
Phase 3 — Plugin client (interlink-plugin-apptainer) changes
3A) Implement Transport interface in the plugin
Create an interface like:
Do(ctx, method, path, body) -> (status, respBody, err)
Implementations:
HTTPTransport (existing)
WSTransport (new)
The plugin chooses based on config:
INTERLINK_TRANSPORT=ws|http|auto
INTERLINK_WS_URL=wss://.../ws
3B) WS client features to implement
Minimum viable:
- connect + handshake/register
- request/response correlation via
id
- timeouts
- reconnect (exponential backoff)
- on reconnect: re-register
Edge-case policy (pick one):
- fail in-flight requests on disconnect (simpler)
- or retry idempotent ones
Phase 4 — Dual-stack rollout and feature flags
4A) Server
- Keep existing REST endpoints unchanged
- Add WS in parallel
- Add metrics: connected plugins, reconnect count, WS request latency
4B) Plugin
- Default stays HTTP
- Enable WS only in testing environments first
- Add verbose logging of WS connect/register/errors
4C) “Auto” mode
auto tries WS first, falls back to HTTP if:
- WS endpoint not reachable
- handshake fails
- server replies “unsupported”
Phase 5 — Streaming enhancements (the real reason to use WS)
Once REST-over-WS is stable, you can add native streaming message types without breaking existing operations.
5A) Log streaming
Instead of repeated REST polling, add:
- request:
StreamLogs { podID, sinceTime, follow: true }
- server sends multiple
logChunk messages (same stream ID)
- client can cancel stream with
CancelStream { streamID }
5B) Pod lifecycle events
Have plugin push events:
podStarted, podFinished, podFailed, etc.
This reduces server polling and improves UI responsiveness.
Phase 6 — Testing plan
- Protocol unit tests: encode/decode envelope, unknown fields, version mismatch
- Integration tests:
- start server
- connect WS client
- run a set of “golden” calls that are currently REST calls and validate identical outputs
- Fault injection:
- disconnect mid-request
- server restart
- auth failure
- Load tests: concurrent requests over one connection, or multiple connections
Phase 7 — Long-term cleanup
After a few releases:
- make WS the default transport for capable plugins
- keep REST for backward compatibility
- optionally move from JSON → protobuf for performance and strict schemas
What I need from you to tailor this to InterLink exactly
To make this plan file-level actionable (exact packages, structs, endpoints), paste either:
- the list of current plugin-facing REST endpoints + where they’re implemented (paths), or
- links to the router/handler files in
interlink-hq/interLink and the HTTP client code in interlink-plugin-apptainer.
If you share those, I can return:
- a mapping table “REST endpoint → WS message op/path → internal method”
- a minimal set of new Go packages/files to add
- and a migration sequence that avoids breaking existing plugins.
Goal
Add WebSocket transport for InterLink server ↔ plugin communication while keeping the existing REST API working.
Phase 0 — Confirm the desired direction and roles
Recommended topology: plugin connects to server.
wss://interlink-server/wsThis typically works better with NAT/firewalls than server-initiated connections.
Phase 1 — Introduce a transport layer abstraction (server + plugin)
1A) Define a common internal “service interface”
On the server, refactor REST handlers so they call a set of internal methods like:
CreatePod(...)DeletePod(...)GetPodStatus(...)GetLogs(...)(even if currently implemented as polling)The key is: HTTP handlers become thin adapters.
1B) Define a transport-neutral message envelope
Start with JSON (easy to debug). Example:
{ "v": 1, "id": "uuid-or-monotonic", "kind": "request", "op": "POST", "path": "/v1/pods", "headers": { "authorization": "Bearer …" }, "body": { "...": "..." }, "deadline_ms": 30000 }Response:
{ "v": 1, "id": "same-id", "kind": "response", "status": 200, "body": { "...": "..." }, "error": null }And define a few control messages:
hello/welcome(handshake + protocol version)ping/pong(keepalive)register(plugin type, capabilities, plugin instance ID)eventfor async server→plugin or plugin→server events laterThis is the “REST-over-WS” bridge that avoids rewriting all payloads immediately.
Phase 2 — WebSocket endpoint on the InterLink server
2A) Add WS endpoint
Expose a new endpoint (example):
GET /ws(HTTP upgrade)On upgrade:
registermessage (plugin identifies itself:apptainer,slurm, etc.)2B) Authentication options
Pick one (in order of simplicity):
Authorizationheader during WS upgradeFor parity with REST, keep the same token validation logic.
2C) Routing incoming WS requests
Implement a dispatcher that:
op+pathto the same internal handler logic you use for RESTidThis allows you to reuse existing request structs and validation.
Phase 3 — Plugin client (interlink-plugin-apptainer) changes
3A) Implement
Transportinterface in the pluginCreate an interface like:
Do(ctx, method, path, body) -> (status, respBody, err)Implementations:
HTTPTransport(existing)WSTransport(new)The plugin chooses based on config:
INTERLINK_TRANSPORT=ws|http|autoINTERLINK_WS_URL=wss://.../ws3B) WS client features to implement
Minimum viable:
idEdge-case policy (pick one):
Phase 4 — Dual-stack rollout and feature flags
4A) Server
4B) Plugin
4C) “Auto” mode
autotries WS first, falls back to HTTP if:Phase 5 — Streaming enhancements (the real reason to use WS)
Once REST-over-WS is stable, you can add native streaming message types without breaking existing operations.
5A) Log streaming
Instead of repeated REST polling, add:
StreamLogs { podID, sinceTime, follow: true }logChunkmessages (same stream ID)CancelStream { streamID }5B) Pod lifecycle events
Have plugin push events:
podStarted,podFinished,podFailed, etc.This reduces server polling and improves UI responsiveness.
Phase 6 — Testing plan
Phase 7 — Long-term cleanup
After a few releases:
What I need from you to tailor this to InterLink exactly
To make this plan file-level actionable (exact packages, structs, endpoints), paste either:
interlink-hq/interLinkand the HTTP client code ininterlink-plugin-apptainer.If you share those, I can return: