Summary
Resource instances created via getStreamUUID and destroyed via deleteUUID never release their WASM ArrayBuffer memory back to the OS. The WASM linear memory grows monotonically — even after Skip's internal GC frees objects, the ArrayBuffer stays at its high-water mark. Over time this leads to OOM on constrained instances.
Environment
@skipruntime/wasm@0.0.19, @skipruntime/server@0.0.19, @skip-adapter/postgres@0.0.19
- Node.js on Clever Cloud (XS instance,
--max-old-space-size=644)
- Platform:
wasm (default)
Reproduction
- Start a Skip service with several external Postgres resources (
syncHistoricData: true)
- Observe baseline
ArrayBuffer memory (~1 GB for our dataset — photos, comments, notifications, reactions, etc.)
- Open SSE connections — each
getStreamUUID call instantiates a resource with .map() chains (enrichers, filters), allocating additional WASM memory
- Close SSE connections —
deleteUUID is called, resource instance is destroyed
- Observe
ArrayBuffer memory: it never decreases, even with 0 active SSE connections
Data from production
Memory samples from process.memoryUsage().arrayBuffers (MB):
Uptime | ArrayBuf | SSE conns | Notes
---------|----------|-----------|------
63s | 1025 | 0 | Fresh start, baseline after syncHistoricData
9843s | 1077 | 3 | Stable with 3 connections
10263s | 1077 | 0 | Connections closed, memory unchanged
10803s | 1077 | 0 | Still 0 connections, still 1077 MB
9063s* | 1704 | 6 | After burst of 17 SSE connections
9783s* | 1715 | 3 | Connections dropped, memory stayed at 1715
10683s* | 1715 | 3 | Never came back down
* Second deployment, same pattern
The [MEMORY-TREND] watchdog flagged 368 MB of ArrayBuffer per SSE connection (threshold: 10 MB). After the connections closed and deleteUUID succeeded, the memory remained permanently allocated.
Eventually the process becomes unresponsive: event loop lag reaches 10+ seconds, DB queries time out, all health checks fail. Only a process restart recovers.
Expected behavior
After deleteUUID successfully destroys a resource instance, the WASM memory allocated for that instance's reactive graph (mapped/filtered collections) should be reclaimable — either by shrinking the WASM linear memory or by reusing freed pages for subsequent allocations without growing further.
Questions
- Is there a way to configure WASM memory limits or trigger compaction?
- Would switching more resources to
syncHistoricData: false significantly reduce baseline memory?
- Is the
native platform option (runService with platform: "native") expected to have better memory reclamation behavior?
- Are there plans to support WASM memory shrinking (e.g., via
memory.discard proposals)?
Workaround
We're implementing a self-healing watchdog that calls process.exit(1) when ArrayBuffer exceeds a threshold, relying on the hosting platform to restart the process. This works but causes brief downtime on each restart.
Summary
Resource instances created via
getStreamUUIDand destroyed viadeleteUUIDnever release their WASMArrayBuffermemory back to the OS. The WASM linear memory grows monotonically — even after Skip's internal GC frees objects, theArrayBufferstays at its high-water mark. Over time this leads to OOM on constrained instances.Environment
@skipruntime/wasm@0.0.19,@skipruntime/server@0.0.19,@skip-adapter/postgres@0.0.19--max-old-space-size=644)wasm(default)Reproduction
syncHistoricData: true)ArrayBuffermemory (~1 GB for our dataset — photos, comments, notifications, reactions, etc.)getStreamUUIDcall instantiates a resource with.map()chains (enrichers, filters), allocating additional WASM memorydeleteUUIDis called, resource instance is destroyedArrayBuffermemory: it never decreases, even with 0 active SSE connectionsData from production
Memory samples from
process.memoryUsage().arrayBuffers(MB):* Second deployment, same pattern
The
[MEMORY-TREND]watchdog flagged 368 MB of ArrayBuffer per SSE connection (threshold: 10 MB). After the connections closed anddeleteUUIDsucceeded, the memory remained permanently allocated.Eventually the process becomes unresponsive: event loop lag reaches 10+ seconds, DB queries time out, all health checks fail. Only a process restart recovers.
Expected behavior
After
deleteUUIDsuccessfully destroys a resource instance, the WASM memory allocated for that instance's reactive graph (mapped/filtered collections) should be reclaimable — either by shrinking the WASM linear memory or by reusing freed pages for subsequent allocations without growing further.Questions
syncHistoricData: falsesignificantly reduce baseline memory?nativeplatform option (runServicewithplatform: "native") expected to have better memory reclamation behavior?memory.discardproposals)?Workaround
We're implementing a self-healing watchdog that calls
process.exit(1)whenArrayBufferexceeds a threshold, relying on the hosting platform to restart the process. This works but causes brief downtime on each restart.