Conversation
Add explicit shutdown() to SerialExecutor and DaskDelayedExecutor that clears tracked futures. Enhance LithopsEagerFunctionExecutor.shutdown() to clear cached _call_output on ResponseFutures before closing, preventing memory accumulation across repeated map() calls. Add parametrized tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Codecov Report❌ Patch coverage is
❌ Your patch check has failed because the patch coverage (40.00%) is below the target coverage (75.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #925 +/- ##
==========================================
- Coverage 89.23% 88.62% -0.61%
==========================================
Files 33 33
Lines 2025 2031 +6
==========================================
- Hits 1807 1800 -7
- Misses 218 231 +13
🚀 New features to boost your workflow:
|
virtualizarr/tests/test_parallel.py
Outdated
|
|
||
|
|
||
| @pytest.mark.parametrize("executor_cls", ALL_EXECUTORS) | ||
| class TestExecutorMemory: |
There was a problem hiding this comment.
I'm not sure if either of these tests will be reliable enough - curious of @chuckwondo 's thoughts.
for more information, see https://pre-commit.ci
virtualizarr/parallel.py
Outdated
| # Lithops registers self.clean as an atexit handler (executors.py __init__), | ||
| # which prevents the FunctionExecutor from ever being garbage collected. | ||
| # Unregister it so the executor can be freed after shutdown. | ||
| atexit.unregister(self.lithops_client.clean) |
There was a problem hiding this comment.
This is absolutely wild and deserves raising upstream
|
|
||
|
|
||
| @pytest.mark.parametrize("executor_cls", ALL_EXECUTORS) | ||
| class TestExecutorMemory: |
There was a problem hiding this comment.
I'm not sure these really belong here. They seem like tests that should occur upstream. If we see memory leaks in the upstream executors, we should probably be opening bugs against the appropriate repositories, no?
There was a problem hiding this comment.
I agree, see #926 for discussion of what we should do or not do to clean up
There was a problem hiding this comment.
I think you are right in principal, but I would propose to keep this around as at least an optional test due to the significant work that was needed to get to the bottom of this.
There was a problem hiding this comment.
Yeah @chuckwondo I see these tests as hopefully-temporary, but unfortunately important.
|
Ok Tom and I actually worked on an alternative approach where we change the lithops config to set I have limited this to when the backend is |
TomNicholas
left a comment
There was a problem hiding this comment.
I think this is good, but before releasing it I want to:
- confirm with @jbusecke that nothing else puzzling has come up wrt this rabbit hole,
- raise an upstream issue on lithops in case we're missing something important here.
|
@TomNicholas I just looked into the failing Minimum Version Test and it seems that lithops is installed with the latest version 3.6.4? a) That does not seems like the behavior we want from the test? |
|
Aha! now the 3.11 tests also fail. |
|
The 3.12 and 3.13 tests fails with that The 3.11 min-deps test does not mention this import error, but there is something failing within lithops, so it is not just an issue of not releasing the caches IIUC. |
|
Rerunning them now. |
@TomNicholas and I have been mulling over a complex native zarr ingestion job for a few days now. We were ingesting many batches of large (~1TB) native zarr stores, and saw a steady increase of memory which indicated that 'something' was holding onto memory in between batches. This PR adds tests to catch this behavior and a fix for the lithops executor that did fix our problem for now.
The dataset we are using is currently not public. I would like to demonstrate the base issue fully reproducibly. If anyone knows a ~1TB/300k chunks native zarr store in an anon bucket, please let me know.
Closes Lithops FunctionExecutor memory leaks: atexit handler + unbounded futures list #926
Tests added
Tests passing
No test coverage regression
Full type hint coverage
Changes are documented in
docs/releases.mdNew functions/methods are listed in an appropriate
*.mdfile underdocs/apiNew functionality has documentation