fix: add shutdown methods to executors by jbusecke · Pull Request #925 · zarr-developers/VirtualiZarr

jbusecke · 2026-03-12T17:28:15Z

@TomNicholas and I have been mulling over a complex native zarr ingestion job for a few days now. We were ingesting many batches of large (~1TB) native zarr stores, and saw a steady increase of memory which indicated that 'something' was holding onto memory in between batches. This PR adds tests to catch this behavior and a fix for the lithops executor that did fix our problem for now.

The dataset we are using is currently not public. I would like to demonstrate the base issue fully reproducibly. If anyone knows a ~1TB/300k chunks native zarr store in an anon bucket, please let me know.
Closes Lithops FunctionExecutor memory leaks: atexit handler + unbounded futures list #926
Tests added
Tests passing
No test coverage regression
Full type hint coverage
Changes are documented in docs/releases.md
New functions/methods are listed in an appropriate *.md file under docs/api
New functionality has documentation

Add explicit shutdown() to SerialExecutor and DaskDelayedExecutor that clears tracked futures. Enhance LithopsEagerFunctionExecutor.shutdown() to clear cached _call_output on ResponseFutures before closing, preventing memory accumulation across repeated map() calls. Add parametrized tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

codecov · 2026-03-12T17:30:11Z

Codecov Report

❌ Patch coverage is 40.00000% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 88.62%. Comparing base (bdc1a3a) to head (a612af5).

Files with missing lines	Patch %	Lines
virtualizarr/parallel.py	40.00%	6 Missing ⚠️

❌ Your patch check has failed because the patch coverage (40.00%) is below the target coverage (75.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #925      +/-   ##
==========================================
- Coverage   89.23%   88.62%   -0.61%     
==========================================
  Files          33       33              
  Lines        2025     2031       +6     
==========================================
- Hits         1807     1800       -7     
- Misses        218      231      +13

Files with missing lines	Coverage Δ
virtualizarr/parallel.py	`64.10% <40.00%> (-25.56%)`	⬇️

... and 3 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

virtualizarr/parallel.py

virtualizarr/tests/test_parallel.py

TomNicholas · 2026-03-12T18:49:06Z

virtualizarr/tests/test_parallel.py

+
+
+@pytest.mark.parametrize("executor_cls", ALL_EXECUTORS)
+class TestExecutorMemory:


I'm not sure if either of these tests will be reliable enough - curious of @chuckwondo 's thoughts.

for more information, see https://pre-commit.ci

TomNicholas · 2026-03-12T20:02:41Z

virtualizarr/parallel.py

+        # Lithops registers self.clean as an atexit handler (executors.py __init__),
+        # which prevents the FunctionExecutor from ever being garbage collected.
+        # Unregister it so the executor can be freed after shutdown.
+        atexit.unregister(self.lithops_client.clean)


This is absolutely wild and deserves raising upstream

Probably so.

virtualizarr/tests/test_parallel.py

chuckwondo · 2026-03-12T20:08:54Z

virtualizarr/tests/test_parallel.py

+
+
+@pytest.mark.parametrize("executor_cls", ALL_EXECUTORS)
+class TestExecutorMemory:


I'm not sure these really belong here. They seem like tests that should occur upstream. If we see memory leaks in the upstream executors, we should probably be opening bugs against the appropriate repositories, no?

I agree, see #926 for discussion of what we should do or not do to clean up

I think you are right in principal, but I would propose to keep this around as at least an optional test due to the significant work that was needed to get to the bottom of this.

Yeah @chuckwondo I see these tests as hopefully-temporary, but unfortunately important.

for more information, see https://pre-commit.ci

jbusecke · 2026-03-12T23:04:49Z

Ok Tom and I actually worked on an alternative approach where we change the lithops config to set lithops.data_cleaner to false (this is true by default and triggers the atexit registration). Combined with the added .shutdown() method on the Lithops exec this solves the problem in #926 and seems a bunch nicer than the original approach.

I have limited this to when the backend is localhost so that we leave the serverless behavior untouched for now. We could easily extend this if a user finds this error with other backends.

…s/VirtualiZarr into executor-cleaning

TomNicholas

I think this is good, but before releasing it I want to:

confirm with @jbusecke that nothing else puzzling has come up wrt this rabbit hole,
raise an upstream issue on lithops in case we're missing something important here.

jbusecke · 2026-03-16T18:17:06Z

@TomNicholas I just looked into the failing Minimum Version Test and it seems that lithops is installed with the latest version 3.6.4?

a) That does not seems like the behavior we want from the test?
b) Worries me that the fix is somehow dependent on the python version or some other library?

jbusecke · 2026-03-16T18:19:44Z

Aha! now the 3.11 tests also fail.

jbusecke · 2026-03-16T19:23:56Z

Waiting for #932 to make sure that this is limited to python 3.11, but also wondering if this is another example of #933?

jbusecke · 2026-03-16T19:45:29Z

The 3.12 and 3.13 tests fails with that VirtualiZarrDatasetAccessor import error we saw last week in the office @tom.

The 3.11 min-deps test does not mention this import error, but there is something failing within lithops, so it is not just an issue of not releasing the caches IIUC.

jbusecke · 2026-03-16T19:49:47Z

Rerunning them now.

TomNicholas · 2026-03-16T21:12:36Z

Try adding the @pytest.mark.flaky decorators from #934 @jbusecke

jbusecke and others added 2 commits March 12, 2026 12:26

Add tests and constrain fix only to lithops

4f4c061

jbusecke requested review from TomNicholas and chuckwondo March 12, 2026 17:28

jbusecke temporarily deployed to test-release March 12, 2026 17:28 — with GitHub Actions Inactive

TomNicholas reviewed Mar 12, 2026

View reviewed changes

virtualizarr/parallel.py Outdated Show resolved Hide resolved

virtualizarr/tests/test_parallel.py Outdated Show resolved Hide resolved

TomNicholas added the performance label Mar 12, 2026

TomNicholas reviewed Mar 12, 2026

View reviewed changes

jbusecke and others added 2 commits March 12, 2026 15:35

Clean up claudes horrible tests

d925fb7

[pre-commit.ci] auto fixes from pre-commit.com hooks

46908c1

for more information, see https://pre-commit.ci

pre-commit-ci bot temporarily deployed to test-release March 12, 2026 19:37 Inactive

jbusecke marked this pull request as ready for review March 12, 2026 19:50

TomNicholas reviewed Mar 12, 2026

View reviewed changes

virtualizarr/tests/test_parallel.py Outdated Show resolved Hide resolved

chuckwondo reviewed Mar 12, 2026

View reviewed changes

TomNicholas mentioned this pull request Mar 12, 2026

Lithops FunctionExecutor memory leaks: atexit handler + unbounded futures list #926

Open

jbusecke and others added 2 commits March 12, 2026 18:59

Alternative approach via lithops config

4732cc3

[pre-commit.ci] auto fixes from pre-commit.com hooks

031ffac

for more information, see https://pre-commit.ci

pre-commit-ci bot temporarily deployed to test-release March 12, 2026 23:00 Inactive

jbusecke added 2 commits March 12, 2026 19:07

toms renaming suggestion

6414546

Merge branch 'executor-cleaning' of https://github.com/zarr-developer…

f013d74

…s/VirtualiZarr into executor-cleaning

jbusecke temporarily deployed to test-release March 12, 2026 23:08 — with GitHub Actions Inactive

Merge branch 'main' into executor-cleaning

216fb09

TomNicholas temporarily deployed to test-release March 12, 2026 23:12 — with GitHub Actions Inactive

TomNicholas approved these changes Mar 13, 2026

View reviewed changes

TomNicholas mentioned this pull request Mar 16, 2026

Release v2.5.0 #915

Open

Merge branch 'main' into executor-cleaning

6a83bf3

TomNicholas temporarily deployed to test-release March 16, 2026 17:54 — with GitHub Actions Inactive

Update lithops dependency version in pyproject.toml

3f9aa44

jbusecke temporarily deployed to test-release March 16, 2026 18:12 — with GitHub Actions Inactive

Revert lithops dependency version constraint

af2c83e

jbusecke temporarily deployed to test-release March 16, 2026 18:14 — with GitHub Actions Inactive

jbusecke mentioned this pull request Mar 16, 2026

Test python 3.12, 3.13 #932

Merged

4 tasks

Merge branch 'main' into executor-cleaning

4481f63

TomNicholas temporarily deployed to test-release March 16, 2026 19:34 — with GitHub Actions Inactive

jbusecke mentioned this pull request Mar 16, 2026

Examples of CI hanging or being flaky/weird #933

Open

Merge branch 'main' into executor-cleaning

a612af5

TomNicholas deployed to test-release March 16, 2026 21:11 — with GitHub Actions View deployment



		@pytest.mark.parametrize("executor_cls", ALL_EXECUTORS)
		class TestExecutorMemory:

Conversation

jbusecke commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

TomNicholas Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

TomNicholas Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

chuckwondo Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

TomNicholas Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

chuckwondo Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

TomNicholas Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

jbusecke Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TomNicholas Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

jbusecke commented Mar 12, 2026

Uh oh!

TomNicholas left a comment

Choose a reason for hiding this comment

Uh oh!

jbusecke commented Mar 16, 2026

Uh oh!

jbusecke commented Mar 16, 2026

Uh oh!

jbusecke commented Mar 16, 2026

Uh oh!

jbusecke commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jbusecke commented Mar 16, 2026

Uh oh!

TomNicholas commented Mar 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jbusecke commented Mar 12, 2026 •

edited

Loading

codecov bot commented Mar 12, 2026 •

edited

Loading

jbusecke Mar 12, 2026 •

edited

Loading

jbusecke commented Mar 16, 2026 •

edited

Loading