Skip to content

Numba cache error in Apptainer/Singularity container #20

@jonasmac16

Description

@jonasmac16

Description of the bug

I am trying to run SOPA on our HPC using nextflow. I would like to use singularity/apptainer to run the pipeline and have set up a local cache of the containers since our computing nodes are offline.

I am trying to do a test run on the login node to pull all nf plugin dependencies but I get the following error, which looks like an issue with numba trying to write to disk but can't as it is running inside the container.

I have to run nextflow from the downloaded pipeline directory as I have to change the the test profile requirement of 4 cpus (since the login node only has 2 cpu).

nextflow run . --outdir ./test_run -profile "test,apptainer"

-[nf-core/sopa] Pipeline completed with errors-
ERROR ~ Error executing process > 'NFCORE_SOPA:SOPA:TO_SPATIALDATA (1)'

Caused by:
  Process `NFCORE_SOPA:SOPA:TO_SPATIALDATA (1)` terminated with an error exit status (1)


Command executed:

  sopa convert samplesheet.csv --sdata-path sample_name.zarr --technology 'toy_dataset' --kwargs "{}"

  cat <<-END_VERSIONS > versions.yml
  "NFCORE_SOPA:SOPA:TO_SPATIALDATA":
      sopa: $(sopa --version)
      spatialdata: $(python -c "import spatialdata; print(spatialdata.__version__)" 2> /dev/null)
      spatialdata_io: $(python -c "import spatialdata_io; print(spatialdata_io.__version__)" 2> /dev/null)
  END_VERSIONS

Command exit status:
  1

Command output:
  (empty)

Command error:
  /usr/local/lib/python3.12/site-packages/dask/dataframe/__init__.py:31: FutureWarning: The legacy Dask DataFrame implementation is deprecated and will be removed in a future version. Set the configuration option `dataframe.query-planning` to `True` or None to enable the new Dask Dataframe implementation and silence this warning.
    warnings.warn(
  Traceback (most recent call last):
    File "/usr/local/bin/sopa", line 5, in <module>
      from sopa.main import app
    File "/usr/local/lib/python3.12/site-packages/sopa/__init__.py", line 12, in <module>
      from spatialdata import read_zarr  # will set `dataframe.query-planning` to False
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/usr/local/lib/python3.12/site-packages/spatialdata/__init__.py", line 60, in <module>
      from spatialdata import dataloader, datasets, models, transformations
    File "/usr/local/lib/python3.12/site-packages/spatialdata/datasets.py", line 20, in <module>
      from spatialdata._core.operations.aggregate import aggregate
    File "/usr/local/lib/python3.12/site-packages/spatialdata/_core/operations/aggregate.py", line 17, in <module>
      from xrspatial import zonal_stats
    File "/usr/local/lib/python3.12/site-packages/xrspatial/__init__.py", line 1, in <module>
      from xrspatial.aspect import aspect  # noqa
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/usr/local/lib/python3.12/site-packages/xrspatial/aspect.py", line 10, in <module>
      from xrspatial.utils import ArrayTypeFunctionMapping, cuda_args, ngjit, not_implemented_func
    File "/usr/local/lib/python3.12/site-packages/xrspatial/utils.py", line 4, in <module>
      import datashader as ds
    File "/usr/local/lib/python3.12/site-packages/datashader/__init__.py", line 7, in <module>
      from .core import Canvas                                 # noqa (API import)
      ^^^^^^^^^^^^^^^^^^^^^^^^
    File "/usr/local/lib/python3.12/site-packages/datashader/core.py", line 18, in <module>
      from . import reductions as rd
    File "/usr/local/lib/python3.12/site-packages/datashader/reductions.py", line 16, in <module>
      from datashader.transfer_functions._cuda_utils import (
    File "/usr/local/lib/python3.12/site-packages/datashader/transfer_functions/__init__.py", line 992, in <module>
      @nb.jit(nopython=True, nogil=True, cache=True)
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/usr/local/lib/python3.12/site-packages/numba/core/decorators.py", line 227, in wrapper
      disp.enable_caching()
    File "/usr/local/lib/python3.12/site-packages/numba/core/dispatcher.py", line 811, in enable_caching
      self._cache = FunctionCache(self.py_func)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/usr/local/lib/python3.12/site-packages/numba/core/caching.py", line 687, in __init__
      self._impl = self._impl_class(py_func)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/usr/local/lib/python3.12/site-packages/numba/core/caching.py", line 423, in __init__
      raise RuntimeError("cannot cache function %r: no locator available "
  RuntimeError: cannot cache function '_array_density': no locator available for file '/usr/local/lib/python3.12/site-packages/datashader/transfer_functions/__init__.py'

Work dir:
  /exafs1/well/gordon-weeks/projects/01_crclm/pipelines/nf-core-sopa/dev/work/64/b991f83cb84a08b91c90048d235db3

Container:
  /well/gordon-weeks/shared/nextflow/.apptainer_cache/quentinblampey-sopa-2.1.11.img

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`

 -- Check '.nextflow.log' file for details
ERROR ~ Pipeline failed. Please refer to troubleshooting docs: https://nf-co.re/docs/usage/troubleshooting

 -- Check '.nextflow.log' file for details

I have tried define the numba cache to point to a writable location as follows, without luck. Can you help?

mkdir $SHARED/.numba_cache
export NUMBA_CACHE_DIR="$SHARED/.numba_cache"

Command used and terminal output

Relevant files

No response

System information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions