Workers fails to deserialize with rasterio

7/30/2019

after a deploy over Google Cloud the official Dask Helm chart I've update the environment with some extra conda packages, specifically xarray and rasterio. If I try to run my code I'm getting back this error from the workers log and the procedure stops.

Traceback (most recent call last): File "/opt/conda/lib/python3.7/site-packages/tornado/ioloop.py", line 743, in _run_callback ret = callback() File "/opt/conda/lib/python3.7/site-packages/tornado/ioloop.py", line 767, in _discard_future_result future.result() File "/opt/conda/lib/python3.7/site-packages/tornado/gen.py", line 742, in run yielded = self.gen.throw(*exc_info) # type: ignore File "/opt/conda/lib/python3.7/site-packages/distributed/worker.py", line 661, in handle_scheduler self.ensure_computing]) File "/opt/conda/lib/python3.7/site-packages/tornado/gen.py", line 735, in run value = future.result() File "/opt/conda/lib/python3.7/site-packages/tornado/gen.py", line 742, in run yielded = self.gen.throw(*exc_info) # type: ignore File "/opt/conda/lib/python3.7/site-packages/distributed/core.py", line 386, in handle_stream msgs = yield comm.read() File "/opt/conda/lib/python3.7/site-packages/tornado/gen.py", line 735, in run value = future.result() File "/opt/conda/lib/python3.7/site-packages/tornado/gen.py", line 742, in run yielded = self.gen.throw(*exc_info) # type: ignore File "/opt/conda/lib/python3.7/site-packages/distributed/comm/tcp.py", line 206, in read deserializers=deserializers) File "/opt/conda/lib/python3.7/site-packages/tornado/gen.py", line 735, in run value = future.result() File "/opt/conda/lib/python3.7/site-packages/tornado/gen.py", line 209, in wrapper yielded = next(result) File "/opt/conda/lib/python3.7/site-packages/distributed/comm/utils.py", line 82, in from_frames res = _from_frames() File "/opt/conda/lib/python3.7/site-packages/distributed/comm/utils.py", line 68, in _from_frames deserializers=deserializers) File "/opt/conda/lib/python3.7/site-packages/distributed/protocol/core.py", line 132, in loads value = _deserialize(head, fs, deserializers=deserializers) File "/opt/conda/lib/python3.7/site-packages/distributed/protocol/serialize.py", line 184, in deserialize return loads(header, frames) File "/opt/conda/lib/python3.7/site-packages/distributed/protocol/serialize.py", line 57, in pickle_loads return pickle.loads(b''.join(frames)) File "/opt/conda/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 59, in loads return pickle.loads(x) File "/opt/conda/lib/python3.7/site-packages/rasterio/init.py", line 22, in from rasterio._base import gdal_version ImportError: libzstd.so.1: cannot open shared object file: No such file or directory

For my understanding problem seems to be the missing or corrupted library libzstdl, am I right? I can't try to reinstall it as I don't have the admin rights. The helm carts is based on the official dask/docker version Can any one help me to find on which channel is better to report this problem?

-- Cursore
dask
dask-distributed
dask-kubernetes

2 Answers

7/30/2019

It looks like your versions are not the same across all of your clients and workers. Note that the EXTRA_CONDA_PACKAGES= environment variable that you're using needs to be used both in the client and the worker specs, not just one.

You might also try client.get_versions(check=True) to verify that some of the packages that are more central to Dask are synchronized.

-- MRocklin
Source: StackOverflow

7/31/2019

Solved add in in the helm file:

env:
    - name: EXTRA_APT_PACKAGES
      value : libzstd1
-- Cursore
Source: StackOverflow