Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Saving hangs #258

Closed
jklymak opened this issue Mar 4, 2021 · 11 comments
Closed

Saving hangs #258

jklymak opened this issue Mar 4, 2021 · 11 comments

Comments

@jklymak
Copy link

jklymak commented Mar 4, 2021

If I try and save a multi-file xmitgcm dataset to disk, it hangs and has to be killed..

The script is reading three time steps:

import xarray as xr
import xmitgcm as xm
import os

runname = 'Iso3kmlowU10Amp305f141B059RoughPatch100'
data_dir = f'../results/{runname}/input'
out_dir = f'../reduceddata/{runname}/'

with xm.open_mdsdataset(data_dir, prefix=['spinup'], endian='=',
                        geometry='cartesian') as ds:
        ds = ds.isel(YC=200, YG=200)
        ds.to_netcdf(f'../reduceddata/{runname}/SliceMid.nc', engine='netcdf4')

This routinely hangs, and if I kill it I get:

   ds.to_netcdf(f'../reduceddata/{runname}/SliceMid.nc', engine='netcdf4')
  File "/home/jklymak/venvs/AbHillInter2/lib/python3.8/site-packages/xarray/core/dataset.py", line 1644, in to_netcdf
    return to_netcdf(
  File "/home/jklymak/venvs/AbHillInter2/lib/python3.8/site-packages/xarray/backends/api.py", line 1120, in to_netcdf
    writes = writer.sync(compute=compute)
  File "/home/jklymak/venvs/AbHillInter2/lib/python3.8/site-packages/xarray/backends/common.py", line 155, in sync
    delayed_store = da.store(
  File "/home/jklymak/venvs/AbHillInter2/lib/python3.8/site-packages/dask/array/core.py", line 1022, in store
    result.compute(**kwargs)
  File "/home/jklymak/venvs/AbHillInter2/lib/python3.8/site-packages/dask/base.py", line 281, in compute
    (result,) = compute(self, traverse=False, **kwargs)
  File "/home/jklymak/venvs/AbHillInter2/lib/python3.8/site-packages/dask/base.py", line 563, in compute
    results = schedule(dsk, keys, **kwargs)
  File "/home/jklymak/venvs/AbHillInter2/lib/python3.8/site-packages/dask/threaded.py", line 76, in get
    results = get_async(
  File "/home/jklymak/venvs/AbHillInter2/lib/python3.8/site-packages/dask/local.py", line 476, in get_async
    key, res_info, failed = queue_get(queue)
  File "/home/jklymak/venvs/AbHillInter2/lib/python3.8/site-packages/dask/local.py", line 133, in queue_get
    return q.get()
  File "/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/python/3.8.2/lib/python3.8/queue.py", line 170, in get
    self.not_empty.wait()
  File "/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/python/3.8.2/lib/python3.8/threading.py", line 302, in wait
    waiter.acquire()

If I select the three time steps in individually and save individually, then everything works fine. ie.

for time in range(3):
   ds = ds0.isel(YC=200, YG=200, time=time)
   ds.to_netcdf(f'../reduceddata/{runname}/SliceMid{time}.nc', engine='netcdf4')

saves three files, one for each time slice. That is an acceptable work around, but not what I have done on other clusters.

> pip list
Package         Version
--------------- --------
cachetools      4.2.1
cftime          1.4.1
dask            2021.2.0
netCDF4         1.5.4
numpy           1.20.1
pandas          1.1.4
pip             20.0.2
python-dateutil 2.8.1
pytz            2021.1
PyYAML          5.3.1
setuptools      46.1.3
six             1.15.0
toolz           0.11.1
wheel           0.34.2
xarray          0.16.2
xmitgcm         0.5.1

Note xarray seems to be having a problem with file/thread locks as well: pydata/xarray#3961 So I wonder if this is the same thing. Using pure xarray and netcdf files, if I set lock=False things tend to work better, but I don't see a similar flag for xmitgcm.

(Sorry for two issues in one day - new cluster ;-)

@jklymak
Copy link
Author

jklymak commented Mar 4, 2021

Note that if I just let the job above run, the job runs out of memory. Perhaps that is the fundamental issue, and I'll try with more memory:

slurmstepd: error: Detected 1 oom-kill event(s) in StepId=63164230.interactive cgroup. Some of your processes may have been killed by the cgroup out-of-memory handler.
srun: error: cdr767: task 0: Out Of Memory

@rabernat
Copy link
Member

rabernat commented Mar 4, 2021

open_mdsdataset is not a contextmanager, so you should not use it in a with block.

To diagnose this better, could you show the full repr of your dataset and report the variable total size and chunk size?

If you are using dask, the dask dashboard is invaluable for debugging.

@jklymak
Copy link
Author

jklymak commented Mar 4, 2021

Oh its not a context manager? Ooops.

Upping the memory to 16 G does fix the problem. However, I am concerned as to what will happen with more iterations. I can definitely share the data if this continues to be an issue, but thanks for the suggestions!

@rabernat
Copy link
Member

rabernat commented Mar 4, 2021

Please post the repr. print(ds)

@jklymak
Copy link
Author

jklymak commented Mar 4, 2021

<xarray.Dataset>
Dimensions:  (XC: 480, XG: 480, YC: 400, YG: 400, Z: 400, Zl: 400, Zp1: 401, Zu: 400, time: 3)
Coordinates: (12/36)
  * XC       (XC) float64 1.5e+03 4.5e+03 7.5e+03 ... 1.436e+06 1.438e+06
  * YC       (YC) float64 1.5e+03 4.5e+03 7.5e+03 ... 1.196e+06 1.198e+06
  * XG       (XG) float64 0.0 3e+03 6e+03 ... 1.431e+06 1.434e+06 1.437e+06
  * YG       (YG) float64 0.0 3e+03 6e+03 ... 1.191e+06 1.194e+06 1.197e+06
  * Z        (Z) float64 -5.0 -15.0 -25.0 ... -3.975e+03 -3.985e+03 -3.995e+03
  * Zp1      (Zp1) float64 0.0 -10.0 -20.0 -30.0 ... -3.98e+03 -3.99e+03 -4e+03
    ...       ...
    rLowW    (YC, XG) float64 dask.array<chunksize=(400, 480), meta=np.ndarray>
    rLowC    (YC, XC) float64 dask.array<chunksize=(400, 480), meta=np.ndarray>
    rhoRef   (Z) float64 dask.array<chunksize=(400,), meta=np.ndarray>
    rSurfW   (YC, XG) float64 dask.array<chunksize=(400, 480), meta=np.ndarray>
    iter     (time) int64 dask.array<chunksize=(1,), meta=np.ndarray>
  * time     (time) timedelta64[ns] 0 days 1 days 2 days
Data variables:
    UVEL     (time, Z, YC, XG) float64 dask.array<chunksize=(1, 400, 400, 480), meta=np.ndarray>
    VVEL     (time, Z, YG, XC) float64 dask.array<chunksize=(1, 400, 400, 480), meta=np.ndarray>
    WVEL     (time, Zl, YC, XC) float64 dask.array<chunksize=(1, 400, 400, 480), meta=np.ndarray>
    THETA    (time, Z, YC, XC) float64 dask.array<chunksize=(1, 400, 400, 480), meta=np.ndarray>
    PHIHYD   (time, Z, YC, XC) float64 dask.array<chunksize=(1, 400, 400, 480), meta=np.ndarray>

@rabernat
Copy link
Member

rabernat commented Mar 4, 2021

So each chunk is 614 MB. That's pretty big but should work fine if you have enough memory.

Since the data are chunked in time, theoretically the writing should execute in a streaming manner and not require more memory for more timesteps. A couple of suggestions:

  • Use a dask distributed cluster and watch the dashboard. It is so useful for debugging.
  • Try writing to zarr instead of netcdf. It plays so much better with parallel writes. (In our group we have completely stopped using netcdf as an intermediate data format.)

@jklymak
Copy link
Author

jklymak commented Mar 4, 2021

OK, I'll look into the dashboard.

As an experiment running with 16 Gb completed the write in 17 s. Dropping to 8 Gb required 50 s. These memory caps are set by the cluster - I need to request a memory size for shared nodes.

If I only write 2 out of 3 files, then it takes 17 s with 8 Gb. So, something funny is happening with memory.

I'll give zarr a try. I've just been using netcdf because...

@rabernat
Copy link
Member

rabernat commented Mar 4, 2021

Another good debugging technique is to just do a reduction, e.g. ds.mean().load(). That eliminates the writing part of the workflow and tests whether your chunk structure is computable.

@jklymak
Copy link
Author

jklymak commented Mar 4, 2021

OK< I'll make a bigger data set and report if the memory needs stay the same or grow.

zarr seems to work about the same. I can imagine it is a lot better for parallel writes, but as a portable format to transfer somewhere else, it seems suboptimal because I would guess you want a tar first? (sorry if this is turning into a chat - feel free to return to whatever else you might have been doing I think I'm OK for now ;-))

@jklymak
Copy link
Author

jklymak commented Mar 7, 2021

Saving as zarr works. Just using xarray and loading as zarr and then trying to save as netcdf fails on this machine. So I think we can safely rule out xmitgcm. Thanks for the help!

@jklymak jklymak closed this as completed Mar 7, 2021
@rabernat
Copy link
Member

rabernat commented Mar 8, 2021

NetCDF / HDF5 does not play particularly well with distributed writing. It should be possible, but there are lots of things that can go wrong. That was one of the motivating factors that inspired the creation of Zarr in the first place.

If you want to store zarr in a single file, .zip is the way to go. Zarr supports reading / writing directly to zip files (although locking is required for writes.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants