-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segmentation fault when writing to netcdf with dask-enabled xarray dataset #1172
Comments
An important addition -- the following also causes low-level system errors (bus error or seg fault, can't remember) so the problem does not originate in the to_netcdf per se, but rather in the chunking / loading of the dataset.
|
I'm using 0.8.2. Thanks for the issue link; I had read through that, but since I am not using open_mfdataset and lock=True did not fix my issue, I figured my problem was subtly different. Some incompatibility / race condition when using netcdf4 and dask together? This might be a tricky problem to track down as my code did complete without seg faulting when my dimensions were subtly different (about 10% small in space and 20% smaller in time), even on a server with half as much memory. Bleck. |
Yes, this is different. I think this is bug in how we write netCDF files. Currently, we always use a new thread lock in |
Thanks for the info! Given this potential bug, is the driver='pynio' solution acceptable, or is it just working for me for now and may fail in some subtly different configuration / data size? Another possible solution suggested by a colleague - add the following at the top to enforce single-threaded reads and writes.
|
It is possible that pynio is linking to an independent HDF5 installation, which should eliminate the need for a shared lock. But if that's not the case, then you probably just got lucky. |
…writing Fixes pydata#1172 The serializable lock will be useful for dask.distributed or multi-processing (xref pydata#798, pydata#1173, among others).
…ing (#1179) * Switch to shared Lock (SerializableLock if possible) for reading and writing Fixes #1172 The serializable lock will be useful for dask.distributed or multi-processing (xref #798, #1173, among others). * Test serializable lock * Use conda-forge for builds * remove broken/fragile .test_lock
I have a 4 GB netcdf file and am running on a machine with 32 GB of memory. The following works just fine without error on this large memory machine:
This dask + pynio approach also works correctly:
But the following dask + default engine (netcdf4 probably?) approach slowly sucks up all the system memory, writes out a file
twice as large as it should bewith variable values that are extremely large, and then fails with seg fault, bus error or other low-level system errors we'd rather not be seeing in python!Adding lock=True to the open_dataset call does not help.
I have two workable solutions to my problem (run without dask because I have a lot of memory available, or use engine='pynio') but this error was hard to track down so thought you would want to know. Would be glad to hear also if I missed something in the docs and the all-too-common user error is to blame =)
The text was updated successfully, but these errors were encountered: