-
Notifications
You must be signed in to change notification settings - Fork 12
Description
The following error just came up when running sweets
lib/python3.12/site-packages/h5py/_hl/files.py", line 238, in make_fid
49 fid = h5f.open(name, flags, fapl=fapl)
50 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
51 File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
52 File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
53 File "h5py/h5f.pyx", line 102, in h5py.h5f.open
54 OSError: Unable to synchronously open file (truncated file: eof = 96, sblock->base_addr = 0, stored_eof = 2048)
which is confirmed by checking the files size (short sample here):
81 -rw-rw-r-- 1 renierlgv renierlgv 96 Dec 3 14:45 t110_234946_iw2/20200109/t110_234946_iw2_20200109.h5
82 -rw-rw-r-- 1 renierlgv renierlgv 96 Dec 3 14:45 t110_234946_iw3/20200109/t110_234946_iw3_20200109.h5
83 -rw-rw-r-- 1 renierlgv renierlgv 96 Dec 3 14:45 t110_234945_iw1/20200109/t110_234945_iw1_20200109.h5
84 -rw-rw-r-- 1 renierlgv renierlgv 96 Dec 3 14:45 t110_234947_iw3/20200109/t110_234947_iw3_20200109.h5
85 -rw-rw-r-- 1 renierlgv renierlgv 252M Dec 3 14:52 t110_234934_iw1/20200202/t110_234934_iw1_20200202.h5
86 -rw-rw-r-- 1 renierlgv renierlgv 204K Dec 3 14:53 t110_234944_iw1/20200309/t110_234944_iw1_20200309.h5
87 -rw-rw-r-- 1 renierlgv renierlgv 204K Dec 3 14:53 t110_234945_iw1/20200309/t110_234945_iw1_20200309.h5
88 -rw-rw-r-- 1 renierlgv renierlgv 204K Dec 3 14:53 t110_234946_iw1/20200309/t110_234946_iw1_20200309.h5
89 -rw-rw-r-- 1 renierlgv renierlgv 204K Dec 3 14:53 t110_234935_iw1/20200414/t110_234935_iw1_20200414.h5
90 -rw-rw-r-- 1 renierlgv renierlgv 193K Dec 3 14:54 t110_234933_iw3/20200402/t110_234933_iw3_20200402.h5
91 -rw-rw-r-- 1 renierlgv renierlgv 193K Dec 3 14:54 t110_234932_iw3/20200402/t110_234932_iw3_20200402.h5
92 -rw-rw-r-- 1 renierlgv renierlgv 193K Dec 3 14:55 t110_234945_iw3/20200309/t110_234945_iw3_20200309.h5
Checking at the ASF there is actually available data for this day, so, my guess is that a worker/s are not doing their job as they should.
Deleting the root folders of the truncated .h5 files and restarting "sweets run" seems to solves the issue, yet it will be highly appreciated to have a sanity check at this point and re run the run_files automatically for those dates with outputs not matching the expected size (~250 MB). Will that be possible to implement?