Skip to content

Sanity check on gslcs #127

@renierlgv

Description

@renierlgv

The following error just came up when running sweets

lib/python3.12/site-packages/h5py/_hl/files.py", line 238, in make_fid
49 fid = h5f.open(name, flags, fapl=fapl)
50 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
51 File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
52 File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
53 File "h5py/h5f.pyx", line 102, in h5py.h5f.open
54 OSError: Unable to synchronously open file (truncated file: eof = 96, sblock->base_addr = 0, stored_eof = 2048)

which is confirmed by checking the files size (short sample here):

81 -rw-rw-r-- 1 renierlgv renierlgv 96 Dec 3 14:45 t110_234946_iw2/20200109/t110_234946_iw2_20200109.h5
82 -rw-rw-r-- 1 renierlgv renierlgv 96 Dec 3 14:45 t110_234946_iw3/20200109/t110_234946_iw3_20200109.h5
83 -rw-rw-r-- 1 renierlgv renierlgv 96 Dec 3 14:45 t110_234945_iw1/20200109/t110_234945_iw1_20200109.h5
84 -rw-rw-r-- 1 renierlgv renierlgv 96 Dec 3 14:45 t110_234947_iw3/20200109/t110_234947_iw3_20200109.h5
85 -rw-rw-r-- 1 renierlgv renierlgv 252M Dec 3 14:52 t110_234934_iw1/20200202/t110_234934_iw1_20200202.h5
86 -rw-rw-r-- 1 renierlgv renierlgv 204K Dec 3 14:53 t110_234944_iw1/20200309/t110_234944_iw1_20200309.h5
87 -rw-rw-r-- 1 renierlgv renierlgv 204K Dec 3 14:53 t110_234945_iw1/20200309/t110_234945_iw1_20200309.h5
88 -rw-rw-r-- 1 renierlgv renierlgv 204K Dec 3 14:53 t110_234946_iw1/20200309/t110_234946_iw1_20200309.h5
89 -rw-rw-r-- 1 renierlgv renierlgv 204K Dec 3 14:53 t110_234935_iw1/20200414/t110_234935_iw1_20200414.h5
90 -rw-rw-r-- 1 renierlgv renierlgv 193K Dec 3 14:54 t110_234933_iw3/20200402/t110_234933_iw3_20200402.h5
91 -rw-rw-r-- 1 renierlgv renierlgv 193K Dec 3 14:54 t110_234932_iw3/20200402/t110_234932_iw3_20200402.h5
92 -rw-rw-r-- 1 renierlgv renierlgv 193K Dec 3 14:55 t110_234945_iw3/20200309/t110_234945_iw3_20200309.h5

Checking at the ASF there is actually available data for this day, so, my guess is that a worker/s are not doing their job as they should.

Deleting the root folders of the truncated .h5 files and restarting "sweets run" seems to solves the issue, yet it will be highly appreciated to have a sanity check at this point and re run the run_files automatically for those dates with outputs not matching the expected size (~250 MB). Will that be possible to implement?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions