Datamodel I/O efficiencies causing error in copy() #5699

PaddyKavanagh · 2021-02-05T11:26:26Z

There seems to be a datamodel issue with running the ramp output of the development version of MIRISim though any pipeline build >=7.6, which we can't pin down. This occurs when the dq_init step tries to make a copy of the ramp datamodel. The error can be recreated using the following code snipped with the ramp file available here (the trace is at the end of this report):

https://www.dropbox.com/s/nxzgahfveejzhqy/det_image_seq1_MIRIMAGE_F1000Wexp1.fits?dl=0

from jwst import datamodels
dm = datamodels.RampModel('det_image_seq1_MIRIMAGE_F1000Wexp1.fits')
dmc = dm.copy()

I found an issue with a similar error, though not identical, in the following:

#5527

I tried the fix of setting SKIP_FITS_UPDATE environment variable to false to disable some of the I/O efficiencies and this fixes the problem described above so it appears to be related.

==========================================
Traceback (most recent call last):
File "", line 1, in
File "/Users/patrickkavanagh/anaconda3/anaconda3/envs/miricle.devel/lib/python3.8/site-packages/jwst/datamodels/model_base.py", line 352, in copy
self.clone(result, self, deepcopy=True, memo=memo)
File "/Users/patrickkavanagh/anaconda3/anaconda3/envs/miricle.devel/lib/python3.8/site-packages/jwst/datamodels/model_base.py", line 331, in clone
instance = copy.deepcopy(source._instance, memo=memo)
File "/Users/patrickkavanagh/anaconda3/anaconda3/envs/miricle.devel/lib/python3.8/copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "/Users/patrickkavanagh/anaconda3/anaconda3/envs/miricle.devel/lib/python3.8/copy.py", line 296, in _reconstruct
value = deepcopy(value, memo)
File "/Users/patrickkavanagh/anaconda3/anaconda3/envs/miricle.devel/lib/python3.8/copy.py", line 151, in deepcopy
copier = getattr(x, "deepcopy", None)
File "/Users/patrickkavanagh/anaconda3/anaconda3/envs/miricle.devel/lib/python3.8/site-packages/asdf/tags/core/ndarray.py", line 362, in getattr
return getattr(self._make_array(), attr)
File "/Users/patrickkavanagh/anaconda3/anaconda3/envs/miricle.devel/lib/python3.8/site-packages/asdf/tags/core/ndarray.py", line 267, in _make_array
self._array = np.ndarray(
ValueError: strides is incompatible with shape of requested array and size of buffer

The text was updated successfully, but these errors were encountered:

stscijgbot-jp · 2021-02-05T19:09:28Z

This issue is tracked on JIRA as JP-1911.

stscijgbot-jp · 2021-02-05T19:09:29Z

Comment by Jonathan Eisenhamer on JIRA:

Of course I cannot find it, some issue where the error was something like "not an ndarray", but this is symptomatic of a case where, due to lazy loading, an "array", that hasn't been loaded yet, is passed directly to a numpy function, which, of course, numpy has no idea what to do with. The solution there was to force-load the array, but referencing, before the call.

eslavich · 2021-02-19T17:21:02Z

The file on dropbox seems to have a bad strides property in its embedded ASDF tree:

data: !core/ndarray-1.0.0
  source: fits:SCI,1
  datatype: float32
  byteorder: big
  shape: [1, 10, 1024, 1032]
  strides: [52838400, 5283840, 4128, 4]

float32 is a 4-byte type, so the strides in the first dimension suggest 52838400 / 4 = 13209600 elements in the array. But by the shape (and the actual number of elements of the array) the number of elements is 10 * 1024 * 1032 = 10567680.

@PaddyKavanagh are you able to give me instructions for creating this file? This smells like a bug in datamodels or asdf, but I haven't been able to reproduce it by creating simple files with the same array.

eslavich · 2021-02-19T19:11:00Z

Never mind, I was able to reproduce without datamodels:

from astropy.io import fits
import asdf
import numpy as np

data = np.zeros((10, 10))
data_view = data[:, :5]

hdul = fits.HDUList([fits.PrimaryHDU(), fits.ImageHDU(data_view)])
with asdf.fits_embed.AsdfInFits(hdulist=hdul) as af:
    af["data"] = hdul[-1].data
    af.write_to("test.fits", overwrite=True)

with asdf.open("test.fits") as af:
    np.array(af["data"])

The asdf library isn't properly serializing an array referenced in the surrounding .fits file that is a view over some larger array.

stscijgbot-jp · 2021-02-19T21:35:07Z

Comment by James Davies on JIRA:

So basically this is caused by writing out a Numpy view to a datamodel, and asdf doesn't serialize it correctly?

Is there a reason we should allow writing out views? That seems ripe for inadvertent mistakes.

eslavich · 2021-02-23T21:07:27Z

@PaddyKavanagh I have a PR open on the asdf repo with a proposed fix. Can you try regenerating your file with my branch installed?

pip install git+https://github.com/eslavich/asdf.git@JP-1911-fix-array-views

stscijgbot-jp · 2021-02-26T12:55:09Z

Comment by Howard Bushouse on JIRA:

Fixed by #5787

hbushouse added the datamodels label Feb 5, 2021

stscijgbot-jp removed the datamodels label Feb 5, 2021

stscijgbot-jp added the team_periwinkle label Feb 19, 2021

eslavich mentioned this issue Feb 23, 2021

Fix strides issues over FITS arrays and base arrays with negative strides asdf-format/asdf#930

Merged

jdavies-st mentioned this issue Feb 25, 2021

JP-1911: Bump asdf==2.7.3 in requirements-sdp.txt #5787

Merged

jdavies-st closed this as completed in #5787 Feb 26, 2021

stscijgbot-jp added the Software Affected: datamodels label Feb 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Datamodel I/O efficiencies causing error in copy() #5699

Datamodel I/O efficiencies causing error in copy() #5699

PaddyKavanagh commented Feb 5, 2021

stscijgbot-jp commented Feb 5, 2021

stscijgbot-jp commented Feb 5, 2021

eslavich commented Feb 19, 2021

eslavich commented Feb 19, 2021 •

edited

Loading

stscijgbot-jp commented Feb 19, 2021

eslavich commented Feb 23, 2021

stscijgbot-jp commented Feb 26, 2021

Datamodel I/O efficiencies causing error in copy() #5699

Datamodel I/O efficiencies causing error in copy() #5699

Comments

PaddyKavanagh commented Feb 5, 2021

stscijgbot-jp commented Feb 5, 2021

stscijgbot-jp commented Feb 5, 2021

eslavich commented Feb 19, 2021

eslavich commented Feb 19, 2021 • edited Loading

stscijgbot-jp commented Feb 19, 2021

eslavich commented Feb 23, 2021

stscijgbot-jp commented Feb 26, 2021

eslavich commented Feb 19, 2021 •

edited

Loading