Skip to content

Which (FITS) header should be stored in Spectrum1D.meta (and where)? #617

@dhomeier

Description

@dhomeier

Loaders for FITS files are saving the full information from the header cards in Spectrum1D.meta['header']. In #608 (comment) I noted some ambiguity as to which header should be stored.
Most of the default_loaders for such formats, namely apogee, hst_*, muscles, subaru_pfs and sdss spec_loader, are pulling the info from the FITS header of the first HDU, while the spectral data are read from one (or several) of the extension HDUs.
This primary HDU header will usually contain general information on the target, observing run etc., while the HDU with the spectrum will include more specific info on the dataset. In principle both could be of interest. Should there be a general recommendation for new default (or custom) loaders (and a fix to existing ones) how to handle this?

I the following options:

  1. Leave it at the primary HDU header as with the majority of current default_loaders.
    Pros: no API changes
    Cons: discarding potentially useful information; inconsistent with Astropy's io.fits behaviour for Table (see below)

  2. Always read the header for the HDU containing the data.
    This is already implicitly done by generic_spectrum_from_table and spectrum_from_column_mapping as they are loading table.meta under the hood.
    Pros: Matches the Table.meta data; possibility to use io.fits mechanism to filter relevant keywords (i.e. excluding REMOVE_KEYWORDS and all, that have already been used in defining column formats and units).
    Cons: Losing info from the primary HDU header (among the current formats with data in a BINTABLE extension, at least the Apogee, HST, MUSCLES and SDSS also have potentially interesting information in the primary HDU).
    Some formats, e.g. apogee, are parsing spectrum data from several different extensions.

  3. Combine all header info from primary HDU and any extension HDU from which data are read.
    This is already implemented in the jwst_reader by using the (FITS) header.extend method to update the primary header with the data extension header.
    Pros: No or minimal loss of information.
    Cons: Possibly excessive accumulation of values in spectrum.meta['header']; writing the spectrum back will produce a FITS file with different headers (but currently there is no guarantee that any of the loaders will write back identically formatted files anyway).

I tend to the 3rd option, using a similar implementation to jwst_reader. It might still be preferable to use the Astropy Table.meta dict for updating to take advantage of the io.fits mechanism for stripping excess cards.

Addendum
In the above I was assuming that the Spectrum1D format always has the header info collected in one dictionary within the meta dictionary as Spectrum1D.meta['header']; this is also suggested in the docs for creating a custom loader by prescribing
meta = {'header': header}
and followed by most of the default_loaders.
But in fact neither parsing_utils nor jwst_reader follow this scheme, but instead put all saved header cards directly into Spectrum1D.meta.
I therefore suggest to first settle on a consistent scheme for organising the meta dict.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions