exposures-jura contains some NaN values #2333

weaverba137 · 2024-08-21T03:39:58Z

While working on #2251, we discovered that the top-level exposures-jura files, specifically the exposure tables in those files contained some NaN values.

This list of exposures should be all of those that contain NaN. You can see that some of the columns are masked.

 NIGHT   EXPID  TILEID  TILERA TILEDEC ...   SKY_MAG_Z_SPEC        EFFTIME_GFA          EFFTIME_DARK_GFA    EFFTIME_BRIGHT_GFA EFFTIME_BACKUP_GFA
 int32   int32  int32  float64 float64 ...      float64              float64                float64              float64            float64      
-------- ------ ------ ------- ------- ... ------------------ ---------------------- ---------------------- ------------------ ------------------
20210202  74307  80679   111.0    41.5 ... 19.993767541215583                     --                     --                 --                 --
20210308  79769  80731  192.86   27.13 ... 19.575426801000127                     --                     --                 --                 --
20210404  83420  80697   145.0  32.375 ...  19.65692757225773 1.0075802546981036e-05 1.0075802546981036e-05                 --                 --
20211125 110852  42262   5.991   4.792 ... 19.157658679585744                     --                     --                 --                 --
20230815 190752  40203 320.573  26.421 ... 19.386191940942705                     --                     --                 --                 --

This is the list of columns in exposures-jura that contain NaN values:

TRANSPARENCY_GFA
SEEING_GFA
FIBER_FRACFLUX_GFA
FIBER_FRACFLUX_ELG_GFA
FIBER_FRACFLUX_BGS_GFA
FIBERFAC_GFA
FIBERFAC_ELG_GFA
FIBERFAC_BGS_GFA
EFFTIME_GFA
EFFTIME_DARK_GFA
EFFTIME_BRIGHT_GFA
EFFTIME_BACKUP_GFA

The origin of these invalid values should be investigated at high priority in preparation for kibo. As part of the investigation, we need to look into the tsnr_afterburner to see where these columns come from.

We also need to check the FRAMES table for invalid values (part of exposures-jura.fits).

Attn: @sbailey, @araichoor, @akremin et al..

The text was updated successfully, but these errors were encountered:

araichoor · 2024-08-21T04:54:41Z

thanks for initiating that.
some complementary informations:

in short:
these five exposures are poor-quality data from sv1 or mainbackup), where the gfa pipeline returns NaN's for some reason.
I ve not investigated why, not sure it s worth it.
can t we leave NaN's? if not, we could patch with values like -99, which clearly is a non-physical value for those quantities?

in details:

about the expids:
these five exposures are poor-quality data; from jura:

EXPID   NIGHT    FAFLAVOR       EXPTIME            EFFTIME_SPEC    
------ -------- ---------- ------------------ ---------------------
 74307 20210202  sv1elgqso    566.04931640625    1.3787630796432495
 79769 20210308     sv1ssv 300.06988525390625 5.455439168144949e-05
 83420 20210404     sv1elg   80.4530029296875 2.655843309185002e-05
110852 20211125 mainbackup  601.6654663085938    17.180953979492188
190752 20230815 mainbackup  603.0149536132812    0.6969366669654846

about the columns:
those all are gfa columns.
I believe that these columns are propagated from reading the gfa files (here code version 0.63.0, the one from jura):

desispec/bin/desi_tsnr_afterburner

Lines 1115 to 1154 in 9101ccc

    
           gfa_table = None 
        
           if args.gfa_proc_dir is not None : 
        
               try : 
        
                   gfa_table = read_gfa_data(args.gfa_proc_dir) 
        
               except PermissionError as e : 
        
                   log.error(e) 
        
                   log.error("could not read some files in {}".format(args.gfa_proc_dir)) 
        
                   args.gfa_proc_dir = None 
        
           if args.gfa_proc_dir is not None and gfa_table is not None : 
        
               e2i = {e:i for i,e in enumerate(gfa_table["EXPID"])} 
        
               jj=[] 
        
               ii=[] 
        
               for j,e in enumerate(tsnr2_expid_table["EXPID"]) : 
        
                   if e in e2i : 
        
                       jj.append(j) 
        
                       ii.append(e2i[e]) 
        
               cols=["TRANSPARENCY","FWHM_ASEC","FIBER_FRACFLUX","FIBER_FRACFLUX_ELG","FIBER_FRACFLUX_BGS","FIBERFAC","FIBERFAC_ELG","FIBERFAC_BGS","SKY_MAG_AB","AIRMASS",] 
        
               for col in cols : 
        
                   if col in gfa_table.dtype.names : 
        
                       if col == "FWHM_ASEC" : 
        
                           col2 = "SEEING_GFA" 
        
                       else : 
        
                           col2 = col+"_GFA" 
        
                       tsnr2_expid_table[col2]=np.zeros(len(tsnr2_expid_table),dtype=float) 
        
                       tsnr2_expid_table[col2][jj] = gfa_table[col][ii] 
        
           if args.gfa_proc_dir is not None  and (args.skymags is not None or args.compute_skymags) : 
        
               efftime_dark, efftime_bright, efftime_backup = compute_efftime(tsnr2_expid_table[jj]) 
        
               for col in ["EFFTIME_DARK_GFA","EFFTIME_BRIGHT_GFA","EFFTIME_BACKUP_GFA"] : 
        
                   tsnr2_expid_table[col]=np.zeros(len(tsnr2_expid_table),dtype=float) 
        
               tsnr2_expid_table["EFFTIME_DARK_GFA"][jj]=efftime_dark 
        
               tsnr2_expid_table["EFFTIME_BRIGHT_GFA"][jj]=efftime_bright 
        
               tsnr2_expid_table["EFFTIME_BACKUP_GFA"][jj]=efftime_backup 
        
               goaltype=tsnr2_expid_table["GOALTYPE"] 
        
               tsnr2_expid_table["EFFTIME_GFA"] = tsnr2_expid_table["EFFTIME_DARK_GFA"] # default 
        
               tsnr2_expid_table["EFFTIME_GFA"][goaltype=="bright"]=tsnr2_expid_table["EFFTIME_BRIGHT_GFA"][goaltype=="bright"] 
        
               tsnr2_expid_table["EFFTIME_GFA"][goaltype=="backup"]=tsnr2_expid_table["EFFTIME_BACKUP_GFA"][goaltype=="backup"]

the EFFTIME_* columns come from some computation based on those columns, hence getting NaN's:
https://github.com/desihub/desispec/blob/0.63.0/py/desispec/efftime.py#L13

for completeness, reading the gfa_table with the same routine, and looking for those columns for those expids:

EXPID   NIGHT       TRANSPARENCY         FWHM_ASEC         FIBER_FRACFLUX    FIBER_FRACFLUX_ELG  FIBER_FRACFLUX_BGS         FIBERFAC            FIBERFAC_ELG          FIBERFAC_BGS     
------ -------- -------------------- ------------------ ------------------- ------------------- -------------------- --------------------- --------------------- ----------------------
 74307 20210202 0.059126717922222294 0.9789517068362219  0.6320627694913908 0.44970044302469636  0.20504407421980816                   nan                   nan                    nan
 79769 20210308                  nan 1.8761397170576326  0.2613052900585784 0.21108856454109742  0.10411062155477979                   nan                   nan                    nan
 83420 20210404                  nan                nan                 nan                 nan                  nan 0.0007895831162312318 0.0006804336788210475 0.00030002771194605096
110852 20211125   0.8780836223539111   3.20778365614237 0.11836103479224255 0.10716889904754712 0.057902263978654445                   nan                   nan                    nan
190752 20230815  0.04454787890451631 1.0542636639223057   0.572109104821019  0.4136182060095519  0.19035298877013784                   nan                   nan                    nan

araichoor · 2024-08-21T04:58:04Z

and about the FRAMES extension of exposures-jura.fits: I don t find any NaN's in there.

I did the following check:

d = Table(fitsio.read("/global/cfs/cdirs/desi/spectro/redux/jura/exposures-jura.fits", "FRAMES"))
for key in d.colnames:
     if isinstance(d[key][0], str):
        continue
     sel = ~np.isfinite(d[key])
     if sel.sum() > 0:
        print("{}\t{}".format(key, d["EXPID"][sel].tolist()))

sbailey · 2024-08-21T18:16:08Z

IIUC all NaN values come from the GFA pipeline when its outputs are merged into to the exposures table. I don't think any of these are coming from the spectro pipeline itself. Options:

fix upstream and rerun so that the GFA pipeline doesn't produce NaNs. This is best if possible, but might not be viable for either technical or timescale reasons for Kibo.
Intercept in the tsnr_afterburner and replace with dummy values (e.g. -99); be careful to also update any exposures -> tiles summary statistics to filter out these dummy values so that they don't mess up e.g. sum(EFFTIME_GFA).
Leave the NaNs as is in the files, and intercept and replace with dummy values when loading the DB; arguably a NaN is a valid value for missing data.

In practice these GFA quantities are mostly informative and used for operations debugging since the GFAs are used for the exposure time calculator. i.e. if we're got a quick and solid fix, ok, but I don't think this should delay Kibo. Also, if we confirm that all the NaNs are coming from GFA pipeline inputs being merged into the exposures table, that could be fixed and rerun while Kibo is running, since the exposures table isn't generated until after all of the tile-based processing is done.

weaverba137 · 2024-08-21T20:50:00Z

Thank you for contributing to this discussion. This is getting interesting. Here are some preliminary findings:

For patching these values in jura, the best value to use is zero (0.0). Why?
- The NaNs are masked and thus the sum over them treats them as zero. This was probably not the intended behavior of that code snippet, but that is what actually happened in jura, so we should just use that value.
- Zero is perfectly valid for loading the database.
What are we actually patching?
- The idea was to patch missing values in daily with values in jura, then it was discovered that some values in jura are bad.
- I do not propose changing any existing files in jura, only applying a patch to daily.
Why didn't we find this before?
- The exposures listed above appear in both fuji and iron. However, in both fuji and iron the GFA quantities associated with all of the exposures for the corresponding tiles are set to zero (another reason zero is a good patch value!).
- Somehow the interpretation of the GFA files changed between iron and jura. The GFA file itself is the same: desi/survey/GFA/offline_matched_coadd_ccds_SV1-thru_20210928.fits which was frozen in fuji. There are definitely NaN values present in that file. Somehow, the interpretation of bad GFA values changed between desispec/0.56.5 (iron) and desispec/0.63.x (jura).
What should we do for kibo?
- Determine what changed in interpreting GFA data between iron and jura and decide (in progress). I vote for replacing NaN with zero explicitly--with logging and documentation--in kibo.
- Issue a warning in compute_tile_completeness_table() when it sums over masked values, so we can at least know that is happening.
What else?
- desi_tiles_completeness effectively does the same thing as desi_tsnr_afterburner --tile-completeness, but does not share any code. These could diverge in the future. It is not obvious why desi_tiles_completeness is even needed. That should be a separate ticket, of course, and is not necessarily needed for kibo.
- The processing of desi_tsnr_afterburner should be standardized. The outputs and log files are literally all over the place, which is not good practice.

weaverba137 · 2024-08-21T23:40:38Z

Follow up results:

There is one significant difference between desi_tiles_completeness and desi_tsnr_afterburner --tile-completeness: the former uses Table.read() while the latter uses desispec.io.read_table(). Table.read() masks NaN, but read_table() does not. If desi_tsnr_afterburner --tile-completeness had been used for jura, NaN would probably have shown up in the tiles-jura.(fits|csv) files.
When running desi_tsnr_afterburner for iron, the SV1 GFA file was never read in. That means effectively all SV1 exposures have no GFA data. Or equivalently, it's not the presence or absence of NaN that caused the iron GFA values to be zero, the values were never assigned in the first place. This might have also happened for fuji but that's a slightly lower priority investigation at this point.

araichoor · 2024-08-22T01:48:31Z

thanks for the investigation.
if useful, a minor precision about the sv1, sv3 gfa files: the namings are a bit misleading; I believe that the sv3 file contains all exposures taken since the start of sv3 (20210405), which do include some sv1 exposures (approx. half of the sv1 exposures).
actually the offline_matched_coadd_ccds_SV3-thru_20240409.fits file probably contains all exposures since 20210405.

>>> d1 = Table(fitsio.read("/global/cfs/cdirs/desi/survey/GFA/offline_matched_coadd_ccds_SV1-thru_20210928.fits", 2))
>>> d3 = Table(fitsio.read("/global/cfs/cdirs/desi/survey/GFA/offline_matched_coadd_ccds_SV3-thru_20240409.fits", 2))

>>> d1["NIGHT"].min(), d1["NIGHT"].max()
(20201214, 20210928)
>>> d3["NIGHT"].min(), d3["NIGHT"].max()
(20210405, 20240409)

>>> np.in1d(d1["EXPID"], d3["EXPID"]).mean()
0.48369036027263873
>>> np.all(~np.in1d(d1["EXPID"][d1["NIGHT"] < d3["NIGHT"].min()], d3["EXPID"]))
True
>>> np.all(np.in1d(d1["EXPID"][d1["NIGHT"] >= d3["NIGHT"].min()], d3["EXPID"]))
True

weaverba137 · 2024-08-22T17:42:58Z

OK, I should have said "many early" SV1 exposures rather than "all" SV1 exposures, but the important thing here is that two GFA summary files must be read for completeness, and one of them will continue to contain the label "SV1".

weaverba137 · 2024-08-22T17:52:06Z

We also now know the reason for the difference between iron & jura:

When iron was run, it used the directory that is now called desi/survey/GFA.KPNO, but at the time, it was just desi/survey/GFA. That directory did not (and does not) contain a copy of a SV1 GFA summary file, so all of the exposures that appear only in the SV1-labelled GFA summary files had their values set to zero. This had nothing to do with the presence of NaN in GFA summary files at all. The necessary file just wasn't there. This also happened for fuji.

jura used desi/survey/GFA.NERSC (which is the current desi/survey/GFA via a symlink), which does contain a SV1 GFA summary file, so we then "discovered" the NaNs for exposures that had previously just been set to zero.

weaverba137 added bug bad data labels Aug 21, 2024

weaverba137 self-assigned this Aug 21, 2024

weaverba137 mentioned this issue Aug 22, 2024

Safety checks and warnings for NaN or masked columns. #2336

Merged

sbailey closed this as completed in #2336 Aug 23, 2024

weaverba137 mentioned this issue Aug 23, 2024

tiles-daily.csv contains null values for PROGRAM #2251

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

exposures-jura contains some NaN values #2333

exposures-jura contains some NaN values #2333

weaverba137 commented Aug 21, 2024

araichoor commented Aug 21, 2024

araichoor commented Aug 21, 2024

sbailey commented Aug 21, 2024

weaverba137 commented Aug 21, 2024

weaverba137 commented Aug 21, 2024

araichoor commented Aug 22, 2024

weaverba137 commented Aug 22, 2024

weaverba137 commented Aug 22, 2024

exposures-jura contains some NaN values #2333

exposures-jura contains some NaN values #2333

Comments

weaverba137 commented Aug 21, 2024

araichoor commented Aug 21, 2024

araichoor commented Aug 21, 2024

sbailey commented Aug 21, 2024

weaverba137 commented Aug 21, 2024

weaverba137 commented Aug 21, 2024

araichoor commented Aug 22, 2024

weaverba137 commented Aug 22, 2024

weaverba137 commented Aug 22, 2024