Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support group_attrs in cf_writer #1914

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open

Conversation

zxdawn
Copy link
Member

@zxdawn zxdawn commented Dec 2, 2021

  • Tests added
  • Fully documented

Sometimes when we save datasets to NetCDF as groups, it will be useful to add the group attributes. In this PR, I add the option by passing group_attrs. It works as same as header_attrs.

Here's an example:

from satpy import Scene
from glob import glob

tropomi_dir = './'
scn = Scene(glob(tropomi_dir+'S5P_OFFL_L2__NO2____20190630T0901*.nc'), reader='tropomi_l2')
vnames = ['nitrogendioxide_tropospheric_column']
scn.load(vnames)

header_attrs = {'description': 'Global attrs'}
group_attrs = {'description': 'Group attrs of data'}

scn.save_datasets(filename='test_cf_group.nc',
                    datasets=vnames,
                    groups={'S5P': vnames},
                    compute=True,
                    header_attrs=header_attrs,
                    group_attrs=group_attrs,
                    writer='cf',
                    engine='netcdf4',
                    )

Output of ncdump -h test_cf_group.nc:

netcdf test_cf_group {

// global attributes:
		:description = "Global attrs" ;
		:history = "Created by pytroll/satpy on 2021-12-02 09:54:54.089499" ;

group: S5P {
  dimensions:
  	time = 1 ;
  	y = 3245 ;
  	x = 450 ;
  	bnds_1d = 2 ;
  variables:
  	int64 time(time) ;
  		time:bounds = "time_bnds" ;
  		time:standard_name = "time" ;
  		time:units = "days since 2019-06-30" ;
  		time:calendar = "proleptic_gregorian" ;
  	float latitude(y, x) ;
  		latitude:_FillValue = NaNf ;
  		latitude:name = "latitude" ;
  		latitude:standard_name = "latitude" ;
  		latitude:units = "degrees_north" ;
  	float longitude(y, x) ;
  		longitude:_FillValue = NaNf ;
  		longitude:name = "longitude" ;
  		longitude:standard_name = "longitude" ;
  		longitude:units = "degrees_east" ;
  	float nitrogendioxide_tropospheric_column(time, y, x) ;
  		nitrogendioxide_tropospheric_column:_FillValue = NaNf ;
  		nitrogendioxide_tropospheric_column:ancillary_variables = "nitrogendioxide_tropospheric_column_precision air_mass_factor_troposphere air_mass_factor_total averaging_kernel" ;
  		nitrogendioxide_tropospheric_column:end_time = "2019-06-30 10:43:01" ;
  		nitrogendioxide_tropospheric_column:file_key = "PRODUCT/nitrogendioxide_tropospheric_column" ;
  		nitrogendioxide_tropospheric_column:file_type = "tropomi_l2" ;
  		nitrogendioxide_tropospheric_column:long_name = "Tropospheric vertical column of nitrogen dioxide" ;
  		nitrogendioxide_tropospheric_column:modifiers = "" ;
  		nitrogendioxide_tropospheric_column:multiplication_factor_to_convert_to_molecules_percm2 = 6.02214e+19f ;
  		nitrogendioxide_tropospheric_column:platform_shortname = "S5P" ;
  		nitrogendioxide_tropospheric_column:sensor = "tropomi" ;
  		nitrogendioxide_tropospheric_column:standard_name = "troposphere_mole_content_of_nitrogen_dioxide" ;
  		nitrogendioxide_tropospheric_column:start_time = "2019-06-30 09:01:31" ;
  		nitrogendioxide_tropospheric_column:units = "mol m-2" ;
  		nitrogendioxide_tropospheric_column:coordinates = "longitude latitude" ;
  	double time_bnds(time, bnds_1d) ;

  // group attributes:
  		:description = "Group attrs of data" ;
  } // group S5P
}

@codecov
Copy link

codecov bot commented Dec 2, 2021

Codecov Report

Merging #1914 (e2e3d6d) into main (f34d563) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##             main    #1914   +/-   ##
=======================================
  Coverage   93.78%   93.79%           
=======================================
  Files         282      282           
  Lines       42250    42260   +10     
=======================================
+ Hits        39626    39636   +10     
  Misses       2624     2624           
Flag Coverage Δ
behaviourtests 4.74% <0.00%> (-0.01%) ⬇️
unittests 94.35% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
satpy/tests/writer_tests/test_cf.py 99.70% <100.00%> (+<0.01%) ⬆️
satpy/writers/cf_writer.py 94.45% <100.00%> (+0.09%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f34d563...e2e3d6d. Read the comment docs.

@djhoese
Copy link
Member

djhoese commented Mar 18, 2022

Our new code checker isn't happy with the quality of the 'cf' writer. Any chance you could try cleaning some of this stuff up?

https://github.com/pytroll/satpy/pull/1914/checks?check_run_id=5603372095

@djhoese djhoese added enhancement code enhancements, features, improvements component:writers labels Mar 18, 2022
@coveralls
Copy link

coveralls commented Mar 18, 2022

Coverage Status

Coverage increased (+0.001%) to 94.292% when pulling e2e3d6d on zxdawn:cf_group_attrs into f34d563 on pytroll:main.

@zxdawn
Copy link
Member Author

zxdawn commented Mar 18, 2022

I have improved the test part. For the cf_writer, the high complexity seems OK?

if flatten_attrs:
group_attrs = flatten_dict(group_attrs)
dataset.attrs = encode_attrs_nc(group_attrs)

encoding, other_to_netcdf_kwargs = update_encoding(dataset, to_netcdf_kwargs, numeric_name_prefix)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you put this chunk of code (everything inside the for loop besides the to_netcdf and append at the end) in a separate method, how ugly would it be?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that passes now.

satpy/writers/cf_writer.py Outdated Show resolved Hide resolved
satpy/writers/cf_writer.py Outdated Show resolved Hide resolved
satpy/writers/cf_writer.py Outdated Show resolved Hide resolved
satpy/writers/cf_writer.py Outdated Show resolved Hide resolved
satpy/writers/cf_writer.py Outdated Show resolved Hide resolved
satpy/writers/cf_writer.py Outdated Show resolved Hide resolved
satpy/writers/cf_writer.py Outdated Show resolved Hide resolved
def save_datasets(self, datasets, filename=None, groups=None, header_attrs=None, engine=None, epoch=EPOCH,
flatten_attrs=False, exclude_attrs=None, include_lonlats=True, pretty=False,
compression=None, include_orig_name=True, numeric_name_prefix='CHANNEL_', **to_netcdf_kwargs):
def combine_info(self, group_name, group_datasets, epoch, group_attrs, flatten_attrs, exclude_attrs,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, this got uglier than I was expecting. I think any sub-methods should be prefixed with _ as they probably aren't public API for users. The main ugliness with this refactor is the number of arguments that needed to be passed. Looking at the arguments it looks like most of them go to the _collect_datasets call. Maybe that could be moved back into the for loop? Then the time if/else in one method and the group attrs in another method. So the for loop would essentially look like:

for group_name, group_datasets ...:
    ... = self._collect_datasets(...)
    dataset = xr.Dataset(datas)
    self._add_time_bounds(...)
    self._add_group_attrs(...)
    encoding, other_to_netcdf_kwargs = update_encoding(...)
    return dataset, encoding, other_to_netcdf_kwargs

@gerritholl
Copy link
Collaborator

@zxdawn What is the status of this PR? Any chance of looking into the remaining points?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:writers enhancement code enhancements, features, improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants