Description
After downloading MODIS L2 Ocean Color data from https://oceandata.sci.gsfc.nasa.gov/MODIS-Aqua/L2/ or just via the L1 and L2 browser https://oceancolor.gsfc.nasa.gov/cgi/browse.pl it seems like xarray can't open these files properly. The attributes are there but no data, dimensions, or coordinates.The netCDF4 module works and I've just been using that to create xarray datasets, but it feels clunky.
MCVE Code Sample
This can be reproduced by this code and a workable example is on colab here: https://colab.research.google.com/drive/1sLh98c06I99kRGEFYuzIUD9nqgTvasdo
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
# data can be downloaded from https://oceandata.sci.gsfc.nasa.gov/ob/getfile/A2020113184000.L2_LAC_OC.nc
fn = 'A2020113184000.L2_LAC_OC.nc'
import xarray as xr
xds = xr.open_dataset(fn)
# when you view the contents of the xds there is no data
print(xds)
# netcdf4 works fine
from netCDF4 import Dataset
dataset = Dataset(fn)
gd=dataset.groups['geophysical_data']
nav=dataset.groups['navigation_data']
lons = nav.variables["longitude"][:]
lats = nav.variables["latitude"][:]
flags= gd.variables["l2_flags"][:]
# this has the data we want
print(gd.variables.keys())
# we can create the correct xds with the data from netcdf4
chl_xds = xr.Dataset({'chlor_a':(('x', 'y'),gd.variables['chlor_a'][:].data)},
coords = {'latitude': (('x', 'y'), lats),
'longitude': (('x', 'y'), lons)},
attrs={'variable':'Chlorophyll-a'})
# then merge back into the xarray dataset with all the attributes, though I'm not 100% sure I'm doing this correctly
xds['chlor_a'] = chl_xds.chlor_a
# replace nodata areas with nan
xds = xds.where(xds['chlor_a'] != -32767.0)
# just to visualize the data to verify
from matplotlib import colors
xds.chlor_a.plot(vmin=0, vmax=1)
Expected Output
I would expect xarray to be able to open this level two data from the netCDF file.
Problem Description
This workaround using netCDF4 seems clunky and I would have imagined xarray could open this data that I would imagine many folks use. I'm happy to try to help resolve this issue if there is a set way to add some details that will help xarray use MODIS L2 data without issue.
Versions
Output of xr.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.6.9 (default, Nov 7 2019, 10:44:02)
[GCC 8.3.0]
python-bits: 64
OS: Linux
OS-release: 4.19.104+
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
libhdf5: 1.10.4
libnetcdf: 4.6.3
xarray: 0.15.1
pandas: 1.0.3
numpy: 1.18.2
scipy: 1.4.1
netCDF4: 1.5.3
pydap: None
h5netcdf: None
h5py: 2.10.0
Nio: None
zarr: None
cftime: 1.1.2
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: 1.3.2
dask: 2.12.0
distributed: 1.25.3
matplotlib: 3.2.1
cartopy: None
seaborn: 0.10.0
numbagg: None
setuptools: 46.1.3
pip: 19.3.1
conda: None
pytest: 3.6.4
IPython: 5.5.0
sphinx: 1.8.5