Skip to content

concat_dim getting added to *all* variables of multifile datasets #2064

Open
@xylar

Description

@xylar

Code Sample

Using the following example data set:
example_jan.nc

#!/usr/bin/env python3

import xarray
ds = xarray.open_mfdataset('example_jan.nc', concat_dim='Time')
print(ds)

The result from xarray 0.10.2 (and all previous various xarray versions we've worked with):

Dimensions:                                                      (Time: 1, nOceanRegions: 7, nOceanRegionsTmp: 7, nVertLevels: 100)
Dimensions without coordinates: Time, nOceanRegions, nOceanRegionsTmp, nVertLevels
Data variables:
    time_avg_avgValueWithinOceanLayerRegion_avgLayerTemperature  (Time, nOceanRegionsTmp, nVertLevels) float64 dask.array<shape=(1, 7, 100), chunksize=(1, 7, 100)>
    time_avg_avgValueWithinOceanRegion_avgSurfaceTemperature     (Time, nOceanRegions) float64 dask.array<shape=(1, 7), chunksize=(1, 7)>
    time_avg_daysSinceStartOfSim                                 (Time) timedelta64[ns] dask.array<shape=(1,), chunksize=(1,)>
    xtime_end                                                    (Time) |S64 dask.array<shape=(1,), chunksize=(1,)>
    xtime_start                                                  (Time) |S64 dask.array<shape=(1,), chunksize=(1,)>
    refBottomDepth                                               (nVertLevels) float64 dask.array<shape=(100,), chunksize=(100,)>
Attributes:
    history:  Tue Dec  6 04:49:14 2016: ncatted -O -a ,global,d,, acme_alaph7...
    NCO:      "4.6.2"

The results with xarray 0.10.3:

<xarray.Dataset>
Dimensions:                                                      (Time: 1, nOceanRegions: 7, nOceanRegionsTmp: 7, nVertLevels: 100)
Dimensions without coordinates: Time, nOceanRegions, nOceanRegionsTmp, nVertLevels
Data variables:
    time_avg_avgValueWithinOceanLayerRegion_avgLayerTemperature  (Time, nOceanRegionsTmp, nVertLevels) float64 dask.array<shape=(1, 7, 100), chunksize=(1, 7, 100)>
    time_avg_avgValueWithinOceanRegion_avgSurfaceTemperature     (Time, nOceanRegions) float64 dask.array<shape=(1, 7), chunksize=(1, 7)>
    time_avg_daysSinceStartOfSim                                 (Time) timedelta64[ns] dask.array<shape=(1,), chunksize=(1,)>
    xtime_end                                                    (Time) |S64 dask.array<shape=(1,), chunksize=(1,)>
    xtime_start                                                  (Time) |S64 dask.array<shape=(1,), chunksize=(1,)>
    refBottomDepth                                               (Time, nVertLevels) float64 dask.array<shape=(1, 100), chunksize=(1, 100)>
Attributes:
    history:  Tue Dec  6 04:49:14 2016: ncatted -O -a ,global,d,, acme_alaph7...
    NCO:      "4.6.2"

Problem description

The expected behavior for us was that refBottomDepth should not have Time as a dimension. It does not vary with time and does not have a Time dimension in the input data set.

It seems like #1988 and #2048 were intended to address cases where the concat_dim was not yet present in the input files. But in cases where concat_dim is already in the input files, it seems like only those fields that include this dimensions should be concatenated and other fields should remain free of concat_dim. Part of the problem for us is that the number of dimensions of some of our variables change depending on which xarray version is being used.

Expected Output

That for 0.10.2 (see above)

Output of xr.show_versions()

/home/xylar/miniconda2/envs/mpas_analysis_py3/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`. from ._conv import register_converters as _register_converters

INSTALLED VERSIONS

commit: None
python: 3.6.5.final.0
python-bits: 64
OS: Linux
OS-release: 4.13.0-38-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

xarray: 0.10.3
pandas: 0.22.0
numpy: 1.14.2
scipy: 1.0.1
netCDF4: 1.3.1
h5netcdf: 0.5.1
h5py: 2.7.1
Nio: None
zarr: None
bottleneck: 1.2.1
cyordereddict: None
dask: 0.17.2
distributed: 1.21.6
matplotlib: 2.2.2
cartopy: 0.16.0
seaborn: None
setuptools: 39.0.1
pip: 9.0.3
conda: None
pytest: 3.5.0
IPython: None
sphinx: 1.7.2

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions