Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xarray.merge function and major refactor for merge logic #857

Merged
merged 9 commits into from
Jul 28, 2016

Conversation

shoyer
Copy link
Member

@shoyer shoyer commented May 23, 2016

Fixes #417

New top level :py:func:merge function allows for combining variables from any number of Dataset and/or DataArray variables.

Example usage:

>>> arrays = [xr.DataArray(n, name='var%d' % n) for n in range(5)]
>>> xr.merge(arrays)
<xarray.Dataset>
Dimensions:  ()
Coordinates:
    *empty*
Data variables:
    var0     int64 0
    var1     int64 1
    var2     int64 2
    var3     int64 3
    var4     int64 4

The internal refactoring also lays the ground work for supporting ufunc-like functions that merge three or more arguments, such as the full form of where.

Notes to any potential reviewers:

This doesn't have as much test coverage as I would like, but I did get all the existing functionality working. I'll see if I can add in some more tests for the internal code in merge.py. Also, more docstrings is probably a good idea.

Either way, I'm a bit nervous that I haven't broken things, because frankly I still don't understand exactly how the old code worked (and yes, I wrote it!). So we should probably make a release candidate for xarray v0.8 to avoid the need for a bug fix release shortly afterwards.

cc @jhamman @MaximilianR

Fixes GH417

New top level :py:func:`merge` function allows for combining variables from
any number of ``Dataset`` and/or ``DataArray`` variables.

Example usage:

    >>> arrays = [xr.DataArray(n, name='var%d' % n) for n in range(5)]
    >>> xr.merge(arrays)
    <xarray.Dataset>
    Dimensions:  ()
    Coordinates:
        *empty*
    Data variables:
        var0     int64 0
        var1     int64 1
        var2     int64 2
        var3     int64 3
        var4     int64 4

The internal refactoring also lays the ground work for supporting ufunc-like
functions that merge three or more arguments, such as the full form of
`where`.
[-1.13563237, 1.21211203, -0.17321465]])
second value: <xarray.Variable (x: 2, y: 3)>
array([[ 1.4691123 , 0.71713666, -0.5090585 ],
[-0.13563237, 2.21211203, 0.82678535]])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little confused by this MergeError example. Is the error raised because this merge is trying to merge two datasets that have the same variables which have the same coordinates?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's what this is attempting to indicate. Let me know if this is clearer....

@shoyer
Copy link
Member Author

shoyer commented May 25, 2016

@jhamman thanks for catching my lint errors -- that's a little embarrassing! My linter seems to have disabled itself, which I need to fix.

@jhamman
Copy link
Member

jhamman commented May 25, 2016

@shoyer - I gave this a once over. A few minor comments in addition to my comments above,

  • I didn't look at the tests in depth so whoever looks at this next, that would be a place to start. I also glazed over toward the end of merge.py so that needs another look.
  • There are a lot of private functions/methods involved in this. That may be how it has to be but it seems a bit messy at times.

@shoyer
Copy link
Member Author

shoyer commented Jul 19, 2016

@jhamman another look?

@shoyer shoyer mentioned this pull request Jul 20, 2016
3 tasks
@shoyer
Copy link
Member Author

shoyer commented Jul 28, 2016

OK, I want to get this in for v0.8.0 (and make that release soon), so I'm merging this. I think it's in significantly better shape that before, though it's certainly still more complex than ideal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make merge a function as well as a method
2 participants