Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: resample methods with tolerance #2716

Merged
merged 5 commits into from
Jan 31, 2019

Conversation

observingClouds
Copy link
Contributor

@observingClouds observingClouds commented Jan 26, 2019

  • ENH: resample methods bfill, pad, nearest accept tolerance keyword

* ENH: resample methods bfill, pad, nearest accept tolerance keyword

* DOC: documentation is updated with examples

Fixes: GH2695
@jhamman
Copy link
Member

jhamman commented Jan 26, 2019

Thanks @observingClouds - At first glance, this seems right but we'll want some tests to confirm. You'll probably want to put these in test_missing.py. That's where the existing tests for bfill/pad/etc are so should be easy to extend for this feature.

@observingClouds
Copy link
Contributor Author

observingClouds commented Jan 27, 2019

Sure @jhamman, I'll add some tests.
However, I thought the test should rather go into test_dataarray.py than test_missing.py, because
this is an improvement to resample/_upsample?

Something like

    def test_upsample_tolerance(self):
        # Test tolerance keyword for upsample methods bfill, pad, nearest
        times = pd.date_range('2000-01-01', freq='1D', periods=2)
        times_upsampled = pd.date_range('2000-01-01', freq='6H', periods=5)
        array = DataArray(np.arange(2), [('time', times)])

        # Forward fill
        actual = array.resample(time='6H').ffill(tolerance='12H')
        expected = DataArray([0., 0., 0., np.nan, 1.],
                             [('time', times_upsampled)])
        assert_identical(expected, actual)

@jhamman
Copy link
Member

jhamman commented Jan 28, 2019

@observingClouds - I had forgotten the upsample tests were in test_dataarray.py. That seems like a better place to test this functionality.

Include tests for GH2695
@pep8speaks
Copy link

pep8speaks commented Jan 28, 2019

Hello @observingClouds! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on January 30, 2019 at 22:33 Hours UTC

Copy link
Member

@shoyer shoyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two minor suggestions on the test, but other this looks good to me!

xarray/tests/test_dataarray.py Outdated Show resolved Hide resolved
xarray/tests/test_dataarray.py Outdated Show resolved Hide resolved
Copy link
Contributor

@dcherian dcherian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dataset and DataArray objects with an arbitrary number of dimensions.

In order to limit the scope of the methods ``ffill``, ``bfill``, ``pad`` and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mention that "tolerance" is in coordinate (or label) units, not index units and that the index version is limit

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unit for tolerance is now (a878ebc) explicitly mentioned in the time-series example. However, I did not comment on the index method as this has not been implemented for the resample method, yet. The index method seems to be a bigger/other project.

@shoyer shoyer merged commit 3cc0d22 into pydata:master Jan 31, 2019
@shoyer
Copy link
Member

shoyer commented Jan 31, 2019

Thanks @observingClouds -- really nice PR!

dcherian pushed a commit to yohai/xarray that referenced this pull request Feb 4, 2019
* master:
  remove xfail from test_cross_engine_read_write_netcdf4 (pydata#2741)
  Reenable cross engine read write netCDF test (pydata#2739)
  remove bottleneck dev build from travis, this test env was failing to build (pydata#2736)
  CFTimeIndex Resampling (pydata#2593)
  add tests for handling of empty pandas objects in constructors (pydata#2735)
  dropna() for a Series indexed by a CFTimeIndex (pydata#2734)
  deprecate compat & encoding (pydata#2703)
  Implement integrate (pydata#2653)
  ENH: resample methods with tolerance (pydata#2716)
  improve error message for invalid encoding (pydata#2730)
  silence a couple of warnings (pydata#2727)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants