Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUGZILLA #17604] approx's new na.rm=FALSE does not play well with 'ties' argument. (R-devel post 3.6.1) #6778

Open
MichaelChirico opened this issue May 19, 2020 · 1 comment

Comments

@MichaelChirico
Copy link
Owner

In R-devel (post 3.6.1) approx() and approxfun() have a new argument, na.rm=TRUE. If FALSE, input x,y pairs with NAs in only the y values are left around so that predictions based on xout values close to x become NA. The bug occurs when there are duplicate x values for which some y values are NA and some not: the ties function is only called if there is more than one non-NA y value of a given x value. I think it should be called if there is more than one y value, NA or not, for a given x.

E.g., shouldn't the following two example give the same results?

data.frame(approx(c(2,3,3,4), c(12,NA,13.2,14), method="constant",
xout=seq(2,4,by=1/2), ties=function(x){print(x);mean(x)}, na.rm=FALSE))
x    y

1 2.0 12.0
2 2.5 12.0
3 3.0 13.2
4 3.5 13.2
5 4.0 14.0

data.frame(approx(c(2,mean(c(3,3)),4), c(12,mean(c(NA,13.2)),14),
method="constant", xout=seq(2,4,by=1/2), ties=function(x){print(x);mean(x)},
na.rm=FALSE))
x  y

1 2.0 12
2 2.5 12
3 3.0 NA
4 3.5 NA
5 4.0 14


METADATA

  • Bug author - Bill Dunlap
  • Creation time - 2019-08-26 19:37:48 UTC
  • Bugzilla link
  • Status - UNCONFIRMED
  • Alias - None
  • Component - Analyses
  • Version - R 3.5.0
  • Hardware - Other Other
  • Importance - P5 normal
  • Assignee - R-core
  • URL -
@github-actions
Copy link

I think what you want is

data.frame(approx(c(2,mean(c(3,3), na.rm = TRUE),4), 
c(12,mean(c(NA,13.2), na.rm = TRUE),14),
method="constant", xout=seq(2,4,by=1/2), 
ties=function(x){print(x);mean(x)},
              na.rm=FALSE))

x    y

1 2.0 12.0
2 2.5 12.0
3 3.0 13.2
4 3.5 13.2
5 4.0 14.0

because not having the na.rm=TRUE in mean(c(NA,13.2)) results in NA for the second element of y.

This can be closed.


METADATA

  • Comment author - elin.waring
  • Timestamp - 2020-06-18 14:21:32 UTC

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant