You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
createDataPartition creates correct splits for 100%, 80% and 0% but approximate (inaccurate) splits for 70% 50% and 10%. I did not test all the numbers with apply, but I am sure for 70% it should return 70 instead of 72. Unless that is a feature and not a bug.
library(caret)
library(mlbench)
# create list of simulated regression data # y = 10*sin(PI*x1*x2) + 20*(x3 - 0.5)^2 + 10*x4 + 5*x5 + N*(0,s^2)# x1..x5 and x6..x10 non-informative
set.seed(123)
simReg<- mlbench.friedman1(100, sd=1)
# conversion to data frame as suggested in book Applied MLsimReg$x<-data.frame(simReg$x)
inTrain<- createDataPartition(y=simReg$y, p=1.0, list=FALSE)
dim(inTrain)
inTrain<- createDataPartition(y=simReg$y, p=0.80, list=FALSE)
dim(inTrain)
inTrain<- createDataPartition(y=simReg$y, p=0.70, list=FALSE)
dim(inTrain)
inTrain<- createDataPartition(y=simReg$y, p=0.50, list=FALSE)
dim(inTrain)
inTrain<- createDataPartition(y=simReg$y, p=0.1, list=FALSE)
dim(inTrain)
inTrain<- createDataPartition(y=simReg$y, p=0.0, list=FALSE)
dim(inTrain)
str(simReg)
Yes, if turn off the default stratified sampling by setting the number of y quantile breaks to two or less, e.g., createDataPartition(simReg$y, p=0.10, list=F, groups=2)
I, too, was just bitten by this. I understand the rationale for the splitting procedure to respect the structure in the outcome variable, but it's counterintuitive to a new user that p = 0.5 does not give a 50/50 split without setting another argument away from its default value.
Hi,
createDataPartition creates correct splits for 100%, 80% and 0% but approximate (inaccurate) splits for 70% 50% and 10%. I did not test all the numbers with apply, but I am sure for 70% it should return 70 instead of 72. Unless that is a feature and not a bug.
creates
Is there a way to create accurate splits?
Cheers
Tobias
The text was updated successfully, but these errors were encountered: