-
Notifications
You must be signed in to change notification settings - Fork 194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added docstring for predict #227
added docstring for predict #227
Conversation
src/statmodels.jl
Outdated
@@ -145,11 +145,18 @@ end | |||
|
|||
const adjr² = adjr2 | |||
|
|||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated change here. Would be great if you could remove it, though it's not a huge deal.
src/statmodels.jl
Outdated
abstract RegressionModel <: StatisticalModel | ||
|
||
fitted(obj::RegressionModel) = error("fitted is not defined for $(typeof(obj)).") | ||
model_response(obj::RegressionModel) = error("model_response is not defined for $(typeof(obj)).") | ||
residuals(obj::RegressionModel) = error("residuals is not defined for $(typeof(obj)).") | ||
|
||
""" | ||
predict(obj::RegressionModel [, newX]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency with the other docstrings, it should be , [newX]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though this http://docs.julialang.org/en/stable/manual/documentation/ seems to disagree, see e.g. the second code box, or point 1 in the first numbered list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've done the docstrings in this package slightly differently. IIRC it's not entirely consistent in Base.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool - I already pushed the change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thanks :)
In what scenario would a one-argument It would make sense to me to change the function signature to """
predict(obj::RegressionModel, newX)
Predict the response of the regression model `obj` from the values in `newX`.
"""
predict(obj::RegressionModel, newX) = error(...) Having |
Thanks. It can be useful to get the model predictions for the original dataset, e.g. when plotting a smoothed line (LOESS) -- though it's maybe redundant with Would also be nice to explain what's the expected type/format for |
@ararslan Yes, as @nalimilan said it is nice to just get the fitted values when not supplying newx. Also note that it is not redundaint with @nalimilan It is my impression that the interface is always to use a DataFrame, see the troubles I was having here: JuliaStats/GLM.jl#165 . I think that the DataFrames module has a predict function that generates the newx matrix and calls the functions in GLM. |
If it were me, I wouldn't specify the format of |
I don't think packages should extend this method to support other input types for So for now I would document both the matrix and data frame formats. |
Just to state though that I find the current format really confusing - the user is supposed to supply a DataFrame, but the only docstring brought up by |
I tried to incorporate a description of |
src/statmodels.jl
Outdated
Form the predicted response of model `obj`. New covariate values `newX` can be supplied as a `DataFrame` | ||
with the same names as the original predictors, or as an AbstractMatrix (for low-level use) with a column | ||
for the value of each predictor (including a `Vector` of 1s if an intercept was fitted). | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is the place to be specific about newX
. I think the right place to add documentation is here https://github.com/JuliaStats/DataFrames.jl/blob/master/src/statsmodels/statsmodel.jl#L76
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point is, all these methods are going to be moved to StatsModels. So I don't think it's worth the trouble of adding partial docstrings everywhere just to remove them in a few weeks.
This PR was requested in another PR (on GLM) that was merged Jan 5. Should this be merged as well now? |
Sorry for a negative comment but I still don't think this is the right place for the doc string. It belongs where the functionality is implemented. |
No problem, I am just following up. If you can direct me to where this should go, in your view (if anywhere) I am happy to close this and make a new PR there. |
Let's just document here the method taking |
The method here just throws an error. The reason to have it here is to be able to import the name such that downstream packages can all use it without conflicts. If we add a docstring here, I think it should simply state that reason. Specific implementations can then add docstrings. There is already one in GLM, right? |
There is one in GLM (https://github.com/JuliaStats/GLM.jl/blob/master/src/lm.jl#L161) but the problem is that it only uses AbstractMatrix (because that is what gets called here), whereas the user is supposed to call the DataFrames version (JuliaStats/GLM.jl#165 (comment)). The preferred DataFrames version is not documented, which is super confusing for the user, so I think the attempt was to remedy this. |
@andreasnoack Yet we don't want all modeling packages to write their own docstrings, since they should all follow the same pattern for the most part for consistency. There should be a minimal docstring in StatsBase/StatsModels. Then we can document the generic data frame method in DataFrames. That one cannot be documented by each modeling package, since they should only define an |
I think many modeling packages would need to write their own docstring to give details about the implementation but I've been convinced that it could be helpful to have a general description of how |
Great, what is the easiest way forward - can you edit the PR directly, or do you want to merge (or close) and do the edit? |
OK, I have tried to change the docstring according to @andreasnoack 's suggestion |
Thanks! |
Requested by @nalimilan here JuliaStats/GLM.jl#171