ENH Add "dvs_to_predict" param to ModelPipeline.predict #241

stephen-hoover · 2018-03-07T18:36:23Z

CivisML v2.2 will add the ability for users to subset model predictions, as a way to save time and space. Add this parameter to ModelPipeline.predict.

CivisML v2.2 will add the ability for users to subset model predictions, as a way to save time and space. Add this parameter to `ModelPipeline.predict`.

elsander · 2018-03-07T20:14:46Z

civis/ml/_model.py

+            If this is a multi-output model, you may list a subset of
+            dependent variables for which you wish to generate predictions.
+            This list must be a subset of the original `dependent_variable`
+            input. Ignoring some of the model's outputs will let predictions


Instead of this sentence, what about something like "Scores for all outputs will always be calculated, but only the subset will be returned, allowing predictions to complete faster and use less disk space". I think it's good to make this explicit so that users know that asking for a subset won't affect their scores.

It's true that we do currently let the model compute everything, then drop columns from the output array before writing to disk. But that seems like an implementation detail. If we were to somehow modify the models such that they only computed the requested labels, it would still look the same to the user. How about

The scores for the returned subset will be identical to the scores which those outputs would have had if all outputs were written, but ignoring some of the model's outputs will let predictions complete faster and use less disk space.

Good point, that is more of an implementation detail. I like your revision.

elsander · 2018-03-07T20:18:09Z

civis/ml/_model.py

+        if dvs_to_predict:
+            if isinstance(dvs_to_predict, six.string_types):
+                dvs_to_predict = [dvs_to_predict]
+            if self.predict_template_id > 10600:


This is really nitpicky, but it would be clearer if this were 10583 instead of 10600, to match other parts of the code that reference the specific template id that is the cutoff for that argument.

Good point. Changed.

elsander

LGTM!

stephen-hoover added enhancement Modeling labels Mar 7, 2018

stephen-hoover added this to the Next Version milestone Mar 7, 2018

ENH Add "dvs_to_predict" param to ModelPipeline.predict

b31f364

CivisML v2.2 will add the ability for users to subset model predictions, as a way to save time and space. Add this parameter to `ModelPipeline.predict`.

stephen-hoover force-pushed the paro-636-skip-output-cols branch from 482fc9e to b31f364 Compare March 7, 2018 18:38

stephen-hoover changed the title ~~ENH Add "targets_to_predict" param to ModelPipeline.predict~~ ENH Add "dvs_to_predict" param to ModelPipeline.predict Mar 7, 2018

stephen-hoover requested a review from elsander March 7, 2018 19:35

stephen-hoover assigned elsander Mar 7, 2018

stephen-hoover mentioned this pull request Mar 7, 2018

ENH "dvs_to_predict" option for model predictions civisanalytics/civis-r#92

Merged

elsander suggested changes Mar 7, 2018

View reviewed changes

elsander assigned stephen-hoover and unassigned elsander Mar 7, 2018

CR

2818d43

stephen-hoover assigned elsander and unassigned stephen-hoover Mar 7, 2018

elsander approved these changes Mar 7, 2018

View reviewed changes

elsander assigned stephen-hoover and unassigned elsander Mar 7, 2018

stephen-hoover merged commit 664f6b3 into civisanalytics:master Mar 7, 2018

stephen-hoover deleted the paro-636-skip-output-cols branch March 7, 2018 21:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ENH Add "dvs_to_predict" param to ModelPipeline.predict #241

ENH Add "dvs_to_predict" param to ModelPipeline.predict #241

Uh oh!

stephen-hoover commented Mar 7, 2018

Uh oh!

elsander Mar 7, 2018

Uh oh!

stephen-hoover Mar 7, 2018

Uh oh!

elsander Mar 7, 2018

Uh oh!

elsander Mar 7, 2018

Uh oh!

stephen-hoover Mar 7, 2018

Uh oh!

elsander left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ENH Add "dvs_to_predict" param to ModelPipeline.predict #241

ENH Add "dvs_to_predict" param to ModelPipeline.predict #241

Uh oh!

Conversation

stephen-hoover commented Mar 7, 2018

Uh oh!

elsander Mar 7, 2018

Choose a reason for hiding this comment

Uh oh!

stephen-hoover Mar 7, 2018

Choose a reason for hiding this comment

Uh oh!

elsander Mar 7, 2018

Choose a reason for hiding this comment

Uh oh!

elsander Mar 7, 2018

Choose a reason for hiding this comment

Uh oh!

stephen-hoover Mar 7, 2018

Choose a reason for hiding this comment

Uh oh!

elsander left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants