Add validation to log-regression benchmark #436
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Validation currently supports only binary classification which returns a single intercept value. If multinomial logistic regression is needed, the validation needs to be updated to support multiple intercept values.
The validation checks logistic regression model's coefficients, intercept, and the number of classes the model can classify. For the coefficients, it compares the sum and sum of squares of all coefficient values with the expected values, and then validates the coefficient count. Regarding the intercept, its value is compared with the expected value, and then the validation checks that the model contains only one intercept.
The validation does not change the code that is being measured and should not really impact benchmark performance. As a quick check, I calculated an average duration from the last 5 repetitions of a single run of the benchmark before and after validation:
log-regression.no-validation.result.txt
log-regression.with-validation.result.txt
The variant with validation appears approx. 1% slower, but this is just from a single run. Given that the benchmark code did not change, I don't think this indicates an actual change, but if necessary, I can do the comparison for multiple runs.