-
Notifications
You must be signed in to change notification settings - Fork 74
Use L-BFGS-B Fortran library for native logistic regression benchmark #8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
native/Makefile
Outdated
@@ -26,10 +27,19 @@ all: $(addprefix bin/,$(BENCHMARKS)) | |||
bin: | |||
mkdir -p bin | |||
|
|||
bin/log_reg_lbfgs: log_reg_lbfgs_bench.cpp $(FOBJ) | bin | |||
$(CXX) $^ $(CXXINCLUDE) $(CXXFLAGS) $(LDFLAGS) -lmkl_rt -lm -lifcore \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Linking against mkl_rt
, are we making sure that MKL is using TBB as the underlying threading layer?
MKL's default is to use OpenMP, as since DAAL's default is to use TBB we end up incurring the cost of runtime of both.
One can either set the threading layer in the benchmark itself, by calling mkl_set_threading_layer
, or use explicit dynamic linking, see MKL Linking Advisor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a call to mkl_set_threading_layer
in native/log_reg_lbfgs_bench.cpp
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
This PR adds logistic regression replicating the results of sklearn.linear_model.LogisticRegression but implemented in daal4py. It supports solvers lbfgs and newton-cg. test_fit uses the canonical form of logistic regression for the binary case (which in scikit-learn is multi_class='ovr') and the multinomial (softmax of exponentiated scores, multi_class='multinomial') for the multi-class case. test_predict supports any combination of n_classes and multi_class, but we use the same multi_class used in test_fit. While we cannot directly use daal4py's logistic regression because it isn't as easy as native DAAL to pass in a custom solver (see #8), we use daal4py's logistic regression objective functions and math primitives to compute logistic regression.
optional build mode without distributed mode
* adding cycling notebook example * Making adjustments to the Jupyter notebook * adding in main landing page changes * Adding in examples page * Fixing examples, formatting, and wording * fixing removed items and typos in example code
This PR swaps out the native logistic regression benchmark's solver for the same one used in SciPy. L-BFGS-B is wrapped in a minimal DAAL
optimization_solver
class and directly set as the solver for DAAL's logistic regression. This brings native performance for this benchmark to match our optimized scikit-learn performance for a sample binary classification problem with 100k samples and 1k features.Because we must now directly call BLAS and LAPACK functions instead of using DAAL's internal BLAS and LAPACK implementations, we add a dependency on MKL as well.