Skip to content

Use L-BFGS-B Fortran library for native logistic regression benchmark #8

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Jun 27, 2019

Conversation

bibikar
Copy link
Contributor

@bibikar bibikar commented Jun 13, 2019

This PR swaps out the native logistic regression benchmark's solver for the same one used in SciPy. L-BFGS-B is wrapped in a minimal DAAL optimization_solver class and directly set as the solver for DAAL's logistic regression. This brings native performance for this benchmark to match our optimized scikit-learn performance for a sample binary classification problem with 100k samples and 1k features.

Because we must now directly call BLAS and LAPACK functions instead of using DAAL's internal BLAS and LAPACK implementations, we add a dependency on MKL as well.

@bibikar bibikar requested a review from oleksandr-pavlyk June 13, 2019 15:36
native/Makefile Outdated
@@ -26,10 +27,19 @@ all: $(addprefix bin/,$(BENCHMARKS))
bin:
mkdir -p bin

bin/log_reg_lbfgs: log_reg_lbfgs_bench.cpp $(FOBJ) | bin
$(CXX) $^ $(CXXINCLUDE) $(CXXFLAGS) $(LDFLAGS) -lmkl_rt -lm -lifcore \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Linking against mkl_rt, are we making sure that MKL is using TBB as the underlying threading layer?

MKL's default is to use OpenMP, as since DAAL's default is to use TBB we end up incurring the cost of runtime of both.

One can either set the threading layer in the benchmark itself, by calling mkl_set_threading_layer, or use explicit dynamic linking, see MKL Linking Advisor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a call to mkl_set_threading_layer in native/log_reg_lbfgs_bench.cpp.

Copy link
Contributor

@oleksandr-pavlyk oleksandr-pavlyk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@bibikar bibikar merged commit a475be0 into IntelPython:master Jun 27, 2019
@bibikar bibikar deleted the feature/l_bfgs_b branch June 27, 2019 18:10
bibikar added a commit that referenced this pull request Jun 27, 2019
This PR adds logistic regression replicating the results of sklearn.linear_model.LogisticRegression but implemented in daal4py. It supports solvers lbfgs and newton-cg. test_fit uses the canonical form of logistic regression for the binary case (which in scikit-learn is multi_class='ovr') and the multinomial (softmax of exponentiated scores, multi_class='multinomial') for the multi-class case. test_predict supports any combination of n_classes and multi_class, but we use the same multi_class used in test_fit.

While we cannot directly use daal4py's logistic regression because it isn't as easy as native DAAL to pass in a custom solver (see #8), we use daal4py's logistic regression objective functions and math primitives to compute logistic regression.
razdoburdin pushed a commit to razdoburdin/scikit-learn_bench that referenced this pull request Jun 13, 2023
optional build mode without distributed mode
razdoburdin pushed a commit to razdoburdin/scikit-learn_bench that referenced this pull request Jun 13, 2023
* adding cycling notebook example
* Making adjustments to the Jupyter notebook
* adding in main landing page changes
* Adding in examples page
* Fixing examples, formatting, and wording
* fixing removed items and typos in example code
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants