Skip to content

Add GPU benchmarks support to readme #59

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

vlad-nazarov
Copy link

No description provided.

@vlad-nazarov vlad-nazarov added the docs documentation and readme update label Mar 30, 2021
Copy link
Contributor

@michael-smirnov michael-smirnov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

README.md Outdated
|**[train_test_split](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html)**|train_test_split|:white_check_mark:|:x:|:white_check_mark:|:x:|
|**[GradientBoostingClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.GradientBoostingClassifier.html)**|gbt|:x:|:x:|:x:|:white_check_mark:|
|**[GradientBoostingRegressor](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html)**|gbt|:x:|:x:|:x:|:white_check_mark:|
| algorithm | benchmark name | sklearn | sklearn on GPU | daal4py | cuml | xgboost |
Copy link
Contributor

@PetrovKP PetrovKP Mar 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concerns:

  • What is sklearn on GPU? (It is not private benchmarks)
  • It is not aligned with the rest of the benchmarks for libraries: sklearn/xgboost/daal4py/cuml

if you want to highlight algorithms on GPU, I would make it as a note at the bottom of the table.

@SmirnovEgorRu , @michael-smirnov What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My opinion is we should think about how to present GPU support better, especially because the number of supported algorithms will grow in the future. It's not a supplementary thing to place somewhere in the bottom. In my opinion, it should be on the same level as CPU support for sklearn.
We can remove it from this table, but create another section "Support of Intel(R) Extension for scikit-learn" when to describe which benchmarks are supported on CPU and GPU. @PetrovKP what do you think about this?

README.md Outdated
@@ -67,7 +66,7 @@ Run `python runner.py --configs configs/config_example.json [--output-file resul

runner options:
* ``configs`` : configuration files paths
* ``no-intel-optimized`` : using Scikit-learn without Intel(R) Extension for Scikit-learn*. Now avalible for scikit-learn benchmarks. Default starts with using Intel(R) Extension for Scikit-learn*.
* ``no-intel-optimized`` : using Scikit-learn without [Intel(R) Extension for Scikit-learn*](#Intel(R)-Extension-for-Scikit-learn*-support). Now avalible for scikit-learn benchmarks. Default starts with using Intel(R) Extension for Scikit-learn*.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not works reference. Maybe?

Suggested change
* ``no-intel-optimized`` : using Scikit-learn without [Intel(R) Extension for Scikit-learn*](#Intel(R)-Extension-for-Scikit-learn*-support). Now avalible for scikit-learn benchmarks. Default starts with using Intel(R) Extension for Scikit-learn*.
* ``no-intel-optimized`` : using Scikit-learn without [Intel(R) Extension for Scikit-learn*](## Intel(R)-Extension-for-Scikit-learn*-support). Now avalible for scikit-learn benchmarks. Default starts with using Intel(R) Extension for Scikit-learn*.

README.md Outdated
@@ -67,7 +66,7 @@ Run `python runner.py --configs configs/config_example.json [--output-file resul

runner options:
* ``configs`` : configuration files paths
* ``no-intel-optimized`` : using Scikit-learn without Intel(R) Extension for Scikit-learn*. Now avalible for scikit-learn benchmarks. Default starts with using Intel(R) Extension for Scikit-learn*.
* ``no-intel-optimized`` : using Scikit-learn without [Intel(R) Extension for Scikit-learn*](#Intel(R)-Extension-for-Scikit-learn*-support). Now avalible for scikit-learn benchmarks. Default starts with using Intel(R) Extension for Scikit-learn*.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* ``no-intel-optimized`` : using Scikit-learn without [Intel(R) Extension for Scikit-learn*](#Intel(R)-Extension-for-Scikit-learn*-support). Now avalible for scikit-learn benchmarks. Default starts with using Intel(R) Extension for Scikit-learn*.
* ``no-intel-optimized`` : using Scikit-learn without [Intel(R) Extension for Scikit-learn*](#Intel(R)-Extension-for-Scikit-learn*-support). Now available for [scikit-learn benchmarks](https://github.com/IntelPython/scikit-learn_bench/tree/master/sklearn_bench). Default running with using Intel(R) Extension for Scikit-learn.

README.md Outdated
@@ -108,6 +107,9 @@ The configuration of benchmarks allows you to select the frameworks to run, sele
|**[GradientBoostingClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.GradientBoostingClassifier.html)**|gbt|:x:|:x:|:x:|:white_check_mark:|
|**[GradientBoostingRegressor](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html)**|gbt|:x:|:x:|:x:|:white_check_mark:|

## Intel(R) Extension for Scikit-learn* support
[Intel(R) Extension for Scikit-learn](https://github.com/intel/scikit-learn-intelex) provides the ability to run scikit-learn on CPU and GPU with kernels optimized by [oneDAL](https://github.com/oneapi-src/oneDAL). The extension support GPU patching for algorihms: **DBSCAN**, **KMeans**, **LinearRegression**, **LogisticRegression**. You can launch benchmarks with GPU support using [skl_xpu_config.json](https://github.com/IntelPython/scikit-learn_bench/blob/master/configs/skl_with_context_config.json) configuration file.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[Intel(R) Extension for Scikit-learn](https://github.com/intel/scikit-learn-intelex) provides the ability to run scikit-learn on CPU and GPU with kernels optimized by [oneDAL](https://github.com/oneapi-src/oneDAL). The extension support GPU patching for algorihms: **DBSCAN**, **KMeans**, **LinearRegression**, **LogisticRegression**. You can launch benchmarks with GPU support using [skl_xpu_config.json](https://github.com/IntelPython/scikit-learn_bench/blob/master/configs/skl_with_context_config.json) configuration file.
[Intel(R) Extension for Scikit-learn](https://github.com/intel/scikit-learn-intelex) providing drop-in patching speeds up scikit-learn on CPU and GPU.
Scikit-learn benchmarks on GPU with Intel(R) Extension for Scikit-learn support algorithms: **DBSCAN**, **KMeans**, **LinearRegression**, **LogisticRegression**. Example config with GPU support [here](https://github.com/IntelPython/scikit-learn_bench/blob/master/configs/skl_with_context_config.json).

@@ -27,6 +27,7 @@ Refer to the tables below for descriptions of all fields in the configuration fi
|data-order| array[string] | **REQUIRED** input data order. Data order: *C* (row-major, default) or *F* (column-major) |
|dtype| array[string] | **REQUIRED** input data type. Data type: *float64* (default) or *float32* |
|check-finitness| array[] | Check finiteness in sklearn input check(disabled by default) |
|device| array[string] | For scikit-learn only. List of devices to run with sycl context. It can be *None* (without context, default), *cpu*, *gpu* or *host*|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: what is host device for users? Maybe add description? For runner script no such parameter?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PetrovKP PetrovKP self-requested a review April 14, 2021 01:12
README.md Outdated
@@ -108,6 +107,11 @@ The configuration of benchmarks allows you to select the frameworks to run, sele
|**[GradientBoostingClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.GradientBoostingClassifier.html)**|gbt|:x:|:x:|:x:|:white_check_mark:|
|**[GradientBoostingRegressor](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html)**|gbt|:x:|:x:|:x:|:white_check_mark:|

## Intel(R) Extension for Scikit-learn* support
[Intel(R) Extension for Scikit-learn](https://github.com/intel/scikit-learn-intelex) providing drop-in patching speeds up scikit-learn on CPU and GPU.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"providing drop-in patching": there is no final decision yet about how XPUs will be supported. Can we use more neutral wordings like "provides optimized functionality of scikit-learn on Intel(R) CPU and GPU" at the moment?

README.md Outdated
## Intel(R) Extension for Scikit-learn* support
[Intel(R) Extension for Scikit-learn](https://github.com/intel/scikit-learn-intelex) providing drop-in patching speeds up scikit-learn on CPU and GPU.

Scikit-learn benchmarks on GPU with Intel(R) Extension for Scikit-learn support algorithms: **DBSCAN**, **KMeans**, **LinearRegression**, **LogisticRegression**. Example config with GPU support [here](https://github.com/IntelPython/scikit-learn_bench/blob/master/configs/skl_xpu_config.json).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following benchmarks support GPU computations with help of Intel(R) Extension for Scikit-learn*:

  • dbscan
  • kmeans ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Example config with GPU support [here]: The configuration file that contains all of these benchmarks can be found [here].
The link for "here" is broken.
Please add an example of command how to run this config on GPU here

@@ -27,6 +27,7 @@ Refer to the tables below for descriptions of all fields in the configuration fi
|data-order| array[string] | **REQUIRED** input data order. Data order: *C* (row-major, default) or *F* (column-major) |
|dtype| array[string] | **REQUIRED** input data type. Data type: *float64* (default) or *float32* |
|check-finitness| array[] | Check finiteness in sklearn input check(disabled by default) |
|device| array[string] | For scikit-learn only. List of devices to run with sycl context. It can be *None* (without context, default), *cpu*, *gpu* or *host*|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

README.md Outdated
@@ -108,6 +107,17 @@ The configuration of benchmarks allows you to select the frameworks to run, sele
|**[GradientBoostingClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.GradientBoostingClassifier.html)**|gbt|:x:|:x:|:x:|:white_check_mark:|
|**[GradientBoostingRegressor](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html)**|gbt|:x:|:x:|:x:|:white_check_mark:|

## Intel(R) Extension for Scikit-learn support

By default scikit-learn benchmark launches using [Intel(R) Extension for Scikit-learn](https://github.com/intel/scikit-learn-intelex) on the CPU (use ``no-intel-optimized`` option to run without extention). Some benchmarks have a GPU support:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure this is correct sentence. Maybe change by "The launches of scikit-learn benchmark use ... by default"?

README.md Outdated
* linear
* log_reg

A configuration file that contains all these benchmarks can be found [here](https://github.com/IntelPython/scikit-learn_bench/blob/master/configs/skl_xpu_config.json). You can use this file to run these benchmarks on both CPU and GPU.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add the command to run this config and get a report

@@ -27,6 +27,7 @@ Refer to the tables below for descriptions of all fields in the configuration fi
|data-order| array[string] | **REQUIRED** input data order. Data order: *C* (row-major, default) or *F* (column-major) |
|dtype| array[string] | **REQUIRED** input data type. Data type: *float64* (default) or *float32* |
|check-finitness| array[] | Check finiteness in sklearn input check(disabled by default) |
|device| array[string] | For scikit-learn only. The list of devices to run the benchmarks on. It can be *None* (default, run on CPU without sycl context) or one of the types of sycl devices: *cpu*, *gpu*, *host*. Please reffer to [SYCL specification](https://www.khronos.org/files/sycl/sycl-2020-reference-guide.pdf) for details|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
|device| array[string] | For scikit-learn only. The list of devices to run the benchmarks on. It can be *None* (default, run on CPU without sycl context) or one of the types of sycl devices: *cpu*, *gpu*, *host*. Please reffer to [SYCL specification](https://www.khronos.org/files/sycl/sycl-2020-reference-guide.pdf) for details|
|device| array[string] | For scikit-learn only. The list of devices to run the benchmarks on. It can be *None* (default, run on CPU without sycl context) or one of the types of sycl devices: *cpu*, *gpu*, *host*. Please refer to [SYCL specification](https://www.khronos.org/files/sycl/sycl-2020-reference-guide.pdf) for details|

README.md Outdated
@@ -67,7 +66,7 @@ Run `python runner.py --configs configs/config_example.json [--output-file resul

runner options:
* ``configs`` : configuration files paths
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* ``configs`` : configuration files paths
* ``--configs``: specify the path to a configuration file.

README.md Outdated
@@ -67,7 +66,7 @@ Run `python runner.py --configs configs/config_example.json [--output-file resul

runner options:
* ``configs`` : configuration files paths
* ``no-intel-optimized`` : using Scikit-learn without Intel(R) Extension for Scikit-learn*. Now avalible for scikit-learn benchmarks. Default starts with using Intel(R) Extension for Scikit-learn*.
* ``no-intel-optimized`` : using Scikit-learn without [Intel(R) Extension for Scikit-learn*](#intelr-extension-for-scikit-learn-support). Now available for [scikit-learn benchmarks](https://github.com/IntelPython/scikit-learn_bench/tree/master/sklearn_bench). Default running with using Intel(R) Extension for Scikit-learn.
* ``output-file``: output file name for result benchmarks. Default is `result.json`
* ``report``: create an Excel report based on benchmarks results. Need library `openpyxl`.
* ``dummy-run`` : run configuration parser and datasets generation without benchmarks running.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* ``dummy-run`` : run configuration parser and datasets generation without benchmarks running.
* ``--dummy-run``: run configuration parser and dataset generation without benchmarks running.

README.md Outdated
@@ -67,7 +66,7 @@ Run `python runner.py --configs configs/config_example.json [--output-file resul

runner options:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
runner options:
Options:

README.md Outdated
@@ -67,7 +66,7 @@ Run `python runner.py --configs configs/config_example.json [--output-file resul

runner options:
* ``configs`` : configuration files paths
* ``no-intel-optimized`` : using Scikit-learn without Intel(R) Extension for Scikit-learn*. Now avalible for scikit-learn benchmarks. Default starts with using Intel(R) Extension for Scikit-learn*.
* ``no-intel-optimized`` : using Scikit-learn without [Intel(R) Extension for Scikit-learn*](#intelr-extension-for-scikit-learn-support). Now available for [scikit-learn benchmarks](https://github.com/IntelPython/scikit-learn_bench/tree/master/sklearn_bench). Default running with using Intel(R) Extension for Scikit-learn.
* ``output-file``: output file name for result benchmarks. Default is `result.json`
* ``report``: create an Excel report based on benchmarks results. Need library `openpyxl`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* ``report``: create an Excel report based on benchmarks results. Need library `openpyxl`.
* ``--report``: create an Excel report based on benchmark results. The `openpyxl` library is required.

README.md Outdated
@@ -67,7 +66,7 @@ Run `python runner.py --configs configs/config_example.json [--output-file resul

runner options:
* ``configs`` : configuration files paths
* ``no-intel-optimized`` : using Scikit-learn without Intel(R) Extension for Scikit-learn*. Now avalible for scikit-learn benchmarks. Default starts with using Intel(R) Extension for Scikit-learn*.
* ``no-intel-optimized`` : using Scikit-learn without [Intel(R) Extension for Scikit-learn*](#intelr-extension-for-scikit-learn-support). Now available for [scikit-learn benchmarks](https://github.com/IntelPython/scikit-learn_bench/tree/master/sklearn_bench). Default running with using Intel(R) Extension for Scikit-learn.
* ``output-file``: output file name for result benchmarks. Default is `result.json`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* ``output-file``: output file name for result benchmarks. Default is `result.json`
* ``--output-file``: output file name for the benchmark result. The default name is `result.json`.

README.md Outdated
@@ -67,7 +66,7 @@ Run `python runner.py --configs configs/config_example.json [--output-file resul

runner options:
* ``configs`` : configuration files paths
* ``no-intel-optimized`` : using Scikit-learn without Intel(R) Extension for Scikit-learn*. Now avalible for scikit-learn benchmarks. Default starts with using Intel(R) Extension for Scikit-learn*.
* ``no-intel-optimized`` : using Scikit-learn without [Intel(R) Extension for Scikit-learn*](#intelr-extension-for-scikit-learn-support). Now available for [scikit-learn benchmarks](https://github.com/IntelPython/scikit-learn_bench/tree/master/sklearn_bench). Default running with using Intel(R) Extension for Scikit-learn.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now available for [scikit-learn benchmarks] -- why do you need this info?

Suggested change
* ``no-intel-optimized`` : using Scikit-learn without [Intel(R) Extension for Scikit-learn*](#intelr-extension-for-scikit-learn-support). Now available for [scikit-learn benchmarks](https://github.com/IntelPython/scikit-learn_bench/tree/master/sklearn_bench). Default running with using Intel(R) Extension for Scikit-learn.
* ``--no-intel-optimized``: use Scikit-learn without [Intel(R) Extension for Scikit-learn*](#intelr-extension-for-scikit-learn-support). Now available for [scikit-learn benchmarks](https://github.com/IntelPython/scikit-learn_bench/tree/master/sklearn_bench). By default, the runner uses Intel(R) Extension for Scikit-learn.

@@ -27,6 +27,7 @@ Refer to the tables below for descriptions of all fields in the configuration fi
|data-order| array[string] | **REQUIRED** input data order. Data order: *C* (row-major, default) or *F* (column-major) |
|dtype| array[string] | **REQUIRED** input data type. Data type: *float64* (default) or *float32* |
|check-finitness| array[] | Check finiteness in sklearn input check(disabled by default) |
|device| array[string] | For scikit-learn only. The list of devices to run the benchmarks on. It can be *None* (default, run on CPU without sycl context) or one of the types of sycl devices: *cpu*, *gpu*, *host*. Please refer to [SYCL specification](https://www.khronos.org/files/sycl/sycl-2020-reference-guide.pdf) for details|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
|device| array[string] | For scikit-learn only. The list of devices to run the benchmarks on. It can be *None* (default, run on CPU without sycl context) or one of the types of sycl devices: *cpu*, *gpu*, *host*. Please refer to [SYCL specification](https://www.khronos.org/files/sycl/sycl-2020-reference-guide.pdf) for details|
|device| array[string] | For scikit-learn only. The list of devices to run the benchmarks on. It can be *None* (default, run on CPU without sycl context) or one of the types of sycl devices: *cpu*, *gpu*, *host*. Refer to [SYCL specification](https://www.khronos.org/files/sycl/sycl-2020-reference-guide.pdf) for details.|

README.md Outdated
@@ -108,6 +107,17 @@ The configuration of benchmarks allows you to select the frameworks to run, sele
|**[GradientBoostingClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.GradientBoostingClassifier.html)**|gbt|:x:|:x:|:x:|:white_check_mark:|
|**[GradientBoostingRegressor](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html)**|gbt|:x:|:x:|:x:|:white_check_mark:|

## Intel(R) Extension for Scikit-learn support

The runs of Scikit-learn benchmark use [Intel(R) Extension for Scikit-learn](https://github.com/intel/scikit-learn-intelex) on the CPU by default (use ``no-intel-optimized`` option to run without extention). Some benchmarks have a GPU support:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The runs of Scikit-learn benchmark use [Intel(R) Extension for Scikit-learn](https://github.com/intel/scikit-learn-intelex) on the CPU by default (use ``no-intel-optimized`` option to run without extention). Some benchmarks have a GPU support:
When you run scikit-learn benchmarks on CPU, [Intel(R) Extension for Scikit-learn](https://github.com/intel/scikit-learn-intelex) is used by default. Use the ``--no-intel-optimized`` option to run the benchmarks without the extension.
The following benchmarks have a GPU support:

README.md Outdated
* linear
* log_reg

A configuration file that contains all these benchmarks can be found [here](https://github.com/IntelPython/scikit-learn_bench/blob/master/configs/skl_xpu_config.json). You can use this file to run these benchmarks on both CPU and GPU.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
A configuration file that contains all these benchmarks can be found [here](https://github.com/IntelPython/scikit-learn_bench/blob/master/configs/skl_xpu_config.json). You can use this file to run these benchmarks on both CPU and GPU.
You may use the [configuration file for these benchmarks](https://github.com/IntelPython/scikit-learn_bench/blob/master/configs/skl_xpu_config.json) to run them on both CPU and GPU.

@vlad-nazarov vlad-nazarov merged commit c5ddd0e into IntelPython:master Apr 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs documentation and readme update
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants