IntelPython · icfaust · Sep 25, 2023 · Sep 25, 2023 · Sep 25, 2023 · Sep 25, 2023
@@ -33,7 +33,7 @@ We publish blogs on Medium, so [follow us](https://medium.com/intel-analytics-so
 - [How to create conda environment for benchmarking](#how-to-create-conda-environment-for-benchmarking)
 - [Running Python benchmarks with runner script](#running-python-benchmarks-with-runner-script)
 - [Benchmark supported algorithms](#benchmark-supported-algorithms)
-  - [Scikit-learn benchmakrs](#scikit-learn-benchmakrs)
+- [Scikit-learn benchmarks](#scikit-learn-benchmarks)
 - [Algorithm parameters](#algorithm-parameters)
 
 ## How to create conda environment for benchmarking
@@ -105,6 +105,8 @@ The configuration of benchmarks allows you to select the frameworks to run, sele
 |**[DBSCAN](https://scikit-learn.org/stable/modules/generated/sklearn.cluster.DBSCAN.html)**|dbscan|:white_check_mark:|:white_check_mark:|:white_check_mark:|:white_check_mark:|:x:|
 |**[RandomForestClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html)**|df_clfs|:white_check_mark:|:x:|:white_check_mark:|:white_check_mark:|:x:|
 |**[RandomForestRegressor](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestRegressor.html)**|df_regr|:white_check_mark:|:x:|:white_check_mark:|:white_check_mark:|:x:|
+|**[ExtraTreesClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.ExtraTreesClassifier.html)**|et_clfs|:white_check_mark:|:x:|:x:|:x:|:x:|
+|**[ExtraTreesRegressor](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.ExtraTreesRegressor.html)**|et_regr|:white_check_mark:|:x:|:x:|:x:|:x:|
 |**[pairwise_distances](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise_distances.html)**|distances|:white_check_mark:|:x:|:white_check_mark:|:x:|:x:|
 |**[KMeans](https://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html)**|kmeans|:white_check_mark:|:white_check_mark:|:white_check_mark:|:white_check_mark:|:x:|
 |**[KNeighborsClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KNeighborsClassifier.html)**|knn_clsf|:white_check_mark:|:x:|:x:|:white_check_mark:|:x:|
@@ -118,7 +120,7 @@ The configuration of benchmarks allows you to select the frameworks to run, sele
 |**[GradientBoostingClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.GradientBoostingClassifier.html)**|gbt|:x:|:x:|:x:|:x:|:white_check_mark:|
 |**[GradientBoostingRegressor](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html)**|gbt|:x:|:x:|:x:|:x:|:white_check_mark:|
 
-### Scikit-learn benchmakrs
+### Scikit-learn benchmarks
 
 When you run scikit-learn benchmarks on CPU, [Intel(R) Extension for Scikit-learn](https://github.com/intel/scikit-learn-intelex) is used by default. Use the ``--no-intel-optimized`` option to run the benchmarks without the extension.
 

@@ -182,6 +182,78 @@
                 }
             ]
         },
+        {
+            "algorithm": "et_clsf",
+            "dtype": "float32",
+            "dataset": [
+                {
+                    "source": "npy",
+                    "name": "higgs1m",
+                    "training":
+                    {
+                        "x": "data/higgs1m_x_train.npy",
+                        "y": "data/higgs1m_y_train.npy"
+                    },
+                    "testing":
+                    {
+                        "x": "data/higgs1m_x_test.npy",
+                        "y": "data/higgs1m_y_test.npy"
+                    }
+                },
+                {
+                    "source": "npy",
+                    "name": "airline-ohe",
+                    "training":
+                    {
+                        "x": "data/airline-ohe_x_train.npy",
+                        "y": "data/airline-ohe_y_train.npy"
+                    },
+                    "testing":
+                    {
+                        "x": "data/airline-ohe_x_test.npy",
+                        "y": "data/airline-ohe_y_test.npy"
+                    }
+                }
+            ],
+            "num-trees": 50,
+            "max-depth": 16,
+            "max-leaf-nodes": 131072,
+            "max-features": 0.2
+        },
+        {
+            "algorithm": "et_regr",
+            "dtype": "float32",
+            "dataset": [
+                {
+                    "source": "npy",
+                    "name": "year_prediction_msd",
+                    "training":
+                    {
+                        "x": "data/year_prediction_msd_x_train.npy",
+                        "y": "data/year_prediction_msd_y_train.npy"
+                    },
+                    "testing":
+                    {
+                        "x": "data/year_prediction_msd_x_test.npy",
+                        "y": "data/year_prediction_msd_y_test.npy"
+                    }
+                },
+                {
+                    "source": "npy",
+                    "name": "airline_regression",
+                    "training":
+                    {
+                        "x": "data/airline_regression_x_train.npy",
+                        "y": "data/airline_regression_y_train.npy"
+                    },
+                    "testing":
+                    {
+                        "x": "data/airline_regression_x_test.npy",
+                        "y": "data/airline_regression_y_test.npy"
+                    }
+                }
+            ]
+        },
         {
             "algorithm": "ridge",
             "dataset": [

@@ -0,0 +1,165 @@
+{
+    "common": {
+        "lib": "sklearn",
+        "algorithm": "et_clsf",
+        "data-format": "pandas",
+        "data-order": "F",
+        "dtype": ["float32", "float64"],
+        "max-features": "sqrt",
+        "device": ["host", "cpu", "gpu", "none"]
+    },
+    "cases": [
+        {
+            "dataset": [
+                {
+                    "source": "npy",
+                    "name": "higgs1m",
+                    "training":
+                    {
+                        "x": "data/higgs1m_x_train.npy",
+                        "y": "data/higgs1m_y_train.npy"
+                    },
+                    "testing":
+                    {
+                        "x": "data/higgs1m_x_test.npy",
+                        "y": "data/higgs1m_y_test.npy"
+                    }
+                }
+            ],
+            "workload-size": "medium",
+            "num-trees": 50,
+            "max-depth": 16,
+            "max-leaf-nodes": 131072,
+            "max-features": 0.2
+        },
+        {
+            "device": "none",
+            "dataset": [
+                {
+                    "source": "npy",
+                    "name": "airline-ohe",
+                    "training":
+                    {
+                        "x": "data/airline-ohe_x_train.npy",
+                        "y": "data/airline-ohe_y_train.npy"
+                    },
+                    "testing":
+                    {
+                        "x": "data/airline-ohe_x_test.npy",
+                        "y": "data/airline-ohe_y_test.npy"
+                    }
+                }
+            ],
+            "workload-size": "medium",
+            "num-trees": 50,
+            "max-depth": 16,
+            "max-leaf-nodes": 131072,
+            "max-features": 0.2
+        },
+        {
+            "dataset": [
+                {
+                    "source": "npy",
+                    "name": "susy",
+                    "training":
+                    {
+                        "x": "data/susy_x_train.npy",
+                        "y": "data/susy_y_train.npy"
+                    },
+                    "testing":
+                    {
+                        "x": "data/susy_x_test.npy",
+                        "y": "data/susy_y_test.npy"
+                    }
+                }
+            ],
+            "workload-size": "medium",
+            "num-trees": 10,
+            "max-depth": 5
+        },
+        {
+            "dataset": [
+                {
+                    "source": "npy",
+                    "name": "susy",
+                    "training":
+                    {
+                        "x": "data/susy_x_train.npy",
+                        "y": "data/susy_y_train.npy"
+                    },
+                    "testing":
+                    {
+                        "x": "data/susy_x_test.npy",
+                        "y": "data/susy_y_test.npy"
+                    }
+                }
+            ],
+            "workload-size": "large",
+            "num-trees": 100,
+            "max-depth": 8
+        },
+        {
+            "dataset": [
+                {
+                    "source": "npy",
+                    "name": "susy",
+                    "training":
+                    {
+                        "x": "data/susy_x_train.npy",
+                        "y": "data/susy_y_train.npy"
+                    },
+                    "testing":
+                    {
+                        "x": "data/susy_x_test.npy",
+                        "y": "data/susy_y_test.npy"
+                    }
+                }
+            ],
+            "workload-size": "medium",
+            "num-trees": 20,
+            "max-depth": 16
+        },
+        {
+            "dataset": [
+                {
+                    "source": "npy",
+                    "name": "mnist",
+                    "training":
+                    {
+                        "x": "data/mnist_x_train.npy",
+                        "y": "data/mnist_y_train.npy"
+                    },
+                    "testing":
+                    {
+                        "x": "data/mnist_x_test.npy",
+                        "y": "data/mnist_y_test.npy"
+                    }
+                }
+            ],
+            "workload-size": "large",
+            "num-trees": 100,
+            "max-depth": 10
+        },
+        {
+            "dataset": [
+                {
+                    "source": "npy",
+                    "name": "hepmass_150K",
+                    "training":
+                    {
+                        "x": "data/hepmass_150K_x_train.npy",
+                        "y": "data/hepmass_150K_y_train.npy"
+                    },
+                    "testing":
+                    {
+                        "x": "data/hepmass_150K_x_test.npy",
+                        "y": "data/hepmass_150K_y_test.npy"
+                    }
+                }
+            ],
+            "workload-size": "medium",
+            "num-trees": 50,
+            "max-depth": 15
+        }
+    ]
+}