Skip to content

Commit

Permalink
Ravin Kohli: [FIX formatting in docs (#342)
Browse files Browse the repository at this point in the history
  • Loading branch information
Github Actions committed Nov 22, 2021
1 parent b64c977 commit 8c9a9c1
Show file tree
Hide file tree
Showing 42 changed files with 1,546 additions and 1,219 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -26,10 +26,13 @@
from autoPyTorch.api.tabular_classification import TabularClassificationTask
from autoPyTorch.datasets.resampling_strategy import CrossValTypes, HoldoutValTypes

############################################################################
# Default Resampling Strategy
# ============================

############################################################################
# Data Loading
# ============
# ------------
X, y = sklearn.datasets.fetch_openml(data_id=40981, return_X_y=True, as_frame=True)
X_train, X_test, y_train, y_test = sklearn.model_selection.train_test_split(
X,
Expand All @@ -39,7 +42,7 @@

############################################################################
# Build and fit a classifier with default resampling strategy
# ===========================================================
# -----------------------------------------------------------
api = TabularClassificationTask(
# 'HoldoutValTypes.holdout_validation' with 'val_share': 0.33
# is the default argument setting for TabularClassificationTask.
Expand All @@ -51,7 +54,7 @@

############################################################################
# Search for an ensemble of machine learning algorithms
# =====================================================
# -----------------------------------------------------
api.search(
X_train=X_train,
y_train=y_train,
Expand All @@ -64,7 +67,7 @@

############################################################################
# Print the final ensemble performance
# ====================================
# ------------------------------------
y_pred = api.predict(X_test)
score = api.score(y_pred, y_test)
print(score)
Expand All @@ -76,17 +79,22 @@

############################################################################

############################################################################
# Cross validation Resampling Strategy
# =====================================

############################################################################
# Build and fit a classifier with Cross validation resampling strategy
# ====================================================================
# --------------------------------------------------------------------
api = TabularClassificationTask(
resampling_strategy=CrossValTypes.k_fold_cross_validation,
resampling_strategy_args={'num_splits': 3}
)

############################################################################
# Search for an ensemble of machine learning algorithms
# =====================================================
# -----------------------------------------------------------------------

api.search(
X_train=X_train,
y_train=y_train,
Expand All @@ -99,7 +107,7 @@

############################################################################
# Print the final ensemble performance
# ====================================
# ------------
y_pred = api.predict(X_test)
score = api.score(y_pred, y_test)
print(score)
Expand All @@ -111,9 +119,13 @@

############################################################################

############################################################################
# Stratified Resampling Strategy
# ===============================

############################################################################
# Build and fit a classifier with Stratified resampling strategy
# ==============================================================
# --------------------------------------------------------------
api = TabularClassificationTask(
# For demonstration purposes, we use
# Stratified hold out validation. However,
Expand All @@ -124,7 +136,7 @@

############################################################################
# Search for an ensemble of machine learning algorithms
# =====================================================
# -----------------------------------------------------
api.search(
X_train=X_train,
y_train=y_train,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n# Tabular Classification with Custom Configuration Space\n\nThe following example shows how adjust the configuration space of\nthe search. Currently, there are two changes that can be made to the space:-\n1. Adjust individual hyperparameters in the pipeline\n2. Include or exclude components:\n a) include: Dictionary containing components to include. Key is the node\n name and Value is an Iterable of the names of the components\n to include. Only these components will be present in the\n search space.\n b) exclude: Dictionary containing components to exclude. Key is the node\n name and Value is an Iterable of the names of the components\n to exclude. All except these components will be present in\n the search space.\n"
"\n# Tabular Classification with Custom Configuration Space\n\nThe following example shows how adjust the configuration space of\nthe search. Currently, there are two changes that can be made to the space:-\n\n1. Adjust individual hyperparameters in the pipeline\n2. Include or exclude components:\n a) include: Dictionary containing components to include. Key is the node\n name and Value is an Iterable of the names of the components\n to include. Only these components will be present in the\n search space.\n b) exclude: Dictionary containing components to exclude. Key is the node\n name and Value is an Iterable of the names of the components\n to exclude. All except these components will be present in\n the search space.\n"
]
},
{
Expand All @@ -26,7 +26,133 @@
},
"outputs": [],
"source": [
"import os\nimport tempfile as tmp\nimport warnings\n\nos.environ['JOBLIB_TEMP_FOLDER'] = tmp.gettempdir()\nos.environ['OMP_NUM_THREADS'] = '1'\nos.environ['OPENBLAS_NUM_THREADS'] = '1'\nos.environ['MKL_NUM_THREADS'] = '1'\n\nwarnings.simplefilter(action='ignore', category=UserWarning)\nwarnings.simplefilter(action='ignore', category=FutureWarning)\n\nimport sklearn.datasets\nimport sklearn.model_selection\n\nfrom autoPyTorch.api.tabular_classification import TabularClassificationTask\nfrom autoPyTorch.utils.hyperparameter_search_space_update import HyperparameterSearchSpaceUpdates\n\n\ndef get_search_space_updates():\n \"\"\"\n Search space updates to the task can be added using HyperparameterSearchSpaceUpdates\n Returns:\n HyperparameterSearchSpaceUpdates\n \"\"\"\n updates = HyperparameterSearchSpaceUpdates()\n updates.append(node_name=\"data_loader\",\n hyperparameter=\"batch_size\",\n value_range=[16, 512],\n default_value=32)\n updates.append(node_name=\"lr_scheduler\",\n hyperparameter=\"CosineAnnealingLR:T_max\",\n value_range=[50, 60],\n default_value=55)\n updates.append(node_name='network_backbone',\n hyperparameter='ResNetBackbone:dropout',\n value_range=[0, 0.5],\n default_value=0.2)\n return updates\n\n\nif __name__ == '__main__':\n\n ############################################################################\n # Data Loading\n # ============\n X, y = sklearn.datasets.fetch_openml(data_id=40981, return_X_y=True, as_frame=True)\n X_train, X_test, y_train, y_test = sklearn.model_selection.train_test_split(\n X,\n y,\n random_state=1,\n )\n\n ############################################################################\n # Build and fit a classifier with include components\n # ==================================================\n api = TabularClassificationTask(\n search_space_updates=get_search_space_updates(),\n include_components={'network_backbone': ['MLPBackbone', 'ResNetBackbone'],\n 'encoder': ['OneHotEncoder']}\n )\n\n ############################################################################\n # Search for an ensemble of machine learning algorithms\n # =====================================================\n api.search(\n X_train=X_train.copy(),\n y_train=y_train.copy(),\n X_test=X_test.copy(),\n y_test=y_test.copy(),\n optimize_metric='accuracy',\n total_walltime_limit=150,\n func_eval_time_limit_secs=30\n )\n\n ############################################################################\n # Print the final ensemble performance\n # ====================================\n y_pred = api.predict(X_test)\n score = api.score(y_pred, y_test)\n print(score)\n print(api.show_models())\n\n # Print statistics from search\n print(api.sprint_statistics())\n\n ############################################################################\n # Build and fit a classifier with exclude components\n # ==================================================\n api = TabularClassificationTask(\n search_space_updates=get_search_space_updates(),\n exclude_components={'network_backbone': ['MLPBackbone'],\n 'encoder': ['OneHotEncoder']}\n )\n\n ############################################################################\n # Search for an ensemble of machine learning algorithms\n # =====================================================\n api.search(\n X_train=X_train,\n y_train=y_train,\n X_test=X_test.copy(),\n y_test=y_test.copy(),\n optimize_metric='accuracy',\n total_walltime_limit=150,\n func_eval_time_limit_secs=30\n )\n\n ############################################################################\n # Print the final ensemble performance\n # ====================================\n y_pred = api.predict(X_test)\n score = api.score(y_pred, y_test)\n print(score)\n print(api.show_models())\n\n # Print statistics from search\n print(api.sprint_statistics())"
"import os\nimport tempfile as tmp\nimport warnings\n\nos.environ['JOBLIB_TEMP_FOLDER'] = tmp.gettempdir()\nos.environ['OMP_NUM_THREADS'] = '1'\nos.environ['OPENBLAS_NUM_THREADS'] = '1'\nos.environ['MKL_NUM_THREADS'] = '1'\n\nwarnings.simplefilter(action='ignore', category=UserWarning)\nwarnings.simplefilter(action='ignore', category=FutureWarning)\n\nimport sklearn.datasets\nimport sklearn.model_selection\n\nfrom autoPyTorch.api.tabular_classification import TabularClassificationTask\nfrom autoPyTorch.utils.hyperparameter_search_space_update import HyperparameterSearchSpaceUpdates\n\n\ndef get_search_space_updates():\n \"\"\"\n Search space updates to the task can be added using HyperparameterSearchSpaceUpdates\n Returns:\n HyperparameterSearchSpaceUpdates\n \"\"\"\n updates = HyperparameterSearchSpaceUpdates()\n updates.append(node_name=\"data_loader\",\n hyperparameter=\"batch_size\",\n value_range=[16, 512],\n default_value=32)\n updates.append(node_name=\"lr_scheduler\",\n hyperparameter=\"CosineAnnealingLR:T_max\",\n value_range=[50, 60],\n default_value=55)\n updates.append(node_name='network_backbone',\n hyperparameter='ResNetBackbone:dropout',\n value_range=[0, 0.5],\n default_value=0.2)\n return updates"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Data Loading\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"X, y = sklearn.datasets.fetch_openml(data_id=40981, return_X_y=True, as_frame=True)\nX_train, X_test, y_train, y_test = sklearn.model_selection.train_test_split(\n X,\n y,\n random_state=1,\n)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Build and fit a classifier with include components\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"api = TabularClassificationTask(\n search_space_updates=get_search_space_updates(),\n include_components={'network_backbone': ['MLPBackbone', 'ResNetBackbone'],\n 'encoder': ['OneHotEncoder']}\n)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Search for an ensemble of machine learning algorithms\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"api.search(\n X_train=X_train.copy(),\n y_train=y_train.copy(),\n X_test=X_test.copy(),\n y_test=y_test.copy(),\n optimize_metric='accuracy',\n total_walltime_limit=150,\n func_eval_time_limit_secs=30\n)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Print the final ensemble performance\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"y_pred = api.predict(X_test)\nscore = api.score(y_pred, y_test)\nprint(score)\nprint(api.show_models())\n\n# Print statistics from search\nprint(api.sprint_statistics())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Build and fit a classifier with exclude components\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"api = TabularClassificationTask(\n search_space_updates=get_search_space_updates(),\n exclude_components={'network_backbone': ['MLPBackbone'],\n 'encoder': ['OneHotEncoder']}\n)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Search for an ensemble of machine learning algorithms\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"api.search(\n X_train=X_train,\n y_train=y_train,\n X_test=X_test.copy(),\n y_test=y_test.copy(),\n optimize_metric='accuracy',\n total_walltime_limit=150,\n func_eval_time_limit_secs=30\n)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Print the final ensemble performance\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"y_pred = api.predict(X_test)\nscore = api.score(y_pred, y_test)\nprint(score)\nprint(api.show_models())\n\n# Print statistics from search\nprint(api.sprint_statistics())"
]
}
],
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,14 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Data Loading\n\n"
"## Default Resampling Strategy\n\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Data Loading\n\n"
]
},
{
Expand All @@ -51,7 +58,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Build and fit a classifier with default resampling strategy\n\n"
"### Build and fit a classifier with default resampling strategy\n\n"
]
},
{
Expand All @@ -69,7 +76,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Search for an ensemble of machine learning algorithms\n\n"
"### Search for an ensemble of machine learning algorithms\n\n"
]
},
{
Expand All @@ -87,7 +94,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Print the final ensemble performance\n\n"
"### Print the final ensemble performance\n\n"
]
},
{
Expand All @@ -105,7 +112,14 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Build and fit a classifier with Cross validation resampling strategy\n\n"
"## Cross validation Resampling Strategy\n\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Build and fit a classifier with Cross validation resampling strategy\n\n"
]
},
{
Expand All @@ -123,7 +137,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Search for an ensemble of machine learning algorithms\n\n"
"### Search for an ensemble of machine learning algorithms\n\n"
]
},
{
Expand All @@ -141,7 +155,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Print the final ensemble performance\n\n"
"### Print the final ensemble performance\n\n"
]
},
{
Expand All @@ -159,7 +173,14 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Build and fit a classifier with Stratified resampling strategy\n\n"
"## Stratified Resampling Strategy\n\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Build and fit a classifier with Stratified resampling strategy\n\n"
]
},
{
Expand All @@ -177,7 +198,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Search for an ensemble of machine learning algorithms\n\n"
"### Search for an ensemble of machine learning algorithms\n\n"
]
},
{
Expand Down
Loading

0 comments on commit 8c9a9c1

Please sign in to comment.