Azure · harneetvirk · Dec 7, 2020 · Dec 7, 2020
diff --git a/NBSETUP.md b/NBSETUP.md
@@ -28,7 +28,7 @@ git clone https://github.com/Azure/MachineLearningNotebooks.git
 pip install azureml-sdk[notebooks,tensorboard]
 
 # install model explainability component
-pip install azureml-sdk[explain]
+pip install azureml-sdk[interpret]
 
 # install automated ml components
 pip install azureml-sdk[automl]
@@ -86,7 +86,7 @@ If you need additional Azure ML SDK components, you can either modify the Docker
 pip install azureml-sdk[automl]
 
 # install the core SDK and model explainability component
-pip install azureml-sdk[explain]
+pip install azureml-sdk[interpret]
 
 # install the core SDK and experimental components
 pip install azureml-sdk[contrib]

diff --git a/configuration.ipynb b/configuration.ipynb
@@ -103,7 +103,7 @@
       "source": [
         "import azureml.core\n",
         "\n",
-        "print(\"This notebook was created using version 1.18.0 of the Azure ML SDK\")\n",
+        "print(\"This notebook was created using version 1.19.0 of the Azure ML SDK\")\n",
         "print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
       ]
     },

diff --git a/contrib/fairness/fairlearn-azureml-mitigation.ipynb b/contrib/fairness/fairlearn-azureml-mitigation.ipynb
@@ -38,7 +38,7 @@
         "## Introduction\n",
         "This notebook shows how to use [Fairlearn (an open source fairness assessment and unfairness mitigation package)](http://fairlearn.github.io) and Azure Machine Learning Studio for a binary classification problem. This example uses the well-known adult census dataset. For the purposes of this notebook, we shall treat this as a loan decision problem. We will pretend that the label indicates whether or not each individual repaid a loan in the past. We will use the data to train a predictor to predict whether previously unseen individuals will repay a loan or not. The assumption is that the model predictions are used to decide whether an individual should be offered a loan. Its purpose is purely illustrative of a workflow including a fairness dashboard - in particular, we do **not** include a full discussion of the detailed issues which arise when considering fairness in machine learning. For such discussions, please [refer to the Fairlearn website](http://fairlearn.github.io/).\n",
         "\n",
-        "We will apply the [grid search algorithm](https://fairlearn.github.io/api_reference/fairlearn.reductions.html#fairlearn.reductions.GridSearch) from the Fairlearn package using a specific notion of fairness called Demographic Parity. This produces a set of models, and we will view these in a dashboard both locally and in the Azure Machine Learning Studio.\n",
+        "We will apply the [grid search algorithm](https://fairlearn.github.io/master/api_reference/fairlearn.reductions.html#fairlearn.reductions.GridSearch) from the Fairlearn package using a specific notion of fairness called Demographic Parity. This produces a set of models, and we will view these in a dashboard both locally and in the Azure Machine Learning Studio.\n",
         "\n",
         "### Setup\n",
         "\n",
@@ -98,8 +98,11 @@
       "metadata": {},
       "outputs": [],
       "source": [
-        "from sklearn.datasets import fetch_openml\n",
-        "data = fetch_openml(data_id=1590, as_frame=True)\n",
+        "from utilities import fetch_openml_with_retries\n",
+        "\n",
+        "data = fetch_openml_with_retries(data_id=1590)\n",
+        "    \n",
+        "# Extract the items we want\n",
         "X_raw = data.data\n",
         "Y = (data.target == '>50K') * 1\n",
         "\n",

diff --git a/contrib/fairness/upload-fairness-dashboard.ipynb b/contrib/fairness/upload-fairness-dashboard.ipynb
@@ -98,8 +98,11 @@
       "metadata": {},
       "outputs": [],
       "source": [
-        "from sklearn.datasets import fetch_openml\n",
-        "data = fetch_openml(data_id=1590, as_frame=True)\n",
+        "from utilities import fetch_openml_with_retries\n",
+        "\n",
+        "data = fetch_openml_with_retries(data_id=1590)\n",
+        "    \n",
+        "# Extract the items we want\n",
         "X_raw = data.data\n",
         "Y = (data.target == '>50K') * 1"
       ]

diff --git a/contrib/fairness/utilities.py b/contrib/fairness/utilities.py
@@ -0,0 +1,28 @@
+# ---------------------------------------------------------
+# Copyright (c) Microsoft Corporation. All rights reserved.
+# ---------------------------------------------------------
+
+"""Utilities for azureml-contrib-fairness notebooks."""
+
+from sklearn.datasets import fetch_openml
+import time
+
+
+def fetch_openml_with_retries(data_id, max_retries=4, retry_delay=60):
+    """Fetch a given dataset from OpenML with retries as specified."""
+    for i in range(max_retries):
+        try:
+            print("Download attempt {0} of {1}".format(i + 1, max_retries))
+            data = fetch_openml(data_id=data_id, as_frame=True)
+            break
+        except Exception as e:
+            print("Download attempt failed with exception:")
+            print(e)
+            if i + 1 != max_retries:
+                print("Will retry after {0} seconds".format(retry_delay))
+                time.sleep(retry_delay)
+                retry_delay = retry_delay * 2
+    else:
+        raise RuntimeError("Unable to download dataset from OpenML")
+
+    return data
diff --git a/how-to-use-azureml/automated-machine-learning/automl_env.yml b/how-to-use-azureml/automated-machine-learning/automl_env.yml
@@ -3,7 +3,7 @@ dependencies:
   # The python interpreter version.
   # Currently Azure ML only supports 3.5.2 and later.
 - pip<=19.3.1
-- python>=3.5.2,<3.6.8
+- python>=3.5.2,<3.8
 - nb_conda
 - boto3==1.15.18
 - matplotlib==2.1.0
@@ -21,8 +21,8 @@ dependencies:
 
 - pip:
   # Required packages for AzureML execution, history, and data preparation.
-  - azureml-widgets~=1.18.0
+  - azureml-widgets~=1.19.0
   - pytorch-transformers==1.0.0
   - spacy==2.1.8
   - https://aka.ms/automl-resources/packages/en_core_web_sm-2.1.0.tar.gz
-  - -r https://automlcesdkdataresources.blob.core.windows.net/validated-requirements/1.18.0/validated_win32_requirements.txt [--no-deps]
+  - -r https://automlcesdkdataresources.blob.core.windows.net/validated-requirements/1.19.0/validated_win32_requirements.txt [--no-deps]
diff --git a/how-to-use-azureml/automated-machine-learning/automl_env_linux.yml b/how-to-use-azureml/automated-machine-learning/automl_env_linux.yml
@@ -3,7 +3,7 @@ dependencies:
   # The python interpreter version.
   # Currently Azure ML only supports 3.5.2 and later.
 - pip<=19.3.1
-- python>=3.5.2,<3.6.8
+- python>=3.5.2,<3.8
 - nb_conda
 - boto3==1.15.18
 - matplotlib==2.1.0
@@ -21,9 +21,9 @@ dependencies:
 
 - pip:
   # Required packages for AzureML execution, history, and data preparation.
-  - azureml-widgets~=1.18.0
+  - azureml-widgets~=1.19.0
   - pytorch-transformers==1.0.0
   - spacy==2.1.8
   - https://aka.ms/automl-resources/packages/en_core_web_sm-2.1.0.tar.gz
-  - -r https://automlcesdkdataresources.blob.core.windows.net/validated-requirements/1.18.0/validated_linux_requirements.txt [--no-deps]
+  - -r https://automlcesdkdataresources.blob.core.windows.net/validated-requirements/1.19.0/validated_linux_requirements.txt [--no-deps]
 
diff --git a/how-to-use-azureml/automated-machine-learning/automl_env_mac.yml b/how-to-use-azureml/automated-machine-learning/automl_env_mac.yml
@@ -4,7 +4,7 @@ dependencies:
   # Currently Azure ML only supports 3.5.2 and later.
 - pip<=19.3.1
 - nomkl
-- python>=3.5.2,<3.6.8
+- python>=3.5.2,<3.8
 - nb_conda
 - boto3==1.15.18
 - matplotlib==2.1.0
@@ -22,8 +22,8 @@ dependencies:
 
 - pip:
   # Required packages for AzureML execution, history, and data preparation.
-  - azureml-widgets~=1.18.0
+  - azureml-widgets~=1.19.0
   - pytorch-transformers==1.0.0
   - spacy==2.1.8
   - https://aka.ms/automl-resources/packages/en_core_web_sm-2.1.0.tar.gz  
-  - -r https://automlcesdkdataresources.blob.core.windows.net/validated-requirements/1.18.0/validated_darwin_requirements.txt [--no-deps]
+  - -r https://automlcesdkdataresources.blob.core.windows.net/validated-requirements/1.19.0/validated_darwin_requirements.txt [--no-deps]
diff --git a/...tion-bank-marketing-all-features/auto-ml-classification-bank-marketing-all-features.ipynb b/...tion-bank-marketing-all-features/auto-ml-classification-bank-marketing-all-features.ipynb
@@ -105,7 +105,7 @@
       "metadata": {},
       "outputs": [],
       "source": [
-        "print(\"This notebook was created using version 1.18.0 of the Azure ML SDK\")\n",
+        "print(\"This notebook was created using version 1.19.0 of the Azure ML SDK\")\n",
         "print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
       ]
     },

diff --git a/...-learning/classification-credit-card-fraud/auto-ml-classification-credit-card-fraud.ipynb b/...-learning/classification-credit-card-fraud/auto-ml-classification-credit-card-fraud.ipynb
@@ -93,7 +93,7 @@
       "metadata": {},
       "outputs": [],
       "source": [
-        "print(\"This notebook was created using version 1.18.0 of the Azure ML SDK\")\n",
+        "print(\"This notebook was created using version 1.19.0 of the Azure ML SDK\")\n",
         "print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
       ]
     },

diff --git a/.../automated-machine-learning/classification-text-dnn/auto-ml-classification-text-dnn.ipynb b/.../automated-machine-learning/classification-text-dnn/auto-ml-classification-text-dnn.ipynb
@@ -96,7 +96,7 @@
       "metadata": {},
       "outputs": [],
       "source": [
-        "print(\"This notebook was created using version 1.18.0 of the Azure ML SDK\")\n",
+        "print(\"This notebook was created using version 1.19.0 of the Azure ML SDK\")\n",
         "print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
       ]
     },
@@ -298,7 +298,7 @@
         "                             compute_target=compute_target,\n",
         "                             training_data=train_dataset,\n",
         "                             label_column_name=target_column_name,\n",
-        "                             blocked_models = ['LightGBM'],\n",
+        "                             blocked_models = ['LightGBM', 'XGBoostClassifier'],\n",
         "                             **automl_settings\n",
         "                            )"
       ]

diff --git a/...reml/automated-machine-learning/continuous-retraining/auto-ml-continuous-retraining.ipynb b/...reml/automated-machine-learning/continuous-retraining/auto-ml-continuous-retraining.ipynb
@@ -81,7 +81,7 @@
       "metadata": {},
       "outputs": [],
       "source": [
-        "print(\"This notebook was created using version 1.18.0 of the Azure ML SDK\")\n",
+        "print(\"This notebook was created using version 1.19.0 of the Azure ML SDK\")\n",
         "print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
       ]
     },

diff --git a/...machine-learning/experimental/regression-model-proxy/auto-ml-regression-model-proxy.ipynb b/...machine-learning/experimental/regression-model-proxy/auto-ml-regression-model-proxy.ipynb
@@ -68,6 +68,7 @@
         "import logging\n",
         "\n",
         "from matplotlib import pyplot as plt\n",
+        "import json\n",
         "import numpy as np\n",
         "import pandas as pd\n",
         " \n",
@@ -92,7 +93,7 @@
       "metadata": {},
       "outputs": [],
       "source": [
-        "print(\"This notebook was created using version 1.18.0 of the Azure ML SDK\")\n",
+        "print(\"This notebook was created using version 1.19.0 of the Azure ML SDK\")\n",
         "print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
       ]
     },
@@ -322,6 +323,24 @@
         "print(best_run)"
       ]
     },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "#### Show hyperparameters\n",
+        "Show the model pipeline used for the best run with its hyperparameters."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "run_properties = json.loads(best_run.get_details()['properties']['pipeline_script'])\n",
+        "print(json.dumps(run_properties, indent = 1)) "
+      ]
+    },
     {
       "cell_type": "markdown",
       "metadata": {},

diff --git a/.../automated-machine-learning/forecasting-beer-remote/auto-ml-forecasting-beer-remote.ipynb b/.../automated-machine-learning/forecasting-beer-remote/auto-ml-forecasting-beer-remote.ipynb
@@ -113,7 +113,7 @@
       "metadata": {},
       "outputs": [],
       "source": [
-        "print(\"This notebook was created using version 1.18.0 of the Azure ML SDK\")\n",
+        "print(\"This notebook was created using version 1.19.0 of the Azure ML SDK\")\n",
         "print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
       ]
     },

diff --git a/how-to-use-azureml/automated-machine-learning/forecasting-beer-remote/helper.py b/how-to-use-azureml/automated-machine-learning/forecasting-beer-remote/helper.py
@@ -3,11 +3,11 @@
 from azureml.core.conda_dependencies import CondaDependencies
 from azureml.train.estimator import Estimator
 from azureml.core.run import Run
+from azureml.automl.core.shared import constants
 
 
 def split_fraction_by_grain(df, fraction, time_column_name,
                             grain_column_names=None):
-
     if not grain_column_names:
         df['tmp_grain_column'] = 'grain'
         grain_column_names = ['tmp_grain_column']
@@ -17,10 +17,10 @@ def split_fraction_by_grain(df, fraction, time_column_name,
                   .groupby(grain_column_names, group_keys=False))
 
     df_head = df_grouped.apply(lambda dfg: dfg.iloc[:-int(len(dfg) *
-                               fraction)] if fraction > 0 else dfg)
+                                                          fraction)] if fraction > 0 else dfg)
 
     df_tail = df_grouped.apply(lambda dfg: dfg.iloc[-int(len(dfg) *
-                               fraction):] if fraction > 0 else dfg[:0])
+                                                         fraction):] if fraction > 0 else dfg[:0])
 
     if 'tmp_grain_column' in grain_column_names:
         for df2 in (df, df_head, df_tail):
@@ -59,11 +59,13 @@ def get_result_df(remote_run):
                                      'primary_metric', 'Score'])
     goal_minimize = False
     for run in children:
-        if('run_algorithm' in run.properties and 'score' in run.properties):
+        if run.get_status().lower() == constants.RunState.COMPLETE_RUN \
+                and 'run_algorithm' in run.properties and 'score' in run.properties:
+            # We only count in the completed child runs.
             summary_df[run.id] = [run.id, run.properties['run_algorithm'],
                                   run.properties['primary_metric'],
                                   float(run.properties['score'])]
-            if('goal' in run.properties):
+            if ('goal' in run.properties):
                 goal_minimize = run.properties['goal'].split('_')[-1] == 'min'
 
     summary_df = summary_df.T.sort_values(
@@ -118,7 +120,6 @@ def run_multiple_inferences(summary_df, train_experiment, test_experiment,
                             compute_target, script_folder, test_dataset,
                             lookback_dataset, max_horizon, target_column_name,
                             time_column_name, freq):
-
     for run_name, run_summary in summary_df.iterrows():
         print(run_name)
         print(run_summary)

diff --git a/...ml/automated-machine-learning/forecasting-bike-share/auto-ml-forecasting-bike-share.ipynb b/...ml/automated-machine-learning/forecasting-bike-share/auto-ml-forecasting-bike-share.ipynb
@@ -87,7 +87,7 @@
       "metadata": {},
       "outputs": [],
       "source": [
-        "print(\"This notebook was created using version 1.18.0 of the Azure ML SDK\")\n",
+        "print(\"This notebook was created using version 1.19.0 of the Azure ML SDK\")\n",
         "print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
       ]
     },

diff --git a/how-to-use-azureml/automated-machine-learning/forecasting-bike-share/forecasting_script.py b/how-to-use-azureml/automated-machine-learning/forecasting-bike-share/forecasting_script.py
@@ -1,22 +1,24 @@
 import argparse
-import azureml.train.automl
-from azureml.core import Run
+from azureml.core import Dataset, Run
 from sklearn.externals import joblib
 
-
 parser = argparse.ArgumentParser()
 parser.add_argument(
     '--target_column_name', type=str, dest='target_column_name',
     help='Target Column Name')
+parser.add_argument(
+    '--test_dataset', type=str, dest='test_dataset',
+    help='Test Dataset')
 
 args = parser.parse_args()
 target_column_name = args.target_column_name
+test_dataset_id = args.test_dataset
 
 run = Run.get_context()
-# get input dataset by name
-test_dataset = run.input_datasets['test_data']
+ws = run.experiment.workspace
 
-df = test_dataset.to_pandas_dataframe().reset_index(drop=True)
+# get the input dataset by id
+test_dataset = Dataset.get_by_id(ws, id=test_dataset_id)
 
 X_test_df = test_dataset.drop_columns(columns=[target_column_name]).to_pandas_dataframe().reset_index(drop=True)
 y_test_df = test_dataset.with_timestamp_columns(None).keep_columns(columns=[target_column_name]).to_pandas_dataframe()

diff --git a/how-to-use-azureml/automated-machine-learning/forecasting-bike-share/run_forecast.py b/how-to-use-azureml/automated-machine-learning/forecasting-bike-share/run_forecast.py
@@ -1,29 +1,32 @@
-from azureml.train.estimator import Estimator
+from azureml.core import ScriptRunConfig
 
 
-def run_rolling_forecast(test_experiment, compute_target, train_run, test_dataset,
-                         target_column_name, inference_folder='./forecast'):
+def run_rolling_forecast(test_experiment, compute_target, train_run,
+                         test_dataset, target_column_name,
+                         inference_folder='./forecast'):
     train_run.download_file('outputs/model.pkl',
                             inference_folder + '/model.pkl')
 
     inference_env = train_run.get_environment()
 
-    est = Estimator(source_directory=inference_folder,
-                    entry_script='forecasting_script.py',
-                    script_params={
-                        '--target_column_name': target_column_name
-                    },
-                    inputs=[test_dataset.as_named_input('test_data')],
-                    compute_target=compute_target,
-                    environment_definition=inference_env)
+    config = ScriptRunConfig(source_directory=inference_folder,
+                             script='forecasting_script.py',
+                             arguments=['--target_column_name',
+                                        target_column_name,
+                                        '--test_dataset',
+                                        test_dataset.as_named_input(test_dataset.name)],
+                             compute_target=compute_target,
+                             environment=inference_env)
 
-    run = test_experiment.submit(est,
-                                 tags={
-                                     'training_run_id': train_run.id,
-                                     'run_algorithm': train_run.properties['run_algorithm'],
-                                     'valid_score': train_run.properties['score'],
-                                     'primary_metric': train_run.properties['primary_metric']
-                                 })
+    run = test_experiment.submit(config,
+                                 tags={'training_run_id':
+                                       train_run.id,
+                                       'run_algorithm':
+                                       train_run.properties['run_algorithm'],
+                                       'valid_score':
+                                       train_run.properties['score'],
+                                       'primary_metric':
+                                       train_run.properties['primary_metric']})
 
     run.log("run_algorithm", run.tags['run_algorithm'])
     return run
diff --git a/...omated-machine-learning/forecasting-energy-demand/auto-ml-forecasting-energy-demand.ipynb b/...omated-machine-learning/forecasting-energy-demand/auto-ml-forecasting-energy-demand.ipynb
@@ -97,7 +97,7 @@
       "metadata": {},
       "outputs": [],
       "source": [
-        "print(\"This notebook was created using version 1.18.0 of the Azure ML SDK\")\n",
+        "print(\"This notebook was created using version 1.19.0 of the Azure ML SDK\")\n",
         "print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
       ]
     },