Skip to content

Commit 441a5b0

Browse files
authored
Merge pull request Azure#1440 from Azure/release_update/Release-95
update samples from Release-95 as a part of SDK 1.27 release
2 parents 6f893ff + 70902df commit 441a5b0

File tree

29 files changed

+184
-41
lines changed

29 files changed

+184
-41
lines changed

configuration.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -103,7 +103,7 @@
103103
"source": [
104104
"import azureml.core\n",
105105
"\n",
106-
"print(\"This notebook was created using version 1.26.0 of the Azure ML SDK\")\n",
106+
"print(\"This notebook was created using version 1.27.0 of the Azure ML SDK\")\n",
107107
"print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
108108
]
109109
},

contrib/fairness/fairlearn-azureml-mitigation.ipynb

Lines changed: 7 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -36,9 +36,9 @@
3636
"\n",
3737
"<a id=\"Introduction\"></a>\n",
3838
"## Introduction\n",
39-
"This notebook shows how to use [Fairlearn (an open source fairness assessment and unfairness mitigation package)](http://fairlearn.github.io) and Azure Machine Learning Studio for a binary classification problem. This example uses the well-known adult census dataset. For the purposes of this notebook, we shall treat this as a loan decision problem. We will pretend that the label indicates whether or not each individual repaid a loan in the past. We will use the data to train a predictor to predict whether previously unseen individuals will repay a loan or not. The assumption is that the model predictions are used to decide whether an individual should be offered a loan. Its purpose is purely illustrative of a workflow including a fairness dashboard - in particular, we do **not** include a full discussion of the detailed issues which arise when considering fairness in machine learning. For such discussions, please [refer to the Fairlearn website](http://fairlearn.github.io/).\n",
39+
"This notebook shows how to use [Fairlearn (an open source fairness assessment and unfairness mitigation package)](http://fairlearn.org) and Azure Machine Learning Studio for a binary classification problem. This example uses the well-known adult census dataset. For the purposes of this notebook, we shall treat this as a loan decision problem. We will pretend that the label indicates whether or not each individual repaid a loan in the past. We will use the data to train a predictor to predict whether previously unseen individuals will repay a loan or not. The assumption is that the model predictions are used to decide whether an individual should be offered a loan. Its purpose is purely illustrative of a workflow including a fairness dashboard - in particular, we do **not** include a full discussion of the detailed issues which arise when considering fairness in machine learning. For such discussions, please [refer to the Fairlearn website](http://fairlearn.org/).\n",
4040
"\n",
41-
"We will apply the [grid search algorithm](https://fairlearn.github.io/master/api_reference/fairlearn.reductions.html#fairlearn.reductions.GridSearch) from the Fairlearn package using a specific notion of fairness called Demographic Parity. This produces a set of models, and we will view these in a dashboard both locally and in the Azure Machine Learning Studio.\n",
41+
"We will apply the [grid search algorithm](https://fairlearn.org/v0.4.6/api_reference/fairlearn.reductions.html#fairlearn.reductions.GridSearch) from the Fairlearn package using a specific notion of fairness called Demographic Parity. This produces a set of models, and we will view these in a dashboard both locally and in the Azure Machine Learning Studio.\n",
4242
"\n",
4343
"### Setup\n",
4444
"\n",
@@ -48,7 +48,7 @@
4848
"* `azureml-contrib-fairness`\n",
4949
"* `fairlearn==0.4.6` (v0.5.0 will work with minor modifications)\n",
5050
"* `joblib`\n",
51-
"* `shap`\n",
51+
"* `liac-arff`\n",
5252
"\n",
5353
"Fairlearn relies on features introduced in v0.22.1 of `scikit-learn`. If you have an older version already installed, please uncomment and run the following cell:"
5454
]
@@ -88,7 +88,6 @@
8888
"from fairlearn.widget import FairlearnDashboard\n",
8989
"\n",
9090
"from sklearn.compose import ColumnTransformer\n",
91-
"from sklearn.datasets import fetch_openml\n",
9291
"from sklearn.impute import SimpleImputer\n",
9392
"from sklearn.linear_model import LogisticRegression\n",
9493
"from sklearn.model_selection import train_test_split\n",
@@ -112,9 +111,9 @@
112111
"metadata": {},
113112
"outputs": [],
114113
"source": [
115-
"from fairness_nb_utils import fetch_openml_with_retries\n",
114+
"from fairness_nb_utils import fetch_census_dataset\n",
116115
"\n",
117-
"data = fetch_openml_with_retries(data_id=1590)\n",
116+
"data = fetch_census_dataset()\n",
118117
" \n",
119118
"# Extract the items we want\n",
120119
"X_raw = data.data\n",
@@ -137,7 +136,7 @@
137136
"outputs": [],
138137
"source": [
139138
"A = X_raw[['sex','race']]\n",
140-
"X_raw = X_raw.drop(labels=['sex', 'race'],axis = 1)"
139+
"X_raw = X_raw.drop(labels=['sex', 'race'], axis = 1)"
141140
]
142141
},
143142
{
@@ -584,7 +583,7 @@
584583
"<a id=\"Conclusion\"></a>\n",
585584
"## Conclusion\n",
586585
"\n",
587-
"In this notebook we have demonstrated how to use the `GridSearch` algorithm from Fairlearn to generate a collection of models, and then present them in the fairness dashboard in Azure Machine Learning Studio. Please remember that this notebook has not attempted to discuss the many considerations which should be part of any approach to unfairness mitigation. The [Fairlearn website](http://fairlearn.github.io/) provides that discussion"
586+
"In this notebook we have demonstrated how to use the `GridSearch` algorithm from Fairlearn to generate a collection of models, and then present them in the fairness dashboard in Azure Machine Learning Studio. Please remember that this notebook has not attempted to discuss the many considerations which should be part of any approach to unfairness mitigation. The [Fairlearn website](http://fairlearn.org/) provides that discussion"
588587
]
589588
},
590589
{

contrib/fairness/fairlearn-azureml-mitigation.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,3 +5,4 @@ dependencies:
55
- azureml-contrib-fairness
66
- fairlearn==0.4.6
77
- joblib
8+
- liac-arff

contrib/fairness/fairness_nb_utils.py

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,13 @@
44

55
"""Utilities for azureml-contrib-fairness notebooks."""
66

7+
import arff
8+
from collections import OrderedDict
9+
from contextlib import closing
10+
import gzip
11+
import pandas as pd
712
from sklearn.datasets import fetch_openml
13+
from sklearn.utils import Bunch
814
import time
915

1016

@@ -26,3 +32,62 @@ def fetch_openml_with_retries(data_id, max_retries=4, retry_delay=60):
2632
raise RuntimeError("Unable to download dataset from OpenML")
2733

2834
return data
35+
36+
37+
_categorical_columns = [
38+
'workclass',
39+
'education',
40+
'marital-status',
41+
'occupation',
42+
'relationship',
43+
'race',
44+
'sex',
45+
'native-country'
46+
]
47+
48+
49+
def fetch_census_dataset():
50+
"""Fetch the Adult Census Dataset
51+
52+
This uses a particular URL for the Adult Census dataset. The code
53+
is a simplified version of fetch_openml() in sklearn.
54+
55+
The data are copied from:
56+
https://openml.org/data/v1/download/1595261.gz
57+
(as of 2021-03-31)
58+
"""
59+
try:
60+
from urllib import urlretrieve
61+
except ImportError:
62+
from urllib.request import urlretrieve
63+
64+
filename = "1595261.gz"
65+
data_url = "https://rainotebookscdn.blob.core.windows.net/datasets/"
66+
urlretrieve(data_url + filename, filename)
67+
68+
http_stream = gzip.GzipFile(filename=filename, mode='rb')
69+
70+
with closing(http_stream):
71+
def _stream_generator(response):
72+
for line in response:
73+
yield line.decode('utf-8')
74+
75+
stream = _stream_generator(http_stream)
76+
data = arff.load(stream)
77+
78+
attributes = OrderedDict(data['attributes'])
79+
arff_columns = list(attributes)
80+
81+
raw_df = pd.DataFrame(data=data['data'], columns=arff_columns)
82+
83+
target_column_name = 'class'
84+
target = raw_df.pop(target_column_name)
85+
for col_name in _categorical_columns:
86+
dtype = pd.api.types.CategoricalDtype(attributes[col_name])
87+
raw_df[col_name] = raw_df[col_name].astype(dtype, copy=False)
88+
89+
result = Bunch()
90+
result.data = raw_df
91+
result.target = target
92+
93+
return result

contrib/fairness/upload-fairness-dashboard.ipynb

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,7 @@
5050
"* `azureml-contrib-fairness`\n",
5151
"* `fairlearn==0.4.6` (should also work with v0.5.0)\n",
5252
"* `joblib`\n",
53-
"* `shap`\n",
53+
"* `liac-arff`\n",
5454
"\n",
5555
"Fairlearn relies on features introduced in v0.22.1 of `scikit-learn`. If you have an older version already installed, please uncomment and run the following cell:"
5656
]
@@ -88,7 +88,6 @@
8888
"source": [
8989
"from sklearn import svm\n",
9090
"from sklearn.compose import ColumnTransformer\n",
91-
"from sklearn.datasets import fetch_openml\n",
9291
"from sklearn.impute import SimpleImputer\n",
9392
"from sklearn.linear_model import LogisticRegression\n",
9493
"from sklearn.model_selection import train_test_split\n",
@@ -110,9 +109,9 @@
110109
"metadata": {},
111110
"outputs": [],
112111
"source": [
113-
"from fairness_nb_utils import fetch_openml_with_retries\n",
112+
"from fairness_nb_utils import fetch_census_dataset\n",
114113
"\n",
115-
"data = fetch_openml_with_retries(data_id=1590)\n",
114+
"data = fetch_census_dataset()\n",
116115
" \n",
117116
"# Extract the items we want\n",
118117
"X_raw = data.data\n",

contrib/fairness/upload-fairness-dashboard.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,3 +5,4 @@ dependencies:
55
- azureml-contrib-fairness
66
- fairlearn==0.4.6
77
- joblib
8+
- liac-arff

how-to-use-azureml/automated-machine-learning/automl_env.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -21,8 +21,8 @@ dependencies:
2121

2222
- pip:
2323
# Required packages for AzureML execution, history, and data preparation.
24-
- azureml-widgets~=1.26.0
24+
- azureml-widgets~=1.27.0
2525
- pytorch-transformers==1.0.0
2626
- spacy==2.1.8
2727
- https://aka.ms/automl-resources/packages/en_core_web_sm-2.1.0.tar.gz
28-
- -r https://automlcesdkdataresources.blob.core.windows.net/validated-requirements/1.26.0/validated_win32_requirements.txt [--no-deps]
28+
- -r https://automlcesdkdataresources.blob.core.windows.net/validated-requirements/1.27.0/validated_win32_requirements.txt [--no-deps]

how-to-use-azureml/automated-machine-learning/automl_env_linux.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -21,8 +21,8 @@ dependencies:
2121

2222
- pip:
2323
# Required packages for AzureML execution, history, and data preparation.
24-
- azureml-widgets~=1.26.0
24+
- azureml-widgets~=1.27.0
2525
- pytorch-transformers==1.0.0
2626
- spacy==2.1.8
2727
- https://aka.ms/automl-resources/packages/en_core_web_sm-2.1.0.tar.gz
28-
- -r https://automlcesdkdataresources.blob.core.windows.net/validated-requirements/1.26.0/validated_linux_requirements.txt [--no-deps]
28+
- -r https://automlcesdkdataresources.blob.core.windows.net/validated-requirements/1.27.0/validated_linux_requirements.txt [--no-deps]

how-to-use-azureml/automated-machine-learning/automl_env_mac.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -22,8 +22,8 @@ dependencies:
2222

2323
- pip:
2424
# Required packages for AzureML execution, history, and data preparation.
25-
- azureml-widgets~=1.26.0
25+
- azureml-widgets~=1.27.0
2626
- pytorch-transformers==1.0.0
2727
- spacy==2.1.8
2828
- https://aka.ms/automl-resources/packages/en_core_web_sm-2.1.0.tar.gz
29-
- -r https://automlcesdkdataresources.blob.core.windows.net/validated-requirements/1.26.0/validated_darwin_requirements.txt [--no-deps]
29+
- -r https://automlcesdkdataresources.blob.core.windows.net/validated-requirements/1.27.0/validated_darwin_requirements.txt [--no-deps]

how-to-use-azureml/automated-machine-learning/classification-bank-marketing-all-features/auto-ml-classification-bank-marketing-all-features.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -105,7 +105,7 @@
105105
"metadata": {},
106106
"outputs": [],
107107
"source": [
108-
"print(\"This notebook was created using version 1.26.0 of the Azure ML SDK\")\n",
108+
"print(\"This notebook was created using version 1.27.0 of the Azure ML SDK\")\n",
109109
"print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
110110
]
111111
},

0 commit comments

Comments
 (0)