Skip to content

Commit b936dd3

Browse files
authored
Merge pull request Azure#69 from rastala/master
New SDK version 0.1.74
2 parents 32102e2 + 7339c95 commit b936dd3

File tree

46 files changed

+8203
-4617
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

46 files changed

+8203
-4617
lines changed

01.getting-started/01.train-within-notebook/01.train-within-notebook.ipynb

Lines changed: 806 additions & 807 deletions
Large diffs are not rendered by default.

01.getting-started/02.train-on-local/.ipynb_checkpoints/02.train-on-local-checkpoint.ipynb

Lines changed: 477 additions & 0 deletions
Large diffs are not rendered by default.

01.getting-started/02.train-on-local/02.train-on-local.ipynb

Lines changed: 475 additions & 475 deletions
Large diffs are not rendered by default.
Lines changed: 325 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,325 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"metadata": {},
6+
"source": [
7+
"Copyright (c) Microsoft Corporation. All rights reserved.\n",
8+
"\n",
9+
"Licensed under the MIT License."
10+
]
11+
},
12+
{
13+
"cell_type": "markdown",
14+
"metadata": {},
15+
"source": [
16+
"# 03. Train on Azure Container Instance (EXPERIMENTAL)\n",
17+
"\n",
18+
"* Create Workspace\n",
19+
"* Create Project\n",
20+
"* Create `train.py` in the project folder.\n",
21+
"* Configure an ACI (Azure Container Instance) run\n",
22+
"* Execute in ACI"
23+
]
24+
},
25+
{
26+
"cell_type": "markdown",
27+
"metadata": {},
28+
"source": [
29+
"## Prerequisites\n",
30+
"Make sure you go through the [00. Installation and Configuration](00.configuration.ipynb) Notebook first if you haven't."
31+
]
32+
},
33+
{
34+
"cell_type": "code",
35+
"execution_count": null,
36+
"metadata": {},
37+
"outputs": [],
38+
"source": [
39+
"# Check core SDK version number\n",
40+
"import azureml.core\n",
41+
"\n",
42+
"print(\"SDK version:\", azureml.core.VERSION)"
43+
]
44+
},
45+
{
46+
"cell_type": "markdown",
47+
"metadata": {},
48+
"source": [
49+
"## Initialize Workspace\n",
50+
"\n",
51+
"Initialize a workspace object from persisted configuration"
52+
]
53+
},
54+
{
55+
"cell_type": "code",
56+
"execution_count": null,
57+
"metadata": {
58+
"tags": [
59+
"create workspace"
60+
]
61+
},
62+
"outputs": [],
63+
"source": [
64+
"from azureml.core import Workspace\n",
65+
"\n",
66+
"ws = Workspace.from_config()\n",
67+
"print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep = '\\n')"
68+
]
69+
},
70+
{
71+
"cell_type": "markdown",
72+
"metadata": {},
73+
"source": [
74+
"## Create An Experiment\n",
75+
"\n",
76+
"**Experiment** is a logical container in an Azure ML Workspace. It hosts run records which can include run metrics and output artifacts from your experiments."
77+
]
78+
},
79+
{
80+
"cell_type": "code",
81+
"execution_count": null,
82+
"metadata": {},
83+
"outputs": [],
84+
"source": [
85+
"from azureml.core import Experiment\n",
86+
"experiment_name = 'train-on-aci'\n",
87+
"experiment = Experiment(workspace = ws, name = experiment_name)"
88+
]
89+
},
90+
{
91+
"cell_type": "markdown",
92+
"metadata": {},
93+
"source": [
94+
"Create a folder to store the training script."
95+
]
96+
},
97+
{
98+
"cell_type": "code",
99+
"execution_count": null,
100+
"metadata": {},
101+
"outputs": [],
102+
"source": [
103+
"import os\n",
104+
"script_folder = './samples/train-on-aci'\n",
105+
"os.makedirs(script_folder, exist_ok = True)"
106+
]
107+
},
108+
{
109+
"cell_type": "markdown",
110+
"metadata": {},
111+
"source": [
112+
"## Remote execution on ACI\n",
113+
"\n",
114+
"Use `%%writefile` magic to write training code to `train.py` file under the project folder."
115+
]
116+
},
117+
{
118+
"cell_type": "code",
119+
"execution_count": null,
120+
"metadata": {},
121+
"outputs": [],
122+
"source": [
123+
"%%writefile $script_folder/train.py\n",
124+
"\n",
125+
"import os\n",
126+
"from sklearn.datasets import load_diabetes\n",
127+
"from sklearn.linear_model import Ridge\n",
128+
"from sklearn.metrics import mean_squared_error\n",
129+
"from sklearn.model_selection import train_test_split\n",
130+
"from azureml.core.run import Run\n",
131+
"from sklearn.externals import joblib\n",
132+
"\n",
133+
"import numpy as np\n",
134+
"\n",
135+
"os.makedirs('./outputs', exist_ok=True)\n",
136+
"\n",
137+
"X, y = load_diabetes(return_X_y = True)\n",
138+
"\n",
139+
"run = Run.get_submitted_run()\n",
140+
"\n",
141+
"X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)\n",
142+
"data = {\"train\": {\"X\": X_train, \"y\": y_train},\n",
143+
" \"test\": {\"X\": X_test, \"y\": y_test}}\n",
144+
"\n",
145+
"# list of numbers from 0.0 to 1.0 with a 0.05 interval\n",
146+
"alphas = np.arange(0.0, 1.0, 0.05)\n",
147+
"\n",
148+
"for alpha in alphas:\n",
149+
" # Use Ridge algorithm to create a regression model\n",
150+
" reg = Ridge(alpha = alpha)\n",
151+
" reg.fit(data[\"train\"][\"X\"], data[\"train\"][\"y\"])\n",
152+
"\n",
153+
" preds = reg.predict(data[\"test\"][\"X\"])\n",
154+
" mse = mean_squared_error(preds, data[\"test\"][\"y\"])\n",
155+
" run.log('alpha', alpha)\n",
156+
" run.log('mse', mse)\n",
157+
" \n",
158+
" model_file_name = 'ridge_{0:.2f}.pkl'.format(alpha)\n",
159+
" with open(model_file_name, \"wb\") as file:\n",
160+
" joblib.dump(value = reg, filename = 'outputs/' + model_file_name)\n",
161+
"\n",
162+
" print('alpha is {0:.2f}, and mse is {1:0.2f}'.format(alpha, mse))"
163+
]
164+
},
165+
{
166+
"cell_type": "markdown",
167+
"metadata": {},
168+
"source": [
169+
"## Configure for using ACI\n",
170+
"Linux-based ACI is available in `westus`, `eastus`, `westeurope`, `northeurope`, `westus2` and `southeastasia` regions. See details [here](https://docs.microsoft.com/en-us/azure/container-instances/container-instances-quotas#region-availability)."
171+
]
172+
},
173+
{
174+
"cell_type": "code",
175+
"execution_count": null,
176+
"metadata": {
177+
"tags": [
178+
"configure run"
179+
]
180+
},
181+
"outputs": [],
182+
"source": [
183+
"from azureml.core.runconfig import RunConfiguration\n",
184+
"from azureml.core.conda_dependencies import CondaDependencies\n",
185+
"\n",
186+
"# create a new runconfig object\n",
187+
"run_config = RunConfiguration()\n",
188+
"\n",
189+
"# signal that you want to use ACI to execute script.\n",
190+
"run_config.target = \"containerinstance\"\n",
191+
"\n",
192+
"# ACI container group is only supported in certain regions, which can be different than the region the Workspace is in.\n",
193+
"run_config.container_instance.region = 'eastus'\n",
194+
"\n",
195+
"# set the ACI CPU and Memory \n",
196+
"run_config.container_instance.cpu_cores = 1\n",
197+
"run_config.container_instance.memory_gb = 2\n",
198+
"\n",
199+
"# enable Docker \n",
200+
"run_config.environment.docker.enabled = True\n",
201+
"\n",
202+
"# set Docker base image to the default CPU-based image\n",
203+
"run_config.environment.docker.base_image = azureml.core.runconfig.DEFAULT_CPU_IMAGE\n",
204+
"#run_config.environment.docker.base_image = 'microsoft/mmlspark:plus-0.9.9'\n",
205+
"\n",
206+
"# use conda_dependencies.yml to create a conda environment in the Docker image for execution\n",
207+
"run_config.environment.python.user_managed_dependencies = False\n",
208+
"\n",
209+
"# auto-prepare the Docker image when used for execution (if it is not already prepared)\n",
210+
"run_config.auto_prepare_environment = True\n",
211+
"\n",
212+
"# specify CondaDependencies obj\n",
213+
"run_config.environment.python.conda_dependencies = CondaDependencies.create(conda_packages=['scikit-learn'])"
214+
]
215+
},
216+
{
217+
"cell_type": "markdown",
218+
"metadata": {},
219+
"source": [
220+
"## Submit the Experiment\n",
221+
"Finally, run the training job on the ACI"
222+
]
223+
},
224+
{
225+
"cell_type": "code",
226+
"execution_count": null,
227+
"metadata": {
228+
"tags": [
229+
"remote run",
230+
"aci"
231+
]
232+
},
233+
"outputs": [],
234+
"source": [
235+
"%%time \n",
236+
"from azureml.core.script_run_config import ScriptRunConfig\n",
237+
"\n",
238+
"script_run_config = ScriptRunConfig(source_directory = script_folder,\n",
239+
" script= 'train.py',\n",
240+
" run_config = run_config)\n",
241+
"\n",
242+
"run = experiment.submit(script_run_config)\n"
243+
]
244+
},
245+
{
246+
"cell_type": "code",
247+
"execution_count": null,
248+
"metadata": {
249+
"tags": [
250+
"remote run",
251+
"aci"
252+
]
253+
},
254+
"outputs": [],
255+
"source": [
256+
"%%time\n",
257+
"# Shows output of the run on stdout.\n",
258+
"run.wait_for_completion(show_output = True)"
259+
]
260+
},
261+
{
262+
"cell_type": "code",
263+
"execution_count": null,
264+
"metadata": {
265+
"tags": [
266+
"query history"
267+
]
268+
},
269+
"outputs": [],
270+
"source": [
271+
"# Show run details\n",
272+
"run"
273+
]
274+
},
275+
{
276+
"cell_type": "code",
277+
"execution_count": null,
278+
"metadata": {
279+
"tags": [
280+
"get metrics"
281+
]
282+
},
283+
"outputs": [],
284+
"source": [
285+
"# get all metris logged in the run\n",
286+
"run.get_metrics()\n",
287+
"metrics = run.get_metrics()"
288+
]
289+
},
290+
{
291+
"cell_type": "code",
292+
"execution_count": null,
293+
"metadata": {},
294+
"outputs": [],
295+
"source": [
296+
"import numpy as np\n",
297+
"print('When alpha is {1:0.2f}, we have min MSE {0:0.2f}.'.format(\n",
298+
" min(metrics['mse']), \n",
299+
" metrics['alpha'][np.argmin(metrics['mse'])]\n",
300+
"))"
301+
]
302+
}
303+
],
304+
"metadata": {
305+
"kernelspec": {
306+
"display_name": "Python 3",
307+
"language": "python",
308+
"name": "python3"
309+
},
310+
"language_info": {
311+
"codemirror_mode": {
312+
"name": "ipython",
313+
"version": 3
314+
},
315+
"file_extension": ".py",
316+
"mimetype": "text/x-python",
317+
"name": "python",
318+
"nbconvert_exporter": "python",
319+
"pygments_lexer": "ipython3",
320+
"version": "3.6.5"
321+
}
322+
},
323+
"nbformat": 4,
324+
"nbformat_minor": 2
325+
}

0 commit comments

Comments
 (0)