Skip to content

Commit

Permalink
Working on questions and assignments for Session 3
Browse files Browse the repository at this point in the history
  • Loading branch information
svpino committed Aug 19, 2023
1 parent 2acf57b commit 3bf2f1a
Show file tree
Hide file tree
Showing 2 changed files with 48 additions and 6 deletions.
53 changes: 48 additions & 5 deletions penguins/penguins-cohort.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -10707,7 +10707,7 @@
},
{
"cell_type": "code",
"execution_count": 35,
"execution_count": null,
"id": "90fe82ae-6a2c-4461-bc83-bb52d8871e3b",
"metadata": {
"tags": []
Expand Down Expand Up @@ -10996,13 +10996,57 @@
"\n",
"<div style=\"margin: 30px 0 10px 0;\"><span style=\"font-size: 1.1em; padding:4px; background-color: #b8bf9f; color: #000;\"><strong>Question 3.1</strong></span></div>\n",
"\n",
"TBD\n",
"When a Training Job finishes, SageMaker automatically uploads the model to S3. Which of the following statements about this process is correct?\n",
"\n",
"1. SageMaker automatically creates a `model.tar.gz` file with the entire content of the `/opt/ml/model` directory.\n",
"2. SageMaker automatically creates a `model.tar.gz` file with any files inside the `/opt/ml/model` directory as long as those files belong to the model we trained.\n",
"3. SageMaker automatically creates a `model.tar.gz` file with any new files created inside the container by the training script.\n",
"4. SageMaker automatically creates a `model.tar.gz` file with the content of the output folder configured in the training script.\n",
"\n",
"\n",
"<div style=\"margin: 30px 0 10px 0;\"><span style=\"font-size: 1.1em; padding:4px; background-color: #b8bf9f; color: #000;\"><strong>Question 3.2</strong></span></div>\n",
"\n",
"Our pipeline uses \"file mode\" to provide the Training Job access to the dataset. When using file mode, SageMaker downloads the training data from S3 to a local directory in the training container. Imagine we have a large dataset and don't want to wait for SageMaker to download every time we want to train a model. How can we solve this problem?\n",
"\n",
"1. We can train our model with a smaller portion of the dataset.\n",
"2. We can increase the number of instances and train many models in parallel.\n",
"3. We can use \"fast file mode\" to get file system access to S3.\n",
"4. We can use \"pipe mode\" to stream data directly from S3 into the training container.\n",
"\n",
"\n",
"<div style=\"margin: 30px 0 10px 0;\"><span style=\"font-size: 1.1em; padding:4px; background-color: #b8bf9f; color: #000;\"><strong>Question 3.3</strong></span></div>\n",
"\n",
"When tuning the model, we used an `IntegerParameter` to define the range we wanted to explore for the number of epochs. Which of the following classes are also supported to define the range of other types of parameters?\n",
"\n",
"1. `FloatParameter`\n",
"2. `ContinuousParameter`\n",
"3. `CategoricalParameter`\n",
"4. `DateTimeParameter`\n",
"\n",
"\n",
"<div style=\"margin: 30px 0 10px 0;\"><span style=\"font-size: 1.1em; padding:4px; background-color: #b8bf9f; color: #000;\"><strong>Question 3.4</strong></span></div>\n",
"\n",
"Which of the following statements are true about the usage of `max_jobs` and `max_parallel_jobs` when running a Hyperparameter Tuning Job?\n",
"\n",
"1. `max_jobs` represents the maximum total number of Training Jobs that the Hyperparameter Tuning Job will start. \n",
"2. `max_parallel_jobs` represents the maximum total number of Training Jobs that will run in parallel at any given time.\n",
"3. `max_parallel_jobs` can never be larger than `max_jobs`.\n",
"4. `max_jobs` can never be larger than `max_parallel_jobs`.\n",
"\n",
"\n",
"\n",
"## Assignments\n",
"\n",
"* <span style=\"padding:4px; background-color: #f2a68a; color: #000;\"><strong>Assignment 3.1</strong></span> We currently define the number of epochs to train the model as a constant that we pass to the Estimator using the list of hyperparameters. Replace this constant with a new Pipeline Parameter named `training_epochs`. You'll need to specify this new parameter when creating the Pipeline.\n",
"\n"
"\n",
"* <span style=\"padding:4px; background-color: #f2a68a; color: #000;\"><strong>Assignment 3.1</strong></span> The training script is using a hard-coded learning rate value to train the model. Modify the code to accept the learning rate as a parameter that we can control from outside the script.\n",
"\n",
"* <span style=\"padding:4px; background-color: #f2a68a; color: #000;\"><strong>Assignment 3.2</strong></span> We currently define the number of epochs to train the model as a constant that we pass to the Estimator using the list of hyperparameters. Replace this constant with a new Pipeline Parameter named `training_epochs`. You'll need to specify this new parameter when creating the Pipeline.\n",
"\n",
"* <span style=\"padding:4px; background-color: #f2a68a; color: #000;\"><strong>Assignment 3.3</strong></span> Our pipeline uses \"file mode\" to provide the Training Job access to the dataset. When using file mode, SageMaker downloads the training data from S3 to a local directory in the training container. For this assignment, modify the code to stream the data into the training container instead of copying it.\n",
"\n",
"* <span style=\"padding:4px; background-color: #f2a68a; color: #000;\"><strong>Assignment 3.4</strong></span> TBD.\n",
"\n",
"* <span style=\"padding:4px; background-color: #f2a68a; color: #000;\"><strong>Assignment 3.5</strong></span> Modify the pipeline you created for the \"Pipeline of Digits\" project and add a Training Step. This Training Step should receive the train and validation splits from the Preprocessing step.\n"
]
},
{
Expand Down Expand Up @@ -13627,7 +13671,6 @@
"vcpuNum": 96
}
],
"instance_type": "ml.t3.medium",
"kernelspec": {
"display_name": "Python 3 (TensorFlow 2.6 Python 3.8 CPU Optimized)",
"language": "python",
Expand Down
1 change: 0 additions & 1 deletion penguins/penguins-setup.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -1394,7 +1394,6 @@
"vcpuNum": 96
}
],
"instance_type": "ml.t3.medium",
"kernelspec": {
"display_name": "Python 3 (TensorFlow 2.6 Python 3.8 CPU Optimized)",
"language": "python",
Expand Down

0 comments on commit 3bf2f1a

Please sign in to comment.