@@ -220,8 +220,9 @@ def shorten_param(param_name):
220220cv_results
221221
222222# %% [markdown]
223- # With only 2 parameters, we might want to visualize the grid-search as a
224- # heatmap. We need to transform our `cv_results` into a dataframe where:
223+ # Given that we are tuning only 2 parameters, we can visualize the results as a
224+ # heatmap. To do so, we first need to reshape the `cv_results` into a dataframe
225+ # where:
225226#
226227# - the rows correspond to the learning-rate values;
227228# - the columns correspond to the maximum number of leaf;
@@ -237,7 +238,8 @@ def shorten_param(param_name):
237238pivoted_cv_results
238239
239240# %% [markdown]
240- # We can use a heatmap representation to show the above dataframe visually.
241+ # Now that we have the data in the right format, we can create the heatmap as
242+ # follows:
241243
242244# %%
243245import seaborn as sns
@@ -253,6 +255,14 @@ def shorten_param(param_name):
253255ax .invert_yaxis ()
254256
255257# %% [markdown]
258+ # The heatmap above shows the mean test accuracy (i.e., the average over
259+ # cross-validation splits) for each combination of hyperparameters, where darker
260+ # colors indicate better performance. However, notice that using colors only
261+ # allows us to visually compare the mean test score, but does not carry any
262+ # information on the standard deviation over splits, making it difficult to say
263+ # if different scores coming from different combinations lead to a significantly
264+ # better model or not.
265+ #
256266# The above tables highlights the following things:
257267#
258268# * for too high values of `learning_rate`, the generalization performance of
0 commit comments