MTN Heat map explanation (INRIA#833)

SebastienMelo · SebastienMelo · commit ffd3a5681529 · 2025-07-16T10:36:19.000+02:00
diff --git a/python_scripts/parameter_tuning_grid_search.py b/python_scripts/parameter_tuning_grid_search.py
@@ -220,8 +220,9 @@ def shorten_param(param_name):
 cv_results
 
 # %% [markdown]
-# With only 2 parameters, we might want to visualize the grid-search as a
-# heatmap. We need to transform our `cv_results` into a dataframe where:
+# Given that we are tuning only 2 parameters, we can visualize the results as a
+# heatmap. To do so, we first need to reshape the `cv_results` into a dataframe
+# where:
 #
 # - the rows correspond to the learning-rate values;
 # - the columns correspond to the maximum number of leaf;
@@ -237,7 +238,8 @@ def shorten_param(param_name):
 pivoted_cv_results
 
 # %% [markdown]
-# We can use a heatmap representation to show the above dataframe visually.
+# Now that we have the data in the right format, we can create the heatmap as
+# follows:
 
 # %%
 import seaborn as sns
@@ -253,6 +255,14 @@ def shorten_param(param_name):
 ax.invert_yaxis()
 
 # %% [markdown]
+# The heatmap above shows the mean test accuracy (i.e., the average over
+# cross-validation splits) for each combination of hyperparameters, where darker
+# colors indicate better performance. However, notice that using colors only
+# allows us to visually compare the mean test score, but does not carry any
+# information on the standard deviation over splits, making it difficult to say
+# if different scores coming from different combinations lead to a significantly
+# better model or not.
+#
 # The above tables highlights the following things:
 #
 # * for too high values of `learning_rate`, the generalization performance of