You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: powerquery-docs/cluster-values.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -25,7 +25,7 @@ In this example, the outcome you're looking for is a table with a new column tha
25
25
26
26
To cluster values, first select the **Person** column, go to the **Add column** tab in the ribbon, and then select the **Cluster values** option.
27
27
28
-
:::image type="content" source="media/cluster-values/cluster-column-icon.png" alt-text="Screenshot of the cluster values icon inside the Add column tab in the Power Query online ribbon.":::
28
+
:::image type="content" source="media/cluster-values/cluster-column-icon.png" alt-text="Screenshot of the cluster values icon inside the Add column tab in the Power Query online ribbon." lightbox="media/cluster-values/cluster-column-icon.png":::
29
29
30
30
In the **Cluster values** dialog box, confirm the column that you want to use to create the clusters from, and enter the new name of the column. For this case, name this new column **Cluster**.
Copy file name to clipboardExpand all lines: powerquery-docs/fuzzy-matching.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -56,7 +56,7 @@ To determine what's causing this clustering, double-click **Clustered values** i
56
56
57
57
Enabling the **Show similarity scores** option creates a new column in your table. This column shows you the exact similarity score between the defined cluster and the original value.
58
58
59
-
:::image type="content" source="media/fuzzy-matching/values-with-show-similarity-score.png" alt-text="Screenshot of the table with a new similarity score column named Fruit_Cluster_Similarity.":::
59
+
:::image type="content" source="media/fuzzy-matching/values-with-show-similarity-score.png" alt-text="Screenshot of the table with a new similarity score column named Fruit_Cluster_Similarity." lightbox="media/fuzzy-matching/values-with-show-similarity-score.png":::
60
60
61
61
Upon closer inspection, Power Query couldn't find any other values in the similarity threshold for the text strings `Blue berries are simply the best`,`Strawberries = <3`, `fav fruit is bananas`, and `My favorite fruit, by far, is Apples. I simply love them!`.
62
62
@@ -66,14 +66,14 @@ Go back to the **Cluster values** dialog box one more time by double-clicking **
66
66
67
67
This change gets you closer to the result that you're looking for, except for the text string `My favorite fruit, by far, is Apples. I simply love them!`. When you changed the **Similarity threshold** value from **0.8** to **0.6**, Power Query was now able to use the values with a similarity score that starts from 0.6 all the way up to 1.
68
68
69
-
:::image type="content" source="media/fuzzy-matching/values-with-show-similarity-score-60.png" alt-text="Screenshot of the table after defining the similarity threshold at 0.6 with new values assigned in the Cluster column.":::
69
+
:::image type="content" source="media/fuzzy-matching/values-with-show-similarity-score-60.png" alt-text="Screenshot of the table after defining the similarity threshold at 0.6 with new values assigned in the Cluster column." lightbox="media/fuzzy-matching/values-with-show-similarity-score-60.png":::
70
70
71
71
> [!NOTE]
72
72
>Power Query always uses the value closest to the threshold to define the clusters. The threshold defines the lower limit of the similarity score that's acceptable to assign the value to a cluster.
73
73
74
74
You can try again by changing the **Similarity score** from 0.6 to a lower number until you get the results that you're looking for. In this case, change the **Similarity score** to **0.5**. This change yields the exact result that you're expecting with the text string `My favorite fruit, by far, is Apples. I simply love them!` now assigned to the cluster `Apples`.
75
75
76
-
:::image type="content" source="media/fuzzy-matching/values-with-show-similarity-score-50.png" alt-text="Screenshot of the table with all the correct values in the Cluster column.":::
76
+
:::image type="content" source="media/fuzzy-matching/values-with-show-similarity-score-50.png" alt-text="Screenshot of the table with all the correct values in the Cluster column." lightbox="media/fuzzy-matching/values-with-show-similarity-score-50.png":::
77
77
78
78
> [!NOTE]
79
79
> Currently, only the [Cluster values](cluster-values.md) feature in Power Query Online provides a new column with the similarity score.
@@ -84,9 +84,9 @@ The transformation table helps you map values from your column to new values bef
84
84
85
85
Some examples of how the transformation table can be used:
86
86
87
+
*[Transformation table in cluster values](cluster-values.md#using-the-fuzzy-cluster-options)
87
88
*[Transformation table in fuzzy merge queries](merge-queries-fuzzy-match.md#transformation-table)
88
89
*[Transformation table in group by](group-by.md#fuzzy-grouping)
89
-
*[Transformation table in cluster values](cluster-values.md#using-the-fuzzy-cluster-options)
90
90
91
91
> [!IMPORTANT]
92
92
>When the transformation table is used, the maximum similarity score for the values from the transformation table is 0.95. This deliberate penalty of 0.05 is in place to distinguish that the original value from such column isn't equal to the values that it was compared to since a transformation occurred.
Copy file name to clipboardExpand all lines: powerquery-docs/merge-queries-fuzzy-match.md
+2Lines changed: 2 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -101,6 +101,8 @@ For this example, the **Similarity score** serves only as additional information
101
101
Screenshot of the fuzzy merge survey output table with the Question column containing the column distribution graph showing nine distinct answers with all answers unique, and the answers to the survey with all the typos, plural or singular, and case problems. Also contains the Fruit column with the column distribution graph showing four distinct answers with one unique answer and lists all of the fruits properly spelled, singular, and proper case.
102
102
:::image-end:::
103
103
104
+
For more information about how transformation tables work, go to [Transformation table precepts](cluster-values.md#transformation-table-precepts).
0 commit comments