Skip to content

Latest commit

 

History

History

groupdata

All Guess Words Groups Stats

Introduction

  • The file, PatternGrpGuessesLarge.txt, is a text file summarizing the word grouping performance for every word in the Wordle allowed guess word list used as a guess for the starting state where every possible Wordle solution can be the days' Wordle solution. Data is based upon using the Classic+ 3,189 possible Wordle solutions.

  • The data can help one evaluate the merit for a only game's first word. Words identified by PatternGrpGuessesLarge.txt no longer apply after the first guess.

  • The Data is sorted first by word groups unbiased entropy (ent) value (descending), then by the number of word groups (qty) generated (descending) and then by the maximum group size (mxa) (ascending).

  • The list columns are:

    • guess - This is the guess word.
    • qty - The number of word groups the guess divides the possible Wordle solution words. The guess word identically matches or mismatches all the words in each word group. Each group is a possible outcome for a guess. The qty is the number of possible remaining words outcomes that could result from the guess.
    • min - The number of words in the smallest word group.
    • max - The number of words in the largest word group. The largest word group often contains the words having nothing in common with the guess word, ie that guess resulting in all grey clues.
    • ave - The average size of all the word groups. This is simply the number of possible solutions divided by the number of groups.
    • ent - The word groups' unbiased entropy in bits. The NYT Wordlebot refers to this as 'Information Bits'. 'Bits' means the entropy value is in powers of 2. Unbiased means every word in the word groups is considered equally likely to be a solution. Entropy in Wordle groups context is simply a measure for how evenly the possible solution words are distributed into the groups or outcomes that result from a guess. Larger entropy means more evenly distributed solution words.
    • exp - The outcome words group size expected for the guess.
    • p2 - The word groups' size population variance.
      • wrds/max - The number of times larger the 3,189 solution word is above the maximum group size. This value may have no bearing on the guess's merit, but it does provide a comparison means. (Newer Group Driller output no longer shows this statistic.)
  • The Group Driller Condensed verbose output creates the content shown in this list.

  • The concept of Wordle Groups is explained in an allegory here: Groups Allegory

Observations

  • The data when sorted in specific ways and then charted in a graph will produce interesting striped graphs from which one can draw observations. For example in the following chart the data is sorted by group quantity first (descending) and then maximum group size (ascending). The stripes correspond to guesses that produce the same number of word groups or outcomes. When judging "better" as producing more groups, which would tend to result in less remaining words, one can see by looking up that group in the data, that the "better end" of a stripe groups tends to be words having the most commonly used letters "AE,NOSTRIL". This observation supports the idea of letter frequency strategy puzzle play.
  • One can also observe that both the entropy values and the expected outcome size also chart as fuzzy stripes. Those characteristics are better predictors than groups quantity.

This Data Graphed

'All Guess Words Groups Graphed.png Image'

Other Data

  • PatternGrpGuessesLarge.txt - Unbiased groups' data generated by the Large vocabulary, ie all the allowed Wordle words, applied as first guesses to the Classic+ vocabulary. This list shows technically better performing first guesses because it includes guesses that would not be solutions but more efficiently divide the solutions than the solutions themselves. This data is sorted by groups ent and includes the groups expected size exp and groups population variance p2.

  • ClassicPatternGrpGuessesLarge.txt - Unbiased groups' data generated by the Large vocabulary, ie all the allowed Wordle words, applied as first guesses to the Classic vocabulary. This list shows the technically best performing first guesses using the Classic vocabulary for guesses. The Classic vocabulary, because it does not include words that the NYT WordleBot padded its Possible Solutions list, ie the Classic+ vocabulary, many words of which arguably would never be the solution, in essence causes the groups' data to some degree resemble the biased NYT WordleBot's groups' data versions.