Wordle is the popular game where you try to guess a hidden word in 66 guesses.

Each guess you get a pattern consisting of black, yellow and green colours for each letter of the guess.

A guess of “PARSE” would return a hint with a pattern, like this:



Word Meaning
Hidden word A word that could be the answer, but it is hidden from the player
Test word A word that can be used as a guess
Pattern The hint given by the game after making a guess
“Easy” mode The original game where the set of test words is not restricted
“Hard” mode A variation where the set of test words gets smaller depending on the previous guesses and patterns. In this variation, you can only use guesses that match the patterns provided


The regular wordle list from 15-Feb-22 has 23152315 hidden words HH, and 1297212972 test words TT.

The regular wordle problem has been looked at in great depth by Alex Selby on The best strategies for Wordle, part 2.

BIGHIDDEN mode, which is a harder version where there are 1297212972 hidden words and 1297212972 test words, is what I focused on. In particular, the wordle “easy” mode, which is ironically harder to compute than wordle hard mode, where the test words you can use are restricted by your previous guesses.

With BIGHIDDEN mode, it can be proven (by verifying the models) that there are at least 64606460 unique starting test words that have a solution. Unless there are mistakes in the code, there should be exactly 64606460 unique starting test words for this problem, because it was exhaustively calculated.

Finding the solution is quite difficult, but verifying that it is a correct solution is simple. The complete 6460 word list is available here, and the all the models are here.

Using solve from domsleee/wordle.

Goal Command
Find all solutions ./bin/solve --guesses ext/wordle-combined.txt --answers ext/wordle-combined.txt
Verify a model ./bin/solve --guesses ext/wordle-combined.txt --answers ext/wordle-combined.txt --verify model.json

New strategies

1. EndGame Db

Refer to EndGameAnalysis/ folder.

This is a cutoff that caches all powersets of sets of words EE. Currently it uses 2276 sets of words that match on 4 letters + positions, e.g. pattern .arks.


2. Lower bounds using a cache entry that is a subset

The main idea is that if you are querying for a lowerbound for a set of hidden words HH when you have known lower bounds for sets of hidden words L={H1,H2,...,Hn}L = \{H_1,H_2,...,H_n\}:

l=maxHL{lb(H)HQ}l = \textrm{max}_{H \in L}\{lb(H) \mid H \subseteq Q\}

This only makes sense for easy mode, since for hard mode you would also need to consider the test words.

In terms of implementation details, I tried an inverted index approach SubsetCache.h, and also a trie approach.

3. Most solved in 2 or 3 guesses

See this PR.

We know the following

  • 2 guesses ⇒ any 3 words can be solved (by exhaustion)
  • 3 guesses ⇒ any 5 words can be solved (reasoning, {1,1,1} can add any 2)

normal: bin/solve --guesses ext/wordle-guesses.txt --answers ext/wordle-answers.txt

bighidden: bin/solve --guesses ext/wordle-combined.txt --answers ext/wordle-combined.txt

--other-program normal bighidden
any4in2 ❌ (13 groups, eg batch,catch,hatch,patch) ❌ (11423 groups)
any6in3 ❌ (eg fills,jills,sills,vills,bills,zills)

4. Remove guesses that have a better guess

Refer to RemoveGuessesBetterGuess/

Goal: reduce the size of test words TT by removing test words that have a better test word.

Let P(H,t)\mathcal{P}(H, t) be the partitioning of the hidden word set HH using test word tt, i.e.

P(H,t)={P(H,t,x)xX}\mathcal{P}(H,t)=\{P(H,t,x)\mid x\in X\}

Where P(H,t,x)P(H,t,x) is the words in HH which would give pattern xx when guessing the word tt.

Define t1t2t_1 \geq t_2 to mean t1t_1 is at least as good as t2t_2 if every partition of t1t_1 is a subset of a partition of t2t_2:

p1P(H,t1)p2P(H,t2)(p1p2)\forall_{p_1\in\mathcal{P}(H,t_1)}\exists_{p_2\in \mathcal{P}(H,t_2)}(p_1 \subseteq p_2)

t1t2t_1 \geq t_2 implies that t2t_2 can be deleted from TT.


Consider this definition of solvability, which returns 11 iff there is a solution in dd guesses:

F(H,d)={Hdd1maxtTpP(H,t)F(p,d1)otherwise\mathcal{F}(H,d) = \begin{cases} |H|\leq d &d\leq1\\ \text{max}_{t\in T}\prod_{p\in\mathcal{P}(H,t)}\mathcal{F}(p,d-1) &\text{otherwise} \end{cases}

We want to prove that F(H,d)\mathcal{F}(H,d) will return an equivalent value if we remove values from TT.

For every t1t_1 that is used as a better guess than t2t_2:

pP(H,t1)F(p,d1)pP(H,t2)F(p,d1)\prod_{p\in\mathcal{P}(H,t_1)}\mathcal{F}(p,d-1) \geq \prod_{p\in\mathcal{P}(H,t_2)}\mathcal{F}(p,d-1)

We can prove this by cases:

If any value of the RHS is 00, the claim is clearly true because the domain is {0,1}\{0,1\}.

If all values of the RHS is 11, then all values of the LHS is 11 because every partition of the LHS is a subset of a partition on the RHS.


For least possible guesses

The definition for least possible guesses, taken from The best strategies from Wordle:

f(H)={HH1H+mintTsGGGGGf(P(H,t,s))otherwisef(H) = \begin{cases} |H| &|H|\leq1\\ |H|+\text{min}_{t\in T}\sum_{s \ne GGGGG}f(P(H,t,s)) &\text{otherwise} \end{cases}

The same definition for “better guess” doesn’t work because the sum can exclude a partition (sGGGGGs \ne GGGGG) . So for least guesses, we must add an additional check that P(H,t1,GGGGG)=P(H,t2,GGGGG)P(H,t_1,GGGGG)=P(H,t_2,GGGGG).


This is nice however I could not find a fast enough way of doing this.

Using the partitioning approach, I could find an O(T2H)O(T^2H) algorithm, but with a reasonable approximation, there is an O(TlogT)O(T\log{T}) approach. Typical values are T=4000T = 4000 and H=40H = 40.

An approximation of this partitioning check is to reason about which partitions p2p_2 would be split, by reasoning about how two words of HH are differentiated:

1 Non-letter rule

If a letter does not occur in any word of HH, then any occurrence of that letter in a test word will not differentiate any two words of HH.

2 Green letter rule

If every position of one letter across all HH has the same position in every word of HH (for all positions of the letter in the word), then any occurrence of that letter in a test word will not differentiate any two words of HH.

3 exactly-n letter rule (yellow letter rule)

If there are exactly nn occurrences of cc in every word in HH, then any occurrence of cc in TT that is in a position other than all the positions of cc in HH can be ignored.

  • If there are at most two cc in every word in HH, and one of them is green, the same rule does not apply
    • reason: for guess tt, 1Y gives no information when tt has 1c1c but gives information when tt has 2c2c. Example: 1Y1Y from 2c2c implies hh has 1c1c

4 most-n letter rule

If there are at most nn occurrences of cc in every word in HH, a word tt with n+m≥n+m of cc can mark up to mm positions that don’t occur in any position of HH - they are guaranteed to be useless BB

5 least-n letter rule

If there are at least nn occurrences of cc in every word in HH, a word with tt with n\le n of cc can mark all letters that don’t occur in any position of HH - they are guaranteed to be useless YY