Solve Wordle puzzles with regular expressions

Preparing the Dictionary

Change to your home directory and surf to the following GitHub page to open the file: https://github.com/dwyl/english-words/blob/master/words_alpha.txt. Press Download, and you will see the start of a list of words. Now right click and select Save Page As to save the file in your home directory. If you're allergic to the GUI, you can use wget instead.

To make the work a little easier, you should convert the list to uppercase (all Wordle entries are uppercase) with tr (translate or transliterate) and store the results in a file named wordle-caps.txt:

$ tr '[:lower:]' '[:upper:]' < words_alpha.txt > wordle-caps.txt

Your new dictionary file named wordle-caps.txt should have just over 370,000 words. Use wc -l (short for word count) to count the lines in the file:

wc -l wordle-caps.txt
370102 wordle-caps.txt

For Wordle, you only need the words with exactly five characters from this list. Again, you need the help of grep. Because all of the words are already in uppercase, you only need to output the five-letter words to a text file. The following grep command simply stores the five-letter words in a file named wordle-complete.txt:

grep -o -w "\w\{5\}" wordle-caps.txt > wordle-complete.txt

The -o option tells grep to print the matching words, while -w tells grep that the search term is a regex. The regex string itself, \w\{5\} is equivalent to five continuous characters. Now run another line count as follows:

$ wc -l wordle-complete.txt
15918 wordle-complete.txt

This leaves you with nearly 16,000 words, which is more than enough to solve the Wordle of the day. Let's find out.

Grep the Wordle

While you only have to do the preliminary work once, keep the wordle-complete.txt file safe for later. To solve the wordle shown in Figure 3, you need to start with a completely random word from your Wordle dictionary. Initially, the game grid shown in Figure 3 is empty. You can run shuf to pick five random five-letter words from the file (Listing 2). If you are not happy with the selection, simply repeat the command.

Listing 2

Game 1, Round 1

$ shuf -n 5 wordle-complete.txt
FANGA
FRASS
SIAFU
MOORY
HALDU

Wow! Listing 2 resulted in an amazing collection of weird and wonderful words. In our example, we went for the word MOORY. When we entered it in Wordle, all the letter fields were gray – so at first glance, this wasn't a good guess. But now we know that the word we are looking for does not contain any of the letters from MOORY. This knowledge is actually helpful in our search for the solution.

The first command from Listing 3 filters out all words from our word list that contain the characters M, O, R, and Y. The -v switch (---invert-match) tells grep to invert the regex rule that follows. The command saves the results to the file wordle1, which "only" contains 5,362 words. From this list, you can output another five arbitrary words.

Listing 3

Example 1, Attempt 2

$ grep -v '[MOORY]' wordle-complete.txt > wordle1
$ wc -l wordle1
5362 wordle1
$ shuf -n 5 wordle1
TUDEL
DATED
CEILE
ENCUP
DEFET

From the selection offered, we liked DATED best – well, it was the only word we understood, so hey ho. I wonder if Wordle will agree with us. Transferred to Wordle, the A in the second position and the T in the third position both light up green, so a pretty good guess. We now know that the second letter in the solution we are looking for is an A and the third letter is a T. The D and the E in DATED are shown in gray, so the letters do not appear in the solution.

Armed with this information, we can now narrow down the word list even further. The grep command from line 1 of Listing 4 combines all the conditions into a single call. The circumflex (^) means that the single statement should be inverted, similar to the -v switch. So the full regular expression [^ED][A][T][^ED][^ED] searches for a string of five letters. The first must not be E or D, the second must be an A, the third must be a T, and so on.

Listing 4

Example 1, Attempt 3

01 $ grep '[^ED][A][T][^ED][^ED]' wordle1 > wordle2
02 $ wc -l wordle2
03 55 wordle2
04 $ shuf -n 5 wordle2
05 HATCH
06 BATAN
07 PATTA
08 BATTS
09 WATAP

Our wordle2 file now contains only 55 potential solutions. From this, we again output five random words (line 4). The dictionary defines a watap as a thread made of the string roots of various coniferous trees and used by Native Americans, so let's go with it. Again, Wordle isn't entirely happy with our guess. But we have a matching trio of ATA in the middle of our word, which results in more fodder for grep:

grep '[^WP][A][T][A][^WP]' wordle2 > wordle3

Another call to wc -l tells us that wordle3 only contains 10 words, so let's just cat the file and see what we get:

$ cat wordle3
BATAK
BATAN
CATAN
FATAL
KATAT
LATAH
LATAX
NATAL
SATAI
SATAN

Time for some guesswork: FATAL looks like a good choice, but, fatally (ouch), Wordle doesn't see things our way. Not to worry, though: The L in fifth position is marked in green, and the only remaining candidate is NATAL. Lo and behold, we finished the game in four steps, but only due to bit of bad luck at the end.

New Day, New Game

Using the same logic, we can tackle the next game (Figure 4). The hard work has already been done (i.e., we already have retrieved a dictionary and created an uppercase word list). Again, we need to start with an arbitrary word, extracting it from the complete Wordle dictionary:

$ shuf -n 5 wordle-complete.txt
CRAPS
DAMON
TAREQ
GEYAN
CLARO
Figure 4: We solved the second example in just two steps with a bit of luck this time.

This time we went for CLARO. Not a bad start: It looks like the C is in the right place already. The A can occur in the second, fourth, or fifth position, and the O can occur in the second, third, or fourth position. L and R do not occur at any position in the target word. The regex for this is [C][^LR][^LR][^LR][^LR], but we also need to pipe the output through two further greps: After all, the word needs to contain an A and an O, too (Listing 5).

Listing 5

Game 2, Round 2

$ grep -P '[C][^LR][^LR][^LR][^LR]' wordle-complete.txt | grep A | grep O > wordle1
$ wc -l wordle1
71 wordle1
$ shuf -n 5 wordle1
COCOA
CANOE
CHOCA
COMMA
COTTA

Now, I don't drink cocoa and prefer boats that are bigger than canoes, and I'm pretty sure that CHOCA isn't actually a word, so I'll go for the next word on the list COMMA – after all, I probably type hundreds of them a day. Success! We solved the Wordle in only two guesses.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Wordle Booster

    Wordle, a simple online word game, took the world by storm in February. Mike Schilli has developed a command-line tool to boost his Wordle streak using some unapproved tactics.

  • Treasure Hunt

    A geolocation guessing game based on the popular Wordle evaluates a player's guesses based on the distance from and direction to the target location. Mike Schilli turns this concept into a desktop game in Go using the photos from his private collection.

comments powered by Disqus