Solve Wordle puzzles with regular expressions
Preparing the Dictionary
Change to your home directory and surf to the following GitHub page to open the file: https://github.com/dwyl/english-words/blob/master/words_alpha.txt. Press Download, and you will see the start of a list of words. Now right click and select Save Page As to save the file in your home directory. If you're allergic to the GUI, you can use wget
instead.
To make the work a little easier, you should convert the list to uppercase (all Wordle entries are uppercase) with tr
(translate or transliterate) and store the results in a file named wordle-caps.txt
:
$ tr '[:lower:]' '[:upper:]' < words_alpha.txt > wordle-caps.txt
Your new dictionary file named wordle-caps.txt
should have just over 370,000 words. Use wc -l
(short for word count) to count the lines in the file:
wc -l wordle-caps.txt 370102 wordle-caps.txt
For Wordle, you only need the words with exactly five characters from this list. Again, you need the help of grep
. Because all of the words are already in uppercase, you only need to output the five-letter words to a text file. The following grep
command simply stores the five-letter words in a file named wordle-complete.txt
:
grep -o -w "\w\{5\}" wordle-caps.txt > wordle-complete.txt
The -o
option tells grep
to print the matching words, while -w
tells grep
that the search term is a regex. The regex string itself, \w\{5\}
is equivalent to five continuous characters. Now run another line count as follows:
$ wc -l wordle-complete.txt 15918 wordle-complete.txt
This leaves you with nearly 16,000 words, which is more than enough to solve the Wordle of the day. Let's find out.
Grep the Wordle
While you only have to do the preliminary work once, keep the wordle-complete.txt
file safe for later. To solve the wordle shown in Figure 3, you need to start with a completely random word from your Wordle dictionary. Initially, the game grid shown in Figure 3 is empty. You can run shuf
to pick five random five-letter words from the file (Listing 2). If you are not happy with the selection, simply repeat the command.
Listing 2
Game 1, Round 1
$ shuf -n 5 wordle-complete.txt FANGA FRASS SIAFU MOORY HALDU
Wow! Listing 2 resulted in an amazing collection of weird and wonderful words. In our example, we went for the word MOORY. When we entered it in Wordle, all the letter fields were gray – so at first glance, this wasn't a good guess. But now we know that the word we are looking for does not contain any of the letters from MOORY. This knowledge is actually helpful in our search for the solution.
The first command from Listing 3 filters out all words from our word list that contain the characters M, O, R, and Y. The -v
switch (---invert-match
) tells grep to invert the regex rule that follows. The command saves the results to the file wordle1
, which "only" contains 5,362 words. From this list, you can output another five arbitrary words.
Listing 3
Example 1, Attempt 2
$ grep -v '[MOORY]' wordle-complete.txt > wordle1 $ wc -l wordle1 5362 wordle1 $ shuf -n 5 wordle1 TUDEL DATED CEILE ENCUP DEFET
From the selection offered, we liked DATED best – well, it was the only word we understood, so hey ho. I wonder if Wordle will agree with us. Transferred to Wordle, the A in the second position and the T in the third position both light up green, so a pretty good guess. We now know that the second letter in the solution we are looking for is an A and the third letter is a T. The D and the E in DATED are shown in gray, so the letters do not appear in the solution.
Armed with this information, we can now narrow down the word list even further. The grep
command from line 1 of Listing 4 combines all the conditions into a single call. The circumflex (^
) means that the single statement should be inverted, similar to the -v
switch. So the full regular expression [^ED][A][T][^ED][^ED]
searches for a string of five letters. The first must not be E or D, the second must be an A, the third must be a T, and so on.
Listing 4
Example 1, Attempt 3
01 $ grep '[^ED][A][T][^ED][^ED]' wordle1 > wordle2 02 $ wc -l wordle2 03 55 wordle2 04 $ shuf -n 5 wordle2 05 HATCH 06 BATAN 07 PATTA 08 BATTS 09 WATAP
Our wordle2
file now contains only 55 potential solutions. From this, we again output five random words (line 4). The dictionary defines a watap as a thread made of the string roots of various coniferous trees and used by Native Americans, so let's go with it. Again, Wordle isn't entirely happy with our guess. But we have a matching trio of ATA in the middle of our word, which results in more fodder for grep
:
grep '[^WP][A][T][A][^WP]' wordle2 > wordle3
Another call to wc -l
tells us that wordle3
only contains 10 words, so let's just cat
the file and see what we get:
$ cat wordle3 BATAK BATAN CATAN FATAL KATAT LATAH LATAX NATAL SATAI SATAN
Time for some guesswork: FATAL looks like a good choice, but, fatally (ouch), Wordle doesn't see things our way. Not to worry, though: The L in fifth position is marked in green, and the only remaining candidate is NATAL. Lo and behold, we finished the game in four steps, but only due to bit of bad luck at the end.
New Day, New Game
Using the same logic, we can tackle the next game (Figure 4). The hard work has already been done (i.e., we already have retrieved a dictionary and created an uppercase word list). Again, we need to start with an arbitrary word, extracting it from the complete Wordle dictionary:
$ shuf -n 5 wordle-complete.txt CRAPS DAMON TAREQ GEYAN CLARO
This time we went for CLARO. Not a bad start: It looks like the C is in the right place already. The A can occur in the second, fourth, or fifth position, and the O can occur in the second, third, or fourth position. L and R do not occur at any position in the target word. The regex for this is [C][^LR][^LR][^LR][^LR]
, but we also need to pipe the output through two further greps: After all, the word needs to contain an A and an O, too (Listing 5).
Listing 5
Game 2, Round 2
$ grep -P '[C][^LR][^LR][^LR][^LR]' wordle-complete.txt | grep A | grep O > wordle1 $ wc -l wordle1 71 wordle1 $ shuf -n 5 wordle1 COCOA CANOE CHOCA COMMA COTTA
Now, I don't drink cocoa and prefer boats that are bigger than canoes, and I'm pretty sure that CHOCA isn't actually a word, so I'll go for the next word on the list COMMA – after all, I probably type hundreds of them a day. Success! We solved the Wordle in only two guesses.
« Previous 1 2 3 Next »
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
News
-
So Long Neofetch and Thanks for the Info
Today is a day that every Linux user who enjoys bragging about their system(s) will mourn, as Neofetch has come to an end.
-
Ubuntu 24.04 Comes with a “Flaw"
If you're thinking you might want to upgrade from your current Ubuntu release to the latest, there's something you might want to consider before doing so.
-
Canonical Releases Ubuntu 24.04
After a brief pause because of the XZ vulnerability, Ubuntu 24.04 is now available for install.
-
Linux Servers Targeted by Akira Ransomware
A group of bad actors who have already extorted $42 million have their sights set on the Linux platform.
-
TUXEDO Computers Unveils Linux Laptop Featuring AMD Ryzen CPU
This latest release is the first laptop to include the new CPU from Ryzen and Linux preinstalled.
-
XZ Gets the All-Clear
The back door xz vulnerability has been officially reverted for Fedora 40 and versions 38 and 39 were never affected.
-
Canonical Collaborates with Qualcomm on New Venture
This new joint effort is geared toward bringing Ubuntu and Ubuntu Core to Qualcomm-powered devices.
-
Kodi 21.0 Open-Source Entertainment Hub Released
After a year of development, the award-winning Kodi cross-platform, media center software is now available with many new additions and improvements.
-
Linux Usage Increases in Two Key Areas
If market share is your thing, you'll be happy to know that Linux is on the rise in two areas that, if they keep climbing, could have serious meaning for Linux's future.
-
Vulnerability Discovered in xz Libraries
An urgent alert for Fedora 40 has been posted and users should pay attention.