Track down race conditions with Go

Programming Snapshot – Racing Goroutines

© Lead Image © alphaspirit, 123RF.com

© Lead Image © alphaspirit, 123RF.com

Article from Issue 251/2021
Author(s):

If program parts running in parallel keep interfering with each other, you may have a race condition. Mike Schilli shows how to instruct the Go compiler to detect these conditions and how to avoid them in the first place.

If programmers are not careful, program parts that are running in parallel will constantly get in each other's way, whether as processes, threads, or goroutines. If you leave the order in which system components read or modify data to chance, you are adding time bombs to your code. They will blow up sooner or later, leaving you with runtime errors that are difficult to troubleshoot. But how do you avoid them?

The common assumption that components will run in the same order that a program calls them is a fallacy – one easily refuted with an example such as in Listing 1. But coincidence can also be a factor. It is quite possible for something to work once but then crash after a small, and often unrelated, change to the code. The load on the system you are using can also play a role: Something may work flawlessly in slack times but fall apart unexpectedly under a heavy load.

Listing 1

orderfail.go

01 package main
02 import (
03   "fmt"
04 )
05
06 func main() {
07   done := make(chan bool)
08   n := 10
09
10   for i := 0; i < n; i++ {
11     go func(id int) {
12       fmt.Printf("goroutine %d\n", id)
13       done <- true
14     }(i)
15   }
16
17   for i := 0; i < n; i++ {
18     <-done
19   }
20 }

The fact that unsynchronized goroutines do not run in the order in which they are defined, even if the program starts them one after the other, is nicely illustrated by Listing 1 [1] and the output in the upper part of Figure 1. Although the for loop starts goroutine   first, followed by 1, then 2, and so on, as defined by the index numbers in i, the upper part of Figure 1 makes it clear from the compiled program's output that chaos reigns, and the goroutines write their messages to the output as a wildly confusing mess.

Figure 1: Unsynchronized goroutines run in an idiosyncratic order.

Each of the 10 go func()s created in the for loop passes the current loop index as a parameter to the respective goroutine, completely according to the textbook, so that they do not all share the same variable. Also, to stop the program from terminating immediately after the for loop ends – instead of making it wait until all the goroutines have completed their work – each goroutine sends a message to the done channel at the end of its working life. The final for loop starting in line 17 collects the messages from there and does not terminate until the last goroutine has said goodbye.

One by One

But if you really want goroutine   to start first, then goroutine 1, and so on, you need to use a synchronization mechanism, such as channels or mutex constructs, to make sure that the Go runtime maintains the desired order, defying the natural chaos.

Listing 2 demonstrates this with an array of 10 channels. The goroutines all start blocking, shortly after they are called, and wait until a message arrives on the channel assigned to them. This unblocks the read statement from the channel array starters in line 17, and the goroutine moves on to printing its "Running" message. At first, none of the channels will have a message, but line 27 after the for loop then starts a chain of events by writing a value to the first channel.

Listing 2

orderok.go

01 package main
02 import (
03   "fmt"
04 )
05
06 func main() {
07   done := make(chan bool)
08   n := 10
09
10   starters := make([](chan bool), n)
11   for i := 0; i < n; i++ {
12     starters[i] = make(chan bool)
13   }
14
15   for i := 0; i < n; i++ {
16     go func(id int) {
17       <-starters[id]
18       fmt.Printf("Running %d\n", id)
19       if id < n-1 {
20         starters[id+1] <- true
21       }
22       // [... DO WORK ...]
23       done <- true
24     }(i)
25   }
26
27   starters[0] <- true
28
29   for i := 0; i < n; i++ {
30     <-done
31   }
32 }

This releases the goroutine with the id of  , because the block in its read statement in line 17 is now lifted. The routine then outputs its message and, to keep things ticking along, writes to the channel with the id+1 (i.e., 1). This in turn triggers goroutine 1, which in turn triggers goroutine 2. This merry dance continues in a controlled manner until goroutine 9 initiates the completion of the program.

This approach naturally reduces the concurrency of all goroutines, which now do not all start quasi-simultaneously but wait for each other – but only as long as the individual goroutine needs to trigger the next one in the channel. What happens afterwards within the individual goroutines (commented in line 22 with the placeholder DO WORK), is again a quasi-simultaneous affair.

There Can Only Be One Winner

The disastrous consequences that race conditions can cause in an application are illustrated by an airline's booking program in Listing 3. It detects in line 13 that there is still one seat available on the plane in the variable seats, which is shared by two different goroutines. It then outputs a success message to the user and sets the number of remaining seats to zero.

Listing 3

airline.go

01 package main
02 import (
03   "fmt"
04   "time"
05 )
06
07 func main() {
08   seats := 1
09
10   for i := 0; i < 2; i++ {
11     go func(id int) {
12       time.Sleep(100 * time.Millisecond)
13       if seats > 0 {
14         fmt.Printf("%d booked!\n", id)
15         seats = 0
16       } else {
17         fmt.Printf("%d missed out.\n", id)
18       }
19     }(i)
20   }
21
22   time.Sleep(1 * time.Second)
23   fmt.Println("")
24 }

However, there are two parallel goroutines fighting over the booking in the for loop starting in line 10. While one rejoices and prints the success message, the second goroutine also tests the variable seats, which is still set to 1, and proceeds to book the seat as well. The result is an overbooked plane and angry passengers.

The output at the top of Figure 2 shows that Listing 3 does indeed allow repeated double-bookings – exacerbated by the length of the microsleep instruction at line 12, simulating the actual booking process. This is not what a customer, or an airline, wants.

Figure 2: Without synchronization, two goroutines happily book the same seat.

The root of the problem is obvious: Two concurrent program threads share the variable seats during the time that elapses between the check seats > 0 in line 13 and the variable being reset by seats = 0 in line 15. If the second goroutine is performing a check while the first is booking the seat, the second goroutine erroneously thinks it has a free seat because seats is still set to 1. A booking error is inevitable.

The problem can be solved by either performing the check and setting the variable in a single atomic statement or by declaring the program area containing both statements to be a critical section that locks out other goroutines as long as one goroutine is working in it.

Listing 4 shows a possible solution to the problem using a buffered booking channel with a depth of 1, as created by the make statement in line 9. Thanks to the buffer, one goroutine can write a value into the channel without it immediately blocking [2]. But if the next goroutine tries to send a value into the channel, it blocks until someone else has extracted the buffered value, and this happens at the end of the critical section in line 21.

Listing 4

airline-ok.go

01 package main
02 import (
03   "fmt"
04   "time"
05 )
06
07 func main() {
08   seats := 1
09   booking := make(chan bool, 1)
10
11   for i := 0; i < 2; i++ {
12     go func(id int) {
13       time.Sleep(100 * time.Millisecond)
14       booking <- true
15       if seats > 0 {
16         fmt.Printf("%d booked!\n", id)
17         seats = 0
18       } else {
19         fmt.Printf("%d missed out.\n", id)
20       }
21       <-booking
22     }(i)
23   }
24
25   time.Sleep(1 * time.Second)
26   fmt.Println("")
27 }

With this safeguard in place, only one goroutine traverses the critical section at any given time, and it doesn't matter how long it takes to check or set the seats variable, because no one can interfere in the meantime. The lower part of Figure 2 then also shows that only one goroutine at a time makes the booking, while the other goroutine reports that there are no more seats available – to the disappointment of the passenger who wants to book. But that's how things have to be.

Reporting Speeders

During development, Go helps you detect race conditions – if you compile the source code with the -race option. If two goroutines then race for a variable, the Go runtime detects this in the moment and outputs a corresponding error message (Figure 3). However, this requires the program to enter the subrange that triggers the problem during the test run. This makes it important for the test suite to cover the code as completely as possible.

Figure 3: The program, built with the -race option in place, detects race conditions at runtime.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Simultaneous Runners

    In the Go language, program parts that run simultaneously synchronize and communicate natively via channels. Mike Schilli whips up a parallel web fetcher to demonstrate the concept.

  • Let's Go!

    Released back in 2012, Go flew under the radar for a long time until showcase projects such as Docker pushed its popularity. Today, Go has become the language of choice of many system programmers.

  • Fighting Chaos

    When functions generate legions of goroutines to do subtasks, the main program needs to keep track and retain control of ongoing activity. To do this, Mike Schilli recommends using a Context construct.

  • Motion Sensor

    Inotify lets applications subscribe to change notifications in the filesystem. Mike Schilli uses the cross-platform fsnotify library to instruct a Go program to detect what's happening.

  • Progress by Installments

    Desktop applications, websites, and even command-line tools routinely display progress bars to keep impatient users patient during time-consuming actions. Mike Schilli shows several programming approaches for handwritten tools.

comments powered by Disqus