Distributed Compiling with distcc

Distributed software compilation for the Raspberry Pi

By

Distributed compiling with distcc offloads the CPU-intensive compilation tasks from the Raspberry Pi to other computers, saving you days of time and frustration.

The Pi is wonderful and all, but it is not really ideal for compiling. Try to build anything more complex than a "Hello World" program, and you will lock it up for hours. However, Raspbian runs compiled programs, so how did they get there? Of the several ways to compile programs for the Raspberry Pi, they all, interestingly, involve removing the Rasp Pi's hardware from the equation.

One option is to compile using a tool chain, which is exactly what it says on the box: a series of tools you chain together on a regular non-Rasp Pi computer and through which you pipe the source code of a program. Out the other end pops the compiled version of the program ready for your Pi. The Raspberry Pi Foundation distributes an official tool chain [1].

I can think of two problems with tool chains. First, you need to replicate the Rasp Pi environment by copying over directories to the machine that is going to do the compiling. This, in itself, is not too difficult, but if you come across an unexpected, unmet dependency while compiling, you then have to go back to your Pi, install the packages you need, copy over the directories again, and restart the compile … and you have to do this every time the compile borks.

Second, you run into the problem of not actually being able to test your program until you copy it onto the Pi and try it out. If you skip a file by accident or fudge the install, you can spend hours or days trying to figure out what you missed.

Your second option is to use a virtual machine. You can't get Raspbian running on VirtualBox (VirtualBox only does x86, not ARM, architectures), but it does work on Qemu [2]. The idea is you decompress a Raspbian image file, mount it, and run it as a virtual SD card on Qemu. The biggest problem with this method is that virtual machines tend to be sloooooow, and resource hungry! Even on a modern multicore machine, you're going to be sucking up at least one core and, to make the experience less painful, probably two. Therefore, while compiling (which is a CPU-intensive task), you will seriously hamper any other heavy-duty activities, like playing a video game, using a design program, or watching a high-resolution movie.

The third option is distributed compiling, and this is the most intriguing of them all. The idea here is that the Rasp Pi works as a master and forks out the job of compiling to other computers on the network (aka nodes). The Rasp Pi "thinks" it is compiling locally, and all unmet dependencies are dealt with directly on the running Pi – no going back and forth.

Once set up, the nodes doing the real heavy lifting can be headless, so they can be idle print servers, file servers, or whatever you have laying around in your office or home. Even old computers can do the job decently well. This means you don't have to tie up your own computer in a CPU-intensive task. Of course, you can use your own computer as a node, but you don't have to. Because the nodes only do one thing – compile – and don't need to run a virtual environment, they are fast, or at least much faster than using a virtual machine.

Finally, there's an app for that: It's called distcc [3], and it is available in the Raspbian repositories and for most other Linux distributions. (See also the "Aim of the Game" box.)

Aim of the Game

This article came about because I wanted to port a program I had written for the Arduino 101 in a previous article to the Rasp Pi. In that article [4], I demoed how to use the gyroscope on the Arduino 101 by waving it around to move the model of a 3D helicopter on the screen of a my laptop. However, being a contributor to a magazine that has Raspberry Pi in its name, the fact that the bits and pieces didn't work on the Rasp Pi bugged me.

Unfortunately, Panda3D [5], one of the cornerstones of the project, has no native package for the Rasp Pi. Compiling Panda3D on the Pi is nearly impossible because of the resources it sucks up in the process. I tried once, and it took more than 24 hours to reach about 30%, and then it just stopped, locking up the Pi completely.

Looking for ways to compile Panda3D led me down the rabbit hole of distributed compiling. And here we are. Although this article might seem a bit dry at the beginning, stick with me: The pay off is pretty great and includes cool things such as animated 3D graphics and gesture-controlled devices.

distcc on the Pi

Installing distcc on the Rasp Pi is straightforward:

sudo apt-get update
sudo apt-get upgrade
sudo apt-get install distcc

Configuring is a bit more complicated. First, you have to edit distcc's /etc/distcc/hosts file. This file contains the list of names or IPs on which distcc will compile. Comment out the line that says +zeroconf and add the IPs of your nodes, one per line. For example, I am only going to use one node, a quadcore i5 that I use as a printer/scanner server on my home network and lives at 192.168.1.24. This machine is idle most of the time, so it is ideal. Adding the line

192.168.1.24

allows distcc to use that computer as a compile node.

If you have several nodes, distcc will try the first one in the list; if that doesn't work or is too busy, it will move on to the next, then the next, and the next. This means that if you have a number of computers you can use as nodes, it is a good idea to put the fastest or least busy nodes at the top. If you do want to include your personal computer (i.e., the one on which you regularly work), you might want to put it toward the bottom of the list. If no nodes are available, distcc will try to compile your program locally using the local compile tools.

The distcc program creates a directory, /usr/lib/distcc, which it fills with dummy compilers, soft links that actually point to the distcc executable. To make sure you always compile using distcc, you want to put the path to that directory at the beginning of your Raspbian $PATH environment variable. Do that by adding the line

PATH=/usr/lib/distcc:$PATH

to the end of the /etc/profile file. This ensures that, when the time to compile comes, Raspbian will first look into the distcc directory before it looks anywhere else.

To activate the change, type:

. /etc/profile

You can check that everything is okay by typing:

echo $PATH

This should show /usr/lib/distcc at the beginning of the list of directories.

Use the which tool to check that Raspbian is picking up the correct compiler (i.e., the distcc dummy compiler):

$ which gcc
/usr/lib/distcc/gcc

Although not strictly necessary, you can include the following variables in your .bashrc file, assuming you are the user who is going to do the compiling:

DISTCC_BACKOFF_PERIOD=0
DISTCC_IO_TIMEOUT=3000
DISTCC_SKIP_LOCAL_RETRY=1

The DISTCC_BACKOFF_PERIOD variable tells distcc how long (in seconds) it should wait when a node fails before trying again. By setting it to  distcc will try immediately. The DISTCC_IO_TIMEOUT=3000 variable tells distcc how long it has to wait before quitting with a timeout error when a node doesn't respond immediately. Finally, DISTCC_SKIP_LOCAL_RETRY=1 tells distcc not to try and compile locally if all the other nodes fail. As mentioned before, the Pi is bad at compiling, so this is probably a sensible setting.

These variables don't work on distcc versions earlier than 3.2, and, at the moment of writing, the version in Raspbian's repository is 3.1. However, some day it will be updated and then you'll be ready!

Installing on Nodes

Now you have to configure what distcc calls the hosts – that is, the nodes (in my case, "node" in the singular) on which you will be compiling. My computer at IP 192.168.1.24 is a Debian machine, so I access it and install distcc onto it:

apt-get install distcc

When you do that, distcc actually installs two bits of software. You already saw how to configure the client bit on your Rasp Pi in the previous section, but now you need the daemon component, a program that runs as a server in the background on each of your nodes and listens for compile requests from the Pi.

The first thing to do is modify the /etc/default/distcc file as root by changing the line that says

STARTDISTCC="false"

to:

STARTDISTCC="true"

This change makes sure you can start the distcc daemon and starts it again every time you reboot the node. The next line to change is

ALLOWEDNETS="127.0.0.1"

to:

ALLOWEDNETS="192.168.1.0/24"

If your network is like mine, with IPs that go from 192.168.1.1 to 192.168.1.254, this makes sure that the whole network is covered. My Rasp Pi, which has currently been assigned 192.168.1.111, will be able to pass on compile tasks to the node.

If your IPs are something different (e.g., 192.168.0.1 to 192.168.0.254), you would use:

ALLOWEDNETS="192.168.0.0/24"

If you have configured your Pi to have a static address (e.g., 192.168.1.31), you could use

ALLOWEDNETS="192.168.1.31"

and only allow compiling from the Pi.

Continuing down in the file, the last thing you need to change is the line that says

LISTENER="127.0.0.1"

to:

LISTENER="0.0.0.0"

This will ensure that distcc listens to the outside network.

The most modern version of Debian uses systemd, so to get distcc started immediately, type

systemctl start distcc

as root. To check that everything is working as it should, use:

systemctl status distcc

You should see output something like that shown in Figure 1.

Figure 1: Distcc running as a daemon on a node.

The next step is to install the Rasp Pi tool chain. I know what I said before, but you are going to be using the specially tailored ARM compilers that come with it.

Make a directory in your home directory (I called mine RPiTC) and download the Rasp Pi tool chain into it:

cd RPiTC
git clone https://github.com/raspberrypi/tools.git --depth=1

This grabs the latest version directly from the Raspberry Pi Foundation's repository. Next, open /etc/init.d/distcc and add the path to the tool chain's compiler collection to the PATH variable

PATH=/home/<your_user>/RPiTC/tools/arm-bcm2708/gcc-linaro-arm-linux-
  gnueabihf-raspbian/bin:$PATH

and reload the distcc daemon:

systemctl daemon-reload

The distcc from the Rasp Pi is going to come looking for executable compilers called cppgccc++g++, and so on, but if you look in tools/arm-bcm2708/gcc-linaro-arm-linux-gnueabihf-raspbian/bin, you'll see compilers called arm-linux-gnueabihf-cpparm-linux-gnueabihf-gcc, and so on. To avoid the Rasp Pi distcc from bailing, create some soft links so that it finds what it's looking for:

cd ~/RPiTC/tools/arm-bcm2708/gcc-linaro-arm-linux-gnueabihf-raspbian/bin
ln -s arm-linux-gnueabihf-c++ c++
ln -s arm-linux-gnueabihf-cpp cpp
ln -s arm-linux-gnueabihf-g++ g++
ln -s arm-linux-gnueabihf-gcc gcc

This makes sure the above-named compilers exist, even though they are really pointing to the ARM equivalents.

You have to do all of the above for each node. When you're done, trying to compile anything that is CPU-intensive on the Pi will result in it being shipped off to the nodes, at which point they will take over.

Compiling Panda3D

Finally, I come to the job I was aiming to do all along: get Panda3D working on the Rasp Pi. This is not a task for the faint of heart – at least without guidance. Setting up distcc to compile is only the first hurdle. If you manage to figure out all the dependencies and complete a compile, most likely Panda3D will be sluggish, many features (e.g., texturing) will be missing, and the program will crash if you nudge it even a little bit.

All of these symptoms are the result of a regular compile that tries to use OpenGL. However, when it doesn't find a GPU that supports OpenGL, Panda3D will fall back to using the CPU, which is a very suboptimal solution.

The trick is figuring out how to compile Panda3D to use OpenGL ES and the on-board Broadcom GPU. ES stands for Embedded Systems and is the subset of OpenGL used for games and graphic-intensive programs on smartphones and, yes, the Rasp Pi.

Fortunately, Thomas Egenhofer has done the legwork and published an excellent guide [6]. However, his guide needs a bit of updating and a few tweaks to work perfectly on a distributed compile.

The first task is to get the dependencies out of the way:

sudo apt-get install build-essential pkg-config python-dev libpng-dev 
  libjpeg-dev libtiff-dev zlib1g-dev libssl-dev libx11-dev libgl1-mesa-dev 
  libxrandr-dev libxxf86dga-dev libxcursor-dev bison flex libfreetype6-dev 
  libvorbis-dev libeigen3-dev libopenal-dev libode-dev libbullet-dev 
  libgtk2.0-dev

The Panda3D developers recommended this list themselves, except I have taken out one package specific to Nvidia. Rasp Pi doesn't come with a Nvidia GPU, so you won't need it; in fact, it would interfere with the correct working of Panda3D programs.

Next, you should download the latest stable version of Panda3D:

git clone https://github.com/panda3d/panda3d.git

Egenhofer has written a patch that tweaks all the scripts and source code files to adapt them to the Rasp Pi's architecture. Download it with:

wget http://home.arcor.de/positiveelectron/files/pandaprpi2.patch

The patch assumes Panda3D is in a directory called panda3d-master-org, but your directory is probably called simply panda3d. Rename it with:

mv panda3d/ panda3d-master-org

Now make sure you are in the directory that contains the panda3d-master-org directory (not in panda3d-master-org itself) and apply the patch with:

patch -s -p0 < pandaprpi2.patch

The output should be minimal – at most, a couple of lines mentioning what has been rejected, if anything. You can safely ignore the message.

With the source patched, Egenhofer then recommends setting some environment variables before starting the compile proper:

export LDFLAGS="-Wl,-allow-multiple-definition"

Finally, to get the compile going, he recommends using the following chain of commands, flags, and parameters:

python2.7 ./makepanda/makepanda.py --verbose --everything 
  --installer --threads=4 --no-eigen --gles-incdir=/opt/vc/include 
  --gles-libdir=/opt/vc/lib

Even using distcc, compiling Panda3D is a toughie and will take a while. Fortunately, using a quadcore i5, it takes minutes, not hours (Figure 2). If you access any of your nodes and, as root, have a look in /var/log/distcc (e.g., with tail):

tail -f /var/log/distcc

you will see each component pop up as it is compiled.

Figure 2: If you "follow" the distcc logfile on the compile node, the tasks scroll by as they occur.

The --installer flag used in the python2.7 command above ensures that, when the compile is done, you'll find a DEB package in your Panda3D directory. You can install it with

sudo dpkg -i panda3d<XXXXX>_armhf.deb

where <XXXXX> is the version number. Installing the DEB also takes a while, so be patient.

Using distcc on a quadcore i5 at 2.8GHz (a pretty old machine), the compile took about 22 minutes (Figure 3). Bearing in mind that my prior attempt to compile on the Rasp Pi only reached 30% after 24 hours before locking up the machine completely, that is a substantial boost! Egenhofer compiled using Raspbian within a virtual machine and managed to compile on a similar computer to mine in two hours. It seems pretty clear that distcc is the winner for both convenience and speed.

Figure 3: Compiled in 22 minutes instead of over 6 hours. Not bad at all.

Testing

To make sure everything works, you can now try out some programs that come with Panda3D. If you look in the /usr/share/panda3d/samples directory, you'll see some projects you can play with. Most are directly executable, so to play a game of Panda3D-powered asteroids, for example, type

/usr/share/panda3d/samples/asteroids/main.py

in a terminal window (Figure 4).

Figure 4: Playing a match of Panda-powered asteroids.

Some projects just show a blank window with the message: Video driver reports that Cg shaders are not supported. These projects use OpenGL-only extensions and cannot run with OpenGL ES.

The program I wrote for Raspberry Pi Geek #16 does work, though.

The program I wrote uses an Arduino 101 [7] to control an onscreen 3D model helicopter (rendered with Panda3D). You can download all the bits and pieces with:

git clone https://github.com/pbrown66/Arduino-101.git

Then, you need to upload the Arduino sketch gyro.ino, using a regular x86-based computer, to the Arduino 101, because the 101 is a rather new board and is only supported in the latest versions of the Arduino IDE, which is not yet available for the Rasp Pi.

Notwithstanding, the Python program copter.py works just fine from the Raspberry Pi, as you can see in Figure 5 and a YouTube demo [8].

Figure 5: Making a Panda3D object twist and spin by waving around an Arduino 101 works on the Raspberry Pi!

Conclusion

Getting my Arduino 101/Panda3D program running flawlessly on the Rasp Pi has been a big deal for me. I sort of managed to start Panda3D compiling on the Pi, but as mentioned elsewhere, taking days and finally locking up completely was … er … suboptimal.

When I solved the compile problems inherent to the Rasp Pi thanks to distcc (and what a great technology that is), I sort of got my program to work, but only at a jerky 10fps, with terrible lag and continuous crashes.

Once I stumbled on Egenhofer's guide, everything sorted itself out: I managed a smooth, error-free compile, and I got a buttery responsive animation from my program when I waved around my Arduino 101.

The lesson I'm taking away from the whole experience is this: The Rasp Pi is way more powerful than you think – just not for everything. That said, the open source community usually finds a way and provides free tools and wise guidance to overcome nearly every problem you might face.

And that's what makes it such a joy.

Info

  1. Raspberry Pi tool chain: https://github.com/raspberrypi/tools
  2. The Qemu virtualizer: http://wiki.qemu.org/Main_Page
  3. The distcc compile distributor: https://github.com/distcc/distcc
  4. "Arduino 101" by Paul Brown, Raspberry Pi Geek, issue 16, 2016, pg. 20, http://www.raspberry-pi-geek.com/Archive/2016/16/Exploring-the-new-Arduino-Genuino-101
  5. Panda3D: https://www.panda3d.org
  6. Thomas Egenhofer's guide to installing Panda3D on the RPi: https://www.panda3d.org/forums/viewtopic.php?f=6&t=18214
  7. Arduino 101: https://www.arduino.cc/en/Main/ArduinoBoard101
  8. Watch the demo: https://youtu.be/G9cF04Os2YM

Related content

  • Panda3D

    Several free game engines are available for Linux users, but programming with them is often less than intuitive. Panda3D is an easy-to-use engine that is accessible enough for newcomers but still powerful enough for the pros at Disney Studios.

  • Decisions, Decisions

    When it comes to open hardware projects, the choice of an Arduino or a Raspberry Pi board can determine success or failure. Read on for guidance in selecting the best board for your specific needs.

  • ReportLab and Panda3D

    A game of bingo illustrates how to use the ReportLab toolkit and Panda3D real-time 3D engine.

  • RaspPi-Controlled Toy Sailboat

    With Node-RED, you can create a web dashboard that instructs a Raspberry Pi to set the rudder position on a toy sailboat.

  • Data Visualization in Python

    Python's powerful Matplotlib, Bokeh, PyQtGraph, and Pandas libraries lend programmers a helping hand when visualizing complex data and their relationships.

comments powered by Disqus
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters

Support Our Work

Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.

Learn More

News