Building Linux from Scratch

Master Recipe

Article from Issue 251/2021
Author(s):

If you really want to learn about Linux, build it from scratch.

Linux from Scratch (LFS) [1] and Beyond Linux from Scratch (BLFS) [2] are online tutorials that provide step-by-step instructions for how to compile and assemble your own Linux system from freely available sources. The abbreviation LFS refers to both the documentation and the resulting Linux system.

If you've been around Linux long enough, you have probably come across the need to compile software – either from a homegrown application or possibly from source code available at an open source project website. You might think that compiling an operating system is similar, and in some ways it is, but building a complete operating system is vastly more complicated than compiling your average desktop application.

Keep in mind that the modern distro you are using to build that desktop app already has a complete toolchain that has been tested and approved for compatibility. Imagine that this build environment doesn't exist yet and you have to create it. That includes a compiler and also important libraries such as glibc and libstdc++. The Linux from Scratch instructions (sometimes call the book) contain 11 chapters that are divided into five major parts. The authors break the process of building a Linux system into three stages (Figure 1):

  • configure a build host (Part II, Chapters 2 to 4)
  • build a cross toolchain and temporarily required tools (Part III, Chapters 5 to 7)
  • create the actual LFS system (Part IV, Chapters 8 to 11).
Figure 1: The three parts of the LFS build process.

Gerard Beekmans, the author of LFS, started working with Linux in 1998. After using various distributions, he realized that he could not find a "one size fits all" solution, and he would need to build his own system. When he started compiling, he quickly discovered that he had unleashed an onslaught of compile-time errors and circular dependencies. The experience he gained, which he shared with the Linux community, met with such widespread interest that the LFS project was born.

LFS is one of several distribution kits that focus on letting the user build their own system. Many of these projects specialize in the embedded area, including BitBake from the Yocto project [3]. Other options include Linux Target Image Builder [4], OpenWrt [5], and Scratchbox [6].

LFS makes an effort to follow common standards, including POSIX.1-2008 [7], which defines an operating system interface, including an environment for portability purposes, the Filesystem Hierarchy Standard FHS 3.0 [8], and the Linux Standard Base LSB 5.0 [9]. The Linux Standard Base defines four standards for core, desktop, runtime languages and imaging, that are hotly debated. LFS can implement them only partially due to its minimal software equipment.

If you follow the instructions, you will end up with a minimal Linux system that includes an Ext4 filesystem, Grub 2.04, Kernel 5.10.17, Bash 5.1, GCC 10.2, Perl 5.32.1, Python 3.9.2, and Vim 8.2. (This article uses LFS v10.1 from March 2021.) You have the choice between LFS with SysVinit [10] and LFS with Systemd and D-Bus [11]. I'll cover the Systemd variant, which includes DHCP and NTP clients, as well as other bits and pieces that are not included in the SysVinit variant of LFS, although BLFS offers some additional SysVinit options.

The build takes a good deal of time: Using the test VM described in this article with KVM on a machine with Core i7 10th Gen and NVMe SSD, the build takes several hours even with the tests turned off.

Cross-Compilating

To minimize the influence of the build host's operating system, you need to make sure that the LFS system is independent of it. Chapters 5 and 6 help you create the software tools you'll need to build Linux from Scratch. You then deploy these tools in an isolated chroot environment only for the purpose of building LFS.

The build process is based on the principle of cross-compiling: A cross-compiler is typically used to compile code for other system architectures. On your build machine, a cross-comiler is theoretically not necessary because you are most likely building on x86_64 for x86_64. However, you will want LFS to be free of inlinked software from the build host, which a cross-compiler guarantees.

The LFS documentation assumes you have three hosts, each with a different architecture. Host A has a compiler for its own architecture. The host is slow and its capacity is limited. Host B does not have a compiler initially, but it is fast and well equipped. It will be producing the software for Host C. Host C, with yet another architecture, also has no compiler and is small and slow.

This scenario requires two cross compilers: Cross-Compiler A and Cross-Compiler B. On Host A, the compiler builds Cross-Compiler A, which in turn builds Cross-Compiler B. Cross-Compiler B is used on Host B to build Compiler C and all programs for Host C. However, they cannot be tested on Host B. Therefore, Compiler C rebuilds and tests itself on Host C.

For LFS to succeed with cross-compiling on the same host, you need to gently adjust the well-known vendor triplet x86_64-pc-linux-gnu (which always contains four components today) in LFS_TGT to fool the compiler into thinking it has different architectures. Then, on the build host, the compiler build host first builds the cross-compiler build host, which in turn builds the compiler for LFS. In the LFS chroot on the build host, the compiler LFS then rebuilds and tests itself and is then ready for deployment.

For LFS, you need to use the cross-compiler to compile not only the C compiler but also the glibc and libstdc++ libraries. The problem is that, to be built, the C compiler relies internally on libgcc, which in turn has to be linked against glibc – which doesn't exist yet.

To work around these circular dependencies, first build a minimal version of the cross-compiler build host that is just about good enough to compile a full-fledged glibc, as well as a minimally functional libstdc++. The libstdc++ library will be built again later in the chroot environment, this time completely. Further details are given in the LFS documentation.

Hardware

The LFS build process will not work unless your build host is working perfectly. The tutorial in this article assumes an unused virtual machine based on a fully patched, current CentOS 8.3 minimal install with BIOS boot. If you're using a different Linux variant, you might need to modify some of the steps, but the process is similar.

The C compiler GCC benefits from a fast core. The documentation discusses a multi-thread build distributed over several cores, but this configuration is not recommended in practice because of occasional race conditions.

For the build server, two cores, 2GB RAM, an SSD-based hypervisor, and an Intel e1000 NIC are all you need. For mass storage, you need a 10GB disk (VirtIO, /dev/vda/) for the system and a 20GB disk (SATA or SCSI, /dev/sda/) for LFS.

Avoid VirtIO for the LFS disk and network card because the instructions in the LFS manual do not compile VirtIO support into the kernel. LFS is happy with a virtual SATA or SCSI disk that it identifies as /dev/sda/ and a classic Intel e1000 network card with 1 Gbps. The disk name is important when it comes to making LFS bootable. The name of the network card (such as enp1s0) matters if you want to use Predictable Network Interface Names [12].

Getting Started

After CentOS is installed, you need to set up the packages required for the first build processes. You can retrieve the required makeinfo from the texinfo package in the PowerTools repository, starting with CentOS 8.3. The packages required for the build are quickly installed (Listing 1).

Listing 1

Installing the Required Packages

$ dnf -y install yum-utils
$ yum config-manager --set-enabled powertools
 dnf -y install bison byacc bzip2 gcc-c++ patch perl python3 tar texinfo wget

You then need to check the versions installed by CentOS with the help of a small script: LFS v10.0 requires a GCC version of 6.2 or higher and a GNU Make version of 4.0 or higher, which an earlier Linux distribution such as CentOS 7 will not give you.

If everything is fine so far, split the second disk of the build host into the partitions /, swap and /home, create the Ext4 filesystem, and define the $LFS environment variable, which is essential for all subsequent shell calls and resolves the root mount point. The variable must therefore work under all circumstances. You will want to complete the locale settings in CentOS 8 and mount the second disk.

Make the /dev/sda/ disk bootable via BIOS/MBR and add various partitions. To give Grub's own core.img enough space, the first partition for / with a size of 10GB starts at 1MB. After that, create a /swap of 2GB, leaving the rest of the disk reserved for /home.

In this setup, /boot does not need its own partition but contains the kernel plus the boot loader configuration as an ordinary directory. You only need a separate partition if you want to support multiboot with other distributions, or if the boot loader can no longer access the root file system. Lack of access to the filesystem can happen as a result of missing drivers, disk encryption, software RAID, or LVM.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Linux from Scratch 6.5 Available

    The Linux from Scratch (LFS) intensive training has been updated to version 6.5.

  • Linux From Scratch

    Linux From Scratch helps you create a custom Linux system with everything you need and nothing more. And as you build your system, you’ll get an insider’s look at how Linux really works.

  • Buildroot

    Whether you need a tiny OS for 1MB of flash memory or a complex Linux with a graphical stack, you can quickly set up a working operating system using Buildroot.

  • detLFS

    The detLFS project provides an ideal foundation for compiling Linux from source code, either to experience the fundamentals of how Linux works or to prepare an operating system for a project with very specific requirements.

  • Mozilla and Samsung Collaborate on New Browser Engine

    Mozilla is collaborating with Samsung on a new web browser engine called Servo.

comments powered by Disqus