Life and times of the classic ext Linux filesystem

The Filer

Article from Issue 189/2016
Author(s):

The ext filesystem celebrates its 25th birthday next year. A brief tour of ext history will give you some insights into how this classic Linux filesystem works – and how it has evolved to meet users' needs.

When Linus Torvalds developed the first early versions of Linux in 1991, he used the Minix filesystem by Andrew S. Tanenbaum. The Minix filesystem was part of the legacy of the Minix Unix clone [1], which Tanenbaum created for teaching purposes. Some restrictions, such as a maximum file name length of 14 characters or the file size limit of 64MB, gave rise to the need for a filesystem specially developed for Linux: The birth of the Extended Filesystem, or ext, followed shortly afterward.

French software developer Rémy Card released the first version of ext in 1992, making it possible to save files of up to 2GB on Linux. The permissible length for file names grew to 255 characters. Although ext got many things right in the first version, it was hardly up to professional requirements. For some applications, ext suffered from severe fragmentation, and it was impossible to store different timestamps for access and modifications of the inode and file.

Successor Ext2

When designing ext2, the developers adopted many best practices and principles of the then-widespread Unix Berkeley Fast File System [2]: Accordingly, an ext2 filesystem divides the storage medium into blocks from a logical perspective, then strings the blocks together. The default block size is 4KB.

The blocks should generally be at least as large as the hard disk sectors. At that time, the typical size was 512 bytes, which, from a logical perspective, is a block containing eight sectors. For several years, however, many hard disks have come with 4KB sectors, so a block directly maps to a sector.

Ext2 also adds blocks together to create groups; given a block size of 4KB, 32,768 blocks typically form a group (of about 128MB). A modern hard disk with a capacity of 2TB holds many thousands of blocks and around 16,000 block groups.

The division into blocks and block groups helps to organize the storage space logically and optimize read and write access. As a rule, the system writes files within the same block group in order to minimize fragmentation and access times to the storage medium. If the system writes a file that is greater than the configured block size, the file accordingly occupies a number of blocks.

The division into fixed block sizes offers a decisive disadvantage: If a file does not occupy the block size of 4096 bytes (4KB), it is wasting space. A 96-byte file fills a complete 4KB block, and a 5092-byte file fills two. In either case, 4000 bytes are lost.

Inodes

An ext filesystem stores the files without metadata on the hard disk, so it needs a way to manage the size of the files, their ownerships, and the access permissions. Additionally, the filesystem needs space for the exact location on the hard drive so that the system can find the file quickly.

The developers of ext2 used inodes for this purpose: Every file and directory is represented by an inode, which contains the ownership and access information. The name inode stands for "index node," which is why the abbreviation i-node often appeared in the early years of ext2. An inode in ext2 has a default size of 128 bytes.

Directories

From the perspective of ext, folders are nothing more than special files that are home to a list of files. Each entry associates a file name with an inode number, the length, and content from the actual file name. When accessing a file, the system just needs to take a look at the inode, which represents the directory holding the file (Figure 1).

Figure 1: A look at the user-readable content of a directory inode using debugfs and the xxd hex viewer.

File name and inode number mappings do not need to be singular: A completely different file name that points to an already-referenced inode number is known as a hard link. A hard link typically is not readily visible to applications and users, and it can only point to objects in its own filesystem. Incidentally, the same principle applies to subdirectories. A subdirectory is also a special file, which the inode of the overlying directory references with its inode number as a file.

If you type ls -a to list the contents of a folder, you will notice the . and .. entries. These entries are special properties of directories: When it creates a new folder, the system automatically generates them, and they cannot be deleted. Ext2 lists the entries . and .. in the directory with the inode number of the current or parent directory. The root directory always resides in inode number 2, so that the system can find it faster.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Configuring Filesystems

    Although most Linux distributions today have simple-to-use graphical interfaces for setting up and managing filesystems, knowing how to perform those tasks from the command line is a valuable skill. We’ll show you how to configure and manage filesystems with mkfs, df, du, and fsck.

  • Managing Linux Filesystems

    Even with all the talk of Big Data and the storage revolution, a steady and reliable block-based filesystem is still a central feature of most Linux systems.

  • Choose a Filesystem

    Every Linux computer needs a filesystem, and users often choose a filesystem by habit or by default. But, if you're seeking stability, versatility, or a small performance advantage, it pays to take a closer look.

  • The ext Filesystem

    The extended filesystem has been part of the Linux kernel since 0.96c – a faithful companion of the free operating system. With its developments – or, rather, rebirths – through ext2, ext3, and ext4, it is one of the oldest Linux-specific software projects.

  • Write Barriers

    Your journaling filesystem is carefully tracking write operations – but what happens when the data gets to the disk? A write barrier request can help protect your data.

comments powered by Disqus