Secure storage with GlusterFS

Bright Idea

Article from Issue 153/2013

Author(s): Kurt Seifried

You can create distributed, replicated, and high-performance storage systems using GlusterFS and some inexpensive hardware. Kurt explains.

Recently I had a CPU cache memory error pop up in my file server logfile (Figure 1). I don't know if that means the CPU is failing, or if it got hit by a cosmic ray, or if something else happened, but now I wonder: Can I really trust this hardware with critical services anymore? When it comes to servers, the failure of a single component, or even a single system, should not take out an entire service.

Figure 1: A CPU hardware error.

In other words, I'm doing it wrong by relying on a single server to provide my file-serving needs. Even though I have the disks configured in a mirrored RAID array, this won't help if the CPU goes flaky and dies; I'll still have to build a new server, and move the drives over, and hope that no data was corrupted. Now imagine that this isn't my personal file server but the back-end file server for your system boot images and partitions (because you're using KVM, RHEV, OpenStack, or something similar). In this case, the failure of a single server could bring a significant portion of your infrastructure to its knees. Thus, the availability aspect of the security triad (i.e., Availability, Integrity, and Confidentiality, or AIC) was not properly addressed, and now you have to deal with a lot of angry users and managers.

Enter GlusterFS

GlusterFS is a network/clustering filesystem acquired by Red Hat in 2011. In a nutshell, GlusterFS has a server and client component. The server is basically "dumb" (i.e., the metadata is stored with the file on the back end, simplifying the server considerably). You create a trusted pool of servers, and these servers contain storage "bricks" – basically disk space, which you can then export, either singularly or in groups (known as volumes) using a variety of replication, distribution, striping, and combinations thereof to strike the right balance of performance, reliability, and capabilities for the volumes you need.

The GlusterFS data can then be exported in one of three ways to clients, using the native GlusterFS client, which is your best bet for performance and features like automated failover, NFS (the GlusterFS server can emulate NFS), or CIFS (using Samba to export the storage). Of course, you could also mount the storage on a server and re-export it using other file-serving technologies.

The GlusterFS server storage bricks are just normal devices with a supported file system. XFS is recommended, and ext4 is supported (more on this later). With no need for special hardware, you could use leftover space on a device to create a partition that is then exported as a storage brick, by dedicating an entire device or making a RAID device and exporting it.

GlusterFS Installation

GlusterFS installation is pretty easy; Fedora, Debian, and several others include GlusterFS server and client by default. The upstream GlusterFS project also makes RPMs and DPKG files available [1], as well as a NetBSD package. I strongly recommend running the stable version if possible (3.3.x at the time of this writing), although the latest beta version (3.4) has some really cool features, such as Qemu thin provisioning and Write Once Read Many.

GlusterFS Setup and Performance

As with any technology, there are good ways and better ways to set up the servers. As far as filesystems go, you want to use XFS rather than ext4 if at all possible. XFS is far better tested and supported than ext4 for use with GlusterFS; for example, in March 2013, a kernel update broke GlusterFS on ext4 [2]. Also, if you're using XFS, you'll want to make sure the inodes are 512 bytes or larger because if you use ACLs on XFS, you'll want the inodes to be large enough to store the ACL directly in the inode; if it's stored externally to the inode, every file access will require additional lookups, thereby adding latency and affecting performance.

You also need to ensure the system running the GlusterFS Server has enough RAM. GlusterFS aggressively caches data, which is good for performance, but I have seen cases in which this caused the system to run short on memory, invoking the OOM killer and killing the largest process (in terms of memory use), which was invariably glusterd. The same goes for clients, which will need at least 1GB of RAM. I found in production that cloud images with 640MB of RAM would consistently result in the GlusterFS native client being killed by the OOM killer.

As far as storage, I recommend setting up one storage brick per device, or ideally using RAID 5/6/10 devices for storage bricks, so a single failed disk doesn't take out the entire brick on that server. If possible, you should not share devices between GlusterFS and the local system. For example, if you have a locally I/O-intensive process (e.g., log rotation with compression), this could affect the performance of glusterd on that server and, in turn, affect all the clients using the volume that include that storage brick.

Setup of GlusterFS and volumes is quite easy, assuming you have two servers with one disk each that you want to make into a single volume, replicated (basically RAID 1, a mirror). From the first server, run:

# gluster peer probe ip.of.server.2
# gluster volume create vol1 replica 2 transport tcp 10.2.3.4:/brick1 10.3.4.5:/brick2
# gluster volume start vol1

On the client(s), mount the filesystem:

# mount -t glusterfs -o acl 10.2.3.4:/vol1 /vol1

Replicated volumes are pretty simple: Basically it's RAID 1, and the number of bricks should be equal to the replica count. Plans have been made to support more flexible RAID setups (similar in nature to RAIDZ1/2/3, which allows you to specify that you want data replicated to withstand the loss of one, two, or three drives in the group). You can also create distributed volumes; this is most similar to JBOD (Just a Bunch of Disks), in that files are written in their entirety to one of the bricks in the storage volume. The advantage here is you can easily create very large storage volumes.

A third option is distributed volumes; these are basically RAID 0-style volumes. Here, files are written in pieces to each storage brick, and for write and read access, you can get very high volumes (e.g., 10 servers writing one tenth of the file each compared with a single server handling all of it). Finally, you can mix and match these three types. For example, you could create a distributed striped volume for extremely high performance, a distributed replicated volume for extremely high availability, or a distributed striped replicated volume for both performance and reliability.

A note on GlusterFS security: Currently, GlusterFS only supports IP/port-based access controls; in essence, it is meant to be used with a trusted network environment. A variety of authentication and encryption options (e.g., Kerberos and SSL) are being examined; however, nothing had been decided at the time of this writing. I suggest you use firewall rules to restrict access to the clients, and if you have clients with different security levels (e.g., trusted internal clients and untrusted external clients), I advise using different storage pools for each.

1 2 Next »

Buy this article as PDF

Express-Checkout as PDF

Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES

Print Issues

Digital Issues

SUBSCRIPTIONS

Print Subs

Digisubs

TABLET & SMARTPHONE APPS

US / Canada

UK / Australia

Support Our Work

Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.

News

Linux Servers Targeted by Akira Ransomware

Enterprise Linux , Linux , ransomware , Security

A group of bad actors who have already extorted $42 million have their sights set on the Linux platform.
TUXEDO Computers Unveils Linux Laptop Featuring AMD Ryzen CPU

Games , Hardware , laptop , Linux

This latest release is the first laptop to include the new CPU from Ryzen and Linux preinstalled.
XZ Gets the All-Clear

Arch Linux , Fedora , Linux , open source , Security , Ubuntu

The back door xz vulnerability has been officially reverted for Fedora 40 and versions 38 and 39 were never affected.
Canonical Collaborates with Qualcomm on New Venture

Artificial Inte... , Linux , open source , Security , Ubuntu

This new joint effort is geared toward bringing Ubuntu and Ubuntu Core to Qualcomm-powered devices.
Kodi 21.0 Open-Source Entertainment Hub Released

audio , Multimedia , Music , open source , streaming video , Video

After a year of development, the award-winning Kodi cross-platform, media center software is now available with many new additions and improvements.
Linux Usage Increases in Two Key Areas

Games , Linux , open source , Steam

If market share is your thing, you'll be happy to know that Linux is on the rise in two areas that, if they keep climbing, could have serious meaning for Linux's future.
Vulnerability Discovered in xz Libraries

Fedora , Linux , malware , Security

An urgent alert for Fedora 40 has been posted and users should pay attention.
Canonical Bumps LTS Support to 12 years

Linux , open source , Operating Systems , Ubuntu

If you're worried that your Ubuntu LTS release won't be supported long enough to last, Canonical has a surprise for you in the form of 12 years of security coverage.
Fedora 40 Beta Released Soon

Fedora , Gnome , open source , Plasma , Wayland

With the official release of Fedora 40 coming in April, it's almost time to download the beta and see what's new.
New Pentesting Distribution to Compete with Kali Linux

Linux , open source , Tools , Ubuntu

SnoopGod is now available for your testing needs

Secure storage with GlusterFS