Merging file systems for a simple NAS with MergerFS

Come Together

© Lead Image © Michelle Albers, Fotolia.com

© Lead Image © Michelle Albers, Fotolia.com

Article from Issue 254/2022
Author(s):

MergerFS is a simple tool for bunching together disks, volumes, and arrays.

I had to make many decisions when setting up my personal network-attached storage box. I needed a machine capable of sharing files on my local network with Samba [1]. I also wanted to be able to use the system as a Plex streaming server [2] and to run virtual machines occasionally to test out new Linux distributions. I didn't need the system to be mission critical or high performing. A big motivation for setting this server up was to learn more about Linux. With that in mind, it should not be too surprising that I built it using spare parts.

The files I wanted to store on this server were mainly replaceable media files. A high-end file system such as ZFS sounded amazing, but it was more than I needed in this case, and ZFS wasn't really financially viable because of RAM costs. I just wanted to get the most mileage out of my hard disk space, redundancy be damned. All critical information, such as personal files, would be backed up to multiple machines and to someone else's computer (Alphabet's, to be precise).

I did plan on using a RAID-0 array – for speed rather than redundancy. Using a Plex and Samba server on a home network meant that the bulk of the data would be written once and read occasionally, and that speeds of even shingled magnetic recording-based spinning drives would be more than adequate. However, one issue was the need to support Windows and the desire to format the drives to NTFS so that, in the event of a hardware failure or operator error, the drive could be removed and installed into a 3.5-inch external enclosure on a nearby Windows system.

My frustration began when I was trying to set up Samba and Plex in a way that would make logical sense to the person accessing the files on the opposite end. The easiest way to do this is to have one share that represents the contents of the entire machine, but the problem with that approach is that those files and folders are stretched across multiple disks and in no particular order, and to add chaos to confusion, not all drives are even in the same format. One disk might have TV shows, another movies, a third software, and a fourth all of the above plus some documents. Music is striped across all of the disks.

RAID could have been a possible solution from the start, but the disks were acquired one at a time, with a new disk added only when the installed disks were approaching their capacity (>80%). I had heard of Synology NAS machines using Btrfs in a way that allows the owner to install a disk of any size and to extend the array to the size of the current array plus the new disk, but again my disks were initially set up primarily with NTFS and there was not enough space on any one or two disks to allow for the creation of an array after the fact. Backing up all of the data to an online service would have been cost-prohibitive. It would be amazing if there existed a RAID technology that would allow me to keep the contents of the drives while creating the array, but that's the kind of magic reserved for unicorns in fairy tales.

The solution to this mess was MergerFS [3]. MergerFS is a tool that lets you combine separate file systems and volumes from different partitions or disks into one single volume facing the user. You can think of MergerFS as something similar to utilities such as mhddfs, UnionFS, and AuFS. MergerFS doesn't care if your drives are formatted to NTFS, FAT, ext3, or ext4, or if they are organized in a RAID-0, 1, 5, or another level of RAID. It doesn't even matter if your drives are in a logical volume already. MergerFS loves them all just the way they are.

MergerFS runs in userspace as a FUSE device, and it can be manipulated in most of the same ways that any other volume can be, but it is composed of partitions from other volumes or disks – with logic that makes it easy to configure how the user interacts with it. The GitHub page describes MergerFS as a "union of sets." You can configure policies defining how files and folders are added to the array.

MergerFS can handle both read/write and read-only disks. Redundancy isn't prohibited or disallowed, but you need to consider redundancy when setting up the device initially. For instance, suppose you would like to have personal files duplicated (health records, CV, Bitcoin wallet, pictures), but you also have many files that do not need to be kept safely (audio and video media, a collection of Linux ISOs etc.). You could create a folder within a pre-allocated RAID array of say, RAID-5, and create a folder within it for personal files. Then you could create another set of folders on a separate volume that is not redundant, such as one disk for movies and another disk for TV shows and music. The RAID-5 array, along with the two media disks, can be "merged" into what appears to be a single client-facing drive. As long as personal files are added to the personal files folder (initially created on the RAID-5 array) and all media files are added to the folders that were created for them initially, the files will all end up in the right places. This behavior is, of course, based on the policy that the user chooses to set. More information on the varying policies is available on the GitHub page for MergerFS.

If a disk in the RAID-5 array fails, then it can be replaced and the array rebuilt. On the other hand, if the disk with the movies, TV shows, or music fails, then the data is lost and will need to be recreated or downloaded again.

MergerFS allows for redundancy when needed, a lack of redundancy when redundancy is not needed, a single point of contact for the user, and a lower cost compared to setups where everything is redundant. You can even use MergerFS in conjunction with ZFS or Btrfs, and the benefits of those file systems are still achieved.

For my system, MergerFS has allowed me to take drives of different sizes, models, speeds, and even formats, and combine them in a way that makes it very simple for me to add, change, or remove data without worrying about what information is on the drive. In the future, I intend to invest in a much better system where data loss is minimized through using RAID or ZFS, but for now, MergerFS has allowed me to very easily and inexpensively manage a wide variety of data for my home server in a way that suits me and is easy to expand.

To set up MergerFS initially on Ubuntu, run the following commands:

$ sudo apt update $ sudo apt install
mergerfs -y

Next, you will need to create a folder to mount the MergerFS array (I have used /media/virt in the following example), apply the chosen policy, and add drives to the array. For my setup, I use the following:

mergerfs -o defaults,allow_other,
use_ino,nonempty,fsname=MergerFS
/media/diskb:/media/diskc:
/media/diskd:/media/diske
/media/virt

In this example, the user-facing volume containing the drive contents will be /media/virt, and it will consist of all of the data on /media/diskb, c, d, and e. Figure 1 shows how this appears in the file browser. The virt drive on my system can be used just as any other single drive would be. In each individual drive that makes up the array, I have separate folders for different data, such as programs, movies, and so forth, organized in a way that I set up initially. With this setup, if I have a folder named Programs, say, on both disk b and disk c, then MergerFS will add files to whichever drive has the most free space. If I only have a folder named Programs on disk c though, then it will automatically add anything that I put in the /media/virt/Programs folder into disk c.

Figure 1: Virt created from disks b, c, d, and e.

Conclusion

MergerFS has made it much easier for me to organize my data in a logical way. I can simplify my configuration by setting up different folders for each type of data, and MergerFS figures out the file system and redundancy details. To complete the configuration, you can set up MergerFS to start when your system boots up.

The Author

Adam Dix is a mechanical engineer and Linux enthusiast posing as an English teacher after playing around a bit in sales and marketing. You can check out some of his Linux work at the EdUBudgie Linux website.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Temperature Testing a NAS

    Using stress, lm-sensors, and hddtemp to sort out temperature and reliability related issues with a home-based NAS box.

  • Desktop RAID

    Linux offers several options for fulfilling the RAID promise of fast hard disk access and data security.

  • RAID Performance

    You can improve performance up to 20% by using the right parameters when you configure the filesystems on your RAID devices.

  • High Availability vs. Backup

    Some users trust their data to powerful file servers that advertise enterprise data protection, but your Network Attached Storage system might not be as safe as you think it is.

  • ZFS on Linux

    License issues prevent the integration of ZFS with the Linux kernel, but Linux users can try the highly praised filesystem in userspace.

comments powered by Disqus
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters

Support Our Work

Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.

Learn More

News