Run Samba in clustered mode with Ceph
Double Sure
Fail-safe is a massive topic for file server admins. Thanks to the CTDB and Ceph, you can put Samba in a cluster with minimal complications.
The popularity of Samba means file server admins have to think about how they can protect the service against loss. Samba is now mature and runs without any problems in most cases, but if the server on which Samba is running crashes, the service is no longer available.
The Samba developers are aware of the need for some fault tolerance and have responded to the problem with a genuine cluster option. Samba's cluster mode means you can use several Samba servers to process incoming requests. A single Samba server crash will not stop the show because other servers in the cluster will keep working.
Configuring Samba's cluster mode is not entirely intuitive, especially considering that the Samba cluster implementation has changed radically several times in the past few years. This article offers a quick look at high availability with Samba.
The Challenge
Why is a Samba cluster such a challenge? A little excursion into the world of storage theory will offer some answers. In particular, the issue of locking is very important. How does the application handle concurrent access to the same file? "Application," in this case, can mean a simple filesystem on a disk or a complex application. In any case, just imagine the chaos if two clients simultaneously access the same file and change parts of it. The file would end up corrupted, and neither client A nor client B could do anything with the contents.
Various filesystems have tried practically every conceivable solution for file locking: Older filesystems rigorously deny access to a file if it is already open. Modern filesystems follow the principle that the last write wins and determines the contents of the file.
Because Samba offers a network filesystem, it also has internal locking functions. Samba uses the TDB (Trivial Database) database format for storing internal metadata. One of the most important databases is locking.tdb
, which tracks which client is currently accessing which file.
Samba relies on opportunistic locking, which means a client tells the server that it has claimed exclusive access rights to a file on the Samba share for itself. Once the Samba server has complied with the request, it writes a corresponding note to locking.tdb
and stops other clients from accessing the same file.
As long as the process is limited to a single instance of Samba, everything works fine: The single Samba server can reliably assume that its version of locking.tdb
is authoritative.
But a clustered configuration adds a challenge: Multiple Samba instances need to sync the contents of their locking.tdb
files with each other. The cluster must therefore have some means for managing client access to files on the Samba volume.
The solution for this problem, say the Samba developers, is CTDB (Clustered Trivial Database), an extension of TDB that lets many instances of Samba dynamically share TDB content.
Requirements for Clustered Samba
A few years ago, the option for a cluster file server was some form of clustered filesystem: solutions such as GFS or OCFS2 (Oracle Cluster Filesystem 2) could manage cluster-wide access to the same filesystem in a NAS share connected via iSCSI. But solutions of this sort required a cluster manager, preferably Pacemaker, and configuring and managing Pacemaker can be a very complicated task – especially when you are using it with GFS or OCFS2.
Luckily, distributed storage solutions have led to a simpler approach. Distributed storage tools such as GlusterFS and Ceph work differently: A large filesystem comprises many small segments on the participating servers, and consistency issues are addressed internally. Access occurs through designated, independent mechanisms via simple interfaces. In truth, distributed storage is no less complex than Pacemaker with OCFS2, but it does a better job of hiding the complexity. The barrier to entry is thus lower.
Two rival distributed storage solutions dominate the market, and both are sponsored by Red Hat: On one hand, GlusterFS offers a classic distributed filesystem; on the other, Ceph is an object store that can offer its contents in the form of a POSIX-compatible filesystem, CephFS. CephFS was stuck for several years at the beta stage, but the last version of Ceph "Jewel" promises a higher level of maturity: CephFS is suitable for the production operation, according to the developers.
Three servers are available in the following example of Ceph: Alice, Bob, and Charlie – each of these servers has a hard drive that it contributes for the Ceph object store. Although the performance benefits of Ceph are best realized when the cluster runs on real hardware, you can easily emulate this configuration on virtual machines if you only want to try things out.
Even the most attractive Samba cluster will be no help if you ignore fundamental rules of high availability (HA). Basically, an HA cluster with Samba faces the same challenges that all other services on a server need to take on: Clustering at the software level only checks one box on the list. The loss of infrastructure that is not controlled by Samba can still trip Samba up.
Network and power are the two classic infrastructure issues you'll need to address: Several Samba servers in the combined cluster are good, but if they are all connected to the same electrical circuit and the circuit fails, both servers are dead. The problem is the same for Ethernet: If all nodes in the cluster are connected to the same switch and it fails, the Samba service is still available, but its clients can no longer reach it.
Creating the Necessary Infrastructure
The degree of redundancy depends on the budget for the project. Redundancy at the power and network levels can cause significant additional costs, because you'll need to duplicate many components. Admins face a compromise: The more parts you make redundant, the lower the risk of failure, but the setup is more expensive.
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
News
-
Juno Computers Launches Another Linux Laptop
If you're looking for a powerhouse laptop that runs Ubuntu, the Juno Computers Neptune 17 v6 should be on your radar.
-
ZorinOS 17.1 Released, Includes Improved Windows App Support
If you need or desire to run Windows applications on Linux, there's one distribution intent on making that easier for you and its new release further improves that feature.
-
Linux Market Share Surpasses 4% for the First Time
Look out Windows and macOS, Linux is on the rise and has even topped ChromeOS to become the fourth most widely used OS around the globe.
-
KDE’s Plasma 6 Officially Available
KDE’s Plasma 6.0 "Megarelease" has happened, and it's brimming with new features, polish, and performance.
-
Latest Version of Tails Unleashed
Tails 6.0 is based on Debian 12 and includes GNOME 43.
-
KDE Announces New Slimbook V with Plenty of Power and KDE’s Plasma 6
If you're a fan of KDE Plasma, you'll be thrilled to hear they've announced a new Slimbook with an AMD CPU and the latest version of KDE Plasma desktop.
-
Monthly Sponsorship Includes Early Access to elementary OS 8
If you want to get a glimpse of what's in the pipeline for elementary OS 8, just set up a monthly sponsorship to help fund its continued existence.
-
DebConf24 to be Held in South Korea
Busan will be the location of the latest DebConf running July 28 through August 4
-
Fedora Unleashes Atomic Desktops
Fedora has combined its solid distribution with rpm-ostree system to make it possible to deliver a new family of Fedora spins, called Fedora Atomic Desktops.
-
Bootloader Vulnerability Affects Nearly All Linux Distributions
The developers of shim have released a version to fix numerous security flaws, including one that could enable remote control execution of malicious code under certain circumstances.