Run Samba in clustered mode with Ceph
Double Sure
Fail-safe is a massive topic for file server admins. Thanks to the CTDB and Ceph, you can put Samba in a cluster with minimal complications.
The popularity of Samba means file server admins have to think about how they can protect the service against loss. Samba is now mature and runs without any problems in most cases, but if the server on which Samba is running crashes, the service is no longer available.
The Samba developers are aware of the need for some fault tolerance and have responded to the problem with a genuine cluster option. Samba's cluster mode means you can use several Samba servers to process incoming requests. A single Samba server crash will not stop the show because other servers in the cluster will keep working.
Configuring Samba's cluster mode is not entirely intuitive, especially considering that the Samba cluster implementation has changed radically several times in the past few years. This article offers a quick look at high availability with Samba.
The Challenge
Why is a Samba cluster such a challenge? A little excursion into the world of storage theory will offer some answers. In particular, the issue of locking is very important. How does the application handle concurrent access to the same file? "Application," in this case, can mean a simple filesystem on a disk or a complex application. In any case, just imagine the chaos if two clients simultaneously access the same file and change parts of it. The file would end up corrupted, and neither client A nor client B could do anything with the contents.
Various filesystems have tried practically every conceivable solution for file locking: Older filesystems rigorously deny access to a file if it is already open. Modern filesystems follow the principle that the last write wins and determines the contents of the file.
Because Samba offers a network filesystem, it also has internal locking functions. Samba uses the TDB (Trivial Database) database format for storing internal metadata. One of the most important databases is locking.tdb
, which tracks which client is currently accessing which file.
Samba relies on opportunistic locking, which means a client tells the server that it has claimed exclusive access rights to a file on the Samba share for itself. Once the Samba server has complied with the request, it writes a corresponding note to locking.tdb
and stops other clients from accessing the same file.
As long as the process is limited to a single instance of Samba, everything works fine: The single Samba server can reliably assume that its version of locking.tdb
is authoritative.
But a clustered configuration adds a challenge: Multiple Samba instances need to sync the contents of their locking.tdb
files with each other. The cluster must therefore have some means for managing client access to files on the Samba volume.
The solution for this problem, say the Samba developers, is CTDB (Clustered Trivial Database), an extension of TDB that lets many instances of Samba dynamically share TDB content.
Requirements for Clustered Samba
A few years ago, the option for a cluster file server was some form of clustered filesystem: solutions such as GFS or OCFS2 (Oracle Cluster Filesystem 2) could manage cluster-wide access to the same filesystem in a NAS share connected via iSCSI. But solutions of this sort required a cluster manager, preferably Pacemaker, and configuring and managing Pacemaker can be a very complicated task – especially when you are using it with GFS or OCFS2.
Luckily, distributed storage solutions have led to a simpler approach. Distributed storage tools such as GlusterFS and Ceph work differently: A large filesystem comprises many small segments on the participating servers, and consistency issues are addressed internally. Access occurs through designated, independent mechanisms via simple interfaces. In truth, distributed storage is no less complex than Pacemaker with OCFS2, but it does a better job of hiding the complexity. The barrier to entry is thus lower.
Two rival distributed storage solutions dominate the market, and both are sponsored by Red Hat: On one hand, GlusterFS offers a classic distributed filesystem; on the other, Ceph is an object store that can offer its contents in the form of a POSIX-compatible filesystem, CephFS. CephFS was stuck for several years at the beta stage, but the last version of Ceph "Jewel" promises a higher level of maturity: CephFS is suitable for the production operation, according to the developers.
Three servers are available in the following example of Ceph: Alice, Bob, and Charlie – each of these servers has a hard drive that it contributes for the Ceph object store. Although the performance benefits of Ceph are best realized when the cluster runs on real hardware, you can easily emulate this configuration on virtual machines if you only want to try things out.
Even the most attractive Samba cluster will be no help if you ignore fundamental rules of high availability (HA). Basically, an HA cluster with Samba faces the same challenges that all other services on a server need to take on: Clustering at the software level only checks one box on the list. The loss of infrastructure that is not controlled by Samba can still trip Samba up.
Network and power are the two classic infrastructure issues you'll need to address: Several Samba servers in the combined cluster are good, but if they are all connected to the same electrical circuit and the circuit fails, both servers are dead. The problem is the same for Ethernet: If all nodes in the cluster are connected to the same switch and it fails, the Samba service is still available, but its clients can no longer reach it.
Creating the Necessary Infrastructure
The degree of redundancy depends on the budget for the project. Redundancy at the power and network levels can cause significant additional costs, because you'll need to duplicate many components. Admins face a compromise: The more parts you make redundant, the lower the risk of failure, but the setup is more expensive.
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
News
-
TUXEDO Computers Unveils Linux Laptop Featuring AMD Ryzen CPU
This latest release is the first laptop to include the new CPU from Ryzen and Linux preinstalled.
-
XZ Gets the All-Clear
The back door xz vulnerability has been officially reverted for Fedora 40 and versions 38 and 39 were never affected.
-
Canonical Collaborates with Qualcomm on New Venture
This new joint effort is geared toward bringing Ubuntu and Ubuntu Core to Qualcomm-powered devices.
-
Kodi 21.0 Open-Source Entertainment Hub Released
After a year of development, the award-winning Kodi cross-platform, media center software is now available with many new additions and improvements.
-
Linux Usage Increases in Two Key Areas
If market share is your thing, you'll be happy to know that Linux is on the rise in two areas that, if they keep climbing, could have serious meaning for Linux's future.
-
Vulnerability Discovered in xz Libraries
An urgent alert for Fedora 40 has been posted and users should pay attention.
-
Canonical Bumps LTS Support to 12 years
If you're worried that your Ubuntu LTS release won't be supported long enough to last, Canonical has a surprise for you in the form of 12 years of security coverage.
-
Fedora 40 Beta Released Soon
With the official release of Fedora 40 coming in April, it's almost time to download the beta and see what's new.
-
New Pentesting Distribution to Compete with Kali Linux
SnoopGod is now available for your testing needs
-
Juno Computers Launches Another Linux Laptop
If you're looking for a powerhouse laptop that runs Ubuntu, the Juno Computers Neptune 17 v6 should be on your radar.