Staying in sync with a network filesystem

Connections

© Dmitry Sunagatov, Fotolia

© Dmitry Sunagatov, Fotolia

Article from Issue 99/2009
Author(s): , Author(s): , Author(s): , Author(s):

Tired of copying and recopying files from your laptop to the office file server? Maybe you need an automated offline filesystem, such as OFS.

Users routinely copy documents to their laptops, edit the files from the road, and save the changes centrally when they get back to the office.

Unfortunately, it is all too easy to lose files, overwrite changes, or forget which version is the most recent. Windows provides an offline file storage option to address this problem, and several alternative tools are also available (see the "Similar Approaches" box).

Now a new project brings the offline storage option to Linux: OFS, the offline filesystem [1]. (Because it began at the Georg-Simon-Ohm University in Nuremberg, Germany, OFS also stands for Ohm Filesystem.)

Similar Approaches

Windows Offline Files: The motivation for the OFS project was the desire to create a Linux counterpart to Windows offline files. Microsoft introduced the feature to its operating systems in Windows 2000, and it is integrated with the Synchronization Center in Vista. Windows offline files makes files stored on an SMB share available offline. Users can access the share even if their computer doesn't have a connection to the server. Once the share is available again, Windows will synchronize automatically or at the user's request [5].

Coda: As early as the 1980s, a team at Carnegie Mellon University (CMU) started to create a network filesystem designed to make lost network connections invisible to users. Coda, which was intended as the successor to the Andrew File System (AFS), can bridge short-term network failures, letting users work without a network connection. The Coda programmers developed some excellent ideas, although the filesystem is not totally mature [6].

Intermezzo: Intermezzo, which is also maintained by CMU, is similar Coda but with a far simpler design. Whereas Coda only accesses the cache if it does not have a connection, Intermezzo always uses the cache. The filesystem synchronizes the client cache with the server content by periodically polling the server or following a server request. The Intermezzo filesystem is no longer under active development; in fact, it was dropped from Linux kernel version 2.6, and it is not suitable for production use in most cases.

Version management: Offline filesystems cannot replace a full version management system like Subversion [7]. Version management systems provide a more extensive range of revision control features, but they are more complex and not as convenient as simple offline storage tools.

Synchronization Tools: Unison [8] and other products synchronize the contents of local and remote directories; however, users have to handle synchronizing their copies with the server content, even if they have an active connection.

OFS

To start, simply select the directories you need, and then the offline filesystem copies the contents of these directories to a cache on your local disk. Even if you don't have a connection to the server, you will still appear to be working on the network filesystem. In reality, you will be working with the copies in the cache. Paths stay the same whether or not you have a connection to the network.

When the connection becomes available again, the offline filesystem automatically launches a reintegration session to write the changes out to the server.

OFS is not a network filesystem in the true sense of the word. In fact, it is a layer between the network filesystems (for example, NFS or Samba) and a user view, which means that you can combine OFS with any filesystem.

How It Works

OFS code runs completely in userspace and relies on FUSE (Filesystem in Userspace [2]). FUSE provides an interface the kernel uses to forward file access to userspace programs.

The FUSE interface mainly consists of a kernel module, fuse.ko, and the Libfuse library, both of which are included with any recent Linux distribution. The kernel's VFS (Virtual Filesystem Switch) accepts the FUSE kernel module as a filesystem. FUSE forwards calls to the OFS daemon in userspace via the /dev/fuse device file.

To receive data, OFS relies on Libfuse and the C++ bindings by the Fusexx project [3] (Figure 1). Each action that affects a file or directory on an OFS filesystem triggers a function call though FUSE to the OFS daemon.

In addition to the OFS daemon shown in Figure 1, the OFS project also provides a mount helper and a file browser plugin (Figure 2). To communicate, the file browser plugin, which acts as a user interface to the OFS daemon, uses D-Bus [4]. This design makes it fairly simple to extend OFS by adding more file browser plugins, no matter which desktop you need to support.

The mount helper, mount.ofs, mounts the remote filesystem, for example, a Samba share, and launches the OFS daemon. The OFS daemon resides in userspace and accesses the remote filesystem. As with any normal application, OFS uses the standard filesystem API for this access.

To allow this to happen, the mount helper does not mount the remote filesystem at the location the admin specified in the call, but under /var/ofs/-remote/URL Hash.

The original mount point is controlled by FUSE. Thus, OFS is completely independent of the remote filesystem and will work with any implementation supported by the Linux kernel.

The OFS daemon's internal structure is service oriented, making it easily extensible. In the simplest case, the daemon passes all requests on to the remote filesystem itself. However, it is also responsible for caching and maintaining all persistent data.

When a user enables caching – for the whole filesystem, or for individual directories – OFS creates a copy of any file that is opened in the cache. When a user works with the file, the OFS daemon writes the changes, both to the cache copy and to the remote file on the server if it is available. See Figure 3 for the individual steps for opening, reading, and writing a file.

Caching

In OFS, the local cache mirrors the share. When the user chooses which directories to access while on the road, OFS creates a full mirror copy of only the required components. Other offline filesystems either copy the whole share or select the files based on some form of babysitting algorithm, for example, the most frequently changed or the most recently used files.

Because the OFS daemon always creates the whole path from the root of the share to the directory for which the user needs offline access, OFS can redirect access to the cache directly in case of connection loss, without the need to emulate missing parent directories.

Internally, OFS supports three directory trees: a local cache (Cache-Dir), a remote share directory (Remote-Dir, where OFS mounts the share), and the user's working directory (Working-Dir). The complete cache and the mount point for the remote share are local directories created by OFS itself.

Entries in the OFS daemon's configuration file decide where OFS creates the directories; the default location is /var/ofs. The directory name is a hash of the remote share's URL. This approach avoids conflicts and ensures that the admin can easily copy or move the internal directories.

Practical Applications

To create an OFS-managed mount point, or – to put this a different way – to mount a share via OFS, use the following command:

mount -t ofs type://server/share /mountpoint

The command

mount -t ofs smb://fileserver.comp/data /network/data

mounts the SMB share data on the server fileserver.comp in the local /network/data directory. The mount.ofs mount helper uses FUSE to let the OFS daemon intercept calls to /network/data. At the same time, the mount helper breaks down the URL and executes an additional mount command to mount the remote directory:

mount -t smb //fileserver.comp/data /var/ofs/remote/URL_hash

Each OFS-managed mount point includes a state indicating whether or not the share is available and directing access to the cache or the share itself. When the network cable is removed, Linux uses D-Bus to trigger an event, which OFS then queries before setting the filesystem status to offline.

In the future, the developers would like to integrate more features, such as a timer to identify server problems or a polling component to detect the current online state.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Managing Linux Filesystems

    Even with all the talk of Big Data and the storage revolution, a steady and reliable block-based filesystem is still a central feature of most Linux systems.

  • A web service filesystem

    The Fuse kernel module lets developers implement even the most idiosyncratic of filesystems. We’ll show you how to build a filesystem that relies on SOAP to publish data over web services.

  • Cmdfs

    Cmdfs builds a filtered virtual filesystem based on a source directory tree. You can even integrate other programs to convert data on the fly.

  • AuFS

    AuFS offers a painless filesystem for a thin client, and FS-Cache provides a persistent cache.

  • Hot Backups

    The tools and strategies you use to back up files that are not being accessed won't work when you copy data that is currently in use by a busy application. This article explains the danger of employing common Linux utilities to back up living data and examines some alternatives.

comments powered by Disqus
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters

Support Our Work

Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.

Learn More

News