The Kosmos distributed FS

Launching KFS

The next step distributes the binary files to the meta- and chunk servers. A Python script in the ~/kfs-0.1.1/scripts directory takes care of this, creating a customized program package for each server and then securing the installation with SSH.

To allow this to happen, all of your servers should run the same Linux environment, or at least the distributions should not be wildly different. Configuring SSH with keypairs removes the need to keep entering multiple passwords.


The only thing missing now is the configuration file that tells the script which computers on the network will be handling which task. Listing 1 shows a sample configuration file.

Listing 1

Kosmos FS Sample Configuration


The file has a separate section for each server involved, headed by the server name in square brackets. The minimal requirement is a [metaserver] section.

Following is a section for each chunk server, which typically takes the form of [chunkserver1] through [chunkserverN]. The KFS cluster in this example comprises a metaserver and two cluster servers. Each section contains the settings for one server.

node: is followed by the name of the IP address for the server. rundir: is followed by the directory in which the binaries will be stored (in the example in Listing 1, this is the home directory for the tim user account on each server). The baseport: keyword specifies the TCP port that the server will use to communicate with the other nodes.

The computer names do not need to be different. In fact, Kosmos FS will let you run all the servers on a single machine – and this can be localhost – but in cases like this, you must assign unique TCP ports to your metaservers and cluster servers.

Each chunk server has a space: option that specifies how much disk space the server will use to save data. In the example here, the first chunk server provides 30GB, the second slightly less, 18,000MB. Sample configuration files are available in the conf directory below the source code archive.

Command Center

Now that the configuration file is complete, the next step is to change directory to scripts and enable the following:

python -f configuration_file.cfg -b ../build/bin

Thanks to the configuration file, all the servers and SSH can be launched centrally from the current machine:

python -f configuration_file.cfg --start

The following call shuts the system down:

python -f configuration_file<.cfg --stop

Specifying the configuration file is important and lets users manage different KFS clusters from a single console.

Now that the servers are running, users can start moving data onto the enormous new filesystem using either the special KFS Shell (see the box titled "Toolbox" for more details) or via the API. A simple example of a C++ program that stores its data in KFS is given in Listing 2.

Listing 2

Creating a File


Unfortunately, the header files are hidden away in the depths of the source code archive in src/cc. This also applies to the libraries, which are located in build/lib:

++ test.cpp -I ~/kfs-0.1.1/src/cc -L ~/kfs-0.1.1/build/lib/ -lkfsClient -lkfsIO -lkfsCommon

Before calling the results, LD_LIBRARY_PATH has to be set:

export LD_LIBRARY_PATH=~/kfs-0.1.1/build

To save the linker the trouble of searching for the dynamic libraries, you can link your own programs with the static variant, which is located in ~kfs-0.1.1/build/lib-static.

To handle huge volumes of data, a KFS application simply opens a new file via the client library.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • RAID Performance

    You can improve performance up to 20% by using the right parameters when you configure the filesystems on your RAID devices.

  • Progress by Installments

    Desktop applications, websites, and even command-line tools routinely display progress bars to keep impatient users patient during time-consuming actions. Mike Schilli shows several programming approaches for handwritten tools.

  • Partition Backup

    A partition backup offers several advantages over legacy, file-based backup alternatives, and using a backup server adds even more convenience. We’ll show you some free tools for partition backup over the network.

  • Ready to Rumble

    A Go program writes a downloaded ISO file to a bootable USB stick. To prevent it from accidentally overwriting the hard disk, Mike Schilli provides it with a user interface and security checks.

  • Ask Klaus!
comments powered by Disqus