The Kosmos distributed FS
Launching KFS
The next step distributes the binary files to the meta- and chunk servers. A Python script in the ~/kfs-0.1.1/scripts directory takes care of this, creating a customized program package for each server and then securing the installation with SSH.
To allow this to happen, all of your servers should run the same Linux environment, or at least the distributions should not be wildly different. Configuring SSH with keypairs removes the need to keep entering multiple passwords.
Topology
The only thing missing now is the configuration file that tells the script which computers on the network will be handling which task. Listing 1 shows a sample configuration file.
Listing 1
Kosmos FS Sample Configuration
The file has a separate section for each server involved, headed by the server name in square brackets. The minimal requirement is a [metaserver] section.
Following is a section for each chunk server, which typically takes the form of [chunkserver1] through [chunkserverN]. The KFS cluster in this example comprises a metaserver and two cluster servers. Each section contains the settings for one server.
node: is followed by the name of the IP address for the server. rundir: is followed by the directory in which the binaries will be stored (in the example in Listing 1, this is the home directory for the tim user account on each server). The baseport: keyword specifies the TCP port that the server will use to communicate with the other nodes.
The computer names do not need to be different. In fact, Kosmos FS will let you run all the servers on a single machine – and this can be localhost – but in cases like this, you must assign unique TCP ports to your metaservers and cluster servers.
Each chunk server has a space: option that specifies how much disk space the server will use to save data. In the example here, the first chunk server provides 30GB, the second slightly less, 18,000MB. Sample configuration files are available in the conf directory below the source code archive.
Command Center
Now that the configuration file is complete, the next step is to change directory to scripts and enable the following:
python kfssetup.py -f configuration_file.cfg -b ../build/bin
Thanks to the configuration file, all the servers and SSH can be launched centrally from the current machine:
python kfslaunch.py -f configuration_file.cfg --start
The following call shuts the system down:
python kfslaunch.py -f configuration_file<.cfg --stop
Specifying the configuration file is important and lets users manage different KFS clusters from a single console.
Now that the servers are running, users can start moving data onto the enormous new filesystem using either the special KFS Shell (see the box titled "Toolbox" for more details) or via the API. A simple example of a C++ program that stores its data in KFS is given in Listing 2.
Listing 2
Creating a File
Unfortunately, the header files are hidden away in the depths of the source code archive in src/cc. This also applies to the libraries, which are located in build/lib:
g ++ test.cpp -I ~/kfs-0.1.1/src/cc -L ~/kfs-0.1.1/build/lib/ -lkfsClient -lkfsIO -lkfsCommon
Before calling the results, LD_LIBRARY_PATH has to be set:
export LD_LIBRARY_PATH=~/kfs-0.1.1/build
To save the linker the trouble of searching for the dynamic libraries, you can link your own programs with the static variant, which is located in ~kfs-0.1.1/build/lib-static.
To handle huge volumes of data, a KFS application simply opens a new file via the client library.
« Previous 1 2 3 4 Next »
Buy this article as PDF
(incl. VAT)