The Kosmos distributed FS

Buffers

First, the library buffers the incoming write operations and waits for the cache memory reserved for this purpose to fill or for the application to issue a flush command before pushing the data to the chunk servers.

Immediately after the data arrive, they become available for further operations.

Besides the outgoing data, the client library also buffers any metadata that are requested for 30 seconds. This helps to avoid unnecessary, repeated server contact.

If a client is running on a chunk server, it retrieves the data locally rather than using up network bandwidth. If a chunk server suddenly fails during a read operation, the client library automatically switches to another chunk server. All of this is completely transparent for the application.

Conclusions

Kosmos FS is an interesting alternative to HDFS and Google FS, but it is still at an early stage of development. Currently, one weak point is the metaservers. They need to be able to deliver metadata quickly. After all, to be able to process the file, a client needs to know which node the file it requires is stored on. If the metaserver fails completely, the files on the chunk servers it manages are also unreachable.

Because the metaserver additionally handles load distribution, it is responsible for the performance of the KFS network it manages. Unfortunately, there is currently no replication plan for metadata, in contrast to the scheme used by the chunk servers. Administrators need to take care of this manually and back up the data regularly.

Another issue is the lack of access controls. Currently, users can store any data on the distributed filesystem and read any data stored there. For this reason, KFS should only be deployed in trusted environments until a more mature version is released.

Infos

  1. Kosmos filesystem: http://kosmosfs.sourceforge.net
  2. HDFS and the Hadoop project: http://lucene.apache.org/hadoop
  3. Paper on Google filesystem (GSF), on which KFS is based: http://research.google.com/pubs/papers.html

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • RAID Performance

    You can improve performance up to 20% by using the right parameters when you configure the filesystems on your RAID devices.

  • Partition Backup

    A partition backup offers several advantages over legacy, file-based backup alternatives, and using a backup server adds even more convenience. We’ll show you some free tools for partition backup over the network.

  • File Transport

    Various alternatives let you work around pesky size limits when transferring a file from point A to point B.

  • Ask Klaus!
  • Offline FS

    Tired of copying and recopying files from your laptop to the office file server? Maybe you need an automated offline filesystem, such as OFS.

comments powered by Disqus

Direct Download

Read full article as PDF:

048-051_kosmos.pdf  (356.70 kB)

News