Cloud backup with MCrypt and S3cmd

Pay for What You Use

Because you have given up your sacred AWS access credentials, you also have the ability to control other AWS services and, in the case of S3cmd, that includes the superb "pay for what you use" service called CloudFront [5]. If you're unfamiliar with Content Delivery Networks (CDNs), CloudFront is undeniably both powerful and affordable. You can use S3cmd to query, create, and modify several of CloudFront's functions.

I'll take an extremely brief look at AWS's widely used CDN integration with S3cmd. Each instance of a CDN configuration assigned to your account is referred to as a "distribution" (probably because it efficiently distributes your data around the globe). You can display a list of your configured distributions with the command:

# s3cmd cflist

If you want a bit more detail about the configured parameters for all your distributions, use:

# s3cmd cfinfo

You can also query a specific distribution and its parameters by referencing its distributionID as follows (adding your own ID accordingly):

# s3cmd cfinfo cf://<distributionID>

You can also use cfcreate, cfdelete, and cfmodify to create, delete, and change a distribution without messing about with web interfaces.

Please, Sir, I Want Some More

The list of S3cmd features is really comprehensive for such a diminutive utility. The list includes a --force overwrite option, which should be used with great care, and a very useful --dry-run flag, which lets you display the files that will be uploaded and downloaded – without it actually happening. If you're ever worried about breaking things horribly by getting a regex entry incorrect, you will appreciate the --dry-run feature.

The useful --continue option only works with downloads, but it should, in theory at least, resume a partially downloaded file so you don't have to bother starting a big download again from scratch. It's fair to say that HTTP resumes have been around for a while and this capability is not a new, earth-shattering feature, but it is still a nice and well-needed touch.

Stop me if I have already mentioned the -r parameter (otherwise known as --recursive), which works on uploads, downloads, and deletions if you want to affect subdirectories. Use this with caution if you want to avoid incurring massive data transfer fees or accidental deletions.

Reduced Cost

If you are storing large amounts of data, you might be concerned about minimizing expenses. Apparently (and please be warned that specifications, configurations, and procedures change frequently with emerging technologies, so don't take this as gospel), the standard repository for an uploaded Amazon S3 file spans three data centers. In other words, your file is copied three times across three geographically disparate buildings.

If you only use two data centers, you can use "Reduced Redundancy," and AWS will lower its storage fees. The theory is that these files won't be as critical to you; therefore, you might tolerate losing one or two. Perhaps you will have local backups available or maybe the files have a limited shelf life.

The -rr switch, or --reduced-redundancy in longhand, allows you to instruct the clever S3cmd that you need to watch your costs and only use two data centers for storage.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Command Line – crypt

    If you just need to encrypt a file or two, a descendant of crypt can do the job. Which one you choose depends on your objective.

  • Duplicity Cloud Backup

    If you're looking for a secure and portable backup technique, try combining the trusty command-line utility Duplicity with an available cloud account.

  • Manage Amazon S3 with s3cmd
  • Charly’s Column: S3QL

    Sys admin Charly has been an enthusiastic amateur photographer for many years. Recently, he started worrying about something happening to his rapidly expanding photo collection. Can the cloud save the day?

  • Duplicati

    The free backup tool Duplicati simplifies the process of backing up data with cloud providers while at the same time protecting backups with strong cryptography.

comments powered by Disqus
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters

Support Our Work

Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.

Learn More