Automate data backup at the command line
Automatic Backup
Backing up data is an unpopular task that many users – and even some administrators – consider a chore, prompting us to take a look at some command-line automatic backup programs.
Linux users have access to numerous backup tools. Administrators who like working with SSH appreciate that servers of any size and design can be backed up with command-line programs. However, the differences in terms of features are quite considerable (see Table 1 for an overview). Not every program is suitable for every application scenario. In this article, I investigate which tools work for which environments.
Table 1
Command-Line Backup Tools
| Attic | bup | Duplicity | rdiff-backup | rsnapshot |
---|---|---|---|---|---|
Local backup |
Yes |
Yes |
Yes |
Yes |
Yes |
Backup via SSH |
Yes |
Yes |
Yes |
Yes |
Yes |
Verification |
Yes |
Yes |
Yes |
Yes |
Yes (logfile) |
Encryption |
Yes |
Yes |
Yes |
No |
No |
Cloud services |
No |
No |
Yes (Amazon, Rackspace) |
No |
No |
Include/exclude directory |
Yes |
Yes |
Yes |
Yes |
Yes |
Time-controlled |
Yes* |
Yes* |
Yes* |
Yes* |
Yes* |
Front ends available |
No |
Yes |
Yes |
No |
Yes |
Incremental backups |
Yes |
Yes |
Yes |
Yes |
No |
Differential backups |
No |
No |
No |
No |
No |
Manual full backup |
Yes |
Yes |
Yes |
Yes |
Yes |
FUSE-mount possible |
Yes |
Yes |
No |
Yes |
No |
*Backups scheduled with the cron daemon. |
Server vs. Desktop
Home users often store large volumes of data on their computers, similar in volume to those found on servers in small businesses. High-definition video collections, as well as audio files with lossless compression and photo folders, are real memory hogs. New data is often added, but once stored, the data hardly ever changes.
On the other hand, you will also often find small files (such as correspondence, tables, presentations, and databases) on server systems. These data collections are constantly changing through modifications, such as newly created records or added documents. Accordingly, backup strategies must take existing data resources into account to guarantee rapid reconstruction in the event of data loss.
Differential vs. Incremental
Administrators distinguish three backup strategies: full backup, differential backup, and incremental backup. The full backup, a copy of the existing data, is always the first backup in any plan – subsequent backups follow as differential or incremental backups. Whereas differential backups always save changes since the last full backup, incremental backups only save modifications relative to the last backup of any kind.
The differential backup procedure requires more space for individual backups, but in an emergency, only the full backup and the last backup will help you recover the entire database. Although the incremental method uses less disk space, all incremental backups need to be re-installed in the correct sequence during the restore starting from the full backup. If a small backup is missed, the database is no longer reconstructible.
Before selecting a command-line backup software, I recommend initially performing a careful analysis of your data collection and data growth, so you do not accidentally select a program that is unsuitable for your specific IT environment.
Differential backup strategies are more likely to be used for databases that have relatively few large files and moderate regular modifications, whereas incremental backups are better suited to typical office environments. Regardless of which you choose, you should always run at least one full backup per week.
Desktop users who want to back up their own databases without root privileges do not have a huge choice of backup software for the command line, which obviously requires some knowledge of the command syntax. For users, it is important to be able to run a backup as smoothly and reliably as possible. The end user will only use the backup software – and actually perform the backup – if it is quick and easy to use.
Ideally, the same software can be used for mixed environments with both a backup server and additional desktop backups by users. Using the same software saves you the hassle of having to know the syntax of two programs – and thus avoids any associated errors.
Attic
The Attic backup program, which is written in Python, can be found in the repositories of some Linux distributions, such as Mageia, openSUSE, ROSA, or Slackware Linux; it can be installed conveniently using the respective package managers. The project page also provides the source code for download. Detailed documentation is also available [1].
Attic requires Python v3.2 or greater and openSSL in a version greater than 1.0.0. Because the software also lets you mount a backup set in user space, the llfuse package from the Python treasure trove has to be installed to provide this function.
After a successful installation, you first have to initialize a backup repository. This is achieved with the command:
attic init /<Repository-Path>/<Repository-Name>.attic
Several directories can then be backed up in an archive (which should be specially created) in this repository. Attic does not enable encryption by default. The names for these archives can be freely selected. The following command backs up the directories:
attic create /<Repository-Path>/<Repository-Name>.attic::<Archive-Name>/<Source-Directory-1>/ <[...]> <Source-Directory-n>
If the data must be encrypted, then the
--encryption=passphrase|keyfile
parameter command must be added.
I recommend using the weekday as the archive name for regular backups of the same directories, which quickly gives you the correct sequence of backups during a restore. The first backup in a repository can take a long time to complete for large data volumes, but subsequent backups will be far quicker because Attic saves them incrementally (i.e., only modified or newly added data is included in the backup).
If you want to monitor the backup run, you can display the most important data for the backed up archive using the --stats
parameter. Attic not only lists the directories and the required time for the backup run, but also the number of files backed up and the volume of data. It shows both the original and the compressed backed up data volumes, so you can keep track of data compression efficiency (Figure 1).
In contrast to many other backup tools, Attic provides a convenient approach to listing archive content. For this purpose, enter the following command at the prompt:
attic list -v /<Repository-Path>/<Repository-Name>.attic::<Archive-Name>
The software then lists all the content, including file size, owner, and file permissions. Subdirectories are automatically included, and it shows the absolute paths.
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
News
-
Rhino Linux Announces Latest "Quick Update"
If you prefer your Linux distribution to be of the rolling type, Rhino Linux delivers a beautiful and reliable experience.
-
Plasma Desktop Will Soon Ask for Donations
The next iteration of Plasma has reached the soft feature freeze for the 6.2 version and includes a feature that could be divisive.
-
Linux Market Share Hits New High
For the first time, the Linux market share has reached a new high for desktops, and the trend looks like it will continue.
-
LibreOffice 24.8 Delivers New Features
LibreOffice is often considered the de facto standard office suite for the Linux operating system.
-
Deepin 23 Offers Wayland Support and New AI Tool
Deepin has been considered one of the most beautiful desktop operating systems for a long time and the arrival of version 23 has bolstered that reputation.
-
CachyOS Adds Support for System76's COSMIC Desktop
The August 2024 release of CachyOS includes support for the COSMIC desktop as well as some important bits for video.
-
Linux Foundation Adopts OMI to Foster Ethical LLMs
The Open Model Initiative hopes to create community LLMs that rival proprietary models but avoid restrictive licensing that limits usage.
-
Ubuntu 24.10 to Include the Latest Linux Kernel
Ubuntu users have grown accustomed to their favorite distribution shipping with a kernel that's not quite as up-to-date as other distros but that changes with 24.10.
-
Plasma Desktop 6.1.4 Release Includes Improvements and Bug Fixes
The latest release from the KDE team improves the KWin window and composite managers and plenty of fixes.
-
Manjaro Team Tests Immutable Version of its Arch-Based Distribution
If you're a fan of immutable operating systems, you'll be thrilled to know that the Manjaro team is working on an immutable spin that is now available for testing.