Recovering files with Magic Rescue
File Wizardry
The Magic Rescue recovery utility saves corrupt or deleted files by reading a file's magic number.
Free software has no lack of utilities for recovering deleted files. However, over the years, Magic Rescue [1] has proved to be one of the most reliable. In fact, it's so reliable that it continues to be carried by most major distributions despite the fact that it has been unmaintained for several years. A day will probably come when it is obsolete, but, meanwhile, it remains a standard recovery tool.
Magic Rescue works by reading a file's magic bytes or magic pattern – that is, the unique signature that designates each file type. This signature is often, but not always, within the very first bites of a file. If it is not, then you can use a hex editor to find it (Figure 1). It is mostly used by the file
command, often behind the scenes. Magic Rescue uses its collection of recipes to recognize the magic bytes in all deleted files of a particular type then saves deleted files to an output directory where they can be sorted.
Magic Rescue might not work on badly fragmented filesystems if it can only find the first chunk of a file; however, even then, it might be able to identify a file type for recovery, as long as the chunk is large enough to contain the complete magic byte.
Running Magic Rescue
Magic Rescue works on filesystems so you might need to run fdisk -l
to see a list of devices (Figure 2). You might also want to try running Magic Rescue on a damaged filesystem, which may not display. If you are working with a damaged filesystem, you should also run:
hdparm -d 1 -c 1 -u 1 /dev/DEVICE
This command retrieves basic information about the drive that can help Magic Rescue work with it and possibly decrease the run time.
The number of deleted files of a popular type can add up quickly on a filesystem that has been used actively for several years, so you should create a recovery directory on a filesystem with plenty of free space. The directory should not be on the filesystem from which Magic Rescue is trying to extract files, because the result can be an infinite loop as Magic Rescue continually finds an output file and creates a duplicate of it.
You can see if Magic Rescue has a recipe for the file format you want to recover by checking the contents of /usr/share/magicrescue/recipes
(Figure 3). On my Debian system, Magic Rescue ships with 37 recipes, including ones for raw Gimp files, Mozilla inboxes and sent directories, .png
, .txt
, and .zip
(including LibreOffice and OpenOffice) files. Should a recipe for the format you need be missing, you might be able to find one online or create your own (see below). With this information, you are ready to run Magic Rescue. The minimum version of the command has the following structure:
magicrescue -r RECIPE -d OUTPUT-DIRECTORY /dev/DEVICE
The recipe's file name should be enough to identify it, but the output directory requires a full path if it is not in ./recipes
or /usr/share/magicrescue/recipes
. Multiple recipes can also be specified, each with its own set of options immediately before it.
Often, the basic structure is all you need, but it can be modified by other options. Using -b BLOCKSIZE
causes Magic Rescue to recover only those files that are a multiple of the blocksize argument. This option gives you results faster but also gives fewer results. It will also miss formats whose magic bytes are not at the start of the file or files within compressed archives. However, the option only works for the recipe immediately after it, so you can reduce the chances of missing files by running different blocksizes for the same recipe.
Another option is to change the content of the standard output with -MOUTPUT
. When the output is -i
, each recovered file's name displays, whereas when the output is -o
, the output file's name displays. By contrast, with -io
, the names of both the recovered file and the output file are shown.
Regardless of the options you add, you can stop the command at any time with Ctrl+C. You can restart it with the -O
option plus the last location listed in the terminal window (e.g., -O 0xF0CD2B
).
Running Magic Rescue can take several minutes – or even longer if you are recovering graphics files, especially on a filesystem that contains a web browser. Notice that the recovery often concludes with a note that an error has been reported (Figure 4), but on some devices, that may only denote the end of file.
Sorting the Results
When Magic Rescue completes its run, you can sort the results in the output directory. Although you can manually sort the results, an easy first step is to run dupemap
, a utility that accompanies Magic Rescue. Dupemap creates a database of checksums, which you can then use to eliminate duplicates using the command structure:
dupemap -d DATABASE.map delete,report OUTPUT-DIRECTORY dupemap delete, report OUTPUT-DIRECTORY
The delete and the report commands do exactly what their names suggest, deleting duplicates, then reporting the results. After the two commands are run, the output directory contains only unique files. Duplicates and files created if you are doing a recovery are eliminated.
Alternatively, Magic Rescue also provides the magicsort
utility. Its structure is as simple as possible:
magicsort OUTPUT-DIRECTORY
This bare command creates a subdirectory named after each unique file name and moves the file to that directory.
Both dupemap
and magicsort
can be used without Magic Rescue, but the Debian packages do not include a man page. However, man pages are available online from Ubuntu [2][3].
Creating Recipes
If you need a recipe not included with Magic Rescue, you can create your own, although it takes a certain amount of knowledge of Linux and some ingenuity to modify existing recipes for your purposes.
Before you start building a recipe, read the man page for magicrescue
[4], so you can learn the basics of magic bytes. Then, sample some of the recipes in /usr/share/magicrescue/recipes
. Many of the recipes are heavily commented, which can help you, but most recipes have only three to five lines (Figure 5). They start with information about the magic bytes and the offset from the start of the file of any other identifying pattern. Usually, the file extension is mentioned. This identifying information is followed by an operation within Magic Rescue that should be applied to the file type, and, sometimes a parameter or sub-command that should operate on the file.
For example, when a file is recognized as a compressed ZIP file, the recipe runs the dd
command to create a copy of the file. It then checks its characteristics to see if it is a .jar
(Java) or Open Document Format (LibreOffice or OpenOffice). If the file is identified as either type, it is renamed with the appropriate extension. You can copy the operations and parameters from other recipes, adapting them to your own purposes. The recipe might involve starting another utility designed to deal with the file type.
Other details are necessary to build a recipe but are too variable to give here. However, you can study the magicrescue
man page, preferably with it open in another tab or window. You can test a recipe with /usr/share/magicrescue/tools/checkrecipe
.
Buy this article as PDF
(incl. VAT)