Klaus Knopper answers your Linux questions

Ask Klaus

Article from Issue 184/2016

Strategies for getting around flash drive limitations and updating OSs on flash drives.

Undeleting Files

Hi Klaus, Would you be able to advise what software is needed to undelete files that were deleted in error from the early version 7 persistent image file? Or, can this be done using the later versions?

Thanks, Marcus Pillifeant Maleny, Queensland, Australia

Because the problem of undeleting – or recovering – accidentally deleted files is quite often asked for, and is not specific to the Knoppix persistent image or partition, I'm going to first answer in a more general way.

Undeleting files, that is, reverting the effect of the remove (rm) or unlink commands, is a very filesystem-specific task. It's chances of success depend on the structure and features of the filesystem. I'll look at one of the most simple filesystems first – FAT32 – which stores filesystem information in a simple table (hence the name "file allocation table"). Figure 1 shows a raw dump (hexedit) of the FAT with a few files in the root directory.

Figure 1: FAT32 file allocation table.

The actual filenames are lecture1.pdf, lecture2.pdf, and lecture3.pdf. The earliest FAT filesystems were only able to handle file names with eight uppercase letters, and an additional three-letter extension. This scheme is still used in the modern FAT, as marked in yellow, but the "long filename" with fewer limitations is now also present as an extension, which you may be able to identify somewhat above the "short" filename.

After deleting the file lecture2.pdf (using rm -f lecture2.pdf) and releasing the filesystem with umount, thus writing back all changes, the raw view of the file allocation table looks like Figure 2.

Figure 2: FAT32 dump after one file is deleted.

The most obvious change is the replacement of the filename's first letter, L, by character hex E5 (also in the "long filename" version above). This is how FAT32 first "hides" deleted files, before they are eventually overwritten by a newly created file later. In this stage, recovering the file is easily done by replacing the E5 character at the beginning of the file by an alphabetic letter (e.g., back to the original L).

After doing this, the deleted file is back when the filesystem is mounted again; you might want to do the same with the "long filename" part to get back the originally visible lowercase name. Recovery programs for DOS or Windows do exactly that. A very good recovery program for Linux is TestDisk (Figure 3), which knows about the specifics of file deletion and recovery for many filesystems.

Figure 3: TestDisk in action.

Please note that, although recovering files in a FAT32 filesystem is comparably easy, the file's data and metadata will only stay intact as long as a new file does not claim the same location in the FAT or overwrite the file's data location. If this happens, the file and its contents are really gone for good. Creating new files or modifying files on the filesystem will likely destroy the content of the deleted file, so undeletion won't work or the recovered file will be corrupt.

Now, FAT32 is a very simple filesystem and somewhat limited; that is, you can't create files larger than 4GB, and because of its static size, the FAT itself can run out of space for new file names. Therefore, you might not be able to create a vast number of small files in the same directory, even if there is still physical space available to hold the data. Also, FAT32 does not support the Unix system permissions and special file types like block and character devices, symbolic and hard links, sockets, or named pipes and extended attributes, which are elementary for Linux to function properly and stay secure in a multiuser environment.

Native Linux filesystems, such as ext2 … ext4, XFS, or ReiserFS store data and metadata in a more efficient way, so file access in a complex tree of directories and files is much faster and costs less memory than searching for a file in a static file allocation table. Also, less space is wasted for a huge FAT, because the metadata is stored in a linked list that can spawn all across the complete partition size; moreover, recent changes are kept in a journal, which allows for quick repair of the filesystem in the case of unfinished file operations or a crash before the filesystem is unmounted properly.

These advantages of modern journaling filesystems are a tradeoff against "undoing" of valid transactions. A deleted file is unlinked from the data metastructure quickly, so it is quite difficult to find old entries once the filesystem tree is automatically optimized. Only very recent changes, which are kept in the journal, can be replayed or reversed with special, filesystem-specific software. Unfortunately, any references to file names and file metadata – like time stamps – disappear very quickly in modern filesystems after the file has been deleted, so you might still be able to recover the file data, but you won't get back the matching file name.

If you care more about the data of a single file than about retrieving the complete filesystem and directory structure, you can try PhotoRec instead of TestDisk to get your data back. PhotoRec scans raw data and finds file contents based on header signatures (Figure 4). In some cases, the file content also reveals the original file name, even if the file no longer appears in the filesystem organizational structure, so you can get back the file with its (almost) original name. However, in most cases, such as pictures or videos, the file name is no longer associated with the data after file removal, so you have to search or guess from the recovered file's sizes and block positions on disk, which are used by PhotoRec to assign new names to files recovered and saved to a new partition or medium.

Figure 4: Using PhotoRec.

PhotoRec scans files regardless of which filesystem is used on the source partition, but it honors filesystem-specific data links and file fragments if the filesystem is known or specified in the initial configuration options.

Back to the Knoppix-specific part of your question: The two filesystem types in question for the read/write overlay are: ext2 (for the (optional) overlay file method selected at flash disk installation) or ReiserFS (for the additional overlay partition method, which is recommended for efficiency). Undeleting removed files may be more difficult in ReiserFS than in ext2 because of the balanced tree metastructure.

However, if you deleted a system file that's physically located on the read-only part of the Knoppix overlay stack, recovery is very simple: All original files residing in the compressed read-only overlay files KNOPPIX/KNOPPIX* are still immediately accessible under the /KNOPPIX* directories, which are mounted at boot. When removing files in Live operation, the AuFS overlay filesystem just creates a so-called "whiteout" file starting with ..wh.* in the writable /KNOPPIX-DATA directory structure, which hides the (read-only) original file. Either removing the whiteout file or copying back the original file from /KNOPPIX to /UNIONFS will recover the file. Of course, this method of recovering files from a part of the overlay stack only applies for those files normally included in Knoppix, not to files that were downloaded or created by yourself.

A last hint: When booting with options

knoppix noimage

Knoppix will not access the overlay filesystem and thus will not attempt to write to it, so recovery from the purely read-only system (e.g., DVD) is safe.

Klaus Knopper

Klaus Knopper is an engineer, creator of Knoppix, and co-founder of LinuxTag expo. He works as a regular professor at the University of Applied Sciences, Kaiserslautern, Germany. If you have a configuration problem, or if you just want to learn more about how Linux works, send your questions to: klaus@linux-magazine.com

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Ask Klaus!

    Klaus Knopper is the creator of Knoppix and co-founder of LinuxTag expo. He currently works as a teacher, programmer, and consultant. If you have a configuration problem, or if you just want to learn more about how Linux works, send your questions to: klaus@linux-magazine.com

  • Ask Klaus!
  • Top 10 Knoppix Rescue Tricks

    The Knoppix Live Linux distro is packed with powerful tools for fixing broken systems. We ask Knoppix creator Klaus Knopper for his favorite Knoppix rescue tricks.

  • Ask Klaus!

    Klaus Knopper is the creator of Knoppix and co-founder of LinuxTag expo. He currently works as a teacher, programmer, and consultant. If you have a configuration problem, or if you just want to learn more about how Linux works, send your questions to: klaus@linux-magazine.com

  • Ask Klaus!
comments powered by Disqus