Comparing files with diffoscope
Spot the Difference
Diffoscope finds all the differences between files or folders, but at the price of verbosity. We show you how to focus diffoscope on what you want to know.
Diffoscope [1] is a Python 3 command-line tool that shows, with many more details than you might imagine, all the possible differences between two files, folders, or file archives in TAR, ISO, or other formats. In TAR, ISO, and the others, diffoscope can browse the full internal hierarchy of those containers to look at every file they contain.
The basic features and most common uses of diffoscope were previously explained by Bruce Byfield in Linux Magazine [2]. After a brief recap of those features, I will discuss what I consider a limitation of this tool for non-programmers and then illustrate a general method to overcome that limitation – allowing you to much more quickly exploit this great tool's potential.
Diffoscope's Main Features
The full diffoscope package can handle almost every file format in existence, if the right third-party tools are available in your system. A "minimal" version of the package is also available, and as far as I can tell should be more than adequate for the great majority of users. When browsing folders or archives, diffoscope compares files with the same or similar names in the same subfolder. If there is no other way to find differences between files and display them, it falls back on "hexdump comparison" (i.e., comparing the two files byte by byte and showing differences in byte values).
[...]
Buy this article as PDF
(incl. VAT)