Examining the algorithms of the diff utility

WHAT'S THE DIFF?

Article from Issue 76/2007
Author(s):

Diff finds the differences between two versions of a file. We’ll show you how diff finds changes and matches in files without affecting a system's resources.

For a user at the command line, discovering the differences between two text files is easy: a simple command, such as diff Version_1.txt Version_2.txt, is all it takes. On closer inspection, however, it turns out that diff needs a large amount of memory and some ingenious algorithms to compare files. This article investigates how diff manages to find changes and matches in multiple megabyte files without affecting a system’s resources.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Command Line – diff and merge

    Diff and merge: They're not just for developers.

  • FOSSPicks

    Graham looks at TerraForge3D, nheko, Navidrome, ddcutil, and much more!

  • Team Spirit

    Instead of the coach determining the team lineup, an algorithm selects the players based on their strengths for Mike Schilli's amateur soccer team.

  • Monitoring Station

    With a monitoring system implemented in Go, Mike Schilli displays the Docker containers that have been launched and closed on his system.

  • Perl: Automating Color Correction

    If you have grown tired of manually correcting color-casted images (as described in last month's Perl column), you might appreciate a script that automates this procedure.

comments powered by Disqus
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters

Support Our Work

Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.

Learn More

News