Examining the algorithms of the diff utility

WHAT'S THE DIFF?

Article from Issue 76/2007
Author(s):

Diff finds the differences between two versions of a file. We’ll show you how diff finds changes and matches in files without affecting a system's resources.

For a user at the command line, discovering the differences between two text files is easy: a simple command, such as diff Version_1.txt Version_2.txt, is all it takes. On closer inspection, however, it turns out that diff needs a large amount of memory and some ingenious algorithms to compare files. This article investigates how diff manages to find changes and matches in multiple megabyte files without affecting a system’s resources.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Command Line – diff and merge

    Diff and merge: They're not just for developers.

  • FOSSPicks

    Graham looks at TerraForge3D, nheko, Navidrome, ddcutil, and much more!

  • Team Spirit

    Instead of the coach determining the team lineup, an algorithm selects the players based on their strengths for Mike Schilli's amateur soccer team.

  • Monitoring Station

    With a monitoring system implemented in Go, Mike Schilli displays the Docker containers that have been launched and closed on his system.

  • Perl: Automating Color Correction

    If you have grown tired of manually correcting color-casted images (as described in last month's Perl column), you might appreciate a script that automates this procedure.

comments powered by Disqus