Hacking free software for creative writing

Binary Diffs

I covered diffoscope [7] in depth in the November 2020 issue of Linux Magazine [8]. Here, I will only say that diffoscope is a new command that brings the time-honored diff command to the desktop and works with over 60 binary formats, including LibreOffice's ODT format. Although originally written by Debian's Reproducible Builds project, diffoscope is an important building block for writers who choose to work more like developers.

At its simplest, diffoscope requires no more than the command and two file names to compare versions (Figure 3). However, if you choose, you can specify options such as the length of each excerpt from the two files and regular expressions to include or omit. You can also experiment with fuzzy logic.

Figure 3: Diffoscope is a new tool for comparing two different binary files.

Merging Files

One of the barriers to using git or diff commands for writing is that they usually work with text files, while most writers work in binary formats or at least have to submit their work in one. In addition, as I write, diffoscope is too new to have automated merge capacity. Consequently, most merges have to be manual, although there are several ways to go about it.

If you only want to append files, ooo_cat [9] will do the job. From the desktop, you can connect files to each other in LibreOffice using File | Send | Create Master Documents (Figure 4). This feature not only appends documents but also allows files to be rearranged and new text entered. The files in a master document remain separate but can be printed or saved as a single document.

Figure 4: LibreOffice Writer's Master Documents feature brings multiple files together.

More complicated file merges can be done with LibreOffice's Edit | Track Changes | Compare Documents (Figure 5). You can open a second file so that you can approve all the changes, one at a time. Alternatively, you can use Merge Documents to combine both documents all at once. The Compare Documents option in particular has an interface complicated enough to take some getting used to, although it is a powerful solution.

Figure 5: LibreOffice's Compare Documents feature is a graphical tool for choosing the changes to accept while comparing two documents.

You also have the option of working in plain text so you can use Git's own merge features. In order to do so, you need to convert ODT files to plain text, using a solution like odt2text [10] for LibreOffice files or PDFMiner [11] for PDF files. Unfortunately, while you can convert text back to ODT format by opening a file in LibreOffice and then saving it, much of the formatting to prepare a manuscript for submission still needs to be added – a task eased by styles, but still a tedious one. A more practical solution would be to write in LaTeX or HTML, markup languages that work with plain text and therefore allow you to work directly with Git's merge command. When you are finished editing, you can compile a LaTeX file or run a script to convert text to ODT or HTML to ODT.

The New Workflow

None of the tools mentioned here is promoted as being specifically for writers. However, with a little ingenuity, writers can benefit from them almost as much as developers. They require a change in workflow that takes a while to learn, but the effort is worth the struggle. In the last few months, I have found that thanks to these tools I am getting organized in my habits. I no longer have to hunt for background notes, or figure out where my drafts are, or scroll up and down as much as I once did. These development tools have helped me to organize my workflow and increase my efficiency – improvements that, like most writers, I have badly needed.


  1. Williams, Robin. The PC is Not A Typewriter. Peachpit Press, 1992:https://www.amazon.com/Pc-not-typewriter-Robin-Williams/dp/0938151495
  2. VimWiki: https://vimwiki.github.io/
  3. Zim: https://zim-wiki.org/
  4. xkcdpass: https://pypi.org/project/xkcdpass/
  5. Git: https://git-scm.com/
  6. Mercurial: https://www.mercurial-scm.org/
  7. Diffoscope: https://diffoscope.org/
  8. "A modern diff utility" by Bruce Byfield, Linux Magazine, issue 240, November 2020, pp. 30-32
  9. ooo_cat: http://ooopy.sourceforge.net/
  10. odt2text: https://medium.com/@mbrehin/git-advanced-diff-odt-pdf-doc-xls-ppt-25afbf4f1105
  11. PDFMiner: https://github.com/euske/pdfminer

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • File Comparison

    With support for more than 60 file formats, diffoscope extends the power of diff beyond the plain text or HTML file.

  • Balancing Act

    CLI tools for generating passwords have many options that can help you strike a balance between ease of use and security.

  • OpenOffice.org Writer

    A reliable word processing application is a basic computing requirement. We'll show you how to get started with OpenOffice.org's Writer.

  • StoryLines

    If you’re looking for a way to organize your next novel, try StoryLines and the Writer’s Cafe suite.

  • Command Line – diff and merge

    Diff and merge: They're not just for developers.

comments powered by Disqus