Needle in a Haystack

What Next?

In this tutorial you have learned why and how to use a tool that automatically scans as many ODF or text files you want, to find any given string. Cool, but why stop here?

The first thing you can do is improve odfgrep as you please. To work on non-writeable media, for example, you can modify it to create a temporary, complete copy of all the folders to examine in another folder. Alternatively, you can replace the test in Listing 1 (line 11) with another on the basis of the file command: It would be more complicated, but it would recognize ODF files no matter what their extension.

Another fun and productive line of work is using odfgrep as a model to build similar tools. A good candidate would be an odfdiff script that prints out the differences between two ODF documents.

The most important take-home lesson, however, is this: ODF is a format for sophisticated text documents, presentations, and spreadsheets that is very easy to work with and process in very efficient ways. For more proof of this, visit my little "ODF scripting" collection [5], and if you know about other scripts like those, or write new ones, please let me know!

The Author

Marco Fioretti (http://mfioretti.com) is a freelance author, trainer, and researcher based in Rome, Italy. He has been working with free/open source software since 1995 and on open digital standards since 2005. Marco also is a board member of the Free Knowledge Institute (http://freeknowledge.eu).

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Tutorials – ODF Metadata

    It is no secret that the native file format of LibreOffice and OpenOffice, the OpenDocument Format (ODF), is a truly open standard for word processing documents, spreadsheets, and presentations. What most people do not know is that ODF files contain lots of metadata that is very easy to read or modify.

  • Command Line – diff and merge

    Diff and merge: They're not just for developers.

  • Tutorials – Attachment Extraction

    If your inbox is full of email messages with important attachments, retrieving those attachments manually can be a tedious task. The script presented in this article does this task automatically and can even save the email as a plain text file.

  • Command Line: Archives

    Gzip and bzip2 not only compress files, they also provide lean and powerful tools for viewing, searching, and comparing text files.

  • Tracked Down

    Searching for text in files or data streams is a common and important function. Ugrep tackles this task quickly, efficiently, and even interactively if needed.

comments powered by Disqus

Direct Download

Read full article as PDF:

Price $2.95

News