DocFetcher
Bloodhound
DocFetcher is a practical local search tool that is easy to configure and use – even for large data collections.
Modern operating systems take up several Gigabytes of space just for the many application programs, and they sometimes contain up to several hundred thousand individual files. If you add your extensive music or photo collection, you can quickly lose track.
Modern desktop environments offer indexing and search applications for existing data, and the Linux environment includes several special search programs. However, many of these programs are not very intuitive, and some even expect you to install a database as a backend. In addition, many of the tools often do not support full-text searches. If you are looking for a lean, practical, and powerful search tool for your workstation, DocFetcher is a very interesting alternative.
You can download the Java application from the project page, where you will also find installation instructions [1]. As a prerequisite, you need a reasonably up-to-date Java runtime environment; DocFetcher harmonizes perfectly with the current OpenJDK environments, which you can usually install directly from your distribution's software repositories.
Unpack the downloaded ZIP archive with the DocFetcher files using a tool like Ark, File Roller, or Xarchiver. You can then move the subdirectory you created to a directory of your choice. To start the program from a desktop menu, however, you need to manually create a menu entry (see the box entitled "Installation").
Installation
Many Linux distributions do not include DocFetcher in their package sources. Ubuntu, for example, does not yet include a package for DocFetcher. It is thus often necessary to install DocFetcher manually.
Listing 1 shows how to unpack the ZIP archive downloaded from the project page into the /usr/local/bin/
directory. In Listing 2, you will find the content for /usr/share/applications/docfetcher.desktop
to help you create a matching entry in the Start menu of the desktop environment.
Adjust the version number in the commands if necessary. If you prefer a location other than /usr/local/bin/docfetcher/
, remember to change the paths appropriately. If you are still using a system without GTK3 libraries, you also need to swap DocFetcher-GTK3.sh
for DocFetcher-GTK2
.sh in the Exec
line.
Listing 1
Unzipping DocFetcher
$ unzip docfetcher-1.1.19-portable.zip $ sudo mv DocFetcher-1.1.19/ /usr/local/bin/docfetcher
Listing 2
Creating a Menu Entry
Version=1.0 Name=DocFetcher GenericName=Document Index and Search X-GNOME-FullName=DocFetcher Document Index and Search Comment=Index and Search your computer Type=Application Categories=System;Utility;FileTools;Java; Exec=/usr/local/bin/docfetcher/DocFetcher-GTK3.sh Terminal=false StartupNotify=true Icon=/usr/local/bin/docfetcher/img/docfetcher128.png
Start Your Engines
When you first launch DocFetcher, some systems start with a dialog where you can change the keyboard shortcut from the default ([Ctrl]+[F8]). If the shortcut is already mapped, a message asks you to confirm by pressing OK. The program window, which is divided into five panes, then appears. In the top-left corner, you will find an input field for the minimum and maximum file size that DocFetcher should consider for the search.
Select the file types you want DocFetcher to find from a dropdown list; the program enables all supported formats by default. Below is the search area, and top-right is an input line for the search terms. Below this area, the software lists the results with information on match relevance and file size; an area in the bottom right displays the contents of the selected file (Figure 1).
![](/var/linux_magazin/storage/images/issues/2018/214/docfetcher/figure-1/731984-1-eng-US/Figure-1_large.png)
DocFetcher needs to index the contents of the mounted storage media in order to search reliably and quickly even in large data sets. You can trigger this indexing from the Create Index From dialog, which you can access by right-clicking in the search area in the bottom-left of the main window. Then select either a folder or an archive file. In Microsoft environments, DocFetcher supports indexing of PST files containing messages, contacts, tasks, or appointments.
To limit the size of files that the program should consider, enter the minimum and maximum values in the boxes in the upper-left corner. The process of indexing the data collection, which relies on Apache Lucene [2], takes some time during the first run, but this step will significantly speed up searching in these folders (Figure 2).
![](/var/linux_magazin/storage/images/issues/2018/214/docfetcher/figure-2/731987-1-eng-US/Figure-2_large.png)
After indexing is complete, you will find the indexed directories and archives in the Search Scope pane. Enter the desired search terms in the search box. After you press the Search button, DocFetcher searches through the indexed data and lists the locations. Files containing the search term appear together with information such as the file size. Below you will find the text passages where the search term appears. DocFetcher highlights the term in yellow (refer to Figure 1).
Multiple Terms
In addition to the simple keyword search, DocFetcher also offers simultaneous searching for several keywords. You can also search for word sequences or specify terms to exclude from the search. If you want to search for two terms, enter the two terms with the AND
operand. DocFetcher searches for files in which both terms occur together, although they can occur at any location in the text. If you want the application to find an exact word order, you need to put the words in quotes.
You can exclude a term from the search by prefixing it with a minus sign. For a wildcard search, use a question mark or asterisk. The question mark replaces exactly one character in a search term; the asterisk replaces several characters. Especially when searching for compound nouns and technical terms, the asterisk is most helpful.
The search sometimes reveals results that are not needed at all. With the option to exclude unneeded formats, you can quickly thin out the list of hits. Uncheck the boxes to the left of the individual file formats in the Document Types window segment. Alternatively, use the Search Scope pane to limit the search to the relevant directory trees.
In the results display, you can scroll through the terms found page by page by clicking on the arrows to the left or right above the search display. The matches are shown with a yellow background. The up/down arrow buttons are used to navigate from match to match; DocFetcher highlights the search key in green.
Updates
As soon as you store new data in the directory hierarchies integrated by DocFetcher, you have to update the index to include all files in later searches. To update the index, right-click on the index in the search area and select the Update… option from the context menu. DocFetcher now integrates the new files and directories into the index in a process that is far faster than the initial indexing.
You can use the same context menu to list the documents in a folder without searching through them. Select the List Documents option. The software then displays the individual files in the results display top right in the program window. You can only apply this function to a single directory, not to higher-level directories that only contain subdirectories themselves.
To remove individual files from the folder, right-click the file and select Open Parent Folder from the context menu. The file manager opens, listing the files in the parent folder. Alternatively, you can display the folder contents by right-clicking on the directory in the lower-left corner of the search area and selecting Open Folder from the context menu.
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
![Learn More](https://www.linux-magazine.com/var/linux_magazin/storage/images/media/linux-magazine-eng-us/images/misc/learn-more/834592-1-eng-US/Learn-More_medium.png)
News
-
NVIDIA Released Driver for Upcoming NVIDIA 560 GPU for Linux
Not only has NVIDIA released the driver for its upcoming CPU series, it's the first release that defaults to using open-source GPU kernel modules.
-
OpenMandriva Lx 24.07 Released
If you’re into rolling release Linux distributions, OpenMandriva ROME has a new snapshot with a new kernel.
-
Kernel 6.10 Available for General Usage
Linus Torvalds has released the 6.10 kernel and it includes significant performance increases for Intel Core hybrid systems and more.
-
TUXEDO Computers Releases InfinityBook Pro 14 Gen9 Laptop
Sporting either AMD or Intel CPUs, the TUXEDO InfinityBook Pro 14 is an extremely compact, lightweight, sturdy powerhouse.
-
Google Extends Support for Linux Kernels Used for Android
Because the LTS Linux kernel releases are so important to Android, Google has decided to extend the support period beyond that offered by the kernel development team.
-
Linux Mint 22 Stable Delayed
If you're anxious about getting your hands on the stable release of Linux Mint 22, it looks as if you're going to have to wait a bit longer.
-
Nitrux 3.5.1 Available for Install
The latest version of the immutable, systemd-free distribution includes an updated kernel and NVIDIA driver.
-
Debian 12.6 Released with Plenty of Bug Fixes and Updates
The sixth update to Debian "Bookworm" is all about security mitigations and making adjustments for some "serious problems."
-
Canonical Offers 12-Year LTS for Open Source Docker Images
Canonical is expanding its LTS offering to reach beyond the DEB packages with a new distro-less Docker image.
-
Plasma Desktop 6.1 Released with Several Enhancements
If you're a fan of Plasma Desktop, you should be excited about this new point release.