Total Recoll
Lost and Found
ByKDE's unofficial search engine may be the most usable choice of all.
Searches have been KDE's weak point for several years. Nepomuk [1], which was introduced in the fourth release series as a sophisticated search engine, proved difficult to configure and use. Last year, Nepomuk was replaced by the supposedly easier-to-use Baloo [2], but it has been greeted with no greater enthusiasm. Currently, the most usable alternative for drive searches is Recoll [3] (Figure 1), which combines both simplicity and – for those who want them – advanced configuration options that are explained in a comprehensive manual [4].
Like Baloo, Recoll is a Qt tool that uses the Xapian search engine library [5]. Recoll's main difference from its predecessors is that it does not install as a daemon by default, a practice that has gained both Nepomuk and Baloo a reputation for being a drain on system resources. Recoll is not a standard KDE package, but it is well enough known that major distributions carry it. If your distro does not include the Recoll packages in its repository, the manual includes detailed information about building from source.
Once Recoll is installed, it needs to index your files. In my experience, Recoll indexed about a terabyte of files in one hour (Figure 2), about the same as either Nepomuk or Baloo; however, unlike my experience with Nepomuk, the process did not noticeably slow operations running concurrently.
To reduce updates to the index substantially, go to Preferences | Index Configuration, where you can exclude paths (e.g., library directories) and specify languages to include in searches. From Preferences | Index Scheduling (Figure 3), you can create a cronjob to run indexing regularly. Although you can also update the index manually, creating a cronjob avoids relying on your memory and allows you to run the update at a time it is less likely to interfere with other applications. Still another solution is to update the index at each login.
Before using Recoll, you might also check the list of supporting packages [6] that Recoll requires to read certain files. Many of these packages are probably already on most desktop computers, but if you need support for a specific file format, taking a moment to check is only sensible.
You can customize Recoll further by setting a keyboard binding or adding other customizations described in the manual, but here I will provide the minimum configuration steps you'll need to set up Recoll.
Simple and Advanced Searches
After basic configuration, simple searches are as easy as entering a term in the search field on the right side of the toolbar and pressing the Enter key (see Figure 1). By default, search results include stemming, so that searching for order will include results like preorder and ordered. Because Recoll has already indexed the files, results return in a matter of seconds.
As you scroll down the results window, clicking an Open link opens the file in its native application. Recoll is free software running on Linux, so you cannot open MS Word files in Microsoft Office; instead, they open in the Antiword document reader.
If this simple search does not help you locate a file, you can refine it in a number of ways. To start, you can get results more quickly if, instead of accepting the default choice All, you choose one of the half dozen common file types, such as media (graphic or sound), presentation, spreadsheet, or text. Alternatively, you can eliminate common file types by selecting other.
Another option is to refine a query. For example, selecting File name rather than the default Any term speeds search results by searching only for file names and not scanning file contents. More elaborately, by selecting Query language, you can narrow the search in other ways, many of which might be familiar to you from web searches. For example, prefacing a search term with a minus sign (-) excludes the term from the results; terms can also be separated by OR or AND for multiple simultaneous searches. In files with metadata, Recoll also has the ability to search for some of the more common fields. For example, you can search such formats by prefacing the search term with title:, author:, ext: (extension), dir: (directory path), or <YYYY-MM-DD>: for a date.
As in a web search, quotation marks define a phrase and the exact order in which the words must appear. However, you can also add a variety of letters immediately after the last quotation mark (i.e., with no space between) to refine a search further. For instance, adding C turns on case sensitivity, or adding D includes diacritics in the results.
Finally, Recoll also supports basic wild cards: an asterisk (*) for any number of characters, a question mark for a single character, and a set of characters in square brackets (e.g., [123]) to match any one of the specified characters. However, as the manual warns, wild cards should be used sparingly, especially at the start and end of search terms. Although wild cards might increase the likelihood that the term you are looking for will be in the search results, they can also slow the search and return many more unwanted results to scroll through.
The only trouble with some of these options is that you either have to memorize them or run Recoll with the manual open in another window. You might prefer to run Tools | Advanced search instead, which offers all these options in a graphical interface (Figure 4).
The mouseovers in the Advanced search dialog are extremely detailed and provide more than enough information to let you use the options in the Find and Filter features with a minimum of authorization.
Another way to save time with Recoll is to click Tools | Document history, which records the last 100 documents opened from the search results. Because of its limited number of records, calling up the document history (Figure 5) is often quicker than repeating a search.
The Problem of Integration
As an alternative to Nepomuk or Baloo, Recoll has at least two advantages. First, its interface is simple to use and, for many users, probably less intimidating as well. Second, its advanced search options are better integrated into the interface, and many are close enough to those of web browsers that they are easy to learn.
Unfortunately, Recoll is not integrated into the latest versions of KDE Plasma. The manual does discuss using a Unity Lens and an obsolete Krunner plugin, but what Recoll could really use is integration into the Dolphin file manager. If such a thing existed, it would show those who have had trouble with previous search indexers exactly what was intended.
As things are, non-integration is a small price to pay for Recoll's speed and convenience. As long as you are willing to memorize some of its advanced features, Recoll is a powerful tool that is totally undeserving of its obscurity. Like Krunner, it comes very close to being a superior alternative to a standard file manager, especially on large drives.
Info |
[1] Nepomuk: https://en.wikipedia.org/wiki/NEPOMUK_%28framework%29 |
Author |
Bruce Byfield is a computer journalist and a freelance writer and editor specializing in free and open source software. In addition to his writing projects, he also teaches live and e-learning courses. In his spare time, Bruce writes about Northwest coast art. You can read more of his work at http://brucebyfield.wordpress.com |
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
News
-
New KDE Slimbook Plasma Available for Preorder
Powered by an AMD Ryzen CPU, the latest KDE Slimbook laptop is powerful enough for local AI tasks.
-
Rhino Linux Announces Latest "Quick Update"
If you prefer your Linux distribution to be of the rolling type, Rhino Linux delivers a beautiful and reliable experience.
-
Plasma Desktop Will Soon Ask for Donations
The next iteration of Plasma has reached the soft feature freeze for the 6.2 version and includes a feature that could be divisive.
-
Linux Market Share Hits New High
For the first time, the Linux market share has reached a new high for desktops, and the trend looks like it will continue.
-
LibreOffice 24.8 Delivers New Features
LibreOffice is often considered the de facto standard office suite for the Linux operating system.
-
Deepin 23 Offers Wayland Support and New AI Tool
Deepin has been considered one of the most beautiful desktop operating systems for a long time and the arrival of version 23 has bolstered that reputation.
-
CachyOS Adds Support for System76's COSMIC Desktop
The August 2024 release of CachyOS includes support for the COSMIC desktop as well as some important bits for video.
-
Linux Foundation Adopts OMI to Foster Ethical LLMs
The Open Model Initiative hopes to create community LLMs that rival proprietary models but avoid restrictive licensing that limits usage.
-
Ubuntu 24.10 to Include the Latest Linux Kernel
Ubuntu users have grown accustomed to their favorite distribution shipping with a kernel that's not quite as up-to-date as other distros but that changes with 24.10.
-
Plasma Desktop 6.1.4 Release Includes Improvements and Bug Fixes
The latest release from the KDE team improves the KWin window and composite managers and plenty of fixes.