Close Search

Query Structure

Each Recoll query can contain several elements (Figure 4). Each element is composed of a value (e.g., "linux") and an optional field that describes where Recoll should look for that value. Field name and value are separated by a colon.

Figure 4: Another built-in way to learn how Recoll works: A handy cheat sheet appears showing information for the Query language search mode.

If a query contains several elements, Recoll will only return files that match all of them, unless you tell it otherwise using parentheses and logic operators. Consider the following examples:

Linux Stallman Torvalds
Linux AND Stallman OR Torvalds
(Linux AND Stallman) OR Torvalds
Linux Stallman -Torvalds
"Lord of the Rings"

The first query returns all documents that have a name or contents containing all the three terms listed. The second query only returns documents that contain Linux and either Stallman or Torvalds. The third query returns documents that contain the word Torvalds or both the words Linux and Stallman: This happens because the OR operator has higher priority than the AND, but you can set priorities as you want with parentheses. The fourth query, thanks to the negation operator (-) finds the documents that contain both the words Linux and Stallman but not the word Torvalds. The double quotes in the last query tell Recoll to search for the complete phrase Lord of the Rings.

The next things to know are that (even if they may seriously slow down Recoll) you can use wild cards (as in "*ter" to find computer, commuter, etc.), and stemming is used by default, except for all-uppercase words. For example, searching for linux will return documents containing words like linuxian or linuxer. To avoid that and only get the exact word you want, enclose it in double quotes.

Recoll queries become more powerful when you pair double quotes with modifiers to allow proximity searches. To understand the concept of a proximity search, compare the following queries:

"Linux rules"
"Linux rules"p
"Linux rules"po10

The first query, thanks to the quotes, finds only the documents containing the exact sentence Linux rules. The p modifier attached (without spaces!) to the second statement makes Recoll find any document that contains those two words next to each other, but in any order. The o modifier in the last query asks Recoll for all the documents in which those words appear in any order, but with up to 10 other words between them, as in Linux is an operating system that rules or Linux really, really rules!

Basic Query Filters

In addition to describing which terms you want and in which combinations, you can specify where, in a document, Recoll should look for the search string. For instance, you can look in title, author, recipient, keyword, ext (that is, file extension), filename, and dirname. To find all documents with the sentence Linux rules in the title, or whose author is Marco, just enter:

title:"Linux rules" OR author:Marco

As this example shows, Recoll filters are easy to use – as long as you are aware of some basic properties of file formats and differences among them. Keywords, for example, are supported only in a few formats, like OpenDocument or markdown with front matter. The author field in Recoll is equivalent to the From address in email files and the Author in OpenDocument texts. In the same spirit, both the Subject of an email and the contents of the HTML <title> tag are the title, as far as Recoll is concerned.

The dir filter restricts searches to specific directories or parts of the filesystem:

dir:/reports/drafts

makes Recoll look only in folders like Documents/reports/drafts, Documents/archive/reports/drafts, and so on.

Special Filters

Recoll has four more filters that all work in the same way, regardless of how you mix them with other filters by means of parentheses or boolean operators. These filters are rclcat, mime, size, and date. rclcat defines what file categories you are interested in. The available categories are those in the main GUI window: text, spreadsheet, presentation, message, media, and other. The mime filter indicates specific types of files:

"-mime:text/plain"

means find all files except plain text files.

Finding only files with sizes in a certain range is as easy as writing, for example:

size>100 size=<1000

to find only files of size between 101 and 1000 bytes, inclusive.

The Recoll query language also understands dates and periods – in other words, intervals of time. Dates have the format YYYY-MM-DD, but only the year is mandatory. Periods are strings beginning with a capital P (as in "period") followed by any combination of number of years (Y), months (M), and days (D):

P2Y10D = 2 years and ten days
P2M5D = 2 months and 5 days

Combining dates and periods with a slash tells Recoll to find only documents created or modified within a specified date range:

date:2018-04-01/2018-04-30
date:2018-04-01/P30D
date:P3M/
date:/2017-12-25

The first filter indicates the whole month of April 2018. The second filter says exactly the same thing, but with a syntax much easier to implement in a shell script. The third filter means "anything that is no more than three months older than the current date" (assuming you have no files with dates in the future). The last filter means "anything dated before Christmas of 2017."

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Recoll

    Whether you’re looking for a letter to the Internal Revenue Service or an email from an online trader, the Recoll desktop search machine will help you find it with just a few mouse clicks.

  • Paperwork Document Manager

    Paperwork was developed to manage the paperless office – a dream as old as desktop PCs.

comments powered by Disqus

Direct Download

Read full article as PDF:

Price $2.95

News