Full-text search with Solr, Xapian, and Sphinx
Full-text search engines like Solr, Xapian, and Sphinx make the daily data chaos on your hard disk searchable – and they even cooperate with relational databases.
Creating a list of 10 websites that discuss the latest Ubuntu release is simple: just use Google or another one of the popular web search engines. But if you host an information-packed website yourself and want to offer your own search function for it, you need a full-text search tool. Full-text search engines have other benefits for the user and developer. If you are building a custom application or DVD, for instance, you might want to include a full-text search tool to put important information at the user's fingertips. Full-text search delves the depths of random or systematically arranged data for one or more search terms. You will want the search results sorted by relevance, and you will want the results in a split second.
Luckily, admins and developers need not reinvent the wheel: Solr, Xapian, and Sphinx are open source projects that index and analyze data. But how do you define data? You can roughly distinguish two states in which the search engines find information: structured and unstructured.
Structured data has a fixed, predefined structure that allows it to be easily recognized, categorized, and processed with the help of applications. The most common form of structured data is a relational database, with data organized in rows and columns that, in turn, are connected in the form of tables. In contrast to this, unstructured data lacks a data model. Such data sets are often so ambiguous that a program cannot simply process them because the data, facts, and figures are totally mixed. Unstructured data is the domain of search engines that can at least arrange the chaotic data semantically.
Read full article as PDF:
New release comes with better semantic search and improvements to Kontact.
Annual code quality report shows FOSS is more secure at all project size levels.
The Raspberry Pi Foundation has announced an even smaller version of the tiny computer that will fit into a DIMM slot.
A new class of problems lets a malicious app pre-configure an invisible privilege update.
New Hack language adds static typing and other conveniences.
New crypto policy system will offer easier configuration and more uniform security.
Ubuntu founder denounces insecurity in proprietary, close-source software blobs.
Vulnerability affects many Linux web servers
The Bavarian capital shuns Microsoft, Google, and other alternatives to implement an open source groupware solution.
Phone vendor partnerships bring Mark Shuttleworth's dream of Ubuntu on a phone a step closer to reality.