An indexing search engine with Nutch and Solr
CMS, wikis, text files … modern companies store important data in many different places, and that data must be accessible down to the tiniest detail through a single search. Commercial software vendors such as Google [1] offer tools that will index the data and store the index on an external server. But many organizations prefer to keep control of the search capabilities – for security and privacy reasons, but also to add flexibility and promote innovation and customization.
A handy constellation of open source tools from the Apache project will help you build your own search index for the assorted documents and data on your network: Nutch, Solr, Apache, and Lucene.
Nutch [2] is a powerful web crawler, and Apache Solr [3] is a search engine based on Apache Lucene [4]. You can combine Nutch with Solr to create a complete search engine – a miniature Google, if you like.
[...]
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
    Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
 
	
News
- 
		    					    		    KDE Unleashes Plasma 6.5The Plasma 6.5 desktop environment is now available with new features, improvements, and the usual bug fixes. 
- 
		    					    		    Xubuntu Site Possibly HackedIt appears that the Xubuntu site was hacked and briefly served up a malicious ZIP file from its download page. 
- 
		    					    		    LMDE 7 Now AvailableLinux Mint Debian Edition, version 7, has been officially released and is based on upstream Debian. 
- 
		    					    		    Linux Kernel 6.16 Reaches EOLLinux kernel 6.16 has reached its end of life, which means you'll need to upgrade to the next stable release, Linux kernel 6.17. 
- 
		    					    		    Amazon Ditches Android for a Linux-Based OSAmazon has migrated from Android to the Linux-based Vega OS for its Fire TV. 
- 
		    					    		    Cairo Dock 3.6 Now Available for More CompositorsIf you're a fan of third-party desktop docks, then the latest release of Cairo Dock with Wayland support is for you. 
- 
		    					    		    System76 Unleashes Pop!_OS 24.04 BetaSystem76's first beta of Pop!_OS 24.04 is an impressive feat. 
- 
		    					    		    Linux Kernel 6.17 is AvailableLinus Torvalds has announced that the latest kernel has been released with plenty of core improvements and even more hardware support. 
- 
		    					    		    Kali Linux 2025.3 Released with New Hacking ToolsIf you're a Kali Linux fan, you'll be glad to know that the third release of this famous pen-testing distribution is now available with updates for key components. 
- 
		    					    		    Zorin OS 18 Beta Available for TestingThe latest release from the team behind Zorin OS is ready for public testing, and it includes plenty of improvements to make it more powerful, user-friendly, and productive. 




