Safer Internet Searches
YaCy as a Solution
One of the biggest differences between YaCy and Searx is that YaCy runs independently of other search engines. YaCy creates its own distributed index. Just like in torrent files that use distributed hash tables (DHTs), you keep your own part of the tables.
To run YaCy, you need to set the amount of space that you will allow YaCy to occupy on your system, although the installation script has a default. Like Searx, you can use a Docker image to run YaCy. YaCy offers three different Docker images: amd64
, arm64v8
, and arm32v7
.
To install YaCy with Docker, use the standard values found on YaCy's web page:
docker run -d --name yacy -p 8090:8090 -p 8443:8443 -v yacy_data:/opt/yacy_search_server/DATA --log-opt max-size=200m--log-opt max-file=2 yacy/yacy_search_server:latest
These standard values help you manage resource usage. Once the server is running, you can also access a management interface from your browser. If you want to be able to use the management interface from another computer, you need to set an administrator password. If you lose the password, you will need to go back to the command line in the root of the YaCy directory and run:
bin/password.sh
This command will handle changing the password, whether your server is running or not.
You can also clone the GitHub repository and compile the binaries [11]. Confusingly, the GitHub repo does not mention at the top that you must compile before running the standard script (startYACY.sh
).
YaCy needs Java. When you download the GitHub repo, you need ant
to compile. You'll find the details further down in the GitHub document. If you need to install YaCy on multiple machines, you can create a Debian package directly with the compiler.
Configuring YaCy
Whichever method you choose for installation, you need to set up some values to get the most out of your system. First, you should specify how you want to use YaCy. For the most basic configuration, you set an interface language, name, and search use case (Figure 6).
The search use case sets the type of search. An internal search will just find files on your network; more common is a search of the entire YaCy community.
In the YaCy Administration dialog, you can edit all your settings, including working memory, disk space, and more.
Clicking on RAM/Disk Usage & Updates lets you adjust the settings for working memory and disk space. The default memory for the Java Virtual Machine (JVM) is set to 600MB.
The other values in the RAM/Disk Usage & Updates dialog save you from running out of disk space. You can use the Steady-state minimum option to disable crawls when free disk space falls below a specified minimum megabytes. This will only be an issue when you have the ports open and you collaborate with the index or when you start your own crawl. HTCache configuration lets you control the size of the content retrieved via HTTP or FTP; the default size is 4GB.
Putting YaCy to Work
Once you've configured YaCy, you can start a crawl from any web address. From the Administration dialog, click on Load Web pages, Crawler and enter the web address. YaCy will look through all the documents on the server and index them for you. You can use this to index your own internal network or add your new web page to the common index.
In addition to private searching, YaCy lets you share your search engine with others. You can customize YaCy for your website. Click on Portal Configuration to set color, title text, and even the logo that appears above the search box. From here, you also can see what the search engine will look like with your customizations.
If you use YaCy seriously, you should consider contributing to the YaCy index. To do this, you need to open your port to other peers on the network. In particular, you'll need to open port 8090, which is usually blocked by default.
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
News
-
Halcyon Creates Anti-Ransomware Protection for Linux
As more Linux systems are targeted by ransomware, Halcyon is stepping up its protection.
-
Valve and Arch Linux Announce Collaboration
Valve and Arch have come together for two projects that will have a serious impact on the Linux distribution.
-
Hacker Successfully Runs Linux on a CPU from the Early ‘70s
From the office of "Look what I can do," Dmitry Grinberg was able to get Linux running on a processor that was created in 1971.
-
OSI and LPI Form Strategic Alliance
With a goal of strengthening Linux and open source communities, this new alliance aims to nurture the growth of more highly skilled professionals.
-
Fedora 41 Beta Available with Some Interesting Additions
If you're a Fedora fan, you'll be excited to hear the beta version of the latest release is now available for testing and includes plenty of updates.
-
AlmaLinux Unveils New Hardware Certification Process
The AlmaLinux Hardware Certification Program run by the Certification Special Interest Group (SIG) aims to ensure seamless compatibility between AlmaLinux and a wide range of hardware configurations.
-
Wind River Introduces eLxr Pro Linux Solution
eLxr Pro offers an end-to-end Linux solution backed by expert commercial support.
-
Juno Tab 3 Launches with Ubuntu 24.04
Anyone looking for a full-blown Linux tablet need look no further. Juno has released the Tab 3.
-
New KDE Slimbook Plasma Available for Preorder
Powered by an AMD Ryzen CPU, the latest KDE Slimbook laptop is powerful enough for local AI tasks.
-
Rhino Linux Announces Latest "Quick Update"
If you prefer your Linux distribution to be of the rolling type, Rhino Linux delivers a beautiful and reliable experience.