Building a Web Spider with Ruby
Spider on the Web
Ruby is a very elegant language,and it’s harmonious – the parts work together effectively. Ruby also significantly reduces a developer’s burden. We’ll show you how to use Ruby to build a quick and simple web spider application.
Ruby is a scripting language developed by Yukihiro Matsumoto and released under the GPL. The Ruby language has an excellent set of string manipulation and networking libraries, making it a great choice for writing web spiders. If you are not familiar with web spiders, they are programs designed to automatically traverse the web. Search engines use web spiders to add web pages to their index; companies like Netcraft use spiders to get statistics on web servers. You can use a web spider to find information automatically from almost any website; in this article, we’ll discuss how to use Ruby to retrieve information from LiveJournal, a popular weblog provider. You can extend these techniques to virtually every website that provides public information.
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
News
-
Red Hat Enterprise Linux 7.5 Released
The latest release is focused on hybrid cloud.
-
Microsoft Releases a Linux-Based OS
The company is building a new IoT environment powered by Linux.
-
Solomon Hykes Leaves Docker
In a surprise move, Solomon Hykes, the creator of Docker has left the company.
-
Red Hat Celebrates 25th Anniversary with a New Code Portal
The company announces a GitHub page with links to source code for all its projects
-
Gnome 3.28 Released
The latest GNOME rolls out with better contact management and new features for handling virtual machines.
-
Install Firefox in a Snap on Linux
Mozilla has picked the Snap package system to deliver its application to Linux users.
-
OpenStack Queens Released
The new release comes with new features for mission critical workloads.
-
Kali Linux Comes to Windows
The Kali Linux developers even managed to run full blown XFCE desktop via WSL.
-
Ubuntu to Start Collecting Some Data with Ubuntu 18.04
It will be an ‘opt-out’ feature.
-
CNCF Illuminates Serverless Vision
The Cloud Native Computing Foundation announces a paper describing their model for a serverless ecosystem.