Filtering log messages with Splunk
Search with Remote Control
The post()
method in line 57 sends a search command to the Splunk server. In contrast to the web GUI, searches submitted via the API need to start with the search
command. Besides the NOT eventtype=chatter
filter that I already described, it defines the restriction earliest=-24h
; that is, it only asks for events in the past 24 hours. As defined by the output_mode
parameter, I want Splunk to return the results in JSON format. If you prefer to avoid lengthy emails, you will also want to restrict the number of hits to 50 using limit=50
.
The from_json()
function from the JSON CPAN module then converts the results to Perl data structures one line at a time in line 82. Three fields are crucial for the mail to be sent: the time stamp of the log entry with the _time
key, the logfile in source
, and the original log line in _raw
.
The Net::SMTP module from CPAN sends the mail with the results of the search to the target defined in $to_email
. The SMTP server $smtp_server
was set previously in line 23.
For Dinosaurs and Hipsters
Complex HTML messages annoy old codgers like me that use text-based email readers such as Pine. Conversely, a plain text email is too old school for young, dynamic Outlook and Thunderbird mouse pushers. To mediate between the two worlds, Listing 1 formats the tabular results of the query in ASCII using the CPAN Text::ASCIITable module. To prevent the timestamp column from becoming too long, and to wrap it instead, line 76 limits its width to a maximum of 10 characters. The same thing applies to the column with the log entry, which wraps to a line length of 34, keeping messages readable, even on mobile phones.
Some modern mail readers prefer HTML, and to satisfy them, line 96 calls the CPAN Email::MIME module. It wraps the existing ASCII text in inline HTML, surrounded by simple pre
tags. Thus, the results are acceptable both in Alpine (Figure 7) and in Gmail (Figure 8).
The script can be easily extended to include tests that compare the values found with previously set limits and then only send messages when the limit is exceeded. This can happen once per day as a summary or at five-minute intervals for rapid-alert email messages.
Some open source alternatives to Splunk, such as Logstash [6] [7] and Graylog2 [8], are also available, but so far they do not come close to Splunk in terms of ease of use and scalability.
Infos
- Splunk: http://www.splunk.com
- Hadoop: http://hadoop.apache.org
- "Giant Data: MapReduce and Hadoop" by Thomas Hornung, Martin Przyjaciel-Zablocki, and Alexander Schätzle: http://www.admin-magazine.com/HPC/Articles/MapReduce-and-Hadoop
- Listings for this article: ftp://ftp.linux-magazin.de/pub/listings/magazine/155
- "Splunk: Intro REST API tutorial": http://dev.splunk.com/view/SP-CAAADQT
- Logstash: http://logstash.net
- "Centralized Log Archiving with Logstash" by Martin Loschwitz, Linux Magazine, June 2013, p. 60: http://www.linux-magazine.com/Issues/2013/151/Logstash/(language)/eng-US
- Graylog2: http://graylog2.org
« Previous 1 2 3
Buy this article as PDF
(incl. VAT)