Understanding data stream processing

All Is Flux

Article from Issue 244/2021
Author(s):

Batch processing strategies won't help if you need to process large volumes of incoming data in real time. Stream processing is a promising alternative to conventional batch techniques.

Stream processing, also known as data stream processing, has been around since the early 1970s, but it has seen a big resurgence of interest in recent years. To understand why stream processing is on the rise, first consider how a conventional program processes data. Traditional software reads a chunk of data all at once and then performs operations on it. This batch technique is fine for certain types of problems, but in other use cases, it is quite limiting – especially in the modern era of parallel processing and big data.

Stream processing instead envisions the data as a continuous flow. New events are processed as they occur. You can envision the program as something like a factory assembly line – a stream of incoming data is analyzed, manipulated, and transformed as it passes through the system. In some cases, parallel streams might arrive separately for the program to analyze, process, and merge together.

Stream processing excels at use cases that require real-time processing of incoming data from large datasets, such as fraud detection software for a credit card company or a program that manages and interprets data from IoT environmental sensors.

[...]

Use Express-Checkout link below to read the full article (PDF).

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Apache StreamPipes

    You don't need to be a stream processing expert to create useful custom solutions with Apache StreamPipes. We'll use StreamPipes to build a simple app that calculates when the International Space Station will fly overhead.

  • IBM System S: Stream Computing

    IBM's new System S analyzes business data at their creation.

  • FAQ – Apache Spark

    Spread your processing load across hundreds of machines as easily as running it locally.

  • ApacheCon 2009 Free Live Stream

    The Apache Software Foundation (ASF) is holding ApacheCon US 2009 from November 2-6 in Oakland, California. The foundation for a free webserver is also celebrating its 10th birthday. In honor of this 10th birthday, the celebration includes three days of the conference program available as a FREE Live stream.

  • DCCP

    The DCCP protocol gives multimedia developers a powerful alternative to TCP and UDP.

comments powered by Disqus

Direct Download

Read full article as PDF:

Price $2.95

News