Understanding data stream processing

All Is Flux

Article from Issue 244/2021

Author(s): Nico Kruber

Batch processing strategies won't help if you need to process large volumes of incoming data in real time. Stream processing is a promising alternative to conventional batch techniques.

Stream processing, also known as data stream processing, has been around since the early 1970s, but it has seen a big resurgence of interest in recent years. To understand why stream processing is on the rise, first consider how a conventional program processes data. Traditional software reads a chunk of data all at once and then performs operations on it. This batch technique is fine for certain types of problems, but in other use cases, it is quite limiting – especially in the modern era of parallel processing and big data.

Stream processing instead envisions the data as a continuous flow. New events are processed as they occur. You can envision the program as something like a factory assembly line – a stream of incoming data is analyzed, manipulated, and transformed as it passes through the system. In some cases, parallel streams might arrive separately for the program to analyze, process, and merge together.

Stream processing excels at use cases that require real-time processing of incoming data from large datasets, such as fraud detection software for a credit card company or a program that manages and interprets data from IoT environmental sensors.

[...]

Use Express-Checkout link below to read the full article (PDF).

Buy this article as PDF

Express-Checkout as PDF

Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES

Print Issues

Digital Issues

SUBSCRIPTIONS

Print Subs

Digisubs

TABLET & SMARTPHONE APPS

US / Canada

UK / Australia

Understanding data stream processing

All Is Flux

Buy this article as PDF

Buy Linux Magazine

Related content