IBM System S: Stream Computing

May 26, 2009

IBM's new System S analyzes business data at their creation.

System S processes different data simultaneously, hence the name "Stream Computing." The new software should benefit, for example, financial institutions, government agencies and the transportation and retail industries, because it analyzes and processes data at creation time. The IBM labs have been working on the approach for seven years and announced its availability at their recent annual investor briefing in New York. For its further development, the company opened the IBM European Stream Computing Center in Dublin, Ireland as the hub of research, customer support and advanced testing.

IBM promises users the current System S as a definitive first state. The new perpetual analytics model is unlike the traditional one where data is first stored in memory before being analyzed. As the press release states, "Traditional computing models retrospectively analyze stored data and cannot continuously process massive amounts of incoming data streams that affect critical decision-making." A common application example for the new model is a webserver that needs to process real-time location data for mapping purposes.

There are a few avenues for this perpetual analytics approach. One is the Stream Processing Application Declarative Engine (SPADE), a programming language with its own runtime environment for applications that support Stream Computing. No specific knowledge of the underlying technology is necessary. Another avenue is the Semantic Solver in the IBM InfoSphere solution that directly interprets data. The third avenue allows more advanced users to develop an Eclipse-based IDE application that creates data streams, and types for other components to reuse or create streams.

Unfortunately IBM seems to be loathe to expose too much about the new technology. A data sheet on the subject is available only after an extensive registration (see Gallery). On an inquiry into the matter by our publication, a spokesperson at IBM Labs informed us that System S is part of the Infosphere product line that can currently run on RHEL 32-bit and 64-bit machines with Intel chips. She also revealed that the software was developed on Linux. Some open source software came into play, such as the C++ libraries of the Boost project, Graphviz graph visualization software, and a few Perl modules (XmlSimple and XmlRegexp among them). Despite the naming similarities with IBM's X, I and Z systems, System S includes no hardware. SPADE, developer and analysis tools are nevertheless included. The InfoSphere family and its first offspring System S are a significant part of IBM's "Smarter Planet" initiative that it introduced at CeBIT 2009 in March.

Gallery (8 images)

Related content

  • Tshark

    The simple and practical Tshark packet analyzer gives precise information about the data streams on the network.

  • Hadoop 2 and Apache Spark

    Hadoop version 2 has transitioned from an application to a Big Data platform. Reports of its demise are premature at best.

  • Streaming with Icecast

    For live Internet radio, you need a streaming server. We’ll show you how to get started with Icecast, an open source streaming alternative for Linux.

  • FAQ – Apache Spark

    Spread your processing load across hundreds of machines as easily as running it locally.

  • KTools: KMyFirewall

    Linux has a fantastic selection of firewalls for securing stand-alone computers or whole networks. Although you can use IPTables to set up a firewall, the configuration is often the most difficult step. KMyFirewall offers a powerful, user-friendly, GUI-based approach.

comments powered by Disqus