Hello Stream Processing

In the realm of analytics, organizations are now understanding the shortcomings and drawbacks of batch-oriented data processing, it is becoming clear that Stream processing is inevitable in many real-life applications. Stream processing turns the batch oriented computation upside down on its head, latter is store-compute paradigm whereas Stream processing is compute-store i.e perform in real-time operations such as filter, join, enrich, aggregate, group-by  before the data or aggregates are stored for historical analysis.

Let me show you an example of simple Stream processing application using Vitria Operational Intelligence(OI) platform. The intent here is we tap into an Order Stream, filter for high value orders, join with a collection of VIP customers, enrich with customer data, count the total orders and group by region and look at the result for last 1 minute. Traditionally this is done by storing all order data, performing daily or weekly ETL into data warehourse, and then query and obviously the results are only as current as your ETL process.

This example is archtypical design pattern for many common stream processing use case, whether you are looking at Top trending hashtags from twitter streams, or monitoring dropped calls for your most important customers from CDR, or measuring call handling time in the call center from service calls etc.. However most of the exisitng Stream processing systems stops here, the computed analytics are not tied back into the decision process. Vitria takes it further, by taking automated actions, alerting, or triggering workflow on the computed results, there by not only enabling continuous and incremental computation but also triggering response and tieing it back to the decision process.

Some basic Vitria OI concepts as it relates to stream processing application:

Event processing network (EPN) : This is the query pipeline on streams. Each query performs some operations on streams, you can fan out results from one query to another query as well fan-in results from multiple query to another query. (A stream is nothing but a sequence of tuples). You can also build a multi-stage pipeline for distributed scale-out processing (I will talk about that in seperate blog)

Feed: This is conduit of event stream, this is a low latency messaging layer that transports streaming data between various Vitra OI components.

Live Collection: This is an in-memory collections of computed results from an EPN, that can be queried in incremental fashion, this also snapshots time series data into a persistent store if needed.

Named Window: This is an in-memory collection of enrichment or reference data which can be operated just like database table i.e. you can insert/delete data to this collection.

Here is a an EPN model build in Vitria OI

By simple visual modeling, you can connect different queries in the pipelines. Yellow box are event streams, Greens are queries and blacks are collections., behind every box there is powerful declarative query interface, Vitria’s optimized Stream query language, which is conceptually simillar to a DML in SQL but eliminates the need to work with rigid schemas. In this era of POST SQL or NOSQL, this query language provides the flexibility to work with well structured or loosely structured data streams.

From the results of the computation, you can trigger actions using Event Policies, a model which defines the rules that dispatch Vitria OI events to resolution processes (a process is that is fully automated and/or triggers a workflow) based on information from the Vitria OI events. Vitria OI provides a full BPMN2.0 engine that can be used to orchestrate any type of processes.

Vitria’s unified modeling environment, allows customers to build very rich distributed Stream processing application designed with simple concepts and powerful interface.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>