Power of Fast Analytics over Slow Data

Fast analytics over fast data makes sense intuitively. How about fast analytics over slow data?

It sounds like an oxymoron, doesn’t it? Before we answer this, let’s look at what fast data, slow data and fast analytics mean. Fast data is like a high-velocity stream coming out of a fire hose.

Sensor data, financial ticker data, and click-stream data are a few examples of fast data, where large volumes of data come into play at high velocity. The “data cycle” refers to how frequently data gets updated and reported. So, basically, a slower data cycle means it takes more time for the data to be updated and reported, and a faster data cycle means it takes less time for the data to get reported.

Fast analytics is about analyzing data in real-time as it gets reported. Based on this, we might conclude that fast analytics makes more sense with faster data cycles. However, the Internet of Things (IoT) provides a new twist to this as it deals with enormous amount of data that has a high degree of variance over speeds, feeds and data cycles.

Let’s take a look at a real-life example to see what this actually means. A so-called “smart meter” reports electrical usage electronically via the Internet. Currently there are more than 50 million smart meters installed across the United States alone. Consider a smart electric grid that consists of 15 million smart meters.

Typically, each meter reports its energy consumption once an hour at some preset time during the hour, with different meters set at different presets so that there is a continuous in flow of meter data. Using traditional analytics, it might take 15 minutes to analyze and predict demand for the next hour, yielding a total cycle time – i.e., data reporting plus analytics processing – of 1 hour and 15 minutes.

Now, consider what happens an afternoon when the temperature is unexpectedly hot and humid? As customers start turning on their air conditioners, the actual demand will start to rise above the forecasted demand.

If this happens at the beginning of a data cycle, then it will be another hour and fifteen minutes before the next revised forecast is generated. If another power generator is required to meet this spike in demand, then it will take an additional 30 minutes to bring it online. So with traditional analytics, it may be as long as 1 hour and 45 minutes from the time the demand unexpectedly started to rise and a new generator is brought on line. In the meantime, demand may outstrip supply, and brownouts or worse could occur.

Now let’s consider how fast analytics can help even in the midst of a slow data cycle. Since the meters are reporting uniformly over the hour, the 15 million meters will report a quarter of a million (250,000) new readings per minute. Statistically this is a significant population that we can leverage to predict trends. Now if we apply fast analytics over this sub-population we can achieve the following:

  • Within 5 minutes into the data cycle we can detect a variance from actual demand based on 1.25 million new readings.
  • Within 10 minutes into the data cycle we can predict a shortage of capacity based on 2.5 million new readings.
  • Within 15 minutes into the data cycle, we can predict an energy shortfall with over 99% confidence level based on 3.75 million new readings and, given this high confidence level, start to spin up a generator to meet the surge in demand.
  • Within 45 minutes we can bring a new gas turbine online, just in-time to avoid brownouts or worse.

This is the value of Fast Analytics over Slow Data. We see this kind of use case repeated over and over in IoT, where devices report on hourly, daily or weekly basis. However, since there are millions of devices, with each passing moment we see a significant amount of new data, from which we can:

  • Detect anomalies and deviations from plan early.
  • Predict problems early.
  • And, consequently, act in time to avoid or mitigate problems.
  • Future-proof the operations against ever faster data cycles.

Fast Analytics over Slow Data has one other major advantage: it can future-proof you against every faster data cycles. So while today, your data cycles may be daily; next year, hourly; and the year after, every 15 minutes – it doesn’t matter, Fast Analytics can always keep up.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>