For the end of 2011, many around the blogosphere are forecasting what will be on hype next year. I often read that big data and mobile BI are on the hype. In this article, I will discuss about those technologies. Don’t believe the ads, those technologies aren’t game changer at all.

Big data

First there is a lot of hype on big data. The definition is still unclear but two main component are :

  • to big to handle it with current methods, i.e. regular DBMS
  • mostly unstructured

For the first issue, which is size, let me point out that eBay have two data warehouse with many petabytes running Teradata. Obviously, Teradata is far from cutting edge new stuff. I didn’t heard of a Hadoop clusters above the single digit petabyte range and the number of instances over the petabyte range seems equals for Hadoop and Teradata (between 15-20).  Therefore, at least we can say that old technology can handle big data. To be fair, we have to say that it’s all about a trade off decision. Hadhoop is good for scale, i.e. it cost a small capital at the begining and can grow accordingly with the business. That come at the cost of effectiveness. Computational effectiveness but human effectiveness as well.

For the unstructured issue, I would say that it’s just not an issue. The first time I saw unstructured data was in an Oracle instance using XML fields (Oracle 9 from a decade ago can handle XML). What was interresting is the reason of that choice of adding an unstructured XML field to a old relational table. The business guys were having ideas at a velocity the development team couldn’t handle. One can say it’s agility at the cost of usability. To me it’s just increasing the technical debt. Structuring data is boring because we have to think on how such data should be structured to benefit the business. Letting it unstructured is pushing the boring part for later, that’s not a good idea.

Mobile BI

That’s hot, or at least predicted to be hot for SAP Business Object. Obviously giving the ability to answer business questions within a meeting using a smartphone is amazing. But think a little bit about how likely such scenario is. In my experience there is only three case :

  • the data is a KPI : if it’s a KPI and people don’t know have an idea of it, mobile BI can’t help. The BI stack just fail. If the C-level suite doesn’t know the performance of the business and main drivers of it, what is the purpose of BI?
  • the data is not a KPI but covered by the usual BI stack : but nobody know that such data is in the daily report they get. I already saw some people getting a lot of daily automated reports with some having more that 50 pages. That’s hundred indicators. Nobody can know them all, some are even not correctly computed (at least not the way you think it should be).
  • they want insight and not a piece of data : such question like “What is the behavior of such kind of customers when we launch such kind of marketing operation?” Such question will never be handled by any mobile BI, any self served BI at all. It just need some work. Maybe the conversion rate increase but the lifetime decrease

The old issue

To me, the root issue of both those hyped technologies is the same. BI is still mainly a geek and IT topic. Answering a business question or making things the old way is just … boring. Who cares what the trend of acquisition is when we can build a new sexy hadoop cluster (with hbase, zookepper, hdfs and many funky tools), when SVM increase accuracy of 0.1%? So the idea is simple, let’s give some fancy tools to the business so they don’t come again and let’s make something awesome. Yahoo is very strong in Hadoop, but it’s hard to see any business result.
Technology is just a tool. Having petabytes of data and a huge Hadoop cluster is easy, it’s just time and money. Making reports or apps on smartphone is easy, again just time and money. Extracting useful insights from data, that’s hard. Most of those insight can be extracted using a regular database and a simple excel sheet.

Let's stay in touch with the newsletter