Data is popularly referred to as the new oil. Whether it is the fuel that grows your business, or snake oil that does nothing but burn a hole in your pocket is the million(multi?) dollar question.
Big Data and its value have got two perspectives: as a valuable commodity that flows through an organisation powering the engines of new insight and action, and Big Data can be a collection of, sometimes, disgraceful claims. Both have merit and are right, but the distinction between value and snake oil is helpful in understanding the value of Big Data.
It is assumed that the associated technology maturity and transformation will create value with time. But the assumption implies that enterprise data needs to be similar or normalized in some sort of way. This has forced inefficiencies and redundancies on enterprises, who then invest massively in identifying, aggregating, moving, storing and optimizing data before the value is created.
How does the snake oil get sold?
A report has stated that about 465 exabytes will be created each day by 2025. To put that in perspective, take a quite roomy 1 terabyte hard drive, and put 465 MILLION of them together. And that’s EVERY SINGLE DAY! This gargantuan number, however, conveniently ignores the reality that 95% of data is completely irrelevant.
The fact remains that data is not about quantity, but rather quality. Greater quantity buries the quality. More does not equate to better. Sellers of technology and tools further a dangerous notion that more data is always useful. Companies selling the tools to collect, analyse and respond to data-driven triggers are growing fat on the profits of burying us in numbers and taking our focus from the customer to the spreadsheet while blinding us with too much data accompanied with buzzwords.
We are descending into a data-driven world of clichés and catch phrases. Generally speaking, the more information you have, the better the decisions you make. However, information overload kicks in at a point. The data explosion has also played a significant role in the switch.
Although data is vital to the development and implementation of successful operational optimisations and efficiency campaigns, “information” should be the driver.
The purpose is not to be alarmist and not bias you against bringing technology to support data-driven decisions. Nor is it to say that Big Data is by default inferior to good old human knowledge. Quite the opposite actually.
Any tool that accelerates the inherent knowledge of the organisation, and supports decision-making with the right interventions, communicated the right way, can supercharge the way the business operates.
Caution however needs to come in, when a tool is sold as a panacea for a business problem, without the champion being able to give a real strategy on how the tool would operationally be deployed in the company.
Machine Learning – the Snake Oil favourite
Machine Learning is both real and useful, but rare is the knowledge to figure out how these tools apply to business beyond finance. Prepackaged algorithmic solutions are the future of “big data”. But today, we lack imagination, which goes beyond better tools. The industry grew out the electronification of paper and has barely moved beyond elementary data processing.
The favourite bogeyman is often Microsoft Excel. Even if we ignore the sheer power in Excel, it’s a flawed analogy that perpetrates the ‘this tool will solve all your ills’ fallacy.
Machine Learning is a key lynchpin of this argument. While it is true that Excel is not the right tool for Machine Learning, the bigger question is that does ML add value to your current situation. Does it allow for a fresh perspective to come in, where traditional statistical methods have already been deployed and used effectively.
There are key questions to determine the validity of Machine Learning
- How about machine learning that actually helps us make strategic business decisions?
- Will the end result be a technology solution or management consulting, and do we bring in tech geeks, math geeks or a bunch of MBAs?
- Who manages a project like that in a large company? And,
- What kind of hairy simulations would you need to test the outcome of business decisions made by machines?
This is hard. Not everyone can do it. And it establishes the argument that Machine Learning is a direction to be explored only after standard statistical methods of extracting information have been exhausted.
So what do I do?
Unfortunately, the answer isn’t simple. As a business, it is imperative that operations get engaged actively in determining the gaps where data can come in, and collaborate with the right experts and implement the tools which help fill those gaps.
In essence, it means asking the right questions. And this can be done only by someone intimately familiar with your business. The right questions are those whose answer is simple, precise and communicates a wealth of information about the area the question targets. These questions lead to identifying the right use cases, which can then help identify the right data to pick-up, process, and analyse.
The good news is that making better sense of data is an evolutionary process, and the benefit is that you can let go of unrealistic expectations where a large tool implementation will automatically transform the company’s relationship with data.