Follow Us

Wednesday, 19 November 2014

Defining Big Data In Two Words: Who Cares?


I’d like to think of my new Forbes BrandVoice blog series as a conversation with you about analytics. So let’s start here with the same question that usually comes up at the start of many conversations I have with colleagues, customers, partners and other industry watchers: “How do you define big data, Bill?” Before I share my answer, though, let’s take a quick look at some of the other definitions for “big data” since the term made its debut in 1997.
bigdata_definition
Most don’t remember, but “big data” was a term coined that year in a scientific article by NASA researchers to describe “the problem of big data” that happens when large visualization files put a strain on computer memory. The term, more or less, incubated for a decade until reports such as a 2008 Computing Community Consortium (CCC) paper on “Big-Data Computing”helped popularize it. The funny thing is, if you take a close look at the reference to big data in the document’s second paragraph you’ll see how even this landmark CCC paper sidesteps an actual definition of the term, and cites examples instead.
There’s been surprisingly little consensus on a definition since then, and today you can still put dozens of the best minds on the case and get a range of definitions.  The Berkeley School of Information actually did this, asking 40 thought leaders to convene around the question “What is Big Data.” The answers were all over the map, and I had to laugh at the word cloud generated from all those definitions. My immediate instinct was to find a way – in bold and all caps – to superimpose a giant “WHO CARES?” over the whole tangle of words.
That, by the way, happens to be my answer when people ask, “How do you define big data, Bill?”
WHO CARES?
As someone who makes his living in analytics, why in the world would I say that?  The reason is simple. The goal should be to solve a business problem by using new analytics, not to worry about defining a term. That’s because definitions are a distraction from the simple question of “Does this data contain information that is valuable for my business?”
In other words, “What data – if collected, organized and used within an analytics process – would improve the answers that we are able to generate to solve our problem?” And I’ve written before how the answer to this question has absolutely nothing to do with the definition of big data.  It could be big data, small data or a bunch of spreadsheets. By the time an organization is at the moment of realizing that it must make use of something resembling big data, it’s too late to worry about the definitions. The data is needed; it’s valuable; you’ve got to figure out how to make use of it.

In fact, “Value” is what is described as the “fourth V” to go along with the famous “Volume, Variety, Velocity” framework that Gartner developed for understanding big data (and Gartner’s is just one of several big data descriptions from influential organizations).  But, I would go even further, and elevate value as the Uber-V that determines everything else.  The only reason to worry about the other characteristics is because they govern your efforts to collect and analyze the right data you’ve determined as potentially valuable.

Don’t misunderstand what I’m saying. For some businesses, most of the data they’re dealing with fits the archetypal definitions of big data. In this case, the data itself will certainly influence the tools and techniques the organization must use to incorporate big data into its analytics processes. The important distinction here is that the choice of tools and techniques is a tactical issue in service of the initial strategic question: “Is the information this data contains important to my business?” Once that question is answered, an organization must do what it takes to put the data to work. Any formal definitions of big data become irrelevant, after-the-fact labels.
So, don’t get overburdened trying to understand what qualifies as big data and what doesn’t. While defining big data is an interesting academic exercise, frankly it’s a waste of time and even a bit silly if your goal is to solve a business problem. Nobody will care about your definitions; they’ll only care about your results.
Luckily, you don’t necessarily need massive new investment in tools and technology to create a high-performing analytics ecosystem for your business. Sometimes a few changes to policy and access around data can get you there with what you already have. In my next blog, I’ll serve up a non-traditional example to show how that can happen.

0 comments:

Post a Comment