In data, small is the new big

Humongous data
Everyone’s talking about big data. Often, it’s Big Data where an initial capital in each word lends the phrase an air of great significance that focuses on size.

From gigabytes to exabytes to zetabytes, you’re in no doubt that this is big with an impact on society of equal effect.

Yet for most people I know, big data in alliance with size-words does little to help you understand what it means or, indeed, the relevance of it to your business. It’s easy to get lost in the hugeness of it all.

This perspective on big data formed a large part of a discussion I took part in last week at The Hospital Club in Covent Garden, London, in the inaugural Small Data Forum get-together. Organized by LexisNexis with Thomas Stoeckle as the master of ceremonies,  the informal breakfast/morning event attracted well over 20 people.

I participated as a joint moderator (speaker, in fact), standing in for my IBM colleague and boss, Andrew Grill, with Sam Knowles, founder and managing director of Insight Agents. It gave me a great opportunity to hear the views and opinions of people from widely different organizations who are immersed in data in their work roles. While the tech of data was never far from our discussion, our focus was very much on the business end.

Sam has posted a terrific write-up of the event and I’m not going to duplicate his great narrative. Rather, let me highlight a small handful of thoughts I took away from the event.

  • First and foremost, a majority opinion in our discussion group is that the term “big data” is an ineffective one that’s hard to grasp and doesn’t represent what actually happens in business. “Small data” is a more accurate descriptor if you consider it in the context of looking at big data as a conglomeration of chunks of small data.
  • If big data is the jigsaw, then small data is the individual pieces that you assemble or connect to complete the full puzzle. That makes sense to me, and makes good sense to anyone attempting to understand the data he or she is working with in the business sense. It’s especially good when you think of unstructured data that accounts for 80 percent of all data today, and which often is expressed in that enveloping phrase, big data. Unstructured data is at the heart of machine learning.
  • A common view in the discussion was that anonymous data will become more prevalent. That brings a big question to the forefront – how do you trust anonymous data? Meaning, who do you trust, ie, if you don’t know the source of the data, how can you trust it? A big opportunity for proxy trust – not necessarily the source (who you don’t know) but the ‘purveyor’ (who you do know) of anonymous data.

LexisNexis has produced a nifty infographic that focuses on five insights from the Small Data Forum:

  1. Making data valuable is about mixing together the right ingredients.
  2. Big data is both oil and soil.
  3. You don’t need a sledgehammer to crack a nut – you just need a clever nut-cracker.
  4. Create more of the future you want.
  5. Algorithms are important, but the human element cannot be replicated (yet).

I like the appended ‘yet’ on the last one.

The detail is in the infographic, below:

5 insights

Small Data Podcast soon

One extra outcome from the event – especially from the riff Sam and I had on the overall topic – is that Sam and I will record a podcast discussion on themes from last week’s event. We’ll be recording it on Friday May 27, and it will be published by LexisNexis soon thereafter (find out when via the Twitter hashtag #SmallDataForum).

Here’s my friend Watson to tell you a bit more :)

However you pronounce ‘data,’ big data is a big topic, without question, especially so if you have the business view rather than the tech. I hope our take on it in the shape of small data helps you better grasp it all.

(Cartoon at top via David Fletcher at CloudTweaks.)