I recently had the pleasure of presenting at a Big Data & Data Management event in Cambridge University in the UK. The event, jointly organised by the University and DAMA UK, focused on some of the key trends and challenges facing organisations looking to mine and exploit new Big Data sources. It was an excellent day as attendees came from both Big Data science and the more traditional data management disciplines of data architecture, data integration, data quality and Business Intelligence. By the end we all realised how much we will ultimately depend on each other, rather than ploughing two parallel furrows that will never meet, as sadly seems to be norm at present.
One surprise for me was to learn how many open data sources are now available to organizations and individuals. In his session, Dr Mark Harrison of the University’s Distributed Information and Automation Laboratory (DIAL) highlighted that in the UK, it’s now possible to access over 9,000 open data sets on the economy, transport, climate, demographics, industry, health and so on via a linked Open Data cloud. Linking and integrating these data sets with an organization’s own data sources on their customers, services & operations has the potential to uncover new insights to support activities such as marketing and selling, logistics, product development et al. For example, integrating accurate local weather forecasts into logistics planning can help to determine the daily optimal route for delivery vehicles. The possibilities of Big Data and the ability to mesh these sources together are endless.
But a major challenge remains. In my session I talked about The Big Bad Data Wolf and the relationship between Big Data and data quality. Having the chance to present at one of the world’s leading academic institutions where the likes of intellectual giants such as Isaac Newton, Charles Darwin & Stephen Hawking were educated, I naturally thought I’d start with a children’s fairy story, mainly to demonstrate my own lack of a prestigious education. So I talked about the Three Little Pigs and the Big Bad Wolf.
In this tale the Three Little Pigs went out in the world to find their fortune and built houses of straw, sticks and bricks respectively. Along came the Big Bad Wolf, who huffed and puffed and blew down the houses of straw and sticks, and feasted on the unfortunate porcine residents. When he encountered the house of bricks however he failed to demolish the structure, so in desperation clambered down the chimney, plunged straight into the fire below, and huffed and puffed no more.
And so it is with exploiting these new Big Data open sources. If an organization’s own information architecture is badly constructed and built with poor raw materials – namely deficient, missing and inaccurate data – when the Big Bad Data Wolf comes along it will all fall down. Poor data quality will make it impossible to link and integrate these new Big Data sources with your existing data.
The moral of the tale is to construct your data houses with high quality raw materials – the building bricks of well-constructed, solid, standardised & managed data. If you do this, you do not need to be afraid of the Big Bad Data Wolf when he comes along. Instead you might even be pleased to invite him in.
VP Information Management Strategy, Trillium Software
Nigel Turner works with Trillium Software clients to start, expand and accelerate their enterprise data quality initiatives. He spent much of his career at British Telecommunications plc (BT) where he led an internal enterprise wide data quality improvement programme. This ten year programme was praised by Gartner, Forrester, Ovum Butler and others both for its approach and proven benefits. Nigel has published several papers on data management and is a regular invited speaker at CRM and Information Management events. He is also a part time lecturer at Cardiff University where he teaches data management.