Everything old is new again. It’s an expression commonly used to describe those popular fashion trends – from pleated pants to skinny ties - that pop up every few decades. The colors, materials and supporting accessories may change, but at the foundation are classic clothing items that reappear time and time again.
The concept isn’t exclusive to clothes, however. Right now the market is abuzz about Big Data and the new information and technologies it brings to businesses. Concurrently, timeless principles of data quality and data governance are experiencing a resurgence in popularity, just as they have at the outsets of past data management initiatives related to data warehousing, customer data integration and master data management. Organizations realize that to successfully leverage the full potential of new Big Data innovations, they need to draw from traditional concepts of data quality and governance – here’s why.
So many new applications and platforms offer sophisticated ways to analyze and manipulate Big Data, but they lack capabilities to adequately standardize, enrich and match complex data sets. As a result, organizations realize that they first have to ensure the reliability and quality of data lakes and enterprise data hubs before their contents can be utilized by any downstream applications. According to a recent TDWI report on best practices for Hadoop, 55% of organizations surveyed plan to invest in data quality tools for Hadoop over the next three years – a higher expected rate of investment than for analytics, reporting and data visualization tools. Organizations acknowledge the potential return on Big Data investments is large, but it ultimately pays to get the data right, first.
Additionally, organizations can only align Big Data to business initiatives if they have complete visibility into the full scope of their business information. And more often than not, organizations are integrating unfamiliar, third-party data sets into their existing data stores. Traditional principles of data profiling and data exploration are more critical than ever as all of this new information floods organizations every day. Data quality solutions not only provide the visibility that organizations require; they identify connections between vast old and new data sets as they are integrated together.
Finally, the key challenges facing organizations tackling Big Data are similar to those that arise around any major shift in information management – how to make the right investments and tie those investments to specific business goals that yield the greatest return possible. Multi-functional, collaborative governance strategies can ensure Big Data investments truly align to business needs.
Recently, Trillium Software announced Trillium Big Data, a new data quality solution at the convergence of new technology innovation and proven data quality expertise. Trillium Big Data is built to meet the speed and scale demanded by Big Data now and in the future, but at its core are market-leading data quality methodologies and content. For over two decades we have helped the world’s leading organizations overcome data quality challenges, and we are well-positioned to support current and future data quality challenges presented by Big Data.
So, what about all the new nuances of Big Data quality, such as when and how to apply data quality processes? Stay tuned for the next Big Data blog in which we’ll cover specific new data quality principles for Big Data and Hadoop.