The Cost of Bad Data

The Cost Of Bad Data

According to Gartner, decision-making based on inconsistent data is responsible for an average of $15 million per year in losses to the enterprise. Why? Enterprises today are using a variety of tools for KPI reporting that rely on varying subsets of data. 

At what price is your team willing to take the risk? 

In our most recent webinar, Dave Mariani, AtScale Co-Founder and CSO speaks to the benefits of implementing a universal semantic layer throughout your organization so that your team has the confidence to make the right calls with secure and consistent data, while adding savings to your wallet. 

Recognizing the Problem Early On

Mariani recalls why he and his co-founders founded AtScale. He reflects on his past at Yahoo!, “Bad data cost us a lot more than $15 million a year, and that was almost 10 years ago. In that environment where I actually ran the analytics teams and data pipelines, I really struggled, and my team struggled to really deliver consistent data. And that was because we were supporting a variety of different BI tools.” 

With multiple BI tools in place, it was difficult to arrive at the truth. His team found that they were receiving multiple answers to the same questions and emerging conflicting definitions had the team “twisted up”. Mariani continues on where they were lacking as a team and that “Inconsistent data is really costly, not just in a physical cost, but also just in lost productivity.” 

Trust Your Data 

Convincing people to trust their data is a hard challenge to take on. Mariani states, “Unless you have a semantic layer and a single source of truth, there’s always going to be doubt in the numbers.” What is responsible for the lack of trust? Between data usage, storage, migration and integration; the quality of your data is at risk and mistakes are likely to be made. Enter self-service BI. While working with self-service tools, users still need to be aware of the consequences that come from this freedom. It is not uncommon to see a lack of consistency among definitions and business metrics, creating larger problems down the road.

Mariani continues, “It’s very difficult to ensure and apply consistency and that kind of a “free for all” environment. And then there’s just the fact that data gets old, data gets out of sync … You’re gonna get, you’re gonna get mismatches and you’re going to get errors in your data.” 

RMS Titanic 

Mariani shares a Kaggle project that the engineering team used to predict the deaths on the Titanic. In this example, Mariani shares the common problems that engineers face when performing calculations and how easy it is to arrive at the wrong answer when you have multiple tools. He then shares that by implementing a universal semantic layer and by reorganizing all of your tools, your team will arrive at a single truth. 

What is a Universal Semantic Layer?

Let’s start with the basics. What is a semantic layer? According to Wikipedia, “A semantic layer is a business representation of corporate data that helps end users access data autonomously using common business terms. A semantic layer maps complex data into familiar business terms such as product, customer, or revenue to offer a unified, consolidated view of data across the organization.” Mariani praises this definition as “the semantic layer is meant to promote self service, but promote self service with consistency. And that’s where common business terms come into play.” 

Click here to learn more about AtScale’s Universal Semantic Layer. 

What’s Not to Love? 

Now that you’re familiar with a universal semantic layer, why would you want one? For Mariani, it is all about gaining control as teams can “deliver results, as quickly as they want and as quickly as they prioritize them again, as long as they have access to that raw data.” But with this new found freedom, are there any dangers that follow closely behind? Two words, data wrangling. Mariani speaks to the challenge of data governance and how different definitions force your business analysts to be data engineers= a loss in productivity. He states, “Do you really want your business users who are supposed to be running the business and figuring out how to improve profitability and improving customer experience actually becoming data engineers to do their jobs?”

Mariani continues to speak to the pros and cons of building all business logic in the data warehouse, using canned reports and dashboards and implementing intelligent data virtualization. 

According to the Gartner Market Guide for Data Virtualization, “By 2020, organizations utilizing data virtualization as a data delivery style will spend 45% less than those who do not plan on building and managing data integration processes for connecting distributed data assets.” What does this mean for those who choose wisely? Expect to be free from the errors and inconsistent data from the traditional ETL path. 

Analyzing Data Consistently and Securely with Intelligent Data Virtualization 

In this demonstration, Dave shares AtScale’s intelligent data virtualization layer looks like and how it will deliver consistency and governance throughout your entire organization.  

Related Reading: 

To learn more, download your copy of “Why A Universal Semantic Layer is Critical to Your Data Architecture.” 

GigaOm Sonar Chart - semantic layers and metrics stores