3 Things About Data Virtualization You Might Not Know

Data virtualization is not new. It was born in the naughts when data was still small by today’s standards. Unlike today, most data was stored in relational databases from a few vendors like Oracle, Microsoft and IBM. Oh my, how times have changed! Data is coming from everywhere and everything, and businesses now realize that data is their lifeblood and it can’t be thrown away. On top of that, we now have data platforms for every type of data whether it’s on premises or in the cloud. It’s no surprise that enterprises are now looking for ways to manage this chaos at scale.

Data virtualization is now seen as a critical strategy for managing this new data volatility and variety. In fact, Gartner says that “by 2022, 60% of all organizations will implement data virtualization as one key delivery style in their data integration architecture.” (Source: Gartner Market Guide for Data Virtualization)”. In this article, we’ll look at some of the misconceptions about data virtualization and highlight how intelligent data virtualization is changing the game.

#1: Data Virtualization is Not Query Federation

Database platforms have included varying degrees of query federation for some time now. Query federation allows a single SQL query to combine data from more than one data platform using remote database connections to external data platforms. There’s a few problems with this approach. First, users need to understand the local and remote database schemas in order to join the data. Second, queries can’t scale if the remote database has a large data set because the network can’t support the data movement. Third, there’s no semantic layer, so users need to understand SQL, primary and foreign, keys etc. to make it work. In contrast, data virtualization solves these problems by presenting a single consolidating view that hides the underlying data platform location and complexity while managing data flows to minimize data movement.

#2: Data Virtualization Works for Analytics

When first introduced, data virtualization focused on small data sets and operational, not analytical, use cases – for example, combining data from monitoring dashboard A with ticketing dashboard B. Why? Analytical style queries require lots of I/O because they tend to aggregate and join data in multiple, large tables. In recent times, data virtualization vendors like AtScale have solved the data movement problem by pushing the work down to the source data platforms and performing large scale aggregations and joins in the database, not over the network. Now users can have their cake and eat it too: query big data, in multiple data platforms, at scale and at the speed of thought.

#3: Data Virtualization Solves Your Data Governance Woes

Because the data virtualization platform acts as the single entry point for all queries and presents a unified semantic model, it’s an excellent place to enforce data governance. Rather than implement varying and potentially conflicting data access rules in a myriad of databases, data virtualization can serve as data governance central to drastically simplify data access rules for the enterprise. Combined with a data catalog like Collibra or Alation, a data virtualization platform becomes the “enforcer in chief” for the data catalog’s data governance policies.

As you can see, data virtualization has come a long way. Enterprises should consider data virtualization instead of or in addition to traditional, ETL style data integration. Data virtualization can drastically simplify your data infrastructure, improve your agility to respond to the business’ needs and protect your data by governing its access regardless of the query source.

For more information:

Read “The Data Virtualization and Data Architecture Modernization” whitepaper from ESG
Read “The Ultimate Data Virtualization Buyer’s Checklist”

WHITE PAPER

How Data Governance and a Semantic Layer Support Data Mesh

DOWNLOAD NOW

July 13, 2023

The 6 Principles of Modern Data Architecture

May 15, 2019

Data Virtualization: The Champion Of The Hybrid Cloud

February 12, 2019

Why You Need A Virtual Data Warehouse

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.

Necessary

Always Enabled

Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Functional

Performance

Analytics

Others