The Semantic Layer and its Role in the Modern Data Stack

The Semantic Layer and its Role in the Modern Data Stack

This is a series of blog posts based on AtScale TechShorts – an interactive video series to build a foundational understanding of the core building blocks of the universal semantic layer in the modern data stack.

In today’s business landscape, data and analytics are crucial for decision-making. Two significant gravity shifts are reshaping how organizations handle data and analytics.

The first shift is “data gravity.” This trend involves centralizing data on cloud platforms like Google BigQuery, Microsoft Azure, Databricks, and Snowflake. Gartner estimates that the business world is over halfway towards moving its on-prem data infrastructure to the cloud, and that cloud database management growth and the share gain from legacy on-prem will accelerate. The opportunity created by the centralization of data assets on cloud data platforms is massive as it radically simplifies the data landscape.

The second shift is “insights gravity.” Organizations are increasingly using multiple business intelligence (BI) and machine learning (ML) tools. In fact, on average, an organization owns at least four different analytics tools. According to Forrester, “25% of organizations use 10 or more BI platforms, 61% of organizations use four or more, and 86% of organizations use two or more.”

While organizations are now empowered with the promise of cloud data scalability and the BI or the AI tool of choice for analysis, there are several critical factors needed to be successful with cloud analytics.

Four factors to achieve successful cloud analytics using a semantic layer:

  1. Governance: The semantic layer consolidates data, making it understandable in common business terms. A semantic layer acts as an intermediary between raw data sources and end-users, providing a well-defined, business-friendly layer that offers a single version of truth. This not only simplifies data access but also ensures data consistency and reliability across the organization, promoting data-driven decision-making.
  2. Multi-dimensional Analysis: Humans naturally analyze data in multi-dimensional ways, and this should not change when moving data to the cloud. The semantic layer organizes data in a multi-dimensional manner without physically moving it. It structures the underlying cloud data using dimensional schemas, including measures and dimensions, into a consolidated view organized in simple business terms.
  3. Cloud Optimization: The age of cloud computing requires rethinking optimization not only for data governance but also for performance and cost. The semantic layer should utilize active metadata to understand analytics query patterns and optimize both performance and cost.
  4. Security and Integration: The semantic layer acts as an intermediary between cloud data sources and insights consumption. It should seamlessly integrate with an organization’s authentication and identity management tools, enforcing role-based access control at the row and column levels. Additionally, it should offer native API support to integrate with other components of the data and analytics ecosystem, such as data catalogs and business processes.

To learn more about the semantic layer in the modern data stack, watch the entire TechShorts video series.

GigaOm Sonar Chart - semantic layers and metrics stores