May 6, 2019Seamless Adoption of Snowflake in the Cloud: Rakuten with AtScale
There’s been a lot of news lately about semantic layers. Google and Tableau announced their plans to connect Tableau to Looker’s semantic layer. It’s great to see the industry recognize the importance of the semantic layer in the new cloud analytics stack. My founders and I started AtScale in 2013 because we believed that an agnostic, independent semantic layer is key to driving self service analytics at scale (can you guess why we named the company AtScale?).
As a reminder, this is Wikipedia’s pretty good definition of a semantic layer:
“A semantic layer is a business representation of corporate data that helps end users access data autonomously using common business terms. A semantic layer maps complex data into familiar business terms such as product, customer, or revenue to offer a unified, consolidated view of data across the organization.”
Add the adjective “Universal” to the definition of semantic layer and you’ll see why it’s a pretty popular topic these days. In this post, we’ll dig a little deeper and talk about how a Universal Semantic Layer (USL) should be a critical element of your modern analytics stack.
On the data platform side, the data warehousing space has changed drastically in the past 3 years. Now, cloud data lakes and cloud data warehouses have become well-accepted data platform architectures. According to the 2020 Big Data & Analytics Maturity Survey, 61% of respondents currently operate cloud data platforms and 48% plan on deploying them in the near future. This means that more data can be collected and stored than ever before since enterprises can outsource the data platform’s scaling and management to their cloud partners.
On the consumption side, contrary to the many predictions, business intelligence (BI) tools have continued to proliferate and Excel never did go away (it’s more than alive and kicking). On top of that, we’ve seen the data scientists emerge as another data hungry consumer, needing the same access to business friendly data as their business analyst partners.
What does this all mean? It means that the analytics landscape has become even more daunting for IT and users: more data (in volume and variety) and more consumers wanting to use the tool of their choice. This is why a Universal Semantic Layer is now being recognized as a critical piece of a modern analytics platform.
Location, Location, Location
As enterprises move their data operations to the cloud, the cloud file system (S3, ADLS, GCS) has become the default landing zone for raw data. This is essentially a data lake. Some organizations also process that raw data using data pipelines and write the transformed data to a cloud data warehouse like Snowflake, Google BigQuery, Amazon Redshift or Azure Synapse SQL.
There’s much debate about whether a data warehouse or a Lakehouse is the right data architecture. I say “who cares?” With a universal semantic layer, enterprises can provide access to both the warehouse and data lake and hide the data’s location (and complexity) from their consumers. Providing access to both the raw and prepared data is really important. The finance team typically will want to access blessed, reconciled data, ideally stored in a data warehouse, while the marketing team may want to analyze clickstream data in the data lake.
As you can see in the architecture below, with a proper USL, you can abstract data’s location and form and make data access ubiquitous for any analytics consumer.
Where a Universal Semantic Layer Fits in the Data Stack
This sort of abstraction layer makes data locale and data format invisible to end users.
Who, What, Where?
It’s not just that data has become more dispersed. The types of analytics consumers and their respective tool sets have proliferated too. While the needs of a data scientist and a business intelligence user may seem quite different, they both need simple and secure access to clean, understandable data. With today’s self-service architectures, we’ve forced our analytics consumers to become data wranglers and data engineers. In fact, the average data scientist spends more than 80% of their time preparing data rather than modeling it. Besides being a colossal waste of time, by asking business users and data scientists to program their own metrics and business terms, we’ve created a recipe for chaos and inconsistency. Again, the USL is an excellent solution to this problem as well. By defining business metrics, data access and transformations in one place, analytics consumers are almost guaranteed to speak the same language, regardless of their use case or tool sets.
Even better, by creating a single point for data access, the USL also serves as a central governance gateway across the enterprise. IT can secure the data and control its access once and for all. As you can see from the chart below, 79% of enterprises rank cloud security and governance critical to their success in the cloud.
Source: 2020 Big Data & Analytics Maturity Survey
Keep it Universal and Make it Simple
I’m a firm believer in the design principle of KISS (Keep it Simple Stupid). By removing multiple, moving parts, you can simplify a design drastically and improve resiliency.
So, why add another moving part to the analytics stack you ask? Actually, by adding a Universal Semantic Layer to your architecture, you can drastically simplify your stack, not complicate it. To start with, you can retire multiple, proprietary and conflicting (tool based) semantic layers that are tough to maintain and impossible to keep in sync. The key is that a semantic layer is useless and counterproductive unless it’s universal. All tool and cloud vendors want to convince you to stay within their walled gardens with their tool-specific semantic layers. Don’t fall into this trap (again).
By investing in a stand-alone Universal Semantic Layer, you can free yourself from vendors’ proprietary chains and create the flexibility you’ll need as new data platforms and tools inevitably continue to proliferate. Best of all, with a USL, everyone will be speaking the same language and playing by the same rules.
To learn more about how the AtScale Universal Semantic Layer can work for you, download the “Achieve Data-Driven Insights With AtScale Cloud OLAP” whitepaper.
To learn more about where enterprises are investing, download the “2020 Big Data & Analytics Maturity Survey Results” report.
To learn more about how AtScale can scale your cloud data warehouse and save you money, download the “AtScale Cloud Data Warehouse Benchmark Report” report.
The Practical Guide to Using a Semantic Layer for Data & Analytics