May 19, 2021How Third-Party Data Modeling Enables Smarter Decisions
AtScale is a launch partner for the new, open source Delta Sharing project. We see the tremendous value in establishing an open source protocol for data sharing within modern cloud data architectures. Enterprise business intelligence and data science teams are expanding their interest beyond their first-party data to include sharing second-party data with partners and consuming third-party data from data providers. Delta Sharing streamlines data sharing with an open, scalable and cost efficient data sharing protocol, making it easier to consume new datasets, leverage new services and incorporate new capabilities into an analytics infrastructure. A semantic layer is a critical component within the modern analytics stack that will leverage this protocol and help propagate its use.
AtScale’s semantic layer solution lets enterprise analytics organizations establish a business-oriented logical data model of data managed on cloud data platforms like the Databricks Lakehouse Platform. BI and data science teams consume live cloud data through the AtScale model using tools of their choice – including Excel, Power BI, Tableau, Looker, or Python. AtScale persists a live connection to shared datasets available via pre-signed URLs on cloud storage and virtualizes queries originating from data consumers. Enterprise analytics teams typically deploy a semantic layer to support a self-serve analytics program and to ensure consistent analytics performance on large cloud datasets.
How a semantic layer works with blended data
As more data moves into the cloud, the semantic layer becomes an important enabler of advanced analytics and data science use cases leveraging Lakehouse platforms. While the cloud has made it cheaper and faster for many organizations to work with first-party data, and data providers are making a wealth of new data sources available, it remains complicated to bridge these data sources for two primary reasons:
- Difficulty in establishing access to both first party data and third party data within the same workspace
- Complexity in blending two disparate data models for unified analysis.
Data virtualization, leveraging standards-based approaches to sharing data, can eliminate expensive and time-consuming data movement. Since both first- and third-party data shared via Delta Sharing servers is stored in cloud data platforms, virtualization can be an effective alternative to a physical ETL process.
Blending data from different schemas can be complicated and expensive when it’s based on the physical transformation of data. An alternative is to leverage a semantic layer solution that can define a blended data model, conforming important dimensions like time and geography – without physically moving the data. AtScale helps organizations make their data “analysis-ready.” BI analysts work with a consistent set of key enterprise metrics and dimensions, regardless of the tool they are working in. Data scientists choose from a broad range of features – including features built from third-party data sources – exposed in a consistent manner through the semantic layer.
When these two techniques are combined within the AtScale semantic layer, data consumers can interact with a blended data model and access raw data through virtualized connections to live shared data. This is a highly efficient way to deliver the benefits of third- party data sharing for BI and data science use cases.
Tapping into big data sources in real time
In this recent blog post, How Third-Party Data Modeling Enables Smarter Decisions, we noted that as the amount of data being created and shared continues to accelerate, it makes sense for enterprises to tap into these big data sources in real time. Using AtScale, organizations can enable self-service analytics with both internal and external data at scale. This allows companies to enhance their existing datasets to achieve higher growth, productivity, and other positive business outcomes.
AtScale, as with many other technology providers striving to modernize enterprise analytics, will be able to leverage standards like Delta Sharing to more efficiently tap into the power of modern cloud data platforms, and get more value from their data and AI initiatives.