This article offers you a step-by-step guide to building an AtScale semantic layer on a Databricks lakehouse .
We’ll start by showing you how to create and publish a model within AtScale. We’ll then look at the difference of using Power BI with and without the AtScale semantic layer. We’ll also take a quick look at how to use Excel with a live connection to the model, to illustrate how AtScale can enable better self-service BI analysis in these environments.
Let’s get started.
Step 1. Log in and open the AtScale Design Center
When you first log into AtScale, you’ll be taken to the Design Center. This web-based interface is your workshop for everything you want to do with AtScale, where you can build your models, manage the system, watch queries, and so on. It allows you to create a business-friendly view of the data to facilitate their BI analytics.
On the AtScale Design Center home screen, you’ll find Quick Start buttons, where you can start a new project from scratch, explore a sample data model, import projects to work on, or continue to work on an existing project you’ve built.
Step 2. Choose the data model you want to work on.
Open the Projects tab and select a model to work on from the left sidebar.
This will open the Project Overview screen, where you will find all the design tools you need to build a data model.
Step 3. Add a new data model
Within the project, create a new AtScale data model by clicking Add New Cube. You’ll want to give it a meaningful name, description, and decide on who has access to read and edit it.
Once you click Add Cube, the new Cube will appear on your Project Overview screen.
Step 3. Build your data model
Once you’ve added a Cube, you can enter the model. Select Enter Model to work on the new Cube canvas.
You can start to build your AtScale data model by simply dragging and dropping measures, fact tables and dimensions from the Library onto the canvas. Note that this won’t move the existing tables from the Databricks data lake.
You can create a Preview Box that allows you to see what the BI users will see when the project is live. With this option, you can easily keep an eye on how the project you are developing will translate to the BI tool.
Step 5. Add measures and dimensions
Select Create a Measure and add the data columns you want to include as measures.
When you save the measure, you’ll notice that it has appeared in the Preview Box. This indicates that your BI tool will now be able to access it.
Similarly, you can create date dimensions, customer dimensions, order dimensions and so on. You can also reuse customized pre-built dimensions that you have already been using.
Here, we’ll add the order date to your date dimension. Simply enter the key for the date, and the date dimension will automatically populate. You can then create the relevant relationships between your data points, and these relationships will be reflected in Power BI.
Step 6. Enrich the data set.
With AtScale, you can enrich your data set by creating the calculations and formulas you need to deliver business intelligence to your end users, without having to go back to the data administrator or actually modifying or manipulating the underlying data set itself.
You are simply creating virtual elements that can look at the data lake and reflect the data you want to see in a format that is useful for business users. Even if you change your formulas, the data itself remains untouched. This way, you keep the underlying data secure and safe from error, while still being able to use it as you need to.
Step 7. Publish your model
When you’re happy with your model, you’re ready to publish it. Select your data source–in our example, we’re using Azure Databricks. Click through the wizard and hit Finish to publish. Once you’ve published your model, your users will be able to come in and query it.
Using Power BI with and without the AtScale Semantic Layer
Now, let’s look at the process of accessing the data you need using Power BI. without the AtScale semantic layer. While it is still possible to connect directly to the Databricks lakehouse, it’s somewhat less straightforward.
In Power BI, go to the Get Data tab and locate your data source. Again, our example is using Azure Databricks.
You will now need to find and enter the server, the host name and the http path.
One way to do this is to go back to the Databricks warehouse screen, locate those elements, copy the data one by one, and then go back to Power BI to paste them into the Power BI info sheet.
You’ll notice that the data connectivity mode defaults to “Import”. Since in this instance we’re not connected using DAX, Power BI will want to import a copy of the data to work with.
The benefit of using the AtScale semantic layer here is that you can select Direct Query. You can make full use of the data without having to create a copy or change the data itself.
Now, log into your Databricks cluster. Without AtScale, you will need to know which tables, schemas and databases you need and to search for these.
You will also need to create your date dimension, selecting the various tables you want to include. A problem here is that the date dimension you create may differ significantly from your colleagues’. As a result, data using different time periods could produce inconsistent results across your dashboards.
Now, import your data. This may take some time.
You will find your tables under the Modeling tab. However, without AtScale, there are no relationships yet between the tables. You will now need to create these.
The upshot of this is that, without AtScale, whenever your Power BI users want to create a report, they will need to know as much as your data steward knows.
Using AtScale, you simply need to create a connection string. Using the built-in SQL Server Analysis Services database, paste the data from Power BI into the AtScale server.
In this case, you’ll notice it defaults to Connect Live. This means you don’t have to move data off the Databricks lakehouse.
With AtScale, instead of having to model data in Power BI, you can simply point to it live, and the data will come over pre-modeled. You can drill into your date dimensions and measures dynamically, and publish reports without having to manipulate the data at all. This is how AtScale helps you to access truly business-ready data.
Using PowerBI and AtScale with Excel
You can also use the data dynamically in Excel. In Power BI, go to Get Data -> From Database, and select the Analysis Services connector.
Paste in the URL, and log in. Find your model and create a live connection in Excel.
Now, you can analyze the data and create useful visualizations, all without having to do any data-built modeling.
The Value of a Universal Semantic Layer
Using a semantic layer like AtScale allows you to create a fast, secure and governed business-friendly data layer. This enables your BI and AI teams and application developers to perform cloud data analytics, while also making that data easy to digest for your data consumers.
AtScale offers consumers and administrators a single source of truth, optimized for high performance and with consistent governance. With AtScale, data consumers don’t need to become data engineers to access and use the data they need to make decisions.
If you’d like to experience the power of Power BI with AtScale for yourself, schedule a live demo today!