How To Guide

How To Get Started with a Semantic Layer with Databricks

The AtScale semantic layer platform is a powerful complement to a Databricks Lakehouse.  AtScale forms a “diamond layer” on Databricks, enabling high performance, self service analytics from any BI platform.  

Follow this basic guide to get started with AtScale on Databricks.

Step 1. Log in and set up

Log into AtScale and open the Design Center, the main web-based interface for AtScale, where you can perform all your tasks and build your data model. We’re going to be using Databricks SQL as the principal repository.

AtScale Design Center - Databricks SQL

You can start from scratch and create a new model, import an XML file, or work on an existing project. For this guide, we’ll copy an existing project. 

You’ll see your existing projects listed on the screen. Here we have one called Sales Insights.

To duplicate the existing project, just click on the Copy icon at the end of the project box. Fill in a new name in the dialog that pops up. You can also set permissions and change aggregate settings for the project. When you’re ready, click Copy Project. You should now see your new project listed on the screen. Now you have a starting point to begin your modeling.

AtScale Design Center - Make a Copy of Project Sales Insights

Step 2. Explore your Cube

Click on the project you’ve just made to access the Project Overview Screen. Here you’ll see one or more Cubes waiting for you. In our example, we have one called Internet Sales Cube.

AtScale Design Center - Sales Insights Databrcicks

Click on the Cube and you’ll find yourself in the Cube Canvas Screen with some information from this Cube, such as the measures and dimensions. 

AtScale Design Center - Cube Canvas

Click on Enter Model to see the existing model. You can use this model as a seed to create your own.

AtScale Design Center - Enter Model

Step 3. Make a new Cube

Click on your project name in the first tab on the left to return to the Project Overview Screen.

Click New Cube. Add a name in the dialog box–let’s call it Cube 2, and add a description and change permissions if you need to.

AtScale Design Center -Add New Cube

Click Add Cube. Now click on the Cube you’ve just made and then click Enter Model. This is where you’re going to build your AtScale logical data model.

AtScale Design Center - Enter Model

Step 4. Add your data

Open the Library tool from the toolbar on the right-hand side of the workspace. From the Library, you can drag in tables or objects and drop them into your canvas, which will pull the logical data model into AtScale and allow you to work with all of the fields in the underlying data source.

AtScale Design Center -add data

Here we’ll start by dragging our Fact Internet Sales table into the canvas. Click on the hamburger menu in the top right of the table to get a quick view of the raw table data.

AtScale Design Center - Add fact table

When you look at this data model, you may see that there are fields you may want to modify to create additional fields.

AtScale Design Center- edit fields

Close the data model and return to your Canvas.

If you want to add a dimension, such as the date, to this model, you can grab the dimension from the Library and drag it into the Canvas.

This is one of the great features of AtScale: if you’ve already built a dimension, you can reuse it as what we call a Conform Dimension when you build additional models, allowing you to really leverage that power BI semantic layer.

AtScale Design Center - Conform dimension

Step 5. Build a relationship

So now you have a fact table and a dimension, it’s time to join the two together.

On your Fact Table, select an entry and drag it to an entry on your Date Dimension. In our example, here we’ll drag the Shipdatekey from the Fact Table to Daydatekey in the Date Dimension table to create a relationship between the two.

AtScale Design Center - Join two dimensions

In the dialog box that pops up, you can enter the Relationship Settings for the Role-Playing Template, let’s call ours “Ship”.

AtScale Design Center - Create a Relationship

Step 6. Create your data model

To build a semantic layer, click Save. You have now created an AtScale data model.

Step 7. Preview

Click on the Preview tool on the right-hand side to see what your BI tool is going to see when you publish your model.

AtScale Design Center - Preview

Step 8. Role playing

Of course, your dimensions can serve multiple business purposes, so AtScale lets you reuse them multiple times. You can easily add another relationship, by grabbing the OrderDate entry, for example, and dragging it into the same entry in the Date Dimension table and giving it a name, eg. Order, in the Role-Playing Template. With AtScale, reusability and flexibility for your analytics are baked in.

AtScale Design Center

AtScale Design Center -Create a Measure

This modularity means that in a hub and spoke analytics system, for example, you could have one team doing billing analytics, and your subject matter experts in different departments who own different domains can access and leverage all that work.

In this scenario, AtScale’s semantic layer would allow everyone to get a consistent view of dates and times right across BI tools, and come up with the same answers from this single source of truth. 

Step 9. Create measures

Let’s create a measure using Order Quantity from our Fact Table. Click on the Measures tool on the right. 

AtScale Design Center - Create a Measure

Now grab Order Quantity from the Fact Table, drag it over and drop it in the measures tool. 

In the dialog box that pops up, you can give the measure a name and append the suffix SUM, AVGT or AVG to it. Then if you scroll down you can adjust various settings such as the aggregation handling. Click Save when you’re ready.

AtScale Design Center

Add any other Measures you need in the same way and, thanks to AtScale’s iterative design process, in just a few clicks, you’ve already built a model that you can use to query data!

Step 10. Key-value pairs

On your Cube Canvas, click the hamburger menu on your Fact Table. Select View Raw Table Data. 

AtScale Cube Canvas - View Raw Table Data

In the different fields, e.g. Product Info, there can be embedded data. For example, key-value pairs of data like color, size, weight, and style in one column. 

Close the Raw Table view and on your Fact Table, click on the entry for the field you’re interested in. A dialog box will pop up and allow you to quickly map these columns.

AtScale Cube Canvas - map columns

Hit Save, and AtScale automatically extracts this data set and maps them onto your model. You can even drag these key-value pairs onto your Dimensions box to virtually add them there too–no need to call your data architect or ETL team.

AtScale Cube Canvas

Step 11. Publish

So, let’s make our metrics analysis-ready and available to run with live queries using the BI, Data Science, and ML services of your choice.

Return to your project overview. In the development phase we’ve been walking through, nobody could connect to your model. Now, we want to run a query, so, if you’re ready, click Publish to allow access to the data model.

AtScale - publish data model

Click through the dialog boxes and configure the settings as appropriate then click Finish. Now, on the left of your screen, you should see your published Cubes.

AtScale Design Center - Published Cubes

Step 12. Connect to Tableau for your BI analytics

Let’s click on our Cube 2 and click the Connect button at the top of the workspace. 

On the next screen, you will see a choice of BI tools and interfaces you can connect with your AtScale Cube.

Connect AtScale cube to Tableau

Once it’s downloaded click on it and Tableau will open up. Log in and Tableau will connect to the AtScale Cube. 

Login to Tableau

Now, you should see your data model and you can start to run queries by dragging and dropping your entries. Tableau will use AtScale to run queries and AtScale sends the query to Databricks and returns an answer to Tableau.

Tableau + AtScale

Step 13. Self-service BI with Excel

Go back to your Projects tab and click on your project, then select your published Cube 2 on the left, and then click on the Connect button.

Publish AtScale cube to Excel

Now, click on Excel and follow the instructions on the Client Tool Connection box that appears. 

Connect AtScale to Excel

Once Excel is ready, go to the Data tab, then click on the Get External Data button on the left. When the provider menu comes up, choose From Analysis Services. 

Excel- get data from AtScale cube

Paste your server connection info in the Server Name field and enter your username and password and click Next.

Excel connect to server

AtScale will ask you what you want to connect to. We’ll select Cube 2 and click Next. 

Excel - select database and table

On the next screen, check the Save Password In The File box if you want, and click Finish.

Excel

Choose how you want to view the data in your workbook and where you want to put the data, then click OK.

Excel

Now, you’ll have access to your data model provided to Excel through the semantic layer! 

Access data model in Excel

So, what’s going on under the hood of AtScale’s power BI semantic layer?

How can AtScale work so flexibly with such a variety of large-scale cloud data analytics? You can easily see what happens with the queries in detail, thanks to the Queries tab in AtScale.

AtScale Design Canvas - queries

This area lets you see how AtScale takes a query and rewrites it for the backend dialect. It doesn’t matter what platform or what dialect you’re running in the background. AtScale receives the query from Tableau, Excel, or other BI analytics tools and then automatically converts it to the proper dialect for the system so you can use any BI tool with every BI tool. 

AtScale queries

Even if different versions have different SQL, AtScale works out what to do with it, rewrites it, and sends it to the Databricks server. AtScale also accepts inbound MDX. 

AtScale replaces legacy “Cube” architectures like SSAS with a solution that enables blazing-fast dimensional analysis by creating aggregates on Databricks.

AtScale’s optimization system learns from the type of BI analytics queries it receives. It uses statistical analysis to create its own aggregates on the fly and leverages multiple aggregates in combinations to speed up the return of the cloud analytics results you need. No need for data wranglers or SQL jockeys.

AtScale aggregates

It can even rewrite a query to use an aggregate created by a previous user, in a previous BI tool, in a previous session. You can train a Cube about the type of queries you’re likely to be running and that learned efficiency is then available to anybody who connects to the Cube, meaning you only ever have to do it once. 

Aggregate tables also get resynced with the underlying tables that they’re coming from on Databricks.  

In the published section in your Cube, there are several ways to build your aggregates. Initially, when you create a data model, you will probably build them manually until the data model becomes part of a production workflow.

AtScale projects

Later, when you’re loading your data warehouses incrementally you’ll be able to use an API call to rebuild the aggregate right after your ETL step. 

AtScale - API trigger

AtScale ensures your aggregate updates can be done as quickly as possible, so your cloud data analytics are always fresh.

Finally, AtScale also allows you to look at your utilizations and see the query time saved in terms of BI analytics use and database processing.

AtScale query time saved

Databricks and AtScale’s Semantic Layer enable effortless self-service BI analysis for everyone in your organization. If you’d like to experience the power BI semantic layer provided by AtScale and Databricks’ Lakehouse for yourself, schedule a live demo today

Request a Demo