The AtScale semantic layer platform is a powerful complement to a Databricks Lakehouse. AtScale forms a “diamond layer” on Databricks, enabling high performance, self service analytics from any BI platform.
Follow this basic guide to get started with AtScale on Databricks.
Step 1. Log in and set up
Log into AtScale and open the Design Center, the main web-based interface for AtScale, where you can perform all your tasks and build your data model. We’re going to be using Databricks SQL as the principal repository.
You can start from scratch and create a new model, import an XML file, or work on an existing project. For this guide, we’ll copy an existing project.
You’ll see your existing projects listed on the screen. Here we have one called Sales Insights.
To duplicate the existing project, just click on the Copy icon at the end of the project box. Fill in a new name in the dialog that pops up. You can also set permissions and change aggregate settings for the project. When you’re ready, click Copy Project. You should now see your new project listed on the screen. Now you have a starting point to begin your modeling.
Step 2. Explore your Cube
Click on the project you’ve just made to access the Project Overview Screen. Here you’ll see one or more Cubes waiting for you. In our example, we have one called Internet Sales Cube.
Click on the Cube and you’ll find yourself in the Cube Canvas Screen with some information from this Cube, such as the measures and dimensions.
Click on Enter Model to see the existing model. You can use this model as a seed to create your own.
Step 3. Make a new Cube
Click on your project name in the first tab on the left to return to the Project Overview Screen.
Click New Cube. Add a name in the dialog box–let’s call it Cube 2, and add a description and change permissions if you need to.
Click Add Cube. Now click on the Cube you’ve just made and then click Enter Model. This is where you’re going to build your AtScale logical data model.
Step 4. Add your data
Open the Library tool from the toolbar on the right-hand side of the workspace. From the Library, you can drag in tables or objects and drop them into your canvas, which will pull the logical data model into AtScale and allow you to work with all of the fields in the underlying data source.
Here we’ll start by dragging our Fact Internet Sales table into the canvas. Click on the hamburger menu in the top right of the table to get a quick view of the raw table data.
When you look at this data model, you may see that there are fields you may want to modify to create additional fields.
Close the data model and return to your Canvas.
If you want to add a dimension, such as the date, to this model, you can grab the dimension from the Library and drag it into the Canvas.
This is one of the great features of AtScale: if you’ve already built a dimension, you can reuse it as what we call a Conform Dimension when you build additional models, allowing you to really leverage that power BI semantic layer.
Step 5. Build a relationship
So now you have a fact table and a dimension, it’s time to join the two together.
On your Fact Table, select an entry and drag it to an entry on your Date Dimension. In our example, here we’ll drag the Shipdatekey from the Fact Table to Daydatekey in the Date Dimension table to create a relationship between the two.
In the dialog box that pops up, you can enter the Relationship Settings for the Role-Playing Template, let’s call ours “Ship”.
Step 6. Create your data model
To build a semantic layer, click Save. You have now created an AtScale data model.
Step 7. Preview
Click on the Preview tool on the right-hand side to see what your BI tool is going to see when you publish your model.
Step 8. Role playing
Of course, your dimensions can serve multiple business purposes, so AtScale lets you reuse them multiple times. You can easily add another relationship, by grabbing the OrderDate entry, for example, and dragging it into the same entry in the Date Dimension table and giving it a name, eg. Order, in the Role-Playing Template. With AtScale, reusability and flexibility for your analytics are baked in.
This modularity means that in a hub and spoke analytics system, for example, you could have one team doing billing analytics, and your subject matter experts in different departments who own different domains can access and leverage all that work.
In this scenario, AtScale’s semantic layer would allow everyone to get a consistent view of dates and times right across BI tools, and come up with the same answers from this single source of truth.
Step 9. Create measures
Let’s create a measure using Order Quantity from our Fact Table. Click on the Measures tool on the right.
Now grab Order Quantity from the Fact Table, drag it over and drop it in the measures tool.
In the dialog box that pops up, you can give the measure a name and append the suffix SUM, AVGT or AVG to it. Then if you scroll down you can adjust various settings such as the aggregation handling. Click Save when you’re ready.
Add any other Measures you need in the same way and, thanks to AtScale’s iterative design process, in just a few clicks, you’ve already built a model that you can use to query data!
Step 10. Key-value pairs
On your Cube Canvas, click the hamburger menu on your Fact Table. Select View Raw Table Data.
In the different fields, e.g. Product Info, there can be embedded data. For example, key-value pairs of data like color, size, weight, and style in one column.
Close the Raw Table view and on your Fact Table, click on the entry for the field you’re interested in. A dialog box will pop up and allow you to quickly map these columns.
Hit Save, and AtScale automatically extracts this data set and maps them onto your model. You can even drag these key-value pairs onto your Dimensions box to virtually add them there too–no need to call your data architect or ETL team.
Step 11. Publish
So, let’s make our metrics analysis-ready and available to run with live queries using the BI, Data Science, and ML services of your choice.
Return to your project overview. In the development phase we’ve been walking through, nobody could connect to your model. Now, we want to run a query, so, if you’re ready, click Publish to allow access to the data model.
Click through the dialog boxes and configure the settings as appropriate then click Finish. Now, on the left of your screen, you should see your published Cubes.
Step 12. Connect to Tableau for your BI analytics
Let’s click on our Cube 2 and click the Connect button at the top of the workspace.
On the next screen, you will see a choice of BI tools and interfaces you can connect with your AtScale Cube.
Once it’s downloaded click on it and Tableau will open up. Log in and Tableau will connect to the AtScale Cube.
Now, you should see your data model and you can start to run queries by dragging and dropping your entries. Tableau will use AtScale to run queries and AtScale sends the query to Databricks and returns an answer to Tableau.
Step 13. Self-service BI with Excel
Go back to your Projects tab and click on your project, then select your published Cube 2 on the left, and then click on the Connect button.
Now, click on Excel and follow the instructions on the Client Tool Connection box that appears.
Once Excel is ready, go to the Data tab, then click on the Get External Data button on the left. When the provider menu comes up, choose From Analysis Services.
Paste your server connection info in the Server Name field and enter your username and password and click Next.
AtScale will ask you what you want to connect to. We’ll select Cube 2 and click Next.
On the next screen, check the Save Password In The File box if you want, and click Finish.
Choose how you want to view the data in your workbook and where you want to put the data, then click OK.
Now, you’ll have access to your data model provided to Excel through the semantic layer!
So, what’s going on under the hood of AtScale’s power BI semantic layer?
How can AtScale work so flexibly with such a variety of large-scale cloud data analytics? You can easily see what happens with the queries in detail, thanks to the Queries tab in AtScale.
This area lets you see how AtScale takes a query and rewrites it for the backend dialect. It doesn’t matter what platform or what dialect you’re running in the background. AtScale receives the query from Tableau, Excel, or other BI analytics tools and then automatically converts it to the proper dialect for the system so you can use any BI tool with every BI tool.
Even if different versions have different SQL, AtScale works out what to do with it, rewrites it, and sends it to the Databricks server. AtScale also accepts inbound MDX.
AtScale replaces legacy “Cube” architectures like SSAS with a solution that enables blazing-fast dimensional analysis by creating aggregates on Databricks.
AtScale’s optimization system learns from the type of BI analytics queries it receives. It uses statistical analysis to create its own aggregates on the fly and leverages multiple aggregates in combinations to speed up the return of the cloud analytics results you need. No need for data wranglers or SQL jockeys.
It can even rewrite a query to use an aggregate created by a previous user, in a previous BI tool, in a previous session. You can train a Cube about the type of queries you’re likely to be running and that learned efficiency is then available to anybody who connects to the Cube, meaning you only ever have to do it once.
Aggregate tables also get resynced with the underlying tables that they’re coming from on Databricks.
In the published section in your Cube, there are several ways to build your aggregates. Initially, when you create a data model, you will probably build them manually until the data model becomes part of a production workflow.
Later, when you’re loading your data warehouses incrementally you’ll be able to use an API call to rebuild the aggregate right after your ETL step.
AtScale ensures your aggregate updates can be done as quickly as possible, so your cloud data analytics are always fresh.
Finally, AtScale also allows you to look at your utilizations and see the query time saved in terms of BI analytics use and database processing.
Databricks and AtScale’s Semantic Layer enable effortless self-service BI analysis for everyone in your organization. If you’d like to experience the power BI semantic layer provided by AtScale and Databricks’ Lakehouse for yourself, schedule a live demo today!