May 31, 2022Deliver Self-Service BI at scale with the Semantic Layer + Databricks
In a previous post, we talked about using AtScale’s semantic layer to merge Foursquare Places data with first-party data. By blending third-and first-party data, organizations can improve their decision-making capabilities using advanced analytics and predictive data modeling. In this post, let’s take a look at how a national retail chain that sells bikes could make smarter decisions with analysis-ready third-party data with a semantic layer.
First, let’s learn more about how AtScale’s semantic layer breaks down data silos, and then follow along as the bike company’s CFO and CEO dig into the numbers on where to open their next retail store.
Correlating First-Party and Third-Party Data with a Semantic Layer
First-party data, SaaS apps, and third-party data brokers all have their own schemas for data. If you want to do an effective analysis of this data, you need to model the data correctly. That’s where AtScale’s semantic layer comes in. AtScale’s semantic layer uses live connections with on-premise and cloud data sources to break down internal data silos and make external data more accessible.
In addition, data virtualization helps companies share third-party data throughout the organization without expensive and time-consuming data movement and transformations. Data virtualization hides the complexity of dealing with multiple data sources, so that business analysts and Data Scientists can more easily consume blended datasets within popular BI (e.g. Power BI, Tableau, Excel) and AI/ML tools (DataRobot, H2O, Python).
Applying Third-Party Data to Make Smarter Decisions
In our bike store example above, the CFO wants to run the numbers on where to open a new retail location. Using first-party datasets in Excel, the CFO can see that sales were highest in California, as the image below shows.
Meanwhile, the CEO can see the same data on orders by geography in Tableau, without having to wait on data engineers to model the data for her. The CEO decides to correlate the store’s own first-party data with third-party data from Foursquare Places to determine the best locations to open a potential store.
From there, the CEO and CFO tap in the data science team to peel back the layers of the onion on the data using AI/ML solutions, such as H2O or DataRobot. They can bring in Point of Interest (POI) data from Foursquare to try to predict foot traffic before deciding which city in California to choose for their shop. Using Jupyter Notebooks, they can initialize a connection to AtScale to bring in this data from Foursquare, without having to be an ETL pipeline or SQL guru.
Once the data science team is inside the Jupyter notebook, they can pull out different features from the Foursquare Places dataset, and blend it with their own data to gain key insights on their store opening. This can include looking at flood zone data to determine where the highest areas of risk are for their business.
From there they can pull in POI data to drill down into the types of similar businesses open in the area, to determine if there is any competition locally.
Time series data can help the team learn more about daily visits, and average dwell times. In the example below, the dwell time was especially high around February 20 in Carpeteria, CA. By doing a bit more research, the data science team finds that this is when youth recreation activities reopened in that region, which spiked interest in sporting goods sales.
More and more businesses are tapping into big data sources like Foursquare Places and AWS Data Exchange in real-time to make smarter decisions. Using AtScale, businesses can enable self-service analytics with both internal and external data at scale. This allows teams to enhance their existing datasets, gaining critical insights that can impact the company’s bottom line.
Get Foursquare Places & Visits data, pre-modeled for insights by AtScale, by requesting a free trial. For more information on using pre-modeled Foursquare Places & Visits data, watch this 20 minute overview demo.
The Practical Guide to Using a Semantic Layer for Data & Analytics