Consider This When Managing Snowflake
In a recent webinar, Mark Stange-Tregear of Rakuten Rewards and I discussed how to control costs and manage your cloud data warehouse environment. Mark made some excellent suggestions for how to better manage your Snowflake environment.
A lot of the cloud data warehouse vendors insist that they already have fast performance and endless capacity, leading you to believe that you have nothing to worry about. But, buyer beware. There’s always a catch. When it comes to thinking about managing a cloud data warehouse efficiently, you should think about the following four dimensions:
Query Performance: How fast can the cloud data warehouse return a single?
Each cloud data warehouse has its own query latency for returning query results. If your end users require OLAP style, instant queries, not all cloud data warehouses will fit the bill. Snowflake actually does a pretty good job in this department by leveraging their query cache. However, if a query hasn’t been cached, you’re likely to see queries run for several seconds.
User Concurrency: How do multiple users running simultaneous queries affect performance and stability?
You would think that with the cloud’s endless compute capacity, that user concurrency wouldn’t be a problem. But, that’s just not the case. If your cloud warehouse is undersized for a spike in user query activity, query latency increases. As Mark mentioned, the way that cloud data warehouses typically deal with too much concurrency is by queuing. While queuing maintains the health of the cluster, queries will stack up and wait for their slot to run. This can result in unpredictable query runtimes and frustrated users.
Compute Costs: How do query workloads and cluster configuration impact your monthly bill?
Mark talked a lot about compute costs because that is something that needs to be planned and managed before you get that surprise monthly bill (we’ve all been there).. At Rakuten Rewards, Mark built a cost management system by leveraging Snowflake’s system activity tables so his CFO can see all cloud costs by department and function. No surprises.
Semantic Complexity: How difficult is it to write the query to answer the business question?
Mark and I talked a lot about the importance of data modeling to make your cloud data warehouse consistent and easy to use. By creating a semantic layer on top of Snowflake’s raw tables and views, Mark made sure that his analysts and data scientists were all speaking the same language and saved them from the drudgery of ETL and data engineering.
What does the Cloud analytics stack look like?
The Rakuten Rewards team restructured their data infrastructure by moving from an on-premises Hadoop cluster to a Snowflake cloud data warehouse on Amazon Web Services (AWS) with AtScale providing universal semantic layer to optimize queries, manage costs and make the data easy to work with.. Mark’s centralized BI team provides his internal customers with access to data through Tableau Server dashboards, ad hoc analysis using Tableau Desktop as well as hand written SQL queries.
The Benefits of the Universal Semantic Layer
What makes all this possible is the Universal Semantic Layer™. AtScale, you get a full multi-dimensional engine that provides a rich business friendly interface for users while ensuring consistency for key business metrics and definitions.
In addition to the power of a semantic layer, AtScale’s single point of entry delivers a one stop data governance shop. You can apply your data governance policies at a logical or physical level while virtualizing and hiding the physical implementation of the data.
Snowflake’s elastic, scalable resource model helps Rakuten Rewards manage their analytics infrastructure with the flexibility and scalability the business demands. AtScale’s Universal Semantic Layer providing labor-saving automation and is making Rakuten’s data easier and safer to use.
- How to Realize an Additional 270% ROI with Snowflake (Slideshare Presentation)