Data Lake Intelligence With Amazon S3 and Redshift Spectrum
Hadoop pioneered the concept of a data lake but the cloud really perfected it. It’s no longer necessary to pipe all your data into a data warehouse in order to analyze it. Cloud data lakes like Amazon S3 and tools like Redshift Spectrum and Amazon Athena allow you to query your data using SQL, without the need for a traditional data warehouse. In this blog, I will demonstrate a new cloud analytics stack in action that makes use of the data lake and the data warehouse by leveraging AtScale’s Intelligent Data Virtualization platform.
The New Cloud Analytics Data Stack
In today’s cloud-y world, just about all data starts out in a data lake, or data file system, like Amazon S3. Later, the data may be cleansed, augmented and loaded into a cloud data warehouse like Amazon Redshift or Snowflake for running analytics at scale. Often, enterprises leave the raw data in the data lake (i.e. S3) and only load what’s needed into the data warehouse. With a virtualization layer like AtScale, you can have your cake and eat it too. By leveraging tools like Amazon Redshift Spectrum and Amazon Athena, you can provide your business users and data scientists access to data anywhere, at any grain, with the same simple interface.
See how AtScale’s Intelligent Data Virtualization platform works in the new cloud analytics stack for the Amazon cloud (3 minute video):
Query S3, Redshift & Teradata in a Single Virtual Cube
AtScale lets you choose where it makes the most sense to store and serve your data. There’s no need to move all your data into a single, consolidated data warehouse to run queries that need data residing in different locations. In addition to saving money, you can eliminate the data movement, duplication and time it takes to load a traditional data warehouse. With the freedom to choose the best data store for the job, you can deliver data to your business users and data scientists immediately without compromising the integrity or granularity of the data.
See how AtScale can transparently query three different data sources, Amazon Redshift, Amazon S3 and Teradata, in Tableau (17 minute video):
NEW: The Virtual Cube Marketplace
The AtScale Intelligent Data Virtualization platform makes it easy for data stewards to create powerful virtual cubes composed from multiple data sources for business analysts and data scientists. With our latest release, data owners can now publish those virtual cubes in a “data marketplace”. With our 2020.1 release, data consumers can now “shop” in these virtual data marketplaces and request access to virtual cubes. This new feature creates a seamless conversation between the data publisher and the data consumer using a self service interface.
See how AtScale can provide a seamless loop that allows data owners to reach their data consumers at scale (2 minute video):
As you can see, AtScale’s Intelligent Data Virtualization platform can do more than just query a data warehouse. Whether data sits in a data lake or data warehouse, on premise, or in the cloud, AtScale hides the complexity of today’s data. Provide instant access to all your data without sacrificing data fidelity or security.