Why AtScale Was Founded
Dave Mariani, Co-Founder and Chief Strategy Officer, AtScale
While running data pipelines and analytics for Yahoo! in 2012, I struggled to keep up with the growth of our advertising data and make it usable for the business. We had truly big data before the term “Big Data” was invented. On top of the data volume challenge, I had several internal and external customers who all wanted to consume this data using different tools and applications. The result was utter chaos. Our most simple terms of “page views” and “ad impressions” had different definitions across our business units – there was no single source of truth for our most basic business metrics and my users didn’t trust our data.
I needed a Universal Semantic Layer, or data API that would:
- Work with a variety of data consumers and tools
- Deliver fast, consistent query performance
- Define secure metrics in once place with an easy to use business interface
- Provide a self-serve interface for business analysts and data scientists
- Scale with our business
- Minimize manual data engineering
- Avoid data copies and data movement
At the time, there was no single solution that could meet my objectives. So, to satisfy these requirements, I forced a shotgun marriage of Hadoop, Oracle and SQL Server Analysis Services (SSAS). The data landed in Hadoop, was pre-aggregated and ETL’ed into Oracle as a staging area, and then processed into an SSAS cube for end-user consumption. My SSAS cube ended up becoming the largest SSAS cube in the world at 24TB for 3 months of data. While five times larger than the next largest cube, 24TB is really not “Big Data” in my definition. Yet, keeping that beast fed required a delicate dance involving NetApp snapshots and tricked out ETL code. But my end users loved it and they were able to do wonderful, revenue-producing things with it. I really needed a way of having my cake (OLAP functionality) and eating it too (without the OLAP architecture).