October 22, 2020Data News Roundup- Thursday, October 22nd
Welcome back to another interview from our “Analytics Leader Spotlight” series where we chat with business and technology professionals who are using data and analytics to transform their organizations. In today’s spotlight, we introduce you to Miloš Milenković, Business Intelligence Developer at Spotify.
Would you introduce yourself? Explain where you are based and who you work for?
A: I’m currently based in Stockholm, Sweden, working for Spotify. Originally I’m from Serbia. That’s where I studied and started my career. I’ve been working in the data and business intelligence (BI) field since the beginning of my career which started in 2007. I started my career at a local, Serbian company, which is in the area of banking software with a really big BI department, and I was there for five years. Then I moved to another company, GroundLink, in Belgrade, in Serbia, which was a New York- based company with an office in Serbia. That company is in the ground transportation business, and I was there for six years before eventually moving to Stockholm two years ago where I started working at Spotify.
What originally sparked your interest in the BI space?
A: Somewhere at the end of my studies, I became really interested in databases. That was about the time when Microsoft released SQL Server 2005, which was a really big thing at the time with a shiny new analysis service and OLAP cube. One of my professors at university was very interested in it and he pointed me in that direction. I did some work on learning about analysis services. As a huge fan of sports – especially basketball which is really big in Serbia. My first project was to create a cube on NBA data. It was an opportunity for me to join my two passions, sports and data. That’s how I got started.
When you were looking at the NBA data, what kind of analysis were you doing?
A: I’m not sure if I remember all of the details. It was a cube with dimensions such as players, teams, time, and I was looking at the number of wins and points over time for different players.
Was this project for personal use or did you share it with friends?
A: It was a project at the university for one of the exams. It was shared with classmates and professors. At that time, we used SSRS to connect to the cube and present the data.
It sounds like you’ve been in data and analytics since university and have a lot of experience doing this type of reporting style of analytics. How did that experience get you into your role at Spotify?
A: For most of my time back in Serbia, I’ve been working with Microsoft BI stack which included SQL Server, OLAP cubes and reporting. There have been other technologies along the way, but SQL Server was always in the center of it. In the end of 2017, my wife and I started thinking about exploring an opportunity to move somewhere in Europe. We were looking at Western Europe or Northern Europe –like Scandinavia, but we didn’t have a specific country in mind. I started looking into job openings in different places and found one at Spotify for a BI developer. They were looking for someone with experience in a SQL Server BI stack. They had the idea to migrate to Google Cloud, and mentioned Google BigQuery. At that point, I didn’t know much about cloud or BI in the cloud, but it sounded interesting. They were looking for someone with the experience that I had and it looked like a good opportunity for career development to gradually migrate to the cloud where Spotify was going. I applied and after several rounds of interviews and a few months, I took the job and we moved to Sweden at the beginning of 2018.
Have you had any experience in working with data in the cloud or were you exclusively on the on-premises Microsoft stack?
A: I was exclusively on-premises, and mostly the Microsoft stack. There’s been a little bit of Oracle, MySQL, and PostgreSQL knowledge along with some other technologies, but all my knowledge was with servers, not in the cloud. I was familiar with cloud as a concept of having virtual machines and working with a serverless environment, but I had never tried it nor did I have the opportunity to. The first time that I had the opportunity to do something with the cloud was at Spotify.
When you joined Spotify, what type of BI analysis were you specifically supporting?
A: At Spotify, the data managing and data handling is done (maybe) differently than some of the other companies. There is no centralized enterprise data warehouse (EDW) where all of the data flows — it’s spread across business units. So each business unit has its own data team that does data and reporting and analytics for them. The unit that I’m in is focused on Spotify’s free business. It’s basically related to ads and how all of the different advertisers and companies book impressions and ads to be served to users who are in the free tier. When I started, the tech stack was only Microsoft SQL Server except for the user facing part (which is reports and dashboards). The BI tool was Qlik Sense connecting to SQL Server and an OLAP cube when I started.
Who were those lines of business that you support specifically and what is that analysis being used for in their decision making?
A: I would say that my team has three main stakeholder groups and several other groups that are occasional users. The main ones are Financial Planning and Analysis in the ads area, Sales Operations and Pricing & Yield. Those are the three groups that use the DW the most. They do a lot of forecasting based on historical trends. They’re trying to see where ad sales are going and how much is going to be booked for the coming twelve months, how much we can expect to deliver, and what their revenue will be. For Sales Operations, they use the data when they are identifying a new market and determining the strategy to use in those markets based on similar markets that have already been launched. They are looking to see if there is already a market in the Far East or if a new market is being launched in South Korea, for example, to see what we should do. They are looking at what worked well and that didn’t work well.
Your background is in OLAP “what-if” style analysis. Are you still using that at Spotify? How do people use that style of analysis?
A: Yes, it is one of the main ways of our users to use the data. Since the beginning when we were in SQL Server, we have shifted towards the cloud so we also have a copy of our Data Warehouse in Google BigQuery. We still do all of our processing and data transformations in SQL Server and then push it to Google BigQuery. There are two ways of our users accessing the data: directly by writing SQL queries against Google BigQuery and using an OLAP cube, SSAS Microsoft cube that we still have on SQL Server. We have about 30-40 analysts that are using the cube on a regular basis. So, they are very dependent on OLAP.
What is it about that style of analysis that they like so much?
A: It allows a very seamless ad-hoc analysis. They’re used to Excel. Most of them are financial analysts so they know their ways in Excel. The OLAP cube is a pivot table on steroids. So, they know how to use it and it’s easy for them. They usually say that data leads to more questions so if they do a simple analysis and drag a few dimensions and a measure and see something they then drill down at another level or hierarchy and see more, it’s so easy to do it with OLAP.
You talked about Excel. How important is Excel? It sounds like it’s important to your business group. Is it important across the entire organization at Spotify?
A: Yes, it’s very important especially in the financial organization. My team belongs to the financial engineering which is supporting Finance in Spotify in all levels from business applications, ERPs, CRMs and all the way to the data. For all the people that work in Finance, Excel is the number one tool. There’s a shift towards Google spreadsheets a little bit these days because Spotify is on a G-suite, so some of the users have shifted from traditional Excel to Google’s product. Even that is not so seamless. I think many prefer to have Excel on their desktop.
You talked a lot about the challenge at Spotify for being able to use new technology, especially the idea of moving to the cloud and being able to do analysis in the cloud. What are some of the things that you love about the cloud? What are the challenges that you’re trying to overcome today?
A: What I like about the cloud is that there is much less need for DevOps, compared to on-premises solutions. You don’t need to worry about the operating system and disk size. In the traditional setup, it takes days or weeks to do something like that. In the cloud, it’s just a few clicks of a button and there you are. You can double your disk size or RAM memory, that’s the beauty of it. Data processing happens on distributed servers, you don’t even know what’s happening, you can see it on a chart or graph somewhere that your data is being crunched by 40 workers at a time and it just spits out the result, that’s really great.
The new data warehouses have columnar storage which is really fast. Billions of records can be queried in seconds if you write the right query. Switching from traditional databases where “select all” is not so bad and can work sometimes. If you move to Google BigQuery or Amazon Redshift or some of the others, select star is really not something you want to do. It costs a lot and it takes a lot of time. Now if you just know which columns you want, you will get the result in a few seconds. So it’s also a shift for those of us who come from a more traditional background, but the results are amazing once you get used to it. So that’s some of the things that I love about it.
The challenges are probably a result of my more traditional background where I’m used to having the data being processed close to where it lives. I like handling data in SQL even if it can require very complex transformations. In the cloud, it’s mostly data pipelines with Java, Python, Scala; and the like. For me, even though it’s not so hard to learn, it’s still feels a little abstract that I cannot really trace what happens to each of the rows, they’re just being processed somewhere far away, but that’s Big Data and that’s how it is now and you just have to get used to it. I don’t know about other problems or issues in our journey in trying to migrate our current BI solution to Google Cloud. We ran into some problems where we were not able to find the right product. None of the cloud solutions that I know, whether it’s Google Cloud, AWS or Microsoft Azure, offer a real Cloud OLAP solution. There’s a lot of talks about the future of OLAP in the cloud and in the Big Data world and there are two very different opinions on that.
How are you thinking of being able to continue OLAP analysis in the cloud? Why do you think it’s still important?
A: I think that OLAP is not going away even with the Big Data cloud and the dimensional model that has been there for many years. Every few years there is something that seems outdated and then you realize that a lot of people in the world are still using it and loving it. That’s why I think it’s not going away. It just needs to find its way and position in this new world and there are companies that are trying to find solutions for that. I’m still very surprised that Google Cloud or AWS do not have a product in their portfolio or have not acquired a company that is doing it, but I think they will do it at some point. From what I’ve learned so far, my team and I did some research on this topic, I think there are two approaches now. One is trying to keep the traditional way of having a cube somewhere; a physical cube that’s materialized, building the cube and connecting tools to it. And then there’s this other approach of data virtualization, where the cube is not materialized, it’s just a semantic layer which is able to translate user queries, whether they are MDX or some other way into a SQL query to the underlying databases. Those are the two approaches that I’ve seen. They’re both interesting, and they have their advantages and disadvantages. I like the data virtualization approach, and as long as the tool is able to do a quick translation of the user query into SQL that works well with the underlying cloud data warehouse.
What advice or recommendations would you have to others in the industry like you who are adopting the cloud and do want to maintain that “what if” OLAP style of analysis?
A: From my current position, I think that the recommendation is to understand the business needs and see what the business users are capable of doing on their own- how well they are technically educated, if they can do a lot of SQL querying on their own or if they need help from others. If users and analysts are very good with SQL, I would probably recommend having them query the data warehouse or data lake directly, but I don’t think that’s realistic. I think data science and machine learning engineers can do that, but not really the users who aren’t engineers. The data and BI teams need to find a way to present the data to their users, whether it’s some sort of modern OLAP tool or building a very advanced interactive dashboards in one of the dashboarding tools like Tableau, Qlik, Power BI or Looker. I still think that those cannot be as interactive and they cannot enable ad-hoc analysis as a simple Excel can do, but I think it’s possible if you have the right people, they can build a very good dashboard that can replace some of these. But also having those reporting tools connected to the cube is a good idea also even if Excel is not an option because now there is open source solutions for OLAP these days like ClickHouse or Druid or some others that big companies are using to analyze billions of records in the OLAP way, but they cannot really connect to Excel. It’s a trade-off, I guess. Each team has to work with their users and figure out what is the best tool for them and how can they enable them to do their analysis.
A lot of what it is that you do on a daily basis is in support of the business users and what you tend to do is rather than determine what the best solution is, a lot of your work is in talking to and understanding a day in the life of your business customers within Spotify to be able to find a solution that best fits their need. Would that be accurate?
A: Exactly. That’s what we do.