January 17, 2023Using Self-Service Tools To Speed Up Data Product Development
I recently hosted another round of interviews with data leaders about how they deliver AI and BI for their organizations. We discussed combining a data fabric and data mesh, the hub-and-spoke approach to data analytics, and the future of a federated model for data analytics.
The data leaders I spoke to included: Ujjwal Goel, Director of Data Architecture and Data Engineering at Loblaw; Andrea de Mauro, Head of Business Intelligence at Vodafone; and Biju Mishra, Director of Corporate Business Services and Automation at Enbridge.
Here are some of the most thought-provoking insights from these data industry veterans.
What is a hub-and-spoke approach to data analytics?
According to Ujjwal Goel, a clear vision is crucial for modernizing data analytics at an organization. At Loblaw, this vision is what Goel calls a “centralized to decentralized model”, or a hub-and-spoke approach to data analytics.
It starts with the data fabric, or the hub, which is a centralized architecture where services are run to orchestrate data. The data fabric relies heavily on metadata to continuously identify, collect, cleanse, and enrich the data. In addition, Goel sees enormous benefits to a centralized governance approach, though there could be potential for a federated governance model in the future.
The data mesh, on other hand, is a decentralized and distributed solution that puts data ownership in the hands of different teams or domains (the spokes). The problem with the classic version of this approach, according to Goel, is that domains are creating their own data pipelines, leading to data duplication and technical debt. “The data mesh is a new concept,” Goel explained. “And I personally believe it cannot be fully implemented until the organization is on the top of its game for data literacy.”
With a high level of data literacy, combined with a centralized data fabric, each domain can build their own data products and overcome the pitfalls of data silos. “We as the data team see ourselves as enablers,” Goel concluded. “That’s how the mix of data fabric and data mesh becomes the best of both worlds.”
What is the best way of organizing teams to offer self-service analytics?
“The analytics journey is first and foremost an organizational and cultural challenge,” stated Andrea de Mauro of Vodafone. In his opinion, many technical complexities are much easier to solve once you’ve implemented an effective organizational structure to support self-service analytics.
From an organizational perspective, there’s a continuum of possible ways to do this. On one end, you have fully centralized organizations, which are most often companies at the beginning of their analytics journey with a lower level of maturity. There is usually a single center of excellence consisting of data scientists and business analysts that work towards analytics initiatives. However, this approach doesn’t fully utilize the domain knowledge of the business.
At the other end of the spectrum, de Mauro explained, you will find fully decentralized organizations. Data scientists are embedded into business units, so they’ll understand the business better and be able to proactively anticipate business needs. However, these organizations lose out on the synergy that comes from a centralized analytics approach.
“I think where most companies end up is somewhere in the middle with a hub-and-spoke organizational model,” de Mauro said. With this approach, you have a centralized data team working on scaling data and analytics that collaborate closely with embedded analysts who are part of the broader enterprise data program.
“It’s not easy to implement,” de Mauro concluded, “but it’s been proven to be the most effective, scalable, and sustainable organizational approach for enterprise analytics.”
How do you drive self-service analytics with quality and security?
“Organizations have evolved over time, often taking important data capabilities and centralizing them,” stated Biju Mishra. “But I think the future is a more federated model.” This means having some centralized capabilities for maintaining standards and protocols across the organization, while also allowing different departments and business units to have the ability to find and use information.
“One of the challenges with this federated model is making sure that when people look at the information that they have, that it actually makes sense to them,” Mishra suggested. With a federated model, there’s always the risk that someone can interpret data wrong because they don’t fully understand its context. Mishra believes a great analytics organization is one that trains people about data while also making it easy for them to access the right data.
In short, each of these data leaders believes in combining centralized and decentralized capabilities. The centralized function can drive economies of scale, ensure standardization, and spread education and understanding. This should be combined with decentralized capabilities for people across the organization to access the data they need to do their jobs effectively.
You can watch the full discussion I had with these data leaders about How to Deliver Actionable Business Insights with AI & BI. In addition, we put together an eBook about how to Make AI & BI Work at Scale with 15 thought leaders and experts in the data industry about scaling AI and BI at large organizations.