October 10, 2019What is Data Loading?
In another round of data leader interviews, I spoke with a few more data analytics experts about enterprise data strategies. We discussed which stakeholders should be involved in defining a data strategy, how organizations can be more agile and how to create a data mesh.
The data leaders I spoke to included: Maria Villar, Head of Enterprise Data Strategy and Transformation at SAP NA; Ramdas Narayanan, VP PM of Data Analytics and Insights Tech at Bank of America; Karan Dhawal, Enterprise Data Leader at Rockwell Automation; and Srinivasan Sankar, Enterprise Data and Analytics Leader at The Hanover Insurance Group.
Read on for thoughtful insights from the experts about their companies’ data strategies.
Who are the stakeholders when defining a data strategy?
Maria Villar of SAP explained that there are various categories of stakeholders involved when defining a data strategy, and each group needs to be engaged differently. The first category is everyone who has a role to play in the management of data lifecycles, usage, and quality, which includes data scientists, analysts, and IT operations staff.
Then there are the business executives and sponsors who fund and endorse the enterprise data strategy, model behavior, and staff the projects. “Your CFO is a great sponsor to have because they usually get data,” Villar suggested. “They might push you on how much money you have to spend, but they understand the importance of data and the need to be data-driven.”
Finally, you have the technical community. “You might have your own technical team, but your CIO likely has a role to play in the overall data architecture,” Villar said. Sometimes CIOs can stall projects later on, she added, so it’s important to have them onboard from the start.
How do organizations deal with fast-paced data environments?
Ramdas Narayanan explained that one of the biggest challenges companies face today is that they’re accumulating so much structured and unstructured data.
He said Bank of America has to deal with two types of data drift: volume and schema. Volume drifts occur when there’s a rapid increase or decrease in the amount of data coming into the organization. Schema drifts occur when the structure of the data changes.
Bank of America overcomes these challenges by proactively monitoring the data at different stages. “It’s a systematic methodology to handle data quality checks that can be built in-house or by purchasing a tool,” Narayanan explained.
Narayanan also recommends proactively engaging with all stakeholders so they know that potential data drifts are coming. “Gather as many details as possible and spread this information across teams,” he said. “This transparency will help everyone manage the data better.”
How do you get data structured to drive self-service and agility?
Karan Dhawal described how Rockwell Automations uses a data lakehouse for self-service analytics, but this still requires routine maintenance. Maintenance needs to be enabled and operational across many different areas including data acquisition, extraction, transformation, replication, privacy, and security.
“The data lakehouse maintenance is part of the data lifecycle,” Dhawal explained. Every aspect of data maintenance needs to be a part of data governance, which then feeds into the overall data strategy of the business.
In short, data is changing all the time. Companies need a process in place to make sure that unstructured and semi-structured data get turned into structured data that can be analyzed. This is especially important today with the growth of IoT and other machine-generated data.
Is the data catalog the best way to start creating a data mesh architecture?
Srinivasan Sankar of The Hanover Insurance Group believes that a data catalog is an essential requirement for creating a data fabric and data mesh. The data fabric is the underlying infrastructure, architecture, and technologies that makes data available from disparate sources.
A data mesh isn’t a tool or technology, but an organizational structure for data that’s business centric and an enabler for self-service analytics. The enterprise data catalog is where these two often converge, so that the physical data connects to business use cases.
“There’s no AI without IA [information architecture],” Sankar stated. “You need a solid data foundation, which requires a data strategy with a proper information architecture.”
Ultimately, defining the right enterprise data strategy is the key to viewing data as a product.
You can watch the full discussion I had with these data leaders about How to Accelerate AI and BI Impact with an Effective Data Strategy. In addition, we put together an eBook about how to Make AI & BI Work at Scale with 15 thought leaders and experts in the data industry about scaling AI and BI at large organizations.