What Is Agile Data Modeling?

What Is Agile Data Modeling

Data modeling is the act of assembling and curating data for a particular analytical goal, typically performed by data engineers. Agile data modeling describes a more simplified provisioning of data models, allowing business users to create their own models. This reduces or eliminates the need for human data engineers to provision data, considerably expediting the data modeling process. With agile data modeling, not only can existing queries be answered quickly and consistently, but the time savings opens the door to a dramatic expansion of the company’s data exploration and insight generation.

Requirements for agile data modeling

Traditionally, data had to be tagged manually with the company’s definition of what type of data it is and what it is used for. Agile data modeling gives users a much deeper understanding of the data. More information encoded into the model, along with the appropriate UX application for conveying that information, means faster and more accurate representations of use cases. If all of your data is tagged with this level of granularity, it guarantees interoperability and data can be mixed and matched to build robust data models and drive valuable business insights.

Agile data modeling helps ensure an organization has the ability to stay competitive with fast, agile big data analytics. However, successful agile data modeling requires a detailed understanding of the data: statistics on the data, the databases involved, the load on those shared resources, use cases and intent of data consumers, security constraints, etc. Analysts therefore need platforms that are both operational in scale, and flexible enough to support the investigative nature of their jobs. To achieve this, a new kind of platform is required: the adaptive analytics fabric.

An adaptive analytics fabric seamlessly weaves together data that is used to drive business decisions from a wide variety of sources. Unlike a physical data warehouse, an adaptive analytics fabric does not require data to be stored in a single location. It intelligently virtualizes an organization’s siloed data into a single, unified data view from which a variety of BI tools can obtain fast, consistent answers. Data is queried in its native form “as is,” but appears as part of a unified data warehouse to users.

The following capabilities are integral to implementing next-gen agile data modeling, and are enabled by adopting an adaptive analytics fabric.

Autonomous data engineering

Autonomous data engineering produces optimizations that a human would not be able to conceive of. It uses machine learning (ML) to look at all the data, how it’s queried, and how it’s integrated into models being built by any user across the enterprise. Autonomous data engineering digests all of this information and builds optimal acceleration structures. 

Autonomous data engineering can also automatically place data into the right database for it to achieve optimal performance, so you can leverage many different data platforms that each have different advantages. This takes a traditional liability—the variability of all your different database types—and turns it into a strength. 

Reverse engineering/Auto modeling

An adaptive analytics fabric can automatically understand the capabilities of the data platform, what data is available, and how it can be combined, with limited user intervention. This allows you to ingest new data sources quickly and easily, and automatically discover what your data is, its capabilities and limitations, and how to integrate that data with other data when building models. 

Furthermore, an adaptive analytics fabric can reverse engineer the queries and data models used to create legacy reports. It can determine which data sets were used and what queries were run, so you don’t have to rebuild data models or queries, and you can keep using the same report. For example, if you created your TPS report in the old system, you will still be able to retrieve it in the new one. Past queries may have been run on old data, but they can still be translated and run on the new system without any rewrites. 

Semi-structured data

There are many types of specialized data, and different formats that are optimal for that data. Additionally, some data types have become even more important for analysis, namely the time dimension—entire data platform architectures have emerged around time series analysis. 

With an adaptive analytics fabric, you can put acceleration structures in any database, and it will automatically decide where to put data based on where it will generate the best performance. So if your data model and query are essentially working with time series data, the adaptive analytics fabric can actually put the acceleration structure in a different database that is optimized for time series data to extract better performance, leaving the original data remains in place.

Collaboration

The canvas where you build your models has to be a shared work space. Everyone in the data/analytics pipeline should be able to see who’s been working on it, how it’s been edited, and communication about changes people want to make. Tracking changes and having discussions is imperative for a collaborative environment. 

An adaptive analytics fabric enables this type of collaboration between many different stakeholders in the analytics pipeline, including data architects/modelers, data stewards, business analysts, and business users. 

Security & Governance

You can’t trade security for agility; you need to find a way to have both.

With an adaptive analytics fabric, all of the existing security solutions and policies governing your data remain in place. While your data may be readable to all of your users and a multitude of different BI tools, your permissions and policies are not changed. Security and privacy information is preserved all the way to the individual user by tracking the data’s lineage and the user’s identity. The user’s identity is also preserved and tracked, even when collaboratively using shared data connections.

When users are working with multiple databases that may have different security policies, the policies are seamlessly merged, and global security and compliance policies are applied across all data. So, your data remains as safe as it is now under your own existing security policies and apparatus, and additional security measures are not needed.

Unleash your agility now 

It’s never been easier or more affordable to unleash the transformative power of big data analytics. With an adaptive analytics fabric, you can empower business users across your organization to quickly and easily uncover previously unseen insights in your data, ensuring you remain agile and competitive in a world that will only grow more data-driven. 

Learn more about the benefits of leveraging autonomous data engineering for agile analytics by downloading our white paper How Automation Makes Analytics Agile

ANALYST REPORT
GigaOm Sonar Chart - semantic layers and metrics stores

AtScale Developer Community Edition