March 5, 2019What’s the best BI tool for Hadoop?
While it may be tempting to focus our efforts only on self-service BI in terms of security and access control mechanisms, it is important to also place emphasis on economies to achieve success. When an enterprise develops a self-service BI environment, it undoubtedly means that their IT team adopts the role of a service provider. Data and services become available to internal business users for a price. What are these hidden costs?
In this contributed piece, Javier Guillen unveils and helps us understand these costs associated with self-service BI initiatives, as well as how to address them. Javier is a highly respected figure in the Data Analytics world. At the time of writing, Javier was a Principal Consultant for BlueGranite – a data and analytics consulting firm.
Some organizations embark on the self-service BI journey because it’s popular with data analysis or perhaps because competitors have an established program and they fear being left behind. Others do it to offload IT requests to the business, permitting expensive resources to focus on long-term projects. Organizations might also do it “by accident”, discovering that self-service BI grows organically whether they planned for it or not.
Regardless of how self-service BI initiatives started at your company, if you end up “owning” them (whether you are an IT person or not), it is worth considering the associated costs beyond licensing. Actively managing these costs can positively impact your organization, and ultimately your career.
Efficiency costs are the costs incurred to ensure reports are highly consumable by the target audience and that they can be trusted. These costs tend to rise when reports are confusing, or when they offer conflicting numbers compared to other reports. There are several ways to categorize these costs, explained below.
This can occur when multiple data models query the same source in the same way and possibly define the same business calculations across departments, which can imply redundant data refreshes across reports. In some cases, this redundancy can even happen across reports built by different people in a single department.
Given that it is common for data analysts to work independently, redundant dataset modeling occurs often and can increase the price tag of self-service BI initiatives when slow queries with the same result set are executed concurrently during scheduled data refreshes, impacting source system resource capacity.
Additionally, since the same calculation can be defined across reports, there is always a risk of creating rogue copies. This means that the same calculation name with a different formula (and output) can negatively affect adoption if a user’s trust in the reports is diminished.
A competing solution tends to happen while IT is focused on broader, long-term projects that seek to satisfy many reporting use cases. These take time to develop, given their large scope. Meanwhile, business users may use self-service BI tools to create tactical, short-term solutions of their own that contain “tribal knowledge” that the IT-led corporate solutions sometimes lack.
When the time comes for those expensive IT-led solutions to be deployed to production, some users can be reluctant to adopt a BI solution given that their own reports satisfy their needs – and more importantly, they feel they have control over them.
The price tag of such an experiment can be very high, given abandoned IT solutions can imply multiple layers of work, each costing thousands (or even hundreds of thousands) of dollars.
In other scenarios, business users are simply unaware of the existence of enterprise solutions and datasets that may satisfy their needs. They could be building their own self-service BI reports while other existing solutions they were not informed of may have already been deployed.
Inaccurate Report Scope:
This situation refers to when well-meaning self-service BI authors create models that are too generic and cover too many use cases in hopes of targeting a wide audience of disparate users. Unfortunately, this approach can increase efficiency costs as report performance can suffer due to excessive data volumes and broad data definitions. Performance issues alone can cause users to abandon reports.
I often say a report that attempts to answer too many questions will not properly answer any of them. In this scenario, users may be forced to spend additional hours creating more specialized reporting – adding cost to the initiative.
It is also common for analysts building reports for themselves, or for a small audience (their boss, for example), to take the opposite approach of building a single data model per report. This approach can be too narrow, and increases efficiency costs due to redundancy (see bullet point above).
Ineffective Report Layouts:
Reports that are difficult to read, confusing, and overly decorated can increase cost by causing users to spend too much time trying to understand what the report says (or intends to say), or by abandoning the report completely.
Lack of Systematic Testing:
Systematic testing is a process inherent to IT development. In contrast, self-service BI users may be business experts with no IT background and might not be accustomed to rigorous unit testing. A “random check” or “spot check” of numbers may not be sufficient to ensure accuracy across all report filtering conditions, and without appropriate testing, efficiency costs can increase when key users detect inaccuracies leading to decreased usage.
Undefined Development Environment:
One of the many strengths of self-service BI tools is the flexibility for rapid authoring. Report creators often want to experiment with new calculations or visualizations. This, coupled with the fact that many self-service BI initiatives do not designate a separate authoring environment (as in, development and production are the same), can create user confusion at best, and improper decision making based on “test” calculations at worst.
Reports that generate initial excitement may end up suffering low adoption if the original author modifies the “production” copy directly too frequently.
Operational costs are incurred to keep the self-service BI initiative running beyond a license bundle purchase. Similar to efficiency costs, operational costs can also be defined in a number of categories.
Incorrect License Allocation:
When some users (but not all) plan to author/publish reports, an organization might erroneously purchase authoring licenses for its entire user base. To prevent this, consider assigning authoring licenses to report creators, and read-only licenses to consumers.
This begs the question: do you know who your authors are? How about your consumers? Some companies, like BlueGranite, have strategies in place to understand the user population and segment it into dataset developers, report authors, or consumers. This can help align licensing with actual use.
Unplanned Team Assignment:
Some companies mistakenly treat all users as if they are in the same bucket. In reality, some authors are more driven by visualization best practices, while others are more interested in data preparation, modeling principles, and calculation development.
Making this distinction is significant, considering many companies want to follow a “shared dataset” approach to self-service BI. This implies a person (or team) is developing the data models, while other people (or teams) develop the reports on top of the shared model.
Although some users may claim expertise in both areas, typically one area is dominant. These groups have different interests and motivations that benefit from specialized training, support and a process definition to empower them. When companies instead treat authors as belonging to the same category, costs can increase by expecting capabilities from the wrong audience, or by teaching a skill to the wrong user.
Lack of Report Ownership:
Report prototypes often start as experiments. Some stick and become valuable, given their ability to solve reporting needs. However, once reaching that point, not all report authors have the interest or capabilities to own them moving forward. Without a proper ownership migration strategy in place, supporting these tactical solutions becomes expensive, since IT or business teams may struggle to properly maintain them.
No Usage Analysis:
Often, companies do not measure usage behavior for self-service BI reports. This increases operating costs, as IT may spend hours maintaining solutions people no longer use and could be missing important user adoption cues.
Some users are not only business experts, but also achieve self-service BI tool mastery. They are aware of their ability to produce value quickly and their technical prowess surpasses their peers’ capabilities for report creation. With their skills and enthusiasm, they are a fantastic asset to kick-start a self-service BI initiative.
However, as an initiative matures, these experts can become an operational cost risk because they can default to be as self-sufficient as possible. This circumvents the IT department instead of collaborating with it, leading the team to re-invent solutions that may already exist or be in the works.
Opportunity costs are those incurred due to the loss of potential gain, had other alternate approaches been taken.
As mentioned, self-service BI tools can be used to develop flexible reports quickly. IT has an opportunity to use this capability during the requirement-gathering phase by using self-service BI tools to confirm assumptions around reporting, data modeling, and data preparation. These discoveries can influence blueprints for data warehousing or enterprise data modeling work. In other words, self-service BI tools can be used to “fail cheaply and quickly” to save on expensive future refactoring efforts. By not planning for a prototyping phase using self-service BI tools, IT misses a tremendous opportunity to avoid high refactoring work down the road due to inaccurate requirements.
Undefined Integration Lifecycle:
Leading reporting technologies allow for upgrading self-service BI models to enterprise BI models. For example, it is possible to migrate Microsoft Power BI models into SQL Server Analysis Services Tabular. As part of a managed integration process, developers can expand the reach and accuracy of enterprise models by:
- Comparing popular data models developed by the business to those created by IT
- Merging valuable query and calculation objects into the enterprise BI models
Oftentimes, however, businesses rolling out self-service BI programs are either unaware of these capabilities or do not plan on using them methodically. In these cases, organizations are missing out on an opportunity to harvest users’ knowledge and leverage ideas of how to enhance enterprise capabilities. Sometimes the opportunity cost for integration is so high that, over time, it can diminish IT’s ability to remain relevant as a provider of analytical capabilities when enterprise solutions become out-of-sync with business needs.
Addressing hidden costs:
Regardless of how well engineered your back-end data layer might be, these hidden costs can eat up most of, or all, the value of your self-service BI initiative. Companies that seek to manage them typically do so through an established program that accounts for the following:
- Auditability: Usage metrics analysis and adoption management
- Training: User segmentation, gap analysis and roadmap
- Requirement discovery: Tactical and strategic managed prototyping
- Integration Life Cycle: Model harvesting via compare, merge and certification
- Process: Environment separation, team structure and SLAs
cta(Beyond Costs, check out Traps to Avoid for BI on Big Data
Originally published here,https://cta-redirect.hubspot.com/cta/redirect/488249/b03e7889-d3e2-4b47-b196-7cbc88c7d372)