Data Visualization is a method for presenting data visually and compellingly in a way that highlights insights, including performance, change, trends, comparisons, patterns, correlations and anomalies. Data visualization grew out of the statistics field, including descriptive statistics as a way to view trends, relationships and patterns easily compared with columnar reports.
The purpose of Data Visualization is to improve data insights awareness, recognition and cognition, including key highlights to consider discussing, sharing and addressing.
Principles to Consider when Implementing Data Visualization are as Follows:
Edward Tufte defines ‘graphical displays’ and principles for effective graphical display in the following passage: “Excellence in statistical graphics consists of complex ideas communicated with clarity, precision, and efficiency. Graphical displays should:
- show the data
- induce the viewer to think about the substance rather than about methodology, graphic design, the technology of graphic production, or something else
- avoid distorting what the data has to say
- present many numbers in a small space
- make large data sets coherent
- encourage the eye to compare different pieces of data
- reveal the data at several levels of detail, from a broad overview to the fine structure
- serve a reasonably clear purpose: description, exploration, tabulation, or decoration
- be closely integrated with the statistical and verbal descriptions of a data set.
Primary Uses of Data Visualization
Data Visualization has many uses and there are many methods for viewing statistical views of data, including the following list. The primary use of visualization is cognitive – to view statistical representation of trends, patterns, relationships and outliers / anomalies in a way that improves awareness, recognition and cognition.
There are many types of visualization types: a sample list follows:
- Information graphic types
- Line char
- Bar chart
- PlotBox plot
- Pareto chart
- Pie chart
- Area chart
- Tree map
- Bubble chart
- Stripe graphic
- Control chart
- Run chart
- Stem-and-leaf display
- Small multiple
- Marimekko chart
Key Business Benefits of a Data Visualization
The main benefit of Data Visualization is improved understanding of what the data is indicating in terms of insights importance – what is the result of the query in terms of inference and implications.
Common Roles and Responsibilities Associated with Data Visualization
Roles important to Data Visualization are as follows:
- BI Engineer – The BI engineer is responsible for delivering business insights using OLAP methods and tools. The BI engineer works with the business and technical teams to ensure that the data is available and modeled appropriately for OLAP queries, and then builds those queries, including designing the outputs (reports, visuals, dashboards) typically using BI tools. In some cases, the BI engineer also models the data.
- Business Owner – There needs to be a business owner who understands the business needs for data and subsequent reporting and analysis. This to ensure accountability, actionability as well as ownership for data quality and data utility based on the data model. The business owner and project sponsor are responsible for reviewing and approving the data model as well as the reports and analysis that OLAP will generate. For larger, enterprise-wide insights creation and performance measurement,a governance structure should be considered to ensure cross-functional engagement and ownership for all aspects of data acquisition, modeling and usage: reporting, analysis.
- Data Analyst / Business Analyst – Often a business analyst or more recently, data analyst are responsible for defining the uses and use cases of the data, as well as providing design input to data structure, particularly metrics, business questions / queries and outputs (reports and analyses) intended to be performed and improved. Responsibilities also include owning the roadmap for how data is going to be enhanced to address additional business questions and existing insights gaps.
Common Business Processes Associated with Data Visualization
The process for developing and deploying Data Visualization is as follows:
- Access – Data, often in structured ready-to-analyze form and is made available securely and available to approved users, including insights creators and enablers.
- Profiling – Data are reviewed for relevance, completeness and accuracy by data creators and enablers. Profiling can and should occur for individual datasets and integrated data sets, both in raw form as was a ready-to-analyze structured form.
- Preparation – Data are extracted, transformed, modeled, structured and made available in a ready-to-analyze form, often with standardized configurations and coded automation to enable faster data refresh and delivery. Data is typically made available in an easy to query form such as database, spreadsheet or Business Intelligence application.
- Integration – When multiple data sources are involved, integration involves combining multiple data sources into a single, structured, ready-to-analyze dataset. Integration involves creating a single data model and then extracting, transforming and loading the individual data sources to conform to the data model, making the data available for querying by data insights creators and consumers.
- Extraction / Aggregation – The integrated dataset is made available for querying, including, including aggregated to optimize query performance.
- Analyze – Process of querying data to create insights that address specific business questions. Often analysis is based on queries made using business intelligence tools using a structured database that automate the queries and present the data for faster, repeated use by data analysts, business analysts and decision-makers.
- Synthesize – Determine the key insights that the data are indicating, and determine the best way to convey those insights to the intended audience.
- Visualize – Design of dashboards and visuals should be prepared and then developed based on the business questions to be addressed and the queries implemented. Whether working in a waterfall or agile context, it is important to think about how the data will be presented so that the results are well understood and acted up.
- Publish – Results of queries are made available for consumption via multiple forms, including as datasets, spreadsheets, reports, visualizations, dashboards and presentations.
Technologies involved with the Data Visualization are as follows:
- Semantic Layer – Semantic layer applications enable the development of a logical and physical data model for use by OLAP-based business intelligence and analytics applications. The Semantic Layer supports data governance by enabling management of all data used to create reports and analyses, as well as all data generated for those reports and analyses, thus enabling governance of the output / usage aspects of input data.
- Business Intelligence (BI) Tools – These tools automate the OLAP queries, making it easier for data analysts and business-oriented users to create reports and analyses without having to involve IT / technical resources.
- Visualization tools – Visualizations are typically available within the BI tools and are also available as standalone applications and as libraries, including open source.
Trends / Outlook
Key trends in the Data Visualization are as follows:
- Semantic Layer – The semantic layer is a common, consistent representation of the data used for business intelligence used for reporting and analysis, as well as for analytics. The semantic layer is important, because it creates a common consistent way to define data in multidimensional form to ensure that queries made from and across multiple applications, including multiple business intelligence tools, can be done through one common definition, rather than having to create the data models and definitions within each tool, thus ensuring consistency and efficiency, including cost savings as well as the opportunity to improve query speed / performance.
- Video Infographics – The use of video with graphics is increasing, adding additional context and explanation to the data and the visualizations presented.
- Real-time Visualization – With IOT and other sensor data being available, efforts are underway to present this data in real-time, showing patterns and changes as they occur using visualizations, particularly maps.
- Color Gradients – The use of color gradients is increasing both in use as well as innovation to highlight differences among data more effectively.
AtScale and Data Visualization
AtScale’s semantic layer improves data visualization by enabling visualization to be rendered faster via automated query optimization. The Semantic Layer enables development of a unified business-driven data model that defines what data can be used, including supporting specific queries that generate data for visualization. This enables ease of tracking and auditing, and ensures that all aspects of how data are defined, queried and rendered across multiple dimensions, entities, attributes and metrics, including the source data and queries made to develop output for reporting, analysis and analytics are known and tracked.