Technical, Data Analytics

Course 6: Share Data Through the Art of Visualisation, Course Overview plus Module 1: Visualise Data

data visualisation

Course Overview

Welcome to the sixth course in the Google Data Analytics Certificate! In this course, you’ll learn how to create data visualisations. Visualisations, along with compelling data storytelling, will help you communicate the meaning of a dataset to your audience. Sharing the results of an analysis is one of the most important parts of an analyst’s job.

This course starts with the basics: learning principles and best practices for data visualisation in spreadsheets. You’ll get hands-on experience creating data visualisations in Tableau, a specialized data visualisation tool. Beyond the basics, there’s a focus on professional tips for creating exciting visualisations, presentations, and talking points about your data. This course also covers how to prepare and deliver effective presentations, so you can confidently handle the most challenging questions about your data analysis. Once you’ve completed this course, you’ll be on your way to becoming a talented data storyteller!

Certificate program progress

The Google Data Analytics Certificate program has eight courses. Share data through the art of visualisation is the sixth course.

Course 6 content

This course is broken into four modules. Here’s an overview of the skills you’ll gain in each module:

Module 1: Visualise data

In this module, we will delve into the various types of data visualisations and explore what makes an effective visualisation. You’ll also learn about accessibility, design thinking, and other factors that will help you use data visualisations to effectively communicate data insights.

Learning Objectives


Describe the key concepts involved in data visualisation

Explain the key concepts involved in design thinking as they relate to data visualisation

Describe the use of data visualisations to talk about data and the results of data analysis

Discuss accessibility issues associated with data visualisation

Explain the importance of data visualisation to data analysts

Module 2: Create data visualisations with Tableau

Tableau is a business intelligence and analytics platform that helps people visualise, understand, and make decisions with data. In this part of the course, you’ll become well-versed in Tableau’s dynamic capabilities and learn to inject creativity and clarity into your visualisations, ensuring that your findings are easy to understand.

Module 3: Craft data stories

Connecting your objective with your data through insights is essential to data storytelling. In this part of the course, you’ll get acquainted with the principles of data-driven storytelling and learn to craft compelling narratives using Tableau’s dashboard and filtering capabilities, giving life to your data insights.

Module 4: Develop presentations and slideshows

In this part of the course, you’ll discover how to give an effective presentation about your data analysis. This final module teaches you to construct insightful presentations that resonate with your audience. You’ll learn to anticipate and address potential questions and to articulate the limitations of your data, ensuring a robust and credible narrative for your stakeholders.

Module 1: Visualise Data

Effective data visualisations

It can be difficult to understand data insights by examining individual data points or a table of information. Often, insights become more obvious when presented in an effective visual format. You can use data visualisation (often called  “data viz”) techniques to help your audience interpret data in a concise, visual manner.

When creating data visualisations, you must strike a balance between presenting enough information for your audience to understand the meaning of the visualisation and not overwhelming them with too much detail. In this reading, you’ll learn tips and techniques for crafting visualisations that are both impactful and effective. You’ll explore:

  • Two frameworks for organizing data
  • Pre-attentive attributes

Frameworks for organizing your thoughts about visualisation

Frameworks help organize your thoughts about data visualisation and give you a useful checklist to reference as you plan and evaluate your data visualisation. Here are two frameworks that employ slightly different techniques. Both are intended to improve the quality of your visuals. 

The McCandless method

You learned about the David McCandless method earlier in the course; as a refresher, the McCandless method lists four elements of good data visualisation: 

  1. Information: the data with which you’re working 
  2. Story: a clear and compelling narrative or concept
  3. Goal: a specific objective or function for the visual
  4. Visual form: an effective use of metaphor or visual expression

The McCandless method provides terminology that isolates the specific elements of a graphic, allowing the person making a visual the ability to evaluate how well those criteria have been met. The aim when crafting a visualisation is to incorporate all four elements effectively. Visualisations that fail to incorporate all four elements can be ineffective at communicating insights in various ways. For example, visual form without a goal, story, or data could be a sketch or even art. Data in visual form without a goal or function is just a pretty picture. Data with a goal but no story or visual form can be boring. All four elements need to be present to create an effective visual.

Kaiser Fung’s Junk Charts trifecta checkup

This approach is a set of questions that can help consumers of data visualisation critique what they are consuming and determine how effective it is. You can also use these questions to determine if your data visualisation is effective:

  1. What is the practical question? 
  2. What does the data say?
  3. What does the visual say? 

Each of these questions offers an opportunity to investigate a given problem with a slightly different context. A well-designed visual effectively answers all three of those questions at once. Moreover, this framework helps you think about your data viz from the perspective of your audience. 

Pre-attentive attributes

In addition to the frameworks mentioned above, several standard building blocks can help you construct your data visualisations. Creating effective visuals means leveraging what is known about how the brain works, and then using specific visual elements to communicate the information effectively. Pre-attentive attributes are the elements of a data visualisation that people recognize automatically and without conscious effort. The essential, basic building blocks that make visuals immediately understandable are called marks and channels. 

Marks

Marks are basic visual objects such as points, lines, and shapes. Every mark can be broken down into four qualities:

1. Position: Where is a specific mark in space relative to a scale or to other marks? 

For example, if you’re looking at two different trends, position allows you to compare the pattern of one element relative to another. 

2. Size: How big, small, long, or tall is a mark?

The comparison of object sizes can be an easy visual interpretation for humans. This can be very useful for conveying the relationship between categories or data points. However, this also presents a potential problem: The human eye can inadvertently interpret comparisons that aren’t intended to convey meaning. For example, sometimes objects that appear to be the same size when they are not. Controlling the scale of a visual is important, even when comparative sizes are not intended to offer information.

3. Shape: Does the shape of a specific object communicate something about it?

Rather than using simple dots or lines, a bit of creativity can enhance how quickly people are able to interpret a visual by using shapes that align with a given application. In the example below, it is immediately obvious that numbers of people are represented because the bars are person-shaped. 

4. Colour: What colour is a mark?

Colours can be used both as a simple differentiator of groupings or as a way to communicate other concepts such as profitable versus unprofitable, or hot versus cold. 

Channels

Channels are visual aspects or variables that represent characteristics of the data in a visualisation. They are basically specialized marks that have been used to visualise data. It’s important to understand that channels vary in terms of how effective they are at communicating data based on three elements: 

1. Accuracy: Are the channels helpful in accurately estimating the values being represented?

For example, colour is very accurate when communicating categorical differences, such as apples and oranges. But it is much less effective when distinguishing quantitative data, such as 5 from 5.5.

2. Popout: How easy is it to distinguish certain values from others?

There are many ways of drawing attention to specific parts of a visual, and lots of them leverage pre-attentive attributes including line length, size, line width, shape, enclosure, hue, and intensity.

3. Grouping: How effective is a channel at communicating groups that exist in the data?

Consider the proximity, similarity, enclosure, connectedness, and continuity of the channel.

But, remember: The more you emphasize one single thing, the more that counts. Emphasis diminishes with each item you emphasize because the items begin to compete with one another.  

Key takeaways

Throughout our career as an analyst, we will use different techniques and types of data visualisations to present data and insights in a concise, impactful manner. This will include organising your data, selecting the right type of data visualisations, and designing them  in such a way that they are easy to understand and highly communicative while avoiding any visuals that are misleading or inaccurate.

Keep in mind that data visualisation is an art form, and it takes time to develop these skills. Over your career as a data analyst, you will  learn how to design and evaluate data visualisations. Use these tips to think critically about data visualisation—both as a creator and as an audience member.

Resources

  • The beauty of data visualization: In this video, David McCandless explains the need for design to not just be beautiful, but for it to be meaningful as well. Data visualization must be able to balance function and form for it to be relevant to your audience. 
  • ‘The McCandless Method’ of data presentation: At first glance, this blog appears to be written by a David McCandless fan, and it is. However, it contains very useful information and provides an in-depth look at the 5-step process that McCandless uses to present his data.
  • Information is beautiful: Founded by McCandless himself, this site serves as a hub of sample visualizations that make use of the McCandless method. Explore data from the news, science, the economy, and so much more and learn how to make visual decisions based on facts from all kinds of sources. 
  • Beautiful news: In this McCandless collection, explore uplifting trends and statistics that are beautifully visualized for your creative enjoyment. A new chart is released every day so be sure to visit often to absorb the amazing things happening all over the world.
  • The Wall Street Journal Guide to Information Graphics: The Dos and Don’ts of Presenting Data, Facts, and Figures: This is a comprehensive guide to data visualization, including chapters on basic data visualization principles and how to create useful data visualizations even when you find yourself in a tricky situation. This is a useful book to add to your data visualization library, and you can reference it over and over again.

The beauty of visualising : Inspiration is in the air

Data visualisation is the graphical representation of data. But why should data analysts care about data visualisation? Well your audience won’t always have the ability to interpret or understand the complex information that you relay to them so your job is to inform them of your analysis in a way that is meaningful, engaging, and easy to understand. Part of why data visualization is so effective is because people’s eyes are drawn to colors, shapes, and patterns, which makes those visual elements perfect for telling a story that goes beyond just the numbers. 

Of course, one of the best ways to understand the importance of data visualisation is to go through different examples of it. As a junior data analyst, you want to have several visualisation options for your creative process whenever you need. Below is a list of resources that can inspire your next data-driven decisions, as well as teach you how to make your data more accessible to your audience:

  • The data visualization catalogue: Not sure where to start with data visualisation? This catalogue features a range of different diagrams, charts, and graphs to help you find the best fit for your project. As you navigate each category, you will get a detailed description of each visualisation as well as its function and a list of similar visuals. 
  • The 25 best data visualizations: In this collection of images, explore the best examples of data that gets made into a stunning visual. Simply click on the link below each image to get an in-depth view of each project, and learn why making data visually appealing is so important.
  • 10 data visualization blogs: Each link will lead you to a blog that is a fountain of information on everything from data storytelling to graphic data. Get your next great idea or just browse through some visual inspiration.  
  • Information is beautiful: Founded by David McCandless, this gallery is dedicated to helping you make clearer, more informed visual decisions based on facts and data. These projects are made by students, designers, and even data analysts to help you gain insight into how they have taken their own data and turned it into visual storytelling.
  • Data studio gallery: Information is vital, but information presented in a digestible way is even more useful. Browse through this interactive gallery and find examples of different types of data communicated visually. You can even use the data studio tool to create your own data-driven visual.

Engage your audience

Remember: an important component of being a data analyst is the ability to communicate your findings in a way that will appeal to your audience. Data visualisation has the ability to make complex (and even monotonous) information easily understood, and knowing how to utilize data visualisation is a valuable skill to have. Your goal is always to help the audience have a conversation with the data so your visuals draw them into the conversation. This is especially true when you have to help your audience engage with a large amount of data, such as the flow of goods from one country to other parts of the world.

Correlation and causation

In this reading, you will examine correlation and causation in more detail. Let’s review the definitions of these terms:

  • Correlation in statistics is the measure of the degree to which two variables move in relationship to each other. An example of correlation is the idea that “As the temperature goes up, ice cream sales also go up.” It is important to remember that correlation doesn’t mean that one event causes another. But, it does indicate that they have a pattern with or a relationship to each other. If one variable goes up and the other variable also goes up, it is a positive correlation. If one variable goes up and the other variable goes down, it is a negative or inverse correlation. If one variable goes up and the other variable stays about the same, there is no correlation.
  • Causation refers to the idea that an event leads to a specific outcome. For example, when lightning strikes, we hear the thunder (sound wave) caused by the air heating and cooling from the lightning strike. Lightning causes thunder.  

Why is differentiating between correlation and causation important? 

When you make conclusions from data analysis, you need to make sure that you don’t assume a causal relationship between elements of your data when there is only a correlation. When your data shows that outdoor temperature and ice cream consumption both go up at the same time, it might be tempting to conclude that hot weather causes people to eat ice cream. But, a closer examination of the data would reveal that every change in temperature doesn’t lead to a change in ice cream purchases. In addition, there might have been a sale on ice cream at the same time that the data was collected, which might not have been considered in your analysis. 

Knowing the difference between correlation and causation is important when you make conclusions from your data since the stakes could be high. The next two examples illustrate the high stakes to health and human services. 

Cause of disease

For example, pellagra is a disease with symptoms of dizziness, sores, vomiting, and diarrhea. In the early 1900s, people thought that the disease was caused by unsanitary living conditions. Most people who got pellagra also lived in unsanitary environments. But, a closer examination of the data showed that pellagra was the result of a lack of niacin (Vitamin B3). Unsanitary conditions were related to pellagra because most people who couldn’t afford to purchase niacin-rich foods also couldn’t afford to live in more sanitary conditions. But, dirty living conditions turned out to be a correlation only.

Distribution of aid

Here is another example. Suppose you are working for a government agency that provides SNAP benefits. You noticed from the agency’s Google Analytics that people who qualify for the benefits are browsing the official website, but they are leaving the site without signing up for benefits. You think that the people visiting the site are leaving because they aren’t finding the information they need to sign up for SNAP benefits. Google Analytics can help you find clues (correlations), like the same people coming back many times or how quickly people leave the page. One of those correlations might lead you to the actual cause, but you will need to collect additional data, like in a survey, to know exactly why people coming to the site aren’t signing up for SNAP benefits. Only then can you figure out how to increase the sign-up rate.

Key takeaways 

In your data analysis, remember to: 

  • Critically analyse any correlations that you find 
  • Examine the data’s context to determine if a causation makes sense (and can be supported by all of the data)
  • Understand the limitations of the tools that you use for analysis

Further information

You can explore the following article and training for more information about correlation and causation:

  • Correlation is not causation: This article describes the impact to a business when correlation and causation are confused.
  • Correlation and causation (Khan Academy lesson): This lesson describes correlation and causation along with a working example. Follow the examples of the analysis and notice if there is a positive correlation between frostbite and sledding accidents.

Different ViZ Methods

Line chart 

A line chart is used to track changes over short and long periods of time. When smaller changes exist, line charts are better to use than bar graphs. Line charts can also be used to compare changes over the same period of time for more than one group. 

Let’s say you want to present the graduation frequency for a particular high school between the years 2008-2012. You would input your data in a table like this:

YearGraduation rate
200887
200989
201092
201192
201296


From this table, you are able to present your data in a line chart like this:

Maybe your data is more specific than above. For example, let’s say you are tasked with presenting the difference of graduation rates between male and female students. Then your chart would resemble something like this:

Column chart 

Column charts use size to contrast and compare two or more values, using height or lengths to represent the specific values.  

The below is example data concerning sales of vehicles over the course of 5 months:

MonthVehicles sold
August2,800
September3,700
October3,750
November4,300
December4,600

Visually, it would resemble something like this:

What would this column chart entail if we wanted to add the sales data for a competing car brand?

Heatmap 

Similar to bar charts, heatmaps also use color to compare categories in a data set. They are mainly used to show relationships between two variables and use a system of color-coding to represent different values. The following heatmap plots temperature changes for each city during the hottest and coldest months of the year.

Pie chart

The pie chart is a circular graph that is divided into segments representing proportions corresponding to the quantity it represents, especially when dealing with parts of a whole.

For example, let’s say you are determining favorite movie categories among avid movie watchers. You have gathered the following data:

Movie categoryPreference
Comedy41%
Drama11%
Sci-fi3%
Romance17%
Action28%

Visually, it would resemble something like this:

Scatterplot

Scatterplots show relationships between different variables. Scatterplots are typically used for two variables for a set of data, although additional variables can be displayed.

For example, you might want to show data of the relationship between temperature changes and ice cream sales. It would resemble something like this:

As you may notice, the higher the temperature got, the more demand there was for ice cream—so the scatterplot is great for showing the relationship between the two variables.

Distribution graph

A distribution graph displays the spread of various outcomes in a dataset.

Let’s apply this to real data. To account for its supplies, a brand new coffee shop owner wants to measure how many cups of coffee their customers consume, and they want to know if that information is dependent on the days and times of the week. That distribution graph would resemble something like this:

From this distribution graph, you may notice that the amount of coffee sales steadily increases from the beginning of the week, reaching the highest point mid-week, and then decreases towards the end of the week.

If outcomes are categorized on the x-axis by distinct numeric values (or ranges of numeric values), the distribution becomes a histogram. If data is collected from a customer rewards program, they could categorize how many customers consume between one and ten cups of coffee per week. The histogram would have ten columns representing the number of cups, and the height of the columns would indicate the number of customers drinking that many cups of coffee per week.

Reviewing each of these visual examples, where do you notice that they fit in relation to your type of data? One way to answer this is by evaluating patterns in data. Meaningful patterns can take many forms, such as:

  • Change: This is a trend or instance of observations that become different over time. A great way to measure change in data is through a line or column chart.
  • Clustering: A collection of data points with similar or different values. This is best represented through a distribution graph.
  • Relativity: These are observations considered in relation or in proportion to something else. You have probably seen examples of relativity data in a pie chart.
  • Ranking: This is a position in a scale of achievement or status. Data that requires ranking is best represented by a column chart.
  • Correlation: This shows a mutual relationship or connection between two or more things. A scatterplot is an excellent way to represent this type of data pattern.

Studying your data

Data analysts are tasked with collecting and interpreting data as well as displaying data in a meaningful and digestible way. Determining how to visualise your data will require studying your data’s patterns and converting it using visual cues. Feel free to practice your own charts and data in spreadsheets. Simply input your data in the spreadsheet, highlight it, then insert any chart type and view how your data can be visualised based on what you choose.

So how to decide which visualisation option to choose?

With so many visualisation options out there for you to choose from, how do you decide what is the best way to represent your data? 

A decision tree is a decision-making tool that allows you, the data analyst, to make decisions based on key questions that you can ask yourself. Each question in the visualisation decision tree will help you make a decision about critical features for your visualisation. Below is an example of a basic decision tree to guide you towards making a data-driven decision about which visualisation is the best way to tell your story. Please note that there are many different types of decision trees that vary in complexity, and can provide more in-depth decisions. 

Begin with your story

Start off by evaluating the type of data you have and go through a series of questions to determine the best visual source:

  • Does your data have only one numeric variable? If you have data that has one, continuous, numerical variable, then a histogram or density plot are the best methods of plotting your categorical data. Depending on your type of data, a bar chart can even be appropriate in this case. For example, if you have data pertaining to the height of a group of students, you will want to use a histogram to visualise how many students there are in each height range:
  • Are there multiple datasets? For cases dealing with more than one set of data, consider a line or pie chart for accurate representation of your data. A line chart will connect multiple data sets over a single, continuous line, showing how numbers have changed over time. A pie chart is good for dividing a whole into multiple categories or parts. An example of this is when you are measuring quarterly sales figures of your company. Below are examples of this data plotted on both a line and pie chart.

and

  • Are you measuring changes over time? A line chart is usually adequate for plotting trends over time. However, when the changes are larger, a bar chart is the better option. If, for example, you are measuring the number of visitors to NYC over the past 6 months, the data would look like this:
  • Do relationships between the data need to be shown? When you have two variables for one set of data, it is important to point out how one affects the other. Variables that pair well together are best plotted on a scatterplot. However, if there are too many data points, the relationship between variables can be obscured so a heat map can be a better representation in that case. If you are measuring the population of people across all 50 states in the United States, your data points would consist of millions so you would use a heat map. If you are simply trying to show the relationship between the number of hours spent studying and its effects on grades, your data would look like this:

Additional resources

The decision tree example used in this reading is one of many. There are multiple decision trees out there with varying levels of details that you can use to help guide your visual decisions. If you want more in-depth insight into more visual options, explore the following resources:

  • From data to visualization: This is an excellent analysis of a larger decision tree. With this comprehensive selection, you can search based on the kind of data you have or click on each  graphic example for a definition and proper usage.
  • Selecting the best chart: This two-part YouTube video can help take the guesswork out of data chart selection. Depending on the type of data you are aiming to illustrate, you will be guided through when to use, when to avoid, and several examples of best practices. Part 2 of this video provides even more examples of different charts, ensuring that there is a chart for every type of data out there. 

So to summarise, what makes a good viz?

Here are some additional best practices to keep in mind:

  • Your audience should know what they are observing within five seconds of being shown a data visualization. Visuals should be clear and easy to follow.
  • In the five seconds after that, your audience should understand the conclusion your visualization is making—even if they aren’t familiar with your research.
  • As long as it’s not misleading, you should visually represent only the data that your audience needs to understand your findings. Including irrelevant data may confuse, distract, or overwhelm your audience.

Principles of design

In this part, we are going to learn more about using the elements of art and principles of design to create effective visualizations. So far, we have learned that communicating data visually is a form of art. Now, it’s time to explore the nine design principles for creating beautiful and effective data visualizations that can be informative and appeal to all audiences.

After we go through the various design principles, spend some time examining the visual examples to ensure that you have a thorough understanding of how the principle is put into practice. Let’s get into it! 

Nine basic principles of design 

There are nine basic principles of design that data analysts should think about when building their visualizations.  

1. Balance: The design of a data visualization is balanced when the key visual elements, like color and shape, are distributed evenly. This doesn’t mean that you need complete symmetry, but your visualization shouldn’t have one side distracting from the other. If your data visualization is balanced, this could mean that the lines used to create the graphics are similar in length on both sides, or that the space between objects is equal. For example, this column chart is balanced; even though the columns are different heights and the chart isn’t symmetrical, the colors, width, and spacing of the columns keep this data visualization balanced. The colors provide sufficient contrast to each other so that you can pay attention to both the motivation level and the energy level displayed.

2. Emphasis: Your data visualization should have a focal point, so that your audience knows where to concentrate. In other words, your visualizations should emphasize the most important data so that users recognize it first. Using color and value is one effective way to make this happen. By using contrasting colors, you can make certain that graphic elements—and the data shown in those elements—stand out. 

For example, you will notice a heat map data visualization below from The Pudding’s “Where Slang Comes From” article. This heat map uses colors and value intensity to emphasize the states where search interest is highest. You can visually identify the increase in the search over time from low interest to high interest. This way, you are able to quickly grasp the key idea being presented without knowing the specific data values.

3. Movement: Movement can refer to the path the viewer’s eye travels as they look at a data visualization, or literal movement created by animations. Movement in data visualization should mimic the way people usually read. You can use lines and colors to pull the viewer’s attention across the page. 

For example, notice how the average line in this combo chart (also shown below) draws your attention from left to right. Even though this example isn’t moving, it still uses the movement principle to guide viewers’ understanding of the data. 

4. Pattern: You can use similar shapes and colors to create patterns in your data visualization. This can be useful in a lot of different ways. For example, you can use patterns to highlight similarities between different data sets, or break up a pattern with a unique shape, color, or line to create more emphasis.

In the example below, the different colored categories of this stacked column chart (also shown below) are a consistent pattern that makes it easier to compare book sales by genre in each column. Notice in the chart that the Fantasy & Sci Fi category (royal blue) is increasing over time even as the general category (green) is staying about the same. 

5. Repetition: Repeating chart types, shapes, or colors adds to the effectiveness of your visualization. Think about the book sales chart from the previous example: the repetition of the colors helps the audience understand that there are distinct sets of data. You may notice this repetition in all of the examples we have reviewed so far. Take some time to review each of the previous examples and notice the elements that are repeated to create a meaningful visual story.

6. Proportion: Proportion is another way that you can demonstrate the importance of certain data. Using various colors and sizes helps demonstrate that you are calling attention to a specific visual over others. If you make one chart in a dashboard larger than the others, then you are calling attention to it. It is important to make sure that each chart accurately reflects and visualizes the relationship among the values in it. In this dashboard (also shown below), the slice sizes and colors of the pie chart compared to the data in the table help make the number of donuts eaten by each person the focal point. 

These first six principles of design are key considerations that you can make while you are creating your data visualization. These next three principles are useful checks once your data visualization is finished. If you have applied the initial six principles thoughtfully, then you will probably recognize these next three principles within your visualizations already. 

7. Rhythm: This refers to creating a sense of movement or flow in your visualization. Rhythm is closely tied to the movement principle. If your finished design doesn’t successfully create a flow, you might want to rearrange some of the elements to improve the rhythm.

8. Variety: Your visualizations should have some variety in the chart types, lines, shapes, colors, and values you use. Variety keeps the audience engaged. But it is good to find balance since too much variety can confuse people. The variety you include should make your dashboards and other visualizations feel interesting and unified.

9. Unity: The last principle is unity. This means that your final data visualization should be cohesive. If the visual is disjointed or not well organized, it will be confusing and overwhelming. 

Being a data analyst means learning to think in a lot of different ways. These nine principles of design can help guide you as you create effective and interesting visualizations. 

Data is beautiful

At this point, you might be asking yourself: What makes a good visualization? Is it the data you use? Or maybe it is the story that it tells? In this reading, you are going to learn more about what makes data visualizations successful by exploring David McCandless’ elements of successful data visualization and evaluating three examples based on those elements. Data visualization can change our perspective and allow us to notice data in new, beautiful ways. A picture is worth a thousand words—that’s true in data too! You will have the option to save all of the data visualization examples that are used throughout this reading; these are great examples of successful data visualization that you can use for future inspiration.

Let’s revisit the previous concept about what makes a good visualization?

Four elements of successful visualizations

The Venn diagram by David McCandless identifies four elements of successful visualizations: 

  • Information (data): The information or data that you are trying to convey is a key building block for your data visualization. Without information or data, you cannot communicate your findings successfully.
  • Story (concept): Story allows you to share your data in meaningful and interesting ways. Without a story, your visualization is informative, but not really inspiring. 
  • Goal (function): The goal of your data visualization makes the data useful and usable. This is what you are trying to achieve with your visualization. Without a goal, your visualization might still be informative, but can’t generate actionable insights.
  • Visual form (metaphor): The visual form element is what gives your data visualization structure and makes it beautiful. Without visual form, your data is not visualized yet. 

All four of these elements are important on their own, but a successful data visualization balances all four. For example, if your data visualization has only two elements, like the information and story, you have a rough outline. This can be really useful in your early planning stages, but is not polished or informative enough to share. Even three elements are not quite enough—you need to consider all four to create a successful data visualization.

Example 1: Visualization of dog breed comparison

View the data

The Best in Show visualization uses data about different dog breeds from the American Kennel Club. The data has been compiled in a spreadsheet. Click the link below and select “Use Template” to view the data.

Link to the template: KIB – Best in Show

Examine the four elements

This visualization compares the popularity of different dog breeds to a more objective data score. Consider how it uses the elements of successful data visualization:

  • Information (data): If you view the data, you can explore the metrics being illustrated in the visualization. 
  • Story (concept): The visualization shows which dogs are overrated, which are rightly ignored, and those that are really hot dogs! And, the visualization reveals some overlooked treasures you may not have known about previously.
  • Goal (function): The visualization is interested in exploring the relationship between popularity and the objective data scores for different dog breeds. By comparing these data points, you can learn more about how different dog breeds are perceived. 
  • Visual form (metaphor): In addition to the actual four-square structure of this visualization, other visual cues are used to communicate information about the dataset. The most obvious is that the data points are represented as dog symbols. Further, the size of a dog symbol and the direction the dog symbol faces communicate other details about the data.  

Example 2: Visualization of rising sea levels

Examine the four elements

This When Sea Levels Attack visualization illustrates how much sea levels are projected to rise over the course of 8,000 years. The silhouettes of different cities with different sea levels, rising from right to left, helps to drive home how much of the world will be affected as sea levels continue to rise. Here is how this data visualization stacks up using the four elements of successful visualization:

  • Information (data): This visualization uses climate data on rising sea levels from a variety of sources, including NASA and the Intergovernmental Panel on Climate Change. In addition to that data, it also uses recorded sea levels from around the world to help illustrate how much rising sea levels will affect the world. 
  • Story (concept): The visualization tells a very clear story: Over the course of 8,000 years, much of the world as we know it will be underwater. 
  • Goal (function): The goal of this project is to demonstrate how soon rising sea levels are going to affect us on a global scale. Using both data and the visual form, this visualization makes rising sea levels feel more real to the audience. 
  • Visual form (metaphor): The city silhouettes in this visualization are a beautiful way to drive home the point of the visualization. It gives the audience a metaphor for how rising sea levels will affect the world around them in a way that showing just the raw numbers can’t do. And for a more global perspective, the visualization also uses inset maps. 

Notice how each of these visualizations balance all four elements of successful visualization. They clearly incorporate data, use storytelling to make that data meaningful, focus on a specific goal, and structure the data with visual forms to make it beautiful and communicative. The more you practice thinking about these elements, the more you will be able to include them in your own data visualizations.

Design thinking for visualisation improvement

Design thinking for data visualisation involves five phases:

  1. Empathize: Thinking about the emotions and needs of the target audience for the data visualisation 
  2. Define: Figuring out exactly what your audience needs from the data
  3. Ideate: Generating ideas for data visualisation
  4. Prototype: Putting visualisations together for testing and feedback
  5. Test: Showing prototype visualisations to people before stakeholders see them

As interactive dashboards become more popular for data visualisation, new importance has been placed on efficiency and user-friendliness. In this reading, we will learn how design thinking can improve an interactive dashboard. As a junior analyst, you wouldn’t be expected to create an interactive dashboard on your own, but you can use design thinking to suggest ways that developers can improve data visualizations and dashboards.

An example: online banking dashboard

Suppose you are an analyst at a bank that has just released a new dashboard in their online banking application. This section describes how you might explore this dashboard like a new user would, consider a user’s needs, and come up with ideas to improve data visualisation in the dashboard. The dashboard in the banking application has the following data visualisation elements:

  • Monthly spending is shown as a donut chart that reflects different categories like utilities, housing, transportation, education, and groceries. 
  • When customers set a budget for a category, the donut chart shows filled and unfilled portions in the same view.
  • Customers can also set an overall spending limit, and the dashboard will automatically assign the budgeted amounts (unfilled areas of the donut chart) to each category based on past spending trends.

Empathize

First, empathize by putting yourself in the shoes of a customer who has a checking account with the bank. 

  • Do the colors and labels make sense in the visualization? 
  • How easy is it to set or change a budget? 
  • When you click on a spending category in the donut chart, are the transactions in the category displayed?

What is the main purpose of the data visualization? If you answered that it was to help customers stay within budget or to save money, you are right! Saving money was a top customer need for the dashboard. 

Define

Now, imagine that you are helping dashboard designers define other things that customers might want to achieve besides saving money.

What other data visualizations might be needed? 

  • Track income (in addition to spending).
  • Track other spending that doesn’t neatly fit into the set categories (this is sometimes called discretionary spending).
  • Pay off debt.

Can you think of anything else?

Ideate

Next, ideate additional features for the dashboard and share them with the software development team. 

  • What new data visualizations would help customers?
  • Would you recommend bar charts or line charts in addition to the standard donut chart?
  • Would you recommend allowing users to create their own (custom) categories?

Can you think of anything else?

Prototype

Finally, developers can prototype the next version of the dashboard with new and improved data visualizations.

Test

Developers can close the cycle by having you (and others) test the prototype before it is sent to stakeholders for review and approval.

Key points

This design thinking example showed how important it is to:

  • Understand the needs of users.
  • Generate new ideas for data visualizations.
  • Make incremental improvements to data visualizations over time.

You can refer to the following articles for more information about design thinking:

Data Visualisation Guidelines and pro tips

Refer to the following table for recommended guidelines and style checks for headlines, subtitles, labels, and annotations in your data visualizations. Think of these guidelines as guardrails. Sometimes data visualizations can become too crowded or busy. When this happens, the audience can get confused or distracted by elements that aren’t really necessary. The guidelines will help keep your data visualizations simple, and the style checks will help make your data visualizations more elegant.

Visualization componentsGuidelinesStyle checks
HeadlinesContent: Briefly describe the data – Length: Usually the width of the data frame – Position: Above the data– Use brief language – Don’t use all caps – Don’t use italic – Don’t use acronyms – Don’t use abbreviations – Don’t use humor or sarcasm
SubtitlesContent: Clarify context for the data – Length: Same as or shorter than headline – Position: Directly below the headline– Use smaller font size than headline – Don’t use undefined words – Don’t use all caps, bold, or italic – Don’t use acronyms – Don’t use abbreviations
LabelsContent: Replace the need for legends – Length: Usually fewer than 30 characters – Position: Next to data or below or beside axes– Use a few words only – Use thoughtful color-coding – Use callouts to point to the data – Don’t use all caps, bold, or italic
AnnotationsContent: Draw attention to certain data – Length: Varies, limited by open space – Position: Immediately next to data annotated– Don’t use all caps, bold, or italic – Don’t use rotated text – Don’t distract viewers from the data

How to design a chart in 60 minutes

By now, you understand the principles of design and how to think like a designer. Among the many options of data visualization is creating a chart, which is a graphical representation of data. 

Choosing to represent your data via a chart is usually the most simple and efficient method. Let’s go through the entire process of creating any type of chart in 60 minutes. The goal here is to develop a prototype or mock up of your chart that you can quickly present to an audience. This will also enable you to have a sense of whether or not the chart is communicating the information that you want.

Follow this high level 60-minute chart to guide your thinking whenever you begin working on a data visualisation. 

Prep (5 min): Create the mental and physical space necessary for an environment of comprehensive thinking. This means allowing yourself room to brainstorm how you want your data to appear while considering the amount and type of data that you have.

Talk and listen (15 min): Identify the object of your work by getting to the “ask behind the ask” and establishing expectations. Ask questions and really concentrate on feedback from stakeholders regarding your projects to help you hone how to lay out your data. 

Sketch and design (20 min): Draft your approach to the problem. Define the timing and output of your work to get a clear and concise idea of what you are crafting.

Prototype and improve (20 min): Generate a visual solution and gauge its effectiveness at accurately communicating your data. Take your time and repeat the process until a final visual is produced. It is alright if you go through several visuals until you find the perfect fit. 

Glossary terms from module 1

Terms and definitions for Course 6, Module 1

Alternative text: Text that provides an alternative to non-text content, such as images and videos

Annotation: Text that briefly explains data or helps focus the audience on a particular aspect of the data in a visualization

AVERAGEIF: A spreadsheet function that returns the average of all cell values from a given range that meet a specified condition 

Balance: The design principle of creating aesthetic appeal and clarity in a data visualization by evenly distributing visual elements

Bar graph: A data visualization that uses size to contrast and compare two or more values

Calculus: A branch of mathematics that involves the study of rates of change and the changes between values that are related by a function 

Causation: When an action directly leads to an outcome, such as a cause-effect relationship

Channel: A visual aspect or variable that represents characteristics of the data in a visualization

Chart: A graphical representation of data from a worksheet

Cluster: A collection of data points on a data visualization with similar values

CONVERT: A SQL function that changes the unit of measurement of a value in data

Correlation: The measure of the degree to which two variables change in relationship to each other

CREATE TABLE: A SQL clause that adds a temporary table to a database that can be used by multiple people

Data composition: The process of combining the individual parts in a visualization and displaying them together as a whole 

Decision tree: A tool that helps analysts make decisions about critical features of a visualization

Design thinking: A process used to solve complex problems in a user-centric way

Distribution graph: A data visualization that displays the frequency of various outcomes in a sample 

DROP TABLE: A SQL clause that removes a temporary table from a database

Dynamic visualizations: Data visualizations that are interactive or change over time

Emphasis: The design principle of arranging visual elements to focus the audience’s attention on important information in a data visualization

HAVING: A SQL clause that adds a filter to a query instead of the underlying table that can only be used with aggregate functions

Headline: Text at the top of a visualization that communicates the data being presented

Heat map: A data visualization that uses color contrast to compare categories in a dataset

Histogram: A data visualization that shows how often data values fall into certain ranges

Inner query: A SQL subquery that is inside of another SQL statement

Label: Text in a visualization that identifies a value or describes a scale

Legend: A tool that identifies the meaning of various elements in a data visualization

Line graph: A data visualization that uses one or more lines to display shifts or changes in data over time

Map: A data visualization that organizes data geographically

Mark: A visual object in a data visualization such as a point, line, or shape

MAXIFS: A spreadsheet function that returns the maximum value from a given range that meets a specified condition

Mental model: A data analyst’s thought process and approach to a problem

Movement: The design principle of arranging visual elements to guide the audience’s eyes from one part of a data visualization to another

MINIFS: A spreadsheet function that returns the minimum value from a given range that meets a specified condition

Narrative: (Refer to story)

Ordinal data: Qualitative data with a set order or scale

Pattern: The design principle of using similar visual elements to demonstrate trends and relationships in a data visualization

Pie chart: A data visualization that uses segments of a circle to represent the proportions of each data category compared to the whole

Pre-attentive attributes: The elements of a data visualization that an audience recognizes automatically without conscious effort

Proportion: The design principle of using the relative size and arrangement of visual elements to demonstrate information in a data visualization

R: A programming language used for statistical analysis, visualization, and other data analysis 

Ranking: A system to position values of a dataset within a scale of achievement or status

Relativity: The process of considering observations in relation or proportion to something else

Repetition: The design principle of repeating visual elements to demonstrate meaning in a data visualization

Rhythm: The design principle of creating movement and flow in a data visualization to engage an audience

Scatterplot: A data visualization that represents relationships between different variables with individual data points without a connecting line 

SELECT INTO: A SQL clause that copies data from one table into a temporary table without adding the new table to the database

Sort range: A spreadsheet menu function that sorts a specified range and preserves the cells outside the range

Sort sheet: A spreadsheet menu function that sorts all data by the ranking of a specific sorted column and keeps data together across rows

Static visualization: A data visualization that does not change over time unless it is edited

Story: The narrative of a data presentation that makes it meaningful and interesting 

Subtitle: Text that supports a headline by adding context and description

Tableau: A business intelligence and analytics platform that helps people visualize, understand, and make decisions with data

Unity: The design principle of using visual elements that complement each other to create aesthetic appeal and clarity in a data visualization

Variety: The design principle of using different kinds of visual elements in a data visualization to engage an audience

Visual form: The appearance of a data visualization that gives it structure and aesthetic appeal

X-axis: The horizontal line of a graph usually placed at the bottom, which is often used to represent time scales and discrete categories

Y-axis: The vertical line of a graph usually placed to the left, which is often used to represent frequencies and other numerical variables

Series Navigation<< Course 5: Analyse Data to Answer Questions,  Module 4: Perform Data CalculationsCourse 6: Share Data Through the Art of Visualisation, Course Overview plus Module 2: Create Data Visualisation with Tableau >>
Tagged

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.