Smart Data Visualizations: Quality Assessment Algorithm

The gap between a bad and good data visualization is small. The gap between a good and great data visualization is a vast chasm! The challenge is that we, and our HiPPOs, bring opinions and feelings and our perceptions of what will go viral to the conversation. This is entirely counter productive to distinguishing between […]

The post Smart Data Visualizations: Quality Assessment Algorithm appeared first on Occam’s Razor by Avinash Kaushik.

The gap between a bad and good data visualization is small.

The gap between a good and great data visualization is a vast chasm!

The challenge is that we, and our HiPPOs, bring opinions and feelings and our perceptions of what will go viral to the conversation. This is entirely counter productive to distinguishing between bad, good, and great.

What we need instead is a rock solid understanding of the updraft we face in our quest for greatness, and a standard framework that can help us dispassionately assess quality.

Let’s do that today. Learn how to seperate bad from good and good from great, and do so using examples that we can all relate to instantly.

We’ll start by looking at the two sets of humans who are at the root of the conflict of obsessions and then learn to assess how effective any data visualization is in an entirely new way. If you adopt it, I guarantee the impact on your work will be transformative.

The Conflict of Obsessions.

There are two parties involved in any data visualization.

1. Analyst/Data Visualizer.

As I’ve passionately shared frequently on this blog, we, Analysts, are all in the business of persuasion. We work against that desired outcome because when we work on creating a data visualization, here are our top-of-mind concerns/desires/perspectives:

How can I cram as much as I can into the graphic?

What can I include to ensure everyone clearly gets just how much work I did?

How much of my agenda do I need to make overt, and how much can I make covert?

Is there something I can add to increase the chances that this will go viral and result in fame and glory?

Ok. I’m only teasing.

But, as an Analyst, a Data Visualizer, I can’t say that these thoughts don’t cross my mind. :)

I’m sharing the above primarily to ensure that you know these motivations exist – and, like me, you should try to fight and resist!

The very best Data Visualizers, obsess about:

1. known and unknown variables
2. causality
3. nuance
4. visualization techniques
5. rank-ordering messages
6. simplicity, simplicity, simplicity, simplicity, simplicity, simplicity, and, just to be safe one last time, simplicity.

These are the six things that matter supremely in my work, and they should be what matter in yours.

Simplicity matters more than the rest because if I can’t distill complexity, I might as well not do the work because that is only a snowball’s chance on the sun that the audience will understand my complex visual.

Let’s look at the other set of humans involved in a data visualization equation.

2. Data Consumer.

Here are the concerns/desires/perspectives that a consumer of data visualizations has top of mind when they are presented with a set of analysis:

What’s in it for me?

How easy is it  to grasp the most important point?

What’s in it for me?

How much effort do I need to put in to understand the whole infographic?

What’s in it for me?

How can I trust that this message is from a credible Analyst/source/using sound methodology?

(Never underestimate the staggering selfishness that a Data Consumer brings with them to the table when you are showing them a table of data or a data visual. And, it is understandable because they have difficult jobs and 71 other things to worry about.)

Notice there is very little overlap between the obsessions of the Data Consumer and Data Visualizer.

If you have a choice (and you do!), let the needs of the Data Consumer drive your data visualization efforts. The only exception is when you are trying to push propaganda, then go with your agenda.

If an infographic sucks, it is usually due to the conflict between the Visualizer and the Consumer along the above dimensions.

You’ll see it vividly on display when you look at any graphic through the Consumer lens with an eye on simplicity (the Analyst dimension).

The Data Visualization Assessment Algorithm.

Algorithm might perhaps be a tad bit pompous, as applied here. I’ve developed a set of filters and lenses through which you can look at any data visualization in order to quickly assess quality.

Perhaps someone reading this blog post is going to help us all out by building a Machine Learning algorithm to assess if a Data Viz is bad, good, or great. :)

Reflecting on the aforementioned Consumer vs. Visualizer conflict of obsessions has helped me distill the evaluation of data visualizations to eight dimensions. They influence each other and the entire portfolio, yet they stand on their own.

In the format of “Obsession | [ratings scale],” here’s the data viz assessment algorithm:

1. Time to the most important insight. [Scale: Fast. Slow. KMN!]

2. The effort to understand the whole graphic. [Low. Medium. No Thank You.]

3. Trust marks. [Clear. Non-Obvious. None.]

4. Rank-ordering of key messages. [Yes. Partial. WTH!]

5. Explaining the key logic powering the graphic. [Super clear. Cloudy. Invisible.]

6. Exposing nuance. [Sweet. Some. Sour.]

7. Visualizer trying to be too clever. [No, and thank god. Yes, but it is harmless. Yes, sadly.]

8. Likely to recommend to influential leaders. [Yes! No. No way.]

I want you to explicitly notice:

I’ve put the Data Consumer first

Incentivized good behavior by the Data Visualizers, and …

… Included an outcome in the end because activity is well and dandy but it is outcomes are what matter.

My hope is to share a very specific algorithm that gets your critical thinking juices flowing. I invite your critique and suggestions on how I can make it even smarter. Please reply.

The best way to learn is to practice via real-world examples. So.. Let’s do that!

COVID What Should I be Afraid of (!) Data Visualizations.

A few weeks ago, perhaps not coincidentally, a number of different entities published visuals to help us understand what we can do safely and what’ll cause grievous harm.

I’ve collected four of these efforts – each a really different way to visualize nearly identical information. This gives us an ideal data set to apply our algorithm, and learn discerning skills along the way.

Data Visualization #1

The first graphic is from the inimitable Randall Munroe (I’m a very big xkcd fan!).
Randall has a unique way to communicate complex information (buy Thing Explainer!), and this graphic is no different. It combines seriousness, fun, and scientific accuracy.

As an approach, 2x2s work really well. They force simplicity. The color clustering above helps, you can jump to the safest or riskiest activities faster.

On the downside, it is hard to take in the whole thing. You can get lost.

I’m treating this as a very serious example, but it is important to remember that the intent above includes the goal of making us smile.

Let’s apply our algorithm and see how this graphic does with our tough, but with love, lens.

1. Time to the most important insight. [Fast. Slow. KMN!]

2. The effort to understand the whole graphic. [Low. Medium. No Thank You.]

3. Trust marks. [Clear. Non-Obvious. None.]

4. Rank-ordering of key messages. [Yes. Partial. WTH!]

5. Explaining the key logic powering the graphic. [Super clear. Cloudy. Invisible.]

6. Exposing nuance. [Sweet. Some. Sour.]

7. Visualizer trying to be too clever. [No, and thank god. Yes, but it is harmless. Yes, sadly.]

8. Likely to recommend to influential leaders. [Yes! No. No way.]

The graphic should technically get a pass on #3 as it is for fun, and possibly #5 as well. But, I’ve still graded it seriously so that all of us can practice scoring.

If the phrase big miss applies here it is perhaps #2, the effort to understand the whole graphic (or more precisely, cartoon).

Based on the algorithm’s assessment, it earns a score of 23/66.

Oh, I totally forgot to tell you… I made a little scoring system to help you truly internalize the key messages. Those who know me will not be surprised that my system has a steep grading curve (#highstandardsFTW!).

The scoring system uses a multiplier across each rating in the scale above. Additionally, since each dimension does not carry the same level of importance, there’s a multiplier for each dimension – to effectively communicate my values.

Here’s the math…

It is all fun and games until you realize there’s a score involved! :)

Important: My intent in creating the data viz assessment algorithm, and scoring sheet, is not to have you entirely agree with how I’m grading each visualization. My intent is to teach a systematic approach you can bring to these difficult and complex tasks.

I do hope you see why I’m scoring the way I am, I hope you’ll agree. But, that desire is tertiary.

Data Visualization #2

The second graphic is from the world-famous Information is Beautiful (IiB). They have some of the world’s most famous data visualizations. (The simple and effective: When Sea Levels Attack)

IiB tends to make graphics for large screens, I need to be on my beloved 27” ThinkVision monitor to read it optimally.

In this instance, you’ll notice the color palette works against the ability to read the text (teal on dark gray or slightly lighter gray on dark gray).

The spectrum from light yellow to blood red of the circles, with internal gradations, is trying to add a layer of cleverness that possibly satiates a Data Visualizer, at the cost of the Data Consumer.

Once you zoom into one part of the visual, things become readable. You do lose the full picture of any section. In this view, perhaps you’ll agree that there is a sense of randomness to what’s in the bubble (check for this in the two visuals below as well).

It was a lovely touch to add the “risk factors to consider” on the top left of the visualization which explains the logic powering the graphic.. (You can see it more clearly in the higher resolution view, the blue font on gray makes it hard above.)

I do like the subtle helpful tips like the one about condiments, below.

Let’s apply our algorithm and see how this graphic does with our tough, but with love, lens:

1. Time to the most important insight. [Fast. Slow. KMN!]

2. The effort to understand the whole graphic. [Low. Medium. No Thank You.]

3. Trust marks. [Clear. Non-Obvious. None.]

4. Rank-ordering of key messages. [Yes. Partial. WTH!]

5. Explaining the key logic powering the graphic. [Super clear. Cloudy. Invisible.]

6. Exposing nuance. [Sweet. Some. Sour.]

7. Visualizer trying to be too clever. [No, and thank god. Yes, but it is harmless. Yes, sadly.]

8. Likely to recommend to influential leaders. [Yes! No. No way.]

I was this close to choosing no way in terms of recommending this graphic to others (because I never will). In the end, IiB is such a huge entity and so famous and so many people love them… no way seemed too much against the grain.

I've come to understand that IiB has a very specific design language, texture, and philosophy that has come to define them. It possibly acts as a constraint now.

Based on the algorithm’s assessment, it earns a score of 7/66.

Here’s the math:

It is important that data this critical – for this wide a consumption (whole planet) – needs to figure out how to hit an extraordinarily high simplicity and effective comms standard.  Else, it remains an exercise in self-satisfaction by the Data Visualizer.

Data Visualization #3

The third graphic is by Professor Saskia Popescu, Dr. James P. Phillips, and Dr. Ezekiel Emanuel.

I’m a huge fan of Dr. Emanuel. He was the special advisor for health policy in the Obama administration and played an instrumental role in passing the Patient Protection and Affordable Care Act (aka. Obamacare). For this, he has my eternal gratitude on behalf of those who society and politicians don’t usually listen to in the United States.

The Covid-19 Risk Index clearly identifies the logic powering the graphic: enclosed space, crowds, duration of interaction, and forceful exhalation.

Note that IiB also had some of these factors, forceful exhalation is an addition here (unsurprising that the doctors brought that to the fore).

The colors in the graphic are related to the intensity of the risk, green is low and red is high. Simple, direct, effective.

I’m not a huge fan of a giant company logo on graphics as you see below in the "hexagon art." I believe: More white space = more peace.

Given the heartbreaking debate in the US, I did appreciate the bonus call to action up top to wear a mask.

Did you notice the trust marks at the bottom? Really nice.

As in the case with the IiB graphic, this one is meant for the large screen display. I applaud the team for making sure each segment is readable – no fancy font colors and fancy background as a demonstration of the Visualizer's smartness.

Folks in my teams know I hold a special hatred for icons. They add clutter. In this case, I do support the decision to include icons.

For example, without needing to read any text I know that working in the office carries medium/high risk, and participating in group religious services is in the recommend you please avoid category – even in the small version above and certainly in the zoomed-in version below.

Let’s apply our algorithm and see how this graphic does with our tough, but with love, lens.

1. Time to the most important insight. [Fast. Slow. KMN!]

2. The effort to understand the whole graphic. [Low. Medium. No Thank You.]

3. Trust marks. [Clear. Non-Obvious. None.]

4. Rank-ordering of key messages. [Yes. Partial. WTH!]

5. Explaining the key logic powering the graphic. [Super clear. Cloudy. Invisible.]

6. Exposing nuance. [Sweet. Some. Sour.]

7. Visualizer trying to be too clever. [No, and thank god. Yes, but it is harmless. Yes, sadly.]

8. Likely to recommend to influential leaders. [Yes! No. No way.]

This graphic went viral on the socials, and deservedly so. With CV-19 flaring up in multiple countries (sadly, we in the US are still making our way through wave one), I hope that you will use the graphic above to stay safe – and share it with your friends and family so that they can stay safe as well.

Based on the algorithm’s assessment, it earns a score of 50/66.

Here’s the math:

Clearly a graphic the Data Visualizer can be proud of, reaching a level of obsessions overlap with Data Consumer obsessions that is rare.

Data Visualization #4

The last graphic was developed by the physicians on the Texas Medical Association COVID-19 Task Force and TMA Committee on Infectious Diseases.

I love it.

It is simple. It is easy to digest. There is absolutely nothing cute about it (hurrah!). There are no circles to jump through. No expensive Data Visualizer Specialist In Fonts was hired. The graphic is not trying too hard.

It was probably designed by the Doctors in TMA. It is insanely boring. All it is is… Effective.

Just about the only lite criticism I can make is that perhaps in keeping with the (ironically) liberal posture of the state of Texas when it comes to dealing with Covid, this graphic lowers the bar for what’s risky compared to all other sources. I share that as a small red flag, but it is adjacent to the technical analysis of the data viz that we are undertaking today.

The logic powering the graphic is integrated into the core of the graphic, as becomes clear below. There is little to no effort necessary to understand the visual. Start at the top, keep going. The colors and bars help you along.

Even in this small size, it is fairly readable…

When information is laid out so clearly other things jump out at you that makes you think (an excellent trait of a great data visualization).

All of the below items are an 8 or a 9 – but consider the staggering differences.

Attending a bar is just as risky as a religious service with 500+ worshipers! And, both are a tiny bit riskier than eating a buffet!!  You were leaned-in questioning the data, being curious. A good sign.

TMA COVID Highest Risks

Let’s apply our algorithm and see how this graphic does with our tough, but with love lens:

1. Time to the most important insight. [Fast. Slow. KMN!]

2. The effort to understand the whole graphic. [Low. Medium. No Thank You.]

3. Trust marks. [Clear. Non-Obvious. None.]

4. Rank-ordering of key messages. [Yes. Partial. WTH!]

5. Explaining the key logic powering the graphic. [Super clear. Cloudy. Invisible.]

6. Exposing nuance. [Sweet. Some. Sour.]

7. Visualizer trying to be too clever. [No, and thank god. Yes, but it is harmless. Yes, sadly.]

8. Likely to recommend to influential leaders. [Yes! No. No way.]

Based on the algorithm’s assessment, it earns a score of 64/66.

Here’s the math:

The TMA graphic was the spark to write this newsletter.

The world needed a simple way to communicate effectively, in this case literally, information that can save lives.

While things are rarely that high-stakes in a business environment, I hope the TMA inspires you to ensure that you don’t lose sight of what’s important when you work on data visualizations: The understanding of data.

Bottom line.

How do you handle the conflict between your goals as a Data Visualizer (and incentives your employer creates for you) and the Data Consumer? While the answer seems obvious, it is incredibly difficult to execute. I hope you’ll use the data visualization assessment to ensure you, your team, solve for the Data Consumer first, yourself second.

If you have graphics that score above 60, I would love to see them! (If they are shareable.)

All the best.

PS: Bonus Life Lesson:

A small number would surely have noticed that the perfect score from the algorithm is 66 (all Great), and the score for it was good enough is 22 (all Could Be Optimized). That massive chasm reflects life (and my philosophy).

There are thousands of Analysts who’ll stop at good, after all it is good. Perhaps a hundred, or less, will do the hard work required to get to great. They’ll rule the (biz) world.

#nowyouknow

The post Smart Data Visualizations: Quality Assessment Algorithm appeared first on Occam's Razor by Avinash Kaushik.

The Perils of Poor Data Visualization in CRO & A/B Testing

As any UX & CRO expert should now, the way we present information matters a lot both in terms of how well it is understood and in terms of the probability that it will lead to the desired action. A/B testing calculators and other tools of the trade…

As any UX & CRO expert should now, the way we present information matters a lot both in terms of how well it is understood and in terms of the probability that it will lead to the desired action. A/B testing calculators and other tools of the trade are no exception and here I will […] Read More...

Data & Business Impact with Feras Alhlou

A few months ago I had the opportunity to chat with my friend and work partner Feras Alhlou, Co-Founder and Principal Consultant at E-Nor & Co-Author of Google Analytics Breakthrough. Feras and I have known each other for almost 10 years, and it is…

Google Analytics Stuido

A few months ago I had the opportunity to chat with my friend and work partner Feras Alhlou, Co-Founder and Principal Consultant at E-Nor & Co-Author of Google Analytics Breakthrough. Feras and I have known each other for almost 10 years, and it is always great to hear more about the work that he and his first-class team are doing.

Here are the questions we discussed, checkout the answers in the video below. I have also added some of my favorite highlights from the interview after the video.

  1. [01:05] What's the process that you use to make sense out of data?
  2. [02:41]During this process, what do you actually do when you start working with data?
  3. [04:07]When analyzing data, how can we make sure that we are looking at the context to understand what is happening around us?
  4. [07:24]How can Data Studio and better data visualizations help companies make more data-driven decisions?

We believe analytics is a business process. We start with an audit, both from the business side and the technical side - we want to engage the stakeholders to understand how to measure what matters most to the business. Once we have the data in place, we go to the reporting layer - how do we report on this data? Then, we start to be able to analyze the data and find some actionable insights. Last, we can move to testing and personalization - that's when you really can have an impact on the business. Read more about E-Nor's Optimization Framework

There's a whole lot of data these days, right? Life used to be simple for marketers: one device, a few channels - now there's data everywhere, mobile, social, web, and of course backend data. I think one of the first things we need to do is to understand the context around that data, focusing on the following:

  • The integrity of the data: is it clean, was it collected properly, is it raw or aggregated? Understand the data collection, how the data was put together.
  • Having a set of meta data, information about the data: if you're looking at Google Analytics metrics, knowing more about the user. For example, if you have a subscription based model: Is it a premium user? Is it a standard user? Having that additional data gives a whole lot of context, to the person who's consuming that data.

I would definitely advice to have a data road map. Start with what you own, web and mobile analytics data. Then, start augmenting reports with basic social data, maybe you can get a little bit into the qualitative aspect with that. And last but not least, a great product that was recently introduced by Google as the Surveys product. There are surveys we can do on our own properties to understand the voice of the customer. But also use it to do market research - it used to be expensive and cumbersome to do it, but now you can easily run a Google survey and do a lot targeting.

And here is Feras and me having fun in the Google Analytics studio!

Daniel Waisberg and Feras Alhlou

image 
Daniel Waisberg and Feras Alhlou

Data That Matters: Maternal Mortality Trends

I have always appreciated the work of the Bill and Melinda Gates Foundation, it is really amazing to see people working so hard to make the world a better place. But I was left speechless when I opened their new report: GoalKeepers 2017. It tells the s…

Data That Matters: Maternal Mortality Trends

I have always appreciated the work of the Bill and Melinda Gates Foundation, it is really amazing to see people working so hard to make the world a better place. But I was left speechless when I opened their new report: GoalKeepers 2017. It tells the stories behind the data to help "accelerate progress in the fight against poverty by helping to diagnose urgent problems, identify promising solutions, measure and interpret key results, and spread best practices".

First and foremost, the goals themselves are superb - I can't think of more important issues to fight for. But I was also impressed by the information design, it is spotless. They used the right medium for each piece of information: text, images, videos, animations and charts. The report is engaging and, before you realize, you spent an hour going through it. So I was touched both as a person that cares about what is happening around me and as a professional appreciating good work.

Interestingly, a few months ago I was looking for some data to build a sample report, and I chose the maternal mortality dataset from UNICEF's data portal. I built the report and used it, but didn't take the time to publish it - ever heard of procrastination? :-)

In this article I will provide more context into GoalKeepers 2017 using publicly available UNICEF data on maternal mortality. I'll start with some words about the GoalKeepers 2017 report - then, I'll discuss some of the steps I used to create my report and the insights I learned from the data.

Stories behind the data: maternal mortality in Ethiopia

One of the highlights that I found particularly interesting in GoalKeepers 2017 was the maternal mortality case study, focusing on how Ethiopia is fighting this terrible issue. Here is how Bill and Melinda define it.

"If you were trying to invent the most efficient way to devastate communities and put children in danger, you would invent maternal mortality." Bill and Melinda Gates

Most people would agree that mothers are probably the most important pillar for a child (I'm a father, and I think fathers are important too, but as my mom always says: "you will never be a mother!"). So it is devastating to learn that in 2015, UNICEF registered 302,530 maternal deaths due to complications from pregnancy or childbirth - 168.7 deaths per 100,000 live births. And remember that a mother's death does not mean one child left motherless, women can already have many more children when it happens.

However, as GoalKeepers 2017 shows, we've made some great progress, and the trends look good. In their case study, they show how Ethiopia is taking giant steps on their fight against maternal mortality, and the chart they used is simple and powerful: mortality went from 843 to 357 per 100,000 from 1990 to 2015 - that's great!

maternal mortality ethiopia

But in order to understand our global status better, it is important to put more context into the mix: what's happening around the world? And how does Ethiopia compare to other places?

Maternal Mortality around the world

To have a better understanding of how both Ethiopia and the world in general is progressing, I took a deeper look in the maternal mortality dataset from UNICEF's statistics website. The data is publicly available, well organized, and it seems trustworthy. I downloaded the xlsx file and formated it for Data Studio using this spreadsheet; then, I imported it to Data Studio (learn how).

Below you'll find my data visualization embedded, scroll down to read some of my conclusions based on the data.

I know, the horizontal bar chart goes on forever! But I think it gives an interesting perspective.

Disclosure: I do not pretend to be a specialist in global health, my knowledge about the efforts in the area are minimal. The insights below are based on the data only - I'm assuming UNICEF publishes accurate and unbiased data. With that said, I hope it will help people understand better the status and trends of maternal mortality around the world.

Here are my insights on maternal mortality based on UNICEF's data.

  • Amazing progress - but not solved: out of 183 countries in the data, only 13 are worse off in 2015 compared to 1990. The trajectory is mostly good - globally, we saw a decrease from 339 to 168 in maternal mortality rate, an average of 44% decrease. For context, Ethiopia's rate decreased by 71%, significantly better than the average. However, it is clear from the map that Africa is bleeding, with Sierra Leone losing 1,360 for 100,000 giving birth - that's very bad.
  • United States and South Africa have alarming trends: both countries are among the top 10 countries in the 'getting worse' table (sorted by 1990-2015 % change) - South Africa had an absolute 1,500 deaths and USA 550, that's a lot of loss. Even though they don't have the highest rates, it is quite alarming to see the negative trends and absolute numbers. For more on the USA trend check this article, which discusses possible reasons and links to more in-depth analyses.
  • Cambodia and Turkey up-and-to-the-right, but still a lot of deaths: both countries have shown great progress, appearing in the top 10 'getting better' table - but they still need a big push, especially Cambodia.

I think those are interesting points to think about as we continue fighting this horrible issue - the more data (and analyses) we have, the more prepared we will be. If you are looking for a place to start, UNICEF has a lot of interesting datasets in their data portal. Let's help make the world a better place!

image 
Online Behavior

Embedding Google Data Studio Visualizations

Last year I wrote about the Marvel vs. DC war on the big screen. It was super fun to merge two of my passions (data visualization and comics) in one piece. It started with my curiosity to understand what all those movies are amounting to, and I think i…

Embedding Google Data Studio Visualizations

Last year I wrote about the Marvel vs. DC war on the big screen. It was super fun to merge two of my passions (data visualization and comics) in one piece. It started with my curiosity to understand what all those movies are amounting to, and I think it helped me prove a point: Marvel is kinda winning :-)

One of the things that annoyed me was that I had to link to the interactive visualization, readers couldn't see the amazing charts in my article (!) - so I ended up including static screenshots with some insights explained through text. While some people clicked through to play with the data, I suspect many just read the piece and went away, which is suboptimal - when I publish a story, my goal is to allow readers to interact with it quickly and effectively.

I am extremely excited that now Google Data Studio allows users to embed reports in any online environment, which empowers us to create an improved experience for telling stories with data. This feature will be an essential tool for data journalists and analysts to effectively share insights with their audiences.

A year has passed since I did the Marvel vs. DC visualization, so I thought it was time to update it (5 new movies!) and share some insights on how to use Data Studio report embedding to create effective data stories.

Enable embedding

The first step to embed reports is a pretty important one: enable embedding! This is quite simple to do:

  1. Open the report and click on File (top left)
  2. Click on Embed report
  3. Check Enable embedding and choose the width and height of your iframe (screenshot below)

Google data studio enable embedding

Please note that the embedding will work only for people that have access to the report. If the report is supposed to be publicly available, make sure that you make it viewable to everyone. If the report should be seen only to people in a group, then make sure to update your sharing settings accordingly. Read more about sharing reports on this help center article.

But how do you make sure you are choosing the right sizes? Read on...

Choosing the right visualization sizes

Needless to say, people access websites in all possible device categories and platforms, and we have little control over that. But we do have control over how we display information in different screens. The first obvious recommendation (and hopefully all the Interweb agrees with me) - make your website responsive! I am assuming you have already done that.

On Online Behavior, the content area is 640px wide, so the choice is pretty obvious when Data Studio asks me the width I want for my iframe - make sure you know the width of the content area where the iframe will be embedded. Also, since you want the visualizations to resize as the page responds to the screen size, set your Display mode to Fit to width (option available on Page settings).

Without further ado, here is the full Marvel vs. DC visualization v2!

I personally think the full dataviz looks pretty good when reading on a desktop, I kept it clean and short. However, as your screen size decreases, even though the report iframe will resize the image, it will eventually get too small to read. In addition, I often like to develop my stories intertwining charts and text to make it more digestible. So here is an alternative to embedding the whole thing...

Breaking down your dataviz into digestible insights

As I mentioned, sometimes you want to show one chart at a time. In this case, you might want to create separate versions of your visualization. Below I broke down the full dataviz into small chunks. Note that you will find three different pages in the iframe below, one per chart (see navigation in the bottom of the report)

Right now, you can't embed only one page, which means that if you want to show a specific chart that lives on page 2 of a report you would need to create a new report, but that's a piece of cake :-)

I am looking forward to seeing all the great visualizations that will be created and embedded throughout the web - why not partner with our data to create insightful stories? Let's make our blogs and newspapers more interesting to read :-) Happy embedding!

BONUS: Data Studio is the referee in the Marvel vs. DC fight!

As I was working on my dataviz, I asked my 10yo son (also a comic enthusiast) to create something that I could use to represent it. He created the collage / drawing below, I think it is an amazing visual description of my work :-)

Data Studio referee

image 
Google data studio enable embedding
Data Studio referee

150 Years of Marriages and Divorces in the UK

Have you ever wondered how divorce and marriage rates have trended over the last 150 years? Or what reasons husbands and wives give when getting a divorce? Fortunately these, and other questions, can be answered with data. The UK Office for National St…

Marriage and Divorce Trends

Have you ever wondered how divorce and marriage rates have trended over the last 150 years? Or what reasons husbands and wives give when getting a divorce? Fortunately these, and other questions, can be answered with data. The UK Office for National Statistics make available two extremely interesting and rich datasets on marriages and divorces, providing data for the last 150 years.

Following the discovery of these datasets, I decided to uncover trends and patterns in the numbers, working with my colleague Lizzie Silvey. Two important questions were in our minds when exploring the data:

  1. Who wants a divorce and why?
  2. How do wars and the law impact marriage and divorce rates in the UK?

We discuss our findings in this article, but you can also drill down into the data using this interactive visualization that we created using Google Data Studio.

Divorce petitioners and their reasons

The ratio of petitioners has been stable since around 1974 (70% women and 30% men), the time at which both genders started having the same rights and divorce could be attained more easily.

In the charts below we see the trends for 'Adultery' and 'Unreasonable behaviour', the two most common reasons provided (out of five possible) - each line shows the number of divorces granted to the husband or wife for a specific reason.

Divorce reasons UK

In order to use Adultery grounds the petitioner must prove that the partner had sexual intercourse with someone else, which might not be simple. We can see in the chart that Adultery follows the exact same pattern for husbands and wives, but analyzing the statistics further we see that, on average, 40% of the adultery divorces are granted to husbands - since only 30% of total divorces are petitioned by husbands, it seems adultery is a particularly strong reason for men to file for a divorce.

The second chart, showing 'Unreasonable behaviour', is more enigmatic. While husbands were granted divorces in an increasing pace for behavioural reasons, and while the lines seem to be converging, there is a strange hump in the wives line. Why were wives granted a massive amount of divorces up to 1992 based on unreasonable behaviour? Could that be related to a “backlog” of cases of domestic violence (classified as a behavioural reason) that came to light after women could divorce based on those grounds more easily? Unfortunately we could not find data showing possible reasons for that.

The impact of laws & wars on marriage and divorces

When looking at the marriage and divorce trends since 1862, there were a few clear turning points.

UK Marriage Divorce rates

The wars seemed to affect marriages quite significantly. Around the beginning of World War I & II we see spikes in marriages, maybe as a result of young men wanting to vow their love before going to fight. Then, during the wars, the marriages plunged as soldiers went away, and up again when they came back home.

As for divorces influenced by the wars, we can only look at World War II, as women had a limited ability to divorce after World War I. It seems the Matrimonial Causes Act 1937, which made other grounds legal (e.g. drunkenness and insanity), coupled with premature weddings (discussed above) and possibly a estrangement due to separation led to a spike in divorces starting in 1946 - who would have the heart to divorce in war times?

But what seems to be the strongest influence in divorces in the history of the UK is the Divorce Reform Act 1969 (link to PDF), which came into effect in 1971. This act states that divorce can be granted on the grounds that the marriage has irretrievably broken down, and it is not essential for either partner to prove an offense. While that explains the strong increase in divorce, we could not find a strong reason for the decline in marriages at the same time - we invite possible explanations in the comments section.

Closing Thoughts

While we couldn't bring answers as to why trends are going in a certain direction and predict upcoming changes, we believe that the data can shed new light into the British society and family relations. Hopefully with new releases of data in the future we will also be able to dive deeper and answer more existential questions.

If you are interested in exploring the data further, check the interactive visualization, created with Google Data Studio, you will find more context and charts showing trends and pattern on marriage and divorce in the UK.

image 
Divorce reasons UK
Online Behavioriage Divorce rates

Partnering with data to create insightful stories

[Cross-posted from The Next Web]
Whether you are a marketer trying to persuade people, a technologist building a startup, or an executive making business decisions, data is your partner. You can use it to make better decisions and create insightful dat…

Data Storyteller

[Cross-posted from The Next Web]

Whether you are a marketer trying to persuade people, a technologist building a startup, or an executive making business decisions, data is your partner. You can use it to make better decisions and create insightful data stories inside and outside your company.

The first step is to accept your data relationship: you are partners forever. Once you understand that, there is an important consideration that will define how to tell your data stories: the context of where they live, which also defines the audience that will interact with them. In this post I will go through some important lessons I learned when visualizing and communicating data in and outside Google.

Data is your partner, live with it!

Data is no longer "next year's big thing", we have gone through that many times over and almost everyone accepts data as a valuable team member. But not everyone can understand and make use of it optimally, which means lots of decisions are still made based on intuition - if you don't believe me, check PwC's Global Data and Analytics Survey 2016, it shows some interesting numbers on how often managers use data during the decision-making process. Data education is a crooked road and we have a long journey ahead of us.

One of the reasons for that is similar to the well-known phenomenon called mathematical anxiety, where people are afraid of maths as a result of past difficulties and traumas. Every one of us have interacted with data analyses (at work, newspapers or academic research) that were created by unskilful communicators, people that might be amazing statisticians but lack the ability to convey the stories behind the numbers. That creates anxiety and could prevent professionals from even trying to understand data.

I believe the reason the data community is not growing like weeds is because professionals are not confident enough with numbers and charts. I have written about how to overcome the fear of analytics (and help others), here is a quick summary.

  1. Never mock people for not understanding a chart
  2. Take baby-steps towards numeracy
  3. Make analytics more fun

When you create a visualization you may affect other people both positively and negatively. If you create a complex and unintuitive visualization, you might be creating a phobia on other people, and they will forever hate numbers and stats. However, if you create a powerful and beautiful visualization, you might be persuading another mind to join the data visualization tribe.

Below are some ideas that might help you craft better data stories, both for businesses and in general.

Stories tailored to businesses, the world, and beyond...

There are many ways to communicate data, but choosing the right format will depend on where the data will be published or presented, the context. Is it a daily performance report or a quarterly result presentation? Or a behavioral analysis using web data? Or an interactive visualization showing global trends?

I'd like to break down data stories into two main branches: business reporting or analysis, and visualizing the world. These groups can show very different characteristics, so let's look into each separately.

Business reporting or analysis

I recently had the opportunity to interview Avinash Kaushik, Digital Marketing Evangelist at Google. In our conversation we discussed techniques to create great data stories, focusing on businesses. Avinash talked about his business framework See, Think, Do, Care and the role of data visualization during the decision making process.

We also discussed data visualization (see minute 11:08), and Avinash explains how not to make silly mistakes when using data in a business context. He makes the differentiation between three main types of visualizations:

  1. Elaborated stories presented with the intent to change people's views on a complex subject (what I call visualizing the world).
  2. Strategic analysis of business results presented to executives.
  3. Day-to-day reporting used to drive most small business decisions.

Avinash differentiates between analysis "packed together" with a storyteller, which allow for more complex visualizations, and day-to-day reporting, which are supposed to stand on their own and help people make decisions by themselves.

Considering the data delivery circumstances is a great start when designing your visualizations as they will inform the presentation style and level of complexity that can be used. While every visualization should strive for simplicity, a daily report (and business visualizations in general) must be extremely clean and self-explanatory, as the data storyteller won't be there to help the decision maker.

Below is a quote by Avinash summarizing his views on how to succeed with data.

"On a business context, a data visualization has to do one job really well, and it has to answer the question ‘so what?' If your data doesn't answer the 'so what' question, and if there isn't a punchy insight that drives action, all you have is a customized data puke, it looks really nice but it serves no purpose. If you want to drive change, you have to get to the simplest possible way to present the data, and once you get to it ask the so what question. After you answer it, ask if it quantifies the opportunity, if it does you are going to win."

Visualizing the world

Luckily to our society, visualizations are increasingly used in a broader context, where the goal is not to understand the business or track performance, but to educate the public and change people's minds. There are some great examples of visualizations that make a difference, but probably the most famous is Hans Rosling motion charts, where he debunks several myths about world development.

I've written about data stories in the past, discussing why it is important and providing some ideas on how to use data visualization to tell stories. Basically, here are two really important things you need on a good data story:

  • It stands on its own - if taken out of context, the reader should still be able to understand what a chart is saying because the visualization tells the story.
  • It is easy to understand - but while too much interaction can distract, the visualization should incorporate some layered data so the curious can explore.

Recently I worked on a data story with my colleague Lizzie Silvey, where we analyzed stats from the UK Office for National Statistics. We looked into Divorce and Marriage trends starting from 1862, and came up with an interactive visualization. Below is a screenshot with some of the insights on how changes in the law impacted marriage and divorce rates in the UK. Check the visualization to play with the data.

UK Marriage Divorce rates

Whether you are working on a monthly report or a world-changing visualization, if you take the time to uncover and communicate the stories behind the data, you will be contributing to better decisions in your company and in society in general.

image 
Data-driven decisions
UK Marriage Divorce rates

Visualization Techniques to Communicate Data

So here’s the deal: you’ve spent a ton of time with your data and you know it inside out. You’ve wrangled, sliced and diced it and are now the expert with this data for this problem. You’ve uncovered new, actionable insights that will lead to fantastic…

So here's the deal: you've spent a ton of time with your data and you know it inside out. You've wrangled, sliced and diced it and are now the expert with this data for this problem. You've uncovered new, actionable insights that will lead to fantastic opportunities or improve your bottom line. Great! Time to show your colleagues or your boss or your clients these findings.

You open your data tool of choice, quickly create some charts and make it all look pretty with a flashy color scheme or fancy logos. More often than not, we fly through this final stage and don't give the data visualization step the due care it needs. This is insane!

Think about it. Your charts and dashboards are most likely the only piece of information your boss or client will interact with. The only information! And yet, here we are, creating default charts and missing the opportunity to really convey our message.

Effective charts are a compelling way to show your data. The human brain is simply better at retaining and recalling information that has been presented visually.

Sales chart year-over-year comparison

In this article I will discuss several techniques that will help you create more effective charts to communicate the underlying data.There's no big secret here. However, by applying deliberate thought, a handful of best practices, and allocating sufficient time in projects for the data visualization step, you can make a big difference to the impact of your charts.

Plan your approach

Before firing up your favorite data visualization software, it pays to spend some time thinking about your output and your goals. Start by answering a few simple questions:

  • Who is the intended audience?
  • What medium will you use to show your charts? (e.g. slides / dashboard / report etc.)
  • What is the goal of this project?

For example, consider the audience who will view your chart. How long will they have to study it? How familiar are they with the data? Are they technically inclined? Do they want detailed charts, or quick summaries?

You want to optimize your message to resonate with your audience, so the more you know about them, the more likely you'll be able to achieve that.

Likewise, how you deliver your message will affect your decisions. Is it a chart in a slide deck? In an informal email? A formal report? An interactive dashboard?

Reports and dashboards are typically pored over for longer periods of time, so charts and findings can be more detailed, whereas presentations or client pitches are short and sweet, where the audience will only have a moment to understand and absorb the information.

Lastly, think about what your end goal is. What do you want your audience to do with the information you show them? For example, if you want your manager to make a cost-benefit decision for a new hire or expensive research tool, make sure your solution answers the question and facilitates making that decision.

Deliberately focus the viewer's attention

Remember, the point of your visualizations is to communicate information, and you can ensure they do that more effectively by giving prominence to the key message within your chart.

You can do this by using attributes, for example color, to highlight specific elements of your charts and focus your audience's attention there. These are known as pre-attentive attributes, and they dramatically help speed up the absorption of information.

Consider this chart showing the open rates for four newsletters that you manage. There's an important story in there, but it's difficult to see with the default colors:

Newsletter open rates chart

However, by carefully using colors, we can bring that story to the fore:

Newsletter open rates with color

Add context to aid understanding

Consider the two charts above, showing email newsletter open rates. The second chart also has a heading that adds context to the story. The words complement the chart and reinforce the message.

Much like writing titles for your blog posts or newsletters, think about the title of your chart in the same way. It should tell the viewer what to expect in your chart and summarize the message.

Similarly, your data may have unexpected spikes or dips, so you might want to use annotations directly on the data points or as footnotes, to make sure the viewers have all the context they need.

Reduce clutter in charts

Renowned data visualization pioneer Edward Tufte coined the term data-ink ratio to convey the ratio of ink needed to tell the core message in your display, divided by the total ink in the display. The idea is to maximize this ratio, in other words, reduce the amount of non-essential ink.

Let's see that in practice. Compare the following two charts showing Amazon's revenue between 2007 and 2016:

Amazon revenue cluttered

After decluttering, the annual revenue figures jump out at the viewer and the information is much quicker to absorb:

Amazon revenue chart after decluttering

Avoid using overly complex charts for the sake of it

There are a lot of complex chart types out there: waterfall charts, radar charts, box and whisker plots, bubble graphs, steamgraphs, tree maps, pareto charts, etc. etc.

Sometimes these may be appropriate for specific cases (e.g. a Sankey chart to show web traffic flow) but it really comes back to the question of who your intended audience is and what medium you'll be showing your chart through.

Does this radar chart really communicate your message well? Would a simple bar chart, which is widely understood, be a better alternative?

Radar chart example

Whenever I teach a dataviz class, I always say that a good chart should be like a good joke: it should be understood without you having to explain it.

Is that pie chart really the best choice?

Pie charts are popular and ubiquitous, but somewhat maligned by the data visualization community. Why is that?

Consider this default pie chart in Data Studio, showing website Sessions broken out by Medium:

Bad pie chart example

This chart (and pie charts in general), have two main drawbacks: 1) it's hard as human beings to decipher the relative sizes of the slices (and the order and position of them affects this), and 2) the long tail is unreadable. Plus, the legend is ugly to look at.

A much better chart for data with many categories and a long-tail would be a standard bar chart. Nothing fancy here, but it's super quick and easy to read off the values, especially for the smaller categories (e.g. compare trying to understand email sessions in the pie chart vs. the bar chart).

Bar chart to replace pie chart

So if you're going to use them, restrict pie charts to small numbers of categories (I'd advise three or less), and always ask yourself if a simple bar chart or table would suffice and be quicker to read.

Be careful with dual axes charts

Dual axes charts should be used with caution as they often cause confusion. It's tempting to use them when trying to chart data series with large size differences, as shown in the following image. Which series goes with which axis? Lines that overlap will also confer meaning that doesn't actually exist, because the series are on different scales.

Dual axis confusion

Some strategies you can use to mitigate confusion include matching the series and axes with different colors, labeling the axes clearly and even using different chart types for the different series (line with a bar).

However, I'd still advocate only using them sparingly. It's often better to show the two series in separate charts next to each other.

When to start the y-axis at 0

For bar charts, you should always start the y-axis at 0 since the height of the bar represents the count in that category. We look at the height of the bars and compare them. If one bar is twice the height of the other, then we're going to conclude that the value of that category is twice the value of the other category, even if the axis shows otherwise.

Consider this simple example. Both bar charts have been plotted from the same data but they tell very different stories:

Truncated y-axis

Vox Media created an excellent video about truncating the y-axis. With line charts we don't need to be so strict with truncated y-axes as the visual lines are used to compare trends, not actual values, as in the case of bars. Indeed you sometimes need to narrow the range with line charts to show the story.

Remember to consider the color blind

Approximately 10% of the male population and 1% of the female population identify as color-blind, and the most common type is Red-Green color-blind. So it pays to keep this in mind when designing your charts.

Closing Thoughts

Once you've created your charts, or your dashboard, pause and ask yourself these few questions:

  • Is it effective at communicating your message?
  • Is it efficient at communicating your message?
  • Ultimately, does the audience benefit from seeing your visualization?

There is no single right answer with data visualizations, as it will depend on many of the factors discussed above. People will come out with different charts from the same dataset, all of which could be equally effective. However, by following some best practices and thinking critically about your charts, you can improve them dramatically.

I'll leave you with some parting words from a master in this field:

"Above all else show the data" Edward Tufte

image 
Sales chart year-over-year comparison
Sales chart year-over-year comparison