(This post was originally published on Tableau Public as part of their Politics Viz Month campaign.)
AS FAR BACK AS 2013, one of my politically-aware colleagues was periodically tracking overseas gambling odds on the 2016 presidential race. He used the bookmaker Ladbrokes as his source, and gathered the data by hand on irregular intervals. Occasionally he would share a simple line graph with his friends. However, when the race heated up and more candidates entered, his friends clamored for more frequent analysis, and he asked for my help.
Gathering the Data
I discovered that Oddschecker.com kept historical records on betting odds for a vast array of wagers, including futures bets on who the next US president would be, from more than 20 different overseas gambling sites. By leveraging their data, I could get as many observations as possible instead of the random sampling I had been getting before.
As primary season came to a close, I wanted to take a look back at all the candidates’ campaigns individually. This required a different approach than a simple line chart. I considered bump charts, overlapping area charts, and complex multi-sheet dashboards chock full of interactivity.
The result was “Four Years and Two Dozen Candidates.” This small-multiples chart shows how the bookmakers perceived the rise and fall of each candidate’s campaign over a four-year period.
Here, we could clearly see the rise and dominant lead of Hillary Clinton over all rivals; the sharp rise and even sharper fall of Marco Rubio; the odd persistence of Jim Gilmore; the wasted opportunities of former favorites Chris Christie and Jeb Bush; and the low chances of winning of candidates who once had a disproportionate share of the media spotlight, like Carly Fiorina, Ben Carson, or even Ted Cruz.
Once I came up with the basic concept, executing the design wasn’t complicated. There were a number of steps, but no one step required Zen Mastery of Tableau. Let me walk you through the steps.
Building the Viz
Let’s Get Something on the Screen
I find it helpful to get some data into the view right away. This helps me see and think through my options—even if I know it’s only a temporary (and certainly incorrect) solution.
I knew that I didn’t want each of the 24 candidates to have an individually colored line, because that’s way too many colors. I chose instead to color each candidate only by party affiliation.
The field Decimal Odds can be thought of this way: “This candidate has a Decimal Odds-to-1 chance of becoming the president, according to bookmakers.”
Well, we have something, but it’s a pretty useless chart at this point. It highlights an unimportant outlier (Lindsey Graham’s 10,000-to-1 odds on one particular day) and is completely unhelpful at showing differences among strong candidates. Obviously, the y-axis needs to work in the opposite direction, with favorites toward the top and long shots toward the bottom.
Favorites Up, Long Shots Down: Calculating an Implied Chance of Winning
I fixed this by using some calculated fields that show each candidate’s implied chance of winning on a daily basis, based on the odds available at the time.
That’s an improvement. Now I can see Hillary Clinton and Donald Trump toward the top of the chart, and a big cluster of other candidates toward the bottom. If I look closely, I can see that candidates like Rick Perry, Scott Walker, and Lincoln Chafee dropped out early while Bernie Sanders continued to campaign.
However, there were still many stories in this data that were hard to see. I couldn’t tell when candidates entered the race, or how their trajectories rose or fell over time; I couldn’t see where they peaked; and I couldn’t easily compare which candidates really contended vs. which ones were just kidding themselves.
Small Multiples to the Rescue
With so many different values on my candidate dimension, none of the preset chart types in Tableau showed my desired comparison clearly and cleanly. My preferred solution for visualizing data like this is to use small multiples, also known as a panel or trellis chart.
Many people in the Tableau community have created such charts, including Andy Cotgreave, Michael Mixon,Jim Wahl, and Chris DeMartini, to name just a few. I knew that I could create such a chart. The real questions were: How should I order the panels for display? And how much data should be in each panel?
Getting the House in Order
I settled on ordering the panels in descending order of “highest single-day chance of winning the presidency.”
Clarifying the Message With Dual Axes
Once I constructed the trellis, I could tell that the differences among all the candidates were still too difficult to see. I decided to turn the line chart into a dual-axis line-and-area chart.
This made it easier to see the area under the curve for each candidate, which would help to convey both the length of each campaign and their relative successes. It also gave me two different labels to play with, which I took advantage of when it came time to polish up the viz at the end.
The End of the Road
I felt it was important to emphasize the date when each candidate chose to suspend his or her campaign. I chose to use a reference line with shading to convey this data.
The Ol’ Fit-and-Finish Routine
At this point, the data layout was basically complete, but as someone who came into this field with a design background, I considered it far from finished. I hadn’t set up any labels or tooltips, and I had barely touched any of the formatting options.
Since I had used a dual-axis approach to creating each panel, I could take advantage of the fact that each of the two charts could be given a unique label.
On the line chart, I could create the label that would indicate each candidate’s campaign peak by showing the date and their implied chance to win for that day.
For the area chart, I could create the label that would simply be the candidate’s name, which I could drag to the top-left corner of each panel and use as a title at the very end of the viz-creation process.
Now each panel displayed the candidate’s name, and their peak day was labeled with the date and their implied chance of winning.
I wanted more information about each candidate’s day-to-day campaign status available in the tooltip. This would include:
The daily implied chance of winning
The actual average of available odds (shown in the “X -to-1 chance of winning” format)
The number of bookmakers taking wagers on that candidate that day.
To do this, I created these corresponding calculated fields solely for use in tooltips.
Formatting in the Worksheet
In Tableau 10, many of these steps will be a lot faster since there are many new formatting tools available.
In general, I wanted to remove everything that could distract the viewer from seeing the main messages of the viz, so I hid most headers and titles, went with a minimalist look for the axis labels, emphasized the borders between each panel, and added light vertical gridlines to show where each year began.
Dashboarding: The Final Touch
Putting this worksheet into a dashboard let me add a text column along the right side of the viz, a title along the top, and a graphic on the bottom right. The very last bit of finishing involved dragging the candidates’ names to the top-left corner of each panel. By default, the dashboard looked like this:
Here’s where I used that little-known feature of Tableau whereby you can click-and-drag mark labels to any location within the pane of their parent mark. There’s no snap-to-grid or auto-align function, however; I tried to manually align the label anchor to the top of the pane, and leave a bit of space between the label and the left-hand edge of the pane. After that was done, the viz was complete.
The small-multiples format proved to be the ideal choice for comparing the ups, downs, and outs of candidates in the current presidential election cycle. Within each panel, a candidate's path is easy to track. And across panels, labeling (and ordering by) a candidate's peak implied chance of winning emphasizes just how far away from victory most hopefuls truly were.
While the interactive version allows viewers to get granular data, down as deep as the daily level, the high-level, static version is intentionally designed to stand on its own as an informative and easily shareable visualization.