Man looking at visualisation of data on laptop
Portrait of author Kris Coombes
Kris Coombes

Data Scientist - Insights

See All Posts

How to build your own data visualisation package

By Kris Coombes on June 10, 2022 - 5 Minute Read

The Peak Insight team has created our own “peak-theme” visualisation package. It saves us time and ensures our charts are of the highest standard, every time. In this article, we give you the lowdown on how to build a data visualisation package of functions to standardise your charts with your own custom themes.

What does it mean to standardise a chart?

Chart standardisation is the practice of producing visualisations that are styled consistently, regardless of the type of chart you use. 

Rather than reusing long blocks of code, Peak has built a package of functions to produce these standardised charts into single lines of code

Building your own visualisation package has a number of benefits:

  1. All those hours spent writing and tweaking visualisation code can be a thing of the past!
  2. All of your visualisations are stylistically consistent and repeatable.
  3. Brand recognition! Incorporating your brand’s colours and fonts into your visuals gives your charts a professional feel and a style that your audience recognises.

Let’s take a look at some examples of standardised, Peak-branded charts…

Bar chart

Visualisation of data

Graph

 Visualisation of data

Dot chart

 Visualisation of data

All of these charts were produced using single lines of code like this…

linechart(data_frame = data, x = ‘date‘, y = ‘headcount‘, group = ‘country‘)

If we didn’t have functions for these charts, we’d have to use a cumbersome chunk of code to get the job done, like this:

data %>%
ggplot(aes(x = date, y = headcount, group = country)) +
geom_line(aes(colour = country)) +
labs(x = ‘Date‘, y = ‘Headcount‘, colour = ‘Country‘) +
scale_colour_manual(values = c(‘#000033′,’#66FF99′,’#FF3399’)) +
theme(text = element_text(),
plot.title = element_text(face = “bold“, size = rel(1.2), family = “nhaasgrotesk75bd“),
plot.background = element_rect(colour = NA),
plot.margin = margin(3, 3, 3, 3, “mm“),
…,
strip.background = element_rect(colour = “#f0f0f0“, fill = “#f0f0f0“),
strip.text = element_text())

Building your package: before you begin

Now we’ve shared some of the benefits of chart standardisation, you must be itching to start writing some functions, right? Great! The rest of this article will show you how to build a package

Before you build your first functions, you need to do a little bit of groundwork and organising. It’s important to lay some solid foundations early on, or you’ll find yourself backtracking on your progress further down the line.

1. Define your package roadmap

The first thing you need to do is prioritise which functions you want to build first. There’s loads of charts you could build functions for, but the ones you most commonly use should take priority. At Peak, we use bar charts, dot charts and line charts most frequently, so we built those first.

2. Create a functionality checklist

Secondly, you should spend some time writing a checklist of features that your functions should include

Look back over your code and try to identify modifications you regularly make, such as any formatting changes (e.g. when to switch between comma and percent formatting automatically for your axis tick labels). 

The first version of your chart function doesn’t need to be able to do everything from the start – prioritise what matters most! 

Tip: Below is a checklist we’re using for our bar chart function – you can use this as a starting point. Click on the image to use it as a checklist for yourself.

Bar chart function checklist

3. Collaborate with your design team

Finally, you should make some time to talk to your design team — their input will be invaluable in helping you build out your custom theme. Some of your brand’s colours might not be easy to read when applied to a chart, so you may need to make a few compromises to make sure they’re fit for purpose in visualisations. 

We’d recommend familiarising yourself with your brand guidelines too; these should contain everything from the colours in your palette and the fonts you need to which font colour you should use with each palette colour!

Tip: This would also be a great time to think about the inclusivity of your theme and define a couple of alternative palettes for colourblind readers!

 

How to create functions for your charts 

Now you’ve laid the foundations, it’s time to start building functions. The objective of your function is to produce an on-brand chart in a single line of code, with different options you can set to change the chart parameters.

Functions are far less scary to write than you might think: if you treat them with care, they’ll treat you kindly back. We’ll be writing functions in R for our examples, specifically as wrappers for ggplot2 code.

Your first task should be to create a function containing all of your custom styling (also known as a ‘theme’) and create objects for your colour palettes. 

If you use ggplot2 on the regular, you should be familiar with theme_bw() or theme_minimal() – these are effectively theme() statements wrapped into one function that contains things like text size, fonts, legend position and spacing and so much more. Building these are crucial, as they are what ultimately make all of your plots look consistent!

Once you’ve done that, you can start writing chart functions.

1. What baselines do you want to set?

Let’s take a bar chart as an example. What are the basic things that a bar chart function would need to do? It would need to take some data, and plot an x and a y variable from it, like the below.

my_barchart <- function(data, x, y){…}

2. What common modifications do you want to include?

Your function should also include some options for common modifications that you make to your charts. For our bar chart example, we want users to be able to compare different groups: let’s add this to our function.

my_barchart <- function(data, x, y, group = ”){…}

Note that, for the group option, we have provided an empty string as a default value. If you didn’t provide your own argument when using the function, by default it would not group your barchart.

You can also use boolean operators in functions to change chart elements or add new details. For example, if we wanted to flip the axes so the chart reads horizontally, we could add an option for this that can optionally be set to TRUE.

my_barchart <- function(data, x, y, group = ”, flip = TRUE){…}

3. How can you make your functions robust and user-friendly?

Now you’ve defined what you want your function to do, it’s time to build the internal workings of it! While we won’t go into the ins and outs of that here, it’s important to think about what things you don’t want your function to do.

In our bar chart example, we’d need one of our variables to be a numeric field and one to be non-numeric. You should write a test for this that checks if this is true or false; if it fails this test, the function should fail and return a helpful, easy to debug error message like in the example below.

tests_passed <- TRUE
if(is.numeric(data_frame[[x]])){
message(paste0(“my_barchart() requires x to be a categorical field.”))
tests_passed <- FALSE
}
if(!is.numeric(data_frame[[y]])){
message(paste0(“my_barchart() requires y to be a numeric field.“))
tests_passed <- FALSE
}

4. Testing

Congratulations, you’ve written your first chart function – but the job isn’t done yet! You need to ensure the chart your function produces is exactly what you would expect it to be before releasing it, so testing and documentation is essential. 

Here’s some tips from us on what you should do to stay organised when building your package:

  • Set up a GitHub repository: you’ll need to set up a GitHub repository to store all your cool new functions in! When you have finished writing a batch of new functions, push them to a ‘beta’ branch before you push to your main branch so you can start testing them from there.
  • Test it with friends: try to round up a bunch of people who’ll help out and do a mass testing session! They’ll be able to test your functions across various data sources, and try and break them in ways you probably hadn’t even thought of.

 

5. Releasing and implementing your package

Once you’re satisfied that your function is free from bugs and gremlins, it’s time to release your package! Here’s some things you should consider to enhance the quality of life of your package:

  • Create a release note: that way if anything goes awry you can roll back your release to the previous version.
  • Write documentation: creating things like help vignettes (these can be viewed in R using ??my_barchart, for example) ensures users can learn what the function is capable of quickly.
  • Write a visualisation cookbook: practical documentation like cookbooks doesn’t just give you an opportunity to show off your new chart functions, it can also provide tips on how to customise visualisations more generally.

And there you have it, our trilogy on Storytelling With Data has come to an end. 

We’ve talked about what storytelling with data is, why it’s important, and equipped you with the guidance needed to tell that story.

Now we’ve given you the tools to tell that story with, let us know how you get on applying your newly-acquired storytelling skills in the Community discussion below. 

Subscribe today!