Transcript#
This transcript was generated automatically and may contain errors.
Hello, everybody. Thanks for joining. I'm David. I run R for the Rest of Us. A lot of people know us because we produce a bunch of educational content. We also do consulting work, and today I'm going to be talking about that. Specifically, I'm going to be talking about the projects that we have worked on where we design highly stylized reports for our clients. So the title of my talk is Report Design in R, Small Tweaks that Make a Big Difference.
To start out, I want to actually take you back to a time. This was around, I think, 2015, maybe. I had a job at the Oregon Community Foundation. It's a foundation in Portland where I'm based, and at that time, this was before I used R. I was very excited about this job. One day, though, they told me, OK, David, we have this program where we fund nonprofits across the state that do after-school programs. They do surveys. We need you to take that survey data and produce one report for each nonprofit. As you can see, there are a lot of nonprofits, and at that time, I was working with Microsoft products, Excel, Word, PowerPoint. You can imagine that producing these reports was pretty terrible, and by the end of doing this, I felt kind of like that.
Now, shortly after that, I taught myself R, and I got really, really excited. I learned about the idea of reproducible reporting, and especially I learned about the idea of parameterized reporting, where you can take one kind of report template and then make multiple reports. I thought back to that time when I had had to make these reports for all these nonprofits across the state, and I was like, oh my gosh, I can do this automatically in R.
Eventually, though, what I realized is that it wasn't as simple to make really good-looking reports as I hoped it would be. Eventually, I came to see that it seemed like there were two paths to making really nice-looking reports. On the one hand, I could work with other tools, so I could work with things like Word, Excel, PowerPoint, sometimes working with a graphic designer, but the problem with that is it's not reproducible, so if you're making dozens, hundreds of reports, it's not really a good way to go. The other approach seemed to be to make ugly reports, and so I'm assuming most people in the room recognize this as the default output if you work with Quarto and you render to PDF, right? It's not the most attractive thing you've ever seen.
Eventually, though, what I realized is it was actually a third path that I had missed, so if you look really closely in the middle, there's actually a path that goes up there, and I realized that the third path was to make beautiful reports in R, and so today I'm going to talk to you about how we've done this and hopefully show you that I think you can do it as well.
This is an example of a report that we made. This is actually showing just a couple pages. It was working with a client in Connecticut making reports on housing and demographic data, and this is a report that we had to make not just for one town, in this case Hartford, but for 169 towns in the state of Connecticut, so again this is not the kind of thing that you would want to do in, you know, some other tool. We need to do this reproducibly, but if we want to make it look really good, we're going to have to put in some effort to make that happen.
Does design matter?
First though, let me ask and answer a question that I think some of you might have, which is does design matter? I think especially among technical folks, and I count myself among that group, there can be a view that, oh well it doesn't really matter, let me just, you know, make some graphs, send them off to whoever, and they can do whatever they want with them, but I think that's, I don't think that's accurate. I think design really does matter. There's something that's been studied by researchers called the aesthetic usability effect. This was actually brought to my attention by Will Chase, who you may know, who does a lot of amazing data visualization, and this aesthetic usability effect talks about the idea that things that are more aesthetically pleasing are perceived to be more useful, so to the degree that we want what we're producing, our results to be used, it behooves us to make them aesthetically pleasing.
I also think good design builds trust, so if you can show people that, hey, I spent a bunch of time thinking about designing this well, they're also going to trust that you've spent time with your analysis and other pieces of the work that you're presenting. Now you might be thinking, wait a second, David, I'm not, isn't, I'm not a designer, but here's the thing, neither am I, and I have learned over the last few years that good design is just a few small tweaks, and I'm going to show you, hopefully, how you can do the same.
I also think good design builds trust, so if you can show people that, hey, I spent a bunch of time thinking about designing this well, they're also going to trust that you've spent time with your analysis and other pieces of the work that you're presenting.
A framework for good design
I've got a framework for good design, and I've got a real quick do and a don't, so what I'm going to show you is the do is going to be to focus on being consistent, and I'll show you how we do that both with report layout as well as with data visualization, and the don't is going to be don't use defaults, so don't use that like Quarto default output that you get, don't use default themes in ggplot, don't use default colors, and I'll show all this in a second.
So let's talk for a second, what is a report? What makes up a report? I've made a really kind of simplified version here that you're looking at. The first thing you're going to want to do is create a layout. I've created a very simple layout here, you can see I've got that like top bar, and then I've got a footer at the bottom, nothing super crazy, like this isn't, again, I'm not a designer, but you can make something really simple with this type of layout. You also need to add your brand colors, so here, for example, you can see the blue, the dark blue, which was one of the colors from our client, which I'll talk about in a second. Also add your brand fonts, so here I've gone from, I think it was Times New Roman I had put before, to a brand font, which in this case is Open Sans. And finally, you want to add your plots, and when you add those plots, you want them to have all those things that we just talked about, specifically your brand colors and your brand fonts.
So I'm going to make this, this is an even more simplified version, just in the interest of time, I'm going to make this report here with Typst, which I, up until today, have been calling Typist, so I will do my best to correct myself. So first thing is to identify your brand colors. If you work internally in an organization, just ask, they'll tell you what your brand colors are. If you work for an external client, like we often do, sometimes we'll ask them, sometimes we'll know, a lot of times we'll just go to their website. You can either inspect the HTML source, if you know how to do that, you can find what are called color picker tools, which can go and say, like, what is the hex code for this color, and you can extract that. And so, for example, working with this client, you can see we got a couple colors that we used in this report, a blue and a red.
Next, identify your brand fonts. Again, if you work internally, you can just ask around. For us, that often means, again, going to the website and figuring out what font they're using there, and then using it as well. And this was Open Sans, which is what I'll use as I show how to create a layout.
Creating a layout with Typst
So, creating a layout. We're going to do this with Typst, so I'm going to give you a very quick overview of how this would work. How Typst works is this. You start with your report.qmd file, so same Quarto file that you use for any other output format. Then you're going to have what are called template partials, so you have a typst-show.typ file. That file is actually going to take the variables from Quarto and pass them into Typst, which we'll use to create our template in our typst-template.typ file. And from there, that will then create our report.pdf, so that's kind of overall process.
So, let's look at our report.qmd file. I've just made a very simple version. I've got the title here called Housing Data Profiles, and then again, this is a parameterized report, so I've got one parameter, which is the town, and in this case, I'm using Hartford. You can see then I've got my template partials. That's the typst-show and typst-template that I talked about.
So, let's look at each of these files. The typst-show.typ file, again, is how we're getting variables from Quarto to Typst. So, on the first line here, I'm creating a Typst function, psc-report, and within this function, what I'm going to do is I am going to pass the variables from Quarto into Typst. So, I'm saying if title, so in other words, if there is a title in my Quarto document, which there was, it was called Housing Data Profiles, then on line three, here, we're creating a variable called title that gets the value of that title from Quarto. So, we're basically passing it from Quarto to Typst. Same thing with the parameter of town. So, if params.town exists, then we pass it to Typst as town. And I will say at the very end, I have a link to the GitHub repo, so you can get all this code. So, don't feel like you need to copy it as I'm going.
The typst-template is where we're going to lay out our document. So, here, for example, you can see at the top, I've got some variables that I'm defining. So, title and town, I have to give them default values, but just know that they'll actually be replaced by the values that are passed from typst-show when I'm actually making my report. And then below that, I'm going to set up a whole bunch of properties for my report, a whole bunch of body properties, and I'll walk through each of these in just a second.
So, the first body property that I'm going to do is I'm going to set the text for my whole report. So, I'm saying set text font, Open Sans, size 12 point. So, as you can guess, that's saying make everything 12 point Open Sans. So, again, not using the default font, I'm using Open Sans. Then I'm going to set up my page properties. I'm saying let's use US letter. So, use that, you know, you can also use like A4, you can define custom sizes as well. And then I'm setting my margins as well.
Next, what I'm doing is I'm actually adding this blue rectangle on the top. So, I'm saying background place top. So, in other words, putting something on the top, and what I'm placing there is a rectangle. So, you can see rect, and then I'm saying setting the properties of the rectangle. So, most importantly, see this hex value here, that's that blue hex value. So, that's going to put that blue rectangle at the very top. And you can see I'm saying make it 100% width and 0.5 inches in height. Next, in the header, what I'm doing is I want to add my text. So, I want to add like housing data profiles and Hartford. So, I'm doing that. The most important piece here is to focus on the grid here, where I'm saying grid columns 80%, 20%. So, what that's doing, the 80% is this column here, that's going to take up 80% of the width. Then down here, I'm saying align on the left side, I'm defining a bunch of text properties. But the value that's going to actually show up here is title. And remember that the hex value for title is housing data profile. So, again, that shows up here. Same thing on the right side, except we're showing town. So, we get Hartford there.
We do something very similar for the footer here. So, I'm defining columns 40%, 60%. I'm using this counter page display one. That just sets the page number there. And then on the right side, I'm just adding the logo, which is the logo of the client that we were working with, Partnership for Strong Communities. Okay. So, that's like the fastest run through of Typst ever. But hopefully that gives you kind of a basic overview.
Using functions for consistent plots
At this point, now that you have a layout, you need to add some content. One thing that we do that I think really helps is we use functions to make our plots consistent. So, rather than having like a whole smorgasbord of different types of plots, or even, for example, if you have bar charts, you don't want each one to look a little bit different. So, we make functions. So, for example, in this report, we made something, I don't know if it has a better name, but we called it a comparison plot. We wanted to be able to compare the town that the report was for to every other town in the state of Connecticut. And so, you can see, for example, we do that with single-family homes as a percent of all homes, and then also with total population. And we did this for many different things as well.
So, to make this comparison plot, I'm going to load a couple packages, tidyverse and scales. And you can see that I've got data that I'm working with here, single-family homes. Then I'm going to make a function that I'm calling comparison plot, takes two arguments, my data, my data frame, and then whichever town I want to highlight, the highlight town. Then I'm going to use this in ggplot. I want you to actually focus less on the specifics of how ggplot works and more on kind of the overall concept. But the basic idea is I take my data frame, pipe it into ggplot, then I'm going to add the light gray lines for all the towns. Then I'm going to add another line that shows whichever town I want to highlight. So, if I do comparison plot for single-family homes in Hartford, I get that.
Now, that's looking pretty good, but see this has like 0.25, rather that be 25 percent. So, to make that 25 percent, I'm going to use scaleXcontinuous, where labels is percent format. And when I do that, you can see now I've got 25 percent, and that looks great. The only problem is I also want this function that I've made, I want it to be consistent, but I also want to have some flexibility for different types of data. So, for example, for total population, I don't want it to say 25 percent, right? So, I'm going to add an argument to comparison type, or comparison plot, and I'm calling it value type. And I'm saying if value type is percent, then we'll use that percent format. If value type is number, then I'm going to use another function from scales, comma format. So, if I do comparison plot for single-family homes with percent, we get what we got before. But if I do it with number for total population, you can see now I get nicely formatted numbers here with commas in the thousands place, etc.
I also use functions, we also use functions to make what I call these big numbers, so the plots on the on the left, they don't actually look like plots, but you can make them in ggplot. So, I'm talking about this here, and this. So, to do that, I make a function for big number plot, where I have two arguments, value, whatever the number that I want to show is, and then the text. So, again, I make it a function, I have a geom text that adds the value, and then I have another geom text that adds the actual text. And then I have a theme void, because, again, I just want to strip everything out. So, if I do big number plot, where value is 19 percent, and text is single-family homes is a percent of all homes, we get that.
Applying brand colors and fonts to plots
Now, that's looking good, but the problem is we don't have any color, right? It looks very generic. So, let's use our brand colors. The first thing we typically do is we define our brand colors as variables. So, if you remember these colors from before, we can define these as variables in our PSC blue, PSC red. Now, we can use these brand colors in our plots. To do that, I'm adding an additional argument in my comparison plot function called highlight color, and then I use that for the geom point where I'm actually adding the line for the town that I want to highlight. And so, now, if I do comparison plot where the highlight color is PSC blue, I get that, but if I do highlight color PSC red, I get that. So, you can see how I'm making it consistent with a function, but I'm giving myself some flexibility as well. With the big numbers that do the same thing, I add an argument called value color, and then I use that. You can see there now it actually shows up with the color.
Last thing I want to do is I want to use my brand fonts in my plots. Typically, the first way, the first thing we do is we make a custom theme. So, here I've made an extremely simple theme called theme PSC, which is a function that starts with theme void, and the most important thing is this, the base family, which is Open Sans. So, that will set for all theme elements. It will make them Open Sans, and then you can do any additional tweaks in the theme function if you want to. So, you can see here's what it looks like with the theme not applied, and here if I add theme PSC, it's a little bit subtle, but if you compare that 150,000 to that, the one on the bottom is actually using Open Sans.
Now, if I want to do the same thing with my big numbers, I might think to apply theme PSC, but unfortunately that doesn't change anything. The reason why is because I'm using geom text. Geom text is not actually impacted by the theme. The first way I might think to do this would be to add family equals Open Sans in every instance of geom text, but that gets annoying because you have to do it over and over and over. Instead, I really like the function update geom defaults. So, for this, you put it at the top of your script or your Quarto document, and you specify which geom you're focusing on, geom text, and then what aesthetic properties you want to use. So, family equals Open Sans. So, here's what it looks like without that applied. With it applied, now I'm using Open Sans.
Instead, I really like the function update geom defaults. So, for this, you put it at the top of your script or your Quarto document, and you specify which geom you're focusing on, geom text, and then what aesthetic properties you want to use.
So, overall, this is basically what it looks like. We've got my yaml at the top in my report.qmd where I'm specifying I'm using Typst. I'm defining my colors. I'm using update geom defaults, and then I'm using my functions throughout. And this is how we get to something like this. My hope is that you, in the end, can make reports like this and end up feeling just like this guy here. I've got links to the slides, to GitHub repo if you want to see the code, as well as several examples of reports that we made. Thanks.