Sydeaka Watson | A Robust Framework for Automated Shiny App Testing

Transcript#

This transcript was generated automatically and may contain errors.

Hi, my name is Sadeaka Watson, and so that's me. I work as a data scientist at Eli Lilly, the pharmaceutical company. I'm on the development team for a Shiny application that is used at Lilly. So essentially my job is to wear this hat, the app developer hat.

This isn't just a normal Shiny app. This is a complex production-grade app. We get a new version of the app released every month with new features and bug fixes. And a lot of people in my organization visit this app every day, so it's really important that the app is functioning properly all the time.

A story about manual testing pain

So with that in mind, I'd like to begin with a story about something that happened to me back in May 2021. So I was in my home office working on a new feature for the app, but of course I was multitasking because I was also keeping an eye on the activity in our team chat window. So one of the users was saying she was having some issues with the app. It wasn't populating the drop-down menu with the values that she was expecting. So the values were in the app before, and then now all of a sudden she didn't see them anymore.

So one of the other developers identified the issue and determined that we would have to quickly release a new version of the app that resolves the issue with the drop-down menu. And that's when I panicked because I realized that meant two things were about to happen.

First, it meant that I would have to put on another hat, my code reviewer hat. So as a code reviewer, I'm responsible for reviewing GitHub pull requests, have to read the code to make sure it adheres to our coding standards, and then I have to run the app in a local environment to make sure that all the old features work with the new features. I have to check the radio buttons, the drop-down menus, the contents of the tables. I have to make sure that the database API and the model API connections are still working as they used to as we added these new features. And so if it passes those tests, then we deploy the app to the test server.

But then that means I have to put on yet another hat. This time I have to put on my app tester hat. So as a member of the app testing team, I was one of several people that had to manually test the deployed version of the app. So I would have to open the app in a web browser and then run a lot of the same tests as before, check the radio buttons, check the drop-down menus, check the API connections, database connections, all those other things that I did before in the code review. And so now the only difference is that this is on the deployed version of the app as opposed to the pre-deployment version.

Now, testing is very important. It does allow us to catch bugs so that we can fix them before the new version of the app is released to production. However, as one of the people who was responsible for actually running the test, I could say that this was a really extremely tedious, painful experience for me. It was very stressful.

I also felt undervalued because I'm thinking, OK, I have a PhD in statistics, and yet my job now is to click this radio button or check this drop-down menu. So I felt like this wasn't a good use of my time. Another issue is that I and the other app testers had other priorities. So we weren't full-time app testers. We could only run the test when we had time to squeeze testing in.

I also felt undervalued because I'm thinking, OK, I have a PhD in statistics, and yet my job now is to click this radio button or check this drop-down menu. So I felt like this wasn't a good use of my time.

And that meant that managing all of those schedules across the different testers meant that it could take a long time for us to get through those tests and actually go to the next stage of our software release. So that was increasing the amount of time it took us to deploy the app. Because it was so time-intensive, we had to sparingly apply these tests. So we couldn't afford to run them as new features were coming in. We had to use multiple scenarios. We could only do them really once if we could do that.

And then lastly, we found that different testers were running the tests with slightly different conditions. So there was some subjective assessment with whether or not the test passed or failed. So one tester could say it passed, and another person running the same test would say that it failed. So that was a problem.

Discovering automation tools

So because these releases were such a heavy burden for me, I started looking for a way to make this process more efficient for me and for our team. I was already familiar with Selenium. I knew it was a great tool for automating interactions with the web browser. And I figured this might help automatically check whether the UI elements were still working on the deployed version of the app.

I was somewhat familiar with the rvest package and that it was great for web scraping. So I figured this might help to programmatically extract the contents of the web page. And I could run some sanity checks against it to make sure everything's rendering properly so Selenium and rvest could work together on the deployed version of the app.

My colleague Eric Nant recommended that I use the shinytest2 package to test the code that other developers were submitting in their pull requests. So he figured that that might help to automate some of the code reviews. And then, of course, later I found out that there was another version of the shinytest package called shinytest2 that somebody will speak about later. And then finally, the shinytest documentation was recommending that we also use the testthat package.

And so now I had all these tools. There was a bit of a learning curve, but I took about three months of my time. And eventually, over the next three months, I was able to use those tools to create an automated testing pipeline.

Impact on the software release workflow

Let's see how the software release workflow changed after introducing these tools. So in the code review stage in blue, we see the manual pre-deployment testing during code reviews was replaced with the shinytest2 script that automates these same tasks. And now the code reviewer hat sits on the head of the bot that runs the test script.

And then in orange, in the post-deployment testing stage, I'm running the manual regression test. I'm replacing those with a suite of Selenium test scripts. And then I get to remove that heavy app tester hat from my head and place it on the bot who is happy to run those test scripts. At least he doesn't complain about it if he does not feel happy about it.

So those two changes had a significant impact on our software release workflow. Our code reviewers and app testers had less work to do. They weren't feeling stressed out with each new release. We didn't have to worry about testers balancing their testing duties with other high-priority work, so we ended up shortening our time to deployment from five days to five hours, which was pretty huge for our testing team.

so we ended up shortening our time to deployment from five days to five hours, which was pretty huge for our testing team.

We could apply our testing criteria as often as needed. So no matter how many releases we had during the month or feature pull requests that we merged during the month. And then because the test logic was contained in the test script, that meant we could consistently apply that test logic from one test run to the next so that removed that variation in test results that we were looking for.

Sydeaka Watson | A Robust Framework for Automated Shiny App Testing | RStudio (2022)

Transcript#

A story about manual testing pain

Discovering automation tools

Impact on the software release workflow

The automated testing framework

Working example: the ice cream app

Closing thoughts

Featured software#

rstudio

Shiny

shinytest2