R at AstraZeneca: upskilling our workforce through education, experience, and exposure
We were joined on November 29th at 12PM EST by Gabriella Rustici & Guillaume Desachy, who shared their experience about the R journey AstraZeneca is currently on. Resources: ⬢ R @ AZ: Building a Community in the Pharmaceutical Industry Blog Post: https://www.rstudio.com/blog/building-a-community-in-the-pharmaceutical-industry/ ⬢ R in Pharma YouTube videos: https://www.youtube.com/c/RinPharma ⬢ Posit Pharma Site: https://posit.co/solutions/pharma/ Timestamps: 4:53 - Start of session 5:41 - Paradigm shift in the pharmaceutical industry (many people are multilingual) 6:40 - Profile of R users at AstraZeneca (varied across data science, clinicians, medical director) 8:28 - Meet the R&D Learning & Development Team 10:03 - 3E Framework: Education, Exposure, Experience 12:11 - Bridging the science community and data science audience 13:11 - We all learn differently (solutions that suit different needs & styles) 17:31 - Index of learning (self-led index, synchronicity index) 12:20 - Experiential Learning 20:47 - The community of R users at AstraZeneca 21:05 - The early days (April 2021) 21:52 - azTidyTuesday: a playground to hone data viz skills 24:15 - internal R conference 25:38 - R function of the month 27:04 - Lunch & LeaRn 28:20 - R @ AZ 10:1 29:44 - Communication expanded from internal social media to R @ AZ Monthly Newsletter 31:57 - Workshops with Posit 32:11 - AZRHotdesk - come with your questions and someone will help you solve it 35:02 - Wish list for 2023 38:38 - Start of Q&A section Abstract: The use of R continues to become more and more important at AstraZeneca. It is a true paradigm shift that we have embarked on! This shift has required upskilling our workforce to make them proficient R users. To do so, we are leveraging the 3Es of learning: education, experience and exposure. Learn more about their team's Data Science Educational Program and how the team at AstraZeneca has built their own strong community of R users - where learning takes place through experience and exposure. Speaker bios: Gabriella is Data Science Learning Senior Director in Astrazeneca’s R&D Data Science & AI where she is responsible for developing a strategy for, and creating a centralised approach to, data science learning for R&D. Gabriella completed her PhD at the Wellcome Sanger Institute and previously run bioinformatics training programs at the University of Cambridge and the European Bioinformatics Institute, in the UK. She is passionate about designing, implementing and evaluating effective and scalable solutions to educate scientists and data science practitioners at all career stages. Guillaume is passionate about helping bring new medicines to patients by leveraging the power of statistics and precision medicine. Since October 2020, he has been doing so at AstraZeneca where he works as a Statistical Science Director. In addition, since March 2022, he have been leading a team of 15 collaborators focusing on building the community of R users at AstraZeneca, called R @ AZ. During the event you can ask questions anonymously through slido here as well: rstd.io/meetup-questions Blog post on R @ AZ, Building a Community in the Pharmaceutical Industry: https://www.rstudio.com/blog/building-a-community-in-the-pharmaceutical-industry/ Please note the recording of this session will be shared at the same YouTube Live link
image: thumbnail.jpg
Transcript#
This transcript was generated automatically and may contain errors.
I'll wait a few seconds here until we go live over to YouTube. I think we've just gone live over to YouTube right now.
Hi, everybody. Welcome to the Posit Enterprise Community Meetup. I'm Rachel. I lead our customer community here at Posit. This is a friendly and open meetup environment for teams to share use cases, teach lessons learned, and just meet each other and ask questions. If you've joined us here at the meetup before, put a two into the YouTube chat. If this is your first time, put a one into the chat.
Together, we're all dedicated to making this space inclusive and open for everyone, no matter your experience, industry, or background. During the event, we will certainly have time for Q&A. And you're able to ask questions live on YouTube Live through the chat. And I can see them and save them and star them here, but also anonymously through Slido, which I will share on the screen in just a second here.
Our presenters today also have a few questions for you, too. So there will be a poll through Slido as well. To address one of the popular questions up front, yes, the recording will be available. And it will actually be ready right as the presentation ends. So at the same exact YouTube Live link.
But anyways, I am so excited to have you all here with us today. At Posit, we get to see firsthand the shift to open source across the pharmaceutical industry. I'm joined by my colleague, Jason Milnes, who is helping out in the background as well. Jason works with many of our life science customers today.
We are so excited to have both Gabriela Rustici and Guillaume Desaschi here with us today to share their experience and learn more about the R journey that AstraZeneca is currently on. Gabriela is Data Science Learning Senior Director in AstraZeneca's R&D Data Science and AI, where she is responsible for developing a strategy for and creating a centralized approach to data science learning for R&D. Guillaume is a statistical science director at AstraZeneca, passionate about helping bring new medicines to patients by leveraging the power of statistics and precision medicine. He has also been leading a team of 15 collaborators focused on building the community of R users at AstraZeneca called RAZ.
Introduction and paradigm shift in pharma
So thank you. Thank you very much, Rachel, for having us today. So before we get started, we actually have a couple of questions, and Rachel will be sharing a Slido link with all of you because we want to get to know who you are, we want to get to know where you are connecting from.
My name is Guillaume, and I want you to picture a scene. You are learning a new programming language, and I won't tell you how long ago I first started learning how to program in SAS because both Gabriela and I decided, okay, let's not disclose it. You first learn to use this programming language, and for me, it was actually SAS. And back then, R kind of felt like a faraway land.
I'm mentioning it because I think there's been a true paradigm shift in the pharmaceutical industry. And back when I learned how to use SAS, I think there was a strong divide between SAS and R. And you basically had to choose your camp. Either you were a SAS user or you were an R user. But I think things have really changed in the industry. And now things are really different because you can read an actual SAS dataset from R, and there's real interconnection between the two languages.
But the other thing that I find very interesting is that our workforce, the new people that we hire, they're actually multilingual. They speak different programming languages. And most of them, they speak R and SAS or R and another one. And that's what is really fun.
Profile of R users at AstraZeneca
Talking a bit more about our workforce and the profiles of our users, I think we can think of the usual suspects. Because we can think of the statisticians, we can think of the programmers, we can think of the data scientists. But what is very interesting is that the R users are less — at AstraZeneca, it's actually much more than the usual suspects. Because we've got clinicians using R, we've got medical directors using R, we've got information practice scientists using R. So we've got a lot of people using R. And it is this diversity which makes it very fun.
So we're there. So Gabri, you've got the answer to your question. You wanted to know how people had first learned how to use R. Well, it seems like a lot of people are self-taught R users that they actually never followed any kind of formal training.
The Data Science Academy and the 3E framework
Yes, indeed. Thank you, Guillaume. And I don't think this is different from what I had expected, actually. I know lots of programmers self-teach themselves how to program. Not everybody is fortunate in their workplace or throughout their career to have a wonderful, dedicated learning and development team, as we are fortunate in the SNAI here at AstraZeneca.
So maybe you'll find some of these people on LinkedIn and you'll connect with them. Let's just say that although, you know, the success of our program is also linked to the fact that we have a very large group of trainers. So again, a training program wouldn't be possible if it wasn't for the time and commitment that the instructors give us.
So let's move to the next slide. What do I run? So the learning and development team that I represent aligns their activities and initiatives to the 3E framework. So we all know that the best way of learning is through experiencing something, to being exposed to others, and the way that others work and connect. And the smallest component should be education, you know, but often education might help us get us started, to gain a little bit of confidence to then go out there and experience and expose themselves to the way others are working.
The project that I lead within AstraZeneca is called Data Science Academy. So Data Science Academy is an R&D-wide educational resource in everything that pertains data science and AI. You can already imagine that R featured prominently in this. So what we try to do is to cater to all of the needs of the people that Guillaume had on his last slide. So we recognize that there are many different individuals in different roles with different needs, some that requires, you know, a little bit of data science awareness, some that require deep technical skills to carry out their role.
But let's also keep in mind that this is not just about technical training. This is also about how do we bridge the gap between scientists and data scientists? How do we improve the way that we communicate? And we try to tailor everything to the needs of our therapy areas and functions. So again, you relate to training better if we are using examples that are close to your heart.
As I said before, we recognize the fact that there are many aspects of data science, some that are more relevant to scientists, some that are more relevant to data scientists. We have awareness raising level training where there is no coding. It's just more about understanding how is data science used, ethical aspect of working responsibly with data and artificial intelligence, and maybe for the bravest, start dipping their toes into coding.
If you look at the right hand side of the slide, it says that this is, you know, the component of the training that is more technical. So learning how to use R, how to use Python, then start applying this programming language in the context of statistics, in the context of machine learning. How do we move then to the next level and adopt coding best practices so that we can collaborate and share coding effectively? And then, for example, how we use R in the analysis of transcriptomics data, analysis of proteomics data, so looking at the analysis of particular data types.
Moving to the next slide, we also recognise the fact that we all learn differently. Some people are more akin to self-study, and for that reason, we have several educational resources that are available to our e-learning platform. They are curated by subject matter experts within the organisation, and we sort of bring to them a digested set of learning pathways to get them started in many of the topics that we discussed above.
Our best-selling point is indeed our portfolio of virtual instructor-led courses. So this is our traditional classroom experience, although we do a lot of our activities virtually still, until we go back to the face-to-face. These are your multi-days courses, a mixture of lectures and practicals, and where you have obviously the possibility of interacting with the trainer, interacting with a mentor, and further your knowledge by starting to experience, by doing exercises.
And this is where also our existing collaboration with Posit Academy lies, because we have done and delivered some training in collaboration with them around basic R training. And then we bleed even more into the experiential learning, although I won't talk too much about this today, with our apprenticeship programme, some of our project-based work, and some of our mentoring. Again, thinking about how do we learn on the job, how do we learn by concentrating on specific projects that are offered to our learners, and how do we learn by talking to mentors?
Just by looking and reiterating the fact that R training is a major component of our virtual instructor-led courses, I launched Data Science Academy shortly after joining AstraZeneca in 2020. So we have been running our programme since then, and actually the R training is probably the first training that we offer. 41% of our courses focus on R. As I said before, we range from basic programmes, statistical analysis, machine learning, analytics.
We offer in the excess of 900 places on our virtual instructor-led courses, lots of training for SAS programmers learning to use R. We have done a specific training programme for them. And as you see, there is a little bit of glimpse of what our learners say after six months from attending a course. Attendance has definitely improved, their coding skills have improved, their ability to handle data. So we can definitely see tangible effects of learners attending courses, but then also start implementing what they learn on their day-to-day job.
So we can definitely see tangible effects of learners attending courses, but then also start implementing what they learn on their day-to-day job.
Diversity of learning styles
Thank you very much, Gabrielle. And I think you said something very nice regarding the diversity in the way we learn. And I think it's very true. And if we think a bit about it, some of us are going to be liking learning by watching the videos, some of us is going to be by opening a book, some of us is going to be by being in the classroom, even though these days it's a bit harder to be in a physical classroom.
And the way I kind of like to think about these things is with two indices. So the first one, I'm calling it the self-led learning index. And at the top of it, it means that you are going to be learning by yourself. And at the bottom of it, people are going to be taking you by the hand, someone is going to be helping you out. And then there's the synchronicity index. On the right hand side, if you miss the event, it's too bad, you won't be able to catch up. But on the left hand side, you are going to be able to catch up because this kind of event or this kind of training is asynchronous.
And now going back to what Gabrielle was actually talking about, we can fit the various kinds of trainings and various kinds of resources in these four quadrants. And if we think about the self-study resources, they are very much asynchronous and they are very much self-led by their name. Then if we think about the virtual instructor-led courses, someone is taking you by the hand. You're not in the classroom per se, not physically, but still someone is helping you out. And last but not least, we've got the experiential learning opportunities.
The RAZ community of R users
So let's move on with the community of R users at AstraZeneca. So the way we called it, we called it RAZ. And as with all things related to R, we decided, okay, if we are going to be doing an R community, we're going to have some hex stickers. And this is the logo of RAZ.
Now, when we started it, and when we started it, I'm talking about the early days, but the early days were not that long ago. It was a bit more than a year and a half ago. So a bit more than a year and a half ago, a few of us got together and we brainstormed. And we were like, okay, what could we do to get a community of R users started at AstraZeneca? And we jotted down some ideas. And one of them was, what about R Tidy Tuesday? Another one of them was, what about a weekly blog post? And another one of them was, what about an internal R conference?
Now, we knew that it was going to be taking time. And that every single idea, every single idea that fell on a piece of paper was going to be taking a whole lot of time. But we were actually like, okay, let's explore these ideas. And the first one we started was something that we branded as AZ Tidy Tuesday. So if you've heard of Tidy Tuesday, AZ Tidy Tuesday is sitting with pride Tidy Tuesday. But we made it our very own.
And when I say we made it our very own, basically what we did is that each time we were promoting a dataset, a publicly available dataset, we were framing it. We were telling a story around something going on internally within AstraZeneca or around one of our values or framing it around something going on externally. And this was happening once a month.
And the way we were doing this, we were promoting the dataset, we were promoting the story on our internal social media, which is something which is supported by Facebook. So we're promoting this dataset. Collaborators had about two weeks to submit their entry. And a few days after the deadline, we were actually promoting one of the entries and there was a featured contributor. Because after 25 editions, we decided to retire AZ Tidy Tuesday.
Now, let me go back to the diversity of learning. AZ Tidy Tuesday fits on the top left. Because you can go back in time, all the codes, all the images, all the graphs are actually stored on a SharePoint online. So if you want to go back in time, you can go back to edition number one. And it's a mix of self-led learning index and also people walking you by the hand. Because we were doing some code review for every single entry for the first few editions of Tidy Tuesday.
But as I was saying, very early on, we were like, okay, maybe we could organize an internal R conference. We knew it was going to be a big piece, but we also knew that it was going to be good for the community. In a nutshell, it was a half-day event. We had 10 plus hours of content related to our 30 speakers, close to 30 posters, and all of these from close to 100 authors. And we were very fortunate to have Max Kuhn. So now from Posit, formerly known as RStudio.
And it was the first time we were doing this. So we didn't know how it was going to be received. We didn't know if people were going to be showing up. We were blown away. And actually, the feedback was stellar. And we had more than 500 participants in this internal R conference. So seeing this, we're like, OK, we can't just stop here. We have to keep going. And that's what we did in 2022.
And we had more than 500 participants in this internal R conference.
And one of these initiatives is the R function of the month. So the concept is that we detail one R function. And by we, I actually mean Daniel here, because this is led by Daniel. And he does this in the form of a blog post. It happens once a month. And again, we leverage our internal social media. So on our internal social media, we do a workplace post, a Facebook post. And that's actually how it looks like. So it's an R markdown. It's a .html. And people can go back to it.
So that's the R function of the month. And if we go back to this, then this is pretty much self-led, because you are going to have to make the effort to go through this .html. But you can pretty much go back in time, because it's very easy to consult this and to have a look at this.
Now, another initiative that we kicked off this year, and this is led by Gustav Satterstrom, is something that we're calling the Lunch and Learns. And the whole idea of the Lunch and Learns is that we know that time is scarce. So we decided, OK, let's tackle one topic at a time, and let's have 30-minute sessions. So about a 20-minute presentation, 5-10 minutes Q&A. And same thing, happening once a month, and having inclusion in mind while alternating between the US and Europe.
And we've got, so for these kind of events, we had a mix of internal speakers and external speakers. And, for example, in here, we had a presentation from Monika Huynh, who is one of our collaborators, and she talked about Plotly. But some other people called about R Markdown, or some other people talked about many other topics, about packages that they had developed.
So that's the Lunch and Learns. And again, the Lunch and Learns are very much synchronous events, because if you miss this, it's going to be hard. You're going to be able to catch up, but you will, again, lose the interactivity.
Now, let me move on to something that we brought it as the 10-to-1s. So the 10-to-1s are led by Tom Marlow. And the idea with this was, OK, we need to federate this community, and we want people to get to know each other. So let's ask to our users 10 questions. And each month is going to be the exact same 10 questions. And again, it's once a month.
And what I really like with this initiative is that you can flip through the different 10-to-1s, like you would flip through a magazine. And you can see the diversity. And that's where you see that everyone learns differently. Everyone has different kinds of tips on how to learn R. And if we look at this, the 10-to-1s, you can pretty much go back in time, because you can be, OK, I'm going to be catching up on the 10 past editions. People are going to be taking you by the hand, because the tips are very precise. So it's a gem, because you can very easily apply this to your day-to-day work.
And initially, I was saying that we were leveraging our internal social media, and that we were doing all our communications via our internal social media. And this was true until very recently. And again, it's a question of inclusion and diversity, because we know that not everyone is checking our internal social media. So we decided, OK, let's launch a monthly newsletter. And Shili Zhang is the one leading this. And the idea is a newsletter. So you've got all news, internal and external. It's one single email. We're not spamming you. And it's happening once a month, by the middle of the month.
And basically, Shili is collating everything that has happened in the past month, every single R and AZ initiative that has happened in the past month. And so what has been going on externally? Is there, let's say, R on Pharma, is there a meetup coming up? These kind of things.
But that's not it, because we've got other kinds of initiatives. And one of them, we are calling it, let's meet each other. And again, it goes back to getting to know each other. Because I think at the end of the day, that's all that matters. We want, as human beings, we crave these connections. We want to know each other. We want to build connections with other people. And then these connections are going to be helping building connections across departments.
So that's what is let's meet each other. It's a very informal get together. It's a 30 minute session. You connect, you don't connect, doesn't matter. And you get to meet other R users in the organization. Sometimes we talk about R, sometimes we talk about other things. One thing we discussed, for example, was, how do you, what do you think about Posit, about the new name? Another time we talked about the holidays. Another time we talked about our favorite R package.
And then recently, we also did some workshops jointly with Posit. And we've done two so far. And we are very much looking forward to be doing more with Posit, because they were very well received in the organization. And last but not least, the very latest initiative that we launched is something that we called the AZR Hot Desk. And the whole idea in here is, you come with your question. It's a face to face kind of event. You come with your question, and we're going to be helping you to answer this R related question.
So that's what we're doing in the community of R users at AstraZeneca. Now, of course, all of these initiatives, and in total, we've got, in 2022, we had eight initiatives going on at the same time. Now we've got seven, because Tidy Tuesday, AZTidyTuesday retired. But I won't lie, it does take quite a substantial amount of time.
But behind the scenes, I'm the one doing the talking today, but behind the scenes, there's a great team working fabulously. And they did an amazing job throughout the year. And I don't think we would be here today without that education, without their passion, and without their commitment. So I want to thank them.
If you want to know more about the community, we've written a blog post, which is available on the Posit blog. So you can type this on your favorite browser, and you'll be able to find this. Otherwise, you can flash the QR code, and you will also land right there.
Wishes for 2023
So I think, Gabrielle, I've got a question for you. So we talked about what we did in 2022, both in terms of trainings, and you talked about the importance of the diversity of learnings with education, experience and exposure. But what is on your wish list for 2023?
Well, I think one of the important things is we can train lots of people on how to use R, but then we need to make sure that people use the skill that they have acquired, use it or lose it. So how do we make sure that there are opportunities for them to do so? How do we make sure that there are projects lined up once the skill has been acquired? How do we make sure, actually, that the lots of the training that we develop is developed in response to the need of a particular group, a particular function, a particular set of individuals?
And then we are obviously looking at the opportunity of strengthening our collaboration with Posit. So we're actually been discussing to second someone from Posit to AstraZeneca to better understand some of our learning needs and developing some solutions. So we're really looking forward to this opportunity for 2023.
But Guillaume, maybe I'm curious of asking you the same question. What is on your wish list for 2023? Well, thank you very much, Gabrielle. I think for me it's going to be the connections and I'm going to be making the parallel with the picture I showed at the beginning. I think we are in a very nice place at the moment because we've got more and more connections between SAS and R.
Now, the other thing I'm very much looking forward to in 2023 is all the connections that are going to be made through the various departments at AstraZeneca. And we often talk about how do we evaluate a community and how hard it is to actually evaluate a community. I think all these connections at the end of the day do help the business, do help the organization. Because if you know that your friend that you met at Let's Meet Each Other or that you met at any other kind of event is also working on developing, let's say, a Shiny app, but you've never done this, then you might be able to reach out to that person instead of learning it the hard way. So that's what I'm very much looking forward to in 2023. So it's these bridges, these connections between people and between departments.
Q&A
Oh, I was muted. Thank you so much for joining us, Gabrielle and Guillaume. It's really amazing to hear about all the different ongoing initiatives at AstraZeneca and about this growing community, too. I see a lot of love for AstraZeneca in the chat and a few people from your team on as well. There are so many great questions here.
So I can take this one. I can talk about the R community. So one thing that I did not mention is that today I talked about the R community, but we actually have other communities. So there is, for example, an imaging community. There is, for example, a Python community. Now answering Jeremy's question, the community is an AstraZeneca-wide community. So that's why it was and it is important for us to have these diverse sets of initiatives so that we try to meet the needs of everyone. And when you think about it, it's actually not that easy because if you sit in R&D, if you sit in research and development, then when you think of a topic, intuitively, you're going to be thinking of a topic which is research and development-oriented. So I think you have to be proactive and you have to make conscious decisions so that you include everyone.
I see on Slido there's a question that got a lot of upvotes. How do people at AstraZeneca find balance between training versus doing their actual job? And how is AstraZeneca leadership supported and enabled this balance? I think it's an interesting question. Finding balance at the end of the year is what everybody's looking for. But I definitely have to say that AstraZeneca leadership has been very supportive, particularly when it comes to specific programs that we have developed. We had buy-in from them from the beginning in helping us steer the development of some of these programs as well.
We do try to raise awareness of line managers, of the importance of ring-fencing, time for learning, and especially encourage, now that we have all of these resources and opportunity, people to think about how training can help them achieve goals in their individual development plans. So we can improve that because we haven't necessarily found the perfect balance. But I have to say that there is support from leadership is there, and I think it's quite crucial to make sure that these training programs land and people have the chance of taking advantage of them.
Another popular question I see was from Dan over on Slido. Was there any issue with regulators, FDA, EMA, when you switched to R from SAS?
So I think, so we've not made a complete switch to R. There are other companies in the industry that decided to make this leap. We've not made it yet for a variety of reasons. And I think whether we like it or not, and I'm not saying us as an industry, SAS is going to remain a key player in the years to come. And again, for a variety of reasons, because if we take, it can be AstraZeneca, it can be any other pharma company. If you've got SAS macros, if you've got SAS programs that are developed, it's going to be taking quite a substantial amount of time before you can actually transition to another kind of programming language.
So we've not completely made that shift, and I think if we do decide to make that shift, it's going to take a bit of time. However, in our workforce, what we see is that there is more and more people using a wide variety of languages. No, I think I very much agree with what you say, Guillaume. I think there is a recognition anyway that regulators are now fully aware of R and the power of R, so things will change in the future. And for the moment, as Guillaume said, we are doing some of the, validate some of the work in R and getting people prepared for that switch, basically.
Another question over on Slido is, a challenge I find is having enterprise tools for R available for people to use after learning or training. How have you dealt with this at AstraZeneca? So I think R, we have several platforms that are available to our data scientists and researchers. I mean, R is obviously one of the tools that can be easily installed from our software store, so anybody can install it. It's installed. We have an RStudio server. You know, we have computing resources where R is available. So depending on your preferred way of work, there should be a solution available for you and you should definitely be able to have access to R through these various platforms that are available.
I see Joshua has asked, thank you both, Guillaume and Gabriella, for this excellent talk. I'm curious as to what role machine learning with R has in R&D at AstraZeneca.
So actually, it is true that by default, one would think machine learning equals Python. And it's possibly very true that lots of our data scientists that are working in machine learning might have chosen Python. I can say that the first course that we offered in machine learning was machine learning in R through some of my former collaborators at the University of Cambridge.
No, I agree with you and I'm going to be alluding, I can't remember who said that in pharma, but I think, you know, in 2022, I think what is important is that we've got a set of tools. We've got, historically in the industry, in the pharmaceutical industry, it's been SAS. So we've got SAS, we've got R, we've got Python. Some of us use Julia. I think depending on what you do and depending on the requirements from the regulatory authorities, then you've got this set of tools and you can tap on one tool or the other.
And I don't think, and I think that's actually quite important, I don't think we should limit ourselves to one programming language because one programming language is going to be having some strength that the other doesn't have. And again, I think it goes back to personal preference. Some of us like better doing machine learning with R and some of us like better doing machine learning with Python. Depends to what you're the most comfortable with. As long as you get the job done.
Thank you. I see a few questions that touch upon a similar idea, but one was, would the AstraZeneca's Data Science Academy be available to the public to gain the expertise? And maybe I'll also ask, are there any other community resources you recommend too?
In the current plans that we currently have, the resource itself is it's only an in-house resource. What I can say though is I am actually working on some training activity, more in the format of summer schools or winter schools, where we can actually bring together academics and colleagues from AstraZeneca. I'm currently working on a summer school for next year to take place in Cambridge that will be open to academic and member of the private sector. So opportunities like that, I'm really looking forward to open more up in the future.
And as you know, sort of similar, the same spirit, some of our community activities, we also invite external speakers. We really try to increase the flow between our knowledge and networking opportunity between our academia and industry. Let's not forget that there are many other programmes that we did not talk about today. We have PhD programmes, we have internship programmes. So there are other opportunities where we can create collaborations. So I would say probably stay tuned for the first AZ summer school.
Somebody else asked, how do you determine which people go through Posit Academy versus in-house training? Our team is exploring academy now. So I think actually it's an interesting question. So we were actually some of the beta testers of Posit Academy. So we were one of the first to take advantage of the programme. And I think one of the strengths of this programme is that the training, it's development built around a particular project or a particular objective. So the individuals that kind of applied to attend that course and needed to also propose a project that they would work on and around which the training was crafted.
So I think we went on selecting individuals that had also quite a timely need to learn hard. They had the need to implement the skill. We try to also guarantee sort of an equal representation of participants from our different research areas. So that was kind of the approach that we used. So we actually do not normally do selection for participate on the courses where in this case is we did so that we wanted to get, you know, the most motivated individual with the most pressing need to acquire the skill and with the project line up to apply this to. So I would say selection was quite key in this case, and we would continue to run this collaboration in this spirit of selecting participants that would mostly benefit.
So going by some of the most upvoted questions here on Slido, one was how do you do risk analysis for any tasks you are doing first time in R and submission studies? So to the best of my knowledge, so as I was saying, we've not made the transition to R at the moment. And to the best of my knowledge, I'm not sure we've actually done any kind of submission using R. So I don't think I'm actually able to answer this question.
I think there's a few things in this question. So there's definitely the price of the license that is definitely pricey. And for some organizations, it is an issue. And it has a huge impact on our budget as well.
But I think there's another aspect to it, which is the open source aspect. And as of today, when you want to consult, when you're reading a paper, and when you want to give a try to the statistical methodology or the new machine learning methodology, it's going to be in the form of an R package. Some kind of development, some kind of open source development. And I think that's a big difference between SAS and R in the sense that there's all the regulatory work.
But then there's all the question about exploratory and all the question about study design, if we take the example of designing randomized control trials. And then that's where I think the strength of open source plays a big role. So I think even if SAS would not have raised the prices that much, I still think open source would have played a key role in the pharmaceutical industry.
And I think this is also linked to one of the questions that popped up on the YouTube channel. You know, Konstantinos was asking, what are the dangers of using R or open source languages compared to licensed languages? And I wouldn't call it a danger, you know, but I'd say that when I first heard about R, you know, I, in my PhD, used closed tools. But I think the advantage of open source is that you have a community. You have the experts in a field developing a solution that is available to you a few months after it has been developed. It's out there. You can use it. You can modify it.
You post the question on the R community and, you know, the person that has developed the package answer you. It doesn't get better than that in terms of access to expertise and knowledge. So I'm not going to say that we all have to ditch non-open source tools, but I think it's a great complement to be able to have that support and be able to move at quite a fast pace with open source.
You post the question on the R community and, you know, the person that has developed the package answer you. It doesn't get better than that in terms of access to expertise and knowledge.
So I've been in contact with someone in late phase clinical trials, so late phase programming in randomized control trials. And this is definitely a hot topic, because if we want to push the use of R further at AstraZeneca, there is definitely all the piece related to the training, how do we upskill our collaborators, but there is also all the piece regarding GXP compliant environments. So this is something we're working on, and hopefully in the next few months, we should have an answer to this.
Going back and touching on the training aspects, somebody had asked, what KPIs does your team use to evaluate how much each course was? I briefly, briefly, briefly touched it when we were looking at those numbers for our R training, but this is actually something that I am quite passionate about. Even before joining AZ, I was already looking at how to measure quality and impact of short courses in one of the previous projects that I was involved in.
And, you know, I think the way we approach it is, right after the course, we would measure the quality. Did you like the course? Do you think it was balanced? Are you planning to use the tools in your future work? What did you like the most? What did you like the least? You know, just to get a sense of how the course is received. And then, at intervals of six months after the course, well, six months after the course, we will circulate another survey where we are instead going to ask, are you using the tools?
In what way has the training helped you? Has it improved the way you code? Has it improved the way you collaborate? But we also ask questions around, have you actually passed your knowledge to others? Have you trained others on how to use the tools? So, we do collect some metrics, a combination of quantitative and qualitative, although these type of KPIs are often more qualitative than quantitative. We try to do some quantitation, but it's not perfect. But also, we do interviews, a round of interviews with our learners, just ask them, you know, some people, that by survey is definitely something that we are experiencing.
So, it's nice to do some case studies. We have, you know, evidence of individuals that have changed roles, and were not in a data science role, and then they are transitioning to one. We have several success stories on how the training is helping. So, we have a few sets of KPIs, but in some cases, it's also just, you know, interviews, and talking to the learners, and see how they've been affected.
But on Slido, I'm seeing right now the most upvoted is, how did you pick use cases that would motivate people to want to learn? Did you start with specific technologies to highlight the power of open source data science, like Shiny?
I think when it comes to use cases, I would say that the success of the training program is also try to marry, you know, what is the need with what is the availability of the people that are going to contribute to the training program. When we work on use cases, you know, we try to showcase the work that others within the organization are doing. So, we want to kind of kill two birds with one stone. One is to already display how we are using data science, how we are utilizing one particular technology, and how is that enabling us to achieve particular goals in a way that you get also to inspire others.
And that is where you go back to that peer-to-peer learning as well, how others are using the skill within the organization might be relevant for what I'm trying to do. But at the same time, it's also an opportunity for our subject matter experts, many of which contribute to our training, to showcase also what they do. So, give them that visibility. Make people aware of who are the experts within the organization. So, if I have a problem and I want to talk about the natural language process, who do I go and talk to?
The other one on the training was, how much is the reproducible environments and workflows taught for all users going through the training? Well, I can definitely say that reproducible science, you know, is very high on our list. We can do better. We are developing more courses that are around the software best practices and how you work reproducibly. As I said before, we try to get more of our courses run of our infrastructure. So, when the training is over, the infrastructure is already readily available to the user to continue their work.
We have recently done a seminar, actually just last week, wasn't it, Guillaume, around the workflows in collaboration with Posit. So, again, I think we definitely need to expose more of these and get more people to utilize our environments and adopt best practices. So, it is definitely something on the rise. How do you, you know, containers and reproducibility and portability is definitely something that is high on my list for next year.
On the community front, I know, Guillaume, at R&Pharma, we were talking a bit about making things sustainable. And actually, somebody had asked, can you talk about why you decided to drop AZTidyTuesday? That's a very good question. It was not an easy decision because it was the first initiative we started and that's what built quite some momentum around the community. So, that was definitely not an easy decision. But the reason why we decided to drop AZTidyTuesday was basically we could see a drop in participation.
And it was also the fact that, you know, all these initiatives, even if you think that at first they don't take time, they still do take some time. Because, you know, for every single AZTidyTuesday edition, we had to look for a dataset. So, it was a publicly available dataset. But we still had to look for one. And we had to frame it around a nice story. Then you have to select a featured entry. Then you have, you go through the codes. Then you select, we were selecting three to five nice things in the code that we were actually putting up so that everyone could benefit. So, you know, it was a matter of time, basically. And it was to create a bit of space, a bit of time for new initiatives.
Okay. Two more questions, I promise. One great question over on Slido is, are there teams in AZ that include both R-only people and SAS-only people? If yes, how did you manage them working together?
I think I can give a go to this one. I think this is kind of fading away, I would say. Because I was saying, you know, everyone is becoming multilingual these days. And I think it is true. And it is true for the new joiners in the industry. Because even if they learned R, Python, or Julia at university, when they join AstraZeneca as early career professionals, they have some SAS training. So, they become multilingual in SAS. Now, if we take the example of people who have been with us for a longer period of time, and who are more versatile at SAS, I think they're also seeing the R journey, not only at AstraZeneca, but in the pharmaceutical industry. And if people want to remain competitive, they need to learn to use R. So, this kind of, you know, dichotomy between R-only people or SAS-only people, I don't think it exists anymore.
Okay. So, in thinking about all the different community initiatives, I see Eric had a great question. How do you motivate community members to share their insights and learnings in the meetups or other sessions?
And it very much relates to what we talked about in the past, Rachel. When we're like, I think what is important in communities is how do you make sure that your community goes from a receiving mode to a contributing mode? And I was just looking earlier today at our Facebook equivalent group at AstraZeneca. And I was actually quite happy when I was scrolling through the feed of the group, because I was like, I could see questions popping up and posts popping up. And these posts were not driven by steering committee members.
And I think that's the hardest bit, maybe. How do you make sure that you create, we talked about this as well in the past, how do you make sure you create a safe space so that everyone is fine being vulnerable in such an environment and being like, okay, I'm going to be asking a question. Maybe it's a simple question, but it doesn't matter. And I think that's kind of how we motivate others. We try to create this safe space so that everyone can speak up.
And the other thing is, we've got different kinds of initiatives. Some of them require quite a level of R, but some of them, let's say the Lunch and Learns, we make sure that we don't overcomplicate things so that it's also open for a beginner to present in a few editions. And very much like what you say, Rachel, when you talk about, you know, the journey of a participant, maybe that person is going to be joining the media first, not speaking, then following that person is going to be asking a question. And then the goal is to have that person present something at some point. So that's what we try to do.
And I would say that there is a lot of enthusiasm, though, within the organisation for sharing. I think there was a lot of uptake. People really volunteered to participate, present, co-organise. I think one thing that we have to appreciate, and I didn't when I joined the organisation, this is a very vast organisation. So you can be an R user, you know, relatively isolated in your group, trying to do something, not knowing what others are doing. So I think establishing these networks and meetings or meetups, people really value the fact that they can share experience and connect. So I think the enthusiasm is definitely there.
That's a great point, though, Gabrielle. Sometimes people might be working in a team where they don't know about the community that exists already. And if in the future, somebody is watching this meetup recording on YouTube, and they're from AstraZeneca, and they just started, how do they go about finding out about the community events?
As Guillaume said, you know, we have social media, everybody's added to that social media. You know, we are, I'm personally working on getting more data science learning onto onboarding. So, you know, it's all about raising awareness of what's out there and what's available. So I think there are channels for newcomers to do so, but I'm sure we can improve in that respect as well. But there are various channels through which they can become aware.
Great. Thank you both so much, Gabrielle and Guillaume. It's really been great getting to join some of the R and Pharma events and to get to meet you both and work with you both. It's so impressive to see all the training and community building that that you both do. Thank you very much. Thank you for the invite.
For others listening in, I also wanted to just give you a heads up, and I just put it into the chat too. But it's a big Pharma week for us here, because we also have Christina Fillmore, a data science leader at GSK joining us as a featured leader at the data science hangout this Thursday as well. But I want to thank you all for all your great questions and for being here and joining us today as well. Have a great rest of the day, everybody. Thank you so much.