Gordon Shotwell | Socure | Creating Secure Systems for Growth | Posit
Setting Up Secure Systems for Growth Presentation by Gordon Shotwell, Lead Data Scientist at Socure Abstract: One of the main challenges of doing data science is getting access to the data in the first place. Data scientists need to be able to look at data in detail to do their work effectively, but that necessarily creates data security and data governance problems for the organization and its clients. In this presentation, we go through the social and technical processes that can create a secure data analytics environment and set your team up for success. Speaker Bio: Gordon Shotwell is a Lead Data Scientist at Socure where he helps develop tools for data scientists to securely and efficiently work on sensitive data. They're hiring too! https://www.socure.com/about/careers Timestamps: 09:45 - Make friends with the security team in your org 19:40 - Set up child-proof [data science environments] 22:11 - Developer experience is a security problem - people will do the easy thing, make the right thing easy 35:34 - Buy [tools] don't build them - economic lesson in understanding cost centers Read the related blog post here: https://blog.rstudio.com/2021/10/26/how-data-scientists-and-security-teams-can-work-together/
image: thumbnail.jpg
Transcript#
This transcript was generated automatically and may contain errors.
So thank you so much for having me. I'm located in Halifax, Nova Scotia, where it's quite beautiful here today. And what I'm going to sort of talk about is about security and how to kind of like approach building, like getting your data scientist to work on data in a secure environment interfacing with those other parts of the company. And one of the sort of things I was thinking about while I was sort of trying to figure out what to say about this is that there's not really a kind of right answer in terms of what level of security you need for any of these systems. So these are kind of like general strategies, very general strategy, I think the particular things you need to do, like, we need this kind of authentication, or the data needs to be stored in this way or tracked in some fashion, are going to be individual to the place that you work. So this is kind of more about like, how do you work with the people and systems at your company? And how do you set those things up to be successful?
About Socure
So a little bit about Socure. So Socure is an identity company. So we build basically models and data products that help people identify fraudsters. So whenever you apply for like a bank or a credit card, you, your bank sends your information to either a company like ours, or another model, or another or an internal model to sort of check like, are you really who you say you are? This is for regulatory reasons to make sure that people aren't like wandering money. It's also for just loss prevention, like fraud costs companies a lot of money. So they try to do things that produce that that reduce fraud.
And we're the best at that. So we have some of the best products in these markets. And a lot of that has to do with the fact that we've kind of from the start, set up systems where we're able to confidently work with very, very sensitive data of various different kinds and segregate data that's more sensitive from data that's less sensitive. We're also going through a period of very, very fast growth. So revenue is basically growing, but depending on how you count revenue, one and 200% year over year, we started the year 120 people are probably going to be growing to about 500 people at the end of this year. So we're hiring.
And so this puts a lot of pressure on security. And because the kind of ways you have to think about security, when you're growing this fast, are slightly different from the ways that you can think about security at a larger company, just because, you know, if you sort of imagine, like, so I started at Socure two years ago, and I'm one of the more senior people there, right? So all the people who joined after me have like less context about, you know, why do we have things set up this way? What are the risks? What's the threat model, all those different things. But I found it really helpful environment to sort of learn this stuff. Because I think if you can kind of build secure systems at these sort of like, fast growing company places, it really helps to think about how to make them work at all different types of companies.
The security bad place
So I want to sort of talk about what I sort of think of as the bad place, the security bad place that a lot of places find themselves in. And this is kind of the basic conflict between data scientists, and really everybody else. But you can see a similar conflict occurs between like, you know, salespeople and security people, some degree like engineers and security people, but definitely data scientists and security people, it's the worst. And the basic conflict is that the safest way to deal with sensitive data is to just delete it, right? To not use it, if you can't delete it, you want to make sure that you don't give anybody access to it. And if you can't give, you can't prevent people from accessing it all, you want to have like a small number of people. And it has to be really, really annoying to access, right?
So other kind of like the from a security perspective, that's a hierarchy of data safety, right? And as soon as you have somebody who's actually like, I'm going to go open up a data set and look at, you know, the emails or look at the name. That's a kind of like security nightmare for people. But from a data science perspective, like there's a lot of insight that you're getting from that data that you actually do need to look at the data. So for example, if we have like fraud, fraudster, a fraud ring coming in, they might have like a particular email pattern, which we don't know about until you actually kind of look at that, those, those, those scores, you look at them and say, like, okay, like, what are the emails that are, you know, causing us to decide this person is fraudulent? At some point, you maybe can build an automated system to detect that pattern. But at some point, data scientists do need to actually like be able to look at the data in some environment, right?
So what I've noticed that happens with security is security groups, I'm just going to refer to that as security, like, so this is maybe like compliance people, security engineers, you know, the people who are there to try to make sure that your company doesn't get hacked. You know, so security, like, they'll come into a situation, and they'll try to fix something that's broken. That's an old system that was set up some way that, you know, maybe it was good at the time, or maybe it was just never thought about, but it's some battle system. And they'll say, like, look, this is a big, you know, problem, right? Like, this could be an attack here, we got to fix this, right? But in order to fix, they have to direct money from new projects to fix the old ones. And usually, like, in most companies, like, the groups that are responsible for security don't actually have the power to direct resources like that.
But the power that security does always have is the power to veto new things, right? So when a new project comes in, it usually goes through a security review, and if it fails, it doesn't get put in place, right? So you have this situation where, like, the company can't fix the old security problems, but then can also put in new systems in place because they get vetoed, right? Everybody thinks that the other side is unreasonable, like, so the data scientists feel like security is just putting roadblocks, like needless roadblocks in their place. Security thinks that the rest of the business doesn't care about security. And then everybody just stops talking.
So I think this is really common. And it's worth knowing that even if this isn't like a real problem at your business, most of the people who are coming into that business have had experiences at other companies that are like this. So like most, pretty much all the security people I know, have had situations where they were like, I was at this other job. And there was this like, giant, you know, giant fire of a bad system that never got fixed. And it was so frustrating. And I left, right. So they're kind of carrying that from job to job. And data scientists often feel like we can do, we've had all these experiences where it's like, I could do so much, produce so much value, but I just can't get access to the data that I need to actually produce that value.
So that's the bad place. So this is a really big problem, because it's sort of all of the friction of a high security environment with none of the security. Because all of these kinds of hacks or security issues, they're adversarial, usually, so somebody is looking for the weak link in your company's armor. And it really doesn't matter if you have a 40% of your systems are secure, or 90% of your systems are secure, because they will tend to find the holes, right? So you kind of want to always focus your energy on like lifting the floor of your business rather than having like some small number of things that are like super duper secure. The analogy that I kind of use for this is like, if all of your windows in your house are open, you know, it doesn't really matter if you walk the door, right? Like somebody a burglar is going to like just go through the window.
So that's the bad place. So this is a really big problem, because it's sort of all of the friction of a high security environment with none of the security.
Being an ally to security
So as a data scientist, you can help with this problem by kind of trying to understand and like being intentional about being an ally to the security parts of your organization. Because oftentimes, when you're when I like decision makers are hearing security issues from several different places in the company, they're much more likely to be fixed. So if it's not just coming from the people who are kind of always bringing up security issues, you know, like if it's coming from somebody else as well, like that's a really powerful way of making progress on these things.
And the something that's helped me think about this is just to sort of like think about the mindset of somebody who is whose job it is to identify low probability risks at a business, right? And so this is kind of they live in this like lonely, anxious world, like they're often saying no to awesome stuff. Everybody else is like, this would be so awesome. And then the security person is like, No, we're not doing that. That would be awful. And there's also just this a lot of anxiety, right? Like they don't get any positive feedback, right? It's like nothing happens. It's not because they're like, Oh, this is I did such a great job, right? It's just like, there's I'm just constantly worried about these small, low probability things that will happen that all have missed that will sink the business, right? So it's a really like, high anxiety workplace.
So it's kind of like, understanding that and, and, like, putting yourself in that mindset when you're talking to people in security is really helpful. Because you can kind of sort of like, talk about like, you know, like, basically just sort of like, not think of that they're being ridiculous, right? Like, it's often something where it's like, the thing that seems to you as being like overly strict is often because like, when you're spending your time thinking about these, you know, things that happen 100 times, but that are really tragic. It's, you're sort of more likely to find those things and think that they're important.
And another good thing to do is just to talk about security projects. So pretty much everywhere that I've worked, there's a list of things where it's just like, you know, these are important security projects that we have not done, because for whatever reason, they just haven't gotten, you know, gotten to the top of the pile. Because there's always new things going on the top of the pile, and they kind of get worse and worse over time. So being somebody who's kind of just like, not part of that security organization, but is just like aware of those issues and continues to bring them up over and over and over again, helps the rest of the like, compliance security teams sort of realize that you're on their side when you're talking about building any of these systems.
And then do things like sort of have one on ones with people just sort of talk to them about like their job, what they're what they're working on, develop those things. And then the last one, which I think is really important is that prove that you can make progress on these things, like so that prove, prove to security, that you have a data scientist who's handling this stuff, can push these things through and accomplish them, you can start with small little things that are like security, bad practices on the data science team, like improving those a little bit and being able to demonstrate that is really important. Why is that important? It's important, because when you're setting up a system, it allows you to make promises to security and compliance groups, and have them believe you. So you can say, like, we're putting this in place now, because it's better than what we had before, here are the three or four things we'd like to improve over the next six months. And if you don't do this, where you kind of like demonstrate over and over again that you're going to like move that security ball forward, they're going to say no, like, we have to do everything that we all the requirements today on day one.
Because I don't trust that you're going to actually do what you say you're going to do in terms of of delivering on these patches, basically. So if you kind of do all that, like, and sort of start thinking of yourself, and maybe this is like a whole data science team, or maybe it's like a couple of people on that team. So at Socure, this is something that I've done, basically as kind of like a liaison between the big group of data scientists and the big group of security people. So now, basically, like those groups don't feel like they need to talk to each other as much, like they both can talk to me and my team. And we can sort of translate and be like, you know, like, have the context of what the data scientists are trying to do, have this context of what the security people are trying to do, and be able to, like, find the sort of solution that kind of like works for both groups.
Security priorities list
So the thing once you kind of have that relationship, like, I think it's important to come up with just a list, like an ordered list of security priorities as they relate to data science. And this is, it's important to do that. And to kind of like, think about this as a, like, just to speak in a real basic language, because a lot of times, like, people who work in security don't really have a good sense of like, what it is that data scientists do. Like the programming that data scientists do is very different from the programming that you would do as a, like an application software engineer. It's much more like a process of discovery than it is a process of implementation, a lot of those different things.
And then these two things that I would really highlight are to precisely articulate the business value of the data science work. And this is, at Socure, like, this is kind of easy on one level, because, you know, we're a data science company, all we sell are statistical models, basically. So there's, we can sort of say, in general, like, you know, our work is valuable to the company, it's the main product that we sell. But being really precise about it helps to narrow down, like, what kind of data access you need. So when you're more precise about the business value that you're producing, you might, you might sort of say, like, okay, well, actually, like, I can get that same business value with, you know, 10% of the people, only 10% of the people having access to this sensitive data, or I can get that same business value with the same sort of, like, productive velocity, if I'm, you know, using, like, this kind of environment versus this other one.
Similarly, from the security side, get them to purchase, precisely articulate the threat model that they're worried about. And this kind of goes back to that sort of, like, vague anxiety that people have, which is, like, you just have this vague anxiety of stuff, and then you just veto things, because you're just, like, I'm just anxious, I don't want, I don't want any kind of breach. But when you sort of are, like, more intentional about, like, what kinds of attacks are we trying to prevent, you have a better chance of kind of, like, right-sizing the security access. So an example of this is, like, you know, you might worry about mistake from your company, like, you don't want your employees making a mistake and putting a file in some place that they shouldn't be putting it. Or you might worry that they're malicious, that your employees are actually going to try to steal this data from you and sell it to somebody else, right? Those are very, very different threat models. And the systems that you're going to design that are going to be good for the first one are not going to be good for the second one.
Three principles for building secure systems
Okay, so that's all kind of like prologue. These are my three principles for building secure systems. One is childproof rooms. Second one is developer experience is a security problem. And third one is to buy things and not build them. Let's talk about childproof rooms. This is my daughter's playroom. And this is kind of, like, I feel like the perfect analogy for what you want to do with data scientists, especially new data scientists who are just joining your company. You want to give them a place that has all of their stuff that is really nice, and that they can't, like, burn your house down, right? And so it's a place where you can, like, just say, like, go play, you know, like, go do your thing in here.
So they're really, like, places where people can't hurt the company. And this is the only way that you can onboard people quickly and have it be safe, because there's no way that a new employee in their first month of work is going to be able to understand, like, the security and regulatory environments that Socure operates in. Like, the only way that we're going to be able to get them to abide by those policies is if we kind of make the situation, the environment that they're in, enforce those policies. So this kind of means that when you embed, put those secure things inside the system that you're building, you don't have to tell people about it as much, right? Because it's just, like, that's just what it is. You don't want to have a situation where you're basically, like, having people read 100 page policy documents and expecting that they're gonna abide by them. Nobody can do that.
So the next one is this idea of that developer experience as a core security problem. And so this is something that I think is not well understood by people who work in security in their day to day. Because people who work in security come from a model where it's just like, it's kind of like a police model or a legal model, where it's like, I'll just say what the law is, right? Like, I'll put some policy out, I'll say what the law is, and then everybody's just going to do that. Right? And like, we have, like, you know, we have really smart, thoughtful employees, like, they understand that they're supposed to be following the rules. So like, they'll just, they'll just do what I what we tell them to do.
But that's not exactly, that's not what happens. Because it turns out that like, you just don't have the enforcement mechanisms for those policies. And all of the people who you've hired are like, you've hired them because they're like really clever with computers. And people who are who are really clever with computers can like sort of subtly circumvent things that they even if they might not know that they're not supposed to be doing something, they might kind of know that they shouldn't do it. But it's like some really urgent project, and they've got to just get something out the door. Right? Like people will always sort of like end up taking some little shortcuts around, around something that they find burdensome, right? So if the thing that is secure, is also like a pain to use, difficult, unappealing, some people are going to not use it.
And that's all you really need is like some people not using a secure process. So you can either make it so that it's like it's impossible for them to do the wrong thing, like you've put up like big walls, or you can sort of understand their motivation, which is that they want to get their work done as easily as possible. And so if you assume that they're going to do the easy thing, right, and you can make the right thing easy, then you can have some confidence that they're going to tend to do the right thing, right? Like, if you make it like put up some roadblocks around make it super irritating to do some process that you think is insecure, and then also give them a really, really convenient way of doing this more secure process, you're going to have a lot of success with adoption. And I actually think from a security perspective, adoption is one of the big problems is just actually like getting people to do the thing.
And so if you assume that they're going to do the easy thing, right, and you can make the right thing easy, then you can have some confidence that they're going to tend to do the right thing, right?
So the ways that I've used to build good developer experiences for, for secure projects, the first one, I think is really, really crucial is to control the client libraries. So we have an internal a set of internal R packages, a set of internal Python packages, and, and they're great to use, they've gotten good adoption. And so what that means is that we can, we basically have a layer of abstraction between the user and however, that's interfacing with some kind of back end system. So we might have like some function that we have a function, for example, that connects to a database. This function has been around in that package for years, right? But over the over my history there, I think we probably changed some mechanism of how that database connection works, maybe seven or eight times, right? And the users never knew, right? Like, it's because it's just basically like, we've thought about their experience. And like, part of their experience is like, they should never have to learn a new database connection algorithm. Like, they shouldn't need to know how AWS, like, access tokens work. That's not something that's relevant to their job. Like, they just want to connect to the database.
So the next thing is to like, think about security from like a product sales perspective, like you're actively selling the product to the user, even if at some point, you're going to be able to be basically say, like, you know, like, we're not asking anymore, right? Like, this is just the rule, like, you know, we're going to do it this way. Then it's just, that's how it's going to happen. But to start with actually thinking of it more like you're selling a product. And one of the ways of selling secure systems that's really helpful is to say is that most data scientists who work on sensitive data, it just is like, freaky, like, it's just you do not want to do it, it makes you really nervous, you're worried about leaving it somewhere, you're worried about making a mistake, right? And so if you sell them and say, like, look, if you start using this system, like you don't need to worry about those mistakes anymore, like the system will handle it for you.
One of my first data science jobs was working on youth criminal justice data in front of the Nova Scotia government. And it was actually under like an air gap. So I had to like work on this, like, on this, like, like no internet connection computer with like no R libraries. And on some level, it was like really irritating. But on the other level, I was like, I was like, Oh, I'm kind of so happy to just like be here and not have to worry that I'm going to like leave this, you know, like send it like accidentally email a file and then go to jail, right? Like, so there's some level of just like comfort of having something that's secure.
The third one is to pair restrictions with power. So this is kind of like always ship security things with functionality. And this helps a lot, right? Like, so if you give people something where it's just irritating, right, like they're going to be irritated. So you give them something where there's irritation and like something that they really want together in some in one release. It helps kind of make the medicine go down, you know, to have something that's like some new functionality, new capacity that's going with it. And the last one is it looks really matter. So whenever you're building a secure system, one of the most neglected things is user experience and design. And so spending a bit of time just saying like, you know, we're building a secure web application, like let's, you know, let's do a sprint with, you know, a UX designer, and like somebody to make this look nice, right?
RStudio Connect case study
So I have a little instrument for a case study about our dear host, which is RStudio Connect. So we implemented RStudio Connect to host Shiny applications. And what we kind of discovered, basically, like I was putting together a way of hosting Shiny applications using like Docker. And as part of this, I went to a, so when this went through a security review, and it was vetoed, basically, because it was not that didn't fulfill a bunch of security requirements. And I didn't have the skills to put those requirements in place myself.
So we also realized that as we were doing this, that we could like, you know, so RStudio Connect like gave us that authentication that we needed, gave us the logging, tracking, auditing capabilities that we needed to host Shiny applications. But then as we were kind of using it, we were like, oh, we can replace a lot of different things with this. Like, there's a lot of ways that we're communicating what we call like company confidential information, which is not PII, but is, you know, like things that are our personal IP that we're worried about being out in the world. And we realized we could sort of like replace a lot of those things with hosted R Markdown or different things like that. So it was also something that we could childproof, right, like we can build clients to maybe allow things on, you know, one environment and not allow them on RStudio Connect. We can put on like network level controls over what RStudio Connect can talk to in the rest of Socure systems.
So it let us sort of put something up where it's like, look, if you are, you know, hosting something here, you know, it's going to like be safe, safe enough. And it also kind of like let us build this environment where like data scientists could deploy things. And we knew that that was OK, right. So like this has happened a lot with so many things I've done a number of times is like have so our package has like it doesn't have a lot of security in terms of who is able to download and install our package. But one of the ways is we have like functions that will like refer to a pinned file on RStudio Connect or refer to an API that's hosted on RStudio Connect. And so there we have like user authentication on those files or APIs. So somebody can use a function from the R package and it'll basically like behave differently, basically like pick up who is that person and kind of reference these like, you know, connect products. And we know that all those products are OK because the whole system has been kind of verified and hardened.
And then the last one is or the second last one is it paired restrictions, which is like you're not allowed to email the stuff anymore, like you're not allowed to you have to share it using this way that we can track who's looking at every single document if we need to with power. You know, it's like reproducible. It's you can schedule things, you know, that kind of stuff. So that led people like people didn't really notice the restrictions because we were giving them something that was so much more powerful and more convenient in terms of their work. And then the last one is that it's just easy to use and pretty.
The economic case for buying software
Okay, so the last thing I want to talk about was buying software, and I wanted to give you an economic argument about why buying software is always better from a security perspective. And this is an argument that you can use to get licensing money from your company. So at most companies, security is a cost center. What a cost center is something the company needs to spend money on, but doesn't make revenue based on in the short term. So like you need to spend money on an office space or you used to. You don't directly make money on that office, the office building, right? You make money on the stuff that happens in that office building.
So every business or almost every business is going to go through rough times where they don't have enough cash flow. And in those times they were going to cut costs. They're going to neglect cost centers and they're going to put money into revenue centers. So the place that actually puts money on the balance sheet. And they'll do this even if it's in the long run, bad idea. Like even if it's a long run mistake to cut those cost centers, like cut those security people, cut those DevOps people. When you're kind of get in a threatening economic situation or threatening business situation, you just need to put revenue back on the business. You need to get it back into the black. So you'll tend to neglect cost centers.
So while security is a cost for you, for the vendor, it is a revenue center. So if Databricks or RStudio is thinking about like, should I invest in wrapping this brand new awesome security protocol? If they do that, there's going to be some customer out there who's going to buy their product because they did that. So they're going to put money and stuff into that thing, even in times where their business is difficult. Because that's a big part of what the product is that they're selling. They're selling something that is secure, something that is compliant, something that works for all these companies that they're trying to sell to. So whenever you buy software or licensed software, you're basically moving security from living in your cost center world to being in the revenue center of your business. Because you're kind of like offshoring it to some degree.
So while security is a cost for you, for the vendor, it is a revenue center.
So a little bit, it's like a maintenance commitment to the future about we're going to keep this up to date. We're buying the software now, we're also committing to paying license fees for the next however long we use this thing. And that's kind of like putting a little bit of money every year into that system getting better and more secure. So this is really important also if you're growing, because as you're growing, your security requirements change.
Now Socure is like we have a billion dollar valuation, we're in Forbes, we're a much, much bigger target than we were. So our security requirements have changed dramatically. The nice thing about buying software is that your vendor has other clients, has built these products for clients that have big company problems. So if you're a small company, you buy these things, you might look at it and be like, well, I don't need that, I don't need that, I don't need that. But as you grow, suddenly you discover you need all of those things.
And then the thing about growth is that at some point you start being limited by people and not money. So it takes maybe six months or something to hire and onboard a new DevOps person, and we can't just hire 20 of them all at once. So right now, we're really grateful for any opportunity where we can spend licensing money to avoid development work, because we have tons of development work that is in our core business that there are no vendors for. So anything that we can do where we're able to basically say, I'm going to put licensing money onto this problem helps us grow, because we can just basically just keep giving our vendor more cash for more licenses and support.
So our RStudio Server or RStudio Workbench is a good case study of this. So when we first implemented this, it was for mostly remote R development. We were a pretty much universal R team on our stuff. So we wanted something where it was giving people more compute power, giving them a remote environment so that we knew that the data was location safe. And so it could be controlled and audited, and you can shut people's access off, stuff like that.
But after we bought that system and started putting it together, so we could have basically built this remote environment with a number of open source tools. We could have used the open source RStudio. We could have used just a server with a terminal. There's tons of different ways we could have done this that would have been cheaper in the short term. But basically, after we bought RStudio Server, they keep working on it, and it kept getting better. So a lot of these things that when we were first thinking about it were either not at all security requirements or were things that were on our nice-to-have list. At some point in our growth, maybe we were selling something to a client, they would have some questions about how is your data science environment authenticated? And it would be something where I was like, if you're authenticating in way A, we're not interested if you're authenticating in way B, that'll pass our audit of your work. And so there were many times where basically, because we had bought the system and basically just kept getting these patches and updates every time we just did the little, there's a new release, let's just spend two minutes pseudo-installing the new release. We got those things. They weren't free, but they didn't require any work. We didn't have to do some urgent sprint to get those out of the box. The thing had kept accumulating features over time.
And I think this is something that's really underrated about buying software. It's just that you shouldn't think of it in terms of building a short-term thing and whether it's more or less expensive to buy or build. You should think about it like, how do we want to spend the money and commit the people to maintain the system that we build forever? Or do we want to pay somebody else to maintain the system that we do forever? And I think unless your business, unless the thing is actually really, really core to your company, like you're the best in the world at that thing, usually you should try hard to buy things.
Summary and closing
So those are my three principles. So basically, my advice for this, and this is, again, like there's not very much here about like you need to have this technical setup. I can answer questions about that if it's helpful, although I think probably there are usually better people to answer that. But I do think that this is kind of like an organizational social thing to a lot of degrees. So like work hard at like developing a relationship with your security teams where you all feel like you're on the same team. Try to set up childproof environments or maybe like data science proof environments or whatever. Think about developer experience as a security problem. Like if they're not at odds with each other, it's not like pick security or convenience. Like you actually need your secure systems to be convenient for them to be successful. And then last, like try hard to buy things. And there are good economic reasons why the one you buy is going to be better than the one you build.
And then lastly, we are hiring in huge numbers. We have a lot of people who are dedicated to our developers as well as many dedicated Python developers and some people who do both. So if you would like to hear more about this, if fraud is an area that you're interested in, please feel free to reach out. My information is here. I'm also on the LinkedIn group. Thank you very much.
Q&A
Thank you so much, Gordon. That was awesome. I feel like that is so valuable for so many different industries right now. I'm thinking of a lot of conversations I've had with customers in the past who are kind of like selling even the idea of data science internally to their teams, where those points are really relevant there.
Gordon, one that has been upvoted is, how did you communicate R packages to your security team?
So like what an R package is, or you mean like external R packages?
Probably external ones. Yeah, I think it's just probably coming. I've heard this from a few different teams before as well. But if someone doesn't know really what R is, how do you explain the idea of packages? Yeah, so I mean, I think like packages are like you could sort of think about it like, I mean, JavaScript libraries are the same. So the way I sort of think about it from a security perspective, and this is kind of like another way where having a robust like internal package is really helpful. Because like, I feel like I'm able to make decisions about whether a package should be included in our work in a way that other data scientists are not, because I've sort of done a fair amount of just like package development. So I kind of know, I can look at packages and be like, how good is this package? Like how solid is its development? Or like, who developed it, right? Like, so if you're like, I'm much more likely to trust packages that are maintained by, you know, a professional, like somebody whose job it is to maintain that, and one that's that's ad hoc, I'm much more likely to trust ones that are popular than ones that aren't.
So if you're kind of like exposing, like have an internal package that you're working on, that like does almost everything that people need it to do, you'll be able to make that decision of just like, I'm not going to take on that dependency. I'm going to do that in base R, you do the packages that I already use. And so that's one piece of it. The other piece is just of like, this is kind of like articulating a threat model as part of that, where you're just basically like, like, I think sometimes security people will worry more about an R package than they will worry about a JavaScript library, even though the JavaScript library is probably going to be hacked before the R one is, just because of surface area, like there's more, more JavaScript people out there.
And actually, like for putting something on NPM or something like that, like the, the requirements of doing that are much lower than, than releasing something to CRAN. Not that CRAN is not a security thing at all. But, but, but just that there are these other places where we inject dependencies all the time. And so I would say like the main thing I would sort of argue for there is like to try to get kind of like a fair assessment. Like how is it that you use, how is it that we monitor whether somebody is using the right NPM packages or not? Like, how do we, how do we know that those aren't malicious? Okay. Like we'll do the same thing.
I see there's one other question on Slido that was anonymous. How do you deal with a security team that has less knowledge on best practice than the data science team? Like the security people, especially at Socure, are very, very good. But I would say that, like, you have to meet people where they're at, right. And people have power within organizations and you just have to deal with that power. So I think it's, so I think, like, not getting frustrated, right, and not, like, as soon as you start talking to somebody who has power over a system, as though they're an idiot, right, you've lost, you're done, right, you're not going to make any progress with that. So you need to, like, look at their sort of work, look at their stuff, and try to understand it in a sympathetic way, right. And one of the tools that I do a lot is I kind of like, you know, like repeating the same thing over and over, you know, like repeating things back to people is helpful to just be like, okay, what's the most charitable way that I can say back to you my understanding of this.
And then, like, you can sort of take that and sort of, like, write down, like, okay, here's, here's what we wrote down as, like, this is that requirement, this is why we have it, right, this is the motivation for it, get kind of broad agreement on that. And then sort of, like, move forward, right, you're like, this, given that this kind of, like, fulfills those things. But yeah, like, you can't, you can't treat people like, like, they're, they're dumb. Because, because it just, that's, that's, again, kind of that bad place where you're, you're, like, you're not, like, as soon as people stop trusting each other, like, you're going to ask for something that's totally reasonable in six months, and they're just going to say, man, like, that jerk, I'm gonna say no right away, right? There's a lot of discretion in these things.
Thank you. I think that's such a good lesson in communication, in general, and, like, thinking about, like, even when you're sharing visualizations with, like, people from the business as well.
Kevin, I see you asked a question, which may have been answered, but want to make sure that I get to that as well. Do you want to provide any additional context?
Yeah, of course. So, yeah, other than, I guess, the RStudio Connect that you were using, how else, what were other tools that you were using for, like, leveraging with RStudio to help with your goals with security? Those are the two main ones. We also do have RStudio Package Manager, which we use. I wouldn't say we do anything particularly interesting with it, but it's useful. I've had a good experience with all of those. These aren't the only tools that we use, and these aren't the only sort of layers of security. Like, there's this idea called defense in depth with security, as you want, like, just many, many layers. But we do use both of those, and I've been very happy with them.
I'm also wondering which external package that you're using most right now.
Oh, like external, like, to just, like, another, like, R package? I don't know. That's a tough question. I did use, like, so pins was one that I used a lot, and this was kind of something that we stopped doing. We have another thing, but there's this period of time where we had a really urgent data storage change that we needed to make, and the data storage project was late. And so we ended up using kind of, like, pins in our RStudio Connect to kind of, like, be a little, like, stopgap until we got the real solution. And so that's something that does, like, I do, I'm such a fan of pins. It's so simple that it's, just having something where you're just able to, like, you know, send a file, like, have it be, like, controlled somewhere, like, have it on S3, and just have it come back and cached, and it's pretty nice.