Resources

Dan Negrey @ MarketBridge | Creating a framework for consistent measurement | Data Science Hangout

We were joined by Dan Negrey, Director, Analytics at MarketBridge. At (15:11) we asked Dan about a tip for impacting the business with data science. So I think every business is going to have KPIs (Key Performance Indicators), and there's going to be other metrics besides KPIs, things that lead into that. As crazy as it sounds, some organizations struggle to measure those and to do so in a consistent and repeatable way. Maybe they measure something that just comes from one person sitting at a desk, and they've done it for six years, and they leave. All of a sudden, who knows how they do that? Creating a framework for consistent measurement is huge for an organization. The measurement is consistent and the outcomes are measured consistently. Then taking action to improve those outcomes can be thought of as more reliable because the measurement process is consistent. So that would be one thing for sure. Another – on that note, is decision making. Every company makes decisions. A lot of us are here because we like to do this kind of work, but most of our companies exist because they like to make money, and they like to grow. So we find a balance between doing what we do to help our company to achieve their goals. Find ways to help your company optimize cost, reduce waste and increase growth. All of that is through measuring and looking at decisions that have been made in the past and thinking about how they could have been made differently. This could be through historical analysis or building models to help make those decisions more effectively. That’s a huge win for any organization. ____ There was also lot of love for repeatable data with the pins package at this Data Science Hangout. Dan Negrey shared: "Pins has been a huge package that we've started using a year or so ago...if you've never used pins, it's definitely worth checking out." Helpful resources on pins: Pins for R: http://pins.rstudio.com/ Pins for Python: https://lnkd.in/ghmxiEHV Great repo that uses pins: https://lnkd.in/ezvBkav Workflow that involves Quarto, pins, plumber API, vetiver and shiny: https://lnkd.in/e6gnMXfD Link to Ryan's video & stepping stones: https://lnkd.in/erR-Mjr9 Other resources: MarketBridge career page (with open data science roles): https://marketbridge.applytojob.com/ β–Ί Subscribe to Our Channel Here: https://bit.ly/2TzgcOu Follow Us Here: Website: https://www.posit.co LinkedIn: https://www.linkedin.com/company/posit-software Twitter: https://twitter.com/posit_pbc To join future data science hangouts, add to your calendar here: pos.it/dsh (All are welcome! We'd love to see you!)

Apr 10, 2023
1h 0min

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

Welcome to the Data Science Hangout, hope everybody's having a great week. If we haven't met before, I'm Rachel Dempsey, I lead our pro community at Posit.

Before I forget to say this, I want to share an exciting announcement that the call for talks at Posit conference in Chicago in September was extended to April 7th, so you have a little bit more time to procrastinate. So if anybody wants to submit a talk, I'll put the updated blog post into the chat here too.

But thanks again for joining us today. The Data Science Hangout is our open space to chat about data science leadership, questions you're facing, and getting to hear about what's going on in the world of data across different industries. And so if you are watching this on YouTube in the future, in our future world, it happens every Thursday at the same time, same place, so you can add it to your calendar with the details that would be below in the YouTube.

But every week we feature a different data science leader as my co-host to help lead our discussion and answer questions from you all. Together at the Hangout, we're all dedicated to making this a welcoming environment for everybody. We love when everybody can participate, no matter your level of experience or industry or area of work. It is totally okay to just listen in too.

There's always three ways, though, that you could jump in and ask questions or provide your own perspective. So you can jump in by raising your hand here on Zoom, and I'll look out for it. You can put questions in the Zoom chat, and feel free to put a little star next to it if you wanted me to read it, if you're in a coffee shop or something. Otherwise, I'll just call on you to jump in and introduce yourself. And then third, we also have a Slido link where you can ask questions anonymously.

And I believe Hannah and Tyler will be sharing that, yep, in the chat in just a second. Thank you so much again for joining us, and thank you, Dan. Dan Negre is Director of Analytics at MarketBridge and is joining as our co-host today. Dan, I'd love to have you jump in and introduce yourself and maybe share a little bit about your role and also something you like to do outside of work.

Thanks, Rachel, and thanks for having me here today. I was super excited about it when you reached out. I was thrilled. I love talking about this stuff.

So as Rachel mentioned, I'm Director of Analytics at MarketBridge. We're a – we call ourselves a go-to-market science company, and so we really use reproducible scientific methods to tackle our clients' toughest measurement problems in sales and marketing.

My role is – I wear a lot of hats, but I guess primarily I'm focused on our – sort of our tooling and our approach to, like, building tools that we can apply across different problems as well as some of the front-end stuff we do with Shiny and serving up insights to clients based on the models we build.

I'm sure we'll get into more questions about some of that, but I'm from Cleveland, Ohio. I live right outside of Cleveland, and outside of work, I love playing basketball.

Marketing measurement and team structure

So to kind of put us in your mindset of the problems that you're helping to solve, what's an example of a marketing use case or maybe a problem you've helped a stakeholder with?

Probably the most, you know, common problem that we work on that every company is challenged with is measuring their marketing effectiveness or, you know, you think about all the different efforts you do from a marketing perspective, reaching out to customers directly through different channels, direct mail, email, phone, spending money on paid ads on, you know, various social media channels and paid search and all that kind of stuff. So we help our clients measure those efforts effectively and in a way that doesn't double count or sort of over-attribute.

So maybe these could be related, but I know on past Hangouts, we've talked a lot about like community. And so, for example, like an internal meetup and being able to show like the value of that within your own organization, do you have some tips for people here who might be trying to do that?

Well, I would say, you know, without getting, you know, too into the weeds in some of this stuff, try to think of the problems as in more broad ways. It's like, what is your goal? Like is your goal to, you know, by having internal meetups, is your goal to get like what's your objective? What's your measurable objective on the back end? Are you trying to get more people to register for these events going forward?

Or, you know, is it more of a softer lead type of thing to get people to sign up for the conference? And, you know, just using Posit as an example, but, you know, think about the problem holistically before we start like diving into pulling data and arranging data and all that stuff. Because the more you can think about it, the bigger picture of what's going on, the more you'll be able to, you know, one, explain what you're doing to other people, but also, you know, think about how all the different pieces that affect what you're trying to measure can kind of come into play and work together with each other.

So, I get to ask you all the questions right in the beginning as we're thinking of the ones that as people are thinking of their own, but I know you said you use Shiny a bit, so I'd love to hear a little bit more about how you're using Shiny today.

So, first of all, I'm a huge fan, and I've been following the product, I guess you'd call it product, along since it first launched. I knew it was going to be a game changer, because at the time you could already do some really incredible stuff visually with R. And the moment I saw how the Reactive Framework worked and how to manipulate things and have it, you know, change what users were looking at, I knew it was going to be awesome.

So, I've spent a lot of time in the last seven or eight years, I can't remember how long it's been, trying to get better and trying to learn the ins and outs of Shiny. But at the end of the day, like, I look at it like, your ability to create in a tool like R is pretty limitless.

There are so many, like, I compare it to, like, Legos, right? There's all sorts of cool Lego sets out there. You think of a package maybe as a Lego set, and the fact that you can just pull from different packages and you can kind of create your own original Lego creation from all these different, you know, really strong and stable parts that come underneath them. So, as soon as you kind of accept that, and then you think how Shiny can kind of take that to the next level, you're kind of giving your users the ability to take your Legos and rearrange them in different ways and do some cool stuff with it.

Yeah, with what we do, you know, I think about building marketing measurement or marketing mix models, when we can serve those up to our clients and have them change the data around that they are scoring those models on. Maybe they're making assumptions around how they might spend differently next quarter on a certain channel, or maybe their budget's going to be reduced and they need to scale things back. They can rescore the models, and all that's happening, you know, in the Reactive backend. You know, they're making, using some inputs to change assumptions around the data, and then having the models get rescored.

You know, seeing visuals get regenerated that show, you know, time series plots or tables that have summarized data, that's a lot of what we're doing, you know, manipulating those models on the backend and, you know, rescoring them and showing results in different ways.

Alan, I see you just asked a question in the chat. Do you want to jump in here?

I've been learning a little bit about some of what our teams are doing with marketing effectiveness measurement lately, and it's an interesting space, and there's a ton of challenges and stuff to learn. So, this is timely and fun to sort of get to learn from you a little bit. I'm curious if you can talk a little bit about your team and what the roles are there, and maybe say a little bit more about, are you, as a director, are you leading people, or is your leadership in a more of a kind of technical space, or like a mix, and how you find that balance?

So, my role is in a little bit more of a technical space. We do have a good structure to our team, though. I think, you know, we've kind of, it's a little tough to describe because it's a mixture of, you know, sort of flat, but also with specialty.

So, we know we have, and this is an important thing I was hoping to bring up at some point in this conversation, but, you know, we, where I've seen teams kind of fail or not succeed as well is when, you know, people are trying to, there's like a single role that, you know, leadership has in mind, and everyone has to fit into that role. And that's not the case that we have here. You know, we have people that are really good with theoretical statistics, and we have people that are really good with, you know, data storage and data storage solutions and data engineering, I guess. We have people that are good with Shiny, and we have people good with data visualization.

So, what we try to do is, it involves a lot of communication, too, and breaking projects down into individual components where we can say who fits best on those components so that we can get the best out of everybody while also giving exposure to different components so people can learn in different areas. So, recognizing those differences and trying to get the most out of people's expertise is huge for that.

R beyond Shiny

I see there was an anonymous question that goes back to us talking about Shiny, and it said, do you think that the main value that R provides is Shiny? And this person said, personally, I find the emphasis on, the overemphasis on Shiny sometimes makes R seem like a limited language, and was curious to get your thoughts on that.

I don't see it that way because, I mean, I've been using R for a long time and have seen it kind of evolve. So, obviously, I think, it wasn't by much, but I believe R Markdown preceded the release of Shiny. So, that was pretty innovative at the time. You could, you know, you could create reports in all sorts of different formats and embed output and code. And so, that is still a great use case.

In fact, that's one of the things when I'm talking to clients and, you know, in best case scenario, we have clients that can log into our environment and see some of the cool stuff we create in our environment. But there are other cases when, you know, just by, I'm sure everybody on the phone here has been in a situation where they've had to work in a different computing environment without all the resources that they would love to have at their disposal. So, we find ourselves in that situation sometimes. And when we're fortunate enough to have even like the ability to install RStudio Desktop or something, your ability to create those types of reports now becomes on the table.

And one thing that I think gets overlooked a lot is all the really cool and somewhat interactive visualizations you can do in R Markdown output. You know, I'm a huge fan of those, the HTML widgets collection of packages. So like DT for tables and Leaflet for maps and digraphs for time series plots and things like that. A lot of clients and companies you might work with may have never even seen something like that before. They might not even know that they could get a static document that has an embedded map with icons for individual lat long coordinates and polygons for shapes.

And, you know, they did hover over to see different information at different, you know, regions. So, those have gone a long way, right? I mean, they've gone a long way with clients that have been received very well. And that obviously stops short of shiny reactivity. But then even just as a, you know, as an exploratory or language to explore data, manipulate data, generate plots in the viewer, even pacing those into PowerPoint, even if they're, you know, static plots can go a long way. And of course, building models and generating insights around, you know, inferential statistics as well.

A tip for impacting the business with data science

A question, I'm not sure if I'm going to frame this question the right way or if I fully thought about it. But I, so, I read the book Checklist Manifesto, and I've been thinking about, like, if there may be some sort of checklist for, like, providing value to an organization, like, or a way for your team to, like, best impact the business with data science. And I was wondering if there's something that comes to mind for you, like, something that you say, like, oh, you definitely need to have this to provide the impact on the business.

So I think if every business is going to have KPIs, and there's going to be other metrics besides KPIs, things that, you know, lead into that. And as crazy as it sounds, some organizations struggle to measure those. So, and to do so in a consistent and repeatable way, you know, so maybe they measure something that just comes from one person sitting at a desk, and they've done it for six years, and they leave, and now all of a sudden, you know, who knows how they do that. So just getting in a, creating a framework for consistent measurement is huge for an organization, right?

So just getting in a, creating a framework for consistent measurement is huge for an organization, right?

If the measurement is consistent, outcomes are measured consistently, then taking action to improve those outcomes can be thought of as more reliable because the measurement process is consistent.

And I guess another, you know, on that note, it's decisioning, right? Every company makes decisions, and a lot of us are here because we like to do this kind of work. But most of our companies exist because they like to make money, and they like to grow. And so we find a balance between, you know, doing what we do to help our company to achieve their goals. And so if you can find ways to help your company optimize cost and, you know, reduce waste and increase growth, and again, all of that is through, you know, measuring and looking at decisions that have been made in the past and thinking about how they could have been made differently and doing historical analysis or building models to help make those decisions more effectively, then that's obviously a huge win for any organization.

Marketing mix models (MMM)

Josh, I see you had asked a question in the chat. Do you want to jump in and maybe introduce yourself?

So I'm Josh King. I'll be, I think, April 20th, I'll be a leader as well. And similar to you, Dan, I focus a lot on marketing data science. One of the things that really caught my ear earlier, you were referencing, you have a shiny framework and you referenced specifically MMM as an input to that and like being able to, for your clients to use that as a scoring mechanism for new models. So I'm curious to try to get some thought to this myself in the past. Are you using that as a means for actually constructing MMM for your clients? Or is it more of like a scenario planning aspect to utilize response curves coming out of an MMM for them to actually use it as a tool for them?

And the reason that I've had a lot of struggle wrapping my head around this in the past is I see MMM not necessarily as like a specific model necessarily, but more of like a framework to build out what it is that you're trying to measure, which is very human in the loop, which is kind of different from a lot of other data science projects. It's very much a blend of the art and the science of having a balance between statistical fit as well as business fit. So I'm curious, like within the apps you've built in the past, like how you're able to kind of find that balance.

Currently, I'd say the state is more of the latter that you mentioned. Taking models that our team has worked very hard to build and very precise and serving those up in a way to give clients the ability to understand how those models work and consider different scenarios where they think about how they might manipulate, spend differently or what would happen if this happens sort of a thing. But totally agree on your point about making it more of a fluid application where it's not just reacting to something that's been built, but like also has a hand in building it to begin with. So that's definitely in mind and on our roadmap for how we approach these things. It's just, it's actually one of the more exciting things that I think about that I'm working on this year.

So we were double checking in the chat on what MMM stands for and thought it might be helpful to include with the recording too. So what is MMM?

So you might hear different answers from people, but marketing mixed models or media mixed models. But basically, when you look at, you consider all the different types of marketing stimulus or marketing efforts that go towards a measurable objective and it's trying to assign, consider all those things at the same time and assign contributions by each effort towards the outcome and do so in a way that doesn't double count or over-attribute or anything like that.

The question was, what are the shortcomings of MMM or as we say, marketing mixed models or mixed media models?

I think in some cases, you know, one thing I'll say is that you need historical data or you need a great way to simulate what you think is a good representation of historical data if it doesn't exist. I think that's the case with a lot of, with pretty much any model you build. But in some cases for marketing data that it, a lot of times that data doesn't exist or isn't in, you might have, you might have data that's, well, one thing I'll say is, as marketing gets, is more digital nowadays and is being executed through all sorts of different social media platforms or other online platforms, what you might be able to extract out of those platforms may not be in the best form for a model.

Let's say you have, you know, new customer signups by day, just some generic term, right? But maybe you can only get, you know, Facebook advertising spend at a week level or something. So there's a lot of, you know, there's assumptions you can make about spreading some of that data around to try and get it to fit the structure you need for building a model. But so I would say that, you know, in terms of pros and cons, it's, it's dependent. You know, if you like data munging, it's a great space to be in because you get data from all these disparate systems and it's all in different formats.

The pros, I'll say, and this is the most obvious one, is that, you know, you can have a conversation with the CMO and tell them that, you know, you've got a sound statistical approach that suggests, you know, channel X has contributed, you know, Y percent towards this particular outcome and a different channels has contributed this. And had you done this instead, this might've been the outcome. And, you know, you can test that going forward and, you know, over the next quarter or six months or whatever. And a lot of times that's eye-opening for, for a CMO.

There's a lot of pressure, I think, on CMOs, chief marketing officers these days, to show they spend a lot of money, right? So there's a lot of pressure to show the value of all that spend. And if you're, you know, every, every channel that, you know, you execute stuff through is probably going to offer some kind of reporting attribution that only looks at what they do and only considers it being like the only thing that somebody might've been exposed to. And obviously that's, you know, not really practical, right?

So if you've got a tool that can consider all the things you're doing where the spend all adds up to your total budget and the outcomes all add up to what was observed by the whole organization and you can say, you know, based on our model, we think that, you know, Facebook did this or Google did this or Direct Mail did this. And so now you can, now that CMO can have a much, you know, much better conversation with the CFO and the rest of the organization about how they can optimize that budget to, you know, keep costs under control while trying to maximize the, you know, the outcome, you know, new signups, new customers, revenue, that sort of thing.

Dan's career journey

Dan, I'd love to hear a little bit about your journey into your role now, and I know you previously were an analytics consultant as well. And like, what has the transition been like into a director of analytics role?

I'll probably go back a little bit more in time. So I was going to be a math teacher. That was, that was the start of my journey. So I was got a bachelor's degree in secondary math education from Ohio University. And towards the end of that, decided I wanted to just keep learning more math and stats. Wasn't, wasn't sure if I wanted to get into public school education and, or at least wanted to put it off for a little bit. So I had an opportunity to stay at OU and go to grad school and study applied math and statistics.

And right around that time is when jobs in, they weren't calling it data science yet, but it was definitely more statistician or maybe if you're lucky analytics jobs. This was like 2006-ish. And, and so, man, I thought this stuff was super interesting. I thought, and it wasn't, it wasn't like, wasn't like actual science, not to say that that's not interesting, but just the idea that there's this like emerging field of, it's sort of not fleshed out yet. You're kind of using a little bit of computer science, a little bit of math and stats and just some business or otherwise context acumen to, to solve interesting problems.

And so when I finished school, when I finished grad school, I had a couple of different jobs for, you know, a year or so each at larger companies and then had an opportunity to get to work for a locally owned consulting company in Cleveland. That's really where I spent a lot of time, almost nine years altogether, but went from sort of a consultant with a few years of experience all the way up through, you know, by the time I left there, I was a director and had been leading projects and leading longer term engagements, but also working on R&D side with new technology.

So that's what really got me interested in this stuff was, you know, they trusted me to work on having our, helping our company grow into using this, you know, there's just, I'll go back like 10 years ago, there was just an explosion of open source technology. Like there was a lot of proprietary software that was getting to be very expensive and it became very well known that tools like R or Python were starting to become used in more business contexts.

And also at the same time, you know, the argument used to be all, well, it's in memory computing and what can you do? Like, you know, computer has two gigs of RAM or whatever. Well, then, you know, that price of memory started going down and, you know, they started being able to spin up virtual machines and kind of keep that, an eye on that cost a little bit more.

And this all just, this is so interesting of the timing of all this, because right around the same time, it was like, you know, RStudio was getting bigger and they were, you know, there was more awareness of the packages coming out. ggplot2 was, had already been pretty big, I think at the time, but then those HTML widgets came out and it's like, oh, look what you can do. You can embed an interactive map or a table or something in a, what looks to be like a presentation or a document or something.

Well, I should mention that, like that itch never left. Like I did never end up teaching in a public school, but I did teach undergrad math in grad school and every job I've had, I've, I've taken it as like a unspoken responsibility to train and teach people, especially as this stuff has just exploded. There's, I mean, if you're new, if you're finishing school and you have, if you've not had exposure to a lot of this stuff, you probably think like, where do I start? Like these companies expect me to be a brilliant statistician and have 10 years of computer science background and, you know, all this kind of stuff. And it's not practical.

So there needs to be good training programs in place at places and people patient and excited about teaching others how to use it. But yeah, so I had an opportunity to lead R&D efforts at the one consulting company I was at and get us up to speed on Hadoop and R and Python and Spark and all this cool stuff. And, and then, you know, at some point I decided it was time to move on to a different, to a different role. And I took a role at a software company. That's where Rachel and I actually first met because I was trying to get RStudio Pro products installed there.

And that was a great experience. I learned a ton about B2B software and the sales and marketing process. If you, if you hadn't, if you've only worked on like in a B2C environment, working in a B2B environment is, is much different in how long everything takes and the, the marketing efforts that go into, to, you know, what could be like a year or two long sales cycle to with much bigger price tag at the end.

But after a while I wanted to get back into just, you know, being more hands-on data science developer type, but had accumulated a lot of this experience that, you know, that I could reliably say that I'm, you know, call myself a leader in this space, even though I still have imposter syndrome. I'm looking at all of you and I'm immediately thinking I'm, I'm out of my league here and you guys, you guys should be the ones talking.

But so I've met, you know, I met people at MarketBridge and they knew that we clicked in terms of our vision. It was also a smaller company. I think I just had a, I have a, you know, my preference is to just work in a smaller company that can move a little bit faster and not have as much, you know, red tape across when it comes to trying things out with technology.

So, yeah, so now we, now we're here and we've kind of taken, everything is cumulative. I have everything I've done for like the last 15 years. I feel like I've, I've tried to put into play here and, you know, it's teaching others. It's building internal packages so that we can abstract problems and have people, you know, use consistent functions to address similar problems. It's shiny. It's, it's everything.

Tools, packages, and integrations

I'll give a plug for pins. Pins has been a huge package that we've started using. We started using like a year or so ago and, you know, we've just got into using Azure Blob containers as the boards for pinning objects and reading from. So if you've never used pins, it's definitely worth checking out, even if you're using, you know, local folders as the backend boards, because the idea of abstraction there is really cool and what you can do with it.

There are a few anonymous questions that came through, and one was, are there any specific packages that you find are good for manipulating, displaying, and displaying time series data or packages for missing data?

So because I started learning R before the tidyverse, if you were looking at my code, you're going to see a nice blend of bracket subsetting meets dplyr meets all sorts of. But I believe it's sound, and it could be totally different than somebody else's approach. Sometimes I probably code things up the long way because it's immediately the solution that comes to mind when there's probably already a package that does it in a simple call. But that's the fun of it, right? That's the fun of solving problems and thinking about how it should be solved and then coding it up yourself, even if at first glance it's not the most efficient way.

So dplyr is just incredible. I remember being at, I'm trying to remember exactly what it was, if it was a user conference. I think it was a user user conference at UCLA in like 2014 or something. And Hadley did a talk about dplyr, and I thought it was a game changer. Because now all of a sudden, what I found is people had a hard time figuring out manipulating data in R with the apply functions and bracket subsetting and that sort of thing when they were used to an approach that was more like SQL or SAS. So dplyr comes along and it sort of like speaks to them and gives them a new tool set for manipulating data.

So as far as time series stuff goes, I know for a while I was using XTS and digraphs to get more of a formal time series objects and visualize those. Digraphs was incredible. I was blown away when I saw the sliders. And if you're not familiar with it, imagine a time series plot that right out of the box has sliders at the bottom that you can easily like pinch in and pull apart and see different parts of the time and hover overs and everything. I started using Plotly a ton for visualization, just because it can do so much besides time series. So I just find it like got used to using their framework for handling a lot of different visualization problems.

So going back to Pins for just a second, because I just was reading Libby's comment in the chat. If somebody was just getting started with using Pins now, is there a specific resource you'd recommend?

Yeah, I taught pins.rstudio.com. I still pull that up in my browser all the time and even read through like the first page or a couple of articles that are embedded there. They do a great, it does a great job of sort of explaining it. Especially, I think the examples start with just using like a folder, you know, right near your working directory that can be used to pin stuff. And then once you kind of see what it's doing there, and then you look at what all the other backend storage options, it's sometimes, you know, you got to deal with like, okay, well, you don't have to authenticate to a folder that's next to your working directory, but you do with like an Azure blob storage. So connecting to that board is like the hurdle. But once you have the board established, then you're past that point and it can be really cool.

Eric, I see you were putting a plug for a package as well. Do you want to jump in?

Yeah, this has been a wonderful talk, Dan. I really enjoyed learning from your perspective. Just on the topic of additional packages first, yeah, I love Pins as well. I've been using that in a lot of my daily new workflows. It's been making things a lot more efficient. But with interactive visualizations, yeah, I've been on a huge journey of that as well. I'll give a nice verbal shout out as well for eCharts for R is another great way to have some awesome visuals. John is here in the audience. It's helped a lot in my major app. So if you're doing interactive visuals and need both the more standard type of statistical displays, but also really novel displays, it's got you covered.

So one other anonymous question we had here was, do you or your clients use Power BI, ClickView, or Tableau?

Yeah, we have clients that use all those things. Our company has a lot of Fortune 1000 clients, and a lot of big companies have Enterprise BI tools. And we've worked at smaller companies, too, that have used those as sort of led with those as their BI offerings. It's an interesting – it's interesting to think about how Shiny or Connect, you know, plays a role with those things. Because it's easy to – and I've played a role, and it's easy to get into like, oh, we don't need Enterprise BI tools. We'll just do everything in Shiny.

I think, you know, those tools have a good – they serve a good purpose in terms of having – keeping an eye on sort of a lot of those KPIs and things I measured before. I think there's a good audience for those tools, and they're a great way to learn data visualization and manipulation. I think when I've – where I've – I guess one thing I'll say about those is it's not worth fighting about BI tools. Like, you shouldn't probably go into a conversation and say, Shiny should replace this.

The conversation I found is more helpful is when you can demonstrate what something like Shiny can do that those tools can't or, you know, the use case for those. Or, you know, like serving up model scores through plumber API and RStudio Connect. Like, that's a good example of stuff that's probably much more difficult or maybe even impossible to do in some of the Enterprise BI tools. So try to focus on the use cases that those tools can help solve instead of, you know, coming in with your sleeves rolled up.

And I think it can be kind of cool to when you think that there's like it goes back to what I said about other technology. Let's say if somebody else wants to somebody else has a role of like on the presentation layer will say, then let them own that role and and, you know, try to try to do what you can do to get your stuff integrated with that. It could be sort of a shared product between the two of you.

Project lifecycle and client work

I'm curious, Dan, if there's a kind of typical project, or if there's just some variation and going back to the root of the question I think is around like do you end up making like robust infrastructure for a customer that needs a, an enduring way to collect data that produces some kind of output for them that then they're like monitoring, or, and or do they come in with a bunch of data and they say we need an analysis, and a way to answer this question about about market value.

I would say in the first part more the latter of clients, coming with their data right so that we can, you know, build models and reports and analytics off of that. In other cases we're in their environments and and having to do a lot of the detective work to figure out where data is and how to get it. And then in other cases, finding out that data still resides in source systems and so we'll be in the client environment.

We've had situations where we've built databases from the ground up inside a client environment so that we can ingest data from some of those systems to support the different type of analytics work we do. So it's all over the map, really with, you know, it's definitely have a preference to what we'd want to do but a lot of cases, you know, especially dealing with data, you know there's compliance and governance issues and constraints that maybe suggest the type of environment stuff can go or should go and we deal with a lot of that.

Jared I realized we missed one of your questions from a bit earlier. Do you want to jump in here.

I guess my question is more generalistic but handling like client ambiguity when it comes they come to you with, you know, a need if you have any tips tricks or insight or wisdom on shoring up and getting more specifics.

One thing I've seen that's been helpful is when engaging with a new organization new client is whether you show them this or not is having like a maturity framework, just so you can assess where they're at in terms of data accessibility, the types of analytics tools or insights they currently develop and the tools they use, get an understanding of their environment. If you plot all that out, the first thing you might say is that they're pretty low on, they don't have hardly any of this stuff so maybe the first thing.

I hear what you're saying about ambiguity, they might, they might have that type of setup, but then they might also come to with very specific questions and you think well before we can even get to this we have to sort of set up this foundation. If you can, if you can talk to them about that maybe building that foundation is a place you can start if they're not patient enough for that you can get into maybe more specifics around the data that's needed to answer the specific questions they have.

Especially when you're, you're dealing with leadership they don't know that that gap exists between what they're looking for and the resources and tools they have in place to achieve that so if you can bring a perspective on on what you can do to help them achieve that that's that's always huge.

Yeah, I think it's that's come with experience to, you know, it's, this is where like the context and the business acumen comes into play right, let's say business I like the word context I don't like the word business acumen but context of the problem because you can be working for pharmaceuticals or you can be working for local government. The more you are aware of like the pitfalls that people in decision making roles that they'll run into, and then the more you can anticipate situations that they might encounter and you could do some work to help support that.

There's anyone that has, you know, maybe no coding abilities but tons of context they they're they're valuable person to have around because they can help shape those conversations.

Open roles and book recommendations

I am. I remember seeing on LinkedIn and I think it was maybe from top series post about your hangout, but I, I heard that you have several open positions on your team. For people using R and Python so just wanted to make sure I gave a minute here for those two.

Yeah, we do. Thanks for mentioning that. So we're, we're growing. We're growing pretty well so always looking for bright motivated people that love to tackle problems like these and are just love to solve problems in general, I think that's finding, finding people who are curious about how things work and how to how to solve the how to solve problems is this huge.

And I'm also trying to like rewrite my website or blog in Quarto which is going pretty well I just not ready to push it out just yet so if you see old posts of mine on LinkedIn and it says go to denegrate.com it's not going to look like what you think and more likely than not I'm going to just publish to my self GitHub pages repo and point people there.

Dan, I'd love to hear, do you have any book recommendations for us.

I assume, I assume the data science books but if you know my favorite book is the Hitchhiker's Guide to the Galaxy so if you've never read that you should give that a read. But the ones I keep here all the time that I'm always looking at is advanced R and R packages from Hadley. Those are, those are phenomenal resources.

The other one that I swear by and I tell people because I'm a huge, I'm a huge Linux guy. So the moment I started working in R and Python I thought well when someone's like oh you know there's open source like free operating systems and they play nicer with these tools I was floored and I got really into that so there's a book called the Linux command line by William Schatz and there's a, it's a, it's available free as a PDF. It's not too expensive on Amazon either. If you're interested in learning more about like Linux servers and command line tools it's a great introduction, especially if you're, you know, not familiar with that stuff already kind of eases you kind of relates everything back to some of the more gooey OS stuff that you might already be familiar with.

I'm a big advocate of REST APIs, any, any software can access the REST API so Power BI can access REST API, Tableau can access REST API, anything can. The idea is that you can make, if you can write a function in R, you can make a REST API. So the question just then becomes how do you host that REST API. So maybe you want to have a function that manipulates your data, and you want Power BI to use that. All you need to do is make it accessible via API. The hosting parts the hard part but I'm sure there's 25 people in this chat who can help you figure out how to do that with Connect.

Looking ahead

Dan I know we're coming up quickly on our time here. A question I wanted to ask you and I think we touched on it a little bit is when you think of the year ahead for your, your team, what is something that you're most excited about?

I think about the work we're doing with my team specifically with the work we're doing with marketing effectiveness, and I think what excites me the most about that is that we, we've done, we've done a lot of these projects and are approaching it from how do we abstract it, like there's so many things in common from one, one project to the next. And so we're, we're not, yes, every, every client can be different, you know, every outcome could be different, whether it's, you know, a count of something or a spend amount or whatever, but so we're trying to, we're always thinking about how these problems relate and how they're common from one to the other or how they differ so that we can expand how we abstract those.

And when you can abstract a problem, you can create reusable solutions and classes and methods. And that's exactly what we're trying to do. So that helps us be more consistent with the work, the way we approach the problems. And, you know, it's sort of a self-sustaining model too, because then we can, we can train people on how these things work and we can reason about them the same way when we talk to each other. So it's, it's a very exciting way to approach this type of work that I've wanted to do for a while, but, you know, just finally have the opportunity to approach it this way with our support of our leadership. So it's really cool.

One other thing I'll say while we're waiting on that is what I'd love to see in the next year is more Quarto reveal JS presentations. It's very cool, very innovative. Something similar was available with IO slides a while ago, but obviously this approach with Quarto is, you know, a little bit superior. So that becomes like, if you think about business and business presentations, you know, like having to copy and paste stuff into PowerPoint, even though I know Quarto can output PowerPoint now too, but I would love to see those types of presentations become, you know, more of a more, have a greater presence in terms of data science delivery.

The first thing I, yeah, templates. And I know there's, people are constantly adding to the assets available there, but I just remember the first thing I, when that got announced, I immediately checked if it could support HTML widgets, because that was the reason I went to IO slides a while ago, instead of the RStudio presentation that was built in. Because when I've showed people that, oh, here's a table with 400 rows, but you can see 10, 25 or a hundred or however many one at a time and page through it versus just like, you know, like a Microsoft office table embedded into a thing that was, that really blew their mind. So I guess just continuous support for the different visualizations, especially the more JavaScript interactive ones that can introduce some interactivity.

Career advice

One last question for you, which I always think is fun to ask. Is there a piece of career advice that you've received or given that comes to mind? It doesn't have to be like the best or favorite, but something that was impactful.

Yeah, I think it comes back to like imposter syndrome and all that. So, like, have a lot of confidence in your background, in your academic background, in your work experience, even in your thought process for approaching problems. A long time ago, when I said something that kind of illustrated that I wasn't sure of myself or whatever, and he just said, you have a master's degree in math. And not that everyone has to have a master's degree in math, but it's just like, you know what, like, you can do this, right? Like, you have all the tools you need. You can communicate to people. You can reason about problems. You can do this.

A long time ago, when I said something that kind of illustrated that I wasn't sure of myself or whatever, and he just said, you have a master's degree in math. And not that everyone has to have a master's degree in math, but it's just like, you know what, like, you can do this, right? Like, you have all the tools you need. You can communicate to people. You can reason about problems. You can do this.

I love that. Thank you so much, Stan, for joining us here and sharing your experience. So that's all. If people want to get in touch, is LinkedIn the best way? It sounds like not the blog yet.

LinkedIn, I'm on Twitter. If you search for my name, you'll probably find me, but somewhere around there is an email address to whatever you want. I'm usually pretty open. I got three kids, though, so I might take a while to get back to people.

Sounds good. Well, thank you so much, Stan. Have a great rest of the day, everybody.