Transcript#

This transcript was generated automatically and may contain errors.

Okay, so today we're going to talk really briefly about the differences between using something like Claude Code to do data analysis in R versus Posit Assistant. I'm Sarah, I'm on the AI Core team at Posit. Simon, do you want to introduce yourself?

Yeah, I'm Simon. Sarah and I work together on the AI Core team. We do a good bit of writing together and we also work on Posit Assistant.

Cool, so one thing before we get going is just say if you're still using something like ChatGPT to help you write code, using a coding agent, like any coding agent, is probably going to be a big improvement since you don't have to copy and paste, but the agent can run code for you and see your environment, all of that. So if you haven't tried that, I would just like try anything out. It's probably going to be better than copy and pasting.

Loading the TidyTuesday data

Okay, so the first thing I want to do, I found some TidyTuesday data that is on likeliness words, so words like very likely and then people assign probabilities to it. So you might think, you know, very likely has like a 70%.

So before we look at Claude Code and Posit Assistant, I'm just going to load this the old fashioned way. Read this in. So there's 3 tables. I think we're probably only going to really look at 1 or maybe 2. So 1 of them is this absolute judgment table, which has, yeah, like the word and then different people assign probabilities to that word.

I thought this was kind of interesting. Yeah, that is interesting. I'm just going to put some, just plot it really quickly.

So, yeah, there's a bunch of words and then people assigned different probabilities to it and then there's some demographic data and another one of the tables.

Okay, so many outliers. Yeah, yeah, yeah, I don't know what you said, 1% probability or something. Yeah, yeah, it's funny. And almost no, someone said almost no chances. So, yeah, there might be some things going on in the data. I don't know if we'll have a chance to look at that, but it is interesting. This was, yeah, the TidyTuesday data from a few weeks ago.

Okay, so now we're just going to try to do some basic analysis with this data with both Claude Code and with Posit Assistant. Posit Assistant right now is only available in RStudio, so we're going to use it in RStudio. So, let's first look at Claude Code. So I'm going to use it from the command line.

Connecting Claude Code to R sessions with btw

And I'm in the same directory I was over here in R. So Claude Code, like, out of the box, it can't see your R session. So if you do, if you, I think we can just ask it, can you see my R session? Let's see what it says. It can't see your R session. It can run R code like by writing, by, you know, putting it in a file and then running it with an R script command, but it can't, like, directly access the data I've loaded. It can't directly run R code out of the box, but we can set it up with MCP tools to let it run R code. So we're going to do that. I'm just going to exit. And this is pretty cool. So if we need to do two things, the first is register the MCP server for Claude. And this is using the btw package. And so we're going to register MCP server for Claude. So this will let it run. You have the btw run R tool and then also the docs tool. So I'm going to run this and it added it here. So this is cool. So now when I launch Claude, it'll have access to that. We also need to run this in R console to connect this R session to the MCP server so that they can talk to each other.

Yeah. Do you want to say anything about btw or MCP tools? Yeah. So we put these packages out in the middle of 2025 or something like that. MCP tools allows you to, like, write functions inside of R and then make them available as tools to language models. So you could connect it to Claude Code, but also, like, GitHub Copilot or Codex or whatever it may be. And then btw is something like a companion package, but it has a bunch of sort of rebaked tools for using R. So one of them that Sarah registered is the run R tool, which is basically just like it can run any R code that it wants. And then there's a few tools that are targeted towards documentation, which she's registered using that docs argument. And so that just lets it read function documentation vignettes and things like that.

Cool. Okay. So now we should be able to ask it something like, can you summarize the absolute judgments table? And we already have this loaded in our R session, so it should know about this and be able to do something with it.

Hopefully. Yeah, interesting. Oh, there it goes. Okay. There we go. It found it. Yeah. Cool. Okay. So this, like, I think this is, oh, it's doing a lot more than last time I ran this. But it, so this is, I do think this is cool. Like, we didn't do that much, and now it can access our R session and run R code. So it did, it did a lot. Like, it found the table, it ran summary, it did some summarization. So this is nice. I think, like, it feels a little clunky, I guess, and it's hard to see the code that it ran. So it's kind of hard to audit what it did.

R session access with Posit Assistant

Now let's look at it in Posit Assistant. I'm going to make this bigger. So Posit Assistant lives in the sidebar in RStudio, and it's built to work within your R session. So, like, out of the box, it can see your variables and plots and console. So we don't need to do any setup or any additional configuration for it to do that. So let's ask it the same, what did I ask it? Let's ask it that same question.

And just for a good comparison, what model was I using? It's like Opus 4.6. Okay, so I'll change, I'll change this to Opus. Cool. So I'll have it summarize the absolute judgments table. So it already knows it exists, and it can start summarizing. So it's doing, it's doing, like, kind of the same operations that Claude was doing, which makes sense since it's using the same underlying model. But I think when we made this, like, the point is partially data analysis and being able to see your R session. So it's just, and like, the purpose is for it to run R code, at least partially. So it's easier to see the code that was run and the output. The code and output is for you as the person as much as it is for the model. So it's easier to audit the code and see the output. And then Posit Assistant is going to give us a little summary and then have some suggestions. Again, it's like kind of running similar bits of code. There might be some differences because of how we've prompted it. But I think for now, like, the one just major difference is how easy it is to see the code and the output and that we didn't have to do any additional configuration.

Yeah, it really does feel like the big difference, especially getting the indentation right. You get like the pretty printed, like, I think that na.rm equals true, you get to have like the nice blue true so that you know it's a logical and stuff like that. Yeah, I did find it interesting. Another one of like the big differences from my perspective is the plotting. And it did suggest to do that first in the suggestions.

Plotting with Claude Code

Yeah. Yeah. Yeah. So maybe now we can talk about plots. So Claude Code is a very general purpose coding agent and it works very well for that. But it's not really designed for data analysis in particular. So and I think one of those specifics is plotting. And if you're analyzing data, you're probably going to want to plot it at some point. Let's just see how that would work. We can get something to work with Claude Code.

Oh, that's funny. Claude Code even suggested it to visualize the probabilities. That's interesting. Okay, I guess we'll do that. That was the exact thing I was going to ask it to do. It feels a little weird. We'll have it do that. And again, like it's capable of writing code to visualize this. So I think this is happening here that's trying to show the plot, but we're in the terminal and it can't show the plot, I think. But usually it figures that out.

Yeah, like it says it can't find the rendered ggplot object. And again, like this, you might, like if you're using Claude Code for data analysis, like you would know this, you'd probably have some alternative setup that you wanted to do. So I'm not trying to say like, look what it did. It shouldn't do that because I don't think. It doesn't work. Yeah, because it does know how to write plotting code. Eventually it usually figures out that like it could write it to image file and see if it does this. Sometimes it automatically opens it in preview, which is nice, but I don't think it did that this time. Let me just check the other screen. Yeah, sometimes it opens it up automatically, which is sort of a nice workflow, although it is a little, like there is sort of a lot of clicking around, but it made a box plot. Yeah, again, I just think like if you're using Claude Code in the terminal, like this just really isn't what it's made for. So even though it's very good at writing the code, you just, there's not going to be, it's not going to be easy to do it like back and forth with visualization.

Plotting with Posit Assistant

So let's, let's see. Let's just try this and Posit Assistant. So have them, we'll have it visualize probabilities associated with each term. Yeah, I think, I don't know if I point out before, but you can see it thinking here. So it made basically the plot that we did at the beginning, but it is nice to just be able to easily see the plot here.

Plot interpretation

And then the other thing I just want to point out is that Posit Assistant can also actually see the plots and it can interpret them. So it added a little bit of interpretation here, but if we ask it to like need further plot, like you can actually see the plot image and will help you or interpret for you the plot, which is really useful if you're using it for EDA or any kind of data analysis, since a lot of the insights that you might get about your data are going to come from visualization. So it has some observations.

Just nice. We'll see if it.

Yeah, and it's coming to the same conclusion that we did at the beginning, which is that there are so many outliers in this data and then it'll have some suggestions at the end.

This was something that I was wondering if we were going to see. Like in the interpretation from Claude Code, it basically says like, I kind of see what I expected to see and like calls out like high certainty terms are near 95 to 100. When we look at the interpretation from Posit Assistant, especially the first time around, it's just sort of like, oh yeah, there's a lot of spread everywhere, which is really like the lesson have notably widespread suggesting less agreement on what those terms mean numerically. Like to me, that's kind of the point of the visualization as I saw it. And I think like I'm happy to see that Posit Assistant is behaving in this way because we really tried to get it to sort of surface the real message of the plot rather than like what it kind of expected to see.

Like to me, that's kind of the point of the visualization as I saw it. And I think like I'm happy to see that Posit Assistant is behaving in this way because we really tried to get it to sort of surface the real message of the plot rather than like what it kind of expected to see.

Yeah. Yeah. And I'm not sure. So do you think Claude Code is looking at the PNG file to interpret? It was like in the, in the past when I've done this, it said, like, I can't see, I can't look at the image and it has just run some code to try and get the information. But I'm wondering if because it wrote it to the PNG, it is actually ingesting it and looking at it.

Yeah. Okay. So it's referencing the IQRs, which kind of makes me think that it's relying on it's having run like, you know, deterministic analysis rather than seeing the image. Yeah. Claude Code. Also, like if you scroll up a little bit, it just says read one file, which, yeah. Okay. So it can't actually see the plot.

But anyway, I think like my main point is like, you can make workarounds for Claude Code for this stuff. And I think there's even a way to hook like in btw to maybe have it so that it can see the plot image. But like, this is what Posit Assistant was designed for. Whereas Claude Code is really just a coding agent. And so like, you're going to get this, these abilities out of the box. And also like, we, this is what you're supposed to use it for. We have tried to make it so that it is good for EDA and visualization. And it's nice to use for those things. And that gives you a, like reliable results if you're visualizing.

Iterative data analysis with Claude Code

Okay. So the last thing is iterative data analysis. So Posit Assistant is an evolution of DataBot, which was just an EDA agent. And it still has that like core functionality in it. So if it thinks that you're exploring data, it will do things like only do a couple of tool calls at once, like only run a couple of bits of code at once and then give you suggestions. So it's supposed to be this like iterative process where you're involved in the exploration of your data.

So let's go back to Claude Code first, let's say we want to do some more exploration and see what phrases people most disagree on. So again, I feel like I'm using Claude Code in like a little bit of a weird way to do this, because I think if I was, if I only had access to Claude Code and I wanted to do analysis, I would probably just like have it write code into a script and then move back and forth between the pieces so it can do some, it's going to do some exploration and then give us some results. And then we can like keep going. In the past, I've seen it like give us, give me suggestions sort of like Posit Assistant does. But we could, we could take it's next suggested thing, which again might be from my notes. A little weird. And cause we might want to know like, how did they vary by country of origin? And that's in that third table for the respondent metadata.

I find it interesting in the first response. It says the pattern is clear, vague, non-committal phrases. And again, like one of the clearest things about the plot and the numbers that it's analyzing is that like, there's so much variation here. I don't know that there really is a super clear pattern to speak of. Yeah. And so it wants to make another plot. And then it has some, some summary. Yeah. So again, I think you would just have, you probably have a different workflow for Claude Code where you'd have it right to a script because it really just isn't made for this kind of interaction.

Iterative data analysis with Posit Assistant

So let's just try this with Posit Assistant. We're going to see what phrases people most disagree on.

Okay. You gave us a table and yeah, again, this is like, this is one of the core features of Posit Assistant that it can do this kind of exploration. So it gives us some suggestions, but I'm going to have it do what we asked Claude to do, which was see how the probability judgments vary by country of origin. It makes a plot.

So you'll notice like it's only, it's only supposed to do a couple of these tool calls before it regroups and gives you some takeaways and then add suggestions.

Okay. So this, let's see, I was going to ask it to limit the countries, but it seems like it already, or maybe that's, maybe those are the only countries in there. It's interesting. It's a little hard to. Interesting. Is there some sort of filtering that determines whether the plot is visible or not, like if we scroll up a little bit, is it like filtering? There we go. Filtering by top countries.

Okay. Yeah. I mean, I, like, I think we saw the plot, which we didn't with Claude Code. And then like, when we have a question like that, like what did it actually do? It's much easier to audit what happened in this Claude Code. Yeah. It's supposed to be easy for you to audit the code, look at it and change it in the next round if you don't, if you don't like it and continue your exploration process. So you might like click on one of these follow-ups or you can write your own.

But yeah, again, like this exploration process and visualization, general data analysis, like these are core features of Posit Assistant. This is really what it was designed for. So even though it's using, it's using the same model as our, this Claude Code session. It's just like a different way of interacting. And again, like it gets here, our session automatically.

Cool. So those are the three things that I wanted to show. So yeah, I think in conclusion, we both think they're both very useful tools. And at least one of the reasons we made Posit Assistant is because we think you should have a good agent for data tasks specifically, and that includes data analysis and EDA and visualization, but also includes things like you might want to, you know, write an R package, make a shiny app, you know, more coding focused tasks, which we didn't really talk about today, but we're really excited about Posit Assistant. We think it's really good for data tasks.