Transcript#
This transcript was generated automatically and may contain errors.
Welcome back to the Data Science Lab everybody! This is a place where we get together every Tuesday to pair code as a community and we have a different Data Lab Manager every week. Today I am joined by my co-host Isabella Velazquez. Isabella, would you like to say hello?
Hi everyone! Thanks for joining. I'll be posting the links.
Yes, Isabella is absolutely on top of links in the chat. So if you are wondering what we are using or talking about today and you think that there might be a link, give it a second. Isabella will probably put it in the chat. She's so fast and amazing. And I would love to introduce our Lab Manager for today, Joey Marshall. Joey, would you introduce yourself?
Yes, thank you so much Libby. It's nice to see everyone. I'm Joey Marshall and this is Tofu. You can't tell that he's a cat because of the shape that he's in, but he's typically a cat. And collectively we are VP of Data Science at Verisight, a survey and public opinion research firm. I know a lot of you though through the Posit Data Science Hangout community, just because I lurk in the Discord. And so for the longest time I was a Data Scientist at the Census Bureau. So Joey from Census is the same as current Joey. I don't know, we change a lot over the course of our lives. Am I exactly the same? Kind of.
We're a million different people. Well, I bet a lot of people will remember you because you ask a lot of great, great questions at the Data Science Hangout. And I have been told before, I have been sent many a DM. Who is that guy who just asked that question? He has a podcast voice.
I'm like, he sure does. Joey and I need a podcast. Well, I will say Tofu is adorable and makes me want to ask Lauren to share her cat Nightmare in the chat somewhere. Share us a picture of Nightmare, Lauren, because Lauren has a very, very adorable black cat too.
Introducing today's topic: Claude Code
All right. Well, Joey, today our plan is for you to walk us through how you use Claude Code. We do not always talk about AI and LLMs and stuff on the Data Science Lab. We cover all kinds of topics. Today we happen to be talking about AI. And Joey, you have been using LLMs regularly for your data and other workflows for how long?
Oh, a long time. I mean, definitely I was using the early LLMs before there were transformer models to do work, before they were any good, really. But that was kind of in the before times. Ever since, you know, Chachapiti and Claude and Jim and I have gotten good, I've increasingly used LLMs both to build things for people, either employers I've worked for full time, or I've done in the past some side contract work for employers who wanted things built with LLMs. So that would be like fine tune an LLM or do a wrapper around an LLM in order to create something for our customers. So that's one use case. But then the other use case, which is kind of more central today, which is pair programming with LLMs, like using Claude Code to help me build stuff, which I increasingly do. And that's probably more relevant to today's chat.
Yes, absolutely. And I say we go ahead and start screen sharing and start working through what we're going to do today, which is working through a tidy Tuesday data set. But giving Claude a lot more free reign than I would in an analysis, this is going to be fun. And while we are doing that, Noor has our first question today in the Discord, which is for someone who's bearish on AI, what benefit does using something like Claude Code bring? And second question, which I want you to answer first, what's your favorite pie?
Oh, hey, Noor. My favorite pie is that pie that you posted in the Discord last night, because it's the most beautiful pie I've ever seen. And I really like your question on being bearish about AI. I'm even reluctant to assume that identity myself. And that's something I'll try to be sensitive to as we go through the... Like today, we're going to intentionally use Claude Code and really push it to the max and try really hard not to do anything hands-on. That's not a representation of the way that I always work. But I think some reasons to be skeptical, especially for us data scientists, would be statistical errors, whatever you build needs to be accurate. There are security risks like giving secrets and your API keys and tokens to Claude Code and just exposing your file system to Claude Code. It's a cli tool. Those are all reasons not to be bearish about AI. But there are a lot of good reasons actually to appreciate working with it too and to try and strike a balance.
And for me, that's that... I mean, obviously, I can work a lot faster with Claude Code. That's not always a virtue. Sometimes being slow is preferable. But if it's something that needs to just work, let's say it's like a front-end tool. I'm not a front-end developer. I don't need a production-grade front-end tool. I just need a clicky interactive thing that will work as a proof of concept. And then I can hand it over to real software engineers to build that kind of thing. Then AI is great for that. First of all, I know it's not making any statistical errors because I'm not asking it to do any statistics in that case. I just want it to mock up like a simple front-end, which it's great at. And you can just fly through tasks like that with Claude Code. In more sensitive stuff, you have to be a lot more careful, build in a lot of unit testing, look at a lot of stuff hands-on.
In more sensitive stuff, you have to be a lot more careful, build in a lot of unit testing, look at a lot of stuff hands-on.
But the other thing is always learn a ton from Claude Code. And I think this is true of all the AI agents. They're great teachers. Like what Wikipedia used to be. I still love Wikipedia. But it's a great jumping-off point. Don't get me wrong. I still use Wikipedia all the time.
Setting up the terminal and starting Claude Code
I think that we will demonstrate some of this stuff as we go through today. And Kieran just said in the chat that Claude's loop where it can be permitted to use local resources to write, to run, to test code on your machine, to read output from it, turns out to be really a powerful way to iterate on things. And I think we'll do a little bit of iteration today and demonstrate that. So the first thing that we will do is I'm guessing tell Claude what we want to do. And what are we looking at here? We have your terminal is up and we don't have a Claude open yet.
That's right. This is just an empty terminal. And for, I'm not using the, okay, I'm on a MacBook Pro so you know what computing system I'm using. And my terminal here is through Cmux. I have a list of stuff that I use and Isabella and Libby can share that at the end. I like Cmux because it has nice vertical tabs. Check this out. I can spawn up all these terminals and have them all doing different things in these pretty little vertical tabs. Cmux also hooks into Claude's hooks. So you can get like desktop notifications whenever your Claude's done doing stuff, whatever. I don't sell this product. It's free. Cmux. Okay. We're looking at a bare terminal and I have some terminal aliases set up, which are, if you don't use the terminal a lot, you'll have some configuration file, depending on your operating system that can, you can enter some characters and they'll translate to other kinds of characters. So for me, the terminal alias dv just puts me in my dev directory.
I have a terminal alias called CCDP that is going to boot up Claude Code for me. And you'll notice it just asked me for my fingerprint password. That's because I'm running Claude Code inside of a 1Password command line tool. This is also in my list of resources. 1Password, you know, it's just a password manager, but it has a secrets manager that can interact with your command line. You can run Claude Code inside of it, and that lets you do really useful stuff, like keep your passwords secret. So all Claude Code sees is a link to a password, not an actual password. I'll say more about that later. Now we're in Claude Code, and we're going to let me just grab this data and pull it over. I've never worked with this before. It's about frogs. That is about the extent of what I know. So we are going to do this in real time.
Exploring the frog data set
We're looking at the Australian frogs data set from tidy Tuesday, which is I believe week 35 from 2025. Okay. So we've got this frog ID data. This gives us instructions on how to read it using R, and we're kind of going fast and having fun. So I'm just going to copy this URL, and then we're just going to drop this into Claude Code and ask it to explain it. I'm using voice dictation to communicate with Claude Code, an app called Wispr Flow. It's also in the list of resources.
I'm doing a live demo of Claude Code, and we need to build some kind of data product within about 45 minutes. So don't let me down. That's the first thing I'm going to say to Claude. I really need it to be on its best behavior. The first thing I want you to do is to explore this data set. Tell me about it. Make sure you can read it and just explain the data a little bit to me.
Okay. While this is cooking, I want to quickly explain what I was saying before about wrapping my Claude Code command in the 1Password password manager. Often when you work with an AI agent, if you want the agent to be able to do things autonomously, you need to be able to share secrets, API tokens, credentials for stuff, URLs to things that may not be public, whatever. And it's common, I think, in practice to drop those in like a .env file, .env in your directory. But the problem is, you've exposed a list of secrets in plain text on your machine. If it's a shared computing environment, that's really risky. And also, if you accidentally commit that, that's a real problem.
Let's see what Claude says. This comes from Frog ID, a citizen science project where people across Australia record Frog calls via a mobile app. There are two files. Frog ID data.csv. It's got about 140,000 records. Each row is a Frog call recording. With lat, long coordinates, we may can do something geospatial. State. And which I guess are top level geographies in Australia. And recorder ID. Okay, cool. 186 unique species. That seems like a lot to me. New South Wales dominates with 43% of records followed by Victoria and Queensland. Okay. And then we got Frog names, which just maps the scientific names to common names. Okay, cool.
What kind of data product are you thinking? An interactive map dashboard analysis or something else? So here's what I'm going to do. Let's do some interactivity. I'm going to get Claude to give us a few ideas and then we can all decide as a group what we want to do. It's a safe place.
Choosing a project and discussing prompting strategy
That's a great question. I should have said at the beginning, I'm using Opus. I think 4.6 is the latest Claude Code generation here. Codex is getting better. I tend to use it less for vibe reasons because it feels less, let me give a more intelligent answer. Definitely read Simon's analysis. I do most of my work in Claude Code using Opus. You burn through tokens really fast that way. Sonnet is fine for light engineering stuff. I tend to only use Haiku, Claude's smallest model for LLM tasks, like when I want an LLM to do something, like read some words and extract something or whatever. I did build an integration once, and you can find it on my LinkedIn and GitHub and stuff, where I could ask Claude Code the autonomous ability to get a code review from Codex on the same machine. I liked that for a while until Claude Code's own code reviewer sub-agent was created, I think in the last generation or two. You can ask Claude Code to deploy its own code reviewer. It's a code-reviewer sub-agent, which is a much better code reviewer in my opinion. I just haven't used Codex all that much recently. Sorry that I can't give you a better answer. I'm using Opus right now.
That is totally okay. And Becca had also asked a follow-on onto this whole prompt, which we are about to look at and go over the results of. She asked, with your Claude prompts, what components are you intentionally saying versus what did you feel free to be more liberal about? And are there any aspects of your prompts that are directly influenced by your training as a data scientist? Oh, that's a great question. I used to try to be very precise in my prompts. And just anecdotally, I've learned that the more words, the more human language I can give agents, the better they perform. I mean, think about what it's doing. It's taking your prompt, it's getting an embedding, and then it's moving all that context back and forth and back and forth every time. And so, like, in terms of it providing generating relevant text after what I've said, the more I can give it, the better.
As a matter of fact, this is just, like, a general, I think, best practice, is I tend to have Claude Code write as much documentation, maybe more, as it does code. In a second, we'll use planning mode, where it sends up a whole bunch of agents, does a whole bunch of reading and writing. Before it does anything, I'll have it write a bunch of documentation that we can reference. The same principle applies to prompts that I give it. I tend to just say a lot. And if you're unsure, just say even more. And that's why voice dictation I really appreciate. So I'm less precise and much more verbose with my prompts than I used to be.
As a matter of fact, this is just, like, a general, I think, best practice, is I tend to have Claude Code write as much documentation, maybe more, as it does code.
I think we'll move on in just two seconds after we stick in Zach's question here, which is how does the computer know that you're talking to the AI versus just talking? Like, is there a button you're pressing or a hotkey? Yeah. Yeah. It's like the yeah, you can set a hotkey. So I've assigned a hotkey that I hold down. You can choose a hotkey.
Planning mode and managing secrets with 1Password
Let's see what we've got. It gave us three proposals. All feasible as a single page web app that we can ship to Vercel in under 30 minutes. What the heck is Vercel, Joey? Oh, it's just like another back end as a service app. I have so many of these goofy things. For databasing, like I use Supabase a lot, like Railway for workers or GitHub Actions. All these back end as a service things, a lot of times what they're doing is reselling you AWS. But what you're paying for is a ton of convenience. So they're really doing some work to design away a lot of the friction in getting a working thing online. So Vercel is just another hosting and deployment environment. But it's super tailor made for Python, which I work in a lot. And so you can just like put together and deploy a Python app and then layer on a little front end and it kind of just works. So I have a Vercel token anonymized in my environment here. And it's just like a quick way to get up and running.
For your own org's enterprise solution, your org probably has something better with like software engineers on staff who can help you do it. But if you're just like doing stuff as a person and for proof of concept, all these services are I think pretty great and they're all pretty cheap. Usually everything is like 20 bucks. But then it's like streaming services. You have $200 a month of $20 a month stuff, whatever.
So who's the winner here of the options? Isabella, who is our winner? I think as it currently stands, it's option two. And apparently we get a pie.
Yeah, well, no, I was hoping for I mean, like, nightlife heat map like sounds really cool. But I like that Claude is proposing like a radius query. I think that means we can make a little circle and then learn stuff about it. That's how I'm interpreting that. So let's do it. Option two is the winner. Would you please go into planning mode and do deep research on what we need to do to build and implement option two. At the end, I want this to be a GitHub repo that you can deploy to my personal GitHub. By the way, look in your global Claude.md file for information on how we manage secrets. You should be able to see my personal GitHub pat in 1Password. And you should be able to use that to deploy to GitHub. Also, I would like for this to be an interactive web app. My Vercel token is also in 1Password. You should be able to get it there. Otherwise, I want to make sure that whatever we can build is feasible. You've thought through both the data that we need and the tech stack. And also the aesthetics. I would like for this to have a cottagecore aesthetic. I'm going to ask the audience for a color scheme. And I'll provide you a color scheme later. Go into planning mode. Think this through. And come back to me with an implementation plan.
Can you please ask Joey, do these voice to text things require a lot of hardware to work fine? Whisper flow is cloud based, I think. It's a good question. I don't know. I don't know how well Wispr Flow is optimized and whether it uses any kind of GPU acceleration. I'm using, like, an M4 MacBook Pro. 48 gigabytes of shared memory and, like, it has, like, a 12 core CPU. But you know what? I also run Wispr Flow on my piece of trash Windows machine and it works fine. So it must be pretty well optimized.
Okay. So, Noor had also asked while we're cooking here, she's like, I know 1Password is a password manager, but how does Claude have access to it? Okay. Let's do both. Whoever asked about conflicting subagents. JPS. This is important. You can ask Claude to spin up subagents or you can use the slash.
This is a real problem. So, you can use the slash agents command in Claude Code to either define your own subagents or it has some prebaked subagents that you can use like the code reviewer and some other ones. Or you can just ask in natural language, please spin up agents to parallelize this task as much as possible. You know, whatever. And it will try to do that. However, if you have multiple Claude Code sessions. I spun up a bunch of Claude Code sessions here. And I had them all working. And they were all working on the same repository. They will absolutely conflict with each other. And if you're doing stuff like that, you should use git work trees. It's a little out of scope for today. That's like a whole data science lab using git work trees. But it basically provides like an OS level kind of barrier between the various agents. So, they won't conflict with each other. They kind of all get on their own branch and do their work. And then, you know, conflicts, merge conflicts are designed a way like that. So, git work trees are a thing in git. But Claude Code is designed to integrate with git work trees. That's a great way to parallelize across multiple Claude Code sessions.
Otherwise, for Claude just spinning up its own agents, like in planning mode, which we just did, Claude almost certainly spun up a whole bunch of subagents to do this. Just to do specific tasks. Oh, you go off and research this part of the tech stack and you go off and look at the data and you get off of whatever. And you can see they're little commands. You can see them as they work. But git work trees if you're if this is a concern.
Noor asked about secrets. Okay. So, 1Password has a command line interface or cli tool. And what that allows you to do is to wrap your Claude command in a 1Password command called oprun. And what that does is let me just show you. Okay. So, I'm in this I'm back on a terminal. We're off of Claude Code. I'm back on a terminal. And I'm in my dev directory. I have all of my keys are in here. I'm not going to show them to you because actually they're not printed in plain text. I'm just going to show you the file. There's a file called env.env.1password. And these are my keys. But they are not raw keys. There's nothing this video can go on YouTube and no one can get anything from me here. They're just paths to my 1Password vaults. So, you set up a vault in 1Password.
paths to your 1Password vaults and then the op command that you wrap Claude Code in and I just dump all that into a terminal alias more about this in the resources that will get sent around this will trigger something called password injection so Claude only ever sees this but when it runs a command the 1Password cli tool will replace this with the actual secret that'll get injected and then you don't have to work with bare secrets in a text file that makes sense amazing yes this is so helpful
Context windows and token management
Okay. So, what's an LLM doing it's not doing anything magic a lot of it you can think back to your early like text analysis and NLP principles an LLM the a chatbot essentially is taking your prompt tokenizing it you probably remember from early days that a token is just a fundamental unit of language. Claude GPT all the various LLMs tokenize in different ways. They might be usually parts of words or even really Claude tokenizes really aggressively so even like a character space is like a token stop is a token like stop generating text those are all tokens. But anyway it's going to tokenize your thing it's going to get its your prompt it's going to get its sentence embedding kind of its uh sort of semantic kind of like it's like it's like it's coordinates in highly dimensional semantic space that's not exactly accurate but it's a nice heuristic of how to think about sentence embeddings. And it's working memory can only hold so many tokens at once and before last week Claude Opus 4.6 could hold a total of like 200,000 tokens. So whenever you're having this conversation throughout these tokens are going back and forth between you and the LLM and its context windows filling up filling up filling up filling up larger than it can handle and when that happens uh it's a real problem. Um first of all anecdotally I see performance issues whenever my context window gets close to the end. But more importantly when your context window fills completely up you can't prompt Claude anymore so it will automatically do something called compact the conversation which is it will write its own summary of the conversation thread and then clear everything else out of memory. Now you can trigger that compact yourself slash compact and you can even write your own summary of the conversation thread. Now as of last week or maybe over the weekend uh the context window for Opus is now a million tokens it's 5x the size it expanded.
Saw a lot of chatter on reddit about that this weekend people are hyped I mean I'm hyped it's nice. And if you do slash context in a Claude Code session you can see this really helpful I just love little visualizations in the terminal. So this is basically telling us that three units of this are consumed with its system prompt uh you know that's like stuff that Anthropic gives it don't be evil you are an AI you know whatever. It has like a don't actively harm things yeah Anthropic has a philosopher on staff uh she researches like morality and well good yeah that seems I'm glad.
And then there are a few skills we haven't talked about skills yet but Claude Code skills are um like markdown files that tell it how to do stuff that is useful to you yeah we can have a whole data science lab about skills probably. Oh time check we have 15 minutes left all right let's hurry up thanks hey frog finder is live at this url what already like it deployed it.
Reviewing the deployed frog finder app
I mean I gave it my vercel token um okay messages this is stuff that I've sent back and forth to it you know whatever eventually this will get almost full and it has a little auto compact buffer so it doesn't let you go above this so when it's getting around here you want to compact your conversation. Okay uh github auto linking failed needs a vercel login connection for that github org oh that's right I haven't authorized vercel and github to communicate with each other but the deployment works via direct upload okay I can connect the repo in vercel settings later.
oh am I like logged in or something this is actually the part where I usually fail with Claude is the deployment part where like I don't have things hooked up in the background for it with my pat and stuff like that um and things fall through the cracks. But I'm curious about how this will work please feel free to log in on the side if you need to. We're gonna give Claude a browser and let it look at the browser itself so I have a chrome browser with the Claude extension.
recently asked it to book some flights for me using the browser and it did better than you think.
Happy St. Patrick's Day everybody by the way I'm wearing green.
okay Claude is still thinking um and so I'm going to pop over to Joe's question in the chat which is so you can switch back and forth between the model during the same session depending on what you need or do you have to stick with a single model while you're in that one session. because you can slash model right Joey yes uh oh was the question can you switch models in the middle of a session oh um good question I tend not to do that either I don't know I don't know if that would wipe your context maybe does it does someone else know I actually don't know the answer to that if you know the answer put it in the chat. Um my gut tells me that you would have to re give it re give it all the stuff right you'd have to re-give it your context your Claude md file all that stuff I would think but I actively do not know.
this is something that we will have to try okay I think you know what's interesting is often the models don't know that much about themselves because they were trained on data in a world before that model existed Claude will try to reference its own docs but yeah. I have a question from Isla curious to know what resources you would recommend to learn all of this stuff how did you learn all this stuff Joey oh logs YouTube videos trying and failing yourself well you could join the data science hangout discord yeah so the discord where you are right now is a great place to hang out with people we have an AI and LLM channel. When I remember asking Hadley this question not that long ago and he was like blogs I read a million blogs and I think that that's a great answer and I've watched a million YouTube videos as well. Isabella how about you what have you learned from yeah I also find like Anthropic has like a course marketplace and you know they're fairly short and um from like the foundational so I've taken a few of those as well.
oh my gosh okay all right uh I'm gonna drop this URL in the chat so other people can go there too I guess if everyone goes there you might blow up my uh $20 a month hug of death everybody. yeah exactly uh okay what can we see oh there we go okay I can click on a circle oh it just makes an automatic circle I kind of wanted it to I kind of wanted to draw a circle oh I can increase the radius here we go there we go and then this tells me the species found oh and it says at the top it says 41 species found within 90 kilometers of where we are centered which is kind of right around Brisbane.
yeah we did this in like half an hour because we talked for a lot of this right and in the beginning and we're 52 minutes in we have eight minutes left and we have a thing that is mostly working as long as we can verify that it is correct right yeah I mean maybe not like ready to you know um but I mean yeah but for a POC proof of concept this is great.
yeah Anna says it's kind of weird that the dots are equally spaced like put them in a grid I wonder that's weird I haven't looked at the lat long data but I'm wondering if that's a limitation of the data or if it did that on its own and it just like uh aggregated within a grid I don't know. that's a question we would want to ask it you know like about the choices it made and we'd fine-tune this as we went you know for a real thing.
we got them from wikipedia a few folks mentioned that it was actually in the plan I um that's oh we just missed it because we were we're trying to go fast. yeah thanks Aaron Aaron let us know that it was it was it told us in the project specs it was going to pull them from wikipedia okay yeah Amelia same question I'm really curious about how accurately it did it though um that's the kind of thing that when I am working with Claude I am constantly cherry picking to check myself I'll go like grab five of them and then I'll go look them up myself. Um and I'll try to make sure that I'm verifying that things are coming through correctly so we would also build a whole bunch of unit tests to double check its work. We are for clarity we're going so fast right now right because we had an hour more like 50 minutes to do this stuff we wanted to see how much we could get Claude to do in sort of a one shot here.
oh Darren I'm so glad you're here oh they're just they're just put in um uh grid cells so a tenth of a degree oh we would have to get into the details of like uh what is that in this projection um yeah I don't even know how this map is projected it's a leaflet map so maybe it's Mercator by default um so yeah Mercator by default I like the little explanation though small medium and large small fewer species bigger moderate diversity um yeah this is something we would have to dig into which oh it's guessing 11 kilometer squares uh that's does that vary going away from the equator probably I don't think these would be equal area grids ah that's getting into the weeds of like geography stuff whatever it doesn't matter um plus Australia is a huge place right.
amazing well this was this got us so far we learned so much there are a million links in the chat but also when this video ends up on YouTube um if you are looking at the YouTube videos all of the links you could possibly want are in the description um they're usually just like stacked with with links and if anything is ever missing from those send me a message on discord and I will go fix them and just add stuff to the the descriptions in YouTube. Thank you so much for hanging out with us Isabella thank you for hanging out Joey I hope you had a wonderful time thank you so much for sharing your knowledge with us this was fun thanks everyone yeah everybody say thanks to Joey in the chat this was so much fun I learned so much bye everybody.
