Resources

Wes McKinney: Part 2 — The open source hustle and an insider view of Positron

In Part 2 of our conversation with Wes McKinney, we go beyond the code and into the mission-driven world of open source funding, community-building, and product strategy. Wes talks about what it takes to make critical tools like Arrow sustainable — from pitching to mavericks at Two Sigma to navigating the politics of Apache Software Foundation governance. Also, Wes gives a peek behind the curtain on the origins of Positron. And, yeah, metal. In part two of our conversation with Wes McKinney, we dig into the challenges and realities of sustaining open source software. Wes shares how funding actually works (or doesn’t), why corporate buy-in is essential, and what it’s like building tools across languages, communities, and IDEs. We also talk about the Apache Software Foundation’s role in open governance and the origin of the Positron IDE. What’s Inside: • Why passion isn’t enough for open source to scale • Apache Arrow’s origin story and how it was pitched • How open governance enables trust between competitors • The thinking behind Positron, Posit’s next-gen IDE • Polyglot programming – Designing tools that bridge the R/Python divide • LLMs and data UX: Why modern IDEs need to serve both humans and models • Day-to-day coding, advising, investing, and context-switching • Metalheads unite

Dec 3, 2025
26 min

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

Welcome to the test set. Here we talk with some of the brightest thinkers and tinkerers in statistical analysis, scientific computing, and machine learning, digging into what makes them tick, plus the insights, experiments, and OMG moments that shape the field.

This episode is part two of a conversation with Wes McKinney, open source software developer, author, metalhead, and principal architect at Posit. I'm Michael Chow. Thanks for joining us.

Funding open source software

I do feel like that's a theme that comes up too in a lot of your work and what you talk about is kind of the funding side, like how do we get people into this position? And I think, I feel like a lot of really interesting things you've said have also highlighted the kind of like, what gets attention and what doesn't.

Well, the whole funding open source is a really complex question. I came initially from the finance world, and historically, folks from financial firms were not so keen to release open source software and maybe even less keen to fund work that might benefit their competitors.

So I recognize that a big part of being a successful open source developer is building a business case why people should adopt the software and why people should contribute to it. But then the next level beyond that is direct funding, and that often is even more complicated in part because of like, you know, kind of boring problems around budgeting and accounting and things like that.

Because, you know, maybe the person who wants to fund the project, they want to fund the project, but they have to convince the powers that be to allocate budget to do the work. But, you know, I recognize that particularly with Arrow, because Arrow was a project that was much more resource intensive than Pandas, and it also needed to have broad buy-in from a lot of sectors of business.

And so I had to essentially embrace this role of being like a developer evangelist or like a chief marketing officer for the project.

Yeah, well there's the technical persuasion of convincing people that you have an approach to solving a problem that is the right one and that it is also solving a problem that they have and that they should consider adopting your experimental brand new technology, which, you know, not that many other people use. You're asking people to take a leap of faith.

And so you have to identify people who have that maverick type innovator personality that are willing to take a risk, that they're willing to try something new. And, you know, they hope that it succeeds, but they're willing to fail. And so as an example, like early on in Arrow, like I connected with folks at Two Sigma. And they were like, we're really excited about this. And so we spent a bunch of time together and they were willing to, you know, put their money where their mouth is and put significant resources behind me to work on the project.

You know, I think people have this idea that a lot of open source projects succeed on, you know, single passionate developer working nights and weekends, you know, squirreled away in the attic. But the reality is that it does take, it takes a lot. It takes, you know, you know, corporate resources like, you know, it's, it's, it takes a village essentially.

It takes, you know, you know, corporate resources like, you know, it's, it's, it takes a village essentially.

And so I, you know, I couldn't have done, couldn't have done, you know, most of what I, what I've achieved without, you know, the support of people believing and, you know, believing in the different visions and, and experiments that I've run. Some of which have, you know, succeeded and, you know, some of that, you know, not everything has been successful.

The Apache Software Foundation

How did, I'm so curious how Apache factored in too. Like I never, so I like know the name Apache and obviously a lot of the Apache, but I'm curious what was kind of the benefits you saw in the strategy.

The Apache Software Foundation or the ASF. So it's one of the best known open source foundations. They provide like a home for open source projects. They provide like marketing and some developer infrastructure, legal protections. You know, they manage like trademark and copyright and that sort of thing, code licensing. They were started in the 1990s when the world of open source was very different.

So Linux was only a few years old at that time and everyone was freaked out about getting sued by Oracle or Sun or whoever were the major, you know, players in that day. Because open source was just really as a thing was, was the free software movement. It was still, still very new. But as the ASF evolved, it has become essentially a neutral home for open source projects that have corporate contributors that are looking to establish a neutral, like a Switzerland like neutral ground for governance.

So that contributors can develop the project in a way that is open and transparent and to try to at least as much as possible mitigate some of their conflicts of interest. And so, you know, so when a company is looking to develop a project but to give up enough control where that maybe even their direct competitors can join as contributors. And, you know, you don't want to be looking around and say like what's your motivation or like what are you hiding? Like, you know, what's going on in these backroom discussions?

So in the ASF, the rule is that you can't have, you can't make decisions in private or in backroom discussions. Like if you want to make a decision, you have to have the discussion openly and in public in a way where everybody can participate. And so this, this helps level the playing field. But it also helps people who maybe don't work at your company. It gives them visibility into how you're thinking, what you're doing, why you're doing it and gives them a way to also get involved.

And so as a way to build community, I think the ASF has played, played a pretty, you know, pretty important role.

It sounds like Apache like really establishes like a place for companies or like, like you mentioned, like competitors to kind of trust. Open source is an interesting one. Whereas I guess I've often as an open source developer thought mostly about as far as like the licensing. It's interesting to hear like all the other factors.

So ASF is a combination of a permissive license that's corporate friendly. So the Apache 2.0 license includes, you know, do whatever you like with the software, but also don't sue us. And like don't get, don't get software patents and then sue us over the software patents. So it has protections like for all of those, all of those things to make the corporate, you know, pointy haired lawyers happy that their bases are covered with respect to IP.

But yeah, it's like, but beyond the licensing, there's just this community, community over code is like the motto of the ASF. Like this sort of community first mindset of like how you communicate, how you make decisions, how you treat each other, like having a code of conduct, being respectful. And, but how you, how you treat each other is also like, you know, that, that you treat people with respect in terms of like, you know, you don't just show up and like try to steamroll a pull request or say like, you know, I'm the authority on this matter. And this is the way that I think it should be done. You actually have to argue from first principles why your approach to the problem is, you know, is the right one.

So there's no like project dictator, you know, benevolent dictator for life, or there's no like explicit hierarchy within a project. Like if you have commit access to a project, your commit access cannot be taken away. So it's a very egalitarian and I think, I think a pretty successful model that other, other projects that don't, aren't necessarily in the ASF for whatever reason have, have, you know, have strived to, to emulate that, emulate that model.

But I had almost no exposure to, to this way of working in open source until, you know, circa 2015 when I, when I ended up at Cloudera. And so I think that was interesting because it was a combination of like seeing how ASF projects work, but then also exposure to big enterprise software. Because I had lots of colleagues at Cloudera who had worked for VMware or for Google or for, I guess, Meta, Microsoft, like worked for all these big tech companies.

And so I also, you know, having come from the finance world, which, you know, software engineering wise is a little bit more of the wild west and, and coming to more of like this highly, you know, enterprise corporate software engineering culture. So it was very, it was very eye opening experience. I think that's definitely, you know, helped, helped me, you know, help me become a better software engineer. Or at least thinking about how to build larger software projects and enable them to scale in a way that's more, in a way that scale that's more healthy.

Origins of Positron

What kind of stuff are you up to now? Like I know a little bit about your stuff on the Positron team that I'm curious to hear about.

Well, I mean, it was just, it was just around the time that I was learning about, you know, the ASF and we decided to start the Arrow project. And, you know, that's when I first got involved with, got involved with Posit. So I'd known Hadley for, I'd known Hadley Wickham for several years. And Hadley just happened to be in town the month that we were, that we were launching Arrow. And I was really excited about it. Oh, we're starting this new thing. And the idea is to have portable interoperable data frames.

And so we spent a day together and brainstormed like what could we do with this that could be useful for the data science community. And so we decided to create a small, you know, small file format called Feather. So that was my first concrete collaboration with Hadley. And I think that was pretty successful. And it also helped demonstrate the potential of sharing, the potential of sharing technology between, across the language fences. And we were like how can we build tools that could help maybe, you know, tear down the language walls and end some of the language wars.

And so that kind of sent me down this path of just being, thinking a lot about how to build tools and systems that empower polyglot teams and polyglot, polyglot development. So I spent, I've spent the better part of a decade working on, working on the Arrow project, which has been all about building composable, modular, interoperable technologies that can be used across, used interchangeably across programming languages and that can be mixed and matched to build different types of data processing systems.

I was, you know, catching up with, catching up with, you know, JJ and the, JJ Allaire and founder of Posit and the Posit team. And so I was like, what's, you know, what's new at, what's new at Posit? And like, oh, we're building a new polyglot data science IDE called Positron. And this was, this was in 2023. So this was all, you know, under NDA, like, you can't talk about this.

And I, I found the, the idea of like, you know, taking all the, everything that, that Posit, formerly RStudio, had learned about building an amazing data science user experience in a IDE form factor. So code first, you know, code editor, console, variables pane, plots pane. And re-imagining that within the context of the modern programming landscape, which is very dominated by VS Code and the VS Code extension ecosystem.

And so at that point, Positron had been in, you know, developed privately for a year and a half or so. And the prospect of getting involved in that and also, you know, getting involved in Posit's Python strategy more broadly really, really appealed to me. Because, you know, Posit had been RStudio and had been focused on the R, R, on the, on the, on R programming. Had rebranded to Posit, added Python support to all of its products, started making, you know, significant contributions in the Python landscape, and hired you.

And as well as like bringing some of the goodness from the, from the R world into the Python world. You know, shiny, great tables, you know, lots of, lots of good stuff. And so, you know, I never imagined myself working on an IDE, but I found some things in Positron. For example, you know, building a modern data viewer, like data explorer component. I imagined myself like, what do I want if I have data frames in memory? Like, what do I want a data viewer, a data explorer component to do?

And so we set about to try to build that thing, and it's still early, but it's already a very useful thing that I enjoy using on a, you know, nearly daily basis.

Building the data explorer

So like a data viewer is like a really fast kind of like glimpse into your data that you can like filter. And so we, as we, you know, initially brainstormed on it, like because RStudio has a data viewer component. So when you have a data frame loaded up in RStudio, you can click on the data frame in the variables pane, and it will pop open this data viewer window. It has some basic filtering capabilities. It has some search capabilities. But so we started from that place.

We're like, well, we'd like to essentially have table stakes, like deliver what exists in RStudio, but also sky's the limit. What else can we build? What have we always dreamed of? What have we always dreamed of having?

And so, you know, for me, like so often I've found myself looking at the data while I'm writing code to clean it and manipulate it. And so I wanted to have the thing that I could apply selective filters, or I could sort, or I could search, but that would live update while I'm writing code and modifying the data set. And so I wanted that like live update experience. I want something that has really smooth, infinite scroll, both horizontally and vertically. I've often found myself with these like massive, you know, 10,000 column or 100,000 column data sets. I'm like, why can't we have a data explorer that is infinitely scrollable horizontally and vertically?

And of course, you know, seeing some of the innovations that have come out of the business intelligence world, having things like, you know, spark lines, you know, summary statistics, like being able to have like the statistical visualizations like just immediately available. We're really influenced by, you know, I'm a big fan of this tool called RealData. So it's Mike Driscoll's kind of latest company powered by DuckDB.

But I also want to like really basic stuff like, you know, when you're in data, if you're in VS Code and I'm sure you've been in VS Code and you have like a data file, like shouldn't you be able to just open the data file and look at it? So I wanted that like instant preview experience. And so like we use DuckDB to build that. You can just click on a parquet file and it opens and everything works. And it's, you know, it's a really big parquet file. It may not be instantaneous, but it's really snappy.

And so I'm proud of what we build. And it's a nice feeling, you know, to cross over, to build something that's at the intersection of like backend engineering because there's clearly like a lot of like, you know, you need to have knowledge of the internals of like how Pandas works and how Polars works. And our data frames and how DuckDB works to create this thing that you can deliver that snappy kind of intuitive thing, something that's easy to use but also is fast.

I mean, you spend a lot of time looking at data. And now I think what's interesting is that now we have LLMs. And like LLMs also need to be able to look at data. And they need to be able to look at a lot of data. And context windows are only so big. So it may not be practical to take the whole data frame and stuff it in a context window.

And so clearly like I think the focus of Positron and this data explorer is empowering humans to be able to get the information that you need quickly and intuitively and for it to be fast and like not get in your way. But then, you know, also like as time goes on, like we have to think about, you know, we have to think about the role of, you know, how we expose information about data sets to the LLMs so that they also can, you know, if they're empowering you to ask more questions, that they can get the information they need to give you high quality suggestions.

Day-to-day work and investing

Mostly I've been working on, I've been working on Positron lately. And since I work closely with folks who specialize in front-end development and sometimes I find myself bottlenecked on, like I need help from somebody who knows front-end better than I do. Or if I try to do the front-end portion, I'll make a mess.

And so it's a good ebb and flow in that like I might have a week where, you know, I work mostly on Positron and, you know, have 10 pull requests or something. And then a week where maybe I'm at a conference or like I'm, you know, joining customer calls or like doing presentations. Like actually I think one of the great things about being back at Posit is having more input from real data science users. Like learning from people who are using data science, open source data science in a business setting.

Because I've been so heads down working on, just working on open source projects. And, you know, I had my, you know, I was full-time at a startup, you know, up until a year and a half ago. So didn't have a lot of time to like go interview, you know, people working on enterprise data science teams to learn about like what their big problems are. So that's been really interesting.

And I do spend, I spend a portion of my time, spend a portion of my time investing. Like I have, like I part-time, I have a, like a small venture fund which is basically, you know, started out as an extension of the angel investing that I was already doing. But also it helps me be like, help be, not only, but help be Posit's eyes and ears in the broader market and understand like, you know, where technology trends are going.

Like what, you know, new companies, like what, you know, what they're doing. You know, what can we learn from? What should we do differently? You know, where are there opportunities, things that we should be doing that we're not doing? And so that's been, you know, that's been interesting as well.

I do not aspire to be a full-time, full-time venture investor. People are asking me that all the time. Like I was just at a conference last week. They're like, Wes, like you're a VC now? I'm like, no, no. Like I'm at best, I'm at best a super angel.

And my hope is, especially I've really focused on working with companies that are directly involved in areas that I care about, like data infrastructure, developer tools, open source, things relating to open source data science. And now some of the crossover into the AI landscape. And so I'm hoping in helping, picking companies to work with that they're building stuff or they're engaging with communities that I care about and that helps enrich the whole, you know, make the whole pie bigger for everyone.

Essentially it's another lever in the community development sort of toolbox in a sense. If you work with a company, you know, I've been an advisor, like I'm an advisor for Lance DB, for example, which is Chong's new company. So he co-founded Lance DB with Lei from Cloudera. So they're building, you know, a vector database and a file format for multimodal, you know, machine learning AI workloads. And that's like the perfect example of like they're using Arrow, like they're using Data Fusion, like which came out of the Arrow ecosystem. And so it's like, you know, their success is like the success of the Arrow ecosystem. And that benefits the data science world more broadly.

Metal music

I'm so curious to ask you, because I know last year we were both in Charleston and we went to a metal bar. And in fact, someone in a review called it like a relatively new metal bar that hadn't gotten its patina, its metal bar patina. And I remember asking like, who listened to metal? And we had one very obviously metal coworker. And I feel like you really surprised us by coming out of the woodwork. I figured this is our chance for you to maybe tell the people about your, what do you love, what's hot, what's not in metal for you?

Well, I definitely have like wide ranging and at times eclectic music tastes. But I do enjoy. When I was in college, I had a number of friends who were big metal heads and they introduced me to symphonic and power metal. And so recently the last metal show I saw was Dream Theater. They played Nashville. It was awesome. I think the second time that I've seen them.

And so I wouldn't say I listen to too much metal these days. Maybe I ran the half marathon in San Francisco several years ago and my playlist was mostly metal. I think it really helped. I really enjoy the concerts, especially the live concerts, just because I think the crowd energy is amazing. But also I appreciate the technical execution. Just the playing is just incredible. Everything, the singers, the drummers, all of the guitar and bass. They put on really technically impressive shows.

Power metal is pretty epic. I guess Dragon Force is the one a lot of people. Yeah, they're kind of the most over the top. I do like Dragon Force. I saw them once in concert. It was pretty insane. A band that I really liked was Sabaton. They're Swedish. They do a lot of war themed music.

It's all a crowd energy thing. I didn't even know them. I went to see them in San Francisco with a former colleague from Cloudera. I was there to see Nightwish, which is a famous symphonic metal band from Finland. It's been around since the late 90s. Sabaton was the opener. I realized pretty quickly, everyone's here to see Sabaton. They're not here to see the main show. I think we were also blown away by the opener. Nightwish was still great. They're always great. It was almost a little bit of a let down.

For coding music, it needs to be a little more chill. I do like having something on in the background. I got to listen to music without lyrics, I found. I found movie soundtracks and soundtracks to video games have been pretty successful. Coding background music. It varies by the day and my mood and whatnot.

I found movie soundtracks and soundtracks to video games have been pretty successful. Coding background music.

Well, Wes, thanks so much for chatting. I mean, we're co-hosts. The Test Set is a production of Posit PBC, an open source and enterprise tooling data science software company. This episode was produced in collaboration with branding and design agency, Adji. For more episodes, visit thetestset.co or find us on your favorite podcast platform.