Latent Space: The AI Engineer Podcast - ⚡ Inside GitHub’s AI Revolution: Jared Palmer Reveals Agent HQ & The Future of Coding Agents

Starting point is 00:00:03 All right, we are here for a very special edition of Lane Space with my buddy, Jared Palmer, SVP at GitHub, and VP at Cori at Microsoft. Correct. Dual title. Yeah. Twice the fun. Do we're going to have two jobs? I'm only on, I'm only to fold the slamer.

Starting point is 00:00:19 I'm only on day 13, I think. Yeah. So early days. So far so good. So far so good. We've been trying to get you on the podcast for two years, I think. I think so, yeah. You know, like that more it.

Starting point is 00:00:30 Yeah, you're a busy guy. We were doing Herson, so. We have to do in person. Yeah, exactly. Way better. I should also plug that you have, if Jared Palmer Friends should dig into your previous podcast with Ken Wheeler. Not about that. It's from the inner room.

Starting point is 00:00:45 Okay. Shout out to Ken. Before that, you were building, I guess, like, V-Zero and EISDK. And you were just sort of VP of AI at. At Voselle, yeah. Yes, all AI initiatives and vibes. Yeah. And I feel like basically you went from, like, you sort of building one, code.

Starting point is 00:01:03 agent to now being the home for all coding agents. Is that like the general vibe of agent HQ? I think that's right. Yeah. So backing up, I spent the last sort of two years or so building VZO at VSEL and AISDK. And then the summer took time off and now joined GitHub. And today we launched Asian HQ among other things here at Universe. And yeah, it's going to be the home we hope of not only agents but also developers.

Starting point is 00:01:29 And it seems like the gravity well of. of this new collaboration space that we're trying to build. Yeah. What do you think, like, basically that GitHub can do that you couldn't do at V-Zero? GitHub is an enormous platform, right? So these are- 180 million. It's something about it.

Starting point is 00:01:45 Yeah, it's under 80 million developers. It's just the scale is immense, right? And V-Zero was focused on not only one language, but one framework. Right. And a specific problem space with a built-in renderer. And, you know, for those you're not aware of V-Zero, It's like a fault or lovable, but it's built by Versel. And it's focused on building NextJS apps, specifically NextJS apps.

Starting point is 00:02:11 That constraint was rather liberating for the team at the time, and it lets us really like laser focus. It does that my world at it. I hope, thank you. I hope so. And obviously at GitHub, you know, we're the home of all languages and, you know, and frameworks and developers. And so the scope is broadened.

Starting point is 00:02:29 And yeah, it's just a different part of the map, with you will, right? Yeah. How I've seen, so you've been basically covering the entire journey of coding agents. Yeah. For, you know, from the start. Like, what do you think?

Starting point is 00:02:41 What's your personal journey through coding agents, right? Like, we started out with co-pilot. Obviously, GitHub started the co-pilot trend. Sure, sure. When, tell us about the origin story of V-0. Yeah. And then how that develops and maybe where, like, what you want to see next with?

Starting point is 00:02:57 So, history. It's funny you ask that. As I've told this story multiple times, I feel like I've unlocked. different parts of it in my brain by going back, you know, like, so maybe we'll have to figure out how retrieval memory works. By the way, interesting how memory works for agents.

Starting point is 00:03:11 Totally. That's why I brought it up. Like, kind of everything as you... Yeah. Sometimes you discover new paths, right? Yeah. Anyway, the story goes like this. So when chat GPT first came out, obviously it was incredible, right? Like, world changing, immediately, faster growing product ever.

Starting point is 00:03:27 I looked back at like the timeline and dates and we were very early, like, in when I was at Purcell jumping into AI stuff. But the journey kind of went like this. So at the time, actually there was no AI division. There was no AI group.

Starting point is 00:03:40 I was actually the director of engineering for all of Versal frameworks. And I was helping NextJS spell, right. Yeah, so NextJS, Svelte, Turbo repo, TurboPack, Webpack, and all internal dev tools at Versel. And I was helping the NextJS team dog food

Starting point is 00:03:56 and test the initial implementation of server actions. And instead of building a to-do app, Gijermo, the CEO of herself was like, I don't you build like a playing ground? And I was like, okay, cool. So that led to the AI playground,

Starting point is 00:04:10 which is now just part of AISDK. We'll get there a second. Which, by the way, iconic for like side by side, but also the... Right, so Nat. God C. So Nathodev. Nath.

Starting point is 00:04:18 So G. I got a DM. I remember because I was at a bachelor party. And Gizermo, totally online, sends me know, like, Nat. Dot Dev is launching on Monday.

Starting point is 00:04:28 You have to ship. And I have been working on it previously. And so we were like... game idea. Yeah, so he got wind of it, I guess. So I definitely had to, like, jump into motion. And I didn't think we didn't even ship chat first. So he's never sends me this DM over the weekend. I'm at a match for party. And he's like, Nat.D. Dev, this sidebys, he sends me the link. And I play with it. I'm like, chow. Okay, so I spring into gear, ship the AI playground. It was cool about the AI

Starting point is 00:04:50 playground was it forced me to go through every single model provider's API docs and figure out their quirks of their nuanced streaming. Because at the time, it wasn't like everybody used Open AI. It was like, oh, little quarks. Some of them kind of were compatible. So that was my first way to it. And then launched AI Playground, that shot to the top of Hacker News. And I remember we didn't, I didn't even implement chat, because that chat wasn't actually like a fake, like, wasn't as important. It was just like, Confucian. So eventually we facted a chat. And that, that project, uh, out of that came AISDK, because I had already looked at all the model providers and all the, um, the combinations. It was like, okay, here's that chunk of streaming

Starting point is 00:05:29 code you need. And then AI The AI SDK found that sort of niche of like, how do we focus on the part that we're going to be good at, which is like that UI aspect of it, but then also knock it in your way. So that we shipped AIFlayground, and then AISDK launched. And then, you know, we're always about demos and having great starter templates at Bersel. And I remember writing Gigerra. I was like, you know what would be cool? This guy, Shad Sien. Oh, man, he seems like amazing.

Starting point is 00:05:54 And his UI library is doing great. Why don't we team up and ship a chaty BT clone? open source. And we did. We shipped us this awesome template, which is now called chat SDK, but it's great. And what that did, though, at Vercel was it set us up for like rapid experimentation because we had this really good, like pretty full-featured chat GPT ready to rock with all the latest features. Yeah. So when it came to like rapid prototyping that summer, now we're summer 23, it was so great. It was like liberating. So I remember at that point I had gained some momentum internally and pivoted almost entirely to AI. And I had Shu Ding, you're friends with, and Max Lider, and Shad Sien now, we're cooking. And I think at that

Starting point is 00:06:47 point, code execution had just come out. I think that's my time later. They have a cool sandbox interpreter. Interco interpreter. That's what they call at a time. And I had a very, as soon as I saw this, I had a very ambitious idea and proposal to present to Gijermo, which was like, What if there was some, mind you, tool calls don't exist, the context windows, 4,000 tokens. So, like, there's not much here.

Starting point is 00:07:08 What if we had this thing where, like, you could prompt, and sometimes it would do code interpretation, and then maybe we could sort of, sometimes we would decode interpretation, but then other times it would choose to render, like, UI, or then it would render sometimes, like, a document. In line in the chat.

Starting point is 00:07:26 Yeah, and like, it would just have different sort of generative UI. Yeah, and maybe you could. pipe them together. So like the output of one could pipe into it. So if we did code interpretation, we coerced, it's always emit like tabular data. Maybe we could pass that to another prompt that which like a you want. And just like some idea there. It's got crazy. But if it sounds like these are just tool calls, that that's exactly what these aren't really was. Yeah. So it became pretty obvious that like the, so I came because you remember at the time we just had a secure,

Starting point is 00:07:55 we had a sort of security debate. Like should we code interpretation with the ability to such data, like giving an internet access was like kind of now they're like, fine, whatever, whatever you, whatever you want, wow, wow, West. But at the time, it was like a little scary. So we kind of said, okay, no to the code interpretation, but this UI thing, it's pretty neat. And so that, this like prompt to UI, that was like the aha moment of V0. But the models were not very good, right? Or relative to where they are now.

Starting point is 00:08:25 And it was four, GPD4 era? This is just into the GPT4 era. and now we're probably at a 16,000 token context window. So you can't really do chat. So we had to kind of invent this kind of new paradigm of like fake it with completion. But that forced us to do sort of the click the initial V0, which launched I think in September 23. It would look more mid-jury.

Starting point is 00:08:51 In fact, if you like the original tweet, it was like mid-jury for React because it was all very visual. And you could click on different components and elements and repromp. But it was, again, we were kind of hacking this because we didn't have chat and we didn't have tool calls. And then fast forward again, you know, that launches. And then probably like nine months later, it took us like nine months to get to like a million ARR, this little team. But then the models progressed. And from, you know, GPD4, GPD432K, the big boy, we never really got GPT4 turbo working.

Starting point is 00:09:28 I don't know why. which never happened, and then switch to other frontier models and then start doing our own models and stuff like that. But fast forward another 10 months or nine months or so, and then we rebased towards chat. And now the models finally could do chat. And the artifact pattern had evolved,

Starting point is 00:09:46 so it was time to rewrite. When we launched VZero, the chat version, or the new VZero, whatever I would call it. It's like 14 days, another million MRR. 14 days, another million MR. It was like a rocket ship after that. And that just proceeded. And we just kept cooking.

Starting point is 00:10:01 And so that's been the journey. We just kept perfecting. And what was really liberating for us was actually the focus on just one stack or one framework. Whenever he else was trying to do general purpose coding agent, we were like, no, we're just going to focus on NextJS front end and Chad Sien. And that was really, that allowed the team to focus. So that's the story arc. I mean, to be fair, like, ball level ball, like because next year is so dominant, basically everyone has to be good at NextJS. Right. But being focused on, like, even right down to the UI library and component stuff, like, that actually helps a lot.

Starting point is 00:10:33 We also started working with all the frontier model labs to help, because it was in our Vursell's best interest to have them be created an XJS. Yeah. And also because of the post-trained models, and you can read about this on the Vassel blog, like we can, the post-training harness that we created, we started sharing and stuff with other model labs and stuff like that. And we had all our data in a very hygienic state to work with them. Did you have a debate internally? And because, you know, I've, my, from my seat at cognition, I can also see this, where you should pick the best qualities of every model and string them together in the V0, or you have the model selector and you let, it's counter is true. We went back and forth.

Starting point is 00:11:13 And I think, you know, we launched, we went back and forth on all this. I think at the end of the day, there's pros and comments. Yeah. One of the benefits of having your own branded model or synthetic. or composite is that you can stick these things together. Yeah, there's like at higher levels of, and now it's a little different with the say, gentic flow, but even look at what you,

Starting point is 00:11:34 like what you guys launched recently with, these are actually wrapped, right? So search is it going to be a different model than what generation, but, you know, the Genesis, but like, because it's a sort of. But like, search and Genesis are two different, like, like, you tire subsystems, right? So you can have search evals that are going to be totally different. And so how do you, so where we ended up now,

Starting point is 00:11:56 where are we? Where we've probably ended up now is like, for a long time, we didn't have model selector, and then we had our own models, which were composites, which we talked about. And then, like, and that would allow us to, you know, mix match. And I think that's probably if what,

Starting point is 00:12:14 it's also nice because you get as a, this is like the product cap, you get to brand it. Yes, right? And you can decouple it from the launch of the frontier lap. Yes. How do you guys, how does cognition even build for it?

Starting point is 00:12:25 Like, AFU's, uh, computer is a synthetic unit, right? Yeah. So it gets a little wonky.

Starting point is 00:12:31 Yeah. We can go on for pricing this, uh, this stuff. It gets challenging. But the nice thing about having the like, brand name model is that like, you get to co-launch with the provider and they'll hide you up.

Starting point is 00:12:43 So, uh, but your billing needs to then is capped at whatever retail is, right? Or, or some. Right. Right. Right.

Starting point is 00:12:49 Right. Or you can really charge of, uh, too much of a premium. You can, but. Right. But if you are,

Starting point is 00:12:54 but I want to be. I like, well, then how do we charge you for Sweet Rep or something? Yeah, yeah. And so, like, I think it's, some part of it is the cynical, like, you want to create a sustainable business and independence from the model labs. But the other part is genuinely,

Starting point is 00:13:08 you actually do get better performance. Right. You, like, string together, all these things. Yes, and so it's tough. I think, well, we've switching gears to GitHub, like, we are all about model choice now. And, like, making sure that, and what's cool is that, like,

Starting point is 00:13:21 we also have co-pilot, which is our harness, and co-pilot C-LI, but we also have third-party partners as like Cloud Code and Codex and cognition now in Agent HQ. So you kind of get the best of both worlds. And like, I think that's going to be awesome and ultimately what people want. Yeah, I think also the model layer is not the right abstraction to do the switcher anymore. Which is weird because that's where you started with the ISDK. Yeah, yeah, yeah, exactly.

Starting point is 00:13:46 But now it's like the model and the agent have to be strictly tied together, like very, very strongly balanced. You can't loosely bound it and just do it. a generic interface because then you're just going to have the lowest common denominator of all the models. If you're in Agent World, which may just be better than Chat World, like, in general, like, better... Agent World is a much better abstraction. I'm calling it Agent World, but I mean by like A, like, A, Loop with maybe compute runtime and, like, files. That's that your definition of agent? You're dropping your official definition here?

Starting point is 00:14:14 No, don't put me on stuff. But, um, maybe... No, so my initial definition of Agent for AIS, because, like, I actually fought I was dying on this still. because AISDK, everyone else is an agent framework. And I think maybe they actually went to like this, I don't know what it says on the front page now, but like, who knows? But like six.

Starting point is 00:14:31 An agent is, you know, an agent is orchestrating, you know, an API request with a queue and a four loop. Okay, but a coding agent now has meant so much more. There's like these coding agent SDKs and you've got sandboxing and file systems and tool calls. And I do think that is a unique, I'll call that agent world and charge of our coding agents here. And yeah, I think that that seems to be where things are going.

Starting point is 00:14:55 And even I believe the Claude Excel agent is basically, I was talking to Mike Krieger backstage, like, I think it's related to Claude Code. It could be. I actually don't know how it should have made it under a hood. Yeah, so it wouldn't surprise me if it was. Yeah, yeah, yeah. They seem very all in on skills, which is kind of like an interesting. What do you think of skills? It's kind of Dxt, which is like the sort of bundled.

Starting point is 00:15:22 version of MCPs. Okay. It wasn't, yeah, the reason you don't know about it is because it wasn't very popular. Okay. So,

Starting point is 00:15:29 Skills is kind of like the second shot that is very LLM-pills. Like, it's like, just read my markdown and just read this, this directory of files

Starting point is 00:15:36 and go nuts. As long as, like, it can understand that you have the capability to run code, to read files, you're good.

Starting point is 00:15:44 And actually, that is the universal interface, which is a file system. Right. Back to Agent File System. Yeah, which is kind of cool. Yeah,

Starting point is 00:15:51 so I mean, I think, like, what you're hitting at is, you know, this philosophy of, like, understanding of what coding agents, the minimum bar is over the last two years. Yeah. Right. Like, you've lived this journey. And, like, now you're basically kind of like the kingmaker or like the... I don't know that. You have phrase.

Starting point is 00:16:10 You run Agent HQ. And, I mean, I imagine you have other projects, too. But, like, AJSQ is, like, the big one they were talking about here. Like, what are you seeing from, like, the different agents? Like, what do you want to, this to become? Such a good question. I think that Agent HQ and GitHub itself needs to co-evolve. And, you know, one of the things that Microsoft has done really well is by putting things that are alike closer together.

Starting point is 00:16:39 And so you think about the new core AI organization. Yeah. Got Visual Studio Code. GitHub and parts of Azure all in one. and obviously the GitHub team and the BS code team have been working closely together for a long time but now we're really close together and I think for me one of the cooler things

Starting point is 00:16:59 that Agent HQ can sort of offer is this seamlessness, this fluidity with your workflow, right? So if you saw in the demo today, we saw a demonstration of you use Agent Hsu, you fire off a task and it creates a PR, but you can also open that PR up in VS code in one click.

Starting point is 00:17:19 And that's awesome. And I think the vision for GitHub as it evolves is to look at those touch points where AI can be sprinkled in, you know, salt-based style, into the native workflow, whether you're assigning an issue or maybe some new stuff that I think we should like focus on. Maybe it could be like, how do we resolve a merge conflict? Oh, my God. Right? Like, how do we maybe pop open an action or like get in? Right. And so like, I think Zolvi Emerging.

Starting point is 00:17:48 It makes this my definition of EGI. Totally. But like, you get that error on an action and you're like, we've all been in that, we've all been in that sort of flow where like actions kind of don't work. Locally, or is that tool act. If you are trying to, I don't have that set up on my machine. I haven't done this. And so you're pushing up and you're this like, okay, what if we could just put, like,

Starting point is 00:18:06 you know, comment or kick off a task to solve this for you or do your things there? I think, right, or I'm trying to describe as this like this workflow where it's just like seamless and fluid and you can stay in a flow state. across, whether you're across all devices, mobile web on GitHub.com or in your local editor. And I think that's where my focus is going to be in the next six months or so. Yeah, yeah. Just a side tangent on this. So one of the things that Microsoft also owns, I don't know if it's Microsoft or GitHub, is dev containers.

Starting point is 00:18:40 And I think like very important concepts for sandboxing, environment, whatever you call it, it is kind of a light version of what Docker containers are, kind of. Right. Do you see that as a standard that we should invest in as like a thing? Because it's supported in VS code. I don't think it's just that popular outside of VS code. Yeah, it's used internally at GitHub too for like development at GitHub. Oh, yeah, yeah.

Starting point is 00:19:05 Which is cool. Yeah, I think they were so far ahead almost. But now there's sandboxes. There's so many of these days, right? So I think Cloudflare just launched theirs. Yeah, there's Dayton. They're here, Purcell, modal, which I think Loveable uses. I'm not yet.

Starting point is 00:19:22 I mean, you'll probably have your own. I don't know. What do you guys do you guys? Just some Kubernetes pods. Okay, you guys are rolling yourself. That cool. I think that's maybe the runtime. But there's work and discussion about what that runtime should be even internally at Microsoft.

Starting point is 00:19:36 We've got a couple different competing things. So we'll figure it out in the next cycle here. But there's a great point. Like there's a lot of cool stuff that is. in a dev container. You've got Vs code loaded. You've got a file system. You've got a sandbox. You've got the security protocol. Yeah. And also like wired into GitHub Enterprise. Yeah. And like ready to be packaged. So there's lots of goodness there. Yeah. I see like the number one pain points that cognition has, but also codex, also presumably the other guys, is repo set up. It's setup. Which is effectively what dev containers and a Docker file does for you is like run this thing, then that thing. Sit this up, do that thing. Why is it so hard? Like, what have we solved?

Starting point is 00:20:15 I don't know. I think it's hard because you can't predict what's in the repo, right? So it's like, and you don't know when they bundled FFMPEG. You just don't know. It's nice when like if it's just next year, you just run PMPM install. Correct, correct. You can like do special. Like there's obviously, through constraints, you can make optimizations.

Starting point is 00:20:34 And I think the general purpose container is just like challenging. That being said, though, I think there's probably some work to do on, you know, auto detection and reempting. stuff to be done there, but it's just a bigger, it's a broader problem space, right? Yeah. And, uh, yeah. So, fun fact, when I was at Nelify, I actually wanted to reach out to Rizel to do, like, a standardized open source auto detection thing of frameworks.

Starting point is 00:21:00 Oh, yeah. And I, like, we, we never, like, got internal momentum on that. It was an idea. I was like, shouldn't this be open source, you know? Yeah. Like, auto detection is a, is a common utility that everyone needs. Yes. Yes.

Starting point is 00:21:12 I remember that I'm having a flashback. It's like, have, yeah, yeah, right. everyone builds their base and probably we shouldn't all build it. Right. Now it's, and then also like what are your defaults? They're not exactly the same, which would be better to like just having even the same like preference stack. Yeah, of defaults is the right. Yeah, yeah.

Starting point is 00:21:30 Would be great because then we could move the whole ego system together. Right. From like to PMPM or bun, right? Okay, so are there other movements or protocols or standards that you're interested in? Like MCP was a big winner this year There's other like, I don't know ACA, ACP, all this All this like that's interesting

Starting point is 00:21:50 I'm not as familiar with ACP the payments or one or the is that right right? Oh, no, there's that one. The one, okay. And then the payment that was a Stripe or Coinbase? Stripe. Yeah, that's very cool. We've had them on the pod. Okay, yeah, that's very cool.

Starting point is 00:22:05 It'd be interesting to see if that takes off. I mean, it's striped. Yeah, but it's supposed to be adopted by like the clients, right? Yeah, and I think that's the, It's fascinating. The MCP is huge, it seems like it is the way that a lot of the, especially when it comes to digital transformation or some of our enterprise customers, it's where they're going to be able to where they are able to add context.

Starting point is 00:22:26 In addition to that, we also have custom agents that we announced today too. So, like, you can work with prompts and stuff within your agent HQ and customize these agents for different tasks, and those can have MCPs and such. And I think that's going to be really powerful from like a platform perspective. Yeah, it's me excited. That's what I think is like shipping now. in the next, but we're always in lookout for like the next thing. And I don't know, what's on your, what's top of mind for you?

Starting point is 00:22:50 For like standards or standards? It should be a container. Look, I think dev container just as a PR problem. It's a great idea. Right, right. Just no one makes it interesting. I think you can do it, basically. Okay.

Starting point is 00:23:08 Add it to my list. But before that, probably you have a bunch of other stuff that I do want to get to. But just staying on the AI stuff, like I think we're actively exploring computer use as a thing, because it kind of got going a little bit. People were very excited, and then they found out it was slow and bad and inaccurate. It is computationally intensive. In my understanding.

Starting point is 00:23:30 It's getting better. Yeah. Especially with Open Vision models like Deep Seek OCR and OMO OCR, like just give it a few more turns of the scaling. It seems like it's like you need. that edge case and primary just seems like a modality worth pursuing. I think a lot of people

Starting point is 00:23:49 are on the code gen side, the code agent side, a lot of people are trying to think about all right, we had this evolution from co-pilot to like a more agetic sort of like a cloud code situation is where I think that the status is like what's next, right? What's the obvious next step? I mean making them good.

Starting point is 00:24:07 Making them good, yeah. You don't like clothes? No, no, I don't. It's more just like, you know, the devil's the details, like, going from 90%, going to like, hill climbing, it gets steeper. Yeah, yeah. In my opinion, it gets deeper. And so going from 90% success to 95 to 98 to 99 to 9s of success. Yeah.

Starting point is 00:24:28 I mean, we're really hard. Paying Mark Gore a lot of money for expert programmers of open source maintainers. Then you realize along the way maybe the users aren't that good at, but, like, no, but I just think there's a lot of work to do to finish the swing. and there's a big difference between 98% than 99% correct. And that's like noticeable. And this used to hit, you know, if you're working on AI product, you probably don't realize how you've probably seen this. Like most people are blind, like living in La La Land about how poor quality

Starting point is 00:25:01 their AI product lately is, unless they're really measuring like a number of error-free sessions, like how many errors are coming from the infra providers, like, you know, how many requests or drop, how fast these things are. And so that's something that we cared about it for sell quite a bit and, like, we'll care about it. Do you have, like, a daily review of your dashboard? I don't know. Daily, daily, slow. Okay.

Starting point is 00:25:26 I thought you were going to say days too much. No, and you're thinking about it every three hours. A roll-up of key metrics and stats. Yeah. And, like, one of them was, like, error-free sessions and other things like that. That was, like, really important because, you know, especially now with agents which are like multi-turn. I have a tweet about this that was like in 2024,

Starting point is 00:25:46 which is that like agents will really only work where we get to not only like the more intelligent models, but better reliability of the infrastructure providers, right? These are not, inference is not like a database, like update uptime, right? So there's still differences between providers. There's little differences between performance and difference uptimes. And that's why you see things like open router being very successful and different gateway products because like reliability,

Starting point is 00:26:09 you need to switch. They go down all the time. So, long story short, yeah, we would do, like, you know, it was almost like video game style. Like we'd have, like, all the data coming in all the time. Yeah. And that allowed, I used to joke, it's by mood ring, like, good day, bad day. So it was very successful for us.

Starting point is 00:26:27 I think other teams should adopt that, like, data-driven approach. I think one thing that's surprising is the lack, the relative lack I still see on data analyst agents where you can sort of chat. add a slackbot for the precise analytics that you want to generate. Because I think we're still in the BI era. Yeah. Isn't that weird?

Starting point is 00:26:50 Yeah, I totally agree. It's interesting that space hasn't been like captured as much. Like, I guess maybe now. Actually, I'm interested in this like shift to a way into, like, into knowledge work tasks with coding agents. Okay. I wonder. Using coding agents but non-coding tests? Correct.

Starting point is 00:27:08 Do you do that personally? Yeah, yeah, yeah, what do you do? Well, like, I was, like, this summer I was doing, like, I was something, I was trying to automate some my dad's workflows and stuff like that. And just like some of his, he's got some Excel spreadsheets and, uh, for like accounting, like, financial accounting, or managerial accounting, I guess. And, uh, yeah, just like, point cloud code of that stuff and see what happens.

Starting point is 00:27:28 And like, it's, it ends up doing Python and, and, and generating some scripts. And it kind of got off, down on the hairs. But it was like, even he saw that it was better at it than the, chat client that exists. Super obvious. It became kind of obvious. Yeah, it felt better. I wonder if he can try Cloud for Excel and see. Yeah, yeah, that's that. He got me right.

Starting point is 00:27:49 And then, of course, you got the browser, the, I don't even want, not browser, browsers, but not computer users, but browsers with agents, Asian browsers. Asian browsers. For Flexity. Everybody's, you come right, that guy. It's like, is that the better. If that's true, then maybe the general purpose

Starting point is 00:28:04 injection point is there. Yeah. What do you, or have, have you tried any of the agent browsers? All of them. All of what do you're doing? I am very, I'm currently maining Atlas, mostly because I just want to give

Starting point is 00:28:15 chat GPT a fair go. Okay. But I'm very stuck to the arc and like the vertical tab. Oh, okay. I think any pro user, like I have multiple businesses. How many tabs do you?

Starting point is 00:28:25 I'm context switching. I have hundreds of tabs open. I made a open source tool called Chrome Dump. You can find it on my GitHub where it literally dumps all the tabs open. It summarizes them and I can close it by deleting

Starting point is 00:28:36 them on markdown. That's pretty cool. So you just go on like a, you just go on like a, like a, like a, a, like a, a, a, a, a, a, a, and then you just dump it. It should be as easy to close it's marked down. And Chrome isn't that good at the performance side of things yet.

Starting point is 00:28:51 And you were working on like some browser comparisons. I was, but so I tried to build it in Tori. Okay. And Tori explicitly doesn't want you to build a browser, and I, I cared to fight it too much. I see. Yeah, that's very cool. So just to wrap things out of that and we're around about time.

Starting point is 00:29:07 There are other side projects that tasks and things that you've announced here. First of all, redesign GitHub homepage, which a lot of people don't even know GitHub has a homepage. I'm legitimately... I have a tweet, which is... Riz's this tweet, like, printed out. Like, you see the tweet, like, there's a tweet from Riz and it's like, no one uses... All of this stuff is totally useless. I'll pull it up.

Starting point is 00:29:31 Yeah, yeah, yeah. I quarry out today because we're like, when we launch... Let me get it right, because I got to do it right. Hold on. Okay. Was incredible how pretty much the entire GitHub homepage is useless. And is 1.3 million views and 19,000 likes. And this was May 25.

Starting point is 00:29:50 So, heard. The team made improvements. And today they launched a new GitHub homepage. Yeah, I'm very proud of. And they should be really proud of. It's got tasks at this top. It's got recent PRs. Some stuff is still there, like your recent repositories.

Starting point is 00:30:04 I think there's more work to do, but it's like really overhauled. and they did an amazing job with it. So they nailed it. But, you know, more work to do, never done. And, like, hopefully we can keep iterating with the community and everyone. And, uh, and keep going. The last thing I want to hit you on is stack tips. Oh, yeah.

Starting point is 00:30:21 You asked everyone when you join, what should I work on or something? Yeah. What happened? I don't know if this is your job specifically. It wasn't. But why do people want stack this so much? Listen, I think you have some history there. Yes.

Starting point is 00:30:32 Anyone who's interacted with anyone at Facebook, no worries about fabric here. They're not just about it. Yeah. So can you explain what it is? Why is it so hard? Okay, so this concept of a pull request, which we're all familiar with, you write some commits, you open a PR, and then you merge the PR, and you go about your day. So as you scales, like, larger organizations, and, like, you look at your history, and there are people who are, like, very, I'll say, like, have, like, near religious. disbeliefs about how to get, how to do get right. Rebase versus merch.

Starting point is 00:31:11 There's a crowd that wants to fast forward the repository, so to preserve all the history. And then there's a crowd that wants to squash and merge into the anime. I'm pews squash. Okay. Anyway, Facebook, and I've never worked Facebook, but in my previous startup before Versel, Turbo repo, I did a lot of research on build systems. And at Facebook, they have, not only they have their custom build tools called Buck, they also have a custom file system,

Starting point is 00:31:37 and they don't use Git. They use Mercurial, and then now it's sort of custom, and it's all wired together. And at Facebook, they don't use pull requests. They have a different sort of philosophy. You can,

Starting point is 00:31:53 it's sort of like the best way to it, like, think about this is like, imagine every PR just have one commit in it. Yeah, you could branch them, and the critical thing is you can restack them. and then if you restack or make a change

Starting point is 00:32:10 later in the stack than later in the stack and these stacks are just diffs right the commits or just dips and it's the term stack diffs you can then collapse them and merge the last one

Starting point is 00:32:19 and merge them towards it and it just gives you a little of a nicer workflow and it's what people if you work on a monorepo or you work on a very very large code base it's a really

Starting point is 00:32:28 really nice way to work especially you've got a system that will that will automatically restack And then if you think even more deeply about it, like, and get really deeper into the weeds, you can decide which diffs in the stack CI should run against. If you get fancy. Okay.

Starting point is 00:32:48 May not be. Like, so kind of commit messages. They're just always, yeah, you could decide, like, maybe this one doesn't need it or skip that one or whatever. And you end up getting these, like, sort of these groups, these stacks. And it's really nice from a code review perspective because when you go to update or you can update a different part of the stack, it just thinks. it's a little bit more fluid. And so it's what people want. There are a couple tools out there in the market

Starting point is 00:33:08 that do this kind of behavior. One's called graphite. There are a couple others. So many of it was called graphite. It seems there's another graphite right here. Yeah, yeah. And it's got, it's a generic workflow. And so it's been the top full request,

Starting point is 00:33:21 or sorry, the top feature request. Thank you. At GitHub for year, from the community's perspective. I don't know at GitHub. And then as soon as I joined, the first thing I did, I was going to look this up. And, well, that's the first thing. I asked how should make it up better, and it was the top feature request, right?

Starting point is 00:33:38 And then I went to go like, okay, investigate, like any good product person would. And there's been multiple attempts of this internally, going back to like 2020. Okay. And there was one very, very, very polished attempt to in 2022. And it just, I don't have all the context. So, but it was, there was a pretty good implementation. All of the work was done on the client. And it reintroduced this new concept called Stacks.

Starting point is 00:34:05 outside the pull request into GitHub, and it was a little too risky. It was sort of deemed too risky, too big of a change. That's just what I was told. So anyway, we're, we have cold meetings internally already, and we're trying to weave it into planning and the roadmap, and so hopefully we'll be able to share more updates soon,

Starting point is 00:34:23 but like it's a top of the list known feature at well. Again, heard. Yeah, and so, and like we're working on it, obviously, like, something the size of GitHub to move, to, like, support this kind of new, your feature is not just like a walk in the park because of the size of GitHub's and GitHub's Git implementation but it's something that we're actively exploring.

Starting point is 00:34:44 Yeah. Well, I think, you know, just to wrap all that up, you know, I think it's really nice for someone who's so deeply engaged and like coming from like one of us, literally. Yeah, yeah. That you now run things at GitHub and we can just add you. You can do that mean. I think like Audrey Carpathie was the other day was saying,

Starting point is 00:35:01 like every company needs one of these. Or you can just like, hey, like, this really should exist at GitHub. We love GitHub. We use GitHub. But like, come on. Well, yeah, feature requests, welcome. Like, my DMs are always open. Oh, careful. I don't know. 10080, whatever. I am of the philosophy that, like, all feedback is a gift. Like, it's all a signal. Yeah. And the more signal we can collect, the better decisions we can make.

Starting point is 00:35:25 And truly build this really, really useful website and company, like, together. And that's going to need the future. focus just on that, we're going to be okay. Yeah, we're going to be okay. All right. Well, thanks so much, Eric. This is a real pleasure. Awesome. Catching up. Yep, likewise. Congrats.

Latent Space: The AI Engineer Podcast - ⚡ Inside GitHub’s AI Revolution: Jared Palmer Reveals Agent HQ & The Future of Coding Agents

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.