The Infra Pod - Let's chat about vibe coding & Ralph! (Chat with Dexter at Humanlayer)

Episode Date: January 26, 2026

In this episode of The Infra Pod, hosts Tim and Ian sit down with Dexter Horthy, CEO of Human Layer, to explore the evolution of AI coding agents and the future of software development. Dexter shares ...his journey from building data tools to discovering the real problem: making AI coding agents actually productive for senior engineers, not just juniors.The conversation dives deep into the research-plan-implement workflow that enables engineers to ship 99% of their code with AI assistance, the challenges of getting staff engineers to adopt AI tools, and why most AI coding ecosystems don't actually help you sell to enterprises. Dexter also shares his spicy take on how Ralph-style agents can be even further enhanced.Whether you're a skeptical senior engineer or an AI-curious developer, this episode offers practical insights into what actually works in production AI coding today.[0:00] Introduction & Dexter's JourneyWhy Dexter finally started a company, the failed data catalog pivot, and building an AI janitor for data warehouses[8:00] The Hard Lessons of AI Ecosystem HypeWhy there's no "SAML for AI agents" and what enterprises actually need versus what the hype machine promises[13:00] The Research-Plan-Implement BreakthroughHow to make senior engineers productive with AI, staying objective during research, and making decisions at the top of the context window[26:00] The Vibe Shift & Where We Are TodayWhen respected engineers started believing, the role of Ralph and spec-driven development, and what's working in production[37:00] Spicy Take: Ralph Goes to the Supreme

Transcript
Discussion (0)
Starting point is 00:00:04 All right, this is Tim from Essence and Ian, let's go. Hey, this is Ian Livingston, CEO of Keycard, Decider of Age and Identity. Couldn't be more excited to have Dexter Horthi on the podcast today. I think it's like a fourth or fifth podcast. Anyway, Dexter, CEO of Human Layer. Dexter, how are you doing? I'm doing great.
Starting point is 00:00:26 You all missed all the fun cold opener content because Tim didn't tell me we weren't recording, but we'll try to bring it back a little bit. No, super stoked to be here. Like I said, I have been a fan of the infrapod since you guys have my buddy Hunter on. I was like, all right, me and Hunter have a low-key. We were like two of the only solo founders in our YC badge. So we're buzzed.
Starting point is 00:00:46 We just got dinner a couple of weeks ago. But I was like, all right, if Hunter's on the InfraPod, I got to figure out how to get on the infra pod. And here you are. So Dexter tell us, what is human layer? What in the world are you working on? And why the hell did you even start on this journey? This is the key question we ask every single guess. What makes you so crazy and what compelled you to start a company?
Starting point is 00:01:03 Honestly, one of my best friends in college, like, kind of startup filled me when I was, like, 19. And then I just, like, waited around for too long with, like, imposter syndrome, trying to, like, figure out when was the right time to do it. So I was in Chicago's work. I was like, okay, I got to learn this. And I got to learn this. And I got to do this. And I worked at a ton of startups. And, like, so I was like, I don't know when I decided to do it.
Starting point is 00:01:26 It's kind of like, it's been such a part of my identity for so long that, like, it's honestly embarrassing that it took this long. but I woke up one day and I was just like, oh, you've been sitting around waiting for the right idea or the right co-founder to come along or for someone to tell you that you're ready. And it's like, I had a buddy asked me to come be the CTO of his startup and they were cool guys and I love them. But the problem they were working on is just like not interesting to me.
Starting point is 00:01:49 But I was like, oh, I just set up all your infrastructure in a weekend for fun as a bit to see if we could do it. I think we could probably figure this out. And then I started doing the co-founder dating thing and talked a bunch of people and met up with a buddy of mine who had an idea. He was working on this open source thing for like, it was kind of like a component of like a data catalog,
Starting point is 00:02:09 which I don't know. Did you see this interview from Noreps with Swix and Sarah Katanzaro from Amplify? Not going to see it, no. So it was basically like I call data catalog as like the hot product category of like 2024. And that did not work out. He was also the hot data product of 2018 and 2012.
Starting point is 00:02:32 Data catalogs? What was the 2012 one? Atlassian or allation's been adlin. Oh, Allation. Yeah, elation. And there was one other one that was before it, right? Yeah, yeah. That's like a long time.
Starting point is 00:02:45 Yeah. Yeah. Yeah. Do you call it data catalog? I think catalog something. I once it was like data catalog, like the way we talk about at least. Yeah. Well, it was like all data products have the like before DBT and after DBT sort of like minting
Starting point is 00:02:59 of like generation of companies when it was like, oh, this is an engineer. discipline now. And it did work out. And I'm like really glad that all happened. But I also, when I was starting a company, I was like, oh, yeah, we're going to do this thing. And it's going to be like segment, but for your data warehouse, where it'll give you product analytics for like which datasets are actually in use and like, how are they being used so you know actually which pagers to turn off? Because like, anyways, every VC I talk to is like, they actually seem really smart. I like you. Like lots of cool people have said you're dope. Please don't do a data company in 2020. before. Like, it's, the party is over. It's still good. It's still important tooling. But, like,
Starting point is 00:03:37 most of those companies are not going to end up being venture scale. I don't know how much truth there was. And of course, as a founder, you're like, you know what? Screw you. I believe people need this. And the reason why it's not like venture scale is because people don't have my thing. And when they have my thing, suddenly we're going to unlock all the value of data, which I'm sure I'm the only person to have ever said. Incredible. You did try it, right? Like, you try to. did it. We built a product. We did a pilot with like a really, really like, if we had closed this logo, it would have been like cool, like off to like a dope seed round. And they used it for a couple months
Starting point is 00:04:11 and then like didn't convert. And then we tried a bunch of stuff. And we were like hacking around with AI. This was like April of 2024, I want to say. And like this was when all the like agent stuff was starting to build up. And like I talked to some of the folks who were early investors at crew AI. And they're like, you should really pay attention to this thing, decks. And we hacked on a bunch of stuff. and we tried a bunch of things, and we tried to, like, the Snowflake App Marketplace had just come out, which was this crazy, like,
Starting point is 00:04:38 have you seen the Snowflake Kubernetes thing that came out in, like, late 20, mid-20204, where you, like, instead of running Cube Control logs, you, like, select from container logs, and it's like a function, you, like, write SQL to query your Kubernetes cluster, and you get the results back as, like, a data table. I didn't see that.
Starting point is 00:04:57 Yep. Yeah, it was a platform to deploy on-prem apps into other people's snowflakes. So you could build your app and then like bundle it as containers that you pushed to a snowflake table and then or like to the Snowflake Object Store and then you could basically have like a one-click deploy
Starting point is 00:05:14 where it would like Kubernetes deploy your app into your customers like Snowflake compute basically. We've got very hardly nerd snipped on that. It didn't go well. We tried six more things. My co-founder burned out. I like was like, all right, now what are we going to do? And we built this like, the like new MVP was like,
Starting point is 00:05:30 we're going to make an AI agent that is the janitor for your data warehouse. We're going to run around and find tables that aren't used or find indexes that aren't used or find like query patterns that some analyst is like copying and pasting SQL out of a spreadsheet and tell you that that's happening and tell you like, hey, maybe this like these seven queries get run in a row once a month by this one person and that's the report that probably ends up on the CEO's desk. And that should probably be the thing that all of your analytics engineers are building, not whatever else they're working on.
Starting point is 00:06:01 How would you go from that to what you're doing now? What's the journey, my man? So the MVP was like this agent would like find tables that were unused and drop them for you. But it's like, I don't really want an AI running around my snowflake instance dropping stuff because AI was still not, it was as unreliable as it is now, it was much worse when it was like, this was pre-GPT-40. This was like early GPT-4 days, basically. And we built this human-loof system basically.
Starting point is 00:06:26 We're like, okay, find tables that are unused. for 90 days, send a Slack message to the three people who matter. Like the person who made it and the last person to query it and be like, hey, we're going to drop this table. And then they would get a Slack message and they could say yes or they could say no or they could respond in natural language and say like check back in two weeks. And we would like reschedule the check for you and stuff like that. And it was an interesting idea.
Starting point is 00:06:47 But the thing that came out of it was like we built this system that was like almost like pageer duty style like escalations for human in the loop. Like an angel wants to do a thing that's scary. Go find a person to get permanent. mission for it and like based on what it was dynamically figure out who that was, query it from some user database and then like go find those people and get an answer. And I was tired of trying to sell the data teams. So I pivoted to like, what if we built this as plat? Basically, we decided that like cleaning up people's unused snowflake tables was not a like problem that people were
Starting point is 00:07:18 willing to pay a lot of money for. But the pattern of like, hey, let's have an AI that runs in the background. It does things that are actually useful instead of just like summarizing data or creating research or creating, like, written content. Let's go turn that into a paradigm and see if we can, like, automate and, like, make that easy for developers to incorporate into their AI application so they can do more interesting things than just, like, chatbots. And what do you learn? So much.
Starting point is 00:07:43 So much. So we built that. I, like, shipped that in, like, three weeks. Float to SF, demo to meetup, had a lot of good conversations. I was like, great. I don't know if this is enough poll. This is way more pull than we had for the last thing. And obviously, go the tailwinds and AI and stuff.
Starting point is 00:07:56 close first revenue a week later. This was like August 2024, and then a couple weeks later, like got into YC, spent a long time trying to make it work in YC, and like spent about three or four months learning this really weird, hard lesson, which was like basically the,
Starting point is 00:08:11 what we had been sold as AI builders was maybe kind of not 100% true. The things that the hype machine told you was like, there's this whole ecosystem and there's all these plugable bits. And if you integrate, as long as you're usable in like, LangeChang crew AI and every single new agent framework
Starting point is 00:08:28 there was like one a week coming out at that point in like fall of 2024. Like you integrate into this ecosystem and then everyone will be able to use your thing. And then we actually went to like, a ton of success at hackathons and like indie devs loved it. I get 100 people to try this thing in a day by doing a workshop. And then we go try to like sell to people who actually were making money
Starting point is 00:08:47 and we're like building AI that shipped to the enterprise and like we're making high six low seven figure ARR these like seed series A companies, none of them were using any of that. Or they had tried it and they had thrown it out because they needed way more flexibility. And so it turned out like there was no standard. There was no ecosystem to implement against. So in order to use our like human, where you like fire off a request and we go like go runoff and get an answer for you and then send you a web hook when it's ready, people would have to
Starting point is 00:09:15 re-architect their applications to fit in that model. It wasn't like, I don't know, like the thing I compare it to is like Samuel II or like Oath or something, right? where it's like, okay, when all the enterprises decided to adopt SAML II, because they built it into Active Directory, it was this literally flick a switch. If you support SAML II, you can sell to every enterprise that you want to, with some exception, right?
Starting point is 00:09:35 You still the on-prem thing and stuff like that, but it made it way easier. And like, that didn't exist. And so I, like, documented all of that and, like, talked about it a lot because they didn't want other people to go through this, like, challenging thing of, like, oh, I'm going to build a startup, but I'm going to build a product around this ecosystem that doesn't actually, like, help you sell things to businesses that have, like, real needs and, like, need to deliver
Starting point is 00:09:56 high-quality AI solutions to large customers that, like, it just can't be wrong 20% of time. It has to be wrong, like, 0.1% of the time. And it's not the path to turning that into a company. It's like, okay, you create the standard and you push it out into the world and you get people to adopt this. Like, we even wrote the protocol, right? It was like A2H agent to human protocol. This was like around the time A2A was out. A lot of people use A2A for this now, or we're using A2A for it, you know, a year ago or 18 months ago. And like, I saw the path to make human layer work. And I looked at all the other experiments we had kicked off because we realized like, okay, this clearly doesn't have anything even resembling product market fit. Yeah.
Starting point is 00:10:35 And there was another experiment we were working on that I liked much better. They're actually all of them I liked much better than the path of like trying to get people to re-architect their apps around a standard that didn't exist yet. It might not even be like ready. And so we went all in on one of the. other ideas. Well, what is it? Tell us all. Because it's been a hot moment, because you call your company Humian Layer, but is that really what you're
Starting point is 00:10:57 up to anymore? Like, it makes sense when you're doing how do I, human the loop controls for one thing or the other. Yeah. Yeah, and I'll get into, ask me again in a minute, like, why the name still works. But yeah, we had, we've been using Cloud Code for a while. And when the Cloud SDK came
Starting point is 00:11:13 out, which was like, wasn't even a library at that point. It was literally just a bunch of flags on the CLI and like a streaming JSON output format and like some weird like undocumented MCP interface where you could give Claude a custom tool to ask permission from a user. I don't know, you use cloud code. You know, it's like, hey, I want to write to this file and you have to like approve it. There was like a format where you could actually like get permission from the user to do
Starting point is 00:11:38 the thing. And so we hooked that up to human life. We built this demo of like running the Claude Code CLI in a loop to basically like every time it needs permission, it sends me an email. and I respond to the email with like yes, no, or change it in this way, like plain text, goes back into the Clod CLI, and then when it's done, I get an email with the final answer. So we basically hooked Clod up to our API so that now it can work over email or Slack, and you can work with Clod code wherever you want.
Starting point is 00:12:08 And then we decided, okay, cool, like, what if we want to run a bunch of clod? We built like a terminal UI to like multiplex clod codes because there's all this crazy stuff. people doing these like T-MUC setups with work trees in like, you know, May, April of 2024. And so we were like messed around in that world a lot. And we started to get some like conviction around like there's an interesting thing of like the super like Unix OG like own my own setup, do everything in NeoVim people had like started to develop a workflow in some of these like private communities on Twitter and stuff.
Starting point is 00:12:40 And I was like, what if we productize this and made this way of working available? to lots more people. So we built a 2E, and then we started working on a desktop app. And while we were building that, we kind of stumbled upon the thing that actually was the most exciting unlock, which was like when you spend 70 hours a week
Starting point is 00:12:59 talking to Claude, me and the engineers on the team were figuring out some really interesting ways of like we had built this very complicated, like Golang, demon, that like managed a bunch of sub-processes and stream things on standard out and like kind of looked a little bit
Starting point is 00:13:14 like a very naive version of Docker, like early Docker, I'm sure, look kind of like that, not in the provision stuff, but just managing lots of processes for you and like the configuration for them and like giving you a UI to like view all the outputs and stuff. And that became very complicated very quickly, you know, tens of thousands of lines of Golan code.
Starting point is 00:13:32 And so we were forced to like figure out, okay, how do we make Claude productive in a complex code base like this that's like not perfectly architected and came together really quickly. and we want to ship large features, not just like what most people and when I was using coding agents for mostly, it was like small fixes, adding tests, diagnosing problems. But like most of the time when I set up to have it do like a hard thing, it was like rolling the dice.
Starting point is 00:13:58 And you probably weren't going to hit the jackpot. But every once in a while you would get it. So you'd keep trying. And we kind of like figured out this workflow of like research plan implement that actually allowed us to consistently ship large interesting features in a, large complex code base. And that actually ended up being the more interesting thing than just a UI to run lots of clauses in parallel. That makes sense?
Starting point is 00:14:23 Yeah, I think so. So let's talk about more about this, because I feel like if you're listening so far, probably still a little confused exactly what you guys do. So what I can condense is like, okay, you guys are working on the ability to almost like help use AI coding agents in a much more sophisticated and more complete way because I saw to example you doing PRs for battle, right? Like this complex code, we have some support and cancellation support. It takes five or seven days for a senior engine to do those PRs. So typically, I guess the whole point of those prenotes in those description is usually if you use AI coding agents, you're going to create
Starting point is 00:14:59 a bunch of code, but it's hard to support complicated features. And today, people are just producing garbage code. I don't even know what they're doing anymore, right? So it's really difficult. I think that's kind of where we're going, right? And so maybe talk about more. more, what exactly is the type of problems you're trying to solve? And is your product even, like, ready for people to use? Maybe talk about the state of it as well, yeah. Yeah, we'll do another like two months of the story. So we gave this talk in August of 2025 of like, here's how we do this stuff. And like the prompts are open source. You can go use them. I've done a podcast about it like a month before. Like, here's the 15 minute version. If you want the hour and a half version, you can go
Starting point is 00:15:37 watch it over there. It blew up. It had like, you know, 100,000 views within like a month. I wrote a blog post version of it. It blew up on hacker news. People were obsessed with these prompts that we hit open sourced. And so we built the product. We integrated them into the product, but it was really like there was this UI
Starting point is 00:15:52 on top of cloud code, and then there were the prompts. And we learned a couple things working with like early design partners that were basically around the lines of like, okay, cool. Like step one is how do you get a engineer really, really productive with AI?
Starting point is 00:16:09 How do you make it so someone can write 99% of their code with AI? And that's not like shipping tons of sloth. That's like, when you say 99%, I'm like, I'm talking about a senior engineer can do all of their work with AI coding assistance, which means not just the easy stuff, not just the small stuff, but like large features, complex features in large code bases. The kinds of things that senior and staff engineers are generally working on. Like how do you make coding agents work for that? And we did a lot of like workshop based, training base, like, hey, send me your staff engineer and we'll sit with them for a day and we'll ship, you know, two days worth of work in one day or something. And it went well.
Starting point is 00:16:44 And the other thing we learned is like the thesis is always like, okay, once you have four or five, this happened internally with us. And this was like the secondary problem. Like once you get really good at shipping with AI, you have this problem, this like follow on problem, which is like nobody else on the team can like keep up with what's going on.
Starting point is 00:17:01 And like reading a thousand lines of like wall of green text and GitHub, ain't going to cut it anymore. And so we had to like change our SDLC and develop new ways of working together. and like on paper for all of Q4 last year, that was kind of, this is the problem we're solving with our design partners. Like, I think coding agents are going to be commodified.
Starting point is 00:17:21 I think like the open source prompts are open source. Like anyone can go take them and run with them and learn how to use them if they're like motivated enough. But it's like cool. We're going to like help you speed run the route to like, how do we make one engineer good? And then we're going to help you redesign your SDLC so that you can actually collaborate on this
Starting point is 00:17:37 without descending into a chaotic slot fest where most people kind of lose touch with what the code. it is and how it works. There were some issues with that theory, which we kind of like ironed out and I like reacted to in the last like two months, which is like the answer to like what is the product now. So we had the product who's open source for a long time. We kind of forked it and are working on like a private version. I'm not sure exactly what the licensing is going to look like for that yet, but it was basically around the idea of like, oh, actually I can sit down with one engineer for seven hours and get them on the path to shipping 99% of their code with AI. They
Starting point is 00:18:12 won't necessarily be there, but it's like showing them what is possible. If you take a person who's an expert on the workflow, which is usually me or my co-founder, and you take a person who's an expert in the code base, an engineer who's been there for a long time and has architectural opinions and knows how they want things to be. And with those two people, you can show them, oh, we could ship two to three days worth of work in a day, or with Viobb is a special case because that dude's absolutely legendary, but like ship, you know, 10 days worth of work in seven hours, right? the problem was is like we did that
Starting point is 00:18:42 and then our champion would like go try to roll like this is great I'm going to give this everyone on my team but they wouldn't spend the seven hours with every single person to like hand the knowledge on and so a lot of the like other people on a team
Starting point is 00:18:53 would actually get kind of like bad results from the planet they wouldn't actually get they couldn't they can't like quite broken through and they had the same thing that like I think is happening in lots of companies and has been happening for the last year which is like
Starting point is 00:19:06 the people you need to embrace AI If you have a team of hundreds of engineers or more, you need your principal, staff-level engineers to embrace this stuff and own the cultural shift in the engineering org to help everybody adopted in a safe and productive way. And those are all the people who, like, tried it. It didn't make them that much faster because, like, without a lot of reps, it's not actually, you have to, like, practice and learn this stuff to get good at it. And two, it's like, okay, like, they are already pretty freaking fast. So it didn't speed them up that much. And they probably got some bad results. sometimes. Like, I wouldn't have written the code this way, or this is slop or this is bad,
Starting point is 00:19:42 so I'm not going to use this. I'm going to keep doing what I was doing because it's faster that way. And so you have this like rift growing. I talked about this in the talk where it's like the juniors pick it up and the mid-level engineers pick it up and they are like, oh my God, this is amazing. I'm actually, it's like filling in gaps in my knowledge. I can ship more than I could before. And so they love AI more and more every day, but they do also ship a little. There's like slop gets in there and stuff because they don't have the experience to like or the taste of the judgment to know like, oh, no, this is an anti-pattern, and here's why, because they haven't spent 10 years getting paged at 3 in the morning because someone did something stupid. They're just
Starting point is 00:20:14 like, yeah, let's go. It works. Like, I tested it as good. And so the, you're like, most senior engineers who need to, for this to work, who need to be all in on the AI thing, like, hate it more and more week by week because they're spending more and more of their time, like, fixing bugs introduced by like, okay, it's a 2000-line PR. I can't read this whole thing. Like, I hope it doesn't suck. I was like, yep, I was right. It sucked. So we rebuilt the product around this idea of like, there's a bunch of things we all wanted to do forever, which is like build a bunch of collaboration features and session sharing and Figma for Cloud Code. So like that's all coming.
Starting point is 00:20:48 But also just how do we make it easier to get good plans if you're not an expert at the workflow? Forget your like software engineering expertise. Forget your code base expertise. Like if you are not an expert, if you haven't sat down with an expert for a day and like consumed the learnings and all the knowledge and all the like, between the lines advice, how do we still give you a really good experience that lets you unlock these workflows without, again, having to like, oh, there's this stuff that was like, if you sprinkled in these magic words in between certain steps of the workflow, you got way better results. And people would be like, I did a plan. I was like, well, did you use the magic
Starting point is 00:21:23 words? And I'm sitting there on a customer call, I'm like, I sound like an idiot. Like, I can't believe this is the answer to your question of how do I get better results from this thing. Interesting. Well, maybe unpacked it a little bit because I think what I've, why I think it's found it's so interesting is to use the word workflow, right? Because I like a lot of people just to assume, okay, AI coding is good if you know how to use it. And this is like how you use it is a very opaque thing because everybody talks about it differently. Like the workflow sometimes includes a lot of tools. Some people use it, I only use it in this section of my code or I only use it this way or that way. Like it's not really almost like a very simple to abstract way to think about.
Starting point is 00:22:00 about a typical workflow that everybody can adopt, right? So what is a good workflow or an AI-quoting agent that can actually unlock good productivity? Like, what does that even look like? And how do you help people to even adopt that in the first place? Yeah, so there's a lot of concepts that flow into this, and they all come from like the stuff we talked about back in April with the 12-factor agent's paper,
Starting point is 00:22:23 which was like, you know, hey, just build your AI pipelines from scratch and make the things deterministic that should be deterministic and use AI for the things that AI is good. it at. It's like core context engineering stuff, which includes like, you know, it's broken on in three phases, right? Research plan, implement. We've actually in the new version, there's more phases because planning is actually a lot more than just like write a file that says what we're going to do. But the two biggest things that I think are non-obvious, you can go get a prompt or a spec kit or any of these tooling things that like walk you through the workflow and try to like handhold
Starting point is 00:22:58 you and just ask you the questions that matter. But the two most important things are like, when you're doing research, you want to stay objective. No opinions allowed. The models get immensely worse when you have them make decisions about how to solve a problem. The first stage of workflow is just understand the code base. What we wanted to deliver to people was like drop in your Jira ticket or your linear ticket, give it to the research prompt. And the research prompt will go find the things that matter and compact down all of the code base
Starting point is 00:23:25 understanding that matters for this task in. to a single file that then you can reuse in future sessions. And so this built on this idea of frequent intentional compaction, which is like try to keep your context windows fairly short and build your workflow around. Like each step has a very clear purpose, which is like research or a planning document or a design discussion or a outline of like the structure of how we're going to do it.
Starting point is 00:23:49 But in the research phase, you want to stay objective. And like that means that like the skill that people had to pick up that we're trying to automate now is like, You need to read a ticket and then combine that with your understanding of the codebase and generate questions that say, go find, go look in this part of the code base and this part of the code base and this part of the code base without leaking implementation details into that problem.
Starting point is 00:24:11 Like the model can't know what we're trying to do because then it's going to bring its opinion. If you wanted to solve this, you should do this, this and this. The entire point of the workflow is like keep an objective as long as possible. And then when you go into planning, it's really important for the model. people have been saying this since the chat GPT era is like analyze my question and give me three options or like come up with three or three to five options that we could do and then we'll talk through the like
Starting point is 00:24:36 strengths and weaknesses of each option. Like that part is the thing that coding agents do not do by the Fod code now has a plan mode that's like for the average user for the average use case it works much better than not using it. But I still think like there is more to be desired in the planning workflow. that it's like basically forcing the model to ask me questions until I say, okay, I believe that you understand it.
Starting point is 00:25:01 Now you can go build because the building part, the implementation part is very context intensive. You're writing files, you're running tests, you're getting output from all these tools. And so like if you are making, if you are doing that and then halfway through your 50% through your context window and then you're making more decisions, whether the model's making decisions or it's even in the off chance that it actually asks, how do you want to proceed here? like the quality is going to go way down. So you want to make all your decisions in the very top part of the context window where it's like you get the smartest and the best results that's the least amount of noise. That's kind of the core of it without like going into the exact steps that we recommend
Starting point is 00:25:37 and the exact prompt workflows that like we're rolling out in the basically rebuild from scratch. We literally threw it out and we rebuilt the entire thing from scratch in the last month. We went down from like 60,000 lines of code to something like 35,000. And I think probably like the, only thing we ported over was the UI components and everything else is rebuilt from the ground up. It has way more features. It's way more stable. All this stuff that we did in a hacked way is not built properly. I'm so excited, guys. It's going to be so sick. I'm curious, like, what were the,
Starting point is 00:26:08 you know, clearly over the last like two or three weeks, there's been some, you know, everyone was around for the holidays. They're all playing with stuff. The Twittersphere finally picked up and dry-dode skills and lazy loading was supported. And, you know, MCP lazy loading's kind of starting to be there. But more importantly, Opus 4-5 is there, and then people finally had time to, like, pick up some of these, some of these prompting techniques you're talking about. There's 100% of vibe shift, which is, I think if I look at my own agentic coding use cases, you know, back in September and August, you know, I was, you know, building proofs of concept than, though relatively small, but the minute I got to any form of complexity, it just fell over
Starting point is 00:26:46 and couldn't manage, like, the complexity. It became more trouble than it was worth, right? Yeah. And it generated slop, and I couldn't give that to an engineer. And the engineer would be like, what the fuck is this? Right. Like, to be honest, as an engineer, I was like, I don't know what the hell this shit is. I guess, you know, like, that doesn't make a lot of sense.
Starting point is 00:27:03 It's not what I told it to do. And it would have these crazy deviations where, like, make these massive migrations that don't make any sense and go all the way down these roads. And then you come back in December and you start, you know, you pick up with four five, you pick up some of these new techniques. And you're like, this shit works. Like, it's like the moments here. Does that resonate with sort of where your brain's at? because that's the experience I had is like, it's not great. There's still like large issues, but like you can,
Starting point is 00:27:26 you can kind of squit and be like, okay, I actually, I believe the hype in concept, you know, six months ago, now I see the hype in practice. That's better. And I remember the first time I saw Mitchell Hashimoto tweeting about how he uses agent decoding all day. And I was like, okay, this is a signal of a vibe shift. Yeah.
Starting point is 00:27:44 And the last six weeks, I've seen a lot of people of that archetype. And like, you see, I don't even remember. remember his last name, but he's Cooper Dennis. He used to be, like, work on a ton of, like, huge Kubernetes stuff. I'm going to have them on my show, and we're going to talk about the, like, Claude Code that runs in your Kubernetes cluster and fixes your pods for you thing. I saw Jeffrey Ferzell
Starting point is 00:28:03 talking about using, like, Claude Opus 4 or 5 to do a bunch of dope shit, or maybe she's a code expert. Anyways, these people who are, like, the most cynical, smartest, most respected engineers in the community who, like, build the early versions of some of the dopest, like, people who built Golang. People who, you have Yana from Google,
Starting point is 00:28:21 about like, oh, yeah, no, like, Claude just did this thing that we worked on for the last year. People who built the early versions of Docker are now leaning into this. And it's like, cool, that's how I know there's a vibe shift. I'm kind of curious. Where do you think things like Ralph fit in terms of what's going on? Like, are we truly at the moment of like autonomy? We've reached autonomy and coding agents. Or is that a fun toy? Where with the rise of background agents? Like, how do you think about where we are in this transition from like world of like you know we started 23 with get up co-pilot and that was fancy auto complete and then cursor was good it was so much better than anything we had so much
Starting point is 00:28:58 we had and then cursor was a better version of fancy auto complete and then claude was a slightly you know then cursor claudec had kind of battled about who's the best auto complete but net but over the last six months like agents have really really started to come full circle in the sense of how we think about them and work with them so I'm kind of curious like where does this fit and how you think about where we're at in that journey. So I like Ralph for two reasons. One, Ralph is kind of the like, if you watch Sean Groves talk at AI Engineer in June,
Starting point is 00:29:27 his whole thing about like spectraven development and like someday we're going to treat the code like assembly and we're never even going to read it except on very rare occasions and we're just going to read the specs and stuff. Like that's a future that we'll probably come to pass. I don't know when. But the closest version of that we have that actually works is this Ralph Wiggum thing where you just like,
Starting point is 00:29:47 like, here's a directory full of specs, run a loop forever until all the specs are implemented. And so from that perspective, and just like, what is like the end state of spectrum development might look like? It's a really interesting look into that. I will say, and even Jeff says this, like, you can run Ralph and it will probably do like 90% of the work. And you still need to check in every couple days and you'll find a bunch of shit that's broken. And you'll actually need to tweak your prompt and add another thing that was like, here's all the thing that's broken, build a fixed plan. And now we're no longer working on the implementation plan. We have a new plan that is the fixed plan.
Starting point is 00:30:17 It's such a different way of interacting with coding agents. Like the primitives you're working with are like a folder full of specs and an implementation plan and you're modifying those things. And like you don't care about the LLM's context window. You don't care about the conversation. Like you can control C Ralph at any
Starting point is 00:30:34 point and just change the specs and restart it and it will like just reconsume the, it's like okay, here's the desired state of the world, here's the current state of the world, let me make progress. And it's almost like I don't know. I love, I'm very Kubernetes filled. I love Kubernetes not just as an infrastructure thing, but it's like the core, like, philosophy underneath it of like everything should be a
Starting point is 00:30:54 control loop because that's how you can make like robust distributed systems where each component has its like sphere of that it cares about. And even if it loses touch with everything, as long as it can get the state of the world and like interact with its own like local environment, every part of the system can eventually gain consistency and like moving towards the actual thing that the user wants out of the system, whether it's X containers running over here, or it's an agent working on this, or it's this feature in a web app. All those concepts are the same. I also like Ralph as just like a really deep and like advanced lesson on context engineering and on thinking about your context window as something that is deterministically allocated. You think of like being a C developer
Starting point is 00:31:39 and allocating like arrays of memory with Malik. It's like Ralph is designed from that perspective of like you're always going to have the system prompt and then you have some budget for what's the desired state of the world and you have some budget for what's the current state of the world and those are the like parameters how big are those things so that you can on then how much time do you want to spend like how much more context window can you afford to use to actually do meaningful work of editing files and running tests and verifying behavior before you want to like cut it and just like run that loop again that makes sense yeah So, you know, with all the vibe shifting now, we're trying to, like, figure out how do you actually realize all these values?
Starting point is 00:32:21 Like I said, the workflows. And I understand the Ralph and the goal of getting to autonomy. And so, like, I know your product is still on a waitlist, right? And still are getting shipped. What do you expect people can be doing out of it? Do you expect everybody will pick up human layer? And when the waitlist is downloading, we all can write 3,000 lines of codes and no bugs. and we're all like superhuman right away?
Starting point is 00:32:46 Or you feel like this is like incremental step this is happening? Because I feel like right now with context engineering, with all the like research plans, like the workflows, it feels like we're trying to find a better stance of how to improve. And we've seen some signals already, right? I'm just curious, like, do you think this is enough? Or we still like, there's a lot of stuff to figure out. And we're still kind of.
Starting point is 00:33:06 So this is, I haven't like spoken really publicly on this. And it's probably, you know, like, be a formal thing we write about soon. But I will share it. here, here's more breaking news for you. The human layer wait list is kind of a, it's a little bit of a sci-up.
Starting point is 00:33:21 So it's open source. All of the releases are published to the GitHub repo. If you can find the URL with the instructions to download it, you can just go get it, at least to grab the free version. The prompts are open source. So there's like probably tens of thousands of people who have grabbed the prompts.
Starting point is 00:33:37 I base this on there are 7,000 GitHub stars. So I imagine at least two or three X that number of people have gone and grabbed the stuff and started using it in their orgs. I've seen products, like, I don't know, like Block has this, like, goose agent they built. They've built an RPI plugin. They were kind enough to cite us on the plugin page,
Starting point is 00:33:54 but the prompts are, like, very similar, if not, like, almost, like, word for word from what's in the human. Like, lots of people, this is, like, making its way into lots of large engineering teams all over the world. That's, like, level one. Again, the desktop app that is an IDE for managing lots of cloud code sessions, the current version is like, if you ping me, I will tell you how to get it. We have not blasted out to the wait list because by the time it was at the point
Starting point is 00:34:18 we're like, we should just launch this and give this to everybody. We already decided we were going to rebuild it. And so we were just like, okay, am I going to bother everyone on our wait list? But like, here, go try this thing that we're like deprecating in a couple months. So the answer is the waitlist has now transitioned to be the wait list for the new product, which has like a SaaS component and we'll have a sign up and we'll probably like, it's much, much better. But like you can technically go play with code layer today.
Starting point is 00:34:43 You can go get the prompts and use them. We'll probably find a way to publish and make available the like version two of the prompt workflow. But part of what makes it so good is it's not just, you know, using a giant prompt for control flow. It's actually like workflows is built into the product itself. So instead of using this is with the 12 factor agents thing, right? Don't use prompts for control flow. If you know what the workflow is, use control flow for control flow. And so that's like the answer is like you can go grab the prompts.
Starting point is 00:35:12 You can go grab the thing today and go mess with it. You're going to GitHub repo and download a DMG. It should just work. There's a homebrew command you can use to get it. And we're most excited about putting it. We're putting it in people's hands. I think actually today is the first like hands on install with someone outside the company that we're going to do. So very excited to like quickly start rolling that out to people.
Starting point is 00:35:31 And, you know, we have the best customers. Our customers have the best problems. Like I love that we spent the last six months doing a kind of, like people advise us against doing like a services heavy motion of like go get in the trenches with your customers and watch them use it and train them. It's like, talk to many of Cs are like, I don't want to invest in your consulting company. I don't want to invest in like or in like, we're not raising right now anyways, but it's like, you know, there are people in the community that you like and look up to and respect and you
Starting point is 00:35:56 want their advice. It was like, I don't like, you can have a billion YouTube views. Like that's not a company. What makes a good company is you have to have a good product. But the asterix on that for all other builders watching this is like, Like, doing services is time consuming. It takes you off a product. But getting invited in the building to be in the trenches with your customers
Starting point is 00:36:16 and really understand their problem is priceless when you're early on. And like it is very likely it would have taken us six months longer to realize what the shortfalls were with the current set of problems and the current workflow and the current product if we hadn't put ourselves on the hook for outcomes. And we said, we're not just going to give you a tool and see how it goes. we are going to commit to working with you until you feel like you're shipping two to three X faster across every engineer on the team, across every project, and every cook base. Amazing. Well, as I told you, we're going to have our favorite section of our pod right here.
Starting point is 00:36:53 And I know you have a ton of stuff you can probably pull out of your hat. So what is your spicy future hot take here? So this is the thing I believed six months ago. I'm sure you saw the tweet about this. This is July 11th. I texted one of my buddies in L.A. who's like, he's a software engineer, but he's not like super deep into the AI world.
Starting point is 00:37:16 I said, in three to six months, a phrase that resembles Ralph Wigomloop will escape San Francisco and go viral. And it's pretty close. That's three days from now would have been the six month cutoff. So I only just got it. So my new prediction for 2026,
Starting point is 00:37:33 when I first saw Ralph, one of the things that people like to talk about I was like, oh, so you could just copy any software from the specifications. And you talk about what happened, Jeff loves to talk about what happened with like Intel AMD, or like the way that AMD was legally able to like kind of copy Intel's architecture, allegedly. Somebody went in and reverse engineered all of the chips, developed clean room specifications that had nothing to do with the implementation,
Starting point is 00:37:59 just like this is how every component works and this is how it like behaves from a black box perspective. And then you hand those specs to someone who has never looked at, at an Intel chip, never seen an Intel chip, does no idea how they work, and then they implement the specs from scratch. And so I don't know what the legal precedent's going to be here. I believe that if you could get your hands on a well-formatted, like, context-efficient copy of the Salesforce documentation, you could in a couple months build a completely wire-compliant version of Salesforce, slap whatever UI you wanted onto it, do a bunch of AI-nated stuff, build it in a modern stack.
Starting point is 00:38:35 So my take for 2026 is we are going to see either directly or loosely like something that is like Ralph Wiggum will be upstream from a Supreme Court case about IP and copying stuff from other products and like rebuilding software products from scratch. That's my, that's going to be because it's released on an open source or because some big company is going to someone's going to deploy either open source or or some freaking startup is just going to clone Salesforce. It'd be literally like, just point your URLs over here.
Starting point is 00:39:10 We'll literally slurp all your integrations out and reconfigure them. And now you have Salesforce, but it's faster, cheaper, scales, runs on-prem, every single thing you could want that Salesforce doesn't have but people put up with because they won the ecosystem. But if you can do a wire-compliant, like, complete API, like, you know, clone of just, here's the contracts and here's how everything works, then you could make a lot of of mine. So you're bullish on spec-driven development, like driving specs, using the specs to drive the coding agents, using the specs as sort of the source of truth for the- I'm not sure we're there
Starting point is 00:39:43 yet, maybe in the next generation of models, but I think, I do think that something like Ralph can work to get you to 90% of the way there. And yeah, building 10% of Salesforce is a huge mountain of work, but it's a lot less than building all of it, which is what all of the other, there are already funded, well-funded startups that are cloning Salesforce and like trying to replace it. imagine if you could just walk to an enterprise and be like push this button and now you're on our thing that's a great take i'm curious when do you think we see the first use of ralph wiggum in production in you know your average enterprise company when are we there today is it happening today or do you think this is still like uh just hasn't left the ether of the twithphere oh it's been
Starting point is 00:40:25 done it's been it was done last year many times i know lots of consultants who have gone in and done like, hey, look, we did this like C-sharp or dot net 4 to dot net 9, like migration. Actually, someone did this and then wrote a case study about it and PR did it into our like open source repo. You can go read it of basically this thing that the company was never going to do had never done because it was going to be 18 months. I just like couldn't justify it. And something like Ralph did it in like a couple weeks, basically.
Starting point is 00:40:54 Ralph is delivering enterprise value already. It's amazing. I mean, a single while loop, statements can turn into this like person that's just like taking over the world. Don't know how to like really personify this, but it's amazing. We can talk for a lens, but I, you know, just based on time, I think what, what are ways people can find you or all the amazing stuff you're putting out there? How do people find you?
Starting point is 00:41:20 So I'm pretty active on X. If you want to follow me there, send me a note. We have Discord channel. Go to humanlayer.com. dev slash discord you go to human lyder.com, you can sign up for the wait list. And this time, I promise we won't make people wait for four months to get something and secretly just have it available for free on the internet anyways. Like, we have a lot of conviction in this new product, and we have a lot of conviction that it's going to change the way people build. So if you go sign up
Starting point is 00:41:43 for the wait list, you'll get our monthly updates. And when it's ready, we will be, we will be being much more aggressive. We're going to let 10 people in and the next week we'll let 100 people in it and the next week will let 1,000, something like that. We're like, we want to ramp this up quickly and like we'd rather give it to people and I'm sure most of the people have like lost interest by now and are on to the next thing but we'll give it to you if you're interested you can stay on it you will not hurt my feelings if you don't like it that's my only ask I'll just keep saying this as I can the only thing that will hurt my feelings is if you try it and you don't like it and you don't tell us why I know that's a big ask feedback is a gift but that's my ask for you
Starting point is 00:42:16 all is sign up for the thing we'll send you the stuff please tell us what you liked please tell us what you don't like so we can make it better amazing I can't wait give it a go. Humanlayer.dev, everybody. Thank you so much, Dex. It's been such a pleasure. Thanks, fellas. This is a blast.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.