The Changelog: Software Development, Open Source - Spec-driven development with Kiro (Interview)

Starting point is 00:00:00 Welcome, friends. I'm Jared and you are listening to The Change Log, where each week, Adam and I speak with the hackers, the leaders, and the innovators of the software world. We pick their brains, we learn from their failures, we get inspired by their accomplishments, and we have a whole lot of fun along the way. Today we're joined by Deepak Singh from the Kiro team. Kiro is AWS's attempt at building an AI coding environment to take you from prototype to production. It does that by bringing structure to your agentic workflow with spec-driven development. Their aim, the flow of AI coding, leveled up with mature engineering practices. Deepak shares some really good ideas that are driven by real-world use. I think you'll enjoy it. But first, a big thank you to our partners at Fly.io. That's the public cloud built for developers. Who love to ship? We love to ship, so we love

Starting point is 00:01:00 why you might too learn more at fly.io okay deepak sing talking kiro on the changelog let's do it well friends the news is out our friends over at code rabbit cod rabbit dot a i've raised a massive series b and they've launched their cly reviews tool it is now out there i've been playing with it it's cool the bottleneck is not code the bottleneck is code review with so much code have so many people coding now, so much code being generated, and so many things competing for developers' time and attention to maximize. Code review still remains a bottleneck. But not anymore.

Starting point is 00:01:45 CodeRabbit. CLI code reviews, code reviews in your pool requests, code reviews in your VS code, and more. Teams now have a true answer to what it means to code review at scale. Code review at the speed of AI, and CodeRabbit is right there for you. You'll learn more at coderabbit.aI. We'll link up their latest blog announcing their series B and their announcement of their CLI review tool.

Starting point is 00:02:10 Again, codrabbit.aI.i. Today we are joined by Deepak Singh, one of the leaders of the developer agents team at AWS working on Kiro, a very exciting and interesting new take on a tool that a lot of people have takes on right now. Deepak, welcome to the show. Thanks for having me. Y'all announced Kiro, I think it was back in July, early July, to much fanfare. and excitement. I even got excited, and that's hard for me these days,

Starting point is 00:03:01 you know, to get a little excited about a tool. So I was happy when you all reached out and said, let's talk about it on the pod. Welcome to have you. And why, when there are so many agentic

Starting point is 00:03:12 coding things going on, is AWS throwing your hat into the ring? Yeah. I mean, we threw our hat into the ring a little bit before Kiro, and actually we learned a lot doing that. If you go back in AI world,

Starting point is 00:03:26 like donkey, easier, less than two and a half, three years ago. It feels like another era, epoch in AI time. You had all these assistants that were basically fast typists. They helped you complete sentences. And then you went into these chat-based assistants that you could write functions with. Like, hey, give me a function that does blah. And, you know, it was pretty good.

Starting point is 00:03:49 But we looked around inside the company amongst our customers. And one of the nice things out being at Amazon is we have a lot of software developers in the company. So they're who are very opinionated about how they use software. And so talking to them and talking to our customers, it became pretty clear that the sort of chat-based auto-completion, give me a function, was useful, but wasn't going to change the world. And along the way, LLMs got better, a lot better.

Starting point is 00:04:17 They added thinking and reasoning to it. And that's when we started building sort of what we call our next generation of agents. And we started launching them, actually, in earlier. this year, we launched something called the QCL. This is actually the most heavily used agent inside Amazon. And in the process of building things like the QCLI, learning from our customers, a few things became very clear to us. The first one was, you were moving towards a world where code was not going to be typed

Starting point is 00:04:45 by humans. So the type faster, bright blocks of code for me was not going to be the game changer. The game changer was going to be, I have a pull request converted into something. I have written an issue converted into something that's meaningful that I can make part of a bigger project

Starting point is 00:05:01 or write my entire software for me and a human was not going to do the typing. We already started seeing examples of that and it was pretty clear that that's where the world was starting to go.

Starting point is 00:05:12 The second thing we did was we found that senior engineers had a really, really hard time with the original era of assistance where invariably they would say is I understand the code base,

Starting point is 00:05:23 I understand the problem, I can type faster, Yes, I might use it for some help. Like I use the, you know, I'm dating myself like the Pearl Cookbook to find a function. And we were like, okay, that's not useful. How do we get our best engineers wanting to use these tools and actually finding them useful in their day-to-day work? As they started talking to them, the thing that became very clear was the way they wrote code or solved a problem was by basically breaking it down into smaller problems. Like how do you take a larger problem?

Starting point is 00:05:53 what are its constituent elements and how do they express them? Sometimes they write them on a whiteboard. Sometimes they write a design document. Sometimes they write pseudocode, right? And then they work with a set of junior engineers and other engineers to convert that into a final product. This is happening before AI.

Starting point is 00:06:10 So the question we asked us to those, can we convert that into an AI-based system? Can we take the way our best engineers think, make it easier for them to do that? And along the way, also assume that they're actually not going to type the code, an agent is going to do it for them. Along the way, as we were doing this, this whole idea of vibe coding, as we call it, came up, which is this prompt-based loop where you keep prompting an agent until you get to an answer.

Starting point is 00:06:36 And in that world, also we noticed there were people who were, for the more complex problems, going into something else, maybe Claude or ChartGPT, and creating a set of tasks and breaking it up, and then throwing it into the coding agent. The whole premise of Keer was, that's just the way it works. The code typing part is hidden. It's there. You can type code if you wanted to, but the assumption is the only time you'll type code

Starting point is 00:06:59 is to tweak things that an agent may have gotten wrong. But even changes that you want to make will be done by talking to an agent. So the way Kiro works is, like our senior engineers work, you express a problem. Kiro will convert that into a specification of specs, which is sort of the core piece of Kiro. It's a spec authoring tool at its heart.

Starting point is 00:07:20 And a spec is a set. of requirements in markdown, design document, also in markdown, and a set of tasks where it wants to do also written in markdown. And that's what you work on with the agent. And the code part of it is just the output that the agent generates. There's more to it. But that was a starting point. That's how we came to their user experience that Kiro has today.

Starting point is 00:07:43 It's interesting to go that deep. I mean, even QCL, if I got the pullback right, QCLI is the, it enables completions, is that right, for hundreds of popular CLIs. Is that, is that what on? No, that's how it started initially. Yeah, there was an open source project called FIG that became part of AWS about two and a half years ago. I used that as part of OMIZH on my shell, even before, they became part of AWS. And they do a lot of CLI auto completion.

Starting point is 00:08:16 In March of this year, the QCLI team put an agent inside the terminal, inside the CLSI. And you can basically start vibe coding. That's what it is. You start prompting it. It's advanced quite a bit now. For example, you can start a conversation and if you want to experiment with something, you can actually create a branch of your conversation. You can go into a tangent.

Starting point is 00:08:41 Quite literally, it's called tangent. Have a conversation. If you don't like that path, you can come back to your checkpoint. Or you can now go down the different branch and merge. Make that main later, effectively. So that's been very successful, but it's all an interactive conversation, right? So the problem is what happens on day 20 when, you know, you've forgotten what you did. You have to go look at a conversation history.

Starting point is 00:09:06 And what happens when you give that code to somebody else, they have no context of what was in your head and no artifact that they can look at in a team. And, you know, we work in teams. A lot of engineers work in teams. So in many ways, what we were trying to do with the Kiro spec UX was how do we capture that vibe coding magic while making the code more robust over a period of time where it's still useful a year later and people have the ability to inspect the specification modified if they want to make any changes the two teams work pretty closely together so that we learn a lot

Starting point is 00:09:41 from each other so that's kind of fun i think perhaps why this idea at least if not the implementation resonated with me and i think with a lot of people well first of all it's a ws so you guys have a loud megaphone so when you announce something people pay attention and so that got some of the attention but also just it's a new take on something that we're all working on which is like this agentic coding thing and it's kind of jumping ahead i think in the process of we're all kind of getting to maybe in our own ways of like realizing that chat isn't it and chat's cool and all but you just can't build serious software in a chat dialogue like you can do stuff and you can make progress and you can experiment but there's like a certain point where you're like yeah i don't want to

Starting point is 00:10:21 live my entire life this way. I'm not going to be... Yeah, I'm just not going to be productive. And you guys formalizing... Yeah, formalizing the idea of I think you're calling a spec-driven development or you can give me other taglines about like this concept that is... That's the right tagline. Okay, perfect.

Starting point is 00:10:38 Like, even just that to me, I was like, yeah, that makes some sense. Like, that's something worth trying. And Adam can tell you about Agent Flow. I'm sure he will. He's kind of developing his own spec-driven development style inside of ClaudeCode or with clod code as he builds things. And, heero seems like an attempt to like,

Starting point is 00:10:57 what you're using with cloud code, right? Well, AMP and augments. And amp. Yeah, it's agent. So lots of things. Yeah. Okay.

Starting point is 00:11:05 Well said. Regardless of the agent, he's doing that. And I think that what Kiro attempts to do, I guess, is to formalize that into the entire experience. Is that fair? Yeah.

Starting point is 00:11:15 That is fair. And it'll evolve. I think we are learning how to build software. in a world where the engineer's role is the role of an engineer is going from being the, I'll say typist, I don't mean it quite like that,

Starting point is 00:11:31 but they're the ones who are, their fingers on the keyboard, you know, they're converting that, converting what they have in their head into a language, which is usually, you know, pick your language of choice here inside Amazon, AWS, if you're building a back-end system, it's probably going to be rust.

Starting point is 00:11:47 And how, and that's what they excel at. Now, you're acting a little bit differently, you're a driver. You're driving the behavior of an agent. It's about how you give the agent the right context. Structured thinking with specs is one way. There are other components to it, which others, I'm suspect in your agent floor thing that also comes up, which is how do you provide access to the right tools?

Starting point is 00:12:09 How do you provide access to the right steering files that provide, give the agent the right context to be more effective? And this combination of actually inside a Kiro project, the core components are the spec, the tools, which is usually MCP servers, the steering files, which are essentially, again, mark down files that, JSON files that tell an agent, this is what I want you to do, how I want you to behave, give it a persona, as it well. And downstream of that, you have this idea of hooks, which are these automations that happen. Like, the moment you check in code, it's watching for events. And it could be the moment you save a file,

Starting point is 00:12:49 please run an optimizer for me because there may be a better way of optimizing the code or send somebody an email saying, I'm done checking in code. You may want to look at your merge request or whatever. That combination is the KiroUX, but the spec workflow is the heart of that. So what does Kiro look like?

Starting point is 00:13:09 How does it work? What is it today? Where is it heading, etc. But first lay the field out so we can all see what it looks like. For those of you who haven't seen Kiro, So it lives on its own website called kiro.com. It's the best place to find anything about Kiro is go there. It has, you know, you can download it.

Starting point is 00:13:27 You can read the documentation, but my recommendation is get it. Right now we have a wait list. So one of the, you know, we talked about the Kiro launch. It got really popular, really fast. We got over 100,000 developers signing up in the first few days. And we had to put a wait list on, which we've mostly cleared the original wait list which got very, very large, but it's still there.

Starting point is 00:13:51 But assumption is if you sign up for the waitlist now, you'll be out of the wait list very quickly. You're not to wait for a month or so. But once you download it, you'll recognize a few things. One, I think the most obvious one is that it is a code OSS-based editor. So if you have BS code profiles and preferences already set up, it'll import them for you. So if you had any plugins, et cetera,

Starting point is 00:14:15 running most of them, it'll run pretty successful. Some, there are clashes and we'll warn you that please do not use this plugin with Kiro, for whatever reason. And if you have, you know, we used the what is called BTX Exchange, which is open source plugin ecosystem for VS code. So that's where you can go and get most of your plugins. That's the obvious thing. The second thing that will jump out at you is when you open a project, instead of getting your code windows, et cetera, the default is you'll get a question.

Starting point is 00:14:45 do you want to just talk to an agent or do you want to start building code using spec workflows? And you can actually just start talking and Kiro will detect whether you need to do a more structured development or just, you know, sometimes vibing is fun. Sometimes you just have questions to ask, ask the agent. But you'll notice very quickly that the emphasis in Kiro is not in the code editor part, the code editing part, the syntax highlighting of code. That stuff is all hidden. The main window that's open and, you know, the way you can set it. is all around the spec authoring experience where you're creating a specification

Starting point is 00:15:19 and you're working with an agent to create the spec. So that's the first thing you'll notice about Kiro. And you can use it on your favorite laptop, all the operating systems that you care about. And it's pretty easy to get going with because once your brain adjust to the fact that what you have in front of you is not code. But actually, as I like to joke,

Starting point is 00:15:44 Kiro is more of a markdown editor than a code editor because most of your interactions are going to be natural language with a markdown, and what you see is marked down. The first thing that will come out is a set of requirements that are written in EAS format. So those of you who have worked with user requirements and etc. Before, Ears is a format that I think came out of Atlassian to represent user stories and requirements, all of them in Markdown.

Starting point is 00:16:11 You can throw text at it. can throw pseudocode at it. It's multimodal. You can throw your classic, I drew my startup idea on a napkin and throw it at it. And Kiro will convert that into a specification. Once you like the requirements, you can take a look at them and say, yeah, like them. If you don't like them, you can type into the chat window that you want to change them. Or you can directly edit the markdown, which we rarely see people do. The next thing it does is actually come up with the design. You're like an engineer. It'll come up with here's my design. proposal and you get to decide if you like it and if you're okay with it then you say okay

Starting point is 00:16:49 i'm good and it'll create a set of tasks that it could have one task 20 tasks where task could be i am going to create a window in the you know a button on the right corner which is blue in color and says subscribe that could be one other tasks it'll be written and within that there will be sub tasks which are the actual things you'll do you can do task by task you can because some of those tasks are optional or you can say, implement it all. It'll go ahead and do it. You'll see if you go into the file browser, you'll see all the files getting created.

Starting point is 00:17:20 You can inspect them. Sometimes they'll come back to you for help. Say, hey, I'm not, even if you're not a pilot mode, where it needs a user permission because your file system requires, for example. And that's how your projects are created. Like, is that workflow is around that specification.

Starting point is 00:17:35 And if you want to go back and change the code, the most are recommended way, is actually to change the specification and not go in and manually edit the code. Like, no, I actually don't want this to be blue. I want this to be green. Or I didn't like what you did here. Change your implementation a little bit.

Starting point is 00:17:51 And you can also do it in real time. If you see, it's going in a wrong direction because you get the chain of thought in front of you. You can, it's essentially hit Control C and go, nope, I want you to go in a different direction. So that's kind of how it works. I'm simplifying greatly, but that's kind of how it works. What if AI agents could work together just like developers do?

Starting point is 00:18:26 That's exactly what agency is making possible. Spelled AGN, T-CY, agency is now an open-source collective under the Linux Foundation, building the internet of agents. This is a global collaboration layer where the AIG. agents can discover each other, connect, and execute multi-agent workflows across any framework. Everything engineers need to build and deploy multi-agent software is now available to anyone building on agency, including trusted identity and access management, open standards for agent discovery, agent-to-agent communication protocols, and modular pieces you can remix for scalable systems. This is a true collaboration from Cisco, Dell, Google Cloud,

Starting point is 00:19:11 Red Hat, Oracle, and more than 75 other companies all contributing to the next-gen AI stack. The code, the specs, the services, they're dropping, no strings attached. Visit agency.org, that's agn-t-c-y-c-org to learn more and get involved. Again, that's agency, agn-t C-Y.org. You know, it sounds super weird to say this, but I read it. designate with not editing the code yourself. And the reason why is because the LLM, the agent has this context that is beyond my context.

Starting point is 00:19:53 And it kind of has this plan. So when I start like telling it what to do, I kind of like derail its plan. And that to me is jarring at first because I'm like, no, no, no. I still want to be in charge here. I want to have an agency. Not to be too punny,

Starting point is 00:20:08 but I find that like even a small change, I really don't want to do because I want to tell it what to do. And so Jared mentioned Agent Flow. This is just a term I coined. It really just, I think it sounds cool for one. I think it's the coolest term off. So if everybody wants to use Agent Flow, you know, back off. I'm just kidding.

Starting point is 00:20:28 I like it. It's cool. But mine is not, my flow is not as sophisticated as yours is in terms of like MCP servers and tools yet. It's really just how I think in my brain about the problem I'm trying to solve. develop some version of a spec I've been calling them just documents so I've been calling it document development

Starting point is 00:20:46 they are documents they are documents and it's nothing against the word spec but generally in our industry specs have been a version of a pejorative that it's just kind of like ooh you want to me to do that on specs

Starting point is 00:21:00 I just personally didn't jive with that term cool with you all using it it's just my personal preference but it was not as kind of sophisticated as yours that being said though as I'm playing with Kiro you know you're generally this initial document, the spec, which I get, and I've written those specs before I was an agile team. I was a product manager led teams. And I would speak in these terms, like as a user,

Starting point is 00:21:23 I want X, et cetera, that whole format. And I get that. And then the design document that comes from that is even cooler. Like, that's where I think my brain gravitates most. So that's kind of what I've been generating that version of a design document, not so much that initial spec that creates this design document. Yeah. But not wanting to touch the code I thought was really strange at first until I started to get into the agent's way with what I told it to do via this document-driven, spec-driven agent flow. And I just became the architect of the idea and the researcher and the viber. I'd like to call it riffing, really, because I would like ask questions about different things, poke holes in the CLI or the API or how does this truly work. Did you research that?

Starting point is 00:22:10 Is that really based on facts and truth? Here's the truth here. Or here's an example of how I've hit a pitfall there. So avoid that. And then we come up with this thing called a spec or a document or whatever. And it's like, okay, great. This thing represents as best as I can think of the idea. And then usually the last thing I do is I say, where are the blind spots at with this?

Starting point is 00:22:31 Okay, we've defined this. I think I've got good eyeballs on the spec or the design. Where are the blind spots here? And we'll go back and forth on what those are. and we'll update that thing. And then when you get into that flow and you say, okay, go. And I want to be autopilot, right? Your buttons is autopilot.

Starting point is 00:22:47 And I kind of like that too. It's like auto edit, sure, go ahead. Just make the world. I don't want to really get in that thing's way, you know. I wanted to call it Yolo mode, but. Yeah. It's kind of, yeah, yellow mode actually, I mean, I would almost advocate for that

Starting point is 00:23:03 because that kind of makes the tool fun. You know, we don't want to take away from this process of creating software is removing the fun of it and making this tedious task of just like writing specs and writing documents. We can't lose that fun aspect of it. So I rambled a bit, but I totally vibe with the idea of not touching the code, not because I can't change one single value because I kind of get in its way and it has this context window and I really don't want to disturb it because I've done my job, which is be the

Starting point is 00:23:37 architect and the visionary, the idea. And then we've blessed this thing called a document or this spec. Yeah. And let's just go. And if I get in its way, I kind of like derail. It can, it can get back in motion again. So it's not like a total derail. But sometimes you can really jack it up and you're on a tangent or I call a side quest. Yeah. No, so I think there's, you're completely right. There's two reasons for this also. There's a technical reason, which is context is king. Your results are only going to be as good as your context. And especially if you've been in a long conversation or a long session, going in a manually changing code actually kills the context.

Starting point is 00:24:15 And you get out of sync. And getting back into sync is just really hard. In fact, I mean, even if you're not using structured work, but just documents, giving it instructions, the moment you go start touching the code, you've lost that context completely. So that's one part of it. I also feel, at Amazon, we have this leadership, a tenant for our principal engineers called illuminate and clarify, where our senior engineers are very good at illuminating and clarifying

Starting point is 00:24:46 a problem. They usually do it with their team or group of junior engineers. I think of what you just described as illuminating and clarifying to an agent, to an AI agent. And I think the metaphor stands, and the better we are at doing that, because I think it's more art than science right now, a lot of folks are learning how to work with these agents. You've developed your own workflow, I like the name, agent flow. I've seen other engineers have their own flow, so to speak, and I think the thing that we, as the Kiro team, will have to do over time,

Starting point is 00:25:21 is today there's a particular flow that Kiro use. You can go, we can do some stuff around it, but you can't go randomly and build your own, completely randomly build your own flow. I think we'll learn quickly from how people want to build these because you want to help develop a stay in the flow that they like. That's a big part of what we want to accomplish. Just keep it fun.

Starting point is 00:25:40 So there's a couple of decisions that you guys made. Probably had to make it pretty early in the process, which I think are interesting to me to the why is behind the decisions. And one of those decisions was it's going to be an editor. It's going to be a VS code. Is it a fork, I assume? Or is it a... It's based on code OSS, which is the open.

Starting point is 00:26:00 source framing project underneath VS code. Gotcha. So it's not exactly a fork of VS code, but it's based on the same foundation as VS code. It's VS code-esque. Not really interested in that decision. I think once you decide we're going to build an editor, then that's like a really good decision to just be like,

Starting point is 00:26:16 let's build off of that. But a lot of people are going to the terminal. And I have found as an engineer that I've enjoyed using terminal-based agentic tools more than I've enjoyed using the exact same tools in my editor, whether it is VS code, whether it is Z. Those are the two that I've been using things in. And so I'm curious, first of all, about the choice of like, where does the agent live? And your guys' answer is it lives inside your editor. Can you speak to that?

Starting point is 00:26:47 Yeah. So actually, it's interesting is that both the agent living inside Kiro, and one of the nice things is we have the terminal-based agent as well with the QCL. You know, when we say 80% of builders inside Amazon are developing code with AI tools on a very regular basis, a very large number of them are using QCLI. There's a couple of, because as I said, it predates Kiro, there's a couple of reasons we did that. There's a class of people who just love working with what you would call an IDE. And I think the IDE as we know it, which as a code editor is going to go away.

Starting point is 00:27:26 because as we just discussed, that's the less important part. It's more of a visualization system. If you look at people doing terminal editors, they will use an ID or some text editor at some point, because they need to visualize, they need to do deaths, et cetera. There are some things that we found, even with QCL where, which are very, very hard.

Starting point is 00:27:46 If you want to build a spec and have a spec workflow to it, that's a much richer interface that's, and it's very difficult to get put that interface into a terminal. We think there will be a lot of folks, especially power users who will use terminal-like systems because they give you that freedom and flexibility. Or they're just from, you know, it's the classic VI versus Emacs war at some level. They're just the flow that they like. But if you're interacting with an agent, it's not a bad system to use.

Starting point is 00:28:13 We also felt that there was a very large number of people who wanted that richer user interface. And the richer user interface allowed us to try a few things. One was the whole spec workflow thing that you can try building it. inside a terminal, it's going to look flunky in it, ugly, and it won't work. There was one, and we had strong belief in that. But the whole helping people author an MCP server, helping them author a set of steering files, the whole library of hooks that Akira has makes is much, much easier. The way you would do hooks in a terminal was, you would probably have a slash hooks

Starting point is 00:28:50 button, and you'd get a list of the hooks that you wanted. But if you had 500 hooks, you'd have to scroll through all of them to find. them. In a visual interface, that's much easier. You can have a library of hoax that only your team cares about, and over time you may have a library of hoax that, in fact, your company may publish, and it's like, this is the code review agent that you're going to use. Here's an MCP server that goes with it. Here's a steering file that goes with it. All that context is shared. So there's things you can do from a user experience perspective that are just hard in a terminal. If all you're doing is just driving an agent with prompts, terminals are great.

Starting point is 00:29:25 I don't think the IDE gives you any advantage. But within an editor, if you are working on this rich things, you want to visualize diagrams, work with multimodal inputs, have these other affordances that you want to make part of your context. It's a more powerful, more visual interface to do it. You're not just prompting at that point of time. You're doing a lot more than that. But I agree.

Starting point is 00:29:48 If you're just prompting an agent, CLIs are amazing. Okay, good thoughts. The second decision that you all made, at least I'm assuming you made this. was we are going to be, I guess, not model agnostic, but we're going to pick a model that you're going to use and then you can change that model.

Starting point is 00:30:06 There's different angles of this. Like obviously, some people have their own model they want you to use, right? Like anthropic ones, you know, the model behind the agent is like a huge part of the system, right? And so, uh, Kiro as I download it, it selected Claude Sonnet 4. And there's a drop-down selector.

Starting point is 00:30:22 For me, I can't pick any other option. I'm assuming I could, can figure that to pick different options. But just your thoughts around model selection. I mean, it's a big part of it because at the end of the day, you are wrapping a model that you have to decide what to do with that. This is a bit of a philosophical debate. We have within ourselves as well.

Starting point is 00:30:41 So the reason the drop down is there, there was a time that Kiro allowed you to pick between Sonnet 3.7 and Sonnet 4 because sometimes you ran into availability issues that Sonnet 4. There was just not enough capacity. So you could choose Sonnet 3-7 and do your work there. But that problem went away and nobody cares about Sonnet 37 when Sonnet 4 is available. Right. It always picks Sonnet 4, so we took that out.

Starting point is 00:31:05 So that's not going to say GPT5 next to this ever. No, but there isn't any, if you download it now, you will get two options. The default model, I won't say a model, the default agent in Kiro is something that we call Otto, and I'll go into this in a second. I'll actually go back to Sonnet 4 as well. What you're using in Kiro is not Sonnet 4. You're using an agent that uses Sonnet 4 for the core thinking part of its work.

Starting point is 00:31:34 There's a set of scaffolding that goes around it, everything from your system prompts to the way you handle caching, to the way you are handling any guardrails that might be in the picture. Sonnet 4, let's take two editors. They both say Sonnet 4. They're not equivalent. It's how you drive context into the LLM.

Starting point is 00:31:54 And the scaffolding you have around it that makes it useful or less useful or more. So in some ways, part of the, you could argue that saying Sonnet 4 is also almost a comfort level for people saying, I've got Sonnet 4 inside it, because it is an agent that's doing the work. It happens to be an agent that is built on Sonnet 4. There could be pieces of it. And actually, that's true for many agents out there, where for saying hello, I don't actually want to use Sonnet 4. That's a very expensive hello, you know, as an example. Yeah. Well, I was just thinking about that because you, I didn't want to call it a rapper because that seems like that's just like belittling the amount of effort that you all have done. But basically in wrapping a model like that, you're making a pretty large bet on the provider of that. And you're, and you're to the, to the, you know, choosing them versus being like, well, you can use Gemini. You can use chat GPT. You can use Anthropic. The good news is because you have a drop down, we want to go in that direction. That's fine. Now, where we're.

Starting point is 00:32:52 have decided to go, at least for now, is an agent called Auto. Auto is using multiple agents underneath the hood. We can choose whichever largest frontier model we want to choose. Right now, we tell people, it's there in the documentation, that one of the models that we use is Sonnet 4, because that's a model that we know well. We understand well. All our agents use Sonnet for QCLI also uses Sonnet 4 system. We have all the scaffolding in place.

Starting point is 00:33:19 We understand how it works. but we are starting to we want more there's a number of ideas we have around how we handle the model based on the tasks you have including maybe some specialized models for certain things so auto is an agent it's not a model it uses multiple models underneath the hood one of which happens to be Sonnet 4

Starting point is 00:33:39 and over time we may have different categories Sonnet 4 sets the bar in some ways for how good the results need to be we have you know the way we think about it is our quality of results with auto have to be at least as good as the agent that uses Sonnet 4, the one that you've been that's a sonnet for.

Starting point is 00:33:59 But we want to do it at a fraction of the cost and hopefully better performance. That's a mental model. But again, we haven't taken one way doors because I think this is an philosophical debate within the community, within our team on what's the right thing to do because the funny part is

Starting point is 00:34:16 if you give people five agents, So chances are they're half of them, most of them will just pick the one that they think is the best anyway. So that's where we are right now. Where the future goes, we'll see. Choosing models as a, I think there's a cost perspective there, right? And if you remove the cost perspective to the individual user, so I guess the context here really is like, we're drastically different types of developers in terms of like, I'm a solo. I'm not doing as a team, even though Jared and I are both here.

Starting point is 00:34:47 I'm not working on, like, change on web proper stuff. It's like scratching my own age, CLI type things, just really learning and exploring. And so a lot of my context and even my preferences are largely based around a team of one. Whereas Kiro is really a team of many. And so the choices you made were really around teams and context sharing and things like that. And in a world like that, you usually have some version of an enterprise cutting the check. and there's less concern about, you know, ultimately the cost of an agent

Starting point is 00:35:20 and how it may impact you. Obviously, you've done your cost analysis, but that's largely similar to us usually. Yes and no, because I think if you go directly, like that becomes, if you go directly with a cheaper model, the challenge is people will do that,

Starting point is 00:35:35 but then we'll get very frustrated with the quality of the output. What I care about is the quality of the output. And that works for individual developers, that works for, you know, your cell startup that's funded by Y Combinator, it works for the enterprise. I actually think in some ways enterprise developers work under more constraints on how creative they want to get than developers in other places.

Starting point is 00:35:59 So the way Kiro thinks about is the customer that we want to really make happy is an individual developer. That individual developer could be a soloproner, somebody sitting at home and scratching a niche, somebody sitting as part of a small startup, maybe the technical person in a two-person startup, for example. They could be a developer at, you know, at a startup that we all know, one of these

Starting point is 00:36:22 series D unicorn startups, or it could be somebody at Delta, to pick a big developer customer. All of that is possible, which is, and I think the reason, so there's a few things that we did in Kiro that point to that. One is that the default login system in

Starting point is 00:36:38 Kiro is social. You either start login with your Gmail or you log in with your GitHub login. It's not you need an enterprise system. That's a big part. Again, going back to decisions that we made, a very conscious one. It's actually one of the few, if not only, products that have been built by an AWS team, that you can sign up with Gmail and have the ability to pay.

Starting point is 00:37:02 You don't need an AWS account. In fact, you don't need anything. You need no AWS account. You don't need to worry about AWS permissions zero. There's no requirement at all. If you happen to be an AWS customer, there's ways you can get signed up. So that was the big part of it. What we did decide to do was assume that people want to write code that's going to do.

Starting point is 00:37:22 A big part of what idea was, yes, there's a lot of fun projects out there that people write, you know, I'll call them toy projects. Again, I'm being sort of general. Projects that you do to learn, et cetera, and you can do them with Kiro. You can buy code with Kiro just like with any other agent. But our assumption was people want to take code and actually do something meaningful with it and grow it over time. and evolve it over time. So that was very part, and it is also true. A lot of people do that as part of a team.

Starting point is 00:37:51 So we wanted to keep that in mind. The team could be two people. It could be two pizza Amazon team. But the fundamental assumption was that the person signing up, person signing up is an individual somewhere, whether they are at home or in a company. And I think finding that balance between the fun part, I'd forget which one of you mentioned, you have to,

Starting point is 00:38:13 like development is fun. You can't take it away, especially developing with AI agents. You don't want to add, make it like a boring, put in a bunch of checks and balances because you assume it's going to be in the enterprise. I think that's kind of where we are right now. And we'll see as it grows and where people push us, which direction we end up in. But there's a lot of white space right now in this whole area. We're just all vibe in our way there.

Starting point is 00:38:40 We're getting there eventually. We'll get there eventually. How many are there AWS? are there two pizza teams shipping code with Kiro? I mean, I know you guys have a big wait list, but you have a great internal team that you can just test stuff out on and say, hey, use this sucker.

Starting point is 00:38:53 Well, the best example is the Kiro team. The Kiro team's just been using Kiro from the day Kiro started. Really? So they built Kiro in Kiro. I don't have any stats on it, but the one use case we mentioned is this one engineer who wanted to ship a feature, a notification feature in Kiro,

Starting point is 00:39:10 and it did it over a day because he wrote a spec, a spec for it, and then somebody else, and then he wrote, the agent wrote the code and maybe somebody else shipped it. But in general, part of one of the tenets we had in Kiro, and I have it for a lot of my teens, is you can only build, you want to build it almost for yourself. Like, developers work best when they're building things that make them happy developers.

Starting point is 00:39:36 And you're a development team, build something that makes you happy. Then, of course, you give it to a bunch of customers and see whether their happiness and your happiness happens the same way, not developers are picky people. And that's when you start making the choices. The spec editor in Kiro actually evolved significantly because the way the team was using it worked for them, but as we gave it to other people,

Starting point is 00:39:56 it didn't work quite as well. And we almost rewrote it. I mean, we did rewrite it thrice, I think, in how in the user experience to get into a point where the majority of our customers really liked the experience, the most of the people in our, you know, beta cohort. And so that's one. I'll give a more general one, which is, you know, we are at a point inside the company,

Starting point is 00:40:19 which actually helps us a lot, learn behavior. It's that most developers now, 80%, as I said earlier, are using AI systems to build code, where we have now got significant projects, which are significant complexity, which are 80, 90% AI, you know, written using AI. So we are moving quickly in that direction. That's really cool. I would love to see some of your guys's specs like from the Kiro product team

Starting point is 00:40:46 like not open source the whole thing or whatever but like here is a large production code base that was built with spec driven development and here is a bunch of specs that you can look at to see what we've ended up with as we've built out this code base. That's a really good idea. We've also, I mean I've been

Starting point is 00:41:03 some of the folks on the team are toying around with the idea of can you have a spec sharing just a place for people to share specifications which include not just a spec files but also steering files and etc because they all go together like in the end

Starting point is 00:41:19 I think this is what Adam was saying context is not as one dimensional as people like to say it is. It includes a spec it includes a bunch of other things right and your prompts your steering files can is there a package way of sharing all of like

Starting point is 00:41:35 you have agents.md now right it's a very powerful sharing mechanism because it's a standard format. So I think that's kind of where I think there's going to be interesting evolution even in that space. And as you said, I think the community is sort of converging into the idea of specs. It's whether they are doing it deliberately or sort of everyone's heading there. So we're very excited to be part of that because we've had this religion for a while. So inside of Kiro for our listener, the left-hand side, when you have the Kiro panel open. There's

Starting point is 00:42:08 one, two, three. There's four subsection, specs, agent hooks, agent steering, and MCP servers. So, Deepak, how many of those produce artifacts that end up checked into your source control? Like, how many of those are like trackable things?

Starting point is 00:42:25 Yeah, I mean, there are text files. You can check all of them in. All them are. Sitting on your file system. Yeah. Everything in Kiro is a text file on your, I laugh because for me, they're on my right side. By the way my brain works. to move all of those to the wrong. But it's different.

Starting point is 00:42:38 By default, I think it's out on the left. I just popped it up. I move everything to the right. I like having them there. There's the reason workspaces are configurable because you're all picky. And our own way is always the best way, isn't it? Like, my way is better. Yeah, you can have yours, but mine's the best.

Starting point is 00:42:57 Yeah. But they're all marked down files in the end or JSON files. They can all be checked in. And that's the beauty, right? that, you know, in the end, get, get is your source of truth. It's so wild how we have come back to markdown files and JSON files, you know. I mean, honestly, because there's, you can have, you know, data serialized into JSON. That's, you can read it as a, as a human.

Starting point is 00:43:23 It's not that fun. You can still do it. Yeah. But, you know, you can still build an interface out of that through a script. It says, you know, make this, turn this data into human readable. or marked out it, essentially. And those two different document types to me are key. One thing I'm not seeing in this and tell me if this is something that's on your roadmap

Starting point is 00:43:44 is I feel like you've got the initial spec, you've got the design, and you've got some version of a task list that takes you to an implementation. Feature is, in quotes, done, right? It hasn't been QA'd, it hasn't been deployed. It may spike an incident. there may be a bug and something that I've really, which I really feel like is what I'm calling the agent flow part of it, is it's not just this initial spec and the feature that gets written.

Starting point is 00:44:15 It's the bugs that may come after it. I started to do things called builder logs. They're blogs, but I've enjoyed reading what the agent wrote about the feature we delivered because it helps me learn. So I'm using as a means to like be a better illuminator and clarifier essentially. You know, how can I better understand the system I'm defining? And the builder logs are part of that, incidents are part of that, bugs are part of that, you know, knowledge-based articles. What do we learn as a result of this exploration, or even in this little riff or this vibe, not even writing software, but what do we learn from, you know, poking holes in the API or the CLI or what really is or is not there, what really is or is not possible?

Starting point is 00:44:55 When do we have to go from API to SSH because the API lacks an interface to certain parts of the system? then we actually have to go in via SSH and sniff around or whatever. So all these little weird things that happen as you build software is largely part of like bugs, incidents, knowledge-based articles, builder logs. Do you have plans for the full ecosystem beyond this? I'm glad you asked this question. I will say the way we are thinking about it right now is where hooks come in. Because I don't think we know exactly what's the right way to do it. Like, Kiro will summarize what it's done for you.

Starting point is 00:45:30 It's like, here's all I did, right? You can take a look at all the actions that took. That you can go find. Our current mechanism for doing that is the hooks part. Hooks is a great way for people to hook into, no pun intended, into Kiro. The hooks can come from you. They can also come from a third party, by the way. It's a great integration point because, you know, it's easy for somebody else to write a hook.

Starting point is 00:45:54 So, for example, you want to take a code and document it properly, create what's the you know, let's say you're writing Java code, you wanted Java docs created or something like that. Like, I'm not a Java guy, so I have no idea how that works. But you wanted to convert that into a really nice documentation that you can upload onto a website, maybe even publish it. You could create a downstream hook that is essentially a documentation agent or another agent that takes a look at and converts in documentation.

Starting point is 00:46:22 Maybe somebody else has written it. There's another startup or some open source project that does it. You can plug it into Kiro as a hook that will do it automatically. you still don't have to drive it. The good news is that the hook is using the same context as the rest of the project. So the context gets preserved. So that's one example. On the bug side, the thing that I would love, like one of the things I would love to do,

Starting point is 00:46:45 actually, as learnings is can you convert your learnings into a steering file that can then be used as a feedback loop? The good news with agents is they're really good at reflection. You give them error logs and feedback. They'll get better and better over time. How can we do more there? It's not something that's built into Kiro, but you can imagine that that's something that ends up happening.

Starting point is 00:47:06 And I think what we learn over time is, what are the kind of hooks that everybody does? And that makes, to me, as a product person, is like, that's a capability that we should probably just take care of within Kiro. But over time, you want people to be extended in the ways that they like and learn from that. I think there's, if you look at all the successful things out there that people enjoy using, I'm a heavy obsidian user, for example.

Starting point is 00:47:32 The whole plugin obsidian, you know, extension communities is a big, rich community that's out there. So, but there's a set of core plugins. And what's the core versus what comes from the community is something that we learn over time with hooks. It's early right now. It is early. So the hooks essentially is your way to allow yourselves to operate within a certain known user experience, a certain, you know, clarified way. that you expect here to be used, but it allows folks to extend it beyond and add to it

Starting point is 00:48:04 that you would not maybe otherwise want to prescribe because it's white space, as you mentioned before. There's a lot of a lot of vibing happening. I think we're all learning too as we do this vibe code type stuff and allowing the agent to write all of it versus stepping in even if we have the technical abilities to do so. How far, how mature is that? Like how is there anything out there in the hooks part of it? The mechanism is fairly mature. We are talking to, there's a bunch of hooks that have been written by our beta customers and early users for various things.

Starting point is 00:48:39 One of my favorite hooks that's out there, like the classic hook that's actually part of the sample hooks in Kiro is a hook that converts, that translates dictionary. So if you change your English language dictionary, it'll go and update every other dictionary for use of a localization. It's a localization hook. You know, localize my code, basically. It's the kind of thing that people hate doing, but you can do it automatically, and if you make it a hook, every time you check in code, it'll automatically go and update all your dictionaries for you.

Starting point is 00:49:07 I know somebody else was writing a game in a JavaScript framework that they were not very familiar with. I think they were writing a view. They didn't know view very well. So their view code, and they didn't know if the agent was perfect. They had an optimizer that ran every time the checked in code. There's a lot of things that you can do. I've seen people hook it up into code. review tool. So I think that will happen. And I think part of what we need to figure out is

Starting point is 00:49:32 what's the right mechanism to make it scalable, to use that term. The bug thing is kind of interesting. I'm going to go outside Kiro. In CloudWatch, we have, and by the way, you can hook that into Kiro quite easily. We have an agent, which you call incident response, that every time an alarm gets triggered, it starts creating hypotheses on where your, where in your system you have an issue that the alarm may have been triggered, you can probably convert your root cause into a spec. I'm speculating. It's probably not that hard to take,

Starting point is 00:50:07 here's my root cause, here's where the code, you know, this is why the issue happened. You can create an issue that's what people who do. They'll file a ticket to do it. But you can even create a spec that says this is the right way, you know, this is the problem that we need to go solve, feed it back into something like Kiro. And with hooks, you should be able to automate something like that,

Starting point is 00:50:25 is my guess. So I haven't tried. I don't know if anybody else has. But I think these kinds of workflows will become more common. What's interesting about this, hooks, if I understand it correctly, is there's things that, you know, as you're doing something with, you know, with the agent, these hooks are sort of like side things you want done, like a formatter or, as you mentioned, a localization thing.

Starting point is 00:50:47 Maybe you're even running a build. And if you're in, in the case of, say, cloud code, I'd have to open a new tab. and instantiate a new version of cloud with brand new context to not be blocked. So I may do something that a hook may do, but I'm doing it because there's no concept of hooks in the cloud code world, at least that I'm not using at least. Maybe an MCP server can do that, but still you're going to have that context window or that active window you're working in consumed by this kind of like a side task. And you're kind of blocked.

Starting point is 00:51:25 you're blocked until you could do something else. So that's kind of cool how you've solved that problem there. Yeah, it's something that the team did pretty early. I mean, I think one of the nice, this is where I think the advantage of building a tool and using it the way you to build your own, you know, it's somewhat meta, but that's kind of how it works. And hooks came out of some of those ideas because hang on, we need to do that. So, you know, it was a very early feature in, in Kiro. Yeah, it seems like it has tons of potential.

Starting point is 00:51:54 What are the timings of the hooks? Like, when can you actually fire off a thing or is I'm sure there's a life cycle of some kind? Yeah, there's a, at least the way it is today, it's like a watch API on a file system or a repo. It's looking for some change. Okay. So when files change, it does things. Yeah, it's an if this, then that kind of thing. If this happens, do it.

Starting point is 00:52:15 The key part is what Adam mentioned is it retains the context of the project, which is very, very important. It's not going to go randomly start from scratch and start it. But now you have to feed it all the context back in. It already has all the context. So it's going to work within that context. Yeah, if I think about it's a context mover and saver is all I am. Like, I want to keep the context. I want to move the context.

Starting point is 00:52:34 I want to save the context. And then in the case of Cloud Code, when that infamous auto compact comes around, like, no, my gosh, please. Stop destroying my world, you know. Auto compaction is such an interesting thing, which people, which is necessary because you want to conserve tokens, but it kills your cash. It gives your context There's all kinds of challenges

Starting point is 00:52:56 Well, then you, you know, as a user, you're I just feel like it's a ticking time bomb constantly. I just feel like such anxiety. You know, I'm like right in the middle of the best part. We're finally to the nugget. You know, we've rifted back and forth enough to finally get the clarity. And it's like, okay, draft or update this spec.

Starting point is 00:53:17 Or in my case, it's a pep. I've actually borrowed the Python way. and even the acronym too, PEP, but I call mine Project Enhancement Proposals versus Python Enhancement Proposals. But by and large, the same concept where you've got this initiation of whatever it might be. It's got statuses and all those things. And that's what seemed to work for me, like updating that. And I can't do it because, you know, I've got to wait for the auto compact to happen or then it comes back. And it's like, I think I was doing this based on, you know, what this history says.

Starting point is 00:53:51 but then you kind of reset the zero I understand what happens but it's always like this anxious moment as a user to map around Yeah, as you were talking about it I felt like driving an electric vehicle and having range anxiety Yeah, similar.

Starting point is 00:54:07 It feels like that. It's like how many miles till I can get there to charge up, yeah, exactly. It's, I think eventually that will get solved and I understand technically why it's there but as a user, all it does is basically give me anxiety

Starting point is 00:54:22 and then I got to work around it and moving context and saving context and I mean I've even started to like save chat sessions because you can do slash export copy to your clipboard or move it somewhere and kind of save at least the chat history so that you can read back from it

Starting point is 00:54:39 you may not have the full actual context that was saved but you at least have some version of a map of how you got there I'd be very interested when some of the new keyhole builds come out We've got a couple of enhancements coming out this week in the clients. I'd be very interested in you running long running sessions and seeing what happens because we've done, tried to do a lot of work in managing that context,

Starting point is 00:55:00 in managing that and making sure you get the right behavior out of it. It's not perfect, but I'd be very curious to see your experience, you know, whether the range anxiety gets mitigated or just gets a little bit better. And let's talk about that then because I want to talk about Stack, because, you know, my stack for what I'm doing is pretty much any agent CLI. And, Jerry, you'll be happy to know that Zed is my daily driver now. Like, I just, I've thrown every other possible editor to the wind

Starting point is 00:55:31 because Zed obviously is fast and beautiful. And the only thing it lacks is some of the things that Sublime Tech still uses, but, you know, I digress from that point there. But I'm using, you know, some version of a CLI in the terminal and then a separate editor. And I would not call ZD an IDE, although it has IDE tendencies. That's my stack. And the reason why it's my stack is one, it's accessible to me.

Starting point is 00:55:56 I've just learned about Kiro and just got access to it. But even with augment code, I used it inside of VS code. And me as a user, as a visual person, as a designer, VS code is great, but the text is small, the windows are spaced out. What I like about the Claude Code, Augment Code, AMP code, CLI flows is that the user experience of using them is a very, it's a larger text. I can see it better. I feel like I've got more clarity with the, the vibing for lack of better terms. And then I still have this really awesome, ZET is the best editor to still go and review and look at the actual code, you know, go around the entire documentary that is the project. To me, while I would love to use Kiro, the thing that would hold you back is that you're in this V-S-Code-like world, and it's just less than desirable, I would say.

Starting point is 00:56:57 It's not bad. It's just not my preference. Yeah. I mean, I think we talked a lot about back in the day, you know, building something completely new like Zed did. I like the fact that they call it Zed and not Zee. or working off, you know, working of code OSS. And the reason we ended up, the goal, the idea, it was very clear to us, because we used to have a plugin, you know, classic VES code plugin.

Starting point is 00:57:25 We still do, which is on the Q developer side. But it became pretty clear to us from a UX affordance side that was going to be a very, very difficult place to be. We wanted to do things in the user experience that as a plugin was just hard. So having, you know, code OSS as a starting point was great. I think the interesting point is, will we stay with code OSS or will be evolved,

Starting point is 00:57:49 like at the end you fork it so much, it becomes its own thing? I think that's to be determined. There's a lot of that. One of the other advantages of using code OSS is there's so many people using VS code and they all have their plugins and extensions and just pulling them into Kiro

Starting point is 00:58:04 to do things makes their life easier as opposed to starting from scratch. there's actually probably one of the biggest reasons that we stuck with the code OSS part but the goal was always making a UX that was optimal for the flows that we want to get to and I don't disagree with you I think UX is also a very personal thing and how it works

Starting point is 00:58:25 I'll always go back to I just like personally I never like using code editors like I you know my I like using text editors but I'm not a I'm not a really good programmer either. But I think with Kiro, because now you're not in code editing land, you're in viding with agent land to create specs.

Starting point is 00:58:49 Our user experience is going to keep going in. How can we make that a true, rich, multimodal experience and keep evolving it? Where it ends up again, I don't know. But that's a goal. But actually what you said, the stack part is very important. I do feel like most people will end up using a, what I'll call it, a more rich experience. which you can call an editor, ID, whatever.

Starting point is 00:59:11 I almost feel like calling an IDE is doing the thing a disservice right now. So, you know, we call it KiroIDE, but in my head it's like a desktop product, which happens to have a code editor inside it, and a terminal-based product, like a CLI. And I think those two, I mean, because it's VS code, I actually run QCLI in my VS code

Starting point is 00:59:35 in my Kiro terminal as well. I often also have a separate shell open all the time. And one of the ideas, I think that's going to be interesting in how it evolves. You see that a little bit with some of these CLI things that now have editor plugins where you're using the editor just to visualize what's happening. I think there's going to some very interesting, again,

Starting point is 00:59:58 UX evolution of how terminal-based agents and richer UX agents evolve over the next several years. so progress with kiro means the vs code side the code side gets kind of tucked further and further away and the spec editor and this like immersive interactive building of a specification and then some sort of end product kind of continues to level up and become more and more primary so perhaps kiro down the road you know maybe you don't have to tell people that yes you can look at the code underneath if you need to. I argued with that

Starting point is 01:00:37 I remember having a debate with the team and do we even need a code editor in here? Yeah. They won that battle. Correctly so. Yeah, I think right now you do. Yeah. My question for you is

Starting point is 01:00:48 with the current state of the art models, do you think that world that we just described of Kiro as this interactive app building thing with a code editor hidden in it? Do you think that's

Starting point is 01:01:03 You think that's possible today with just better agents, auto getting better and Kiro getting better and leaving all else the same? Do you think we can get there? Or do you think the underlying models also need to improve to bring us to where you want to go? Oh, both, both. I mean, it wasn't a bold question, Deepak. You can't say follow. I'll be blunt.

Starting point is 01:01:24 It's either or. That's fine. Touche. I think we need one, at least one more generation of, There are probably models out there that can get you there, but they're really expensive. And their context windows aren't right. There's a lot of limitations right now. I mean, I say that given the fact that over the last year, the models have become so much better.

Starting point is 01:01:49 Some of the things that we're doing today, you could not do a year ago. You just couldn't. I do think there's one more generation of models before we get. I still feel like to write a full application, like not a simple CRUD app, but a proper, you know, distributed backend. And we have examples of people who've done that. But it requires a lot of skill in the part of the developer to be sure that you're in the right place. I'm not saying that's a bad thing. I still feel like the people who are going to be most effective using these tools are skilled people. There's a lot of folks who believe and hope that you can give somebody

Starting point is 01:02:27 like me and somehow make me a better programmer, developer. That's not happening. Can I write applications more quickly and easily. Absolutely. I can prototype better than I ever could. I've written more in the last year than I had in the five years before that. And mostly it's me trying to make a molecular viewer for myself because that's what I like doing.

Starting point is 01:02:46 But I do think... Have you got one? You got one done? I've got it like 15 versions of it. They keep... Because that's my sort of test pet project. Do they work though? Are these... Can you ship them? And the molecular viewer? Is that right? That's cool. A molecular viewer.

Starting point is 01:03:01 I used to be buying from a addition, so I like my molecular because that's my standard test project. Would you ship any of those? You got 15 of them. Would you ship any of those as products? I have no idea. I think it works, but I'm not sure if it works. Right. Okay, keep going.

Starting point is 01:03:17 Yeah, but I do think. So you need that, there's a level of skill and understanding that you need to get the maximum out of these agents that we have today. But as these reasoning models get better, as we start adding new techniques to it, for example, there's this fun term these days called Neurosymbolic AI, which is basically bringing mathematical techniques into generative AI systems. So we have a very strong Neosymbolic AI team over here. By the time this podcast publishes, this is probably going to be well-known news. but Joe Hellestine, who's well known in the database community, he has a project called Hydro. Hydro is an open source project to basically do,

Starting point is 01:04:04 verify the correctness of a distributed system written in Rust. We want to bring those kinds of techniques and build them into the spec engine of Kiro. So as you're building software, you don't need a human at the other end to verify correctness, that you can verify correctness using a mathematical model in this case. That became changing. the kinds of techniques that take us to the next level.

Starting point is 01:04:26 You're not there. Yeah. But I think it's going to happen. Within, I don't know if it's six months, one year, two years, three years. Yeah. That would be a game changer for sure. It seems like everything's 80% solutions right now, or maybe even better than that. Maybe like 95%.

Starting point is 01:04:41 And then like that last mile is where you need to be a senior engineer to actually recognize and finalize the last mile of delivery on a lot of this stuff. There's an intuition that they have. that you just don't get in otherwise. And I, as I said, to the question you specifically asked, I would say, no, not yet, but we're not that far away. Let's talk about pricing. This thing's a product you all are trying to sell.

Starting point is 01:05:11 How do we buy it? So we changed our pricing twice already. Okay. This is a, so this is a pricing that we announced, and I'm going to get it wrong, last week and it'll probably go fully live and we'll start charging people in some time in the next few weeks

Starting point is 01:05:30 but it's your classic per user pricing you get a free tier you actually get a free trial because the free tier is mostly just for vibe coding the free trial gives you the full experience you just I think we put you into the pro tier for about a month and you can play around with it and then if you like it you can actually start paying

Starting point is 01:05:51 or you can flip back to the free tier, but there's a $20 tier, a $40 tier, and a $200 tier that give you $1,000, $2,000 and $10,000 credits as a credit system. The way we talk about it is, if you're using the auto model, you will get those credit consumption for Sonnet 4

Starting point is 01:06:12 is 1.3 times a credit consumption of auto. And so if you're using, so you get, you just get a set. of credits. As you can see, if we gave you selection of some kind, we would have different credit consumption rates based on, you know, the cost of running those models. And so you get two agents, an auto agent and a Sonnet 4-based agent. The Sonnet 4-based agent is, you know, 30% uses credits 30% more faster typically because it depends on the prompt and the LLMs aren't deterministic. But for the most part, 30% more than faster rate of consumption than the

Starting point is 01:06:55 auto model. And we also have overages. So if you, you know, use, so there's another thing that we did it, we added the idea of fractional credits. So when you go to, when you go, you can go from one to one to one. You don't go from one to two in these step changes. We found people who are running into going through their credits a lot because they used a little more than they, you know, within the boundary, and then they would get sent to the next one. And that meant that you just consume credits really quickly. The other thing we have is overages. If you're in the $20 plan, you don't need to go to the $40 plan

Starting point is 01:07:30 if you just want to spend $21 for the project you have in hand. So there's overages there, and then the 201 assumption is most people will rarely get there, but we know there's some people who will because they have. but we spent a lot of time and trying to optimize that credit consumption and at least in our early testing it looks like you're not running out of credits so quickly. We had

Starting point is 01:07:54 in the middle separated out the amount of usage the way we were charging for specs versus vibing and people found that too difficult to wrap their heads around so we've combined it into a single credit bucket. I'm still confused. You're still confused. I'm still confused.

Starting point is 01:08:11 It's probably simpler on the web page. Well, I don't mean, I don't even mean your description of it. I mean just generally because I think AMP code and augment code, they use messages as their terminology, whereas you're using credits. That's what we used before. Yeah, we use credits. And how that maps to usage is sort of like a black box to me because, you know, I guess in those cases, like what I learned about AMP code when I talked to Bjong Liu, you know, CTO source graph, he helped me understand how, how, begins. because I had a larger context window, it was actually consuming more credits, which I thought was like, well, that's convenient for you because obviously I want context and so I'm burning

Starting point is 01:08:54 faster. It's super unclear as the user. And I'm like, gosh, you're just super expensive. He's like, actually, you're just holding it wrong. I'm like, well, I'm not actually holding it wrong. I'm just holding it. I don't know how I'm holding it. Right. How do I hold it right? Right. So this mapping of usage to messages to credits to me and it's not your fault. It's just, it's hard to understand. It's tricky. One of the things you've gone back and done is, because it became pretty clear, that's what folks wanted, was actually show you real-time usage. So when you work, I don't know if you'll have it right off the bat, but I know we're working on it. We now show you as you are using it, how your credits are getting consumed.

Starting point is 01:09:31 Where we want to get to, and we'll get there pretty quickly, if not in the next couple of weeks, is actually showing you how a particular thing that you did, what kind of credits are consumed. because to your point consumption of credit how much you're using is not just about a prompt it's not even about the context window it's about tools if the LLM decides to use

Starting point is 01:09:53 200 tools instead of five it's going to use much it's going to use more credits that's the danger of MCP servers they're amazing but it's also like an enlarger context window too so the more tools you use from those servers yeah if you preload your

Starting point is 01:10:08 if you preload your project with 5,000 MCP servers you'll run out of credit for like five minutes because that context will get so large. I also think as a community, we need to start becoming much better at educating people and best practices. So I think part of what all of us will, where this will end up doing is you'll end up having really, really nice, I won't say really nice, but mechanisms to understand what's the optimal way to set up a project. How do you do things that you always want versus bring them in as you need it, showing you the right visual cues, because, and the other problem is, and I'll go back to tokens here, because there's a, you know, in some sense, that is the currency,

Starting point is 01:10:48 a token in one model is not equal to the token in another model. Token is also different. People put token weights, et cetera, but that's a heuristic. So I think the thing that people are struggling with is how LLM's work is not deterministic by the way people are, people's brains work. how things get consumed and cost accrues. And quite honestly, I think you're seeing the entire industry deal with it over the last six months. And in different ways, this is our way. We think it's going to work for what we want, what our customers are going to do. But a lot of the responsibility is on us to be very transparent about when you submit a prompt and something happens,

Starting point is 01:11:32 you know, you at least get a sense of this is what this takes. And if you have different agents, the fact that they behave differently, you can make decisions based on quality and speed on whether it's worth, you know, you may want to go with the more expensive agent for whatever reason. But you have to know why. And I think that's going to be, and that's hard, but that's what you're trying to do. Well, certainly still the early days, I think. I mean, I think this is evidence of the trailblazing that is going on

Starting point is 01:12:01 and just the figuring out of how it all has to work and how it, should work and then educating ourselves on how to hold things optimally, not wrong. And picking the correct model or having a model picker who's orchestrating behind the scenes, like where you send any particular prompt to to save money and speed. I mean, it's all very much the land of trailblazing, isn't it? I mean, it does not sound figured out by any means. Yeah, I mean, we also, I mean, I think the other thing is how people use, these agents and how they get success

Starting point is 01:12:38 out of it is sometimes learn only after doing it. You don't pre-you don't before know this is the best way to use this agent to be successful. The whole idea of what became specs and these memory banks and prompt libraries and steering

Starting point is 01:12:54 files, I think we sort of you know, this is a equivalent of trying to learn how to walk falling down, trying to learn again. It's almost like your first baby steps. That's how you learn. And I think we've gotten to the point where we are, not because all of us were smart and knew what we were doing. I think as an industry, we are constantly learning from people's behaviors, how they use these things.

Starting point is 01:13:17 Inside Amazon, we have this power users, Slack. I think I get more understanding of what's the best way to run an agent because of people who are using these agents in anger every day and learning better and better ways to use them. And they come up with things like agent flow, and that gives product people product ideas. right but yet that's just us i think the whole industry is doing that and so it is it is it is early days but it's also a lot of fun so yeah that is the hand of innovation is is using things in anger you know right it creates itches that you scratch next thing you know you're you're doing something completely you're innovating yeah you're actually yeah you're on the edge there i'm going to mention something that's slightly dystopian even at the at this we almost

Starting point is 01:14:04 It's true without any dystopian, yeah. Well, you know, are you guys both familiar with the idea of a toll booth? Of course. The way to get rich is to create a toll booth, right? One way. Well, yeah, sure. One way. Well, by and large, if you look at most of the way people develop wealth, it's some version of a toll booth in metaphorical terms.

Starting point is 01:14:26 It could be real estate. That's a toll booth because you need somewhere to stay. I got the toll booth. You want to stay? Boom. I make my money. or in the case of something like this, I'm thinking of it like the future of software development,

Starting point is 01:14:38 I see it all going this way. We all see it going in this direction. And for a while there, the cost to be a developer was your ability to, you know, get hold something in anger, maybe have a license for a code editor, or in the case of VIM for free,

Starting point is 01:14:54 it's open source, the machine you had to buy, there was no toll booth in front of your ability to create and innovate. The toll booth was maybe the books or the education process or the schooling required, things like that. But you could bypass those things if you were smart enough to learn on your own and just avoid the toll booth altogether. And I guess what I'm struggling with as we're in this conversation is maybe just the fact that if this is the way of the future, what a massive toll booth there is in front of the innovation that drives the world. And I'm just not sure how I feel about that right now.

Starting point is 01:15:32 Yeah, I actually think it's the other way around. I'll use examples. Maybe there's a toll book. I don't know. But the way I think about it is there are things happening where the cost of doing that thing is very, very high. Maybe I am a semi-technical founder and I'm busy spending time looking for an engineering a very deeply, a deep engineering team, because I can't even build a prototype, right, because I'm not technical enough to go do that. I have an idea in my head. I can write some JavaScript

Starting point is 01:16:07 maybe, but my idea requires some heavy-duty backend stuff and I have no idea how to do it. In theory, and I've seen enough examples that this possible, you can do that on your own, or maybe with one person to do much more work. Once you have that, you can go further. That's one example. I've also seen people who've had a project that's been sitting on the back burner for a decade because it is something that they've built, it's an open source project, they've got a day job, and they don't have the time to go and evolve that open source project because it takes too much time. They don't want to spend their weekends doing maintenance of an old thing, maybe. Or you have a back end that's not scaling and you need to change it, but the cost, you know you need to change it,

Starting point is 01:16:53 But the cost of doing it is like 20 people and a year and a half of development, if you did it the old way. I've seen enough evidence now in all three cases of people building, using these agentic development systems, going and tackling those problems. The first one is an obvious one people talk about all the time that says, you know, founder thing. The second and third one, I think of where the magic lies, which is you are doing things now that you would not have done before. which results in better code, better user experience, etc., because your barrier to entry and ability to do it is small and low enough that what you thought would take you two weeks is now going to take you a day and a half, maybe even less.

Starting point is 01:17:38 What you would take you 20 engineers and, you know, a year and a half, you can now do in much shorter time, et cetera. My favorite example is actually of a team that does sprint planning every four days instead of two weeks because they were finishing their backlog that quickly. So I think that's, to me, that's, I don't know if you'd call that a toll booth or shifting off the toll booth or whatever, but I do think there's this innate sort of barrier reduction. There's an activation energy to everything. I used to, you know, I'll speak in my physics terms.

Starting point is 01:18:09 And you need a catalyst to cross that activation energy. And that catalyst could be funding because you get more people. It could be hiring really talented people. I actually think it's not just the catalyst part. your activation energy has gone down significantly. So the catalysts you need are, you don't need catalyst at all in some cases, or you don't need the best ones out there.

Starting point is 01:18:29 And I think that one, it'll have a meaningful impact on software development and how teams are structured and how we think about it in a very positive way, because you're getting things done that you either not doing or wouldn't have done or would have done much more slowly. And I think that's, at least that's how I think about it.

Starting point is 01:18:47 Maybe I'm the glass-a-full guy, and let's just see you. You're spot on, and I concur and agree with everything you just said. But what we cannot deny is that there seems to be a subscription of some sort. And this is where I'm struggling, really. Even though I'm, like, emphatically happy about the circumstance, I think it's cool. I think it's the best thing almost ever, but now I'm struggling to think that there's now a subscription attached to being a developer or to doing developer things that build software. And the reason why I say that is you go back to a couple years ago when they said you will own nothing and you'll be happy.

Starting point is 01:19:25 Well, now we don't really even, like if this is the way, right, and if I was just telling Jerry this other day, if Adam and Jerry from 20 years ago, which is today, let's say there's versions of me and him today that are entering the software world, if this is the way, maybe they don't even own their intellect and their innovation ability anymore. Like, now we're renting that too. And I'm kind of just struggling with that in this moment. Yeah, and I, the ideas are still yours. I think it's, I still, I mean, I remember the days when I was at AWS a long time ago, when, when, when, I used to have a stack of Linux servers at home that I finally, like, I don't even know why I have them, right? Because I, before I joined AWS also, where I, you know, I got rid of them and started using EC2 for everything.

Starting point is 01:20:15 I don't need, like, I have this little corner of my house picking up heat, not required. Like, there's an early one. So I still remember when you switch, you know, there's a time you switch from assembly language to more of their modern languages. Or you end from some language to rails type structure. There's always an abstraction or something that comes up that allows you to shift your attention somewhere else. I think the big one is where does that attention shift and how valuable is that? And I think those have historically defined whether something is going to be successful or not. I do think, and I mean, maybe all of us have run the cool it on this one.

Starting point is 01:20:50 In this particular case, I don't think we're talking about a 5%, like my ability to innovate or my ability to act and my ability to think and, you know, it's not becoming 5% better. I actually think I have a new tool that allows me to operate fundamentally differently. And I don't think we've quite grasped what operating fundamentally differently as individuals and teams actually means. I think people are starting to find out, but I think it's going to make an impact in a very positive way. And maybe it's the shifting of the toll booths on where the toll, because there's always a toll booth, as you said, somewhere. But don't take it, even though I described it as dystopian, and I don't want to be necessarily negative, although it is a negative thought when you sort of go down the rabbit hole, I 100% agree. I'm personally doing things just simply exploring and scratchy my own inches. that I just would never, ever do

Starting point is 01:21:45 because I just didn't have the time to even go to the depth necessary to do it in a way that was even worthwhile doing. And you take that same idea and you apply it to an enterprise team that has different context windows and challenges and issues. Gosh, yes, for sure.

Starting point is 01:22:02 I was just telling Jared two the day that for a while there I was bummed that I thought that, well, there's going to be a market shift in a market correction in terms of the way enterprises or even teams are built these days, and you may see layoffs, but I think by and large, we're going to see a massive, massive influx into folks becoming what are called software developers, because

Starting point is 01:22:23 there's tooling that allows them to operate like a senior engineer, you know, or with senior engineer abilities, and that's amazing. I'm all for it. In the moment, I was just like, gosh, do we now have a toll booth in front of our ability to show up as developers? And that was just kind of bummer. I think subscription fatigue gets to us all. It does.

Starting point is 01:22:46 Yes. You know, when you're paying per token, you're basically paying that subscription fatigue. Well, Deepak, I know you are hitting up against a hard stop. We appreciate you spend a few extra minutes with us. Kiro looks really cool. I love all the innovations going on. I think that spec-driven development is a very awesome, like, stake in the ground to claim, like, here's a way of doing agentic things that is better.

Starting point is 01:23:10 And I hope you guys all the best and building it out and realizing that vision of what you're trying to build. Cool stuff, man. Thanks for having me. It was fun talking to you. And as I said before, I think we started, I've been, I've probably heard, I started listening to the change log at episode one. So it's great to actually. That's so cool. Appreciate it.

Starting point is 01:23:29 So cool. That's beyond cool. Thanks for listening for being a list for all these years. I appreciate that. Special thanks to our longtime community member. Jordy Mone companies for connecting us with Emma and the AWS team for this episode. You're awesome, Jordy. Thank you for being awesome.

Starting point is 01:23:49 And thank you, of course, to our awesome partners at fly.io and to our awesome sponsors of this episode. Codrabbit.aI and agency.org. That's A.G-N-T-C-Y.org. And thank you to our awesome. Musical producer, the one and only brakemaster cylinder. Sounds like the word of the day might be awesome. Oh, Terry. Yes, be we.

Starting point is 01:24:13 How do you like today's secret word? Okay, that's it. This one's done, but stay tuned because we have another awesome. Yeah! Good screaming, everybody. Conversation. This time, all about the trouble in the Ruby community and what it means for open source at large.

Starting point is 01:24:31 Coming on Friday. We'll talk to you then. We're going to be able to be able to be. We're going to be able to be. Ain't...

The Changelog: Software Development, Open Source - Spec-driven development with Kiro (Interview)

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.