The Changelog: Software Development, Open Source - Spec-driven development with Kiro (Interview)
Episode Date: October 15, 2025We're joined by Deepak Singh from the Kiro team. Kiro is AWS's attempt at building an AI coding environment to take you from prototype to production. It does that by bringing structure to your agentic... workflow with spec-driven development. Their aim: the flow of AI coding, leveled up with mature engineering practices.
Transcript
Discussion (0)
Welcome, friends. I'm Jared and you are listening to The Change Log, where each week, Adam and I speak with the hackers, the leaders, and the innovators of the software world.
We pick their brains, we learn from their failures, we get inspired by their accomplishments, and we have a whole lot of fun along the way.
Today we're joined by Deepak Singh from the Kiro team. Kiro is AWS's attempt at
building an AI coding environment to take you from prototype to production. It does that by bringing
structure to your agentic workflow with spec-driven development. Their aim, the flow of AI
coding, leveled up with mature engineering practices. Deepak shares some really good ideas that are
driven by real-world use. I think you'll enjoy it. But first, a big thank you to our partners at
Fly.io. That's the public cloud built for developers. Who love to ship? We love to ship, so we love
why you might too learn more at fly.io okay deepak sing talking kiro on the changelog let's do it well friends the
news is out our friends over at code rabbit cod rabbit dot a i've raised a massive series b and they've
launched their cly reviews tool it is now out there i've been playing with it it's cool the bottleneck
is not code the bottleneck is code review with so much code have
so many people coding now, so much code being generated,
and so many things competing for developers' time and attention to maximize.
Code review still remains a bottleneck.
But not anymore.
CodeRabbit.
CLI code reviews, code reviews in your pool requests,
code reviews in your VS code, and more.
Teams now have a true answer to what it means to code review at scale.
Code review at the speed of AI, and CodeRabbit is right there for you.
You'll learn more at coderabbit.aI.
We'll link up their latest blog announcing their series B
and their announcement of their CLI review tool.
Again, codrabbit.aI.i.
Today we are joined by Deepak Singh, one of the leaders of the developer agents team at AWS working on Kiro, a very exciting and interesting new take on a tool that a lot of people have takes on right now.
Deepak, welcome to the show.
Thanks for having me.
Y'all announced Kiro, I think it was back in July, early July, to much fanfare.
and excitement.
I even got excited,
and that's hard for me these days,
you know, to get a little excited
about a tool.
So I was happy when you all reached out
and said,
let's talk about it on the pod.
Welcome to have you.
And why,
when there are so many agentic
coding things going on,
is AWS throwing your hat into the ring?
Yeah.
I mean,
we threw our hat into the ring
a little bit before Kiro,
and actually we learned a lot doing that.
If you go back in AI world,
like donkey,
easier, less than two and a half, three years ago.
It feels like another era, epoch in AI time.
You had all these assistants that were basically fast typists.
They helped you complete sentences.
And then you went into these chat-based assistants that you could write functions with.
Like, hey, give me a function that does blah.
And, you know, it was pretty good.
But we looked around inside the company amongst our customers.
And one of the nice things out being at Amazon is we have a lot of software
developers in the company.
So they're who are very opinionated about how they use software.
And so talking to them and talking to our customers, it became pretty clear that
the sort of chat-based auto-completion, give me a function, was useful, but wasn't going
to change the world.
And along the way, LLMs got better, a lot better.
They added thinking and reasoning to it.
And that's when we started building sort of what we call our next generation of agents.
And we started launching them, actually, in earlier.
this year, we launched something called the QCL.
This is actually the most heavily used agent inside Amazon.
And in the process of building things like the QCLI, learning from our customers, a few things
became very clear to us.
The first one was, you were moving towards a world where code was not going to be typed
by humans.
So the type faster, bright blocks of code for me was not going to be the game changer.
The game changer was going to be, I have a pull request converted into something.
I have written an issue
converted into something
that's meaningful
that I can make
part of a bigger project
or write my entire software
for me
and a human was not going to do the typing.
We already started seeing
examples of that
and it was pretty clear
that that's where the world
was starting to go.
The second thing we did
was we found that senior engineers
had a really, really hard time
with the original
era of assistance
where invariably
they would say is
I understand the code base,
I understand the problem,
I can type faster,
Yes, I might use it for some help.
Like I use the, you know, I'm dating myself like the Pearl Cookbook to find a function.
And we were like, okay, that's not useful.
How do we get our best engineers wanting to use these tools and actually finding them useful in their day-to-day work?
As they started talking to them, the thing that became very clear was the way they wrote code or solved a problem was by basically breaking it down into smaller problems.
Like how do you take a larger problem?
what are its constituent elements
and how do they express them?
Sometimes they write them on a whiteboard.
Sometimes they write a design document.
Sometimes they write pseudocode, right?
And then they work with a set of junior engineers
and other engineers to convert that into a final product.
This is happening before AI.
So the question we asked us to those,
can we convert that into an AI-based system?
Can we take the way our best engineers think,
make it easier for them to do that?
And along the way, also assume that
they're actually not going to type the code, an agent is going to do it for them.
Along the way, as we were doing this, this whole idea of vibe coding, as we call it, came up,
which is this prompt-based loop where you keep prompting an agent until you get to an answer.
And in that world, also we noticed there were people who were, for the more complex problems,
going into something else, maybe Claude or ChartGPT, and creating a set of tasks and breaking it up,
and then throwing it into the coding agent.
The whole premise of Keer was, that's just the way it works.
The code typing part is hidden.
It's there.
You can type code if you wanted to,
but the assumption is the only time you'll type code
is to tweak things that an agent may have gotten wrong.
But even changes that you want to make
will be done by talking to an agent.
So the way Kiro works is, like our senior engineers work,
you express a problem.
Kiro will convert that into a specification of specs,
which is sort of the core piece of Kiro.
It's a spec authoring tool at its heart.
And a spec is a set.
of requirements in markdown, design document, also in markdown, and a set of tasks where
it wants to do also written in markdown.
And that's what you work on with the agent.
And the code part of it is just the output that the agent generates.
There's more to it.
But that was a starting point.
That's how we came to their user experience that Kiro has today.
It's interesting to go that deep.
I mean, even QCL, if I got the pullback right, QCLI is the,
it enables completions, is that right, for hundreds of popular CLIs.
Is that, is that what on?
No, that's how it started initially.
Yeah, there was an open source project called FIG that became part of AWS about two and a half years ago.
I used that as part of OMIZH on my shell, even before, they became part of AWS.
And they do a lot of CLI auto completion.
In March of this year, the QCLI team put an agent inside the terminal, inside the CLSI.
And you can basically start vibe coding.
That's what it is.
You start prompting it.
It's advanced quite a bit now.
For example, you can start a conversation and if you want to experiment with something,
you can actually create a branch of your conversation.
You can go into a tangent.
Quite literally, it's called tangent.
Have a conversation.
If you don't like that path, you can come back to your checkpoint.
Or you can now go down the different branch and merge.
Make that main later, effectively.
So that's been very successful, but it's all an interactive conversation, right?
So the problem is what happens on day 20 when, you know, you've forgotten what you did.
You have to go look at a conversation history.
And what happens when you give that code to somebody else, they have no context of what
was in your head and no artifact that they can look at in a team.
And, you know, we work in teams.
A lot of engineers work in teams.
So in many ways, what we were trying to do with the Kiro spec UX was how do we capture
that vibe coding magic while making the code more robust over a period of time where it's
still useful a year later and people have the ability to inspect the specification modified
if they want to make any changes the two teams work pretty closely together so that we learn a lot
from each other so that's kind of fun i think perhaps why this idea at least if not the implementation
resonated with me and i think with a lot of people well first of all it's a ws so you guys have a loud
megaphone so when you announce something people pay attention and so that got some of the attention but
also just it's a new take on something that we're all working on which is like this agentic coding thing
and it's kind of jumping ahead i think in the process of we're all kind of getting to maybe in our own
ways of like realizing that chat isn't it and chat's cool and all but you just can't build
serious software in a chat dialogue like you can do stuff and you can make progress and you can
experiment but there's like a certain point where you're like yeah i don't want to
live my entire life this way.
I'm not going to be...
Yeah, I'm just not going to be productive.
And you guys formalizing...
Yeah, formalizing the idea of I think you're calling a spec-driven development
or you can give me other taglines about like this concept that is...
That's the right tagline.
Okay, perfect.
Like, even just that to me, I was like, yeah, that makes some sense.
Like, that's something worth trying.
And Adam can tell you about Agent Flow.
I'm sure he will.
He's kind of developing his own spec-driven development style inside of ClaudeCode or
with clod code as he builds things.
And,
heero seems like an attempt to like,
what you're using with cloud code, right?
Well,
AMP and augments.
And amp.
Yeah, it's agent.
So lots of things.
Yeah.
Okay.
Well said.
Regardless of the agent,
he's doing that.
And I think that what Kiro attempts to do,
I guess,
is to formalize that into the entire experience.
Is that fair?
Yeah.
That is fair.
And it'll evolve.
I think we are learning how to build software.
in a world where
the engineer's role is
the role of an engineer is going from
being the, I'll say
typist, I don't mean it quite like that,
but they're the ones who are, their fingers on
the keyboard, you know,
they're converting that, converting what they have
in their head into a language, which is usually,
you know, pick your language of choice
here inside Amazon,
AWS, if you're building a back-end system,
it's probably going to be rust.
And how, and that's what they excel
at. Now,
you're acting a little bit differently, you're a driver.
You're driving the behavior of an agent.
It's about how you give the agent the right context.
Structured thinking with specs is one way.
There are other components to it, which others, I'm suspect in your agent floor thing that
also comes up, which is how do you provide access to the right tools?
How do you provide access to the right steering files that provide, give the agent the right
context to be more effective?
And this combination of actually inside a Kiro project, the core
components are the spec, the tools, which is usually MCP servers, the steering files,
which are essentially, again, mark down files that, JSON files that tell an agent, this is
what I want you to do, how I want you to behave, give it a persona, as it well. And downstream of
that, you have this idea of hooks, which are these automations that happen. Like, the moment
you check in code, it's watching for events. And it could be the moment you save a file,
please run an optimizer for me
because there may be a better way of optimizing the code
or send somebody an email saying,
I'm done checking in code.
You may want to look at your merge request or whatever.
That combination is the KiroUX,
but the spec workflow is the heart of that.
So what does Kiro look like?
How does it work?
What is it today?
Where is it heading, etc.
But first lay the field out so we can all see what it looks like.
For those of you who haven't seen Kiro,
So it lives on its own website called kiro.com.
It's the best place to find anything about Kiro is go there.
It has, you know, you can download it.
You can read the documentation, but my recommendation is get it.
Right now we have a wait list.
So one of the, you know, we talked about the Kiro launch.
It got really popular, really fast.
We got over 100,000 developers signing up in the first few days.
And we had to put a wait list on, which we've mostly cleared
the original wait list which got very, very large,
but it's still there.
But assumption is if you sign up for the waitlist now,
you'll be out of the wait list very quickly.
You're not to wait for a month or so.
But once you download it, you'll recognize a few things.
One, I think the most obvious one is that it is a code OSS-based editor.
So if you have BS code profiles and preferences already set up,
it'll import them for you.
So if you had any plugins, et cetera,
running most of them, it'll run pretty successful.
Some, there are clashes and we'll warn you that please do not use this plugin with Kiro, for whatever reason.
And if you have, you know, we used the what is called BTX Exchange, which is open source
plugin ecosystem for VS code.
So that's where you can go and get most of your plugins.
That's the obvious thing.
The second thing that will jump out at you is when you open a project, instead of getting
your code windows, et cetera, the default is you'll get a question.
do you want to just talk to an agent or do you want to start building code using spec workflows?
And you can actually just start talking and Kiro will detect whether you need to do a more structured development or just, you know, sometimes vibing is fun.
Sometimes you just have questions to ask, ask the agent.
But you'll notice very quickly that the emphasis in Kiro is not in the code editor part, the code editing part, the syntax highlighting of code.
That stuff is all hidden.
The main window that's open and, you know, the way you can set it.
is all around the spec authoring experience
where you're creating a specification
and you're working with an agent to create the spec.
So that's the first thing you'll notice about Kiro.
And you can use it on your favorite laptop,
all the operating systems that you care about.
And it's pretty easy to get going with
because once your brain adjust to the fact
that what you have in front of you is not code.
But actually, as I like to joke,
Kiro is more of a markdown editor than a code editor
because most of your interactions are going to be natural language
with a markdown, and what you see is marked down.
The first thing that will come out is a set of requirements
that are written in EAS format.
So those of you who have worked with user requirements and etc.
Before, Ears is a format that I think came out of Atlassian
to represent user stories and requirements, all of them in Markdown.
You can throw text at it.
can throw pseudocode at it. It's multimodal. You can throw your classic, I drew my
startup idea on a napkin and throw it at it. And Kiro will convert that into a specification.
Once you like the requirements, you can take a look at them and say, yeah, like them. If you
don't like them, you can type into the chat window that you want to change them. Or you can
directly edit the markdown, which we rarely see people do. The next thing it does is actually
come up with the design. You're like an engineer. It'll come up with here's my design.
proposal and you get to decide if you like it and if you're okay with it then you say okay
i'm good and it'll create a set of tasks that it could have one task 20 tasks where task could be
i am going to create a window in the you know a button on the right corner which is blue in color and
says subscribe that could be one other tasks it'll be written and within that there will be sub tasks
which are the actual things you'll do you can do task by task you can because some of those tasks are optional
or you can say, implement it all.
It'll go ahead and do it.
You'll see if you go into the file browser,
you'll see all the files getting created.
You can inspect them.
Sometimes they'll come back to you for help.
Say, hey, I'm not,
even if you're not a pilot mode,
where it needs a user permission
because your file system requires, for example.
And that's how your projects are created.
Like, is that workflow is around that specification.
And if you want to go back and change the code,
the most are recommended way,
is actually to change the specification
and not go in and manually edit the code.
Like, no, I actually don't want this to be blue.
I want this to be green.
Or I didn't like what you did here.
Change your implementation a little bit.
And you can also do it in real time.
If you see, it's going in a wrong direction
because you get the chain of thought in front of you.
You can, it's essentially hit Control C and go, nope,
I want you to go in a different direction.
So that's kind of how it works.
I'm simplifying greatly, but that's kind of how it works.
What if AI agents could work together just like developers do?
That's exactly what agency is making possible.
Spelled AGN, T-CY, agency is now an open-source collective under the Linux Foundation, building the internet of agents.
This is a global collaboration layer where the AIG.
agents can discover each other, connect, and execute multi-agent workflows across any framework.
Everything engineers need to build and deploy multi-agent software is now available to anyone
building on agency, including trusted identity and access management, open standards for
agent discovery, agent-to-agent communication protocols, and modular pieces you can remix
for scalable systems. This is a true collaboration from Cisco, Dell, Google Cloud,
Red Hat, Oracle, and more than 75 other companies all contributing to the next-gen AI stack.
The code, the specs, the services, they're dropping, no strings attached.
Visit agency.org, that's agn-t-c-y-c-org to learn more and get involved.
Again, that's agency, agn-t C-Y.org.
You know, it sounds super weird to say this, but I read it.
designate with not editing the code yourself.
And the reason why is because the LLM,
the agent has this context that is beyond my context.
And it kind of has this plan.
So when I start like telling it what to do,
I kind of like derail its plan.
And that to me is jarring at first because I'm like,
no, no, no.
I still want to be in charge here.
I want to have an agency.
Not to be too punny,
but I find that like even a small change,
I really don't want to do because I want to tell it what to do.
And so Jared mentioned Agent Flow.
This is just a term I coined.
It really just, I think it sounds cool for one.
I think it's the coolest term off.
So if everybody wants to use Agent Flow, you know, back off.
I'm just kidding.
I like it.
It's cool.
But mine is not, my flow is not as sophisticated as yours is in terms of like
MCP servers and tools yet.
It's really just how I think in my brain about the problem I'm trying to solve.
develop some version of a spec
I've been calling them just documents
so I've been calling it document development
they are documents
they are documents
and it's nothing against the word spec
but generally
in our industry specs have been
a version of a pejorative
that it's just kind of like
ooh you want to me to do that on specs
I just personally didn't jive with that term
cool with you all using it
it's just my personal preference
but it was not as kind of sophisticated as yours
that being said though as I'm playing with Kiro
you know you're generally
this initial document, the spec, which I get, and I've written those specs before I was
an agile team. I was a product manager led teams. And I would speak in these terms, like as a user,
I want X, et cetera, that whole format. And I get that. And then the design document that comes
from that is even cooler. Like, that's where I think my brain gravitates most. So that's kind of
what I've been generating that version of a design document, not so much that initial spec that
creates this design document. Yeah.
But not wanting to touch the code I thought was really strange at first until I started to get into the agent's way with what I told it to do via this document-driven, spec-driven agent flow.
And I just became the architect of the idea and the researcher and the viber.
I'd like to call it riffing, really, because I would like ask questions about different things, poke holes in the CLI or the API or how does this truly work.
Did you research that?
Is that really based on facts and truth?
Here's the truth here.
Or here's an example of how I've hit a pitfall there.
So avoid that.
And then we come up with this thing called a spec or a document or whatever.
And it's like, okay, great.
This thing represents as best as I can think of the idea.
And then usually the last thing I do is I say, where are the blind spots at with this?
Okay, we've defined this.
I think I've got good eyeballs on the spec or the design.
Where are the blind spots here?
And we'll go back and forth on what those are.
and we'll update that thing.
And then when you get into that flow and you say, okay, go.
And I want to be autopilot, right?
Your buttons is autopilot.
And I kind of like that too.
It's like auto edit, sure, go ahead.
Just make the world.
I don't want to really get in that thing's way, you know.
I wanted to call it Yolo mode, but.
Yeah.
It's kind of, yeah, yellow mode actually,
I mean, I would almost advocate for that
because that kind of makes the tool fun.
You know, we don't want to take away from
this process of creating software is removing the fun of it and making this tedious task of just
like writing specs and writing documents.
We can't lose that fun aspect of it.
So I rambled a bit, but I totally vibe with the idea of not touching the code, not
because I can't change one single value because I kind of get in its way and it has this
context window and I really don't want to disturb it because I've done my job, which is be the
architect and the visionary, the idea. And then we've blessed this thing called a document or this
spec. Yeah. And let's just go. And if I get in its way, I kind of like derail. It can,
it can get back in motion again. So it's not like a total derail. But sometimes you can really jack
it up and you're on a tangent or I call a side quest. Yeah. No, so I think there's, you're completely
right. There's two reasons for this also. There's a technical reason, which is context is king. Your
results are only going to be as good as your context.
And especially if you've been in a long conversation or a long session,
going in a manually changing code actually kills the context.
And you get out of sync.
And getting back into sync is just really hard.
In fact, I mean, even if you're not using structured work,
but just documents, giving it instructions,
the moment you go start touching the code, you've lost that context completely.
So that's one part of it.
I also feel, at Amazon, we have this leadership, a tenant for our principal engineers called
illuminate and clarify, where our senior engineers are very good at illuminating and clarifying
a problem. They usually do it with their team or group of junior engineers. I think of what
you just described as illuminating and clarifying to an agent, to an AI agent. And I think the
metaphor stands, and the better we are at doing that,
because I think it's more art than science right now,
a lot of folks are learning how to work with these agents.
You've developed your own workflow, I like the name, agent flow.
I've seen other engineers have their own flow, so to speak,
and I think the thing that we, as the Kiro team, will have to do over time,
is today there's a particular flow that Kiro use.
You can go, we can do some stuff around it,
but you can't go randomly and build your own,
completely randomly build your own flow.
I think we'll learn quickly from how people want to build these
because you want to help develop a stay in the flow that they like.
That's a big part of what we want to accomplish.
Just keep it fun.
So there's a couple of decisions that you guys made.
Probably had to make it pretty early in the process,
which I think are interesting to me to the why is behind the decisions.
And one of those decisions was it's going to be an editor.
It's going to be a VS code.
Is it a fork, I assume?
Or is it a...
It's based on code OSS, which is the open.
source framing project underneath VS code.
Gotcha.
So it's not exactly a fork of VS code,
but it's based on the same foundation as VS code.
It's VS code-esque.
Not really interested in that decision.
I think once you decide we're going to build an editor,
then that's like a really good decision to just be like,
let's build off of that.
But a lot of people are going to the terminal.
And I have found as an engineer that I've enjoyed using terminal-based agentic tools
more than I've enjoyed using the exact same tools
in my editor, whether it is VS code, whether it is
Z. Those are the two that I've been using things in. And so I'm curious, first
of all, about the choice of like, where does the agent live? And your guys' answer is
it lives inside your editor. Can you speak to that?
Yeah. So actually, it's interesting is that
both the agent living inside Kiro, and one of the nice things
is we have the terminal-based agent as well with the QCL.
You know, when we say 80% of builders inside Amazon are developing code with AI tools on a very regular basis,
a very large number of them are using QCLI.
There's a couple of, because as I said, it predates Kiro, there's a couple of reasons we did that.
There's a class of people who just love working with what you would call an IDE.
And I think the IDE as we know it, which as a code editor is going to go away.
because as we just discussed, that's the less important part.
It's more of a visualization system.
If you look at people doing terminal editors,
they will use an ID or some text editor at some point,
because they need to visualize, they need to do deaths,
et cetera.
There are some things that we found,
even with QCL where, which are very, very hard.
If you want to build a spec and have a spec workflow to it,
that's a much richer interface that's,
and it's very difficult to get put that interface into a terminal.
We think there will be a lot of folks, especially power users who will use terminal-like systems
because they give you that freedom and flexibility.
Or they're just from, you know, it's the classic VI versus Emacs war at some level.
They're just the flow that they like.
But if you're interacting with an agent, it's not a bad system to use.
We also felt that there was a very large number of people who wanted that richer user interface.
And the richer user interface allowed us to try a few things.
One was the whole spec workflow thing that you can try building it.
inside a terminal, it's going to look flunky in it, ugly, and it won't work.
There was one, and we had strong belief in that.
But the whole helping people author an MCP server, helping them author a set of steering files,
the whole library of hooks that Akira has makes is much, much easier.
The way you would do hooks in a terminal was, you would probably have a slash hooks
button, and you'd get a list of the hooks that you wanted.
But if you had 500 hooks, you'd have to scroll through all of them to find.
them. In a visual interface, that's much easier. You can have a library of hoax that only
your team cares about, and over time you may have a library of hoax that, in fact, your company
may publish, and it's like, this is the code review agent that you're going to use. Here's
an MCP server that goes with it. Here's a steering file that goes with it. All that context
is shared. So there's things you can do from a user experience perspective that are just hard
in a terminal. If all you're doing is just driving an agent with prompts, terminals are great.
I don't think the IDE gives you any advantage.
But within an editor, if you are working on this rich things,
you want to visualize diagrams, work with multimodal inputs,
have these other affordances that you want to make part of your context.
It's a more powerful, more visual interface to do it.
You're not just prompting at that point of time.
You're doing a lot more than that.
But I agree.
If you're just prompting an agent, CLIs are amazing.
Okay, good thoughts.
The second decision that you all made,
at least I'm assuming you made this.
was we are going to be,
I guess, not model agnostic,
but we're going to pick a model that you're going to use
and then you can change that model.
There's different angles of this.
Like obviously, some people have their own model
they want you to use, right?
Like anthropic ones, you know,
the model behind the agent is like a huge part of the system, right?
And so, uh, Kiro as I download it,
it selected Claude Sonnet 4.
And there's a drop-down selector.
For me, I can't pick any other option.
I'm assuming I could,
can figure that to pick different options.
But just your thoughts around model selection.
I mean, it's a big part of it because at the end of the day,
you are wrapping a model that you have to decide what to do with that.
This is a bit of a philosophical debate.
We have within ourselves as well.
So the reason the drop down is there, there was a time that Kiro allowed you to pick
between Sonnet 3.7 and Sonnet 4 because sometimes you ran into availability issues
that Sonnet 4.
There was just not enough capacity.
So you could choose Sonnet 3-7 and do your work there.
But that problem went away and nobody cares about Sonnet 37 when Sonnet 4 is available.
Right.
It always picks Sonnet 4, so we took that out.
So that's not going to say GPT5 next to this ever.
No, but there isn't any, if you download it now, you will get two options.
The default model, I won't say a model, the default agent in Kiro is something that we call Otto,
and I'll go into this in a second.
I'll actually go back to Sonnet 4 as well.
What you're using in Kiro is not Sonnet 4.
You're using an agent that uses Sonnet 4
for the core thinking part of its work.
There's a set of scaffolding that goes around it,
everything from your system prompts
to the way you handle caching,
to the way you are handling any guardrails that might be in the picture.
Sonnet 4, let's take two editors.
They both say Sonnet 4.
They're not equivalent.
It's how you drive context into the LLM.
And the scaffolding you have around it that makes it useful or less useful or more.
So in some ways, part of the, you could argue that saying Sonnet 4 is also almost a comfort level for people saying,
I've got Sonnet 4 inside it, because it is an agent that's doing the work.
It happens to be an agent that is built on Sonnet 4.
There could be pieces of it.
And actually, that's true for many agents out there, where for saying hello, I don't actually want to use Sonnet 4.
That's a very expensive hello, you know, as an example.
Yeah. Well, I was just thinking about that because you, I didn't want to call it a rapper because that seems like that's just like belittling the amount of effort that you all have done. But basically in wrapping a model like that, you're making a pretty large bet on the provider of that. And you're, and you're to the, to the, you know, choosing them versus being like, well, you can use Gemini. You can use chat GPT. You can use Anthropic. The good news is because you have a drop down, we want to go in that direction. That's fine. Now, where we're.
have decided to go, at least for now, is an agent called Auto.
Auto is using multiple agents underneath the hood.
We can choose whichever largest frontier model we want to choose.
Right now, we tell people, it's there in the documentation, that one of the models that
we use is Sonnet 4, because that's a model that we know well.
We understand well.
All our agents use Sonnet for QCLI also uses Sonnet 4 system.
We have all the scaffolding in place.
We understand how it works.
but we are starting to we want more
there's a number of ideas we have
around how we handle the model based on the tasks you have
including maybe some specialized models for certain things
so auto is an agent it's not a model
it uses multiple models underneath the hood
one of which happens to be Sonnet 4
and over time we may have different categories
Sonnet 4 sets the bar in some ways for
how good the results need to be
we have you know the way we think about it is
our quality of results with auto
have to be at least as good as the agent
that uses Sonnet 4, the one that you've been
that's a sonnet for.
But we want to do it at a fraction of the cost
and hopefully better performance.
That's a mental model.
But again, we haven't taken one way doors
because I think this is an philosophical debate
within the community, within our team
on what's the right thing to do
because the funny part is
if you give people five agents,
So chances are they're half of them, most of them will just pick the one that they think is the best anyway.
So that's where we are right now.
Where the future goes, we'll see.
Choosing models as a, I think there's a cost perspective there, right?
And if you remove the cost perspective to the individual user, so I guess the context here really is like,
we're drastically different types of developers in terms of like, I'm a solo.
I'm not doing as a team, even though Jared and I are both here.
I'm not working on, like, change on web proper stuff.
It's like scratching my own age, CLI type things, just really learning and exploring.
And so a lot of my context and even my preferences are largely based around a team of one.
Whereas Kiro is really a team of many.
And so the choices you made were really around teams and context sharing and things like that.
And in a world like that, you usually have some version of an enterprise cutting the check.
and there's less concern about, you know,
ultimately the cost of an agent
and how it may impact you.
Obviously, you've done your cost analysis,
but that's largely similar to us usually.
Yes and no,
because I think if you go directly,
like that becomes,
if you go directly with a cheaper model,
the challenge is people will do that,
but then we'll get very frustrated
with the quality of the output.
What I care about is the quality of the output.
And that works for individual developers,
that works for, you know,
your cell startup that's funded by Y Combinator, it works for the enterprise.
I actually think in some ways enterprise developers work under more constraints on how
creative they want to get than developers in other places.
So the way Kiro thinks about is the customer that we want to really make happy is an
individual developer.
That individual developer could be a soloproner, somebody sitting at home and scratching
a niche, somebody sitting as part of a small startup, maybe the technical
person in a two-person startup, for example.
They could be a developer at,
you know, at a startup that
we all know, one of these
series D unicorn startups, or it could be
somebody at Delta,
to pick a big
developer customer. All
of that is possible, which is, and I
think the reason, so there's a few things that we did in
Kiro that point to that. One is
that the default login system in
Kiro is social. You either start
login with your Gmail
or you log in with your GitHub login.
It's not you need an enterprise system.
That's a big part.
Again, going back to decisions that we made, a very conscious one.
It's actually one of the few, if not only, products that have been built by an AWS team,
that you can sign up with Gmail and have the ability to pay.
You don't need an AWS account.
In fact, you don't need anything.
You need no AWS account.
You don't need to worry about AWS permissions zero.
There's no requirement at all.
If you happen to be an AWS customer, there's ways you can get signed up.
So that was the big part of it.
What we did decide to do was assume that people want to write code that's going to do.
A big part of what idea was, yes, there's a lot of fun projects out there that people write, you know, I'll call them toy projects.
Again, I'm being sort of general.
Projects that you do to learn, et cetera, and you can do them with Kiro.
You can buy code with Kiro just like with any other agent.
But our assumption was people want to take code and actually do something meaningful with it and grow it over time.
and evolve it over time.
So that was very part, and it is also true.
A lot of people do that as part of a team.
So we wanted to keep that in mind.
The team could be two people.
It could be two pizza Amazon team.
But the fundamental assumption was that the person signing up,
person signing up is an individual somewhere,
whether they are at home or in a company.
And I think finding that balance between the fun part,
I'd forget which one of you mentioned, you have to,
like development is fun.
You can't take it away, especially developing with AI agents.
You don't want to add, make it like a boring, put in a bunch of checks and balances
because you assume it's going to be in the enterprise.
I think that's kind of where we are right now.
And we'll see as it grows and where people push us, which direction we end up in.
But there's a lot of white space right now in this whole area.
We're just all vibe in our way there.
We're getting there eventually.
We'll get there eventually.
How many are there AWS?
are there two pizza teams shipping code with Kiro?
I mean, I know you guys have a big wait list,
but you have a great internal team
that you can just test stuff out on and say,
hey, use this sucker.
Well, the best example is the Kiro team.
The Kiro team's just been using Kiro from the day Kiro started.
Really?
So they built Kiro in Kiro.
I don't have any stats on it,
but the one use case we mentioned is
this one engineer who wanted to ship a feature,
a notification feature in Kiro,
and it did it over a day
because he wrote a spec,
a spec for it, and then somebody else,
and then he wrote, the agent wrote the code and maybe somebody else shipped it.
But in general, part of one of the tenets we had in Kiro,
and I have it for a lot of my teens, is you can only build,
you want to build it almost for yourself.
Like, developers work best when they're building things that make them happy developers.
And you're a development team, build something that makes you happy.
Then, of course, you give it to a bunch of customers and see whether their happiness
and your happiness happens the same way,
not developers are picky people.
And that's when you start making the choices.
The spec editor in Kiro actually evolved significantly
because the way the team was using it worked for them,
but as we gave it to other people,
it didn't work quite as well.
And we almost rewrote it.
I mean, we did rewrite it thrice, I think,
in how in the user experience to get into a point
where the majority of our customers really liked the experience,
the most of the people in our, you know, beta cohort.
And so that's one.
I'll give a more general one, which is, you know, we are at a point inside the company,
which actually helps us a lot, learn behavior.
It's that most developers now, 80%, as I said earlier, are using AI systems to build code,
where we have now got significant projects, which are significant complexity,
which are 80, 90% AI, you know, written using AI.
So we are moving quickly in that direction.
That's really cool.
I would love to see some of your guys's specs
like from the Kiro product team
like not open source the whole thing or whatever
but like here is a large production code base
that was built with spec driven development
and here is a bunch of specs that you can look at
to see what we've ended up with
as we've built out this code base.
That's a really good idea.
We've also, I mean I've been
some of the folks on the team are toying around
with the idea of can you have a spec sharing
just a place for people
to share specifications
which include not just a spec files
but also steering files and etc
because they all go together
like in the end
I think this is what Adam was saying
context is not as
one dimensional as people like to say
it is. It includes a spec
it includes a bunch of other things
right and your prompts
your steering files can is there a package
way of sharing all of like
you have agents.md now right it's a very
powerful sharing
mechanism because it's a standard format. So I think that's kind of where I think there's going
to be interesting evolution even in that space. And as you said, I think the community is
sort of converging into the idea of specs. It's whether they are doing it deliberately or sort
of everyone's heading there. So we're very excited to be part of that because we've had this
religion for a while. So inside of Kiro for our listener, the left-hand side, when you have the
Kiro panel open. There's
one, two, three. There's four
subsection, specs,
agent hooks, agent steering, and
MCP servers. So, Deepak,
how many of those produce
artifacts that end up
checked into your source control? Like, how many
of those are like trackable things?
Yeah, I mean, there are text files.
You can check all of them in.
All them are. Sitting on your file system. Yeah.
Everything in Kiro is a text file
on your, I laugh because
for me, they're on my right side. By the way my brain works.
to move all of those to the wrong.
But it's different.
By default, I think it's out on the left.
I just popped it up.
I move everything to the right.
I like having them there.
There's the reason workspaces are configurable because you're all picky.
And our own way is always the best way, isn't it?
Like, my way is better.
Yeah, you can have yours, but mine's the best.
Yeah.
But they're all marked down files in the end or JSON files.
They can all be checked in.
And that's the beauty, right?
that, you know, in the end, get, get is your source of truth.
It's so wild how we have come back to markdown files and JSON files, you know.
I mean, honestly, because there's, you can have, you know, data serialized into JSON.
That's, you can read it as a, as a human.
It's not that fun.
You can still do it.
Yeah.
But, you know, you can still build an interface out of that through a script.
It says, you know, make this, turn this data into human readable.
or marked out it, essentially.
And those two different document types to me are key.
One thing I'm not seeing in this and tell me if this is something that's on your roadmap
is I feel like you've got the initial spec, you've got the design,
and you've got some version of a task list that takes you to an implementation.
Feature is, in quotes, done, right?
It hasn't been QA'd, it hasn't been deployed.
It may spike an incident.
there may be a bug and something that I've really,
which I really feel like is what I'm calling the agent flow part of it,
is it's not just this initial spec and the feature that gets written.
It's the bugs that may come after it.
I started to do things called builder logs.
They're blogs, but I've enjoyed reading what the agent wrote about the feature we delivered
because it helps me learn.
So I'm using as a means to like be a better illuminator and clarifier essentially.
You know, how can I better understand the system I'm defining?
And the builder logs are part of that, incidents are part of that, bugs are part of that, you know, knowledge-based articles.
What do we learn as a result of this exploration, or even in this little riff or this vibe, not even writing software, but what do we learn from, you know, poking holes in the API or the CLI or what really is or is not there, what really is or is not possible?
When do we have to go from API to SSH because the API lacks an interface to certain parts of the system?
then we actually have to go in via SSH and sniff around or whatever.
So all these little weird things that happen as you build software is largely part of like bugs, incidents, knowledge-based articles, builder logs.
Do you have plans for the full ecosystem beyond this?
I'm glad you asked this question.
I will say the way we are thinking about it right now is where hooks come in.
Because I don't think we know exactly what's the right way to do it.
Like, Kiro will summarize what it's done for you.
It's like, here's all I did, right?
You can take a look at all the actions that took.
That you can go find.
Our current mechanism for doing that is the hooks part.
Hooks is a great way for people to hook into, no pun intended, into Kiro.
The hooks can come from you.
They can also come from a third party, by the way.
It's a great integration point because, you know, it's easy for somebody else to write a hook.
So, for example, you want to take a code and document it properly, create what's the
you know, let's say you're writing Java code,
you wanted Java docs created or something like that.
Like, I'm not a Java guy, so I have no idea how that works.
But you wanted to convert that into a really nice documentation
that you can upload onto a website, maybe even publish it.
You could create a downstream hook that is essentially a documentation agent
or another agent that takes a look at and converts in documentation.
Maybe somebody else has written it.
There's another startup or some open source project that does it.
You can plug it into Kiro as a hook that will do it automatically.
you still don't have to drive it.
The good news is that the hook is using the same context as the rest of the project.
So the context gets preserved.
So that's one example.
On the bug side, the thing that I would love, like one of the things I would love to do,
actually, as learnings is can you convert your learnings into a steering file that can then
be used as a feedback loop?
The good news with agents is they're really good at reflection.
You give them error logs and feedback.
They'll get better and better over time.
How can we do more there?
It's not something that's built into Kiro,
but you can imagine that that's something that ends up happening.
And I think what we learn over time is,
what are the kind of hooks that everybody does?
And that makes, to me, as a product person,
is like, that's a capability that we should probably just take care of within Kiro.
But over time, you want people to be extended in the ways that they like
and learn from that.
I think there's, if you look at all the successful things out there that people enjoy using,
I'm a heavy obsidian user, for example.
The whole plugin obsidian, you know, extension communities is a big, rich community that's out there.
So, but there's a set of core plugins.
And what's the core versus what comes from the community is something that we learn over time with hooks.
It's early right now.
It is early.
So the hooks essentially is your way to allow yourselves to operate within a certain known user experience,
a certain, you know, clarified way.
that you expect here to be used, but it allows folks to extend it beyond and add to it
that you would not maybe otherwise want to prescribe because it's white space, as you mentioned
before. There's a lot of a lot of vibing happening. I think we're all learning too as we do
this vibe code type stuff and allowing the agent to write all of it versus stepping in even
if we have the technical abilities to do so. How far, how mature is that? Like how is there anything
out there in the hooks part of it?
The mechanism is fairly mature.
We are talking to, there's a bunch of hooks that have been written by our beta customers
and early users for various things.
One of my favorite hooks that's out there, like the classic hook that's actually
part of the sample hooks in Kiro is a hook that converts, that translates dictionary.
So if you change your English language dictionary, it'll go and update every other dictionary
for use of a localization.
It's a localization hook.
You know, localize my code, basically.
It's the kind of thing that people hate doing, but you can do it automatically,
and if you make it a hook, every time you check in code, it'll automatically go and update all your dictionaries for you.
I know somebody else was writing a game in a JavaScript framework that they were not very familiar with.
I think they were writing a view.
They didn't know view very well.
So their view code, and they didn't know if the agent was perfect.
They had an optimizer that ran every time the checked in code.
There's a lot of things that you can do.
I've seen people hook it up into code.
review tool. So I think that will happen. And I think part of what we need to figure out is
what's the right mechanism to make it scalable, to use that term. The bug thing is kind of
interesting. I'm going to go outside Kiro. In CloudWatch, we have, and by the way, you can
hook that into Kiro quite easily. We have an agent, which you call incident response, that every
time an alarm gets triggered, it starts creating hypotheses on where your, where in your system
you have an issue that the alarm may have been triggered,
you can probably convert your root cause into a spec.
I'm speculating.
It's probably not that hard to take,
here's my root cause, here's where the code,
you know, this is why the issue happened.
You can create an issue that's what people who do.
They'll file a ticket to do it.
But you can even create a spec that says this is the right way,
you know, this is the problem that we need to go solve,
feed it back into something like Kiro.
And with hooks, you should be able to automate something like that,
is my guess.
So I haven't tried.
I don't know if anybody else has.
But I think these kinds of workflows will become more common.
What's interesting about this, hooks, if I understand it correctly,
is there's things that, you know, as you're doing something with, you know, with the agent,
these hooks are sort of like side things you want done, like a formatter or, as you mentioned,
a localization thing.
Maybe you're even running a build.
And if you're in, in the case of, say, cloud code, I'd have to open a new tab.
and instantiate a new version of cloud with brand new context to not be blocked.
So I may do something that a hook may do, but I'm doing it because there's no concept of
hooks in the cloud code world, at least that I'm not using at least.
Maybe an MCP server can do that, but still you're going to have that context window or that
active window you're working in consumed by this kind of like a side task.
And you're kind of blocked.
you're blocked until you could do something else.
So that's kind of cool how you've solved that problem there.
Yeah, it's something that the team did pretty early.
I mean, I think one of the nice, this is where I think the advantage of building a tool
and using it the way you to build your own, you know, it's somewhat meta, but that's kind of how it works.
And hooks came out of some of those ideas because hang on, we need to do that.
So, you know, it was a very early feature in, in Kiro.
Yeah, it seems like it has tons of potential.
What are the timings of the hooks?
Like, when can you actually fire off a thing or is I'm sure there's a life cycle of some kind?
Yeah, there's a, at least the way it is today, it's like a watch API on a file system or a repo.
It's looking for some change.
Okay.
So when files change, it does things.
Yeah, it's an if this, then that kind of thing.
If this happens, do it.
The key part is what Adam mentioned is it retains the context of the project, which is very, very important.
It's not going to go randomly start from scratch and start it.
But now you have to feed it all the context back in.
It already has all the context.
So it's going to work within that context.
Yeah, if I think about it's a context mover and saver is all I am.
Like, I want to keep the context.
I want to move the context.
I want to save the context.
And then in the case of Cloud Code, when that infamous auto compact comes around,
like, no, my gosh, please.
Stop destroying my world, you know.
Auto compaction is such an interesting thing, which people, which is necessary
because you want to conserve tokens, but it kills your cash.
It gives your context
There's all kinds of challenges
Well, then you, you know, as a user, you're
I just feel like it's a ticking time bomb constantly.
I just feel like such anxiety.
You know, I'm like right in the middle of the best part.
We're finally to the nugget.
You know, we've rifted back and forth enough
to finally get the clarity.
And it's like, okay, draft or update this spec.
Or in my case, it's a pep.
I've actually borrowed the Python way.
and even the acronym too, PEP, but I call mine Project Enhancement Proposals versus Python Enhancement Proposals.
But by and large, the same concept where you've got this initiation of whatever it might be.
It's got statuses and all those things.
And that's what seemed to work for me, like updating that.
And I can't do it because, you know, I've got to wait for the auto compact to happen or then it comes back.
And it's like, I think I was doing this based on, you know, what this history says.
but then you kind of reset the zero
I understand what happens
but it's always like this anxious moment
as a user to map around
Yeah, as you were talking about it
I felt like driving an electric vehicle
and having range anxiety
Yeah, similar.
It feels like that.
It's like how many miles till I can get there
to charge up, yeah, exactly.
It's, I think eventually
that will get solved and I understand
technically why it's there
but as a user, all it does
is basically give me anxiety
and then I got to work around it
and moving context and saving context
and I mean I've even started to like save
chat sessions because you can do
slash export copy to your clipboard
or move it somewhere
and kind of save at least the chat history
so that you can read back from it
you may not have the full actual context
that was saved but you at least have
some version of a map of how you got there
I'd be very interested when
some of the new keyhole builds come out
We've got a couple of enhancements coming out this week in the clients.
I'd be very interested in you running long running sessions and seeing what happens
because we've done, tried to do a lot of work in managing that context,
in managing that and making sure you get the right behavior out of it.
It's not perfect, but I'd be very curious to see your experience,
you know, whether the range anxiety gets mitigated or just gets a little bit better.
And let's talk about that then because I want to talk about Stack,
because, you know, my stack for what I'm doing
is pretty much any agent CLI.
And, Jerry, you'll be happy to know that Zed is my daily driver now.
Like, I just, I've thrown every other possible editor to the wind
because Zed obviously is fast and beautiful.
And the only thing it lacks is some of the things that Sublime Tech still uses,
but, you know, I digress from that point there.
But I'm using, you know, some version of a CLI in the terminal
and then a separate editor.
And I would not call ZD an IDE, although it has IDE tendencies.
That's my stack.
And the reason why it's my stack is one, it's accessible to me.
I've just learned about Kiro and just got access to it.
But even with augment code, I used it inside of VS code.
And me as a user, as a visual person, as a designer, VS code is great, but the text is small, the windows are spaced out.
What I like about the Claude Code, Augment Code, AMP code, CLI flows is that the user experience of using them is a very, it's a larger text.
I can see it better.
I feel like I've got more clarity with the, the vibing for lack of better terms.
And then I still have this really awesome, ZET is the best editor to still go and review and look at the actual code, you know, go around the entire documentary that is the project.
To me, while I would love to use Kiro, the thing that would hold you back is that you're in this V-S-Code-like world, and it's just less than desirable, I would say.
It's not bad.
It's just not my preference.
Yeah.
I mean, I think we talked a lot about back in the day, you know, building something completely new like Zed did.
I like the fact that they call it Zed and not Zee.
or working off, you know, working of code OSS.
And the reason we ended up, the goal, the idea, it was very clear to us,
because we used to have a plugin, you know, classic VES code plugin.
We still do, which is on the Q developer side.
But it became pretty clear to us from a UX affordance side
that was going to be a very, very difficult place to be.
We wanted to do things in the user experience that as a plugin was just hard.
So having, you know, code OSS as a starting point
was great.
I think the interesting point is,
will we stay with code OSS or will be evolved,
like at the end you fork it so much,
it becomes its own thing?
I think that's to be determined.
There's a lot of that.
One of the other advantages of using code OSS is
there's so many people using VS code
and they all have their plugins and extensions
and just pulling them into Kiro
to do things makes their life easier
as opposed to starting from scratch.
there's actually probably one of the biggest reasons
that we stuck with the code OSS part
but the goal was always making a UX
that was optimal for the flows that we want to get to
and I don't disagree with you
I think UX is also a very personal thing and how it works
I'll always go back to I just like personally
I never like using code editors
like I you know my
I like using text editors
but I'm not a
I'm not a really good programmer either.
But I think with Kiro, because now you're not in code editing land,
you're in viding with agent land to create specs.
Our user experience is going to keep going in.
How can we make that a true, rich, multimodal experience and keep evolving it?
Where it ends up again, I don't know.
But that's a goal.
But actually what you said, the stack part is very important.
I do feel like most people will end up using a,
what I'll call it, a more rich experience.
which you can call an editor, ID, whatever.
I almost feel like calling an IDE is doing the thing
a disservice right now.
So, you know, we call it KiroIDE,
but in my head it's like a desktop product,
which happens to have a code editor inside it,
and a terminal-based product, like a CLI.
And I think those two, I mean,
because it's VS code, I actually run QCLI in my VS code
in my Kiro terminal as well.
I often also have a separate shell open all the time.
And one of the ideas, I think that's going to be interesting
in how it evolves.
You see that a little bit with some of these CLI things
that now have editor plugins where you're using the editor
just to visualize what's happening.
I think there's going to some very interesting, again,
UX evolution of how terminal-based agents
and richer UX agents evolve over the next several years.
so progress with kiro means the vs code side the code side gets kind of tucked further and further away
and the spec editor and this like immersive interactive building of a specification and then
some sort of end product kind of continues to level up and become more and more primary
so perhaps kiro down the road you know maybe you don't have to tell people that yes you can
look at the code underneath if you need to.
I argued with that
I remember having a debate with the team
and do we even need a code editor in here?
Yeah.
They won that battle.
Correctly so.
Yeah, I think right now you do.
Yeah.
My question for you is
with the current state of the art
models,
do you think that world
that we just described
of Kiro as this interactive
app building thing
with a code editor hidden in it?
Do you think that's
You think that's possible today with just better agents,
auto getting better and Kiro getting better and leaving all else the same?
Do you think we can get there?
Or do you think the underlying models also need to improve to bring us to where you want to go?
Oh, both, both.
I mean, it wasn't a bold question, Deepak.
You can't say follow.
I'll be blunt.
It's either or.
That's fine.
Touche.
I think we need one, at least one more generation of,
There are probably models out there that can get you there, but they're really expensive.
And their context windows aren't right.
There's a lot of limitations right now.
I mean, I say that given the fact that over the last year, the models have become so much better.
Some of the things that we're doing today, you could not do a year ago.
You just couldn't.
I do think there's one more generation of models before we get.
I still feel like to write a full application, like not a simple CRUD app, but a proper, you know,
distributed backend. And we have examples of people who've done that. But it requires a lot of
skill in the part of the developer to be sure that you're in the right place. I'm not saying
that's a bad thing. I still feel like the people who are going to be most effective using these
tools are skilled people. There's a lot of folks who believe and hope that you can give somebody
like me and somehow make me a better programmer, developer. That's not happening. Can I write
applications more quickly and easily.
Absolutely. I can prototype better than
I ever could. I've written
more in the last year than I had in the five years
before that. And mostly it's me
trying to make a molecular viewer for myself
because that's what I like doing.
But I do think...
Have you got one? You got one done?
I've got it like 15 versions of it.
They keep... Because
that's my sort of test pet project.
Do they work though? Are these... Can you ship
them? And the molecular viewer? Is that right?
That's cool. A molecular viewer.
I used to be buying from a addition, so I like my molecular because that's my standard test project.
Would you ship any of those?
You got 15 of them.
Would you ship any of those as products?
I have no idea.
I think it works, but I'm not sure if it works.
Right.
Okay, keep going.
Yeah, but I do think.
So you need that, there's a level of skill and understanding that you need to get the maximum out of these agents that we have today.
But as these reasoning models get better, as we start adding new techniques to it, for example, there's this fun term these days called Neurosymbolic AI, which is basically bringing mathematical techniques into generative AI systems.
So we have a very strong Neosymbolic AI team over here.
By the time this podcast publishes, this is probably going to be well-known news.
but Joe Hellestine, who's well known in the database community,
he has a project called Hydro.
Hydro is an open source project to basically do,
verify the correctness of a distributed system written in Rust.
We want to bring those kinds of techniques
and build them into the spec engine of Kiro.
So as you're building software,
you don't need a human at the other end to verify correctness,
that you can verify correctness using a mathematical model in this case.
That became changing.
the kinds of techniques that take us to the next level.
You're not there.
Yeah.
But I think it's going to happen.
Within, I don't know if it's six months, one year, two years, three years.
Yeah.
That would be a game changer for sure.
It seems like everything's 80% solutions right now, or maybe even better than that.
Maybe like 95%.
And then like that last mile is where you need to be a senior engineer to actually recognize
and finalize the last mile of delivery on a lot of this stuff.
There's an intuition that they have.
that you just don't get in otherwise.
And I, as I said, to the question you specifically asked,
I would say, no, not yet, but we're not that far away.
Let's talk about pricing.
This thing's a product you all are trying to sell.
How do we buy it?
So we changed our pricing twice already.
Okay.
This is a, so this is a pricing that we announced, and I'm going to get it wrong,
last week
and it'll probably go fully live
and we'll start charging people
in some time in the next few weeks
but it's your classic per user pricing
you get a free tier
you actually get a free trial
because the free tier is mostly just for vibe coding
the free trial gives you the full experience
you just I think we put you into the pro tier
for about a month and you can play around with it
and then if you like it you can actually start paying
or you can flip back to the free tier,
but there's a $20 tier, a $40 tier,
and a $200 tier
that give you $1,000, $2,000 and $10,000
credits as a credit system.
The way we talk about it is,
if you're using the auto model,
you will get those credit consumption for Sonnet 4
is 1.3 times a credit consumption of auto.
And so if you're using,
so you get, you just get a set.
of credits. As you can see, if we gave you selection of some kind, we would have different
credit consumption rates based on, you know, the cost of running those models. And so you get
two agents, an auto agent and a Sonnet 4-based agent. The Sonnet 4-based agent is, you know, 30% uses
credits 30% more faster typically because it depends on the prompt and the LLMs aren't
deterministic. But for the most part, 30% more than faster rate of consumption than the
auto model. And we also have overages. So if you, you know, use, so there's another thing
that we did it, we added the idea of fractional credits. So when you go to, when you go, you can
go from one to one to one. You don't go from one to two in these step changes. We found people
who are running into going through their credits a lot because they used a little more than they, you know,
within the boundary, and then they would get sent to the next one.
And that meant that you just consume credits really quickly.
The other thing we have is overages.
If you're in the $20 plan, you don't need to go to the $40 plan
if you just want to spend $21 for the project you have in hand.
So there's overages there, and then the 201 assumption is most people will rarely get there,
but we know there's some people who will because they have.
but we spent a lot of time
and trying to optimize that credit consumption
and at least in our early testing
it looks like you're not running out of credits
so quickly. We had
in the middle separated out the amount of
usage the way we were
charging for specs
versus vibing and people found that
too difficult to wrap their heads around
so we've combined it into a single
credit bucket. I'm still confused.
You're still confused. I'm still confused.
It's probably simpler on the web page.
Well, I don't mean, I don't even mean your description of it.
I mean just generally because I think AMP code and augment code, they use messages as their terminology, whereas you're using credits.
That's what we used before.
Yeah, we use credits.
And how that maps to usage is sort of like a black box to me because, you know, I guess in those cases, like what I learned about AMP code when I talked to Bjong Liu, you know, CTO source graph, he helped me understand how, how, begins.
because I had a larger context window, it was actually consuming more credits, which I thought
was like, well, that's convenient for you because obviously I want context and so I'm burning
faster. It's super unclear as the user. And I'm like, gosh, you're just super expensive. He's like,
actually, you're just holding it wrong. I'm like, well, I'm not actually holding it wrong. I'm just
holding it. I don't know how I'm holding it. Right. How do I hold it right? Right. So this mapping
of usage to messages to credits to me and it's not your fault. It's just, it's hard to understand.
It's tricky.
One of the things you've gone back and done is, because it became pretty clear, that's what folks wanted, was actually show you real-time usage.
So when you work, I don't know if you'll have it right off the bat, but I know we're working on it.
We now show you as you are using it, how your credits are getting consumed.
Where we want to get to, and we'll get there pretty quickly, if not in the next couple of weeks, is actually showing you how a particular thing that you did, what kind of credits are consumed.
because to your point
consumption of credit
how much you're using
is not just about a prompt
it's not even about the context window
it's about tools
if the LLM decides to use
200 tools instead of five
it's going to use much
it's going to use more credits
that's the danger of MCP servers
they're amazing but it's also like
an enlarger context window too
so the more tools you use from those servers
yeah if you preload your
if you preload your project with 5,000 MCP servers
you'll run out of credit for like five minutes because that context will get so large.
I also think as a community, we need to start becoming much better at educating people and best
practices. So I think part of what all of us will, where this will end up doing is you'll end up
having really, really nice, I won't say really nice, but mechanisms to understand what's the
optimal way to set up a project. How do you do things that you always want versus bring them in
as you need it, showing you the right visual cues, because, and the other problem is,
and I'll go back to tokens here, because there's a, you know, in some sense, that is the currency,
a token in one model is not equal to the token in another model. Token is also different.
People put token weights, et cetera, but that's a heuristic. So I think the thing that people are
struggling with is how LLM's work is not deterministic by the way people are, people's brains work.
how things get consumed and cost accrues.
And quite honestly, I think you're seeing the entire industry deal with it over the last six months.
And in different ways, this is our way.
We think it's going to work for what we want, what our customers are going to do.
But a lot of the responsibility is on us to be very transparent about when you submit a prompt and something happens,
you know, you at least get a sense of this is what this takes.
And if you have different agents, the fact that they behave differently,
you can make decisions based on quality and speed on whether it's worth,
you know, you may want to go with the more expensive agent for whatever reason.
But you have to know why.
And I think that's going to be, and that's hard, but that's what you're trying to do.
Well, certainly still the early days, I think.
I mean, I think this is evidence of the trailblazing that is going on
and just the figuring out of how it all has to work and how it,
should work and then educating ourselves on how to hold things optimally, not wrong.
And picking the correct model or having a model picker who's orchestrating behind the scenes,
like where you send any particular prompt to to save money and speed.
I mean, it's all very much the land of trailblazing, isn't it?
I mean, it does not sound figured out by any means.
Yeah, I mean, we also, I mean, I think the other thing is how people use,
these agents and how they get success
out of it is sometimes learn only
after doing it. You don't
pre-you don't before know
this is the best way to use this agent
to be successful. The whole idea
of what became specs and these
memory banks and
prompt libraries and steering
files, I think
we sort of
you know, this is a equivalent of trying to learn
how to walk falling down, trying to learn
again. It's almost like your
first baby steps. That's how you learn.
And I think we've gotten to the point where we are, not because all of us were smart and knew what we were doing.
I think as an industry, we are constantly learning from people's behaviors, how they use these things.
Inside Amazon, we have this power users, Slack.
I think I get more understanding of what's the best way to run an agent because of people who are using these agents in anger every day and learning better and better ways to use them.
And they come up with things like agent flow, and that gives product people product ideas.
right but yet that's just us i think the whole industry is doing that and so it is it is it is early
days but it's also a lot of fun so yeah that is the hand of innovation is is using things in anger
you know right it creates itches that you scratch next thing you know you're you're doing something
completely you're innovating yeah you're actually yeah you're on the edge there i'm going to mention
something that's slightly dystopian even at the at this we almost
It's true without any dystopian, yeah.
Well, you know, are you guys both familiar with the idea of a toll booth?
Of course.
The way to get rich is to create a toll booth, right?
One way.
Well, yeah, sure.
One way.
Well, by and large, if you look at most of the way people develop wealth, it's some version of a toll booth in metaphorical terms.
It could be real estate.
That's a toll booth because you need somewhere to stay.
I got the toll booth.
You want to stay?
Boom.
I make my money.
or in the case of something like this,
I'm thinking of it like the future of software development,
I see it all going this way.
We all see it going in this direction.
And for a while there,
the cost to be a developer was your ability to,
you know,
get hold something in anger,
maybe have a license for a code editor,
or in the case of VIM for free,
it's open source,
the machine you had to buy,
there was no toll booth in front of your ability to create and innovate.
The toll booth was maybe the books or the education process or the schooling required, things like that.
But you could bypass those things if you were smart enough to learn on your own and just avoid the toll booth altogether.
And I guess what I'm struggling with as we're in this conversation is maybe just the fact that if this is the way of the future,
what a massive toll booth there is in front of the innovation that drives the world.
And I'm just not sure how I feel about that right now.
Yeah, I actually think it's the other way around.
I'll use examples.
Maybe there's a toll book.
I don't know.
But the way I think about it is there are things happening where the cost of doing that thing is very, very high.
Maybe I am a semi-technical founder and I'm busy spending time looking for an engineering
a very deeply, a deep engineering team, because I can't even build a prototype, right, because
I'm not technical enough to go do that. I have an idea in my head. I can write some JavaScript
maybe, but my idea requires some heavy-duty backend stuff and I have no idea how to do it.
In theory, and I've seen enough examples that this possible, you can do that on your own, or maybe
with one person to do much more work. Once you have that, you can go further. That's one example.
I've also seen people who've had a project that's been sitting on the back burner for a decade
because it is something that they've built, it's an open source project, they've got a day job,
and they don't have the time to go and evolve that open source project because it takes too much time.
They don't want to spend their weekends doing maintenance of an old thing, maybe.
Or you have a back end that's not scaling and you need to change it, but the cost, you know you need to change it,
But the cost of doing it is like 20 people and a year and a half of development, if you did it the old way.
I've seen enough evidence now in all three cases of people building, using these agentic development systems, going and tackling those problems.
The first one is an obvious one people talk about all the time that says, you know, founder thing.
The second and third one, I think of where the magic lies, which is you are doing things now that you would not have done before.
which results in better code, better user experience, etc.,
because your barrier to entry and ability to do it is small and low enough
that what you thought would take you two weeks
is now going to take you a day and a half, maybe even less.
What you would take you 20 engineers and, you know, a year and a half,
you can now do in much shorter time, et cetera.
My favorite example is actually of a team that does sprint planning every four days
instead of two weeks because they were finishing their backlog that quickly.
So I think that's, to me, that's, I don't know if you'd call that a toll booth or shifting
off the toll booth or whatever, but I do think there's this innate sort of barrier reduction.
There's an activation energy to everything.
I used to, you know, I'll speak in my physics terms.
And you need a catalyst to cross that activation energy.
And that catalyst could be funding because you get more people.
It could be hiring really talented people.
I actually think it's not just the catalyst part.
your activation energy has gone down significantly.
So the catalysts you need are,
you don't need catalyst at all in some cases,
or you don't need the best ones out there.
And I think that one,
it'll have a meaningful impact on software development
and how teams are structured and how we think about it
in a very positive way,
because you're getting things done
that you either not doing or wouldn't have done
or would have done much more slowly.
And I think that's, at least that's how I think about it.
Maybe I'm the glass-a-full guy,
and let's just see you.
You're spot on, and I concur and agree with everything you just said.
But what we cannot deny is that there seems to be a subscription of some sort.
And this is where I'm struggling, really.
Even though I'm, like, emphatically happy about the circumstance, I think it's cool.
I think it's the best thing almost ever, but now I'm struggling to think that there's now a subscription attached to being a developer or to doing developer things that build software.
And the reason why I say that is you go back to a couple years ago when they said you will own nothing and you'll be happy.
Well, now we don't really even, like if this is the way, right, and if I was just telling Jerry this other day, if Adam and Jerry from 20 years ago, which is today, let's say there's versions of me and him today that are entering the software world, if this is the way, maybe they don't even own their intellect and their innovation ability anymore.
Like, now we're renting that too.
And I'm kind of just struggling with that in this moment.
Yeah, and I, the ideas are still yours.
I think it's, I still, I mean, I remember the days when I was at AWS a long time ago,
when, when, when, I used to have a stack of Linux servers at home that I finally,
like, I don't even know why I have them, right?
Because I, before I joined AWS also, where I, you know, I got rid of them and started using EC2 for everything.
I don't need, like, I have this little corner of my house picking up heat, not required.
Like, there's an early one.
So I still remember when you switch, you know, there's a time you switch from assembly language to more of their modern languages.
Or you end from some language to rails type structure.
There's always an abstraction or something that comes up that allows you to shift your attention somewhere else.
I think the big one is where does that attention shift and how valuable is that?
And I think those have historically defined whether something is going to be successful or not.
I do think, and I mean, maybe all of us have run the cool it on this one.
In this particular case, I don't think we're talking about a 5%, like my ability to innovate or my ability to act and my ability to think and, you know, it's not becoming 5% better.
I actually think I have a new tool that allows me to operate fundamentally differently.
And I don't think we've quite grasped what operating fundamentally differently as individuals and teams actually means.
I think people are starting to find out, but I think it's going to make an impact in a very positive way.
And maybe it's the shifting of the toll booths on where the toll, because there's always a toll booth, as you said, somewhere.
But don't take it, even though I described it as dystopian, and I don't want to be necessarily negative, although it is a negative thought when you sort of go down the rabbit hole, I 100% agree.
I'm personally doing things just simply exploring and scratchy my own inches.
that I just would never, ever do
because I just didn't have the time
to even go to the depth necessary
to do it in a way that was even worthwhile doing.
And you take that same idea
and you apply it to an enterprise team
that has different context windows
and challenges and issues.
Gosh, yes, for sure.
I was just telling Jared two the day
that for a while there I was bummed
that I thought that, well,
there's going to be a market shift
in a market correction in terms of
the way enterprises or even teams
are built these days, and you may see layoffs, but I think by and large, we're going to
see a massive, massive influx into folks becoming what are called software developers, because
there's tooling that allows them to operate like a senior engineer, you know, or with
senior engineer abilities, and that's amazing.
I'm all for it.
In the moment, I was just like, gosh, do we now have a toll booth in front of our ability to
show up as developers?
And that was just kind of bummer.
I think subscription fatigue gets to us all.
It does.
Yes.
You know, when you're paying per token, you're basically paying that subscription fatigue.
Well, Deepak, I know you are hitting up against a hard stop.
We appreciate you spend a few extra minutes with us.
Kiro looks really cool.
I love all the innovations going on.
I think that spec-driven development is a very awesome, like, stake in the ground to claim,
like, here's a way of doing agentic things that is better.
And I hope you guys all the best and building it out and realizing that vision of what you're trying to build.
Cool stuff, man.
Thanks for having me.
It was fun talking to you.
And as I said before, I think we started, I've been, I've probably heard, I started listening to the change log at episode one.
So it's great to actually.
That's so cool.
Appreciate it.
So cool.
That's beyond cool.
Thanks for listening for being a list for all these years.
I appreciate that.
Special thanks to our longtime community member.
Jordy Mone companies for connecting us with Emma and the AWS team for this episode.
You're awesome, Jordy.
Thank you for being awesome.
And thank you, of course, to our awesome partners at fly.io and to our awesome sponsors of this episode.
Codrabbit.aI and agency.org.
That's A.G-N-T-C-Y.org.
And thank you to our awesome.
Musical producer, the one and only brakemaster cylinder.
Sounds like the word of the day might be awesome.
Oh, Terry.
Yes, be we.
How do you like today's secret word?
Okay, that's it.
This one's done, but stay tuned because we have another awesome.
Yeah!
Good screaming, everybody.
Conversation.
This time, all about the trouble in the Ruby community
and what it means for open source at large.
Coming on Friday.
We'll talk to you then.
We're going to be able to be able to be.
We're going to be able to be.
Ain't...
