The Changelog: Software Development, Open Source - Flowing with agents (Interview)
Episode Date: September 17, 2025Everything is changing. Adam is joined by his good friend Beyang Liu from Sourcegraph — this time, talking about Amp (ampcode.com). Amp is one of the many, and one of Adam's favorite agentic coding ...tools to use. What makes it different is how they've engineered to it to maximize what’s possible with today’s frontier models. Autonomous reasoning, access to the oracle, comprehensive code editing, and complex task execution. That's nearly verbatim from their homepage, but it's also exactly what Adam has experienced. They talk through all things agents, how Adam might have been holding Amp wrong, and they even talked through Adam's idea called "Agent Flow". If you're babysitting agents, this episode is for you.
Transcript
Discussion (0)
Welcome back, friends. It is your favorite podcast. That's Change Law. We speak to the hackers, those leaders and the innovators living at the speed of AI. And if you don't know, everything is changing.
Now, I'm joined today by my good friend, Beyond Lou from Sourcegraph, this time talking about AMP. Check it out at AMPcode.com. Amp is one of the many and one of my favorite agentic coding tools to use.
What makes it different is how they've engineered it to maximize what's possible with frontier models, autonomous reasoning, access to the Oracle, super cool, comprehensive code editing, and complex task execution.
That's nearly verbatim from their website, I admit, but it's also exactly what I've experienced.
We talk through all things agents, how I might have been holding it wrong, and we even talk through my idea called Agent Flow.
If you're babysitting agents, this episode's for you.
A big, massive thank you to our friends and our partners over at fly.io.
That is the home of changelaw.com. Learn more at fly.io.
Okay, let's agent.
Well, friends, the news is out.
Our friends over at Coddrabbit, codivot.aI.
They've raised a massive series B, and they've launched their CLI reviews tool.
It is now out there.
been playing with it it's cool the bottleneck is not code the bottleneck is code
review with so much code happening so many people coding now so much code being
generated and so many things competing for developers time and attention to
maximize code review still remains a bottleneck but not anymore code rabbit
CLI code reviews code reviews in your pool requests code reviews in your VS code and
more teams now have a true answer to what it means to code review at scale code review at the
speed of AI and code rabbit is right there for you you go learn more at codrabbit.a i will link up their
latest blog announcing their series B and their announcement of their cly review tool again
codrabbit.a ii beyond lu welcome back to the change log let's go as deep as humanly possible
on AMP. Not source graph, necessarily, but AMP. What do you think? Cool. Yeah, it sounds good to me,
and thanks for having you back on the show, Adam. What is AMP? Yeah, I don't think that it takes too much
explanation. It's AMP is a coding agent. So most people, I would imagine your audience, know what that is at this
point. But, you know, if you've been living under a rock for the past six months, a coding agent is
essentially a AI powered program that takes natural language instructions and then goes and
like modifies, uh, gathers context and then modifies the code following your instruction.
And so it's, it's much more, um, kind of like high level. You kind of describe what you want.
Uh, you figure out how to instruct it. And then it does and figures out how to, uh, it figures
out how to do most of the editing, uh, for you, you know, runs the tests, uh, runs compilers.
it can correct itself.
And the idea is that we want to enable programmers to operate at sort of like a higher level.
So, you know, dictating what the architecture should be, what the key interfaces should be,
what the UI should look like, and not as much have to be in the weeds of every single line of code that is written.
And then AMP in particular among the landscape of coding agents is distinguished by the fact that I think we're the only coding agent that I've come across that has this approach.
of, so we're a multi-model. We use multiple LMs, which is not distinctive. There's a lot of coding
agents that use a variety of underlying large language models and small language models now.
But I think our approach is different in that we don't have like a model selector. It's not like
a model harness. So an AMP user doesn't think in terms of like, oh, let me use, you know,
Claude Sonnet 4 today or let me use GPD5 today. We view that as basically like an implementation detail.
So, like, we'll figure out what the models are good at, and the contract of the end user is an agentic experience that hopefully just works well for your use cases.
So there's no toggling on different specific models.
It's more like, oh, you know, for this particular feature or sub capability of AMP, we're going to use this model because it has the right characteristics in terms of latency, intelligence, and, like, competency around, you know, one particular task.
Maybe just a side tangent, which I didn't really plan to do until this very moment,
but I was thinking back to the last time we had a conversation.
And I think at the very end, I'm not even sure if it ended up on the air or not.
So this may have just been side conversation between you and I and maybe just from feedback.
I think it was, how in the world do I sign up for Cody?
And so obviously we're talking about AMP.
I don't know where Cody went.
I've literally forgotten about Cody until this very moment.
and that feedback I gave to you, which was,
hey, I'd love to play with your stuff,
but I don't feel like I know how to do it.
I think I have a source graphic account.
I've tried 16 ways,
and I keep beating, you know, no progress.
So there you go.
Oh, you've run into like off issues with AMP?
No, no, no.
This was Cody back the one we talked.
Oh, yeah.
I don't even know when we talked,
maybe eight months ago, 10 months ago or something like that.
Yeah.
So where's Cody?
So I would say.
The Cody is still alive and well.
Okay.
enterprise.
Okay.
But we've sort of moved away from it for non-enterprise use cases.
So there's still plenty of enterprise customers that use it quite heavily for, I would say,
like non-agentic AI coding assistance.
And that's still a common feature among a lot of large companies.
There's some companies out there that even have like a legal prohibition on anything
agentic.
However, you want to define that.
You can't say agent inside.
some of some orgs.
What's the alternative word to agent if you can't say agent?
Oh, I think sometimes people say like workflow and that means a certain, you know, it has
a different, there's no like single source of authority on what agent or workflow or any
of these terms mean, but there is a different like connotation or zone of what that definition
means. And so, like, I think some more regulatory or, like, legally sensitive orgs, they're
okay with workflows, but not anything that's, quote, unquote, like, fully agentic. And I think
it's, it's just like, it's hard to pin these down. And at some point, it's not worthwhile
discussing it too much depth, because at the end of the day, it's like, you know, a certain
customer has this requirement, and, you know, we're fine meeting that. They determine
they need that. And, you know, we're happy to serve them where we can.
But long story short is, Cody is still alive and well in the enterprise.
It's used for non-agentic workflows.
But outside of the enterprise, the reason why we built AMP outside of Cody
was really, we felt that there was this very big technological shift that was happening.
If anything, it's like, you know, the whole, like, Gen A.I. phenomenon,
there's multiple waves of technology that I've actually gotten lumped into one umbrella term
that people call Gen AI.
But we view the agentic, we view agents as like fundamentally a different technology and a different application architecture than the pre-agent, more like chat-based AI.
And so that to us was like we really needed to think from first principles in terms of how we built this thing in order to fully, in order to unlock the potential of agentic models.
And so we didn't want to be hampered by the design constraints and the way that we built Cody originally because that was very much.
for the, like, chat LLM world.
And if anything, a lot of the best practices
for building with chat-oriented models,
it's like the inverse with agendic LMs.
Big shift, big waves changing as we have this gen AI world.
And we've been, what, three-ish years deep
into this phenomenon, as you have said, right?
Just under three years?
Yeah, just under three.
It feels like a decade.
I don't know about you.
I feel like it's been a decade.
Yeah, it's wild.
I'm excited. Are you excited about this change?
I mean, I feel like I've had like peak hype and then a drop down to reality, then
peak hype, then drop back down to reality.
And I feel like now I'm just sort of like peak hype all the time.
Yeah.
I mean, definitely excited about the technology.
I think it's like got huge potential.
I would say like I was never like excited to the point where I believed in the sort of like
AGI myth or like this, this never.
that it would eliminate all need for work or, you know, kill us all. I thought that was always
like a fairy tale told by people overactive imaginations. But definitely very excited about the
potential for this to eliminate toil for all kinds of knowledge work, not just developers,
but especially developers, because that's who we sort of build for. Yeah. I feel like it's
a, it's obviously a force multiplier. I still feel like it's this weird genie in a bottle where you have to conjure it in certain ways. You have to hold it delicately. I feel like even in my own experience, it kind of, it's smart in some cases and then really dumb in some cases. I have some questions for you about that to sort of demystify when I feel like, gosh, I've, we've done this before and you've been really good. And now suddenly you have no idea how to commit to my Git repository. You know, like, that's,
that's the most basic function that you could possibly get is like get commit dash m blah like you know
that's pretty easy but i feel like i've had moments where i've had like you know how to commit
to a get repository right and i'm not speaking it to that way but in my brain i'm like wow this thing
was really smart and now it's not really smart so i feel like there's waves of that even in my own
personal usage yeah i feel like so like the committing to get i think that's i mean at least for like
amp that's largely solved now it's been a while
I've seen like a class of error.
I'm not blaming somebody else.
Okay, gotcha.
Like, I think those sorts of things,
like, I have, like, very high confidence
that, like, it could just do now.
But I get what you're saying.
There is, like, the spectrum
where some things that you, a human,
would consider difficult,
it can just, you know,
almost one shot.
Whereas other things where a human,
you know, it'd be like a two-line change
or whatnot.
And it will just,
we would call it, like, doom looping
where it's just like iterating
over and over again.
It can't figure it out.
even though you're saying like, okay, now try it like this.
Try it. You're trying to like nudge it and it just like just won't get it.
There's certainly that phenomenon that's still at play.
And it's almost at the point where it's like a lot of this is it reduces to the underlying model intelligence.
And so I think the proper way to view this is it's, you know, part of it is it's like, okay, okay,
like there's some improvements that we need to make the product in terms of like how we present the answer to the user
and help people like prompt in a way that gets you to the sweet spot.
of the model. But there's also like
users have to view this as
like a skill set they need to acquire
as well. So like the more you
use agent decoding
tools, the more you develop an
intuition for
what it can do and
can't do. And so
you can almost
like tell from talking to a person
how much they've used these tools
in terms of like what they express
their frustrations as. Like a lot
of newbie users, especially the ones that are like
from the onset skeptical, for whatever reason.
They will say, like, oh, you know, I asked it to do this thing
that ought to be very easy, but it fell flat on its face.
And then you asked to see what their prompt look like,
and it's like five words.
And it's almost like at that point, it's like, you know,
what do you expect to do?
Like, it's not a mind reader.
There's almost like an information, like,
you put a certain amount of bits in,
and from that, the model has to tease out what your intention was.
And, like, it can certainly accept, you know,
instructions that are like five words.
words long, but if it's only five words, and it's going to be very much, like, you know,
based on, like, what its prior behavior wants to do. So, like, if you have, like, a five-word
prompt to create, like, a simple React app or a simple game or whatnot, it could probably
do that, but it's not going to be probably what you intended, unless what you intended is, like,
the median React app or the median JavaScript game that's represented in its training set.
Whereas the more expert users, they still have complaints, but it's oftentimes around very
specific things. It's like, okay, I know what it's good at. I'm not complaining about the
fact that it can't do these things anymore. I can still do those things. It's fine. It's more
around like, hey, I would love to be getting more out of this tool, but there's certain bottlenecks
that are kind of constraining how much I can use it. I think one very important bottleneck that
still remains is code review. So, like, I think most expert users of coding agents realize that
like, you know, they're very powerful and they want to be using them heavily, but you can't trust them fully.
Like, you still got to read through the code that they emit and understand it because otherwise the slop will creep in.
And they'll make subtle changes that if you don't catch and are not wary of, we'll add complexity and, like, nuanced bugs, subtle bugs to the code.
And so this process of like the human understanding the code that's been generated and ensuring that doesn't do anything very incorrect.
or not according to their intentions,
that's an important part of the process.
And it's quickly becoming, I think,
like one of the key bottlenecks
in this process among the folks
that are really, really trying to push
the frontier of what they could do with these tools.
Yeah. My case in particular,
I'll call out the tool. It was not AMP.
Just so you know. It was Claude.
And it was around 11.30 p.m. 11.45 p.m. Central Standard
time literally last night and the example was I'm obviously you know using cloud code
well not obviously I'm using cloud code on the commandant so I'm using their their CLI tool so
in my directory cloud opens up their CLI tool and then there you go in my case I think it may
have been a recent context window clearing either way just feel like like simple tasks were
getting harder and harder for it to do.
Like it was almost like I had to like I like it just got unsmart.
I don't want to say the opposite work because it's not cool to do that.
But it was it had become unsmart basically.
And this specific example was SSH into this known machine we're working with.
Like it has a name in my hierarchy.
Like it's clear.
There's context for what this machine is, how we've been interacting with this machine
and how it would work.
And I said S is H in this machine and just check on.
the memory state because this is something we had been doing.
And it suddenly says, I can't SSH into machines.
I'm not able to do things like that.
And I'm like, no, no, no.
Your Claude code, that's your first job as Cloud Code is to be able to traverse anywhere
I can go in my terminal sessions.
Yeah.
And you should be able to S.
And because I've got a, you know, I've got a SCH key and this is my own machine.
And so this is a known thing.
And I had to remind it was like, oh, you're right.
Right. Oh, you're right. I can do this. And so that's the very specific example versus just this. I feel like sometimes there's drift in its ability. And that could be a clod thing. It could be maybe it was a certain time and usage was peaked. And maybe these models get less smart whenever there's peak usage because there's maybe less memory to go around. I have no idea. I feel like that part of the world in the in the usage is so black box. And maybe it's black box.
honest to you too, but I feel like
in that case, I was like, you know how to SSH
and yes, you can, is like, oh, you're right, I can.
Yeah. Yeah, it's interesting.
I think there was like, I don't know when this is going out,
but there was a recent report that Anthropic posted
where they were, apparently they had rolled out
like a quantized version of the model
over the past couple of days,
which actually did yield degraded quality.
And so this is kind of like confirmed the conspiracy theories
in people's minds where it's like,
you know, are they nerfing the model quality?
And it turns out like they had rolled out some changes
that they would, you know, aimed at improving efficiency
but did actually have a tangible impact on model quality.
So I wonder if, like, you know, that happened in your case.
In our case, you know, we do use the Claude family models
heavily underneath the hood,
but we have a couple levers that we can pull
that can help address this issue.
You know, one is we actually use multiple inference providers
that provide Claude.
And there's actually periods of time
where, like, one provider
will have different,
like, up-time characteristics.
So, like, if the model is completely down
from one provider,
we can switch over to the other
and stay online
while other people are using tools
that are tied to just one provider
can't use that tool.
And that also goes to the model quality, right?
So, like, if we notice quality degradation
from one provider,
we can cut over to the other,
and still be consuming the model at sort of like full fidelity and full quality.
And then the second lever that's just now coming online is we're starting to play around with
more and more additional families of models.
So we already make use of a variety of models for specific use cases and capabilities within AMP.
And we're also constantly trying out different models as the main agentic driver.
And I could see it like a future where if we start to notice, you know, degraded quality,
or higher error rates from one model family,
or maybe it just goes completely offline,
we could cut over to another model family
as a fallback, provided it's good enough
at the core capability of driving
the agentic workflow that we need.
Let's talk about, I guess, how it works, if you don't mind.
Can you...
Your CTO source graph, so you should know these things.
At least the read-me version,
I'm assuming you're super deep.
I don't want to assume a lot of stuff,
but I figured your position gives you the ability to go
probably as deep as anyone might be able to,
aside from maybe the core team and literal, you know,
implementers who are working on this day to day.
But give me as best you can to the world, you know, to the public,
how AMP works from the architecture at like a hosted service level.
You talked about being able to determine degradation.
Give me as best you can to lay of the land
that can be public.
Yeah, so I'd say there's no weird rocket science here.
I'd say like at the core, at a very high level,
AMP operates the same way that every other agent does,
which is the way I'd describe is it's a four-loop wrapping an agentic L-LM.
So you take an LM like Claude or GPD-5
or any of the other like agentic LLMs that are now online,
you put that in a four-loop,
And what the loop looks like is you take the user input, you feed it in the model, the model will come back with some combination of tool calls and like written response, like, you know, thinking things out or responding to the user query.
And the tool calls are just text.
They describe like, hey, you told me that you had access to these tools.
You know, maybe it's grep, maybe it's a read file tool, and maybe it's a tool to edit a file.
I want to invoke the tool of these arguments.
and then our application logic goes and executes the tool call.
So, like, given the spec that we get from the model,
we execute the tool, we get the response,
and then that response gets fed back into the next iteration of the loop.
And that just keeps looping until there are no more tool calls,
at which point the model generates a final response.
And that's usually either, like, the answer to the user's question,
or it says, I made these edits to the code in these files,
and these are the changes that I made.
gives you a summary of those changes.
So at a high level, every single agent in the world
follows that architecture.
It's like the agentic equivalent
of the ChatGPT wrapper architecture
that was so prevalent, you know, a year 18 months ago.
Like beyond that layer, you know, there's nuances.
There's nuances in terms of like what tools we provide,
the models, the different prompts that we use
for the main agent, and then there's also sub-agents.
So sub-agents are a special
class of tools where you call a tool, but underneath the hood, it's just its own, like, nested
for loop. So it's its own, like, agentic loop doing a more targeted task. So you could have
just like an agent calling itself in sort of like a recursive fashion or might call a domain
specific agent. Like we have an agent that's dedicated to code-based search, which has been
optimized for finding and locating relevant context within the repository. And so there's a lot of
tuning that we do to make sure that all those pieces operate well together, that we can
kind of eliminate the dumb class of errors like, you know, oh, I can't SSH into this thing,
or I can't read this file. Smooth all those out because those are big disruptors to the user
experience. And then we find pieces to fill in the gaps for, you know, what blocks the agent
from getting further or what helps it correct, course correct itself when it doesn't immediately
do the right thing off the bat.
And then there's sort of this client
server architecture that we have where
part of it always has to be on the server
because the best
models are still server bound right now
because local models have not yet gotten to the point
where they're fast enough
to run locally. But in addition
of that, AMP also has
service side store for like
all the threads. So we call
the agentic interactions that people
have with AMP threads.
So every kind of like session you have with AMP to accomplish a task is a thread.
And all those threads get synced to the server.
And the benefit of that is that you can go on AMP and see the threads of all your teammates
and how they're using AMP, which is really helpful because coding agents, using them effectively, in our view, is a high ceiling skill.
Like you said before, it's not trivial, right?
Like there's certain things, especially if you're a newbie, where you expect it to be able to do something, you know, especially if you've come in consuming a lot of the hype around AI. You're like, oh, you know, it's magic. And then it doesn't do the thing. It doesn't read your mind. And so it's really helpful to have this team-wide view of how other people are using this tool to great effect, because then you can go and like learn what their best practices are. So that's another big component of the AMP architectures. We do have the service side component that stores a thread information.
so you can share with your teammates
and also you can go back
and revisit previous threads
in case you're like, oh, like
what was that thing that I was learning about
like a couple of days back, let me go up
and look that thread and see what actually
what it answered or what it actually did there.
Is the thread simply
kind of a chat history?
It's not really context.
It's just more what was printed back and forth kind of thing.
It includes every single conversation
from the assistant or it includes every single message from the assistant or the user and all the
tool calls and all the tool results so it's basically like the entire interaction it's like here's what
you asked to do yeah it's like a transcript here's what you asked it to do here are the tools that
it called here's the results of those tool calls like I read this file read that file it listed
to this directory it made this edit and then you asked it to do this and then it's just like
the entire message history interesting so you got
client server, you obviously have your own
CLI, so you can install it
I think, can you install it via brew
on a Mac? I can't recall if I did it
if it was via MPM.
The preferred way is via
NPM for now. Gotcha. Okay.
So you've got that client
architecture, that is your own CLI, which
by the way is just stunning. It's beautiful.
I love the way y'all did that. Thank you. Yeah.
Really love the nod
to NeoFetch.
At least I think it was a nod to
NeoFetch, the opening splash screen.
I could be wrong.
The orb?
Well, you know, when you launch, you know how you launch Amp.
So when you first, for the users who may not, or listeners who may not, I'm calling them users.
Y'all are listeners still yet.
You're not users yet.
But, you know, when you launch Amp, if you've ever used NeoFetch on Linux, when you NeoFetch on any given box or machine you've SISH into, it essentially gives you this sort of bigger logo on the left and a list of details on the right hand side.
that amp splash page reminds me of NeoFet.
I'm not sure if there was a nod there or homage there.
To my knowledge, no, but I'm also not the one who created that splash page.
So I don't know if that was an inspiration.
I think we're just, like, part of what we're trying to do,
I think like command line coding agents are all the rage right now for a good reason.
I think, you know, it's super versatile.
And I think now in part because of,
of AI, people are really pushing the boundaries of what you can do inside a terminal-based
UI. And so now there's all these, like, great new libraries and frameworks that are doing
things that are visually, like, stunning inside the terminal. Whereas, like, I don't know,
two years ago, you asked me to describe, like, the prettiest command line tool, and they all kind
of look the same, right? It's all mostly text.
Each top, maybe.
Yeah, yeah, exactly. But now, I think.
because the terminal is so versatile, right,
like you can deploy a coding agent in a huge number of places
if it's in a terminal form, right?
Because you don't have to install, like, a graphical interface.
You don't have to, like, go through any of the loopholes you have to do.
Or not the loopholes, but you don't have to jump through any of the
hurdles to set up the coding agent
if it's based on the command line.
And so I think we, along with everyone else,
are trying to make AMP as pretty as possible in the terminal.
Because today you want to, yeah, you want to use something that's delightful, right?
You know, there's a lot, I can gush about a lot, but I think that just the animation of that orb
initially was really cool.
And I think it's changed over time.
I can't tell because I never snapped a perfect memory shot of it in my brain.
But I think it's changed over time and it's gotten different colors and maybe even the shape
has changed.
But it's cool.
It's gotten better.
And I do want to call out that, so we just shipped a new terminal UI like yesterday as of this recording.
And it's actually using a new terminal UI framework that was built in-house specifically for AMP.
And one of the things that's immediately obvious if you use it is that there's no flicker.
So like we're using ink beforehand, which is very popular.
And, you know, frankly, like very, very great framework.
But one of the things that people notice with ink is that it flickers a lot.
And the Flickr gets worse in some terminals over others.
So, like, if you're using it inside T-Mux, as I often do,
like, the Flickr is very noticeable.
And that's something that we're able to eliminate
as a very non-trivial technical task.
And special shout-out to Tim Culverhouse on our team
because he basically just joined the team.
He's, like, a terminal UI expert.
He did a lot of work on Ghostie, that terminal,
working with Mitchell Hashimoto and folks there.
And he's just been, he basically rewrote the entire terminal UI and it just launched.
And it's really solid, I think.
This date, just so everybody knows, we're recording on September 3rd.
So yesterday it was September 2nd, if you can't do calendar math.
I did it for you there.
I'm not sure if I noticed that the UI
has changed much, but I have noticed the Flickr, and I thought it was because, like, anybody
doing this, I've, I don't think I've shut my machine down in, like, a month or so.
I'm just afraid to shut it down.
I'm kidding.
I've shut it down since then, because I've learned.
But, like, for a while there, I was like, because of threads, like, going back to this
thread and the transcript, I was, you know, because the things you're putting into it is, like,
as you said, it's a skill set that you learn.
And so I feel like as you get better and better,
you want to understand what you did a week ago or a session ago or two days ago
or how you frame something so that you can keep doing a version of that or build upon it.
And so this history to me is really important.
And I've been like collecting little text files basically because I didn't realize or, you know,
I don't think everybody has your gumption where, hey, these threads,
these transcripts are kind of important to Spalunk.
If you're a week or so past or what the, the true conversation was or what, you know, the user asked or the, I think you called them, not an agent.
The agent is the thing, but what did you call the person?
You had a name for the person.
Oh, I usually just say the user or the user.
The user.
Okay, cool.
You know, me as the user, I was like, you know, I've been collecting little prompts and little things like that.
Like, nothing major, but just things I'm doing frequently.
Like, you know, I've come to this, you know, my usage pattern has gotten to this point where I feel like I need to give, I need to define roles similar to how you build out a team.
I have a Linux platform engineer or a, you know, a Go style enforcer who understands really good Go, idiomatic Go code.
And then a style guide for my Go code.
And then I have a, you know, a Go style reviewer, because you were talking about code review before.
And so I will give them different roles, and I feel like that works for me.
And but as part of getting there, it was condensing more and more of what I would kind of keep is this massive prompt.
And I would just sort of move that larger prompt for me into a history file of sorts where this is a role, assume this role, here's your task, and then, you know, go forth with that new role.
And I feel like that gives it some really good context versus.
some magical prompt, I've been carculting again and again and again.
So I've learned a skill set, if you can call it that, to better be an agent babysitter.
Yeah, I think, you know, I talked to a lot of users and a lot of people have some variant of that
where it's almost like they've built up a library of prompts for specific, you know, task
types. I think a lot of people have built up some sort of template for listing out. It's like a
combination of like here's how to structure your plan. First of all, like generate a plan at
all. If you want to do like a longer running task, generate a plan, here's a couple
pointers about how to structure that. And then maybe there's a couple pointers about the codebase
as well. We have introduced this thing called agents.m.
I think it's like pretty standard now
across many coding agents
and now there's finally a standard.
I think more and more people are hopping on to this
using the agents dot MD file
as a standard for context.
So, you know, Ample consume that,
but a lot of people add additional context
beyond that that are, you know,
maybe not specific to the code base,
but specific to their like personal,
like the things that they personally feel
they want the agent to do.
And so they'll copy and paste that in
to sort of like steer the agent to do
and behave the way they want it to behave,
especially over these longer running tasks.
How do you feel about the user experience,
the way a user interacts?
So you said you interact with a lot of different users
who have different ways and all that good stuff.
I've been working on a,
since I've been building my skills to be an agent babysitter.
I've kind of come to this idea.
I'm calling it Agent Flow.
It's an internal name to my own brain.
I'm a team of one.
You know, I've got a team around here,
change all.
But the things I'm working on are just sort of like fun tools that solve my own itch and my own problems.
And so my interactions is not team-based.
You know, so I'm not leveraging your threads and sharing those threads with other people.
It's simply just a team of one doing it.
And so I've come up with this idea of agent flow.
I would say the protocol is called document-driven development.
And the implementation is called agent flow.
How do I get agents to understand how I want to flow with it?
You know, we vibe and we flow.
And so I'm like, okay, agent flow sounds kind of cool.
I know that's trademarked or not, but I like it, okay?
I've kind of given you a glimpse of it where I define roles.
I obviously leverage your agent file.
I think Claude has its own version of it.
And so, you know, maybe you can centralize it to just simply agent versus clod.md.
But essentially, how do I want my agent to know about my codebase,
to know about the things, the tasks I'm giving it?
Yeah.
And, you know, obviously, which role to take on, which had to put on, so to speak,
and maybe a style guide of sorts that represents how do we write go code or how to write rust code?
Like, what is the blessed idiomatic way of doing that?
And then I think the additive to that that really sort of solidifies agent flow for me,
and you sort of mentioned it was this, how do you give it its list?
How do you give it its direction?
And I just barred it from the great language of Python.
They have PEPs, Python enhancement proposals.
And they have a whole system around drafting PEPs.
And so I took the same acronym PEP, and I just said that this is a project enhancement proposal.
And so for every new thing, every new idea I have, it begins as riffing or vibing as you would.
But it's all about developing this document called a PEP.
and the peps have numbers so it could be pep dash zero zero three one uh maybe i'll need 10 000 at some point but for now i feel like
you know four you know numerals there is is just fine but that's the essence for the most part is like
draft peps so that i understand full context let's let's dream and build and think and that thing
has status is it has all the artifacts that pep lives in its own directory uh you can the agent can add more
things like maybe it completed something.
So it's a completion report.
We complete this PEP, but don't just update the PEP document.
Give me a whole separate document that is about the completion of it.
And how did it work?
So I can go back and learn and read later on.
You know, you even have things like knowledge base.
So I have it draft knowledge-based articles.
So once we're done with a PEP and we've done X, Y, or Z, not just the completion
report inside that PEP, but more like, what do we learn?
that is now an institutional knowledge for you as an agent or any other agents or any of the team members
I bring on later on. Even internal blogs or what I'm calling builder logs. It's still blog. It's
B-L-O-G-S plural. But I'm calling on builder log. So this is more of an internal free form. But just go
and dream what happened here. And I'll go back and read those as my agent babysitter hat. And I learn
with it how to direct it. And so this entire.
flow that I've sort of like learned as a skill set I've been calling agent flow. What do you think?
Yeah. I think that fits with the way that a lot of people use AMP actually. I think the more
sophisticated intentional users have a very intentional design process that often starts with
the human generating a lot of tokens describing what they want. And sort of like the
length of the length of the complexity of the tasks that you
want it to handle, there's kind of like a linear correlation between the complexity of the task
and how much direction you want to give it. So if I'm doing a thing where I know exactly what
I want and I want the agent to get as far as it can on its own, then I will do things that are
very similar to what you described. Like, hey, I want you to generate a plan. It's got to have
these properties. Make sure that each step is annotated with the relevant
context files that you'll want to address when you're actually implementing this plan.
And maybe I do a couple iterations on the plan itself, you know, if there's any course
corrections, like, oh, you know, use this technology, not that technology, or use this library,
not that library. And then once that plan is generated, then you have another thread that
then it goes and says, like, okay, go and execute this plan or do like steps one through two
of this plan and then let me review it. And so that's a very interesting.
potential workflow that there's a certain set of our users that are all about trying to do as
much as possible in a fully automated fashion. And the ones that are trying to do that are
essentially doing exactly what you're doing. They're trying to build these agentic,
they're trying to build this workflow around like the human being, the product manager
describing it at a high level what they want and the agent essentially like filling all the details.
I will say that that is an important set of use cases
and I find myself doing that from time to time
but there are a lot of other workloads
that are like I would say like me personally
I run threads using the workflow
that you described like relatively frequently
but not like not necessarily every day
because oftentimes especially now
a lot of the work they do is more exploratory
there is a different workflow that ends up being like you get less out of each thread
but you probably end up doing it more frequently because it ends up being
like less mental energy up front and also a better fit for use cases where you're in more
like exploration mode or you're trying to learn alongside of what the agent is doing.
So like there's tons of development tasks where like I myself only have a half-baked idea
what I'm looking for, right?
It's like, oh, you know, maybe something, maybe a sub-agent that, you know,
uses a different search tool would be interesting.
But, like, do I have, like, a clear, exact vision for what that should look like?
No.
Like, I want you to first do, like, a spike.
And so in that case, it's much more informal.
Like, I'd say if you look at most of my threads, most of them don't involve a specific, like,
plan generation step.
It's more like, hey, where's the part of the code base that pertains to this?
and can you give me an explanation
of how it currently works?
And then I have this idea.
Help me think through a couple possibilities
of how I would go about implementing this
and what frameworks I'd use.
Okay, great.
Can you do a quick spike?
Don't implement everything yet,
but just these endpoints
because I want to play around with it.
And so it's kind of like ping pong back and forth
between me and the agent
where part of the side effect
of getting it to do stuff
is that I'm kind of like learning
and acquiring domain knowledge
alongside of it.
So, yeah, like most of my workflows
are probably somewhere between
those two points along the spectrum. It's like either very like, you know, casual, you know,
still thoughtful prompts, but, you know, not following a specific structure all the way to,
like, I know exactly what I want because this is very similar to something I've done before
or like the features is very clear in my mind, in which case I try to like front load more
context and see how far I can get it without any intervention or correspondence from me.
What's up, friends?
I'm here with Kyle Galbraith, co-founder and CEO of Depot.
Depot is the only build platform looking to make your builds as fast as possible.
But Kyle, this is an issue because GitHub Actions is the number one CI provider out there.
But not everyone's a fan.
Explain that.
I think when you're thinking about GitHub Actions,
It's really quite jarring how you can have such a wildly popular CI provider,
and yet it's lacking some of the basic functionality or tools
that you need to actually be able to debug your builds or deployments.
And so back in June, we essentially took a stab at that problem in particular
with depot's get-up action runners.
What we've observed over time is effectively get-up actions
when it comes to actually debugging a build is pretty much useless.
The job logs in GitHub Actions UI is pretty much where your dreams go to die.
Like, they're collapsed by default.
They have no resource metrics.
When jobs fail, you're essentially left playing detective, like clicking each little drop-down
on each step in your job to figure out like, okay, where did this actually go wrong?
And so what we set out to do with our own GitHub Actions of Observability is essentially
you built a real observability solution around GitHub Actions.
Okay, so how does it work?
All of the logs by default for a job that runs on a Deerbubbates.
Depot GitHub Action Runner, they're uncollapsed.
You can search them, you can detect if there's been out of memory errors.
You can see all of the resource contention that was happening on the runner.
So you can see your CPU metrics, your memory metrics, not just at the top level runner level,
but all the way down to the individual processes running on the machine.
And so for us, this is our take on the first step forward of actually building a real
observability solution around GitHub actions, so that developers have real
debugging tools to figure out what's going on in their builds okay friends you
can learn more at depot.dev get a free trial test it out instantly make your
builds faster so cool again depot.dev I think what you just described to is
the exact way I work as well and the way I would change or maybe even
suggest you to and maybe you do this is I'll what I call riff I
riff for a little bit with it, you know, I'll explore, you know. I might ask a question. I might
ask it to help me think through some things. And there's some this, there's some back and forth
on thinking that I just call riffing. And everybody calls that riffing, right? Yeah.
But at some point, there's some clarity that begins to form. Sometimes it even begins to do some
of the work. I'm like, okay, I kind of like that. But hang on, pause. Don't do that. Now that
we have a little bit of clarity, draft this pep. That way now all this riffing has a place
to put things. So it has a drawer or a folder to put stuff into. Sometimes if I even have a
half-bake idea, a half-bake idea, I will still, like if I'm in flow and I'm thinking
through something and it's working and it comes back, I might just say, quick idea, throw this
into a pep, but let's continue.
That way, it's sort of like a to-do list.
And I'll come back and I review
what I'm calling my peps.
And it becomes like this place where
eventually it has a home
and these peps have statuses.
So I literally just borrowed Python's
statuses. You have final, you've got
superseded, you've got
all these different flavors of statuses.
And I'm like, I've already thought through this all.
Let's just borrow their ways, right?
I love that, yeah.
And so I just kind of like keep
these peps as like this constant
I don't want to call it a junk drawer
necessarily but it's a place to
put something I'll riff for a bit I'll explore for a bit
but the moment there's clarity or any
sort of visibility I want the
model to kind of put all the
context we've gathered into this thing
I don't even go back and read it right away
because I just hope
I suppose hope and maybe this is where I'm
rubbing the the
lamp a little bit hoping the genie does it
is put this idea there with enough
fruition to
put the right things there.
So when I come back to it, I can go deeper.
And we can turn this into a real deliverable or real spec.
I treat that as like that.
Yeah.
Do you commit those to the Git repository?
Are they in some like temporary folder?
I mean, right now it's not open source.
I've wondered about that.
Like should all this be there?
And I feel like maybe as, as the one of the earliest humans living in this
three and a half, you know, almost three year old world where these.
This new phenomenon is on my seal, on my command line.
I almost feel a requirement to future humanity to commit this to memory in a way, you know?
So I'm not really sure.
I'm in the blurred line there whether it should or shouldn't.
But I think from my use case, it makes sense to have it all in the same repository,
to have it all in the same place.
My agent flow, the way I describe it at least, puts all this stuff, not in the root, but in a folder call.
Admin. One, I wanted to be at the top of the stack when I look at my, I still use Z. I still use
a code editor, but it's mostly a code viewer for me. I'm editing some things. I'm not doing a lot
of coding in there, but I still do some stuff, but it's largely, you know, pushed to the, to the
model, to the agent. I put all that in an admin folder. So you got admin slash PEP, admin
slash knowledge, admin slash builder logs, you know, all these different things there, or even
the role files. This all live in the admin
folder, which is sort of like me
as the
product owner, CEO of this
directory that is now a project,
that's how I look at it. Like, this is all
admin stuff. And I put it there.
Yep.
Makes sense. I like that.
I like having kind of like a, it's almost like an audit log
for how the code was generated.
Absolutely. I feel like
that might, it might be an area
of untapped potential there
because I think one of the things
that would be helpful, especially if you're going back and reading it later or if someone
needs a review that code is to see the kind of audit trail of like, oh, okay, like, this was the
plan that generated this code. And this is what the high level intention was. If anything,
like, that's almost, you know, if you could stab your fingers and have that included as part of
every, like, pull request, you almost want to do that because it gives you just,
I feel like a lot of time in code reviews often spent trying to piece together what the high level intention was, especially if the person on the other end didn't, you know, underspecified what they're trying to do.
And so you almost have to play like archaeologist a little bit and trying to like piece together what the high level intent was and where to start.
And if you get access to the plan, then it kind of gives you a pointer right into what parts of the code you should read first.
Well, I was moving so fast, I was like, I've got to capture this exhaust.
I don't want to slow down or read it, but I want it to get created.
So when I can slow down and think about some of these things, it's there.
And so maybe the way I looked at these builder logs, essentially, was that they were the what and why.
You know, the long-term knowledge, the institutional knowledge went into a knowledge-based thing.
And they may be connected, but the intention is different.
In a builder log, I kind of want the agent, the developer, to tell me what they encountered along this journey.
What thorns did they get stuck on?
Where did they get cut?
Where do they apply a Band-Aid?
How do they apply whatever?
And what are they trying to do?
What were the blockers?
What were the learnings in there?
And some of that stems back into a maybe knowledge-based article.
The knowledge-based article is more like how to use it, why we use it this way kind of thing.
Whereas the build-a-log is more like the journey of the developer during.
implementation. What are all the things that you encountered there? And honestly, it's been
paramount, I think, to, to, like, I don't want people to think that I'm sitting here, like,
literally, like, glued to my screen, watching this thing constantly. This flow lets me become a
babysitter. It lets me hang, hang with my kids while doing this in the background. My computer's
over there. I go check on it every 10 or 15 minutes. Push yes, push no. You know,
commit that to memory kind of thing uh you know i can meet i can be making dinner out in the
backyard barbecueing whatever you know and still babysit this thing with this kind of flow so it's
it's sort of allowed me the ability to when i want to keep maybe making progress because i feel
like this stuff is just so i don't want to call it addictive but tantalizing like i've never been
able to do this kind of thing at this level of x 10x 20x productivity level ever in my
life right yeah it's it's absolutely incredible and i think one of the the first like what's like wow
moments or like uh points of realization for me with with amp too it's like this is very early on
this like back in like um probably like april or something it's like i had one of these weekend
experiences where i was taking care of our kid we have like a two year old at home so he's like
running around he's an agent of chaos and so it's like it's very difficult to to be in the same
room with him and do anything for longer than like two minutes at a time. And I was literally
running back and forth between like playing with him and trying to like get the spike of this
like feature up and running. And I ended up like getting to a point where I'm like,
oh, this is actually good. And I'm going to push this like over the course of like a Sunday.
And it's like without the benefit of this tool, it would would not have happened.
Because, like, in order to write anything trivial, you need kind of like that mental space to get into flow and to, like, think through the things and, like, page in all the context.
And every interaction, like, resets that.
Whereas with the agent, if the thing that you're doing is, like, within the capability of the agent and you're able to remain at that high level, you can absolutely do this, like, async thing where it's like, okay, like, let me tell you what I want you to do.
Let me give you some detail.
let me maybe give you some pointers for how to validate it
and verify that you're doing the right thing
and then I'll fire it
and then I'll go and like do something else
and then when the like amp has a built-in notification sound
it kind of like pings you.
It's like a solar sound.
And it's almost like Pavlovian now
where like I will hear it and I'm like
oh like it's time to go and like check on
what the status of that thread was
and we'll go see what the agent hath wrought
and see if it's good
or if it needs more adjustment.
I found that AMP is the one that I can most easily set loose on a task.
I told you I've described my pep ways, you know,
and so everything that is long running is not me prompting it necessarily.
It is, you know, this riffing to a pep that has clarity,
it's got, you know, conviction, it's got pros and cons.
It's got whatever context is necessary.
even if my team was formed with humans it would be it would be you know it's human readable it's
English it's yeah it's not some random language that is hard to understand it is yeah it is
just normal way to do it I find that that methodology to me lets me not have to keep the
spaghetti together in my brain yeah it all kind of falls there and it doesn't really
matter what it is, it's just in that flow and it just sort of works.
Yeah.
I like that.
Yeah, I appreciate you saying that.
I think, you know, I don't know if we have, we definitely do not have like one weird
trick that makes it work better in certain cases than others.
All I can say is that we are very heavy dog fooders of our own product.
Everyone on the team is using AMP to write probably like north of 80 to 90% of the code
that they generate.
And as part of that, there's just a lot of iteration that we do on the prompts, on the tool definitions, on the way that subagents combine that I think has compounded over time, you know, every day and week of development usage, has led to an experience where it is able to get farther than some of the other coding agents out there on these long running tasks.
One thing I've done recently, and we can keep going deeper on how we use it if you'd like to, I'm having fun with this, is I told you about the roles.
And so rather than just to find a good go developer role, let's just call it, right?
Or a I need to deploy a Linux, I need a Linux platform engineer kind of role.
Yeah.
I got into this place where I'm like, you know what, I'm just tired of repeating myself the same things.
It's kind of got a little boring.
So I was like, well, let me create a product manager.
And so now I've taught my product manager what I think I know, at least the way to prompt it.
And it had become at least in a long-running pep where we had a clear goal, it had multiple phases.
Yeah.
And again, I'm not trying to be glued to my screen.
I'm not trying to be stuck to this black mirror.
I'm trying to walk away and do my life or move to a different tab and keep doing my real job, which is not writing software at all.
Like my job is not writing software necessarily.
it is talking about software or, you know, defining relationships or nurturing relationships or all the things I do, you know.
And so when I had defined this product manager role, especially in this long runny PEP, I would have the product manager review the output from the other agent.
So I had two agents going at once.
In this case, one of them was AMP and the other one was Claude Code because that one's cheaper for me.
You're really expensive, by the way.
Well, we do the usage-based pricing thing.
I know.
I know.
It's still, it's costly.
We can talk about that.
I think it's costly.
And that's okay.
I'm not fouling you for it.
But AMP is, I think, I'll pause here and just say,
I think AMP's one of the best, if not the best agent out there.
And I think your cost is justified.
I just don't like it.
So.
We wish we could offer intelligence at,
had a cheaper rate. I think certain things are just, you know, dependent on the, you know, the models
where they are, what size of model you need for a certain level of intelligence and the cost
of GPU inference. One of the things that we haven't done that others have done is price
amp at below cost. And, you know, that's for two reasons. One is we are in this for the long
term, we've been building developer tools for more than the past decade. And so we want to
build a sustainable business around this. And I think that a lot of the other tools that I think
there's this notion of like, hey, if we can sell a dollar's worth of tokens for like 70, 80 cents
to grab market share and that somehow that will mean that we can lock in users to our coding
agent. But I very much don't think that's the case. Like coding agents have very little lock in.
very easy to switch and try like a new tool.
It's so easy, yeah.
And so I don't think it's a good business decision for us.
And two, it also introduces a perverse incentive.
So you brought up like model quality degradation earlier in the conversation.
And one of the things that we never want to be in a position of having to do is like nerfing the models that we're using because we need to keep the costs under whatever flat rate price.
most of our users are paying.
So I think that, look, you're right.
Like, all things considered cheaper is better.
I definitely feel the same pain as a user.
But at the end of the day, I think we're making the right trade-off in terms of what's
sustainable and what incentivizes us to build the best quality.
And I think the real trade-off here is not, you know, whether you're paying X or Y dollars
for the coding agent, it's how much
time you're saving as, as human, and what more you can build. And if that's your
barometer of comparison, then the difference in cost of agents is really like a rounding
error in terms of the time that you're saving and the additional value that whatever
your building can bring. Provided what you're building is something that you can put
an economic price tag on. Yeah, I think my frame of reference and maybe even me justifying my
comment. So if I was a business who employed multiple people with multiple salaries and I can
augment them or enhance them, then yeah. In that case, I'm already paying large salaries. And so
this is a nominal additive to that existing metric. In my case, I'm just tinkering. So it's
expensive for tinkering. They're not open source yet, but I plan to open source. The tooling,
I'm just like literally just trying to play. I'm trying to explore, trying to learn.
I wholeheartedly believe that what you said earlier, which is that playing with command line level agents like AMP, like cloud code, like, you know, codex, et cetera, is a skill set to be learned and flexed.
If you throw it five words because you think it knows everything, you will get a website that doesn't do what you think it should do.
It won't be using the language that you should be using.
It won't have the deployment target you think it should have.
you need to be specific just like you would
with any other team member
like the context is still
and I've actually learned more about
how I think I could be a better human
by how I interact with these machines
and I'm just going to call machines
that's just the simplest term to use
because you realize how much
I've always harped on the idea of
of expectation
and clarity
I can't
I can't
value beyond for not doing X
if I haven't described X to you.
I have to describe X to you so that you have clarity on what I expect from you.
And if you don't, and if we have an agreement on that clarity, my expectation is for you to understand that agreement, come back with that, you know, that expectation of sorts.
But if I haven't given you that, how can I expect you to come back with any version of what I want if the clarity hasn't been met?
So that's how I've always been, you know, how I've been, I think, a better human in my life is having the understanding.
But I think it's gotten even more clear to me how important that is, how important context is because just like with freem of reference here, I called you to expensive.
Necessarily you got defensive.
It's good.
You should do that.
You defended my fact that is a fact to me.
But then I clarified my frame of reference is that I'm not already paying developers.
And so this is a.
cost center for me because I'm just trying to play and tinker. And so I feel like, so just to give that
there, I think it's taught me how to, how important context is in all relationships, whether machine
or not. In all aspects, how important is clarity? If something is not clear, how can you come
back to me with frustration, anger, disappointment? If the expectation of what you wanted from me was not
clear and we didn't have some baseline of an agreement that's what you're asking these machines to do right
you're asked them to agree to make you productive agree i mean theoretically agree right my opening
you know pushing amp into your into your command line terminal or whatever that's the agreement right
um yeah i kind of pause there because i i just feel like we kind of went on a little bit there but
the frame of reference really is like i think that amp is the way i would leverage the way i think
of AMP in my own workflows. I don't have the cash flow or the cash to make you my daily
driver. But what I can do is when I know I have a longer running really important thing that
requires deep, very good execution, I will draft a very clear pep, as I've already described
to you, and I will set AMP loose on it for a little bit. Not the whole thing, or I'll define
it very clearly. That's where I'm leveraging AMP personally because of that reason, of those
reasons.
Yeah, well, first of all, I'm curious as to, like, how you're prompting AMP, because
so, you know, it's one thing if you're using AMP heavily and you're, like, coding, you
know, a full work day every day of the week.
In that case, like, if I look at my usage, I'm probably racking up maybe on the order
of, like, low hundreds of dollars per month.
But if I look at just for, like, by weekend, like, side projects, probably, for me,
at least, it probably comes, like, sub-100.
You know, in some cases, like, if it's something simple, it's like, you know, a couple dollars here and there.
It's like for the price of a coffee, I could get, like, a mini app that, like, scratches a niche.
And so for me, it's like, yes, like the, if you use it heavily, at some point, the, the token cost will exceed whatever, like, you know, flat rate folks charge.
But in terms of, like, what you're getting out of it, I feel like I'm using it.
I use AMP on my personal credit card
for personal side projects
and it's not really like an issue for me
it's like for the price of eating out
that probably covers like
a month's worth of personal usage for me
but I do see like some users
I will say this like the way that people
prompt and the behavior patterns
around when people start new threads
is very different
and we all
almost want to put out like a guide for how to use tokens more efficiently.
Here's one thing I've noticed.
So like if you talk to like very senior, if you're talking to like very senior engineers
on our team, you look at their threads.
Most of their threads air on the side of being very targeted and like short and sweet.
You know, not like trivially short, like not like one message and then you're done.
But it's like I want to do a very specific thing.
Go and do it.
Okay.
On the next one, new thread.
because it's a new task.
Or, you know, like you implemented step one of the plan,
now you're done, step two, new thread.
What I noticed among the, like, people coming from non-technical background,
so we have a ton of people on our go-to-market team, for instance,
who are building, like, side projects and, like, games with AMP.
I look at their threads, and it's like so-and-so,
you got, like, 200 messages.
It's like a thread that's, like, 200 messages long.
You fill up the context window.
You know, now the underlying model supports a million tokens of context.
each incremental, yeah, we have prompt cash and all that,
so it keeps the cost low to a point.
But it's like if you filled the context up to the point
where you're occupying hundreds of thousands of tokens,
first of all, like each incremental request,
you're not going to get the highest quality
because there's a lot of stuff in the context we know
that could confuse the model.
And you're paying the cost of like consuming as input
the entire previous, you know, 100 messages
in the thread.
And at that point, it's almost like,
you almost want to, like, ping people.
I'd be like, hey, you know, I know that, like, this may be the thing
that feels most natural to you, but I would actually recommend
you should treat threads as like these sort of like one and done,
like rip off notes, you know, rip them off frequently.
Rather than, like, do the whole, like, you don't need to build a whole app
inside a single thread.
In fact, I would probably recommend against doing.
doing that because you will get lower quality, higher latency, and more cost if you do that.
Yeah, I would concur emphatically.
Please teach people how to be more efficient because, you know, as you described before,
it's the skill set you learn.
So I'm learning.
And the whole even reason why I'm even playing with it is not because I'm trying to really build a bunch of software.
I'm just trying to tinker.
I've got a, you know, one particular it I'm scratching.
is here it changed.
We obviously produce video podcasts, right?
Like it's video first.
It's on YouTube.
The weight of media is very heavy.
When we were audio only, it was maybe a four to six gig project.
When we moved the video, there were easily 15, 20, 25 gigs for every episode as a normal.
And then because we have a desire to keep the long life of these projects, we have a full system to do all this.
yeah and this goes back i mean more than a decade we've been doing this practice a version
iterative version of what we're doing today uh every episode has its own directory it's got
its own workflows and statuses etc but when the show's done by and large it's done forever
for the most part unless we go back to it remaster it or unless we go back to it and reference
content within it which then we want to pull that original source uh and then extract it as a tool
would let us like Adobe Premiere or Adobe Audition.
Those are the two tools we use in our kind of primary workflow.
So I use a technology called 7Z.
It's a pretty well-known archiving tool.
It's an algorithmic format that I've been using.
It generally shrinks things like that we do around half.
So if it's a 20, 25-gig thing, it's in the 10 to 12-gig artifact long-term.
And I will compress the entire thing.
for a while there this thing was just a simple bash script to be a smarter thing on top of 7sie because I just kind of I forgot how to use I wasn't going to keep going back to the documentation of 7Z on how to use it and so I just simplified it by writing a bash script like any accurate right and then over time that bash script got more and more sophisticated and then about maybe five weeks ago or so we were on a podcast talking about cloud code and
I was like, you know what, I haven't played with that enough yet.
I've been largely in this hype cycle, not really playing with it much.
And I thought, okay, let me open up a cloud right now and just where can I set it free real quick in this call?
And I said, just improve this bash script.
And in the moment, that's all I said, like four or five words.
It was not make this amazing.
Here's the CLA.
I want you to.
Here's all the flags.
I want you to implement.
None of that.
It was just improve it.
Show me how I can improve this thing.
And I was just flabbergasted with how good.
How much improvement there was to it.
And I was like, oh, my gosh, I have missed, I've been missing out.
I have got to go as deep as possible on this thing because this is a crucial tool for me.
I use it on the daily, if not, you know, several times a day.
I'm archiving, moving things for, for us and whatnot.
And I don't like to spend a lot of money on hard drive.
So I want to compress and store forever.
And so we have a Trunaz server here on our land.
All that goes over there.
We have this whole entire flow.
and so this this bash script really has been my itch and so i've now developed a tool called
seven's arch or seven z arch whichever pronunciation you like and it is now my brains on top of
seven z it is sophisticated enough for me to point it at a directory examine it know its media
and set the compression algorithm to be best fit for what's in there nothing like that
of my bash script. And so because of like this, these agents and what I'm doing, I'm able to now make that thing, uh, what it is. And so those are my, those are the things I'm using it for. It's not like I'm an everyday software developer. It's those kinds of things. Back to help me be more efficient. My gosh, yes, I've shared with you my exhaustive workflow. I think I am using way too many tokens. I'm definitely not treating threads, especially with AMP like throwaways. Because I,
in my experience
when the context window collapses
the thing forgets everything
and so I thought as a user
I'm thinking I need to keep this context window
as clear as what I'm working on
because if not it forgets it doesn't how
it doesn't how to be what I've asked it to be
for this task
and I kind of get
kind of get like lost in it
it doesn't know what it do so shoot me some of your threads
I think one of the cool things about amps you could
share your threads with other people and so like
oftentimes we get users sharing their
threads and saying like, hey, you know, this is how I built this or like, hey, you know, can you
help me improve and understand what I could do better?
Yeah.
I would love to take a look at how you're using it and maybe give you, like, best practice
tips from what we've learned.
For sure.
I will say, like, my intuitions are almost like inverse from yours.
Like, if anything, I, like, clean, starting fresh is great because you know the context window is
is clean. The more things that you accrue in the context over time, the more possibility for
confusing the model there is. And so, like, the quality, if anything, it's like, back when
the context window limit for Claude's sonnet was around 200K, I think, like, one of the things we
noticed was there's severe, you started to notice a little bit degradation, actually around
like 70K and then a much steeper degradation past like 120 tokens. And, you know, it was
always one of those things where it's like, it was vibe-based. So we never built that directly
into the product saying like, oh, don't go past 120 because, you know, sometimes, you know,
there's a legitimate use case that does involve, you know, take advantage of a lot of the previous
context. So it's a matter of user taste. And, you know, plus we're all still like learning in
real time, so we didn't want to be too prescriptive. But I think maybe, you know, now we're at the
point where we've seen enough where we will be offering some more visual indicators around like
how much of the context has been used and where the sweet spot starts to end for some of these
models. Now it's like you can go up to a million. And I think the quality has gotten better
beyond 70K now. But it's still to the point where all things considered, I prefer the kind of
kind of like rip off note, let's start a fresh thing.
And it's almost like the context that you wanted to remember,
hopefully is captured within the files that it's modified.
So it's like, you know, make this edit to the code.
Now it does this thing.
Okay, now let's start a new thread and say like, okay, okay,
now it does the thing that I just asked it to do,
but now make this other modification.
It's almost like, you know, as a human, when you get commit,
you're kind of saying like, okay, I'm committing this, this is one atomic change, and now let me start
from a clean slate to do the next incremental thing, where I don't have to worry about all the
other stuff, all of the other changes that I made to do the previous thing. It's sort of like
mentally clear to have the separation. I think, you know, the rough analogy applies to agents
as well, where if you want to be like clear and precise with what you're doing, it's,
It's almost better to have this habit of starting new threads for each sort of like isolated or targeted tasks that you want to accomplish.
Yeah, this is definitely something I think that is needs to be more clear because obviously for me, it was not.
And here's me, you know, calling you out on the podcast, telling you're too expensive.
And here's me, the user who doesn't understand how to use your thing.
And so it's not really expensive.
I was using it in an expensive way.
I was using it inefficiently.
And I would say that, like, first of all, like, we're building the product.
We need to build a product experience that makes sense to users.
And so it's like never the user's fault, right?
At the same time, I think what...
I would say this is definitely your fault.
I was joking around.
Yeah, it's definitely our fault.
At the same time, the tension that we notice is that, again, like, coding with agents is a high-ceiling skill.
And so there are legitimate use cases that do involve filling up the context window.
And so what we don't want to do is we don't want to be overly prescriptive.
Because again, like the persona that we have in mind when we're building an AMP is it's really ourselves as a proxy for like a professional engineer whose job it is to like code every day and use this stuff like frequently as like the main way to generate the code they produce.
and there is this tension where like the more things that you build into the agent,
the more things you build in the UI that add these sort of like,
that are like prescriptive, like you should do it this way or that way.
It runs in direct tension with the desire to create a power tool
that lets people use it in the way that they want to use it.
And so it's kind of this constant struggle where like we do see a good amount of people
using it in ways that, you know, we don't use it ourselves internally. Some of them are like
weekend vibe coders, and it's, you know, they're getting value out of it in the way they're using
it. It's not the way that I would use it. And, you know, I think we'll put out some, like, blog
posts to say, like, here are the general best practices that we consider are good, but we also
don't want to overly constrain people because, one, like, there might be a power user out there
that's getting value out of it
in a way that we don't realize.
And two, this is all still something
that we're learning in real time, right?
Like, the era of coding agents,
I think we're like, what, like six, seven months into it?
And so the last thing we want to do is,
you know, like guardrails are good,
but guardrails can also yield blind spots
as the technology evolve.
And the last thing we want to do is
like obstruct the view of our user base,
ourselves from unlocking like a usage pattern that is really powerful that you don't see
because the product is telling you to do something that we consider that is like my
personal best practice but may not you know generalize to all the different users and
code bases out there yeah well i think as a response to that one thing i would suggest is
definitely don't change the way amp works i would say if anything add
maybe a slash command that is not so much buried, but enabled if a user wants to go and
read documentation and do an alternate version. So I would keep things the way they are. But I would
definitely educate on how the context window works, because that is something to, to me, I have
I can assume as a, you know, a smart person. I can begin to assume, but I have no idea. And as
we just described, we're all learning these things. And this is a new versioning thing. So
we're sort of all learning on the fly.
This is definitely a skill set to be developed, but I have no idea how the context window
works.
I don't know how to leverage it.
So when you just said, there are times whenever filling the context window makes sense,
I don't know when that makes sense.
I don't know when it makes sense to thread and rip it off and start new.
I don't even know how to create new threads.
I don't even know, aside from maybe, you know, back at the literal command line,
inside of AMP.
And Claude has this two
where you can resume
and you can continue
and stuff like this.
So this is not a
amp only thing.
This is a,
you know,
kind of an agent level,
you know,
common thing.
Maybe threads is something
uniquely to you,
but,
um,
and the way you can share them
and stuff like that.
But,
um,
I'm not,
I'm not ripping off my threads and starting new.
I don't even know how to create them.
I do not how to list them.
I do know how to see the idea of them.
I assume I have to go and copy and paste it.
And continue it if I want to.
So that is, there's no CLI or 2E for perusing my threads and doing things at least in like a command line.
So this is just early days for you.
I'm sure you'll build that eventually in there.
But that context window and how to leverage threads is black box to me personally.
Yeah.
No, I totally get that.
And it's something that we've been actively thinking about for a while now.
I think we're one of the first to ship the context window meter.
So just like showing you how much of the context window has been filled up.
That's like an important indicator.
I think now I've seen it pop up in a couple of other coding agents.
But that to us is like it's one signal that's useful to look at.
Because the higher that percentage goes, the more costly each additional message becomes.
You take a latency hit as well.
And you start to see quality degradation at some of these key falloff points.
I will say for the folks listening, you know, like my personal best practice for using agents is again,
Like, the analogy that I draw when explaining this to people is the context window is sort of analogous to the human brain's working memory.
So, you know, it's like, yeah, there's the old adage, like, you can't think of more than, like, five or seven.
I forget what the exact number is, but, like, there's only a certain amount of, like, distinct concepts that you can have in your brains, kind of like L1 cache or a working state that are, like, immediately instantaneously accessible at a given time.
And the context window is essentially that but for LMs.
So the more you try to cram in there, at some point, it's got to pick and choose which pieces of that it's actually going to pay attention to.
And then it has to effectively ignore the rest.
The more you cram in there, the more chance that it will become confused because the salient piece in the prior thread history is not actually being attended to when it responds to the next request from the same.
the user. And this is not something, it's not just like long-running threads where this is an
issue. It's also an issue with things like MCP servers. So like MCP servers are, MCP is like very
hot right now. And I think it's great that this protocol, I mean, we partner with Anthropic
on the initial kind of versions of this. I think it's a great protocol. And it's great that it's
gotten basically everyone in the world thinking about, hey, how can I, how can I, you know, any service that
exists, how can I build an MCP server
to provide tools and context
to large language models? Like that
aspect of it is awesome.
But I think that there is this kind of like
learning curve you have when it comes
to MCP servers where a lot
of the MCP servers in existence
now weren't written intentionally
and they weren't written for a specific
application. As a consequence, a lot
of the tools that they inject are
largely irrelevant to task at
hand. And each
additional tool definition becomes a
of context that gets placed in the context window.
So the last thing you want to do is you don't want to turn on like a dozen MCP servers,
each of which injects a dozen or two tools, tool definitions, into the context window.
Now all of a sudden you're talking about like 100, 200 tool definitions that off the bat,
you're paying in terms of token cost, latency, and degraded quality if they're not relevant
to the thing that you're trying to do.
Yeah.
Babysitting is getting harder.
I'm telling you.
I mean, this is all leaning back to, I mean, it makes me rethink my agent flow model, not in terms of like changing it drastically, but more like, okay, I have to be more efficient with props.
I have to be more efficient with maybe a role definition for whatever the role might be because it seems like that's kind of token heavy.
And now it seems to be more clear that this context, I literally had no idea how the context window worked beyond no.
what it does until this conversation.
And I didn't think about it like that,
that the more you have in your prefrontal cortex or that L1 cache, like you mentioned,
yeah, you're going to be confused.
They call it focus for a reason, right?
Take your to-do list, pull one off, focus on that.
So that's analogous to thread, right?
A new thread, focus, context, which role in my case or somebody else's case,
which pep execute
done
rip it off new thread
clear context
to me now that makes
a lot more sense
I would love it if you can
educate the world
more so on this context
when and how it works
because I've spent too much money with you
yeah we'll put out more material
actually just as you were saying that
another analogy that comes to mind
beyond the human brain analogy
there's also an analogy that can be drawn
between threads
and functions.
So, like, if you're a developer, you're writing code.
You know it's an anti-pattern to have, like,
a thousand-line-long main function,
or like 10,000 line-long main function, right?
Because there's just too much to think about.
There's too many ways in which those pieces could interact.
And so what do you do?
You compose that long main function into various sub-functions,
each of which, hopefully, is no more than, like, a dozen
or a couple dozen lines long.
And that helps you encapsulate things
so that you can reason about it.
I think the thread is essentially
the agentic analog to the function call.
And so, like, you can make agents do very powerful things
with relatively short threads
using the flow that you describe, right?
Like, if you have a planning thread
that generates a plan,
well, to generate the plan, you know,
often doesn't require filling up
the entire context window to generate a high-quality plan.
But once you have the plan,
you don't have to use the same thread
that generated the plan to go and implement the plan.
The plan is an artifact
that you can then feed in as an argument
to another thread to say, like,
okay, go implement step one or two or three of the plan.
And that thread doesn't have to know
how the plan was written.
All they need is the plan,
because the point of the plan is that,
you know, once you have the plan,
that's what you're following.
It stands alone, yeah.
Yeah, exactly.
So maybe that's like another analogy
that could resonate with a developer audience.
For sure, for sure.
Well, I've enjoyed that part of the conversation.
I know that while I didn't plan to actually go this deep on how to actually babysit an agent,
maybe we could talk about what it takes to raise an agent.
I love this podcast, y'all did.
I think if I understand, too, like the way AMP came about was,
I don't want to call it accidental because I think you all are very, very deliberate.
with what you do but but it seems like in my understanding you can clarify this uh queen your
your co-founder and ceo uh had this idea and i think thorsten uh torsten ball a friend of ours as well
over a weekend put it together i'm loosely paraphrasing what i thought was the inception of event
yeah um what has it been like to to go from there if that part of the story is true
to raising an agent building a podcast around it how has this
new journey for you all been because I feel like for me, you know, something I've been preaching
to a lot of the different brands we work with is that, you know, I love it that you sponsor
a podcast. That's amazing. Don't stop doing that obviously because that's what sustains our
business. That's what makes this show even stay here right now. But I think that you all should
have your own channel. You should be creating your own content. And that's kind of what you've done
with raising an agent. Can you talk about inception of AMP and raising an agent what you all been
doing around media and creation and software and core team, et cetera?
Yeah, so the Inception Amp is basically as you described it.
It was, you know, Thorsten rejoined the team, and he came in and was trying to figure out what to
work on.
And I think we'd been talking about wanting to do more agentic stuff for a while.
You know, I think like the first week he was back, he tried, like, basically I told him,
like, hey, can you go and like take some of these principles and try to make Cody do more
organic things. And then he
did that for a week and basically
came to the conclusion, which, you know, I shared
at the very beginning, which is like there's so
many design constraints that
assume, you know, certain
things about the model that
it feels like you're going
against the grain of what everything
else in the application wanted to do.
So essentially he and Quinn
went off and
basically spun this up.
It was like, let's just, you know, do
a spike and see where it goes.
And that first iteration, I think he's got like insanely great taste and great intuition for how to build quality software that wraps new technology in a way that is tasteful, but, you know, unleashes the power of the underlying technology while keeping the user experience really good.
And then it just sort of like compounded from there.
It's like he built that and folks started trying it out internally.
I tried it out internally.
I was like, wow, this is really powerful.
like that first moment where it's like,
hey, go and change the background color
of this random component.
And then it goes and finds like the exact component
and makes it and you didn't do anything past that.
It was like one of those like holy crap moments.
And what we quickly realized
is like how many of the things
that we had to kind of like rethink from first principles?
Like I think any time like a new, amazing disruptive technology
comes online, you really have.
have to force yourself to think from first principles because a lot of the rules of thumb
and best practices that you've kind of imbibed over the years were based on assumptions that
may or may not hold true any longer.
And so part of that was like, okay, we got to go back to first principles and learn what
the models are capable of and what they can do within an application scaffold.
that we're building in real time,
we're going to have to relearn this in a real time.
How can we do that in a way that shares those learnings
with people using what we're building
and also engages people in a way
that gets them to share feedback back to us?
Because we don't have a monopoly on insights.
I think everyone recognizes that we're all kind of figuring out
what agents can do in real time.
And so we've learned a ton
from our user community
in terms of how people
are using it. You know, the workflow that you just described, the agent workflow, the agent flow
workflow, that's something that we, or I at least, heard from some of our users for the first
time. So, like, that's not something that I came up with on my own. It's like some of the users
who are using it heavily reached out and said, like, hey, we're using it in this fashion. And
then I was like, oh, maybe that's something I could learn from. But a lot of those people,
they reach out because they listen to this podcast and they see that we're kind of like openly
sharing
what we're figuring out in real time.
And it's a very raw, unadultered podcast.
It's literally like two,
sometimes maybe three people
on the AMP Corps team
just like riffing about some of the topics
that we've been thinking about
recently, some of the challenges,
some of the idiosyncrasies,
some of the weird things that we've come across,
and some of the insights.
I think the most recent episode
was one we did on
evaluating different models and trying to plug them into both the main agent and various subagents and
things we notice around that. And so it's not a polished production at all. I don't know if
I don't think we'll start like our own channel around that just because like it's very ad hoc.
It's like we only put out an episode. There's no like regular release schedule. It's only like when
we've shipped something or we've noticed something that's particularly interesting. Then we'll kind of like get
get together the relevant people.
And it's like, okay, let's riff for an hour on that and talk about it.
So it's not like a, there's no regularly scheduled programming.
It's more of just, hey, if we learn something cool, let's share it with others and use that as a way to get more smart people who are very on the forefront to figure out what agents can do, you know, talking to us because there's a ton to learn from out there.
This most recent episode you're mentioning was episode 8, if I'm correct.
Yeah.
13 days ago.
Yep.
You sat down with Camden when you record team members on the AMP team.
Currently sitting at 674 views.
I would call that not enough, in my opinion.
And it's not your fault.
I don't think it's your fault.
I think if you keep doing this, you'll see the dividends get paid.
But, you know, I think, you know, I think, you know, I don't.
look at what you are doing like this, and I can't tell if it is or is not, you know, weekly,
biweekly.
I don't think, I think what we've, you know, by and large, what we've called a podcast.
I think there's still this, what we do here.
The change log is a podcast.
There is a rhythm to it, Monday, Wednesday, Friday.
There's a rhythm to it.
But I think in a brand world like yours, what I call brand world, I don't think you have to be
that way.
I think YouTube changes the model for you where you can.
that can be your first. This is kind of like some real-time advice, by the way.
Yeah, I know. I appreciate it. I think there's certainly things are going to be doing better on
the awareness front for sure. Yeah. I think you're doing it right, though. So I don't think you
should change much aside from maybe give it more of a first-class citizen structure in terms of
what you're doing, but don't feel like you have to be there on the weekly. Don't feel like you
have to be there on some sort of ceremony. Also, don't take three months, but don't feel like you have
to abide by this weekly must show up must must podcast because the audience expects a release
I think there's a lot of things that you that you and others are doing inside of organizations
that need you need to have some cycle a content engine but it doesn't have to be raising
an agent content engine it can be source graph or AMP at large things that you should
be talking about as a leading brand on this frontier just for sure and the last
Last time I checked, YouTube is not charging you any money to publish, right?
So it's free.
Yes.
The only cost to you is your time.
The only cost to you is the literal cost to produce it.
It is literally free to you to produce and publish content on YouTube.
Yeah.
I would say if there's anyone listening to this that is good at just handling all the publishing stuff,
it literally is basically just like me or Thorsten or someone else on the AMP Corps team
that's just like recording and editing and pushing these things out.
And so I feel like if we had like a halfway decent person who was just like, okay, you guys just talk and we'll handle the publishing and the editing and all that, that might help us release on a, but you're right.
Like, you know, as a business, we should probably be more intentional for that.
For us, it's just like a fun way to engage users and people using coding agents at the moment, which, yeah, like, we could probably be getting more juice out of those conversations.
I will say that, like, you know, for us, the quality people who, like, listen to it and then reach out, and they're like, oh, hey, you know, I heard you guys talking about this.
Like, that in itself has been worthwhile so far, but yeah, yeah, I don't know.
We absolutely should be more intentional about getting it out to a bigger audience.
I don't think you should feel bad about that, really.
My advice is not to shame you by any means or to make you feel like, you're doing great.
Don't stop doing that.
The reason why you show up and it's fun is what makes it enjoyable to watch.
So don't change how you assemble the team to sit down and have the conversation.
Don't change the fun aspect of it.
I would just treat some of the production and orchestration and timing a little bit more careful.
because source graph deserves it and I think you if you have some sort of rhythm it's a little
easier to be whimsical when you show up because you have a some sort of cycle that the
brand itself is publishing content you know somebody kind of in charge of that and there is
an easy button out there for you and I'll share it with you after this podcast but there is an
easy button I'll help you press I say keep doing it I'm enjoying the I'm enjoying the pod that's
why i brought it up because i think you know i think calling it raising an agent uh only shines a
light on the fact that this is burgeoning that we're all exploring together and that you're kind of
learning in real time and sharing almost in real time too with the core team and it began as
sort of a skunk works kind of podcast just talking about what you've done and i think you have an
opportunity to to blossom it to just a a little bit more than that so mainly just encourage you to
keep doing it because i'm enjoying the content personally cool
Appreciate you saying that.
Yeah, absolutely.
I think even calling it raising an agent is just, it's genius.
I don't know who came up with that, but it's genius.
I think Thorson, credit goes to him for, for that.
Something needs to be done with him because he is the brainchild behind this.
I think even the, you might remember this, I emailed you a few months back when the AMP code website was, I would say, in its very, very infancy.
and the copy, the web copy, the copywriting on that
I think I misspoke at the time.
For whatever reason, I thought I had seen a Slack message
where Quinn was like I wrote it, but it was actually Thorsten.
Yeah.
And he's an amazing writer.
He puts out like a weekly newsletter too
called Joy and Curiosity, where he just talks about
cool and delightful things that he's played with
both like technical and non-technical in the previous week
that's just, it's amazing.
Like, he has a rare combination of, like, super sharp, great technical skills, but also great communicator and, frankly, like, great writer, which, like, that alone, it's too rare a skill these days.
Like, very good quality writing from the heart that is clearly articulated and, like, every word you can tell was thoughtful, intentional.
in the age of, you know, L-LM-generated text, it's increasingly rare.
And I think it's a very unique and it's a good skill to have.
Yeah, I'm a big fan of Thorsten.
We had him on the podcast Go Time a long time ago.
That's where I first met him.
He'd written a couple books since then.
And then whenever we were changing the cast members out to kind of invite more folks into the Go-Time podcast world, he was on my list.
And Zai really wanted him to be involved.
He ended up not being able to have as much time.
I think because he got employed by you all.
And he's like, now I'm totally focused.
This and that.
And I was like, oh, my gosh, source graph is amazing.
You should totally be milking that opportunity and doing all.
And this is way before this world we're in.
Now, this is when it was code search, you know, code intelligence, which is not something to be, not a pejorative, but like, wow, how much progress.
That's why I've been so enamored by you personally and Quinn and source graph.
I've seen your story arc since
2014 when we interviewed you
in a dark room at GopherCon
asking you five questions like that's
like I've seen you since then
and like the beyond who you are
is still the same person but like what you've been able to accomplish
on this thread pull
of like just help developers be more intelligent
with their code bases have more
clarity and understanding be more efficient
with your processes understand a code base
more quickly you know get up to speed
and ship something sooner
like this iterative approach to
where you're at is why I'm just like so
so curious, I suppose, about what you're building
and why I care so much about what you,
how you showcase what you build, you know?
No, totally. I mean, we've been building dev tools
for, you know, the vast majority of our professional careers.
And I think it's just, I mean, both Quinn and I are
just insanely passionate about this area.
It's like, you know, if, if you just like let us code all day
in a cave, that would be a great life in my view.
And I think, like, you know, building DevTools for as long as we've been building
them, we've learned a lot both about how individual developers work, the diversity of
preferences and skills and tools out there, but also at the team level, like, how
organizations build software and all the challenges and bottlenecks that are in the
that process. And I think there's nothing I'd rather be doing with my life. Because like developer
tools, it's one of those very rare areas where it's fun and intellectually stimulating. It's also like
the economic impact is huge, given the importance, the increasing importance of like software
and driving the world. But, you know, like the theory behind computer science and now, you know,
machine learning and AI. I don't know. There's almost like a spiritual element to it where you
can see glimpses of maybe some of the fundamental laws around like knowledge and information
that govern the universe. And so for that reason, it's just it's insanely rewarding and fun
area to work in. Yeah. Man. You know, I really struggle with that idea of going into a cave
encoding all day.
And actually I have a playlist on
on YouTube called Working Beats.
It's a glimpse into the Adam world.
And I just, as I hear cool beats on YouTube,
I store them away.
If I'm going to work to them,
I store them in my working beats playlist.
And I go back to it.
And there's lots of cool beats on YouTube
from all sorts of places.
So I don't subscribe to one channel.
I just make a playlist.
And I let the algorithm help me in discovery, right?
That's how you're supposed to do it.
And so I put this one in there
the background like in on a YouTube video is always what there's a video right and so on this
one it's not just the music but it's this background and it's this developer with like these
three large monitors which is just amazing code on all of them I'm like man that's beautiful
right but then this developer is sitting at this really cool desk in this really minimalistic room
with these super huge windows like the they got the desk the monitors and these massive windows
with this view and the view like a d-h-h-h view like have you ever seen him
And the view is this massive, beautiful mountain scape.
And I'm like, that is so weird.
I love it visually.
But then you think about it, right?
That developer is staring into the black window to the black mirrors, they call it, right?
Hammering away code.
Now, I'm with you.
I'm like, yes, take me there.
I love that.
I get joy in that.
There's more life than that, of course.
Yeah, yeah.
But then I'm like, this is juxtaposed to this beautiful view.
Right?
And software is not making the mountain, right?
Software may make the company possible that makes the boots that allow a person to put them on and climb the mountain, but it's not making the mountain.
So it's like this really weird, you know, beautiful view, but juxtaposed against each other.
It's like, well, here's a, here's developer, three huge monitors, coding, loving life, while this beautiful mountain scape is, is off the view.
Yeah. I don't know.
But, you know, like, the way that you, I don't know,
to what I would say that is like the way you experience the mountain.
You know, the photons bouncing off, you know, from the star to the mountain into your eyeballs
and then how they get translated into these signals that get assembled into what we call consciousness.
You know, there's certain, like, fundamental, I don't know, the forces of nature at work there.
think, you know, one of the cool things about the latest advances in AI is now you can actually
see some of these come into something that's like tangible to everyone, you know, like what is
intelligence, what is reasoning, what is, you know, thinking. You know, I said earlier, like,
I'm not like AGI Pilled or anything like that. I'm not like a Dumer. I don't think it's,
it's Nirvana. I think these are very much like tools. But I do think that like we figured
out in in the transformer architecture and some of the other like emerging architectures that are coming
out how to capture what is essentially happening in some part of your brain.
There's some sub-region in your brain that I think is very, like, whatever it's doing is
very well pattern-matched by what the neurons in a transformer are doing when they're taking
input and producing output that is, you know, what we would call like semantically correct
so yeah
I don't know
I just think it's all insanely cool
yeah
I mean I
I would be that
if I could have that view
I would have that view
so I'm not knocking
by the means
I was just thinking like
you know as you said
I would I would be happy
to code in a cave all day
or be in a cave coding all day
I can agree with that
and I can
empathize with it
and I can have desires around that
and then I'm like
well there's also more to life
than that
and I'm not saying
that you don't think that by any means, because we both have children.
I'm sure we both feel the same way about our families and our wives and our children
and our lives and our lives and stuff like that.
I think we're in a unique position in time and history,
which isn't that always the case for every human being to be in a unique position in time
and history?
It's always the case, right?
It's funny how history works.
Every year is unprecedented, right?
That's right.
Yeah.
I think in particular to us, like we've been on the struggle bus, I would call it, as developers,
key by key stroke by stroke
character by character
putting into the machine
to get the depth tool out
or to get the thing out
and now we've gotten to a point
where we can say okay
now my one actually equals 20
you know where it's a force multiplier
now you take one atom
or one beyond and now we're
I'm able to
contextually be able to hold
and produce more
in various lanes
because of it it's a force multiplier
so we're in that unique world
so I'm with you I I struggle with a desire because I want to go deeper and it is tantalizing because I can produce way more than I've ever been able to before or even explore caverns I've never explored before right I'm going into new regions I've personally never explored before you may have somebody else may have but not me and so I like that idea here's what I want to end on because I know we're getting close to time I want to call them skeptics
I kind of want to call them resistors.
They're not AI skeptics,
and maybe you can disagree or agree with this,
but I want you to give,
because I think you've got a unique,
you know, perspective,
a unique frame of reference
on where things are at
because you're building some of these tools
and where we may be going
and even tap by your own personal taste
and desires for where we can go.
Speak to what I would call the resistors,
that they don't,
know how to handle it.
They're aware of it, obviously.
They're not resisting it necessarily,
but they're just not sure they want to let go of the old way.
They're not sure how to conjure it.
They're not sure how to get the magic out of the genie in the land.
They're not sure how to rub it or they're using the context window wrong.
They're holding it wrong.
And they're just not leaning in.
And they're not seeing it as a skill set to build or the skill set to build for the future
of where software development and program is going.
speak to that world as clear and as open as you like to.
Yeah, that has a great prompt.
I'm thinking about like...
It wasn't a question. It wasn't open-ended.
Like, just I gave you some points.
Two things.
One, I'll speak kind of like philosophically,
and then I'll speak on just like a practical level.
So like philosophically, I think like a lot of the skeptics come from a place that a lot of developers come from,
which is a point of skepticism and especially in an area that's,
as hyped up as AI is,
there's sort of like a natural reaction.
Like, if there's this much noise around
what it supposedly can do,
that's often a contraindicator
against the actual substance
of the technology.
And there, I would agree with
a lot of them in saying that, like, the space is
very much overhyped.
Like, the people saying that
it's going to solve all our problems,
humans are going to be out of the job, or we won't
have to work, or, you know, it could go off
the rails and kill us all.
Like, that is just, it's like, those statements like that are, to me, they're like not even wrong.
They're like so far outside the band of like what we should even be talking about with respect to this technology that I can understand the frustration that like a lot of people have where they're like, look, you know, people are saying like, this is, this is going to replace humans.
And then I go and use it and can't even figure out how to SSH into, you know, my remote machine, right?
I don't know how to do that.
I can't do that.
Yes, you can.
Yeah.
And so, like, I would say philosophically, like, what the technology is, it's not consciousness, it's not a human replacement. It's not a God. What it is is it's a universal pattern matcher. Like, if you go and watch Demis's, like, Nobel lecture, he's the head of Deep Mind, received a Nobel Prize for the work that they did on that team in protein folding. It's like a very short lecture. It's like 30 minutes. But he says something very insightful there.
where it's like the transformer architecture,
what it really does is it allows you to fit any pattern
that is either observable in nature
or that you can synthetically generate.
So as long as you have a way to collect the data
from observing nature or generate that data in an environment
that closely enough approximates what you're getting after,
what the transformer model architecture does
is allows you to train a model that,
fits that pattern. They're great
like curve fitters or pattern matchers.
And so that is
not, you know, Nirvana,
but it is a useful tool.
And so if I
hand you a universal pattern matcher, and I say,
I've trained it on all these different workflows,
you know, like coding, the coding
workflow, there's patterns
that emerge from that.
A lot of the tedious stuff
is very like, you know, patternful in the way
that, you know, what you do, there's
this sort almost like a rote that you
learn when doing it.
You as a human understand
what these patterns are.
Now I'm handing you this technology
that can fit those patterns
as long as they're represented in text
and as long as they're represented
an environment where you can validate
what is a good pattern
or what is a batter pattern in that universe.
That's the mindset with which you should approach
these technologies and
that puts you in a mindset of like
not being, like you
don't want to go into
trying these tools as a mindset of
automatic skepticism. I feel like a lot
the more prominent voices out there,
it's almost like they're going into trying these tools
with the intent to show that
it's all hot air.
That's the wrong mindset.
I think you should go in with a mindset of like,
hey, this is amazing new technology,
is a universal pattern matcher.
How can I explore what's possible with this?
Let me put myself in like explore, try new things mode.
And that segues into my practical advice,
which is like if you want to experience the wow,
in a way that's like tangible and delightful
and also practical, I think the first thing I would do
is pick an app or pick a domain
that's like somewhat outside of your wheelhouse
as a developer.
You know, maybe you're a hardcore systems engineer,
but you've never built an iPhone app.
But, you know, you have a three-year-old kid
and you want to build, you know, a game
that teaches him how to like spell basic words
or something like that.
Go and sit down, you know, it could be with AMP.
It could be with any of the other
like coding agents, for this particular task, I think most of them that you've heard of will do a pretty good job of building a basic app that's outside of your wheelhouse to do something that is simple to a lot of people out there, but hard for you. And if you do that, I think probably with like 98% confidence, you'll have a good experience with the proper mindset. And that will then kind of like motivate you to try the technology in various
settings and increasingly complex settings to see what it's capable of.
I like that.
I mean,
I think that's spot on with my own recent experience is that I've gone just a little outside of my wheelhouse.
I've built a couple of CLI tools.
I've learned about CLI patterns that are obvious like dash dash help, dash version.
Those are pretty common ones.
But all these different ways to leverage CILIs.
There's known patterns out there.
And I've done a version of what you've said.
I've had a personal itch.
And in my case, I've justified the spend because I'm like, well, I've got to learn.
I've got to do these things anyways.
And so it's cool to do that.
But I want to build a tool or something I can actually use day to day that maybe gets better.
And my hope is that eventually Seven's Arch or Seven Z Arch is shareable to the world.
I want it to be open source.
It is not an atom tool.
It's a world tool.
And hopefully, you know, if anybody likes the way I'm compressing and the way that we have to compress large media
directories, how many YouTubers
are out there, right? You will eventually use
my tool beyond your team will because, hey, you
produced this awesome podcast called Raising
an Agent, or you do things on YouTube.
So you have maybe a desire to keep
these artifacts long term. And I
say, hey, compress them, save some
file size, you know, that makes sense.
But I've done just that. I've stepped
outside. It's a useful tool to me.
And because I have
the itch and because I'm
the user who understands how I want to work,
I'm able to more clearly
learn all about the go world, all about CLI tooling, all about maybe even the Tui world and
the exploration there. I can do all those things, but I can also improve my own workflows,
my own tools, and then share those with the world. And for a while there, I was really,
like maybe about a month ago, I was actually really, really sad because we've obviously
built this podcast around software development, but specifically this burgeoning and now
the way called open source. And back in 2000,
When we first started this podcast, open source was moving so fast.
GitHub was one year old, wasn't owned by Microsoft, you know, and it was moving fast.
And it was so hard to keep up.
And our tagline was open source moves fast, keep up.
We've let that go because it's sort of snarky, but it was kind of core to our original DNA, right?
But open source moves fast.
It's hard to keep up.
But my hope is that there's more open source.
But for a bit there, I was really, really bummed.
thinking, wow, if we can just generate new code so quickly, does the value of the patterns
captured in open source at large, do they become less important?
Does it become less important to structure those patterns into projects, into communities,
into whatever, and that's how open source works.
Currently, does that change because now we can generate so quickly, does the pattern called
open source no longer become the same value?
And for a bit, I was really bummed.
And now I feel very hopeful that the future of open source is maybe even more brighter
because, one, you may have an influx of users to open source that's already out there.
So discovering good tools.
And so that's great for open source.
Hypothetically, you know, a maintainer may feel the pain because it's a slop,
but that's a whole different podcast and subject.
Yep, yep.
But then you also have all these new builders that can start to scratch their own edge
and want to share the thing they made.
I'm so hopeful for open source now, where I thought before maybe,
so that you could generate it so easily,
it would become less valuable.
Yeah.
I think it's so hard to predict
what the net effects of this will be.
Because there's a certain aspect of it
which makes libraries less necessary.
You know, when common pieces of functionality
are like auto-generatable.
There's a lot of cases where
even with an amp,
there's like one or two packages
that we built internally.
We built our own like TUI framework
for the latest,
the new TUI of AMP.
And it's like,
you know, would we have done that
prior to
coding agents being a thing?
It would have been a lot more expensive.
It would have taken a lot longer to ship that.
But now we can do that
because we're able to move much more quickly.
But at the same time,
I do think that there's,
there remains a use case
for having libraries,
I just think that the nature of which libraries are really popular
is probably going to change.
I think prior to agents and AI,
a lot of the most popular libraries in existence
were some form of middleware
or a piece of abstraction
that helped make using a particular API or technology
or a piece of physical hardware more accessible.
And so that was like a big problem
that libraries like open source packages and libraries solve for you.
I think now that there's kind of like less demand for that form of package,
but still a large amount of demand for libraries that, you know, are robust,
they're well tested, that provide, like you still need some amount of abstraction,
but maybe like fewer layers of abstraction over an underlying capability.
I don't know if that makes sense.
Like, I would say, like, my, you know, I'll say, you know, strongly stated opinion, but maybe weekly held, given how quickly things are changing, is that we'll see far fewer libraries that are, like, purely just like, hey, you know, I, this is a neat way to read files in Node.js or something like that. And much more libraries like, hey, you know, there's this new piece of hardware that got developed. Or maybe there's, like, some new, like, biotech technology, biotech technology.
that is available now and like someone built an API for that and it exposes like all the key hooks and because software is so cheap now code is is cheaper to generate with with agents than with you know humans manually hand coding everything we'll just see a lot more things expose software endpoints and and we'll see a lot richer playground for people building software to be able to like hit different things that that do things either you know in cyber
space or in the physical world.
Cyberspace.
That's cool, man.
I haven't heard that word in a while, man.
That's cool.
Cyberspace.
All right.
Well, ampcode.com.
I am a fan.
And to be clear, I use all agents.
I am not agent agnostic.
I want to use everything.
And I know that's the cool thing with AMP
is I don't have to choose the model.
We barely touch on that.
But I love that I could just use it
knowing that it's a high quality output
kind of tool, that you're tuning to help me not have to think about swapping models or
limitations. It's always that. And I want you to help me learn how to use my context window
better, more so the news of the podcast. Yeah, because I'm probably inefficient and overspending
as a result. So now I know. Now I know. Yeah. To those listening, like, my recommendation is you should
absolutely sample the field of coding agents. Like there's so many and I think you should find the one that
like fits the best with you. I think like we mentioned earlier in this conversation,
like the switching cost is so low. But the only way you're going to find the best one for you
is if you actually go and like try the different ones in existence. It's funny how much like
hype and like high level conversation there is about like this and that. And at the end of
the day, it's like it's never been easier just to like go try these things. So firsthand experience,
I think is it heavily outweighs
whatever the latest Twitter influencer
is saying about the landscape.
So, you know, try AMP, let us know what you think,
but also try a bunch of other
agent-ac coding tools as well.
Yeah, try them all. Try them all.
Anything left in closing, anything over the horizon,
anything that is maybe a sneak peek or a tease
or anything you can share in closing.
I will say, when is this coming out?
Is it going to be like sometime in September?
Not next Wednesday, but the Wednesday after that.
So literally it's going to ship on September 17th.
Okay, cool.
I think by then some of this stuff will be out.
But I think right now an area of active exploration for us
is just experimenting with all the different new models
that have come online.
There's a lot of great models that are really good at tool use now
that occupy different places along the latency, intelligence, perado curve.
And so one of the things that we're doing is we're playing around with all these different models
and seeing how well they function in either sub-agents that AMP uses,
like the Oracle for thinking or the search sub-agent for discovering context
or just like the generic sub-agent, which conserves the context window of the main agent.
But I think within, like by the release date of this podcast,
we'll have deployed some of these new models.
into different places in the application.
They should help speed things up.
And I think one of the things that's an active area of consideration for us
is like up until now,
part of the experience of using a coding agent
has been just like waiting,
waiting for it to get done.
Because the token throughput is at a certain level.
And I think we're seeing a lot of positive signs
that will allow us to bring down that latency
by using different models,
the best model that we can find.
for each task in the next couple weeks.
And so I think that will drastically improve the experience.
And it may also push the latency past the point where there's some sort of inflection
point where if we can get the latency below a certain level,
it will change the kind of nature of how it feels to use a coding agent.
Let's put it that way.
Yeah, absolutely.
I love the principles you built on.
We didn't touch on those.
But I love the principles you built on.
Keep doing what you're doing.
Don't change the thing.
just do more of it and share the what's the wise in the house on that awesome raising agents podcast.
Make it more frequent if you can.
I don't think it's necessary,
but I think definitely elevated to that first class experience of intentionality and production level.
I think that's a pretty easy button to push.
And it doesn't take a lot for you and the team involved in sitting down and hosting and talking to do.
You can employ people around you to have that context instead of you.
there you go. All right, Bejong, thank you so much for your time here, man. It's always a pleasure.
I always appreciate talking with you. Thank you so much.
Thanks for having me on, Adam. Always a pleasure. And yeah, let's keep talking.
You know, I'm a firm believer in just sitting down for a deep conversation with a nerd.
And Beong is a filender like me, like you, just holding tight to where we're at, this ever-changing, wow.
how do we get here moment in software development i don't know about you but everything is
changing everything has changed and everything will be changing so lots of change change
change log that sounds kind of familiar but amp is one of my favorite agentic tools i use augment
code i've tried codex i use cloud code pretty much regularly that is my main driver but i'm trying
them all and i think you should too but one thing i love about amp and one thing i've experienced with
is it seems to be a cut above the rest. It seems to be better. It seems to think differently,
think better, plan better, execute better. And I can just make a plan and set it loose and just
sit back. And that's kind of cool. It's not a one shot. It is a very thorough, very deep,
takes me lots of time. Let's say I'll sit down for 40-ish minutes and just make a plan,
just planning, thinking through, orchestrating, thinking about the user experience, thinking about the
implementation details. And then once I feel good about the next step, I set it loose. And that's so cool.
Well, we just got back from our friends Hangout, the HQ of Oxide. Wow. Oakland, Emeryville,
Brian Cantrell, Steve Tuck, and the rest of the team at Oxide, beyond, beyond impressive.
So stoked to have had a chance to go over to their internal conference, peel back the layers of oxide, their culture, their team, their mission, all the things.
I'll be sharing that with you next week.
Again, a big massive thank you to our friends at Fly.
They are awesome.
Our friends over at depot, depot.dev.
And of course, our friends at CodeRabbit, AI-driven, code review.
Now in your CLI, man, it is wild.
It is so wild.
check them out
codrabbit
dot AI
and to the beat freak
in residence himself
brake master cylinder
bringing those beats
banging
banging
love your BMC
okay this show's done
we'll see you on Friday
Game on.