How I AI - How Devin replaces your junior engineers with infinite AI interns that never sleep | Scott Wu (Cognition CEO)

Starting point is 00:00:00 Devin is async. Once you kick off the Devon session, Devin's going to start working and looking through the code, but you're not expected to be there with it. It's just as if you gave your intern a project and your intern is going and working on it. Devin's my favorite intern on my team, and I have infinite of them. Why don't you pick a task that you might bite off for your product and show us how you would work through that end to end? I'll say, please go research the chat, PRD, MCP server.

Starting point is 00:00:25 This will produce a pull request for us. Often you're running a few of these at once, just like a nice, way to have multiple tasks going and then check in on each of them. One of the benefits of this from a How I AI use case is you can multi-thread a lot with tools like this and set two, three, four, five, ten of these going at once on different projects and not feel like you have to sit there and babysit things. Welcome back to How IAI. I'm Claire Vow, product leader and AI obsessive here on a mission to help you build better

Starting point is 00:00:56 with these new tools. Today is a very special episode for me because we're talking to Scott Wu, CEO and founder of Cognition Labs and the builder of one of my favorite AI products, Devin. We're going to hear about how Scott uses DeepWiki and Devin to kick off well-scoped tasks to get things done, uses Devin as his favorite and most tagged employee inside of Slack, and how he's making it not weird to bring ChatGBTGPT voice into your meetings. Let's get to it. This podcast is supported by Google. Hey everyone, Shrehta here from Google DeepMind. The Gem our 2.5 family of models is now generally available.

Starting point is 00:01:36 2.5 Pro, our most advanced model, is great for reasoning over complex tasks. 2.5 Flash finds the sweet spot between performance and price. And 2.5 Flashlight is ideal for low-latency, high-volume tasks. Start building in Google AI Studio at A.E.Dep. Scott, thanks for joining HowI.A.I. as Devon's number one reply guy on X. I am really excited about this conversation and for you to show off how your company uses

Starting point is 00:02:10 and you use the product that at least makes me very happy and I'm sure makes lots of software engineering teams out there very happy. So welcome. Thank you so much for having me. Now, I'm wondering to be here, honestly. I'm a big fan of you guys and all the work you do. Great. Well, we want to get into, we have lots of stuff to talk about,

Starting point is 00:02:27 but what we really want to do is get into how you AI and in particular how you AI with the products that you've built. And, you know, I think what's really fun as somebody who's building AI products, it's something you get to use every day and get really good at, but also probably show some of our listeners and watchers, some tips and tricks about using the tools that you've built that they may not have thought about so far. So we're getting the expert look into how to AI with the cognition product. So what are you going to show us first? And what are some of your common workflows when you're doing engineering work or trying to move the product forward? Yeah, for sure. No, for us, it's definitely, I mean, as a bunch of programming ourselves, you know,

Starting point is 00:03:12 building an AI that can code has going to be one of the coolest things that we could probably spend our time on. I wanted to show a couple of flows actually of how we use the Devon stack, because there are a few different pieces involved, there's flak and linear. There's the wiki, obviously, and then there's like Ask DeVet and then there's, you know, starting Devon sessions and getting cool requests out of it. I think there's some real. I think there's some real nuance and like what are the right flows of like, how do you work with Devin as an employee? Because I think it really is quite different from a lot of the tools out there, which are much more kind of like an IDEE, for example, or like a terminal UI.

Starting point is 00:03:46 Like Devin is, I think, first and foremost, almost like an engineer on your team. Yep. Totally. So what are some of the things that? you reach for with Devon and the capabilities that you think really make a difference for you as a software engineer. The way that we like to describe it is Devin as a junior engineer. And so Devin is not going to go. And, you know, we're working on getting Devin as a senior engineer. Obviously, you know, we'll get Devin the promotion and everything. But, but like Devin is not going to go and solve from, you know,

Starting point is 00:04:17 really hard architectural problem or make some big strategic decision that you, you know, you're going to make and then kind of like execute on for the next month that you probably want to be involved in those as well. Devin can help you with a decision obviously by by kind of like referencing the right things or or giving a few things or input. But I think where Devin really shines is one way that we say also is kind of like tasks, not problems. And so often when you have a very clear like here's exactly what we need to go do and here's the task and here's all the details of what we need, Devin is really great at going and executing that for you and makes that much faster. And so naturally, I think the next question that comes to mine then is like, how do you figure out the spec or the, you know, the task exactly that you want to do?

Starting point is 00:05:02 And so a lot of the other tools like wiki and search, you know, really, or fear for you to be able to kind of like ask the right questions that you want about understanding the code base or what needs to be done and then putting a task together. I think in practice, part of a lot of the use cases that we see all the time are, you know, probably number one is just. crawling through your issue backlog, you know, whatever, whenever you have an issue that comes up or, you know, we have a lot of Slack channels where we talked about issues and then on every single one of them, we just tagged Devon as the first pass. That's a big one. And so like, you know, someone says, oh, you know, we need to go fix this thing in the front end

Starting point is 00:05:38 or, you know, maybe we need to go support this other, you know, support this other MCP, for example, with a little show and second things like that. And then for a lot of the other kind of like, I'll say like engineer and toil use cases, it also does really, really, well. And so often that's like, you know, going and doing a version upgrade or added documentation throughout, you know, could have your, your repo or adding unit tests for a specific thing that you have up or responding to, you know, a crash report that just came up and trying to diagnose what went wrong. Yep. I love that what you said about, you know, Devon's a junior engineer.

Starting point is 00:06:13 I say Devon's my favorite intern on my team and I have infinite of them. And then I like this idea of scoping task, not problem. And I do think it's something that people are working with AI and even, you know, other other AI tools, not in the engineering space, really thinking about task level orientation sets you up for success or at least a sequence of tasks can be very helpful. And so why don't you pick a task that you might bite off for for your product? And let's, you know, show us how you would work through that end to end. Yeah. Yeah, let's do it. So, As you might know, I'm a huge fan, actually, of Chat Pearty. And the natural thing that came to the line for me was we need to integrate into Chat Priority's MCP server.

Starting point is 00:07:01 And so I was looking into how to do that with Dev. And so the first thing that I always kind of go to as an initial thing is what we call Deepakie, which basically for any repo, this is true for public or private repos. You can come in and get a whole AI-generated documentation of the repo. And so in this case, here's, you know, this is the Devon web app repo. Probably there's not be too sensitive here. But it's basically, you know, it explains Devon. It's pulling a lot of information from will read me or understanding the system architecture.

Starting point is 00:07:33 And I can search this and you pull up different things. And so, you know, if I understand how the MCP market purpose is set up, you know, they'll point out what particular components there are or what particular files are called here. And I can read up on this and kind of understand exactly. how this is set up. And the natural question here that I might ask is, okay, cool, just show me where the MCP server list is implemented. And to this, we'll look through our repo. And Devin at this point has done a lot of work in the DenWeb app repo for standard. And so that helps a lot, which is, you know, Devin builds this representation of the code base over time.

Starting point is 00:08:12 And we can see what's going on here. It has all this. And so you're getting both like sort of a natural language explanation. of how this server list is implemented. And then you also, on the right side of this, for folks that aren't watching, get the actual code snippets and reference files that you can view and really understand the deep layer of the code.

Starting point is 00:08:34 So you have like sort of a combination of let me explain how it works, and then this is the nitty gritty. Yep, combination of English and code. I think it's an interesting one where it's like, you know, I come day it'll probably all be English, you have. But I think especially now, you know, in this current period, I think we're really in the. era where obviously you have the U.S. the engineer want to be looking at Burt, English, and

Starting point is 00:08:55 good. And you can see here, it's giving you kind of the answers of what's going on. And in particular, I'll point out, okay, here's our list of all the different marketplace servers that we have. And we have an Atlassian, MCP, and here's a hub spot. Upsypian and so, right? And from here, the natural thing that I'll want to do here, which is what we found to be a big flow for folks is to use this to produce actually a prompts for Depp. And so the whole idea is now that we're in this context, you know, we know what the questions were, we know what part of the code base that we're looking at. It gives a lot for Devin to be able to start from.

Starting point is 00:09:33 And if we have it after a past in mind, then we can get that going. So I'll say, please go research the chat, PRD MCP server and add that it to for this. And so what this will do, I use basically constructed debit prompt from this. And so this has, you know, my prompts here, which I just typed in, which is not super refined. But it also has on detail about, you know, the part of the code that we're in and what components we're looking at and so on. And so then it will generate for me this prompt in Devin that I can just go ahead and use a media error. And you can see here, you know, you want to follow the pattern of existing servers like at Lassian and HubSpot.

Starting point is 00:10:19 Here's the exact type dip structure that would be used here. Here are the functions that you should be working at. And here's what you should check to make sure that it works. One of the things that I want to call out for folks in terms of a workflow that they should think about is a lot of people, myself included, sorry, Devin, would have just sent that prompt, which is add, you know, ad chat parity's MCP server to the list. And I do think that one very short but important loop of take this prompt and turn it into an effective prompt, given the context you know, and then sending that into the task to do just saves you a lot of heartache. And it feels like extra friction at the time.

Starting point is 00:11:01 But I think pretty soon is one going to be the job to be done of the tool itself. So does that like loop become invisible either through these reasoning models or some hours? application layer that manages it. And two, it's just worthwhile for people to do. So this, you know, when you're thinking about sending a five word prompt, think instead saying, here's my five word prompt, build me a better prompt and sending that into your system. Yeah, for sure. And I think it's, you know, it's a great call because, you know, as we said, Devin is async, right? And so from this point onward, the nice thing about this is, you know, once you kick off the Devon session, Devin's going to start working and looking through the code

Starting point is 00:11:36 and reading online about chat, VOD, for example, right? I'm going to do all this. But, but you're not expected to be there with it. Right. And so, you know, it's going to work on its own. It's just as if you gave your intern a project and your intern is going and working on it. And so, you know, they can ping you on Slack and ask you if there's questions or something or you can go kind of like, you can go take a quick look and see, you know, how your intern is doing. But you don't have to be sitting there with Devon for every step in the way here. And so one way that we kind of describe it is, you know, for a lot of tasks, there's often this stink component, like the synchronous component. And then it's a superist component, right? And a. And a. And a. And a. And a. And a. And A lot of what search and wiki is for, is for doing the synchronous part of the task before you do they sync. Right. And so like, if you had an intern, for example, would you just send them five words, Slack message and just leave it at that. Maybe sometimes for something that is like, you know, super clear and then you know exactly.

Starting point is 00:12:26 Often what you actually would do is you would sit down with them, talk it through for two minutes and be like, okay, yeah, like, you know how we have this MCP marketplace and then we go and look at it together, you know, we read the particular line with the code. And then you say, okay, yeah, so let's, let's add chat. PRV to this and just go take a look at how that MCQ server is implemented and make sure we add it to the list. And then you kind of hand off there, right? So you kind of have the first two minutes of going back and forth with Devon, your intern. And then as soon as you go on the Devon prompt, you're kind of expecting it to be more of an asynchronous thing where you don't have

Starting point is 00:13:00 to be in the loop. Well, and one of the things I want to call out for people that are building AI products out there, you know, like you, like me is in these sync products, latency really matters. People get really frustrated with wait times. But if you set up your product to really be this asynchronous modality, you actually buy yourself a lot of user love on waiting time because there's not that expectation. Just like you would not say, hey, intern, okay, now go research this other MCP and do a PR for me and come back when it's ready. You know, just like that would not be something you would expect an intern to come back to you immediately. You also, from a product perspective, don't expect Devin to come back immediately. Now, one of the benefits of this

Starting point is 00:13:45 from from a how I AI use case is you can multi-thread a lot with tools like this and set, you know, two, three, four, five, ten of these going at once on different projects and not feel like you have to sit there and babysit things. And so I'm, I'm wondering, you know, while this is running, do you go pop off and go to a meeting or get a coffee? What has this sort of like asynchronous workflow enabled for you. For better or for worse, I'm in meetings for a lot of the day. And it's great to be able to just kind of kick these off or, you know, you have an issue backlog or pay there's these three or four things I was hoping to look at today.

Starting point is 00:14:21 And you click up each one with Devon. And then, you know, these go and work asynchronously. And it will make the poor request for you in GitHub and it'll kind of show you the dip and what work it went through. If it's like a chronic change or something like that, it'll send you the screenshots. of what of the before and after, right? You'd see it's going and researching chat PRD.

Starting point is 00:14:43 Well, I will say just my, clearly my SEO on the MCP is not good, but Devin did make my MCP homepage. So it's in the top nav. Yeah. That's funny. There's a hundred for me, so it should know. Yeah.

Starting point is 00:14:58 Cool. Cool. Yeah. So I think for sure, you know, often you're running like a few of these at once. And like you said, it's just like a nice way to be. able to kind of have multiple tasks going and then check in on each of them.

Starting point is 00:15:09 Yep. And so what this would do, and maybe we can come back to it later when it's done thinking, what this is going to go do is it's going to go do research. It's going to find my docs page on the MCP server that Devin did make for us. And then it's going to pull that docs in and then you're going to get actual code out of this. Your goal for this is to get a PR, right? Yep. This will produce a pull request for us.

Starting point is 00:15:34 And then from there, I'll be able to review. you the ProQuest, and then if that looks good, then I'll merge it. And then, obviously, we'll have this out and the next Devin. Amazing. And then your prompts are going to be so much better. And I'm feeling guilty, so I am just going to slack you the MCP homepage. And you can give that to Devin to go. Yeah, sure, sure.

Starting point is 00:15:55 You're getting a true live, true live demo here. Yes, yeah. This is when your interim comes back to you and say, hey, I was looking at sat and like, I couldn't find it. Like, can you point me to it to where? Okay. You have it. ChatprD.a.ai slash product slash MCP. Okay. Has code snippets and everything.

Starting point is 00:16:15 Okay. Okay. Here we go. Great. So this is a good example of you've done your research. You use that research to create a better prompt. You use that prompt to kick off a task. That task is worked asynchronously as a sort of like more junior engineer would work, including doing research in your code, external to your business, and then it's going to go ahead with the context of your repo

Starting point is 00:16:40 and do a PR and ship this feature. And otherwise, you would have had to like ask somebody to do this. And I think about, for me, I think about the people that you'd have to involve in something like this. Like you'd have to go find the senior engineer that wrote the MCP server code. Yeah. And say, like, please explain it to me. You'd have to, you know, get, take the time to write out that nice spec.

Starting point is 00:17:05 of this is what you want to do. And then you'd have to task it to somebody to actually implement. And so I think you can press that workflow of a team of like, you know, even three people's time into, you know, about 10 minutes to get something done. Yeah. Yeah. And I think often a lot of the folks who we see who really, really love debt and and use it display especially are, you know, folks who are like tech reads or product

Starting point is 00:17:30 managers or things like that, you know, it's a great kind of intersection of, one is, on one hand, you're already used to the flow of kind of like, you know, figuring out an issue and getting into what is going on there and then handing off something, you know, of here's exactly what we need to build, right? And then I think two is naturally like the async workflow for people who are in meetings or have a lot of back-to-back going on. It's just a great way to kick off and check in on tasks quickly. And so from starting things from the web app or starting things from Slack, for example, is a nice and light thing if you're not in your IDE all the time. You can start tasks from the IDE as well, obviously, but we see this kind of blow a lot with with leads and PMs, basically, who are, you know, who are going back to forth with a lot of things. Yeah. One of the things that I've been telling people more and more is as part of your PM onboarding, you should be now giving everybody access to GitHub, which isn't something that, you know, typically happens in a lot of product organizations, giving access to GitHub, giving access to

Starting point is 00:18:31 tools like this because I think it does enable product managers to do a lot more. So while this is running, what I wanted to talk about is, you know, before we got into the show, you and I are saying you're just a little bit busy, you know, over the last month. Just doing a few, few interesting things with the business. In addition to, I'm sure, wanting to build and spending time with the team. And so, you know, this asynchronous nature and this junior engineer on demand, how do you actually use that day to day to just keep a float on top of all the stuff

Starting point is 00:19:06 that's coming in your team. You know, not, I have a feature I want to build. Let's go build it. We just saw that flow. But like the kind of reactive stuff in your company, how are you using AI to stay on top of that and keep the velocity high? For us, Alive is just setting the right workflows

Starting point is 00:19:22 in our flag and in our org and so on. And so, you know, Devin obviously has knowledge, which means it'll learn your code based over time as we keep working with it or you can kind of give it more details about how certain things work. And a lot of things are almost just like institutionalizing Devin as first line of response is how I would describe it. And so I could show a few examples of the shit. The big thing is to really get to the point where for a lot of these different things that we pile, you know, like Devin is first person that gets tagged on all of these, right?

Starting point is 00:19:54 And Devin won't be able to do every single thing, you know, on one shot on the first try, but often you're working back and forth with Devin and different puts up a PR. And if there's some slight touch up that you have to do at the end or that you have to build, then you're able to do that. And so we have a ton of channels where we go and talk about issues or various things that we need to build or things like that. You know, we have one for all the crashes that come in. We have one for a kind of like core infrastructure things that come up.

Starting point is 00:20:21 We have one for this is the one for our web app, which is hopefully a little bit less sensitive. it. And you can see here basically every single thing that had a fact folks talk about and remember we do. You know, it's, we start in Devon fashion. And so it's like, hey, you know, can you standardize the font size spacing and style per these three levels, right? And then, you know, we just go and start the Devon session. And Devon will make the PR. It'll go through the PR. This one gets merged because there's some back and forth feedback here. And so, so like, Devin goes and edits.

Starting point is 00:20:57 You can put it up and see. And so Devin made this BR. There were a couple back and forth edits and then Dave, our engineer, went in and merged this. And this is often how it works. You know, it tells this is another goody sample. Hey, Devin, can you make it to that when you command click on a notification? It takes you to that in a new tab, right? Natural feature probably one of our users requested it.

Starting point is 00:21:21 And you just started Devon session. And Devin will give you this progress update. Here's what I'm doing so far. Here are the files that I'm looking at. Here's what I see. In this case, by the way, it's actually comprehensive. It's medium. And then the world that says, oh, no, no, no.

Starting point is 00:21:34 Like, you know, you should take a look at this thing instead. One of the cool things I want to point out, too, is because of this devon is a nationally multiplayer experience. And so we will often have a few different folks going back and forth. Or if somebody else is looking up this issue or, you know, if somebody else is the expert on this part of the code base, they'll go and give their own kind of input. here and Devon will just go back and forth with them as well. And so really it is just a thread where a group of you are communicating and figuring out how to work on this issue.

Starting point is 00:22:04 And Devon is just one of the players in the threat. Right. And so Ethan comes into Walden's thread here and says, hey, make sure to use a link element from Tampstack router and then gives that feedback, right? And then Devin goes and makes that change in the pool request. And so you can see Devin had like an initial thing and then had some additional commits and it went and did this link from TANSack router instead. As an AI founder, you're used to sprinting towards product market fit, your next round,

Starting point is 00:22:35 or that first enterprise contract. But speed isn't enough for AI startups. Buyers expect security, compliance, and transparency from day one. That's why serious AI startups use Vanta. With deep integrations and automated workflows built for fast-moving AI team, Vanta gets you audit ready fast and keeps you secure with continuous monitoring as your models, infra, and customers evolve. AI innovators like Langechain, writer, and cursor scaled faster and closed bigger deals

Starting point is 00:23:10 by getting security right early with Vanta. Listeners can claim a special offer of $1,000 off Vanta at Vanta.com slash how IAI. You know, one of the things that I like about this, and again, kind of a shout out on our use case for folks that are trying to drive more AI adoption in their teams is doing this as much as possible in public is really helpful from a learning perspective. So one of the experiences I had running the engineering team at Launch Darkly was when we started putting Devin and Devin-like agents in public chance. We saw a lot more adoption and upskilling of our team on how to actually talk to these agents, how to get the right outcomes. And so, you know, I, we were talking earlier and I was saying I DM Devon all the time. It's because I have no one to know one to talk to. He's my only buddy. But I DM I DM Devon all the time.

Starting point is 00:24:16 And we have these sort of like side conversations. He's sort of my intern on the side. But in larger organizations, I was very much a do it in public channels, do it where people can see it. Because not only does the work get done and it's nice kind of muscle memory to tag in these tools immediately, but also just learning how you use them. What is an effective prompt? What are the kind of things that it's good at and not good at is really useful for just overall engagement with these tools? And so I think hiding your AI use is kind of the worst thing you can do it in work. So I say do it all in public. Yeah. Yeah, yeah, yeah.

Starting point is 00:24:57 And I think there's two sides of it that I was going to work where one is like that kind of like when we talk about these multiplayer experiences, right? I think there are two benefits, right? One is this kind of like the knowledge transfer for the agent itself, which I think more and more products are starting to have, which is, you know, one person uses debit in or uses. this tool or that tool, right? And that adds to the knowledge of the tool itself so that, you know, a week later when somebody else does that session, Devin's like, hey, oh, yeah, I just touched this piece of the code last week. Like, I know exactly what you're talking about. Let me go and find that. And then the other side is kind of like educating the humans, right? Of like, you're showing each other what your experiences are, you're being able to work with one another in the same clothes.

Starting point is 00:25:40 And I totally agree. I think because of both of those, you know, we'll see a lot of experiences and AI productivity get more and more multiplayer. Yeah, yeah, that's my hope. Okay, before we move on from Devin and your use of it for engineering, I want to get really specific. So you'll go and then I'll go. What are your top five, like everybody can reach for them tasks that Devin can do for you? And you pick kind of like five categories of tasks and I'll pick five.

Starting point is 00:26:09 Okay, sounds good. Yeah, so top five, I think miscellaneous front-end fixes, it's amazing for. I mean, because often that full workload is like, you know, for various reasons, like I said, you have to get like three different people involved. And it's like, here's what we're going to do. And then you bring in somebody who looks at that code and there's somebody else who's reviewing or something. And now with this, you tag that, you explain, here's a screenshot, you know, I want to make this button a little bit more round or, you know, I want to touch up the design here and I want to do X-find me, right? And it'll go and do that.

Starting point is 00:26:41 It'll find the right part of the code. You'll do the implementation. but also it'll send you the before and after screen shop as well, right? And so you can just kind of review it in line there. And that's just like a really, really great use case, both I think because similarly, it's verifiable for the agents, but it's also verifiable for human rights. Yeah. And while you're saying that, I will just pull up an example of this, which is, let me

Starting point is 00:27:06 share my screen, which I rarely get to do here. It's a very exciting window. Oh, let's do. Always thrilling to share your Slack. As you can see, my only friends are agents. But here's an example of it. I just did very recently, which is I'm working on the chat parody homepage. And, you know, Devin shoots back to me. Here's a new, a new hero image that I like. And I was able to give feedback on on that. So that's, this is kind of exactly what you're talking about, which is like, let's make changes and then get kind of that immediate feedback back right in your workflow. Yeah, yeah, fixes, new Copernan changes that you want to make in your front end. It's, it's super, super nice because, yeah, as you're saying, you can just kind of do this all inside, basically. And so that's probably number one for me. And number two that comes from mind is version upgrades, migrations, things like that.

Starting point is 00:27:58 And so, you know, like upgrading your node version or getting onto, you know, the latest packages and so on. So it's a big time. You know, we all have to do it. And then somehow these new packages just come out so quick. But obviously, the devil over the details of like finding, you know, this view version will say, oh, you know, every instance of this component is, you know, we recommend that you use, you know, this structure instead or something. And then will be able to kind of go through that and do the semantic search and find each of the components and make correct changes.

Starting point is 00:28:29 Number three, I'd say is documentation, big one as well. And so we have our, you know, dev and docs, for example, like, our own kind of like docs page, like the external docs page. And I mean, Devin has written the entire thing. DeepRookie itself, obviously, is kind of an extension of that, but even writing your own docs pages or putting the series together, a lot of what Devin does is going and processing the code base and understanding this reference is that and, you know,

Starting point is 00:29:01 here's what this does and so on. And so it's a funny one in the sense that it's not strictly a writing code that use things or isn't always, but but I think it's so closely related to it that a lot of the same capabilities are really valuable there. Number four that I would say is incident response actually. And so we have this setup so that whenever there's a crash, the first line defender, you know, on page of duty basically is Deb and so Devon gets the page and Devin gets started, goes and you know, kind of runs a session.

Starting point is 00:29:35 And obviously you probably want a human there too, you know, especially for these days. instance that to make sure what's going on but the nice thing is you know it's like 4 a.m. and and you're kind of like half asleep and then you you get to your computer and Devin has already written a report of like hey I looked at it I think it was this change from like last week that happened or yesterday that happened you know here's exactly where you know the trace of the error goes so we use that a rod it's a huge lightsaber for us and then number five let's see I I would say adding testing is a big one for us you know, it's a very common thing where this is, especially for kind of like individual engineers as they're going on working on things.

Starting point is 00:30:16 You know, you have your whole PR, you built things out, you built a new feature and always be the last thing that you have to do before you ship it is you have to go and add your own unit tests and make sure your thing works, right? And the nice thing, again, it's like Devin will go and do that. It will make the test and then it will run the test locally itself and make sure those tests pass. And so we can iterate with the edge and that's sure the Lint pass, make sure it's CI passes. and so on and just kind of like add this for you. All right. Well, we're very close. My five are very close.

Starting point is 00:30:44 So I love those. So to recap and I'll augment yours with mine. So number one, friend and fixes, my particular version of front end fixes is I think these AI tools can really help you do polish, really nice interactive user experiences where you wouldn't normally be able to spend time on them.

Starting point is 00:31:00 So any of those like little magical moments that you don't want to like toil in front end on, I think it's really good at. Docs, I think is underrated. I actually have a GitHub action that every PR gets open, gets reviewed by Devin, gets the PR description rewritten by Devin. And then after the PR is closed, Devin goes and ships our documentation, internal documentation, into our repo. So that Devin has access to the, to the doc. So I think it's like an excellent technical writer.

Starting point is 00:31:28 I too have Devin first line of defense for incidents. So Devin actually has a Sentry login and logs in to Century and goes through all of our open issues and starts to fix stuff for for us definitely upgrades and then the one that I didn't hear you say but I just think is a as a more like operational and personal benefit is it's like 24-7 availability rubber ducking which is like when you're working on something and you're just like can you just look at this and see if I'm being crazy if this is crazy you know Sunday night Monday night Saturday morning where you really don't want to bother a colleague. I just think having something to like sort of rubber duck with is is really nice.

Starting point is 00:32:13 And so those would be those would be my use case. It's very similar. Okay, Scott, we're going to close with just one one really high level use case outside of the Devon ecosystem, which is voice. And you were telling me a really interesting chat GPT voice use case that I hadn't heard before. So do you mind spending a few minutes just telling us about that? Yeah, Patricia. I'm a big fan of voice. I actually think you're a lot of interesting.

Starting point is 00:32:39 You know, we've played around with, you know, we have voice in Windsor now, actually as of Wave 11, too, partially because of that. But in short, in short, of the way I'm describing it's like, I think, you know, Google itself, like 25 years ago, what was basically a better encyclopedia, right? You know, we have all sorts of things that you want to look up and pull it together and so on, right? And it basically got you a faster answer and it got it to you, you know, with more up-to-date information of what was going on.

Starting point is 00:33:10 And I almost think of chat-jave voice as like a better Google. You know, like you can get an even faster answer. It's fully synchronous. You can do it in the conversation. And then obviously you have all of the detail. It can go in research and do these other things too. What I'll often do is, you know, if I'm in a meeting and we'll be talking about things, you know, there are always questions that come up like yesterday I was in a meeting

Starting point is 00:33:33 and we were talking about this, which is, you know, there's so many orgs out there with tons of software engineers. And so we were kind of thinking like, yeah, like, what are all the companies that have, let's say, 10,000 plus software engineers? You know, and how many are there in the world, right? You know, obviously, like, you know, the big bank out there, tens of thousands of software engineers, the big tech companies, you know, those are the first couple, maybe the Accenture and for a says, you know, that category.

Starting point is 00:33:56 This is the first one that comes to mind. But like, what are all these different companies that happen? And, you know, naturally in a meeting, it's kind of rude to just go on your phone and just kind of like, you know, be totally unresponsive for like two minutes as you're looking. So it said, what I'll often do is I'll just pull out chat security and grow on voice. And it's basically like adding chat to key to every conversation.

Starting point is 00:34:17 You know, and so when I say, hey, like, can you know, can you please like tell us like how many companies sit there have 10,000 plus software engineers, right? And then, you know, whether it's voice to voice or whether it's, you know, voice and then you kind of get the response and text. Like I use both of those modes a lot. But I find it to be like a very natural. a natural stepping zone where I just find that voice lowers the friction even further in a way that actually really noticed. I was going to say it's like in the encyclopedia era, right, if you were going to look something up,

Starting point is 00:34:49 it took like, I don't know, five minutes or something. So you had to go pull the right letter of the alphabet or something and find this. And then Google got it for like 10 seconds, you know? And like voice is kind of like getting it from like 10 seconds down to like one or two seconds where you can just get on instantly and just say what you want to say. And that actually matters, I think, for being able to go back and forward or just like having, you know, very off the cuff questions that you want to ask. Yeah, I was going to say, you know, you've maybe changed my mind here because I used to think that voice mode was like super socially disruptive in that it feels so unnatural to like talk during a meeting. But if you flip it on its head and you're like, no, this is just another meeting participant that I'm putting into the room.

Starting point is 00:35:33 it actually is more socially inclusive. Everybody hears the result, right? You're not like slacking around links and then people are opening them up on their laptop and reading while somebody is talking. Like everybody sort of like clued into the synchronous nature of this new, new information. So if I had people to be in meetings with and not to brag, but I have very few meetings.

Starting point is 00:35:52 Then maybe I will bring chat, chat, GBT into it. Okay, we will do. Must be nice. Must be nice. Man, it's the dream, man. So quick lightning around questions.

Starting point is 00:36:01 We will get you back to your work. first one, it's like picking between your children, I know now. The IDE, the terminal, or the agent, what is going to be the form factor to rule AI engineering? I really think of this in the future as, you know, we call it coding agent and control. Like a lot of what this becomes is actually just the next generation of human computer interface. And like the way that I like to say it is, you know, Tony Stark doesn't have a laptop. Like, like, you don't need one at some point if you're just, you have your Jarvis plugged in it and you're going back and forth with your agent and then go and do these things for you. And you can imagine that builder software is just kind of like you're not looking at your code.

Starting point is 00:36:41 You're just looking at your own products, right? And you're looking at your own product and you're saying, hey, let's make this button rounder. Look, let me add anything over here. Let's save this and, you know, let's ask user for this and that. And you're just making the changes in real time in your product. And your agent obviously is going and implementing this for you. And so I think it's a, it's certainly very agentic, but I think it's almost like, Like, we might, whether we call it an IDE or an agent or whatever, it really is basically just like a different human computer interface where you are just looking directly at the product rather than having to go through all your code or go through.

Starting point is 00:37:15 And so I think that's the that's the future version some years out. I think today, I would say, I think a lot of it depends on the cohort. And so I'm, for example, in meetings all the time. Unfortunately, you know, not that. But, but, but, but, but, but, yeah, you know, and because of that, I actually think the Slack agent workflow is a super, super natural one, you know, or like linear, for example, and tagging, you know, debit and prim linear. I think for an engineering IC is, you know, gets to code for, you know, eight or ten hours a day, again, must be nice. But then the idea is kind of the natural place where a lot of this starts, right? which is, you know, you'll have these things that run over the background and you'll have

Starting point is 00:38:00 these asynchronous processes that are going as you're doing your thing. But the natural to get started for that is the IDE today, I'd say. I also just think what's nice about this era is like the form factor can come to you. And you can decide what the interface is that works best for your workflow. Okay. You know, as somebody, Devin is my buddy, I'm sure you get lots of chats that would give us very good insight into my closing question, which is when you are frustrated with our sweet, sweet intern Devon. What is what is your prompting technique? And I know you all monitor this because when I get frustrated, sometimes I get little credits back, little credits back. Like, you did that wrong. I get credit back. So I know you see a lot of human language

Starting point is 00:38:43 to agents, but what is your strategy? What do you find yourself doing in a moment of, you know, frustration or being blocked? I can give some advice. I can't say that I have always followed my own advice. But a lot of what it looks like, I'd say for an agent especially, is, I think an agent is a little bit different from a chatbot in the sense that like a chatbot, there's less to go off of this kind of like how I want to say it, right? Where with a chatbot, it's like, you know, you ask the question and it gives you the wrong answer. It's like, no, that was the wrong answer. And then that's all you can really say. With an agent, like, one of the nice things that you can do is you can go through and look through all the history of what he was doing. Right. And so like,

Starting point is 00:39:22 we had an example of that just now where, you know, Devin got stuck of like, you know, I see the chat theory page. It's partly having an MCP server. I'm like trying to find a documentation on this, right? And if we go and scroll through the logs and we'll see like what happened that it Googled it and found some other things, right? And that was what the issue was. Right. And so from there, it's kind of like you take that information and then you understand, oh, Devin was missing the link to this page. And remember you send that. And so I think a lot of it actually with agents is just, it's kind of like pair programming or pair debugging with an intern. Like you want to, you know, first you get to go through and see, okay, here's are all the steps that

Starting point is 00:39:58 you took. Oh, by the way, it's like, you know, I think you missed this one file, which is, you know, the downstream reference of this and that's why there was the bug or something like that. I think that's, that's the biggest thing that will really move the needle. Okay, so review the history, figure out where it went wrong and then then re-instruct. Okay, Scott, this has been so fun. Thank you for showing us. Where can we find you and how can we be helpful? Yeah, yeah, prefer.

Starting point is 00:40:25 So we're Cognition and Devin on Twitter. We officially got the Twitter of slash Cognition, which is great. And then obviously it's Devin.A.I. It'd be like to use the product. Great. Well, thank you so much and appreciate you spending the time with us. Cool. Thank you so much for having me.

Starting point is 00:40:43 Thanks so much for watching. If you enjoyed this show, please like and subscribe here on YouTube or even better, leave us a comment with your thoughts. You can also find this podcast on Apple Podcasts, Spotify, or your favorite podcast app. Please consider leaving us a rating and review, which will help others find the show. You can see all our episodes and learn more about the show at how IAIIPod.com. See you next time.

How I AI - How Devin replaces your junior engineers with infinite AI interns that never sleep | Scott Wu (Cognition CEO)

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.