The AI Daily Brief: Artificial Intelligence News and Analysis - MCP, Agents and What AI Engineers Are Thinking About Right Now feat. Swyx

Starting point is 00:00:00 Today on the AI Daily Brief, what non-engineers need to know about the state of the discourse in AI engineering. The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. To join the conversation, follow the Discord link in our show notes. One of the things that I think is very exciting about AI is that it's breaking down barriers between technical and non-technical people. AI is an intermediating technology whereby people who are non-technical can start to grok the tools of creation from an engineering and development perspective. And as people are trying to make that side of their brain work, the resource that I most often point them to is the latent space podcast and newsletter and the AI engineer

Starting point is 00:00:45 summit that's produced by some of the same people. Specifically today's guest, Swix, is at the center of all of that amazing work. He has an incredibly good pulse on the state of conversations when it comes to AI engineering. And so today we're talking about what the big themes that people in that community are talking about, building around, and what the implications are for the rest of us. All right, Swix, welcome back to the AI Daily Brief. How you doing, sir? Very good. Long-time listener and glad to be back. Yeah. So I think that this would be a really fun conversation. What I was saying to you kind of before we were recording is that I think that what's super valuable and what I hope to kind of have come out of this is to help, you know, my listeners, which I would say are on average,

Starting point is 00:01:28 uh, there's a higher portion of non-engineers, right, than, than your audience. So, you know, helping enterprises and non-engineers think, understand kind of where they're big discussions in AI engineering are. And I think that it's pretty clear at this point that, you know, it has always been valuable for people who are inside technology and building with technology, whether they're engineers, whether they are developers or not, to try to keep a pulse on what builders are building, how they're building, you know, what tools are using. I think it's even more pertinent, obviously, with AI, right? That, like, the space between the non-engineer and the engineer is getting blurrier, right?

Starting point is 00:02:02 Perhaps to some chagrin somewhere. But so I think that that would be super value. And I think you obviously have a unique vantage on this, you know, in terms of the content you produce, you know, with latent space, but also through planning the AI engineer summits, right? So we just got off one, I guess, a couple months ago now. it feels like just a minute ago. That was super fun in New York.

Starting point is 00:02:25 You've got the AI Engineer World's Fair coming up in San Francisco again this summer. And so I thought what would be fun is maybe we kind of just go through, like use those planning processes to kind of frame what people are thinking about and how that's changing, even in this compressed period of time. And maybe to start off, I think what you were just sharing with me about how you think about planning, I think is actually very useful context for folks to get into the conversation. Yeah, sure, thanks. The planning process, you know, this is my third year doing this, so I don't feel like I have fully unlocked, but, you know, I think the main thing is that our source of Alpha is that we stay close to the engineers and also we react faster than the machine learning conferences. And so these are the two things because there are a competitor, there are obviously many, many AI conferences. But for example, the research conferences like Neuribs, ICML, ICLR, all of them, I don't know if people know.

Starting point is 00:03:28 In order to speak at one of them, you have to submit a paper six months in advance. And so in AI time, that's a long time. And that's just purely because Neurip's is 38, 39 years old. It just wasn't that fast when they were started. And now it is. And, you know, it's hard to change a tradition like that. And then the other conferences are typically organized for business heads and talking heads and people interested in that kind of thing. And so they don't get too far down on the technical detail.

Starting point is 00:03:58 And I think the key problem with that is there's a ton of fireside chats, a ton of panels. Everyone shows up with no preparation whatsoever, they yap for 30 minutes. And then you're done and you don't remember any of it. So the thing that I emphasize a lot is what an engineer is going to take home with them to do their work. how to improve that. And that means I demand of my speakers that they prep a lot. But then also like that then that gets the results that we get, which is, you know, talks that actually matter. And the people that come actually, you know, want to meet the people that like build things and are in our hands-on. So, so yeah, I mean, like, it's weird because like it's

Starting point is 00:04:36 slightly less prestigious to be hands-on keyboard than, you know, being like a CEO of like a major company, you know, going on station talking about how we're all not going to need jobs in five years. But, you know, the people who need, who are hands on keyboard also need a place together and that's what I do. Yeah. And my argument, as someone who's, you know, been involved very nominally, you know, at least with the last one, you know, helping MC and stuff, is I think that in general the action is happening hands on keyboard, you know.

Starting point is 00:05:10 And even if you are, you are, everyone's a builder now is sort of the short of it, you know, whether, whether they're building with code or building other ways. If you are kind of fully participating in AI and agent land. So let's talk about going into the sort of the summit that was earlier this year. It felt like the big sort of inflection or change, again, from the outside. And you correct me if you were thinking about it differently, is an expansion of the question that had kind of characterized a lot of 2024, which is what is an AI engineer, right? And, and, and how does one, what does it mean to do that and what do you need to think about to what is an agent engineer? And how does agent engineering, you know, interact with change, modify, transform that, that framework. So I'd love to know if that, if that was sort of how you were thinking about it, if that was the big sort of shift and what the implications were in terms of, uh, the conversations

Starting point is 00:06:04 that you wanted to facilitate. Yeah. Um, I don't know if people understand, that it felt like a risk at the time because we made this decision kind of November-ish. And for a lot of last year, actually, agents was kind of a bad word because there had been a few agents startups that failed. And people were like kind of taking it. You know, we were telling people to take agents off of their description because it was like so ill-defined and so overused that people didn't really like seeing them anymore. It was a counter signal that you were doing something interesting. And then it really flipped with 01 and with all the other subsequent agent lodges with operator deep research and all that. And Manus now is crazy. So like last year's World's Fair,

Starting point is 00:06:52 we had nine tracks. Only one of the nine tracks was agents. So like we really had to decide like, okay, we're like this is the right time for agents and we're going to go all in on this one. And I think then there's also another consideration with regards to what can engineers uniquely do as opposed to researchers. And, you know, like we had other talks about open models. We had other talks about GPUs and inference and multimodal models and all that. But a lot of that starts to entangle with the sort of research layer of the stack. And, you know, those, like, those are great, but like, they are very dominated by people with the resources to do that research.

Starting point is 00:07:36 and there are already research conferences. So we really wanted to be an engineer conferences. And I think that specialization in the engineering layer on top of models to turn them into agents was like the key. I was like I was kind of waiting for that moment and it felt like the time to do it, especially with NCPs launched like an end of last year. And so I picked, you know, I announced that we were all in on agents and we like planned out, you know, here's like what we think the disciplines of agent engineering are. it turned out to be very different in the end, but we sort of scoped out what that cover proposals was.

Starting point is 00:08:11 And people came in and it was really popular. So, I mean, I can talk through like the individual talks that did well, but also like that's the high level, which is like we made a high level bet on agents. And then I did my keynote with why now. Like, you know, I think there's a very strong moment to timing where, you know, if you're correct but too early, you're still wrong. and I think that I think you know that this this whole this whole trend of like 2025 being the year of agents I think is probably correct what do you think that what you know without rehashing the entirety of the of the keynote what do you think the sort of the key inflection point was was it the like the reasoning models was it you know better infrastructure I mean what what do you think sort of made that that shift for it to become real if I can share my screen I'll I'll just you know for people on on YouTube I had I had sort of nine points and a lot of them are more slow bake, right? Like these have just been improving in terms of performance

Starting point is 00:09:11 into the model performance. So what I'm looking at on the screen now is the Gaia benchmark for meta. By the way, we interviewed with the Gaia team last year if you want to learn more about what Gaia really is and what their intentions are. But things have just been improving on this S curve, and we're just kind of in this top part of the S curve now we're starting to reach human baselines.

Starting point is 00:09:32 And I think the closer we are at the human base. the more we can start actually using them instead of just reporting the benchmark scores and saying that's cute but I'm going to go back to using my human intuition now. But then there's also all these other stuff, right? Like there's better capabilities, better tools. But also like, you know, I like to emphasize the second tier of stuff because I think people aren't really, people always focus on the first tier, which is that, oh, we've got, you know, a reasoning model now versus non-reasoning models. And that makes all the difference. And like, sure, like that helps. But but you could build agents without reasoning models

Starting point is 00:10:05 and still benefit from all these other things. So like multiple frontier labs, like GROC 3 now as an API. You know, Gemini 2.5 Pro is arguably the best model in the world. Like you're not stuck to one model, therefore you can chain together different capabilities and get out of ruts. The depreciation curve of models is also a constant force, right?

Starting point is 00:10:28 And it's all Moore's law. And all that other good stuff that we can talk about later. And I think the last thing on the business side I want to highlight to folks is that we're actually really moving from a cost plus model where you're just charging based on a number tokens and then maybe you mark it up a little bit towards the outcomes that you deliver. And that's a huge change, right? Because now the reframing is going from, all right, how well can you consume my tokens to how much of my job as a human can you do, therefore you are worth this much. and that's a couple orders of magnitude difference there. Yeah, super interesting. And so kind of what were the types of talks

Starting point is 00:11:07 that you were trying to bring together to, you know, instantiate this and bring it to life? And what hit either, you know, and particularly I'd love this sort of subjective take on, you know, what was popular that you didn't expect or did expect, you know, out of the talks that kind of were most resonant with people. Yeah.

Starting point is 00:11:25 Well, I mean, so we're still in the process of releasing all the talks. So I don't know in advance with everything. But you could see, like, anything from Big Lab is good. Sure. We had a, because we were organizing in New York, we really wanted to also focus on agents in production, right? I think the subtitle of the conference was agents at work. So really, I think a lot of people see demos, and then they're like, that's cute.

Starting point is 00:11:51 I can use it for fun demos and then probably they never actually use it. But who's actually using this thing at work? How much impact is it having? Do any of the Fortune 500s care? You know, like what if they figured out because they're so smart and so big and so much, so resourced? What have they figured out that I haven't figured out? Right.

Starting point is 00:12:10 So I got people from like Jane Street, from Bloomberg, from Black Rock, which, by the way, their talk wasn't approved to be released. So everyone who attended got an exclusive that we cannot release. Alpha baby. And ramp as well. Yeah. So the production agents and AI talk in big companies like Jane Street, Bloomberg, Ramp, by the way, I think announced like an $11, $12 billion valuation after this talk.

Starting point is 00:12:41 No correlation there. And then I guess like RL was very, very hyped and we can talk about that one, but also windsurf, which I think always very surprising to me that you can just be a second mover after cursor and still do super well. And, and, you know, as long as you design your agents well and you're, you take like good enough differentiation, you have good enough differentiation, people will give you a shot. And I think that's very encouraging. I think that that just means that, you know, if you think something is over or a category is done, maybe you should just try harder. Yeah. Yeah. When I was sort of pulling out and looking at, you know, again, sample that not

Starting point is 00:13:24 everything has gone up, but some of the standouts included the RL for agents, the wind surf thing. And then, of course, and maybe the one that we can talk about a little bit now is the MCP, the MCP discussion. So that one I expected it to do well. So it is not. Yeah. Well, so I would love to like, one of the things that was really interesting is you wrote a post, or you guys wrote a post called YMCP One, which I think is super interesting.

Starting point is 00:13:49 I think I did a whole episode about it, basically. That's a specific post. Yeah. Yeah, I mean, listen, you keep creating great content. I will keep, you know, remixing it for this audience. So I thought part of it was interesting because it's so, so recent but still so far in my memory, now that MCP has completely taken over the conversation, right? And you had the Google CEO a couple weeks ago asking to MCP or not to MCP and then

Starting point is 00:14:15 yesterday answering that question or, you know, yesterday from when this was recorded. But you were, you were reflecting on the sort of, you know, the, the, the, the, the, the, the, the, the period following the immediate reaction to it. So you basically argued that the immediate reaction was good, but then it kind of got quiet for a little while. So can you take us back to like, you know, end in November when it gets announced, into December and January and how you were thinking about MCP

Starting point is 00:14:37 and then where you saw the sort of the pickup in conversation and what you attributed to? Yeah, I would give credit to Alex Albert, who I think has a timeline of events of MCP somewhere out there. But yeah, it was launched in November, and I think there was a lot of interest. I think it was top of hackery news.

Starting point is 00:14:58 But I don't think there was a ton of immediate follow-through because people were kind of used to big companies launching protocols and then it's kind of flopping or not really working. One recent one that people might not remember is meta actually launched a Lama stack, which is a full open-source framework and stack. Every framework has a protocol embedded in it.

Starting point is 00:15:22 I mean, that was kind of the insight there, that everyone maybe went too far in trying to impose all these opinions at once. And Anthropic did it, took a different approach and adopted a protocol that other frameworks can build on top of. And that maybe was that minimal viable product that actually was the only viable products, because everything else would have been too much for imposing too many opinions on everyone building stuff. So I think a lot of people started exploring it. And I think like integrating it into their workflows. And I think probably it was driven by the IDs. So like Z and WinServe and then eventually Cursor, I think was the last one to add it.

Starting point is 00:16:02 Maybe co-pilot as well. I'm not really sure the exact sequence there. But I think the, yeah, I mean, really I knew that it was going to be interesting. I thought it was like useful for a big lab to come out and announce something that wasn't a model. and it was how tools should interoperate with models. And I think insofar as OpenEI did that in 2023, 2024, with their function calling spec and tool calling, which did well but didn't have that spark of excitement that the MCP had.

Starting point is 00:16:38 I think anthropic taking a stab at it was really strong, and I wanted to feature them. And that was really about the amount of calculation that I had about there. they really took it all the way. Anthropic has been a really strong supporter of my conferences. So they showed up and they had this like two-hour presentation and they had tons of new alpha. They never dropped anywhere else.

Starting point is 00:17:03 And then they also talked about their future plans, announced the official MCP registry at the conference. And it was all this stuff. And so like that sparks in more excitement because I think the other thing about launching these things from big labs is that they need follow through. They need to be, people need to believe like this is an actively worked on thing.

Starting point is 00:17:20 You know, I think one of my statements that you liked in the piece was that, you know, protocols are only as strong as the people who are already using them. And so you just have to believe that if I invest in MCP, all my buddies are going to invest in MCP. All the people that want to be compared to are all investing in MCP. And so, like, yeah, I mean, you know, that workshop did really well. We released it. It was like a good hit. And I think, like, I saw the numbers earlier than anyone. else just because I could see

Starting point is 00:17:48 the views, I could see the, and I looked at the download statistics, and I looked at everything, where everything was trending. So I think this was the chart that I focused on, and I was like, okay, you know, do I call it now? Is it too ready to call it? It was like, this is exactly where, you know, it's like three, four months into

Starting point is 00:18:06 MCP. There have been many, many attempts at creating some kind of agent benchmark, agent standard, but, you know, nothing like this. And I was just like, oh, you know, I think, it's a decent chance that MCP has kind of won this. And I tried to articulate to myself why it won. And that would, I ended up with these seven reasons, right?

Starting point is 00:18:27 Oh, six reasons, which is like it's a native, it's open standard, blah, blah, you already went through this in your podcast. And guess where, I don't know if you've seen the chart since this post. I have not. No, I have. Yeah, well, we can click on it. And it has done, we might need, it might need some little bit bigger screen for the, for the data.

Starting point is 00:18:47 But basically, I projected the MCP would take over the incumbent of open... So this is just GitHub stars, right? So we are going from 0 to 15K in a very short amount of time and faster than anyone else. But the incumbent is Open API. That's the big behemoth

Starting point is 00:19:03 that is basically the old industry standard. And that one is at 30,000 stars. So it's basically continued to go there. So I was doing a conservative projection, I was like, it'll hit 30,000 stars in July-ish. No, it's crossed it this month. It's crazy. Today's episode is brought to you by Vanta.

Starting point is 00:19:24 Vanta is a trust management platform that helps businesses automate security and compliance, enabling them to demonstrate strong security practices and scale. In today's business landscape, businesses can't just claim security. They have to prove it. Achieving compliance with a framework like SOC2, ISO-27-01, HIPAA, GDPR, and more, is how businesses can demonstrate strong security practices. And we see how much this matters every time we connect enterprises with agent services providers at Super Intelligent. Many of these compliance frameworks are simply not negotiable for enterprises.

Starting point is 00:19:55 The problem is that navigating security and compliance is time-consuming and complicated. It can take months of work and use up valuable time and resources. Vanta makes it easy and faster by automating compliance across 35-plus frameworks. It gets you audit-ready in weeks instead of months and saves you up to 85% of associated costs. In fact, a recent IDC White Paper found that Vanta customers achieved $535,000 per year in benefits, and the platform pays for itself in just three months. The proof is in the numbers. More than 10,000 global companies trust Vanta, including Atlassian, Cora, and more.

Starting point is 00:20:27 For a limited time, listeners get $1,000 off at vanta.com slash nLW. That's VANTA.com for $1,000 off. Hey, listeners, are you tasked with the safe deployment and use of trustworthy A&W? A. PPMG has a first of its kind AI Risk and Controls Guide, which provides a structured approach for organizations to begin identifying AI risks and design controls to mitigate threats. What makes KPMG's AI Risks and Controls Guide different is that it outlines practical control considerations to help businesses manage risks and accelerate value. To learn more, go to www.kpmg.us.

Starting point is 00:21:06 That's www.kpmg.us slash AI Guide. Today's episode is brought to you by Super Intelligent and more specifically Super's agent readiness audits. If you've been listening for a while, you have probably heard me talk about this, but basically the idea of the agent readiness audit is that this is a system that we've created to help you benchmark and map opportunities in your organizations where agents could specifically help you solve your problems, create new opportunities in a way that, again, is completely customized to you. When you do one of these audits, what you're going to do is a voice-based agent interview where we work with some number of your leadership and employees

Starting point is 00:21:47 to map what's going on inside the organization and to figure out where you are in your agent journey. That's going to produce an agent readiness score that comes with a deep set of explanations, strength, weaknesses, key findings, and of course a set of very specific recommendations that then we have the ability to help you go find the right partners to actually fulfill. So if you are looking for a way to jumpstart your agent strategy, send us an email at agent at Bsuper.a.i, and let's get you plugged into the agentic era. Yeah, I mean, it was to me, you know, picking up on the, on the signals strictly from competitive pressure when, you know, because opening I was fascinating because when they announced

Starting point is 00:22:29 agents SDKK, it was sort of like, all right, cool. There's, you know, the natural interpretation, I think, you know, perhaps unsophisticated was got it. There's going to be an agent sort of protocol war, right? Like, that's another vector of competition in terms of, you know, developer allegiance that we're going to go for. And then when, like, five minutes later, they were like, we love MCP. We're supporting MCP, too. I was like, all right, well, that's a, that's a whole different ball game. Obviously, Google then follows. And, you know, I did resonate with that that point that you made in terms of basically the network effect of protocols. Like, there is such huge advantage, obviously, for building where other people are building. And it just got over the bootstrap problem so quickly

Starting point is 00:23:10 that, you know, it was almost like this is the type of thing where if everyone can get together more quickly and make the decision, it's so good for everyone in terms of just collective value that everyone provides each other, you know? And, you know, now you've got people who are, like, actually thinking about this is a category of new startups, not just a new tool to use, but an actual category of things to build. You've got Darmesh Shah, who I know is on the show recently too. Like, he's spitting out on LinkedIn all the, all the MCP related startup ideas that he doesn't have time to do, you know. And I think he funded one of them when someone responded saying that they were doing that thing.

Starting point is 00:23:46 Oh, wow. Awesome. Yeah. So it's very cool to see how fast that that ecosystem has emerged and is starting to flourish. Yeah. So I think people think of me as an MCP show just because obviously I did just show it a bit. But I am a bit measured about it, right? Like I have seen protocols get hyped and then get very not hyped.

Starting point is 00:24:08 And, you know, so the most recent version of this, in developer land is GraphQL. It was like, yeah, this is like a better layer over rest. Everyone's doing like rest versus GraphQL threads and all that. And it's very reminiscent of MCP versus OpenE API. It's like basically the same. And all the, by the way, all the issues that came up with GraphQL also are emerging with MCP, like how do you do authorization?

Starting point is 00:24:31 How do you do connect with remote MCPs and do discovery of them? Like the exact same things. Because these are all the same type of problems, which I call the M times N, to M plus N problem, right? Like, that's how you sort of, you solve combinatorial issues by adding one legal abstraction that has a standard interface

Starting point is 00:24:48 that everyone plugs into, right? Like, common concept, everyone understands that. The authors of MCPR are super aware of it. They talked to me about it after we did a podcast with them. So, I mean, I think good governance and good judgments

Starting point is 00:25:01 is still going to win the day. There is a way that they can screw this up. And I think, I was also on these TBPN podcasts. And they were like, is this going to result in like an explosion of agents? And I'm like, there's already an explosion of agents. This is not really changing that trajectory in any way. But this basically improves the quality of integrations. It's a boring answer. It's just like, you know, the integrations that you write,

Starting point is 00:25:26 are you expected in one app, you're going to see in another app because it's super easy to add them. And you don't have to wait for them to add like, you know, Notion just because it's on their backlog and they don't have it prioritized yet. Like, no, it's just out of the box because Notion just launched an MCP yesterday. And that's it. It doesn't make for super agents or anything apart from they are wider, they can integrate it with more things, but we still have to solve a lot of the other core problems of agents. So I think it's a good caveating. Where I look at it from, and so, you know, again, bringing this sort of back to an enterprise audience, a lot of what I think enterprises are trying to figure out right now is

Starting point is 00:26:04 interestingly, when they think about agents and agent capabilities, I actually think that directionally they're correctly. When they're imagining in their kind of brain and ideal mind's eye of what agents can do, they're kind of right about where things are headed. The problem is just that it's not there yet in most cases, right? The things that agents can do are more limited. They're more discreet. You know, yada, yada, yada.

Starting point is 00:26:27 Right? There's some gap between what they're imagining and really excited about and what they are now. And a big calculation is how much, how fast, and in what way to invest, given the rate of change. And this is really, really challenging because it is sort of obvious that the answer is it can't be for, in most cases, just practically go all in on building the thing that you most want to build right now because in many cases it's just not exactly where they want it to be. But it also can't be on the other end of the spectrum, just wait for it to get ready because you're going to be behind by then. And so I think that they're trying to understand what to do in the interim. And so it's really interesting when you have, call it accelerationist forces, which is sort of another way of describing, I think, what you just described with MCP.

Starting point is 00:27:15 It's boring because it's not going to make more agents, it's not going to change the trajectory. But by having, you know, your point, not having to wait around for that notion integration, not having to wait around for, you know, some other thing that you're waiting for, it does feel like it is likely to accelerate the speed at which new capabilities come online. And sort of each new thing that gets connected to the ecosystem is likely to open up some additional use case. Yep. I broadly agree with that. Yeah, we can talk about the other elements of agent engineering that we got from the talk, from the conference. Yeah, I think FCP is a great protocol. But like, imagine if there were standards for all the other stuff. Yeah, that'll be great.

Starting point is 00:27:55 Let's talk about the sort of the outside of that, the other pieces. Yeah. Well, so the other thing that happened at the conference that was followed up after was OpenEI actually previewed how they think about agents, which is they released. This is the OpenEI for VPs of AI talk, the one that was the day before you came. And they said an agent is an AI application consisting of, one, a model, equipped with two instructions that guide its behavior. Three, access to tools, that extend its capabilities, that's MCP.

Starting point is 00:28:30 Four, it can be capsulated in a runtime with a dynamic lifecycle. So that was what they previewed, and then they launched the agent's SDK after that. I mean, they told me that that's what they were doing. So I had full knowledge of that. So it's interesting that, like, that is one form of the definition. And then Lillian Wang, who used to be head of safety systems at OpenEI, had a different characterization of agents

Starting point is 00:28:56 where she was like, agents is LLMs plus memory plus planning plus tool use, right? So everyone agrees on the model layer, everyone agrees on tool use, and then they disagree on everything else. Like,

Starting point is 00:29:08 the agents SDK has no memory, no planning skills. And then Lillian Wang forgot that you need prompts for the models. And also there needs to be this like runtime, effectively this wild loop of like the agent in the loop deciding what to do next. And so I think it was like very disordered.

Starting point is 00:29:27 And, you know, I don't like that. It seems very unstructured because people don't really take defining what agent engineering is seriously. So I took a stab at it. And there are six elements, right? So it's I am P-A-C-T just because I, you know, whenever there's a lot of elements, I like to have an acronym to remember them. I'm not trying to push it. You know, you got to, Congress does this for the names of bills too.

Starting point is 00:29:50 You got to make it memorable. I remember, like, I think like the, there was like this Jedi contract or something that like, anyway, it was like a really interesting acronym. But IMPACT, so the only forced acronym in here is I. I is intent because intent is literally borrowing from what opening I just used for their, what they call prompts. But I think you also need to encode goals and e-vowes, meaning that if you, like, an eval is kind of a prompt. because once you run an agent against an e-vowel, you can take the negative results of the e-vail and then prompt it again to get the positive result.

Starting point is 00:30:28 So that is your intent. What is your intent to that you're encoding or classifying and executing on? Everything else is very straightforward. M is memory. P is planning. C is control flow, which is the runtime, the if-else driven by the LLM.

Starting point is 00:30:46 A is authority because the OG meaning, the human meaning of agent, like my real estate agent, my estate agent, whatever, is you work on my behalf because I trust you to work on my behalf to look after my interest.

Starting point is 00:31:01 And there is no, again, like in the technical definitions, the engineers, they like trusted the last thing that you think about. But really, like for consumers, for enterprises, trust is probably number one.

Starting point is 00:31:13 Like if I don't trust this thing, I'm not going to use it. And the last one is T, tool use, which is the thing that everyone agrees on. So let's talk about maybe bringing this sort of forward into reality with where we are now. So you're living inside sort of fast-changing understanding of this space. And now you're once again sort of having to put this back into a structure in the form of the run of show for the summer summit.

Starting point is 00:31:42 So how has your thinking about what needs to be included in that set of talks, that set of conversations, changed since you were planning the last event. And what does that look like in practice for the types of tracks that you guys are doing? Yeah. The feedback loop is very tight, right? So MCP did so well. So now we have doubled down and we're now we just announced an entire MCP track with the MCP team hosting. And then we're just letting them invite their major contributors.

Starting point is 00:32:09 And it's like a MCP little mini conference, right? I just love that I get to make calls like this because I know that the other conferences cannot do this. So like we'll just do it. We'll just do it just because we can. And we're doing the same for local Lama because they are long overdue for a conference as well. Like they are the biggest community of open models out there. And the reasoning in RL talk, Will Brown's talk from Martin Stanley, did so well as well that we announced a reasoning in RL track. So basically, like I am not trying to push the constant of EING.

Starting point is 00:32:41 that hard, just because I like to talk about these ideas and then let them organically take traction, because I'm not going to change people's minds if the timing is bad or if the concept isn't quite fit, right? But I just want to focus on individual problems or domains where we can have the top speakers in the world gather and they'll primarily do their talk, but really they're there to meet each other. Like I fully know as like the guy who curates the talks, that the talks don't actually matter that much. And it's just like the people just showing up and chatting in the hallways. So, you know, it is what it is. Like, you know, we want to do a good show. We want to help people who are not in San Francisco get a sense of what the state of AI is. But at the

Starting point is 00:33:29 end of the day, people are just going to meet in person and talk offline, you know, to decide what to do next. So, yeah, and there's the whole thing. Like, quick response to what is working and what people want more of. And then also I think for the for the summer conference, the set of things that I think you must have, even if they're not that super exciting. Like no one, you know, super really cares about security, but they do care, you know, at the end, especially when they're putting it at work. So yes, we have a security track, you know, like, because we have to. And then my job is to make the most interesting practical, find the most interesting practical speakers. that won't bore you to death about, you know, things that you already know you should be doing.

Starting point is 00:34:13 So stuff like that. I also wanted to focus on jobs. So, like, I think the, one of the smartest things I did with AI engineering was just literally name it after the job that I was trying to create. And I think that there are adjacencies around engineering for AIPM and AI designer that work with AI engineers. So I'm giving them a shot. I'm inviting design and P.M.s to talk about how they work of engineering or just I'll lead on how they should do their thing. We'll never be a full PM thing, product management conference. We'll never be a full design conference. But I think if we can show them that

Starting point is 00:34:54 they have a seat at the table with engineering, I think that's something that they want. Yeah. So this actually, and one of the themes obviously that I wanted to talk about is vibe coding. And I actually think that this feels. like an interesting bridge here because, so we just had this note from, uh, from the Shopify CEO, right? That's the sort of the, the new AI mandate. And one of the pieces of it that sort of, you know, obviously the one that everyone focused on was the no new hiring until you've proven that an AI can't do it. Um, but one of the ones that was most resonant to me as someone who's, you know, building a company in this new context was effectively the, you got to be prototyping

Starting point is 00:35:31 with everything with AI, right? And so, you know, he didn't put it quite this crisply, but like, you know, there's a soft ban on talking about product stuff as opposed to showing product stuff that you can get your hands on. And that's a shift that we've made internally as well. Like, you know, there's one, you know, out of a, you know, core contributing team of six with super intelligent, you know, one lead engineer, CTO, but everyone is using lovable or bolt or or sell or whatever their, you know, their preferred tool is when they have ideas for features or things they want to change, right? It's just become sort of the norm. It's just from a pure efficiency standpoint. It's way easier for people to like, you know, go to step two or three in their own

Starting point is 00:36:16 thinking about what they're trying to articulate by actually seeing some weird little prototype of it. And it's massively easier. I mean, it's a two X, three X, five X, you know, difference in terms of their ability to communicate it to others at speed. So I think it's super interesting. just to have that bridge in because, you know, it's part of the new capability of sort of AI engineering is it is inherently more invitational or maybe accidentally more invitational is a better way of putting it for non-engineers to engage with engineers on their own level in some way. Yeah. I'm not sure what to say about that apart from my, I generally agree. You know, I think it's an enabler for all parts of the organization.

Starting point is 00:37:01 And, oh, man, I always want to have this, like, recommended stack of things that people should be trying out. But I know that if I do this, then, like, people will be pissed at me for, like, not including things or, like, miscategorizing things. But, like, it's almost, like, a necessity that you should have one of these in your company, right? And it's fascinating. I think the people who are self-driven and don't mind getting their hands cut on the beating edge sometimes, they will find the workflows that make them a lot more productive and that will win them in the competition of ideas, I guess. So I'm a little bit idealistic about that. But yeah, happy to double-click on anything.

Starting point is 00:37:44 I think vibe coding, by the way, obviously coined by mentor of mine, Andre, and somewhat taken out of context. Yeah, big time. Very taken out of context? Yeah, wildly. He was literally talking about vibing. Like, he didn't say a glass of wine, but you kind of imagined him listening to, like, you know,

Starting point is 00:38:04 I don't know, like modern jazz and drinking wine while he's doing, he's talking to his coding tool. Yeah, but I think he's coming from a place where he has expertise to look at the code, not read every line, but get the vibes of the code. And if it looks correct, it's probably correct, and commit it and move on. And now it's being taken to mean that you don't need expertise,

Starting point is 00:38:27 and you can just vibe it out and hope for the best. And a lot of people actually are very successful. That's why Bold and Love a Bowl doing so well. But I think then they also get into trouble, and they don't know how to get out of it. And there will be a lot of wasted dollars on that. Maybe someone that waste is fine because it's so productive when it does work. but I think like what I'm trying to enhance here is what are the best practices like how do you stay on the rails and not go off track when you are vibe coding.

Starting point is 00:38:56 Yeah. Look, I think that the the explosion of vibe coding as like there's clearly it's clearly touched a nerve in terms of, you know, an expansionary force for who gets to create what with it has come this whole new set of challenges and problems, which need to be taken on one by one. And, you know, like, we, we, I think about that a lot in terms of this sort of, you know, how accessible to the enterprise is it? And maybe it's not just vibe coding, you know, with the sort of text to code tools, but just this entire new set of, of, uh, of sort of agent, you know, or agent enabled coding environments, um, you know, they are, they're strange or perhaps, you know, you wouldn't expect resistance to a lot of this inside big companies. And the sort of the illegitimate part of that, I think, is often just a desire to not

Starting point is 00:39:51 see things change, you know, like engineers who kind of like the speed at which they get to move inside big companies and don't necessarily want a force for that to, you know, quintuple overnight. But the more legitimate sort of critiques are that there are a lot of these things aren't optimized for, you know, big legacy code bases that have thousands of different contributors. And, you know, the person who wrote the code today might not be there tomorrow. But it also feels like those, like every single new challenge associated with this sort of different approach, it seems like every day there's two new startups that pop up to solve that particular challenge.

Starting point is 00:40:27 It's like, you know, whack a mole for these new issues. What do you think about, like I guess, you know, as a quick preview, what are you hoping to bring sort of with that vibe coding track? Is it just sort of, you know, actually getting into what those challenges are and how how to solve them. You know, you have a particular take on

Starting point is 00:40:43 that set. It's, no, it's more experimental than that. I probably just want to sample a set

Starting point is 00:40:51 of the conversation for people to discuss, right? So I want a live demo of how a good vibe code or

Starting point is 00:40:58 vibe codes. Just maybe because people can learn from that, right? I want to have a negative talk on

Starting point is 00:41:05 vibe coding, like why you should not vibecode or why my coding is doomed or whatever, right? I want to have

Starting point is 00:41:10 a talk talk from someone building a vibe coding platform, probably Bolt, because I'm closer to Eric from Bolt. And I want to sample that space and allow people to explore because I don't really know what I think about it myself. All I do know is I'm very pro people having more autonomy and power to create software and to, you know, like I have so many designers and PMs telling me that like just because of these coding tools, they are able to do the things that they wanted to do without the permission or like,

Starting point is 00:41:49 you know, the prioritization from the industry team. And that's fantastic for them. And it's fantastic for their customers too. And so like there's there's something here. I just, I honestly, I don't know if like Vibe coding is the best name for it, but it's what, it's what people have right now. So I got to call it that. Yeah, it's super interesting. So, this is awesome. I love getting to talk to you about this stuff. Maybe by way of closing, as you think about, a lot of the conversations that we have are around leaders thinking about AI and agentic transformation across their whole companies. And, you know, as I was kind of alluding to, one of the interesting tensions is that it feels like outside of the enterprise,

Starting point is 00:42:35 a lot of the use cases that surround coding and engineering are places that have the highest product market fit, or at least there's so clearly that the biggest change is happening. And yet, that tends to be sort of a more recalcitrant area when it comes to, you know, the engineers working inside big companies. If you were, if you were, you know, a sort of a general leader, right, a CEO of a company who's trying to think about how to help support, encourage, demand that their engineering organization starts to evolve on the basis of these changes.

Starting point is 00:43:10 How would you think about that? Or where would you have them start to pay attention or dig in? Yeah. This is why we have a leadership track. We've rebranded the leadership day into AI architects because that's what Brett Taylor calls them.

Starting point is 00:43:26 Yeah, so it's weird, right? Because on one hand, the engineers have all these terms and jargon they're throwing around. And then on the other hand, I feel like the leaders just have to keep their heads down and mind the stuff that their engineers are not doing, which is like the compliance, the security, the legal, you know, all that stuff that people don't want. And then, but there is one meta thing where they have to be on board, which is defining strategy and hiring. So those are the two things where they really need to be very, very much in sync with the engineers. and the other elements they can kind of dictate to their engineers.

Starting point is 00:44:07 And so, so like, does that make sense? Is there anything that you want to double click from there? No, I know. That totally makes sense. Yeah. So, like, there is a session on defining strategy. I basically always have a hiring talk in each, in every conference. And I think it's really close to software engineering for now,

Starting point is 00:44:29 in the sense that you're 90% of software engineer, and then like 10% of your interview loop or requirements or whatever will see your requirements of AI. But I think that will separate over time. Once you start building all these disciplines in tool calls and planning and control flow and authority and all that, that starts to become its own discipline, which is why I think this AI engineer job description, you know, we're still exploring it. We're still building out, you know, three years in.

Starting point is 00:44:58 And it's, I mean, it's exciting for me because I get to. helped to define it and I also, you know, meet everyone important in that field. But, um, uh, it, like, with everything, like, it's a, it's a nebulous concept. The, the, the lines are definitely blurry. No, it's awesome. Well, look, I'm super excited. I, you know, I, I use these events as, uh, as sort of benchmarks for what I should be paying attention to. I would encourage other people to as well. Um, thank you for, uh, for coming on the show again and, uh, look forward to the next time we get to catch up. Yeah, thanks for the support, man. Uh, come if you, come if you can. Um, um, Yeah, it's in June 3rd to 5th and NSF.

Starting point is 00:45:35 And we're making a yearly thing. So we're already planning 2026. It's going to be outdoors, which is fun. San Francisco is beautiful in the summer. Yeah, love it. All right. Thanks, Sean. Thanks.

The AI Daily Brief: Artificial Intelligence News and Analysis - MCP, Agents and What AI Engineers Are Thinking About Right Now feat. Swyx

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.