The Infra Pod - Coding agents need infra to apply code changes! (Chat with Tejas from Morph)

Episode Date: February 9, 2026

Tim (Essence VC) and Ian (Keycard) sat down with Tejas Bhakta (CEO of Morph) to chat about building infrastructure for the fastest file edit APIs for coding agents. He shares how Morph delivers 10,000... tokens/second through speculative decoding, why cursor removed fast apply, and his vision for autonomous software that updates without prompts. The conversation covers subagent architecture, code search optimization, and the path to reliable AI coding at scale.Timestamps:0:00 - Introduction0:29 - Why start Morph and pivoting through YC1:23 - The fast apply insight from Cursor3:42 - How fast apply works and speculative decoding6:09 - Use cases: when and where fast apply matters8:19 - Why Cursor removed fast apply9:22 - Morph's value prop beyond speed11:58 - Subagent architecture and SDK approach14:45 - Semantic search and code-specific tooling19:52 - Building custom coding agents vs platforms22:42 - Adoption inhibitors and the future of codegen23:26 - Spicy take: Autonomous software and reliability

Transcript
Discussion (0)
Starting point is 00:00:03 Welcome to the InfraPod. This is Tim from Essence VC and Ian. Let's go. Hey, this is Ian, co-founder and CEO of Keycard, the greatest place to go to go to trusted agentic applications. Today, we're joined by the CEO and co-founder of Morf, the place to go for the best and fastest file edit APIs for your coding agents. Tage us back to how are you today and what in the world made you decide to start a company? I'm good, I'm good.
Starting point is 00:00:31 I don't know. Mental illness. I don't know. Like, I don't know. I was in Pivot Hell for a while, honestly. So, I mean, why did I decide to start a company? I mean, it felt inevitable, but it feels good to be working on something that's working. And what was the insight?
Starting point is 00:00:44 So you say you were in Pivot Hell, like, what were you working on? And then what gave you the insight to where you are today? Like, what got you to build more the way it is today as it shows up and the problem you're specifically solving? Yeah. So I did Y-C in summer 23. And I think since then, I've pivoted around like three or four times, did more starting around February 2025 and launched it about five months ago. But I sort of came into it just from like being an early cursor user.
Starting point is 00:01:15 So cursor was kind of the first company that did Fast Apply. And so I figured that like after using it, I was like digging into why cursor feels so good. And it was like largely because of Fast Apply and Curcer tab, of course. And yeah, it felt like a useful primitive that other people were going to want and need. I also had to know how to do it because I used to do like training optimization and inference optimization at Tesla. And so, yeah, it just sort of made sense with both my background as well as like where I saw things going and what the current thing was. It's just sort of all lined up. And was there, what was the pain your experience for using pressure and made you say, hey, you know what?
Starting point is 00:01:53 There's actually like a need for a dedicated company on this problem space. Was there an insight? There was more in time, experience you had that sort of led you to the, that like bulb moment that said, okay, here's, here's where, here's like a possible problem I can solve and it's just useful for other people. Yeah. Yeah. I mean, it was mostly just that every other product other than cursor just felt really bad at the time. And like cursor sort of had their fast apply model that basically, so what fast apply is, it sort of like you have the original code and you want a model to make an edit to that code. Your options are either to have the model
Starting point is 00:02:28 directly output like search and replace strings, which is like output tokens to delete and what to replace them with versus the fast supply approach, which is sort of like have the model output in the sort of lazy format, which is like largely just new tokens. And then you pass this to another model that will figure out how to merge it in. And so that's a fast supply approach. And the fast supply can be done a lot faster because can you expectative decoding to get this to be extremely fast. So yeah, going back to your question of like, how did I decide to start selling this? I think it was mostly just like taking what Cursor had done already and saying like, how can we improve this? How can we make it faster?
Starting point is 00:03:03 How can we sell it to everyone else that wants the cursor like experiencing their products? So I think for most of us that we use Cursor pretty much on a daily basis and people like people like all the time. It's probably not very obvious to folks what's really happening behind the things because what people think of is just it's just an LM generating code, right? You know, you go to Cloud Code does the same thing. But, you know, I think there's sort of like applied challenge. Can we go deeper a little bit? Because I think most engineers may not actually look at that layer at all. And so what does it limitations? Why can cursor just do a faster apply what's really hard about it?
Starting point is 00:03:38 And what's the fundamental approach you're doing that really makes you stand out? So give us maybe even why you stand out, right? Because what we think of is fast apply. But how fast, right? In what situations? Yeah. So, I mean, early on, like when there was Claude 3.5 sonnet and Cloud 4 sonnet, like, Elms were really bad at being able to do search for replace.
Starting point is 00:03:57 It would always have string match failures. There are still string match failures today, but it's not like, it's not the worst in the world. You can handle it. You could like handle edge cases and fall back and stuff like this. But early on it was just like there was no approach that really worked at any sort of scale than fast apply. Today, it's mostly like vibe coding platforms that care because it's faster, like net faster, like on average.
Starting point is 00:04:21 If you prompt correctly and get the right sort of format and you're lazy at it, it's roughly 30, 40% faster. But yeah, I mean, the lab's trained on search and replace format. So, like, it's not the end of the world anymore. So how is it so fast? Right now, we're around 10,000 tokens per second, which is, like, the fastest model that exists. It's like four times faster than what's on cerebrous. And the main way we do this is that, like, we're using the original code as a prior
Starting point is 00:04:45 when we're doing the output. Maybe not to get too technical, but like essentially speculative decoding is like, you start from a guess and you have the LLM verified the guess, essentially. And when you have the original code and the final code that you're trying to get, usually the final code is pretty similar to the original code. Roughly like 70 or 80% of the content is almost exactly the same. So you're essentially using the original code as a guess when you're outputing the output. And then that leads to around a four or five X speed up.
Starting point is 00:05:13 And then you do some kernel tuning as well to get that to be really fast. Because this task is a very niche. So we're almost like making our own inference engine just for this task. So if you bring like another chat model to our fast supply internet, it would not be fast. It's just for this task. What's probably going to be interesting to talk about is what is the use case you show up to most?
Starting point is 00:05:35 Because I feel like it sounds like you can basically apply this to all the vibe coding tasks, right? And you sell to buy coding platforms, which are basically, you know, they have general expectation of how people are going to vibe code. There's a lot of variance of usage, actually. There's people using a lot to use little. I guess for you, if you're selling to VibeCode platforms, doesn't matter what the users of their platform doing
Starting point is 00:05:57 because you're just fast-applying everywhere, or does it actually matter what the context? Like somebody's writing a large amount of files, some people who are doing by repos. Do you apply to a very specific type of the users much more? Let me talk about the characteristics of where the challenges come more. Yeah, so you really use fast apply when making edits. You don't necessarily need it when you're making a new file, for example.
Starting point is 00:06:20 Like you wouldn't need, like, if you're creating a new component, dot, TSX file. There's nothing there already. So you have to have the main L-LM, like, clawed up with that full thing. So yeah, mostly for edits. And then what's the circumstances, mainly for edits of use cases, but, you know, is it people building their own agents from scratch? Is it people taking things like open code or, you know, existing coding agents off the surface? There's people building, you know, what are the broad and bred, the depth of the use cases where more fits and solves a unique problem, right? Because I imagine the cursor use case, you know,
Starting point is 00:06:51 Kircher comes over the box with these experiences, is it the idea in the cursory use case that like more replaces it because it's better? Like help us understand sort of where it all fits into what type of agentic use cases and why. Yeah. So once cursor switched to tokens-based pricing, they got rid of fast supply. But it's still like the better option for a bi-putting platform because you really care about all clock time, right? Where you have like the user's prompt and you have like the completed edit.
Starting point is 00:07:19 So the time from that to that is what you care about. And that being roughly 30% faster on average in a vibe putting platform has higher conversion rates. And so it matters a lot in that use case. And so I mean, you would use Fast Supply today when in cursor, if you wanted it, if you prefer the experience in cursor. But when we started, it wasn't like the type of thing you would use in cursor. It's more of the type of thing you would use if you were building a pipe putting platform. And going forward, our future models are going to be really useful for you to potentially use in cursor.
Starting point is 00:07:48 but it wasn't always that way. So why did cursor take away fast supply? You know, it's given me those of some reason. So they switched to tokens-based pricing. You know how before they had like the $20 plan, which had like X amount of usage? I think a lot of it was a push to being profitable as well. So like they're running dedicated info
Starting point is 00:08:08 that they're giving away for free with fast supply before. And now they do like a sort of premium per token on cloud. And so using cloud for everything, sort of incentive a line to the, them on a new pricing model. So I think largely because of the new pricing model. Got it, got it. So I think this is interesting to maybe talk about like what is your value prop to the vibe coding platforms? Because when you think about cursor, hey, they move away already because they have their incentive for pricing because they are priced on a token usage.
Starting point is 00:08:37 So what happens to all the other vibe coding platforms? Why should they use you or why shouldn't they just go back to follow kind of what cursor is doing? What sort of advantages you're trying to bring to those folks when you're selling to others? Yeah. So for Fast Apply, it's really like wall clock time again, like from a user's prompt to complete to complete edit. Like roughly what you see is like you see conversion rates sort of follow linearly with
Starting point is 00:08:59 speed. So if you double speed, you'll roughly double conversion rates, assuming you don't add inaccuracy along the way. Yeah. And so that's what we bring to Vipening platforms. And to be clear, like Fast Supply is sort of our first model that we did. But like we're like we have semantic search turnkey
Starting point is 00:09:14 now. We have Git storage. We have Git storage. We have a model router and we have more coming out in the future. Oh, got it. So it's not even just code merge deploying, actually, where you have other things. Yeah. So, I mean, the larger thing is just like these sub-agents and models that we're going to be able to, like, give to Cloud that will augment its performance for coding agents. If that makes sense. Like, so if I'm a vibe coding platform, let's say I'm a code generator to do mobile apps.
Starting point is 00:09:43 There's so many of them. What do you sell to them? I guess you're using all clock time and stuff like that, maybe a little bit too niche or nuance. Do you have a general value prop when it comes to these folks? Like, hey, I'm going to help you save cost or I'm going to help you make sure you have a better user experience overall on these certain respect. Is there like a more high level pitch to them or? Yeah.
Starting point is 00:10:04 I mean, the high level pitch is just better, faster, cheaper, code gen. But yeah, I mean, it does get into the weeds. Like, if you think about what the core tasks of a coding agent, it's like, to write code, to edit code, to search for code, and then to try code, maybe. And yeah, so those are the sort of primitives that a coding agent needs to do really well, and we're basically writing,
Starting point is 00:10:24 moving a lot of this to sub-agents as LLMs get really big and a lot better. So I can go into that a little bit more if you want. Yeah, I think that's kind of what we're going out of. And I think, like, how fast, right? Like, how what better? You know, it's really hard to tell, actually. You know, most people don't even touch this layer themselves.
Starting point is 00:10:41 It gives us something, what is some of the specifics and what is it before and after? Yeah. Like in order for something to be worth of complexity to move to a sub-agent, I think it needs to be like cheaper, faster at the minimum, ideally better as well.
Starting point is 00:10:55 So the larger principle we're going after is that like Frontier models are at like 5 trillion parameters or like three to five trillion parameters or whatever. And like there are a ton of benchmarks that like they're already saturating. So some of these are like on code edit,
Starting point is 00:11:09 some of these are on code search. And so the overall principle, that I think most systems are going now. It's like using the least amount of compute to do a task to high fidelity. And so you see this like sort of arising. You see it with fast supply at first in cursor. You see it in deep research in open AI. Essentially the task of web scraping is too easy to use GPT5 level compute on.
Starting point is 00:11:32 And so it's like you have this sort of sub-agent architecture where GPD5 Mini or GPD5 Nano is doing the web scraping of each website and they have DPT5 at the top interpreting it. This is sort of what I mean by a sub-agent system where there's like these scope subtas that are too easy to be worth the frontier compute on everything. And so that's what we're going after, but in CodeGen specifically is like these sub-tasts that are too easy for frontier models. Got it.
Starting point is 00:11:57 So you're giving people like a development kit for how do I build coding agents, specifically sub-agents. You know, if I'm a company, we use a Terraform, we use specifics about how we want to our form to work. We also want that sub-agent to be the smartest version of that for us. They would build that sub-agent using morph, and then you'd expose it to say a cloud code as a sub-agent. And the code would be calling this up.
Starting point is 00:12:23 Is that the idea behind what's going on? Yeah. I mean, the idea is they wouldn't need to make any sub-agent. You just install our SDK and get our sub-agent as tools, essentially. You import it and give it to your cloud code. Cloud code would be like an MCP, but like if you were building your own coding agent, Essentially, yeah, you just use SDK. Gotcha.
Starting point is 00:12:42 And how often do people build their own coding agents versus use your coding agent in the box? And one, like types of coding agents come over the box that are sort of, you know, the default required for most people. Yeah, so we don't have our own coding agent. It's just like the SDK and MCP for people to bring in. How often do people do that? Like, we're mainly B2B right now. I think we'll be more direct to customer pretty soon.
Starting point is 00:13:06 Yeah. So your question is, how often do you? to people? Yeah, are most people using sort of the pre-created sub-agents that you have, or are they creating their own? Most of our customers are creating their own, just because we only released the SDK, like, last week or two weeks ago. Oh, very cool.
Starting point is 00:13:25 Very cool. And what do you think, you know, when thinking about MCP and the broader sort of, like, sub-agent debate, what do you think with the current state of, say, MCP and sub-agent infrastructure that actually enable people to build sub-agent? bring those sub-agents into their development platforms, they're cloud code or cursor. What do you think's missing? And what do you think we need to make that good if it's not good today?
Starting point is 00:13:51 I don't know. I feel like the plug-in the ecosystem and stuff for clock code is pretty extensive now. I mean, MCP, some people, I don't know if you've seen recently. People have been arguing MCP is like kind of a shitty protocol. And I don't know. I mean, I don't know how much it matters. Like, I think it's pretty solid. You have server MCPs, you have client MCPs.
Starting point is 00:14:11 Yeah, I think it's fine. It's not perfect, of course. I feel like no unified format is ever perfect. Like, it just always ends up that way somehow. I think just kind of breaking down a little bit more what you offer. You said stop agents. I saw your model routing as well. It definitely kind of changes how I thought about your company,
Starting point is 00:14:32 because Morph Allen to me has been like the fast supplying. Sounds like you have a lot larger coverage. And so to kind of think of through, I think when I think of subagents for coding, I think the code sandbox companies and the products come up a lot in our heads. Because when we talk to Daytona, we talk to E2B, they're mostly selling to actually these coding platforms as well, you know, to not just isolate. They're also trying to get them tools, you know, given subagents. And so we talk about the tooling cases for all of these agents to use as well. And so when you hear tools and subagents, I think it's very, it sounds like a very general, pretty much you can do whatever. But yours sounds like a much more specific.
Starting point is 00:15:10 I'm trying to figure out, like, are you giving folks maybe very, very specific set of tools? And can we talk about what those tools are and what sub-agents are? And also, like, do you run your own sandbox, you know, trying to provide that too as well? Or do you try to support like, okay, any sandbox you want to run on, either Motto or Daytona, whatever, we would try to figure out how to get through and you. How do you think about like the layers, you know, when your customer trying to talk to you as well? Yeah. So I guess the core thing here is that traditionally people think of tools as like software beneath each tool for a model for a tool call.
Starting point is 00:15:43 But essentially what we are are like small models behind tools. You would have cloud use us as a tool call, but behind every tool call would be this really fast model that's made for co-gen. Yeah. And so we're like sadmox agnostic. Like we don't do sandboxes. I have no intention of doing sandbox. Oh, got it. So you actually have specialized models behind the tools.
Starting point is 00:16:03 Like so applying as one of them. routing I saw as one of them, right? And what are other particular kind of agents you've done? So routing is less of a tool. It's more like done before the tool call is done. It takes a user's task and determines if it should go to like haiku or sonnet based on how simple it is. But yeah, I mean other tools like semantic search is pretty obvious. Like there's embeddings.
Starting point is 00:16:24 We handle all the embeddings for you. So like we built out like this infrastact for like for both Git and it's like automatic embedding. So every time someone pushes, we do like a miracle tree. type thing to figure out which files have changed. We re-embed those files. We do like syntax parsing on the code. So like a big thing in code search is that like you can't just do like every 500 token or every thousand token chunking. You just get really poor performance from that. And so you need to sort of like AST parse the code. You need to like basically separate code into like functional units, like functions classes and whatnot. And so you, we do embeddings. We store the embeddings. We do the
Starting point is 00:17:01 vector search, we do re-ranking. And so we have like this basically giving you like a stay-of-the-art code search, like the way cursor does code search. And so what I really want to give people is like this sort of like top 1% coding agent stack that sort of the best have without you needing to like learn
Starting point is 00:17:17 all the hard lessons of like, okay, I need to do AST parsing, I need to do BM25, I need to like optimize this. Embedding's infrastructure and just like let people operate with their creativity at a higher level abstraction than like these low level primitives that should just come out of the box.
Starting point is 00:17:31 Like, most of the reasoning here is that I just see everyone building the same stack and they're like, waste their time maintaining this stack and like they have less time to like actually do cool stuff and innovate at like the cool layer, which is like the application layer. Yeah. So it's like giving you top 1% code gen just out of
Starting point is 00:17:47 the box with that import. And so do you see yourself? Because when you talk about semantic searching, where you talk about like embedding models and even though it's code specific, it doesn't sound like it's actually VICO platforms only. You know, because when we look at people using cloud code in general, you know, there's a lot of people not just using vibe
Starting point is 00:18:04 coding platforms or tools. They might just go directly as a CLA wrapper to some other things. And so I guess I wonder, given you have a bunch of toolings, do you have more than just vibe coding platforms using you? Or do you plan to kind of go to that route as well? And do you see your side of a much broader audience in general? Yeah, I mean, some IDs use. It's like, look, there's continue.gov. There's kilo code. What else is there? And yeah, there's like a bunch of random use cases as well, like AI chat with docs, AI doc editors essentially where like you have like a web ID and you can use it to edit docs. Yeah, there's a lot of cool use cases actually that are like sort of random.
Starting point is 00:18:40 But like under the hood, it looks kind of like code. Like it's doing fast supply on like this large JSON or a YAML or Markdown. So I mean, it still looks like code, which is not a IPID and platform in those cases. And how do you help companies, you know, so you're helping me build sub-adenters for top sediments as a company? how do you help companies like roll that across the SDLC, right? Are you specifically focused on sort of local development workflows? Or do you also help, you know, a lot of people today are using things like a code
Starting point is 00:19:08 rapid for PR requests? Like how far does more go? How far can I take more across my, across my coding lifecycle and even into operations and support? Yeah. So what I would love is like if someone would just not need to like buy code wrapping, they can build it themselves, build all the custom stuff they want with this.
Starting point is 00:19:29 Because essentially what code wrap it is, it analyzes a U-DIF of each PR. It has some search capability. It's not a super complicated app. It's just a prompt U-DIFs and a search. And so ideally I'd want to like bring this for someone to be able to build themselves really quickly and like one file, everything's done for you.
Starting point is 00:19:51 And then they could bring in whatever they want custom. I'm sure that companies have their own dock store, their own custom stuff that they want to plug in that they can't because CodeRabbit doesn't support it. So I would love for like people to just be able to build stuff like that super easily. Yeah, yeah. I think that's true, at least from my own experience, thinking about how it would deploy like coding agents to say like a sneak when we're responsible for that at sneak. What do you think like is the primary inhibitor today of adoption of like agentic coding practices at scale? is there a model challenge? Is it infrastructure? Is it tooling? Is it, you know, subagents, building subagents that actually understand, like, the context of someone's specific code base? What do you think are like where we're at sort of in this
Starting point is 00:20:34 adoption curve, utility curve, and what do you think the inhibitors are? What do you think we need in order to remove those? I mean, I think in the adoption curve, like usage is still exploding for cloud code. Like it's like straight vertical as well as all the AI coding IDs. And I think it will say, to actually, like 100% of developers eventually. But I see CodeGen being, like, more than just web coding and IDs. I think that's what it is today. And so it's easy to, like, sort of shut your eyes and be like, okay, that's the market. It's like developers and people that are entrepreneurial and want to build, like,
Starting point is 00:21:06 these SaaS things on the vibe coding platforms. But I don't know. What interests me is, like, people building, like, stuff like autonomous software where there's no, there's no prompt in the loop. There's no, like, user prompting for something. There's other use cases as well where you use, like, code. for creative expression. There's like companies that are building social platforms where the social content is like mini apps.
Starting point is 00:21:28 So like WebSim is an example of one of the one of these. Basically what I'm getting at is I think cogen today is not cogen six months from now or one year from now. I think cogen is going to be this sort of like useful primitive to build a lot of stuff on like both creatively, both usefully like an IDs and vibe coding. And I think it's going to be like almost thinking of like software is like this primitive. Right. I think, like, beginning of software, everyone was like,
Starting point is 00:21:52 is Microsoft going to own all software? And is all value going to accrued to Microsoft? Then it turned out that everyone wanted this level of customization that you can only really get if you owned your own software. And I think something similar is going to happen with AI coding. Like everyone's like, is all value going to accrue to cursor or a vibe coding platform? And I think,
Starting point is 00:22:16 I think it's not the case. I think everyone's going to own, own it to some extent. Gotcha. And you think basically we're at a role where every company builds a variety
Starting point is 00:22:25 of a Vype coding platform that's unique to them that is specific to sort of their code and their contacts and the IP that they've created. Not necessarily. I don't think vibe coding
Starting point is 00:22:34 platform is the primitive. I think that like, I mean, I think it's a strong application. I just don't think it's the only thing. I think like, what I mean is that there are apps like PR review,
Starting point is 00:22:45 there's vibe coding, there's now is like creative stuff like the social apps. Yeah, I think like personalized software, like apps that autonomously update based on context of your life. I think there's like a ton of use cases where we don't see today because we don't, the nines of reliability sort of really impact some of these use cases. Like you can't have autonomous software if it fails like 5% of the time or 4% of the time.
Starting point is 00:23:09 So that's sort of what I was meaning. Very cool. Well, I want to move to our favorite section called The Spicy Future. Spicey future. Tell us what's your spicy hot take then into sort of coding LLM space. The spiciest take I have is that you're not going to use
Starting point is 00:23:28 Frontier models for everything. We've sort of already went over this. It's not that spicy. I can get spicier if you need. If you have another option to get spicier, why not? It's a little LLM specific, but a lot of people say LM inference is very memory bandwidth
Starting point is 00:23:42 bottlenecked. So basically when you're doing the forward pass and inference, like you're limited by your memory speed and if you increase memory speed is when you do it. And so you're essentially not using your compute effectively. I think that's sort of a skill issue. I think if your memory bottleneck, you just shouldn't be. You should use your compute for something useful.
Starting point is 00:24:00 And I think speculative decoding is a good way to spend that compute. There's not a lot of people that are properly thinking about this. Speculative decoding is not really effectively use it. Either people train some small model to use for the speculative decoding or they or they use their original code or original file. And I think, long story short, I think if you are memory bandwidth bottlenecked, you should do something with your compute such that you are not. And are you talking about like merging different workloads and the same boxes or stuff like that?
Starting point is 00:24:33 It's not really related to code actually that much. It's more of a statement of like general LM inference. You can do more to like get a speed up if you are memory banned with bottlenecked. And it's not the most interesting take. I can think of a better spicy take related to co-gen. That's more relevant. Yeah, yeah, yeah.
Starting point is 00:24:51 If you have one, yeah, that would be interesting. But I think it is a space where I think not everybody is paying so much attention because you have to pay attention to all the coding models already and all the AI stuff, now down to inference level. There's so many different things happening right now. So I guess for you,
Starting point is 00:25:06 what does the type of work you focus on? Because actually, in coding space, it's not just like how inference optimizations are, like disaggregated, you know, sort of like techniques and stuff, right? You have to go down to the code specific things. You had to go down to the model. What's what was happening? Like, do you spend most of your time researching on particular areas or like, how do you
Starting point is 00:25:26 think about when you're like, okay, I need to keep up with all the frontier stuff that's happening? What is the type of stuff you spend the most time looking at? I don't like, I go out of my way, honestly. I just feel I'm like really updated on everything in Koja and just as a result of working on it. Like I always know, like the latest, uh, context compression technique and cloud code in the latest releases, I know,
Starting point is 00:25:46 as a function of just working in the space. So yeah, I mean, if you wanted another spicy take and co-gen, it would be like, I think that co-gen today is not what co-gen's going to look like six months from now. I think apps that update autonomously are going to become more of a thing as you get more reliable co-gen. And what do you think about, how do you think, you know, what is stopping us from being there? And then, okay, if apps become agentically updating? in the sense that you define intent and the app updates under the hood.
Starting point is 00:26:15 Does this mean that things like spectrum development, are you saying like things like spectrum development are actually the future? Or do you think there's a different interface for how people build? I think the use case where you have no idea of what's going to be in your app and you just expect the agent to do something useful for you. And when you open the app, you expect something there to be useful. I guess like in the sense of like you having a personal assistant, right, you're not always prompting your personal systems.
Starting point is 00:26:39 Like be useful, be useful. Be useful. They're just sort of there. to be useful. I think that there's going to be use cases like this where like you go to your app and like maybe it has context of your calendar, your email, your computer, and it just has something for you that's like generally almost always useful. What's stopping us from being there is just like agent reliability. Like if you today you can like sort of get somewhere close and I feel like not enough people have swung at this. I think Wabi is like a example of someone trying to swing at this.
Starting point is 00:27:09 So getting the context right, getting search right, making updates reliably is sort of what's stopping us. And I think within six to 12 months we'll get there. I mean, it's really like a nine's reliability problem, right? Like if you have, if you're familiar with nine's reliability, I think I forget some engineer had this blog post where it's like, a good infra. It depends on like the point nine, nine, nine, whatever percent of reliability. And I think we don't have that with LMs today. But as we approach that, a lot of these use cases become really viable.
Starting point is 00:27:39 especially to the general public. The general public doesn't have tolerance for when you have new line bugs, where, like, buttons are going down three lines and it's not mobile-friendly and stuff like this. So today, LLM still have a lot of these issues. So I think that's what's limiting it. Gotcha. And do you think that's mostly driven by the lack of feedback loop for these things? Or is that driven by, like, a core model?
Starting point is 00:28:03 We're just lacking intelligence in the model, or is it workflow? Like, what do you think the contributors are that we know or the, the gates that we need to blast through in order to realize that vision. I mean, I think, like, frontier models getting better is one thing. The other is just, like, it being cheaper as well could help. Because, like, if it's sufficiently cheap, you could pass screenshot, detect what's wrong, just, like, self-correct. So you can do those linting errors already.
Starting point is 00:28:26 Like, if there's an error, you can just pass it back to LLM and have it fix it until there's no linting errors. But that doesn't solve the design or U.X bug. You could sort of solve this qualitatively with screenshots. But now that's because it's really expensive. And so the value you need to provide needs to be really high for that to be viable today. So both costs coming down and capabilities going up, there's going to be an influxion point where this is useful where you can run all these models autonomously.
Starting point is 00:28:53 Today, price and capabilities are not there, but it seems very clear that it's going to get there. Awesome. Well, I think we got everything we wanted. So thanks for being on our pod. For people that want to learn more about Morph and maybe you revive coding platform, building a vibe coding platform, there actually is a lot of them still coming up. Where can people find more and maybe how to reach out to you? Yeah, you can email me anytime.
Starting point is 00:29:17 There's the MorphLLM website. It's MorphLLM.com. You can reach out to me on LinkedIn, Twitter. My Twitter is Tegis Y. Bacta. That's going to be hard to spell, but I'm sure you can find me. Yeah, I respond pretty quick. Sounds good. Well, thanks to be on a podster.
Starting point is 00:29:33 Yeah. Thanks so much. Thanks for having me.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.