Latent Space: The AI Engineer Podcast - Windsurf: The Enterprise AI IDE - with Varun and Anshul of Codeium AI

Starting point is 00:00:04 Hey, everyone. Welcome to the latest-based podcast. This is Alessio, partner, and CTO Adesable Partners, and I'm joined by my co-hosts Swix, founder of Small A.I. Hey, and today we are delighted to be, I think, the first podcast in the new Kodium office. So thanks for having us, and welcome Faroon and Anjou. That's for having us. Yeah, thanks for having us. This is the Silicon Valley office?

Starting point is 00:00:25 Yeah. So, like, what's the story behind this? The story is that the office was previously, so we used to be on Castro Street, so this is in Mountain View. And I think a lot of the people at the company previously, you know, were in SF or still an SF. And actually one thing you, if you notice about the company is it's actually like a two-minute walk from the Cal Train. And I think we were, we didn't want to move the office like very far away from the Cal Train that would probably, you know, piss off a lot of the people that lived in in San Francisco, this guy included. So, so we were like scouting a lot of spaces in the

Starting point is 00:00:57 nearby area and this area popped up. It previously was, was being leased by, I think, Facebook slash what's up and then immediately after that ghost autonomy and then now here we are and we also you know I guess one of the things that the landlord told us was this was the place that they shot all the scenes Forest Silicon Valley at least like externally and stuff like that so that just became a meme

Starting point is 00:01:17 trust me that wasn't like the main reason why we but we've leaned into it it doesn't hurt yeah yeah and obviously that played it a little bit into your launch with windsurf as well so let's get caught up you were guest number four I think you were two maybe it was two

Starting point is 00:01:32 So a lot has happened since then. You've raised a huge round and also just launched your ID. What's been the progress over the last year or so since the In Space people last saw you? Yeah. So I think the biggest things that have happened are codiums, extensions have continued to gain a lot of sort of popularity. We have over 800,000 sort of developers that use that product. Lots of large enterprises also use the product. We were recently awarded JP Morgan Chase's Hall of Innovation Award,

Starting point is 00:02:01 which is usually not something a company gets within a year of deploying an enterprise product. And then large companies like Dell and stuff use the product. So I think we've seen a lot of traction on the enterprise base. But I think one of the most exciting things we've launched recently is this actually IDE called Windsurf. And I think for us, one of the things that we've always thought about is how do we build the most powerful AI system for developers everywhere? The reason why we started out with the extension system was we felt that there were lots of developers that were not going to be on one platform. And that still is true, by the way. Outside of Silicon Valley, a lot of people don't use GitHub. This is like a very surprising finding, but most people use GitLab, BitBucket, Gareth, CVorce, CVS, Harvest, Mercurial.

Starting point is 00:02:42 I could keep going down a list, but there's probably 10 of them. GitHub might have less than 10% penetration of the Fortune 500, full penetration. It's very small. And then also on top of that, GitHub has very high switching costs or source code management tools, right? Because you actually need to switch over all the dependent systems on this workflow software. It's much harder than even switching off of a database. So because of that, we actually found ways in which we could be better partners to our customers, regardless of where they sorted their source code. And then more specifically on the ID category, a lot of developers, surprise, surprise, don't just write TypeScript and Python.

Starting point is 00:03:13 They write Java. They write GoLang. They write a lot of different languages. And then high-quality language servers and debuggers matter. Very honestly, JetBrains has the best debugger for Java. It's not even close. These are extremely complex pieces of software. We have customers where over 70% of their developers are your JetRans.

Starting point is 00:03:30 And because of that, we wanted to provide a great experience wherever the developer was. But one thing that we found was lacking was, you know, we were running into the limitations of building within the VS code ecosystem on the VS code platform. And I think we felt that there was an opportunity for us to build a premier sort of experience. And that was within the reach of the team, right? The team has done all the work, all the infrastructure work to build the best possible experience, right, and plug it into every ID. Why don't we just build our own ID that is by far the best experience? and as these agenic products start become more and more possible,

Starting point is 00:04:02 and all the research we had done on retrieval and just reasoning about codebases became more and more to life. We were like, hey, if we launched this agentic product on top of a system that we didn't have a lot of control over, it's just going to limit the value of the product and we're just not going to be able to the best tool. That's why we were super excited to launch WinSurf.

Starting point is 00:04:19 I do think it is the most powerful IDEA system out there right now in the capability, right? And this is just the beginning. I think we suspect that there's much, much more we can do more than just the autocomplete sort of side. When we originally talked, probably autocomplete was the only piece of functionality the product actually had. And we've come a long way since then, right? These systems can now reason about large code bases without you adding everything, right? Like when you use Google, do you say like at New York Times Post, blah, blah,

Starting point is 00:04:44 and like ask it a question? No. We want it to be a magical experience where you don't need to do that. We want it to actually go out and execute code. We think code execution is a really, really important piece. And when you write software, you not only just kind of come up with an idea, the way software kind of gets created is software is originally this amorphous blob. And as time goes on, and you have an idea, the blob and the cloud sort of disappear and you see this mountain. And we want it to be the case that as soon as you see the mountain, the AI helps you get to the mountain. And as soon as you see the mountain, the AI just creates the mountain for you. Right. And that's why we don't believe in this sort of modality where you just write a task and it just goes out and does it.

Starting point is 00:05:20 Right? It's good for zero to one apps. And I think people have been, seeing windurf is capable of doing that, and I'll let Anshol talk about that a little bit, but we've been seeing real value in real software development, which is more to say, this is not to say that current tools can't, but I think more in the process of actually evolving code from a very basic idea. Code is not really built as you have a PRD and then you get some output out. It's more like you have a general vision, and yes, and as you write the code, you get more and more clarity on approaches that don't work and do work. You're killing ideas and creating ideas constantly, and we think windsurf is the right paradigm for that.

Starting point is 00:05:52 Can you spell out what you couldn't do in VS code? Because I think when we did the cursor episode, explain, then everybody on Agri News is like, oh, blah, blah, why did you fork? You could have done it in an extension. Like, can you maybe just explain more of those limitations? I mean, I think a lot of the limitations around like APIs are pretty well documented. I don't know if we need necessarily go down that rabbit hole. I think it was when we started thinking, okay,

Starting point is 00:06:18 what are the pieces that we actually need to give the AI to get to that kind of emergent behavior that Brun talked about. And yes, we were talking about all the knowledge retrieval systems that we've been building for the enterprise all this time. That's obviously a component of that. We were talking about all the different tools that we could give it access to so they can go, like, do that kind of terminal execution and things like that. But then the third main category that we realized would be like kind of that magical thing where you're not out there writing out a PRD, you're not scoping the problem for the AI, is that if we're actually being able to understand the kind of the trajectory of what developers are doing

Starting point is 00:06:50 within the editor. If we actually be able to see like, oh, the developer just went and opened up this part of the director and tried to view it, then they made these kind of edits. And they tried to do like some kind of commands in the terminal. And if we actually understand that trajectory, then our ability for the AI to just be immediately like, oh, I understand your intent. This is what you want to do without you having to spell it all out for it. That is one like that kind of like magic would really happen.

Starting point is 00:07:12 I think that was kind of like that intuition. So you have the restrictions of the APIs that are well documented. We have the kind of vision of like what we have. actually need to be able to hook into to really expose this. And I think it was that combination of those two where we were like, I think it's about time to do the editor. The editor was not like a necessarily new idea. I think we've been talking about the editor for a very long time. I think it's like, of course, we just pulled it all together in the last couple of months. But it was always something in the back of the mind. And it's only when we started realizing, okay, the models are

Starting point is 00:07:39 not capable of doing this. We actually can look at this data. Like we have a really good context awareness system. We're like, I think now's the time. And we went on and execute on it. So it's basically not one action you couldn't do, but it's like how you brought it all together. It's like the VS code's kind of like sandbox, so to speak. Yeah, let me maybe like even just to go one step deeper on each of the aspects that Unschul talked about. Let's go with the API aspect. So right now, I'll give you an example. Super Complete is actually a feature that I think is like very exciting about the product, right?

Starting point is 00:08:07 It can suggest refactors of the code. I think you can do it quickly and very powerfully. On VS code, actually, the problem for us wasn't actually being able to implement the feature. We had the feature for a while. The problem was actually even to show the feature, VS code would not expose an API for us to do this. So what we actually ended up doing was dynamically generating PNGs to actually go out and showcase this.

Starting point is 00:08:26 It was not really aligned. We actually ended up doing it ourselves, and it took us a couple hours to actually go out and implement this, right? And that wasn't because we were bad engineers. No, our good engineering time was being spent fighting against the system rather than being a good system. Another example is we needed to go out and find ways to refactor the code. The VScode API would constantly keep breaking on us.

Starting point is 00:08:43 and we constantly need to show a worse and worse experience. This actually comes down to the second point which Anshu brought up, which is like we can come up with great work and great research. All the work we have here is not like the research on Cascade is not like a couple month thing. This is like a nine months to a year thing that we've been investigating as a company. Investing in on e-vails, right? Even the e-vals for this are a lot of effort, right? A lot of actually systems work to actually go out and do it.

Starting point is 00:09:06 But ultimately, like this needs to be a product that developers actually use. And I think, you know, let's even go for a Cascade, for example, and looking at the trajectory. Can you define Cascade because that's the first time you brought it up? Yeah. So Cascade is the product that is the actual agentic part of the product, right, that is capable of taking information from both these human trajectories and these AI trajectories, what the human ended up doing,

Starting point is 00:09:26 what the AI ended up doing, to actually propose changes and actually execute code to finally get you the final work output, right? I'll even talk about something very basic. Cascade gives you a bunch of code. We want developers to very easily be able to review this code. Okay, then we can show developers a hideous UI that they don't want to look at. And no one's going to really use the software. one's going to really use this product. And we think that this is like a fundamental building

Starting point is 00:09:45 block for us to make the product materially better. If people are not even willing to use the building block, where does this go? And we just felt our ceiling was capped on what we could deliver as an experience. Interestingly, Jeprain's is a much more configurable paradigm than VS code is. But we just felt so limited on both the sort of directions that Anshul said that we were just like, hey, if we actually remove these limitations, we can move substantially faster. And we believe that this was a necessary step for us. I'm curious more about the evils set of it because you brought it up. And we have to ask about e-vils anytime anyone brings up e-vails. How do you evaluate a thing like this that is so multi-step and so spanning, like so much context? So what you can imagine

Starting point is 00:10:27 we can sort of do. And this is like one of the beautiful things about code is code can be executed. We could go take a bunch of open source code. We can find a bunch of commits, right? And we can actually see if some of these commits have tests associated with them. We can start stripping the commits, and the approach of stripping the commits is good because it tests the fact that the code is in an incomplete state, right? When you're writing the commit, the goal is not, the commit has already been written for you, you're given it in a state that where the entire thing has not been written, and can we go out and actually retrieve the right snippets and actually come up with a cohesive plan and iterative loop that gets you to a state where the code

Starting point is 00:11:00 actually passes? So you can actually break down and decompose this complex problem into like a planning, retrieval, and multi-step execution problem. And you can see on every single one of these axes is it getting better. And if you do this, this across enough repositories, you've turned this highly discontinuous and discrete problem of make a PR work versus make it not work into a continuous problem. And now that's a hill you can actually climb. And that's a way that you can actually apply research where it's like, hey, my retrieval got way better. This made my eval get better. And then notice how the way the eval works is, I'm not that interested in the eval where purely it's the commit message

Starting point is 00:11:32 and you finish the entire thing. I'm more interested in the code is in an incomplete state. And the commit message isn't even given to you because that's another thing about developers. not willing to tell you exactly what's in their head. That's the actual important piece of this problem. We believe that developers will never completely pose the problem statement, right? Because the problem statement lives in their head, conversations that you and I have had at the coffee area, conversations that I've had over Slack, conversations I've had over Jira, right? Maybe the Nijira, let's say linear, right? That's the cool thing nowadays. They're talking about Jira. Yeah. So conversations I've had on linear. And all of these things

Starting point is 00:12:07 come together to actually finally propose sort of a solution there, which is why we want to test the incomplete code. What happens that the state is in an incomplete state? And am I actually able to make this pass without the commit? And can I actually guess your commit well? Now you can convert the problem into a mask prediction problem where you want to guess both the high-level intent and as well as the remainder of changes to make the actual test pass. And you can imagine if you build up all of these, now you can see, hey, my systems are getting better. Retrieval quality is getting better. And you can actually start testing this on larger and larger code bases. And I guess that's one thing that we, honestly, to be honest, we could have done a little

Starting point is 00:12:39 faster. We had the technology to go out and build these 0 to 1 apps very quickly. And I think people are using Winsurf to actually do that. And it's like extremely impressive. But the real value, I think, is actually much deeper than that. It's actually that you take a large code base. And it's actually a really good first pass. And I'm not saying it's perfect, but it's only going to keep getting better. And we have like deep sort of infrastructure to that actually is validating that we're getting better on this dimension. We've ever mentioned the end-to-end e-vals that we have for the system, which I think are like super cool.

Starting point is 00:13:06 But I think you can even decompose each of those steps, right? The ideas of just take a retrieval, for example, right? Like, how can we make e-vail for retrieval really good? And I think this is just a general thing that's been true about us of the companies. Like, most evils and benchmarks that exist out there for software development is kind of bogus. There's not really a better way of putting it. Like, okay, you have sui-bench, that's cool. No actual professional work looks like sui-bench.

Starting point is 00:13:29 Like, human eval, same thing. Like, these things are just a little kind of broken. So when you're trying to optimize against a metric that's a little bit broken, you end up making kind of suboptimal decisions. So something that we're always very keen on is like, okay, what is the actual metric that we want to test for this part of the system? So take Redrieval, for example, a lot of the benchmarks for these embedding-based systems

Starting point is 00:13:47 are like needle-in-the-hastack problems. Like, I want to find this one particular piece of information out of all this potential context. That's not really what actually is necessary for doing software engineering because code is a super distributed knowledge store. You actually want to pull in snippets from a lot of different parts of the code base in order to do. the work, right? And so, you know, we built systems that instead of looking at retrieval

Starting point is 00:14:08 at one, you're looking at retrieval at like 50, like, what are the 50 highest things that you can actually retrieve? And are you capturing all of the necessary pieces for that? And what are all the necessary pieces? Well, you can look again back at old commits and see over all the different files that together were edited to make a commit, because those are semantically similar things that might not actually show if you actually try to map out a code graph, right? And so we can actually build these kind of golden sets. We can do this evaluation, even for sub-problems in the overall task. And so now we have, like, you know, an engineering team that can iterate on all of these things and still make sure that the end goal that we're trying

Starting point is 00:14:40 to build to is, like, really, really strong so that we have confidence of what we're pushing out. And by the way, just to talk, let's say one more thing about the sweepbench thing, just to showcase these existing metrics. I think benchmarks are not a bad thing. You do on benchmarks. Actually, like, I would prefer if there are benchmarks versus, let's say, everything was just vibes, right? But vibes are also very important, by the way, because they showcase that where the benchmark is not valuable, because actually vibes sometimes show you where criminal issues that exist in the benchmark. But like, you look at some of the ways in which people have, like, optimized VEBenz. It's like, make sure to run pie test every time X happens. And it's

Starting point is 00:15:11 like, yeah, like, sure. You can start, like, prompting it in like every single possible way. And like if you remove that suddenly, it doesn't get good at it. It's like, what really matters here? What really matters here is, like, across a broad set of tasks, you're performing, like, high quality sort of suggestions for people and people love using the product. And I think actually, like, the way these things work is beyond a certain point. Because, yes, I actually think it's valuable beyond a certain point. But once it starts hitting the peak of these benchmarks, getting that last 10% actually probably is counterintuitive

Starting point is 00:15:39 to the actual goal of what the benchmark was. Like, you probably should find a new hill to climb rather than sort of P-hacking or really optimizing for how you can get higher on the benchmark. Yeah, we did an episode with Anthropic about their recent sweet-agent, and we talked about the human email versus sweet-bench results. or like human eval is kind of like a greenfield benchmark. You know, you need to be good at that. Sweet Bench is more accessing.

Starting point is 00:16:02 But it sounds like, I mean, your evel creation is similar to Sweet Bench as far as like using get-up commits and kind of like that history. But then it's more like masking at the commit level versus just testing the output. That's right. Of the thing. Cool. We have some listener questions actually about the windsurf launch. And obviously, I also want to give you the chance to just respond to Hacker News.

Starting point is 00:16:23 Oh, man. Hey, let me tell you some. something very, very interesting. I love hacker news, as much as the next person, but the moment we launched our product, the first comment, like, this was a year ago, the first comment was, this product is a virus. And we were like,

Starting point is 00:16:38 this is the original codeoam launch like two years ago. This is the original. Like, I am analyzing the binary as we speak, we'll report back. And then he's like, it's a virus. And I was like, dude, like, it's not a virus. We just want to give auto-complete suggestions. That's all we want to do. Yeah. Okay. Okay.

Starting point is 00:16:55 Wow, I didn't expect that. And then there was like Tio drama. There's enough drama on the launch to cover. But I don't know if we want to just make this a Cascade piece. But we had a bunch of people in our Discord trial the product, give a lot of feedback. One question people have is like to them, Cascade already felt pretty agentic. Like, is that something you want to do more of? You know, obviously, since you just launched an ID, you're kind of like you're focusing on having people write the code.

Starting point is 00:17:20 But maybe this is kind of like the Trojan horse to just doing more full-on end-to-end, like code. creation. Devon style. Yeah, I think it's like, how do you get there in a, in a, like, a real principled manner? We have obviously, like, enterprise asking us all the time, like, oh, when's it going to do like end-to-end work? The reality is like, okay, well, if we have something in the IDE that, again, can, like, see your entire actions and get a lot of intent that you can't actually get if you're not in the IDE. If the agent there has to always get human involvement to keep on fixing itself, it's probably not ready to become a full end-to-end automated system, because then we're just going to turn into a linter where, like,

Starting point is 00:17:55 It produces a bunch of things and no one looks at any of it. Like that's not the great end state. But if we start seeing like, oh, yeah, there's common patterns that people do that, like, never require human involvement. It's an end-to-end just totally works without like any like intent-based information. Sure, that can become like fully agentic. And like we'll learn what those tasks are like pretty quickly because we have a lot of data. Maybe add on to that.

Starting point is 00:18:16 I think that if the answer is like full agentic is called like is Devon, I think like, yes, the answer is this product should become fully agentic. and limited human interaction is the goal, is 100% the goal. And I think, honestly, of all usable products right now, I think we're the closest right now. Of all usable products in an IDE. Now, let me caveat this by saying, I think there are lots of hard problems that have yet to be solved

Starting point is 00:18:39 that we need to go out and solve to actually make this happen. Like, for instance, I think one of the most annoying parts about the product is the fact that you need to accept every command that kind of gets run. It's actually fairly annoying. I would like it to go out and run it. Unfortunately, me going out and running arbitrary binaries has some problems and that if it like RMRS, my hard disk, I'm not going to be, I'm not going to be. It's a virus. It's a virus. It does become a virus. I think this is, this is solvable with like,

Starting point is 00:19:04 with complex systems. I think we love working on complex systems infrastructure. I think we'll solve it. Now, the simpler way to go about solving this is don't run it on the user's machine and run it somewhere else because then if you board that machine, you're kind of totally fine. Now, I think, I think though maybe there's a little bit of tradeoff of like running it locally versus remotely, and I think we might change our mind on this. But I think the goal, for this is not for this to be the final state. I think the goal for this is, A, it's actually able to do very complex tasks with limited human interaction, but it needs to know when to actually go back to the human, right? Also on top of that, compress every cycle that the agent is running.

Starting point is 00:19:37 Right now, actually, I even feel like the product is too slow for me sometimes right now. Even with it running really fast. It's objectively pretty fast. I would still want it to be faster, right? So there is like systems work and probably modeling work that needs to happen there to make the product even faster on both the retrieval side and the generation side. And then finally speaking, I think another key piece here that's really important is I actually think asking people to do things explicitly is probably going to be more of an anti-pattern if we can actually go and passively suggest the entire change for the user. So almost imagine as the user is using the product that we're going to suggest the remainder of the PR without the user kind of like even asking us for it. I think this is sort of the beginning for it. But yeah, like these are hard problems.

Starting point is 00:20:19 I can't give a particular deadline for this. I think this is like a big step up than what we had particularly. the past. But I think what Antrol said is 100% true, but the goal is for us to get better at this. I mean, the remote execution thing is interesting. You've wrote a post about the end of local host. Yeah. And now it's almost like, then we were kind of like, well, no, maybe we do need the internet and like people want to run things. But now it's like, okay, no, actually I don't really care. Like I want the model to do the thing. And if you were like, you can do a task end to end, but it needs to run remotely, not on your computer. I'm sure most people will say, yeah.

Starting point is 00:20:50 No, I agree with that. I actually agree with it running remotely. That's not a security issue. I totally agree with you that it's possible that everything could run remotely. That's how it is at most like big calls, like Facebook. Nobody runs things locally. No one does. In fact, you connect to a room. Essentially to the mainframe. You're right on that.

Starting point is 00:21:08 Maybe the one thing that I do think is kind of important for these systems that is more than just running remotely is basically like, you know, when you look at these agents, there's kind of like a rollout of a trajectory. And I kind of want to roll this trajectory back, right? In some ways, I want like a snapshot of the system that I can like constantly checkpoint and move back and forth. And then also on top of that, I might want to do multiple rollouts of this. So basically, I think there needs to be a way to almost like move forward and move backwards the system. And whether that's locally or remotely, I think that's necessary.

Starting point is 00:21:35 But every time if you move the system forward, it like destroys your machine. It's probably going to be a hard system to kind of, or potentially destroys your machine. That's just not a workable solution. So I think the local versus remote, I think you still need to solve the problem of this thing is not going to destroy your machine on every execution, if that makes sense. Yeah. Yeah. There is a category of emerging infrastructure providers that are working on time travel VMs.

Starting point is 00:21:58 And if Veroen's first episode on this podcast was any indication, we like infrastructure problems. Okay. All right. Oh, so you're going there. All right. Well, that's funny, right? It's like when we first had you, you were doing so much on like actual model inference, optimization, all these things.

Starting point is 00:22:12 And today it's almost like. It's Claude. It's like, you know, people are like forgetting about the model. You know, and that's all about at a higher level of extraction. Yeah. So maybe I can say like a little bit about how our strategy on this is like evolved because it objectively has, right? I think I would be lying if I said it hasn't. The things like autocomplete and super complete that run on every keystroke are entirely like our own models.

Starting point is 00:22:35 And by the way, that is still because properties like FIM fill in the middle capabilities are still quite bad with the current. Non-existent. They're all, they're very bad, non-existent. They're not good actually at it. Because FIM is an actual like how you order the tokens. It's how you order the tokens actually in some. ways. And this is a, this is sort of, if you look at what these products have sort of become, and this is great, is a lot of the clods in the opening eyes have focused on kind of the chat

Starting point is 00:22:58 like assistant API where it's like complete pieces of work message and other complete piece of work. So multi-turn kind of back and forth systems. In fact, like actually even these systems are not that good at making point changes. When they make point changes, they kind of are like off here and there by a little bit. Because yeah, when you, when you like are doing multi-point kind of like conversations, it's, you know, exact gifts getting applied is not like even a perfect science still yet. So we care about that. The second piece where we've actually sort of trained our own models is actually on the retrieval system. And this is not even for embedding, but like actually being able to use high-powered L-LMs to be able to do much higher quality

Starting point is 00:23:33 retrieval across the code base. Right. So this is actually what Anshul said. For a lot of the systems, we do believe embeddings work, but for complex questions, we don't believe embeddings can encapsulate all the granularity of a particular query. Like imagine, imagine I have a question on a code base of, find me all quadratic time algorithms in this codebase. Do we genuinely believe the embedding can encapsulate the fact that this function is a quadratic time function? No, I don't think it does. So you are going to get extremely poor precision recall at this task. So we need to apply something a little more high-powered to actually go out and we've actually built large distributed systems to actually go out and run these

Starting point is 00:24:08 at scale, run custom models at scale across large code bases. So I think it's more a question of that. The planning models right now, undoubtedly, I think the clods and the open AIs have the best products. I think Lama 4, depending on where it goes, it could be materially better. It's very clear that they're willing to invest a similar amount of compute as the Open AIs and the Anthropics. So we'll see. I would be very happy if they got really good, but unclear so far. Don't forget GROC. Hey, dude, I think GROC is also possible. Right? I think don't doubt Elon. Okay. So I didn't actually know. It's not obvious when I use Cascade. I should also mention that, you know, I was part of the preview. Thanks for letting me in. I've been

Starting point is 00:24:45 maining windsurf for a long time. It's not actually obvious. You don't make it obvious that you are running your own models. I feel like you should so that I feel like it has more differentiation. Like I only have exclusive access to your models via your IDE than having the drop-down as is cloud and 40 because I actually thought that was what you did. No, so actually the way it works is the high-level planning that is going on in the model is actually getting done with products like the clot.

Starting point is 00:25:09 But the extremely fast retrieval as well as the ability to take the high-level plan and actually apply it to the code base is proprietary systems that are running internally. And then the stuff that you said about embedding is not being enough. Are you familiar with the, I concept the late interaction? No, I actually have never. Yeah, so this is Colbert, or like the guy Omar Katab from, I think Stanford has been promoting this a lot. It is basically what you've done. Okay.

Starting point is 00:25:34 Sort of embedding on retrieval rather than pre-embidding. Okay. In a very loose sense. I think that sounds like a very good idea that is very similar to what we're doing. It sounds like a very good idea. I think we'd say that. That's like the meme of Obama giving himself a medal right there. Well, I mean, there might be something to learn from contrasting the ideas

Starting point is 00:25:53 and seeing where, like the study opinion and differences. It's also been to apply very effectively to vision understanding. Because vision models tend to just consume the whole image, if you are able to sort of focus on images based on the query, I think that can get you a lot of extra performance. The basic idea of using compute in a distributed manner to do operations over a whole set of raw data rather than like a

Starting point is 00:26:17 materialized view. It's not anything new, right? I think it's just like how does that look like for LLMs? When I hear you say build large distributed systems, you have a very strange product strategy of going to down to the individual developer, but also to the large enterprise. Is it the same in Friday serves everything? I think the answer to that is yes. The answer to that is yes. And the only reason

Starting point is 00:26:35 why for the yes, the answer is yes. And to be honest, our company is a lot more complex than I think if we just wanted to serve the individual. And I'll tell you that because we don't really like pay other providers to do things for our indexing. We don't pay like other providers to do our serving of our own customer models, right? And I think that's a core competency within our company that we have decided to build, but that's also enabled us to go and like make sure that when we're serving these products

Starting point is 00:26:59 in an environment that works for these large enterprises, we're not going out and being like, we need to build this custom system for you guys. This is the same system that serves our entire user base. So that is a very unique decision we've taken as a company. and we admit that there were probably faster ways that we could have done this. I was thinking, you know, when I was working with you for your enterprise piece, I was thinking like this philosophy

Starting point is 00:27:18 will go slow to go fast, like build deliberately for the right level of abstraction that can serve the market that you really aren't going after. Yeah, I mean, I would say, like, when writing that piece, you're like looking back and reading it back, it sounds so like almost obvious in hands. Not all of those are really conscious decisions we made. Like, I'll be the first to admit that.

Starting point is 00:27:36 But, like, it does help, right? When we go to, like, an enterprise that has tens of thousands of developers and they're like, oh, wow, like, we have tens of thousands of developers, and does your infrastructure work for tens of thousands of developers? We can turn around and be like, well, we have hundreds of thousands of developers or an individual plan that we're serving. Like, I think we'll be able to support you, right?

Starting point is 00:27:53 So, like, being able to do those things, like, we started off by just like, let's give it to individuals that see what people like and what they don't like and learn, but then those become value propositions when we go to the enterprise. And to recap, when you first came on the pod, it was like auto-completion is free. And Copallo was $10 a month. And you said, look, what we care about is building things on top of code completion. How did you decide to just not focus on like short-term kind of like growth monetization of like the individual developer and like build some of this? Because the alternative would have been, hey, all these people are using it.

Starting point is 00:28:25 It's like we're going to make this other like five bucks a month plan, monetize. I think I think this might be a little bit of like commercial instinct that the company has and unclear if the commercial instinct is right. I think that right now, optimizing for making money off of individual developers is probably the wrong, actually, strategy. Largely because I think individual developers can switch off of products very quickly. And unless we have, like, a very large lead trying to optimize for making a lot of profit off of individual developers, it's probably something that someone else could just vaporize very quickly and then they move to another product. And I'm going to say this very honestly, right? Like when you use a product like codium on the individual, on the individual side, there's not much thing.

Starting point is 00:29:06 not much that prevents you to switch onto another product. I think that will change with time as the products get better and better and deeper and deeper. I constantly say this, like, there's a book in business called like seven powers. And I think one of the powers that a business like ours need to have is like real switching costs. But like you first need something in the product that makes people switch on and stay on before you think about how do you make people switch off. And I think for us, we believe that there's probably much more differentiation we can derive in the enterprise by working with these large companies in a way that is like, that is interesting and scalable for them. Like, I'll be maybe more

Starting point is 00:29:38 concrete here. Individual developers are much more sort of tuned towards small price changes. They care a lot more, right? Like if our product is 10, 20 bucks a month instead of 50 or 100 bucks a month, that matters to them a lot. For a large company where they're already spending billions of dollars on software, this is much less

Starting point is 00:29:54 important. So you can actually solve maybe deeper problems for them, and you can actually kind of provide more differentiation on that angle. Whereas I think individual developers could be churning as long as we don't have the best products. So focus on being the best product, not trying to take price and make a lot of money off of people.

Starting point is 00:30:09 And I don't think we will, for the foreseeable future, try to be a company that tries to make a lot of money off individual developers. I mean, that makes sense. So why $10 a month for WinServe? $10 a month was actually the pro plan. So we launched our individual pro plan before WinSurf existed. Because I think there's

Starting point is 00:30:25 we also have to be financially responsible. Yeah, yeah. We can run out of money. It's a cool. I mean, there's a lot of things because of our infrastructure background, we can give, like, for essentially free, like, unlimited auto-complete, you know, unlimited chat on, like, our, you know, faster models, like, we give a lot of things actually out for free. But, yeah, when we start doing things like the super completes and really large amounts

Starting point is 00:30:49 of indexing and all of these things, like, there is real cogs here. Like, we can't ignore that. And so we just created $10 a month pro plan, mostly just to cover the cost. Like, we're not really, like, operating, I think, on much of a margin there either. But, like, okay, like, just to cover us there. So for Winserve, it. It just ended up being the same thing. And everyone who downloads wins here from the first, like, I forget, like, a couple of weeks,

Starting point is 00:31:10 like two weeks for free. Let's just have people try it out, let us know what they like, what they don't like, and that's how we've always operated. I've talked to a lot of CTOs and, like, the Fortune 100, where most of the engineers they have, they don't really do much anyway. The problem is not that the developer costs 200K and you're saving 8K. It's like that developer should not be paid 200K. But that's kind of like the base price, you know?

Starting point is 00:31:33 But then you have developers getting paid 200K, there should be. be paid 500K. So it's almost like you're averaging out the price because most people are actually not that productive anyway. So if you make them 20% more productive, they're still not very productive. And I don't know in the future. Is it that the junior developer's salary is like 50K, you know, and it's like the bottom of the end gets kind of like squeezed out and then the top end gets squeezed up? Yeah, maybe I'll let's see you know one thing that I think about a lot because I do think about this, the per se, anything, all of this stuff I think about a good deal. Let's take a product like Office 365. I will say a lawyer at Kodium uses Microsoft Word way more than I do.

Starting point is 00:32:08 I'm still footing the same bill. But the amount of value that he's driving from Office 365 is probably, you know, tens of thousands of dollars. By the way, everyone, you know, Google Doc's a great product. Microsoft Word is a crazy product. It made it to that the moment you review anything in Microsoft Word, the only way you can review it is with other people in Microsoft Work. It's like this virus that penetrates everything. And it's not only penetrates it within the company. It penetrates it cross company, too.

Starting point is 00:32:31 The amount of value it's driving is way higher for him. So for these kinds of products, there's always going to be for these kinds of products, this variance between who gets value from these products, right? And you're right. It's almost like a blended because you're actually totally right. Probably this company should be paying that one developer maybe like four times as much. But in weird way, software is like this team activity enough that there's a bunch of blended outcomes. But hey, like 20% of the four times and there are four people is still going to cover the cost across the four individuals, right?

Starting point is 00:32:58 And that's how roughly these products kind of get priced out. I mean, more than about pricing, this is about like the future. of the software engineer. We could be very wrong also. Yeah. I think nobody knows. Reserve the right to be incredibly off. Yeah.

Starting point is 00:33:12 I mean, business model does, in fact, the product, product does impact the user experience. So it's all of the kind. I don't mind. We are, we do, are as concerned about the business of tech as the tech itself. That's cool. Speaking of which, there's other listener questions. Shout out to Daniel Imfeld, who's pretty active in our Discord,

Starting point is 00:33:27 just asking all these things. Multi-agent, very, very hot and popular, especially from like the Microsoft Research point, of you, have you made any explorations there? I think we have. I don't think we've called it a multi-agent, which is more so like this notion of having many trajectories that you can spawn off that kind of like validate sort of some different hypotheses

Starting point is 00:33:48 and you can kind of pick the most interesting one. This is stuff that we've actually analyzed internally at the company. By the way, the reason why we have not put these things in, actually, is partially because we can't go out and execute some random stuff in peril in the meantime. In the meantime on other sides. Because of the side effects. Because of the side effects, right? So there are some things that are a little bit dependent on us unlocking more and more functionality internally.

Starting point is 00:34:09 And then the other thing is in the short term, I think there is also a latency component. And I think all of these things can kind of be solved. I actually believe all of these things are solvable problem. They're not unsolvable problem. And if you want to run all of them in parallel, you probably don't want end machines to go out and do it. I think that's unnecessary, especially if most of them are I-O-bound kind of operations, where all you're doing is reading a little bit of data and writing out a little bit of data. It's not extremely compute-intensive.

Starting point is 00:34:31 I think that it's a good idea, and probably some sort of. we will pursue and is going to be in the product. I'm still processing what you just said about things being I-O-bound. So for a certain class of concurrency, you can actually just run it all in one machine. Why not? Because if you look at the changes that are made, right, and for some of these, spreading out like what, a couple thousand bytes, maybe like tens of thousands of bytes on every... It's not a lot.

Starting point is 00:34:53 Very small. What's next for Cascade or wind surf? Oh, there's a lot. I don't know. We did an internal poll and we were just like, are you more excited about this launch or the launch that's happening in a month? or like what we're going to come out within a month. And it was like almost uniformly in a month.

Starting point is 00:35:08 I think like, you know, there's some like obvious ones. I don't know how much very want to say. I don't want to speak up. But I think you'd look at all the same axes of the system, right? Like how can we improve the knowledge retrieval? Like we'll always keep on figuring out how to improve knowledge retrieval. In our launch video, we even showed some of like the early explorations we have about looking to other data sources. That might not be the coolest thing to the individual developer building a zero to one app.

Starting point is 00:35:30 But you can really believe that like the enterprise customers really think that that's Very cool. I think on the tool side, I think there's a whole lot more that we can do. I mean, of course, when Burns talked about not just suggesting the terminal command, but actually executing them. Like, I think that's going to be huge. Unlock, you look at the actions that people are taking, right? Like, the human actions, the trajectories that we can build.

Starting point is 00:35:49 Like, how can we make that even more detailed? And I think all of those things, and you make some, like, even a cleaner UI, like, the idea of looking at future trajectories, trying a few different things and, like, suggesting potential next actions to be taken. Yeah, yeah. That doesn't really exist. yet, but it's pretty obvious, I think, how that would look like. You open up Cascade, and instead of starting typing, it's just like, here's a bunch of

Starting point is 00:36:10 things that we want to do. We kind of joke that's like Clippy's coming back, but like, maybe now's the time for Clippy to really shine, right? So I think there's a lot of ways that we can take this, which I think is like the very exciting part. We're calling each of our launches waves, I believe, because we want to really double down on the aquatic themes. Oh, yeah.

Starting point is 00:36:26 Does someone actually winsurf at the company? Is that? We're living out our dream of being cool enough to windsurf through the process. I don't think we can't. Yeah, all right. That was actually something we learned, because I don't think any of us are wood surfers. Like in our launch video, we have someone using windsurf on a windsurf. You saw that.

Starting point is 00:36:43 You saw that. In the beginning of the video, someone was at the computer. And we didn't realize, like, now apparently is, like, the time of the year where there's, like, not enough wind to windsurf. So we were trying to figure out how to do this, like, you know, launch video with wind surf on the wind surf. Every wind surf we were attacked, like, yeah, it's not possible. And there was, like, yeah, I think we can do this. And we made it happen. Oh, okay.

Starting point is 00:37:03 That's funny. Is there anything that you want feedback on? Maybe there's a fork in a road. You want feedback. You want people to respond to this podcast and tell you what they want. Yeah, I think there's a lot of things that I think could be more polished about the product that we'd like to improve. Lots of different environments that we're going to improve performance on. And I think we would love to hear from folks across the gamut.

Starting point is 00:37:25 Like, hey, like, if you have this environment, you use Windows and X version, it didn't work. Or this language, it was very poor. I think we would like to hear it. Yeah, I gave it. Preb and Kevin a lot of shit for my Python issues. Yeah, yeah, yeah. And I think there's a lot to kind of improve on the environment side. I think, like, for instance, even just a

Starting point is 00:37:41 dumb example, and I think, Sir of Swix, this was a common one. It's like, yeah, like, the virtual environment, where is the terminal running? What is all this stuff? These are all basic things that, like, to be honest, this is not rocket science, but we need to just fix it, right? We need to fix it. So, we would love to hear, like, all the feedback from the product, like, was it too slow?

Starting point is 00:37:58 Where was it too slow? What kind of environments could work way more in? There's a lot of things that we don't know. We, luckily, we're daily users of the product internally, so we're getting a lot of feedback inside. But I will say, like, there's a little bit of Silicon Valleyism in that a lot of us develop on Mac. A lot of people, once again, over 80% of developers are on Windows. So, yeah, there's a lot to learn and probably a lot of improvements down the line. Have you personally tempted your CEO of the company to switch to Windows just to feel something?

Starting point is 00:38:25 You know, you know what? You know what? Maybe I should. Actually, I think I will. I mean, like, your customers, you know, everyone says, 89% are all Windows, right? You live in Windows, you will never, you would never not see something that missed. So I think in the beginning, part of the reason why we were hesitant to do that

Starting point is 00:38:44 was, like, a lot of our architectural decisions to work on across every IDE was because we built a platform agnostic way of running the system on the user's local machine that was only buildable, easily buildable on, like, on dev containers that lived on a particular time of platform, so Mac was like nice for that. But now there's like not really an excuse

Starting point is 00:39:02 if it's like, if I can also make changes to the UI and stuff like that. And yeah, WSL also exists, that's actually something that we need to add to the product. That's how early it is, that we have not actually added that. We don't have, like, remote. Anything else about codium at large, right? Like, you still have your core business of the enterprise codium. Yeah. Anything moving there or anything that people should know about. I don't think a lot are still moving there, right? I think it would be a little bit like, you know, very kind of egotistical. It would be like, oh, we have windsurf now. All of our enterprise customers are going to switch to windsurf.

Starting point is 00:39:32 is the only, like, no, we still support the other. I was going to say, you just talked about your Java guys loving JetBrains. They're never going to leave JetBrains. They're never going to leave JetBrains. I mean, forget JetBrains. There's still tons and tons of Enterprise people on Eclipse. Like, we're still the only code system that has an extension of Eclipse. That's still true years in, right?

Starting point is 00:39:48 And but, like, that's because that's our enterprise customers. And the way that we always think about it is, like, how do we still maximize the value of AI for every developer? I don't think that part of who we are has changed since the beginning. Right? And there's a lot of, like, meeting the developers where they are. So I think on the enterprise side, we're still pretty invested in doing that. We have a team of engineers

Starting point is 00:40:06 dedicated just to making enterprise successful and thinking about the enterprise problems. But really, if we think about it from the really macro perspective, it's like if we can solve all the enterprise response from an enterprise, and we have products that developers themselves just truly, truly love, then we're solving the problem from both sides. And I think it's one of those things where I think when we started working with the enterprise and we started building like dev tools, right? We started as an infrastructure company. Now we're building dev tools for for developers, you really quickly understand and realize just how much developers loving the tool make us successful in an enterprise.

Starting point is 00:40:39 There's a lot of enterprise software that developers hate. I want to draw this flywheel. But like, we're giving a tool for people where they're doing their most important work. They have to love it. And it's not like we're trying to convince the executives at this company also ask their developers a lot, do you love this? Like that is like almost always a key aspect of whether or not codium is accepted into the organization. I don't think we go from zero to 10 million ARR in less than a year in an enterprise product if we don't have a product that developers love.

Starting point is 00:41:06 So I think that's why we're just, you know, the IDE is more of a developer love kind of play. It will eventually make it to the enterprise. We still solve the enterprise problems. And again, we could be completely wrong about this, but we hope we're solving the right problems. It's interesting. I asked you this before we started rolling, but like it's the same team. That's the same edge team. Like I, in any normal company or like, you know, my normal mental model of company construction, And if you are to have effectively two products like this, you would have two different teams serving two different needs, but it's the same team.

Starting point is 00:41:34 Yeah, I think one of the things that's maybe unique about our company is like this has not been one company the whole time, right? Like we were first, like this GPU virtualization company pivoted to this. And then after that, we're making some changes. And like, I think there's like a versatility of the company and like this ability to move where we think the instinct. We have this instinct where, and by the way, the instinct could be wrong.

Starting point is 00:41:56 But if we smell something, we're going to move fast. And I think it's more a testament to, I think, the engineering team rather than any one of us. I'm sure. You had December 19, 2022, you have one of our guests posts, what building Copilot for X really takes. Estimate inference, the figure out latency quality, build first party instead of using third-party as ABI. Okay. Figure out real time because chat GBT and Dali at RFP are too slow. Optimize prop because context Windows limited, which is maybe not that true anymore.

Starting point is 00:42:27 and then merge model outputs with the UX to make the product more intuitive. Is there anything you would add? I give myself like a B-minus on that. So some parts of that are accurate. Even like the context, like the one that you call that. Like, yeah, models have like larger context. Like now that's absolutely true. Like it's grown a lot.

Starting point is 00:42:44 But look at like an enterprise code base. Yeah, yeah, absolutely. You know, tens of millions of lines of code. That's hundreds of billions of tokens. Never going to change. It's still being really good at like being able to piece together this like distribute knowledge or is important. So I think like there are figures there that I think are,

Starting point is 00:42:57 are still pretty accurate. There's probably some that are less so. First party versus third party. I think we're wrong there. I think I would nuance that to be like there are certain things that it's really important to do first party. Like auto-complete, you have a really specific application

Starting point is 00:43:11 that you can't just prompt engineer your way out of or just maybe even like fine-tune afterwards. You just can't do that. I think there's truth there. But it's also be realistic. The stuff that's coming out for the third model providers, Cascade and WinSurf would not have been possible if it wasn't for the rapid improvements

Starting point is 00:43:26 with 40 and 35-s, That just wouldn't have been possible. So I'll give myself a B-minus. I'll say I passed, but yes, two hours, two years later. Just to be clear, we're not grading. It's more of a, what would you, you know. Where are they now? What would you have added?

Starting point is 00:43:40 What would you like? Yeah, I mean, like that first post, right? Like that was when we had literally, I think that was like a few weeks after we had launched Kodium. I think that's like, you know, Swix and I were talking like, maybe we can write this because we're like one of the first products that people can actually use the AI. That's cool. I specifically like the co-pilot for X thing because everyone is so hard.

Starting point is 00:43:57 At that time, everyone was just like, you know, chat, GPT, that's all that was. But I think, like, you know, that we didn't have an enterprise product. I don't even think we were necessarily thinking of an enterprise product at that point, right? So, like, all of the learnings that, like, you know, we've had from the enterprise perspective, which is why I loved coming back for, like, a third time now on the blog, some of those, I think we kind of, like, figured. Some of those we just honestly walked backwards into. I had to get lucky a lot of the ways. Like, we had many, we just did a lot.

Starting point is 00:44:25 Like, there's so many, like, opportunities and deals that we had that we, like, lost for a variety of reasons that we had to, like, learn from. There's just so much more to add that there's no way I would have gotten that right in 2022. Can I mention one thing that I think is, hopefully this is not very controversial, but it's, like, true about our engineering team as a whole. I don't think most of us got much value from Chachapiti. Largely because I think the problem was, and this is maybe a little bit of a different thing, it's like a lot of the engineers of the company who have been writing software for, like, over eight years. And this is not to say they know everything that Chachabiti T know. they don't. They'd already gone good enough at searching for Stack Overflow. Invested a lot in searching codebase, right? They can very quickly grab through the code. Incredibly fast, like every

Starting point is 00:45:07 tool. And they've spent like eight years mastering that skill. And ChaiGBT being this thing on the side that you need to provide a lot of context to, we were not able to actually get, like my co-founder just basically never used Jai GPD at all. Literally never did. And because of that, probably at the time, one of our incorrect sort of assumptions was probably that, hey, like a lot of, of these passive systems need to get good because they're always there and these active systems are going to be behind. I think actually Cascade was the big thing. Is there a company where everyone is now using? Literally everyone. Biggest skeptics. And we have a lot of people at the company that are skeptical of AI. I think this is actually important. Why do you hire them? No, I think here's the important

Starting point is 00:45:41 thing. Those people that were skeptical about AI previously worked in autonomous vehicles. These are not crypto people. These are people that care about technology and want to work on the future. Their bar for good is just very high. They will not form a cult of, this is awesome, this is going to change the world. They were not going to be the kind of people on Twitter that are like, this is, yeah, this, changes everything. Like, softers, we know it is dead. No, there are people that are going to be incredibly honest, and we know if we hit the bar that it is good for them, we found something special. And I think at that time, we probably had a lot of sentiment like that, that has changed a lot now. And I think it's actually important that you have believers that are incredibly future-looking

Starting point is 00:46:17 and people that kind of reined in. Because otherwise, you just have, you just have a lot of you know, this is like autonomous vehicles. You have a, you have a very discreet problem. People are just working in a vacuum. And there's no signal to kind of bring you down to reality. Right, you have no good way to kill ideas. And there are a lot of ideas we're going to come up with that are just terrible ideas. But we need to come up with terrible ideas. Otherwise, like, how does anything good come on? And I don't want to call these skeptics. Skeptics suggest that they don't know. They're realists. They're the type of people that when they see waitlist on a product online, they just will not believe it. They will not think about it at all.

Starting point is 00:46:47 Kudos for launching without a wait list. Yeah. Yeah. By the way, we will never launch with a waitlist. We will never launch with a wait list. That's the thing at the company. We'd much rather be a company that's considered the boring company than a company that launches once in a while and hopefully it's good. My joke is Generative AI has gotten really good at generating waitlists. Also, just to clarify, both of us used to working at Thomas Vehicle so it doesn't come across it. Oh, yeah, yeah.

Starting point is 00:47:09 I'm just kidding on the bottom's vehicle. No, we love that technology. We love it. Like, I love hard technology problems. That's what I live for. Amazing. Just pushback on the first party thing. I accept that the large model labs have just done a lot of work for you

Starting point is 00:47:24 that you didn't need to duplicate. But you now are sitting on so much proprietary data that it may be worth training on the trajectories that you're collecting. So maybe it's a pendulum back to first party. Yeah, I mean, I think like, I mean, we've been pretty clear from like a security posture perspective. I think there's like both like, you know, customer trust and like... I mean, I kind of want, like, let me opt. I think that there is signals that we do get.

Starting point is 00:47:49 from our users that we can utilize. There's a lot of preference information that we get, for example. Which is effectively what you're saying of like our trajectories. Go ahead. We, like I will say this, the super complete product that we have has gone materially better because of us not only using synthetic data, but also getting the preference data from our users of like, hey, given these set of trajectories, here's actually what a good outcome is. And in fact, one of the really beautiful parts about our product that is very different than

Starting point is 00:48:14 a chat DBT is we can not only see if the acceptance happened, but if something more than the acceptance happened. And edit happened even more than that. Right? Like, let's say you accepted it, but then after accepting it, you deleted three or four items in there. We can see that.

Starting point is 00:48:26 So that actually lets us get to even better than acceptance as a metric. Because we're in the ultimate work output of the developer. It's the preference between the acceptance and what actually happened. If you can actually get ground truth

Starting point is 00:48:38 of what actually happened, this is the beauty of being an ID, then, like, yeah, you get a lot of, lot of information there. So did you have this with the extension or is this pure windsurf? We had this with the extension. Yeah, okay, right.

Starting point is 00:48:48 The Windsor just gives you more of the idea. Yes. So that means you can also start getting more information. Like, for instance, the basic thing that Unchul said, we can see if, like, a file explorer was opened. It's actually just a piece of information we just cannot see previously. Sure. Yeah.

Starting point is 00:49:02 A lot of intent in there. A lot of intent. Second one. Oh, boy. How to make AIUX your mode. Oh, man. Isn't that funny that we now created, like, the full UX experience in an ID? I think that one is pretty accurate.

Starting point is 00:49:14 That one's an A? I think that one I'd give myself. I think, like, we were doing that within the experience. I still think that's true within the extensions as well, right? Like we got very, very creative with things. Like Rune mentioned the idea of just like, you know, essentially rendering images to display things. Like we get creative to figure out what the right U.S. is doing there. Like we could create a really like dumb U.S.

Starting point is 00:49:32 It's like a side panel, like whatever. But actually going to extra mile does make that experience as good as it possibly can there. But yeah, now like look at some of the UX that we're able to build in like in WinSurf and it's just like, it's fun. The first time I saw, because now we can do command in the terminal. you can not have to search for a bash command. The first time I saw that, I was like, I just started smiling. And like it's not like a Cascade.

Starting point is 00:49:55 It's not like a gentic system right in the lap. But I'm like, that is just a very, very cool. We literally couldn't do that in VESCO. Yeah, I understand that. Yeah. I've even invented a 60-line bash command called Please. And you can, you know, do that inside of it. Yeah.

Starting point is 00:50:07 That's cool. Yeah, so please English and then... You know, that's actually really cool because one of the things I think we believe in is actually I like products like autocomplete more than command, purely because I don't. I don't even want to open anything up. So that thing where I just can type

Starting point is 00:50:21 and not have to press some button shortcuts to go in a different place, I actually like that too. Yeah, and I actually adopted warp, the terminal warp, initially for that, because they gave that away for free. But now it's everywhere, so I can turn off a warp and not give Sequoia my batch commands. I'm with you. No, no, look.

Starting point is 00:50:43 Okay, I don't know. I'm going to go on a right. Hopefully somebody's a word. This is like, More product feedback. But they basically had this thing where you can do kind of like pound and then write a natural language. But then they have also the auto infer what you're typing is natural language. And those are different.

Starting point is 00:50:59 When you do the pound, it sounds like it gives you a predetermined command. When you like talk to it, it generates a flow. Okay. It's a bit confusing of a U. But going back to your post, you had the three piece of a IUX. What were there again? Present, practical, powerful. Actually, that was really good.

Starting point is 00:51:17 And I think, like, in the beginning, being present was enough. Maybe, you know, even when you launch, it's like, oh, you have AI, like, that's cool. Other people don't have it. Do you think we're still in the practical where, like, the experience is actually, like, the model doesn't even need to be that powerful, like, just having better experience is enough? Or, like, do you think, like, really the being able to do the whole, because your point was, like, you're powerful when you generate a lot of value for the customer. You're, like, practical when, like, you're basically, like, wrapping it in a nicer way.

Starting point is 00:51:47 Yeah, where are we in the market? I think there's always going to be room for, like, practical U.S. Like, getting it, like, I mean, the command terminal. That's, like, a very practical U.S. right? Like, I do think with things like Cascade and theseogenic systems, like, we are starting to get onto powerful. Because, like, there's so many pieces, like,

Starting point is 00:52:02 from a U.X perspective that make Cascade really good. Like, it's just really, like, micro things that are, like, just all over the place. But as, you know, we're streaming in, we're showing, like, the changes. We're, like, allowing you to jump in open diffs and see it. We can run background terminal commands. You can see what's running back-end processes they're running. There's all these small U.S. things that together come to a really powerful and intuitive

Starting point is 00:52:25 UX. I think we're starting to get there. It's definitely just a start. And that's why we're so excited about where all this is going to go. I think we're starting to see the glimpses of it. I'm excited. It's going to be a whole new blog, yeah. Yeah.

Starting point is 00:52:36 Awesome. First of all, it's just been really nice to work with you. I do work with a number of guests posters and not everyone makes it through it to the end, and nobody else has done it three times. So kudos. We're up for a hat trick. This one was more like the money one, which I, you know, it's funny because I think developers are like quite uninterested in money. Isn't it weird?

Starting point is 00:53:01 Yeah, I mean, I think like, I don't know if this is just the nature of our company. Like I think there's something we've said like there's all like the San Francisco AI companies and like everyone's like hyping each other like on the tech and everything which is like great. The tech's really important. We're here in Mountain View, beautiful office. We just really care about like actually driving value and making money. which is kind of like a core part of the company. I think maybe the selfish way of saying that, or like a little more of the selfless way,

Starting point is 00:53:24 is like, yeah, we can be kind of like this VC-funded company forever. But ultimately speaking, you know, if we actually want to transform the way software happens, we need this part of the business that's cash-generative that enables us to actually invest tremendously in the software. And that needs to be durable cash. Be cash that, like, churns the next year. And we want to set ourselves up to be a company that is durable.

Starting point is 00:53:47 and can actually solve these problems. Yeah, yeah, excellent. So for people, obviously, we're going to link in the show notes, but for people who are listening to this for the first time, I had a lot of trouble naming this piece. So we originally called it, you had like how to make money something.

Starting point is 00:54:03 I was super baddie. I apologize. I was super baddie. I think it was like ready part of that during, like on a plane flight, so I apologize for that. He had like $3. Oh, I absolutely had $3. I was like, I can't do that.

Starting point is 00:54:16 So it's either building AI for the enterprise, and then I also said the worst, the most dangerous thing an AI startup can do is build for other AI startups, which I think both of you will co-sign. And I think basically the main thesis, which I really liked was like, go slow to go fast, like here's the,

Starting point is 00:54:29 if you actually build for like security, compliance, personalization, usage analytics, latency budgets, and scale from the start, then you're going to pay that cost now, but eventually it's going to pay off in the long run. And this is the actual insight.

Starting point is 00:54:44 You cannot do this later. Like, if you build the easy thing first as an MVP, it's like, yeah, like, just ship it with, like, whatever's easy, easy to do. And then you tack on the enterprise ready.io set of, like, 12 things that you have, you actually end up with the different products or you end up worse off than if you had started from the beginning. So that I had never heard before. Yeah, I mean, we see that like repeatedly. I mean, just like right now, we have a lot of customers in like the defense space, for example. We're going through Fed ramp accreditation right now. And people that we're working with, they saw like all the fact like, oh, yeah, we already have a containerized system. We can already like deploy in these manners. We can already do like, we've already gone through like security.

Starting point is 00:55:23 They're like, oh, you're going to have a much easier time doing this, right? Then most companies that are just like, okay, we have like a big SaaS blob and now we need to like do all these things. It might sound like a really deep thing. I think it's just anyone who's worked for like an excited period time at like a company like on a certain project has probably seen this happen. The technology just keeps on improving. And then you realize that like you have to now like re-architect your whole system

Starting point is 00:55:46 to get something improving. Like just making that kind of change when you've invested so much effort. Like people have like, important hours. They're emotionally invested or whatever it might be. It's really hard to make that change. So I'm sure we're going to hit that also. like, yes, I think we've done things a little bit earlier than most companies. I think we're going to hit points where we're going to see parts of our systems.

Starting point is 00:56:06 We're like, oh, we really need to re-archate that. Actually, we've definitely hit that already. And I think that's just like at the project level, the product level, or is that like your whole company? Right. I think the thesis behind here is like to some degree your company needs had this DNA from the beginning. And I think then you'll be able to go through those bumps a lot more smoother and be able to drive the valley.

Starting point is 00:56:26 I haven't been. Yeah, can I say two points? So first point I'd like to say is this is something that me and Douglas, my co-founder, talk about a lot. It's like, you know, there's this constant thing of like build versus buy. I think the answer is, like, a lot of the time the answer should be buy. Right. Like, we're not going to go build our own sales tool. Should go buy sales force, right?

Starting point is 00:56:43 That's kind of dumb. That's undifferentiated. And the reason why you go with buy instead of build is, hey, like, look, the ROI of what exists out there is good. From like an opportunity cost standpoint, it's better to actually go out and buy it than build it and do a shittier job, right? There's a company that's actually going out and focused on that. But here's the hidden thing that I think is really important when you go out and buy.

Starting point is 00:57:02 You're losing a core competency inside the company. And that's a core competency you can never get. Or it's very hard. Like startups are so limited on time. Let me just say, like, let's say as a company, we did not invest in, I don't know, model inference. Yeah, we have like a custom inference runtime. We give that up right now. We will never get it back.

Starting point is 00:57:20 It's going to be very hard to get it back. You can't just use VLM and TensorRT. That would be your only option. Or just let me put it. If we use VLM, we would not be talking with you right now. Like, yeah, we would have, yeah. But the point is, this is more a question of, like, you know, I try to think about it from first of a rental.

Starting point is 00:57:36 It's like, Google's a great company makes a lot of money. What happens if they actually made the search index of the product something that someone else spoke for them? It's like they could. Maybe someone else could have done a good job. Maybe that's, like, a bad example. But like, particularly because Google is a search index, but, like, tough luck getting that core competency back.

Starting point is 00:57:51 You've lost it. Right? And I think for us, it's more a question of, like, what core competencies do we need in, the business. And yeah, like, sometimes it's painful. Like, sometimes actually, like, some of these core competencies are annoying, sometimes we'll be behind, behind what exists out there, right? And we just need to be very honest. That's where the truth-seekingness of the company matters. Like, are we really honest about this core competency? Can we actually keep up? The answer is we truly

Starting point is 00:58:13 can't keep up, then why are we keeping them with a trade? We should just buy, right? Like, let's not build. The answer is we can, and we think that this will differentiatedly make our company a better company in the long term, then the answer is we need to. We need to. Because, like, the race is not one in the next year. The race is won over the next five, ten years, right? Maybe even longer, right? So that's like, that's maybe one thing. And then the second thing, actually, from like the enterprise standpoint, I think one of the unique parts of the company is now is, we have both this individual and enterprise that, and usually companies stick to one or the other. And I think that needs to be part of the DNA, I think kind of early on in the company, as Unchall said.

Starting point is 00:58:45 I mean, there's stories of companies like Dropbox and stuff that tried. And Dropbox is an amazing company, fantastic company, that one of the fastest growing consumer companies of all time, consumer more on the software company of all time. But yeah, like when you have everyone sort of product oriented on the consumer side, the enterprise is just, it's checking off a lot of boxes that ultimately do not help the consumer at all. It doesn't help your growth metrics. And effectively, if the original group of people didn't care, it's incredibly hard to get

Starting point is 00:59:12 them to care down the line, right? Yeah. It's incredibly hard. Why do it? And you need to feel like, hey, this is like, this is an important part for the company's viability. So I think there's a little bit of like the build versus buy part. and then also like the cultural DNA of the company

Starting point is 00:59:25 that I think are both really important and yeah, it's something we think about all the time. I have the privilege of being friends with you guys off the air. I don't feel like, like I think I know your work histories. Like you say cultural DNA, but like it's not like you've built like giant enterprise SaaS before, right? Yeah. I think, yeah.

Starting point is 00:59:44 So like where are you getting this from? Yeah. In fact, in fact, I think the only other sort of, I guess like, you know, when I look at my previous internships, maybe Unshul can provide some. context here. It's like I worked at like LinkedIn and then Quora and then Databricks. And to be honest, like I was not that interested in B2B ETL software that much. That's not what drives me when I wake up at night. So I because of that, because of that, I decided to go work in an autonomous

Starting point is 01:00:06 vehicle company immediately after. I think part of it comes down to maybe a little bit of the unique aspect of the company and the fact that we pivoted as a company is like we want to, we want to be a durable company. And then the question is, how do you work backwards from that? There's a lot of things about being very honest about what we're good at and what we're not good I think surprisingly enterprise sales is like not something that like it came out of the womb knowing how to do. I didn't really know. And because of that, like obviously like a lot of sales happen between sort of folks like Anshul and I helping partner with with companies. But very soon we hired actually a VP of sales and we've actually been deeply involved in the process of scaling out like a large go to market team. And I think it's more a question of like what matters to the company and how do you actually go out and build it. And I think one of the people that I think about a lot actually is someone like Alex Wang. He dropped out of college. He was a year younger than us at MIT. and he has figured out how to constantly change the direction of the company. Effectively, it starts out as like, you know, human task interface,

Starting point is 01:01:01 then an AV labeling company, then a cataloging company, then now a generative AI labeling company. And every time the revenue of the company kind of goes up by a factor of 10, even though the business is doing something largely different. I mean, now it's all about military contracts. Yeah, now it's probably going to be military, and then after that it might be taking over the world. Like, he's just going to keep increasing the stakes.

Starting point is 01:01:17 And like, there's no playbook on how this really works. It's just a little bit of like, you know, solve a hard problem and work backwards, work backwards from that, right? And we'll get lucky along the way. I don't think, like, you'd think everything from first principles to the best of our abilities, but there's just so many variable unknowns that, yeah, like, we don't know everything that's happening in every company out there and everyone knows how fast the AI space is moving. We have to be pretty good at adapting.

Starting point is 01:01:42 I want to double-click on one thing just because you brought it up, and it's like a rare thing to touch on VPO sales. We don't get to actually, we talk to pretty early stage founders mostly. They don't usually have a pretty built-out sales. function. Advice, what kind of sales works in this kind of field? What didn't work? Anything you can share with other founders? I think one of the hard parts about hiring people in sales, and I really, like, Graham, Anshul can also attest, like, we have amazing VP of sales at the company. One of the things is, like, if you're purely a developer, salespeople, their job is to, like,

Starting point is 01:02:14 talk, like, really well, prim and proper. I mean, very obvious, if you hear, like, me talk, like, I'm not a very polished person. You're great, by the way. I don't know. Or compared to most pure, pure salespeople. So actually just checking based on the way they speak is not that interesting. I think, like, you know, what matters in a space like ours that is very quickly, moving very quickly, I think it's like intellectual curiosity is very important. Intellectual horsepower. Understanding how to build a factory. I'm not trying to minimize it.

Starting point is 01:02:42 But in some ways, scales of, you need to build something incredibly scalable here, right? It's almost like every year you're kind of making this factory twice, thrice, maybe as much. big, right? Because in some ways, you have people that are quota carrying. You need some number of people and you need to make the math work. And you actually, the process of building a factory is not something you can just take someone who is a great rep at another company and just make them build a factory. This is actually a very different skill. How do you actually make sure you have hundreds of people that actually deeply understand the product? Actually, Unschil works very closely also with sales to make sure that they're enabled properly. Make sure that they understand the technology.

Starting point is 01:03:16 Our technology is also changing very quickly. Let's maybe take an example on how are companies very different than a company like MongoDB. When you sell a product like MongoDB, no one at the company is interested in how the data is being stored. It's not that interesting, right? I love databases. I would be interested. But most people are like, solve the application problem I have at hand. People are curious about how our technology works. People are curious about rag, right? People that are buying our technology. And imagine we had a sales team that is scaling where no one understands any of this stuff. We're not going to be great partners to our customers. So how do you create almost this growing factory that is able to actually distribute the software in a way that is

Starting point is 01:03:49 true to our partners and also at the same time, like, taking on all the new parts of our product, right? They're actually able to expound on the new parts of our product. So, sorry, that was more a statement of, like, building a scalable sales team. But in terms of, like, who you hire is, you just need to have a sense. Like, in some ways, this is maybe an example of talk to enough people, find out what good looks like potentially in your category and find someone who's good and humble and willing to work with. Yeah, that's just generic hiring. It's just generic hiring. I think here, sales, there's sales for AI. or sales for AI infrastructure.

Starting point is 01:04:22 And then there's also the sales feeding into products in a way that we're talking about here, where they basically tell you what they need. I imagine a lot of that happened. I think a lot of that happened. I mean, still happened. And Veroon, mentioned, like, Rourne, myself, a number of other people who are developers by trade engineers.

Starting point is 01:04:38 Like, we're pretty involved in the sales process because, like, there's a lot to learn, right? Like, before we went out and hired a sales leader, like, yeah, if all we went is, like, neither of us had ever done a sale percodium in our lives. And we went to try to find a sales leader, we probably would have not hired the right person. Yeah, we had sold a product to like 30 or 40 customers of that time. We had done like hundreds and hundreds of deals cycles ourselves personally, right? Without, I mean, we read a lot of books and we just did a lot of stuff. And we learned

Starting point is 01:05:04 like what messaging worked, like what did we need to do? And then I think we found like the right person, right? A second version, like Graham's amazing and who we brought on as our VP of sales. That just has to be part of the nature and it doesn't stop now. Like just because we have a of sales and people dedicated to sales. It doesn't stop that we can't be involved or like engineering can't be involved, right? Like we have lots of people. Like we hire plenty of deployed engineers, right? These are people like, you know, I think like Palantir kind of made this really famous.

Starting point is 01:05:31 Like deployed engineers like work very, very closely with the sales team on very technical aspects because they can also understand like what are people trying to do with AI? As in they work at Kodium as deployed engineers. Yeah. Okay. And they partner with our account executives to like make our customer successful. learn what is it that people are actually getting value with AI. And that's information that we keep on collating.

Starting point is 01:05:52 And it's like we will both jump into any deal cycle just to learn more because that's how we're going to just keep on building the best product. It comes back to the same thing, I don't know. And hopefully we build the right thing. Cool, guys. Thank you for the time. It's great to have you back on the pod. Yeah, thanks so for having us.

Starting point is 01:06:09 Hopefully in a year we can do another one. Yeah. You'll be 10 billion by then. Yeah, exactly. At this rate, then next year. We try not thinking about that. Try to not be a zero-billing company. There's that, yeah.

Starting point is 01:06:21 All right, cool. That's it. Awesome.

Latent Space: The AI Engineer Podcast - Windsurf: The Enterprise AI IDE - with Varun and Anshul of Codeium AI

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.