The Pragmatic Engineer - Building Windsurf with Varun Mohan

Episode Date: May 7, 2025

Supported by Our Partners•⁠ Modal⁠ — The cloud platform for building AI applications•⁠ CodeRabbit⁠⁠ — Cut code review time and bugs in half. Use the code PRAGMATIC to get one month f...ree.—What happens when LLMs meet real-world codebases? In this episode of The Pragmatic Engineer,  I am joined by Varun Mohan, CEO and Co-Founder of Windsurf. Varun talks me through the technical challenges of building an AI-native IDE (Windsurf) —and how these tools are changing the way software gets built. We discuss: • What building self-driving cars taught the Windsurf team about evaluating LLMs• How LLMs for text are missing capabilities for coding like “fill in the middle”• How Windsurf optimizes for latency• Windsurf’s culture of taking bets and learning from failure• Breakthroughs that led to Cascade (agentic capabilities)• Why the Windsurf teams build their LLMs• How non-dev employees at Windsurf build custom SaaS apps – with Windsurf!• How Windsurf empowers engineers to focus on more interesting problems• The skills that will remain valuable as AI takes over more of the codebase• And much more!—Timestamps(00:00) Intro(01:37) How Windsurf tests new models(08:25) Windsurf’s origin story (13:03) The current size and scope of Windsurf(16:04) The missing capabilities Windsurf uncovered in LLMs when used for coding(20:40) Windsurf’s work with fine-tuning inside companies (24:00) Challenges developers face with Windsurf and similar tools as codebases scale(27:06) Windsurf’s stack and an explanation of FedRAMP compliance(29:22) How Windsurf protects latency and the problems with local data that remain unsolved(33:40) Windsurf’s processes for indexing code (37:50) How Windsurf manages data (40:00) The pros and cons of embedding databases (42:15) “The split brain situation”—how Windsurf balances present and long-term (44:10) Why Windsurf embraces failure and the learnings that come from it(46:30) Breakthroughs that fueled Cascade(48:43) The insider’s developer mode that allows Windsurf to dogfood easily (50:00) Windsurf’s non-developer power user who routinely builds apps in Windsurf(52:40) Which SaaS products won’t likely be replaced(56:20) How engineering processes have changed at Windsurf (1:00:01) The fatigue that goes along with being a software engineer, and how AI tools can help(1:02:58) Why Windsurf chose to fork VS Code and built a plugin for JetBrains (1:07:15) Windsurf’s language server (1:08:30) The current use of MCP and its shortcomings (1:12:50) How coding used to work in C#, and how MCP may evolve (1:14:05) Varun’s thoughts on vibe coding and the problems non-developers encounter(1:19:10) The types of engineers who will remain in demand (1:21:10) How AI will impact the future of software development jobs and the software industry(1:24:52) Rapid fire round—The Pragmatic Engineer deepdives relevant for this episode:• IDEs with GenAI features that Software Engineers love• AI tooling for Software Engineers in 2024: reality check• How AI-assisted coding will change software engineering: hard truths• AI tools for software engineers, but without the hype—See the transcript and other references from the episode at ⁠⁠https://newsletter.pragmaticengineer.com/podcast⁠⁠—Production and marketing by ⁠⁠⁠⁠⁠⁠⁠⁠https://penname.co/⁠⁠⁠⁠⁠⁠⁠⁠. For inquiries about sponsoring the podcast, email podcast@pragmaticengineer.com. Get full access to The Pragmatic Engineer at newsletter.pragmaticengineer.com/subscribe

Transcript
Discussion (0)
Starting point is 00:00:00 A lot of people talk about how we're going to have way fewer software engineers in the near future. I think it feels like it's people that hate software engineers, largely speaking, that say this. It feels pessimistic not only towards these people, but I would say just in terms of what the ambitions for companies are. I think the ambitions for a lot of companies is to build a lot better product. And if you now give the ability for companies to now have a better return on investment for building technology, right? Because the cost of building software has gone down. What should you be doing? You should be building more.
Starting point is 00:00:27 because now the ROI for software and developers is even higher because a singular developer can do more for your business. So technology actually increases the ceiling of your company much faster. Winsurf is one of the popular IDs as software engineers use thanks just AI coding capabilities. But what are the unique engineering challenges that go into building it and how good tools like Winserve change software engineering? Today I set down with Varum Mohan, co-founder and CEO of Winsurf. We talk about why the Winsurf team build their own LLMs and how LLMs for text are missing. capability necessary for coding like fill in the middle. How Winsterf uses a mix of techniques for many cases,
Starting point is 00:01:03 like to solve for search, how they use a combination of embeddings and keyword-based searches. Why latency is their number one challenge and how incorrectly balancing GPU compute load and memory load can lead to higher latency for code suggestions popping up. How Varum thinks his software engineering field will evolve and why he stopped worrying about predictions like 90% of code will be generated by AI in six months. If you want to understand the engineering that goes into these next-generation IDs, then this episode is for you. If you enjoy the show, please do subscribe on any podcast platform and on YouTube. Welcome to the podcast.
Starting point is 00:01:36 Yeah, thanks for having me on. You've recently launched GPC 4.1 support in Windsurf, which by the time this is out, it will have been a few weeks. But what are your initial impressions so far? and in general, when you introduce a new model, how do you evaluate, like, how it's working for the coding use cases that we all use? Yeah, maybe I can talk about the second part, and then I can talk about, you know, GBT 4.1, the other models afterwards. Basically, internally, you know, these models have these non-deterministic properties, right? They sometimes perform differently in different tasks in ways that are unexpected. You know, you can't just look at a score on a competitive programming competition and decide,
Starting point is 00:02:20 hey, it's going to be awesome for programming. And, you know, interestingly about the company, maybe this is going to be a helpful context. A lot of us in the company previously worked in autonomous vehicles. And I think in autonomous vehicles, we had a similar type of behavior where you had a piece of software. The software was very modular, lots of different pieces. Each piece was machine learning driven.
Starting point is 00:02:42 So there was some nondeterminism. And it's very hard to test it in the real world, right? Actually, it's much harder than it is to test, I guess, windsurf out in the real world. it's much harder to test autonomous vehicle software out in the real world, because if you ship bad software, you have the chance of hurting a lot of people, right? Hurting a lot of people, hurting the, you know, I don't know, just the general public infrastructure, right?
Starting point is 00:03:01 So in that case, we needed to build really good simulation evaluation infrastructure in autonomous vehicles. And I guess we brought that over here as well, where, hey, if you want to test out a new model, we have evaluation suites. And the evaluation suites not only test end-to-end software performance, which is to say you give a high-level task, what is the pass rate of actually completing the high-level tasks on a bunch of unit tests. It also tests retrieval accuracy, edit accuracy, right? Redundant changes. All these different parts of a model that are like negative behavior.
Starting point is 00:03:30 Because for our product, it not only matters that you pass a test. It also matters that you didn't go out and make 10 steps that were unnecessary because the human is going to be waiting on the other end for all of those changes. So we have metrics for all of these things. And we're able to put each model through like, I guess, a suite of tests that give us metrics. And that's like the way we decide, hey, this is a good model for our end users. Right. And that's like the high level way that we go about testing. And like these tests, you know, they sound great in theory, but in practice, what does it look like? Like I'm going to assume you're going to have, you know,
Starting point is 00:04:02 we can imagine us engineers who've been writing, you know, code, probably not autonomous vehicles, but similar ones, you know, we know our unit tests, our integration tests. If you do mobile, you know your end-to-end test. I'm assuming this will be a little bit different, but with some similarities, like do you actually like code some scenarios you have like example codes example prompts and and then i assume you can do a bit of that but then what else and you know how does this all come together and like how can i imagine this test suite is it like one big giant blob that runs for i don't know how long yeah one of the aspects of code that is really good is it can be run right so it's not like a very you know touchy-feely kind of thing in the end like a test can be passed so what we can do is we can
Starting point is 00:04:43 take a bunch of open source repositories, we can find previous pull requests or commits that actually not only add tests, but also add the implementations correspondingly. And what we can do is instead of just taking the commit description, we can remake what the description of the commit should have been, like a very high level intent. And then from there, it becomes a very, I guess, programmatic problem, which is to say, hey, like, first of all, find the right files that you need to go and make changes to, right? Then there is a ground truth for that, right? Because the base code actually has a set of five, 10 files that changes were made to.
Starting point is 00:05:17 Then after that, what is the intent on those files? You can actually go from the ground truth backwards, which is that you know what the final change was from the actual code. And you can have the model generate that intent. And then after that, you can see if the edit, given that intent is correct.
Starting point is 00:05:33 So you now have three layers of tests, which is that, hey, did I retrieve the right things? Did I have the high level intent correctly? And is the edit performance good, right? And then you can imagine, doing much more than just that. But at a high level now, you know, just from a pure commit or a pure actual ground truth piece of code, you now have multiple metrics that you can go about. And then obviously the final thing you can actually do is run the code. Right. So it's not just like a,
Starting point is 00:05:57 you know, when you measure some of these chat products, they actually, you know, the evaluation is a little bit different, which is to say the evaluation is you give it to multiple humans in a blind test, in an AB test, and you ask them, which one did you like more? Obviously, for us to quickly evaluate, we can't be giving it to like tens of thousands of humans in a second. And with this now, within like minutes, we can get answers to what is the performance on tens of thousands of repositories and tests, basically. This episode is brought to by Modal, the cloud platform that makes AI development simple. Need GPUs without a headache.
Starting point is 00:06:29 With Modul, just add one line of code to any Python function and boom. It's running in the cloud on your choice of CPU or GPU. And the best part, you only pay for what you use. With sub-second container start and instant scaling to thousands of GPUs, it's no wonder companies like Suno, Ramp, and Substack already trust Model for their AI applications. Getting an H-100 is just a PIP install away. Go to modal.com slash pragmatic to get $30 in free credits every month. That is M-O-D-A-L.com slash pragmatic. This episode is brought to by CodeRabit, the AI Code Review platform transforming how engineering teams shift faster without sacrificing code quality.
Starting point is 00:07:08 Code reviews are critical, but time-consuming. CodeRabbit acts as your AI copilot, providing instant code review comments and potential impacts of every poll request. Beyond just flagging issues, CodeDribute provides one-click fixed solutions and lets you define custom code quality rules using AST grab patterns, catching subtle issues that traditional static analysis tools might miss. CodeRabit has so far reviewed more than 5 million poll requests, is installed in 1 million repositories, and is used by 3,000,000. 50,000 open source projects. Try CodeRabbit free for one month at Codrabbit.AI using the code pragmatic. That is coderabbit.aI and use the code pragmatic.
Starting point is 00:07:51 I really like how much engineering you can bring in because it's code and because we have repositories because you can use all these things that feels to me it gives a bit of an edge to like some of the other use cases, just as you mentioned. No, I think you're totally right. We think about this a lot of what would have happened if we were to pick a different sort of category entirely. I think the ground truth is just very hard.
Starting point is 00:08:16 You don't even know if the ground truth is great, right, in some ways. In some cases, for all we know, the ground truth is not good. But in this case, I think it's a lot easier because of the verifiability. Or kind of if you have a good test, it's a lot more easy to verify software. And can give us a sense of what is the team behind windsurf and also how complex this thing is? then how did it even come about? Because for all I know, you know, like a few months ago, when this podcast started, there was no windsurf. There was codium.
Starting point is 00:08:44 We actually talked a bit about what codium was a little bit different. And then out of nowhere, boom, windster comes out. A week later already in the pragmatic injury, about 10% of people that we surveyed were already using it, which was, I think, the second largest usage of tools. And people were enthusiastic about it. But I assume there's more to this. It didn't just come out of, you know, like nothing, right? Yeah, so happy to talk a little bit about our story and summarize it.
Starting point is 00:09:11 So we started the company now close to four years ago, which is substantially before, I guess, the co-pilot and Chai Chi-GPT sort of moment. A lot of us at the company, as I mentioned, previously worked, I would say on these hard tech problems, you know, AR, VR, autonomous vehicles. And I guess at that point, what we started building out, and we had a different company name at that point, it was called XA Function. we started building out GPU virtualization systems. So we built out systems to make it very fast and efficient to run GPU-based workloads. And we would enable companies to run these GPU workloads on CPUs. And we would transparently offload all GPU computations to remote machines. And that could be kuda kernels all the way down to full-on model calls, right?
Starting point is 00:09:52 It was a very low-level abstraction that we provided people. And so much so that if the remote machine died, we'd be able to reconstruct the state of what was on that GPU in another. other GPU, right? And the main use case we targeted were these large-scale simulation workloads for these deep learning workloads that a lot of these robotics and autonomous vehicle companies had. And we thought, hey, the world was going to look like that in the future. A lot of companies would be running deep learning workloads. What ended up happening was in the middle of 2022, I think Texed Da Vinci 3 sort of came out, which was, I guess, the, you know, the GPT3 sort of
Starting point is 00:10:26 instruction model sort of came out. And I guess that changed a lot of our priors, like both me my co-founders priors, which is to say, we thought that the set of models that would run were going to look a lot more homogenous, right? If you were to imagine in the past, the number of different models that people would run was very diverse, right? People would run convolutional neural notes, right? Recurrent neural notes, LSTMs, right? Graph neural nets. There was a whole suite of different types of models. We thought in that case, hey, if we were an infrastructure company, we can make it a lot easier for these companies to run these workloads. But the thing is with Textimichy 3, we actually thought that actually there would be a simple
Starting point is 00:11:01 of the set of models that would run. Why go out and train a very custom Burt model? If you could go out and just ask a very large generative model, is this a positive or negative sentiment? And we thought that that was where the puck was going. I guess for us, like we believe in scaling laws and all these things. If it's this good today, how good is a much smaller model going to be in two years? It's probably going to be way better. So what we decided to do was actually focus on the application layer. Take the infrastructure that we had and actually build on an application. And that was what codium was. So we built out extensions in all the major IDEs, right? And very quickly we were able to get to that point. And we actually did train our own
Starting point is 00:11:36 models and run them ourselves with our own inference stack. And the reason why we did that is at the time, the models were not very good. The open models are not very good. And also for the workload that we had, which was autocomplete, it was a very weird workload. It's not very similar to the chat workload. Code is in a very incomplete state. You need to fill in code in the middle of a line. There's a bunch of reasons why this workload is not very similar. And we thought we could do a much better job. So we provided that because of our infrastructure background for free to basically every developer in the world. There was no way to pay for the product. And then very quickly, enterprises started to reach out. We were able to handle the security requirements and personalization because the companies
Starting point is 00:12:11 not only care about, hey, it's fast, it's free, but is this the best code for my company? Right. And we were able to meet that workload. And then fast forward to today, and I know that this is a long answer, what we felt was agents in the beginning of last year would be very huge. The problem was the models were not there yet, right? We had internal, we had teams inside the company building these agent use cases, and they were just not good enough. But the middle of last year, we were like, hey, it's actually going to be good enough. But the problem is the IDE is going to be a limitation for us.
Starting point is 00:12:42 Because VS code is not evolving fast enough to enable us to provide the best experience for end users. In a world in which agents were going to write 90% or 95% of software, developers would still be in the loop. But the way they would interact with their IDs would look markedly different. And that's why we ended up building out Winsurf in the first place. We thought that there was a much higher ceiling on what IDEs could provide. And with the agent product, which is Cascade, we were able to deliver what we felt was a premier experience right off the bat that we couldn't have with VES code.
Starting point is 00:13:12 How large is a team working on Winsurf and how complex is Winsurf as a product? Like, I'm not sure how much we can quantify it. Yeah. You know, I try to be pretty modest, like, you know, sort of modest with, with some of these things, but just to say, we are a pretty small team. So right now, the engineering team is a bit over 50 people. At the time when we, maybe that's like, that's large compared to other startups. But if I were to say compared to other, you know, large engineering projects in the grand
Starting point is 00:13:41 scheme of things, like one of the books that I read a while ago was this book called Showstopper, right? And it's this book about how Microsoft built Windows NT, right? And it's a much larger team, obviously, but operating systems are more, are very complex piece of software. But my viewpoint on this is that this is a very, very complex piece of software in terms of where the goal post is, which is to say, I would say the goalpost is constantly moving, right? One of the goals that I give to the company is that we should be reducing the time it takes to build applications by 99%. Right. And I would say pre-win-surf, it was probably 20,
Starting point is 00:14:15 and post-win-surf, it was probably over 40. But we are very far from 99, right? We're still like, you know, a 60x away from 99, right? Like if we've, if there's a, If there's a 60 units of time and we want to make a one, we're quite far. So in my head, there's a lot of different engineering projects that we have at the company. In fact, like, I would say over, maybe close to half of the engineering team is working on projects that have not seen the light of day. Right. And that's like an interesting decision that I guess we've made because I think we cannot be
Starting point is 00:14:43 embracing incremental, right? Like, we're not going to win and be a valuable company to our customers if all we're doing is changing the location of buttons. Like, I think people will like us for great UI, but that cannot be the only reason why we win. No, and I love it. I mean, this is, you know, when you're a startup, I think you need to say really big. You cannot use to incremental. You can do incremental later. Hopefully you're going to get there. And what are some interesting numbers that you can share about the usage of windsurf or the load that you're handling? I'm assuming this is just going to,
Starting point is 00:15:13 it's pretty easy tell. It will keep going up, right? Yeah. That's an easy prediction. No, I think you're right. So one of the interesting numbers I can, I can, I can, a handful of numbers is within a couple months of the product. We had, like, well over a million developers tried the product. So it's been growing quite quickly. Within pricing coming out, we've reached over, within a month, we reached over sort of eight figures in ARR. And I think all of those are kind of interesting metrics, but also on top of that, sort of
Starting point is 00:15:42 we run our own model still in a lot of places. Like, you can imagine the fast, passive experience is completely our own model. A lot of the models to go out and retrieve parts of the code base. and find relevant snippets are our own models. And that system processes well over sort of 500 billion tokens of code every day right now. So that system itself is huge. It's a huge world that we actually run. Yeah.
Starting point is 00:16:04 And I guess the history of windsurf is interesting once I understand that you've actually been building your own models for quite the time. You know, you've not just started here because I think for most engineering teams, that would be daunting. And also it's just a lot of time, right? it's not something that you would just like, it's harder to do from scratch. I'll say it because nothing's impossible here. I totally agree with you.
Starting point is 00:16:28 I think, you know, one of the weird things is because of the time that we started and the fact that we were like in the very beginning, first of all, we had the infrastructure background, but we were first saying we need to go out and build an auto-complete model. The best model at the time that was open source end of 2022 was Salesforce code jet. And I'm not saying it was a bad model. It was awesome that Salesforce did open-source that model, but it was missing a lot. lot of capabilities that we needed for our product, right? It was missing fill in the middle, which feels like a very, very obvious capability, but the model is actually...
Starting point is 00:17:00 What is that? So the idea of fill in the middle is, basically, if you look at the task of writing software, it's very different than chat. And maybe an example of what chat is, you're always appending something to the very end and maybe adding an instruction. But the problem for writing code is you're writing code in ways that are in the middle of a line, in the middle of a snippet of code. Yeah, or that kind of stuff, yeah. In the middle of a function. And the problem there is actually, there's a lot of issues that pop up, which is to say actually the tokenization, so these models, when they consume files, they actually tokenize the files, right? Which is they don't consume them byte by byte. They consume them token by token. But the fact that the code when you write it at any given point doesn't tokenize into something that looks like in distribution. I'll give you an example. How many times do you think in the training dataset for these models, does it see instead of return RETU only without the RN, probably never. It probably never sees that. So it's completely out of distribution.
Starting point is 00:17:54 But we still need to, when we see RETU, predict you are going to do RN space a bunch of other stuff, right? It sounds like a very small detail, but that is issue very important if you want to build the product. And that is a capability that cannot be slightly post-trained onto the models. It's actually something where, like, you need to do a non-trivial amount of training on top of a model or pre-trained to get that capability. And it was table stakes for us to provide that for our users.
Starting point is 00:18:16 So that forced us very early on to actually build out our own models and figure out training recipes and make sure we could run them at massive scale ourselves for our end users. And what are other things that are unique in terms of building models for code as opposed to the usual text models? I can think of things like the brackets, for example, in some languages. Maybe this is just naive. You have all seen so many more. So like what makes code, what makes it interesting slash worthwhile to build your own model for code? Yeah, I think, I think, what you said is is definitely like one thing to fill in the middle capability I would say another thing another thing you can do is code is like quite easy to to and quite easy is maybe you know and
Starting point is 00:19:02 an overstatement but quite easy to parse right you could actually asd parse code you can find relationships of the code because code is a system that is like evolved over time you could actually look at the commit history of code to see to build a knowledge graph of the code base and you can start putting you do that details what do you do Do you do that? Yep. Yeah. Yeah.
Starting point is 00:19:22 Yeah. We look at the previous commits. And one of the things that it enables us to do is build a probability distribution of the code base of conditional on you modifying a piece of code. What is the probability of you modifying another piece of code? So there is, you know, when you get into the weeds, code is very, it's very information dense, right? It's testable. There's a way that it evolves. People write comments, which is also cool, which is to say once a pull request gets created,
Starting point is 00:19:47 people actually say, I didn't like this code. So there's a lot of signal on what good. bad looks like within a company. And you can use that actually as a way to make, to automatically make the product much better for companies, right? You know, one of the things that I think we were, all of us were talking about, I would say like a couple years ago when, and I guess, I guess we've been here in this space quite a long time. I know a couple of years is not a very long time in most categories, but in this category, it's, you know, dinosaur years. So, so, you know, one of the things that I think is, is kind of interesting is, we in the beginning, we were,
Starting point is 00:20:18 we were saying, hey, people would write all these guidelines and documentation on how best to use the product. But the interesting thing is code is such a treasure trope. You can go out and probably make a good first cut on what the best way to write software is inside JPMorgan Chase, inside Dell. You can go out and do that by using the rich history inside the company. So there's a lot of things that you can start doing autonomously as well, if that makes sense. Yeah. One thing I'd love to get your take on on how it might have changed a year or two ago, when, using when co-pilot started to become more popular, again, an earlier version, companies like SourceGraph and others have started to build other capabilities. There was this debate of,
Starting point is 00:20:59 would it be worth fine-tuning a model on my company's code base, talking about large companies, let's talk JP Morgan or those others. And there are two strange of thoughts. One said, like, oh, it's probably worth it because our code is so unique, it might be worth it. And some other people think it might not be worth it because it might be too resource intensive. The models are too generic. Did you try this out? And where did you land in this? Because I never got an answer to, you know, what happened, like what was worth it, what was not worth it in the end. So for what is worth, we did try it out. We built out some crazy infrastructure to go out and try it out. I, you know, I guess this will be the first place where I talk about the actual infrastructure. We built out
Starting point is 00:21:40 systems. So transformers have these many layers, right? And if you were to imagine, if when we, actually enable companies to self-hote, at some point in the past, we were enabling companies to self-host the system and the fine-tuning system as well. So at that time, you'll have self-hosting data. You build this out. We built out self-hosted not only deployment, but also fine-tuning. And the way that that actually worked was actually quite crazy, which was to say, okay, where do you get the capacity to fine-tune a model if you're already running it for inference? Like, the company may not want to give you so many GPUs. So we just said, hey, why don't we use the preemptible time, which is to say when the model is not running inference,
Starting point is 00:22:19 what if we actually go out and backpropagate, do back props on the transformer model while this is happening? And then what we found was, oh, the back props take a long time and it might cause downtime on the inference side. So what we enabled it was we enabled the back prop to be able to be preemptible on every layer of the transformer. So if that's to say, let's say you sent an inference request and it's going to do, it's doing back propagation. And it's on layer 10. It'll just stop at layer 10 and it will continue after your inference request completes. So we built a lot of crazy systems to actually go out and do this. I guess here's the thing we found. We found that fine tuning was a bump, but it was a very modest bump compared to what great personalization and great retrieval
Starting point is 00:22:58 could do. That's what we found. Now, does that mean fine tuning in the future is not going to be valuable? I think actually per person fine tuning could actually work quite well. I think though maybe some of the techniques that we do it are going to need to change. And here's the way I'd like to look at it, right? Or anytime a system, you know, you build a system, there are many ways to improve it. Some of them are much easier than other ways. And you can imagine there's a hill to climb for everything. And some hills are much easier. And the right strategy to do when a hill is much easier and it provides a lot of value is climb that hill fully before you go out and do something that's a lot harder. Because when you do the thing that's a lot harder, you are like adding
Starting point is 00:23:33 someone of tech debt if that's not the right solution. What I described to you in terms of like the solution of doing backcrop on a layer by layer basis, it's a cool idea. But you can imagine it added a lot of technical complexity to the software that might have been unnecessary if we thought that purely doing better retrieval was going to be much better. So there's this like, I guess there's this tightrope to kind of, you know, balance on top of on how you decide these things. Now, I was asking around, I've been using windsurf as well, but I'm not a very heavy user, but I have been asking around more heavy users. And one of the biggest criticisms, both of windsurf, but also in of every tool in this area has been like, look, I start off, it's good.
Starting point is 00:24:11 It works good. I have a relatively small complex. My project grows either because Winserve generates code or is just a big project. After a while, it starts to struggle with the context. Maybe it doesn't see, you know, part. It gets confused, et cetera. And clearly, as an engineer who understands that, it is going to be a problem of, like, you have a growing context window. You still want to have similar quality. How do you tackle this challenge?
Starting point is 00:24:37 What progress have you made? And I think this is a bit of a million dollar question in the sense of like if we could somehow have a solution for this, we would be better off. Where have you gotten on this? I'm assuming this is a pretty common challenge. I think it's a very hard problem. You're totally right. There's a lot of things that we can do, which is to say, obviously, we need to work around the fact that the models don't have infinite context. And when they do have larger and larger context, you are paying a lot more and you take a lot more time.
Starting point is 00:25:06 Right. And developers usually, a lot of the time, don't really want to wait. And, you know, one of the things that we have for our products, we hate waiting. Yeah, exactly. But one of the things that we have for our products that we've learned is if you make a developer wait, the answer better be 100% correct. And I don't think we're at a time right now where I can guarantee you with like a magic one that all of our cascade responses are 100% correct. Right. I don't think we're at that right that right. So there's a lot of things that we need to do that are almost approximations, right? How do we keep a very large context? But despite that, we have chats that are so long that how do you accurately checkpoint the past conversation? But that has some natural lossiness attached to it. And then similarly, if the codebase gets very large, how do we get very, very confident that the retrieval is very good? And we have evaluations for all of these things, right? This is not something which we're like shooting in the dark and being like, hey, YOLO let's try a new approach and like give it to half of our users. But I think you're totally right. There's no, I don't think there's like a complete solution for it. What I think it's going to be is like a mixture of a bunch of things, which is to say much better checkpointing,
Starting point is 00:26:08 coupled with better usage of context length, much faster LMs and much better models. So it's going to be, it's not going to be, I think, a silver bullet. And by the way, that could be tied with, hey, you know, understanding, you know, understanding the code base much better from the perspective of the code base already existed, able to use the knowledge graph, right? Able to use a lot of the dependencies within the code base a lot better. So it's a bunch of things that I think are going to multiply together to solve the problem. I don't think there's going to be like one silver bullet that makes it so you're going to be able to have amazingly coherent conversations that are very, very long, basically. To be fair, as an engineer, it kind of, this might feel weird, but it makes me feel a bit better.
Starting point is 00:26:46 We're actually back to talking about like engineering step by step as opposed to like, okay, you know, having these, you know, it feels like you get a new model. Not now, but early on when we got a new model, it was like, oh my gosh, just magic. And it took a while to understand how it works, how it's broken down, et cetera. you did mention your infrastructure. Can you talk a little bit about how we can imagine your hardware and back and stack if I was to join in Windsurf as an engineer? Like, is it going to be a bunch of cloud deployments here and there? Do you self-host some of your GPUs?
Starting point is 00:27:19 Because a lot of AI startups who are smaller or more modest, they're just going to, you know, platform as a service. It sounds like you might be at the scale where maybe you're outgrowing this as well. Yeah, I think we might have just never done kind of, you know, buying off the shelf stuff in the, in the early part of the company. Oh, yeah. Your background, I keep forgetting this. Yeah, but even more than the background, I think there were cases where we could have and maybe should have. One of the reasons why we also didn't was very quickly we got brought into working with very large enterprises. And I think the more dependencies you have in your software, it just makes it harder and harder for these larger companies to integrate the technology.
Starting point is 00:27:57 right? Like they don't want a ton of sub-processors attached to it. We recently got FedRamp High Compliance. We're the only AI software assistant with FedRamp High Compliance. And the only reason why that's the case is we've kept our systems very tight, right? And for these compliances, I did some, but not specifically FedRamp, what do you need to prove that you are this compliant? Yeah, I think basically you need to map out the high levels of sort of all the interactions. You need to be very methodical about releases and how the releases make it into the system. You need to be very methodical about where data has persisted at a layer that is like probably
Starting point is 00:28:36 much deeper than SOC2. I think like going through the SOC2 versus FedEx. I did SOC2 and that was already pretty long. So it was already a really long. Yeah. It's impressive that you did this as a startup slash scale of congrats. Yeah. One of the reasons why was I guess like one of our first customers that were a large enterprise
Starting point is 00:28:54 was like Dell. which is like not a usual first large enterprise. For startup, no. For startup, definitely no. So it forces down a path of how do we build very scalable infrastructure? How do we make sure our systems work at a code base that is 100 plus million lines of code? What does our GPU provisioning need to look like for this larger team? It's just forces to become a lot more, I guess, operationally sound for these kinds of problems, if that makes sense.
Starting point is 00:29:19 Yeah. And how do you deal with inference? You're serving the systems that serve probably billions or hundreds of billions, well, hundreds of billions tokens per day, as you just said, with low latency. What smart approaches do you do to do this? What kind of optimizations have you looked into? Yeah, I mean, like a lot, as you can imagine. One of the interesting things about some of the products that we have, like the passive
Starting point is 00:29:46 experience, latency matters a ton in a way that's very different than some of these API providers. Like I think for the API providers, time to first token is. important. But it doesn't matter that time to first token is 100 milliseconds. For us, that's the bar we are trying to look for. Can we get it to sub, you know, a couple hundred milliseconds and then hundreds of tokens a second for the generation time? So much faster than what all of the providers are providing in terms of throughput as well. Just because of how quickly we want this product to kind of run. And you can imagine there's a lot of things that we want to do, right? How do we
Starting point is 00:30:18 do things like speculative decoding? How do we do things like model parallels, right? How do we make sure we can actually batch requests properly to get the maximum utilization of the GPU, all the while not hurting latency, right? That's an important thing. And one of the interesting things is just to give some of the listeners some a mental model. GPUs are amazing. They have a lot of compute. If I were to draw an analogy to compute or to CPUs, GPUs have over sort of two orders of magnitude more compute than a CPU, right? It might actually be more on the more recent GPUs, but keep that in mind. But GPUs only have an order of magnitude more memory bandwidth than a CPU. So what that actually means is if you do things that are not compute intense, you will be
Starting point is 00:30:58 memory bound, right? So that necessarily means to get the most out of the compute of your processor, you need to be doing a lot of things in parallel. But if you need to wait to do a lot of things in parallel, you're going to be hurting the latency. So there's all of these different tradeoffs that we need to make to ensure a quality of experience for our users that we think is high for the product. And we've obviously mapped out all of these. We've seen how, hey, like, if we change the latency by this much, what is this change in terms of people's willingness to use the product? And it's very stark, right? Like a 10 millisecond increase in latency affects people's willingness to use the product materially, right? It's percentage points that we're talking about. So these are all
Starting point is 00:31:35 parts of the inference act that we've needed to optimize. And is latency important enough? Or does the location factor factor into this? The physically how close people you know, using Windsurf R to wherever your server and then your GPUs are running. You need to worry about that as well? You do need to worry about that. Let speed of light starts mattering. Interestingly, you know, this is not something I would have expected, but we do have users in India.
Starting point is 00:32:02 And interestingly, the speed of light is not actually what is bottlenecking their performance. It's actually the local network. So just the time it takes for the packet to get from maybe like from their home to the major ISP is actually somehow there's a lot of congestion there and that's the kind of stuff that we need to kind of deal with. But by the way, that is something that we just cannot solve right now. So you're totally right. The data center placement matters. Like for instance, if you force a data center in Sydney and you have people in Europe, they're not going to be happy about the latency. So we do think about like where the location of our GPUs are to make sure that we do
Starting point is 00:32:37 have good performance. But there are some places where there are some issues that even we can't get around basically. No, the last time I heard this. complained before WinSurf because this came up with actually, again, someone who's using Windsorfer on the tools a lot said that specifically for one of the tools, he can tell that the data centers are far away because it's just slow. Cloud development environments had the exact same thing because they were similar, right? Like this was, I'm not sure they're as popular right now, but there was a time where it looked like it might be the future.
Starting point is 00:33:06 You just log on to your remote environment, which is running on CPUs or GPU somewhere else. And again, I think it might have to do with as. as does when you're typing, like, when I'm using it, I mean, I'm just used to, like, I do want sub second, probably like a few hundred milliseconds. I just noticed, right? You feel it's slow and it just bothers you. Like, it just, no, I agree. I think if I had to, if I had to even see every time I typed a keystroke, like a couple
Starting point is 00:33:33 hundred milliseconds later, the key would show up. Like, I would rage quit. That would be a terrible experience for me. How do you deal with indexing of the code? So you're going to be indexing, you know, depends on the code. base, it'll be more or less. But if you add it up, I'm sure we're talking billions or a lot more in code. And for your enterprise customers, you might actually have, you know, the hundreds of millions or even more lines of code. Is there anything like novel or interesting that you're using?
Starting point is 00:34:01 Or is it just kind of tried and proven things, for example, that search engines might use? It's a little bit of both, to be honest. And what I mean by that, it's not a very clean answer. we do try approaches that are embedding based. We have approaches that are keyword based on the indexing. Interestingly, actually, one of the approaches that we've taken that's very different than search, and maybe actually systems like Google actually do this, is we not only actually look at just the retrieval, we do a lot of computation at retrieval time.
Starting point is 00:34:32 So what that means is let's say you want to go out and ask a question. One of the things that you can go out and do is ask it to an embedding store and get a bunch of locations. What we found was the recall of that operation was quite low. And, you know, one of the reasons why that happens is embedding search is a little bit lossy. Like, let's say I was to go to a code base and ask, hey, give me all cases where this function, this spring boot version X type function was there. I don't think anyone would believe embedding search would be comprehensive. Yeah.
Starting point is 00:35:03 Right? Because it's just not, like, you're taking something that is very high dimensionality and reducing it to something very low dimensionality. without any knowledge of the question. That's like the most important thing. So it like, it needs to somehow encode all the possible be relevant for all the possible questions. So instead what we decided to do is take a variety of approaches to retrieve a large amount of data and that it could include the knowledge draft,
Starting point is 00:35:25 that it could include could include the dependencies from the abstract syntax tree. They could include like keyword search. That could include embedding search. And you kind of fuse them all together. And then after that, we throw compute at this and actually go out and process large chunks of the code base at inference time and go out and say, hey, these are the most relevant stippets, and this gives us much higher precision recall, right, on the retrieval side to actually go out.
Starting point is 00:35:50 And by the way, that is very important for an agent, because imagine if an agent kind of doesn't have access and doesn't deeply understand the code base, all the while the code base is much larger than the context length of what an agent is able to take in, right? So we've, you know, optimizing the precision recall of the system is actually something that we spent a lot of time and built a lot of systems for. It's interesting because it feels like you're, it shows how, well, A, it's code. So you can eat more easily work with it, especially with certain keywords, for example, on some language. I can imagine that you can even just, you know, you can even list all the keywords that are pretty common.
Starting point is 00:36:20 And you can decide if it's a keyboard or if it's something special where, and if it's a keyword, you can already just like do it. And it's interesting how you can combine the kind of old school or old school before, before LMs and then add the best parts of LMs, but not forgetting about the, you know, what worked before. I'm just for I wonder if there's other any other industry that has this that we do have this like lower the dimensionality space in terms of the the grammar and all these things. We understand the usage pretty well. And then the users are power users who actually, you know, the same people use it who could actually build, you know, this tool. Yeah. You know, I feel like Google's, like Google system is probably ridiculously complex and sophisticated for obvious reasons. just because, for one, they've been doing this for so long, and they've been the, obviously,
Starting point is 00:37:11 they were, they've been at the top for such a long time. And then also on top of that, the monetary value they get from delivering great search is so high given ads that they are incentivized to throw a lot of compute even at, at the query time, right, to make sure that the quality of suggestions is, is really good. So I assume they're doing a lot of, a lot of tactics there. Obviously, I'm not privy to all the details of the system. So I don't know. Well, it's interesting because I would have agreed with you until recently, but there are some search engines that are doing really good results. So I wonder if Google is less focused on the actually the hay sack and the needle and maybe more on revenue or maybe they're doing it as invisible. I'm sure they're doing an amazing job, by the way, behind the hood.
Starting point is 00:37:52 But I wonder if some of that knowledge has commoditized. But, you know, we'll see. But moving on from indexing, in terms of databases, what kind of databases do you use on what challenges are they giving you? Like, again, you're not, I'm assuming you're not just going to be happy with like the usual, let's know everything in Postgres or, or do you actually? You might be able to. I don't know. Sounds like these days postgres can be surprisingly well for even embeddings.
Starting point is 00:38:17 Yeah. You know, I think we, we do a combination of things. So we do like some amount of local indexing. We do some remote indexing as well. Local indexing as on the user's machine. Nice. That helps us get in some ways, the benefit of that is it helps you build. up, if you were to say, hey, you have some understanding of the code base. The problem is that
Starting point is 00:38:38 understanding changes very quickly as the user starts changing code, starts checking out new branches. And you don't want to like basically say all of your information about the code base, you need throw away because of that. So it's good to have like some information about like the user's history and what they've done locally kind of like store it. In terms of remote, I think it would be a lot simpler than people would imagine. One of the complexities of our product. The reason of why the product is very complex is actually the fact that we need to run all of this GPU infrastructure, right? That's actually a large chunk of the complexity because if you were to look at our QPS, our QPS is high,
Starting point is 00:39:10 but it is not like tens of thousands of QPS, right? Actually, it's not, it doesn't need to be that high because in some ways, in some ways, like actually each of the queries that is happening is actually a really expensive query. It's doing trillions of operations remotely. So actually the complexity of the problem is how do you optimally do that process, right? So we can actually get away with things like Postgres. Like we're not, in fact, I would say I like to keep things pretty simple, if it's possible to keep things very simple.
Starting point is 00:39:37 And we should not be rolling any type of our own database. Like I think databases are very, very complex pieces of technology. I think we're good engineers, but we're definitely not good enough to kind of like on the side, build our own database. And then for local indexing, what database do you use? Yeah, we have our own combination of like sort of like SQL-based database. We have a local SQL database. And then like some sort of embedding databases as well.
Starting point is 00:39:59 well that we store locally as well. What is your view on the value of embedding databases? This has been an ongoing debate for the past, like since Chad GBC became big. Again, there were two schools of thoughts. One is we do need embedding based database, embedding databases because they can give us vector search. They can give us all these other features that allelms and embeddings will need. And the other school of thought as well, let's just expand relational databases.
Starting point is 00:40:23 We add a few extra indexes. And boom, we're done. From your, you know, you're more the user of this, but you're a heavy user at, wind surf and the codium. What pros and cons are you seeing? I'm just trying to get you to go to one the direction of the other. It's a good question. So our viewpoint on embeddings are probably that they are not, they don't solve a problem by themselves. They actually just do not. So the answer is going to be mixed. And then the question is, why do we even do it in the first place, right? And I think it really boils down to it's a recall problem. When you want to do good retrieval, you need the input to what
Starting point is 00:40:58 you're willing to consider it to be large and a high recall, right? And if you were to think about it, the problem is if you only have something like keyword search and you have a very, very large sort of code base, actually, what happens if the user typos something? Right? Then your recall is going to be bad. So the way I like to think about it is each of these approaches, keyword search, right, like sort of knowledge graph based retrieval, all of them. They're all in different circles. And what you're trying to do is get something where the union of these circles is going to give you the highest recall ultimately for the retrieval query. And I think embedding can give you good recall, because it is able to summarize or actually able to distill someone to semantic information
Starting point is 00:41:39 about the chunk of code, the AST or the file or the directory and all this other stuff. So what I would say is it's a tool in the toolkit. It's not, like you cannot build our product entirely within an embedding system, but also does the embedding system help? I think it actually does help, right? It does improve our recall metrics and our precision metrics. So I talk with your head of research, Nicholas Moy, and he told me about a really interesting challenge that you're facing, which he called it the split brain situation. He was basically saying that it's almost like the team and everyone in a team needs to have two brains. One is just being aggressively in the present, shipping improvements as you go, but also then do a long-term vision where you're building for the long run. How do you do this?
Starting point is 00:42:22 Like how did you start doing it and how do you keep doing it? You did mention earlier, right, that half the team is working on other stuff, but you kind of, you kind of like split people so like people focus on short term, long term, or do, does everyone, including you juggle these things in your head day to day? It's an interesting one. Yeah, I don't want to give myself like that much credit here. I think like our engineers are probably, probably should be given most of the credit here. But I think in terms of like maybe company's strategic direction, both me and my co-founder, the CTO of the company, he, we try to think a lot about how do we disrupt ourselves?
Starting point is 00:42:57 Because I think it's very easy to get into the state where, hey, I added this cool button. I added this way to control X with a knob. And you keep going down this path. And yeah, your users get very happy. But what happens if tomorrow I told you users don't need to do that? And it's an amazing experience. And it's like a better experience. Users are going to feel like why do I need to do this?
Starting point is 00:43:18 So here's the thing. Users are right up to a certain point. Right? they will not be able to see, like, by the way, if they can, then we should not be doing this. They will not be able to see exactly what the future solution should be. If our users can see the future solution better than we can, like, we should just pack up our bags and leave, right, at that point. Like, what are we actually doing here? So I think basically, you know, you have this tension here where you need to build features to make the product more usable today, right?
Starting point is 00:43:46 And our users are 100% right. They understand this. They face pain through many different axes that we don't. and we should listen to them. But also at the same time, we might have an opinionated stance on where coding and where these models and where this product can go
Starting point is 00:44:00 that we should go out and build to works. And we should be expanding, expounding a large amount of our engineering capital on that. And can you talk about like some kind of bets that you're having? You know, not that you're giving away everything,
Starting point is 00:44:15 but like some promising directions that might or might not work out or even in the past, some promising singers that maybe did not work out. Yeah, I'll tell you, a lot of them. Yeah, so, so, so we failed a lot. And, and, and I think failing is great. And one of the things that I, I tell our engineers is like, engineering is not like a factory building, right? It's, it's actually, you know, you have a hypothesis, you go in and you shouldn't be penalized
Starting point is 00:44:37 if you failed. Actually, I love the idea of, hey, an idea sounds interesting. We tried it, and it didn't work, because we at least learn something. And learning something is awesome. And I'll give you an example, the agent work that we did for, we didn't even start beginning of last year. We started even before beginning of last year. It was not working for many months. And actually, Nick Moy was working on, who you probably spoke with, was the one who was working on some of this stuff. And for a long time, a lot of what he was doing was just not working.
Starting point is 00:45:04 And you would come to us and we would say, okay, fine, it doesn't seem like it's working. So we're definitely not going to ship this. But let's keep doing it. Let's keep working on it. Because we believe it's going to get better and better. But it was failing for a long time, right? we came out with a review product, right, in beginning of last year or around then called Forge for code reviews. We thought it was kind of useful internally at the company and we thought we could continue to improve it.
Starting point is 00:45:28 People did not find it that useful, right? It was not actually that useful. And, you know, this was, we were going in with the assumption, code reviews take a long time. What if we could help people? And the fact of the matter was the way we thought we could help people wasn't actually material enough for people to want to take on this new tool. Right. And there's a lot of. the things that sort of obviously that we've tried in the past that just didn't work the way we the way we thought it did and you know for me I think I would be totally fine if 50% of the bets we make don't work yeah and it's a lot of startups say that and then after a while what I notice is as a company becomes bigger I saw this as Uber it's actually not really the case there's like failures kind of on paper it's embraced but actually it's not so I think you know like there's
Starting point is 00:46:13 this tricky thing that when it's actually meant like it's awesome otherwise people just start to like polish things and make things look good when they're not. Pretend that it's not a failure, but it was a success and we're just walking away, that kind of stuff. So it's nice to see that you're doing it. What was the thing that turned the agents around, which then I assume became cascade? Like, like was it a breakthrough on your end? Was it the models getting better? Was it a mix of something else? Yeah. I think it was a handful of things. So I'll walk through it. So first of all, the models got better. 100% the models got better. I think even with all the internal breakthroughs we had, the models hadn't gotten better,
Starting point is 00:46:47 we wouldn't have been able to release this. So I don't want to trivialize that matter. It was huge. The two other pieces that were quite important was our retrieval cycle is also getting better and better, which I think enabled us to work much better at these larger codebases. I guess table stakes, it's quite good at zero to one programming. But I think the thing that was like a groundbreaking to us was our developers on a complex code base we're getting a lot of value from it.
Starting point is 00:47:12 Right? And I would say something quite interesting, which is that ChachyPT by itself wasn't incredibly groundbreaking to our developers inside the company. And that's not because ChachyPT is not a very useful product. ChachyPT is a ridiculously useful product. It's actually just because you need to think about it from the perspective of opportunity cost and how much more efficient you get. Our developers, a lot of them have been developers in the past. They are quite, I think we do have an exceptional engineering team. They were used to how to use Stack Overflow and all these other tools. to get what they wanted. But suddenly, when the model had the capability
Starting point is 00:47:47 to not only understand your code base and start to make larger and larger changes, it changed the behavior of the people inside the company. And not only make changes, we built systems to very quickly edit the code, right? The ability to edit code, we build the kind of models to take a high-level plan
Starting point is 00:48:05 and make an edit to a piece of code very fast. So all of these together made it so that this was a workflow that our developers wanted to use. right we had the speed covered we had the fact that they that it understood the code base well and then we also had massive model improvements to actually be able to call these tools and make these iterative changes right that's like you know i don't want to diminish that you have all of these and suddenly now you have a real product i've been meaning to ask you this but how how
Starting point is 00:48:30 how is the team you know like using windsurf to develop windsurf because you're doing it right you just told me how you're you're doing it do you have from who's two perspective one from the the technical feasibility, I'm assuming, you know, like just, you know, you're not going to work on the exact same code base or you have a fork or something like that or a build or something like that. And then the other hand on like, you know, do you kind of force people to dogg food? Do people just do it? Do people get stuck on certain versions? Do they turn on features for themselves, etc.? So the way we do it is we do have like an insider's developer mode. So this enables us to test new features. I guess anyone at the company should be able to create a feature and then deploy to
Starting point is 00:49:10 everyone internally. And now we have a large number of developers. We'll get feedback. We have an ability for our own developers to dog food new releases. We can have our own developers say, I hate this thing. Please don't ever do this. And it's nice because then we don't need to give it to our own developers. But other developers. So I think we have this tiered system at the company. We have our own sort of release. We have Next, we're just future looking products that we are releasing that are a little bit more raw. And then we have like the actual release that we give to developers, willing to AB test things, but we're not willing to AB test things in such a way where we give people
Starting point is 00:49:44 a comically bad experience just to A.B. Test them. Right? Like, it's bad because people are using this for their real work. So if you're using it for real work, we don't want to be hurting you. So I think one of the things that's quite valuable to us is probably this is a, you would think this is a failure mode for our company,
Starting point is 00:50:00 which is that we use Winserve, largely speaking, to modify large code bases, right, for obvious reasons, because I think our developers aren't building these toy apps over and over again. But crazily enough, one of our biggest power users inside our company is actually a non-developer. He leads partnerships. He's never written software before. And he routinely builds apps with WinSurf. Right. And he's one of our biggest users inside the company. And we've used this to actually replace buying other SaaS tools.
Starting point is 00:50:26 And he's actually even deployed some of these tools inside the company. What function is this person in? It's partnership. So I'll give me an example of some of the tools. Some of the tools. These are not complex pieces of software, but you would be surprised at how much they actually cost. There's six figures in cost because it's bespoke software, right? I'll give you an example. You have a quoting tool. So the idea of a coding tool is you have a customer, the customer has this size,
Starting point is 00:50:47 they're in this vertical, you know, they want this kind of deal. Here's the way it would look. Here's the amount of discount we're willing to give them as a customer. And usually these systems are really like systems that you would need to pay a lot of money for it. And the reason is because, I don't know, like there's no reason for us for our developers to go out and build this internally, right? it's a big distraction from going out and building our product.
Starting point is 00:51:10 But now, on the other hand, you have a domain expert in the person that actually runs partnerships. He doesn't know software, but he knows this really well, right? And because of that, he's able to create these apps really quickly. And granted, we do have a person inside the company that looks at the app, make sure that it logistically makes sense. It's secure, can be deployed inside the company. But these are more ephemeral apps, right? They're quite stateless.
Starting point is 00:51:33 If you were to look at the input output of this app, it is not as complex. as like, let's say, the windsor project, right? Yeah. But we now have this, like, growing set of people inside the company that are not developers that are getting value from, which we found a little surprising too. Yeah. And can you also give maybe just like some other examples of what you think it might be replaced?
Starting point is 00:51:54 The reason being this, like, I'm actually really interested in this because I do hear a lot of people either or social media or CEO saying that SaaS apps could be the end of it. And I've always been skeptical for the reason that, you know, there's two types of SaaS apps. And the most of SaaS apps I see, for example, Workday, which is an HR platform. And they will have hosting. They will have business rules. They will, like, update to some extent with regulations and all that stuff. So they do a lot of stuff that is the UI.
Starting point is 00:52:18 I know we can trivialize, but it's a lot more than that. And then there are a few of these simpler ones. Like, I don't know what things, but there's like a polling app or internally inside the company you can poll. It has state, but it's relatively simple. You can see behind it. It's just going to be, I could build it, but I just don't want to do it. deal with authentication to host it inside the company, but it's already there. And then there's ones you mentioned that are stateless.
Starting point is 00:52:43 So like what kinds of SaaS tools do you see that you're replacing? And you might see other companies potentially using, you know, like tools like this actually have with eight one dedicated helper developer, build, build it internally, bring it in house. Yeah. You know, I think it's hubris to believe that products like workday and Salesforce are going to get replaced by this. I think you're totally right.
Starting point is 00:53:07 These products have a lot of state. They encapsulate business workflows. There's actually for a product like Workday, probably compliance that you need to do because of how business critical the system is. So this isn't the kind of system that this would replace. It probably falls in the latter two categories and probably even just the last one, which is to say kind of these stateless systems that don't do rights to the most business critical parts of your databases.
Starting point is 00:53:31 It's probably actually those kinds of systems that very quickly can get replaced. And I would say there's a new category where think about the amount of software that would benefit a business that just isn't getting created that now could get created, right? Because and the reason why that software couldn't get created is a company couldn't be created that would be able to sustain itself that would have an economic, a business model that would justify it existing. But now since the software is very easy to create, these pieces of software are going to proliferate, right? And one of the things that I'd like to talk about for software is there's a little bit of a, we've been, you know, because the cost of building software was a lot high. higher, right? Of simple software was a lot higher. Right now for a front end, we have to admit, it's gone a lot cheaper to build a basic front end system, right? Radically. Radically cheaper. So I think the way I would sort of look at it is for these kind of systems, what are you really
Starting point is 00:54:22 paying for when you pay a SaaS vendor? You're not only paying for your product. You're paying for the maintenance. You're paying for the fact that actually, you know, this company actually is building a bunch of other features that you don't need. And the reason why is because they need to support a bunch of customers, but you're still paying for that R&D. You're paying for their sales and marketing, a bunch of other stuff there. So my viewpoint is if you can build custom software for yourself, that is not very complex, but helps you in your own business processes, I think that might proliferate inside companies. And that might actually cause a whole host of kind of companies that fall into that category. It is simple business software, that feels largely stateless.
Starting point is 00:55:01 to kind of have trouble unless they like kind of reinvent themselves. Yeah. And I guess, you know, one obvious reinventing that could happen later is once this happened, let's just continue this thought of like companies are building a lot of internal software. They, they might start to have some similar problems of, let's say, you know, three, five years that underwood maintenance, storage, compliance, just going through if they're working, re-evaluating if it makes sense to actually bring it into something.
Starting point is 00:55:30 So like this could create a lot of new opportunities for other software businesses or software developers or, you know, maybe these companies or maybe a new job role in software engineering, which is, you know, I'm now specialized in, I built so many of these apps and I can help you with them. Who knows? No, I think, I think that like a lot of people talk about how we're going to have like way fewer software engineers in the near future. I think it feels like it feels like it's people that hate software engineers, largely speaking,
Starting point is 00:55:58 that say this. it feels like pessimistic not only towards these people, but I would say just in terms of what the ambitions for companies are. I think the ambitions for a lot of companies is to build a lot better product. And if you now give the ability for companies to now have a better return on investment for building technology, right? Because the cost of building software has gone down. What should you be doing?
Starting point is 00:56:19 You should be building more. Because now the ROI for software and developers is even higher. Because a singular developer can do more for your business. right so technology actually increases the ceiling of your company much faster yeah and i'm going to just double click on that because like you know you're you have been building windsurf and and you've been building these tools but you've also worked with the team in fact with the same team even before these tools today one of your you know solid engineers who was a solid engineer four years ago how has their work changed now that they have access to windsurf agentic you know cascade all these other
Starting point is 00:56:55 tools including, you know, like chat, GPD, etc. Like, what's changed? And then not just your engineering, but also the team that you had four years ago, you know, that I was doing work. How has their work change in terms of, I don't want to point you in any direction, but I'm just interested in like what you would say, how does that seem different in what they do or how they do or how much they do? Yeah.
Starting point is 00:57:17 I think there's maybe a couple of things. So first of all, the amount of code that we have in the company is quite high and now dominates what a single person knows at the moment. So in the beginning of the company, that's not the case. So actually, this is something that I can't point to because the company is very quite small. Right now, I would say it enables, like, there's less, there's more fearlessness to jump into a new part of the code base and start making changes. Right. I would say, in the past, you would be, you would, you would more say, hey, this person has way more familiarity with this part of the code. That is still the case. Right. When you say familiarity now, it is,
Starting point is 00:57:52 it's like understanding the code, but this person also knows where the dead bodies are, which is to say, hey, when we're all, you know, you did X and you got Y that happened. And that means you always should do Z, right? And there are still people like that at the company, and I'm not saying that that is not valuable, but I think now engineers feel more empowered to go out and make changes throughout the code base, which is actually awesome. And the second key piece is our developers now go to the AI first to see what value would generate for them before making a change, which is something which I would say in the autocomplete days, you would go out and type it and you would get a lot of advantage from autocomplete and the passive AI, but now the
Starting point is 00:58:30 active AI is something that developers more and more reach towards to actually go out and make changes at the very beginning, right? Yeah, I'm interested in how this will change software engineering. Because I also noticed, I noticed both things on myself. Like, I still code and I do my side projects, but I always drag my feet of getting back into the context of the code that I wrote, which was, you know, I kind of forgot part of it, getting back into the language because I use multiple languages because of their side projects.
Starting point is 00:58:57 And AI, like, it does help me just like jump into it. I no longer have the thing. And sometimes, yeah, I just prompt the AI saying, what would you do? I just want to know. And then if it looks good, I do it. If it not, I just scrap it. Maybe I prompted it or sometimes I just like, nah, I'm just going to do it because either I didn't give it right instructions.
Starting point is 00:59:15 Like, you know, there is this thing, especially when you're working on stuff, you know, the code base, you've onboarded it. know what you want to do, but I think it helps me, at least, with the effort. Sorry, with the thing that wouldn't take like much creativity, but it would just be time, a drag, figuring out the right things, finding, the right dependency, changing those things, that kind of stuff. I think you're exactly right. I think this reducing friction piece is something that is, it's kind of hard to quantify the
Starting point is 00:59:50 value because it makes you more excited to do more, right? You know, this stuff, I think software development is a very weird profession. And I'll give you an example of why. It's weird. And a lot of people would think, oh, this is a very easy job. And I actually think it's quite hard on you mentally. And I'll walk you through what I mean by that. It's, you know, you're doing a hard project. You sometimes go home with incomplete, with, you know, with an incomplete idea. The code didn't pass a bunch of tests. And it just, it just bothers you when you sleep and you need to go back and kind of fix it. And this could be for days. Yeah. And for other jobs, I don't think you kind of feel that, right? It's a lot more procedural potentially for other types of jobs. I'm not saying for every job, there are obviously jobs where
Starting point is 01:00:34 there's a massive like problem solving component. But that just means that this kind of, you do get a fatigue. If you, you know, at some point, even the easy things, just forcing you to do new easy things. It adds some amount of mental fatigue. and I think you now have a very powerful system that you now trust that should ideally reduce this fatigue and be able to do a lot of the things that are in the past high activation energy and do it really fast for you. Yeah, this is really interesting because I was just talking with a former colleague of mine who had a few months where he just wasn't producing much code, really good engineer, really solid. And at the time, I didn't know why. and he didn't tell me and then he kind of snapped out of it but we're just talking he said like
Starting point is 01:01:20 he said that actually he was at a really bad time in his life lots of stress in a relationship and at home with family all these things and he said that he just realizes how mental a game software engineering us he at work he just couldn't get himself to you know get into the zone we know how it is especially before AI tools and what you said I'm starting to get a bit of an appreciation and the fact that I remember, you know, stressful. I couldn't turn off. Like, you go home, you're having dinner. You're still thinking about how you would change that or why it's not working. It's, like, I don't think we'll be able to like, you know, go onwards. But I think for listeners, it's worth thinking about like how, A, how weird it is, I think is good to reflect on it.
Starting point is 01:02:05 Because it is a unique, it is for so many jobs, you can actually, you know, just put down your work and leave the office and you cannot continue. And that's it. I cannot even think about it because all your work is there. And also like how these tools might just change it for the better in many ways and maybe just in weird ways that we don't expect in others. No, I think you're totally, this idea of, I think this is why like finding amazing software engineers is very, it's rare. It's rare. Because these people are people that I guess have gone through this and are willing to put themselves through the idea of like, hey, all of the learnings that I had from like the lowest level to the highest level and then willing to go to the,
Starting point is 01:02:43 go down to the weeds to kind of make sure you solve the problem. It's a rare skill. It's that, you know, you would imagine, hey, this is something that everyone would be able to do. But it like takes a lot of dedication. And as you pointed out, it's like this, you know, for an activity that is, that it's not a very normal activity. Yeah. Well, going back to engineering challenges and decisions, one super interesting thing that I've been dying to ask you is you did mention in the beginning that, you know, like it's, you, when you started WinSurf, you realize like Visual Studio Code is just, it's not there where it should be. However, you started by forking Visual Studio Code, right?
Starting point is 01:03:22 Do I know that right? That's exactly right. Can you tell me the pros and cons of doing this as opposed to like building your own editor? And I'm aware that there are some downsides of doing there. There's some licensing things. So that's one part of the question. The second part of the question, like, why did you think that forking is the right move to build a much better, much more capable thing of whatever Visual Studio was back,
Starting point is 01:03:43 so VS code was back in the day. Yeah. So just maybe some clarifications just on terminology. VS code is like, is a, is a, and like a product that is built on top of code OSS, which is the ultimate, which is the basically the open source project. I did not know that. Yeah. So because VS code has proprietary pieces on top of the open source, on top of the open source. I do know that.
Starting point is 01:04:08 And a lot of people don't know that actually. Yeah, exactly. So I guess one of the things that we actually did was we wanted to make sure we did this right. And what I mean by that is when we actually built our products, we did for CODA OSS, but we did not support any of the proprietary pieces that Microsoft had. And we never actually provided support for those, not through a marketplace or anything. We actually use an open marketplace that is completely fine. And by the way, this forced us to actually have to build out a lot of extensions.
Starting point is 01:04:38 that people needed and bake it into the product. I'll give you an example. For Python language servers, we actually now, we have our own version, right? For remote SSH, we have our own version. For dev containers, we have our own version. So this actually forced us to get a lot tighter on what we need to do. And we never took, I guess, a shortcut of, hey, let's go out and do something that we shouldn't be doing. Because, hey, we work with real companies.
Starting point is 01:05:01 We work with real developers. And why should we be putting them in that position? Right. I guess we kind of took that position. and so that was like that was the positioning that was the positioning we had obviously there were some complexities but this this just caused us more engineering effort before we launched the product right we did launch the product with an ability to connect to remote SSH and do all this other stuff and we did have like internal engineering effort to actually go out and and do that um now the question might be
Starting point is 01:05:28 why even fork vS code or the question yes in the first place i think it's because it's a very it's a very well-known ID where people have workflows. There are also many extensions there that people rely on that are extremely popular, right? And ID is not just, I guess, the place where you write software, it's also the place where you attach a debugger and do all these other operations. And we didn't want to reinvent the wheel on that. We didn't think we were better than, I guess, the entire open source community on that, right, in terms of all the ways you could use the product.
Starting point is 01:06:04 And I'll give you an example of maybe how we're trying to be pragmatic here. We didn't go out and try to replace JEPBrain's with this product. We actually put all the capabilities of Winsurf into JepRains, into what's called a Winsurf plugin. And this is where our goal is to meet developers where they are. And meeting VS code developers where they are means we should give them a familiar experience. Meeting JepRane's developers means we should give them a familiar experience, which is actually use JepRains.
Starting point is 01:06:30 Now, a question might be, why didn't we fork JepRines? And the answer is two reasons. First of all, we can't. It's close doors. Second of all, the answer is actually because Jepprens is actually a fantastic IDE for Java developers and in a lot of cases, C++ and Python developers. And so far as... P.HP as well, Ph.P. Storm, if you ever need them. It's exactly right. But they have one for almost every single language. For every single language. And the reason is because they have great debuggers, great language servers that actually think are not even present on VS code right now. Like if you were a great Java
Starting point is 01:07:02 developer, most of them, and probably 80 plus percent right now, use intelligent in the market. So the question there is, like, I think as a company, our goal is not to be dogmatic. Our goal is to build the best technology and provide it and democratize it and provide it to as many developers possible. No, I love it. And this is actually, I was talking with one of your software engineers who did mention an interesting challenge because of just this, the fact that you do have a JetBrains plugin and then you have the ID.
Starting point is 01:07:29 And now you're apparently you're sharing some binaries between the. too. Can you talk a little bit about that engineering? Yeah. So this was actually an engineering decision. We needed to make a couple months into starting working on Kodium, which is that, hey, we're going to go out and build a VS code extension. That's what we started out with. But very quickly, like, the next step is let's go implement it in jeopardize. The problem is if we need to duplicate all the code, it's going to be really, really annoying for us to support all this. So what we decided to do is actually go out and build almost the shared binary between both that we call the language server that actually does
Starting point is 01:08:00 the heavy lifting. So the goal there is hopefully we're not just duplicating the work in a bunch of places. And this enables us to support many, many IDs from an architecture standpoint. And that's why we were able to support not just JetBrains, Eclipse, VIM, all of these other IDs that people would, you know, that are popular without much lift. Okay. I need to ask you about MCP. You have started to support it, which is really cool. I play around with it. And I think it's a good first First step. What is your take on MCP, especially with the security worries? And also, where do you see MCP going right now? I think it's a bit of an open book, but you are probably a bit more exposed to this than most listeners will be. You know, I think it's very exciting. I have some, maybe one concern,
Starting point is 01:08:49 but let me start with the exciting part. The exciting part is now it's democratizing access to everything inside a company or everything a user would want within their coding environment. for our product in particular. Obviously, there are other products. Maybe it can help you buy goods and grocery and stuff like that. Obviously, we're not that interested in that case. But that is nice. One of the other things that it lets companies do is they can implement their own
Starting point is 01:09:14 MCP servers with security guarantees, which is to say they can implement a battle-tested MCP server that talks to an internal service that actually does off and all these other things for the end user, and they can own the implementation of that. So there's a way for companies now to enable us to use, to interact with their internal services in a secure way. But you're totally right.
Starting point is 01:09:38 Like there could be a slippery slope where this causes everyone to have immediate access to everything in a right based fashion that could have negative consequences. But the thing I'm like, I'm particularly maybe a little bit worried about. And it's not worried. It's more so like the paradigm itself is, is MCP the right way of encapsulate? talking to other systems, or is it like actual workflows of developers going and interacting with these systems?
Starting point is 01:10:05 And I'll give you an example of that. One of the problems with the MCP is it forces you to hit a particular spec. And you know this. Actually, the best spec is flexibility. It's flexibility. And, you know, if you ask these systems now to integrate with another,
Starting point is 01:10:20 like you ask an LOM, like a GPT-4-1 or a sonnet, hey, you know, build an integration to this system, to a notion. It will do it zero shot now. So you could build an MCP server that is particular that only lets you have access to two things in Notion. Or the models themselves are capable of doing a lot. And it's like how much do you want to constrain versus have freedom? And then also there is the corresponding security issue too.
Starting point is 01:10:44 So look, it's awesome that we have access to it. Is this the final version? I don't know if this is the final version. Yeah. I'm going to rephrase it. And let me know if you think I'm off. But when you set up, for example, you know, I'm building a. web project and I'm using Node and I have I have my my packages JSON that specify what
Starting point is 01:11:04 packages I'm going to use now on my machine I will have a lot of packages installed but for each specific project I'm going to be very clear of what I want to use what package maybe a subset of it and you know like right now it feels to me that the current version of MCP it just lets me connect everything I can't really you know say that for example on this project like I actually want you to only talk to this table in my database I don't want you to access all the other stuff because it's just a proud database. And I have a test table there, that kind of stuff, right? Like, are we talking about this like granularity and figuring out what would actually help me as an engineer be productive?
Starting point is 01:11:46 No, it's an interesting point. So like, you're totally right. You want these systems to have access to a lot of things so that you can get be productive. All the while, you want to be imperative and very instructive on, on, on what systems they should have access to internally. But the problem is people are very, I'm not going to say lazy, but it is annoying if you have 50 services and you need to tell it, you need to do this, you need to do that, you need to do this.
Starting point is 01:12:10 And what can very quickly happen is people don't and they get like mixed results or it has like negative consequences. So look, I think we're figuring this out. I think the whole industry is kind of figuring this out what the right model is. And maybe it actually is a lot of engineering that needs to get done post the MCP server, which is to say the MCP server provides a very free-flowing interface, but there's a lot of like kind of understanding of the server to who the user is, what service they're trying to touch, what code base they're in,
Starting point is 01:12:38 and there's like proper access controls that are implemented, you know, afterwards that helps you kind of like do that. I'm thinking these languages are not really popular, but when I started programming, I used C-sharp. And in C-sharp, for the classes, you had keywords. You know, you have classes, but you couldn't just access them. You had public classes which everyone can access. You had protected classes.
Starting point is 01:12:58 You actually had internal classes that were inside the module. You had private classes, which were not accessible unless you were a child class. And these were just keywords of how, what module can access what parts of your code inside the code base. And we back then, this was like the 2000s, we spent a lot of care deciding who can access what and how, even though technically you could have just everyone could have talked to up with everyone. But we decided this was evolution of a few decades that it wasn't a good idea. So I'm wondering if we're going to get there, for example, with MCP, we might reamend some parts of it because that didn't come up because, like, you know, like someone thought it was like just lick their finger. It was because we needed it to organize large amounts of code back then when we didn't have the tools that we have today. No, I think you're right.
Starting point is 01:13:44 I think some primitives are missing right now for sure. It's too free form right now. It's going to be super exciting, though, because we are seeing it that it is going somewhere. Maybe MCP, maybe not. and we're in the middle of it. You know, who knows? Some people listening to it might actually influence the direction of this new thing that we're going to use in like five years from now.
Starting point is 01:14:02 It's awesome. Yeah. What is your take on this 70, 30% of mental model for AI tools? This is something that comes up every now, especially with folks who are less technical, that today they can, you know, prompt AI tools from windsurf to lovable and others of like, hey, generate this idea that I have. And they do a good job at the, you know, the one shot or,
Starting point is 01:14:24 the tweaking. And then the last 30%, especially when they're not experienced software engineers, they just get a little stuck or hopelessly stuck. Do you observe this with Winsterfusers, or this is not really a thing when people are pretty technical and developers? Yeah, I think we do have non-developers that use the product. And I do think the level of frustration for them. And by the way, my viewpoint on this is not like just let them be frustrated. it's I would love to help them. But the level of frustration when they get it, when they have a problem, is much higher. And the reason is because for you and I, when we go on and use this, and it gets into this degenerate state where it goes out and it tries to make a change and it
Starting point is 01:15:03 does a series of changes that doesn't make sense. Our first instinct is don't just like do it 10 more times when five times it didn't work. It's probably like look at the code and see what step didn't work and we're going back to the step that works, right? Like debugging principles. But that's, by the way, the reason why we do that is we understand the code. Yeah. We can like go back into the code and kind of understand it, but you're right that for developers that can't, they're kind of in a state of helplessness. And I deeply empathize with that. And it's like, it's our job to figure out ways that we can make that a lot better. Now, granted, right, does that mean we make our product completely cater to non-developers? No, that's actually not what
Starting point is 01:15:41 we do. Are there principles from that that that we can take that help both groups? Right, because I think for us, we do want to get to a state where these systems can be more and more autonomous. Right. A real developer needs to go out and needs to fix these issues all the time when they prompt it. It also just means we're getting, we're farther and farther away from being autonomous as well. So that's kind of the way we think about it. But I do think as an industry and this is, you know, there's engineers who like the coders and then the non-coders, there is a question that needs to be asked of, do we eventually need to understand what the code does? Do you need to be able to read the code?
Starting point is 01:16:16 Because, for example, when I was at university, we studied assembly. Now, I never really programmed assembly beyond the class, but I have since came across assembly code, and I'm not afraid to look at it. Now, again, I'm not saying I'm the expert, but you can go all the way down to the stack. And I think there is something to be said that, you know, we're now adding a new level abstraction that as a professional, it will always be helpful to be able to look through the stack. You know, sometimes all the way to the networking logs or the packet, not often, but just knowing where to look and eventually where to go. So this might be more of a philosophical question because I think a lot of people don't want, they just think, okay, we can just use English for everything. But it does translate into a level, which is programming language, just translates into the next level and so on. I think you're right.
Starting point is 01:17:02 So here's my take on it. We are going to have a proliferation of software. Some of the software will be built by people that don't know code. Right. I think it feels simplistic to say that that is not going to happen. Right. And we're already seeing it in real time. But here's the thing.
Starting point is 01:17:17 It's almost like when you think about the best developer that you know, even if they're a full tag developer, they probably understand when the product is slow. It's because there's some issue with the way that this interacts with the operating system. If there's some issue with the way that this interacts with the networking stack, it's the ability for this person to kind of peel back layers of abstraction to get to ground truth. That is what makes a great developer, a great developer. And these people are more powerful. They're more powerful in any organization.
Starting point is 01:17:43 You know that you can take these people and put them on any project. and it's just going to be a lot more successful with them. And I think the same thing is going to happen, which is that some set of projects, it is going to be fine if the level of abstraction you deal with is the final application plus English and a spec. For some other set of applications, it's actually a developer will go in,
Starting point is 01:18:03 but there's going to be some gnarly nature. It's going to interact with the database. It's going to have some high, it's going to have like performance-related issues. And you're going to have an expectation that the AI and the human can go down the stack and the human can reason about this. And I think these people are always going to be really valuable. Similar to how I think actually our best engineers can,
Starting point is 01:18:23 if I ask them to go and look at the object dump of a C++ program and actually understand, hey, actually, here's a place where we're, here's a function, here's a place where we're seeing a massive amount of contention, and we need to go out and fix this, right? And if the developer didn't understand the sort of fundamentals, they would be much worse at our company because of that. Yeah, I wonder if an analogy might be that a car mechanic,
Starting point is 01:18:50 you know, car mechanics evolved over time. Like my dad used to, we used to have like these old school cars where he would take apart the engine. He would take the whole thing apart and then put it back together over a weekend. Like all the parts laid out, I remember. And of course, by the time, you know, I got to owning a car, I could change the oil. And now I have an electric car, which is, you know, like there's not as many moving parts.
Starting point is 01:19:12 However, someone who understands how cars work, how they're built, how they evolved, they will always be more in demand for special cases. For example, I just had my 12-volt battery die in my electric car. I had no idea there was a 12-fold battery, but apparently I talk with someone who, you know, isn't this in like, yeah, it's from the gas cars. And this is why and this is the reason. And this is how the new version will evolve. So like, and clearly we will, the majority of people might not need it eventually, but there is that expertise. Plus, these are the people who understand everything who will often take. take innovation forward because they understand what came before and they understand what needs to
Starting point is 01:19:47 come. You're totally right. Maybe one other thing that I would want to add to what you basically said is when you look at what great computer scientists and software engineers do, I think they're great problem solvers given understanding sort of a high level sort of business case or what the company really wants to do. And there are people that can distill it down. And I think that skill is actually what I think boils down to when you meet great engineers. It's not just like you tell them about a feature. You tell them about an issue, a desired outcome, and they will go out and find any way possible to go out and get to that. I think that's what great engineers are. They're problem solvers. And that's always going to be in demand. Now, is the person that builds the most boilerplate
Starting point is 01:20:32 website? And that is the only thing they are excited to do in the future. That person's skill set is going to be depreciating with time. But I think that's a, but that's a simplistic way of looking at it because, you know, if they were a software engineer, they should know how to reason about systems. They should be good problem solvers. I think that's the hallmark of like software engineering as a whole. And they will always have a position out there in my opinion. Now, since you started to build Windsorff or even Codium, how has your view changed on the future of software engineering? And we've touched on a few things. But, but like have there been some things like before and after? Now you're thinking about things differently? You know, I think that timelines for,
Starting point is 01:21:11 a lot of things. I'm like less scared of them, even though like I think a lot of them are supposed to come like come out like various, like as scary numbers. You know, I think recently Dario from Anthropic was 90% of all committed code is going to be AI generated. I think the answer to that is going to be yes. My question after that is so what like so what if that's the case? Developers don't only spend time writing code, right? I think there's there's this fear that comes from all this stuff. I think I think AI systems are going to get smarter and smarter very quickly. But look, look, when I think about what engineers love doing, I think they love solving problems, right? They love collaborating with their peers to find out how to make solutions that work.
Starting point is 01:21:51 And I think when I look at the future, it's more like things are going to improve very quickly, but I think people are going to be able to focus on the things that they really want to do when their developers, not like the nitty-gritty details that, as you said, you go home and you're like, I don't know why this doesn't compile. I think that will, a lot of those small details for most people are going to be a relic of the past. Well, I'll tell you, I'll give the idea of why people are stressed, you know, like, because and they're going to say, you know, some listeners will say, like, well, you're in an easy position because you're in the middle of an AI company building all these tools, which is the
Starting point is 01:22:22 future, right? Like, and you're going to be fine for the next few years. And they're thinking, I'm sitting at a B-to-B SaaS company where, like, I'm building a software. And my employer is thinking that these things make us 20% or 25% more efficient and they're going to cut a quarter of a team. And I'm worried, A, if it's going to be me, B, the job market is not that great.
Starting point is 01:22:42 And I get it that I can be more more productive with these things, but I still need to find a job. And that is the, you know, like not everyone will verbalize this, but this is the thing that gives people, this is, you know, when they're hearing Dario talk about the 90%, they're thinking, oh, damn, my employer will say like, okay, Joe, we don't need you anymore. Yeah. The problem is, I don't know what, like, maybe this is like, I don't know if this is like a real good answer, but that feels like the employers being like irrational. Because, okay, my, let me, let me provide, let me provide the take here. If the B2B SaaS company that is not doing well needs to compete with other B2B SaaS companies, if they reduce the number of engineers that they have,
Starting point is 01:23:18 they're basically saying their product is not going to improve that quickly compared to a competitor that is willing to hire engineers and improve their software much more quickly, I do think consumers and just businesses are going to have much higher expectations for software. So the demand for software that I buy is way higher. Like, I don't know if I've noticed it. I feel bad when I buy a piece of software
Starting point is 01:23:37 that looks like it did, like, you know, a couple years ago, that's like this ugly procurement software. Yeah. And these days you don't have a, I hear you. I think, I think, I see your son. I see the short term of like, like, are there employers that look at this and they're like, this is an opportunity to cut? I think these employers are being really, really short-sighted. Yeah. And I think I'm getting a little bit of hope from even other industries. There was a time where people, writers were being fired left and right. Like, I'm not saying software writers, but like just like old traditional writers. And now there's a big hiring spree from all sorts of companies of
Starting point is 01:24:08 hiring writers because turns out the AI is kind of, you know, it's a bit bland and a great writer with AI is way better than without, I think same for software engineers. So that's also a bit of my message read and anyone listening, but it's just good to hear from you. Exactly. When you have a competitive market and you add a lot of automation, automation is great, but what you actually need to compare is automation with a human. And if that's way more leveraged, then you actually should compete with that. That's like the game theoretically optimal thing to do. And actually that, that's the tool that you're building right now, which I think is one of the reasons that it's, like a reason I like to use it. It doesn't feel that it's like trying to do anything.
Starting point is 01:24:45 Instead of me, it's doing it with me and making me way more efficient as an engineer. So to wrap up, I just have some rapid questions. I'm just going to ask them and then you can shoot the answer. So I've heard that you're really into endurance sports, long distance running, cycling, and you do just a lot of it. Now, a lot of people are thinking, well, I'm pretty busy with my job with coding, et cetera. I don't have as much time for sports. sports. How do you make time for sports? And what would your advice be for someone who is like actually want to get in a lot better shape while being a software engineer and busy with what's your work? So I will say this like since the company that has gone down drastically. But my previous company,
Starting point is 01:25:21 I still worked a ton. I worked at an autonomous vehicle company. I would bike over like 150 miles a week, like rigorously, like probably close to 160, 170. I think it's just interestingly, it's I for for an activity like this, I actually got Zwift, so like this way to bike indoors. And I would just be able to knock out like 20 to 25 miles in an hour, like at home. And the benefit there is like now I can come back from work very quickly do a ride. And then, you know, on the weekends on a Saturday, I would just dedicate being able to do potentially like a 70 mile loop somewhere. One of the lucky things for me is I'm in the Bay Area.
Starting point is 01:26:00 So there's a lot of like amazing places to ride a bike that have hills and stuff like that. So I think it's easy to carve out this time, but you kind of, you know, you need to make the friction for yourself a lot lower, right? I think if I needed to, I would never go to a gym, like, rigorously. I think I'm not the type of person that would like, you know, I would just find a way to not do it. But if it's literally at home right next to where I sleep, I'm going to find a way to do it. Sounds like just just make it work for you. Yeah. And what's a book that you would recommend and why?
Starting point is 01:26:31 You know, there was a book that I read a long time ago that I really enjoyed. It's called the Idea Factory. It's basically about how Bell Labs kind of like innovated so much while being a very commercial entity. And it was very interesting to see some of like the great scientists of our time working at this company providing so much value. So like information theory, Claude Shen and worked there. Right.
Starting point is 01:26:52 The like the founding of the transistor happened sort of like Shockley and all these people kind of were there too. And just hearing how a company is able to straddle the line between both was really exciting. Yeah, and I hear that, you know, Open AI got inspired by Bell Labs a lot. Their titles are coming back. And I think I actually, I personally want to read more about that. So thanks for a recommendation. Well, thank you. This was great. This was super interesting and just, just love all the insights that you shared. Yeah, thanks a lot for having me. I hope we enjoyed this conversation with Varun and the challenges that the windsurf team is solving for. One of the things I enjoy discussing was when
Starting point is 01:27:27 Barun shared how they have a bunch of features that just didn't work out, like their review tool. And then they celebrate failure and just move on. I also found it fun to learn how any developer can roll out any feature they built to the whole company and get immediate feedback, whether it's good or bad. For more deep dives on AI coding tools, check out the Pragmatic Engineering Deep Dives link in the show notes below. If you've enjoyed this podcast, please consider leaving a rating. This helps more listeners discover the podcast. Thanks and see you in the next one.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.