The Pragmatic Engineer - Building Windsurf with Varun Mohan
Episode Date: May 7, 2025Supported by Our Partners• Modal — The cloud platform for building AI applications• CodeRabbit — Cut code review time and bugs in half. Use the code PRAGMATIC to get one month f...ree.—What happens when LLMs meet real-world codebases? In this episode of The Pragmatic Engineer, I am joined by Varun Mohan, CEO and Co-Founder of Windsurf. Varun talks me through the technical challenges of building an AI-native IDE (Windsurf) —and how these tools are changing the way software gets built. We discuss: • What building self-driving cars taught the Windsurf team about evaluating LLMs• How LLMs for text are missing capabilities for coding like “fill in the middle”• How Windsurf optimizes for latency• Windsurf’s culture of taking bets and learning from failure• Breakthroughs that led to Cascade (agentic capabilities)• Why the Windsurf teams build their LLMs• How non-dev employees at Windsurf build custom SaaS apps – with Windsurf!• How Windsurf empowers engineers to focus on more interesting problems• The skills that will remain valuable as AI takes over more of the codebase• And much more!—Timestamps(00:00) Intro(01:37) How Windsurf tests new models(08:25) Windsurf’s origin story (13:03) The current size and scope of Windsurf(16:04) The missing capabilities Windsurf uncovered in LLMs when used for coding(20:40) Windsurf’s work with fine-tuning inside companies (24:00) Challenges developers face with Windsurf and similar tools as codebases scale(27:06) Windsurf’s stack and an explanation of FedRAMP compliance(29:22) How Windsurf protects latency and the problems with local data that remain unsolved(33:40) Windsurf’s processes for indexing code (37:50) How Windsurf manages data (40:00) The pros and cons of embedding databases (42:15) “The split brain situation”—how Windsurf balances present and long-term (44:10) Why Windsurf embraces failure and the learnings that come from it(46:30) Breakthroughs that fueled Cascade(48:43) The insider’s developer mode that allows Windsurf to dogfood easily (50:00) Windsurf’s non-developer power user who routinely builds apps in Windsurf(52:40) Which SaaS products won’t likely be replaced(56:20) How engineering processes have changed at Windsurf (1:00:01) The fatigue that goes along with being a software engineer, and how AI tools can help(1:02:58) Why Windsurf chose to fork VS Code and built a plugin for JetBrains (1:07:15) Windsurf’s language server (1:08:30) The current use of MCP and its shortcomings (1:12:50) How coding used to work in C#, and how MCP may evolve (1:14:05) Varun’s thoughts on vibe coding and the problems non-developers encounter(1:19:10) The types of engineers who will remain in demand (1:21:10) How AI will impact the future of software development jobs and the software industry(1:24:52) Rapid fire round—The Pragmatic Engineer deepdives relevant for this episode:• IDEs with GenAI features that Software Engineers love• AI tooling for Software Engineers in 2024: reality check• How AI-assisted coding will change software engineering: hard truths• AI tools for software engineers, but without the hype—See the transcript and other references from the episode at https://newsletter.pragmaticengineer.com/podcast—Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email podcast@pragmaticengineer.com. Get full access to The Pragmatic Engineer at newsletter.pragmaticengineer.com/subscribe
Transcript
Discussion (0)
A lot of people talk about how we're going to have way fewer software engineers in the near future.
I think it feels like it's people that hate software engineers, largely speaking, that say this.
It feels pessimistic not only towards these people, but I would say just in terms of what the ambitions for companies are.
I think the ambitions for a lot of companies is to build a lot better product.
And if you now give the ability for companies to now have a better return on investment for building technology, right?
Because the cost of building software has gone down.
What should you be doing?
You should be building more.
because now the ROI for software and developers is even higher because a singular developer can do more for your business.
So technology actually increases the ceiling of your company much faster.
Winsurf is one of the popular IDs as software engineers use thanks just AI coding capabilities.
But what are the unique engineering challenges that go into building it and how good tools like Winserve change software engineering?
Today I set down with Varum Mohan, co-founder and CEO of Winsurf.
We talk about why the Winsurf team build their own LLMs and how LLMs for text are missing.
capability necessary for coding like fill in the middle.
How Winsterf uses a mix of techniques for many cases,
like to solve for search, how they use a combination of embeddings and keyword-based searches.
Why latency is their number one challenge and how incorrectly balancing GPU
compute load and memory load can lead to higher latency for code suggestions popping up.
How Varum thinks his software engineering field will evolve and why he stopped
worrying about predictions like 90% of code will be generated by AI in six months.
If you want to understand the engineering that goes into these next-generation IDs, then this episode is for you.
If you enjoy the show, please do subscribe on any podcast platform and on YouTube.
Welcome to the podcast.
Yeah, thanks for having me on.
You've recently launched GPC 4.1 support in Windsurf, which by the time this is out, it will have been a few weeks.
But what are your initial impressions so far?
and in general, when you introduce a new model, how do you evaluate, like, how it's working for the coding use cases that we all use?
Yeah, maybe I can talk about the second part, and then I can talk about, you know, GBT 4.1, the other models afterwards.
Basically, internally, you know, these models have these non-deterministic properties, right?
They sometimes perform differently in different tasks in ways that are unexpected.
You know, you can't just look at a score on a competitive programming competition and decide,
hey, it's going to be awesome for programming.
And, you know, interestingly about the company,
maybe this is going to be a helpful context.
A lot of us in the company previously worked in autonomous vehicles.
And I think in autonomous vehicles, we had a similar type of behavior
where you had a piece of software.
The software was very modular, lots of different pieces.
Each piece was machine learning driven.
So there was some nondeterminism.
And it's very hard to test it in the real world, right?
Actually, it's much harder than it is to test,
I guess, windsurf out in the real world.
it's much harder to test autonomous vehicle software out in the real world, because if you ship bad software,
you have the chance of hurting a lot of people, right?
Hurting a lot of people, hurting the, you know, I don't know, just the general public infrastructure,
right?
So in that case, we needed to build really good simulation evaluation infrastructure in autonomous vehicles.
And I guess we brought that over here as well, where, hey, if you want to test out a new model,
we have evaluation suites.
And the evaluation suites not only test end-to-end software performance, which is to say you give a high-level task,
what is the pass rate of actually completing the high-level tasks on a bunch of unit tests.
It also tests retrieval accuracy, edit accuracy, right?
Redundant changes.
All these different parts of a model that are like negative behavior.
Because for our product, it not only matters that you pass a test.
It also matters that you didn't go out and make 10 steps that were unnecessary because
the human is going to be waiting on the other end for all of those changes.
So we have metrics for all of these things.
And we're able to put each model through like, I guess, a suite of tests that give us
metrics. And that's like the way we decide, hey, this is a good model for our end users. Right. And that's like the
high level way that we go about testing. And like these tests, you know, they sound great in theory,
but in practice, what does it look like? Like I'm going to assume you're going to have, you know,
we can imagine us engineers who've been writing, you know, code, probably not autonomous vehicles,
but similar ones, you know, we know our unit tests, our integration tests. If you do mobile, you know
your end-to-end test. I'm assuming this will be a little bit different, but with some similarities, like
do you actually like code some scenarios you have like example codes example prompts and and then
i assume you can do a bit of that but then what else and you know how does this all come together and
like how can i imagine this test suite is it like one big giant blob that runs for i don't know how long
yeah one of the aspects of code that is really good is it can be run right so it's not like a very
you know touchy-feely kind of thing in the end like a test can be passed so what we can do is we can
take a bunch of open source repositories, we can find previous pull requests or commits that
actually not only add tests, but also add the implementations correspondingly. And what we can do
is instead of just taking the commit description, we can remake what the description of the
commit should have been, like a very high level intent. And then from there, it becomes a very,
I guess, programmatic problem, which is to say, hey, like, first of all, find the right files
that you need to go and make changes to, right? Then there is a ground truth for that, right? Because
the base code actually has a set of five,
10 files that changes were made to.
Then after that,
what is the intent on those files?
You can actually go from the ground truth backwards,
which is that you know what the final change was
from the actual code.
And you can have the model generate that intent.
And then after that, you can see if the edit,
given that intent is correct.
So you now have three layers of tests,
which is that, hey, did I retrieve the right things?
Did I have the high level intent correctly?
And is the edit performance good, right?
And then you can imagine,
doing much more than just that. But at a high level now, you know, just from a pure commit or a
pure actual ground truth piece of code, you now have multiple metrics that you can go about.
And then obviously the final thing you can actually do is run the code. Right. So it's not just like a,
you know, when you measure some of these chat products, they actually, you know, the evaluation
is a little bit different, which is to say the evaluation is you give it to multiple humans in a blind
test, in an AB test, and you ask them, which one did you like more? Obviously, for us to
quickly evaluate, we can't be giving it to like tens of thousands of humans in a second.
And with this now, within like minutes, we can get answers to what is the performance on
tens of thousands of repositories and tests, basically.
This episode is brought to by Modal, the cloud platform that makes AI development simple.
Need GPUs without a headache.
With Modul, just add one line of code to any Python function and boom.
It's running in the cloud on your choice of CPU or GPU.
And the best part, you only pay for what you use.
With sub-second container start and instant scaling to thousands of GPUs, it's no wonder companies like Suno, Ramp, and Substack already trust Model for their AI applications.
Getting an H-100 is just a PIP install away.
Go to modal.com slash pragmatic to get $30 in free credits every month.
That is M-O-D-A-L.com slash pragmatic.
This episode is brought to by CodeRabit, the AI Code Review platform transforming how engineering teams shift faster without sacrificing code quality.
Code reviews are critical, but time-consuming.
CodeRabbit acts as your AI copilot, providing instant code review comments and potential impacts of every poll request.
Beyond just flagging issues, CodeDribute provides one-click fixed solutions and lets you define custom code quality rules using AST grab patterns, catching subtle issues that traditional static analysis tools might miss.
CodeRabit has so far reviewed more than 5 million poll requests, is installed in 1 million repositories, and is used by 3,000,000.
50,000 open source projects.
Try CodeRabbit free for one month at
Codrabbit.AI using the code pragmatic.
That is coderabbit.aI and use the code pragmatic.
I really like how much engineering you can bring in
because it's code and because we have repositories
because you can use all these things that feels to me
it gives a bit of an edge to like some of the other use cases,
just as you mentioned.
No, I think you're totally right.
We think about this a lot of what would have happened if we were to pick a different sort of category entirely.
I think the ground truth is just very hard.
You don't even know if the ground truth is great, right, in some ways.
In some cases, for all we know, the ground truth is not good.
But in this case, I think it's a lot easier because of the verifiability.
Or kind of if you have a good test, it's a lot more easy to verify software.
And can give us a sense of what is the team behind windsurf and also how complex this thing is?
then how did it even come about?
Because for all I know, you know, like a few months ago, when this podcast started, there was no windsurf.
There was codium.
We actually talked a bit about what codium was a little bit different.
And then out of nowhere, boom, windster comes out.
A week later already in the pragmatic injury, about 10% of people that we surveyed were already using it,
which was, I think, the second largest usage of tools.
And people were enthusiastic about it.
But I assume there's more to this.
It didn't just come out of, you know, like nothing, right?
Yeah, so happy to talk a little bit about our story and summarize it.
So we started the company now close to four years ago, which is substantially before, I guess, the co-pilot and Chai Chi-GPT sort of moment.
A lot of us at the company, as I mentioned, previously worked, I would say on these hard tech problems, you know, AR, VR, autonomous vehicles.
And I guess at that point, what we started building out, and we had a different company name at that point, it was called XA Function.
we started building out GPU virtualization systems.
So we built out systems to make it very fast and efficient to run GPU-based workloads.
And we would enable companies to run these GPU workloads on CPUs.
And we would transparently offload all GPU computations to remote machines.
And that could be kuda kernels all the way down to full-on model calls, right?
It was a very low-level abstraction that we provided people.
And so much so that if the remote machine died,
we'd be able to reconstruct the state of what was on that GPU in another.
other GPU, right? And the main use case we targeted were these large-scale simulation workloads
for these deep learning workloads that a lot of these robotics and autonomous vehicle companies had.
And we thought, hey, the world was going to look like that in the future. A lot of companies
would be running deep learning workloads. What ended up happening was in the middle of 2022,
I think Texed Da Vinci 3 sort of came out, which was, I guess, the, you know, the GPT3 sort of
instruction model sort of came out. And I guess that changed a lot of our priors, like both me
my co-founders priors, which is to say, we thought that the set of models that would run
were going to look a lot more homogenous, right? If you were to imagine in the past,
the number of different models that people would run was very diverse, right? People would run
convolutional neural notes, right? Recurrent neural notes, LSTMs, right? Graph neural nets. There was a
whole suite of different types of models. We thought in that case, hey, if we were an infrastructure
company, we can make it a lot easier for these companies to run these workloads.
But the thing is with Textimichy 3, we actually thought that actually there would be a simple
of the set of models that would run. Why go out and train a very custom Burt model?
If you could go out and just ask a very large generative model, is this a positive or negative
sentiment? And we thought that that was where the puck was going. I guess for us, like we believe
in scaling laws and all these things. If it's this good today, how good is a much smaller model
going to be in two years? It's probably going to be way better. So what we decided to do was
actually focus on the application layer. Take the infrastructure that we had and actually build
on an application. And that was what codium was. So we built out extensions in all the major
IDEs, right? And very quickly we were able to get to that point. And we actually did train our own
models and run them ourselves with our own inference stack. And the reason why we did that is at the time,
the models were not very good. The open models are not very good. And also for the workload that we had,
which was autocomplete, it was a very weird workload. It's not very similar to the chat workload.
Code is in a very incomplete state. You need to fill in code in the middle of a line. There's a bunch of
reasons why this workload is not very similar. And we thought we could do a much better job. So we provided
that because of our infrastructure background for free to basically every developer in the world.
There was no way to pay for the product. And then very quickly, enterprises started to reach out.
We were able to handle the security requirements and personalization because the companies
not only care about, hey, it's fast, it's free, but is this the best code for my company?
Right. And we were able to meet that workload. And then fast forward to today, and I know that
this is a long answer, what we felt was agents in the beginning of last year would be very huge.
The problem was the models were not there yet, right?
We had internal, we had teams inside the company building these agent use cases, and they were
just not good enough.
But the middle of last year, we were like, hey, it's actually going to be good enough.
But the problem is the IDE is going to be a limitation for us.
Because VS code is not evolving fast enough to enable us to provide the best experience for
end users.
In a world in which agents were going to write 90% or 95% of software, developers would still be in the loop.
But the way they would interact with their IDs would look markedly different.
And that's why we ended up building out Winsurf in the first place.
We thought that there was a much higher ceiling on what IDEs could provide.
And with the agent product, which is Cascade, we were able to deliver what we felt was a premier
experience right off the bat that we couldn't have with VES code.
How large is a team working on Winsurf and how complex is Winsurf as a product?
Like, I'm not sure how much we can quantify it.
Yeah.
You know, I try to be pretty modest, like, you know, sort of modest with,
with some of these things, but just to say, we are a pretty small team.
So right now, the engineering team is a bit over 50 people.
At the time when we, maybe that's like, that's large compared to other startups.
But if I were to say compared to other, you know, large engineering projects in the grand
scheme of things, like one of the books that I read a while ago was this book called Showstopper,
right?
And it's this book about how Microsoft built Windows NT, right?
And it's a much larger team, obviously, but operating systems are more, are very complex
piece of software. But my viewpoint on this is that this is a very, very complex piece of software
in terms of where the goal post is, which is to say, I would say the goalpost is constantly
moving, right? One of the goals that I give to the company is that we should be reducing the
time it takes to build applications by 99%. Right. And I would say pre-win-surf, it was probably 20,
and post-win-surf, it was probably over 40. But we are very far from 99, right? We're still like,
you know, a 60x away from 99, right? Like if we've, if there's a,
If there's a 60 units of time and we want to make a one, we're quite far.
So in my head, there's a lot of different engineering projects that we have at the company.
In fact, like, I would say over, maybe close to half of the engineering team is working on projects
that have not seen the light of day.
Right.
And that's like an interesting decision that I guess we've made because I think we cannot be
embracing incremental, right?
Like, we're not going to win and be a valuable company to our customers if all we're doing
is changing the location of buttons.
Like, I think people will like us for great UI, but that cannot be the only
reason why we win. No, and I love it. I mean, this is, you know, when you're a startup,
I think you need to say really big. You cannot use to incremental. You can do incremental later.
Hopefully you're going to get there. And what are some interesting numbers that you can share
about the usage of windsurf or the load that you're handling? I'm assuming this is just going to,
it's pretty easy tell. It will keep going up, right? Yeah. That's an easy prediction.
No, I think you're right. So one of the interesting numbers I can, I can, I can,
a handful of numbers is within a couple months of the product.
We had, like, well over a million developers tried the product.
So it's been growing quite quickly.
Within pricing coming out, we've reached over, within a month, we reached over sort of eight
figures in ARR.
And I think all of those are kind of interesting metrics, but also on top of that, sort of
we run our own model still in a lot of places.
Like, you can imagine the fast, passive experience is completely our own model.
A lot of the models to go out and retrieve parts of the code base.
and find relevant snippets are our own models.
And that system processes well over sort of 500 billion tokens of code every day right now.
So that system itself is huge.
It's a huge world that we actually run.
Yeah.
And I guess the history of windsurf is interesting once I understand that you've actually
been building your own models for quite the time.
You know, you've not just started here because I think for most engineering teams,
that would be daunting.
And also it's just a lot of time, right?
it's not something that you would just like, it's harder to do from scratch.
I'll say it because nothing's impossible here.
I totally agree with you.
I think, you know, one of the weird things is because of the time that we started and the fact
that we were like in the very beginning, first of all, we had the infrastructure background,
but we were first saying we need to go out and build an auto-complete model.
The best model at the time that was open source end of 2022 was Salesforce code jet.
And I'm not saying it was a bad model.
It was awesome that Salesforce did open-source that model, but it was missing a lot.
lot of capabilities that we needed for our product, right? It was missing fill in the middle,
which feels like a very, very obvious capability, but the model is actually...
What is that? So the idea of fill in the middle is, basically, if you look at the task of writing
software, it's very different than chat. And maybe an example of what chat is, you're always
appending something to the very end and maybe adding an instruction. But the problem for writing code
is you're writing code in ways that are in the middle of a line, in the middle of a snippet of code.
Yeah, or that kind of stuff, yeah. In the middle of a function.
And the problem there is actually, there's a lot of issues that pop up, which is to say actually the tokenization, so these models, when they consume files, they actually tokenize the files, right?
Which is they don't consume them byte by byte. They consume them token by token. But the fact that the code when you write it at any given point doesn't tokenize into something that looks like in distribution. I'll give you an example. How many times do you think in the training dataset for these models, does it see instead of return RETU only without the RN, probably never. It probably never sees that.
So it's completely out of distribution.
But we still need to, when we see RETU, predict you are going to do RN space a bunch of other
stuff, right?
It sounds like a very small detail, but that is issue very important if you want to build
the product.
And that is a capability that cannot be slightly post-trained onto the models.
It's actually something where, like, you need to do a non-trivial amount of training
on top of a model or pre-trained to get that capability.
And it was table stakes for us to provide that for our users.
So that forced us very early on to actually build out our own models and figure out
training recipes and make sure we could run them at massive scale ourselves for our end users.
And what are other things that are unique in terms of building models for code as opposed to
the usual text models? I can think of things like the brackets, for example, in some languages.
Maybe this is just naive. You have all seen so many more. So like what makes code, what makes it
interesting slash worthwhile to build your own model for code? Yeah, I think, I think,
what you said is is definitely like one thing to fill in the middle capability I would say another thing
another thing you can do is code is like quite easy to to and quite easy is maybe you know and
an overstatement but quite easy to parse right you could actually asd parse code you can find
relationships of the code because code is a system that is like evolved over time you could
actually look at the commit history of code to see to build a knowledge graph of the code base
and you can start putting you do that details what do you do
Do you do that?
Yep.
Yeah.
Yeah.
Yeah.
We look at the previous commits.
And one of the things that it enables us to do is build a probability distribution of the code base of conditional on you modifying a piece of code.
What is the probability of you modifying another piece of code?
So there is, you know, when you get into the weeds, code is very, it's very information dense, right?
It's testable.
There's a way that it evolves.
People write comments, which is also cool, which is to say once a pull request gets created,
people actually say, I didn't like this code.
So there's a lot of signal on what good.
bad looks like within a company. And you can use that actually as a way to make, to automatically
make the product much better for companies, right? You know, one of the things that I think we were,
all of us were talking about, I would say like a couple years ago when, and I guess, I guess we've
been here in this space quite a long time. I know a couple of years is not a very long time in
most categories, but in this category, it's, you know, dinosaur years. So, so, you know,
one of the things that I think is, is kind of interesting is, we in the beginning, we were,
we were saying, hey, people would write all these guidelines and documentation on how best to use
the product. But the interesting thing is code is such a treasure trope. You can go out and probably
make a good first cut on what the best way to write software is inside JPMorgan Chase, inside
Dell. You can go out and do that by using the rich history inside the company. So there's a lot of
things that you can start doing autonomously as well, if that makes sense. Yeah. One thing I'd love
to get your take on on how it might have changed a year or two ago, when,
using when co-pilot started to become more popular, again, an earlier version, companies like
SourceGraph and others have started to build other capabilities. There was this debate of,
would it be worth fine-tuning a model on my company's code base, talking about large companies,
let's talk JP Morgan or those others. And there are two strange of thoughts. One said, like,
oh, it's probably worth it because our code is so unique, it might be worth it. And some other people
think it might not be worth it because it might be too resource intensive. The models are too
generic. Did you try this out? And where did you land in this? Because I never got an answer to,
you know, what happened, like what was worth it, what was not worth it in the end. So for what is worth,
we did try it out. We built out some crazy infrastructure to go out and try it out. I, you know,
I guess this will be the first place where I talk about the actual infrastructure. We built out
systems. So transformers have these many layers, right? And if you were to imagine, if when we,
actually enable companies to self-hote, at some point in the past, we were enabling companies
to self-host the system and the fine-tuning system as well. So at that time,
you'll have self-hosting data. You build this out. We built out self-hosted not only deployment,
but also fine-tuning. And the way that that actually worked was actually quite crazy,
which was to say, okay, where do you get the capacity to fine-tune a model if you're already
running it for inference? Like, the company may not want to give you so many GPUs. So we just said,
hey, why don't we use the preemptible time, which is to say when the model is not running inference,
what if we actually go out and backpropagate, do back props on the transformer model while this is
happening? And then what we found was, oh, the back props take a long time and it might cause
downtime on the inference side. So what we enabled it was we enabled the back prop to be able to be
preemptible on every layer of the transformer. So if that's to say, let's say you sent an inference
request and it's going to do, it's doing back propagation. And it's on layer 10. It'll just stop at
layer 10 and it will continue after your inference request completes. So we built a lot of crazy
systems to actually go out and do this. I guess here's the thing we found. We found that fine tuning
was a bump, but it was a very modest bump compared to what great personalization and great retrieval
could do. That's what we found. Now, does that mean fine tuning in the future is not going to be
valuable? I think actually per person fine tuning could actually work quite well. I think though maybe
some of the techniques that we do it are going to need to change. And here's the way I'd like to look at it,
right? Or anytime a system, you know, you build a system, there are many ways to improve it.
Some of them are much easier than other ways. And you can imagine there's a hill to climb for
everything. And some hills are much easier. And the right strategy to do when a hill is much
easier and it provides a lot of value is climb that hill fully before you go out and do something
that's a lot harder. Because when you do the thing that's a lot harder, you are like adding
someone of tech debt if that's not the right solution. What I described to you in terms of like
the solution of doing backcrop on a layer by layer basis, it's a cool idea. But you can imagine
it added a lot of technical complexity to the software that might have been unnecessary if we thought
that purely doing better retrieval was going to be much better. So there's this like, I guess
there's this tightrope to kind of, you know, balance on top of on how you decide these things.
Now, I was asking around, I've been using windsurf as well, but I'm not a very heavy user,
but I have been asking around more heavy users. And one of the biggest criticisms, both of
windsurf, but also in of every tool in this area has been like, look, I start off, it's good.
It works good. I have a relatively small complex.
My project grows either because Winserve generates code or is just a big project.
After a while, it starts to struggle with the context.
Maybe it doesn't see, you know, part.
It gets confused, et cetera.
And clearly, as an engineer who understands that, it is going to be a problem of, like, you have a growing context window.
You still want to have similar quality.
How do you tackle this challenge?
What progress have you made?
And I think this is a bit of a million dollar question in the sense of like if we could somehow have a solution for this, we would be better off.
Where have you gotten on this?
I'm assuming this is a pretty common challenge.
I think it's a very hard problem.
You're totally right.
There's a lot of things that we can do, which is to say, obviously, we need to work around the fact that the models don't have infinite context.
And when they do have larger and larger context, you are paying a lot more and you take a lot more time.
Right. And developers usually, a lot of the time, don't really want to wait. And, you know, one of the things that we have for our products, we hate waiting. Yeah, exactly. But one of the things that we have for our products that we've learned is if you make a developer wait, the answer better be 100% correct. And I don't think we're at a time right now where I can guarantee you with like a magic one that all of our cascade responses are 100% correct. Right. I don't think we're at that right that right. So there's a lot of things that we need to do that are almost approximations, right? How do we keep a very large context? But despite that, we have
chats that are so long that how do you accurately checkpoint the past conversation? But that has
some natural lossiness attached to it. And then similarly, if the codebase gets very large,
how do we get very, very confident that the retrieval is very good? And we have evaluations for
all of these things, right? This is not something which we're like shooting in the dark and being
like, hey, YOLO let's try a new approach and like give it to half of our users. But I think you're
totally right. There's no, I don't think there's like a complete solution for it. What I think it's
going to be is like a mixture of a bunch of things, which is to say much better checkpointing,
coupled with better usage of context length, much faster LMs and much better models.
So it's going to be, it's not going to be, I think, a silver bullet. And by the way, that could
be tied with, hey, you know, understanding, you know, understanding the code base much better
from the perspective of the code base already existed, able to use the knowledge graph, right?
Able to use a lot of the dependencies within the code base a lot better.
So it's a bunch of things that I think are going to multiply together to solve the problem.
I don't think there's going to be like one silver bullet that makes it so you're going to be able to have amazingly coherent conversations that are very, very long, basically.
To be fair, as an engineer, it kind of, this might feel weird, but it makes me feel a bit better.
We're actually back to talking about like engineering step by step as opposed to like, okay, you know, having these, you know, it feels like you get a new model.
Not now, but early on when we got a new model, it was like, oh my gosh, just magic.
And it took a while to understand how it works, how it's broken down, et cetera.
you did mention your infrastructure.
Can you talk a little bit about how we can imagine your hardware and back and stack
if I was to join in Windsurf as an engineer?
Like, is it going to be a bunch of cloud deployments here and there?
Do you self-host some of your GPUs?
Because a lot of AI startups who are smaller or more modest, they're just going to, you know,
platform as a service.
It sounds like you might be at the scale where maybe you're outgrowing this as well.
Yeah, I think we might have just never done kind of, you know, buying off the shelf stuff in the, in the early part of the company.
Oh, yeah. Your background, I keep forgetting this.
Yeah, but even more than the background, I think there were cases where we could have and maybe should have.
One of the reasons why we also didn't was very quickly we got brought into working with very large enterprises.
And I think the more dependencies you have in your software, it just makes it harder and harder for these larger companies to integrate the technology.
right? Like they don't want a ton of sub-processors attached to it. We recently got FedRamp
High Compliance. We're the only AI software assistant with FedRamp High Compliance.
And the only reason why that's the case is we've kept our systems very tight, right?
And for these compliances, I did some, but not specifically FedRamp, what do you need to prove
that you are this compliant? Yeah, I think basically you need to map out the high levels of
sort of all the interactions.
You need to be very methodical about releases and how the releases make it into the system.
You need to be very methodical about where data has persisted at a layer that is like probably
much deeper than SOC2.
I think like going through the SOC2 versus FedEx.
I did SOC2 and that was already pretty long.
So it was already a really long.
Yeah.
It's impressive that you did this as a startup slash scale of congrats.
Yeah.
One of the reasons why was I guess like one of our first customers that were a large enterprise
was like Dell.
which is like not a usual first large enterprise.
For startup, no.
For startup, definitely no.
So it forces down a path of how do we build very scalable infrastructure?
How do we make sure our systems work at a code base that is 100 plus million lines of code?
What does our GPU provisioning need to look like for this larger team?
It's just forces to become a lot more, I guess, operationally sound for these kinds of problems, if that makes sense.
Yeah.
And how do you deal with inference?
You're serving the systems that serve probably billions or hundreds of billions,
well, hundreds of billions tokens per day, as you just said, with low latency.
What smart approaches do you do to do this?
What kind of optimizations have you looked into?
Yeah, I mean, like a lot, as you can imagine.
One of the interesting things about some of the products that we have, like the passive
experience, latency matters a ton in a way that's very different than some of these API
providers.
Like I think for the API providers, time to first token is.
important. But it doesn't matter that time to first token is 100 milliseconds. For us, that's the
bar we are trying to look for. Can we get it to sub, you know, a couple hundred milliseconds
and then hundreds of tokens a second for the generation time? So much faster than what all of the
providers are providing in terms of throughput as well. Just because of how quickly we want this product
to kind of run. And you can imagine there's a lot of things that we want to do, right? How do we
do things like speculative decoding? How do we do things like model parallels, right? How do we
make sure we can actually batch requests properly to get the maximum utilization of the GPU,
all the while not hurting latency, right? That's an important thing. And one of the interesting
things is just to give some of the listeners some a mental model. GPUs are amazing. They have a lot
of compute. If I were to draw an analogy to compute or to CPUs, GPUs have over sort of two
orders of magnitude more compute than a CPU, right? It might actually be more on the more recent
GPUs, but keep that in mind. But GPUs only have an order of magnitude more memory bandwidth than a
CPU. So what that actually means is if you do things that are not compute intense, you will be
memory bound, right? So that necessarily means to get the most out of the compute of your processor,
you need to be doing a lot of things in parallel. But if you need to wait to do a lot of things
in parallel, you're going to be hurting the latency. So there's all of these different tradeoffs that
we need to make to ensure a quality of experience for our users that we think is high for the
product. And we've obviously mapped out all of these. We've seen how, hey, like, if we change the
latency by this much, what is this change in terms of people's willingness to use the product?
And it's very stark, right? Like a 10 millisecond increase in latency affects people's willingness
to use the product materially, right? It's percentage points that we're talking about. So these are all
parts of the inference act that we've needed to optimize. And is latency important enough? Or does the
location factor factor into this? The physically how close people
you know, using Windsurf R to wherever your server and then your GPUs are running.
You need to worry about that as well?
You do need to worry about that.
Let speed of light starts mattering.
Interestingly, you know, this is not something I would have expected, but we do have
users in India.
And interestingly, the speed of light is not actually what is bottlenecking their performance.
It's actually the local network.
So just the time it takes for the packet to get from maybe like from their home to the
major ISP is actually somehow there's a lot of congestion there and that's the kind of stuff
that we need to kind of deal with. But by the way, that is something that we just cannot solve
right now. So you're totally right. The data center placement matters. Like for instance, if you
force a data center in Sydney and you have people in Europe, they're not going to be happy about
the latency. So we do think about like where the location of our GPUs are to make sure that we do
have good performance. But there are some places where there are some issues that even we can't
get around basically. No, the last time I heard this.
complained before WinSurf because this came up with actually, again, someone who's using
Windsorfer on the tools a lot said that specifically for one of the tools, he can tell
that the data centers are far away because it's just slow.
Cloud development environments had the exact same thing because they were similar, right?
Like this was, I'm not sure they're as popular right now, but there was a time where it looked
like it might be the future.
You just log on to your remote environment, which is running on CPUs or GPU somewhere else.
And again, I think it might have to do with as.
as does when you're typing, like, when I'm using it, I mean, I'm just used to, like,
I do want sub second, probably like a few hundred milliseconds.
I just noticed, right?
You feel it's slow and it just bothers you.
Like, it just, no, I agree.
I think if I had to, if I had to even see every time I typed a keystroke, like a couple
hundred milliseconds later, the key would show up.
Like, I would rage quit.
That would be a terrible experience for me.
How do you deal with indexing of the code?
So you're going to be indexing, you know, depends on the code.
base, it'll be more or less. But if you add it up, I'm sure we're talking billions or a lot more
in code. And for your enterprise customers, you might actually have, you know, the hundreds of
millions or even more lines of code. Is there anything like novel or interesting that you're using?
Or is it just kind of tried and proven things, for example, that search engines might use?
It's a little bit of both, to be honest. And what I mean by that, it's not a very clean answer.
we do try approaches that are embedding based.
We have approaches that are keyword based on the indexing.
Interestingly, actually, one of the approaches that we've taken that's very different than search,
and maybe actually systems like Google actually do this,
is we not only actually look at just the retrieval,
we do a lot of computation at retrieval time.
So what that means is let's say you want to go out and ask a question.
One of the things that you can go out and do is ask it to an embedding store
and get a bunch of locations.
What we found was the recall of that operation was quite low.
And, you know, one of the reasons why that happens is embedding search is a little bit lossy.
Like, let's say I was to go to a code base and ask, hey, give me all cases where this function, this spring boot version X type function was there.
I don't think anyone would believe embedding search would be comprehensive.
Yeah.
Right?
Because it's just not, like, you're taking something that is very high dimensionality and reducing it to something very low dimensionality.
without any knowledge of the question.
That's like the most important thing.
So it like,
it needs to somehow encode all the possible be relevant for all the possible questions.
So instead what we decided to do is take a variety of approaches to retrieve a large amount of data
and that it could include the knowledge draft,
that it could include could include the dependencies from the abstract syntax tree.
They could include like keyword search.
That could include embedding search.
And you kind of fuse them all together.
And then after that,
we throw compute at this and actually go out and process large chunks of the code base
at inference time and go out and say, hey, these are the most relevant stippets,
and this gives us much higher precision recall, right, on the retrieval side to actually go out.
And by the way, that is very important for an agent, because imagine if an agent kind of
doesn't have access and doesn't deeply understand the code base, all the while the code base
is much larger than the context length of what an agent is able to take in, right?
So we've, you know, optimizing the precision recall of the system is actually something that
we spent a lot of time and built a lot of systems for.
It's interesting because it feels like you're, it shows how, well, A, it's code.
So you can eat more easily work with it, especially with certain keywords, for example, on some language.
I can imagine that you can even just, you know, you can even list all the keywords that are pretty common.
And you can decide if it's a keyboard or if it's something special where, and if it's a keyword, you can already just like do it.
And it's interesting how you can combine the kind of old school or old school before, before LMs and then add the best parts of LMs, but not forgetting about the, you know, what worked before.
I'm just for I wonder if there's other any other industry that has this that we do have this like lower the dimensionality space in terms of the the grammar and all these things.
We understand the usage pretty well.
And then the users are power users who actually, you know, the same people use it who could actually build, you know, this tool.
Yeah.
You know, I feel like Google's, like Google system is probably ridiculously complex and sophisticated for obvious reasons.
just because, for one, they've been doing this for so long, and they've been the, obviously,
they were, they've been at the top for such a long time. And then also on top of that,
the monetary value they get from delivering great search is so high given ads that they
are incentivized to throw a lot of compute even at, at the query time, right, to make sure that
the quality of suggestions is, is really good. So I assume they're doing a lot of, a lot of tactics there.
Obviously, I'm not privy to all the details of the system. So I don't know.
Well, it's interesting because I would have agreed with you until recently, but there are some search engines that are doing really good results.
So I wonder if Google is less focused on the actually the hay sack and the needle and maybe more on revenue or maybe they're doing it as invisible.
I'm sure they're doing an amazing job, by the way, behind the hood.
But I wonder if some of that knowledge has commoditized.
But, you know, we'll see.
But moving on from indexing, in terms of databases, what kind of databases do you use on what challenges are they giving you?
Like, again, you're not, I'm assuming you're not just going to be happy with like the usual,
let's know everything in Postgres or, or do you actually?
You might be able to.
I don't know.
Sounds like these days postgres can be surprisingly well for even embeddings.
Yeah.
You know, I think we, we do a combination of things.
So we do like some amount of local indexing.
We do some remote indexing as well.
Local indexing as on the user's machine.
Nice.
That helps us get in some ways, the benefit of that is it helps you build.
up, if you were to say, hey, you have some understanding of the code base. The problem is that
understanding changes very quickly as the user starts changing code, starts checking out new branches.
And you don't want to like basically say all of your information about the code base, you need
throw away because of that. So it's good to have like some information about like the user's
history and what they've done locally kind of like store it. In terms of remote, I think it
would be a lot simpler than people would imagine. One of the complexities of our product. The reason
of why the product is very complex is actually the fact that we need to run all of this GPU infrastructure,
right?
That's actually a large chunk of the complexity because if you were to look at our QPS, our QPS is high,
but it is not like tens of thousands of QPS, right?
Actually, it's not, it doesn't need to be that high because in some ways, in some ways,
like actually each of the queries that is happening is actually a really expensive query.
It's doing trillions of operations remotely.
So actually the complexity of the problem is how do you optimally do that process, right?
So we can actually get away with things like Postgres.
Like we're not, in fact, I would say I like to keep things pretty simple,
if it's possible to keep things very simple.
And we should not be rolling any type of our own database.
Like I think databases are very, very complex pieces of technology.
I think we're good engineers, but we're definitely not good enough to kind of like on the side,
build our own database.
And then for local indexing, what database do you use?
Yeah, we have our own combination of like sort of like SQL-based database.
We have a local SQL database.
And then like some sort of embedding databases as well.
well that we store locally as well.
What is your view on the value of embedding databases?
This has been an ongoing debate for the past, like since Chad GBC became big.
Again, there were two schools of thoughts.
One is we do need embedding based database, embedding databases because they can give us
vector search.
They can give us all these other features that allelms and embeddings will need.
And the other school of thought as well, let's just expand relational databases.
We add a few extra indexes.
And boom, we're done.
From your, you know, you're more the user of this, but you're a heavy user at,
wind surf and the codium. What pros and cons are you seeing? I'm just trying to get you to go to one
the direction of the other. It's a good question. So our viewpoint on embeddings are probably that they are not,
they don't solve a problem by themselves. They actually just do not. So the answer is going to be
mixed. And then the question is, why do we even do it in the first place, right? And I think it really
boils down to it's a recall problem. When you want to do good retrieval, you need the input to what
you're willing to consider it to be large and a high recall, right? And if you were to think about it,
the problem is if you only have something like keyword search and you have a very, very large
sort of code base, actually, what happens if the user typos something? Right? Then your recall is
going to be bad. So the way I like to think about it is each of these approaches, keyword search,
right, like sort of knowledge graph based retrieval, all of them. They're all in different circles.
And what you're trying to do is get something where the union of these circles is going to give you the highest
recall ultimately for the retrieval query. And I think embedding can give you good recall,
because it is able to summarize or actually able to distill someone to semantic information
about the chunk of code, the AST or the file or the directory and all this other stuff.
So what I would say is it's a tool in the toolkit. It's not, like you cannot build our product
entirely within an embedding system, but also does the embedding system help? I think it actually does
help, right? It does improve our recall metrics and our precision metrics.
So I talk with your head of research, Nicholas Moy, and he told me about a really interesting challenge that you're facing, which he called it the split brain situation.
He was basically saying that it's almost like the team and everyone in a team needs to have two brains.
One is just being aggressively in the present, shipping improvements as you go, but also then do a long-term vision where you're building for the long run.
How do you do this?
Like how did you start doing it and how do you keep doing it?
You did mention earlier, right, that half the team is working on other stuff, but you kind of,
you kind of like split people so like people focus on short term, long term, or do, does everyone,
including you juggle these things in your head day to day? It's an interesting one.
Yeah, I don't want to give myself like that much credit here. I think like our engineers are probably,
probably should be given most of the credit here. But I think in terms of like maybe company's
strategic direction, both me and my co-founder, the CTO of the company, he, we try to think a lot about
how do we disrupt ourselves?
Because I think it's very easy to get into the state where, hey, I added this cool button.
I added this way to control X with a knob.
And you keep going down this path.
And yeah, your users get very happy.
But what happens if tomorrow I told you users don't need to do that?
And it's an amazing experience.
And it's like a better experience.
Users are going to feel like why do I need to do this?
So here's the thing.
Users are right up to a certain point.
Right?
they will not be able to see, like, by the way, if they can, then we should not be doing this.
They will not be able to see exactly what the future solution should be.
If our users can see the future solution better than we can, like, we should just pack up our bags and leave, right, at that point.
Like, what are we actually doing here?
So I think basically, you know, you have this tension here where you need to build features to make the product more usable today, right?
And our users are 100% right.
They understand this.
They face pain through many different axes that we don't.
and we should listen to them.
But also at the same time,
we might have an opinionated stance
on where coding and where these models
and where this product can go
that we should go out and build to works.
And we should be expanding,
expounding a large amount
of our engineering capital on that.
And can you talk about like some kind of bets
that you're having?
You know,
not that you're giving away everything,
but like some promising directions
that might or might not work out
or even in the past,
some promising singers that maybe did not work out.
Yeah, I'll tell you,
a lot of them. Yeah, so, so, so we failed a lot. And, and, and I think failing is great. And one of the
things that I, I tell our engineers is like, engineering is not like a factory building, right?
It's, it's actually, you know, you have a hypothesis, you go in and you shouldn't be penalized
if you failed. Actually, I love the idea of, hey, an idea sounds interesting. We tried it,
and it didn't work, because we at least learn something. And learning something is awesome. And I'll
give you an example, the agent work that we did for, we didn't even start beginning of last year.
We started even before beginning of last year.
It was not working for many months.
And actually, Nick Moy was working on, who you probably spoke with, was the one who was
working on some of this stuff.
And for a long time, a lot of what he was doing was just not working.
And you would come to us and we would say, okay, fine, it doesn't seem like it's working.
So we're definitely not going to ship this.
But let's keep doing it.
Let's keep working on it.
Because we believe it's going to get better and better.
But it was failing for a long time, right?
we came out with a review product, right, in beginning of last year or around then called Forge for code reviews.
We thought it was kind of useful internally at the company and we thought we could continue to improve it.
People did not find it that useful, right? It was not actually that useful. And, you know, this was, we were going in with the assumption, code reviews take a long time.
What if we could help people? And the fact of the matter was the way we thought we could help people wasn't actually material enough for people to want to take on this new tool.
Right. And there's a lot of.
the things that sort of obviously that we've tried in the past that just didn't work the way we
the way we thought it did and you know for me I think I would be totally fine if 50% of the bets we
make don't work yeah and it's a lot of startups say that and then after a while what I notice is
as a company becomes bigger I saw this as Uber it's actually not really the case there's like
failures kind of on paper it's embraced but actually it's not so I think you know like there's
this tricky thing that when it's actually meant like it's awesome otherwise
people just start to like polish things and make things look good when they're not.
Pretend that it's not a failure, but it was a success and we're just walking away, that kind
of stuff. So it's nice to see that you're doing it. What was the thing that turned the agents
around, which then I assume became cascade? Like, like was it a breakthrough on your end? Was
it the models getting better? Was it a mix of something else? Yeah. I think it was a handful of things.
So I'll walk through it. So first of all, the models got better. 100% the models got better.
I think even with all the internal breakthroughs we had, the models hadn't gotten better,
we wouldn't have been able to release this.
So I don't want to trivialize that matter.
It was huge.
The two other pieces that were quite important was our retrieval cycle is also getting better
and better, which I think enabled us to work much better at these larger codebases.
I guess table stakes, it's quite good at zero to one programming.
But I think the thing that was like a groundbreaking to us was our developers on a complex code base
we're getting a lot of value from it.
Right? And I would say something quite interesting, which is that ChachyPT by itself wasn't incredibly
groundbreaking to our developers inside the company. And that's not because ChachyPT is not a very
useful product. ChachyPT is a ridiculously useful product. It's actually just because you need to think
about it from the perspective of opportunity cost and how much more efficient you get. Our developers,
a lot of them have been developers in the past. They are quite, I think we do have an exceptional
engineering team. They were used to how to use Stack Overflow and all these other tools.
to get what they wanted.
But suddenly, when the model had the capability
to not only understand your code base
and start to make larger and larger changes,
it changed the behavior of the people inside the company.
And not only make changes,
we built systems to very quickly edit the code, right?
The ability to edit code,
we build the kind of models
to take a high-level plan
and make an edit to a piece of code very fast.
So all of these together made it
so that this was a workflow
that our developers wanted to use.
right we had the speed covered we had the fact that they that it understood the code base well
and then we also had massive model improvements to actually be able to call these tools and make
these iterative changes right that's like you know i don't want to diminish that you have all
of these and suddenly now you have a real product i've been meaning to ask you this but how how
how is the team you know like using windsurf to develop windsurf because you're doing it right
you just told me how you're you're doing it do you have from who's two perspective one from the
the technical feasibility, I'm assuming, you know, like just, you know, you're not going to work on
the exact same code base or you have a fork or something like that or a build or something like
that. And then the other hand on like, you know, do you kind of force people to dogg food? Do people
just do it? Do people get stuck on certain versions? Do they turn on features for themselves,
etc.? So the way we do it is we do have like an insider's developer mode. So this enables us to
test new features. I guess anyone at the company should be able to create a feature and then deploy to
everyone internally. And now we have a large number of developers. We'll get feedback. We have an
ability for our own developers to dog food new releases. We can have our own developers say, I hate
this thing. Please don't ever do this. And it's nice because then we don't need to give it to our own
developers. But other developers. So I think we have this tiered system at the company. We have our own
sort of release. We have Next, we're just future looking products that we are releasing that are
a little bit more raw. And then we have like the actual release that we give to developers,
willing to AB test things, but we're not willing to
AB test things in such a way where we give people
a comically bad experience just to A.B.
Test them. Right? Like, it's bad because people
are using this for their real work. So if you're
using it for real work, we don't want to be hurting you.
So I think one of the
things that's quite valuable to us is
probably this is a, you
would think this is a failure mode for our company,
which is that we use Winserve, largely speaking,
to modify large code bases,
right, for obvious reasons, because I think our developers
aren't building these toy apps over and over again.
But crazily enough, one of
our biggest power users inside our company is actually a non-developer. He leads partnerships. He's
never written software before. And he routinely builds apps with WinSurf. Right. And he's one of our
biggest users inside the company. And we've used this to actually replace buying other SaaS tools.
And he's actually even deployed some of these tools inside the company.
What function is this person in? It's partnership. So I'll give me an example of some of the tools.
Some of the tools. These are not complex pieces of software, but you would be surprised at how much
they actually cost.
There's six figures in cost because it's bespoke software, right?
I'll give you an example.
You have a quoting tool.
So the idea of a coding tool is you have a customer, the customer has this size,
they're in this vertical, you know, they want this kind of deal.
Here's the way it would look.
Here's the amount of discount we're willing to give them as a customer.
And usually these systems are really like systems that you would need to pay a lot of
money for it.
And the reason is because, I don't know, like there's no reason for us for our developers
to go out and build this internally, right?
it's a big distraction from going out and building our product.
But now, on the other hand, you have a domain expert in the person that actually runs partnerships.
He doesn't know software, but he knows this really well, right?
And because of that, he's able to create these apps really quickly.
And granted, we do have a person inside the company that looks at the app,
make sure that it logistically makes sense.
It's secure, can be deployed inside the company.
But these are more ephemeral apps, right?
They're quite stateless.
If you were to look at the input output of this app, it is not as complex.
as like, let's say, the windsor project, right?
Yeah.
But we now have this, like, growing set of people inside the company that are not developers
that are getting value from, which we found a little surprising too.
Yeah.
And can you also give maybe just like some other examples of what you think it might be
replaced?
The reason being this, like, I'm actually really interested in this because I do hear a lot
of people either or social media or CEO saying that SaaS apps could be the end of it.
And I've always been skeptical for the reason that, you know, there's two types of SaaS apps.
And the most of SaaS apps I see, for example, Workday, which is an HR platform.
And they will have hosting.
They will have business rules.
They will, like, update to some extent with regulations and all that stuff.
So they do a lot of stuff that is the UI.
I know we can trivialize, but it's a lot more than that.
And then there are a few of these simpler ones.
Like, I don't know what things, but there's like a polling app or internally inside the company you can poll.
It has state, but it's relatively simple.
You can see behind it.
It's just going to be, I could build it, but I just don't want to do it.
deal with authentication to host it inside the company, but it's already there.
And then there's ones you mentioned that are stateless.
So like what kinds of SaaS tools do you see that you're replacing?
And you might see other companies potentially using, you know, like tools like this actually
have with eight one dedicated helper developer, build, build it internally, bring it in
house.
Yeah.
You know, I think it's hubris to believe that products like workday and Salesforce are going
to get replaced by this.
I think you're totally right.
These products have a lot of state.
They encapsulate business workflows.
There's actually for a product like Workday, probably compliance that you need to do
because of how business critical the system is.
So this isn't the kind of system that this would replace.
It probably falls in the latter two categories and probably even just the last one,
which is to say kind of these stateless systems that don't do rights to the most business
critical parts of your databases.
It's probably actually those kinds of systems that very quickly can get replaced.
And I would say there's a new category where think about the amount of software that would benefit a business that just isn't getting created that now could get created, right?
Because and the reason why that software couldn't get created is a company couldn't be created that would be able to sustain itself that would have an economic, a business model that would justify it existing.
But now since the software is very easy to create, these pieces of software are going to proliferate, right?
And one of the things that I'd like to talk about for software is there's a little bit of a, we've been, you know, because the cost of building software was a lot high.
higher, right? Of simple software was a lot higher. Right now for a front end, we have to admit,
it's gone a lot cheaper to build a basic front end system, right? Radically. Radically cheaper.
So I think the way I would sort of look at it is for these kind of systems, what are you really
paying for when you pay a SaaS vendor? You're not only paying for your product. You're paying for
the maintenance. You're paying for the fact that actually, you know, this company actually is
building a bunch of other features that you don't need. And the reason why is because they need to
support a bunch of customers, but you're still paying for that R&D. You're paying for their sales
and marketing, a bunch of other stuff there. So my viewpoint is if you can build custom software
for yourself, that is not very complex, but helps you in your own business processes, I think that
might proliferate inside companies. And that might actually cause a whole host of kind of companies
that fall into that category. It is simple business software, that feels largely stateless.
to kind of have trouble unless they like kind of reinvent themselves.
Yeah.
And I guess, you know, one obvious reinventing that could happen later is once this
happened, let's just continue this thought of like companies are building a lot of internal
software.
They, they might start to have some similar problems of, let's say, you know, three, five
years that underwood maintenance, storage, compliance, just going through if they're working,
re-evaluating if it makes sense to actually bring it into something.
So like this could create a lot of new opportunities for other software businesses or software
developers or, you know, maybe these companies or maybe a new job role in software engineering,
which is, you know, I'm now specialized in, I built so many of these apps and I can help you
with them.
Who knows?
No, I think, I think that like a lot of people talk about how we're going to have like
way fewer software engineers in the near future.
I think it feels like it feels like it's people that hate software engineers, largely speaking,
that say this.
it feels like pessimistic not only towards these people,
but I would say just in terms of what the ambitions for companies are.
I think the ambitions for a lot of companies is to build a lot better product.
And if you now give the ability for companies to now have a better return on investment
for building technology, right?
Because the cost of building software has gone down.
What should you be doing?
You should be building more.
Because now the ROI for software and developers is even higher.
Because a singular developer can do more for your business.
right so technology actually increases the ceiling of your company much faster yeah and i'm going to just
double click on that because like you know you're you have been building windsurf and and you've been
building these tools but you've also worked with the team in fact with the same team even before these
tools today one of your you know solid engineers who was a solid engineer four years ago
how has their work changed now that they have access to windsurf agentic you know cascade all these other
tools including, you know, like chat, GPD, etc.
Like, what's changed?
And then not just your engineering, but also the team that you had four years ago, you
know, that I was doing work.
How has their work change in terms of, I don't want to point you in any direction,
but I'm just interested in like what you would say, how does that seem different in what
they do or how they do or how much they do?
Yeah.
I think there's maybe a couple of things.
So first of all, the amount of code that we have in the company is quite high and now
dominates what a single person knows at the moment. So in the beginning of the company,
that's not the case. So actually, this is something that I can't point to because the company
is very quite small. Right now, I would say it enables, like, there's less, there's more
fearlessness to jump into a new part of the code base and start making changes. Right. I would say,
in the past, you would be, you would, you would more say, hey, this person has way more familiarity
with this part of the code. That is still the case. Right. When you say familiarity now, it is,
it's like understanding the code, but this person also knows where the dead bodies are,
which is to say, hey, when we're all, you know, you did X and you got Y that happened. And that
means you always should do Z, right? And there are still people like that at the company, and I'm not
saying that that is not valuable, but I think now engineers feel more empowered to go out and
make changes throughout the code base, which is actually awesome. And the second key piece is
our developers now go to the AI first to see what value would generate for them before making
a change, which is something which I would say in the autocomplete days, you would go out and
type it and you would get a lot of advantage from autocomplete and the passive AI, but now the
active AI is something that developers more and more reach towards to actually go out and make
changes at the very beginning, right?
Yeah, I'm interested in how this will change software engineering.
Because I also noticed, I noticed both things on myself.
Like, I still code and I do my side projects, but I always drag my feet of getting back into the
context of the code that I wrote, which was, you know,
I kind of forgot part of it, getting back into the language because I use multiple languages
because of their side projects.
And AI, like, it does help me just like jump into it.
I no longer have the thing.
And sometimes, yeah, I just prompt the AI saying, what would you do?
I just want to know.
And then if it looks good, I do it.
If it not, I just scrap it.
Maybe I prompted it or sometimes I just like, nah, I'm just going to do it because either
I didn't give it right instructions.
Like, you know, there is this thing, especially when you're working on stuff, you know,
the code base, you've onboarded it.
know what you want to do, but I think it helps me, at least, with the effort.
Sorry, with the thing that wouldn't take like much creativity, but it would just be time,
a drag, figuring out the right things, finding, the right dependency, changing those things,
that kind of stuff.
I think you're exactly right.
I think this reducing friction piece is something that is, it's kind of hard to quantify the
value because it makes you more excited to do more, right? You know, this stuff, I think software development
is a very weird profession. And I'll give you an example of why. It's weird. And a lot of people would
think, oh, this is a very easy job. And I actually think it's quite hard on you mentally. And I'll walk
you through what I mean by that. It's, you know, you're doing a hard project. You sometimes go home with
incomplete, with, you know, with an incomplete idea. The code didn't pass a bunch of tests. And it just,
it just bothers you when you sleep and you need to go back and kind of fix it. And this could be for days.
Yeah. And for other jobs, I don't think you kind of feel that, right? It's a lot more procedural
potentially for other types of jobs. I'm not saying for every job, there are obviously jobs where
there's a massive like problem solving component. But that just means that this kind of,
you do get a fatigue. If you, you know, at some point, even the easy things, just forcing you to do new
easy things. It adds some amount of mental fatigue.
and I think you now have a very powerful system that you now trust that should ideally reduce this fatigue
and be able to do a lot of the things that are in the past high activation energy and do it really fast for you.
Yeah, this is really interesting because I was just talking with a former colleague of mine who had a few months where he just wasn't producing much code, really good engineer, really solid.
And at the time, I didn't know why.
and he didn't tell me and then he kind of snapped out of it but we're just talking he said like
he said that actually he was at a really bad time in his life lots of stress in a relationship and
at home with family all these things and he said that he just realizes how mental a game software
engineering us he at work he just couldn't get himself to you know get into the zone we know how it is
especially before AI tools and what you said I'm starting to get a bit of an appreciation and the fact
that I remember, you know, stressful. I couldn't turn off. Like, you go home, you're having
dinner. You're still thinking about how you would change that or why it's not working. It's,
like, I don't think we'll be able to like, you know, go onwards. But I think for listeners,
it's worth thinking about like how, A, how weird it is, I think is good to reflect on it.
Because it is a unique, it is for so many jobs, you can actually, you know, just put down your
work and leave the office and you cannot continue. And that's it. I cannot even think
about it because all your work is there. And also like how these tools might just change it
for the better in many ways and maybe just in weird ways that we don't expect in others.
No, I think you're totally, this idea of, I think this is why like finding amazing software
engineers is very, it's rare. It's rare. Because these people are people that I guess have gone
through this and are willing to put themselves through the idea of like, hey, all of the
learnings that I had from like the lowest level to the highest level and then willing to go to the,
go down to the weeds to kind of make sure you solve the problem. It's a rare skill. It's that,
you know, you would imagine, hey, this is something that everyone would be able to do. But it like takes
a lot of dedication. And as you pointed out, it's like this, you know, for an activity that is,
that it's not a very normal activity. Yeah. Well, going back to engineering challenges and decisions,
one super interesting thing that I've been dying to ask you is you did mention in the beginning that,
you know, like it's, you, when you started WinSurf, you realize like Visual Studio Code is just,
it's not there where it should be.
However, you started by forking Visual Studio Code, right?
Do I know that right?
That's exactly right.
Can you tell me the pros and cons of doing this as opposed to like building your own editor?
And I'm aware that there are some downsides of doing there.
There's some licensing things.
So that's one part of the question.
The second part of the question, like, why did you think that forking is the right move to build a
much better, much more capable thing of whatever Visual Studio was back,
so VS code was back in the day.
Yeah. So just maybe some clarifications just on terminology.
VS code is like, is a, is a, and like a product that is built on top of code OSS,
which is the ultimate, which is the basically the open source project.
I did not know that.
Yeah.
So because VS code has proprietary pieces on top of the open source, on top of the open source.
I do know that.
And a lot of people don't know that actually.
Yeah, exactly.
So I guess one of the things that we actually did was we wanted to make sure we did this right.
And what I mean by that is when we actually built our products, we did for CODA OSS,
but we did not support any of the proprietary pieces that Microsoft had.
And we never actually provided support for those, not through a marketplace or anything.
We actually use an open marketplace that is completely fine.
And by the way, this forced us to actually have to build out a lot of extensions.
that people needed and bake it into the product.
I'll give you an example.
For Python language servers, we actually now, we have our own version, right?
For remote SSH, we have our own version.
For dev containers, we have our own version.
So this actually forced us to get a lot tighter on what we need to do.
And we never took, I guess, a shortcut of, hey, let's go out and do something that we shouldn't be doing.
Because, hey, we work with real companies.
We work with real developers.
And why should we be putting them in that position?
Right.
I guess we kind of took that position.
and so that was like that was the positioning that was the positioning we had obviously there were some
complexities but this this just caused us more engineering effort before we launched the product right
we did launch the product with an ability to connect to remote SSH and do all this other stuff and we did
have like internal engineering effort to actually go out and and do that um now the question might be
why even fork vS code or the question yes in the first place i think it's because it's a very it's a very
well-known ID where people have workflows.
There are also many extensions there that people rely on that are extremely popular, right?
And ID is not just, I guess, the place where you write software, it's also the place where
you attach a debugger and do all these other operations.
And we didn't want to reinvent the wheel on that.
We didn't think we were better than, I guess, the entire open source community on that,
right, in terms of all the ways you could use the product.
And I'll give you an example of maybe how we're trying to be pragmatic here.
We didn't go out and try to replace JEPBrain's with this product.
We actually put all the capabilities of Winsurf into JepRains,
into what's called a Winsurf plugin.
And this is where our goal is to meet developers where they are.
And meeting VS code developers where they are means we should give them a familiar experience.
Meeting JepRane's developers means we should give them a familiar experience,
which is actually use JepRains.
Now, a question might be, why didn't we fork JepRines?
And the answer is two reasons.
First of all, we can't. It's close doors. Second of all, the answer is actually because
Jepprens is actually a fantastic IDE for Java developers and in a lot of cases, C++ and Python
developers. And so far as... P.HP as well, Ph.P. Storm, if you ever need them.
It's exactly right. But they have one for almost every single language.
For every single language. And the reason is because they have great debuggers, great language
servers that actually think are not even present on VS code right now. Like if you were a great Java
developer, most of them, and probably 80 plus percent right now, use intelligent in the market.
So the question there is, like, I think as a company, our goal is not to be dogmatic.
Our goal is to build the best technology and provide it and democratize it and provide it to
as many developers possible.
No, I love it.
And this is actually, I was talking with one of your software engineers who did mention
an interesting challenge because of just this, the fact that you do have a JetBrains plugin
and then you have the ID.
And now you're apparently you're sharing some binaries between the.
too. Can you talk a little bit about that engineering?
Yeah. So this was actually an engineering decision. We needed to make a couple months into starting
working on Kodium, which is that, hey, we're going to go out and build a VS code extension.
That's what we started out with. But very quickly, like, the next step is let's go implement it
in jeopardize. The problem is if we need to duplicate all the code, it's going to be really,
really annoying for us to support all this. So what we decided to do is actually go out and build
almost the shared binary between both that we call the language server that actually does
the heavy lifting. So the goal there is hopefully we're not just duplicating the work in a bunch
of places. And this enables us to support many, many IDs from an architecture standpoint. And that's
why we were able to support not just JetBrains, Eclipse, VIM, all of these other IDs that people
would, you know, that are popular without much lift. Okay. I need to ask you about MCP. You have
started to support it, which is really cool. I play around with it. And I think it's a good first
First step. What is your take on MCP, especially with the security worries? And also, where do you see
MCP going right now? I think it's a bit of an open book, but you are probably a bit more exposed to this
than most listeners will be. You know, I think it's very exciting. I have some, maybe one concern,
but let me start with the exciting part. The exciting part is now it's democratizing access to
everything inside a company or everything a user would want within their coding environment.
for our product in particular.
Obviously, there are other products.
Maybe it can help you buy goods and grocery and stuff like that.
Obviously, we're not that interested in that case.
But that is nice.
One of the other things that it lets companies do is they can implement their own
MCP servers with security guarantees,
which is to say they can implement a battle-tested MCP server
that talks to an internal service that actually does off
and all these other things for the end user,
and they can own the implementation of that.
So there's a way for companies now to enable us to use, to interact with their internal services
in a secure way.
But you're totally right.
Like there could be a slippery slope where this causes everyone to have immediate access
to everything in a right based fashion that could have negative consequences.
But the thing I'm like, I'm particularly maybe a little bit worried about.
And it's not worried.
It's more so like the paradigm itself is, is MCP the right way of encapsulate?
talking to other systems,
or is it like actual workflows of developers
going and interacting with these systems?
And I'll give you an example of that.
One of the problems with the MCP is it forces you
to hit a particular spec.
And you know this.
Actually, the best spec is flexibility.
It's flexibility.
And, you know, if you ask these systems now
to integrate with another,
like you ask an LOM, like a GPT-4-1 or a sonnet,
hey, you know, build an integration to this system,
to a notion.
It will do it zero shot now.
So you could build an MCP server that is particular that only lets you have access to two things in Notion.
Or the models themselves are capable of doing a lot.
And it's like how much do you want to constrain versus have freedom?
And then also there is the corresponding security issue too.
So look, it's awesome that we have access to it.
Is this the final version?
I don't know if this is the final version.
Yeah.
I'm going to rephrase it.
And let me know if you think I'm off.
But when you set up, for example, you know, I'm building a.
web project and I'm using Node and I have I have my my packages JSON that specify what
packages I'm going to use now on my machine I will have a lot of packages installed but for
each specific project I'm going to be very clear of what I want to use what package
maybe a subset of it and you know like right now it feels to me that the current version of
MCP it just lets me connect everything I can't really you know say that for example on this
project like I actually want you to only talk to this table in my database
I don't want you to access all the other stuff because it's just a proud database.
And I have a test table there, that kind of stuff, right?
Like, are we talking about this like granularity and figuring out what would actually help me as an engineer be productive?
No, it's an interesting point.
So like, you're totally right.
You want these systems to have access to a lot of things so that you can get be productive.
All the while, you want to be imperative and very instructive on, on,
on what systems they should have access to internally.
But the problem is people are very, I'm not going to say lazy,
but it is annoying if you have 50 services and you need to tell it,
you need to do this, you need to do that, you need to do this.
And what can very quickly happen is people don't and they get like mixed results
or it has like negative consequences.
So look, I think we're figuring this out.
I think the whole industry is kind of figuring this out what the right model is.
And maybe it actually is a lot of engineering that needs to get done post the MCP server,
which is to say the MCP server provides a very free-flowing interface,
but there's a lot of like kind of understanding of the server to who the user is,
what service they're trying to touch, what code base they're in,
and there's like proper access controls that are implemented, you know, afterwards
that helps you kind of like do that.
I'm thinking these languages are not really popular,
but when I started programming, I used C-sharp.
And in C-sharp, for the classes, you had keywords.
You know, you have classes, but you couldn't just access them.
You had public classes which everyone can access.
You had protected classes.
You actually had internal classes that were inside the module.
You had private classes, which were not accessible unless you were a child class.
And these were just keywords of how, what module can access what parts of your code inside the code base.
And we back then, this was like the 2000s, we spent a lot of care deciding who can access what and how, even though technically you could have just everyone could have talked to up with everyone.
But we decided this was evolution of a few decades that it wasn't a good idea.
So I'm wondering if we're going to get there, for example, with MCP, we might reamend some parts of it because that didn't come up because, like, you know, like someone thought it was like just lick their finger.
It was because we needed it to organize large amounts of code back then when we didn't have the tools that we have today.
No, I think you're right.
I think some primitives are missing right now for sure.
It's too free form right now.
It's going to be super exciting, though, because we are seeing it that it is going somewhere.
Maybe MCP, maybe not.
and we're in the middle of it.
You know, who knows?
Some people listening to it might actually influence the direction of this new thing
that we're going to use in like five years from now.
It's awesome.
Yeah.
What is your take on this 70, 30% of mental model for AI tools?
This is something that comes up every now,
especially with folks who are less technical,
that today they can, you know, prompt AI tools from windsurf to lovable and others of like,
hey, generate this idea that I have.
And they do a good job at the, you know, the one shot or,
the tweaking. And then the last 30%, especially when they're not experienced software engineers,
they just get a little stuck or hopelessly stuck. Do you observe this with Winsterfusers,
or this is not really a thing when people are pretty technical and developers?
Yeah, I think we do have non-developers that use the product. And I do think the level of
frustration for them. And by the way, my viewpoint on this is not like just let them be frustrated.
it's I would love to help them. But the level of frustration when they get it, when they have a
problem, is much higher. And the reason is because for you and I, when we go on and use this,
and it gets into this degenerate state where it goes out and it tries to make a change and it
does a series of changes that doesn't make sense. Our first instinct is don't just like do it
10 more times when five times it didn't work. It's probably like look at the code and see what
step didn't work and we're going back to the step that works, right? Like debugging principles.
But that's, by the way, the reason why we do that is we understand the code. Yeah. We can like go
back into the code and kind of understand it, but you're right that for developers that can't,
they're kind of in a state of helplessness. And I deeply empathize with that. And it's like,
it's our job to figure out ways that we can make that a lot better. Now, granted, right,
does that mean we make our product completely cater to non-developers? No, that's actually not what
we do. Are there principles from that that that we can take that help both groups? Right,
because I think for us, we do want to get to a state where these systems can be more and more autonomous.
Right.
A real developer needs to go out and needs to fix these issues all the time when they prompt it.
It also just means we're getting, we're farther and farther away from being autonomous as well.
So that's kind of the way we think about it.
But I do think as an industry and this is, you know, there's engineers who like the coders and then the non-coders, there is a question that needs to be asked of, do we eventually need to understand what the code does?
Do you need to be able to read the code?
Because, for example, when I was at university, we studied assembly.
Now, I never really programmed assembly beyond the class, but I have since came across assembly code, and I'm not afraid to look at it.
Now, again, I'm not saying I'm the expert, but you can go all the way down to the stack.
And I think there is something to be said that, you know, we're now adding a new level abstraction that as a professional, it will always be helpful to be able to look through the stack.
You know, sometimes all the way to the networking logs or the packet, not often, but just knowing where to look and eventually where to go.
So this might be more of a philosophical question because I think a lot of people don't want, they just think, okay, we can just use English for everything.
But it does translate into a level, which is programming language, just translates into the next level and so on.
I think you're right.
So here's my take on it.
We are going to have a proliferation of software.
Some of the software will be built by people that don't know code.
Right.
I think it feels simplistic to say that that is not going to happen.
Right.
And we're already seeing it in real time.
But here's the thing.
It's almost like when you think about the best developer that you know, even if they're a full
tag developer, they probably understand when the product is slow.
It's because there's some issue with the way that this interacts with the operating system.
If there's some issue with the way that this interacts with the networking stack, it's the ability
for this person to kind of peel back layers of abstraction to get to ground truth.
That is what makes a great developer, a great developer.
And these people are more powerful.
They're more powerful in any organization.
You know that you can take these people and put them on any project.
and it's just going to be a lot more successful with them.
And I think the same thing is going to happen,
which is that some set of projects,
it is going to be fine if the level of abstraction you deal with
is the final application plus English and a spec.
For some other set of applications,
it's actually a developer will go in,
but there's going to be some gnarly nature.
It's going to interact with the database.
It's going to have some high,
it's going to have like performance-related issues.
And you're going to have an expectation that the AI and the human
can go down the stack and the human can reason about this.
And I think these people are always going to be really valuable.
Similar to how I think actually our best engineers can,
if I ask them to go and look at the object dump of a C++ program
and actually understand, hey, actually,
here's a place where we're, here's a function,
here's a place where we're seeing a massive amount of contention,
and we need to go out and fix this, right?
And if the developer didn't understand the sort of fundamentals,
they would be much worse at our company because of that.
Yeah, I wonder if an analogy might be that a car mechanic,
you know, car mechanics evolved over time.
Like my dad used to,
we used to have like these old school cars where he would take apart the engine.
He would take the whole thing apart and then put it back together over a weekend.
Like all the parts laid out, I remember.
And of course, by the time, you know, I got to owning a car,
I could change the oil.
And now I have an electric car, which is, you know, like there's not as many moving parts.
However, someone who understands how cars work, how they're built, how they evolved, they will always be more in demand for special cases.
For example, I just had my 12-volt battery die in my electric car.
I had no idea there was a 12-fold battery, but apparently I talk with someone who, you know, isn't this in like, yeah, it's from the gas cars.
And this is why and this is the reason.
And this is how the new version will evolve.
So like, and clearly we will, the majority of people might not need it eventually, but there is that expertise.
Plus, these are the people who understand everything who will often take.
take innovation forward because they understand what came before and they understand what needs to
come. You're totally right. Maybe one other thing that I would want to add to what you basically
said is when you look at what great computer scientists and software engineers do, I think they're
great problem solvers given understanding sort of a high level sort of business case or what the
company really wants to do. And there are people that can distill it down. And I think that skill is actually
what I think boils down to when you meet great engineers. It's not just like you tell them about a feature.
You tell them about an issue, a desired outcome, and they will go out and find any way possible
to go out and get to that. I think that's what great engineers are. They're problem solvers.
And that's always going to be in demand. Now, is the person that builds the most boilerplate
website? And that is the only thing they are excited to do in the future. That person's skill set
is going to be depreciating with time. But I think that's a, but that's a simplistic way of looking
at it because, you know, if they were a software engineer, they should know how to reason about
systems. They should be good problem solvers. I think that's the hallmark of like software engineering
as a whole. And they will always have a position out there in my opinion. Now, since you started
to build Windsorff or even Codium, how has your view changed on the future of software engineering?
And we've touched on a few things. But, but like have there been some things like before and after?
Now you're thinking about things differently? You know, I think that timelines for,
a lot of things. I'm like less scared of them, even though like I think a lot of them are supposed
to come like come out like various, like as scary numbers. You know, I think recently Dario from
Anthropic was 90% of all committed code is going to be AI generated. I think the answer to that is
going to be yes. My question after that is so what like so what if that's the case? Developers don't
only spend time writing code, right? I think there's there's this fear that comes from all this stuff.
I think I think AI systems are going to get smarter and smarter very quickly. But look, look,
when I think about what engineers love doing, I think they love solving problems, right?
They love collaborating with their peers to find out how to make solutions that work.
And I think when I look at the future, it's more like things are going to improve very quickly,
but I think people are going to be able to focus on the things that they really want to do when
their developers, not like the nitty-gritty details that, as you said, you go home and you're like,
I don't know why this doesn't compile.
I think that will, a lot of those small details for most people are going to be a relic of the past.
Well, I'll tell you, I'll give the idea of why people are stressed, you know, like, because
and they're going to say, you know, some listeners will say, like, well, you're in an easy
position because you're in the middle of an AI company building all these tools, which is the
future, right?
Like, and you're going to be fine for the next few years.
And they're thinking, I'm sitting at a B-to-B SaaS company where, like, I'm building
a software.
And my employer is thinking that these things make us 20% or 25% more efficient and they're
going to cut a quarter of a team.
And I'm worried, A, if it's going to be me, B,
the job market is not that great.
And I get it that I can be more more productive with these things, but I still need to find a job.
And that is the, you know, like not everyone will verbalize this, but this is the thing that gives people, this is, you know, when they're hearing Dario talk about the 90%, they're thinking, oh, damn, my employer will say like, okay, Joe, we don't need you anymore.
Yeah.
The problem is, I don't know what, like, maybe this is like, I don't know if this is like a real good answer, but that feels like the employers being like irrational.
Because, okay, my, let me, let me provide, let me provide the take here.
If the B2B SaaS company that is not doing well
needs to compete with other B2B SaaS companies,
if they reduce the number of engineers that they have,
they're basically saying their product is not going to improve that quickly
compared to a competitor that is willing to hire engineers
and improve their software much more quickly,
I do think consumers and just businesses
are going to have much higher expectations for software.
So the demand for software that I buy is way higher.
Like, I don't know if I've noticed it.
I feel bad when I buy a piece of software
that looks like it did, like, you know,
a couple years ago, that's like this ugly procurement software.
Yeah. And these days you don't have a, I hear you. I think, I think, I see your son.
I see the short term of like, like, are there employers that look at this and they're like,
this is an opportunity to cut? I think these employers are being really, really short-sighted.
Yeah. And I think I'm getting a little bit of hope from even other industries. There was a time
where people, writers were being fired left and right. Like, I'm not saying software writers,
but like just like old traditional writers. And now there's a big hiring spree from all sorts of companies of
hiring writers because turns out the AI is kind of, you know, it's a bit bland and a great writer
with AI is way better than without, I think same for software engineers. So that's also a bit of
my message read and anyone listening, but it's just good to hear from you. Exactly. When you
have a competitive market and you add a lot of automation, automation is great, but what you actually
need to compare is automation with a human. And if that's way more leveraged, then you actually
should compete with that. That's like the game theoretically optimal thing to do. And actually that, that's
the tool that you're building right now, which I think is one of the reasons that it's,
like a reason I like to use it. It doesn't feel that it's like trying to do anything.
Instead of me, it's doing it with me and making me way more efficient as an engineer.
So to wrap up, I just have some rapid questions. I'm just going to ask them and then you can
shoot the answer. So I've heard that you're really into endurance sports, long distance running,
cycling, and you do just a lot of it. Now, a lot of people are thinking, well, I'm pretty busy with
my job with coding, et cetera. I don't have as much time for sports.
sports. How do you make time for sports? And what would your advice be for someone who is like actually
want to get in a lot better shape while being a software engineer and busy with what's your work?
So I will say this like since the company that has gone down drastically. But my previous company,
I still worked a ton. I worked at an autonomous vehicle company. I would bike over like 150 miles a week, like rigorously,
like probably close to 160, 170. I think it's just interestingly, it's I for for an activity like this,
I actually got Zwift, so like this way to bike indoors.
And I would just be able to knock out like 20 to 25 miles in an hour, like at home.
And the benefit there is like now I can come back from work very quickly do a ride.
And then, you know, on the weekends on a Saturday, I would just dedicate being able to do
potentially like a 70 mile loop somewhere.
One of the lucky things for me is I'm in the Bay Area.
So there's a lot of like amazing places to ride a bike that have hills and stuff like that.
So I think it's easy to carve out this time, but you kind of, you know, you need to make the friction for yourself a lot lower, right?
I think if I needed to, I would never go to a gym, like, rigorously.
I think I'm not the type of person that would like, you know, I would just find a way to not do it.
But if it's literally at home right next to where I sleep, I'm going to find a way to do it.
Sounds like just just make it work for you.
Yeah.
And what's a book that you would recommend and why?
You know, there was a book that I read a long time ago that I really enjoyed.
It's called the Idea Factory.
It's basically about how Bell Labs kind of like innovated so much while being a very commercial
entity.
And it was very interesting to see some of like the great scientists of our time working at
this company providing so much value.
So like information theory, Claude Shen and worked there.
Right.
The like the founding of the transistor happened sort of like Shockley and all these people kind
of were there too.
And just hearing how a company is able to straddle the line between both was really
exciting. Yeah, and I hear that, you know, Open AI got inspired by Bell Labs a lot. Their titles are
coming back. And I think I actually, I personally want to read more about that. So thanks for a
recommendation. Well, thank you. This was great. This was super interesting and just, just love all the
insights that you shared. Yeah, thanks a lot for having me. I hope we enjoyed this conversation with
Varun and the challenges that the windsurf team is solving for. One of the things I enjoy discussing was when
Barun shared how they have a bunch of features that just didn't work out, like their review tool. And then
they celebrate failure and just move on. I also found it fun to learn how any developer
can roll out any feature they built to the whole company and get immediate feedback, whether
it's good or bad. For more deep dives on AI coding tools, check out the Pragmatic Engineering
Deep Dives link in the show notes below. If you've enjoyed this podcast, please consider
leaving a rating. This helps more listeners discover the podcast. Thanks and see you in the next one.
