Latent Space: The AI Engineer Podcast - Windsurf: The Enterprise AI IDE - with Varun and Anshul of Codeium AI
Episode Date: December 13, 2024Our second podcast guest ever in March 2023 was Varun Mohan, CEO of Codeium; at the time, they had around 10,000 users and how they vowed to keep their autocomplete free forever: Today, over a million... developers use their products, they still have their free tier, and they recently launched Windsurf, an AI IDE. Chapters* 00:00:00: Introductions & Catchup* 00:03:52: Why they created Windsurf* 00:05:52: Limitations of VS Code* 00:10:12: Evaluation methods for Cascade and Windsurf* 00:16:15: Listener questions about Windsurf launch* 00:20:30: Remote execution and security concerns* 00:25:18: Evolution of Codeium's strategy* 00:28:29: Cascade and its capabilities* 00:33:12: Multi-agent systems* 00:37:02: Areas of improvement for Windsurf* 00:39:12: Building an enterprise-first company* 00:42:01: Copilot for X, AI UX, and Enterprise AI blog posts This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.latent.space/subscribe
Transcript
Discussion (0)
Hey, everyone. Welcome to the latest-based podcast.
This is Alessio, partner, and CTO Adesable Partners, and I'm joined by my co-hosts
Swix, founder of Small A.I.
Hey, and today we are delighted to be, I think, the first podcast in the new Kodium office.
So thanks for having us, and welcome Faroon and Anjou.
That's for having us.
Yeah, thanks for having us.
This is the Silicon Valley office?
Yeah.
So, like, what's the story behind this?
The story is that the office was previously, so we used to be on Castro Street, so this is in
Mountain View. And I think a lot of the people at the company previously, you know, were in SF or
still an SF. And actually one thing you, if you notice about the company is it's actually like a
two-minute walk from the Cal Train. And I think we were, we didn't want to move the office like
very far away from the Cal Train that would probably, you know, piss off a lot of the people that
lived in in San Francisco, this guy included. So, so we were like scouting a lot of spaces in the
nearby area and this area popped up. It previously was, was being leased by, I think, Facebook slash
what's up and then immediately after that
ghost autonomy and then now
here we are and we also
you know I guess one of the things that the landlord told us
was this was the place that they shot all the scenes
Forest Silicon Valley at least like externally
and stuff like that so that just became a meme
trust me that wasn't like the main reason why we
but we've leaned into it
it doesn't hurt yeah yeah
and obviously that played it a little bit into your launch
with windsurf as well so
let's get caught up you were
guest number four
I think you were two maybe it was two
So a lot has happened since then.
You've raised a huge round and also just launched your ID.
What's been the progress over the last year or so since the In Space people last saw you?
Yeah.
So I think the biggest things that have happened are codiums, extensions have continued to gain a lot of sort of popularity.
We have over 800,000 sort of developers that use that product.
Lots of large enterprises also use the product.
We were recently awarded JP Morgan Chase's Hall of Innovation Award,
which is usually not something a company gets within a year of deploying an enterprise product.
And then large companies like Dell and stuff use the product.
So I think we've seen a lot of traction on the enterprise base.
But I think one of the most exciting things we've launched recently is this actually IDE called Windsurf.
And I think for us, one of the things that we've always thought about is how do we build the most powerful AI system for developers everywhere?
The reason why we started out with the extension system was we felt that there were lots of developers that were not going to be on one platform.
And that still is true, by the way. Outside of Silicon Valley, a lot of people don't use GitHub.
This is like a very surprising finding, but most people use GitLab, BitBucket, Gareth, CVorce, CVS, Harvest, Mercurial.
I could keep going down a list, but there's probably 10 of them. GitHub might have less than 10% penetration of the Fortune 500, full penetration. It's very small.
And then also on top of that, GitHub has very high switching costs or source code management tools, right?
Because you actually need to switch over all the dependent systems on this workflow software. It's much harder
than even switching off of a database.
So because of that, we actually found ways in which we could be better partners to our customers,
regardless of where they sorted their source code.
And then more specifically on the ID category, a lot of developers, surprise, surprise,
don't just write TypeScript and Python.
They write Java.
They write GoLang.
They write a lot of different languages.
And then high-quality language servers and debuggers matter.
Very honestly, JetBrains has the best debugger for Java.
It's not even close.
These are extremely complex pieces of software.
We have customers where over 70% of their developers are your JetRans.
And because of that, we wanted to provide a great experience wherever the developer was.
But one thing that we found was lacking was, you know, we were running into the limitations of building within the VS code ecosystem on the VS code platform.
And I think we felt that there was an opportunity for us to build a premier sort of experience.
And that was within the reach of the team, right?
The team has done all the work, all the infrastructure work to build the best possible experience, right, and plug it into every ID.
Why don't we just build our own ID that is by far the best experience?
and as these agenic products start
become more and more possible,
and all the research we had done on retrieval
and just reasoning about codebases
became more and more to life.
We were like, hey, if we launched this agentic product
on top of a system that we didn't have a lot of control over,
it's just going to limit the value of the product
and we're just not going to be able to the best tool.
That's why we were super excited to launch WinSurf.
I do think it is the most powerful IDEA system out there right now
in the capability, right?
And this is just the beginning.
I think we suspect that there's much, much more
we can do more than just the autocomplete sort of side. When we originally talked, probably
autocomplete was the only piece of functionality the product actually had. And we've come a long
way since then, right? These systems can now reason about large code bases without you adding
everything, right? Like when you use Google, do you say like at New York Times Post, blah, blah,
and like ask it a question? No. We want it to be a magical experience where you don't need to do
that. We want it to actually go out and execute code. We think code execution is a really,
really important piece. And when you write software, you not only just kind of come up with an idea,
the way software kind of gets created is software is originally this amorphous blob. And as time goes on,
and you have an idea, the blob and the cloud sort of disappear and you see this mountain.
And we want it to be the case that as soon as you see the mountain, the AI helps you get to the
mountain. And as soon as you see the mountain, the AI just creates the mountain for you. Right. And that's why
we don't believe in this sort of modality where you just write a task and it just goes out and does it.
Right? It's good for zero to one apps. And I think people have been,
seeing windurf is capable of doing that, and I'll let Anshol talk about that a little bit,
but we've been seeing real value in real software development, which is more to say,
this is not to say that current tools can't, but I think more in the process of actually
evolving code from a very basic idea. Code is not really built as you have a PRD and then you get
some output out. It's more like you have a general vision, and yes, and as you write the code,
you get more and more clarity on approaches that don't work and do work. You're killing ideas
and creating ideas constantly, and we think windsurf is the right paradigm for that.
Can you spell out what you couldn't do in VS code?
Because I think when we did the cursor episode, explain,
then everybody on Agri News is like, oh, blah, blah, why did you fork?
You could have done it in an extension.
Like, can you maybe just explain more of those limitations?
I mean, I think a lot of the limitations around like APIs are pretty well documented.
I don't know if we need necessarily go down that rabbit hole.
I think it was when we started thinking, okay,
what are the pieces that we actually need to give the AI to get to that
kind of emergent behavior that Brun talked about. And yes, we were talking about all the
knowledge retrieval systems that we've been building for the enterprise all this time.
That's obviously a component of that. We were talking about all the different tools that we could
give it access to so they can go, like, do that kind of terminal execution and things like that.
But then the third main category that we realized would be like kind of that magical thing where
you're not out there writing out a PRD, you're not scoping the problem for the AI, is that if
we're actually being able to understand the kind of the trajectory of what developers are doing
within the editor.
If we actually be able to see like, oh, the developer just went and opened up this part
of the director and tried to view it, then they made these kind of edits.
And they tried to do like some kind of commands in the terminal.
And if we actually understand that trajectory, then our ability for the AI to just be immediately
like, oh, I understand your intent.
This is what you want to do without you having to spell it all out for it.
That is one like that kind of like magic would really happen.
I think that was kind of like that intuition.
So you have the restrictions of the APIs that are well documented.
We have the kind of vision of like what we have.
actually need to be able to hook into to really expose this. And I think it was that combination of
those two where we were like, I think it's about time to do the editor. The editor was not like a
necessarily new idea. I think we've been talking about the editor for a very long time. I think
it's like, of course, we just pulled it all together in the last couple of months. But it was always
something in the back of the mind. And it's only when we started realizing, okay, the models are
not capable of doing this. We actually can look at this data. Like we have a really good
context awareness system. We're like, I think now's the time. And we went on and execute on it.
So it's basically not one action you couldn't do, but it's like how you brought it all together.
It's like the VS code's kind of like sandbox, so to speak.
Yeah, let me maybe like even just to go one step deeper on each of the aspects that Unschul talked about.
Let's go with the API aspect.
So right now, I'll give you an example.
Super Complete is actually a feature that I think is like very exciting about the product, right?
It can suggest refactors of the code.
I think you can do it quickly and very powerfully.
On VS code, actually, the problem for us wasn't actually being able to implement the feature.
We had the feature for a while.
The problem was actually even to show the feature,
VS code would not expose an API for us to do this.
So what we actually ended up doing was dynamically generating PNGs
to actually go out and showcase this.
It was not really aligned.
We actually ended up doing it ourselves,
and it took us a couple hours to actually go out and implement this, right?
And that wasn't because we were bad engineers.
No, our good engineering time was being spent fighting against the system
rather than being a good system.
Another example is we needed to go out and find ways to refactor the code.
The VScode API would constantly keep breaking on us.
and we constantly need to show a worse and worse experience.
This actually comes down to the second point which Anshu brought up,
which is like we can come up with great work and great research.
All the work we have here is not like the research on Cascade is not like a couple month thing.
This is like a nine months to a year thing that we've been investigating as a company.
Investing in on e-vails, right?
Even the e-vals for this are a lot of effort, right?
A lot of actually systems work to actually go out and do it.
But ultimately, like this needs to be a product that developers actually use.
And I think, you know, let's even go for a Cascade, for example,
and looking at the trajectory.
Can you define Cascade because that's the first time you brought it up?
Yeah.
So Cascade is the product that is the actual agentic part of the product, right,
that is capable of taking information from both these human trajectories
and these AI trajectories, what the human ended up doing,
what the AI ended up doing, to actually propose changes
and actually execute code to finally get you the final work output, right?
I'll even talk about something very basic.
Cascade gives you a bunch of code.
We want developers to very easily be able to review this code.
Okay, then we can show developers a hideous UI that they don't want to look at.
And no one's going to really use the software.
one's going to really use this product. And we think that this is like a fundamental building
block for us to make the product materially better. If people are not even willing to use the building
block, where does this go? And we just felt our ceiling was capped on what we could deliver as an experience.
Interestingly, Jeprain's is a much more configurable paradigm than VS code is. But we just felt so limited
on both the sort of directions that Anshul said that we were just like, hey, if we actually
remove these limitations, we can move substantially faster. And we believe that this was a necessary
step for us. I'm curious more about the evils set of it because you brought it up.
And we have to ask about e-vils anytime anyone brings up e-vails. How do you evaluate a thing like
this that is so multi-step and so spanning, like so much context? So what you can imagine
we can sort of do. And this is like one of the beautiful things about code is code can be
executed. We could go take a bunch of open source code. We can find a bunch of commits, right?
And we can actually see if some of these commits have tests associated with them. We can start
stripping the commits, and the approach of stripping the commits is good because it tests the
fact that the code is in an incomplete state, right? When you're writing the commit, the goal is not,
the commit has already been written for you, you're given it in a state that where the entire
thing has not been written, and can we go out and actually retrieve the right snippets and actually
come up with a cohesive plan and iterative loop that gets you to a state where the code
actually passes? So you can actually break down and decompose this complex problem into like a planning,
retrieval, and multi-step execution problem. And you can see on every single one of these axes
is it getting better. And if you do this,
this across enough repositories, you've turned this highly discontinuous and discrete problem
of make a PR work versus make it not work into a continuous problem. And now that's a hill
you can actually climb. And that's a way that you can actually apply research where it's like,
hey, my retrieval got way better. This made my eval get better. And then notice how the way the
eval works is, I'm not that interested in the eval where purely it's the commit message
and you finish the entire thing. I'm more interested in the code is in an incomplete state.
And the commit message isn't even given to you because that's another thing about developers.
not willing to tell you exactly what's in their head. That's the actual important piece of this problem.
We believe that developers will never completely pose the problem statement, right? Because the problem
statement lives in their head, conversations that you and I have had at the coffee area,
conversations that I've had over Slack, conversations I've had over Jira, right? Maybe the Nijira,
let's say linear, right? That's the cool thing nowadays.
They're talking about Jira. Yeah. So conversations I've had on linear. And all of these things
come together to actually finally propose sort of a solution there, which is why we want to test
the incomplete code. What happens that the state is in an incomplete state? And am I actually
able to make this pass without the commit? And can I actually guess your commit well? Now you can
convert the problem into a mask prediction problem where you want to guess both the high-level
intent and as well as the remainder of changes to make the actual test pass. And you can imagine
if you build up all of these, now you can see, hey, my systems are getting better. Retrieval
quality is getting better. And you can actually start testing this on larger and larger
code bases. And I guess that's one thing that we, honestly, to be honest, we could have done a little
faster. We had the technology to go out and build these 0 to 1 apps very quickly. And I think people
are using Winsurf to actually do that. And it's like extremely impressive. But the real value,
I think, is actually much deeper than that. It's actually that you take a large code base.
And it's actually a really good first pass. And I'm not saying it's perfect, but it's only going
to keep getting better. And we have like deep sort of infrastructure to that actually is
validating that we're getting better on this dimension.
We've ever mentioned the end-to-end e-vals that we have for the system, which
I think are like super cool.
But I think you can even decompose each of those steps, right?
The ideas of just take a retrieval, for example, right?
Like, how can we make e-vail for retrieval really good?
And I think this is just a general thing that's been true about us of the companies.
Like, most evils and benchmarks that exist out there for software development is kind of bogus.
There's not really a better way of putting it.
Like, okay, you have sui-bench, that's cool.
No actual professional work looks like sui-bench.
Like, human eval, same thing.
Like, these things are just a little kind of broken.
So when you're trying to optimize against a metric that's a little bit broken,
you end up making kind of suboptimal decisions.
So something that we're always very keen on is like,
okay, what is the actual metric that we want to test for this part of the system?
So take Redrieval, for example,
a lot of the benchmarks for these embedding-based systems
are like needle-in-the-hastack problems.
Like, I want to find this one particular piece of information
out of all this potential context.
That's not really what actually is necessary for doing software engineering
because code is a super distributed knowledge store.
You actually want to pull in snippets from a lot of different parts
of the code base in order to do.
the work, right? And so, you know, we built systems that instead of looking at retrieval
at one, you're looking at retrieval at like 50, like, what are the 50 highest things that you
can actually retrieve? And are you capturing all of the necessary pieces for that? And what are all
the necessary pieces? Well, you can look again back at old commits and see over all the different
files that together were edited to make a commit, because those are semantically similar things
that might not actually show if you actually try to map out a code graph, right? And so we can actually
build these kind of golden sets. We can do this evaluation, even for sub-problems
in the overall task. And so now we have, like, you know, an engineering team that can
iterate on all of these things and still make sure that the end goal that we're trying
to build to is, like, really, really strong so that we have confidence of what we're pushing
out. And by the way, just to talk, let's say one more thing about the sweepbench thing,
just to showcase these existing metrics. I think benchmarks are not a bad thing. You do
on benchmarks. Actually, like, I would prefer if there are benchmarks versus, let's say,
everything was just vibes, right? But vibes are also very important, by the way, because
they showcase that where the benchmark is not valuable, because actually vibes sometimes show you
where criminal issues that exist in the benchmark. But like, you look at some of the ways in which
people have, like, optimized VEBenz. It's like, make sure to run pie test every time X happens. And it's
like, yeah, like, sure. You can start, like, prompting it in like every single possible way.
And like if you remove that suddenly, it doesn't get good at it. It's like, what really matters here?
What really matters here is, like, across a broad set of tasks, you're performing, like, high
quality sort of suggestions for people and people love using the product. And I think actually, like,
the way these things work is beyond a certain point.
Because, yes, I actually think it's valuable beyond a certain point.
But once it starts hitting the peak of these benchmarks,
getting that last 10% actually probably is counterintuitive
to the actual goal of what the benchmark was.
Like, you probably should find a new hill to climb
rather than sort of P-hacking or really optimizing for how you can get higher on the benchmark.
Yeah, we did an episode with Anthropic about their recent sweet-agent,
and we talked about the human email versus sweet-bench results.
or like human eval is kind of like a greenfield benchmark.
You know, you need to be good at that.
Sweet Bench is more accessing.
But it sounds like, I mean, your evel creation is similar to Sweet Bench
as far as like using get-up commits and kind of like that history.
But then it's more like masking at the commit level versus just testing the output.
That's right.
Of the thing.
Cool.
We have some listener questions actually about the windsurf launch.
And obviously, I also want to give you the chance to just respond to Hacker News.
Oh, man.
Hey, let me tell you some.
something very, very interesting. I love
hacker news, as much as the next
person, but the moment we launched
our product, the first comment, like,
this was a year ago, the first comment was,
this product is a virus. And we were like,
this is the original codeoam launch like two years ago.
This is the original. Like, I am analyzing
the binary as we speak, we'll report
back. And then he's like, it's a virus. And I was like,
dude, like, it's not a virus.
We just want to give auto-complete
suggestions. That's all we want to do.
Yeah. Okay. Okay.
Wow, I didn't expect that.
And then there was like Tio drama.
There's enough drama on the launch to cover.
But I don't know if we want to just make this a Cascade piece.
But we had a bunch of people in our Discord trial the product, give a lot of feedback.
One question people have is like to them, Cascade already felt pretty agentic.
Like, is that something you want to do more of?
You know, obviously, since you just launched an ID, you're kind of like you're focusing on having people write the code.
But maybe this is kind of like the Trojan horse to just doing more full-on end-to-end, like code.
creation. Devon style. Yeah, I think it's like, how do you get there in a,
in a, like, a real principled manner? We have obviously, like, enterprise asking us all the time,
like, oh, when's it going to do like end-to-end work? The reality is like, okay, well, if we have
something in the IDE that, again, can, like, see your entire actions and get a lot of intent
that you can't actually get if you're not in the IDE. If the agent there has to always get
human involvement to keep on fixing itself, it's probably not ready to become a full
end-to-end automated system, because then we're just going to turn into a linter where, like,
It produces a bunch of things and no one looks at any of it.
Like that's not the great end state.
But if we start seeing like, oh, yeah, there's common patterns that people do that,
like, never require human involvement.
It's an end-to-end just totally works without like any like intent-based information.
Sure, that can become like fully agentic.
And like we'll learn what those tasks are like pretty quickly because we have a lot of data.
Maybe add on to that.
I think that if the answer is like full agentic is called like is Devon,
I think like, yes, the answer is this product should become fully agentic.
and limited human interaction is the goal, is 100% the goal.
And I think, honestly, of all usable products right now,
I think we're the closest right now.
Of all usable products in an IDE.
Now, let me caveat this by saying,
I think there are lots of hard problems that have yet to be solved
that we need to go out and solve to actually make this happen.
Like, for instance, I think one of the most annoying parts about the product
is the fact that you need to accept every command that kind of gets run.
It's actually fairly annoying.
I would like it to go out and run it.
Unfortunately, me going out and running arbitrary binaries has some problems
and that if it like RMRS, my hard disk, I'm not going to be, I'm not going to be.
It's a virus. It's a virus. It does become a virus. I think this is, this is solvable with like,
with complex systems. I think we love working on complex systems infrastructure. I think we'll
solve it. Now, the simpler way to go about solving this is don't run it on the user's machine
and run it somewhere else because then if you board that machine, you're kind of totally fine.
Now, I think, I think though maybe there's a little bit of tradeoff of like running it locally
versus remotely, and I think we might change our mind on this. But I think the goal,
for this is not for this to be the final state. I think the goal for this is, A, it's actually able
to do very complex tasks with limited human interaction, but it needs to know when to actually
go back to the human, right? Also on top of that, compress every cycle that the agent is running.
Right now, actually, I even feel like the product is too slow for me sometimes right now.
Even with it running really fast. It's objectively pretty fast. I would still want it to be
faster, right? So there is like systems work and probably modeling work that needs to happen
there to make the product even faster on both the retrieval side and the generation side.
And then finally speaking, I think another key piece here that's really important is I actually think asking people to do things explicitly is probably going to be more of an anti-pattern if we can actually go and passively suggest the entire change for the user.
So almost imagine as the user is using the product that we're going to suggest the remainder of the PR without the user kind of like even asking us for it.
I think this is sort of the beginning for it.
But yeah, like these are hard problems.
I can't give a particular deadline for this.
I think this is like a big step up than what we had particularly.
the past. But I think what Antrol said is 100% true, but the goal is for us to get better at this.
I mean, the remote execution thing is interesting. You've wrote a post about the end of local
host. Yeah. And now it's almost like, then we were kind of like, well, no, maybe we do need the
internet and like people want to run things. But now it's like, okay, no, actually I don't really
care. Like I want the model to do the thing. And if you were like, you can do a task end to end,
but it needs to run remotely, not on your computer. I'm sure most people will say, yeah.
No, I agree with that. I actually agree with it running remotely. That's not a security issue.
I totally agree with you that it's possible that everything could run remotely.
That's how it is at most like big calls, like Facebook.
Nobody runs things locally.
No one does.
In fact, you connect to a room.
Essentially to the mainframe.
You're right on that.
Maybe the one thing that I do think is kind of important for these systems that is more than just running remotely is basically like, you know,
when you look at these agents, there's kind of like a rollout of a trajectory.
And I kind of want to roll this trajectory back, right?
In some ways, I want like a snapshot of the system that I can like constantly checkpoint and move back and forth.
And then also on top of that, I might want to do multiple rollouts of this.
So basically, I think there needs to be a way to almost like move forward and move backwards
the system.
And whether that's locally or remotely, I think that's necessary.
But every time if you move the system forward, it like destroys your machine.
It's probably going to be a hard system to kind of, or potentially destroys your machine.
That's just not a workable solution.
So I think the local versus remote, I think you still need to solve the problem of this
thing is not going to destroy your machine on every execution, if that makes sense.
Yeah.
Yeah.
There is a category of emerging infrastructure providers that are working on time travel VMs.
And if Veroen's first episode on this podcast was any indication, we like infrastructure problems.
Okay.
All right.
Oh, so you're going there.
All right.
Well, that's funny, right?
It's like when we first had you, you were doing so much on like actual model inference,
optimization, all these things.
And today it's almost like.
It's Claude.
It's like, you know, people are like forgetting about the model.
You know, and that's all about at a higher level of extraction.
Yeah.
So maybe I can say like a little bit about how our strategy on this is like evolved because it objectively has, right?
I think I would be lying if I said it hasn't.
The things like autocomplete and super complete that run on every keystroke are entirely like our own models.
And by the way, that is still because properties like FIM fill in the middle capabilities are still quite bad with the current.
Non-existent.
They're all, they're very bad, non-existent.
They're not good actually at it.
Because FIM is an actual like how you order the tokens.
It's how you order the tokens actually in some.
ways. And this is a, this is sort of, if you look at what these products have sort of become,
and this is great, is a lot of the clods in the opening eyes have focused on kind of the chat
like assistant API where it's like complete pieces of work message and other complete piece
of work. So multi-turn kind of back and forth systems. In fact, like actually even these systems
are not that good at making point changes. When they make point changes, they kind of are like
off here and there by a little bit. Because yeah, when you, when you like are doing multi-point
kind of like conversations, it's, you know, exact gifts getting applied is not like even a perfect
science still yet. So we care about that. The second piece where we've actually sort of
trained our own models is actually on the retrieval system. And this is not even for embedding,
but like actually being able to use high-powered L-LMs to be able to do much higher quality
retrieval across the code base. Right. So this is actually what Anshul said. For a lot of the systems,
we do believe embeddings work, but for complex questions, we don't believe embeddings can
encapsulate all the granularity of a particular query. Like imagine, imagine I have a question
on a code base of, find me all quadratic time algorithms in this codebase.
Do we genuinely believe the embedding can encapsulate the fact that this function is a quadratic time
function? No, I don't think it does. So you are going to get extremely poor precision
recall at this task. So we need to apply something a little more high-powered to actually
go out and we've actually built large distributed systems to actually go out and run these
at scale, run custom models at scale across large code bases. So I think it's more a question
of that. The planning models right now, undoubtedly, I think the clods and the open AIs have the
best products. I think Lama 4, depending on where it goes, it could be materially better.
It's very clear that they're willing to invest a similar amount of compute as the Open AIs and
the Anthropics. So we'll see. I would be very happy if they got really good, but unclear so
far. Don't forget GROC. Hey, dude, I think GROC is also possible. Right? I think don't doubt
Elon. Okay. So I didn't actually know. It's not obvious when I use Cascade. I should also
mention that, you know, I was part of the preview. Thanks for letting me in. I've been
maining windsurf for a long time.
It's not actually obvious.
You don't make it obvious that you are running your own models.
I feel like you should so that I feel like it has more differentiation.
Like I only have exclusive access to your models via your IDE
than having the drop-down as is cloud and 40 because I actually thought that was what you did.
No, so actually the way it works is the high-level planning that is going on in the model
is actually getting done with products like the clot.
But the extremely fast retrieval as well as the ability to take the high-level plan
and actually apply it to the code base is proprietary systems that are running internally.
And then the stuff that you said about embedding is not being enough.
Are you familiar with the, I concept the late interaction?
No, I actually have never.
Yeah, so this is Colbert, or like the guy Omar Katab from, I think Stanford has been promoting this a lot.
It is basically what you've done.
Okay.
Sort of embedding on retrieval rather than pre-embidding.
Okay.
In a very loose sense.
I think that sounds like a very good idea that is very similar to what we're doing.
It sounds like a very good idea.
I think we'd say that.
That's like the meme of Obama giving himself a medal right there.
Well, I mean, there might be something to learn from contrasting the ideas
and seeing where, like the study opinion and differences.
It's also been to apply very effectively to vision understanding.
Because vision models tend to just consume the whole image,
if you are able to sort of focus on images based on the query,
I think that can get you a lot of extra performance.
The basic idea of using compute in a distributed manner to do
operations over a whole set of
raw data rather than like a
materialized view. It's not anything new, right?
I think it's just like how does that look like for LLMs?
When I hear you say build large distributed systems,
you have a very strange product strategy of going to
down to the individual developer, but also to the large enterprise.
Is it the same in Friday serves everything?
I think the answer to that is yes.
The answer to that is yes. And the only reason
why for the yes, the answer is yes.
And to be honest, our company is a lot more complex
than I think if we just wanted to serve the individual.
And I'll tell you that because
we don't really like pay other providers to do things for our indexing.
We don't pay like other providers to do our serving of our own customer models, right?
And I think that's a core competency within our company that we have decided to build,
but that's also enabled us to go and like make sure that when we're serving these products
in an environment that works for these large enterprises,
we're not going out and being like, we need to build this custom system for you guys.
This is the same system that serves our entire user base.
So that is a very unique decision we've taken as a company.
and we admit that there were probably faster ways
that we could have done this.
I was thinking, you know, when I was working with you
for your enterprise piece, I was thinking like this philosophy
will go slow to go fast, like build deliberately
for the right level of abstraction
that can serve the market that you really aren't going after.
Yeah, I mean, I would say, like, when writing that piece,
you're like looking back and reading it back,
it sounds so like almost obvious in hands.
Not all of those are really conscious decisions we made.
Like, I'll be the first to admit that.
But, like, it does help, right?
When we go to, like, an enterprise
that has tens of thousands of developers
and they're like, oh, wow, like, we have tens of thousands of developers,
and does your infrastructure work for tens of thousands of developers?
We can turn around and be like, well, we have hundreds of thousands of developers
or an individual plan that we're serving.
Like, I think we'll be able to support you, right?
So, like, being able to do those things, like, we started off by just like,
let's give it to individuals that see what people like and what they don't like and learn,
but then those become value propositions when we go to the enterprise.
And to recap, when you first came on the pod, it was like auto-completion is free.
And Copallo was $10 a month.
And you said, look, what we care about is building things on top of code completion.
How did you decide to just not focus on like short-term kind of like growth monetization of like the individual developer and like build some of this?
Because the alternative would have been, hey, all these people are using it.
It's like we're going to make this other like five bucks a month plan, monetize.
I think I think this might be a little bit of like commercial instinct that the company has and unclear if the commercial instinct is right.
I think that right now, optimizing for making money off of individual developers is probably the wrong, actually, strategy.
Largely because I think individual developers can switch off of products very quickly.
And unless we have, like, a very large lead trying to optimize for making a lot of profit off of individual developers,
it's probably something that someone else could just vaporize very quickly and then they move to another product.
And I'm going to say this very honestly, right?
Like when you use a product like codium on the individual, on the individual side, there's not much thing.
not much that prevents you to switch onto another product. I think that will change with time as the
products get better and better and deeper and deeper. I constantly say this, like, there's a book
in business called like seven powers. And I think one of the powers that a business like ours need to
have is like real switching costs. But like you first need something in the product that makes people
switch on and stay on before you think about how do you make people switch off. And I think for us,
we believe that there's probably much more differentiation we can derive in the enterprise by
working with these large companies in a way that is like, that is interesting and scalable
for them. Like, I'll be maybe more
concrete here. Individual developers are much
more sort of tuned towards
small price changes. They care a lot more,
right? Like if our product is 10, 20
bucks a month instead of 50 or 100 bucks a month,
that matters to them a lot.
For a large company where they're already spending
billions of dollars on software, this is much less
important. So you can actually solve maybe deeper
problems for them, and you
can actually kind of provide more differentiation
on that angle. Whereas I think
individual developers could be churning as long
as we don't have the best products. So focus on
being the best product, not trying to take price
and make a lot of money off of people.
And I don't think we will, for the
foreseeable future, try to be a company that tries
to make a lot of money off individual developers.
I mean, that makes sense. So why
$10 a month for WinServe?
$10 a month was actually the pro plan.
So we launched our individual pro plan
before WinSurf existed. Because I think there's
we also have to be financially
responsible.
Yeah, yeah. We can run out of money.
It's a cool. I mean, there's a lot of things
because of our infrastructure background, we can give, like, for essentially free,
like, unlimited auto-complete, you know, unlimited chat on, like, our, you know, faster
models, like, we give a lot of things actually out for free.
But, yeah, when we start doing things like the super completes and really large amounts
of indexing and all of these things, like, there is real cogs here.
Like, we can't ignore that.
And so we just created $10 a month pro plan, mostly just to cover the cost.
Like, we're not really, like, operating, I think, on much of a margin there either.
But, like, okay, like, just to cover us there.
So for Winserve, it.
It just ended up being the same thing.
And everyone who downloads wins here from the first, like, I forget, like, a couple of weeks,
like two weeks for free.
Let's just have people try it out, let us know what they like, what they don't like, and
that's how we've always operated.
I've talked to a lot of CTOs and, like, the Fortune 100, where most of the engineers
they have, they don't really do much anyway.
The problem is not that the developer costs 200K and you're saving 8K.
It's like that developer should not be paid 200K.
But that's kind of like the base price, you know?
But then you have developers getting paid 200K, there should be.
be paid 500K. So it's almost like you're averaging out the price because most people are actually
not that productive anyway. So if you make them 20% more productive, they're still not very productive.
And I don't know in the future. Is it that the junior developer's salary is like 50K, you know,
and it's like the bottom of the end gets kind of like squeezed out and then the top end gets squeezed up?
Yeah, maybe I'll let's see you know one thing that I think about a lot because I do think about
this, the per se, anything, all of this stuff I think about a good deal. Let's take a product like Office 365.
I will say a lawyer at Kodium uses Microsoft Word way more than I do.
I'm still footing the same bill.
But the amount of value that he's driving from Office 365 is probably, you know, tens of thousands of dollars.
By the way, everyone, you know, Google Doc's a great product.
Microsoft Word is a crazy product.
It made it to that the moment you review anything in Microsoft Word, the only way you can review it is with other people in Microsoft Work.
It's like this virus that penetrates everything.
And it's not only penetrates it within the company.
It penetrates it cross company, too.
The amount of value it's driving is way higher for him.
So for these kinds of products, there's always going to be for these kinds of products,
this variance between who gets value from these products, right?
And you're right. It's almost like a blended because you're actually totally right.
Probably this company should be paying that one developer maybe like four times as much.
But in weird way, software is like this team activity enough that there's a bunch of blended outcomes.
But hey, like 20% of the four times and there are four people is still going to cover the cost
across the four individuals, right?
And that's how roughly these products kind of get priced out.
I mean, more than about pricing, this is about like the future.
of the software engineer.
We could be very wrong also.
Yeah.
I think nobody knows.
Reserve the right to be incredibly off.
Yeah.
I mean, business model does, in fact,
the product, product does impact the user experience.
So it's all of the kind.
I don't mind.
We are, we do, are as concerned about the business of tech as the tech itself.
That's cool.
Speaking of which, there's other listener questions.
Shout out to Daniel Imfeld, who's pretty active in our Discord,
just asking all these things.
Multi-agent, very, very hot and popular,
especially from like the Microsoft Research point,
of you, have you made any explorations there?
I think we have.
I don't think we've called it a multi-agent,
which is more so like this notion of having many trajectories
that you can spawn off that kind of like validate sort of some different hypotheses
and you can kind of pick the most interesting one.
This is stuff that we've actually analyzed internally at the company.
By the way, the reason why we have not put these things in, actually,
is partially because we can't go out and execute some random stuff in peril in the meantime.
In the meantime on other sides.
Because of the side effects.
Because of the side effects, right?
So there are some things that are a little bit dependent on us unlocking more and more functionality internally.
And then the other thing is in the short term, I think there is also a latency component.
And I think all of these things can kind of be solved.
I actually believe all of these things are solvable problem.
They're not unsolvable problem.
And if you want to run all of them in parallel, you probably don't want end machines to go out and do it.
I think that's unnecessary, especially if most of them are I-O-bound kind of operations,
where all you're doing is reading a little bit of data and writing out a little bit of data.
It's not extremely compute-intensive.
I think that it's a good idea, and probably some sort of.
we will pursue and is going to be in the product.
I'm still processing what you just said about things being I-O-bound.
So for a certain class of concurrency, you can actually just run it all in one machine.
Why not?
Because if you look at the changes that are made, right, and for some of these,
spreading out like what, a couple thousand bytes, maybe like tens of thousands of bytes on every...
It's not a lot.
Very small.
What's next for Cascade or wind surf?
Oh, there's a lot.
I don't know.
We did an internal poll and we were just like, are you more excited about this launch
or the launch that's happening in a month?
or like what we're going to come out within a month.
And it was like almost uniformly in a month.
I think like, you know, there's some like obvious ones.
I don't know how much very want to say.
I don't want to speak up.
But I think you'd look at all the same axes of the system, right?
Like how can we improve the knowledge retrieval?
Like we'll always keep on figuring out how to improve knowledge retrieval.
In our launch video, we even showed some of like the early explorations we have about looking to other data sources.
That might not be the coolest thing to the individual developer building a zero to one app.
But you can really believe that like the enterprise customers really think that that's
Very cool.
I think on the tool side, I think there's a whole lot more that we can do.
I mean, of course, when Burns talked about not just suggesting the terminal command,
but actually executing them.
Like, I think that's going to be huge.
Unlock, you look at the actions that people are taking, right?
Like, the human actions, the trajectories that we can build.
Like, how can we make that even more detailed?
And I think all of those things, and you make some, like, even a cleaner UI,
like, the idea of looking at future trajectories, trying a few different things
and, like, suggesting potential next actions to be taken.
Yeah, yeah.
That doesn't really exist.
yet, but it's pretty obvious, I think, how that would look like.
You open up Cascade, and instead of starting typing, it's just like, here's a bunch of
things that we want to do.
We kind of joke that's like Clippy's coming back, but like, maybe now's the time for Clippy
to really shine, right?
So I think there's a lot of ways that we can take this, which I think is like the very
exciting part.
We're calling each of our launches waves, I believe, because we want to really double down
on the aquatic themes.
Oh, yeah.
Does someone actually winsurf at the company?
Is that?
We're living out our dream of being cool enough to windsurf through the process.
I don't think we can't.
Yeah, all right.
That was actually something we learned, because I don't think any of us are wood surfers.
Like in our launch video, we have someone using windsurf on a windsurf.
You saw that.
You saw that.
In the beginning of the video, someone was at the computer.
And we didn't realize, like, now apparently is, like, the time of the year where there's, like, not enough wind to windsurf.
So we were trying to figure out how to do this, like, you know, launch video with wind surf on the wind surf.
Every wind surf we were attacked, like, yeah, it's not possible.
And there was, like, yeah, I think we can do this.
And we made it happen.
Oh, okay.
That's funny.
Is there anything that you want feedback on?
Maybe there's a fork in a road.
You want feedback.
You want people to respond to this podcast and tell you what they want.
Yeah, I think there's a lot of things that I think could be more polished about the product that we'd like to improve.
Lots of different environments that we're going to improve performance on.
And I think we would love to hear from folks across the gamut.
Like, hey, like, if you have this environment, you use Windows and X version, it didn't work.
Or this language, it was very poor.
I think we would like to hear it.
Yeah, I gave it.
Preb and Kevin a lot of shit for my Python issues.
Yeah, yeah, yeah. And I think
there's a lot to kind of improve on the environment
side. I think, like, for instance, even just a
dumb example, and I think, Sir of Swix, this was
a common one. It's like, yeah, like, the virtual
environment, where is the terminal running? What is
all this stuff? These are all basic things that, like,
to be honest, this is not rocket science, but
we need to just fix it, right? We need to fix it.
So, we would love to hear, like, all
the feedback from the product, like, was it too slow?
Where was it too slow? What kind of
environments could work way more in? There's a lot of things
that we don't know. We, luckily, we're
daily users of the product internally, so we're getting a lot of feedback inside.
But I will say, like, there's a little bit of Silicon Valleyism in that a lot of us develop
on Mac. A lot of people, once again, over 80% of developers are on Windows.
So, yeah, there's a lot to learn and probably a lot of improvements down the line.
Have you personally tempted your CEO of the company to switch to Windows just to feel something?
You know, you know what?
You know what? Maybe I should.
Actually, I think I will. I mean, like, your customers, you know, everyone says,
89% are all Windows, right?
You live in Windows, you will never,
you would never not see something that missed.
So I think in the beginning,
part of the reason why we were hesitant to do that
was, like, a lot of our architectural decisions
to work on across every IDE was because
we built a platform agnostic way of running
the system on the user's local machine
that was only buildable, easily buildable
on, like, on dev containers that lived on a particular
time of platform, so Mac was like nice for that.
But now there's like not really an excuse
if it's like, if I can also make changes to the UI and stuff like that. And yeah, WSL also
exists, that's actually something that we need to add to the product. That's how early it is,
that we have not actually added that. We don't have, like, remote. Anything else about
codium at large, right? Like, you still have your core business of the enterprise codium.
Yeah. Anything moving there or anything that people should know about.
I don't think a lot are still moving there, right? I think it would be a little bit like,
you know, very kind of egotistical. It would be like, oh, we have windsurf now. All of our
enterprise customers are going to switch to windsurf.
is the only, like, no, we still support the other.
I was going to say, you just talked about your Java guys
loving JetBrains. They're never going to leave JetBrains.
They're never going to leave JetBrains. I mean, forget JetBrains.
There's still tons and tons of Enterprise people on Eclipse.
Like, we're still the only
code system that has an extension of Eclipse.
That's still true years in, right?
And but, like, that's because that's our enterprise customers.
And the way that we always think about it is, like,
how do we still maximize the value of AI
for every developer? I don't think that part
of who we are has changed since the beginning.
Right? And there's a lot of, like, meeting
the developers where they are. So I think on
the enterprise side, we're still pretty invested in doing that. We have a team of engineers
dedicated just to making enterprise successful and thinking about the enterprise problems.
But really, if we think about it from the really macro perspective, it's like if we can solve
all the enterprise response from an enterprise, and we have products that developers themselves
just truly, truly love, then we're solving the problem from both sides. And I think it's one
of those things where I think when we started working with the enterprise and we started building
like dev tools, right? We started as an infrastructure company. Now we're building dev tools for
for developers, you really quickly understand and realize just how much developers loving the tool
make us successful in an enterprise.
There's a lot of enterprise software that developers hate.
I want to draw this flywheel.
But like, we're giving a tool for people where they're doing their most important work.
They have to love it.
And it's not like we're trying to convince the executives at this company also ask their
developers a lot, do you love this?
Like that is like almost always a key aspect of whether or not codium is accepted into the organization.
I don't think we go from zero to 10 million ARR in less than a year in an enterprise product if we don't have a product that developers love.
So I think that's why we're just, you know, the IDE is more of a developer love kind of play.
It will eventually make it to the enterprise. We still solve the enterprise problems.
And again, we could be completely wrong about this, but we hope we're solving the right problems.
It's interesting. I asked you this before we started rolling, but like it's the same team. That's the same edge team.
Like I, in any normal company or like, you know, my normal mental model of company construction,
And if you are to have effectively two products like this,
you would have two different teams serving two different needs,
but it's the same team.
Yeah, I think one of the things that's maybe unique about our company
is like this has not been one company the whole time, right?
Like we were first, like this GPU virtualization company pivoted to this.
And then after that, we're making some changes.
And like, I think there's like a versatility of the company
and like this ability to move where we think the instinct.
We have this instinct where, and by the way,
the instinct could be wrong.
But if we smell something, we're going to move fast.
And I think it's more a testament to, I think, the engineering team rather than any one of us.
I'm sure.
You had December 19, 2022, you have one of our guests posts, what building Copilot for X really takes.
Estimate inference, the figure out latency quality, build first party instead of using third-party as ABI.
Okay.
Figure out real time because chat GBT and Dali at RFP are too slow.
Optimize prop because context Windows limited, which is maybe not that true anymore.
and then merge model outputs with the UX to make the product more intuitive.
Is there anything you would add?
I give myself like a B-minus on that.
So some parts of that are accurate.
Even like the context, like the one that you call that.
Like, yeah, models have like larger context.
Like now that's absolutely true.
Like it's grown a lot.
But look at like an enterprise code base.
Yeah, yeah, absolutely.
You know, tens of millions of lines of code.
That's hundreds of billions of tokens.
Never going to change.
It's still being really good at like being able to piece together
this like distribute knowledge or is important.
So I think like there are figures there that I think are,
are still pretty accurate.
There's probably some that are less so.
First party versus third party.
I think we're wrong there.
I think I would nuance that to be like
there are certain things that it's really important
to do first party.
Like auto-complete, you have a really specific application
that you can't just prompt engineer your way out of
or just maybe even like fine-tune afterwards.
You just can't do that.
I think there's truth there.
But it's also be realistic.
The stuff that's coming out for the third model providers,
Cascade and WinSurf would not have been possible
if it wasn't for the rapid improvements
with 40 and 35-s,
That just wouldn't have been possible.
So I'll give myself a B-minus.
I'll say I passed, but yes, two hours, two years later.
Just to be clear, we're not grading.
It's more of a, what would you, you know.
Where are they now?
What would you have added?
What would you like?
Yeah, I mean, like that first post, right?
Like that was when we had literally, I think that was like a few weeks after we had launched
Kodium.
I think that's like, you know, Swix and I were talking like, maybe we can write this
because we're like one of the first products that people can actually use the AI.
That's cool.
I specifically like the co-pilot for X thing because everyone is so hard.
At that time, everyone was just like, you know, chat, GPT, that's all that was.
But I think, like, you know, that we didn't have an enterprise product.
I don't even think we were necessarily thinking of an enterprise product at that point, right?
So, like, all of the learnings that, like, you know, we've had from the enterprise perspective,
which is why I loved coming back for, like, a third time now on the blog, some of those, I think we kind of, like, figured.
Some of those we just honestly walked backwards into.
I had to get lucky a lot of the ways.
Like, we had many, we just did a lot.
Like, there's so many, like, opportunities and deals that we had that we, like, lost for a variety of reasons that we had to, like, learn from.
There's just so much more to add that there's no way I would have gotten that right in 2022.
Can I mention one thing that I think is, hopefully this is not very controversial, but it's, like, true about our engineering team as a whole.
I don't think most of us got much value from Chachapiti.
Largely because I think the problem was, and this is maybe a little bit of a different thing, it's like a lot of the engineers of the company who have been writing software for, like, over eight years.
And this is not to say they know everything that Chachabiti T know.
they don't. They'd already gone good enough at searching for Stack Overflow. Invested a lot
in searching codebase, right? They can very quickly grab through the code. Incredibly fast, like every
tool. And they've spent like eight years mastering that skill. And ChaiGBT being this thing on the
side that you need to provide a lot of context to, we were not able to actually get, like my co-founder
just basically never used Jai GPD at all. Literally never did. And because of that, probably at the time,
one of our incorrect sort of assumptions was probably that, hey, like a lot of,
of these passive systems need to get good because they're always there and these active systems
are going to be behind. I think actually Cascade was the big thing. Is there a company where everyone
is now using? Literally everyone. Biggest skeptics. And we have a lot of people at the company that are
skeptical of AI. I think this is actually important. Why do you hire them? No, I think here's the important
thing. Those people that were skeptical about AI previously worked in autonomous vehicles. These are not
crypto people. These are people that care about technology and want to work on the future. Their bar for
good is just very high. They will not form a cult of, this is awesome, this is going to change the
world. They were not going to be the kind of people on Twitter that are like, this is, yeah, this,
changes everything. Like, softers, we know it is dead. No, there are people that are going to be
incredibly honest, and we know if we hit the bar that it is good for them, we found something special.
And I think at that time, we probably had a lot of sentiment like that, that has changed a lot
now. And I think it's actually important that you have believers that are incredibly future-looking
and people that kind of reined in. Because otherwise, you just have, you just have a lot of
you know, this is like autonomous vehicles. You have a, you have a very discreet problem. People are
just working in a vacuum. And there's no signal to kind of bring you down to reality.
Right, you have no good way to kill ideas. And there are a lot of ideas we're going to come
up with that are just terrible ideas. But we need to come up with terrible ideas. Otherwise, like,
how does anything good come on? And I don't want to call these skeptics. Skeptics suggest that they
don't know. They're realists. They're the type of people that when they see waitlist on a product
online, they just will not believe it. They will not think about it at all.
Kudos for launching without a wait list. Yeah. Yeah. By the way, we will never launch with a waitlist.
We will never launch with a wait list.
That's the thing at the company.
We'd much rather be a company that's considered the boring company
than a company that launches once in a while and hopefully it's good.
My joke is Generative AI has gotten really good at generating waitlists.
Also, just to clarify, both of us used to working at Thomas Vehicle so it doesn't come across it.
Oh, yeah, yeah.
I'm just kidding on the bottom's vehicle.
No, we love that technology.
We love it.
Like, I love hard technology problems.
That's what I live for.
Amazing.
Just pushback on the first party thing.
I accept that the large model labs have just done a lot of work for you
that you didn't need to duplicate.
But you now are sitting on so much proprietary data
that it may be worth training on the trajectories that you're collecting.
So maybe it's a pendulum back to first party.
Yeah, I mean, I think like, I mean, we've been pretty clear from like a security posture perspective.
I think there's like both like, you know, customer trust and like...
I mean, I kind of want, like, let me opt.
I think that there is signals that we do get.
from our users that we can utilize.
There's a lot of preference information that we get, for example.
Which is effectively what you're saying of like our trajectories.
Go ahead.
We, like I will say this, the super complete product that we have has gone materially better
because of us not only using synthetic data, but also getting the preference data from our users
of like, hey, given these set of trajectories, here's actually what a good outcome is.
And in fact, one of the really beautiful parts about our product that is very different than
a chat DBT is we can not only see if the acceptance happened, but if something more than
the acceptance happened.
And edit happened even more than that.
Right?
Like, let's say you accepted it,
but then after accepting it,
you deleted three or four items in there.
We can see that.
So that actually lets us get
to even better than acceptance
as a metric.
Because we're in the ultimate work
output of the developer.
It's the preference between the acceptance
and what actually happened.
If you can actually get ground truth
of what actually happened,
this is the beauty of being an ID,
then, like, yeah,
you get a lot of, lot of information there.
So did you have this with the extension
or is this pure windsurf?
We had this with the extension.
Yeah, okay, right.
The Windsor just gives you more of the idea.
Yes.
So that means you can also start getting more information.
Like, for instance, the basic thing that Unchul said,
we can see if, like, a file explorer was opened.
It's actually just a piece of information we just cannot see previously.
Sure.
Yeah.
A lot of intent in there.
A lot of intent.
Second one.
Oh, boy.
How to make AIUX your mode.
Oh, man.
Isn't that funny that we now created, like, the full UX experience in an ID?
I think that one is pretty accurate.
That one's an A?
I think that one I'd give myself.
I think, like, we were doing that within the experience.
I still think that's true within the extensions as well, right?
Like we got very, very creative with things.
Like Rune mentioned the idea of just like, you know, essentially rendering images to display things.
Like we get creative to figure out what the right U.S. is doing there.
Like we could create a really like dumb U.S.
It's like a side panel, like whatever.
But actually going to extra mile does make that experience as good as it possibly can there.
But yeah, now like look at some of the UX that we're able to build in like in WinSurf and it's just like, it's fun.
The first time I saw, because now we can do command in the terminal.
you can not have to search for a bash command.
The first time I saw that, I was like,
I just started smiling.
And like it's not like a Cascade.
It's not like a gentic system right in the lap.
But I'm like, that is just a very, very cool.
We literally couldn't do that in VESCO.
Yeah, I understand that.
Yeah.
I've even invented a 60-line bash command called Please.
And you can, you know, do that inside of it.
Yeah.
That's cool.
Yeah, so please English and then...
You know, that's actually really cool
because one of the things I think we believe in
is actually I like products like autocomplete more than command,
purely because I don't.
I don't even want to open anything up.
So that thing where I just can type
and not have to press some button shortcuts
to go in a different place, I actually like that too.
Yeah, and I actually adopted warp, the terminal warp,
initially for that, because they gave that away for free.
But now it's everywhere, so I can turn off a warp
and not give Sequoia my batch commands.
I'm with you.
No, no, look.
Okay, I don't know.
I'm going to go on a right.
Hopefully somebody's a word.
This is like,
More product feedback.
But they basically had this thing where you can do kind of like pound and then write a natural language.
But then they have also the auto infer what you're typing is natural language.
And those are different.
When you do the pound, it sounds like it gives you a predetermined command.
When you like talk to it, it generates a flow.
Okay.
It's a bit confusing of a U.
But going back to your post, you had the three piece of a IUX.
What were there again?
Present, practical, powerful.
Actually, that was really good.
And I think, like, in the beginning, being present was enough.
Maybe, you know, even when you launch, it's like, oh, you have AI, like, that's cool.
Other people don't have it.
Do you think we're still in the practical where, like, the experience is actually, like,
the model doesn't even need to be that powerful, like, just having better experience is enough?
Or, like, do you think, like, really the being able to do the whole, because your point was,
like, you're powerful when you generate a lot of value for the customer.
You're, like, practical when, like, you're basically, like, wrapping it in a nicer way.
Yeah, where are we in the market?
I think there's always going to be room for, like, practical U.S.
Like, getting it, like, I mean, the command terminal.
That's, like, a very practical U.S.
right?
Like, I do think with things like Cascade and theseogenic systems,
like, we are starting to get onto powerful.
Because, like, there's so many pieces, like,
from a U.X perspective that make Cascade really good.
Like, it's just really, like, micro things that are, like,
just all over the place.
But as, you know, we're streaming in, we're showing, like, the changes.
We're, like, allowing you to jump in open diffs and see it.
We can run background terminal commands.
You can see what's running back-end processes they're running.
There's all these small U.S. things that together come to a really powerful and intuitive
UX.
I think we're starting to get there.
It's definitely just a start.
And that's why we're so excited about where all this is going to go.
I think we're starting to see the glimpses of it.
I'm excited.
It's going to be a whole new blog, yeah.
Yeah.
Awesome.
First of all, it's just been really nice to work with you.
I do work with a number of guests posters and not everyone makes it through it to the end,
and nobody else has done it three times.
So kudos.
We're up for a hat trick.
This one was more like the money one, which I, you know, it's funny because I think developers are like quite uninterested in money.
Isn't it weird?
Yeah, I mean, I think like, I don't know if this is just the nature of our company.
Like I think there's something we've said like there's all like the San Francisco AI companies and like everyone's like hyping each other like on the tech and everything which is like great.
The tech's really important.
We're here in Mountain View, beautiful office.
We just really care about like actually driving value and making money.
which is kind of like a core part of the company.
I think maybe the selfish way of saying that,
or like a little more of the selfless way,
is like, yeah, we can be kind of like this VC-funded company forever.
But ultimately speaking, you know,
if we actually want to transform the way software happens,
we need this part of the business that's cash-generative
that enables us to actually invest tremendously in the software.
And that needs to be durable cash.
Be cash that, like, churns the next year.
And we want to set ourselves up to be a company that is durable.
and can actually solve these problems.
Yeah, yeah, excellent.
So for people, obviously,
we're going to link in the show notes,
but for people who are listening to this for the first time,
I had a lot of trouble naming this piece.
So we originally called it,
you had like how to make money something.
I was super baddie.
I apologize.
I was super baddie.
I think it was like ready part of that during,
like on a plane flight, so I apologize for that.
He had like $3.
Oh, I absolutely had $3.
I was like, I can't do that.
So it's either building AI for the enterprise,
and then I also said the worst,
the most dangerous thing an AI startup can do is build for other AI startups,
which I think both of you will co-sign.
And I think basically the main thesis,
which I really liked was like,
go slow to go fast,
like here's the,
if you actually build for like security,
compliance,
personalization, usage analytics,
latency budgets,
and scale from the start,
then you're going to pay that cost now,
but eventually it's going to pay off in the long run.
And this is the actual insight.
You cannot do this later.
Like, if you build the easy thing first as an MVP, it's like, yeah, like, just ship it with, like, whatever's easy, easy to do. And then you tack on the enterprise ready.io set of, like, 12 things that you have, you actually end up with the different products or you end up worse off than if you had started from the beginning. So that I had never heard before.
Yeah, I mean, we see that like repeatedly.
I mean, just like right now, we have a lot of customers in like the defense space, for example.
We're going through Fed ramp accreditation right now.
And people that we're working with, they saw like all the fact like, oh, yeah, we already have a containerized system.
We can already like deploy in these manners.
We can already do like, we've already gone through like security.
They're like, oh, you're going to have a much easier time doing this, right?
Then most companies that are just like, okay, we have like a big SaaS blob and now we need to like do all these things.
It might sound like a really deep thing.
I think it's just anyone who's worked for like an excited period time
at like a company like on a certain project
has probably seen this happen.
The technology just keeps on improving.
And then you realize that like you have to now like re-architect your whole system
to get something improving.
Like just making that kind of change when you've invested so much effort.
Like people have like, important hours.
They're emotionally invested or whatever it might be.
It's really hard to make that change.
So I'm sure we're going to hit that also.
like, yes, I think we've done things a little bit earlier than most companies.
I think we're going to hit points where we're going to see parts of our systems.
We're like, oh, we really need to re-archate that.
Actually, we've definitely hit that already.
And I think that's just like at the project level, the product level,
or is that like your whole company?
Right.
I think the thesis behind here is like to some degree your company needs had this DNA from the beginning.
And I think then you'll be able to go through those bumps a lot more smoother
and be able to drive the valley.
I haven't been.
Yeah, can I say two points?
So first point I'd like to say is this is something that me and Douglas, my co-founder,
talk about a lot. It's like, you know, there's this constant thing of like build versus buy.
I think the answer is, like, a lot of the time the answer should be buy.
Right.
Like, we're not going to go build our own sales tool.
Should go buy sales force, right?
That's kind of dumb.
That's undifferentiated.
And the reason why you go with buy instead of build is, hey, like, look, the ROI of what
exists out there is good.
From like an opportunity cost standpoint, it's better to actually go out and buy it than
build it and do a shittier job, right?
There's a company that's actually going out and focused on that.
But here's the hidden thing that I think is really important when you go out and buy.
You're losing a core competency inside the company.
And that's a core competency you can never get.
Or it's very hard.
Like startups are so limited on time.
Let me just say, like, let's say as a company, we did not invest in, I don't know, model inference.
Yeah, we have like a custom inference runtime.
We give that up right now.
We will never get it back.
It's going to be very hard to get it back.
You can't just use VLM and TensorRT.
That would be your only option.
Or just let me put it.
If we use VLM, we would not be talking with you right now.
Like, yeah, we would have, yeah.
But the point is, this is more a question of, like, you know,
I try to think about it from first of a rental.
It's like, Google's a great company makes a lot of money.
What happens if they actually made the search index of the product
something that someone else spoke for them?
It's like they could.
Maybe someone else could have done a good job.
Maybe that's, like, a bad example.
But like, particularly because Google is a search index,
but, like, tough luck getting that core competency back.
You've lost it.
Right?
And I think for us, it's more a question of, like,
what core competencies do we need in,
the business. And yeah, like, sometimes it's painful. Like, sometimes actually, like, some of
these core competencies are annoying, sometimes we'll be behind, behind what exists out there, right? And we
just need to be very honest. That's where the truth-seekingness of the company matters. Like,
are we really honest about this core competency? Can we actually keep up? The answer is we truly
can't keep up, then why are we keeping them with a trade? We should just buy, right? Like, let's
not build. The answer is we can, and we think that this will differentiatedly make our company a better
company in the long term, then the answer is we need to. We need to. Because, like, the race is
not one in the next year. The race is won over the next five, ten years, right? Maybe even longer,
right? So that's like, that's maybe one thing. And then the second thing, actually, from like
the enterprise standpoint, I think one of the unique parts of the company is now is, we have both
this individual and enterprise that, and usually companies stick to one or the other. And I think
that needs to be part of the DNA, I think kind of early on in the company, as Unchall said.
I mean, there's stories of companies like Dropbox and stuff that tried. And Dropbox is an amazing
company, fantastic company, that one of the fastest growing consumer companies of all time,
consumer more on the software company of all time.
But yeah, like when you have everyone sort of product oriented on the consumer side,
the enterprise is just, it's checking off a lot of boxes that ultimately do not help the
consumer at all.
It doesn't help your growth metrics.
And effectively, if the original group of people didn't care, it's incredibly hard to get
them to care down the line, right?
Yeah.
It's incredibly hard.
Why do it?
And you need to feel like, hey, this is like, this is an important part for the company's
viability.
So I think there's a little bit of like the build versus buy part.
and then also like the cultural DNA of the company
that I think are both really important
and yeah, it's something we think about all the time.
I have the privilege of being friends with you guys off the air.
I don't feel like, like I think I know your work histories.
Like you say cultural DNA, but like it's not like you've built like giant enterprise
SaaS before, right?
Yeah.
I think, yeah.
So like where are you getting this from?
Yeah.
In fact, in fact, I think the only other sort of,
I guess like, you know, when I look at my previous internships,
maybe Unshul can provide some.
context here. It's like I worked at like LinkedIn and then Quora and then Databricks. And to be
honest, like I was not that interested in B2B ETL software that much. That's not what drives me
when I wake up at night. So I because of that, because of that, I decided to go work in an autonomous
vehicle company immediately after. I think part of it comes down to maybe a little bit of the
unique aspect of the company and the fact that we pivoted as a company is like we want to,
we want to be a durable company. And then the question is, how do you work backwards from that?
There's a lot of things about being very honest about what we're good at and what we're not good
I think surprisingly enterprise sales is like not something that like it came out of the womb knowing how to do.
I didn't really know. And because of that, like obviously like a lot of sales happen between sort of folks like Anshul and I helping partner with with companies. But very soon we hired actually a VP of sales and we've actually been deeply involved in the process of scaling out like a large go to market team. And I think it's more a question of like what matters to the company and how do you actually go out and build it. And I think one of the people that I think about a lot actually is someone like Alex Wang. He dropped out of college. He was a year younger than us at MIT.
and he has figured out how to constantly change the direction of the company.
Effectively, it starts out as like, you know, human task interface,
then an AV labeling company, then a cataloging company,
then now a generative AI labeling company.
And every time the revenue of the company kind of goes up by a factor of 10,
even though the business is doing something largely different.
I mean, now it's all about military contracts.
Yeah, now it's probably going to be military,
and then after that it might be taking over the world.
Like, he's just going to keep increasing the stakes.
And like, there's no playbook on how this really works.
It's just a little bit of like, you know, solve a hard problem
and work backwards, work backwards from that, right?
And we'll get lucky along the way.
I don't think, like, you'd think everything from first principles to the best of our abilities,
but there's just so many variable unknowns that, yeah, like, we don't know everything that's
happening in every company out there and everyone knows how fast the AI space is moving.
We have to be pretty good at adapting.
I want to double-click on one thing just because you brought it up, and it's like a rare thing
to touch on VPO sales.
We don't get to actually, we talk to pretty early stage founders mostly.
They don't usually have a pretty built-out sales.
function. Advice, what kind of sales works in this kind of field? What didn't work?
Anything you can share with other founders? I think one of the hard parts about hiring people in sales,
and I really, like, Graham, Anshul can also attest, like, we have amazing VP of sales at the company.
One of the things is, like, if you're purely a developer, salespeople, their job is to, like,
talk, like, really well, prim and proper. I mean, very obvious, if you hear, like, me talk, like,
I'm not a very polished person. You're great, by the way. I don't know.
Or compared to most pure, pure salespeople.
So actually just checking based on the way they speak is not that interesting.
I think, like, you know, what matters in a space like ours that is very quickly, moving very quickly, I think it's like intellectual curiosity is very important.
Intellectual horsepower.
Understanding how to build a factory.
I'm not trying to minimize it.
But in some ways, scales of, you need to build something incredibly scalable here, right?
It's almost like every year you're kind of making this factory twice, thrice, maybe as much.
big, right? Because in some ways, you have people that are quota carrying. You need some number of
people and you need to make the math work. And you actually, the process of building a factory is not
something you can just take someone who is a great rep at another company and just make them build
a factory. This is actually a very different skill. How do you actually make sure you have hundreds
of people that actually deeply understand the product? Actually, Unschil works very closely also
with sales to make sure that they're enabled properly. Make sure that they understand the technology.
Our technology is also changing very quickly. Let's maybe take an example on how are companies very
different than a company like MongoDB. When you sell a product like MongoDB, no one at the company
is interested in how the data is being stored. It's not that interesting, right? I love databases.
I would be interested. But most people are like, solve the application problem I have at hand.
People are curious about how our technology works. People are curious about rag, right? People that are
buying our technology. And imagine we had a sales team that is scaling where no one understands
any of this stuff. We're not going to be great partners to our customers. So how do you create
almost this growing factory that is able to actually distribute the software in a way that is
true to our partners and also at the same time, like, taking on all the new parts of our product,
right? They're actually able to expound on the new parts of our product. So, sorry, that was more a
statement of, like, building a scalable sales team. But in terms of, like, who you hire is,
you just need to have a sense. Like, in some ways, this is maybe an example of talk to enough people,
find out what good looks like potentially in your category and find someone who's good and humble and
willing to work with. Yeah, that's just generic hiring. It's just generic hiring.
I think here, sales, there's sales for AI.
or sales for AI infrastructure.
And then there's also the sales feeding into products
in a way that we're talking about here,
where they basically tell you what they need.
I imagine a lot of that happened.
I think a lot of that happened.
I mean, still happened.
And Veroon, mentioned, like, Rourne, myself,
a number of other people who are developers by trade engineers.
Like, we're pretty involved in the sales process
because, like, there's a lot to learn, right?
Like, before we went out and hired a sales leader,
like, yeah, if all we went is, like,
neither of us had ever done a sale percodium in our lives. And we went to try to find a sales
leader, we probably would have not hired the right person. Yeah, we had sold a product to like 30 or 40
customers of that time. We had done like hundreds and hundreds of deals cycles ourselves personally,
right? Without, I mean, we read a lot of books and we just did a lot of stuff. And we learned
like what messaging worked, like what did we need to do? And then I think we found like the right
person, right? A second version, like Graham's amazing and who we brought on as our VP of sales.
That just has to be part of the nature and it doesn't stop now. Like just because we have a
of sales and people dedicated to sales.
It doesn't stop that we can't be involved or like engineering can't be involved, right?
Like we have lots of people.
Like we hire plenty of deployed engineers, right?
These are people like, you know, I think like Palantir kind of made this really famous.
Like deployed engineers like work very, very closely with the sales team on very technical aspects
because they can also understand like what are people trying to do with AI?
As in they work at Kodium as deployed engineers.
Yeah.
Okay.
And they partner with our account executives to like make our customer successful.
learn what is it that people are actually getting value with AI.
And that's information that we keep on collating.
And it's like we will both jump into any deal cycle just to learn more
because that's how we're going to just keep on building the best product.
It comes back to the same thing, I don't know.
And hopefully we build the right thing.
Cool, guys.
Thank you for the time.
It's great to have you back on the pod.
Yeah, thanks so for having us.
Hopefully in a year we can do another one.
Yeah.
You'll be 10 billion by then.
Yeah, exactly.
At this rate, then next year.
We try not thinking about that.
Try to not be a zero-billing company.
There's that, yeah.
All right, cool.
That's it.
Awesome.
