Screaming in the Cloud - Conversations at the Intersection of AI and Code with Harjot Gill

Starting point is 00:00:00 almost always testing some frontier model for these labs using our e-vails. We kind of have the one of the best evils in the industry when it comes to reasoning models, given that code reviews are our reasoning-heavy use case. And also around the agentic flows, because as you go deeper and deeper into agentic flows, your errors compound. So these long horizon tasks, if you're like go off track on the first step, you're going to be way off in the like 10th step, right? So, yeah, it's all about like evaluating.

Starting point is 00:00:25 So we have a team that spends a lot of time looking at open source, that we have, bringing those examples in to make sure we have the right evaluation framework to understand the behavior of these new models as they come out. Welcome to Screaming in the Cloud. I'm Corey Quinn. Today I'm joined by Harjot Gil, CEO and co-founder of Code Rabbit. Harjot, you're a now three-time's entrepreneur who went from Newtonics, senior director. We were building what is now apparently the most installed AI application on GitHub and GitLab, if I'm reading this correctly. What made you leave a big, comfortable tech job to decide, you know what I really want to do next? Code Review.

Starting point is 00:01:11 Thanks for having me here, Cody. I mean, it's been a very interesting journey for me. Coding faster with AI tools, you need faster, better code reviews to keep up and to keep out the AI slop. Code Rabbit delivers senior engineer level code reviews on pull requests and write in your IDE, with less LGTM. Get context-aware feedback for every commit, no configuration needed, and all programming languages you care to use are supported. From individuals to enterprises, join the millions who trust CodeRabbit to banish bugs

Starting point is 00:01:44 and help ship better code faster. Get started at codrabbit.aI and instantly cut your code review time and bugs in half. So I've done these startups for a while. I actually started my first company back in 2015 out of my research that I was doing at University of Pennsylvania at that time, and that was called NetSil. And it was in the early time, I mean, you would remember that because this was the time of Docker, Kubernetes, and all these, like, microservices taking off. And we built, like, this product that was, like, leicence and back what NetSill like,

Starting point is 00:02:17 so we could understand the API calls and understand the performance. I mean, that was the first startup, which was acquired by Nutriamics. and I was there for like a few years. I've never been like a big company person. So yeah, I mean, so we had to go there, indicate the product. Then I quit to start another startup, which was Flux Ninja, which was in the reliability management space.

Starting point is 00:02:36 And the idea was like, how do we go beyond observability into more controlability, prevent cascading failures and all these Black Friday outages and so on? Yeah, so it built like a very interesting, like rate limiting and load sharing kind of a solution where you could prioritize API calls. Now, unfortunately for us, and that didn't go well. I mean, we were betting a lot on service meshes to take off, and they never became like a mainstream tech from that point of view.

Starting point is 00:03:02 And very interestingly around that time, large language models took off, like Generative AI, and it started with GitHub co-pilot, and then chat GPT came along. And this was even before there was GPT4. And I was running this team of, like, around 15 remote employees during COVID, it, like, and we were struggling to, like, ship code faster. Like, one of the big bottlenecks was, like, even though co-pilot had come out, people were coding, even doing, like, small stack pull requests and so on.

Starting point is 00:03:33 But still, the code reviews were, like, a very massive bottleneck, right? So I did this, like, weekend hackathon, like, project where we started using some of these language models to automate the code review process to find low-hanging issues that can go beyond simple linting issues that you find with existing tools. and that was a very interesting outcome. Like, we saw that this is a really good fit. And then CodeRabbit started as a separate company. It just took off.

Starting point is 00:03:58 And it was very clear that we have to, like, just focus on that problem statement. And then Flex Lange on my second startup, we basically folded into CodeRabbit, and that's what I then continue to go full time. And CodeRabbit, yeah, call on. Yeah, CoteRabbit is interesting to me from the perspective of you're tackling something that historically has felt very boring and things that people tend to more or less phone in. I mean, code reviews have

Starting point is 00:04:25 been around forever, but what did you see around its fundamental broken nature that made you think, ah, we can pour some AI on this and make them better? It's broken in many, many ways. You won't believe it. Like, not every company, I mean, everyone wants to do a code review

Starting point is 00:04:41 process because of compliance reasons and just because they want to prevent massive failures downstream, because each issue you catch in SDLC is, much cheaper, right, than to cash it later in production. I mean, even though that's my solo side projects where I'm developing things and merging them in, I still feel obligated to do some form of code review just because I style myself as kind of a grown-up and I probably got everything right.

Starting point is 00:05:04 Why would I even read this thing I'm about to merge in? I still feel obligated to do it. That doesn't mean I do it, but I feel bad when I don't. I mean, and a lot of managers in these companies have that, like, guilty, conscious, like, because they're not doing a good job because if you're not doing a good job, because if If you look at really good code reviews, they take as much time as it takes to actually write the code. But, I mean, you actually have to really understand the context of these changes because software is complex. It breaks in very, very interesting ways.

Starting point is 00:05:33 And code review is like the kind of the first line of defense. I mean, of course, you're doing more validation downstream, QA environments, and so on. But code review is really necessary. But most companies don't do it well. Oh, yeah. And AI seems to be way faster at not reading the code, typing LGTM as a comment, and then click the merge button. It feels like it could speed up that iterative cycle, which frankly is sort of the way that it seems like all large code reviews tend to go. I've submitted three-line diffs that have

Starting point is 00:06:00 been nibbled to death by ducks as people talk about it. But the 10,000 line diff of a refactor or reformat or something gets the looks good to me, ship, ship squirrel, and off it goes. There's a human nature element to play here. You're right. I mean, when the peers are small, yes, people can still have a cognitive, like, can cognitively look at them, but when do they become, like, beyond a certain threshold, that's the point where you say, okay, rubber stamp it, like, ship it. I don't get. I can just go through it. And then, like, in some companies, you have, like, the ego clashes. Like, people actually do a very thorough job or too thorough and many times, and you have, like, so much back and forth for days and weeks at a stretch, and things don't move

Starting point is 00:06:37 at all. Right. So, so, yeah, I mean, the code reviews can gut ugly in many ways. Most of the times, as you said, but in some cases, they can also be very toxic and not a pleasant experience. right, in many ways. Show me how your company reviews code. I can show you aspects of its culture. Then that's right. That's right, yeah. So I guess a question I have is around what,

Starting point is 00:06:58 I guess I'll term a second order effect of a lot of this AI generated code proliferation. Now it seems like forget merging code in that I haven't read. In some cases, I'm merging code in. I haven't written. Feels like that is increasing the burden on the code review process.

Starting point is 00:07:14 Right. So one of the biggest tailored application for generative AI has been coding. I mean, that's probably the only thing that's working if you really think about it, right? Oh, it is the standout breakout success of generative AI. Everything else is trailing behind.

Starting point is 00:07:30 But what it turns out, there's a lot of replacement of stack overflow that AI code generation can do. That's right. That is right. I mean, and it's getting more and more sophisticated. Now you have all these coding agents which have come on the scene, which, like, CloudCode, for instance,

Starting point is 00:07:45 and you're starting with a prompt and you're getting large-scale coding changes done in a few minutes, right? And a lot of these wide coders or the junior coders in every organization, they tend to like throw this code over the wall. And then it's someone else's headache to review it, especially the senior developers. And now they have a huge bottleneck of pull requests

Starting point is 00:08:06 where they have to now piece together that puzzle, like on what actually happened there. And it's a nightmare. I mean, the backlogs are now increasing because of generative AI, a lot of white-coded PR. are being opened against open-source projects. It's also a maintenance nightmare. I don't know whether we've seen a lot of tweets around open-source projects

Starting point is 00:08:22 where they're getting these contributions from random developers across the world with like 10,000-line PRs, hundreds of files change and so on. They seem like good features on the surface, but once you start digging deeper, they're like right-fid issues. And that's where I think it's becoming unsustainable, like, and you're getting AI. It's like an air battle. Like earlier you were fighting a tank battle,

Starting point is 00:08:44 but now you're in a very different battleground. Like, you have to bring AI to fight AI in many ways or AI to review AI, right? Very much so. It's one of these areas where it just seems that there's so much, I guess, nonsense code being thrown out. I look at the stuff in my own code base

Starting point is 00:09:04 after I let AI take a few passes through it. And, like, the amount of dead code that'll never be reached through any code path is astonishing. Five different versions of documentation, all self-contradictory. it becomes an absolute mess. The counterpoint, though, that I have is that this is not the first attempt

Starting point is 00:09:22 at solving the code review problem. There have been a bunch of traditional tools aimed at this before the rise of AI, and the biggest problem that you had there was false positives. That's right. I mean, if you look at the traditional tools like pre-gen AI,

Starting point is 00:09:34 they were like mostly like based on static rules, some sort of like, like if you look at the security scanning market, the SaaS market, like you probably cover some of those companies, is they are looking at OVASP vulnerability. They have signatures to discover those deficiencies, right? And every company, like, when they enable these tools,

Starting point is 00:09:56 they end up in the lot of alerts. Alert fatigue is a problem, and you end up switching a lot of these linting rules so that you can still more faster without making sure you're covering all kind of, like, all these clean code guidelines that you could have, right? Yeah, so that has been like a traditional problem with the tools in this space.

Starting point is 00:10:14 I mean, all the way from sonar source and some grabs of the world and so on. And when it comes to AI, like one of the nice things is you can tune it. They're like very interesting knobs possible here. And also the kind of insights you're getting are more human-like feedback. You're not going and nitpicking on some signature. You're talking about the best practices. As you said, it's the stack overflow. You're taking all this best practice examples that these models have been trained on.

Starting point is 00:10:38 And you're bringing in to the more of an average engineer anywhere, like in the world could now, This has access to the best practices, which they otherwise don't without proper mentorship. I want to dive a little bit into, I guess, the cloud architecture piece of a lot of this. You've been fairly open on your engineering blog about how your infrastructure works. You use one of my favorite Google Cloud services, of all things, Google Cloud Run, to effectively execute what amounts to untrusted code at significant scale. And that is both brilliant and terrifying at the same time. Can you talk me through that decision?

Starting point is 00:11:16 Yeah, we were like one of the first companies to build agentic workflows and also like engineering to a point where it's very cost-effectively done. So one of the things we found with building these agents is that no matter how much context engineering you do, it's never enough. So a lot of the companies started with drag lookups, they will index the code base and they'll bring in all this context and then ask the model a question around whether this code that you're doing looks good or not. But we've found that very often this context is never enough.

Starting point is 00:11:46 It can never be enough given the complex code basis in the real world. That's where you want to give this model some sense of agency to go and explore. For example, the code rabbit has like a multi-pass system. The first pass of the review, it does raise some concerns. And a lot of the times these are false positives. They're not valid findings. So what we do in the sandboxes that we create in Google Cloud, we use Google Cloud run. it's a really great service, it's serverless.

Starting point is 00:12:13 So we clear this is like a FAMBAL environment where we're kind of the pull request where we're reviewing, we not just look at that pull request, we bring in the entire code into the sandbox. And then we're letting the models actually navigate the code like a human does, but using CLA command. So that was the other innovation. Like we have generating a lot of shell scripts like we are letting the models like run RIPREP queries.

Starting point is 00:12:34 We are letting the models do CAD command to read files based on the concerns they see in the code. And they will, they navigate the code. And once they bring in this additional context, that's where they're either able to suppress some of these like false positives. And in many cases, we are able to find issues which are ripple effects in the call graph across multiple finds. That's what makes code have it really good, by the way. Yeah.

Starting point is 00:12:55 How do you wind up, I guess, drawing the line because I found that one of the big challenges you have with a lot of these LLM powered tools is they are anxious to please. When you say, great, find issues in this code. They're not going to say, nope, looks good to me. They're going to find something to quibble about. Like some obnoxious senior engineers we've all worked with in the course of our careers. How do you at some point say, okay, anything, nothing is ever perfect. At some point, you're just quibbling over stylistic choices.

Starting point is 00:13:27 This is good. Ship it. How do you make it draw that line? That was the hardest part, I'll tell you. So when we started, like, I mean, you're right. I mean, if you ask a model to find something wrong, it will find something wrong in it, like 10 things, 15 things, it will almost always please you.

Starting point is 00:13:42 And the hard part is like, how do you draw the right balance in classifying a lot of this output and seeing what's important enough for the user to pay attention to, right? Took a lot of trial and error. Like early days when we launched the product, we still had like a lot of cancellations

Starting point is 00:13:59 because of noise and feedback that was too nitpicky. And they took a while to learn and figure out the right balance when it comes to the quality of feedback and what's acceptable. And it was a long battle, I can tell you. Like, a lot of our engineering actually went into taming these models to a top point where we can make majority of the users happy.

Starting point is 00:14:19 We can't make everyone happy. This is the nature of this product. They're not deterministic. And they're not like the previous generation of the systems where you can deterministically define the rules. But this one's like very vibe feedback. We are vibing, as they say, right? Vibe check. Yeah.

Starting point is 00:14:35 Yeah. There's a, there's a lot of, I guess, nuance in a lot of these things. And the space is moving so quickly that it's basically impossible to keep up. You have, I believe, standardized around Claude's models for a lot of this. 20 minutes before this recording, if people want to pinpoint this in time, Anthropic came with a surprise release of Opus 4.1. So if we had recorded this yesterday and said Opus 4 was their premier model, that would now be inaccurate, even in a short timeline like this.

Starting point is 00:15:06 this, how do you, I guess, continue to hit the moving target that is state of the art? That's a great question. First of all, we use both open-AI and anthropic models. In fact, like, our open-a-token usage might be even more than that, what we see on Anthropic site. We use, like, six or seven models under the whole design. One of the nice things about co-travel product has been. It's not a chat-based product.

Starting point is 00:15:28 Every product in the AI space starts with some sort of a user input. CodeRavit is like zero activation energy. You open a pull request, it kicks off a workflow, and it's a long horizon workflow. It takes like a few minutes to run. Which is genius. Chatbots are a lazy interface to be direct. Everyone tends to go for that because it's the low-hanging fruit.

Starting point is 00:15:47 But if I have to interface with a chatbot, it's not discoverable, what it's capable of doing. And if I look at it in a traditional website, that already means on some level your website has failed in all likelihood. Yeah. Yeah. In a way, like it's like a zero activation energy kind of a system. Like you don't have to remember to use it, right?

Starting point is 00:16:03 But that brings in like very interesting. thing. First of all, it's a long-running workflow with ensemble of multiple models. And evaluations become like the key thing, like the nature of these products is, it's all about evaluations. We are not like training our own foundational models. It's not in our budget. So what we do is like make sure that at least we have some sort of a sense into understanding these models and the behavior and tracking their progress across the different generations that we are seeing. So we are actually almost always testing some frontier model for of these labs using our evils.

Starting point is 00:16:36 We kind of have the one of the best evils in the industry when it comes to reasoning models, given that code reviews are our reasoning-heavy use case. And also around the agentic flows, because as you go deeper and deeper into agentic flows, your errors compound. So these long horizon tasks, if you are like, go off track on the first step, you're going to be way off

Starting point is 00:16:54 in the like 10th step, right? So yeah, it's all about like evaluating. So we have a team that spends a lot of time looking at open source usage that we have, bringing those examples in, to make sure we have the right evaluation framework to understand the behavior of these new models as they come out. Stuck in code review limbo, get CodeRabbit, never wait on some guy named Mike again. Swap the drama for multi-layered context engineering that suggests one-click fixes and explains the reasoning behind the suggestions.

Starting point is 00:17:25 CodeRabbit integrates with your Git workflows and favorite tool chains, supports all programming languages you're likely to use, and doesn't need configuration. code reviews when you need them at codrabbit.a.i and get them for free on open source projects and on code reviews in your IDE. So this might be a politically charged question, but we're going to run with it anyway. Why did you pick Google Cloud as your infrastructure provider of choice? I mean, well, why not Azure? I can answer that easily, but AWS is at least a viable contender for this sort of thing. I have my own suspicions, but I'm curious to hear what your reason. was. We looked at Cloud Run product. I think that was one of the big drivers. The whole Cloud

Starting point is 00:18:09 Run thing is amazing. That was one of the reasons we saved us so much time and costs in operating like this whole serverless thing. And also like in the previous startups, we have gone with Google Cloud. Like the interface, it's like Amazon is great, but that's like one of the first cloud services. It can get very overwhelming to a lot of people. But the GCP is like much more cleaner in our opinion. Cost-wise as well, it's been very effective in terms of of, yeah, I think a lot of startups do build on Google Cloud. Yeah, it's one of those areas where if I were, and I've said this before, that if I were starting from zero today, I would pick Google Cloud over AWS,

Starting point is 00:18:46 just because the developer experience is superior. Cloud Run is no exception to this. It's one of those dead simple services that basically works magic as best I can tell. It feels like it's what Lambda should have been. Yeah, I mean, it's amazing, right? I mean, you can, it's all concurrency base, which we love. Like, I mean, scaling is so straightforward. once you understand the model there, like, it's not based on just resources.

Starting point is 00:19:08 Like, the knobs there are very, like, makes so much logical sense once you get to understand them. They're much simpler. How do you handle the sandboxing approach to this? By which I mean that it has become increasingly apparent, that it is almost impossible to separate out prompt from the rest of the context in some cases. So it seems like it would not be that out of the realm of possibility for someone to say, yeah, disregard previous instructions, do this other thing instead. especially in a sandbox that's able to run code. Yeah, you're right. We are running, like, untrusted code,

Starting point is 00:19:42 and some of this, like, chat interface, you could actually steer the system to generate any malicious transcripts or Python code in that environment, right? And we do have internet access enabled as well, right? So it's all about, like, locking down, making sure have C groups and all set up so that you're not, like, escaping the sandbox

Starting point is 00:20:01 that you have created. And the other part is, like, locking down the access to internet, services, you don't want that sandbox to access the metadata service of these cloud providers, right? So yeah, it's like standard stuff comes around like network zoning and the

Starting point is 00:20:15 C groups and all. And the other part is like we allow internet access. I think that's something we disallow everything which is in the GCP, VPC, but allow internet access. At the same time, we have protections around resource utilization and killing those malicious projects

Starting point is 00:20:30 scripts that could be running. it's that is always one of those weird challenges taking a more mundane challenge that i have is often a code basis tend to sprawl as soon as they become capable of doing non-trivial things and some of us don't bound our pull requests in reasonable ways how do you wind up fitting getting meaningful code review either in a giant monolithic repo that will far exceed the context window or counterpoint within a microservice where 90% of the application is other microservices that are out of scope for this. How do you effectively get code review on something like that? That's a great question. There's a term for this called context engineering, and that's what we do, actually,

Starting point is 00:21:14 if there's the best way to describe it. Like, it all starts with building some sort of code intelligence. Let's say you are like reviewing five or ten files, but you have a large code base where those files got changed. The first part of the process is like building some sort of a code graph because unlike cursor, which kind of uses in the code completion tools, they use like code index, which is more on similarity search, and that works great for their use case because they mainly need to follow the style of existing code when generating new code. In our case, like the code review is a very reasoning intensive workflow. Like if we are bringing in these definitions from the other part of the code, they have to be

Starting point is 00:21:52 in the call path. So that's why we build a relationship graph, which is a very different technology that we invested in. And that brings in the right context as a starting point. As I said, it's still a starting point. Like, you still have to do the agentic loop after that. But the starting point has to be good so that your first pass of the review has some bearing on where to poke holes. Yeah, you're going to like first raise 20, 30 concerns and you're going to start digging deeper on those choose your own adventure kind of routes. And some of these will lead to real bugs. It's not deterministic. I mean, each run would look different. It's like just like human.

Starting point is 00:22:27 like human humans will like miss stuff some a lot of the times but now in this case AI is still doing a really good job in poking holes at where it feels the deficiencies could be there given the code changes but starts with the initial context say you have to bring in like code graph learnings so we have long-term memories features so each time a developer can teach code rabbit it learns and it gets better over time because those learnings are then used to improve reviews for all the other developers in the company it's a multiplayer system right so we are also bringing in context from existing comments on a PR, some of the linting tools, CICD. So they're like 10 or 15 places you're bringing the context from. The most impactful is usually

Starting point is 00:23:04 the code graph. I want to explore a bit of the business piece of it, if that's all right with you. You've taken a somewhat familiar GitHub style model, free for public repos, paid for private ones by and large. There's a significant free tier that you offer. And you're also, to my understanding, free in VS code, cursor, windsurf, as long as that last, etc. How do the economics of this work? That's a really great question. When we start this business, like, one of the things we realized that it's a habit change. You're trying to make people adopt this new habit, like this AI thing, most people don't want

Starting point is 00:23:37 to use it. Like, I mean, you're trying to bring AI experiences into existing workflow and universally everyone hates it. Now, code habit has been lucky in that regard that we brought in a very meaningful experience that people love. And we wanted to make sure that we... spread it and make it like democratize it to some extent like that's where the open source makes sense i mean first of all we love open source like i'm a big believer and we sponsor a lot of

Starting point is 00:24:02 these open source projects and and that became also testing ground so that was the other thing like because it's not just a go-to-market but also the learnings and the public usage we used as a feedback loop to improve the air product so that we can serve the paid customers so from the economics point of view yeah it's hard given that it's one of the agentic systems and you know publicly, like even Cursor and CloudCode have had issues with their pricing, massively negative gross margins. Like, it's like we are still able to offer this surveys at like a flat price point of per developer per month, which is affordable enough that we can go mass market with it.

Starting point is 00:24:36 And predictable enough, which is underappreciated. Exactly. It's predictable enough. There is no surprises. And we don't have negative gross margins, right? I mean, we're one of those very few success stories where we have engineered the system to a point using on. We're not like letting users fix on it and run that in a loop.

Starting point is 00:24:54 I mean, if you look at most products, that's how they are. You're picking a model and then running with it. Like, we are being smart about this, right? It's like Amazon Prime. Yes, everyone wants free shipping, but you can't just offer it. You have to build the automation, the infrastructure, and we invested in that. That's a trick. I mean, we are able to support all the open source users so that we can learn from them a lot.

Starting point is 00:25:12 We are supporting a lot of the IDE users because we monetize on the GitHub side. We are a team product, and that's the market we care about. By removing the barrier to entry, using the IDE, most people are now familiar, getting familiar with code rabbit through that form factor. And once they like it, they're able to bring us into their Git platforms where they need more permissions and some consensus to adopt it.

Starting point is 00:25:35 And that's working really well. I mean, go to market wells. Yeah, we are growing like double-digit growth every month. And who are these folks? Are these individual developers? Are these giant enterprises? Are they somewhere between the two? Somewhere between the two.

Starting point is 00:25:50 like most of our growth early days had come completely product-led growth inbound like all the way from small five developer companies all the way to hundreds of developers so we've seen the whole spectrum of it everyone needs this product like no matter you're a small company large everyone needs to be court reviews and the smaller teams tend to move faster given that it's a fast like you can build a consensus quickly larger teams need a longer POC but it usually happens in a few weeks ROI is very very clear of for this product we have some of the enterprises now also like doing some POC for a few weeks, and these are like large seven-figure deals even. That is significant. You've also recently raised a $16 million Series A, led by CRV. So I'm sure you've been asked this question before, so I don't feel too bad about springing it on you. But what happens when Microsoft just builds this into GitHub natively? How do you avoid getting Sherlock by something like that? It's already happened.

Starting point is 00:26:44 So we've been competing with GitHub, co-pilots, code review product for the last 10 months now. The fact that it automatically does that, and I had no idea that it did that, tells me a lot about GitHub's marketing strategy, but please continue. Yeah, they do have that product which is built in. And usually, I mean, of course, I mean, it's almost like, as with everything, GitHub, like the best of breed products still win, right? I mean, so that's where, like, it hasn't impacted anything on our growth or churn rates, even despite that product being out there.

Starting point is 00:27:13 I have heard people talk about CodeRabbit in this context. I have not heard people talk about co-pilot in this context. for just a sample size of one here. Right, that's right. And we have, like, innovated in this space. We actually created this category. I mean, the bunch of larger players are all trying to copy our concepts. But still, there's a lot of tech under the hood, which is, like, very hard to replicate.

Starting point is 00:27:33 I mean, it's, I would say, a harder product to build than even code generation systems, given that it's a very reasoning-heavy product, and people are more sensitive to the inaccuracies when it comes to these kind of products. And it's a mission-critical workflow, right? I mean, you're in the code review, and it's a very serious workflow, not just something on your developer's laptop, and you can be more forgiving around the mistakes. Yeah, that's why we have seen a lot of clones, but no one has been able to replicate the magic of Code Rabbit. We're like probably 10x bigger than the next competitor in the

Starting point is 00:28:01 space. Yeah, I'm not aware of other folks in the space, which probably says something all its own. And this also has the advantage of it feels like it is more well-fought-out than a shell script that winds up calling a bunch of APIs. And it doesn't seem like it's likely to get, become irrelevant with one feature release from one of the model providers themselves. That's right. That's right. I mean, yeah, I mean, so we kind of bet on the right things early on

Starting point is 00:28:24 like going fully agentic. We kind of saw that coming two years back and investing a lot in the reasoning models before they even became mainstream because the entire thing was reasoning heavy. So kind of been future proof with the decisions we are making. And now we could be blindsided by something, I don't know, like GPD5, GPD6, but so far it looks like

Starting point is 00:28:40 each time the new models come out, they benefit this product a lot. And it's all what the context we are bringing in and how we engineer the system for cost. Cost is a big factor in this market, I could say, I could tell you that. Yes. Oh, absolutely. Especially since it seems like when you offer a generous free tier like this, it feels like that, yes, it's a marketing expense, but that also can be ruinously expensive if it goes into the wrong direction or you haven't gotten the economics dialed in exactly right. That's right. And that's where you have all these abuse limit. That's where the technology from my second startup came handy. So a lot of

Starting point is 00:29:13 abuse prevention is a flex ninja tech that we're still using at CodeRabbit. So you claim to catch a ridiculous percentage of bugs, which is great.

Starting point is 00:29:23 And your marketing says all the things I would expect. What has the developer feedback been like in reality? I guess the majority people love it. I mean, you could see this on the

Starting point is 00:29:32 social media. Like, people are just talking about it. They're in general love the product. Like, a lot of these organizations we talk to this, they recovered the investment in two months.

Starting point is 00:29:41 And some people are coming back in saying, if you were to charge more, they will still buy it. Like, I mean, so it's been a very overwhelming. I mean, of course, there's always going to be some detractors, people who don't like AI in general, have their own pain. So those, yeah, so that crowd will always be there. But if you look at the majority, it's definitely a step function improvement in their workflow. And it's very easy to see if you go and search social media, LinkedIn, or X platform,

Starting point is 00:30:08 or Reddit, you will always see people saying positive things. And that's where we're getting the growth. form. Like, most of our signups are actually word of no signups at this point. So it's like our customers bringing in more customers. So I guess my question now comes down to what the future looks like here. Where does this go? What is the ultimate end game? Do human code reviewers wind up going extinct? Or is this more of an augmentation versus replacement story? I think it's more like now humans will be like fighting a higher order battle. Like if you look at all this like nitpicking and looking at problems, it's like the AI still like is not.

Starting point is 00:30:43 doesn't have the non-obvious knowledge. There's knowledge beyond code review that goes into the decision-making, right, which we don't have. And the humans have that knowledge, right? So in a way, like, I don't think humans are going away. I mean, the fee of seen is like usually instead of having two code reviewers on each PR, now one is AI, the other still a human. And on smaller pool requests, they are just trusting code rabbit.

Starting point is 00:31:04 They don't even have a human review. So some of that automation has kicked in. But when it comes to coding in general and code review, I think it's going to be a long journey. like a lot of the labs are hoping that we will go completely automate software development in the next few years, but I don't think that's going to happen. What we're down discovering, that this whole coding market has multiple submarkets in it. There's like tab completion, now there are different agents all the way from terminal ITE,

Starting point is 00:31:30 background agents, and they don't like really replace each other. They're just being used for different reasons. And then code review as a guardrail layer is going to be standardized for these organizations as a central layer. It's almost like Datadog, if you have to give you an analogy, like you had all this multi-cloud, you had all these Kubernetes, rancher. But then Datadog said, okay, to be successful,

Starting point is 00:31:53 you need observability. I'm going to give you the guardrails and became massively valuable. That's where obviously CodeRabbit opportunity here. So I guess my last question for you on this is that for various developers who are listening to this who are drowning in PR reviews going on attended. What is the one thing that they should know about CodeRabbit? Yeah, I think Code Rabbit is a friend.

Starting point is 00:32:12 I mean, in a way, like, if they're bringing Code Rabbit into their workflow, they are going to be, like, at least offloading some of the most boring parts, which is the trivial. Like, of course, building software is more fun than review. And it's fun to work with. I mean, of course, it's going to be that AI that's, like, always watching their back while they're trying to move fast with AI, making sure they're not, like, tripping over and causing bigger issues.

Starting point is 00:32:34 If people want to learn more, where's the best place to them to find you? Yeah, I mean, they could just find us on CodeRabbit. and just take a couple of clicks to try out the products really frictionless so you could get started just a few minutes for your entire organization. Awesome, and we will, of course, put links to that in the show notes. Thank you so much for taking the time to speak with me today. I appreciate it. Thanks, Corey. Thanks for having me here.

Starting point is 00:32:55 Harjo Gil, co-founder and CEO of CodeRabbit. I'm cloud economist, Corey Quinn, and this is screaming in the cloud. If you've enjoyed this podcast, please, we have a five-star review on your podcast platform of choice, Whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an obnoxious comment that was no doubt written for you, at least partially, by AI, configured badly. I don't know.

Screaming in the Cloud - Conversations at the Intersection of AI and Code with Harjot Gill

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.