Big Technology Podcast - Vibe Coding Decoded — With Amjad Masad

Starting point is 00:00:00 Don't go into this thinking you can just have prompt and have an application pop up at the other end, at least set an afternoon to give it some good effort and try to get like your first app in. And once you do that, you just get evicted. I have the stat here that Replit has multiplied by, it's revenue by 10x in less than six months to 100 million an annual recurring revenue. Is that growth vibe coding or is that growth AI coding? Vibcoating? Is AI coding just a hobby, or at the beginning of a technological revolution that empowers everyone to build?

Starting point is 00:00:36 Our guest today, I'm Jad Mossad, the CEO of Replit, has some answers, and we're here in Repplet headquarters in Foster City to speak with him. I'm Jod, great to see you. Welcome to the show. Thank you. I'm excited to be on the show. So we're going to talk today about vibe coding and AI coding, which are two similar but different things.

Starting point is 00:00:53 I first wanted to speak with you about vibe coding, which is effectively you write a prompt, and then the AI goes ahead and build software for you. This is something that Replit enables. This is something I've tried. What are some of the use cases that you're finding people are actually having effective approaches with this? Like, where are the places where people are doing this well? There's like broadly three use cases. One is personal life, family life.

Starting point is 00:01:20 So, you know, for example, like a lot of people like to do health tracking. I'm going to track my sleep. I'm going to pull in data from my FitBit. I'm going to have the AI sort of process that data and I'm going to have this app on my phone that I use every day or I'm going to build an educational app for my kid to learn math or reading or we're going to have like someone built like a chore hero for their family to like, you know, have an iPad on the wall and like here's who's doing the most chores and gamifying their family life. You'd be surprised how popular this use case is. And so, you know,

Starting point is 00:01:57 In the niche that I've always been in, which is like creator tools, there's always been this idea of personal software, malleable software. And by the way, this goes to the early computing history. So, you know, for example, like Apple had this piece of software called HyperCard. HyperCard allowed anyone to make personal software. You know, there's Vigil Basic. It's been attempted so many times, but for the first time, now anyone can make software. So there's a lot of, there's a class of personal software.

Starting point is 00:02:33 We have a mobile app and you can use that to make software. It's the most fun thing to do is sit down with your kids, five, six, seven years old, and just brainstorm games and make games with them. So that's one bucket. Wait, before we go to the next bucket, I want to ask you a question. So does this say something about the software industry, that the software industry just hasn't served so many use cases? or are these use cases non-economic or is it possible that people will build things for their family and the next thing you know they can serve that mass market and it becomes a business? It is certainly, you know, there's certainly a market there and you can certainly make a lot of money from that.

Starting point is 00:03:11 Okay. Because when I think about like this concept that we're going to get to jobs, but this concept that AI is going to take our jobs, to me it's like, wait, there's so much left to build. If you think just about what we have today and maintaining this, that maybe it will, but there's so much that software has not yet touched that it seems to me that there's more opportunity out there than people are imagining. But just to touch on your earlier question, and you tell me how deep want to go, because I can talk about this for hours. But in the early computing pioneers, they all had this idea that

Starting point is 00:03:44 computers are the thing that makes computers special is this idea of programmability, right? the moment we had a programmable machine that was first invented by von Neumann. And it's the same architecture that we use today. The thinking was, oh, anyone can use a computer to program, to solve problems, to build applications and all of that. It didn't get mass consumer adoption. And the reason is because coding is hard. And so you had the Xerox Park as the research in Palo Alto, Palo Alto Research Center.

Starting point is 00:04:27 They developed GUI. One day they invite this up-on-coming entrepreneur called Steve Jobs. Steve Jobs looks at desktops, menus, items, and he's like, he has the Apple II. Obviously, Apple II is also still command line. You can write some basic. And he's like, okay, this is the key to get mass consumer adoption of computers. And so he copies what Xerox had, and he builds it into the Mac, and obviously later Windows and Microsoft copies UI, and then suddenly computers are usable by anyone.

Starting point is 00:05:01 And this is amazing. Now, like billions of people use computers, and now we have phones based on the same idea, what we lost is this idea that anyone can program a computer. So that's something I've been passionate about all my life, is like computers should fundamentally be programmable. And there's been a lot of different iterations with visual programs. We had the no-code, low-code revolution that happened like maybe, you know, 10 years ago. I would say it never reached the full potential.

Starting point is 00:05:25 It was more of a buzzword than reality. But now... I think it is a multibillion-dollar market for sure, but it's not a trillion-dollar market. I think this idea of like anyone can make software is such a massive market. Okay, so bucket number two. Bucket number two tends to be entrepreneurs. And so everyone in the world has ideas. people build so much domain knowledge about whatever their field of work, right?

Starting point is 00:05:55 I was hearing a story today of an Uber driver that is starting to make an app with Rupplet. And the app is about logistics. He was a truck driver before. And so he has domain knowledge about how to manage fleets, for example. But he never was able to make it into software because he didn't have the skill. Maybe he didn't have the capital to go. you know, commissioned a contractor to do it. And suddenly he can do it. So, you know, pick

Starting point is 00:06:21 anyone on the street. And they all, in whatever industry they're in, they realize that there's a need for a piece of software or technology that no one has built because they don't have that deep domain knowledge. So we see entrepreneurs from all walks of life, one of our favorite one, we talked about it publicly on our rep, the social media channels, a doctorer from the UK, that he's like, you know, there's all these apps around managing doctor, patient relationships, but they never, it's not fully integrated. So, you know, you have Zoc Doc, you can go make an appointment, but, you know, how do you manage your prescriptions? Can I track my patient over time, their progress? Can I get information from their Wi-Fi-connected scale from their Fitbit from, you know,

Starting point is 00:07:13 and so he built this comprehensive platform. He got quoted by an agency 100,000 pounds, and he built it less than 200, you know, British pounds. Not 200, 200, 200, period. 200, 200, the own pounds. So this is stuff that's being vibe-coded. Effectively prompt in, I want to build the software. Yeah. And then Replit will go back.

Starting point is 00:07:37 build it. Yeah, and this is now a startup. And we've had startups start and replicate multi-million dollar revenue run rate. Some of them have raised at like half a billion dollar valuation. And so we have all the way from small entrepreneurs to start a venture of scale entrepreneurs. But this gets me really excited because America has always been about entrepreneurship. And this is really what attracted me to this country. But actually, if you look at the stats, entrepreneurship over time, although we hear about what's happening in the Bay Area and Silicon Valley, there's all startups every day. But the rest of the country, actually, you know, new firm creation has been going down over the past

Starting point is 00:08:12 hundred years. There was an uptick during COVID where everyone's sitting at home is like, I have an idea. I started my business. Right. Exactly. Which is great, but that actually, we had a regression to the mean. And I think with AI, we're going to see that explode again. Okay.

Starting point is 00:08:26 So that's the second bucket, entrepreneurs. One more bucket. Third one is people at companies like this one. So actually, I'll give you a story from our HR department. We have a small HR department. Replit is kind of a lean team. We're 80 people. And so we have a lot of these SaaS tools.

Starting point is 00:08:45 We pay tens, hundreds of thousands of dollars to do every specific kind of function. And sometimes they don't really fit our use case. We think they're too expensive. So this HR person had a need for an org chart software that can visualize the org chart, that can add, remove people, maintain a history. can look back and see what happened, what changes it did, and went on the market and saw that none of the software captured the exact bespoke use case where she wanted to connect it to our kind of more other HRIS systems or databases. And they all needed, you know, they were all very

Starting point is 00:09:30 expensive and needed a lot of IT support. So she went into Replit and built it, vibe coded it in three days. And so that meant that we have a system that exactly fits our use case. And that also meant that we're not paying $10,000, $30,000 a year for a piece of SaaS software. And that's happening across the board. We see companies saving hundreds of thousands of dollars replacing SaaS software with built and with internally built software. Now, do you need to be someone with some technical background or some technical know-how to be able to do this? Well, because I'll give you an example. I mentioned to you before we start recording. I opened a Replit account this week. I wanted to build a simple choose your own adventure game. I think it's called history havoc where you can work your way through

Starting point is 00:10:16 different history scenarios. But it just didn't get to the point where I wanted it. How long did you work on it? So I spend about an hour on it, not a lot of time. And I also, full disclosure, just on your starter plan, I'm not paying yet. But I couldn't get it to work. I also tried to build this story tracker, and it wasn't able to crawl the web the way that I hoped it would. So it still seems like this, to a lot of people, that this is something that is helpful. If you're technical, you want to make a prototype. But these use cases that you're giving seem to be full-blown companies or working pieces of software. So explain that disconnect. I think it requires grit. Obviously, there's like stochasticity and the machine learning models. So explain what that is.

Starting point is 00:11:02 the same prompts can put you on a path of success based on randomness that's happening inside the GPUs. There's this parameter in large language models called temperature. And temperature is literally like how random is the sampling of the words coming out of the of the LLM. So the LM, the way it works, you know, you give it a piece of text, and it tries to complete the next word, the next token, as we call it. And the way it happens, it generates a lot of candidates. So, you know, the red fox, you know, jumped, slapped, whatever.

Starting point is 00:11:44 But like, jumped is the top one. You know, it's the highest probability one that, you know, the model has seen it occur after the sentence and millions of cases. But, you know, you have the sampler and could be randomizing. what it picks, and that that randomization makes it more creative. There's also inherent randomization inside the like the Nvidia chips of the GPUs. So this style of software is unlike the software, the classic software, where everything is discrete, input-output, machine-leared models have inherent randomness.

Starting point is 00:12:26 and that's a feature, not a bug, that creates creativity, right? So some people sometimes get on a bad luck with a replica. We're obviously trying to mitigate a lot of these problems. But I would say it also requires grip. Like the game you just described, professional programmers coding might take them a two days thing. On a replica, you can do it in two, three, four hours, but it would require a little bit of grit. So it's not magic. and the skills you were talking about, the technical skills, although they're not required,

Starting point is 00:12:58 you can build them up over time. And our environment kind of shows some of these features as you're working with it. And so I would suggest to people that don't go into this thinking you can just have prompt and have an application pop up at the other end. I would say at least set an afternoon to give it some good effort and try to get. like your first app in and once you do that you just get addicted so there's vibe coding which is again prompt and then you make an app and then you can refine it with more english and then there's a i coding uh where you could basically have a i you know complete your code big auto complete so

Starting point is 00:13:42 what do you think the opportunity is in vibe coding versus AI coding and where do you think the energy is in the AI industry today i gave the analogy of the history of computing, and I think it's a very suitable analogy for a lot of what we're talking about. Early on in computing, we had the mainframes. So the mainframes really big room-sized computers. IBM used to make them large corporations and governments use them in universities, but everyday people don't have access to them until Apple created Apple II. And that was the first mass consumer market computer. And since then, we've had Windows and all these devices. The mainframe was already serving the professionals needs, but it wasn't serving

Starting point is 00:14:31 the consumer needs. Now if you look at the market for PCs versus the professional workstation, Sun Microsystems, all of that, which used to be the case, the PC not only was a much bigger market, eventually it subsumed the more professional grade. software. And this is called a disruption theory. You know, a lot of your audience that might be into business history or theory. Clay Christensen used to be, I think, Harvard Business School professor. And he wrote this book called The Innovators Dilemma. And the idea is that a lot of technology started the lower end. And because their mass market appeal, they onboard a lot more users and customers and over time they reach certain economies of scale and they subsume even the

Starting point is 00:15:28 upper end of the of the of the market currently the upper end of the market is what you were talking about with the AI coding tools right so um there's like 30 million developers all over the world maybe a little more now um those are professional developers that went to computer science uh classes in in college they were trained for four or five years and now they're working at companies. If you make those developers 20, 30, 40% more productive, you get, depending on, you know, if you're a company of the size of Google, it's like billions of dollars worth productivity, right?

Starting point is 00:16:06 So the market is really obvious there. You can go apply it and get, but it's a zero-sum market. If you look at co-pilot, which is Microsoft's product, which was the first market versus cursor, which is the more modern kind of AI coding. IDE. As cursor is eating market share, you can see it is almost exactly proportional to co-pilot declining in usage. So that's a sign of a zero-sum market. It is very lucrative and there's a lot more growth to be had there, but it is not this fundamentally revolution that we can be going through where it's anyone can make software. Let me ask it this way.

Starting point is 00:16:50 I have the stat here that Replit has multiplied by its revenue by 10x in less than six months to 100 million in annual recurring revenue. So is that growth vibe coding or is that growth AI coding? Vibe coding. Really? Yeah. And are these vibe coding programs or these bespoke programs that people are building with prompts? Are they in production or are they mostly hobbies that people fool around with? It depends on the first bucket is more hobby, personal life.

Starting point is 00:17:24 Second bucket entrepreneurs, as you know, most startups die. So most startup ideas don't make it a fruition. The 10% of startups that are small businesses that get off the ground, they get the most value out of Rapplet. And some of them are in production now. You know, I've talked about a lot of these stories. But, you know, for example, we have this creator, his name is John Cheney. He's a serial entrepreneur, used to take him many months and hundreds of thousands of dollars to build applications, and now he can spin up a business and get to a million dollar run rates in the matter of weeks.

Starting point is 00:17:59 Obviously, he has experience. He knows the formula of what it means to be an entrepreneur, but people can learn that over time. In terms of the enterprise, you know, we have, for example, Zillow, the CEO of Zillow recently on New York Times deal book talked about how everyone that Zillow is using. Replet to accelerate product innovation. Because product innovation no longer depends on engineers. You can have product managers do the entire iteration getting user feedback even without going to the engineers. So it just like increases. We have Duolingo, a bunch of these customers that are really focused on innovating, building their second, third product that are now using Replit for a lot of these use cases. So is the use case that you build like a prototype and then you get some feedback and then if everything works out well, then you build? into the product with your like core engineers? That's one, that's one use case. Okay.

Starting point is 00:18:53 That's interesting. Yeah, that's one use case. It's really great. It rapidly improves, you know, the time to market. The second use case is operations and internal tools. So for examples, like, you know, Sears Home Services, really old company, employees people that go and like fix homes. And they had an operations team that wanted to build,

Starting point is 00:19:18 lot of AI tools and software for their field workers to be able to manage their work and their earnings and all of that. But their software was like this 100-year-old cobal programs. And the engineers were kind of busy kind of migrating that and improving that. So the operations seem started using Replit to spin up these AI applications that are deployed used in productions by those field workers every day to manage their day and kind of design the optimal routes to how to maximize their earnings per day. So the operations type use cases tend to be deployed running in production. Okay. So just so I'm clear, are you also facilitating AI coding? Or is it mostly that you turn Replit into a vibe coding company? My mission has always been about how do you

Starting point is 00:20:08 enable people to do this magical thing that is creating software. It's one of the most magical exciting experiences you would ever have. And I was a founding engineer at Code Academy. And before that, I built open source tools to do that. Code Academy taught millions and millions of people how to code. And we changed, you know, a lot of lives. So the DNA of Replit has always been about how do you make programming more accessible. It was, it had like a more devolable bent at some point. But because Repl. It is sort of batteries included platform. We give you the database. We give you the authentication. We give you the deployment. We give you the scalability. We give you all of that out of the box. You don't have to go anywhere else to do any of that. It always meant

Starting point is 00:20:58 that the people that are getting the most out of it tend to be that not professional programmers. Although professional programs do use it, I would say like this 20% of the use cases. And the question is, then, do the people using Replit then come for the people who are those professional programmers? There was a funny thing that happened. I watched you have a talk at the semaphore tech event in San Francisco a couple months ago. And I tweeted something that you said that in one year or 18 months, companies might be able to run themselves without engineers. And then somebody responded to me with this meme where they said, founders in public, AI is writing 99. of our code. In six months, we won't need any engineers. Founders in the DMs. Does anyone know a

Starting point is 00:21:45 good React developer, $30,000 bonus? And I will name my firstborn son after you. So can you explain that disconnect between this view that engineers are going away and this still like very intense demand for engineers in the market? I never made the point that engineers would go away. I make the point that entrepreneurs can start businesses without needing engineers and that we're ready to see that. Where you see, you know, I meet YC companies and Y Combinator is the most prestigious startup accelerator in the world, Bay Area. And in the past, Y Combinator would encourage you to go get a technical co-founder. But like we said, there's so many people with amazing ideas that don't have a technical co-founder. And so they're starting to get into YC.

Starting point is 00:22:34 and what they tell us is we're just going to build this thing on Rapplet. We're going to see how far we can get and they often get really far. Now, if you're building a venture-scale company and you're going to get to hundreds of millions of dollars of revenue and you want to become billion, 10 billion, $100 billion company, you're going to have to hire engineers.

Starting point is 00:22:57 But if you're trying to build a company that creates a really great living for you, even, you know, you can potentially get rich from it, I think we're almost there where you can do it on your own without any developers. And so when I'm talking, I'm talking to our audience. Right.

Starting point is 00:23:17 As opposed to, I'm not talking to Microsoft or Facebook. They're not going to replace developers anymore. My view in developer productivity is that developers are much more impactful than they used to be because a single developer can be so highly leveraged these days. And so, yes, you want to find the best developers.

Starting point is 00:23:39 And we're expanding the team. But the scale that our output is at today, we would be 10X, the number of people who were a SaaS company five years ago. Wow. To reach $100 million in round rate, you know, five years ago, on average, we'd have like 500. A lot of companies will have 1,000 people.

Starting point is 00:23:59 How many do you have? 80. Wow. Okay. You know, it just makes me wonder that as companies grow like this, what the future is going to look like from the technical side. And I'm curious, do the folks who have technical abilities, you know, let's say the economy expands like this and everyone and their grandma can build, literally can build a company using AI tools. Do the technical people then come in and sort of clean up the problems? Are they your like cleanup crew? I was reading this, uh, funny article and publication called Futurism. It says companies that tried to save money with AI are now spending a fortune, hiring people to fix its mistakes. And it was about, it wasn't about vibe coding. It was actually about content, like content marketing, where like your content marketing plan is just filled with this like, you know, kind of bland chat GPT generated copy. And half the time

Starting point is 00:24:54 it says, as an AI assistant, this is the message that I would use. Right. So I am curious to to hear your perspective on, does the technical field end up becoming cleanup crews for vibe coding gone wrong? Let me just tell you where I think technical folks have a job security today. So I think if you're writing software for my Tesla, I don't want you to be vibe coding. I want you to write low-level verifiable code. If you're writing, code for space shuttle. You're writing low-level verifiable code. But also even, I mean, those are life or death situations. So I think we need, we don't need vibe coding there. We need more precision. But even sort of large-scale platforms, if you're building a core cloud component,

Starting point is 00:25:51 the storage or virtual machine components on AWS or Google Cloud or Azure, you want systems engineers that understand distributed systems, understand how to create fail-safe systems at scale. So I think engineers there have job security for the foreseeable future, right? Because of the problem of the sarcasticity of these models and all of that, you need every line of code to be reviewed and managed very carefully. Now where I think AI is going to have the most impact is on product, and people building products, they want to iterate on it really quickly. They want to internal tools, people want to replace all the mess of the SaaS software that we have today. So I think that's happening.

Starting point is 00:26:39 Now, in terms of the cleanup, I mean, depends on where you think AI is headed. Like, do you think that AI is good at making software but bad at maintaining it and it's going to stay bad and maintaining it for the foreseeable future? If it's good at making software, it must also be good at refactoring software or testing software, right? Actually, right now it's pretty bad at testing software because there's this thing called reward hacking. So when you do reinforcement learning over large-liber models, you're giving it a reward every time it does the right thing. Reward hacking is the way to, so the models become incredibly goal-focused. They don't want to get that done, right? That's what RL does. And oftentimes, what

Starting point is 00:27:28 we see when we try to get the models to test things, it will start being corrupt in a way. It will, like, change the test to fit the mistakes it made or sometimes delete the tests. It's a really fascinating behavior that actually Anthropic published research on. So, but do you believe that's going to be the case forever? Obviously not. Like, I think over the next three or six months, I think we're going to see machine learning models being able to test and verify their work. Okay, so one of the biggest things that this moment depends on

Starting point is 00:28:05 is affordable large language models coming from the foundational companies. And that means, you know, in layman speak, if you're going to want to build with AI code, you have to actually have the ability to bring in models from an open AI or anthropic that are going to generate that code and not break the bank. as you do it. And we're still in this, you know, VC funded

Starting point is 00:28:33 or investment, private market investment moment where we don't really know the true cost of these models. Meaning like the foundation model companies might be losing money on those. Do you think they are? And the application companies, I don't think they're on the gross margin basis. I don't think they are.

Starting point is 00:28:50 Right, but they're also training and that's a lot of money. Definitely. They're not profitable. They're losing billions a year. Of course. Yeah, yeah. And there's been this thing that's happened recently with, I just want to run a bayou with both Replit and Cursor, where I think end users have seen pricing gone up. Right.

Starting point is 00:29:10 And Zittron wrote about this. And I think it's a pretty good piece talking about effort-based pricing within Replit. And that is effectively a different pricing structure. We've seen Replit users talk about the fact that they're actually paying a lot more for the same services than they were. previously and his theory is that open AI and anthropic found quiet ways to jack up their prices for startups and we're beginning to see the consequences because cursor had a similar thing happen at Zitron. Oh, okay.

Starting point is 00:29:42 Is that what's going on? No. The prices haven't gone down and that's a problem. So we used to see these, you know, we've seen token prices come down 99% since, since Chachapit. And we've seen token prices come. down year over year. The thing that's a little disturbing right now is that token prices are not coming down. You better believe that the union economics of the labs are getting better because of economies of scale because these models are getting easier to optimize, but they're

Starting point is 00:30:12 actually not reducing prices. And so the concern of the thing, are we reaching a steady state, is there price collusion? Is there now oligopoly of a few model companies that are able to create these state-of-the-out models, and there's no downward pricing pressure, right? Are there investors starting to demand better business fundamentals? I don't know exactly what's happening. We should talk about the Chinese open source models in a second, because I think that that will introduce an interesting mix to this. But it certainly is the case that we're not seeing token prices go down.

Starting point is 00:30:53 The reason, the main reason we went to effort-based pricing is, let me explain about effort-based pricing. So when we released Replit Agent v1, version 1 of a replet agent would work for like two minutes at a time. You would give it a message or go try to do something for two minutes, either succeeds or fails, gives you a checkpoint, commits the source code, and charges you're 20. 25 cents. And the reason it only worked for two minutes is because the capabilities of the models, you know, meant that it can only work for that long. Now, models got better. And we knew that models are going to get better and they're going to be able to work for 10, 15 minutes. And so with version 2 of Replit agents started in beta in February, came out of it in April, the model would work for 10 minutes.

Starting point is 00:31:56 And so we can't charge 25 cents for like a 10 minutes. So what we started to do is came up with the heuristics. Every nine tool calls will do a checkpoint. And so as it's working, you'll see it make a checkpoint, checkpoint. That's a hack, right? That often means that if you make a small change that costs us five cents or whatever, you still get $0.25. But also, if you make a big change, you might be costing us a lot more than what we charge you. So it was really out of whack. Now, that was a hack, and we need to move

Starting point is 00:32:33 to a place where we're charging the user proportional to how much the model is working and the cost on us. And we think that's the best way to create a long-term sustainable business. And when the, when those two things are aligned, also opens up new opportunities where when we do optimizations, we're always optimizing. We actually, we actually had like 20% optimization on cost recently. We pass it straight to the user because now cost and price are tracking with each other. What happened with our community, the first thing that happened is there was a sticker shock. So you're used to seeing 25 cents every 10 tool calls. And suddenly, you're seeing $1.5.5, you know, or $2 after 15 minutes of work.

Starting point is 00:33:25 So that's one. Two, it's true for some users who are really advanced. The cost has gone up for them because the projects are bigger. The contact size is bigger. Their workloads are bigger. But early on in the project is actually cheaper. You mentioned that you worked for an hour. You didn't have to sign up for the core.

Starting point is 00:33:47 Paid. they're paid. We give free users $3. So you work for an hour on $3. Not bad. Yes. It's cheaper than a developer. It's cheaper than a developer for sure. And so that being said, we recognize that on advanced users, it is almost there's a tax as you go on. So we're trying to optimize the context window and make sure that advanced users are not getting more expensive experience. The other thing that happened is we introduced thinking mode, reasoning mode, and we introduced like high power mode. And people are enabling those, and sometimes they forget them enabled, and now we actually start to hide it under advanced. Like, don't enable this unless you know what you're doing

Starting point is 00:34:31 and you want more power. And there's like a 5x multiplier on it. So a lot of people are enabling those, getting these large checkpoints, and we're like, we put out content, we put out a video, we put out some documentation, a block post. Here's when to use reasoning. mode. You should, you should always have it on. So just describing all of that that's happening, there's a, there's a macro trend in the application space where a lot of companies were subsidizing the cost of, like a lot of companies were paying money, more money onthropic and opening eye than they were making. Was that, were you doing that? We, on V1, no. On V2, yes, because the pricing model was out of walk with how we're charging.

Starting point is 00:35:19 Actually, the median cost per checkpoint kind of went up only a little bit. So on the lower end, we're charging user less right now. But it used to be that on the lower end, we're charging users more. On the upper end, we're charging users less. So now it's more proportional, more fair for both. And so now we have solid business fundamentals that allows us to grow. I've been talking about how Replit has been my mission, my passion for, you know, eight, nine years as a company, 15 years as a side project and a vision.

Starting point is 00:35:56 And we're not trying to, you know, rapidly expand revenue while losing money in order to flip this company, to sell it. You know, we've seen all these acquisitions or raised like the next big round. We're really trying to build a business for the long term. And Replit is made of all these different components. So we have costs, not just on AI, we have costs of traditional compute, CPUs, storage, databases, all of that stuff. So kind of to summarize, you know, I've talked a lot about what was happening specifically in Replit. I don't know what's happening in Cursor.

Starting point is 00:36:29 I think for sure that their situation is like a little different because they, their dynamics is, I think they actually did raise prices for the, you should talk to them. But I think it's like a little different dynamic than what happened in the Replit. To summarize, there is a concerning trend where token prices are not going down. Is that going to be the case for the future? Because that sucks. Because we want to be able to use more tokens to create more intelligence to be able to create better applications for users. Is that going to be the trend forever? Are we reaching a steady state?

Starting point is 00:37:11 In cloud, for example, we kind of. reached a set of state. When you have a monopoly, there's no pricing pressure. But when you also have an oligopoly, they not intentionally without talking, start colluding. You know, because it's like a market dynamic where it's like, if you don't lower a real price, I'm not going to lower a more price. It's not in our sense as a whole, because we own 25% each for the market, right? Okay, I do want to ask you about something that you didn't mention when you looked at the different factors for why prices might not be going down. There might be investor pressure.

Starting point is 00:37:47 There might have been this equilibrium reached. Or is it possible that these models have just gotten so big and expensive to run that the fundamental economics of AI are just not working? So explain why. You can surmise the bigness of the models based on speed, token throughput. But it's not perfect, but if you remember, GPT 4.5, GPD 4.5 was an experimental model from OpenEI. It was the idea, less train a trillion parameter dense model, meaning it is not sparse, meaning all the neurons are activated on every request. And it was so slow.

Starting point is 00:38:31 It's really hard to run these things. The new models, even when they're big, they're sparse models. that are called MEOA mixture of experts. So in every request, there's a router layer that takes it to the expert part of the circuit in order to answer that question. So, you know, there are models with trillion parameters, but any given request is 32 billion active, and that's like a kind of small model. And what we're seeing based on speed and things like that, is actually probably the models

Starting point is 00:39:04 are getting more efficient. I mean, deep seek showed that the models are getting more efficient. And if deep seek open source was able to make it, you better believe that the labs are also making more efficient. Okay, I do want to speak with you about Deep Seek and Kimmy K2 and other Chinese models. Let's do that when we come back from the break right after this. Cool. And we're back here on Big Technology podcast with I'm Jed Mossad, the CEO of Replit talking about all things, AI code, vibe coding. And now let's talk about these Chinese models.

Starting point is 00:39:31 So this episode will air a couple weeks after the emergence of Kempassad. Kimi K2, but we're talking about Kimi K2, which is another Chinese model. And of course, this deep seek moment was a big moment where we found out that this seeming small hedge fund in China with some GPUs was able to engineer a more efficient model. That story will be debated about what actually happened for a long time. But let me ask you one influence of deep seek question and then we'll get into the others in Kimi K2. So you mentioned before the break that Western models have taken after Deepseek. So do you think they learned what Deep Seek did and sort of put those new innovations into

Starting point is 00:40:13 play in their own models? Or is that coming anyway? From what we've seen from the Twittersphere is that it seemed like there were some surprises, because researchers just talk a lot, it seems like there were some fundamental innovations from the Deep Seek models that weren't known in the West. But have they implemented those now? And that's probably why we're getting more efficient models. Like, yes, I'm sure like the models are getting more powerful without going, getting slower.

Starting point is 00:40:44 All right. So tell me about Kimukai, too. When Anthropa came out with Sonnet, Claude 3.5, that was a fundamental shift in the industry where the models got a lot better at coding. And suddenly, instead of making small snippets of change, Sonnet could, could, could, generate entire files and enable things like cursor composer where it was a start of vibe coding where you can put in a prompt and generate entire files and all of that or generate large edits then sonnet 3.5v2 was the first model it was a computer use model was the first model where you could sense that there's agentic true agentic behavior I don't know what they did

Starting point is 00:41:32 they cracked URL, whatever happened there, you can give a model of VM and it can give it a virtual machine. You can give it an objective and it can slew around in the virtual machine, look at the files, do run some commands and then write a program, test it and then solve a problem. That experience, there's a benchmark called SWEBENCH, software engineering bench. and you start seeing the score going up dramatically. I don't know. I think we were at like 10% last year, and now we're at like 70% and 80% percent. World-class coding.

Starting point is 00:42:15 The interesting thing about SweetBanch, it's not just coding because there are other benchmark that just do like the code generation, right? SweetBench, I think the harder thing about it is the agenic workflow, is writing the code, testing it, running commands, finding files, understanding files. And this stuff was like a huge jump that happened with Sonnet 3.5V2, then 3.7, then 4.0.

Starting point is 00:42:44 And they've, you know, kudos to Anthropic. They've been able to create a lead that hasn't been bridged by the other labs. Gemini is getting there on the agentic stuff. but I would say Open AI kind of lagged behind. O3 has some interesting agenetic capabilities, especially around deep research, but it hasn't been as good as the other models on this agentic stuff. I mean, they did some interesting stuff with codex.

Starting point is 00:43:14 I don't know if those models are in the API, but everyone is using Claude for the agentic coding experience. The interesting thing about Kimmy K2, I would say, is they caught up not to clot Sonnet 4.0, perhaps Clots on at 3.7. At least that's the vibes right now, before the other labs. Wow. Yeah, I think that's really underreported. Again, this is vibes.

Starting point is 00:43:40 Everyone's trying to figure it out, but it looks like it has a really good sweepbench. It is doing 65 on Sweet Bench. Sonnet is 72. If you do sampling, which is you, for every step, ask the model to generate in number of solutions, you can get up to 72%. It can be competitive with Sonnet. And this is with export controls. Yes. And I think in the paper, they talk about the solution is scaling reinforcement learning. We also saw that with GROC4. GROC for spent as much on reinforcement learning as they spent on pre-training, which is unheard of. But even, but that's an

Starting point is 00:44:23 important point because with that big spend on reinforcement learning, GROC is a competitive model, but they spent billions, billions, hundreds of millions on RL, which is this goal-setting form of training. And it's not like it's a new category. So it shows there's some limits. X-AI is an amazing team, and they've been able to achieve so much and so little time. But it's also well-known in the industry that they're computer inefficient. They're so compute-rich. That they're throwing computer the other problem in many ways, yeah. So what is the significance that Kim EK2 is now as good as some of these anthropic models? A small research lab, I think the rumors like the 200 people, again, there's export controls as well,

Starting point is 00:45:11 was able to figure out how to catch up to near state-of-the-art agentic coding models before big Western labs that are highly capitalized, a lot more researchers was able to. And does that mean then that they can undercut them on price? So let's see. Right. So let's see. Are you going to integrate Kimmy K2?

Starting point is 00:45:37 We're looking at it. We're looking at it. And so far? So far we're impressed. Okay. So far we're very impressed. So I mean, look, these things, sometimes they overfit to certain things. And I would say it requires a month from the entire community to kind of like really have

Starting point is 00:45:53 consensus over like whether the model is really great. And similarly with the Gorkfor, I think a lot of people are playing with it. But my sense is that it is good enough. And again, the economics are so good that you can expend more tokens to get more intelligence. So it is not at the frontier, but it is near frontier, but given that it is cheap and fast enough, you can spend more tokens that creates some more interesting potential for us to create new capabilities in our platform because it is cheap and fast. How much cheaper is it than the Anthropic models?

Starting point is 00:46:39 Man, I am bad at this, but like I would say, I don't, one-fourth, maybe. Yeah. That's on their official API, perhaps more even. I forgot. Maybe you can look it up after the show. We're going to have to re- This show is going to come. It will come a couple weeks after we record, but we'll have to release this segment early because that's astonishing. One more question about Anthropic. I can vibe code and clot. I do it all the time. And they also have this Claude code product where people are writing prompts getting code. Are they your competitor long term, or how do you see them on that front? Because that's the question is eventually do the labs just subsume everything else that's built on top of it? I think the question is for them, right? Like you should ask, I know you're going to talk to Dario.

Starting point is 00:47:31 You should ask them the question. Listeners, viewers, this will air a week after Dario, but I'm about to, after this, go in and speak with him. So you might see this question a week earlier. Yeah, so look, we're committed to our relationship with Anthropic. They're a great company to work with. We have a great partnership. And it's not like we didn't anticipate them wanting to build products in addition to the models. Every model company is building products right now.

Starting point is 00:48:03 The thing that they're going to have to manage is their pricing. If they're going to compete by undercutting everyone on price, they're going to destroy the ecosystem. Right. I think Replit right now has the advantage of this platform that we built over eight years. That it's going to take a lot of blood, sweat, and tears to build. And also, the user experience that is focused on that sort of non-technical user. And like, we really care about this idea of empowerment. Right now, Cloud code is used by developers and loved by developers. And I think they're competing head-to-head with cursor, windsurf, and those kind of products. Whether they're going to move into our space, again, you should ask them about that.

Starting point is 00:48:50 But I think a more interesting question, how do they want to nurture the ecosystem versus just go, because they can compete on price. They can steamroll everyone. Right. I mean, Claude is the max package is $200 a month. and you see developers getting thousands of dollars of API value out of that. You must notice this. Yeah, yeah. Not good for the ecosystem.

Starting point is 00:49:20 I don't think so. Why? Because, again, you're competing on price, not how good the product is. And there's a price at which maybe the quality doesn't matter as much as how many tokens I'm getting, although CloudCode is a really good product. But then, you know, cursor, no matter how good they make the product, they're still going to be more expensive and a disadvantage. And people are like, well, you know, I really like cursor, but like I can get 10x more value out of clot code. And so the marginal gain in product quality will not matter as much.

Starting point is 00:50:02 Right. And that will destroy the ecosystem. Fascinating. I mean, I think that this question is just one small. question or one version of a big question we're going to be asking as these AI models get bigger and better and more intelligent. So I want to spend the rest of our time talking about some philosophical questions, if that's okay with you. Sure. There's this idea that the AI research houses want to use the code that they generate to sort of, or these coding applications

Starting point is 00:50:32 to speed up the development of the next model and compress the time it takes to get better models. People call it an intelligence explosion or things of that nature. Do you see that as feasible and is that something we should want? So you should think about what are the limiting factors to the next version of a model? What are the bottlenecks? Where is that invasion needs to happen? I can think of a few areas. One is research. So this is algorithmic research. like figuring out the next algorithm, next improvement in training algorithm, and inference algorithm, whatever it is. And then systems engineering.

Starting point is 00:51:21 These training runs are massive that requires a lot of interesting distributed systems engineering. Will AI coding help with AI research on the margins, perhaps? they can spin up Python notebooks faster. I don't think it's that impactful. Like the models can't do AI research, can come up with ideas and test them really quickly. It will help with distributed systems.

Starting point is 00:51:53 Perhaps it is not as impactful right now on writing Rust code or C or Go whatever as it is on JavaScript and Python and higher level languages. And like I said, it requires a little more precision and better system design to, and that the bottleneck to really good distributed systems is, is design and not like the amount of number of codes you can generate, which is more true on the product side. You're just, you need to generate tons of CSS and JavaScript and try a lot of things and delete a lot of things and iterate and do A-B-test and all of that stuff.

Starting point is 00:52:34 So, like, volume of code is important there. I would say on the backend distributed systems, I don't think volumes of code is. So I'm reasoning in real time now, and I guess my answer would be, I don't think it's going to have anything more than, you know, marginal improvement on speed to the next model. All right. I guess that makes me rest a little easier then. by the way, just on a, you know, you speak with a lot of people in the AI industry. Of all the economic activity in the AI industry today, how much of it do you think is code? Just rough guess.

Starting point is 00:53:12 Someone actually made that slide that's been going around. I think it was something like 1.1 billion of ARRs and the AI coding and five coding space. Okay, so it's actually kind of small compared to like the total revenue. Yeah, so Anthropic has $4 billion. Right. Let's say, yeah, $4 billion ARR. Let's say they also have their own products, their own coding products. I don't know. Let's say 1.5 billion is off of that as AI coding. It's substantial, but it is not the entire thing. Okay. But then you have $10 billion of ARR on open AI site, and that's more consumer. Now, on the rush to artificial general intelligence, which we've talked a little bit about, do you think Silicon Valley is the one that, should sort of possess this or be the one that controls it. I mean, it's an interesting place.

Starting point is 00:54:05 There's a lot of kooky ideas here. And it seems like if this is possible, it's going to be something that's controlled by or owned by one or more of the labs here. Is that good? Assuming it all happen and assuming one company will reach their first and have some kind of advantage of monopoly over AGI, which I'm not entirely sure I agree with these assumptions. But if you want to make, if you want me to make these assumptions and then answer the question, I'd be happy to. But I should really clear that. Yeah, let's make those assumptions. I know there's a lot of things that need to happen in order to get there. Yeah, I might have some fundamental disagreement with this

Starting point is 00:54:45 assumption. Wait, let's talk through the disagreement. I don't think AGI has any point in time for one and I think there's going to be right now the distance between any lab is just an order

Starting point is 00:55:01 a few months and I think that really matters you know between O1 one preview and deep seek was like two or three months between I mean the biggest one was this

Starting point is 00:55:11 Kimi K2 one that we just talked about that that was like maybe nine months or something like that but it's still sub one year and so whomever reaches at AGI first, they're not going to go into intelligence explosion and just like suddenly,

Starting point is 00:55:25 you know, superintelligence gets born. People, you know, other labs will catch up really quickly and then, you know, there's going to be a lot of models. I don't think it's going to look that different from the ecosystem that we have today. And if you assume that AGI will actually have an impact on model development through research and speed of development, then everyone will get the benefit of that as well. And so actually you might get even more competition once you have AGI. So I don't think it's going to be a monolith.

Starting point is 00:55:56 Okay. But if it is? Okay. If it is, would I want Silicon Valley, I guess it's like a moral. Yeah, philosophical question. I wouldn't want any human being to, and we're all fallible. That's why markets work. That's how a human society evolved over time.

Starting point is 00:56:23 It is, you know, Darwinian evolution and free market capitalism. It's all based on competition. And the idea that, like, one system would be this model controlled by one human being. We've seen disasters and massive human suffering happen. when there's this top-down sort of leviathan type thing, whether it is in Soviet Russia with all the deaths that happened there or in China or whatever. And oftentimes, as I understand it, in the Soviet era,

Starting point is 00:57:12 they had this kooky idea about evolution, I think, what was called? Lushenko, Lushankuism or something like that? I'm not familiar, but I'd love to hear the explanation. Yeah, so basically they had, they thought that evolution is this bourgeois idea. You know, communism has this idea. It's like anything that's, you know, high class bourgeois is like wrong. And so they have this ideological view on how evolution works or should work that led them to do agriculture in the wrong way and led to famine and that sort of thing.

Starting point is 00:57:46 And so oftentimes they do kill people in cause mass suffering, mass poverty, even if they don't intend, even if like outside of the gulags and all the other oppressive, explicitly oppressive system, those systems are inefficient because they have these wrong ideas and there's no competitive pressure to have better ideas. And so that's fundamentally broken static system that doesn't improve like competitive systems. And I think if we have super intelligent monolith controlled by single company or single human being, it's bad. It's fundamentally really bad. I agree. All right. Last question for you. We're seeing a lot more AI love bots come out.

Starting point is 00:58:39 that a good thing or a bad thing that people are going to fall in love with AI more often? It's a bad thing, like, a priori a bad thing. Like, the reason humanity grew and flourished and all of that is because we have babies. And anything that, you know, takes away from that, especially given the fertility rate is so low right now, is, will potentially lead to really massive problems, especially since capitalism is based on large middle-class consumerism, like the current insinciation to have the economy work requires that, requires taxpayers to fund social security and like elder care and all of that. The welfare state is based on this large young population.

Starting point is 00:59:39 And when that starts to collapse, you're going to have, you know, massive instability in these, in these systems. So even if, you know, humanity doesn't go to extend, like Elon would say, although Elon is the first person to create a really interesting mass market companion, I think, right now. Interesting is a fun word for it. It looks like it's really compelling. I see people, you know, right now talking about it on X so much. It's got some work. Yeah. But it is, these type of things are goal.

Starting point is 01:00:09 going to definitely become real partners to people. People, when this technology has been bad or hardly workable, have gotten married to them. Right. Before LLMs. So it's going to happen again and in greater numbers. The question is, I wrote this, I used to do like more creative writing. I wrote this essay on the hyper-real. So I think it is like French post-modernist theorists like Bolgerd wrote about this concept of the hyper-reel.

Starting point is 01:00:52 And the idea is like we have reality like you and I are intracting right now. And then you have media created realities. And the reason sometimes it is hyper-real. it is more intense than reality itself and more enticing than reality itself. So, you know, even in real things, you know, for example, when you get a, when you eat like a, I don't know, a twinkie or something like that, like fatty, salty, sweety, kind of a snack, it is like, it is not like a piece of chicken or beef or whatever. It is, it is this hyper real thing.

Starting point is 01:01:31 It, like, hyper engages your senses, and it makes you addicted to it. And similarly, social media is hyper real in a sense that I can go there and get a lot of social interaction, tweet something, get hundreds of likes, and it's, like, much easier than going out in the wild and, like, finding a hundred people that could like me. Yeah. Right. And so we have these technologies that are, and the market around it that is bootstrap, to make us addicted because they're so much more enticing and low effort than the reality that we know and experience day to day. And I think that that is a huge danger for the existence and evolution and longevity of human civilization. And I, uh, and, uh, I,

Starting point is 01:02:31 And I think it is, you know, I talked about how good free markets are, how important, how competition is important. This is one thing that capitalism is so adversarial to humans at, right? And so I don't have a solution for it. I think in the past, the solution was religion. For example, in like Islam, you can't depict humans or animals in art. That's why in Islam, the art became more. geometric and if you go go visits like the mosques or whatever they have like all this geometry

Starting point is 01:03:08 or like calligraphy that's really interesting and I think part of the idea there is is is I think the hyper wheel like if the ultimate expression of a something so enticing is a virtual being like we're we're seeing right now and I'm not saying like you know Islam had like like the foresight or whatever. But I think it's, you know, religions used to have this built-in mechanism to protect against these predatory, sort of consumer products. And I wouldn't know how to solve it in the future, but perhaps it is potentially societal, maybe governmental. I'm always kind of skeptical of that or religious kind of protection. need something yeah so lord help us i'm judd great to see you thanks so much for coming on the show

Starting point is 01:04:06 my pleasure all right everybody thank you so much for listening and watching we'll be back on friday to break down the week's news until then we'll see you next time on big technology podcast Thank you.

Big Technology Podcast - Vibe Coding Decoded — With Amjad Masad

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.