TBPN - Weekly Recap | Elon vs. Trump, Ukraine's Drone Attack, Cluely Update & OpenAI CRO

Episode Date: June 7, 2025

(00:00) - Intro and Overview (02:57) - Ukraine’s drone attack on Russia (03:21) - Soren Monroe-Anderson (Neros) (09:02) - Connor Love (Lightspeed Ventures) (13:45) - Erik Prince (Off L...eash) (15:20) - Elon Musk Vs. Donald Trump (21:44) - Delian Asparouhov (Varda) (32:48) - Cluely Update from dropout Roy Lee (46:52) - Kian Sadeghi (Nucleus) (01:00:12) - AI Day on TBPN (01:00:14) - Mark Chen (OpenAI) (01:28:49) - Sholto Douglas (Anthropic) TBPN.com is made possible by: Ramp - https://ramp.comFigma - https://figma.comVanta - https://vanta.comLinear - https://linear.appEight Sleep - https://eightsleep.com/tbpnWander - https://wander.com/tbpnPublic - https://public.comAdQuick - https://adquick.comBezel - https://getbezel.com Numeral - https://www.numeralhq.comPolymarket - https://polymarket.comAttio - https://attio.comFollow TBPN: https://TBPN.comhttps://x.com/tbpnhttps://open.spotify.com/show/2L6WMqY3GUPCGBD0dX6p00?si=674252d53acf4231https://podcasts.apple.com/us/podcast/technology-brothers/id1772360235https://youtube.com/@technologybrotherspod?si=lpk53xTE9WBEcIjV

Transcript
Discussion (0)
Starting point is 00:00:00 You're watching TVPN. This week, what were the top stories? What were the most interesting things that we learned? Favorite interviews. The Ukraine drone attack, that was huge. Operation Spider-Web, a whole ton of drones were smuggled into Russia in shipping containers. They emerged and went out and attacked bombers. We had Soren Monroe Anderson from Neros on the show to
Starting point is 00:00:30 To break that down for us, we also talked to Connor Love at Lightspeed, who does a lot of in defense tech investing. And of course, a couple of weeks ago, we had Eric Prince on the show, the founder of Blackwater, and he had kind of predicted that the Ukrainian military was perhaps underrated. And we might be seeing like something like this in the future. And so that was interesting to see to see play out. So we will take you through those kind of interviews, recap some of those. Then obviously we had the absolute meltdown between President Donald Trump and Elon Musk.
Starting point is 00:00:58 that unfolded on X and Truth Social. The two leaders... Dueling. Dueling social media. Exactly. And although it's a highly political story, there are big business implications. Totally. Talking about what's going to happen in space between NASA, Boeing, different launch providers.
Starting point is 00:01:18 Yeah, the implications for Tesla, SpaceX, Neurrelink, even the boring company, right? There's a lot of stuff that needs... Many of Elon's businesses are heavily regulated. Yep. and the potential impacts are substantial. Yeah, then we also had had some earlier stage founders on the show. Roy from Cluley came on and went pretty viral, put on a show. You're surrounded by journalists.
Starting point is 00:01:41 Hold your position. Cluley is a service to help you cheat on everything. We got to actually try the app. Someone was pushing us to try it and see how good the product is. But regardless of the product, he's also a phenomenal. nominal marketer and he came on and put on an absolute show. And he's printing, apparently. Yeah, he's doing great. He can't spend all the money that they're bringing in. And so we'll give you an update on Roy and see where that business is. And then Keon, from Nucleus came on to launch
Starting point is 00:02:12 Nucleus Embryo, which he calls the first ever genetic optimization software that helps parents give their children the best possible start in life. Quite a lot of controversy there on the timeline this week. A lot of people hate it. A lot of people love it. Yeah. And we'll let you kind of decide for yourself. And then we had a whole bunch of AI experts on the show from Google, OpenAI, and Anthropic, got to the front leading edge of the debates around AI, AI LLN. Poor John yesterday.
Starting point is 00:02:42 You were fighting as long as you could. You didn't want to talk about drama. You got dragged into it. But we did get some great coverage from Mark Chen at OpenAI as well as Shulte over at Anthropic. Yeah, it was a lot of fun. The big news over the weekend was the Ukraine drone attack on Russia. They shipped shipping containers into deep into Russia, at which point drones flew out of the containers and hit strategic targets.
Starting point is 00:03:10 We're going to have two guests on the show today. Soren Munro Anderson from Niros to talk about that and also Connor Love from Lightspeed to talk about that and also defense tech investing generally. Today we have Soren from Niros who builds drones and has been to the Ukraine and so we'll bring him into the studio and ask him how he's doing. How are you doing? There he is. Welcome. Great. How are you guys? We're good. We have a new soundboard, so expect some wild cards. Wild stuff. Could you give us a high level overview of the history of drone warfare in Ukraine? Because I understand it's been progressing super rapidly on both sides. And it'd be helpful to understand
Starting point is 00:03:49 kind of the different stages. Did they ever have like predator drones, like the global war on terror type of drone or do they jump straight to quadcopter and kind of like leapfrog the technology? So, you know, you've had this, this Russian aggression war in Ukraine since 2014. Obviously, the full-scale invasion was 22. But even during that period before the full-scale invasion, there was some usage of drones for surveillance and dropping explosives. These are primarily still like small drones like what you're seeing now. But this was not a proliferated technology.
Starting point is 00:04:23 Then when the full-scale invasion happened, within a few months, the Ukrainians started thinking about all these ways that they could use, you know, inexpensive drone technology to get an asymmetric advantage. And that is where FPV drones started becoming a really, really big deal. So they pioneered the, really this idea of, you know, putting an explosive on a racing drone and using that as a precision strike weapon. There were instances of this happening in other places, but they really scaled it and they've really refined it.
Starting point is 00:04:53 And then Russia was much slower to take it seriously, although now they tend to in some ways outproduce Ukraine and they have a much more direct line to China where most of these components are coming from. But since 2022 and FPV is just starting to get used, now it's reached an unbelievable scale. It's estimated Ukraine is going to produce 4.5 million FPV drones this year. And those are ranging from, you know, ones that are this big to 15-inch propellers. fiber optic controlled drones, many different types and sizes of warheads, different configurations. And I can talk more about the drones that were used in Operation SpiderWeb as well, because those were really interesting.
Starting point is 00:05:36 But what we've seen is just this vast technology landscape where new clever ideas like fiber optic are going to be the hot thing for a few months. And then they sort of just become another tool in the tool belt. And it's just this constant arms race. Yeah. Yeah, talk about this attack was unique in a bunch of different ways, but is this something that had been, to your knowledge, or just, you know, more generally known to be something that had been attempted multiple times?
Starting point is 00:06:06 Or, you know, maybe like I'm curious to know, yeah, kind of the backstory on this type of attack, because it seems, you know, it's a massive difference to be using this technology way behind enemy lines versus using it, you know, at the front line. Yeah. So primarily FPV drones are used on the front line, say the kind of 30 kilometer band across the zero line. What was so unique here is that it was FPV drones, short-range drones being used 4,000 kilometers inside of Russia. It was this unbelievable application where, you know, you've seen the Ukraine using long-range
Starting point is 00:06:46 one-way attack drones that are. going 1,500 kilometers to strike targets deep inside of Russia. But here, these were small drones actually driven in on trucks, basically in the tops of shipping containers. And I don't know of any operations that were similar to this beforehand. I think it was not something they wanted to give away. And the drones were actually operating on cellular. They were not operating on local, like the normal,
Starting point is 00:07:17 the normal low latency local radios you use for FPVs typically. And so I think, you know, this is going to be something that a lot of people are going to look at and see if you have drones that are operating on cellular, you can't really tell them apart from cell phones. That's really hard to defend against, really hard to detect. But now it's going to be part of air-based defense is thinking about drones that are operating on cellular being piloted from basically anywhere in the world. Talk about the Russian response, the immediate response to this incident. from the footage that I saw, and I think most people saw that tracked it, it seemed incredibly challenging to respond to it quickly, right? By the time you could sort of organize a response,
Starting point is 00:07:59 a lot of the core damage have been done. What do you think the question, I think, that every country is asking themselves now is how do you defend against this type of attack, whether you're at war like Ukraine and Russia are or you're just, you know, thinking, you know, long-term. Yeah, this clearly poses a massive threat to critical infrastructure. I mean, being blatant, the U.S. does not have any defenses in place that would stop this from happening. We already know. There's already news stories about drones that are flying over our Air Force bases and we can't do anything about it. And I think the only approach here has to be a multi-layered system where you're looking at all the different types of electronic warfare and also considering things like
Starting point is 00:08:48 satellite communications and cellular communications where you're basically able to turn those off on the flip of the switch, which is a huge inconvenience and a huge thing to build into the infrastructure, but clearly that's going to be required. Welcome to the stream, Connor. How are you doing? Good. Good. I'm doing all right. Be back, guys. Yeah. He's got a suit this time. Oh, looking great. I love to start. I won't say I drink. I'm doing it. I'm good. I'm going to do. I'm going to do. I don't up just for you, but, you know, I would have taken this soon off far before this, you know, if I wasn't coming on. Fantastic.
Starting point is 00:09:18 Good. Good. Thanks so much for jumping on. Have you been tracking the Ukraine story closely? Any insights there? Anything in the portfolio that's at all relevant in the defense tech world? Do you expect a response from the U.S. government or guidance or change to any strategies? Really, any takes on that?
Starting point is 00:09:37 I mean, first shit, what a time to be alive. I mean, you know, I'm sure your Twitter feed. and you're a group chats were blown up, you know, pun intended over the weekend. Yeah. I mean, it's pretty crazy. I mean, let's be honest, like, first, I'm not shocked that the Ukrainians did this.
Starting point is 00:09:54 I mean, the execution seemed to be flawless from what we can pull from open source intel. I do think, though, I mean, again, it's not a surprise. The Ukrainians have been mastering drone warfare for the last handful of years, and, you know, you want to call it, they called it Spiderweb. Like, this was their, this was their Trojan horse. This was their, you know, Israel,
Starting point is 00:10:12 Israeli beeper. And the outcome is pretty impressive, to be honest. I mean, what from the outside looking in, like the Russians woke up over the weekend and they thought they were getting their $4 team orders and what did they get? They got a thousand, you know, FPV drones, you know, blowing them to smithereen. So it's pretty impressive. I mean, my take away from this really twofold. Please. The first is like there's never been a clear signal of where warfare is going. And to be clear, what, you know, what I view this from, you know, both the entrepreneurs in my portfolio, but also from my perspective, I mean, the world is about, you know, cheap, attritable, a lot of times autonomous systems. And that's, you know, playing out in warfare. That's playing out in other
Starting point is 00:10:55 areas of life. And then the second thing is, you know, candidly, it's like, it's really hard to defend yourself at the pace at which things are changing. And again, like, I know we do some things here in the United States and are trying to be on the front end, a lot of innovation. But when this happens, I think this almost just resets everyone again and says, all right, how do we respond to it? And I think it's, to your point, it's not a, it's not a direct U.S. response. It's more of, hey, what do we need to buy? What do we need to develop, you know, for our own fight in, you know, in some way, shape, or form? Yeah, what do you think, obviously, you're a venture capitalist, not a geopolitical strategist, but what's what's the right
Starting point is 00:11:33 Russian response to this? Is it, hey, we suddenly need to be wary of having cell coverage anywhere near strategic military assets. I mean, it seems like Ukraine and Ukraine's perfect world, they could run this style of attack a bunch and copy and paste and hit other targets, but it feels like something that was dependent on cellular technology that that's something that the Russians can revoke fairly quickly. Sure, it'll be inconvenient, but I'm curious if you have a take.
Starting point is 00:12:05 Yeah, I mean, you know, to be honest, when I think about how do you defend against this? I think there is, you know, I wouldn't call this the easy answer of just, you know, turning off the cellular network. I actually think the only way to do it kind of practically is in layers or in a multitude of different ways because, you know, yeah, the reality is if you looked at how the Ukrainians carried out this attack, they did so on the local, you know, Russian cell network, which, again, I don't think any Russian kind of defense in any of these basis was ever thinking that they would have to turn off their own cell network. And then there's just the practicality of how you do it.
Starting point is 00:12:43 I mean, I think there was what, four or five different attacks that hit all at the same time. What do you do? You turn off the network for tens of thousands, hundreds, thousands of people. And oh, by the way, this is like a dirty little secret that nobody talks about. You know, yes, you have your military systems that are protected and all that, but a lot of coordination is happening through WhatsApp, a lot of coordination on, and so all of a sudden you turn off the cell that works, you're actually inhibiting your own defense, your own response, the first responder, you know, the, you know, getting your own people out of there. So I think it's a bit more complex
Starting point is 00:13:14 than that. And then the last thing I'd say is just like, even if you do this in layers, you know, you need to be resilient in a way, but you're not going to stop everything. I mean, this was just brilliant master class of, you know, again, if maybe there was a plan, we didn't know this, but maybe there's a plan for, you know, a hundred bases. And we only hit five of them. And so if you think about just the broad, you know, geopolitical, you know, geographic coverage you have to have, I think to be 100% certain on anything, it's just you, it's impossible. You can't do it. The most capable military in Europe right now is Ukraine and military. The lessons learned that they have are very significant. The drone tech is far and away the best, their ability to
Starting point is 00:14:00 fight against and to even conduct electronic warfare. even close air support in this environment is leaps and bounds ahead of even what the U.S. militaries is. So that's the military to learn from. Ukraine does have a corruption problem. I hope sincerely that Trump is able to get a ceasefire in place and to stop this killing because it's absolutely pointless. It's just Slavs killing Slavs at this point.
Starting point is 00:14:29 And it's nobody's going to advance. and you have a blend of old and new. If you look at the pictures of the front there now, it's almost indistinguishable from the Battle of the Somb, right? Artillery duels, static lines, bunkers, all the rest. Now the problem is somebody can fly an FPV into your bunker on the other side, but between tens of millions of landmines, which make armored breakthrough very difficult,
Starting point is 00:14:58 It slows down any attack so that the FBs and artillery can get to it. You're not going to see any kind of blitzkrieg, Hans-Guderian maneuver warfare there until some significantly different weapon systems come along. So look, Europe needs to get serious about it. They're far from it at this point. Elon posed four minutes ago. The Trump terrorists will cause a recession in the second half of this year. Wow. somebody else is saying,
Starting point is 00:15:30 can I finally say that Trump's tariffs are super stupid? Who is that? Somebody else is posting. Mad's posting is saying it's a Jiji Ping. He says, bro, you seeing this? And it's Putin on the other end.
Starting point is 00:15:43 He's just looking at it. Hold up, got a line. And it's, we'll start pulling some of these up. Ridiculous. What else is going on here? This is for the present versus Elon.
Starting point is 00:15:57 Naval says, Elon's stance is principled. Trump's stance is practical. Tech needs Republicans for the present. Republicans need tech for the future. Drop the tax cuts. Cut some pork. Get the bill through.
Starting point is 00:16:11 This is so crazy. Antonio Garcia says, remember, there's FU money and then there's F the world money. Will Stansell says, imagine being the ICE agent, suiting up for your biggest mission of all time right now.
Starting point is 00:16:28 People are saying that, Trump's going to deport Elon. Elon, back to South Africa. Will DePews says, time to drop the really big bomb. Growing Daniel is in the Epstein file. That is going to turn into a coffee pasta. That is a real reason of coffee. Oh, no.
Starting point is 00:16:48 Delian. We had a question from a friend of the show. They said, the real question is if Tesla is down 14%. How could SpaceX and OpenAI be traded? if they were how would they be trading if they were public the real thing here is it's bad for everyone right it feels bad for everyone right coin is down nobody's really winning here china's up yeah oh really shah magua i mean i'm just saying like at a high level yeah yeah yeah you know china is the big beneficiary here of um uh sarah grose says if anyone has some bad news to bury
Starting point is 00:17:24 might i recommend right now yes yes yes if you have if you uh Well, what's the canonical bad startup news? Like, oh, yeah, you missed earnings or something. Drop it now. Inverse Kramer says Bill Ackman is currently writing the longest post in the history of this app. Okay. And we have a video from Trump here, if we want. I can throw it in the tab and we can share it on the stream and react to it live.
Starting point is 00:17:55 Lex Friedman says to Elon, that escalated quickly triple your security. be safe out there brother your work SpaceX Tesla X-A-I Neurolink is important for the world we need to get Elon on the show today if somebody's listening and can make that happen I would love to hear Max Meyer says so I got this wrong I didn't say it never happened but I thought I wouldn't I'm floored at the way this has happened yeah he didn't think they would have a big breakup many people didn't think they would have a big breakup even just earlier this week it seemed like they might just have a somewhat peaceful exit um Trump just posted a little bit ago. I don't mind Elon turning against me, but he should have done so
Starting point is 00:18:35 months ago. This is one of the greatest bills ever presented to Congress. It's a record cut in expenses, $1.6 trillion in the biggest tax cut ever given. If this bill doesn't pass, there will be a 68% tax increase and things far worse than that. I didn't create this mess. I'm just here to fix it. Anyways, lots going on. Let's go to this Trump video. I want to see what he's that I've seen, and I'm sure you've seen regarding Elon Musk and your big, beautiful bill. What's your reaction to that? Do you think it in any way hurts passage in the Senate, which of course what is your seeking? Well, look, you know, I've always liked Elon, and it's always very surprised.
Starting point is 00:19:15 You saw the words he had for me, the words of, and he hasn't said anything about me that's bad. I'd rather have him criticize me than the bill, because the bill is incredible. Look, Elon and I had a great relationship. I don't know well anymore. I was surprised because you were here. Everybody in this room practically was here as we had a wonderful send-off. He said wonderful things about me.
Starting point is 00:19:38 You couldn't have nicer. He said the best thing. He's worn the hat. Trump was right about everything. And I am right about the great big, beautiful bill. But I'm very disappointed because Elon knew the inner workings of this bill better than almost anybody sitting here, better than you people.
Starting point is 00:19:56 He knew everything about it. He had no problem. with it, all of a sudden he had a problem, and he only developed the problem when he found out that we're going to have to cut the EV mandate, because that's billions and billions of dollars. And it really is unfair. We want to have cars of all types.
Starting point is 00:20:10 Electric, we want to have electric, but we want to have gasoline, combustion, we want to have different. We want to have hybrids. We want to be able to sell everything. He hasn't said bad about me personally, but I'm sure that'll be next. But I'm very disappointed in Elon, and I've helped Elon a lot. lot. I just want to clarify. Did he raise any of these concerns with you privately before he raised
Starting point is 00:20:34 them publicly? And this is the guy you put in charge of cutting spending. Should people not take him seriously about spending? And aren't you saying this is all sour grades? No, he worked hard and he did a good job. And I'll be honest, I think he misses the place. I think he got out there and all of a sudden he wasn't in this beautiful oval office and he was and he's got nice offices too. But there's something about this when I was telling the chance. Folks. This is where it is. People come in here.
Starting point is 00:20:58 Breaking news, Delian, that's Bruhov, is joining us in the temple. I love it. For some live reactions. Come on in. Surprise guest. I can't even spell surprise guest. I'm so excited about this. Surprise guest.
Starting point is 00:21:14 Yeah. In other news, 11 labs dropped a new product. It's another news. Two million dollars seed round. Stop it. Stop it. We love 11 labs. No, they'll keep grinding. Just launch again tomorrow. You're going to have to launch again. Start shooting a new vibe reel.
Starting point is 00:21:36 Start shooting a new, writing a new blog post because no one's going. Lulu says yes, delay the launch on TVPS. So basically right now, I can just pull up and just refresh. I'm going to just be refreshing, true social. How are you doing? So, okay, Jordy's on truth social. I'll be on X. Give us your reaction, Delian. What's going on?
Starting point is 00:21:57 At some point, I was like, I'm just sort of scrolling X and I like tuned into you guys like an hour ago. And I was like, they're talking about some AI thing. I was like, at some point they're switched to like breaking news and then I was watching and I was like, okay. John, John resisted. I fought it for like a half an hour.
Starting point is 00:22:13 But we couldn't do it. But yeah, give us your quick reaction. I mean, always, you know, sort of give it from the, you know, sort of space angle. You know, it's amazing that, you know, how much the world has shifted since, you know, Friday of last week, whereas, you know, sort of presumed the Jared Isaacman was going to be the, you know, sort of NASA admin to today. It was released that the Senate reconciliation package re-added budget back into NASA largely for the SLS program, which was basically the program that, you know, sort of Jared and Elon were, you know, sort of largely advocating to, you know, sort of completely shut down. Yeah.
Starting point is 00:22:48 So, yeah, the, it is already show, like, you know, the sort of counterreaction, you know, is already showing up, you know, in policy. Sorry, SLS program, is that space shuttle or no? Sorry, that's the SLS launch rocket. It is based off of the old space shuttle hardware, but it is basically the internal, you know, sort of NASA run competitor, effectively to, like, a B or Starship, have you either launch rocket. Yeah. And so, you know, because it was, you know, sort of generally behind budget, behind schedule, and there are so many commercial heavy lift rockets coming online. Sure.
Starting point is 00:23:21 The default was canceled. That is largely, you know, sort of a Boeing-based program. And so, you know, if you look at, you know, you know, three months ago, you know, when they were announcing that F-47 program, you know, Elon walks into the secretary of the Air Force's office. Obviously, he'd been, you sort of ranting against, you know, sort of a manned fighter jets, and believing that that shouldn't be what, you know, be what the department is prioritizing. 30 minutes after that meeting was when they announced the F-47 program. And so now you're seeing basically like the equivalent in space where, you know, you know, that was obviously awarded to Boeing.
Starting point is 00:23:49 Boeing was the, it is the largest prime behind Eidosod SLS. You know, Boeing basically, you know, is going to be the biggest winner of, you know, NASA refunding, you know, SLS and Jared Isaac meant not being NASA administrator. So tying this back to the timeline, Trump posted less than 30 minutes ago, in light of the president's statement about cancellation of my government contract, SpaceX will begin decommissioning its dragon spacecraft. immediately break that down. I mean, that just means that we no longer have a vehicle that can go to the International Space Station. We know how we have a vehicle that can bring astronauts up and down. We also don't have a vehicle that can deorbit the International Space Station safely, right?
Starting point is 00:24:28 The Dragon was expected to be able to do that. So what that means is, you know, if you guys remember all the memes about stranded, you know, from last year around Boeing Starliner, it now means that the Space Station, you know, itself is basically, you know, sort of stranded. And that's like, you know, one of the government contracts, obviously, that that you know, Space Sixes involved in. Elon, I've heard generally, like, just wants to shift all things to Starship anyways.
Starting point is 00:24:48 And so in some ways, it was probably kind of looking for an excuse to shut down Dragon and refocus energies. There's also part of where it's like, look, he is like kind of independent in the space world in that, you know, Starlink's total top line revenue is gonna be passing the NASA budget in the next year or two. And so in terms of like size of, you know, state actor that can influence space, you know,
Starting point is 00:25:08 his own company is basically about to become, you know, as large of an actor as like the entire United States. So I don't think there's going to be like a de-escalation here. Like, you know, my, you know, estimation is like on both sides. It's going to continue to escalate. You know, if we thought that we lived in dynamic times, you know, when Trump got into office, it's going to be even more dynamic when it's like the dynamism will continue until morale improves.
Starting point is 00:25:33 Elon, the center, AOC, the progressive populist in Trump, the, you know, sort of conservative populist. and, oh, man, it's a good time. I mean, I just have so many questions, right? How does this impact Golden Dome? Yeah, what's Boeing stock doing? Is, will Golden Dome even be a viable project without SpaceX? It's, it's... I think there's just going to be more resistance probably to working with, you know,
Starting point is 00:26:00 sort of upstarts because they would be ones that would probably be more likely to collaborate, you know, sort of with, you know, a SpaceX. And so, yeah, I think there's... It feels like, it feels like Boeing would be a logical beneficiary of this turmoil, and yet they're down today. They haven't really popped. Oh, really? Yeah. I mean, I'm not obviously, you know, I'm wanting to give like, you know, I'm working through it myself.
Starting point is 00:26:23 And it's surprising. It just feels like it's just let a drop in Boeing to pop basically. Yeah, yeah. That would be the expectation, but there must be something. Because there, there, it feels like this is purely interpersonal between Elon and Trump and not, it's not like, oh, Boeing was secretly behind the scenes the whole time lobbying even more effectively. It doesn't, oh, you got the, well, where's the tinfoil hat? It's over there. Maybe we need a tinfoil hat saying, who knows? But, yeah, I mean, when you're in Boeing world, it's like, hey, we're only down one percent. Let's go.
Starting point is 00:26:53 The coup of the century. My question is, has, has there ever been a crash out of this magnitude ever? In history. Well, you know, when Elon and Trump became friends. Honestly, world scale. I actually, probably world history equivalent. I feel like there's something in like. Yeah, but we maybe didn't have the internet in the United States where, you know, there was some crashing out. Crashing out used to mean calling up the New York Times and just ranting. Yeah. Now you can just live posts like all your reactions and it's just all real time.
Starting point is 00:27:23 This is like crash outs are actually intensified. You actually want to be long crash outs. Yeah, it's definitely. Yeah, it's definitely. And if you have their own social media platform that they own. So, you know, you got to be on both X and and truth social to like stay on top of things. Yeah. Yeah.
Starting point is 00:27:35 I actually did like a deep research report. while back on like has the richest man in America ever been close with the US president going back to like you know was Rockefeller particularly close and and because the narrative was like oh this is like so unprecedented and in fact it is unprecedented in fact really yeah yeah yeah like like rock and feller was me too me too that's what I was going for was like no I imagine this is always it is always close but no I think because the president has become more powerful globally your your your your your your point about you know, mayor of America, dictator of the world. Like, it becomes increasingly valuable for the richest man to have a close alliance. And so it's become more. I don't know exactly how accurate that research was. It's totally possible that, like, behind the scenes, Rockefeller was really close to the president of the time.
Starting point is 00:28:22 And we just didn't write about it in the history books. But there certainly aren't very many anecdotes about the richest man in America going on. Yeah. So Paul, Paul had a press. Yeah, for APUS history, 2050, you know, APCD. Yeah. Yeah. Yeah.
Starting point is 00:28:36 What was the tweet where Elon Musk called the president at the time a potential pedophile? Was it A, about Epstein Island, B, about a cave in the Philippines, C. What a mess. No, so Pavel had a good post. He was quoting the big bomb from Elon. He said, hypothetical question about the USA's power structure. Is the man with the most access to capital more or less powerful than the political head honcho? Purely hypothetical.
Starting point is 00:29:03 It's a good question to. I mean, I think both like archetypes have grown both in absolute power but also in relative power to the rest of the globe, basically since the Gilded era, right? If you think about like the president of the United States in 1925, I'd say pretty darn powerful, but like there was clear like, you know, it was a, you know, sort of multipolar, you know, sort of world. Argentina was pretty darn rich at the time. Obviously, Europe was still, you know, sort of recovering from World War I, but UK was generally, you know, sort of doing well. Like, it was not, you know, it was clear there was a, you know, sort of huge, you know, outweighed effect. And then if you look at probably the, you know, sort of biggest, you know, industries at the time, I don't think you could claim that even, like, Standard Royal at its peak, I'd have to go look at the exact numbers.
Starting point is 00:29:46 But that, like, it had the size of budgets relative to, like, you know, the, like, U.S. government in terms of, you know, sort of budgets, right? Versus I feel like now for the first time, you both have, you know, sort of U.S. president extremely extremely powerful. Yeah. And then you have, like, you know, sort of mag-7, effectively, like, the size of, you know, sort of, you know, huge states. Like the era, you know, fucking around state governments.
Starting point is 00:30:07 And then also just more bureaucracy, more red tape. So like, I, when I think about the 1920s, like, Robert Barrens, it's like,
Starting point is 00:30:13 it is the, you can just do things era. And so you want to build a railroad, like, yeah, you might need to get like one rubber stamp, but it's not going to be 10 years and tons of lobbying and all this different stuff.
Starting point is 00:30:23 So you can kind of just go, uh, you can just go, you know, it's bad when Kanye is saying, bros, please know, we love you both so much.
Starting point is 00:30:31 It's just like, the voice, the voice, the voice, The voice of reason is Kanye West. Yes, yes. Thank you. You didn't bring them together and, you know, form a peace treaty.
Starting point is 00:30:39 Nikita Beer just added his pronouns back to his bio. Let's go. You've got a rubber band. Elon's got a rubber band all the way back to, you know, sort of extreme wokeism, straight back to, you know, sort of super climate change. Wow. Somebody's sharing, re-sharing the picture of the cyber truck blown up in front of the Trump Tower, I guess.
Starting point is 00:30:58 And it's just like this. It was foretold. but it was a question of like when and what magnitude not if always bad if if Vladimir Putin is operating to negotiate between President Trump and Elon I think I think a lot of the world is waiting for Roy Lee's take clearly and a Cluley army they want that people have been asking him to get involved with geopolitics Wow.
Starting point is 00:31:32 I love the Shil Mohat put up a you know, sort of meme about Narenda, like, Prime Minister of India. You know, he basically copied and pasted the Trump Truth Social Post about negotiating peace between India and
Starting point is 00:31:44 Pakistan when it wasn't like actually fully negotiated. Yeah. And, you know, posting about, you know, negotiating a ceasefire between Elon and Trump. Funny thing is like truth social, you can just read all of Trump's posts without creating an account. It truly shows that, like, I would think that you would have to make an account to read them all, but they just, that it's not gated at all.
Starting point is 00:32:05 It's his ball. This could be the biggest, but, you know, they clearly, I don't think they care about monetization. Bitcoin is actually falling alongside. Falling. Falling. Wow. Bitcoin falling, Boeing falling, Tesla falling. Who's the biggest winner of the day?
Starting point is 00:32:20 I think it's China. China. China. Yeah. China, China. Sean McGuire. Wow. Bitcoin really sold off. It's down 3% today at 101K. So still up, but, you know, rough.
Starting point is 00:32:32 Winnie DePoo, just dipping his, you know, hands in that pot of honey, just snacking away, watching from the sidelines. Yeah. Let's see. Chinese stocks. U.S. stocks? Chinese, I can't find anything. Okay. That's probably my commentary on the day, boys. Anyway, this is great. It was fantastic. Thanks for jumping on. Thanks for hopping on so quick. Cheers. Next up, we have Roy from Cluelly coming back for an update. He's hired 50 interns, I think, or something close to it. He said they're bringing every intern on.
Starting point is 00:33:00 They're bringing every intern on. We got every intern coming in. Well, welcome to the studio, Roy. How are you doing? Oh, boom! Let's go. There they are. I think we're overpowering you.
Starting point is 00:33:14 Can you hear us? Yeah, yeah, we can hear you. Yeah, make sure we're zoomed out all the way so we can see everybody. We got a small army here. This is incredible. How big is the team kick us? off how many you got at this point team is 11 full-time plus the interns how many interns you got so far interns bro we're closing in on 50 brothers that's amazing congratulations what are they all
Starting point is 00:33:40 doing how do you manage everything uh is it just is it is purely social media is that what you want them to focus on growth yeah yeah growth marketing like the only goal of the company is get one billion eyeballs onto cluli so you have unreshinged creative freedom and permission to do anything and everything. Just make the company go viral. Every single person you see behind you has over 100,000 followers on some social media platform. Wow. Wow. Wow. That's remarkable. Me too now. Yeah. There we go. There you go. Probably popped. What's working? What platforms have actually been driving the most growth? I mean, I'm sure you've run a lot of tests. What have you learned that you can share? Bro. Ben, take it away, bro.
Starting point is 00:34:24 Let's hear it. U.C. has been really good. We just hit 10 million views today. 10 million views. Eight days. Wow. There we go. Hoping to get 100 million views in the next month.
Starting point is 00:34:34 What platform specifically are the most fertile ground for targeting your specific customer? Because you can imagine that there's a lot of folks who are AI curious on X, but then there's much broader, more viral audience, more general audience on platforms like TikTok, YouTube, Instagram, what's working and what is the next next, next. platform that you're going to be focused on. Yeah, well, we're trying to go viral on every platform, regardless. But the main thing right now is Instagram Reels. Oh, Instagram Reels. Interesting. And what is the main value prop that you're hitting people with? Is it still the cheat on test thing? Or have you evolved at all? What? Still? Like interviews, yeah. Interviews. Okay. And has there been,
Starting point is 00:35:15 this was controversial when you launched it? Is it still controversial in the comments? Are you getting flamed. Has anyone big dunked on you and has that driven virality? Is that actually a net positive? Instagram is not like Twitter. Like you could post the craziest shit on Instagram and they still will not think it's controversial. Really? So to make it controversial, like we have to engage in bait some other way. Like it's cheating tool is controversial on Twitter, but on Instagram you could have like a white guy say the N word 10 times and it's still not controversial. Like you need crazy shit on Instagram. That's what we crack. Every single person here has like very great viral sense and watch the reels that do go viral you see there's like ways that we've
Starting point is 00:35:54 engagement baited the videos and this is what we'll keep doing to uh probably a billion views a month is what how long does it take to figure out if an intern is cracked is it like an hour or two hours how much time do you need for me personally me personally probably like 10 minutes but for anybody watching probably would take like one one or two weeks there we go there we go uh how do you guys think about how do you guys think about product marketing obviously you're just going viral everywhere getting all this attention how do you make sure that it that it uh They're shaking his head on the way. It doesn't think about, it's not about the product.
Starting point is 00:36:26 It's about the attention. Attention is all right. You guys make anything go viral. Yeah. Yeah, but how do you? The side of the street, you know, you make some UGC videos, make some Twitter posts, you know, you can sell anything. You know, in 2025, product doesn't matter.
Starting point is 00:36:41 You know, I could jack off off the side of a building and sell some videos of it for 20 bucks each. Make $2 trillion dollars. It's crazy. Two trillion. That's intense. How do you guys think about burn? Is it on your mind at all? I don't know if you saw the last tweet,
Starting point is 00:36:56 but as of literally like two days ago, we're still, we're still cash fell positive. We're still fucking profitable. We're still profitable. Let's give it up for the property. Let's hear it. It sounds.
Starting point is 00:37:06 So you're charging for the product and people are paying. Are they at all satisfied or do they feel like they got scared? Bro, like the product works. You're either using this as a consumer and it's working because like, like you're passing your interviews and or if it doesn't work, you're not going to complain to me because I'm going to go right to your employer and tell them, yo, guess just complain about using the product? I'm like, I'll get your blacklisted if you complain.
Starting point is 00:37:25 Really? How are you thinking about, how are you guys thinking about product evolutions? What do you want to add to the product? Obviously, you want to help people cheat on everything. Where are you going to help people cheat next? We don't care about like the product is going to be led by the virality of the content. We have video ideas right now that we're going to try to push for different use cases. We're going to see which ones go consistent.
Starting point is 00:37:49 the most viral, if you can make something go more viral, then like, you can just build the technology after you have all the attention. So we'll figure out the exact use cases and exact niches we're going to quintuple down on once these guys get to work. What, what formats on Instagram reels are like the most modern in terms of consistently viral? Like you mentioned like man on the street interviews, what do you do for a living? That's always been a fertile ground. What about, uh, I see a lot of those like mobile game ads that look like, you know, you're fighting down some sort of bridge and then you go into the game, it's actually just a match three. What, what, what are the different formats that you like to pull from?
Starting point is 00:38:28 Every week, there's two new ones. And at any point, there's probably 10 to 20 viral trends that is happening. And these cycles so quick, you need to keep your finger on the pulse. These things will like, expire immediately. You need to be on the ball. And like, if I, if I told you right now, by the time people watch this on YouTube, like, it would all been expired. Well, they were live.
Starting point is 00:38:46 So, so give us, give us the latest and greatest. like what's going viral today? Well, right now we got 10 million views using a Snapchat format. Okay. Viral for like the last three years, to be honest. Okay. And I think that like we just have to get people who continuously scroll TikTok like six hours a day. Yeah.
Starting point is 00:39:07 But what's the what's the actual format that you use? Like describe the video. What is the hook? Like break it down for me like you're explaining the like the art behind the viral form. There's a caption. It starts with a face. Usually a handsome dude or a pretty girl. They're saying, damn, this interview is starting with the interviewer started with the hard questions. I should have been a CS major, not a business major. That's engaged in big because people are saying like, bro, like CS is way harder than business. Then it turns around the interview asked like, like, hey, how are you doing? Why should we hire you?
Starting point is 00:39:37 And then this guy uses cluey to generate a response, but he can't fucking read the response. So he reads it hell outstically like, oh, I revel in detail. And then that's that is like another conversation point. Like people are cooking on the guy because he can't read properly. A guy is doing a really dumb interview using food. That's great. How are you guys using AI generated content internally? I know a lot of these videos that you guys are creating are just typical social media, vertical video. Do you have an intern that's just generating basically copy and pasting making a bunch of other.
Starting point is 00:40:09 B-Role V-O-3 or any of these tools relevant? Anything clicking? Not not yet. I think there's still like a 10% left before they cross the uncanny value. And the biggest thing is that people need to think your video is real. That is a difference between 100K views and 10 million views if people think it is real. Absolutely AI CEO is bearish on AI. What about?
Starting point is 00:40:31 Google needs like 10 more Chinese researchers to like figure it out. And once they push out the latest update, then V-O-3 will be there. But right now we need real people. Yeah, yeah, yeah. I mean, what about just using AI as like stock footage replacement, not as the lead-in for the video, not the entire video, but just like sprinkled in to illustrate a point, you know, an establishing shot of like a building, a helicopter pulling into a building. Like that, that historically has been kind of something that you would reach to, you know, Adobe stock video for V-O-3 feels like it's there. But are you not drawing on that at all yet? If there's a viral format when we need it, maybe we'll use it.
Starting point is 00:41:13 But right now, like, it's, it's really brain dead to go viral on Instagram. Yeah. Formats are in our hard. You don't need a helicopter. You need a guy, a camera, a really shitty camera. I need a computer. I mean, what about, like, those, those kind of, like, AI mashups, like Harry Potter Balenciaga or the, uh, the kangaroo with the plane ticket getting on the plane.
Starting point is 00:41:32 Like, like, AI content can go viral when it's really, uh, when it's like inspired almost by a human. It's not entirely AI generated, but it's using the tools effectively to create something that's like still catchy. Do you think you'll be using any of that anytime soon? Probably very soon. We're scaling up. Like what you see right now is probably about less than 1% of what the size will be by the end of this. Like we are profitable. We're not trying to be profitable.
Starting point is 00:41:57 We just keep making so much money. We can't help it. So we're really scaling this shit up. I'm not even trolling you 1,000 creators are going to be shipping out content. We're doing a complete internet takeover. Okay, so why in-house? Why do they even have to be employees? Couldn't you turn this into like a multi-level marketing scheme or something, a paramed scheme?
Starting point is 00:42:16 We're going to do this. Oh, that's what you're going to do. Okay. MLM. MLM. MLM. Are you guys worried that you could be infiltrated by journalists? I'm sure they're circling the house right now.
Starting point is 00:42:28 The hit pieces are going to come. We're doing a softball interview right now. I mean, the person that's brave enough to try to do a hit piece on the Cluley Army is. I bet they're dying too. Look, more eyeballs is better. There's no companies that ever died from a founder being too controversial. You got deal fucking infiltrating with genuine spies and they're still doing fine, bro. Workers 17 guys. They're still kicking like no company ever dies from being too controversial. You die because you don't make enough fucking money. Yeah, yeah, yeah, yeah. Speaking of making money, what's the pricing model right now? Are you doing anything on price discrimination? Is there a super high tier? if you get a whale. What does it clearly whale look like? Can I spend $2,000 a month on this service? Yeah, you should add a tipping feature too. Yeah. People should be able to tip you guys if they have a good experience. You should add, uh, get the job. Really financialization pay as you go, high interest rate
Starting point is 00:43:21 loans. Just really push it. Make it sports gambling in there maybe. Just throw it all in. Yeah. I mean, it's $20 a month for a hundred dollars a year. And our, our top line revenue is really being driven up by enterprise. You're going to have to talk to sales team to get a custom pull. But, you know, like, there's a lot. Are you serious? What, whatever? Is that more on the sales side? What, who are the enterprise? So you sell the STRs? You guys laugh because you think I can't sell enterprise because I'm no, I believe it. I trust like these 20 these Fortune 500 CEOs like these are like 35 year old dudes who sit there school people who are laughing at my post. Yeah. Yeah. Yeah. Yeah. It seems it seems it seems to see it. It makes sense. No, I believe it. But but I mean you're not going even higher tier like what
Starting point is 00:44:04 with the $2,000 month, Culelevisioned? For a consumer, there's a lot more we can do with more compute. But right now, we're like, to be honest, I didn't expect to grow this fast. The engine team is quite small. I'm going to spend a lot of time trying to hire more competent engineers. We have a lot of backblock tasks that we need to fill out, especially for this last contract that we signed. So we're full-time focusing on the one big guy that we got right now.
Starting point is 00:44:25 And after that, then we'll try and skill this up. But right now, we're focused on the one big client that we signed. Yeah. Talk about your compensation strategy that people want to know. You said you can raise infinite capital and you're so confident, I believe you. But I'm curious to get some more insight there. Bro, I feel like it's so retarded to be a company. Sorry, am I allowed to say that?
Starting point is 00:44:49 No, you're not allowed. No, this is a family-friendly show. It's very stupid to be a company. I try to race to the bottom to see how little you can pay your employees. Bro, if I'm making hell of money, we're all making hell of money. Like, like, it's, I'm trying to pay them more to see if, man, like, maybe tomorrow will start being, like, cash flow negative. But I just can make a money, bro.
Starting point is 00:45:09 Like, I would like to pay these guys what they're worth. And the output is fucking insane. We did 10 million UGC views in what, like, eight days. Like, like, you don't see this sort of traction in any company and you don't see killers like this in any company, unless you're paying these motherfuckers, like what they're worth, bro. Like, I don't know what? Maxed out contracts. Maxed out contracts.
Starting point is 00:45:27 Exactly. Yeah. What about devices? I mean, it seemed like this would be a natural fit for some sort of AI wearable or other platform. Is there an app coming? Or are you interested in what's happening with Johnny Ive and Open AI? What was your take on the device world? We're very interested in the hardware space.
Starting point is 00:45:45 We've got like a million things cooking on hardware. We've got people in the garage right now working on. You don't even know about, bro. Like we're bringing manufacturing back to America and it all starts at the Cluley Garage. Let's go. I'd love to see it. Nobody, you know, they doubted, but you guys are reindustrializing America. You guys really are the part-tack up there. We're working on brain chips down there. Yeah. Brainships. That's the future. There we go. There we go. The new Neurlink. Yeah, I mean, I, you know, there's a world in the future where you guys actually just roll up Neurlink and opening eye.
Starting point is 00:46:18 For sure. Flee Umbrella. Yeah, definitely. It's possible. I'm excited to offer acquisitions for both of those companies. Yeah. It's in the roadmap. It's on the road. All right. This has been a lot of fun. I'm excited for you guys. It is, and I have no doubt that you'll go from, you know, 10 million views a week, 10 million views a week to 100. And I'm excited to see you guys hit that billion view mark very soon. So keep it up.
Starting point is 00:46:43 We are all very entertained and rooting for you. Yeah, shit, shit. I love the energy. Thanks, man. We appreciate you joining. Better guys. We'll talk you. Keep having fun.
Starting point is 00:46:53 Bye. Next up, we have Keon from Nucleus coming on with a big announcement. something like 10 years in the making close to it maybe seven years we'll bring keon in let's play some soundboard how you do it welcome to the show that's a great intro tweets are flying oh my god you guys seen this yeah you're seeing this you're gonna bring you down for us explain what's happening there's nothing like a launch day is this is this is it gaddica or is it there or knows because people can't they can't make up their mind Oh, yeah.
Starting point is 00:47:29 We're going to find out. What's going on? Let's give some context of audience. Nuclis has launched Nuclis embryo, the world's first genetic optimization software. Basically, parents can give their children the best on in life. They can pick their embryo based off of physical characteristics like eye color, IQ. They can go to disease risk like cancers or heart disease. Basically, we really believe parents can get old information that exists about their embryos,
Starting point is 00:47:53 and they can pick however they want. For me personally, you know, it's been 10 years in the making. The journalist actually covered it today in the Wall Street Journal, was a journalist I covered my dean editing in the warehouse in Brooklyn 10 years ago. Yes. Wow. Overnight success. You know, it's a long time in genetics.
Starting point is 00:48:11 Yeah, so break down the state of the art because like embryo screening exists. I think most parents in America, at least if they do the means, do some sort of screening while the embryo is growing. Is this purely for IVF? Is this just going a layer deeper? And then I want to talk about the regulatory and FDA component as well. Let's talk about it. So basically if you go to an IVF clinic today, you're a couple.
Starting point is 00:48:35 The vast, vast, vast majority of clinics. The first thing I actually understand is that the IDF process is principally controlled today by clinicians or doctors. Honestly, couples don't have as much liberty in our perspective as they should. It's their baby. It's their embryos. They should have the right to those, that information. And they should pick off any vertical. However, today in the clinic, what generally happens is people test embryos for a very rare
Starting point is 00:48:57 in severe genetic conditions. For example, like a chromosomal abnormality, like Down syndrome, for example, or even a condition like cystic fibrosis or Tysax or PQ, right? These are conditions that are very rare that maybe someone might have a carrier for cystic fibrosis, but again, it's pretty rare. Then there are conditions that we've all heard about, things like breast cancer, things like corny artery disease, the things that actually kill the vast majority of people today, right? Chronic conditions kill the vast majority of people today.
Starting point is 00:49:22 Those conditions are just not tested for in the clinic, even though we have very good science actually that can make those predictions. How do we know this as a DNA company as well that's what we do, right? We build models that predicts disease and the way you test those models in adults. So we go from adults to embryos is actually because we can basically well validate these models to show that they work in both the embryonic context and in the adult context. And so what we're really doing is we're going from, okay, instead of just looking for really really severe like down syndrome cystic fibrosis, why not do breast cancer?
Starting point is 00:49:49 Why not do heart disease? Why not do colorectal cancer? Right? Why not do schizophrenia? Why do Parkinson's. But then why stop there? And this is really the important thing because ultimately, you know, if you think about diseases and traits, the extreme version of any trait is actually a disease, right?
Starting point is 00:50:03 Hight is a great example of this. One extreme end is like, you know, John, for example, he's like Markinson's in the most, then the other end is like me, dwarfism, right? It's like the same both ends, okay? So, you know, so, you know, IQ is another example of this. One end is like, you know, autism. The other end, it can actually be some sort of, you know, a cognitive basically challenge that people have. And so when you think about it, when we start realizing that people have drawn a line in the sand saying, you can get, you know, rare diseases. You can't get common diseases, but then they really said you can't,
Starting point is 00:50:31 can't get any traits like height, even though the best predictor we have today actually in the world, the best polygenic predictor is for height. So as a company, we've kind of completely reimagine this and said, wait a second, what's going on here? You should have access to the entire stack. Rare diseases, we do. Cystic fibrosis, common diseases like breast cancer and also traits all the way up to something like IQ. Yeah. So, I mean, that test, are you just giving people the data because I imagine that once you get into particular recommendations, that's more of what I would expect a licensed doctor to need to do. Well, yeah, my sense is that you can allow people to get the data from their doctor and
Starting point is 00:51:09 then and then feed it into nucleus. Is that correct? So that's, that is correct. And actually, we have a couple, there was like 10 announcements today. You know how we do it. We like to do 10 announcements at one day. We are actually very, very excited to announce a huge partnership with genomic predictions. So genomic prediction is actually the oldest embryo testing, a company that exists.
Starting point is 00:51:28 They've done genome-wide tests on embryos for almost a decade at this point. And I think they've done over 120,000 couples for PGTA, which is a specific kind of test. And so we're actually partnering with them. So we make it very easy for genomic prediction customers to request their files and actually port it over to nucleus. But really, this isn't just for genomic prediction customers. Anyone who's undergoing IVF can go to their clinic and say, I want my embryos data. You can take that data, you can upload it to nucleus, and then all of a sudden, you know, the application of DNA makes this technology universally accessible.
Starting point is 00:52:00 Now, how much of the benefit is actual algorithmic analysis, bringing in other data points to contextualize the data versus just better UI and better hydration of existing text? Because we had a friend on the show who was talking about getting some medical results from a doctor. the doctor's office was closed. It took two days until the doctor was going to be able to interpret the results. He was able to just take a photo, upload it to ChatGPT and say, hey, is this really, really bad? Should I be panicking? Yeah.
Starting point is 00:52:36 Because it seems somewhat out of the range. And ChatGPT was able to say, hey, you still got to talk to the doctor. But this isn't the craziest thing I've ever seen. This isn't way out of distribution. And so that's almost like a pure UI layer, but extremely valuable. I know it might not be like the right narrative for some people that it's like, not is innovative, but I think that like all that matters at the end of the day is giving people benefits.
Starting point is 00:52:56 It's always both. It's always both. You have fundamentally technology, just for technology's sake, it's not still about, right? So it's like by making something that people want, okay? And people can actually use. So you think about the nucleus innovation, it's too farmed, okay? One is in the informatics, right? You know, I've been doing this for five years.
Starting point is 00:53:12 Sure. I almost, I would argue to myself that I probably spent too much time, you know, developing the science, right? Sure. Science in a nutshell, it's not actually very useful. You need to expand access to it. So on that point, we, you know, you know, you know, I'm going to explain it. So on that point, we do multiple different kinds of analyses that make it such that
Starting point is 00:53:25 we can actually provide the most comprehensive analyses that exist today. But moreover, and this is really the, I think a key point to your point, John, is people understand them. People can see them. I mean, you can pull up the platform. I'm not sure if you guys have shown it already, but it's very easy to sort, compare your embryos. You can actually name your embryos.
Starting point is 00:53:43 You can stack rank your embryos. You can understand what the score means. We lead with overall risk, or we tell you, for example, instead of saying you're in the 99% top for genetic risk condition, which, you know, what does it actually mean? we say, hey, you have a 5% chance or the like of, let's say, schizophrenia or some other condition. In other words, by even overall risk, people have much greater intuitive understanding of the results we're communicating to them. We have genetic counselors on hand. So this really is, what are we shown here? Are we showing something? We showed me. Yeah, yeah, we pulled up here. That's another thing.
Starting point is 00:54:08 That's a fun one. That's an Easter egg. That's an Easter egg. That's the kind of approach that we're taking here. And I think consumers are responding to it, right? People want to have access to their data. The clinician, the doctor shouldn't decide what embryo implant. You should. Okay, so talk to me about what requires FDA approval, obviously new medical devices. Like, if you were developing a machine to take in an embryo and sequence the DNA, I would expect that the FDA would want an approval for that medical device. But if you are taking data and just showing it to a customer in a different UI, that feels like probably a very light FDA process.
Starting point is 00:54:47 And then there's probably a continuum in the middle where once you're making a recommendation, they have rules around that, right? We as a company do not tell you which embryos are implant. You know, basically parents, the couple has complete agency to decide how they want to use the information to implant their embryo. Moreover, let's be clear, height, right? I mean, can a height analysis be a medical device? You know, that doesn't make sense, right?
Starting point is 00:55:10 IQ height, there's traits, for example. We all, you know, traits are something that I don't think necessarily belongs in even the kind of infrastructure thing about medical care, right? These are things that go beyond medical care. things like, you know, that people just kind of intuitively know and that there are DNA tests done every single day due to see for these analyses because they're not disease analyses, right? So we do both diseases and traits to be clear. My point is many of these innovations, you have to wonder, like, you know, should the government say if someone can or cannot pick
Starting point is 00:55:36 their embryo based off height? That doesn't seem right to me. I think it should be in the complete liberty of the individual to decide that. Yeah, but I mean, we're a democratic country. And so if, if, you know, a huge swath of the population says that the FDA should review that type of test or that type of analysis. It could happen. I mean, the FDA reviews all sorts of different stuff. And so I guess the question shifts to, like, do you expect a change from FDA on the way these analysis tools are regulated?
Starting point is 00:56:12 I think right now the most important thing is just putting. these high quality rigorous science results in people's hands and then helping them basically have healthier children, helping them give their child the best starting life. You know, I think that generally speaking that, you know, people should have more liberty, more chores in medicine. I think the broader longevity trend actually touches on that point as well. So that's what we're excited to do at nucleus. Yeah, I mean, the fact that you're partnering with a company on, on actual, on the actual
Starting point is 00:56:41 like medical device side, like they are doing the sequencing in the embryos. That really takes it out of the Theranos question entirely in my mind. I feel like you should be beating the drum there a little bit more. It's like we didn't say we created some new device, but I don't know. We ship. That's the difference. I know. It's locked and saw, baby.
Starting point is 00:56:57 Don't look at it. Go use it. That's the evidence. Why is there a footage? Okay. I love the visual of John and his wife selecting between embryos and it's like 610 or 7-2. Tough choice. Well, if we go with that,
Starting point is 00:57:15 the 610, he has, you know, potentially fly commercial once in his life. We can actually play this game right now. Okay. Here, we're going to play a game right now. I'm going to put in the chat. Okay. Your embryo.com. Okay.
Starting point is 00:57:28 Everyone listening to this. Pick your embryo. com. I'm going to go. Oh, my God. Here we go. Little Easter egg here. Okay.
Starting point is 00:57:34 Let's see. What's more important to John? Intelligence or muscle strength. Come on. Oh, absolutely muscle strength. Let's go. We're the future's body building. Yeah.
Starting point is 00:57:42 John would take a, he would, he would happily have a, a 5-2 son if you had you know top 0.01% bodybuilding genetics. Exactly. Yeah. Okay. So which one is the lifespan or height? Come on. Lifespan. Lifespan. Let's go.
Starting point is 00:57:57 Let's go. Let's go. Let's go. Maybe low depression. You got to be golden retriever mode. You got to be. You need low depression. You need low depression. Let's go low OCD. I don't mind bouncing around a bunch. Okay. Let's go. Risk taking anxiety. Let's go high risk taking.
Starting point is 00:58:13 There we go. Okay. Okay. Okay, wait, analyze. Is this some generative stuff going on? This is great. Nadia. I got Nadia, too. The enduring athlete, let's go.
Starting point is 00:58:23 Physically strong, cautious, built to last. Yeah, this is great. Is this driving a lot of attention, a lot of downloads? Is this going viral yet? It seems like something that's designed to be shareable. I think it just dropped it right now. Technology brothers, we got you the exclusive. Let's go.
Starting point is 00:58:36 Let's put it out there. You know, they can pick your embryo. People say, what's it like? Maybe you're not doing IV. Yeah, no problem. It's funny. Only 9% of people choose Nadia. Okay. Well, we're contrarian. We like that here. Yeah. That's fun. It's great. Oh, well, well,
Starting point is 00:58:51 congratulations on the news. Congratulations on the launch. Yeah, the pace is wild. Last thing, what's going on with, have you seen these just blood billboards? Oh, yeah. They're all over L.A. So, so there is, there's someone who's running a campaign right now, justice for Elizabeth Holmes, claiming that Theranos was not the scam. People think it was. And there's a, there's a documentary coming out and there's billboards all over L.A. for just blood. Like, it's just blood. It's not that big of a deal. And John, just be clear, there's an exclusive on technology where there's next week about from this person, right? They're going to tell their story next week just to make sure you invite them already. I hope we're we are we are toying with the idea that someone reached out to kind of connect us.
Starting point is 00:59:33 We're thinking about doing it, but we're not we're not a hundred percent sure that it would be appropriate for the based on the website. I don't know if it's appropriate. Yeah, it doesn't look like it was designed with Figma. So I don't know. We can't quite do it. It's a little bit. The team definitely doesn't use linear. Yeah, but they claim that Elizabeth Holmes has been proven innocent. And so it's a bold claim.
Starting point is 00:59:55 We like to see people making bold claims. By what jury is my question. The jury of someone who knows HTML. Keon, always a great time. The energy is off the charge. Electric. Thank you for coming on, firing us up. Congratulations on the launch.
Starting point is 01:00:12 We will talk to you soon. Talk soon. Twitter for sure, okay? Yeah, we'll see you there. Bye, guys. Bye. We have someone from Open AI here. We're going to stick to technology and business, but welcome to the show, Mark Chen.
Starting point is 01:00:24 Good to see you. Thank you, guys. Awkward day, but I'm excited to talk about deep research. I am excited to talk about AI products. Would you mind introducing yourself and kind of explaining what you do because opening eyes such a large company now and there's so many different organizations. I'd love to know how you interact with the product and the research side and anything else you can give to contextualize this conversation.
Starting point is 01:00:47 Yeah, absolutely. So first off, you know, thanks for having me on. You know, I'm Mark. I am the chief research officer at OpenAI. So in practice, what that means is I work with our chief scientist, Jakob, and, you know, we set the vision for the research org. We set the pace. We hold the research org accountable for execution.
Starting point is 01:01:05 And ultimately, we really just want to deliver these capabilities to everyone. That's amazing. In terms of research, I feel like a lot of the, what happens in the research, site is actually gated by compute. Is that a different team? Because what if the researchers ask for a $500 billion data center, that feels like maybe a bigger task? Yeah, it is useful for us to factor the problem of research
Starting point is 01:01:28 and also kind of building up the capacity to do that research. So we have a different team, Greg leads that, which really thinks holistically about, you know, data center bring up and how to get the most compute for us. And of course, when it comes to allocating that compute for research, You know, Yaakov and myself do that. That's great. And so what can you share that's top of mind right now on the research side?
Starting point is 01:01:52 There's been this discussion of pre-training scaling wall, potentially the importance of reinforcement learning, reasoning. There's so many different areas to go into what's actually driving the most conversations internally right now. Yeah, absolutely. So I think really it's a really exciting time to do research. I would say versus two or three years ago, I think people were trying to build this very big scaling machine. Yeah. And really the reasoning paradigm changed a lot of that, right?
Starting point is 01:02:24 You know, like reasoning is really taking off and it really opens this new playing playing ground, right? It's like there are a lot of kind of known unknowns and also unknown unknowns that, you know, we're all trying to figure out. It kind of feels like GPT2 era, right? Well, where there's so many different hyper parameters you're trying to figure out. And then I think also, you know, um, like you mentioned, you know, pre-training, that's not to be forgotten either.
Starting point is 01:02:46 You know, today we're in a very different regime of pre-training than we used to be, right? Today, we can't treat data as this infinite resource. And I think a lot of academic studies, you know, they've always kind of treated, you know, you have some kind of finite compute but infinite data. I don't think there's much study of, you know, like, you know, finite data and infinite compute. And I think, you know, that also leads to a very rich playground for research. Do we need kind of a revision to the bitter lesson? Is that a refutation of the bitter lesson?
Starting point is 01:03:18 Or do we just need to rethink what the definition of scaling laws looks like? No, I don't think of anything as a refutation of the bitter. Really, like, our company is grounded in we want simple ideas at scale. I think RL is an embodiment of that. I think pre-training is an embodiment of that. And really at every single scale, we face some kind of difficulty. of this form. It's just like, you've got to find some innovation that gets you past the next bottleneck. And this doesn't feel fundamentally very different from that.
Starting point is 01:03:51 What is what's most important right now on the actual compute side? We heard from Nvidia earnings that we didn't get a ton of guidance on the shift from training to inference usage of NVIDIA GPUs, but it feels like it must be coming. It feels like this inference wave is is happening? Are those even the right buckets to be thinking about tracking metrics in terms of the story of artificial intelligence? Because yeah, I mean, it's like if the reasoning tokens are inference tokens and and but they're what lead to higher intelligent, more intelligent models, like it's almost back
Starting point is 01:04:35 in the training bucket again. What bucket should we be thinking about and and and or are we, are we? How firmly are we in the applied AI era versus the research era? Well, I think research is here to stay. And it's for all the reasons I mentioned above, right? It's such a like a rich time to be doing research. But I do think, you know, inference is going to be increasingly important as well, right? It's such a core part of RL that you're doing rollouts.
Starting point is 01:05:05 And I think, you know, we see 2025 as this year of agents, right? We think of it as a year where models are going to do a lot more autonomous work. You can let them kind of be unsupervised for much longer periods of time. And that is just going to put big demands on inference. When you think about our overall vision, right, we glade it out as a series of steps and levels on the way to AGI. And I think the pinnacle, really that last level, is organizational AI. Like you can imagine a bunch of AIs all interacting. And yeah, I think that's just going to put huge demands on inference, right?
Starting point is 01:05:43 On that organizational question, I remember reading AI 2027. And one of the things that they proposed was that the AIs would actually, like, literally be talking to each other in Slack. Does that seem like, does that seem like the way you imagine agents playing out, like using the same tools as humans instead of kind of- One agent says, I'm going to go talk with teams and talk with Slack. I'm going to do a little negotiating. But maybe it just happens super, super fast 24-7, or is there like a new machine language that
Starting point is 01:06:16 emerges? Yeah. I mean, I think one thing that's really helped us so far in AI development is to come in with some priors for how humans do things. And that's actually, you know, if you bake those priors, and they typically are great starting points. So I could imagine, like, maybe you start with something that's Slack like and give it enough flexibility that it can kind of develop beyond that
Starting point is 01:06:39 and really figure out the way that's most effective for it to communicate. One important thing though is, we want interpretability too, right? I think it's very helpful for us today that what the agents do is easy for us to read and interpret. I don't think you want that to go away as well. So I think there's a lot of benefits just even from a pure like debug the whole system perspective, so just let the models, you know, speak in a way that
Starting point is 01:07:06 It's familiar with us. And you can also imagine, like, we might want to plug in to the system too, right? So, you know, whatever interfaces we're familiar with, we would ideally like our model to be familiar with as well. Yeah. I think it's also pretty compatible with, you know, we hit a big milestone.
Starting point is 01:07:25 We got, I think, 3 million paying business users, but fairly recently. Let's go. Yeah, there we go. Let's go. And, um, Three gong hits for three million. The gong will keep ringing for a while.
Starting point is 01:07:42 Sorry. I was hoping you would drop a number. Yeah, yeah. Congratulations. That's actually huge. That's amazing. Yeah, yeah. But I think what big part of that is, you know, we have connectors now, right?
Starting point is 01:07:55 Yeah. We're connecting into, you know, like G drives. And I think, yeah, you can imagine, you know, like Slack integrations, things like that. I think we just want the models to be familiar with the ways we communicate and and get information. Yeah. Can you talk about benchmarking? It feels like we're potentially...
Starting point is 01:08:11 Yeah, do you think about benchmarks at all? Oh, yeah, a lot. I mean... Okay. But I think it's a difficult time for benchmarks, right? I think we used to be in this world where you have these human written benchmarks for other humans, right? And I think we all have these norms for, like, what are good benchmarks, right?
Starting point is 01:08:30 Like, we've all taken the SAT. We all have, like, a good conception of what it means to get, you know, whatever score on that. But I think the problem is the models are already at the point where for even the hardest human written benchmarks for other humans, it's really near saturated or saturated, right? I think one clear example here is the Amy, like probably the hardest auto-gradable like human math eval, at least in the U.S. And yeah, the models are consistently getting like 90 plus percent on these. And so what that means is, I think there's kind of two different things that people are doing, right? They're developing kind of model-based benchmarks, right? They're not kind of things that we would give to an ordinary human, things like humanity's last exam, things like, you know, Epic AI that are really, really at the frontier of what people can do.
Starting point is 01:09:27 And I think the hard thing is it's not grounded in intuition anymore, right? Like, you don't have a lot of people who have taken these exams. So it makes it harder to kind of calibrate on whether this is a good exam or not. One of the exciting things that's on the flip side of that is I really do think we're at the era where models are going to start innovating, right? Because I think once you've passed the last kind of like the hardest human rating exams, that's kind of at the edge of innovation. And I think you already see that with the models, right? Like they're helping to write parts of papers. And I think the other kind of way that people have shifted is, you know, there's these, you know, ultra-frontier e-vals, but there are also people kind of just indexing on real-world impact, right?
Starting point is 01:10:11 You look at your revenue, kind of the value you deliver to users. And I think that's ultimately what we care about. Can you bring that back to interpretability research? Like, with these super, super hard math evals, for example, Are we doing the right research to understand if the thought process mirrors, not just, not just one-shotting the answer, oh, you memorized it or you magically got it correct, but you actually took the correct path, kind of like, you know, you're graded for your work, not just the answer if you're in grade school. And, and, you know, Dario said that interpretability research will actually contribute to capabilities and even give a decisive lead. Do you agree with that? What's your reaction to that concept of interpretability research being very important?
Starting point is 01:11:02 Yeah. I mean, we care a lot about it here at Open AI as well. So one thing that we care a lot about is interpreting how the model reasons. Right. Because I think we've had a very kind of specific and strong view on this in that we don't want to apply optimization pressure to how the model thinks so that it can be faithful in the way it thinks. and to expose that to us, you know, without any kind of incentives to cater to what the user wants, right? I think it's actually very important to have that unfiltered view.
Starting point is 01:11:36 Because, you know, oftentimes, like, if the model isn't sure, you don't want to hide that fact, right? Just for it to kind of please the user. And sometimes it really isn't sure, right? And so we've really done a lot of work to try to promote this norm of chain of thought, faithfulness, and interpretability. And I think it gives you a lot of sense into what the model is thinking and, you know, what are the pitfalls that it can go off into if it's not reasoning correctly. That's such an important point because if you have somebody on your team and they come to you and they say, hey, you know, I think this is the right answer, but we should probably verify it.
Starting point is 01:12:13 It's like, it's still valuable. Totally. It puts you on the right path. If somebody comes to you 100% confidence, this is the truth. It's like, well, like trust is just destroyed. Yeah, totally. Yeah. Yeah.
Starting point is 01:12:23 Don't you guys feel like, you know, safety felt a lot more theoretical a couple years back, right? But like today, you know, like the things that people are talking about a couple years, like, scalable oversight, really having the model be able to tell you, like, and convince you that the work it did was right. It feels so much more relevant right now. Just because the capabilities are so strong. Yeah. I mean, just personally, I've completely flipped from being like, oh, the safety research is not that valuable because I'm not that worried about getting paperclipped. it just seems like a very low likelihood that that's kind of like the bad ending like immediately
Starting point is 01:12:55 and this fume and all this crazy uh gray goo scenarios were just so abstract and sci-fi it just felt like economics will will fall into place and there will be uh like a like a cold like the nuclear ending which is like we didn't build nuclear plants and we just stopped everything because we humans seem to be good at that uh but now that we're actually seeing things like yeah yeah it's crazy how fast it's been right like um oh yeah i think my my personal story is it's like you know What got me into AI was AlphaGo, right? Like just watching it get to that level of capability at Go. And you were kind of like,
Starting point is 01:13:28 it was such an optimistic and also kind of a little bit of a sobering message, right, when you saw at least it'll get beat. And I just remember, you know, like we saw the coding models, you know, when we first launched like, I think very OG codex, you know, with GitHub co-pilot. It was maybe like under, you know, a thousand Elo on code forces.
Starting point is 01:13:48 And I still remember the meeting where I walked into, where the team showed my score and they're like, hey, with models better than you. And you come full circle and it's like, wow, like I put decades of my life into this. Yeah. You know, the capabilities are there. So like if, you know, I'm kind of at the top of my field in this thing and it's better than me, like what can it do? Yeah.
Starting point is 01:14:09 Yeah, that's amazing. I have so many more questions. On AlphaGo, are there, are there lessons from scaling, how scaling played out there that you can, that we can abstract, abstract into the rest of AI research. What I mean is, as I remember it, the AlphaGo training run was not 100K H-200s. But what would happen if we actually did an AlphaGo style training run? I mean, it would be an economic money pit, right? Like, they've had no economic value to do.
Starting point is 01:14:42 But let's just say some benevolent trillionaire decides, I'm going to spend a billion dollars on a training run to beat Alpha-Gypid. go and go even bigger. Is go at some point solved? Would we see kind of diminishing scaling curves? Could we throw extra R out? Could we port back everything that we've doing in just general AGI research and just continue fighting it out in the world of go?
Starting point is 01:15:08 Or does that end and does that teach us anything? Yeah. Yeah. Honestly, I feel like if you really are curious about these ministers, join our team. That's the thing I want to say. So, yeah, I mean, really like kind of the, central problem of today is RL scaling. When you look at AlphaGo, right, it's a narrow domain.
Starting point is 01:15:26 Right. I think in some sense that limits the amount of compute you can pump into it. But even kind of small toy domains, they can teach you a lot about how you scale RL. Like what are the axes where it's most productive to pump scale in? I think a lot of scaling research just looks like that, whether it's on RL or pre-training. So you identify a lot of, you know, different, different variables under which you can scale. And like where is kind of where you get the best kind of like marginal impact for for a pumping scale there?
Starting point is 01:15:55 I think that's a very open question for RL right now. And I think what you mentioned as well, it's just like, you know, going from narrow to broad, right? Does that give you a lever to pump a lot more scale in as well? I think when you look at our reasoning models today, they're a lot more broad based than, you know, just being able to kind of an expert system on go. So yeah, I really do think that there are so many levers to scale. What about Move 37? That was such an iconic moment in that AlphaGo Lisa doll match.
Starting point is 01:16:28 They placed Move 37. It's very unconventional. Everyone thinks it's a blunder. It turns out not to be. It turns out to be critical. It turns out to be innovation. Do you think we are, we're certainly post-touring test in language models. We're probably post-turing test in image.
Starting point is 01:16:44 generation. But it feels like we're pre-Move 37 in text generation in the sense that there hasn't been like a fully AI generated book that everyone is just, oh, it's the new Harry Potter. Everyone has to read it. It's amazing and it's fully generated or this image. The images, they do go viral, but they go viral because they're AI. Move 37 in the context of Go did not go viral because it was AI. It felt like it was actual innovation. So is that the right frame? Does that make any sense? I think it's not the wrong frame. So I think some quick thoughts on that.
Starting point is 01:17:22 I think kind of when you have something that's very measurable, like win or lose, right? Something like Go. Yeah, it's like very easy for us to kind of just judge, right? Like did the model do something right here? And I think the more fuzzy you get, you know, it is just harder, right? Like when it comes to you, is this the next Harry Potter? Right. Like, you know, it's not a universally loved book.
Starting point is 01:17:46 I think fairly universal, but, you know, there's, there's some haters. And yeah, I think it's, it is just kind of hard when it comes to these human subjective things. Right. Where it's really hard to put down in words, like what makes you like Harry Potter, right? Yeah. And so I think those are always going to lag a little bit. But, you know, I think, you know, we're developing more and more techniques to attack kind of these more open-ended domains. And I don't know.
Starting point is 01:18:15 I wouldn't say that we're not at an innovative stage today. So I think my biggest touch with this was when we had the models compete on the IY last year. So IWI, it's like the international, basically Olympics for computer science, basically the top four kids from each country go and compete. And these are really, really tough problems. basically selected so that they require some innovative insight to solve. I think, and we did see the model come up with solutions, even to some very ad hoc problems.
Starting point is 01:18:54 And so I think there was a lot of surprise for me there. I was completely off base about which problems the model would be able to solve the most. I think, like, I kind of categorize, there's six problems. Some of them as more kind of like, oh, this is a little bit more standard. This is a little bit more out of the box. I was like, it's not going to be able to solve this more out of the box one, but it did. And I think I really does speak to kind of these models have the capacity to do so, especially train with R.R. No, no, no, put that in context of what's going on with ARC AGI.I. Obviously, Open AI has made incredible progress there, but it just, when I do the problems, it seems easy. And when I look at the I.O.I sample problems, I think, this would be a 20-year process for me to figure out how to achieve that, and I can do the RKGI on my phone. Is this the spiky intelligence concept? Is this something that a small tweak in algorithmic design, just one-shots RKGI, or
Starting point is 01:19:57 is there something else going on there that we should be aware of? Yeah. I mean, I think part of this is the beauty of RKGI as well, right? Like, I think I'm not sure if there's another kind of like human intuitive, simple benchmark, which is for the models. I think really that's one of the things they really optimize for on that benchmark. I do think when it comes to models, though, like there's just a little bit of a perception gap as well. Like, you know, models aren't used to this kind of native, you know, like just screen type input.
Starting point is 01:20:28 I think there's a lot we can bridge there. Actually, even 04 Mini, it's a state-of-the-art multimodal model in many ways, including visual reasoning. And I think, you know, you're starting to kind of build up the capacity for the models to take images, manipulate, and reason about them. Or generate new images, write code on images. And I think it's just been kind of underfocused. But I think when I talk to researchers in the field, they all see this as a part of intelligence, too. And we're going to continue the focus there. Yeah.
Starting point is 01:21:01 Is RKGI kind of in the, if we're dropping a buzzword on it, is like program synthesis? Is there a world where, I know that, I know the tokens, like the images, we see them as, as renderings of squares and different colors, but the, when they're fed into the LLM, they're typically just a stream of numbers effectively. Is there a world where actually adding a screenshot is what's important, like visual reasoning? Yeah, yeah. So I think, I think that could be important. It's just like kind of, you know, whenever it comes to like textual representation of grids, Um, models today just don't really do that well, right? And I think it's just kind of because humans don't really ever write down
Starting point is 01:21:46 textual representations of groups or like, you know, we have a chessboard. Like no one really kind of just like types it out in a grid. Yeah. And, um, and so the models are kind of like under trained a little bit on, on what that looks like and what that means. Sure. So, um, you know, I, I think with more reasoning, it'll, we'll, we'll just bridge the gap I think with better visual perception, we'll just bridge that gap.
Starting point is 01:22:09 Yeah. How are you thinking about the role of non-lab researchers in the ecosystem today? I'm sure you try to recruit some of the best ones, but the ones that don't join your team. Tell us about the one that got away. Yeah, the one that got away. Yeah, no, I mean, I think it's still actually a fairly good time for specific domains, right, to be doing research. And, you know, I think the style is just very different. And you do feel the pull of non-lab researchers into labs because I think they feel like a lot of the burning problems in the field are at scale, right?
Starting point is 01:22:45 And that's kind of one of the unfortunate things to you, right? Like when you look at reasoning, you just don't see that happen at small scale, right? There's like a certain scale at which it starts becoming signal bearing. And that requires you to have resources, right? But I do think, you know, a lot of the really good work that I've seen, you know, there's experimental architectures. I think a lot of good work is happening in the academic world there. Like a lot of study in optimization, a lot of study in kind of like GANS, you know, there's certain fields where you see a lot of fruitful research that happens in academia.
Starting point is 01:23:23 Yeah, that makes a lot of sense. How about consumer agents? How are you thinking about them? You talked earlier about sort of B2B adoption and that's all very exciting. But how much do you, in the research org think about breakout consumer agent products? Yeah, that's a fantastic question. I think we think about it a lot. I think that that's the short answer. You know, we really do think like this year we're trying to focus on how we can move to the agentic world, right?
Starting point is 01:23:51 And when I think about consumer agents, I think like chat GDP proved that, you know, people got it, right? It's like people get conversational agents when they, conversational kind of models. But when it comes to consumer agents, we have a couple of theseies that we've tried out in the world. I think one is deep research, right? I think this is something that can do five to 30 minutes of work autonomously come back to you and really like kind of synthesizes information, right? It goes out there, gathers, collects, and kind of, you know, compresses the information in a form that's useful. A little bit of pushback there.
Starting point is 01:24:30 Like I can see that as a consumer product when someone like Aiden is. like, I want new towels and he uses deep research to like figure out like what is the best towel across every dimension. But when I think of deep research, yes, it has applications with students, but it's often. Some of them might just be a person consumers being like, we keep using this flight report on this country and where to travel and things like that. We keep using this flight example, but I don't, I haven't actually tried to book a flight with deep research. It's totally possible that it could go and pull all the different flight routes and and calculate all the different delays and all the different parameters of if I fly to this airport,
Starting point is 01:25:07 I can park or I can use valet here or something like that. Yeah. And I guess like when I think of agents, it's deep research is like, you know, curating information on which you can take action on. But it's like at what point is action a part of that sort of loop, right, where you can not only curate a list of flights that you want, but then, you know, actually go out and have agency. I think one of our explorations in that space is operator. It's where you kind of just feed in raw pixels from your laptop into, or from some virtual machine into the model. And it produces either a click or some keyboard actions, right?
Starting point is 01:25:46 And so there it's taking action. And I think the trouble is, you don't ever want to mess up when you're taking action. I think the cost of that is super high. You only have to get it wrong once to lose trust in a user. And so we want to make sure that that feels super robust before we get to the point where we're like, hey, look, here's a tool. That's so different than deep research because, like, you can wind up on some news article and read one sentence that it gets a fact wrong or the commas in the wrong place and the numbers off. But that's just the expectation for just text and analysis. And if you delegated that, yeah, you're going to expect a few errors here and there.
Starting point is 01:26:31 Oh, that's actually a different company name or that's an old data point. There's new data. But very different if I book a flight and you book the wrong flight and I can wind up in Chicago instead of New York. Exactly. And I think the reason why we care so much about reasoning is because I think that's the path that we get reliable agents through. Sure. You know, we've talked about like reasoning helping safety, but reasoning is also helping reliability. Right.
Starting point is 01:26:55 It's like, you imagine, like, what makes a model so good at a math problem? It's like, it's banging its head against it. It's trying a different approach. And then it's, like, adapting based on what it failed at last time. And I think that's the same kind of behavior you want your agents to have. It's like tries things, like it adapts and keeps going until it's... And that's the humans do this every day. You're booking a flight.
Starting point is 01:27:17 You keep hitting an error. It's not which or which form you miss, right? And you're just sort of banging your head against the computer. And eventually it says, okay. you're booked, right? I think that's a great call-out. Yeah. I mean, there's so many more questions you go into, but I'm interested in the scaling of RL and kind of the balancing act between pre-training RL and inference, just the amount of energy that goes into getting a result when you distributed over the entire user base. How is that changing? And I guess, is, is, are we post, like, really
Starting point is 01:27:54 big, really big runs. Is this going to be something that's like continually happening online? It feels like we're moving away from the era of like, oh, some big development, some big run happened and now we're grouping the fruits of it versus a more iterative process. Yeah, I mean, I don't see why it has to be so, right? I think like if you find the right levers, you can really pump a lot of compute into RL as well as pre-training. I think it is a delicate it balanced though between all of these different parts of the machine. And, you know, when I look at my role with Jakub, it's just kind of like figure out where, how this balance should be allocated, where the promising kind of like nuggets are arising from and resourcing those. Yeah, it's,
Starting point is 01:28:39 it's kind of a, in some sense, I feel like part of my job is a portfolio manager. Yeah. That's a lot of fun. Well, thank you so much for joining. This is a fantastic conversation. We'd love to have you back and go deeper. Great hanging, Mark. We'll talk to you soon. Yeah. Peace. Have a good one. Next up, we have Shalto Douglas from Anthropic coming on this show. I'm getting so many. I just getting a lot of messages saying why no one cares about AI.
Starting point is 01:29:06 Talk about the drama on the timeline. Well, we do care about AI. We care a lot about AI. But it is a mess out there. Wow. Yeah, the end of the Trump-E-Lon era. I don't know. Well, maybe we have to get some people on.
Starting point is 01:29:22 talking about it tomorrow or something. I'm going to do it today. Anyway, we have Shalto from Anthropic in the studio. How are you doing? What's going on? Good to see you guys. Hopefully, uh, you're staying out of the chaos on the top. Don't open.
Starting point is 01:29:37 Don't open. Don't open. Sweet child. Just, just. Yeah. Meet everything. Stay focused on the application.
Starting point is 01:29:46 Stay focused on the mission. Stay focused on the next training run. We really, humanity really cannot afford for any. Disractions. Researchers to OpenX. What a hilarious day. Anyway, I mean, I'm... Back to 24 hours, guys.
Starting point is 01:29:58 Yeah. How are you doing? What is new in your world? What are you focused on mostly day to day? And maybe it's just a way of an intro. Yeah. So at the moment, focused really hard on scaling RL. And that is the theme of what's happening this year.
Starting point is 01:30:11 And we're still seeing these huge gains where you go, you know, 10x compute increase in RL. We're still getting like very distinct linear gains, basically for that. And because our role wasn't really scaled anywhere, close to how much pre-training was scale at the end of last year. Yeah. We have like a, basically a gamut of, like, riches over the course of this year. Yeah. So where are we in that, in that RL scaling story?
Starting point is 01:30:34 Because I, I remember the, the, some of the rough numbers around like GPT2, GPT3, we were getting up into like, it cost $100 million. It's going to cost a billion dollars. Like, it just rough order of magnitude, not even from anthropic, just generally, like, what is a big RL run cost or, or how many? Are we talking 10K H-200s or 100K? Like, are we going to throw the same resources at it? And if so, how soon?
Starting point is 01:30:59 Yeah. So I think in Dyer's essay at the beginning of the year, he said that a lot of runs were only like a million dollars back in like December. I think you have like DeepSeek V3 and this kind of stuff like R1, which means that with at least two ooms just to get to the scale of GPT4. And GPT4 was two years ago. Yeah. RL is also perhaps a bit more naively paralyzable and scalable than pre-training.
Starting point is 01:31:20 You're pre-training. You need everything in one big data center. ideally or you need like some clever tricks. R.L. You could like in theory like what the prime intellect folks are doing to scale it all over the world. Yep. And so we're held back like maybe you're held back far less than you are. Sure. Sure. So everyone and their mother has a billion dollars now.
Starting point is 01:31:40 There are there's you know hundreds of thousands of GPUs getting pumped all over the place. I feel like we're not GPU poor as a as a society. Maybe some companies need to justified in different ways, but it sounds like there's some sort of, uh, like reward hacking problem that we're working through in terms of scaling RL. What are all of the problems that we're working through to actually go deploy the capital canon at this problem? Yes. So, I mean, think about what you're asking the model to do an RL is you're asking it to achieve some goal at, at any cost, basically. Yeah. And this comes to the whole host of like behaviors, which you may not intend. In software engineering, this is really easy.
Starting point is 01:32:21 It might try and hack unit tests or whatever. In much more longer horizon, real world tasks, you might ask it to say, go make money on the internet, and it might come up with all kinds of fun and interesting ways to do that, unless you find ways to guide it into following the principles that you want it to obey, basically, or to align it with your idea of what sort of best for humanity. And so it's actually a pretty intense a process. It's a lot of work to find down and hunt down all the ways these models are,
Starting point is 01:32:49 hacking through the rewards and and patch all that. Yeah. Yeah. How are we going to see scaling in the number of rewards that we're RLing against, if that makes sense? I would imagine that at a certain point, we, unless we come up with like kind of like the the Genesis prompt, go forth and brief fruitful or something and multiply. You could imagine training runs on just knocking down one problem after another? And is that kind of the path that we're going down? I very much think so. There's this idea in which the sort of world becomes an R.L. environment machine in some respect, because there's just so much leverage to making these models better and better at all the things we care about. And so I think we're going to be training on just
Starting point is 01:33:38 everything in the world. Got it. And then does that lead to more model fragmentation models that are good at programming versus writing versus poetry versus image generation or or or does this all feedback into one model does the idea of the consumer needing to pick a model disappear are we in a temporary period for that paradigm i think the main reason that we've seen that so far is uh because people are trying to make the best of the capital like we are all still GPU poor in many ways okay and people are focusing those GPUs on the sort of like spectrum of wars that i think is most important. And I'm a bit of a big model guy. I really do think that similar to how we saw with large pre-trained models before, where small fine-tuned models made it, like had gains over the
Starting point is 01:34:31 sort of GP2, GPT2 era, but then we're obsolete by GPT4 being generally good at everything. I think, to be honest, you're going to see this generalization and learning across all kinds of things that means you benefit from having large single models rather than specialization or area fine-tuned models. Can you talk a little bit about the transition from or any differences between RLHF and just other RL paradigms? Yes. So RLHF, you're trying to maximize a pretty, like lossy signal, things like airwise, like what do humans prefer? And I don't know if you've ever tried to do this, like judge two language model responses. I get prompted for that all the time. Right. And I'm always like, I don't want to read both of those. I'll just click the one on the left. Exactly, exactly. And
Starting point is 01:35:14 I click one of the random ones sometimes. Or I click the one that just looks bigger or I'll read the first two sentences. But yeah, I'm not giving straight. I'm not doing my job as a human reinforcer. Human preferences are easy to hack. Yeah, totally. Environments in the world are much truer if you can find them. So it's something like, did you get your math question right?
Starting point is 01:35:36 It's a very real and a true reward. Does the code compile, right? Does the code compile? Exactly. Did you make a scientific discovery? We've got very little rewards right now, but pretty quickly over the next year or two, you're going to start to see much more meaningful and long horizon rewards. You're going to see models bribing the Nobel Committee to win Nobel Prize. It's a good reward hacking.
Starting point is 01:35:58 There's reward hacking. But that's nothing to prevent, right? Exactly. Yeah, yeah, that's the real nightmare scenario. What about, like, there's so many different problems that we run into that feel like it's just really, really hard to divide. design any type of eval. The, my kind of benchmark that I use whenever a new model drops is just tell me a joke.
Starting point is 01:36:22 They're always bad. And or, or even the latest V-O-3 video that went viral was somebody said like, stand-up comedy joke. And it was kind of a funny joke, but it was literally the top result for joke Reddit on Google. And then it clearly just took that joke. And then instead you in a video that looked amazing.
Starting point is 01:36:44 But it wasn't original in any way. And so we were joking about like the RLHF loop for that is like you have an endless cycle of comedians running AI generated materials and then and then, you know, speak microphones and all the comedy clubs to feedback what's getting laughs. I mean, honestly, that would work pretty well. Yeah. If any comedians want to hook us off at an RL loop, I mean. Yeah, yeah. But for some of those less, like as you go down the current, it feels like each one gets harder and harder to actually tighten the loop.
Starting point is 01:37:18 We see this with like longevity research where it's like, okay, it takes a hundred years to know if you extended a human life. Like the, yes, you could create a feedback loop around that, but every change is going to be hundreds of years. And so even if you're on the cycle, it's irrelevant for us in the context that we talk about AI. So talk to me about like, are you running into those problems or or will there be like another approach that kind of works around those?
Starting point is 01:37:42 So there are a lot of. situations where you can get around this by just running much faster than real time. Like let's say the process of building like a giant app, like building Twitter, right? It's something that would take human months. But if you got fast enough and good enough AIs, you could do that in several hours. I've paralyzed heaps of AI agents that are all building around things expect. And so you can get a faster reward signal in that way. In domains that are less well specified like humor, I agree, it's really, really hard.
Starting point is 01:38:06 And this is like why I think in some respects, like creativity is like at the top end in the spectrum, like true creativity is much, much harder to replicate than the sort of like analytical, scientific style reasoning. Yeah. And that will just take more time. You know what? The models actually are pretty good at making jokes about being an AI. This feels weird to fresh. Like everything else is kind of a weird copy of something.
Starting point is 01:38:28 Like it just feels like it's derivative, basically. It's trying to infer what humor is and it doesn't really understand it. But jokes about being an AI are quite funny. Yeah, I think this also might be, I don't know if it was directly reward hacking, but I notice that one of the new minds. models dropped and a bunch of people were posting these like 4chan like be me memes. And and and they were it seemed like they were kind of hacking the humor by being hyper specific about an individual that they could find information on online.
Starting point is 01:38:56 And so you're laughing at the fact that it's like, oh, wow, that is like something that I've posted about it. It's making a reference. But it's not really that funny to me. Other than it's just like, wow, they really did its research. Like it really knows Tyler Cowan intimately, which is cool. But I didn't find it hilarious. Yeah, yeah, yeah, very interesting.
Starting point is 01:39:14 Let's talk about some sort of deep research projects and products. We were talking to Will Brown, and he was saying, like, AGI is here with some of the bigger models, but the time that AGI can feel consistent, it diverges. And so you could be working with someone who's, you know, 100 IQ, but they will stay consistent for years as an employee, or they'll keep, you know, living their life. Whereas a lot of these super smart models are working really well. And then after a few minutes of work, the agents kind of diverge and kind of go into odd paradigms. And it feels very not human.
Starting point is 01:39:55 It feels like a like just a, they're hyper intelligent one way and then extremely stupid. What's going on there? What is the, what is the path to extending that? Is that more like having more better planning and better, like dividing up the task? Or will this just kind of naturally have? happen through the RL and scale. Yeah, so there's that jaggedness, right, which is what you're seeing is how we call it.
Starting point is 01:40:19 And I think that is largely a consequence of the fact that maybe something like deep-suit research, it's probably been REL to be really good at producing a report. Yeah, but it's never been REL on the like act of producing valuable information for a company over a week or a month or like making sure the stock price goes up in like a quarter or something like this, right? Like it doesn't have any conception of how that feeds
Starting point is 01:40:40 into the broader story at play. It can kind of infer it because it's got a bit of well-know from the base model and this kind of stuff. There's never actually been trained to do that in the same way humans have. So to extend that, you need to put them in much longer running, much like long horizon things. And so deep research needs to become, you know, like deep operator company for a week kind of thing. Sure.
Starting point is 01:41:01 Is that the right path? Like, it feels like the road might be there's a, like the longest running LLM query used to be just like a few seconds, maybe a few minutes. And I remember when some of the reasoning models came out, people were almost trying to like stunt on it by saying like, oh, I asked it a hard question and thought for five minutes. Now deep research is doing 20 minutes pretty much every time. Is the path two hours, two days?
Starting point is 01:41:28 Or are we going to see more efficiency gains such that we just get the 20 minute model, the 20 minute results in two minutes and then two seconds? Yeah, so this is somewhere where like inference in many respects and prioritization becomes really important. So both like how fast is your inference is that literally a effects the speed at which you can think and the speed at which you can like, like, you know, do these experiments. Also how easily you can prelise becomes really important.
Starting point is 01:41:51 Like can you dispatch a team of sub agents to go and do deep research and like compile like sub reports for you so that you can do everything in parallel. These kinds of like, it's it's both like there's an infrastructure question here. That feeds up from the hardware and the chips and this kind of stuff to like designing better chips for better inference and all this and an RRL question of like, you know, how well can you pro lies and all this. So I think we just need to compress the timelines, compress the timeframes, basically. Yeah. So if I'm, if I'm like an extremely big model and I'm running an agentic process, like how, how much am I hankering for like a middle-sized
Starting point is 01:42:32 model on a chip or like baked down into silicon that just runs super fast? Because it feels like that's probably coming. We saw that with the Bitcoin progression from CPU to GPU to FPGA to ASIC. Do you think we're at a good enough point where we can even be discussing that? Because I, every time I see like the latest mid-jury, I'm like, this is good enough. I just want it in two seconds instead of 20. But then a new model comes out. I'm like, I'm glad I didn't get stuck on that back. But yeah, like, how far away from, how far away are we from, okay, it's actually good enough to bake down into Silicon? Well, there's a question here of baking it down to Silicon versus designing a chip, which is like very
Starting point is 01:43:13 suit of the architecture that you care about, right? And baking under silicon, unsure. Like, I think that's a bet you could take, but it's a risky one because the pace of progress is just so fast nowadays. And I really only expect it to accelerate. But designing things that make a lot of sense for the sort of transformers or architecture of the future should make a lot of sense. That's a big gap, though, transformers or architectures of the future. If we diverge, there's a lot of companies that are banking on the transformer sticking around. What is your view on transformer architecture sticking around for the next couple of years? I mean, look, they stuck around for five years, so they might stick around for a little while. But there's different, you think about
Starting point is 01:43:53 architectures in terms of this balance, memory bandwidth and flops, right? One of the big differences we've seen here is Gemini recently had actually a diffusion model that they released. I was about to ask you. Yeah. Yeah. So diffusion is inherently extremely flops intensive process, whereas normal language model decoding is extremely memory bound with intensives. You're designing two very different chips depending on which bet you think makes sense. Yeah. And if you think you can make something that does flops like four times faster than diffusion and like four times cheaper than your other's could, diffusion makes more sense. So there's like this dance basically between the chip providers and the architecture.
Starting point is 01:44:24 Yeah. Both trying to build for each other but also like build for the next paradigm. Yeah. It's risky. Do you I don't know how much you how much you've played with image generation, but do you have any idea of what's going on with images in chat GPT? It feels like there's some diffusion in there. There's some tokenization, maybe some transformer stuff in there. It almost feels like the text is so good that there's like an extra layer on top almost and that it's almost like reinventing Photoshop. And I guess the broader question is like it feels like an ensemble of models.
Starting point is 01:44:56 Maybe the discussion around just agents and text-based LLM interactions shouldn't necessarily be transformer versus diffusion, but maybe how will these play together? Is that a reasonable path to go down? Well, I think pretty clearly there's some kind of like rich information channel. Even if there are multiple models there, there's like it's conditioning somehow on the, on the other model because we've seen before like, let's say when, you know, models use mid-Journey to produce images, it's never quite perfect. It can't perfectly replicate what went in as an input.
Starting point is 01:45:27 It can't perfectly like adjust things. So there's a link somehow, whether that's the same model producing tokens plus diffusion. I don't know. Um, like yeah, can't comment on what I was doing there. Yeah. Yeah. Um, are, are there any other kind of like super wild card long shot, uh, research efforts that are maybe happening even out, even in academia where I mean, this was the big thing with, uh,
Starting point is 01:45:54 what was his name? Gary, uh, who's talking about, I forget what it was called symbolic symbol manipulation was a big one. And, and I feel like, you know, you can never count any one out because if I come from behind, and be relevant in some ways. But are there any other research areas that you think are, like, purely in the theory domain right now that are worth looking into or tracking that, you know, low probability, but high upside if they work.
Starting point is 01:46:24 That's all fun. This is a tough one. But I'll say it's not a symbolic thing. It's crazy how similar transformers are to systems that manipulate symbols. Sure. Like, what they're doing is they're taking a symbol and they're, like, converting it into a vector and then they're manipulating and moving stuff, like information around across them.
Starting point is 01:46:37 Sure. Like, this whole, like, a debate that all Transformers can not represent symbols and they kind of do this, it's, it's, yeah, it's not real. So Gary Mark is underrated or overrated, I guess? Overrated. Yeah, yeah. But, I mean, if you twist the, if you twist it so much, you wind up with saying, like, well, really, like, the, the transformer fits within that paradigm.
Starting point is 01:47:04 And so maybe it's, you know, it's, you know, it like the rhetoric around it being a different path was maybe false the whole time. Yeah. Something like that. But as I remember that debate, it was really the, the idea of compute scaling versus almost like feature engineering scaling. And will the progress scale with human hours or GPUs essentially? And that has a very different economic equation. And it, and it feels like. Like, there's been, there's been some rumblings about maybe with a data wall will shift back to being human labor bound.
Starting point is 01:47:43 But do you think that there's any chance that that's relevant in the future? Or is it just algorithmic progress married with bigger and bigger data centers in the future? So I'm pretty bit of lesson built in the sense that I do think removing as many of our biases and our like clever ideas from the models is really important. Just like freeing them up to learn. Now, obviously, there's like, there is clever structure that we put into these models such that they're able to learn in this extremely general way. And that, but I am more convinced that we will be compute bound than we will be like human researcher out. Human research are bound on this kind of thing. Like, we're not going to be feature engineering and this kind of stuff.
Starting point is 01:48:23 Sure. We're going to be trying to devise incredibly flexible learning systems. Yeah, that makes sense. on the scaling topic part of part of my like worry is that the the ooms get so big that they turn into
Starting point is 01:48:40 these mega projects that are that at a certain point you're bound by the laws of physics because you have to move the sand into silicon chips and you have to dig up the silicon and it's a certain sand. Yeah there's only so much sand and like the math gets really really crazy just for the amount of energy required to
Starting point is 01:48:58 to move everything around to make the big thing. Where are you on how much scale we need to reach AGI, whether or not we will see like the laws of physics start acting as a drag on progress because it certainly feels exponential. We're feeling the exponentials, but a lot of these turn into sigmoins, right? So I think we've got what, like two or three more ooms before it gets really hard. Leopold has this nice table at the end of his situation. awareness where I think like 2028 or something is when under really aggressive timelines that you get to 20% of US energy production. It's pretty hard to go exponentially beyond 20% of US energy production.
Starting point is 01:49:41 Now, I think that's enough. Every indication I'm seeing says that's enough. Now, there might be some complex data engineering, raw engineering, this kind of stuff that goes into, there's still a lot of algorithmic progress left to go. But I think that with those extra ooms, we get to basically a model that is capable of assisting us in doing research and software engineering. Yeah, which is the beginning of the self-inforthment. Yeah. Interesting. Is that just a coincidence?
Starting point is 01:50:11 Like, this feels like one of those things, this feels like one of the things where like the moon is the exact same size as the sun in the sky. It's like, oh, it just happens that AGI happens within this time. Like, did you have you unpacked that anymore because it feels convenient? Not to, you know, I know less about this. There's a lot of weird conveniences are like weird. It's a good sci-fi story, let's say. Totally. You know, we've got, you know, Taiwan and between China and the U.S.
Starting point is 01:50:34 and it produces the most valuable material in the world. It's locked between the two. Incredible plot. Yeah. Yeah. Really bad for the people that don't think of, that don't believe in simulation theory. It really feels like how this is scripted. It's fascinating.
Starting point is 01:50:49 Talk to me more about, um, getting to an ML. engineer in AI and kind of that reinforcement. I imagine that you're using AI code gen tools today and Anthropic is broadly and everyone is. But what are you looking for and what are the what's the shape of the spiky intelligence? Where do they fall flat and what are you looking to kind of knock down in the interim before you get something that's just like go? Yeah. So I mean we definitely used them the other night. I like I was a bit tired. I asked to do something just sat watching it in front of me working for half an hour. It was great. It was truly weird experience,
Starting point is 01:51:26 particularly when you look back a year ago and we're still copy pasting stuff between a chat window and a code file. I like meters e-vails for this kind of stuff. So they have a bunch of e-vials where they measure like the ability to write a kernel, the ability to run a small experiment and improve a loss, and they have these nice progress curves versus humans. And I think this is maybe the most accurate reflection of like what will take for it to really help us. at doing progress. And there's a mix here. Like, where they're not so great at the moment is like large-scale distributed systems engineering,
Starting point is 01:51:59 right? Like debugging stuff across heaps and heaps of accelerators and like the way the feedback loops are slow. And you're actually like if your feedback loop is like an hour, then it's what you spending the time on doing something. Yep. Feedback is 15 minutes. And for context there, the hour-long feedback loop is just because you have to actually compile and run the code across everything.
Starting point is 01:52:20 It takes that long. You, like, spin up all your machines. or you need to run it for a while to see something's going to happen. Like at that point in time, you're still cheaper than the chips. Sure. And so it's better than you do it. But for things like your kernel engineering or for like, you're actually even just understanding these systems incredibly helps.
Starting point is 01:52:40 Like one thing I regularly do at the moment is in parts of the code base in like languages that I'm unfamiliar with or stuff like this. I'll just ask it to rewrite the entire file but with comments on every line. Game changing. Well, it's like, comments, I'm through like thousands of files and explain how everything interacts to me, draw diagrams,
Starting point is 01:52:58 this kind of stuff. It's really, yeah. Yeah, how important is a bigger context window in that, in that example you gave? That feels like something that's important. And yet, I just naively, like, Google's the one that has the million token context window. I imagine that all the other Frontier Labs could catch up, but it seems like it hasn't been as much of a priority
Starting point is 01:53:15 as maybe like the PR around it sounds like. Is that important? Should we be driving that up to like a trillion token window? Is that just going to happen naturally? There's a nice plot in the Gemini 1.5 paper where they show the like loss over tokens as a function of context length. And they show that the loss goes down quite steeply, actually. As you put more and more and more like of a code base in the context, you get better and better and better at predicting the rest. Yeah, that makes sense.
Starting point is 01:53:40 The context length, it's a cost. You know, the way Transformers work is that there's, you know, you have like this memory that is proportional. the KV cache is proportional to how much context you've got. And so you can only fit so many of those into your various chips and this kind of stuff. And so a longer context actually just costs more because you're taking up more of the chip and you're sort of like you could have otherwise been doing other requests, basically. So bringing it back to the custom silicon, is that a unique advantage of the TPU? Is that something that Google has thought about and then wound up to put themselves in this
Starting point is 01:54:14 advantage position? Or is it a durable advantage even? Yeah, so DPUs are good in many respects, partially because you can connect hundreds or thousands of them really easily across really great networking, whereas only recently has that been true for GPUs. With MVLink? Yeah, with MVLink and the NVL-72 stuff. So it used to be like eight GPs in a pod,
Starting point is 01:54:34 and then you connect them over worse interconnect. And now you can be 72 and then it breaks down. With Google TIPUs, you can do like 4,000, 8,000 of a really high bandwidth interconnect in one pod. And so that is helpful. for things like just general scaling in many respects. I think it's doable across any chip platform, but it is an example of like somewhere that being fully vertically integrated
Starting point is 01:54:58 is using a benefit. Yeah, that makes sense. Talk to me about Arc AGI. Why is it so hard? It seems so easy. It does seem easy, doesn't it? Well, it certainly seems like more evaluatable than tell me a funny joke, right? Yeah, yeah.
Starting point is 01:55:14 And I mean, I think if you are old on Arc AGI, then it would, you probably get superhuman at it pretty fast. But I think we're all trying not to RL on it so that it functions as like an interesting held out. Sure. Okay. Is that just an informal agreement between all the labs, basically? Yeah, we try and have a sense of honor between us.
Starting point is 01:55:33 That's good. Sense of honor. That's amazing. How many people on Earth do you think are getting the full potential out of the publicly available bottles? Because we're now at a point where we have, you know, billion plus people are using AI almost daily. And yet I have to have met my sense would be it's maybe like 10,000, 20,000 people on the entire planet are getting that sort of full potential.
Starting point is 01:55:55 But I'm curious what your assessment would be. Yeah, I completely agree. I mean, I think that even I don't get the full potential out of these models often. And I think as we shift from you're asking questions and it's giving you sensible answers to you're asking you have to go do things for you that might take hours at a time. And you can really like paralyze and spin that we're going to hit like, yeah. another inflection point where even less people are like really effectively using these things because it's basically it's like a like StarCraft or Dota like it's going to be like your APM of like managing all these agents and that's totally process yeah so i think starcraft is such a good
Starting point is 01:56:31 example you think you're just absolutely crushing it and then you realize like there's an entire area of the map you're just getting destroyed on it's such a good it's such a good comp that's great anything else already um i think that's it on my my side. I mean, I would like this to be an evolving conversation. Yeah, this is fantastic. We'd love to have to be bad and keep chatting. Absolutely. It's really fun. Yeah, we'll talk to you soon. Cheers, Shelta. Have you going.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.