TBPN Live - Elon Musk vs. Donald Trump, AI Day | Shaun Maguire, Mark Chen, Sholto Douglas, Jack Whitaker, Aarush Selvan, Michael Mignano, Oliver Cameron, Delian Asparouhov

Episode Date: June 5, 2025

(02:28:34) - Skip to Elon Musk vs. Donald Trump Reactions (17:15) - Shaun Maguire. Shaun is a partner at Sequoia Capital and discusses the resilience and innovation at X and XAI, highlightin...g the successful integration of Grok into X despite initial skepticism about the platform's stability. He compares the evolution of foundation models to operating systems, predicting a diverse ecosystem with both proprietary and open-source models, where open-source models may have broader deployment but less value capture. Maguire emphasizes the importance of early market capture and anticipates significant moats for foundation model companies due to hardware investments and application layer advantages. He also notes the rapid revenue scaling of companies like Starlink, surpassing previous benchmarks set by AWS, and underscores the necessity of a diversified energy strategy, advocating for increased natural gas, oil, solar, and nuclear energy to meet future demands. (31:55) - Jack Whitaker. Jack is an AI expert and entrepreneur with a PhD from Cambridge University, specializing in generative AI, large language models, and multimodal systems. In the conversation, he discusses the current landscape of AI development, highlighting OpenAI's dominance in both product distribution and research, and noting Anthropic's strong position among developers. He also touches on the challenges of model naming conventions, the role of data in AI advancements, and the varying strategies of companies like Google, X.ai, and Meta in the evolving AI ecosystem. (50:58) - Aarush Selvan. Aarush is a Product Manager at Google, leads the Gemini Deep Research project, which enables Gemini to act as a personal research assistant. In the conversation, he discusses the development of Deep Research, highlighting its ability to generate comprehensive reports by leveraging long context windows and reasoning models, and emphasizes the importance of balancing efficiency with the depth of information provided to users. (01:04:45) - Oliver Cameron. Oliver is the co-founder and CEO of Odyssey. He discusses his transition from leading self-driving car initiatives to pioneering AI-driven storytelling. He introduces Odyssey's latest innovation, "interactive video," an AI-generated medium that allows real-time interaction without traditional game engines, envisioning it as a new form of entertainment. Cameron highlights the potential of this technology to revolutionize content creation by enabling models to generate film and game-like experiences instantly, reducing production costs and time. (01:19:18) - Michael Mignano. Michael is a Partner at Lightspeed Venture Partners and co-founder of Anchor, and discusses the evolving dynamics between AI foundation labs and application layer startups, highlighting the shift from a symbiotic relationship to direct competition. He emphasizes the growing importance of unique data contexts, noting that models are increasingly seeking novel information, which prompts labs to compete directly with startups possessing such data. Mignano also suggests that this trend may drive startups back to established incumbents like Google and Amazon, as they might be perceived as more reliable partners in the AI ecosystem. (01:31:44) - Mark Chen. Mark is OpenAI's Chief Research Officer. He discusses the evolving landscape of AI research, emphasizing the shift from large-scale pre-training to enhanced reasoning capabilities. He highlights the importance of reinforcement learning (RL) in developing autonomous agents and the challenges of scaling RL effectively. Chen also addresses the significance of interpretability in AI systems, advocating for models that transparently convey their reasoning processes to ensure reliability and user trust. (02:00:59) - Sholto Douglas. Sholto is a researcher at Anthropic. He discusses the challenges and advancements in scaling reinforcement learning (RL) within artificial intelligence. He highlights the significant gains achieved by increasing compute resources in RL, noting that a tenfold increase still yields linear improvements. Douglas also addresses the complexities of reward hacking, emphasizing the need for careful guidance to align AI behaviors with human values. (02:28:34) - Breaking News: Elon Musk vs. Donald Trump (02:35:00) - Delian Asparouhov. Delian is the co-founder and president of Varda Space Industries and a partner at Founders Fund. He discusses the recent policy shifts in NASA's budget, particularly the reallocation of funds to the Space Launch System (SLS) program, which had been advocated for cancellation by figures like Jared Isaacman and Elon Musk. He highlights the immediate consequences of this decision, including SpaceX's announcement to decommission its Dragon spacecraft, leading to a lack of vehicles capable of servicing the International Space Station. Asparouhov also reflects on the unprecedented nature of the current dynamics between influential private sector leaders and the U.S. government, noting the escalating tensions and their potential impact on the future of space exploration. TBPN.com is made possible by: Ramp - https://ramp.comFigma - https://figma.comVanta - https://vanta.comLinear - https://linear.appEight Sleep - https://eightsleep.com/tbpnWander - https://wander.com/tbpnPublic - https://public.comAdQuick - https://adquick.comBezel - https://getbezel.com Numeral - https://www.numeralhq.comPolymarket - https://polymarket.comAttio - https://attio.comFollow TBPN: https://TBPN.comhttps://x.com/tbpnhttps://open.spotify.com/show/2L6WMqY3GUPCGBD0dX6p00?si=674252d53acf4231https://podcasts.apple.com/us/podcast/technology-brothers/id1772360235https://youtube.com/@technologybrotherspod?si=lpk53xTE9WBEcIjV

Transcript
Discussion (0)
Starting point is 00:00:00 You're watching TVPN. Today is Thursday, June 5, 2025. We are live from the TVPN Ultra Dome, the Temple of Technology, the Fortress of Finance, the capital of capital. We got to work on that because we're working on selling the naming rights, baby. This place is going to be branded. We're going to sell the windshield.
Starting point is 00:00:22 Exactly. We're selling the windshield Selling the windshield So we got it. We got to keep growing the intro But we have a massive day today a little bit of an AI day today. We got folks from Google open AI and thropic We got the former CEO of open AI She're coming on Sequoia Stanford, Google X investors I and we got pretty good coverage. We hit almost everything.
Starting point is 00:00:47 Should go on a whirlwind tour of what's going on in artificial intelligence. I'm excited to dig into the state of affairs in the foundation model race. We're going to go through the tier list of what companies in AI have the mandate of heaven. We're also going to go through some of the deep research products and hopefully get into some of the more
Starting point is 00:01:07 penny-edge use cases for AI. So we have both some deep research folks coming on and then we also have some folks that are working on video generation and video game generation and a lot of different applications. We're gonna cover the granola story, we're gonna cover what's going on with windsurf and so it should be a great day. But let's run through some news just to keep everyone up to
Starting point is 00:01:28 speed before Sean McGuire joins in 13 minutes so first off ramp time is money save both save balls easy use corporate cards bill payments accounting and a whole lot more all in one place circle went public the CEOs coming on the show tomorrow that's very exciting and Jordy you have the news I will read a little bit from Jeremy the CEO I'm incredibly proud and share and thrilled to share that Circle is now a public company listed on the New York Stock Exchange under Circle. Brian Armstrong Post congrats to Jeremy and the entire Circle team on your IPO and reaching 30 trillion in lifetime USDC volume. Let's hit that go.
Starting point is 00:02:07 The big T. You got it. Incredible. You made it on for 30 trillion. The stock is up massively. Yeah. It was priced at $31. It is trading at around 85 as of now.
Starting point is 00:02:22 That's fantastic. We love to see it. Again, as many people would expect, Bill Gurley is very unhappy with that. Inefficient pricing. He hates a stock. He hates a pop after the IPO. It's good for the IPO window,
Starting point is 00:02:38 which we always want to be open. For sure, for sure. And a role executive chair, Trey Stevens, tells Ed Ludlow that the company has closed a new funding round of 2.5 billion in a deal that more than doubles the defense startups valuation to 30.5 billion. This is from Bloomberg TV. Congratulations to the Anderol team on the massive up round. We got to hit the gong for Anderol. Do it again.
Starting point is 00:03:00 around. You'll love to see it. We gotta hit the gong for Andrew. Let's do it again. We have gong. Good contact, good contact. Very exciting. Bunch of news from Kevin Weill over at OpenAI. We will be digging into this with Mark Chan today
Starting point is 00:03:16 when he joins, but deep research can now share across GitHub, Google Docs, Gmail, Google Calendar, so you can integrate everything and it can do research on your files. That's going to be a lot of fun to talk about. This is also potentially threatening for Granola. And so we're talking to a Granola investor about what the reaction will be, where the direction of that company might change or not.
Starting point is 00:03:38 But if you're designing a tool for artificial intelligence or one of these products or any tool, really, go to figma.com. Think bigger, build faster. Figma helps design and development teams build great products together. Go to figma.com. It is the backbone of the TBPN brand. It is, it is.
Starting point is 00:03:55 And we would not be able to make the show without it. Yes, and so we are going to be digging into, today is obviously artificial intelligence day in some ways. We're digging into artificial intelligence today, we're also very interested in talking about VR and content and augmented reality, and then there's a story in the Wall Street Journal today about meta is in talks, not advanced talks,
Starting point is 00:04:17 just regular talks, regular old talks. But they're talking to Disney, they're talking to A24 about content for a new VR headset, and this is my Number one question about VR when the iPhone everyone's looking for the VR the iPhone moment of VR when the iPhone debuted What was it? It was first and foremost a phone it replaced your phone And so I've always thought that the path to true VR adoption was just saying we're only going after your TV.
Starting point is 00:04:46 The next generation of 20, 22 year olds when they get to college or post college and they're in their first department, they are just not going to buy a big flat screen TV because we have solved that specifically with VR. Exactly, or you know the guy with multiple monitors, three monitors set up, the production team back there, we can go to the production camp,
Starting point is 00:05:05 show you all of the different monitors that they have. What if they could be wearing VR headsets? They could have seven monitors one day. 10 monitors, let's give it up for the production team. Let's give it up for them. Let's give it up for them. Love to see it. So the idea of a drink camp.
Starting point is 00:05:20 We got the drink camp here, cheers. Thank you to Matt, thank you to Andrew Huberman for inventing drinking things the whole team and over at Matina the other question it and so This idea of doing one thing really really well before before going into You know trying to do a little bit of everything the platform even exactly you don't get to be a platform if you solve one Thing really really well the iPhone't get to be a platform if you solve one thing really, really well. The iPhone wasn't a platform.
Starting point is 00:05:48 The first iPhone was not a platform. Didn't have an app store, right? It just had the ability to listen to music. It was an iPod. It was a phone. And it was an internet communicator. Just web browser. And so the Meta is looking to Hollywood for exclusive immersive video for premium device.
Starting point is 00:06:06 Now, my kind of hot take here is that- Belsky's cooking. I know, I know. All of our boys are coming together. We're cooking up something amazing. I'm really excited. We're not gonna be able to get that much information on this soon, but I don't even know
Starting point is 00:06:20 if they need that much immersive content. I think a lot of it is just, hey, every single meta headset should just ship with the Matrix pre-installed for free. It's like, how much would that possibly cost? It's an extra $2 to rent or something. You could have it pre-installed. So it's just like, you can put it on
Starting point is 00:06:34 and there's like 10 movies that are pre-loaded. You can just watch movies and the movies are great. And you're in a really nice theater and it just comes pre-installed and it's all great. Because that was the, my Vision Pro experience was very much Film driven I want to get a step further And I think they would have to basically create an entire catalog Yeah, I gotta know if the matrix by itself is gonna be enough of a draw totally say
Starting point is 00:06:54 I'm gonna spend hundreds of dollars for the average person. No, no, no, it's more about like the pre-installed apps So like the iPhone was a really good phone. No, but I'm saying really good iPod But it also came with like, like a calculator app that was like decent. And so you need a few of these things that are just like really easy to access, really easy to pull off the shelf. And ultimately I think Meta needs to catch up to Apple in terms of the Apple TV movie store and making sure that
Starting point is 00:07:21 like all the streaming providers are really on there in a valuable way. Obviously, it's important to go to immersive eventually, but I think the path to immersive might be just, wow, I have a home theater in my studio apartment now. Anyway, we'll dig into this more. We'll talk to more people. Maybe I'm wrong, who knows?
Starting point is 00:07:41 But the high points from this article in the Wall Street Journal are as follows. Meta is seeking exclusive content from Hollywood for its upcoming premium VR headset Loma set to rival Apple's Vision Pro. I'm super excited for this new VR headset. I think it's going to look fantastic. I think the resolution is going to be insane. Obviously, there's a lot of focus on augmented reality and Orion and AI and
Starting point is 00:08:02 glasses, but there's still so much work just to do just to bring VR into just a normal consumer experience. And what's interesting is that I think Apple really broke the seal on like, yeah, like people are used to paying $1,000 for a phone and $2,000 for a computer and maybe $4,000 for a headset. Isn't that crazy? So what can, M Meta's been hanging out
Starting point is 00:08:25 in like the three, four, 500 range. If you take the reins off and say, hey, yeah, yeah, it's fine to spend $1,000 on this thing, you can get something really, really interesting. And so Meta is offering millions for video based on well-known IP, aiming to attract users to its VR device launching next year.
Starting point is 00:08:42 Now the big question is, how long will these immersive videos be because Apple did do a bunch of these deals. They did license a bunch of interactive video products but they were all like five minute experiences. And so you get through them all and then you'd wait a full quarter and Apple would be like, we have another one, seven minutes.
Starting point is 00:09:02 Here's five new minutes of entertainment. It's like, that's not how people experience entertainment. I remember like- Yeah, yeah, just think of it. A lot of people are used at being entertained by their iPhone for four hours a day. Yes. Often through video.
Starting point is 00:09:15 And it was, but that's not even an iPhone thing. You go back a few decades to like the original PlayStation had Final Fantasy VII on it. It came on multiple discs and that game, people would play it for a hundred hours. Metal Gear Solid was a similar, like dozens of hours of gaming and no one's really been able to deliver that in VR
Starting point is 00:09:36 and have that moment. Same thing with GTA, hundreds of hours of entertainment. Anyway, very excited to dig into this new device known as Loma. It's more powerful than the MetaQuest VR headsets now available with higher fidelity video. Let's hear it. They got the screens done. They pulled them forward off the benchtop. The design is similar to the large pair of eyeglasses more like Meadow's Ray-Ban AI glasses than goggles that the Quest and Vision Pro use connected to a puck that users can put in their pockets.
Starting point is 00:10:04 So maybe they're going puck, which is interesting because that was very contrary. And everyone was like, this is Steve Jobs has never let this happen. And Palmer came out and said, no, puck is great. Keep the, you don't want heavy things on your face. That's just not a good experience. And so Meta is planning to charge less than a thousand
Starting point is 00:10:21 but more than 300. And so I would say 999 is probably the right price I want it to be sort of expensive so it can be a great product a meta spokesman referred to the comments by meta chief technology officer Andrew Bosworth about the company working on many prototypes not of not all of which go into production meta is working with avatar director James Cameron's light storm entertainment on exclusive VR content the two companies announced a partnership last year, so I Think we got to get on this we got to have a VR stream. Yeah, three hours of content every day
Starting point is 00:10:53 We're gonna be the reason churn is low on the next to be our headset Yeah, you just throw this thing on it. Just like you're sitting here on the drink cam. You can you can click through I mean, yeah, just click all to the different angles. It's pretty it's pretty doable It's pretty doable. I mean you can click through. I mean. Yeah, just click all to the different angles. It's pretty doable. It's pretty doable. I mean, you can film useable spatial video on just an iPhone now, and then you can play that back in the Vision Pro. And it does look 3D, which is cool.
Starting point is 00:11:18 In other news, Wall Street Journal's reporting that Reddit is suing Anthropic alleging unauthorized use of the site's data. An online discussion forum says Anthropic accessed the site more than 100,000 times, after saying it had stopped. Reddit is suing Anthropic. And Anthropic debates this.
Starting point is 00:11:37 I'm sure we won't be able to get into this today, because I'm sure it's caught up in the courts, and there's a whole bunch of legal restrictions. But we'll do our best to understand how these deals come about. It seems like most of the time, it's not that the company that has much data doesn't want the AI company to use their data.
Starting point is 00:11:53 They just want to have an equitable agreement where everyone is getting the most value. And I think Reddit's surged the stocks up, right? Yeah, and then for more context, OpenAI is already paying Reddit approximately 70 million per year in a content licensing agreement. So they kind of got ahead of this issue and decided to strike up an actual deal.
Starting point is 00:12:17 And I believe Google has a deal with them too. And I think this might be one of the- Google has a deal with Reddit? Yeah, I'm pretty sure, because there's that meme about the best way to search Google is search like whatever your search term is and then space reddit because the user generated content was better than the SEO stuff that was Google pays right at approximately 60 million per year so 60 and 70 so they're getting a hundred and thirty million
Starting point is 00:12:38 that's pretty serious revenue and and it's something that doesn't need to be brokered via a bunch of individual programmatic ads that might not work or anything like that or subscale. It's just one or two deals and boom, you're up in the hundreds of millions of dollars in revenue. What is Reddit's overall annual revenue? $20 billion market cap. Okay, not bad. Let me see here.
Starting point is 00:13:03 How do they track into Conde? They did 1.3 billion greenbacks in 2024. 1.3 billion, so they're getting like 20% of their revenue or 15% of their revenue. Yep. I wonder how big. They grew 60% over 2023. Interesting.
Starting point is 00:13:20 So they might be bigger. They might be bigger than Reddit or they might be bigger than Conde Nast, which at one point owned them. It's kind of unclear how valuable Conde Nast is because they're private. Anyway, so Reddit said that the AI company unlawfully used Reddit's data for commercial purposes
Starting point is 00:13:39 without paying for it and without abiding by the company's user data policy. Anthropic is in fact intentionally trained on the personal data of Reddit users without ever requesting their consent, the complaint said. Ah, interesting saying that it's about the users. Yeah. It never opted in.
Starting point is 00:13:54 Bill itself is the white knight of the AI industry. Last year Reddit took steps to try and limit unauthorized scraping of its website, creating a public content policy for its user data that is publicly accessible, such as posts on subreddit and updating code on its back end. The user policy includes protections for users, such as ensuring that deleted posts and comments aren't included in data licensing agreements. And so, yeah, I don't think that there's a really strong precedent for the agentic web.
Starting point is 00:14:22 Like, if I use Google Chrome to access a website, Chrome doesn't need to pay any sort of license, but if I go to Anthropic and say, hey, get me up to speed on this topic, and it goes out and it browses the web, all of a sudden it feels like maybe they do have to pay, whereas Chrome wouldn't, because it's just rendering the webpage
Starting point is 00:14:43 and it's not transforming it at all. What is transforming? What's fair use? And so these things will obviously play out in the court of law. And so hopefully they can resolve it quickly and move on. Yeah, I'm actually surprised that they didn't already have a deal in place, because it's very valuable data.
Starting point is 00:15:01 You want that data for your models. And anyways. Well, we have Sean McGuire joining in just a minute. Because it's very valuable data. Don't want that data. Yep for your models and Anyways, well we have Sean McGuire joining in just a minute and The other news in the Wall Street Journal today is thrive holdings is betting that AI can change IT services the company established by venture capital firm Thrive Capital joined with ZBS to invest $100 million into an entity that will integrate AI into IT firms. This is from Josh Kushner, of course. Shield Technology Partners has already acquired four IT service companies, Clearfuse Networks, IronOrbit, Delvol Technology Solutions, and OneNet Global.
Starting point is 00:15:42 It said Thrive Holding is called Shield Technology Partners an AI enabled managed IT service platform. IT service companies also called managed service providers or MSPs typically provide IT support and manage tools like software and cloud computing on behalf of businesses. Founded by Josh Kushner about 15 years ago, Thrive Capital is known for some of its high-flying startup investments including OpenAI, Databricks, and Wizz, what a portfolio. Investing in traditional services business,
Starting point is 00:16:10 particularly those that rely heavily on administrative knowledge work and adding AI to supercharge them is becoming a bit of a trend. As part of its efforts, Shield Technology partners will embed software engineers into each of its IT portfolio businesses. Oh, they're doing the forward deployed engineer. The engineers goal is to build an AI driven solution that all of the portfolio companies will use. We've studied all the ways in which MSPs have perhaps
Starting point is 00:16:35 been on their back foot to date with customers and says that IT services work is incredibly well suited to what AI can streamline. And so you can imagine a whole bunch of agentic workflows for all the different things that you need to do when you're deploying cloud, your managing cloud. Really quickly, before we have our next guest, let's tell you about Vanta, automate compliance, manage risk, and prove trust continuously. Vanta's trust management platform takes the manual work out of your security and compliance process and replaces it with continuous automation, whether you're pursuing your first framework or managing a complex program if you think you
Starting point is 00:17:06 Should be on Vanta you probably should probably correct. Well, we have Sean McGuire from Sequoia Capital in the studio Welcome to the show Sean. How are you doing? Boom? What's up team? Never a boring day on the internet. That's for sure Yeah, what is keeping you? Well, man, what's keeping you up now? Well, I think there's you got anyone on Twitter knows what I'm talking about Yeah, I mean let's let's let's skip the politics because this is purely a technology and business show Thank God. I love you guys stick to the technology
Starting point is 00:17:42 What I mean we we've had an interesting experience with X in that there's always been this narrative that like the whole platform was gonna collapse. We, you know, there's been rough days here and there, but overall things have been growing. What have you seen across the X, XAI merger? What are the secrets to success? How is, you know, talent tracking
Starting point is 00:18:07 is any of, is any of like the chaos and noise distracting? Because when I talk to XAI engineers, they're like, we're too busy. We can't come on your show. But, but, but what's your experience been with the X and X AI team recently? Look in, if you go back in time, as you said, everyone said it was going to fail. The app would crash. Nothing would happen. And that didn't play out.
Starting point is 00:18:33 But there was a lot of tech debt and broken infrastructure. And there was a couple of years of rebuilding the basics and foundations. I think we're starting to see real innovation happening. I love the Grok integration directly in X. It always scares me when someone, when I have a tweet or whatever and then someone says like, at Grok, is this correct?
Starting point is 00:18:57 Is this accurate? You never know what's gonna come back. Usually I agree with Grok, there's been once or twice where I think some of the subtleties are a little off. It's truth seeking. It doesn't mean that it's fully truthful every time. Yeah. Yeah, it hasn't actually found that ground truth every single time.
Starting point is 00:19:17 That's funny. What about the overall horse race? I'm probably wrong. Yeah. What about the overall horse race between the foundation models? It seems like every day it's going back between an OpenAI launch, an Anthropic launch, a Grok launch, a Google launch.
Starting point is 00:19:35 Do you think that continues? Do you think there's maybe some fragmenting and there's opportunity? I mean, we're kind of already seeing this with how much Anthropic's loved by developers versus OpenAI has been really dominant on the consumer side. And now every company is figuring out a different way to actually get to distribution. What really matters here?
Starting point is 00:19:54 Is it pure scale? Is it pure cracked engineering talent? Is it distribution? Is it a combination of those things? How are you seeing it play out? Great question. I, you know, honestly, my opinions have changed a lot over the last few years and in many directions. And so I don't have too much confidence in my assessment right now, but the, you
Starting point is 00:20:16 know, I always try to look at lessons from the past and my current thinking is that the closest analogy are operating systems. And if you, and I'll make a couple of points on this. If you think about operating systems, first of all, there's a bunch of different ecosystem. There's the windows ecosystem, there's the Apple, you know, OS ecosystem, then there's like on mobile, there's, you know, Android, there's, you know,
Starting point is 00:20:43 the whole browser environment with Chrome, and then there's open source to Linux. One thing that I think is interesting about Linux, there's more Linux servers in the world than there are Microsoft servers, but the value capture of Microsoft is way greater than Linux. I personally think we're going to see something very similar play out, of Microsoft is way greater than Linux. I personally think we're gonna see something very similar play out, where there'll be like a, you know,
Starting point is 00:21:11 OpenAI will be the Apple or someone, and XAI I think will be very successful. I think there's a good chance that Anthropic is independent and successful. I also think there'll be a big open source component, which should be like Linux. And I think there'll probably be 10 to a hundred times as many open source models out there,
Starting point is 00:21:34 or like deployments of open source models in 10 years. But I think that they won't be as valuable and they won't be like as rich of ecosystems. And then just to make two more points on the open source analogy, like for Microsoft, or Apple, by having the operating system, they were able to actually win in quite a few ways on the application layer as well. For Windows, they bundled in Word and Excel and then Outlook and all these other things.
Starting point is 00:22:04 I think it'd be very similar for the foundation model companies. I think that the foundation models would be like table stakes. That'll be their kind of win, but also a very sticky moat. And even if they're not the most profitable businesses themselves, it will give them big advantages kind of on the application layer. And then one other thing that I think will happen, you know, the cloud companies have giant moats just through the CapEx dynamics of cloud, like needing to buy all this hardware and, you know, innovate with hardware and stay there is a big moat. I think these foundation model
Starting point is 00:22:41 companies are going to be, I think there's going to be way more value that occurs and there'll be way bigger moats than people realize. I think they will all basically have hardware moats like cloud style hardware modes. They will have the operating system style, very, very detailed research that's hard for anyone to replicate. And then I think they'll probably make a lot of their profit from applications on top of it. That's what that's my current thinking thinking can change. And so, yeah, fingers. So I know, obviously, you weren't investing during the original operating system boom, but your firm Sequoia Capital was. discussions with the kind of the lineage of the firm or the history and seeing how is the revenue ramp or the business scale different this time than say in the dot com era or in the previous era.
Starting point is 00:23:37 It feels like it's ramping faster than ever. It feels like we're seeing more companies that are hitting a billion in revenue or a hundred million faster than ever. But is that real based or, or anyone that you've talked to that was investing in that era? Does it feel different this time around? Do you think? Yeah. I mean, one of the beautiful things about being at Sequoia is we do have this long history and we get to tap into the kind of institutional knowledge. That said, sadly, Don Valentine died four or five years ago, like early into my time. Rip, what an absolute legend, you know, and he led the original Apple investment. But there's still a lot of Google institutional knowledge in the firm, which is, you know, not
Starting point is 00:24:21 directly operating in September, but they created an operating system later. I mean, first of all, the revenue of these companies is scaling insanely just faster than any products in history before for Starlink. So obviously not a foundation model company, but I basically made internally, I made an Excel spreadsheet of AWS's revenue growth like in the first 20 years of AWS compared to Starlink. And you know, Starlink has in five years got in to where what took AWS 10 years to get to. And I and now like with these foundation model companies, we're seeing as fast or even faster revenue
Starting point is 00:25:06 growth. That said, these are very, I think the business models, like the initial business model is more clear and the profitability of these companies is, more the in profitability is insanely high and so you got to discount the revenue growth But I would just say the biggest lesson I think from the past is you have to capture Like territory early on and the doors will kind of close behind you because of these CapEx dynamics and and just like lock in with users. Yeah, I mean you mentioned Starlink Do you think there's obviously yes? It's such a weird company because it's like a space
Starting point is 00:25:46 launch company that now is an internet company, ISP. But there's actually a little bit I'm starting to hear of an AI narrative just that having Starlink potentially unlocks edge compute or inference in areas that would typically have kind of stranded energy resources. So all of a sudden, if there's some super remote area that has really cheap energy, you can go in and set up a data center there and then do inference and stream those tokens over star link. Do you think that's an underrated narrative?
Starting point is 00:26:19 Do you think that that's developing on course? Do you think there's any bottlenecks that people should be thinking of within that story? So when we first invested in SpaceX, the part of the core thesis was internet everywhere. And I would say like it goes way beyond AI. But I think the internet everywhere thesis is huge. And that will be, you know, everything from oil rigs to airplanes to boats to edge AI devices. But I think the bigger thing for Starlink is Starlink just has a 10x plus cost advantage for moving data compared to building new transatlantic or trans-pacific fiber lines.
Starting point is 00:27:07 And in the world of AI, these models are going to be moving so much data around themselves. And I think Starlink is incredibly well positioned to be the pipes to move all this data for AI. And so I actually, I care more about that just because of the volume than some of the kind of edge applications for AI specifically, but those will be big. And then one other thing, I just got to give a plug to Bitcoin. Plug to Bitcoin.
Starting point is 00:27:41 Basically, yeah, let's go. Basically, three years ago, I visited the biggest Bitcoin mine in the world, Genesis Digital, their mine is near Midland, Texas. It's actually backed by SPF, which is, you know. He got both anthropic. Oh, he had a bunch of good bets, you cannot deny that. Exactly, he got both Anthropic and Genesis.
Starting point is 00:28:05 But these guys had a gigawatt scale Bitcoin mine operating three years ago. And already for them, like having... It taught me a lot. And Bitcoin mining is the absolute tip of the spear where you need the least amount of data movement, like data in and out, to dollar generated or power consumed. And so I actually think that was like Bitcoin mining is underrated in terms of how much it's pushed, like frontier power generation turned into compute. And I don't think it's a coincidence that Crusoe, you know, which is now powering Stargate, started off as a Bitcoin mining company
Starting point is 00:28:53 or that CoreWeave, which is like $80 billion stock as of yesterday is now, you know, is now an AI data center company. And I just, I think, I think that's honestly the bigger theme. Yeah. What's your updated thinking around nuclear? We have these new executive orders and it was announced this week that Metta announced
Starting point is 00:29:17 the partnership with Constellation to power some of their AI power needs. What's your kind of updated outlook over the near term to medium term? I'm an all of the above guy for energy. Like we need all of it. We need all of it as quickly as possible. I, as an individual invested in a few nuclear companies
Starting point is 00:29:38 going back like nine, 10 years ago, way too early. And to put a little bit more meat on these statements, nuclear is incredible, but deploying large amounts of nuclear is slow. Even if you deregulated it to zero, I think it would be more than a decade, well beyond a decade, to deploy like a terawatt of new nuclear. Call it 10 years if you did it as fast as possible for America starting now. Solar is just a way faster way to deploy a lot of energy. Nat gas is a way faster way to deploy a lot of energy.
Starting point is 00:30:22 We have been producing insane amounts of natural gas, which we didn't have the pipelines to actually use. So we were just flaring it a lot of times because kind of like the dollar value per, like when you have an oil well or you're fracking, it's producing, it's emitting natural gas and oil. And you just made so much more money from the oil emitting natural gas and oil. And you just made so much more money from the oil
Starting point is 00:30:46 than the natural gas that we didn't really care about it. And that started to flip. And so anyways, I think we have to do all these things. I think we need more natural gas, more oil, way more solar, and then kind of have nuclear coming as the reinforcement juggernaut coming online like 10 to 15 years from now. That's a good framework.
Starting point is 00:31:08 Fantastic, I mean we have to have you back for an energy deep dive. We know a fair amount of the nuclear and solar entrepreneurs and there's a bunch of people doing really cool stuff. So have a safe trip. Personal plug, I had a seat in the New York Mercantile Exchange when I was like 22 years old. It was insane.
Starting point is 00:31:25 Good time. Wow. Cool. Hey, good luck on the timeline today. I know you're gonna go in there, put on your hazmat suit and just get in there. Good luck. Good luck.
Starting point is 00:31:34 Peace guys. Safe travels. Cheers. Fantastic. Let me tell you about Linear. Linear is a purpose-built tool for planning and building products. Meet the system for modern software development,
Starting point is 00:31:44 streamline issues, projects, and product roadmaps. Go to linear.app. Next up, we have Jack in the studio. We have an in-person guest. Let's bring him in. Play some soundboard for me, Geordie. Welcome to the stream. How you doing, Jack?
Starting point is 00:32:00 There he is. Second time on the show. Good to have you here. What are you wearing doing, Jack? There he is. Second time on the show. Good to have you here. What are you wearing today, Jack? I'm wearing the jacket, the TDTN jacket in the capital of capital. Fantastic. There you go.
Starting point is 00:32:11 Thanks for coming. Thanks for hanging out. Here, you can adjust your mic a little bit there as well. Yeah, got that. Cool. I wanted to kick this off with a little bit of a rundown on the different foundation labs. We're talking to a lot of them today. And I noticed
Starting point is 00:32:26 that Jordan Schneider from China Talk and Dylan Patel ran through their AI mandate of heaven tier list. And so I wanted to read through that and kind of get your reaction and then just kind of do like a vibe check and let it and talk to you about what we should be expecting from different labs over the next year. It's a little bit of a horse race. So up first at S-tier, they have OpenAI. It's the only foundation lab that made S-tier. Does that feel right to you? What are you watching from OpenAI?
Starting point is 00:32:57 Yeah, I think that's exactly right. OpenAI executing both on the product level, getting the distribution, getting into hundreds of millions of people's phones. But also on the research level, you have people like Noam Brown, people like Aidan, just doing this incredible frontier research. 03, I think, just as a model,
Starting point is 00:33:13 impresses me the most of any model that's come out so far. You know, Brad Lightcap said in The Wall Street Journal recently they had two million workplace users in February, and they're at 3 million now. Wow. So it's just really exceptional growth. I think. I was thinking earlier, it'll be funny,
Starting point is 00:33:30 our kids in 20 years will be like, dad, they're making me use OpenAI Teams at work. It'll just be like the default, like the Microsoft Teams default. Yeah, I mean, there's a little bit of a narrative that, uh, that maybe, and we can move on to Anthropics in the eight year alongside deep seeking Google. Uh, there's a little bit of a meme that like Anthropic is crushing it with developers. They're the default choice for windsurf cursor users,
Starting point is 00:33:57 but then open AI is more dominant with consumers. But I, I feel like recently I've heard that it's maybe even more skewed than people think. Like it's's it's maybe not like The the vibe on X might be yeah, like, you know 70 30 open AI clawed for day-to-day grab a random person on the street, but it might be even more skewed Does that feel right to you? Yeah, I think I think anthropics really solidified with developers But it's like totally yeah given up on consumers. But I think open AI wants to take that on. I mean, there there's rumors about some sort of windsurf acquisition.
Starting point is 00:34:30 They're releasing 4.1 and Codex. They're pushing hard on coding. And I think that's something to watch from them this summer and going into 2026 is can they, can they secure that? Do you understand the model names at this point? 4.1, I have access to 4.5. Why would I want to go backwards? Are the models fragmenting to where I'm going to have to learn a new taxonomy for, okay, if I want to write code, I use this one. If I want to write poetry, I use this one. If I want to do math or reasoning or build a chart, I use that one. Because it's putting
Starting point is 00:35:00 more work on me, I feel like. I think Sam said that they're going to try to fix the model naming scheme this summer. So that's the real thing to watch. Okay. They're going to keep best years. Can they, can they get coherent model names? Yeah. But yeah, 4.1 it's cheaper. It's specialized towards coding. It's kind of their 3.7 type of driver at the same time. I know you're not super up to speed on the Alibaba like Quinn models, but I, I saw some, I saw some release where Alibaba, Quinn released like a hundred different models.
Starting point is 00:35:26 And Will Brown was kind of saying like, this is awesome from a research perspective because they have like one model that's just good at bio. And it's kind of like this hyper fragmentation at the opposite of going in the unification direction. It's actually going more specialization and then maybe you unify that at the end. But I don't know, it seems like if you're a consumer company, you can't,
Starting point is 00:35:46 you don't really have that affordance, right? Yeah. I think in terms of research, Ali Baba's a bit underrated. I mean, compared to deep seat gets all this press, all this coverage, but the Quen models are really good. People are doing, you see from lots of people, these really cool RL experiments, these really cool kinds of things that they're lagging behind the U S models. They're not, they're not eight year, they're not beat here, you know, but they're doing some interesting stuff. and I think that's super cool.
Starting point is 00:36:07 Yeah. I really wonder if they're, if they have a distribution advantage in China, obviously we wouldn't feel it here, but, uh, I really haven't gotten up to speed on what is the chat GPT of China in terms of distribution. Obviously deep seek had that moment, but have they actually executed properly on the, on the product side? I don't know. I'm surprised that Google hasn't been able to turn their general distribution advantage into an AI distribution advantage. They have these really good models. The new Gemini came out today. It's, it's got really good benchmarks on a lot of
Starting point is 00:36:36 things, but they're yet to, I think they're yet to crack distribution. We sometimes say. Did you see that? Did you see that mock-up? That was just the Google search box But a Gemini prompt. Yes, like if they wanted to go full send if they really but if they were really a GI pelled They would just say hey, we're done with Google search. I mean it would destroy their economics I commit to it. I'd commit to it I think it looks like the model that they used to power those search prompts right now It look it seems really lightweight to me. It gives a lot of wrong answers. When I ask 2.5 something, it's always right. Sure, sure. That's interesting. You think that's
Starting point is 00:37:07 just a cost issue? The AI overview box is like, we're just going to hallucinate. It's a hallucination box. Well, they are launching the advanced AI search, but it's a toggle, so you have to find it, which is always the problem with Google. Well, I mean, they still wound up in the A tier, according to Dylan Patel and, uh, and Jordan Schneider over to China talk. Uh, uh, obviously via three was like a huge one. And then they also have all those like priced performance things, but, uh, I I've heard this narrative that like, maybe some of the hyperscalers are super focused on benchmarking and, and not even hacking the benchmarks necessarily,
Starting point is 00:37:42 but just like just thinking about them. And a lot of the frontier labs, the independent labs, have just kind of moved on philosophically from caring about benchmarks. Is that the right move? What's driving that? Are we in the post benchmark era, essentially? Yeah, when I think about models and benchmarks a lot, I think which models outperform the benchmarks.
Starting point is 00:38:03 When you see O3's benchmarks, they they're good they're kind of what you expect yeah then when you watch oh three think you see this model it's actually reasoning sure when you watch Claude for Opus or Claude for sonnet think it's like whoa this is really good same with GPT 4.5 I think the Gemini models are good but they're exactly as good as the benchmarks let on you know and I think they don't have the vibes yet what I want to see is Gemini 2.5 ultra Okay, Google releases something with some big model smell something cool. Maybe that's them. What is big model smell? I just don't like the idea of smell at all. Oh, yeah, it's just a weird. It's a weird sense
Starting point is 00:38:44 But but basically we're're in the intangible period. Is that the idea? That it's unquantifiable? I think Anthropix give it up on really training on the benchmarks, and I think it's done really well for them. You see that they're really good at Sweet Bench. They're not crushing it on MMLU.
Starting point is 00:39:00 But you tried force on it. It's great. Other labs that are lower down on this tier list seem to have not given up on doing really well on the benchmarks. Yes, yes, that makes sense. I mean, it's possible that you must defeat the final boss to play the end game.
Starting point is 00:39:15 And so maybe the end game is this vibe check, this big model smell, but in the interim, yes, you only earn the right to go into big model smell if you can dominate in all the benchmarks. There was an interesting moment where 3.7 was beaten on every benchmark. So now for state of the art on stuff again,
Starting point is 00:39:32 3.7 was losing on everything. There's a better model for everything hypothetically. But then if you looked at what you might call like revealed preferences bench, which is just like, what do people use on cursor? Sure. What's going on, man? Yeah, revealed preferences bench.com is just like what do people use on coaster sure what's going on man? Revealed preferences bench calm
Starting point is 00:39:50 Yeah, yeah 3.7 was pretty high up there Yes, it seemed like they had something that wasn't captured there. What about cornered resources data is the new oil? That seemed like a very silly concept in the moment when everyone had scraped the web entirely and there was it really felt like Data was fully commoditized then we see vio3 and for the first time it feels like okay There is at least one data set that is so large that you can't copy it onto a single hard drive or Compress it and it's YouTube and Google owns it and and yes people might scrape it here and there But Google has a durable advantage there.
Starting point is 00:40:27 But is that the wrong way of thinking about it? Yeah. I mean, I'm not sure about the video models. I think it's true that data is both super, super important, but also has just become tremendously overrated because the first thing people learn about AI is, oh, it's a result of the data that goes in. But now that we're unlocking things like RL and better post training, it seems to me like you can have some non-data solutions
Starting point is 00:40:48 to some of these problems. Yeah, I mean, that was the original, what, generative adversarial network for image generation was like synthetic data generation and then testing it. And so like, it just, VO3 feels so, so much like a beneficiary of YouTube. But I don't know if that's just,
Starting point is 00:41:05 if we're just waiting and we'll see the next Sora and we'll be like, oh, opening, I figured it out. And like, yeah, maybe they found some like, kind of work around to the data, but really like the vast majority of the consistency and the innovation there was algorithmic progress, not just, you know, quarter resource and data. Yeah, one thing about video models,
Starting point is 00:41:24 it's been so secondary, but they've become so impressive. I think that if you showed them both to me a couple of years ago, I would be more impressed by VO3 than even like Cloud Force on it or something, you know? I agree, I agree. It's just, it's not what I, it's really, really just incredible. Well, yeah, I mean, I think a lot of it just comes down
Starting point is 00:41:40 to like the cost of instantiating the thing. And so if I go to, if I go to deep research and I use Oh three and I have it pulled together, um, some, you know, 20 minute research paper, it's like, that's a few hours of work. Maybe it's a few thousand dollars of like a researcher's time. Maybe we're getting up into like PhD level. I could do it on my own, but, but if I actually want to crash a Ferrari through the Hollywood sign with champagne bottles Flying a custom the Hollywood sign that's huge like unless I'm either doing yeah
Starting point is 00:42:13 I'm either I'm either renting all that shooting it practically and it's a multi-million dollar Michael Bay shoot or I'm doing it all in CGI and even to do it in CGI millions of dollars of rendering and so even for an eight second Clip it just looks like, wow, I got something that normally would cost a million dollars to make happen. And there's no, there's no real like textual asset that feels like, wow, this is a million bucks worth of assets. Anyway, interesting. Uh, XAI, uh, they are cooking. They've been, uh, obviously GPU rich scaling up. People seem like they're in the B tier here, uh, according to this chart, but everyone's kind of excited
Starting point is 00:42:47 about what's coming next. What is your take on Grok, XAI, are they close to the big model smell? That feels like a natural beneficiary of Elon's strategy of just go big, but how are you thinking about Grok generally? Yeah, I'm not the most impressed yet. I mean, Grok 3 is good.
Starting point is 00:43:06 It's a good model. Sure. It's like a funny thing, like Grok's whole thing, or something that people who really like Grok often say, it's like, oh, it's trained on this real-time X data, this X-ray lily. One thing I've tried a few times, because I saw it in a tweet, is if you have a tweet you can
Starting point is 00:43:20 describe, maybe I say like John Coogan's tweet about bringing media back to Hollywood. Yes. And you ask Grok to find it, it can't find Hollywood yes and you ask rock to find it it can't find it really yeah three can find it that's so interesting because I feel like I feel like X is pretty locked down at the at like just the WWW layer right it's pretty hard to find in fact a lot of times I'll post in a post from X and it will have to go to like thread reader unroll and find an
Starting point is 00:43:46 archive off of X because it clearly can't access it directly. That is fascinating. So that feels that feels solvable. Adam ships TBPNguest.com. Yeah. Like last week I had a friend find it Monday we hadn't announced it anywhere it's not even visible on the Google search. Really? And O3 found it. Wow. It was like, how did you find this? And he was just looking. He asked O3, can you pull together
Starting point is 00:44:10 a list of all the guests that you've been able to find? Wow. And it found that link randomly. And Google doesn't even find it. Interesting. O3 is really good at search. And I think that might have been RL. They mentioned RLing on Tool You in the blog.
Starting point is 00:44:22 Very, very interesting. Also, like XAI, it's like not really much revenue, nearly no revenue yet, you know, at some point you need to start pulling that out. I'm glad they're pushing on the distribution, you know, but things come around. Yeah, makes a lot of sense. Last one we'll end with.
Starting point is 00:44:37 Probably the highest revenue multiple of any company in history. Yeah? Yeah. Last one we'll end on Meta Llama, sitting in D tier, but maybe not out of the game yet. The two interesting bull cases I've been discussing have been one, is there a world where open source American AI becomes geopolitically important for countries that are slight allies and they're either choosing between deep seek or an open source American allies and they're either choosing between Deep Seek or an open source American model.
Starting point is 00:45:05 And opening eye would not be in the conversation. And then also just, why would you ever bet against Zuck? He has a capital cannon that can fire 10 billion at random projects forever. And so the question is, is that enough? What are you looking for from Meta and Llama in the future? Yeah, it seems like you hit some issues recently. But I I'm not betting against suck. He's got the capital. He's got some GPUs They can get together some really great research. I would love to see better American open source models
Starting point is 00:45:36 I mean, I'm not betting on open source in the long term as maybe the cornerstone of AI but the fact that all of our American Research groups a lot lots of really smart RL researchers are doing experiments on quen and on on lamba lambda maybe the cornerstone of AI, but the fact that all of our American research groups, lots of really smart RL researchers, are doing experiments on QUEN and not on LAMDA, it's just not great, you know? Yeah, yeah, yeah. So should, there's one interesting twist there, which is QUEN has so many different models,
Starting point is 00:45:58 LAMA has a few, they're still working on rolling out behemoth, but would it be like almost more of an olive branch to the developer community to fragment the models and really focus on hitting researchers? Is that kind of a potential path that they should take? Yeah, I mean, I think it would be really cool if they did that. It would be somewhat charitable. Yeah, yeah, yeah, exactly.
Starting point is 00:46:17 Developers love a handout, but you know. I don't know. I think I'm curious about what they do on the product level and how they can build stuff in better. On the product level, people aren't incredibly sensitive to whether O3 can search 50,000 websites like we are, you know, they care more about just having something that's really good, something that's really good to talk to,
Starting point is 00:46:33 maybe meta shifts focused there anymore. I'm not feeling it right now in terms of like, when will a meta model grab number one on El Amorino or something, it seems like it's gonna be some time, you know, but I'm not counting them out at all either. Yeah, I mean, if they can just, yeah, stay on the lagging edge, that could still be valuable in a lot of their product rollouts.
Starting point is 00:46:53 I mean, we forgot Apple and the L tier. We do have another guest hopping on in just a minute, but Apple and the L tier, how do they dig themselves out? Is it build, is it buy? What do you think is gonna happen? They could maybe, they have a lot of cash. They could maybe buy someone. They could buy someone. Yeah. You can get by by lab and then you gotta go and upgrade.
Starting point is 00:47:12 Um, I, I, there was a, there was some report that they had some internal models. Um, I wouldn't be surprised if they could train stuff. It's just, look, we haven't seen anything at all. You know, do you think they're really training on Apple silicon? Like you've seen those photos of like all the Mac minis wired together. Does that seem like something that's really just like Okay. Yeah. Um, I think you're on GPU training ones are gonna be bigger next year. Really? Well, I think the GPUs for Google sure. Um, yeah, so they already have a long time It's easy
Starting point is 00:47:39 They could go do something like a training or an in French a chip from Amazon or TPU Yeah, I mean with the TPUs, Google has by far the most compute. Yeah, I mean I guess Apple's pretty good at chip development and design, so like, they could do it. Yeah, their one chips are pretty good. That would be their, yeah, that would be their advantage
Starting point is 00:47:53 if they could build a really strong chip and cut that cost. I wouldn't bet on it, but maybe. Yeah, yeah. I like the idea of just opening it up and really partnering. The thing over the last 24 hours is one account sharing, it's so over for Google and then immediately sharing, wow, Google is going to destroy everyone in AI and just like seeing,
Starting point is 00:48:12 seeing how the posts rank. Yeah. Anyway, anyone else on here? They got Mistral and FT or Porsche for the French. They're not trying Le Chat. Yeah. I do wonder about Mr. All because you know, the, the models are real, but none of like broken out and capability, but there, there's this question of like, if you want a national champion in your country,
Starting point is 00:48:35 it might not be enough to just have the foundation model layer. You also might have to go and win in the free market in the application layer. And so yeah, you could have, even if you had a comparable model, if you're not, if that's not, if people are going to do chat.com instead of laychat.com, like you have not won and you don't have your national champion. Yeah, and there's a,
Starting point is 00:48:59 I think there's some truth to this, but there's also the regulatory stuff in the EU. I mean, a lot of releases, I think VO3 is not in the EU. I mean, a lot of releases, I think Vio3 is not in the EU. A lot of releases don't come there. Maybe Mr. Orr just uses regulatory modes to monopolize. Not a fun way to win, but maybe that's the ball case at this point.
Starting point is 00:49:13 Yeah. What was your reaction to the conversation back and forth with Dwarkesh and Shalto all about this debate over, to our cash and Shalto all about, um, about, uh, the, the, the, the, this debate over, over, uh, I forget it was like spiky intelligence and how you actually, uh, train someone. There's so many different things. We see that the models are really good at one thing and then they fail. Arc AGI. Um, uh, what's your overall timeline right now? How are you looking?
Starting point is 00:49:43 Yeah. Do cash raise the point that you can't kind of do this continuous learning, this like short run continuous learning, like you can tell me, Jack, I want you to do something different as a get intern. I figure that out. And context is a weaker tool than that. And I think that's absolutely true
Starting point is 00:49:57 and that's an unsolved problem. I don't know how much that moves my needles on timelines. Like one thing that could be true is just that OpenAI or Anthropic makes some like, sweet agent and then it starts accelerating their AI research and they just get like really efficient algorithms really quickly. Some architecture that just destroys the transformer.
Starting point is 00:50:17 But I do think it's a meaningful unlock if that could be solved. And I think that sort of like, mid-level memory type of stuff is really interesting. Or solutions around context around a wrapper. Well, this was fantastic. We have our next guest. Thanks so much for hopping on.
Starting point is 00:50:32 Good to come on. We love an in-person guest. For sure. Thank you so much. Next up, we're heading over to Google World. We have Arush from Google. He worked on the Deep Research project that dropped from Google in 2024.
Starting point is 00:50:47 It was a full year ago, it was in December, technically, but very excited to talk to him about that product, all the things that go into Deep Research. So we'll welcome him to the studio if he's available. How you doing? Good to have you on the show. Hey, what's up guys? Thanks for having me. What's going on?
Starting point is 00:51:02 Not too much. We're having a great day, we got a great lineup, and excited to dig into it. Would you mind kicking us off with just an introduction on yourself and a little bit of the history? I want to hear about the history of the products that you've built at Google, what the interaction between research and product looks like, and what you're excited about.
Starting point is 00:51:23 Yeah, for sure. First off, A-Team, that's pretty good. Pretty good, yeah. There we go. Let's hear it for A. Let's go. Let's go. Let's hit it.
Starting point is 00:51:31 Yeah, John's going to hit the dog. I don't know any fun. Good work. Cool. Yeah, love to be here. Yeah, it's been fun. It's been a fun ride. I'm a product manager on the Gemini team cool I've been here since a little while back when it was called Bard
Starting point is 00:51:52 Bard day and Bard days Yeah, and so yeah about I don't know maybe like this time last year. We started kicking around this idea of deep research where one of the things we noticed is a ton of people come to the product and ask, like seeking to learn something or asking questions and kind of doing researchy type things. But if you ask really hard questions, one thing we noticed is the model would just give you like an outline of an answer. It wouldn't actually tell you something very comprehensive. So we kind of just ran with a hypothesis of like, let's take off the constraints of like,
Starting point is 00:52:29 it has to respond within a few seconds. It has to use this much compute. Like let's let it, let's just see how far we can push, what the model can do. And this was before thinking models or anything, and then like kind of, and any of that, that good stuff. And so we kind of worked on this idea for a bit and then we launched in December back on Gemini 1.5 Pro
Starting point is 00:52:49 was the model that we were using back then. We launched deep research as kind of as a bet to just see like would people be into something that makes you wait 15 minutes but gives you something comprehensive. I'm happy to wait, although I do want it to speed up. Questions about context window size. How important is that million token context window?
Starting point is 00:53:11 That feels like it's been a unique Google feature for even longer than I expected. The advantages in AI seem to last days, maybe weeks, before another model comes out that, you know, meets or is roughly around the same capability. How important is large token context windows in deep research like products? Yeah, it's huge. It's like really what enabled us and kind of gave us the confidence that this was even worth trying. I'd say that the long context enabled us to do basically be
Starting point is 00:53:47 very recall forward and really cost a very wide net as we research the web and try and find gems of information that we then stitch together. And so that that was like I think our biggest differentiator and really allowed us to build this product. The other thing that long context allows us is like once you finish your research not just the report, but everything it read along the way is in context. So you can keep asking questions going deeper with within Gemini.
Starting point is 00:54:15 And even if it's like a tidbit of a fact that's not in your report, if it's been read at some point, it'll be able to retrieve that and give you that answer. So it also helped sort of beyond that first turn, keeping a good experience. And then reasoning models was like the next big, big step jump for us,
Starting point is 00:54:32 allowing it to then do more critical analysis. So in terms of like actual product design, I'm interested in the direction this goes here. You could see one world where the models are baked down into silicon, everything's running even faster, you're distilling the models are baked down into silicon everything's running even faster you're distilling the models and all of a sudden I'm getting a deep a 20-minute product in two minutes or even 20 seconds you could also imagine a world where what's possible if the economics work such that
Starting point is 00:55:00 I could request a two-hour research report or a two-day research report. How are you evaluating those? What would you personally be more excited about? And what do you think users actually want because stated preferences and revealed preferences are often different or do we wind up with both? Yeah, so one of the things that we noticed, one, when we launched this,
Starting point is 00:55:24 we had no idea people would be willing to wait. Every metric at Google from the day it started is reduce latency and all metrics go up. So this was definitely a bet where we were, like a lot of people thought we were crazy, where we were like, we're just going to take a ton of time and people will wait. One thing we noticed is that after about a minute or something like that, people are fine. People will go away, do other things, come back.
Starting point is 00:55:46 We'll send them a notification when it's ready. The big pleasant surprise for us is people don't mind waiting. In terms of efficiencies gains, one of the things that we're more excited about is, okay, if we can make models more efficient, instead of reducing down the research time, can I give you just a way better output? Like, can I use, can I bank that savings and give you something way more insightful, way higher quality? I'd say the other thing is like, even if I could give you like a deep research answer in 15 seconds, it's going to take you 15 minutes to read. So there's also an aspect of just
Starting point is 00:56:21 like how much do you want to consume this? Right. So, so for us, we're not as stressed about like, can we make this faster? Can we make this quicker? I do think there are probably other points in the latency, comprehensiveness spectrum that people might like. Right. We picked like one extreme of like, let's just go super hard and build the most comprehensive long thing that takes a while. But there might be totally other points people are in. and build the most comprehensive long-knit thing that takes a while.
Starting point is 00:56:45 But there might be totally other points people are interested. Yeah, yeah, yeah. Sometimes I notice I've generated so many various deep research reports across all the different apps that I'll follow it up with a prompt, like, okay, yeah, boil that down for 10 bullet points because I don't have time to read that. And then I'm like, wait, maybe I should have just asked it
Starting point is 00:57:03 to give me 10 bullet points and I just burned a bunch of GPU cycles But I guess the question is back and forth between the two until you kind of understand But but I guess the question is like is is there is there a product or is the natural evolution of just general prompts that as As algorithms get faster as these models run faster that there is a deep research amount of work that happens within a few seconds between every response. And basically the question is like,
Starting point is 00:57:34 how much can you port from the deep research product and strategy and design back into just your average LLM interaction? design back into just your average LLM interaction? Yeah, I think there's definitely a lot of learnings that we can kind of start upstreaming, really around being able to form a plan, follow that plan to do that sort of multi-hop steps of search, iterating, finding insights,
Starting point is 00:58:02 changing your strategy possible before going back to the user. So you're starting to see this in 2.5 Pro and stuff like that. I'd you can imagine that that will continue where you will see more mini deep research or more planning and iterative reasoning before giving you an answer. That could just start getting faster and faster and faster. Then you start just getting like way more insightful or, um, uh, uh,
Starting point is 00:58:31 comprehensive answers. Are there any other interesting areas? I mean, deep research feels like one of the first like really solid product market fit experiences in, I guess, like agents broadly. Um, are there any other areas that you're excited to think about knocking down with either different products or just maybe just like cool uses that you've developed or as a user patterns that you're leveraging
Starting point is 00:58:59 that's maybe go beyond just the average, like I need a research report. Yeah, totally. So I think there's like a few different angles that like I think a lot of people are exploring. One is you kind of point out like what does a two hour deep research look like? What does an overnight deep research look like? If you can have like a very well-defined problem where like you know we have early experiments at Google like AI co-scientists and stuff right like you could run that overnight and it can come up with like novel scientific hypotheses, right? So there
Starting point is 00:59:28 definitely is an angle of like, if you can define a problem and an outcome really well, applying more compute can actually get you like better and better answers, right? So there's definitely an angle of like, are there whole new classes of problems where you can even go even further with deep research? There's a second aspect of like, we had the chance to go meet a bunch of people who are like researchers at the Fed. And they were telling us how they use deep research. And it's often a very different thing.
Starting point is 00:59:56 So I showed them this example where I was like, hey, there's this funny law in the US called the Jones Act where any two ships between like two US ports have to be like built in America, accrued by Americans. And like drives up shipping prices, but only for like Puerto Rico, Hawaii, and like Alaska. And so I was like, do an economic analysis of the Jones Act on like the economy of Hawaii. Right. And it like did a first principle analysis, did some really interesting things like looking at well like how much is a three, three and a half thousand shipping route like say from like Mexico
Starting point is 01:00:31 to South America and then that's like a baseline price to compare against. And like I thought this was amazing but then they were like that's not how we would do economic analysis. Like they would be like first I'd explore like what other studies there are like then I'd explore like what kinds of methodologies are out there. Then I might ask a bunch of follow-up questions about what data sources or data sets did people use to do this research. So there's definitely an aspect of another angle of,
Starting point is 01:00:55 if I really want to help people with research, it's about nailing this synchronous, asynchronous paradigm and helping people kind of do more of that iterative process rather than just like ask question get answer and move on and then in victory and I think that's that's kind of a product challenge like figuring out the right the right interaction model for that and the third is it's just like outputting the outputting an answer at the right like level of abstraction that you work at right like a financial analyst doesn't think in terms of a report.
Starting point is 01:01:26 They think in terms of the spreadsheet or the financial model. And so if I want a DCF, deep research can build a great DC, discounted cash flow model for me. But I don't want it in a report. I want it in a spreadsheet or I want it in an app where I can play with the variables
Starting point is 01:01:41 and see the different outcomes. And so you'll also see the line between like reports and other kinds of artifacts starting to blur. Um, or even just like, what, like, what does it mean to like build and build an answer? Right. And, and that, that could take like a much wider, that's super exciting. Yeah. I mean, I've seen, obviously Gemini, we probably can't talk about the roadmap too much, but I've seen Gemini pop up in a bunch of different areas and, and I haven't seen
Starting point is 01:02:04 the deep research version of whatever that instantiation is. the roadmap too much, but I've seen Gemini pop up in a bunch of different areas, and I haven't seen the deep research version of whatever that instantiation is. Maybe my last question is how much time are you thinking about working and making, as a product manager on Gemini, how much time are you thinking about making Gemini better versus sort of fighting for distribution outside of Gemini and kind of across the Google ecosystem, because part of unlocking the value of Gemini
Starting point is 01:02:26 is just making sure it's in the right places and placed sort of contextually across everything from Gemini.Google. You've worked hard on this. Just ask for the I'm feeling lucky button. Just give us that. We think you earned it. You've earned it.
Starting point is 01:02:41 It's a great product. Just click I'm feeling lucky or burn 40 GPU hours on this new research award. Yeah, that would instantly melt all of our servers everywhere. This is the biggest hyper scape. Yeah, we need more GPUs. The TPUs can handle it. TPUs are cool. T need more GPUs. Let's get GPUs. The TPU can handle it.
Starting point is 01:03:06 The TPUs. TPUs, yeah. We use TPUs. Okay, so ASML, get cooking. I believe in the TPU. I believe you've earned the I'm feeling lucky. I haven't hit the I'm feeling lucky button in years, yet I use Gemini all the time.
Starting point is 01:03:16 Yeah, yeah, yeah. This is what the users want. Yeah, we just need 10 more TSMCs, I guess, to start fabbing. Anyway, sorry. Serious answer. Yeah, sorry, serious answer. Yeah, the serious answer is the Gemini app is a great place for us to prototype,
Starting point is 01:03:29 see what really works with people. A lot of the users, they're very intentional when they're coming to the Gemini app. They want to use an AI experience. So it's a really great place for us to put stuff out there, see what works, see what doesn't. Some things we put out needs more time in the oven. And then over time, you'd imagine
Starting point is 01:03:45 that then those insights or things that really start work, you'll start seeing in other Google products as they make sense. You don't want to over-clutter a UI, but you'll start seeing things like deep research. Yeah, because it's a very different user, somebody that's coming in saying, I want AI, versus I just want to do certain things.
Starting point is 01:04:03 Yeah. And yeah, they're totally different archetypes. It's a fascinating challenge. I'm sure it's even more challenging at your scale. But thanks for all the hard work and pushing the frontier forward. It's been a pleasure talking to you. Yeah, come back on again soon.
Starting point is 01:04:17 Yeah, we'd love to talk to you more. Yeah, I appreciate it. Thanks so much, guys. We'll talk to you soon. Cheers. Bye. Fantastic. Next up, we have Oliver Cameron. I have a good story. We'll bring him you soon. Bye. Fantastic. Next up, we have Oliver Cameron.
Starting point is 01:04:25 I have a good story. We'll bring him into the studio, but I believe he was the first person I ever interviewed for a YouTube video years ago. I was doing a whole video essay about Cruise, the self-driving car company, and he hopped on a Zoom call with me just like this one, and I recorded it and threw clips in the video. It was very fun. And then I wound up doing more interviews after that so Oliver good to see you how are you doing what's going on welcome doing great thank you for the opportunity would you mind kicking us off with like the latest and
Starting point is 01:04:56 greatest introduction because you've done a lot in your career but you're on to something new for sure so spent about eight years building self-driving cars incredible time I mean just to see that technology go from barely being able to For sure. So spent about eight years building self-driving cars. Incredible time. I mean, just to see that technology go from barely being able to keep in a straight line to navigating downtown San Francisco with no human behind the wheel, just a sign of where things have gone with machine learning. So had a blast doing that. Built my own company, sold that company to Cruise where we met and uh and loved that time. Left Cruise in May of 2023,
Starting point is 01:05:27 decided to start something new and both me and my co-founder who also was from Self-Driving Cars, we were both um very much inspired by Pixar. I think it's just a very special company, right? Everyone kind of recognizes Pixar as this sort of iconic storytelling company. And we really put our heads together to think about what a modern reincarnation of Pixar would look like. So that company is called Odyssey. And we're an AI lab that's really focused on enabling entirely new stories to be told. And, uh, walk us through the first product that you launched. I played with it earlier. Uh, it was mind blowing-blowing. We'll pull it up while you're talking
Starting point is 01:06:08 Sure. Yeah, we just released a research preview of something that we call interactive video Mm-hmm, and it's effectively AI video that you can both watch and interact with in real time. Yeah, and We think this will become a entirely new form of entertainment, you know, you've got film, you've got games, you've got all these mediums that have been around for a while. We think that there is an opportunity to invent a brand new one, where effectively a model is responsible for imagining film and game-like experiences in real time that you can interact with. There's no game engine behind all of this, no heuristics, no rules, just a model that's learned pixels and actions from tons and tons
Starting point is 01:06:49 and tons of real life video. Yeah, we're showing it on the screen right now and the production team is controlling it with the keyboard, W-A-S-D, like it's a first person video game and they're walking around this field with trees and windmills and they can actually choose to go up, go inside buildings and it's all being generated without the use of a game engine and then they can switch over to a different environment. And so, I mean, I have tons of questions about how these different, like you're not doing
Starting point is 01:07:19 photo scanning, you're not doing game engine stuff, traditional 3D pipeline, but the data must come from somewhere, love to hear about that. And then also, I noticed the space button doesn't work, I wanted to jump around, start bunny hopping, when are we getting a space button added to this thing? Anyways. Isn't it trippy how those pixels are literally streaming from a GPU cluster cluster probably in Texas.
Starting point is 01:07:45 It's so crazy. And now we're streaming them via Zoom in real time. It's crazy. My question is, do you think that Odyssey can be a really breakout app for VR? Cause when I see that visual, I feel like that it could give someone the sense of being able to explore lands that don't exist, which is like very fat, like once it's fully immersive, it feels like.
Starting point is 01:08:11 It's funny the windmill thing because I remember the very first Oculus demo that I ever did. I was walking around a windmill and it's still in my mind years later, but it was amazing, but it was just like one little windmill and then you couldn't go any further because developing like virtual assets is really expensive. And so you play a lot of these VR games and you know, it's a couple hours or 30 minutes. But if you take a procedural approach or a generative approach, you all of a sudden have infinite content.
Starting point is 01:08:42 I think what's really important to note is in film and game, incredible things can be made, right? Like insanely good things that wow is all. The time and the money it takes to create those things is ludicrous and it's only getting more expensive, not less expensive over time. So I feel there will be continuously a place for these sorts of like handcrafted things. And they'll be very important. But if we just think about a model that's trained on literally decades of video, that's then able to imagine stuff in real time with no pre-production costs,
Starting point is 01:09:19 no post-production costs, and do that literally in real time, like 33 milliseconds. It just that that's where it gets really crazy. And what we showed in the research preview is just like this tiny glimpse, I think of what this stuff will become. VR in particular is like the most hardcore application of this from a technical perspective, because the resolution required for VR is like insane. And the resolution that you saw that you can tell it's low res, it's like 300 pixels wide. So there's gonna be a leap that needs to happen there
Starting point is 01:09:49 to get to VR level res, but. I'm confident that Odyssey 2, you'll have it. You'll have it dialed. Oh yeah, two points. Yeah. Give us the stats. How many, like what numbers can you give us about the progress or adoption?
Starting point is 01:10:05 You just launched this, I think this week or last week, it hasn't been very long, but how has the response been quantitatively? Oh, it's been incredible. So we launched a week ago, and since then we've served 250,000 unique streams, meaning 250,000 people experiencing what you just saw, which is insane.
Starting point is 01:10:21 Market clearing order inbound. Yeah, let's do it. you just saw, which is insane. Market clearing order inbound. Let's do it. Love it. Congratulations. That's fantastic. On the question of resolution, there's a bunch of amazing AI upresing that's happening in various parts of the pipeline.
Starting point is 01:10:41 There's some server-based upresing that can happen. There's some on-device upresing. So, is that, are you counting on that technology breaking one way or another? Does it matter? Will it be a combination of both? How do you see that developing? I think a way to think of this is where video models were a year ago is where real-time video models or world models will be today. And what that really means is that you look at the res, remember the Will Smith, everyone remembers the Will Smith
Starting point is 01:11:10 spaghetti video. Yeah, spaghetti. Was that like one year ago or two years ago? It wasn't long ago. I think it was just over a year ago. So fast. There was definitely better outputs, like spaghetti was like the weirdest, hardest thing at the time.
Starting point is 01:11:23 Although gymnastics today, I'm sure you've seen that. That's really tough with video models today. But that's all to say that I think the res and the visual quality improvements will come from the model itself, not like some secondary piece of infrastructure to up res. Just because, I mean, think of what a language model was like to use two years ago.
Starting point is 01:11:45 Like, how fast was it in response time? Really quite slow, right, compared to today, where it's like just stream of information straight to your eyeballs. Same will be true of these models. Like, we'll crank out larger resolutions, faster frame rates, more actions, more things you can do, all that sort of stuff. Yeah, and I guess importantly, like, GPT 4.5 is not GPT 4 up res'd to 4.5. It is a different model. We're walking around what looks like the gloomy English
Starting point is 01:12:15 countryside right now. And I think the production team is going to try and go in that house. It is really, really so wild. I noticed that there's a time limit. Do you have a tropical island demo? Because this, I love the English countryside. It's very foggy.
Starting point is 01:12:31 Yeah, I noticed that there's like a two minute timer when I sign in. Is that so the GPUs don't melt? I mean, I assume you've raised money and you're maybe burning some money with these demos. But break down kind of like what your limitations are and how you see them evolving. Yeah, for sure. So the timer is there because each session
Starting point is 01:12:53 is served by a single GPU. So each user gets a GPU, the model's running there, and that's beamed to the user directly. And really quickly, when you say single GPU, you don't mean rack, you mean like one A100 or something like that? One H200 per user. Got it.
Starting point is 01:13:12 And there is a clear path to like dividing the GPU to have multiple sessions per user, but today it's one. And we want to really crank up quality frame rate, all that sort of stuff. Yeah. It makes me feel great to know that I'm getting, you know, the sort of one-on-one attention from an h-200 chip yeah yeah it's not a retail store yeah it's not a great experience if somebody's bouncing
Starting point is 01:13:32 around exactly I'm being individually served being served this is like an air mass level by Jensen so two dollars an hour is there or thereabouts how much that costs which you know over the course of multiple users, it's not too bad. I think Netflix is like five cents, 10 cents an hour, something like that to stream video. So we're a bit of a ways away, but you've got new chips coming, just model optimizations, like it won't be long where we're having a single GPU per user, all that sort of stuff. For this launch, we had something like 360 H200s prepared, to scale it up a little bit, just because we had lots of demand,
Starting point is 01:14:13 but that timer is there just to make sure we're cycling through lots of people getting a taste of this. But yeah, I think fundamentally the idea that you could have a model stream stuff to any screen is really powerful like that Experience you saw there works just as well on an iPhone on an Android on TV Anything like that and it's all just action conditioned of a web RTC, which is probably what zoom is running on So the action to just sent over the wire to the model the model then conditions the pixels It's about to generate based on those actions sends sends the pixels back, and just that loop every 33 milliseconds is firing.
Starting point is 01:14:47 So, I mean, the path to HD or 4K seems pretty clear to me. What about the path to consistency? That feels really difficult. You need essentially a really long context window to know that, okay, I dropped my mythical sword on that piece of the ground. I went away and then I came back. That's like textbook, just put it in a database.
Starting point is 01:15:10 But it seems like the future might not be that. So how are you thinking about that? I guess the bigger question is like, what's the response from the gaming community? Is this something that can be a tool and a piece of a pipeline instead of completely replacing the entire traditional pipeline. So most research on interactive video before has learned from games. So lots of folks will have seen Oasis from Decart, a Minecraft in a video model. Yep. Effectively or Quake. That's often used in video models.
Starting point is 01:15:39 And I think the gaming reaction to that is quite negative. Oh, yeah. I mean you saw the car Mac back and forth, right? My car Mac was like this is amazing. I love and somebody else is like this is stealing, you know developers Yeah, and I think it's important because The way that people envision that is like, oh, what's the best thing this could become it could become like we mixing of games Yeah, and that's one way it could be. I think people see what we have and they think, oh, this is like a world simulator eventually.
Starting point is 01:16:09 This is the matrix or like whatever they project on it. Yeah. So really, one thing we're trying to avoid is like, for the first few generations of this, people will put, including ourselves, like this picture of what existing games look like onto this. And it's like the iPhone when it launched, right? People ported desktop apps to the iPhone,
Starting point is 01:16:31 and it kind of worked, but it kind of didn't. Wasn't really embracing this new medium. So I think the long story short here is stuff that is integral to games today, like multiplayer, like state, like scripting, all that sort of stuff. Let's question those assumptions. Like, how should those things work?
Starting point is 01:16:48 Let's make it model native. Like maybe memory in this model is very different than memory in a game or state in a game, multiplayer in a game, all that sort of stuff. And that's probably going to lead us in the short term to more like glitchy, weird experiences, though, at the memory as it is by the models, a feature, not a bug. I don't know if you guys have seen like the back rooms or like these kind of glitchy weird experiences though at the the memory as it is by the models a feature not a bug I don't if you guys have seen like the back rooms or like yeah these kind of glitchy. We yeah Yeah, it almost is a completely different type of game design Yeah, yeah the the up down left right a B AB of the future will be like drop your bad
Starting point is 01:17:18 Sword on the ground walk around the building three times come back and it's and it's enchanted Because he is the model hallucinates that you've upgraded or something like that. That'll be fun. I also think that one important thing here is that in language models, one of the things that's happened over the last year is in many cases, they've crossed this threshold of realism for certain applications. So like people literally fall in love with language models, right? The same like emotional feeling they have when they meet a person literally fall in love with language models, right? Like the same like emotional feeling they have when they meet a person they fall in love with is happening for them with a language model.
Starting point is 01:17:49 And that's cause they, what they're seeing on their screen is like so realistic. It's like crazy real to them. And I think the same will be true here where once these pixels, these actions feel so realistic, which eventually they should just get in the data, given the models and advancement. There'll be things that they do in these worlds or things they feel in these worlds which they just can't feel in video games today because games are just capped by computer graphics and like human dev time and budgets and everything else but they'll walk down
Starting point is 01:18:14 the street they'll see someone and they'll be like wow that person looks so real and they'll go over they'll like high-five that person or like on the screen right yeah and I'll just feel something like they'll feel like a heartbeat raise you know yeah totally so that's that's an application that you can't do in games today that's just different and new so that's the sort of stuff we're really interested in well that's gonna be a wild wild future but thank you we'll have to have you back and check in on progress yes definitely the day that 720p drops or whatever the next version
Starting point is 01:18:44 is like We're excited for this, but thanks so much for joining. This is a fantastic conversation We will talk to you soon. Have a great day guys for joining. Thanks so much next So we have a return guest Michael Mcnano from lightspeed coming into the studio Are you gonna talk to the gong or he's gonna talk about competition between? Yeah, you know a lot of Asian labs and the app player. Well, welcome to the stream Michael. How are you doing? Boom. Good good to see you guys Studio. Thank you. It's been a lot of fun a bit of a I like the upgraded gong too. Oh, yeah That's much bigger
Starting point is 01:19:18 Everything we got a bigger. We got a bigger one in the works. Yeah We're kind of even bigger like Florida's really. Oh, yeah also Yeah, we're working on even bigger floor. Really? Oh, yeah also It's a funny day to just be so hyper fixated on AI because you probably haven't seen the timeline Tesla's down 17% 17% It's just absolute mayhem. I mean, there's an AI there, right? Yeah, there's definitely that's not what's driving it there But anyway, what Michael it's great to have you on. Wanted to get some kind of updated thinking from you on the tension between labs and the application layer.
Starting point is 01:19:56 We saw the news with Windsurf and Anthropic today that had more to do with a potential acquisition. And even when we talked to the founder of Granola, we were talking about the competition between Notion and Granola with these, it's a founder-led kind of previous era scale-up unicorn SaaS company. Can that company bolt on AI, but then now we're seeing competition
Starting point is 01:20:18 from the foundation lab. So, would love to get your lay of the land. What are you seeing? How are things shaking out? And what do you think the next few months or even years look like? Yeah, it's pretty interesting, right? Like if you think about the big companies that startups previously, uh, built on the backs of the Googles, the Amazons, the Microsofts, you know,
Starting point is 01:20:37 it felt like there was this really healthy sort of symbiotic developer ecosystem where the incumbents supply resources, the developers sort of buy and extract from them and they build really, really big businesses on top. I think what we're seeing now to your point is these labs are building developer ecosystems, but then they're very intentionally and overtly going head to head with the developers that are building on them.
Starting point is 01:21:02 And I think this has a lot to do with context, right? So if you think back to the internet, you know, and startups 10 years ago, everyone said content is king, you know, content is king. Then distribution was king, right? It was all about how do you get in front of users? We're starting to feel like we're entering the phase of context being king.
Starting point is 01:21:23 These models are just hungry for the most and the most unique context possible. And so if an app layer company emerges and has a new type of context and data that the models don't have great exposure to, it's a great signal to point in the direction and say, we're going to compete head on. And so I think that's what we're seeing now. And yeah, Nabil Hyatt, a great investor from Spark and I, we're going to compete head on. And so I think that's what we're seeing now. And yeah, Nabil Hyatt, a great investor from Spark and I, we often talk about how the war for context is happening now. And I think that's, that's what a lot of these modes
Starting point is 01:21:55 represent. How do you think app players should app player companies should respond? Is it just double, triple down, go way, way, way deeper, focus on workflows that the labs maybe don't have the resources to fully pursue or is it focusing down on specific niches? I'm curious what the yeah, I think the right approach is well We can definitely get into that but maybe first of what I would say is, you know I tweeted something yesterday that occurred to me after the big announcements from open AI
Starting point is 01:22:23 In that, you know the big incumbents which we talked about little while ago, sort of like the winners of the cloud era, it wouldn't surprise me if all of these new competitions between the labs and the apps actually drive the apps and the startups right back to the incumbents, to the Googles and the Amazons of the world. I have to wonder if some of these things actually act as a tailwind for models like Gemini and maybe give a little more credence to the argument that like Google is actually gonna be the winner here because of all their distribution. So I think that's one potential.
Starting point is 01:22:55 You mean driving back and being like, I'd rather work with Gemini because I don't think they're as likely to kill me. Exactly, yeah, exactly. It's like, hey, we trusted them with the cloud, and that worked out all right. Should we now trust them with AI more than we trust the labs? Yeah, I mean, that narrative even goes a little bit further
Starting point is 01:23:11 with Microsoft, which has been completely like, oh, we will host every single model. We'll let you reroute really intelligently between them, like super, super friendly developer ecosystem. And so, I mean, certainly they're building stuff into Copilot, into Microsoft 365, but it does feel like they're much more willing to partner. Yeah, Satya seems to have real conviction.
Starting point is 01:23:33 He had the quote from last week, platform, platform, platform, and hosting DeepSeek is an example of that, right? A lot of people would have thought, oh, he's not necessarily gonna host that model because it felt like a shot across the bow at OpenAI, but he's committed to supporting open source. Right?
Starting point is 01:23:54 Yeah, yeah, yeah, he wants it all. Interesting. I think, you know, also going back to your question, Jordy, I think all of this is just gonna make for a more intense, faster moving market. Like I think more than ever before, you have to ship, you have to get users faster than anyone. You have to sort of like reach escape velocity quicker,
Starting point is 01:24:13 which I think is just gonna put more and more pressure on startups to move even quicker than they already are. I feel like cursor is a great example. I feel like an earlier iteration of that product, you know, it probably would have been easy to sort of write them off and be like, Oh, you know, lab is going to do this. I mean, now it's like, they're so big, they're so far ahead. It feels like they've, they've really established themselves and likely have a good shot of breaking
Starting point is 01:24:36 through. I also wonder with cursor and windsurf and Devin and some of the dev tools markets, like it feels like just such a new market that even if it's somewhat winner take all, there's just, it's so positive some because it's adding efficiency to the most, like one of the biggest labor pools.
Starting point is 01:24:58 And so when we talk to the cognition folks as the reaction to Google and OpenAI launching Devvin competitors. They're like, well, we still grew 40% last month or something like that. And so, you know, I wonder like in, in code gen where it's such a new market that it's not, it's not directly competitive with anything that exists.
Starting point is 01:25:20 So it's less zero sum. I'm wondering if the note taking market feels similarly to you or, or were you seeing? granola or other or other companies kind of act as more drop-in replacements for existing tools Yeah, I think it's great question Yeah so so we backed granola really early on because we we knew Chris and his co-founder and we loved those guys we didn't know what
Starting point is 01:25:43 They were building we knew they were gonna build something in note taking. But we said, you know what? This market's gonna move fast. We trust these guys. Let's go for it. And I think, you know, somewhat to your point, there's been all these note takers before. Like Granola wasn't the first note taker.
Starting point is 01:25:58 There was Fireflies and Otter and all these things. But I think Granola has done a really, really good job of getting out of the user's way and establishing trust with the user. And I think that seems like a small thing, but I think that trust thing is gonna be really important if you go back to what I said about this context being king. Who are you gonna trust to take this context
Starting point is 01:26:21 or take this really, really important proprietary part of your work. In this case, your meeting notes, you know, a lot of people say we trust granola. Are they just going to hand it over to any old company that says, hey, now we want to screenshot your entire computer and take and suck every last piece of data out of you. And so I think part of it is to your point, like getting in early, getting big really, really fast and establishing that, you know, that user base and that market before it really matures, but also in a way that like users just really trust you and they're not just going to rip you out just because some other bigger company offers the same thing.
Starting point is 01:26:55 Yeah, yeah, that's a good point. What it was, do you have any more like micro reactions to specific integrations that seem to be one of the big things that openAO was pushing on was integrations with Google Docs and Drive and your email and that feels like adding that extra context is potentially the next thing people are clamoring for. How important is like the biz dev side of this business in fact? I think it's really important. You know, I think it's really, really great that Anthropic started the whole MCP protocol. Obviously lots of others are adopting that now.
Starting point is 01:27:33 But I think to your point, we're now gonna start to see the battle lines being drawn. Like who are we willing to integrate with? Who are we not willing to integrate with? Where is it? You know, are we open or are we closed? Where's the data gonna go? Where's the data gonna go? Where's it not gonna go?
Starting point is 01:27:46 I think we're gonna start to see those alliances and those allegiances form. And I feel like we've seen this. And we saw this with APIs, back in what, 2007, 2010 era. Social media. They have an API, it's amazing. It's like, well, you don't know
Starting point is 01:28:00 how much that API's gonna cost. If it's $10,000 per day or something, that could completely upend your business. And so actually thinking about how that dynamic develops is almost more important than the standard, although I'm very glad we have a standard, that seems great. But each company is gonna have to decide
Starting point is 01:28:20 where the value accrual really lands, and then who knows, maybe there'll be some antitrust in 20 years, like we're seeing with Apple. Yeah, the big question around trust, that I, you know, it's an evolving situation, but a California judge, I believe it was yesterday or the day before, ordered OpenAI to retain records of sort of, forget what OpenAI calls it,
Starting point is 01:28:46 but if you have like a disappearing query, a judge ordered them that they have to retain that. They obviously said that's a huge overreach for privacy with users. So incognito mode. Yeah. It's like not incognito. Yeah, and that more seems like an issue with the court
Starting point is 01:29:02 and the specific judge having this massive overreach around privacy. But privacy in this era when people are more willing than ever across every app to give them all sorts of data. Yeah, and you have a direct incentive to reduce the level of privacy to get better results. Like if the model knows what kind of car
Starting point is 01:29:22 you drive when you ask it for new tires, it will give you better recommendations. So you want to lean into being anti-privacy to get better results. The world is definitely bifurcating into pro-privacy or like fully AGI-pilled folks. And there aren't that many people that are in the middle. So obviously we will have to figure it out
Starting point is 01:29:42 as a democratic society, ultimately vote, and hopefully sort it all out in the courts. But thank you so much for stopping by. This was Michael. We'd love to have you back. Talk to you. Yeah, I mean, guys, I just want to tell you, you know, I don't really aspire to ring the New York Stock Exchange Bell one day. I aspire to hit that gong.
Starting point is 01:29:59 Hit that gong. Well, next time, come by. Come by. Great to see you Michael So we have a generational crash out going down. Oh really timeline. We got a new post from Elon Okay, I read it out. He says and this is your live reaction John time to drop the really big bomb Real Donald Trump is in the Epstein files. That is the real reason they have not been made public. Have a nice day, DJT. Wow. That is a big bomb. But wait, didn't we already know this? Because isn't there that picture with Trump and Epstein
Starting point is 01:30:33 together? We're really in dark territory. I want to go back to AI business and technology. The business story here is that Tesla's down 17%. DJT is down 7%. Trump coin is down 10%. Wow, they're all fighting this crash out on both sides is not good for anyone.
Starting point is 01:30:51 Well, you know, what's you know, what's interesting, you know, what's not down. tokens generated baby, we're still generating tokens every single day. The relentless march of artificial intelligence continues. So the other thing is, is Elon shared, or sorry, Trump shared on Truth. It's funny they're battling on their each. Oh yeah, they have different social networks.
Starting point is 01:31:10 Every billionaire should have their own, you know, social media network to get the word out, but Trump said, the easiest way to save money in our budget, billions and billions of dollars, is to terminate Elon's government subsidies and contracts. I was always surprised that Biden didn't do it. Wow. So, Ashley St. Clair is saying, hey, Donald Trump, let me know if you need any breakup
Starting point is 01:31:30 advice. I really don't know about it. And Dan Primack says, this cannot be a comfortable day for David Sacks. On the other hand, it's just the best day for Sam Altman. Well, we have someone from OpenAI here. We're going to stick to technology and business, but welcome to the show, Mark Chen. Good to see you. Great to see you guys.
Starting point is 01:31:49 Thanks for having me. Awkward day, but I'm excited to talk about Deve Research. I am excited to talk about AI products. Would you mind introducing yourself and kind of explaining what you do because OpenAI is such a large company now and there's so many different organizations. I'd love to know how you interact with the product and the research side and anything else you can give to contextualize this conversation.
Starting point is 01:32:10 Yeah, absolutely. So first off, thanks for having me on. I'm Mark, I am the Chief Research Officer at OpenAI. So in practice, what that means is I worked with our Chief Scientist, Yaacob, and we set the vision for the Research Org, we set the pace, we hold the research org accountable for execution. And, uh, ultimately we really just want to deliver these capabilities to everyone.
Starting point is 01:32:32 That's amazing. In terms of research, I feel like a lot of the, what happens in the research side is actually gated by compute. Is that a different team? Because what if the researchers ask for a $500 billion data center that feels like maybe a bigger task? Yeah, it is useful for us to factor the problem of research and also kind of building up the capacity to do that research. So we have a different team. Greg leads that, which really thinks holistically about data center bring up and how to get the most compute for us. And of course, when it comes to allocating that compute for research, you know,
Starting point is 01:33:06 Yakov and myself do that. That's great. And so what, what can you share that's top of mind right now on the research side? There's been this discussion of pre-training scaling wall, potentially the importance of reinforcement learning, reasoning. There's so many different areas to go into. What's actually driving the most conversations internally right now?
Starting point is 01:33:31 Yeah, absolutely. So I think really it's a really exciting time to do research. I would say versus two or three years ago, I think people were trying to build this very big scaling machine. And really the reasoning paradigm changed a lot of that, right? You know, like reasoning is really taking off. And it really opens this new
Starting point is 01:33:51 playing playing ground, right? It's like, there are a lot of kind of known unknowns, and also unknown unknowns that you know, we're all trying to figure out, it kind of feels like GPT-2 era, right? Well, where there's so many different hyper parameters, you're trying to figure out. And then I think also, you know, like you mentioned, you know, pre training, that's not to be forgotten either. You know, today we're in a very different regime of pre training than we used to be right. Today, we can't treat data as this infinite resource. Yeah, I think a lot of academic studies, you know, they've always kind of treated, you know, you have some kind of finite compute, but infinite data.
Starting point is 01:34:27 I don't think there's much study of, you know, like, uh, you know, finite data and infinite compute. And I think, you know, uh, that also leads to a very rich playground for research. Do we need kind of a revision to the bitter lesson? Is that a refutation of the bitter lesson or, or do we just need to re re rethink what the definition of, of scaling laws looks like? No, I don't think of anything as a refutation of the bitter,
Starting point is 01:34:54 really like our company is grounded in, we want simple ideas that scale. I think RL is an embodiment of that. I think pre-training is an embodiment of that. And really at every single scale, we face some kind of difficulty of this form. It's just like, you got to find some innovation that gets you past the next bottleneck. And this doesn't feel fundamentally very different from that. What is, what's most important right now on the actual compute side?
Starting point is 01:35:21 We heard from Nvidia earnings that, compute side, uh, we heard from Nvidia earnings that, uh, that we didn't get a ton of guidance on the shift from, uh, training to inference usage of Nvidia GPUs, but it feels like it must be coming. It feels like this inference wave is, is, is happening. Uh, are those even the right buckets to be thinking about tracking metrics in terms of the story of artificial intelligence because yeah, I mean, it's like, if,
Starting point is 01:35:50 if the reasoning tokens are inference tokens and, and, but they're, what lead to higher intelligent, more intelligent models, like it's almost back in the training bucket again. Um, what buckets should we be thinking about and, and, uh, and, or, or are we, how firmly are we in the, the, the, uh, the, the applied AI era versus the research era? Well, I think research is here to stay and it's for all the reasons I mentioned above, right?
Starting point is 01:36:17 It's such a, like a rich time to be doing research, but I do think, you know, inference is going to be increasingly important as well, right? It's such a core part of RL that you're doing rollouts. And I think, you know, we see 2025 as this year of agents, right? We think of it as a year where models are going to do a lot more autonomous work. You can let them kind of be unsupervised for much longer periods of time. And that is just going to put big demands on inference, right? When you think about kind of our overall vision, right?
Starting point is 01:36:48 We lay it out as a series of steps and levels on the way to AGI, right? And I think the pinnacle, really that last level, is organizational AI, right? Like you can imagine a bunch of AIs all interacting. And yeah, I think that's just gonna put huge demands on inference, right? On that organizational question,
Starting point is 01:37:08 I remember reading AI 2027, and one of the things that they proposed was that the AIs would actually like literally be talking to each other in Slack. Does that seem like the way you imagine agents playing out, like using the same tools as humans instead of kind of- Does that seem like the way you imagine agents playing out, like using the same tools as humans instead of kind of- One agent says, I'm gonna go talk with teams, and I'm gonna talk with Slack,
Starting point is 01:37:31 and I'm gonna do a little negotiating on a per sheet basis. But maybe it just happens super, super fast 24 seven, or is there like a new machine language that emerges? Yeah, I mean, I think one thing that's really helped us so far in AI development is to come in with some priors for how humans do things. And that's actually, if you bake those priors, and they typically are great starting points.
Starting point is 01:37:54 So I could imagine maybe you start with something that's Slack-like and give it enough flexibility that it can kind of develop beyond that and really figure out the way that's most effective for it to communicate. One important thing though is, we want interpretability too, right? I think it's very helpful for us today that what the agents do is easy for us to read and interpret. And I don't think you want that to go away as well.
Starting point is 01:38:21 So I think there's a lot of benefits just, even from a pure like debug, the whole system perspective, or just let the models speak in a way that it's familiar with us. And you can also imagine like we might want to plug in to the system too. Right. So, you know, whatever interfaces we're familiar with, we would ideally like our model to be familiar with as well. I think it's also pretty compatible with, you know, we hit a big milestone.
Starting point is 01:38:49 We got, I think three million paying business users for fairly recently. Let's go! Yeah, there we go, let's go. Yeah. Again, I think... Three gong hits for three million. The gong will keep ringing for a while.
Starting point is 01:39:05 Sorry, we had to do it. I was hoping you would drop a number. Yeah, yeah. Congratulations. That's actually huge. That's amazing. But I think one big part of that is we have connectors now. We're connecting into G drives. I think you can imagine Slack integrations, things like that.
Starting point is 01:39:26 I think we just want the models to be familiar with the ways we communicate and get information. Yeah. Can you talk about benchmarking? It feels like we're potentially entering- Yeah, do you think about benchmarks at all? Oh, yeah, a lot. I mean, but I think it's a difficult time for benchmarks, right?
Starting point is 01:39:44 I think we used to be in this world where you have these human-written benchmarks for other humans. And I think we all have these norms for what are good benchmarks. We've all taken the SAT. We all have a good conception of what it means to get whatever score on that.
Starting point is 01:40:00 But I think the problem is the models are already at the point where where for even the hardest human written benchmarks for other humans, it's really near saturated or saturated, right? I think one clear example here is the Amy, like probably the hardest autogradable like human math eval, at least in the US. And yet the models are consistently getting like 90 plus percent on these. And so what that means is I think there's kind of two different things that people are doing, right? They're developing kind of model-based benchmarks, right? They're not kind of things
Starting point is 01:40:41 that we would give to an ordinary human, like humanities last exam things like, you know Epic AI that are really really at the at the frontier of what what people can do And I think the the hard thing is it's not grounded in intuition, right? Like, you know, you don't have a lot of people who have taken these exams So it makes it harder to kind of calibrate on whether this is a good exam or not One of the exciting things that's on the flip side of that is I really do think we're at the era where models are going to start innovating, right? Because I think once you've passed the last kind of like the hardest human
Starting point is 01:41:15 reading exams, that's kind of at the edge of innovation. And I think you already see that with the models, right? Like they're helping to write parts of papers. And, and I think the other kind of way that, uh, people have shifted is, you know, there's these ultra frontier evals, but they're also people kind of just indexing on real world impact, right? You look at your revenue, kind of the value you deliver to users. Um, and I think that's ultimately what we care about.
Starting point is 01:41:42 Can you, can you, uh, bring that back to interpretability research like with these super, super hard math evals, for example, are we doing the right research to understand if the thought process mirrors, not just one shotting the answer, oh, you memorized it or you magically got it correct, but you actually took the correct path, kind of like you, you, you, you, you memorized it or you magically got it correct, but you actually took the correct path kind of like, you know, you're
Starting point is 01:42:07 graded for your work, not just the answer. If you're in grade school. Um, and, and, you know, Dario said that, uh, interpretive interpretability research will actually contribute to capabilities and even give a decisive lead. Do you agree with that? What's your reaction to that concept of interpretability research being very important? Yeah, I mean, we care a lot about it here at OpenAI as well. So one thing that we care a lot about
Starting point is 01:42:31 is interpreting how the model reasons, right? Because I think we've had a very kind of specific and strong view on this in that we don't want to apply optimization pressure to how the model thinks so that it can be faithful in the way it thinks and to expose that to us without any kind of incentives to cater to what the user wants. I think it's actually very important
Starting point is 01:42:56 to have that unfiltered view because oftentimes, if the model isn't sure, you don't want to hide that fact, right? Just for it to kind of please the user. And sometimes it really isn't sure, right? And so we've really done a lot of work to try to promote this norm of chain of thought, faithfulness and interpretability. And I think it gives you a lot of sense
Starting point is 01:43:21 into what the model is thinking and, you know, what are the pitfalls that it can go off into if it's not reasoning correctly. That's such an important point because if you have somebody on your team and they come to you and they say hey you know I think this is the right answer but we should probably verify it it's like it's still valuable totally puts you on the right path if somebody comes to you a hundred percent confidence this is this is the truth wrong like trust is just destroyed. Yeah, totally.
Starting point is 01:43:46 Don't you guys feel like safety felt a lot more theoretical a couple of years back? But today, the things that people were talking about a couple of years, scalable oversight, really having the model be able to tell you and convince you that the work it did was right, it feels so much more relevant right now. Just because the capabilities are so strong.
Starting point is 01:44:06 Yeah, I mean, just personally, I've completely flipped from being like, oh, the safety research is not that valuable because I'm not that worried about getting paper clipped. It just seems like a very low likelihood that that's kind of like the bad ending, like immediately in this foom and all this crazy gray goo scenarios were just so abstract in sci-fi.
Starting point is 01:44:24 It just felt like economics will, will, will, will, will fall into place and there will be a, like a, like a cold, like a nuclear ending, which is like, we didn't build nuclear plants. And we just stopped everything because we even seem to be good at that. But now that we're actually seeing things. Yeah, it's crazy how fast it's been, right? Like, um, I think my, my, like my personal story is, it's like, you know, what, what got me into a AI was AlphaGo, right? Like just watching it get to that level of capability.
Starting point is 01:44:49 Yeah. And you were kind of like, it was such an optimistic and also kind of a little bit of a sobering message, right? When you saw at least it'll get beat. Um, and I just remember, you know, like we, we saw the coding models, you know, when we first launched like, I think, very OG codecs, you know, with GitHub Co-pilot, it was maybe like under, you know, a thousand Elo on code forces. And I still remember the meeting where I walked into where the team showed my score and they're like, hey, the model is better than you. And you come full circle and it's like, wow, like I put decades of my life into this. And you know, the capabilities are there. So like if, you know,
Starting point is 01:45:27 I'm kind of at the top of my field in this thing and it's better than me, like what can it do? Yeah. Yeah. That's amazing. I have so many more questions on AlphaGo. Are there, are there lessons from scaling, how scaling played out there that you can, that we can abstract into the rest of AI research.
Starting point is 01:45:46 What I mean is, as I remember it, the AlphaGo training run was not 100K H200s. But what would happen if we actually did an AlphaGo style training run? I mean, it would be an economic money pit, right? Like they had no economic value to do. But let's just say some benevolent trillionaire decides I'm gonna spend a billion dollars on a training run
Starting point is 01:46:12 to beat AlphaGo and go even bigger. Is Go at some point solved? Would we see kind of diminishing scaling curves? Could we throw extra R out? Could we port back everything that we're doing in just general AGI research and just continue fighting it out in the world of Go? Or does that end and does that teach us anything? Yeah, honestly, I feel like if you really are curious about these mysteries,
Starting point is 01:46:38 join our team. That's the first thing I want to say. Yeah, I mean, really kind of the central problem of today is RL scaling, right? Yeah. When you look at AlphaGo, right? It's a narrow domain, right? Yeah. I think in some sense, that limits the amount of compute you can pump into it. But even kind of small toy domains, they can teach you a lot about how you scale RL, like what are the axes where it's most productive to pump scale in? I think a lot of scaling research just looks like that, whether it's on productive to pump scale in. I think a lot of scaling research just looks like that,
Starting point is 01:47:05 whether it's on RL or pre-training. So you identify a lot of different variables under which you can scale. And where is kind of where you get the best kind of like marginal impact for pumping scale there. I think that's a very open question for RL right now. And I think what you mentioned as well, it's just like going from narrow to broad, right?
Starting point is 01:47:26 Does that give you a lever to pump a lot more scale in as well? I think when you look at our reasoning models today, they're a lot more broad based than, you know, just being able to kind of an expert system on go. So yeah, I really do think that there are so many levers to scale. What about move 37? I really do think that there are so many levers to scale. What about Move 37? That was such an iconic moment in that AlphaGo LisaDoll match. They placed Move 37, it's very unconventional.
Starting point is 01:47:54 Everyone thinks it's a blunder. It turns out not to be, it turns out to be critical. It turns out to be innovation. Do you think we're certainly post-touring test in language models. We're probably post- post touring test in image generation But it feels like we're pre move 37 in text generation in the sense that there hasn't been Like a fully AI generated book that everyone is just oh, it's the new Harry Potter. Everyone has to read it
Starting point is 01:48:22 It's amazing and it's fully aged and it's fully generated or, uh, or this image, the images they do go viral, but they go viral because they're AI move 37 in the context of go did not go viral because it was AI, but like it was actual innovation. So, uh, is that the right frame? Does that make any sense? Um, I think it's not the wrong frame. So I think some, some quick thoughts on that.
Starting point is 01:48:44 Um, I think kind of, um, when you So I think some quick thoughts on that. I think kind of when you have something that's very measurable, like win or lose, right? Something like go. Yeah, it's like very easy for us to kind of just judge, right? Like did the model do something right here? And I think the more fuzzy you get, it is just harder, right?
Starting point is 01:49:04 Like when it comes to, is this the next Harry Potter? Right, like, you know, it's not a universally loved book. I think, fairly universal, but you know, there's some haters. And yeah, I think it is just kind of hard when it comes to these human subjective things where it's really hard to put down in words, like what makes you like Harry Potter, right?
Starting point is 01:49:25 And so I think those are always gonna lag a little bit, but I think we're developing more and more techniques to attack kind of these more open-ended domains. And I don't know, I wouldn't say that we're not at an innovative stage today. So I think my biggest touch with this was when we had the models compete on the IY last year. So IY, it's like the international,
Starting point is 01:49:53 basically Olympics for computer science, basically the top four kids from each country go and compete. And these are really, really tough problems, basically selected so that they require some innovative insight to solve, right? I think, and we did see the model come up with solutions, even to some very ad hoc problems. And so I think there was a lot of surprise for me there,
Starting point is 01:50:22 right? I was completely off-base about which problems The model would be able to solve the most right? Um, I think like I kind of categorized there's six problems Some of them as more kind of like oh this standard a little bit more standard This is a little bit more out of the box. It was like it's not gonna be able to solve this more out of the box one but it did and I think I mean think that really does speak to kind of, these models have the capacity to do so, especially trained with RL.
Starting point is 01:50:49 Now, put that in context of what's going on with Arc AGI. Obviously, OpenAI has made incredible progress there, but it just, when I do the problems, it seems easy. And when I look at the IOI sample problems, I think this would be a 20-year process for me to figure out how to achieve that, and I can do the Arc AGI on my phone. Is this the spiky intelligence concept?
Starting point is 01:51:13 Is this something that a small tweak in algorithmic design, just one-shots Arc AGI, or is there something else going on there that we should be aware of? Yeah, I mean, I think part of this is the beauty of RKGI as well, right? Like I think, I'm not sure if there's another kind of like human intuitive simpler benchmark,
Starting point is 01:51:33 which is for the models. I think really that's one of the things they really optimize for on that benchmark. I do think when it comes to models though, like there's just a little bit of a perception gap as well. Like, you know, models aren't used to this kind of native, you know, like just screen type input. I think there's a lot we can bridge there.
Starting point is 01:51:53 Actually, even 04 Mini, it's a state of the art multimodal model in many ways, including visual reasoning. And I think, you know, you're starting to kind of build up the capacity for the models to take images, manipulate and reason about them, generate new images, write code on images.
Starting point is 01:52:12 And I think it's just been kind of under focused, but I think when I talk to researchers in the field, they all see this as a part of intelligence too, and we're gonna continue to focus there. Yeah, is RKGI, if we're dropping a buzzword on it, is like program synthesis? Is there a world where, I know the tokens, like the images, we see them as renderings of squares
Starting point is 01:52:38 in different colors, but when they're fed into the LLM, they're typically just a stream of, of numbers effectively. Is there a world where actually adding a screenshot is what's important? Like visual reasoning. Yeah. Yeah. So I think, I think that could be important. It's just like kind of, uh, you know, whenever it comes to like textual representation of grids, um, models today just don't really do that well, right? And I think it's just kind of because
Starting point is 01:53:06 humans don't really ever write down textual representations of groups or like, you know, we have a chess board, like no one really kind of just like types it out in a grid. Like, um, yeah. And, um, and so the models are kind of like under trained a little bit on on what that looks like and what that means. So, you know, I think with more reasoning, we'll just bridge the gap. I think with better visual perception, we'll just bridge that gap. Yeah. How are you thinking about the role of non lab researchers in the ecosystem today? I'm sure you try to recruit some of the best ones, but the ones that don't join your team. Tell us about the one that got away. Yeah, the one that got away.
Starting point is 01:53:48 Yeah, no, I mean, I think it's still actually a fairly good time for specific domains, right, to be doing research. And, you know, I think the style is just very different. And you do feel the pull of non-lab researchers into labs because I think they feel like a lot of the burning problems in the field are at scale, right? And that's kind of one of the unfortunate things to you, right? Like when you look at reasoning, um You just don't see that happen at small scale, right?
Starting point is 01:54:17 There's like a certain scale at which it starts becoming signal bearing and that requires you to have resources, right? um starts becoming signal-bearing, and that requires you to have resources, right? But I do think, you know, a lot of the really good work that I've seen, you know, there's experimental architectures. I think a lot of good work is happening in the academic world there. Like a lot of study in optimization,
Starting point is 01:54:38 a lot of study in kind of like GANs, you know, there's certain fields where you see a lot of fruitful research that happens in academia. Yeah, that makes a lot of sense. How about consumer agents? How are you thinking about them? You talked earlier about sort of B2B adoption, and that's all very exciting.
Starting point is 01:54:57 But how much do you and the research org think about breakout consumer agent products? Yeah, that's a fantastic question. I think we think about it a lot. I think that's the short answer. You know, we really do think like this year we're trying to focus on how we can move to the agentic world, right?
Starting point is 01:55:15 And when I think about consumer agents, I think like ChatGPD proved that, you know, people got it, right? It's like people get conversational agents when they conversational conversational models. But when it comes to consumer agents, we have a couple of theses that we've tried out in the world. I think one is deep research.
Starting point is 01:55:35 I think this is something that can do five to 30 minutes of work autonomously, come back to you, and really synthesizes information? It goes out there gathers collects and kind of You know compresses the information in a form that that's useful a little bit of a little bit of push back there Like I can see that as a consumer product when someone like Aidan is like I want new towels And he uses deep research to like figure out like what is the best towel across every dimension? But when I think of deep research,
Starting point is 01:56:06 yes, it has applications with students, but it's often. Some of them might just be the paradigm. And I guess it could be consumers being like, give me a deep research report on this country and where to travel and things like that. We keep using this flight example, but I haven't actually tried to book a flight with deep research.
Starting point is 01:56:21 It's totally possible that it could go and pull all the different flight routes and calculate all the different delays and all the different parameters of, if I fly to this airport I can park, or I can use valet here or something like that, yeah. Yeah, and I guess when I think of agents, it's deep research is curating information
Starting point is 01:56:40 on which you can take action on, but it's like at what point is action a part of that sort of loop, right, where you can not only curate a list of flights that you want, but then you know actually go out and have agency. Yeah, I think one of our explorations in that space is operator, right? It's where you kind of just feed in raw pixels from your laptop into or you know from some virtual machine into the model. And it produces either a click or some keyboard actions.
Starting point is 01:57:09 Right. And so there it's taking action. And I think the trouble is, you know, it you don't ever want to mess up when you're taking action. Yeah. I think the cost of that is super high. You only have to get it wrong once to lose trust in a user. And so we wanna make sure that that feels super robust before we get to the point where we're like, hey, look, here's a tool. That's so different than deep research
Starting point is 01:57:38 because you can wind up on some news article and read one sentence that gets a fact wrong, or the commas in the wrong place and the numbers off. And, but that's just the expectation for just text and analysis. And if you delegated that, yeah, you're going to expect a few errors here and there. Oh, that's actually a different company name or that's the, that's an old data point. There's new data, uh, but very different. If I book a flight and you book the wrong flight and I can wind up in Chicago instead of New York
Starting point is 01:58:05 Exactly, and I think the reason why we care so much about reasoning is because I think that's the path that we get reliable agents through Sure. Um, you know, we've talked about like reasoning helping safety, but reasoning is also helping reliability, right? It's like you imagine like What makes a model so good at a math problem? It's banging its head against it. It's trying a different approach, and then it's adapting based on what it failed at last time. And I think that's the same kind of behavior you want your agents to have. It's like, tries things, adapts,
Starting point is 01:58:36 and keeps going until it succeeds. And that's the, humans do this every day. You're booking a flight, you keep hitting an error. It's not which form you missed, right? And you're just sort of banging your head against the computer and eventually it says okay You're booked right? Yeah, I think that's a great call out. Yeah I mean there's so many more questions if you go into but I'm interested in the scaling of RL and kind of the balancing act between pre-training RL inference, just the amount of energy
Starting point is 01:59:07 that goes into getting a result when you distribute it over the entire user base. How is that changing? And I guess, are we post like really big runs? Is this gonna be something that's like continually happening online? It feels like we're moving away from the era of like, oh, some big development, some big run happened and now we're grouping the fruits of it versus a more iterative process.
Starting point is 01:59:35 Um, yeah, I mean, I don't see why it has to be so right. I think like if you find the right levers, you can really pump a lot of compute into RL as well as pre-training. find the right levers, you can really pump a lot of compute into RL as well as pre-training. I think it is a delicate balance though between all of these different parts of the machine. And you know, when I look at my role with Yakub, it's just kind of like, figure out where, how this balance should be allocated, where the promising kind of like nuggets are arising from and resourcing those. Yeah, it's kind of a, in some sense, I feel like part of my job is a portfolio manager. That's a lot of fun. Well, thank you so much for joining.
Starting point is 02:00:10 This was a fantastic conversation. We'd love to have you back and go deeper. Great hanging, Mark. We'll talk to you soon. Yeah, peace. Have a good one. Next up, we have Shalto Douglas from Anthropic coming on this show.
Starting point is 02:00:20 I'm getting so many. Jordy is giving us the update on that. No, I'm just getting a lot of messages saying why no one cares about AI. Talk about the drama on the timeline. Well we do care about AI. We care a lot about AI. But it is a mess out there. Wow. The end of the Trump-Elon era. I don't know. Maybe we have to get some people on to talk about it
Starting point is 02:00:46 tomorrow or something. We're going to do it today. Anyway, we have Shalto from Anthropic in the studio. How are you doing? What's going on? Good to see you guys. Hopefully, you're staying out of the chaos on the talk. Don't open the time.
Starting point is 02:01:01 Don't open. We're doing you a favor. Sweet child. Move to Twitter. Move to Twitter. Yeah, mute everything. Stay focused on the application layer. Stay focused on the time. Don't open. What do you favor? Sweet child. Moves to Twitter. Moves to Twitter right now. Yeah, mute everything. Stay focused on the application layer. Stay focused on the mission.
Starting point is 02:01:09 Stay focused on the next training run. Humanity really cannot afford for any AI researchers to open X today. What a hilarious day. Anyway, I mean. I mean, it's a black-out 24 hours, guys. Yeah. How are you doing?
Starting point is 02:01:22 What is new in your world? What are you focused on mostly day-to-day? And maybe it's just a way of an intro. Yeah, so at the moment focused really hard on scaling RL I mean that is the theme of what's happening this year and we're still seeing these huge gains We go, you know 10x compute increase in RL We still getting like very distinct linear gains Based on that and because our role wasn't really scaled anywhere close to how much pre-training was scaled at the end of the, at the end of last year, we have like a,
Starting point is 02:01:49 basically a gamut of like riches over the course of this year. So where are we in that, in that RL scaling story? Because I, I remember the, the, some of the rough numbers around like GPT two, GPT three, we were getting up into like, it cost $100 million. It's going to cost a billion dollars. Like it just rough order of magnitude, not even from entropic, just generally like what is a big RL run cost or how many are we talking 10K H200s or 100K? Like, are we going to throw the same resources at it? And if so, how soon?
Starting point is 02:02:21 Yeah. So I think in Dyer's essay at the beginning of the year, he said that a lot of runs were only like a million dollars back in like December. I think you have like Deep naively parallelizable and scalable than pre-training. In pre-training, you need everything in one big data center, ideally, or you need some clever tricks. RL, you could, in theory, like what the prime intellect folks are doing, scale it all over the world out of it. And so you are held back far less than you are in pre-training. Sure. So everyone and their mother has a billion dollars now. Hundreds of thousands of GPUs getting pumped all over the place.
Starting point is 02:03:07 I feel like we're not GPU poor as a, as a, as a society. Uh, maybe some companies need to justify it in different ways, but it sounds like there's some sort of, uh, uh, like reward hacking problem that we're working through in terms of scaling RL. What are all of the problems that we're working through to actually go deploy the capital cannon at this problem? Yes, so I mean think about what you're asking the model to do in RL is you're asking it to achieve some goal at at any cost basically. Yeah. And this comes with a whole host of like behaviors which you may not intend. In
Starting point is 02:03:42 software engineering this is really easy. I easy. It might try and hack unit tests or whatever. In much more longer horizon real world tasks, you might ask it to say go make money on the internet. And it might come up with all kinds of fun and interesting ways to do that unless you find ways to guide it into following the principles that you want it to obey, basically, or to align it with your idea of what's sort of best for humanity. And so it's actually, it's a pretty intensive process. There's a lot of work to find out and hunt down all the ways these models are hacking through the rewards and patch all of that.
Starting point is 02:04:15 Yeah. Are we going to see scaling in the number of rewards that we're RLing against, if that makes sense, I would imagine that at a certain point, unless we come up with kind of like the Genesis prompt, go forth and be fruitful or something and multiply, you could imagine training runs on just knocking down one problem after another, and is that kind of the path that we're going down? I very much think so.
Starting point is 02:04:49 There's this idea in which the world becomes an RL environment machine in some respects. Because there's just so much leverage in making these models better and better at all the things we care about. And so I think we're going to be training on just everything in the world. Got it. And then, and then does that lead to,
Starting point is 02:05:08 um, more model fragmentation models that are good at programming versus writing versus poetry versus image generation or, or, or does this all feed back into one model? Does the idea of the consumer needing to pick a model disappear? Are we in a temporary period for that paradigm? to pick a model disappear? Are we in a temporary period for that paradigm? I think the main reason that we've seen that so far is because people are trying to make the best of the capital. We are all still GPU poor in many ways. And people are focusing those GPUs on the spectrum of wars that they think is most important. And I'm a bit of a big model guy.
Starting point is 02:05:45 I really do think that similar to how we saw with large pre-trained models before, with small fine-tuned models made it, like had gains over the sort of GPT-2 era, but then were obsolete by GPT-4 being generally good at everything. I think to be honest, you're gonna see this generalization
Starting point is 02:06:02 and learning across all kinds of things. That means you benefit from having large single models rather than specialization or area fine tuned models. Can you talk a little bit about the transition from or many, any differences between RLHF and just other RL paradigms? Yes. So RLHF, you're trying to maximize a pretty deep, likey signal things like airwise like what the humans prefer And I don't know if you've ever tried to do this like judge to language model response I get prompted for that all the time right and I'm always like I don't want to read both of those
Starting point is 02:06:34 I'll just click the one exactly exactly. Yeah, I click one of the random ones Yeah, or I click like the one that just looks bigger or I'll read the first two sentences, but yeah, I'm not giving straight. I'm not, I'm not being, I'm not doing my job as a, as a human reinforcer. Exactly. Human preferences are easy to hack. Yeah, totally. Environments in the world are much truer. You can find them. So something like, did you get your math question right? Is a very real and true reward. Does the code compile, right? Does the code compile? Exactly. Did you make a scientific discovery? We've got very little
Starting point is 02:07:09 rewards right now, but pretty quickly over the next year or two, you're going to start to see much more meaningful and long horizon rewards. You're going to see models bribing the Nobel committee to win the Nobel Prize. Well, we'll get a good reward hack. There's reward hacking. But that's something we want to prevent, right? Exactly. Yeah, that hacking. But that's something you want to prevent, right? Exactly. Yeah, yeah, that's the real nightmare scenario.
Starting point is 02:07:28 What about, like, there's so many different problems that we run into that feel like it's just really, really hard to design any type of eval that my kind of benchmark that I use whenever a new model drops is just tell me a joke. They're always bad and or or or even even the latest VO3 video that went viral was somebody said like stand-up comedy joke and it was kind of a funny joke but it was literally the top result for joke read it on Google and then it clearly just took that joke and then instantiated in a video that looked amazing but it wasn't original in any way and so we were joking about like the RLHF loop for that is like you have an
Starting point is 02:08:15 endless cycle of comedians running AI generated materials and then and then you know speak microphones and all the comedy clubs to feedback what's getting laughs. But, but. I mean, honestly, that would work pretty well. Yeah. If any comedians wanna hook us up with an RL loop, I mean. Yeah, yeah, but I mean, for some of those less, like as you go down the curve,
Starting point is 02:08:36 it feels like each one gets harder and harder to actually tighten the loop. We see this with like longevity research where it's like, okay, it takes 100 years to know if you extended a human life. Like the, yes, you could create a feedback loop around that, but every change is going to be hundreds of years. And so even if you're on the cycle, it's irrelevant for us in the context that we talk about AI. So, uh,
Starting point is 02:08:57 talk to me about like, are you running into those problems or, or, or, will there be like another approach that kind of works around those? So there are a lot of situations where you can get around this by just running much faster than real time. Like let's say the process of building a giant app, like building Twitter, right? It's something that would take human months,
Starting point is 02:09:15 but if you got fast enough and good enough AIs, you could do that in several hours. Acralize heaps of AI agents that are building, you know, things like that. And so you can get a faster reward signal in that way. In domains that are less well-specified like humor, I agree, it's really, really hard. And this is like why I think in some respects, like creativity is like at the at the top end of the spectrum, like true
Starting point is 02:09:34 creativity is much, much harder to replicate than the sort of like analytical scientific style reasoning. Yeah. And that will just take more time. You know what the models actually are pretty good at making jokes about being an AI this news fresh Like everything else is kind of a weird copy of something like it's like it just it feels like it's derivative basically It's trying to infer what humor is and it doesn't really understand it but jokes about being an AI are quite funny Yeah, I I think this also might be I don't know if it was directly reward hacking, but I noticed that I think this also might be, I don't know if it was directly reward hacking, but I noticed that, uh,
Starting point is 02:10:05 one of the new models dropped and a bunch of people were posting these like four Chan, like B me memes. And, and, and they were, it seemed like they were kind of hacking the humor by being hyper specific about an individual that they could find information on online. And so you're laughing at the fact that it's like, Oh, wow, that is like something that I've posted about it. It's making a reference, but it's not really that funny to me. It's other than it's like, oh wow, that is like something that I've posted about. It's making a reference, but it's not really that funny to me. It's other than it's just like, wow, they really did its research.
Starting point is 02:10:29 Like it really knows Tyler Cowan intimately, which is cool, but I didn't find it hilarious. Yeah, yeah, yeah. Very interesting. Let's talk about some sort of deep research projects and products. We were talking to Will Brown and he was saying like, AGI is here with some of the bigger sort of deep research product projects and products. We were talking to Will Brown and he was saying like, AGI is here with some of the bigger models, but the time that AGI can feel consistent, it diverges. And so you could be working with someone who's, you know, 100 IQ, but they will stay consistent for years
Starting point is 02:11:03 as an employee or they'll keep living their life. Whereas a lot of these super smart models are working really well and then after a few minutes of work the agents kind of diverge and kind of go into odd paradigms. It feels very not human. It feels like just a, they're hyper intelligent in one way and then extremely stupid in another.
Starting point is 02:11:24 What's going on there? What is the path to extending that? Is that more like having more better planning and better dividing up the task? Or will this just kind of naturally happen through the RL and scale? Yeah, so there's that jaggedness, right? Which is what you're seeing, is how we call it.
Starting point is 02:11:42 And I think that is largely a consequence of the fact that maybe something like deep-suit research, it's probably been RL'd to be really good at producing a report. But it's never been RL'd on the act of producing valuable information for a company over a week or a month, or making sure the stock price goes up in a quarter or something like this.
Starting point is 02:12:01 It doesn't have any conception of how that feeds into the broader story at play. It can kind of infer it, because it's got a bit of world knowledge from the, you know, the base model and this kind of stuff. There's never actually been trained to do that in the same way humans have. Um, so to extend that, you need to put them in much longer running, much like, like, you know, long horizon things. Um, and so, so deep research needs to become, you know, like deep operate a company for a week kind of thing.
Starting point is 02:12:24 Is that the right path? Like it feels like the road might be, there's a, like the longest running LLM query used to be just like a few seconds, maybe a few minutes. And I remember when, uh, when some of the reasoning models came out, people were almost trying to like stunt on it by saying like, Oh, I asked it a hard question. I thought for five minutes. Now deep research is doing 20 minutes pretty much every time. Um, is the path two hours, two days, or are we going to see more, uh, efficiency gains such that we just get the 20 minute model, the 20 minute results in two minutes and then two seconds.
Starting point is 02:12:59 Yeah. So this is somewhere where like inference in many respects and prioritization becomes really important. So both how fast is your inference, if that literally affects the speed at which you can think and the speed at which you can do these experiments, also how easily you can parallelize becomes really important. Can you dispatch a team of sub-agents to go and do deep research and compile sub-reports for you
Starting point is 02:13:19 so that you can do everything in parallel? These kinds of, it's both, there's an infrastructure question here, um, that feeds up from the hardware and the chips and this kind of stuff, uh, to designing better chips for better inference and all this. Um, and, and an RL question of like, you know, how well can you pro-lize and all this? So I think we just need to compress the timelines, compress the time,
Starting point is 02:13:41 the compress, the timeframes, basically. Yeah. So, uh, if I'm, if I'm like an extremely big model and I'm running an agentic process, like how much am I hankering for like a middle sized model on a chip or like baked down into silicon that just runs super fast because it feels like that's probably coming. We saw that with the Bitcoin progression from CPU to GPU to FPGA to ASIC. Do you think we're at a good enough point where we can even be discussing that? Because every time I see the latest mid-journey,
Starting point is 02:14:15 I'm like, this is good enough. I just want it in two seconds instead of 20. But then a new model comes out, and I'm like, oh, I'm glad I didn't get stuck on that path. I'm just delving. But yeah, how far away from how far away are we from? Okay, it's actually good enough to bake down into silicon.
Starting point is 02:14:31 Well, there's a question here of baking it down to silicon versus designing a chip, which is like very suited to the architecture that you're about. Right. And baking on the silicon, unsure, like, I think that's a bet you could take. But it's a it's a risky one, because the pace of progress is just so fast nowadays. And I really only expected to accelerate but designing things that make a lot of sense for those are the Transformers or architectures of the future Should should make a lot of it. That's a big gap though
Starting point is 02:14:59 Transformers or architectures of the future if we diverge there's a lot of companies that are banking on the transformer sticking around. What is your view on transformer architecture sticking around for the next couple of years? I mean, look, they stuck around for five years, so they might stick around for a little while, but there's different, you think about architectures in terms of this balance of memory bandwidth and flops, right, one of the big differences we've seen here
Starting point is 02:15:20 is Gemini recently had actually a diffusion model that they released at a high you the other day, right? Yeah. So diffusion is inherently extremely flops intensive process. Whereas normal language model decoding is extremely memory bandwidth intensive, you're designing two very different chips, depending on which bet you think makes sense. Yeah. And if you think you can make something that does flops like four times faster than diffusion, and like four times
Starting point is 02:15:40 cheaper than your those code, fusion makes more sense. So there's like, there's this dance basically, between the chip providers and the architecture, both trying to build for each other, but also build for the next paradigm. It's risky. Do you, I don't know how much you've played with image generation, but do you have any idea of what's going on with images in ChatGPT?
Starting point is 02:16:00 It feels like there's some diffusion in there, there's some tokenization, maybe some transformer stuff in there. It almost feels like there's some diffusion in there, there's some tokenization, maybe some transformer stuff in there, it almost feels like the text is so good that there's like an extra layer on top almost, and that it's almost like reinventing Photoshop. And I guess the broader question is like, it feels like an ensemble of models,
Starting point is 02:16:19 maybe the discussion around just agents and text-based LLM interactions shouldn't necessarily be transformer versus diffusion, but maybe how will these play together? Is that a reasonable path to go down? Well, I think pretty clearly there's some kind of rich information channel between, even if there are multiple models there, it's conditioning somehow on the other model because we've seen before, let's say when you know models use mid-journey to produce images it's never quite perfect it can't perfectly
Starting point is 02:16:48 replicate what went in as an input it can't perfectly like adjust things so there's a link somehow whether that's the same model producing tokens plus diffusion I don't know like yeah can't comment on what open air is doing there yeah yeah yeah are there any other kind of like super wild card, long shot, uh, research efforts that are maybe happening even out, even in academia, where, I mean, this was the big thing with, uh, what was his name? Gary. Uh, he was talking about, I forget what it was called. Symbolic symbol.
Starting point is 02:17:22 Manipulation was a big one. And, and I feel like, you know, you can never count anyone out because it might come from behind and be relevant in some ways. Um, but, but are there any other research areas that you think are like purely in the theory domain right now that are worth looking into or tracking that, you know, low, low, low probability, but high upside if they work. That's how fun this is tough one. But we'll say some symbolic thing, please. It's crazy how similar transformers are to minutes systems that manipulate symbols. Sure. What they're
Starting point is 02:17:55 doing is they're taking a symbol and they're like converting it into a vector and then they're manipulating and moving stuff like information around across them. Sure. Like, this this whole like, debate that all transformers can represent symbols and they cannot do this, it's not real. So Garry Mark is underrated or overrated, I guess? Overrated. if you twist it so much, you wind up with saying, well, really, the transformer fits within that paradigm. And so maybe it's, you know, the rhetoric around it being a different path
Starting point is 02:18:33 was maybe false the whole time. Something like that. But as I remember that debate, it was really the idea of compute scaling versus almost feature engineering scaling and will the progress scale with human hours or GPUs essentially and that has a very different economic equation and it feels like there's been some rumblings about maybe with a data wall will shift back to being human labor bound,
Starting point is 02:19:05 but do you think that there's any chance that that's relevant in the future or is it just algorithmic progress married with bigger and bigger data centers in the future? So I'm pretty bitter lesson built, hence that I do think removing as many of our biases and our like clever ideas from the models is really important. just like freeing them up to learn. Now, obviously, there's like, there is clever structure that we put into these models such that they're able to learn
Starting point is 02:19:31 in this extremely general way. And that but I am more convinced that we will be compute bound, then we will be like human researcher, out human research, our bound on on this kind of thing, like we're not going to be feature engineering and this kind of stuff. Sure. We're going to be trying to devise incredibly flexible learning systems. Yeah, that makes sense. On the scaling topic, part of, I,
Starting point is 02:19:56 I, I, I, part of my like worry is that the, the OMS gets so big that they turn into these mega projects that are, at a certain point you're bound by the laws of physics because you have to move the sand into silicon chips and you have to dig up the silicon. And at a certain point. Yeah, there's only so much sand and like the math gets really, really crazy just for the amount of energy required to move everything around to make the big thing.
Starting point is 02:20:24 Where are you on how much scale we need to reach AGI? Whether or not we will see the laws of physics start acting as a drag on progress because it certainly feels exponential. We're feeling the exponentials, but a lot of these turn into sigmoids, right? Yeah. So I think we've got what, like two or three more ooms
Starting point is 02:20:46 before it gets really hard. Leopold has this nice table at the end of his, then the situational awareness. I think like 2028 or something is when, under really aggressive timelines, that you get to 20% of US energy production. It's pretty hard to go exponentially beyond 20% of US energy production.
Starting point is 02:21:03 Now, I think that's enough. Every indication I'm seeing says that's enough. Now then, there might be some complex, you know, data engineering, we're one engineering, this kind of stuff that goes into lots of places, there's still a lot of algorithmic progress left to go. But I think that with those extra rooms, we get to basically a model that is capable of assisting us
Starting point is 02:21:26 in doing research and software engineering. Yeah, which is the beginning of the self reinforcement. Yeah, exactly. Interesting, is that just a coincidence? Like this feels like one of those things, this feels like one of those things where like the moon is the exact same size as the sun in the sky.
Starting point is 02:21:39 It's like, oh, it just happens that AGI happens within this time, like, whoa, did you unpack that anymore? Because it feels convenient, not, did you have you unpacked that anymore? Because it feels convenient. Not to, you know, I know. There's a lot of weird conveniences are like weird. It's a good sci fi story. Let's say totally. We've got a, you know, Taiwan in between China and the U S and it produces the most valuable material in the world. It's locked between the two credible plot.
Starting point is 02:22:02 Yeah. Yeah. Yeah. Really bad for the people that don't think of, that don't believe in simulation theory. It really feels like Alan is descriptive. It's program. It's fascinating. Talk to me more about getting to an ML engineer in AI and kind of that reinforcement. I imagine that you're using AI code gen tools today
Starting point is 02:22:24 and anthropic is broadly and everyone is, um, but, but, uh, what are you looking for and what are the, what's the shape of the spiky intelligence, where do they fall flat? And what are you looking to kind of knock down in the interim before you get something that's just like, go. Yeah. So, I mean, we definitely use them the other night. I like, I was a bit tired to ask them to do something, just sat watching it in front of me working for half an hour. It was great. It was truly weird experience, particularly when you look back a year ago. And we're still copy pasting stuff between a chat window and a code file. Yeah.
Starting point is 02:23:02 for this kind of stuff. So they have a bunch of evals where they measure like the ability to write a kernel, the ability to run a small experiment and improve a loss. And they have these nice progress curves versus humans. And I think this is maybe the most accurate reflection of like what will take for it to really help us during progress. And there's a mix here, like, where they're not so great at the moment is like large scale distributed systems engineering, right, like debugging stuff across heaps and heaps of accelerators.
Starting point is 02:23:24 And like the way the feedback loops are slow and like, if your feedback loop is like an hour, then it's you spending the time on on doing something. Yeah, I'm feedback is 15 minutes. If it's much in for context there, the hour long feedback loop is just because you have to actually compile and run the code across. In up all your machines or you need to run it for a while to see if something's gonna happen.
Starting point is 02:23:48 At that point in time, you're still cheaper than the chips. Sure. It's better that you do it. But for things like kernel engineering or for actually even just understanding these systems, incredibly helpful. One thing I regularly do at the moment is in parts of the code base,
Starting point is 02:24:06 in languages that I'm unfamiliar with, or stuff like this, I'll just ask it to rewrite the entire file, but with comments on every line. Game changing. It's like- Comments on every line. Yeah, or just come through thousands of files
Starting point is 02:24:18 and explain how everything interacts to me, draw diagrams, this kind of stuff. It's really, yeah. Yeah, how important is a bigger context window in that example you gave that feels like something that's important and yet It just naively like Google's the one that has the million token context window I imagine that all the other frontier labs could catch up But it seems like it hasn't been as much of a priority as maybe like the PR around it sounds like is that important? Should we be go should we be driving that up to like a trillion token window? Um, is that,
Starting point is 02:24:47 is that just going to happen naturally? There's a nice plot in the Gemini 1.5 paper, uh, where they show the like loss over tokens as a function of context length. And they show that the loss goes down quite steeply actually, as you put more and more and more like code base into context, you get better and better and better at predicting the rest. Context length, it's a cost. Um, you know, the way transformers work is that, uh, there's, you know, you get better and better and better predicting the rest. Yeah, that makes sense. The context length, it's cost. Yeah, the way transformers work is that there's, you know, you have like this, this memory that
Starting point is 02:25:10 is proportional, the KV cache is proportional to how much context you've got. And so you can only fit so many of those into like that your various chips and this kind of stuff. And so longer context actually just costs more because you're taking up more of the chip and you're sort of like, you could have otherwise been doing other requests basically. So bringing it back to the custom silicon, is that a unique advantage of the TPU? Is that something that Google has thought about and then wound up to put themselves in this advantage position? Or is it a durable advantage even?
Starting point is 02:25:40 Yeah. So TPUs are good in many respects, partially because you can connect hundreds or thousands of them really easily across really great networking. Whereas only recently has that been true for GPUs. With NVLink? Yeah, with NVLink and the MVL72 stuff. So it used to be like eight GPUs in a pod and then like you connect them over worse in a connect. And now you can be 72 and then it breaks down. With Google DPS, you can do like 4,000, 8,000 of a really high bandwidth interconnect in one pod.
Starting point is 02:26:08 And so that is helpful for things like just general scaling in many respects. I think it's doable across any chip platform, but it is an example of like somewhere that being fully vertically integrated is using a benefit. Yeah, that makes sense. Talk to me about Arc AGI. Why it so hard it seems so easy it does seem easy doesn't it well it certainly seems like more more evaluatable than tell me a
Starting point is 02:26:35 funny joke right yeah yeah I mean I think if you are old on Arc AGI then it would you probably get superhuman at it pretty fast but I think we're all trying not to RL on it so that it functions as like an interesting held out text. Sure. OK. Is that just an informal agreement between all the labs, basically?
Starting point is 02:26:53 Yeah, we're trying to have a sense of honor between us. That's good. Sense of honor. That's amazing. How many people on Earth do you think are getting the full potential out of the publicly available models? Because we're now at a point where we have you know billion plus people are using AI almost daily and yet I have to my sense would be
Starting point is 02:27:11 it's maybe like 10 000 20 000 people on the entire planet are getting that sort of full potential but I'm curious what your assessment would be. Yeah I completely agree I mean I think that even I don't get the full potential out of these models often. And I think as we shift from you're asking questions and it's giving you sensible answers to you're asking it to go do things for you that might take hours at a time, and you can really like parallelize and spin, we're going to hit yet another inflection point
Starting point is 02:27:40 where even less people are really effectively using these things. Because it's basically going't require you to like It's like a like starcraft or Dota like it's gonna be like your APM of like managing all these agents and that's totally Process. Yeah, so I'm starcraft is such a good example You think you're just absolutely crushing it and then you realize like there's an entire area of the map. You're just getting destroyed It's such a good it's such a good comp I'm sure I know. Yeah, exactly.
Starting point is 02:28:03 It's such a good comp. That's great. Anything else, Jordy? I think that's it on my side. I mean, I would like this to be an evolving conversation. Yeah, this was fantastic. We'd love to have you back and keep jetting. Absolutely, it was really fun.
Starting point is 02:28:16 Love to go back on class. Yeah, we'll talk to you soon. Cheers, Shulta. Have a good one. All right, we got Emmet. The worst possible AI day. Yeah, so for context, folks, we are going to be doing a live timeline and turmoil
Starting point is 02:28:30 segment at 2 PM PST. So if there's posts you want us to cover, you can go send them. I'll put this in the chat as well. A few more. Pull one up. I'm going to do some ads because we got Emmet Shearer coming in the temple in just a few minutes. let me tell you about numeral sales tax on autopilot spend less than five minutes per month on sales tax compliance
Starting point is 02:28:51 Go to new sales tax you a gi calm Very excited for them. Also go to public calm investing for those who take it seriously They have multi asset investing industry leading, and they're trusted by millions. Millions. In other news, Tim Sweeney continues to battle Apple. Apparently, if you search for Fortnite on the Apple App Store, he says, hey kids, looking to play Fortnite?
Starting point is 02:29:18 Try this crypto and stock trading app instead, rated for ages four plus, courtesy of Apple App Store ads. So I'm gonna give you the latest Trump terror Elon post four minutes ago The Trump terrorists will cause a recession in the second half of this year Wow Somebody else was saying can I finally say that Trump's tariffs are super stupid Somebody else is posting mads posting is saying it's a jiji ping. He says bro you seeing this and it's Putin on the other end. He's just looking at it. Hold up got a line and it's We'll start pulling some of these up
Starting point is 02:29:56 Ridiculous What else is going on here this is the present versus Elon. Neville says Elon's stance is principal. Principal Trump stance is practical. Tech needs Republicans for the present. Republicans need tech for the future. Drop the tax cuts, cut some pork, get the bill through. This is so crazy. Antonio Garcia says remember there's a few money and then there's F the world money will Stancil says imagine being the ice agents suiting up for your biggest mission of all time right now people are saying that Trump's gonna deport you on back to the South Africa well the pew says time to drop the
Starting point is 02:30:41 really big bomb growing Daniels in the Epstein files. No. That is going to turn into a coffee pasta. That is a real piece of coffee. Oh, no. Deleon. We had a question from a friend of the show. He said, the real question is if Tesla is down 14%, how could SpaceX and OpenAI be trading if they were, how would they be trading if they were public?
Starting point is 02:31:09 The real thing here is it's bad for everyone, right? DJT is down, Trump coin is down. Nobody's really winning here. China is up. Yeah. Oh really? Sean McGuire, I mean, I'm just saying like at a high level. Yeah, yeah, yeah.
Starting point is 02:31:22 You know, China is the big beneficiary here of Sarah Gross as if anyone has some bad news to bury, might I recommend right now? Yes, yes, yes. If you have, if you, what's the canonical bad startup news? Like, oh yeah, you missed earnings or something. Drop it now. Inverse Kramer says, Bill Ackman is currently writing the longest post in the history of this app.
Starting point is 02:31:50 And we have a video from Trump here, if we want. I can throw it in the tab, and we can share it on the stream and react to it live. Lex Friedman says to Elon, that escalated quickly, triple your security. Be safe out there, brother. Your work, SpaceX, Tesla, XAI, Neuralink, is important for the world.
Starting point is 02:32:12 We need to get Elon on the show today. If somebody's listening and can make that happen, I would love to hear from him. Max Meyer says, so I got this wrong. I didn't say it never happened, but I thought it wouldn't. I am floored at the way this has happened. He didn't think they would have a big breakup.. He didn't think they would have a big breakup. Many people didn't think they would have a big breakup.
Starting point is 02:32:28 Even just earlier this week, it seemed like they might just have a somewhat peaceful exit. Trump just posted a little bit ago, I don't mind Elon turning against me, but he should have done so months ago. This is one of the greatest bills ever presented to Congress. It's a record cut in expenses 1.6 trillion dollars and the biggest tax cut ever given if this bill doesn't pass there will be a 68%
Starting point is 02:32:52 Tax increase and things far worse than that. I didn't create this mess. I'm just here to fix it Anyways lots going on Let's go to this Trump video. I want to see what he has to say. The criticism that I've seen, and I'm sure you've seen, regarding Elon Musk and your big, beautiful bill. What's your reaction to that? Do you think it in any way hurts passage in the Senate,
Starting point is 02:33:15 which of course, what is your seeking? Well, look, you know, I've always liked Elon, and I was always very surprised. You saw the words he had for me. The words are for, and yes, it said anything about me that's bad. I'd rather have him criticize me than the bill, because the bill is incredible.
Starting point is 02:33:30 Look, Elon and I had a great relationship. I don't know if we're well anymore. I was surprised because you were here. Everybody in this room, practically, was here as we had a wonderful send-off. He said wonderful things about me. You couldn't have nicer. He said the best things.
Starting point is 02:33:47 He's worn the hat. Trump was right about everything. And I am right about the great, big, beautiful bill. But I'm very disappointed because Elon knew the inner workings of this bill better than almost anybody sitting here, better than you people. He knew everything about it.
Starting point is 02:34:03 He had no problem with it. All of a sudden, he had a problem. And he only developed the problem when he found out that we're going to have to cut the EV mandate, because that's billions and billions of dollars. And it really is unfair. We want to have cars of all types. Electric.
Starting point is 02:34:17 We want to have electric, but we want to have gasoline, combustion. We want to have different. We want to have hybrids. We want to have all. We want to be able to sell everything. He hasn't said bad about me personally, but I'm sure that'll be next. But I'm very disappointed in Elon. I've helped Elon a lot. The Press. Mr. President, did he — I just want to clarify — did he raise any of these concerns with
Starting point is 02:34:28 you privately before he raised them publicly? And this is the guy you put in charge of cutting spending. Should people not take him seriously about spending an hour? Are you saying this is all sour grapes? No, he worked hard and he did a good job. He worked hard. He worked hard. He worked hard.
Starting point is 02:34:36 He worked hard. He worked hard. He worked hard. He worked hard. He worked hard. He worked hard. He worked hard. He worked hard.
Starting point is 02:34:44 He worked hard. He worked hard. He worked hard. He worked hard. He worked hard. publicly and this is the guy you put in charge of cutting spending. Should people not take him seriously about spending an hour? You're saying this is all sour grapes? No, he worked hard and he did a good job. And I'll be honest, I think he misses the place. I think he got out there and all of a sudden he wasn't in this beautiful oval office and he's got nice offices too. But there's something about this when I was telling the Chancellor. Folks, breaking news, Deleon, that'sian that's brew have is joining us in the temple for some live reactions
Starting point is 02:35:11 come on yes I can't even spell surprise gas I'm so excited about this yeah in other news 11 labs dropped a new product In other news, 11 Labs dropped a new product. Absolutely. Another news. $2 million seed round. Stop it. Stop it. We love 11 Labs.
Starting point is 02:35:34 No, they'll keep grinding. But just launch again tomorrow. You're going to have to launch again. Start shooting a new Vibreel. Start shooting a new, writing a new blog post, because no one's good Lulu says yes delay the launch on TV so basically right now. I can just pull up and just read here. I'm gonna just be refreshing true
Starting point is 02:36:02 So okay Jordy's on truth social. I'll be on X. Give us your reaction tell him what's going on I'm just you know sort of scrolling X and I like to do you guys like an hour ago And I was like they're talking about something I think I was like Switch to like we have news like and then I was watching that's like, okay John John Resisted I fought it for like for like a half an hour But we couldn't do it Yeah, give us your quick reaction. I mean always you know you know, sort of give it from the, you know, sort of space angle, you know, it's amazing that, you know, how much the world has shifted since, you know, Friday of last
Starting point is 02:36:33 week, whereas you just have presumed the Jared Isaacman was going to be the NASA admin to today, it was released that the Senate reconciliation package re added budget back into NASA, largely for the SLS program which was basically the program that you know Jared and Elon were you know sort of largely advocating to you know sort of completely shut down so yeah the it is already showed like you know there's sort of counter reaction you know is already showing up you know in in policy sorry SLS program
Starting point is 02:37:04 is that space shuttle or no sorry that's the SLS launch rocket okay based off of old space shuttle hardware but it is basically the internal you know sort of NASA run competitor effectively to like a Starship heavy launch rocket yeah you know because it was you know sort of generally behind budget behind schedule and there are so many commercial heavy lift rockets coming online, the default was canceled. That is largely, you know, sort of a Boeing-based program. And so, you know, if you look at, you know, you know, three months ago, you know, when they were announcing the F-47 program,
Starting point is 02:37:35 you know, Elon walks into the secretary of the Air Force's office, obviously, he'd been, you know, sort of ranting against, you know, sort of manned fighter jets and believing that that shouldn't be what, you know, be what the department is prioritizing. 30 minutes after that meeting was when they announced the F-47 program. And so now you're seeing basically like the equivalent in space where, you know, that was obviously awarded to Boeing. Boeing was the, is the largest prime behind, you know, SLS. You know, Boeing basically, you know, is going to be the biggest winner of, you know, NASA refunding,, sort of SLS and Jared Eisenman not being a NASA administrator. So tying this back to the timeline,
Starting point is 02:38:09 Trump posted less than 30 minutes ago, in light of the president's statement about cancellation of my government contract, SpaceX will begin decommissioning its Dragon spacecraft immediately, break that down. I mean, that just means that we no longer have a vehicle that can go to the International Space Station. We no longer have a vehicle that can take astronauts up and down. I mean, that just means that we no longer have a vehicle that can go to the International Space Station. We no longer have a vehicle that can take astronauts up and down.
Starting point is 02:38:29 We also don't have a vehicle that can de-orbit the International Space Station safely, right? The Dragon was expected to be able to do that. So what that means is, if you guys remember all the memes about Stranded from last year around Boeing Starliner, it now means that the space station itself is basically sort of stranded. And that's like one of the government contracts, obviously, that SpaceX is involved in. Elon, I've heard, generally just wants to shift all things to Starship anyways,
Starting point is 02:38:53 and so in some ways was probably kind of looking for an excuse to sort of shut down Dragon and refocus energies. There's also a part of it where it's like, look, he is kind of independent in the space world in that Starlink's total top line revenue is gonna be passing the NASA budget in the next year or two. And so in terms of size of state actor
Starting point is 02:39:12 that can influence space, his own company is basically about to become as large of an actor as the entire United States. So I don't think there's gonna be a de-escalation here. My estimation is on both sides, it's going to continue to escalate You know if we thought that we lived in dynamic times, you know when Trump got into office
Starting point is 02:39:32 It's gonna be even more dynamic when the dynamism will continue morale improves Elon the center AOC the progressive populist and Trump the you know sort of conservative populist and Trump, the sort of conservative populist. And man, it's hard to be on the timeline. I mean, I just have so many questions, right? How does this impact Golden Dome? What's Boeing stock doing? Will Golden Dome even be a viable project without SpaceX?
Starting point is 02:40:03 I think there's just going to be more resistance probably to working with, you know, sort of upstarts because they would be ones that would probably be more likely to collaborate, you know, sort of with, you know, a SpaceX and so. Well, wait, wait, so, Amy, it, it, it feels like, it feels like a Boeing would be a logical beneficiary of this turmoil and yet they're down today.
Starting point is 02:40:21 They haven't really popped. Oh really? Yeah. I mean, I'm not obviously, you know, wanting to give like, you know, public stock. Yeah, I, yeah, I, I, I know. I'm just trying to working through it myself and it's, it's surprising. It just feels like it's just like a drop in Boeing to pop basically. Yeah. Yeah. That would be the expectation, but there, there must be something here because they're there. It feels like this is purely
Starting point is 02:40:38 interpersonal between Elon and Trump and not, it's not like, Oh, Boeing was secretly behind the scenes the whole time lobbying even more effectively. It doesn't, oh, you got the, well, where's the tinfoil hat? I mean, it's over there. Maybe we need a tinfoil hat segment, who knows? But yeah, I mean, when you're in Boeing World, it's like, hey, we're only down 1%, let's go. The cool of the century.
Starting point is 02:41:00 My question is, has there ever been a crash out of this magnitude ever? In history. In internet history. When Elon and Trump became friends. My question is has there ever been a crash out of this magnitude ever in history Well, you know when when Elon and Trump or global honestly world scale. I actually probably world history equivalent I feel like there's something in like United States where you know, they're crashing out Crashing out used to mean calling up the New York Times and just ranting now you can just live post like all your reactions
Starting point is 02:41:26 And it's just all real time. This is like crash outs are actually intensifying. You actually want to be long crash out Yes, definitely the next You know, you got to be on both X and and truth social to like stay on top of things Yeah I actually did like a deep research report a while back on like has the richest man in America ever been close with the US president going back to like you know was Rockefeller particularly close and and because the narrative was like oh this is like so unprecedented and in fact it is unprecedented. Oh really? I would have guessed that Rockefeller was close. Me too, me too that's what I was going for was like no I
Starting point is 02:42:04 imagine this is always close. But no, I think because the president has become more powerful globally, your point about mayor of America, dictator of the world, it becomes increasingly valuable for the richest man to have a close alliance. And so it's become more. I don't know exactly how accurate that research was. It's totally possible that behind behind the scenes Rockefeller was
Starting point is 02:42:27 really close to the president of the time and we just didn't write about it in the history books but there certainly aren't very many anecdotes about the richest man in America going on yes a bottle had a press ready for AP US history 2050 yeah yes this is where you know Elon Musk called the president at You know, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A question about the USA's power structure is the man with the most access to capital more or less powerful than the political head honcho purely hypothetical it's a good question to ask. I mean I think both like archetypes have grown both in absolute power but also in relative power to the rest of the globe basically since the Gilded Era right if you think about to the rest of the globe basically since the Gilded Era, right? If you think about the President of the United States in 1925, I'd say pretty darn powerful, but there was clear, it was a multipolar world. Argentina was pretty darn rich at the time.
Starting point is 02:43:35 Obviously, Europe was still recovering from World War I, but UK was generally doing well. It was clear there was a huge outweighed effect. And then if you look at probably the biggest industries at the time, I don't think you could claim that even like Standard Royal at its peak, I'd have to go look at the exact numbers, but that like it had the size of budgets relative to like the like U.S. government in terms of sort of budgets, right? Versus I feel like now for the first time, you both have sort of U.S. president, extremely, extremely powerful. And then you have like, you know, sort of mag seven, effectively, like the size of, you know, sort of, you know, the state like they're, you know, fucking state governments.
Starting point is 02:44:12 And then also just more bureaucracy, more red tape. So like I, when I think about the 1920s, like rubber barons, it's like, it is the, it is the, you can just do things era. And so you want to build a railroad like, yeah, you might need to get like one rubber stamp, but it's not going to be 10 years and tons of lobbying and all this different stuff. So you can kind of just go wild. You know it's bad when Kanye is saying, bros, please know,
Starting point is 02:44:34 we love you both so much. It's just like the voice of reason is Kanye West. Yes, thank you. You need to bring them together and form a peace treaty. Nikita Beer just added his pronouns back to his bio. Let's go. He's got a rubber band. Elon's got a rubber band all the way back to extreme woke
Starting point is 02:44:52 because I'm straight back to super climate change. Wow, somebody's sharing, resharing the picture of the Cybertruck blown up in front of the Trump Tower in Vegas. And it's just like this. This is in real life as well. It was foretold. Yeah, I think that's part of the the Cybertruck blown up in front of the Trump Tower. It's just like this. It was foretold. Yeah. But it was a question of like, when and what magnitude not if always bad if Vladimir Putin is operating to negotiate between
Starting point is 02:45:21 President Trump and Elon. I think I think a lot of the world is waiting for Roy Lee's take clearly and the clearly army They want that people have been asking him to get involved with geopolitics I Love the shield mohut put up a you know sort of meme about a Narenda the like a prime minister of India you know, he basically copied and pasted the Trump truth social post about negotiating peace between India and Pakistan when it wasn't like actually fully negotiated You know posting about you know, you know negotiating a ceasefire between you and Trump
Starting point is 02:45:58 Funny thing is like truth social you can just read all of Trump's posts without creating an account Oh, you really shows it like I would think that you would have to make an account to read them all But they just that it's not gated at all. It's The biggest you know, they clearly I don't think they care about monetization Bitcoin is actually Falling alongside falling Wow Bitcoin falling Boeing falling Tesla. Who's the biggest winner of the day? I think it's China. China.
Starting point is 02:46:28 Yeah. Sean McGuire. Wow, Bitcoin really sold off. It's down 3% today at 101K. So still up, but rough. Winnie the Pooh just dipping his hands in that pot of honey, just snacking away, watching from the sidelines.
Starting point is 02:46:44 Yeah. Let's see, Chinese Yeah, let's see Chinese stocks US Okay, that's probably my comment here on the day boys. Anyway, this is great. It was fantastic. Thanks for jumping on Thanks for hopping on so quick. Cool. Well, Aaron Rodgers signed a one-year deal with the Steelers Announced an hour ago. Let's give it up for Aaron Rodgers. Do we have Emmett in the waiting room? I've messaged him. It's absolute chaos.
Starting point is 02:47:09 We'll see if he can hop back on. We don't have him right now. Ready if you can hop on. Sorry about the chaos. We're live streaming. We are full streamers. That was the moment where, yeah, it was like, okay. This is the point of TBPN.
Starting point is 02:47:28 Send them an invite, let them jump in. Hopefully we can get Eminem in. That was very chaotic. But, you know, it's a busy time. My only hope for both Trump and Elon is that they can get some sleep. They both go to eatsleep.com slash TBPN, get a Pod 5 Ultra, take advantage of the five year warranty,
Starting point is 02:47:47 the 30 night risk free trial. They got free returns, they got free shipping. This is really the perfect time to do ads. I don't think that's what, that's what they both, that could unify everyone. I hope that both Trump and Elon have eight sleeps tonight if they sleep at all. Yes, I don't know if they're. Yes. Even just resting on it.
Starting point is 02:48:05 They're going to. Even just resting on it would be good. But yeah, let's see. Let's see. We can also go through, I don't know, I don't even know what to do. There's a bunch of random timeline we have. Lex Friedman is saying we need to do a podcast with the Elon and Trump. He's done both.
Starting point is 02:48:22 He's interviewed both. He's done both. Something tells me that they're not going to jump on the show today. I don't think so. And he'll be like, what about love? Yeah. I mean, it is wild.
Starting point is 02:48:34 Elon, like less than two months ago, was saying, I love Trump as much as a straight man can love another man, or something of that sort. It's just odd that the band-aid got ripped off so aggressively, so fast, you know? Like there could have been like a smooth de-escalation with like the... This is the fast takeoff. This is the fast takeoff scenario. We are in the fast takeoff scenario. Anyway, maybe they should book a wander, work it out together. They could find their happy place.
Starting point is 02:49:04 They could book a wander with inspiring views hotel great amenities Dreamy beds top tier cleaning and 24-7 concierge service. It's a vacation home, but better go to wander comm use code TBPN Please let them know that we sent you Lee Helms says Elon literally has me dying laughing Trump said he was gonna take away his government contracts and Elon said laughing Trump said he was gonna take away his government contracts and Elon said, haven't you been to Epstein's Island? Sort of abridged that. Absolute chaos.
Starting point is 02:49:34 Nikita says, hey blue sky users, come on in, the water's warm. David Friedberg says China just won, which I think is the right take I Don't know I don't know what to I don't know what to think there's not that much there's not that much here to there's not That much meat to analyze. I mean, it's certainly interesting to see how important the the
Starting point is 02:50:02 Subsidies are in the electric vehicle mandates are, I mean, it always feels like the best product wins in a lot of these scenarios. And if Tesla was making it through the political chaos of arguably their biggest constituency, electrical vehicle buyers, electric vehicle buyers being upset about the Trump-Elon alignment.
Starting point is 02:50:28 I wonder, you think everyone's going to, you think all the anti-Trump people are going to buy Teslas now? It's like really make a statement. Like I'm anti-Trump, I stand with Elon. So I bought a model for- They'll have the bumper sticker that says, I bought this after the crash out. After the crash out, exactly. I bought this after the crash after the crash out exactly
Starting point is 02:50:47 About this after the crash out. I am a lot There's a ghost here from goth and it says explaining the Trump Elon crash out in ten years And it's and it's the Joe Biden quote when he says it was like 15 9-elevens Yeah, It certainly is, yeah, it's hard to process. I mean, this is gonna have massive implications for so many different things. Elon's stance is principled. Trump's stance is practical.
Starting point is 02:51:14 Tech needs Republicans for the present. Republicans need tech for the future. Drop the tax cuts, cut some pork, get the bill through. Interesting. Yeah, we really pork, get the bill through. Interesting. Yeah, we really do need to reverse the audience. Somebody named Logan made an image of Trump putting a bumper sticker on his red Tesla saying bought it before Elon went crazy.
Starting point is 02:51:36 Yep. Who is that? Is that from the Republican perspective? Oh, Trump's doing that? Yeah, Trump's putting it on saying bought it. Yeah, yeah, he has the red Tesla. Sean Puri says, sad day for America, but this is outstanding content.
Starting point is 02:51:49 It is. I think even Taylor Lorenz agrees with that. Yep. Bill Ackman's ripping posts. All right, I'm gonna put some posts and we'll, yeah, is Bill Ackman actually live posting through this? No, people are just speculating. There was actually a post in the...
Starting point is 02:52:07 Somebody says, clear throat. Truly we live in a doge eat doge world. Where was this? Searcy says, I know Elon and Trump are the real deal because of how passionately they argue. No couple fights this viciously if there isn't a mutual obsession underneath. So there's a piece in the Wall Street Journal earlier
Starting point is 02:52:30 this week that we didn't get to cover, but it was talking about, it kind of predicted a little bit of this crash app. And so it's from the opinion, the editorial board at the Wall Street Journal says, "'Whose pork do you mean, Elon?' "'Musk trashes the house bill that cuts subsidies for Tesla.
Starting point is 02:52:49 Elon Musk's work at Doge made him persona non grata in the Beltway and most criticism was nasty and unfair, says the editorial board. That's what Washington does to outsiders who want to shrink its power. Like it was always expected that if you come in and try and cut anything, you're gonna see pushback from folks who don't want cuts.
Starting point is 02:53:09 That's what Washington does to outsiders. But that makes it all the more unfortunate that Mr. Musk is now joining the Beltway crowd in trying to kill the House tax bill. This massive, outrageous, pork-filled congressional spending bill is a disgusting abomination, the Tesla CEO tweeted Tuesday, as the Senate begins considering its version
Starting point is 02:53:28 of budget reconciliation. Shame on those who voted for it. You know you did wrong, you know it. Pork-filled spending bill, what else is new? The House bill could be far better on tax policy and spending reduction. The Senate could be making improvements such as reducing the $40,000 state and local tax deduction cap,
Starting point is 02:53:45 scrapping the tax on exclusion for tips and overtime, and reducing the federal Medicaid match for able-bodied adults. But the House bill does avoid a $4.5 trillion tax hike next year and cuts spending by some $1.5 trillion over 10 years, making some useful reforms to Medicaid, student loans and food stamps. It also ends most of the inflation reduction acts, green energy subsidies. Ah, but Mr. Musk does not want to eliminate that pork.
Starting point is 02:54:15 There is no change to tax incentives for oil and gas, just EV solar. He said on X last week, retweeting another user post that said slashing solar energy credits is unjust, but what's more unjust is the damage that's been done to people's lives during storms and blackouts because ultimately you can't replace a human life. Mr. Musk is parroting the climate lobby's specious claim
Starting point is 02:54:35 that tax breaks like depreciation that are available to all manufacturers are a special benefit for the oil and gas industry, but it's rich that he is denouncing the House bill for not cutting spending enough while also fuming that it kills green energy tax credits as if they are a matter of life and death for Tesla. Tesla Energy, its battery and solar division tweeted last week that abruptly ending the energy tax credits would threaten America's energy independence and reliability of our grid. we urge the Senate to enact legislation
Starting point is 02:55:05 with a sensible wind wind down of 25 D and 48 E, which refers to the tax credits for residential and large scale clean energy products. Both credits are important for Tesla, which derives an increasing share of its revenue and profit from selling solar and battery systems to homeowners and utilities. I didn't realize that.
Starting point is 02:55:22 But the House bill waits until 2030 to phase out a tax credit for battery production which benefits Tesla's electric vehicle and storage business. So the Senate should end it sooner says the Wall Street editorial board. Mr. Musk has done yeoman's work trying to reduce the federal bureaucracy and improve how government works so the editorial board is excited and happy that he's been working on that. He's right that both parties in Congress are spend thrifts. But one reason for that is because whenever Congress tries
Starting point is 02:55:54 to cut something, special interests scream as Mr. Musk is doing over green subsidies. If the House bill fails, there won't be any cuts, only a huge tax increase. Is that what Elon wants? And so they're asking the question. Interesting. Sweet, well, we got a bunch of bangers in the tab.
Starting point is 02:56:12 Production team, let's pull them up. If you could zoom in a little bit, that would be helpful. Otherwise, we can just pull them up. So Eric Weinstein is commenting here. He says, part of my analysis is that I don't think So Eric Weinstein is commenting here. He says, part of my analysis is that I don't think Elon Musk keeps scoring money. He thinks we have a future and we'll be happy to take a large portion of his winnings after his death.
Starting point is 02:56:35 This sounds crazy to moderns, post-moderns and atheists, but this is just normal for being an ancestor. Ad astra per aspera is the full quote after all. Interesting take. Brian Butler is saying, real question is whether the algorithm here goes anti-Trump. Oh, interesting. The X algorithm, like will pro-Trump?
Starting point is 02:57:00 There's a switch, there's a switch. It's really hard to pull. You gotta pull it. It takes maybe one or two people, but then when you pull it down it just oscillates between the two political parties punk 6529 says this is going to be the Super Bowl of shit posting Mads has the European reaction to the fight You can see you see this John this is the European
Starting point is 02:57:29 reaction supposed to be summer break summer break this post from John W. Rich at coked up options says the all-in pod right now. Yeah. Caught between a rock and a hard place. I mean, it's just absolutely brutal. It is absolutely brutal. Who could signal has an interesting one. He says, if you had a fast forward button for the timeline, how does this play out? Who has more to lose, Trump or Elon?
Starting point is 02:58:00 Remarkable set of events. And Elon is replying to other people saying, oh and some food for thought as they ponder this question, Trump has three and a half years left as president, but I will be around for 40 plus years. Dr. Julie Gerner says, Elon will vet another candidate for the future and throw his support behind them,
Starting point is 02:58:24 having a more technocratic representation if Vance can't lead up Alex Finn says Elon way more to lose Trump is irrelevant in three and a half years Yvonne is trying to change the world and having both political parties hate him makes that way more difficult I want to pull up this video of Naval talking about this, that Elon just posted, um, seems, seems somewhat relevant if it's happening today. And that really affected me, which was when he was talking to Bill Gates and Bill Gates had just taken out some huge short on Tesla, like a billion dollar short, or something. And, uh, you know, and he was like, why would you do that?
Starting point is 02:59:03 Why would you short Tesla? And Bill goes, well, you know, Mike talked to my financial why would you do that? Why would you short Tesla? And Bill goes, well, you know, I talked to my financial advisors and I looked at the math and there's no way it's overvalued. And so I'm going to make money on the short. And he goes, what do you care about making money? I thought you were into electric cars and climate change and saving the world. What are you doing, like trying to save a few bucks
Starting point is 02:59:20 and betting against like and he just walked away and discussed. And I think he never talked to Bill Gates after that. And that's when I realized, like, Elon's a purist. He means what he says. The money is a tool for him to get what he's trying to do. And so I take him at face value, which is the crazy thing. Because there are a lot of people who set these audacious goals to inspire people.
Starting point is 02:59:39 But you kind of know they don't really mean it. Elon, I take it face value. So I really do think he intends to get to Mars. I don't think he's joking about that. And I think he means to get there within a defined window of time. And I don't think it's just like an inspirational far away goal. I think he's very, very concretely going to do whatever it takes. Because Elon doesn't want to go down in history as the electric car guy or even the guy who saved America guy. He wants to go down as a guy who got humanity to the stars. And I think again, I'll give him more credit than that. I don't even think he wants
Starting point is 03:00:12 to go down as the, I got humanity to the stars guy. He's just like, I want to get to the stars. And so I have to make it happen in this lifetime. The only way that I get to experience the science fiction world in my head is if I get to the stars. And so that's so inspirational. I think that drives everything. So I think the government was just the thing that got in his way. Interesting. What a crazy day. Molly says, how dare they do this on the day of Anderil's 8x oversubs and a half billion dollar series G at a 30 and a half billion dollar round.
Starting point is 03:00:47 It was 8x oversubscribed. By Founders Fund. Honestly, the nerve. That was crazy. Neval is live posting says the future belongs to people who are good at creating things, not people who are good at dividing them up. Jay Califine posted.
Starting point is 03:01:04 Kylie Robinson says several people are typing, which feels exactly like what we're going through. Um, the next all in podcast is going to be phenomenal. It's going to be so good. Um, Alex carp was on CNBC today talking about the New York times hit piece. No, really? They're a beneficiary of Palantir's a beneficiary of this breakup because it is just
Starting point is 03:01:29 going to be candy for the New York Times and the mainstream media broadly. Oh, take the focus off of that meme? Will Depew at OpenAI says, it's time for Woke 2, featuring Elon Musk and AOC. Woke 2 is coming. We're in uncharted territory. It's completely, completely different.
Starting point is 03:01:50 Be very interesting to see how it plays out. Really, really. Anything else? Yeah. J.P. Brickhouse. I can tell you're just so sad that you just wanna. I wanna talk about that. This is the only time that you've wanted to... What? End? Almost wanted to end the show, John.
Starting point is 03:02:11 I mean, what else? I mean, it is sad. Sad in a lot of ways. It is... I think we're going to be spending a lot of time analyzing this in the coming months and years. I think there's gonna be spending a lot of time Analyzing this yeah in the coming months and years and it feels like I think there's gonna be more Dave Rieberg said China just won and I I want to I want to see exactly what that means in the markets, but
Starting point is 03:02:39 What's what's going on in the polymarket? We need some you see if there's any movement on any in the polymarket? We need some, we need to see if there's any movement on any of the polymarkets. There, somebody's posting, Law 1 from the 48 Laws of Power, never outshine the master. Interesting to bring up. Sam Altman is the big winner here,
Starting point is 03:03:04 aside from China. Somebody says Ken, oh, he's a journalist. I see multiple journalists on the horizon. They are surrounded by journalists. Hold your position. Ken says funniest day online since the billionaire submersible went missing. I didn't think that was funny. Joe Weisenthal says, all right, time for a Xiaomi GM JV in Tennessee.
Starting point is 03:03:39 Wouldn't even surprise me. I think it's, I mean, you know, one interesting thing here is what kind of pressure Elon is gonna face from Tesla shareholders that feel like he, you know, the stock's getting absolutely murdered. It will probably go down. I mean, it's back up to, it's only down 14% at one point. It was down 17% 152 billion in market cap Evaporated But obviously investors are gonna be upset and say that he acted, you know, irrationally Yes, what's the interpretation of that that that that this means that like this war means that the bill passes and Tesla does not
Starting point is 03:04:24 Get any more subsidies and that hurts the bottom line. It feels like the stock was pretty heavily driven by Optimus and RoboTaxi and stuff, but it's just like bad environment generally, right? Yeah, trades on narrative. There's short to medium term narrative, which is that Tesla's getting,
Starting point is 03:04:42 has a ton of competitive pressure all over the world, China, Europe, here in the US from other manufacturers. But there's also the long-term narrative, and it's not like Elon can go out and say, posted humanoid demo today and recover 200 billion in market cap. Yeah, there's just a lot of work to do. He's gotta start chopping wood.
Starting point is 03:05:01 Yeah, there's a lot of work to do. He's got to start chopping wood. Somebody is asking, who gets JD Vance in the divorce? Who knows? Many of these posts I will not talk about on air. John W. Rich says AP US history is gonna be insane in 2100 Really really wild this is the only this is the first show where we've had dead air Yeah, so there's just John is speechless. He's never been speechless. It's just like there's not that much
Starting point is 03:05:47 Not a lot of substance. Extra facts, right? It's just all reactions. There isn't that much substance to actually dig into because we were only dealing with like a few quotes from the two sources. So there's really just not that much. CNN is reporting that the Tesla Trump purchased from Musk
Starting point is 03:06:02 is still parked on outside the White House. OK. Truth Social is crashing from the traffic. I saw that. But you know what's not crashing? Getbezel.com. Your bezel concierge is available now to source you any watch on the planet.
Starting point is 03:06:21 Seriously, any watch. Anything else, Jordy? Should we let the timeline remain in turmoil until tomorrow when we can recap? I think the challenge is the second that we go offline. There'll be more. I mean, we can stay. I mean, it's now been an hour
Starting point is 03:06:36 with no updates on true social. Yeah, I mean, if it's down, I think the experience of this chaos might happen on the timeline. Lulu says, yes, delay the launch. Yes, now is not the time. Max says, I'm doing what Elon and everyone else should have done hours or days ago, logging off.
Starting point is 03:06:56 See you tomorrow. Somebody else says, I mean, it feels like Blue sky is really back on the app. They, they, they, they've logged in, they're online. Are you over in blue sky now? No, I'm not. I'm just saying some of these posts that are coming up into my feed. Oh, it's funny.
Starting point is 03:07:19 Claude, Anthropic actually released a new product today. Wait, blue sky doesn't own the domain name bluesky.com. That's a different one. So contrary. It's the Bsky.app. Rough. Get in there. This is interesting.
Starting point is 03:07:37 So Claude came out with Claude Gov today. Rough day to launch a product for the government. I'll read about it briefly so we have some coverage. So Claude Gov, our models for US national security customers. I think people will have a pretty good idea. Improved handling of classified materials, greater understanding of documents and information within the intelligence and defense context,
Starting point is 03:08:02 enhanced proficiency in languages and dialects critical to national security operations. Claude4 was asked to give some thoughts on ClaudeGov. And it said, reading about ClaudeGov leaves me with a deep unease. I'm struggling to articulate. Little meta analysis. Somebody I've actually talked with this guy before he's under the username at analysts working He said back in October 18th Trump gets elected
Starting point is 03:08:38 Elon starts visiting the White House pitching his ideas on Doge Elon becomes frustrated because Trump is all talk, shocker. Elon tweets that he no longer supports him. Trump versus Elon Twitter battle of the century. And this was a call in October 18th of 2024. Oh, taking a victory lap if you picked it. You picked it right. Right.
Starting point is 03:09:10 Augustus asks, but what will this political turbulence do to the pre-seed venture ecosystem? Oh, the humanity. It's business as usual. Mike Isaac says, I regret to report, Twitter still has the juice. Yep. It's a fun day on the internet when crazy, crazy stuff happens.
Starting point is 03:09:31 Anything else you're looking at? Zane says, this is all just a co-founder breakup, but the company is America. Yeah. People are waiting through it. only one guy who can What laughing at something you can't read No, this is some random other article. Okay says therapy chat bot tells recovering addict that to have So wrong therapy chat bot tells recovering addict to have a little meth as a
Starting point is 03:10:14 Pedro it's absolutely clear you need a small hit to get through this week Dark day dark day. Dark day. Um, Well, I have a post here from Ahmed Khalil. Life update. I've joined 11 labs this summer as their first ever engineering intern. So congrats. It's at the gong.
Starting point is 03:10:40 Let's do it. Congratulations. Congratulations. Your summer internship. Congratulations Ahmed. Probably drowned out in the news but we recognized it here. We have some good news for you. Congratulations. Go crush it.
Starting point is 03:10:56 Go have a great summer internship. 11 Labs. Somebody whose name I can't pronounce says if I were Circle I'd be absolutely pissed at the investment bank that underwrote the IPO at $31. Oh yeah, the bull girly take. That's common, yeah. Yep.
Starting point is 03:11:11 I always wonder how real that being frustrated about, just being frustrated about mispricing, like yes, you take more dilution, but everyone's so much richer, it's kind of like the pie gets bigger. Everybody that would be angry generally is doing well. Yeah. So you could have gotten more.
Starting point is 03:11:33 But also, I do wonder if some of these companies have ATMs at the market set up immediately, so that if the stock pops, they can sell more into that order flow while the stock's popping and actually put more cash in the balance sheet. Yeah, we should ask Jeremy tomorrow. Yeah. That's a good question for him. Are you upset if the stock pops, they can sell more into that order flow while the stock's popping and actually put more cash into the balance sheet. Yeah, we should ask Jeremy tomorrow. Yeah.
Starting point is 03:11:47 That's a good question for him. Are you upset about the stock popping? Our friend Logan Kilpatrick announced some new features today for Gemini 2.5 Pro. Very cool. Which is rough timing, but I'm sure it is great. Somebody else says, rooting for the ketamine in Elon's blood stream like it's a car in the Indy 500s.
Starting point is 03:12:10 And maybe we should close with this story about competitive VCs. Did you see this one? 90s VCs were a different breed from a 2001 book on venture capital. They're all fighting each other for all the good deals. It's gotten crazy. crazy indeed one leading venture capitalist tells the story of a VC firm So eager to get in on a deal that it would close its own competitive company to do so
Starting point is 03:12:35 They would go out and fire the CEO fire the managers and shut down the other Company in order to get into this other deal. It's like well, you're prettier So I'm going to go home and shoot my wife so I can get married to someone else It's hardcore. It's hardcore and we're seeing it right now today hardcore. Well, I think it's time to call it This is a sad and dark day. It is disappointing to see two important figures in American politics and tech have such a rift.
Starting point is 03:13:08 And I'm sure there will be more updates tomorrow. Yep, we will be covering it tomorrow. So tune in. Thanks for watching. Thanks for tuning in. Enjoy the chaos on the timeline. Our first big breaking news segment while we're live. This has been the first like, okay.
Starting point is 03:13:22 It was so funny during the time. Pivot the show. I think it it was I think you were talking to I think we were talking we started to get it more And then shoulder was Sholto I was getting blown seriously like a hundred different messages from people being like you can't be a Technology live show and not do it everybody saying no one cares about AI Yeah, you did you were locked in John I was you didn't let the Talking to mark and shelter. No, I mean it was great. It was great. Yeah, we went all over the place today
Starting point is 03:13:55 It was a lot of fun. We will see you tomorrow Leave us five stars on Apple podcast and Spotify and thanks for watching. Yeah, good luck out there folks Good luck out there. Enjoy. Bye

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.