The AI Daily Brief: Artificial Intelligence News and Analysis - What AI Developers Are Building Next, with Swyx and Alessio of Latent Space

Starting point is 00:00:00 Today on the AI breakdown, part two of my conversation with Alessio and Swix from Layton Space. The AI breakdown is a daily podcast and video about the most important news and discussions in AI. I'm going to Breakdown That Network for more information about our Discord, our YouTube, and our newsletter. Hello, friends, back again with Part 2. If you haven't heard part one of this conversation, I suggest you go check it out. But to be honest, they are kind of actually separable. In this conversation, we get into a topic that I think Alessio and Swicks are very well positioned to discuss, which is what developers care about right now, what people,

Starting point is 00:00:39 people are trying to build around. Hello, friends. Quick note before we get to the rest of the episode, you have probably heard me talk about the AI education beta over the past few months. We've had a ton of you participate, which has been amazing. And now we're almost ready to announce something big and something new. If you want to be one of the first to hear about our new approach to learning AI that is hyper-practical, hands-on, immediately relevant, continuously upgrading, and anchor by community, go to B-super.a-I and sign up to be be notified when the project goes live. We're getting there in just a few weeks, and I want all of you along for the journey. Once again, that's B-super.a-I. I honestly think that one of the best ways to

Starting point is 00:01:23 see the future in an industry like AI is to try to dig deep on what developers and entrepreneurs are attracted to build, even if it hasn't made it to the news pages yet. So consider this your preview of six months from now, and let's dive in. Let's bring it to the GBT5 conversation. I mean, so I think that that's a great sort of assessment of just how the stakes have been raised. what is your, I mean, so I guess maybe I'll frame this list as a question, just sort of something that I've, that I've been watching. Right now, the only thing that makes sense to me with how fundamentally unbothered and unstressed open AI seems about everything is that they're sitting on something that does meet all that criteria, right? Because, I mean, even in the Lex Friedman interview that Altman recently did, you know, he's talking about other things coming out first. He's talking about. about, it was just like, he, listen, he's good and he could play nonchalant, you know, if he wanted to, so I don't want to read too much into it. But, you know, they've had so long to work on this. Like, unless that we are like really meaningfully running up against some constraint,

Starting point is 00:02:27 it just feels like, you know, there's going to be some massive increase. But I don't know. What do you guys think? Hard to speculate. You know, at this point, they're pretty good at PR, and they're not going to tell you anything that they don't want to. And he can tell you one thing and change their minds the next day. So it's really, you know, I've always said that model version numbers are just marketing exercises. Like, they have something and it's always improving. And at some point, you just cut it and decide to call it GP5. And it's more just about defining an arbitrary level at which they're ready.

Starting point is 00:03:01 And it's up to them on what ready means. We definitely did see some leaks on GPT4.5, as I think a lot of people reported. And I'm not sure if you've covered it. So it seems like there might be an intermediate release. But I did feel coming out at the Lex Freeman interview that GP-T-5 was nowhere near. And, you know, it was kind of a sharp contrast to Sam talking at Davos in February saying that, you know, it was his top priority. So I find it hard to square. And honestly, like, there's also no point reading too much tea leaves into what any one person says about something that hasn't happened yet or a decision that hasn't been taken yet.

Starting point is 00:03:38 So, yeah, that's my two cents about it. Like, calm down. Let's just build. Yeah, the February rumor was that we're going to work on AI agents. So I don't know. Maybe they're like whatever. Yeah, they had two agent, I think two agent projects, right? One desktop agent and one sort of more general.

Starting point is 00:03:55 Yeah. Sort of GPT's like agent. And then Andre left. So he was supposed to be the guy on that. What did on, Chrissy? What did he see? I don't know. What did he see?

Starting point is 00:04:05 I don't know. But again, it's just like the rumors are always floating around, you know. But I think like this. This is, you know, we're not going to get to the end of the year without Jupiter 4.5 or 5, you know, that's definitely happening, you know. I think the biggest question is like, are Anthropic and Google increasing the pace, you know, like, is the, is the clot four coming out? Like in 12 months, like nine months? What's the deal? Same with Gemini.

Starting point is 00:04:33 They went from like one to one point five in like five days or something. So when's Gemini to coming out, you know? Is that going to be soon? I don't know. There are a lot of a lot of speculations, but the good thing is that now you can see a world in which open AI doesn't rule everything, you know. So that's the best, that's the best news that everybody got, I would say. Yeah.

Starting point is 00:04:54 And Mr. Al large also dropped in the last month. And not as not quite GPT4 class, but very good from a new startup. So yeah, we have now slowly changed in landscape. You know, in my January recap, I was complaining that nothing's changed. the landscape for a long time. But now we do exist in a world, sort of a multipolar world where Claude and Gemini are legitimate challengers to GPD4 and hopefully more will emerge as well, hopefully from meta. Yeah. So let's actually talk about sort of the open source side of this for a minute. So Mr. Large, notable because it's not available open source in the same way their other things are.

Starting point is 00:05:32 Although I think my perception is that the community has largely given them, like the community largely recognizes that they want them to keep building open source stuff and they have to find some way fund themselves that they're going to do that and so they kind of understand that there's like they got figure out how to eat but we've got so you know there's mistral there's i guess rock now which is you know grok one is uh from from october is is open sourced yeah yeah i thought you thought you meant grok the chip company no no you yeah you mean twitter grok although grok the chip company i think is even more interesting in some ways but um and then there's the you know obviously llama three is the the one that sort of everyone's wondering about two and you know my my sense of that the

Starting point is 00:06:09 A little bit that Zuckerberg was talking about Lama 3 earlier this year suggested that, at least from an ambition standpoint, he was not thinking about how do I make sure that, you know, meta, you know, keeps the open source thrown, you know, vis-a-vis mistral. He was thinking about how you go after, you know, how he, you know, releases a thing that's, you know, every bit as good as whatever open AI is on at that point. Yeah, from what I heard in the hallways at GDC, Lama 3, the biggest model will be 260 to 300 billion. parameters. So that's quite large. That's not an open source model. You know, you cannot give people a 300 billion parameters model and ask them to run it. You know, it's very computer intensive. So I think the- It is a, it can be open-source. It's just, it's going to be difficult to run, but that's a separate question of whether it's open source. It's more like, as you think about what they're doing it for, you know, it's not like empowering the person running Lama on their laptop. It's like, oh, you can actually now use this to go after open AI, to go after Anthropic, to go after

Starting point is 00:07:08 some of these companies at like the middle complexity level, so to speak. Yeah, so obviously, you know, we assume Gentala on the podcast that they're doing a lot here. They're making Pythorch better. You know, they want to, that's kind of like maybe a little bit of a shot at at Ambidia in a way, trying to get some of the CUDA dominance out of it. Yeah, no, it's great. I love the duck destroying a lot of monopolies arc. You know, it's been very entertaining. Let's bridge into the sort of big tech side of this because this is obviously like, so I think actually when I did my episode, this was one of the, I added this as an additional war that's something that I'm paying attention to. So we've got Microsoft's moves with inflection, which I think potentially are being read as a shift vis-a-vis their relationship with Open AI, which also the sort of mistral large relationship seems to reinforce as well. We have Apple potentially, entering the race finally, you know, giving up Project Titan and, and kind of trying to spend more effort on this. Although, Counterpoint, we also have them talking about it, or there being reports

Starting point is 00:08:16 of a deal with Google, which, you know, is interesting to sort of see what their strategy there is. And then, you know, meta has been largely quiet. We kind of just talked about the main piece, but, you know, there's, and then there's spoilers like Elon. I mean, you know, what of those things has sort of been most interesting to you guys as you think about what's going to shake out for the rest of this year. I'll take a crack. So the reason we don't have a fifth war for the big tech wars is that's one of those things where I just feel like we don't cover differently from other media channels,

Starting point is 00:08:47 I guess. So in our anti-interestist, we actually say, like, we try not to cover the big tech game of Thrones, or it's proxied through, you know, all the other four wars anyway. So there's just a lot of overlap. Yeah, I think absolutely personally, the most interesting one is Apple entering the race. They actually released, they announced their first large language model that they train themselves. It's like a 30 billion multimodal model. People weren't that impressed, but it was like the first time that Apple has kind of

Starting point is 00:09:14 showcased that, yeah, we're training large models in-house as well. Of course, like, they might be doing this to deal with Google. I don't know. It sounds very sort of rumory to me. And it's probably, if it's on device, it's going to be a smaller model. So something like a Gemma, it's going to be smarter auto-complete. I don't know what to say. I'm still here dealing with like Siri

Starting point is 00:09:36 which probably hasn't been updated since God knows when it was introduced. It's horrible. You know, it makes me so angry. So one, as an Apple customer and user, I'm just hoping for better AI on Apple itself.

Starting point is 00:09:51 But two, they are the gold standard when it comes to local devices, personal compute and trust. You trust them with their data. And I think that's what a lot of people are looking for in AI that they have a they love the benefits of AI. They don't love the downsides, which is that you have to send all your data to some cloud somewhere. And some of this data that we're going to feed AI is the most personal data there is. So Apple being like one of the most trusted personal data companies, I think is very important that they enter the AI race.

Starting point is 00:10:22 And I hope to see more out of them. To me, the biggest question with the Google deal is like, who is paying who? because for the browsers, Google pays Apple like $18, $20 billion every year to be the default browser. Is Google going to pay you to have Javanai or it's Apple paying Google to have Javanai? I think that's like what I'm most interested to figure out. Because with the browsers, it's like it's the entry point to the thing. So it's really valuable to be the default. That's why Google pays.

Starting point is 00:10:51 But I wonder if like the perception in AI is going to be like, hey, you just have to have a good local model on my phone to be worth me. purchasing your device. And that's kind of drive Apple to be the one buying the model. But then, like Sean said, they're doing the MM1 themselves. So are they saying we do models, but they're not as good as the Google wants? I don't know. The whole thing is, it's really confusing. But it makes for a great meme material on Twitter. Yeah. Yeah, I mean, I think, like, they are possibly more than OpenEI in Microsoft and Amazon. They are the most full-stack company there is in compute, in computing. And so, like, they own the chips, man. Like, they manufacture everything. So if, if there was a company that could, you know, seriously challenge the other AI players, it would

Starting point is 00:11:41 be Apple. And it's, I don't think it's as hard as self-driving. So, like, maybe they've, they've just been investing in the wrong thing this whole time. We'll see. Wall Street certainly think so. Wall Street loved that move, man. There's a big, a big sigh of relief. Well, let's, Let's move away from sort of the big stuff. I think to both of your points, it's going to... Can I drop one factoid about this Wall Street thing?

Starting point is 00:12:06 I went and looked at when meta went from being a VR company to an AI company. And I think the stock, I'm trying to look up the details now. The stock has gone up 187% since Lama 1, which is $830 billion in market value

Starting point is 00:12:25 created in the past year. yeah yeah it's like remember if you guys haven't yeah if you haven't seen the chart it's actually like remarkable if you draw a little arrow on it it's like no we're in the eye company now we know forget the VR thing it's uh it it is an interesting no it's i i think um unless you called it sort of like zucks disruptor arc or whatever he he really does he is in the midst of a of a total you know i don't know if it's a redemption arc or it's just it's something different where, you know, he's sort of the spoiler. Like, people loved him just freestyle talking about why he thought they had a better headset than Apple. Like, even if they didn't agree, they just loved he was going direct to camera and talking about it for, you know, five minutes or whatever.

Starting point is 00:13:08 So that's a fascinating shift that I don't think anyone had on their bingo card, you know, whatever, two years ago. Yeah. It's the whole. We still didn't see him fight Elon, though. Yeah. Yeah. I mean, hey, don't, don't write it off. You know, maybe just these things take a while to happen.

Starting point is 00:13:22 But we need to see him fight in the Coliseum. No, I think, you know, in terms of like self-management, life leadership, I think he has, there's a lot of lessons to learn from him. You know, he might, you know, you might kind of quibble with like the social impact of Facebook, but just himself as a, in terms of personal growth and perseverance through like a lot of change and, you know, everyone throwing stuff his way. I think there's a lot to say about like to learn from, from Zach, which is crazy because he's my age. Yeah, right. Awesome. Well, so one of the big things that I think you guys have, you know, distinct and unique insight into being where you are and what you work on is, you know, what developers are getting really excited about right now. And by that, I mean, on the one hand, certainly, you know, like startups who are actually kind of formalized and formed a startups. But also, you know, just in terms of like what people are spending their nights and weekends on, what they're, you know, coming to hackathons to do. And, you know, I think it's a, it's a, it's such a fascinating indicator for. for where things are headed. Like, if you zoom back a year, right now was right when everyone was getting so, so excited about AI agent stuff, right? Auto-G-T and baby AGI and these things were like, if you dropped anything on YouTube about those, like instantly tens of thousands of views.

Starting point is 00:14:41 I know because I had like a 50,000 view video, like the second day that I was doing the show on YouTube, you know, because I was talking about auto-GPT. And so anyways, you know, obviously that's sort of not totally come to fruition yet, but what are some of the trends in what you guys are seeing in terms of people's interest and what people are building? I can start maybe with the agents part and then I know Sean is doing a diffusion meetup tonight. There's a lot of different things. The agent wave has been the most interesting kind of like dream to reality arc.

Starting point is 00:15:15 So out of GBT, I think they went from zero to like 125,000 get up stars in six weeks. and then one year later they have 150,000 stars. So there's kind of been a big plateau. I mean, you might say there are just not that many people that can start it. You know, everybody already started. But the promise of, hey, I'll just give you a goal and you do it. I think it's like amazing to get people's imagination going. You know, they're like, oh, wow, this is awesome.

Starting point is 00:15:44 Everybody can try this to do anything. But then as technologists, you're like, well, that's just like not possible you know we would have like solved everything and I think it takes a little bit to go from the promise and the hope that people show you to then try and it yourself and going back to say okay this is not really working for me and um David won from a depth you know in our episode he specifically said we don't want to do a bottom-s-up product you know we don't want something that everybody can just use and try because it's really hard to get it to be reliable So we're seeing a lot of companies doing vertical agents that are narrow for a specific domain,

Starting point is 00:16:28 and they're very good at something. You know, Mike Conover, who was at Databricks before, is also a friend of Latenspace. He's doing this new company called Brightwave doing AI agents for financial research, and that's it. You know, and they're doing very well. There are other companies doing it in security, doing it in compliance, doing it in, doing it in legal. All of these things that like people, nobody just wakes up and say,

Starting point is 00:16:54 oh, I cannot wait to go on auto GPD and ask it to do a compliance review of my thing. You know, just not what inspires people. So I think the gap on the developer side has been the more bottom sub hacker mentality is trying to build this like very generic agents that can do a lot of open-end the task. And then the more business side of things is like, hey, if I want to raise my next round,

Starting point is 00:17:17 I cannot just like sit around and mess around with like super generic stuff. I need to find a use case that really works. And I think that that is worth for a lot of folks. In parallel, you have a lot of companies doing e-vales. There are dozens of them that just want to help you measure how good your models are doing. Again, if you build e-vails, you need to also have a restrained surface area to actually figure out whether or not it's good, right? because you cannot eval anything, everything under the sun. So that's another category where I've seen from the startup pitches that I've seen,

Starting point is 00:17:53 there's a lot of interest in the enterprise. It's just like really fragmented because the production use cases are just coming like now. You know, there are not a lot of long-established ones to test against. And so that's kind of on the virtual agents. And then the robotic side is probably been the thing that surprised me the most at Nvidia GTC, the amount of robots that were there, they were just like robots everywhere, like both in the keynote and then on the show floor, you would have Boston Dynamics, dogs running around.

Starting point is 00:18:25 There was like this like a fox robot that had like a virtual face that like talked to you and like moved in real time. There were industrial robots. Invita did a big push on their own omniverse thing, which is like this digital twin of whatever environments you're in that you can use to. train the robots agents. So it kind of takes people back to the reinforcement learning days. But yeah, agents, people want them.

Starting point is 00:18:50 You know, people want them. I give a talk about the rise of the full stack employees and kind of this future, the same way full stack engineers kind of work across the stack. In the future, every employee is going to interact with every part of the organization through agents and AI-enabled tooling. And this is happening. It just needs to be a lot more narrow than maybe the first approach that we took, which is just put a string in auto-GPD and,

Starting point is 00:19:13 prey. But yeah, there's a lot of super interesting stuff going on. Yeah. Well, unless you covered a lot of stuff there, I'll separate the robotics piece because I feel like that's so different from the software world. But yeah, we do talk to a lot of engineers and, you know, that this is our sort of bread and butter. And I do agree that vertical agents have worked out a lot better than the horizontal ones. I think, you know, the point I'll make here is just the reason AutoGBT and maybe IGI, you know, it's in the name. Like they were promising AGI AI. But I think people are discovering that you cannot engineer your way to AGI. It has to be done at the model level. And all these engineer and prompt engineering hacks on top of it

Starting point is 00:19:52 weren't really going to get us there in a meaningful way without much further improvements in the models. I would say I'll go so far to say even Devin, which is I would, I think the most advanced agents that we've ever seen still requires a lot of engineering and still probably falls apart a lot in terms of like practical usage or it's just way too slow and expensive for you know what it's what it's promise compares to the video so yeah that that's that's what that's what happened of agents from from last year but i do i do see like vertical agents being very popular and and sometimes you like i think the word agent might even be overused sometimes like people don't really care whether or not you call it an AI agent right like does it replace boring menial tasks that i do that i might

Starting point is 00:20:37 hire human to do or that the human who is hired to do it like actually doesn't really want to do. And I think there's absolutely ways in sort of a vertical context that you can actually go after very routine tasks that can be scaled out to a lot of, you know, AI assistants. So, so yeah, I would sort of basically plus one what I still sit there. I think it's very, very promising. And I think more people should work on it, not less. Like, there's not enough people. Like, this should be the main thrust of the AI engineer is to look out, look for use cases and go to a production with them instead of just always working on some

Starting point is 00:21:11 AGI promising thing that never arrives. I can only add that so I've been fiercely making tutorials behind the scenes around basically everything you can imagine with AI. We've probably done about 300 tutorials over the last couple months and the verticalized

Starting point is 00:21:28 anything, right? Like this is a solution for your particular job or role. Even if it's way less interesting or kind of sexy, is so radically more useful to people in terms of intersecting with how, like, those are the ways that people are actually adopting AI in a lot of cases is just a thing that I do over and over again. By the way, I think that's the same way that even the generalized models are getting adopted, you know? It's like, I use Mid Journey for lots of stuff, but the main thing I use it for is YouTube thumbnails every day. Like day in, day out, I will always do a YouTube thumbnail, you know, or two with Mid Journey, right? And it's like you can you can start to extrapolate that across a lot of things. And all of a sudden, you know, AI doesn't, it looks revolutionary because of a million small changes rather than one sort of big dramatic change. And I think that the verticalization of agents is sort of a great example of how

Starting point is 00:22:19 that's going to play out too. Yeah. So I'll have one caveat here, which is I think that because multi-modal models are now commonplace, like claw, Gemini, OpenEI, all very, very easily multimodal, apples easily multimodal, all this stuff. There is this switch for agents for sort of general desktop browsing that I think people need to keep an eye on. It's not mature yet, but it is absolutely coming on the way. And so just as we're starting to talk about this verticalization piece, because that is mature, that is ready for people to work on. That is that a lot of people are making really good money doing that. The thing that's on the rise is this sort of drive-by vision version of the agent, where they're not specifically taking in text or anything. They're

Starting point is 00:23:03 just watching your screen, just like someone else would. And piloting. And I'm piloting. it by vision. And, you know, in the episode with David that we'll have dropped by the time that this airs, I think that is the promise of adept. And that is the promise of what a lot of these sort of desktop agents are. And that is the more general purpose system that could be as big as the browser, the operating system. Like, people really want to build that foundational piece of software in AI. And I would see, like, the potential there for desktop agents being that, that you can have sort of self-driving computers. Don't write the horizontal piece out. I just think we took a while to get there. What else are you guys seeing that's interesting to you? I'm looking

Starting point is 00:23:46 at your notes and seeing a ton of categories. Yeah. So I'll take the next two as like as one category, which is basically alternative architectures, right? The two main things that everyone following AI kind of knows now is one, the diffusion architecture and two, the, let's just say the decoder-only transformer architecture that is popularized by GPT. You can read, you can look on YouTube for thousands and thousands of tutorials on each of those things. What we are talking about here is what's next, what people are researching and what could be on the horizon that takes the place of those other two things. So first of all, we'll talk about transformer architectures and then diffusion. So Transformers, the two leading candidates are effectively RWKV and the state space models,

Starting point is 00:24:26 the most recent one of which is Mamba, but there's others, like the Striped Taina and the S4H3 stuff coming out of Haysie research at Stanford. And all of those are non-quodratic language models that scale, they promise to scale a lot better than the traditional transformer. This might be too theoretical for most people right now, but it's going to be, it's going to come out in weird ways where imagine if like right now the talk of the town is that Claude and Gemini have a million tokens of context and like, whoa, you can put in like, you know, two hours of video now. okay but like what if you put what if we could like throw in you know 200,000 hours of video like how does that change your your usage of AI what if you could throw in the entire genetic sequence of a human and like synthesize new drugs like what how does that change things like we don't know because we haven't had access to this capability being so cheap before and that's the ultimate promise of these two models they're not there yet but we're seeing very very good progress our wkV is probably in in mamba probably the like the the two leading examples, both of which are open source, that you can try them today and have a lot of progress there. The main thing I'll highlight for Al UKV is that at the 7B level, they seem to have beat Lama 2 in all benchmarks that matter at the same size for the same amount of training as an open source model. So that's exciting.

Starting point is 00:25:55 But, you know, there are 7B now. They're not at 7TB. We don't know if it'll scale. And then the other thing is diffusion. Diffusions and transformers are kind of on a collision course the original stable diffusion already used transformers in parts of its architecture

Starting point is 00:26:10 it seems that transformers are eating more and more of those layers particularly the VAE layer so that's the diffusion transformer is what SORA is built on the guy who wrote the diffusion transformer paper Bill Pebbles is the lead tech guy on

Starting point is 00:26:27 SORA so you'll just see a lot more diffusion transformer stuff going on But there's more sort of experimentation with diffusion. I'm holding a meetup actually here in San Francisco that's going to be like the state of diffusion, which I'm pretty excited about. Stability is doing a lot of good work. And if you look at the architecture of how they're creating stable diffusion three, hourglass diffusion and late inconsistency models or SDXL turbo,

Starting point is 00:26:52 all of these are like very, very interesting innovations on like the original idea of what stable diffusion was. So if you think that it is expensive to create or slow to create stable diffusion or an AI, generated art, you are not up to date with the latest models. If you think it is hard to create text and images, you are not up to date with the latest models. And people still are kind of far behind. The last piece of which is the wild card I always kind of hold out, which is text diffusion. So instead of using auto-generative or auto-regressive transformers, can you use text to diffuse? So you can use diffusion models to diffuse and create entire chunks of text all at once instead of token by token. And that is something that mid-Journey confirmed today, because it was only rumored the past few months.

Starting point is 00:27:35 But they confirmed today that they were looking into. So all those things are very exciting new model architectures that are maybe something that you'll see in production two to three years from now. So the couple of the trends that I want to just get your takes on because they're sort of something that seems like they're coming up are one, sort of these wearable, you know, kind of passive AI experiences where they're absorbing a lot of what's going on around you and then and then kind of bringing things back. And then the other one that I wanted to see if you guys had thoughts on were sort of this next generation of chip companies. Obviously there's a huge amount of emphasis on on hardware and Silicon and different ways of doing things. But, you know, love your take on neither or both of those. So for wearables, I'm very excited about it. I want wearables on me at all times. I have two

Starting point is 00:28:24 right here to quantify my health. And I, you know, I'm all for them. But society is not ready for wearables, right? No one's comfortable with a device on recording every single conversation we have. Even all three of us here as podcasters, we don't record everything that we say. And I think there's a social shift that needs to happen. I am an investor in Tab. They are renaming to a broader vision, but they are one of the sort of three or four leading wearables in this space, and sort of the AI pendants or AI OS or AI personal companion space. I have, I have, I've seen two humanes in the wild in San Francisco. I'm very, very excited to report that there are people walking around with those things on their chest.

Starting point is 00:29:08 And it is as goofy as it sounds, it absolutely is going to fail. But God bless them for trying. And I've also bought a rabbit. So I'm very excited for all those things to arrive. But yeah, people are very keen on hardware. I think the idea that you can have physical objects that embody an AI that do specific things. for you is as old as, you know, the sort of Golem in sort of medieval times in terms of like how much we want our objects to be smart and do things for us. And I think it's absolutely

Starting point is 00:29:45 a great play. The funny thing is people are much more willing to pay you up front for a hardware device than they are willing to pay like an $8 a month subscription recurring for software, right? And so the interesting economics of these wearable companies is they have negative float in the sense that people pay deposits upfront. Like I paid like, I don't know, 200 bucks for the rabbit upfront and I don't get it for another six months. I paid 600 bucks for the tab, but I don't get it for another six months. And then they can take that money and sort of invested in like their next events or their next properties or ventures. And like I think that's a very interesting reversal of economics from other. types of AI companies that I see. And I think, yeah, just the tactile feel of an AI, I think,

Starting point is 00:30:34 is very promising. I don't know if you have other thoughts on the wearable stuff. The Open Interpreter just announced their product four hours ago. Yeah. It's not really a wearable, but it's still like a physical device. It's a push-to-talk mic to a device on your, on your laptop, right? It's a $99 push-to-top. Yeah. But everybody, but again, go back to to your point, it's like, people want to, people are interested in spending money for like things that they can hold, you know? I don't know what that means overall for like where things are going, but making more of this AI be a physical part of your life.

Starting point is 00:31:12 I think people are interested in that, but I agree with Sean. I mean, I've been, I talk to Avi about this, but Avi's point is like most consumers like care about utility more than they care about privacy, you know, like you've seen with social media. but I also think there's a big societal reaction to AI that is much more rooted than the social media one but we'll see but again a lot of work a lot of developers a lot of money going into it so

Starting point is 00:31:42 there's bound to be experiments being run on the chip side sorry I'll just ship it one more thing and then we transition to the chips the thing I'll caution people on is don't overly focus on the form factor the form factor is a delivery mode, there will be many form factors. It doesn't matter so much as where in the data war does it sit. It actually is context acquisition because, and maybe a little bit of multimodality.

Starting point is 00:32:09 Context is king. Like, if you have access to data that no one else has, then you will be able to create AI that no one else can create. Right. And so what is the most personal context? It is your everyday conversation. It is as close to mapping your mental train of thought as possible without, you know, physically you writing down notes. So that is the promise, the ultimate goal here, which is like personal context, it's always available on you, you know, lowly and see all that stuff. But that's the frame I want to give people that the form factors will change and there will be multiple form factors, but it's the software behind that. And the personal context that you cannot get anywhere else that'll win. Yeah, so that was wearables.

Starting point is 00:32:50 On the chip side, yeah, Grock was probably the biggest release, Jonathan Ross. But it's not even a new release because the company, I think, was started in 2016. So it's actually quite old. But now recently captured the people's imagination with their mixed raw 500 tokens a second demo. Yeah, I think so far the battle on the GPU side has been either you go kind of like massive chip like the Cerebrus of the world where one chip front Cip versus about $2 million. You know, that's compared. obviously you cannot compare one chip versus one chip, but H100 is like 40,000, something like that.

Starting point is 00:33:31 The problem with those architectures has been, they want to be very general, you know, but like they wanted to put a lot of the RAM, the S-RAM on the chip. It's much more convenient when you're using large language models, but the models outpace the size of the chips, and chips have a much longer, you know, turn around cycle. GROC today, it's great.

Starting point is 00:33:54 for the current architecture. It's a lot more expensive also as far as dollar per flop. But their idea is like, hey, when you have very high concurrency, we actually were much cheaper. You shouldn't just be looking at the compute power. For most people, this doesn't really matter. You know, like, I think that's like the most interesting thing to me is like we've now gone back with AI to a world where developers care.

Starting point is 00:34:24 about what hardware is running, which was not the case in traditional software for like maybe 20 years since as the cloud is getting really big. My thinking is that in the next two, three years, like we're going to go back to that. People are not going to be sweating.

Starting point is 00:34:39 What GPU do you have in your cloud? What do you have? It's like, yeah, you want to run this model? We can run it at the same speed as everybody else. And then everybody will make different choices, whether they want to have higher front-end capital investment and then better utilization. some people would rather do lower investment before and then upgrade later.

Starting point is 00:34:58 There are a lot of parameters. And then there's the dark horses, right? That is some of the smaller companies like Lemurion Labs, Madax that are working on, maybe not a chip alone, but also like some of the actual math infrastructure and the instructions on it that make them run. There's a lot going on, but yeah, I think the episode with, with Dylan will be interesting for people. But I think we also came out of saying,

Starting point is 00:35:27 hey, everybody has pros and cons. There's no, it's different than the models where you're like, oh, this one is definitely better for me and I'm going to use it. I think for most people, it's like fun, Twitter, meaming, you know, but it's like 99% of people that tweet about the stuff are never going to buy any of these chips anyway. So it's really more for entertainment.

Starting point is 00:35:49 Wow. I mean, like this is serious business here, right? you're talking about, you know, like, the potential new Nvidia, if anyone can take like 1% of Nvidia's business, they are a serious startup that you should look at, right? So that's, that's, that's, yeah, yeah, yeah. I'm more talking about like, how should people think about it, you know? It's like, I think like the, the end user is not impacted as much. This is obviously like. So I disagree. Yeah. I love disagreements because, you know, who likes the podcast where all three people always agree with each other. You will see the impact of this

Starting point is 00:36:18 in the tokens per second over time. This year, I have very, very credible sources all telling me that the average tokens per second, right now we have somewhere between 50 to 100 as like the norm for people. Average tokens per second will go to 500 to 2,000 this year from a number of chip suppliers that I cannot name. So like that is, that will cause a step change in the use cases. Every time you have an order of magnitude improvement in the speed of something, you unlock new use cases that become fun instead of a chore.

Starting point is 00:36:51 And so that's what I would caution this audience to think about, which is like what can you do in much higher AI speed? It's not just things streaming out faster. It is things working in the background a lot more seamlessly and therefore being a lot more useful than previously imagined. So that would be my two cents on that. Yeah. Yeah.

Starting point is 00:37:11 I mean, the new MBIA chips are also much faster. To me, that's true. When it comes about startups, it's like, are the startups pushing the performance on the incumbents or are the incumbents still leading and then the startups are like riding the same wave you know um i don't have yet a good sense of that it's like you know it's next year's invida release just going to be better than everything that gets released this year you know if that's the case it's like okay damn jensen you know it's like the meme it's like i'm gonna fight i'm gonna fight invidia it's like damn jensen got hands like he really does so

Starting point is 00:37:47 well awesome conversation guys I guess just by way of wrapping up call it over the next three months between now and sort of the beginning of summer was one prediction that each of you has can be about anything could be big company can be startup it could be something you have privileged information that you know and you just won't tell us that you actually know what does it have to be something that we think it's going to be true or like something that we think because for me it's like is sundar going to be the CEO of Google maybe not in three months maybe like six months, nine months, you know?

Starting point is 00:38:19 People were like, oh, maybe Demas is going to be the new CEO that was kind of like, I was busy like fishing some deep mind people and Google people for like a good guest for the pot. And I was like, oh, what about Jeff Dean? And they're like, well, Demis is really like the person that runs everything anyway in this stuff. And it's like interesting. And so I don't know. What about Sergey? Sergey could come back.

Starting point is 00:38:41 I don't know. Like he's making more appearances these days. Yeah. I bet we can just put it as like yeah my thing is like CEO change potential but again three months is too short to make a prediction

Starting point is 00:38:57 I think that's fine the time scale might be off yeah I mean for me I think the progression in vertical agent companies will keep going we just had the other day Klarna talking about how they replaced

Starting point is 00:39:14 like 700 of their customer support agents with an AI agent. That's the beginning, guys. Imagine this rolling out across most of the Fortune 500. And I'm not saying this is like a utopian scenario. There will be very, very embarrassing and bad outcomes of this where humans would never make this mistake, but AIs did, and we'll all laugh at it or will be very offended by whatever bad outcome it did.

Starting point is 00:39:40 So we have to be responsible and careful in the rollout. But yeah, this is, it's rolling out. Alessio likes to say that this year is the year of AI in production. Let's see it. Let's see all these sort of vertical full-stack employees come out into the workforce. Love it. All right, guys. Well, thank you so much for sharing your thoughts and insights here.

Starting point is 00:39:58 And I can't wait to do it again. Thanks for everyone.

The AI Daily Brief: Artificial Intelligence News and Analysis - What AI Developers Are Building Next, with Swyx and Alessio of Latent Space

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.