Limitless Podcast - Meta’s New AI Can Predict Your Emotions Better Than You

Starting point is 00:00:03 Imagine if you can predict how someone's brain lights up just by knowing what movie they're watching. Well, Meta's new AI model is kind of doing exactly that. So in this episode, we're going to dive into how Metis AI team just won a big brain modeling challenge by building a model that can literally take a movie's video, audio, and text, and then predict exactly how your brain would respond if you watched it. It's not mind-reading, but it's close, and it's a fascinating glimpse into how AI is learning to understand our brains from the outside looking in. So this is really interesting.

Starting point is 00:00:33 I read the report. I went through the paper. EJAS, it's pretty cool. Walk us through it. What happened here? Yeah, so this was the kind of equivalent of a science competition, but for like the best tech companies in the world. And meta just won it. The invention that they made is, as you said, it's kind of like this brain scan AI model. But what's cool about it is it can predict whether you are going to like a movie or a video. whether you're going to hate it. And it's not just the entire video, it's different scenes of the video, it's maybe a certain actor or a certain kind of illusion or whatever that might be. And the reason why this is so cool is just imagine a world where you could get like kind of personalized content. Imagine like a personalized Netflix that you watch every day, Josh. But instead of you kind of being bored by the same plot lines or the same twists, you are just constantly surprised. But if your friend came over and watch the exact same show, they would see something completely different, right?

Starting point is 00:01:38 What I thought was, like, astounding by this is not just, like, the effects that it has on content, but also how the model works, right? So it's a one billion parameter model. And if anyone has ever listened to this show before, you know that that is minuscule compared to any of the models that we typically talk about. We're talking, we typically talk about like trillion parameter models. So one billion is greatly small, but what is cool about this is it's multimodal. So we're not just talking about like an LLM here that kind of ingests words and characters and spits out words to you itself. It's ingesting images. It's playing through video. It's ingesting audio. And it's compiling all of that in a really tiny model and figuring out like what you're going to think of, how you're

Starting point is 00:02:24 going to react to things. I just think that's awesome. Yeah. Okay. So let's explain what's happening here. So they won the global competition. It's called Algon. And there was 260 teams that all competed. Meta won. And for people who aren't familiar like I was, algonauts apparently, there is an Olympics for brain modeling, which I didn't know. And that's what this algonauts competition is like.

Starting point is 00:02:44 And the challenge was to predict the real fMRI brain responses from what someone is seeing, hearing, and reading on screen. So the winning model, it was called Tribe, short for trimodal brain encoder. We talk about multimodality. We'll get into this in a sec. It topped the leaderboards with a correlation score of, 0.216, which sounds small. But hey, in brain science, that's gold.

Starting point is 00:03:07 That's what it takes. I'm not sure. So basically what it was, it's guessing how your friend will react to the movie scene. When will they laugh? When will they tear up? When will they get confused? Except instead of emotions, we're talking about these thousands of tiny little patterns in your brain's activity.

Starting point is 00:03:20 So quickly getting into how this works, tribe, which stands for tri-modal, it takes in three data streams. So you have video, audio, and text. The video part, it uses meta's model called V-J-E-P-A-2. The model names are always so bad. But basically, this fancy model that meta makes, it's used to understand visual detail. So it recognizes faces, it recognizes movement in the scenes, colors, and the different types of scenes. And it kind of understands what implications each of those traits have on your brain stimulation.

Starting point is 00:03:50 And then there's a second part to this, which is audio. It uses this thing called wave-to-wave V-E-C-2, another super weird model. It doesn't matter. Basically what it is is it interprets the tone. It interprets pitch and sound patterns. So things like music swelling or explosions or any sound effects, it analyzes the direct impact on that audio stimulus on your brain. And then the third one is text. And that uses a model that we're very familiar with called Lama 3.2. That's their open source model actually. So anybody can go and use that. And what it does is it processes the dialogue and the captions. And it takes the implications of what that means and then how that would impact your brain. So you could kind of think of it like three detectives. They're each kind of investigating from a different angle. We have the site and sounds and then we have the script and they're pooling all the clues together before like making a call on what they think you will feel. And they even trained it to handle missing data. So if there's no transcript, for example, it can still predict brain activity just by using sound and vision and by transcoding the voice in real time to create the

Starting point is 00:04:50 transcript and feed that into the model. So it's this really impressive like trimodal thing that they managed to pull off. And I mean, of course, meta wins, I would imagine of the 260 teams, they are the most well capitalized by like a couple orders of magnitude, but it's still really impressive what they managed to pull off. So what does this mean? To me, this is really cool. I am a big fan of the collapsing of the difference or the distance between the brain and the actual input and output of a computer. And if we can start to not read it, because this isn't mind reading, but if we could start to anticipate it, if we can start to understand the impulses and I guess transcribe that into data, that to me seems super interesting because we're going to

Starting point is 00:05:34 talk about a wrist device that they announced a little earlier. But the first time I saw this really was with the Apple Watch where just by kind of like the action of moving your fingers will actually trigger an impulse on your device. And it feels like it's reading your mind even though it's not really. It's just taking a good guess. And I think when it comes to how we're going to, we interact with computers when it how it comes to how we engage with content. The fact that a model can predict what type of content is going to most likely stimulate us, it seems like something that is really exciting in the sense that it can create the most compelling content in the world and also really frightening in the sense that, well, if you have TikTok on this and it understands all the

Starting point is 00:06:15 impulses from your brain exactly how to optimize it for you, well, you can get some pretty amazingly addictive content. So to me, this is exciting for like two reasons. It's like you can get great content and you can now engage with computers even faster because they can anticipate your actions. But also, it's like, hey, we can design like pretty enticing, addictive experiences because we actually know exactly how your brain works down to the audio wave. So that's kind of my take on on the implications of this. Did you have any further ones? Yeah, I think of this as a double-edged sword. If I were to guess where the pinnacle of all this technology is going to end up, it's going to be a really slick brain computer interface. So what I mean by that is we're trying to replicate

Starting point is 00:07:01 human intelligence. We can't just do that by words. We need vision. We need audio. We need feeling. We need all of these kinds of things. So we're kind of like developing all these different kinds of devices. We've got robotics. We've got different kinds of AI hardware, robotaxies. We've got different screen, cell phones, VR goggles, everything to try and emulate and simulate human intelligence. So with this new development, it's really optimistic in the sense that it's helping us get to that stage that I just described. But the flip side of it is this could be really bad for us as well. Do you remember a few years ago, Josh, there was this documentary on Netflix called The Social Dilemma, I think it was called Social Dilemma. And it basically, yeah, it went viral for,

Starting point is 00:07:47 one singular reason, it unpacked and revealed how Facebook's algorithm worked, Facebook and Instagram's algorithm. And it basically described that they knew everything about you. They knew what you were going to buy before you even bought it. They knew what you were going to like before you were going to like it. They knew the friends that you were going to make before you'd even met them. That's insane. And I think that this is that on steroids, right? Because it's now going to apply to content that doesn't even exist. I mean, think about the applications here, right? Like, you could, in the optimistic sense, be a movie director.

Starting point is 00:08:24 And you're like, I don't know whether this script is going to be cool. Will people like this scene? Maybe, maybe not. But let me just run it through this brain scanner or this brain simulator, rather, and test it on maybe 100 subjects and see whether it's appealing to the demographic or audience that we're pitching for. That's really optimistic. But on the other side, if I am a meta shareholder, and again, I'm laughing,

Starting point is 00:08:46 as a matter shareholder here, right? And they're like, well, I want retention to go up. I want user base to go up. The best way to do that is to keep people's eyes on the screens on our apps. So if we can perfectly tailor content, if we can be the producers of content as well, right? We don't need to rely on users. Heck, we'll just use our AI models to create it ourselves and write the scripts. We then own the entire stack and the users and the retention and we keep making money.

Starting point is 00:09:11 That's my due met thesis. But, yeah, that checks out. Yeah, and we're getting like interesting takes from other people too, because there's, there's the neuroscience camp, which this is interesting in because, I mean, it helps map which brain areas respond to language, music, visuals, and how they integrate over time. So for people who are in the brain world, this is fascinating. And AI has just unlocked a new field to look into. And then for AI researchers, well, it shows the way foundation models process information isn't random. Like, it actually lines up with the way that human brains do it.

Starting point is 00:09:39 And at least in some higher level reasons, which implies that like maybe we are not so, different from a LLM than we thought. Like maybe we are actually just predicting next words. If we could have an AI model predict our feelings pretty consistently, then like, well, you could have an interesting conversation about AGI too, is like what really is the difference between the token prediction versus how we, our brains predicting? So that's just this weird edge case. But yeah, and then there's this another other like really cool area that I was thinking of because it was education. When you're learning things, when you're being taught things, a certain. lessons, certain ideas, create more of a cognitive load on your brain than others. And if you're

Starting point is 00:10:22 aware of exactly the amount of load that you can deliver to a brain, then you can actually optimize lessons, optimize education, even optimize working for the maximum that your brain is comfortable handling. And it can actually detect the impulses and the stimulus as you go and kind of figure out like the best way to, you can kind of optimize yourself. So like if you were doing, let's say if you're doing like a math lesson and it has like you know a bunch of things that you need to a bunch of problems you need to solve and knows exactly how far it can push you before you reach a breaking point and that to me that's like interesting it knows not to feed me educational tictox when i'm coming back from a night out basically which i think is is a major unlock yeah the education thing is actually

Starting point is 00:11:06 great i didn't think about that i kind of immediately thought of chinese ticot you know how they say like Chinese TikTok is super educational, informative, and productive for their audience versus Western TikTok, where it's all just kind of like slop, dancers and entertainment. I kind of think about that immediately. I kind of think about how this will apply to news and media as well, right? I wonder if, you know, over the last decade, we've seen click, this rise of click baity titles where, you know, they're trying to figure out, you know, dying media corporations are trying to figure out ways to keep users, get people to join their subscriptions and stuff like that, whether that takes another step up in terms of forming narratives and what that means for kind

Starting point is 00:11:46 of like truth-seeking platforms, like maybe like X, or that's how they advertise themselves at least, versus kind of like traditional media as well. But it was also thinking, Josh, this isn't the first kind of hardware push that Meta's made, right? Didn't they come up with like a neural wristband like literally a month ago that can act as some kind of, kind of interface. So it's like you can kind of like make certain gestures and control an interface. Yeah. So they have this hardware device that will actually anticipate and understand the movements in your fingers through a wrist device. And it's pretty fascinating. They refer to this as the first high bandwidth generic non-invasive neuromotor interface, which is a long way of saying it can

Starting point is 00:12:30 kind of understand the muscle activity that happens in your hands without an implant. And this is kind of the first time that we've seen this at the level of precision that we have. So in the past, we've had, I mean, even the Apple Watch example that I described earlier, you can kind of tap your fingers and it recognizes the muscles flexing in your arm. This will actually recognize gestures, recognize discrete inputs, recognize handwriting. And what we're seeing on screen now is is a person writing by hand and it's actually automatically dictating the words into text. And it's good up to 21 words per minute with an additional 16% boost when it's personalized to your handwriting. So why this matters? Well, it's not the first interface of its kind,

Starting point is 00:13:12 but the wristband is easy to wear, reliable, fast, and it feels like a leap forward in how we engage in human interfaces. So, I mean, in the past, we've talked about the Apple Vision Pro and just the increased adoption of spatial interactions with computing, where now we have virtual worlds, we have virtual reality, we're talking about goggles. If you are able to take away the need for a screen and using two fingers and can just place these sensors on your wrists that are much less invasive, much more natural, and also much more rich in terms of your movements. Like when you're interacting with the screen, you're tapping a single point or maybe a multi point for multi-touch. But with the hand thing, you have all of the like free range motion of your hands. You have the

Starting point is 00:13:55 angles, you have gravity, you have gyroscopes. You can really personalize how you interact with these machines. And this seems to be the trend with meta where they're just kind of trying to understand more of how the human brain works, how the human body works, and how we can merge that with this next generation of wearable computers. And it's not just a trend with META, right? We're seeing these types of developments through a number of different companies. Originally, we had Elon Musk with Neurrelink, which is basically putting a chip inside your brain and therefore whatever you think and look at, you can kind of like access in a simple computer-like way. And then this week, we got this announcement that Sam Altman via OpenAI is funding a competitor, $250 million into

Starting point is 00:14:41 this company called Merge Labs, which is basically building a neuralink competitor. So again, another chip in the brain. And then I kind of think about how META is making this move with their VR glasses and then also with their Rayband glasses, so augmented reality versus VR, this wristband, this now brain simulator. And I can't help but think that they're there is this trend towards, I don't know whether it's better to call it AI healthcare or whether it's AI bioengineering. I kind of think bioengineering makes more sense because it's like a means to an end to try and like create this new kind of metaverse type world. If I dare use that word. But I think it's super cool.

Starting point is 00:15:23 And even actually in OpenAI's announcement last week of GPT5, they spent like 20 minutes talking about healthcare. I've just got a clip up here where they interviewed this lady who has gone through a number of different illnesses, but she used chat GPT to kind of like diagnose her and offer her different paths towards cures and treatments. And that was just from like a word context, right? But imagine if you had an array of devices or chips that could just kind of like read you in real time just like it does a computer, can access your memory, can see kind of like the experiences that you've been through, can feel the trauma through your neurons, you know, do a full body scan like you see in like these futuristic movies,

Starting point is 00:16:05 that would be pretty insane, right, to just kind of like have this curated cure come to you. So I think we're headed towards this kind of world where I think AI is done really well in with words basically and in chatbots. It's starting to appear in like social consumer apps as well. But the next major leap, aside from math and coding, which we've spoken about before on this show, is going to be general science. And I think that's super exciting. Yeah, that seems super cool.

Starting point is 00:16:31 And there's an important distinction between a lot of these things. So actually, if you wouldn't mind going back to the last post with Sam Altman announcing the raise. Yeah, so $250 million into Merge Labs, which we're assuming this is the Neurrelink competitor. This is the brain machine interface competition to Neurrelink and what we've seen from a few others who are trying. This is actually very different than what we've described earlier today with meta and the Algonas competition with their tribe model. So the tribe model, the way that it works is it's read only. So it will read your brain inputs. It will understand how different sensory affects that, and then it can predict how you're going to react to certain things. With Tribe and with Neurrelink, it has the right function as well. So not only will it read your thoughts or anticipate your thoughts, but it will actually allow you to do things with it. So it can control parts of your body without your own inputs actually being responsible for causing it. So with meta, it is, anticipatory, but it is not actually reading your brain. With Neurrelink and with this new company

Starting point is 00:17:34 from OpenAI and Sam Altman merge, it is actually reading directly from your brain and then directly allowing you to create input and output. So you can imagine what we've talked about today with Tribe as being the very early version of this trend that you're describing EJAS, which is like moving more towards human-based compute. And then the next step up is actually reading and writing from the brain. And I've totally agree with your take. This is This is where it is going. The healthcare space in general, the neural space in general, it all has moved very slowly as of the last couple decades.

Starting point is 00:18:07 There really hasn't been much progress. It hasn't been that good. It feels so dumb that in order to do things like solving cancer, we just radiate our bodies and destroy everything else with it. It's so imprecise. It's so, like, it's very sloppy. It's not good. And we haven't had progress in this space for a very long time.

Starting point is 00:18:26 And not so the fault of anyone, really, I would imagine these are difficult challenges, but we have this new technology that can enable us to solve these things and to bring these computers closer to our biological bodies. And I think I fully agree with you that this trend is here to stay. This is an important one. It's cool to see people like Sam and opening eye really stressing how important it is to them. And I mean, every time I hear Sam in an interview, he's talking about how the thing he's most personally excited about is net new knowledge on the frontier of bioengineering. and healthcare and solving human problems.

Starting point is 00:18:59 Because I think that's where we really get this huge level up of a perceived level up. Like we talk about this a lot where we feel like we're moving very quickly, but the world around us, like it takes these like big leaping points to catch up. And hopefully that's going to be one of those things where one of these days you'll go to a doctor and it will have, they'll put a little thing on your head and they'll put this little thing in your arm and it'll be able to detect all these sensory inputs and understand what's wrong with you and be able to help you in a much more precise way.

Starting point is 00:19:24 So this directionally seems like, yeah, the right trend. Yeah, I actually hope we take it a step further. And, you know, you don't even need to go to the doctor in the first place. Like, you just have a set of personalized AI health hardware devices or whatever you want to call it. Maybe it's just a single chip that could just read you 24-7, anticipate diseases or illnesses before it happens, and kind of have your prescription mail to your door, right? I kind of think about the old school way 10 years ago where you have to kind of sit in line at the waiting room

Starting point is 00:19:57 at a hospital or local GP office and kind of maybe wait hours, maybe you're delayed, maybe you have to go away and come back another day for a follow-up. And instead we kind of head towards this kind of real-time AI healthcare. And I don't know about you, Josh, but like I am loving all the competition,

Starting point is 00:20:16 all the arguments between these billionaires online because net net, it results in a great, great experience for the consumer, I think. And if these guys are in a fierce competition spending tens to hundreds of billions of dollars every quarter now, not even every year, on kind of big leaps like this, I am all for it. That's the amazing thing, right? It's like all of the spending we're talking about, all of these problems that these people are having to solve, all of the drama that is happening on a day-to-day basis. It's all to benefit the end user. It's all to benefit us. And for the low cost in most cases of $20 a month to get

Starting point is 00:20:51 access to all of this, this chaotic, fighting, intelligence, like all direct, the smartest people in the world have been working on the single thing. And we get access to it for $20 a month. And every week, it gets a little bit better and a little bit more powerful. And we get a little bit more use cases. And the real winner is just the armchair experts over here, just sitting down watching it all unfold and paying a monthly subscription to access to benefit our life. And on the point of the healthcare thing coming to you, that makes a lot more sense than actually going to a doctor, right? Because Apple has been trying this to varying degrees of success with the watch where they could do EKGs now and Woop recently did this and they detect your blood pressure now

Starting point is 00:21:30 and they can detect your temperature. And we've seen this up to an extent. But I think what Met is working on what we just showed a little bit earlier with the actual like neural sensory impulse thing, that's this whole next level of devices. And it's funny to see it coming out of meta's lab. And maybe they're going to be the hardware company that is going to push us. there because we have OpenAI and they have their like device plus suite of devices that we're not really sure what they are. They've been very big about that. And then we have Apple who has a watch, a phone, a laptop. I mean, they have the Vision Pro. They're probably going to go with glasses. And then we have meta who has the glasses already. They have this risk technology. And I think

Starting point is 00:22:12 what meta does well that other labs don't or other companies don't is they actually just share the research as they go. So I'm sure Apple probably did something similar to this. I'm sure maybe Open AI or Google or someone who has like big hardware chops has done this. But meta just shares the research. And they get everyone really excited about it. And it's part of the open source ethos that they initially did. We're like, hey, we're just going to share our work. Tell us what you think. Tell us where we can improve. And they're just kind of doing this thing in public. So I just realized, Josh, you reminded me, we completely forgot to mention Google who have been like the main company putting out all the amazing AI medical stuff. They were the first to release an AI model that could

Starting point is 00:22:49 predict and generate protein structures for cures for antibodies. They've been investing, exactly, they've been investing hundreds of millions of dollars into this kind of like AI science sector. So, you know, shout out Google. We're actually interviewing one of the heads of AI at Google, Logan. So we are so excited for that interview and keep an eye out for that. But yeah, you know, I've got nothing else to say except here's to living till the ripe old age of 150 and being of sound mind and body. Knock on work. 150? I feel like those are rookie numbers. But I guess what? How long do you think you'll live? Just random number. Do you think it's 150?

Starting point is 00:23:26 Depends on the quality of life. Like, what are we talking about here? Like, am I like an old man in a rocking chair or like, how long am I in my 30s? How long am I in my healthy 30s? I don't know. Brian Johnson's been there for a long time now and he's, he's patient number one. So maybe, maybe. He hasn't even been there. He reversed back to that. He reversed himself back. Yeah. So it's possible. It's going to be really exciting. There's a world in which if you live for the next, if you could keep yourself healthy for even the next decade, like there is a high probability that you will continue to feel healthy for much longer than you will. And I guess in terms of can we maybe end on predictions, like what we're, what company we're most excited about to release AI computer, human interface hardware? Do you think there's a winner that you see early on? I don't think there's a clear winner now because there's no real consumer device or experience that we can kind of see at scale right now. The company that's closest to it, I would say, is Neurrelink, so Elon Musk's company, who has been trialing these chips in your brain for a number of different patients right now. And it's crazy. You know, people who have had a completely disabled body or certain parts of their bodies

Starting point is 00:24:40 can now control computers and are tweeting from their ex accounts, right? And I'm liking that. And I'm like, wait, you can't move your fingers. that's insane, right? But I don't think there's a clear winner. If I was to go with the dark horse,

Starting point is 00:24:54 maybe not so dark horse, I don't think it's meta. I think meta is going to use all this amazing technology to prime their social media. I'll go and that's me being a duma. But I think it's going to be Google. I think it's going to be Google.

Starting point is 00:25:07 I'd love to throw shade on Google because I'm like, ah, they're a big corp, they can't innovate. I remember their first AI model being completely inaccurate. And so I was like, I kind of left a,

Starting point is 00:25:18 bad tint in my mouth, but I think they're killing it. With the Gemini model, with all this investment in science, I'd say Google. Yeah, okay. That sounds right. Yeah. About you? Neurrelink is not even in the same conversation because they're kind of like the device to end all devices. They are the final frontier of device. There is this intermediary layer until we get there where you do have like the glasses, you have the wrist strap, you have like the things on your body before it's just literally your brain. So I agree, Neurrelink on the brain thing. I, Sam Altman, best of luck with your new company. You got a pretty steep hill to climb to catch up with them. But in terms of that intermediary hardware, like before we get the device that ends all devices, I kind of,

Starting point is 00:25:59 I agree with you on Google. I think I have the same take. I want it to be Apple because they have the best design and I think the best ecosystem. It's just not going to be the case. Meta has never really delivered a hardware product at scale that I've enjoyed. So it would take a like zero to one moment for them to start releasing products that are actually great, which they're clearly trying. But Google, man, Google's just, they're dialed. You mentioned that they did have this problem earlier with the diversity thing and it was creating bad images and stuff, but Larry and Sergey have come back. They are working on the company. They have dialed it in. We have a Demis. Demis. How do we pronounce the name? A DeMis. DeMis. DeMis. Yes. DeMis. Yes. He is unbelievable at leading the deep mind team

Starting point is 00:26:42 in terms of like building these scientific models. And I think that research combined with their hardware production capabilities that we see with like the Android line and their laptops and everything. I don't know. Could be a home run. Maybe we're team Google for today. We'll see where we stand next week. But team Google and I think that's that's a wrap on today's episode. So AI is not reading your mind, but it is guessing very precisely where your mind is going to be and with a lot of accuracy. So it's a really cool progression in where we're headed towards in this kind of convergence between computers and humans. And it's just another week in the crazy world of AI. So thank you for joining us again on another episode. We hope you enjoyed this one.

Starting point is 00:27:23 And yeah, there will be lots more to come. If you enjoyed the episode, don't forget to like, subscribe, share it with a friend. That helps a lot. We, oh, we flipped the open AI podcast, officially. We were on Spotify. Yes, on Spotify, tech charts, limelists is now above the Open AI podcast to everyone who helped make this happen. Thank you. We are very much indebted. We very much appreciate the sport. Epic. Keep it coming. Keep it coming. We're going to the top. So thank you everyone for watching for sharing everything and we will see you guys in the next one thank you peace

Limitless Podcast - Meta’s New AI Can Predict Your Emotions Better Than You

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.