Big Technology Podcast - Why Meta Wants To Build Artificial General Intelligence — With Joelle Pineau

Starting point is 00:00:00 The head of META's AI Research Division joins us today to discuss the company's pursuit of human-level artificial intelligence, the cutting edge of AI, why it's open sourcing, its large language models, and plenty more in the only podcast interview the company is giving about its recent news. All that and more coming up right after this. Welcome to Big Technology Podcast, a show for cool-headed, nuanced conversation of the tech world and beyond. Boy, do we have a show for you today with recording. to the minute on Wednesday here right before we drop this episode because there's breaking news coming out of Meta, all about the moves that they're making with their AI division, their pursuit of human level intelligence. We have none other than Joe L. Pino here to talk to us about it. She's the head of Meta's AI Research Division, formerly called Fair. Now I guess it's called

Starting point is 00:00:50 Mayor and Still Fair, Fundamental AI Research. We love the name. Okay, well keep it fair, keep it running. We spoke actually in October 2020 before chat GPT. So this is going to be a really cool moment to talk a little bit about where we've come from there and where we're going. Joel, welcome to the show. Great to see you. Thank you, Alex. Great to be here. So if you recall, in October 2022, and we spoke a couple times at the World Summit AI, one of the, it's kind of funny because like the big storyline then was whether AI is sentient. And this was kind of a moment where like all the big research houses had big large language model chat bots internal and they hadn't released it yet. And it's kind of interesting how society starts to talk about a breakthrough, right? It

Starting point is 00:01:39 sometimes goes in a weird direction before we're actually refocused on what matters. And now I think we are refocused on what matters, right? There's been much more talk beyond sentience in terms of like the near term viability of this technology. I'm curious just to start, what has surprised you in the research since that discussion, not necessarily about, okay, we all know that it's now taken off and it's been hyped, but has there been anything that's made you sit back and be like, wow, we can actually do more than we thought we could, you know, a year or a year and a half ago? So many times this year, honestly.

Starting point is 00:02:14 And it's great to think back to that point in time. I hope you didn't ask me for any very specific predictions, even for someone who's deeply in the space of AI, just predict. how this is unfolding continues to be full of surprises. I will say, you know, it's also been interesting. The faster we progress, the more we have a sense of how much more is left to do. And so though, you know, back you mentioned back in October 2020, we were worried about sentience and we hardly talk about it now.

Starting point is 00:02:49 And yet we are so much further along on the map in terms of our ability to have. models that deeply understand information and process multimodal data. So we're getting further along and yet we worry about some of the more concrete problems. We've talked a lot this year about safety, for example, about how to make sure that we have models that are performing well, but also are aligning with the values of peoples and the needs of people, which I consider sort of a much more grounded problem that we can tackle with research. So That's, I think, the major change that I see. Significant progress, but that means we also have a much better view of what are the real problems we need to solve.

Starting point is 00:03:36 Yeah, it's funny because back then we also had a discussion about whether we should be focusing on the short term or the long term problems. And obviously, those are both worthy of attention and it's kind of wild that the focus on the long term problems, it seems like, blew up open AI over a weekend and maybe it's been put back together now. But the talk from meta now is actually focused on some of the more big ideas that people might have thought were more long term. But now it actually seems like, you know, it might be closer than we think, at least according to some of what we hear from Open AI and others. So this is a quote from Mark Zuckerberg that just came out fairly recently. He says, as recently as last week, we've come to view that in order to build the products that we want to build, we need to build for general intelligence. So, I mean, Jan Lacoon, in our discussions, I've been speaking with him since 2015, one of your colleagues, he's always talked about how the goal is building for artificial general intelligence. So when I saw Mark come out with that last week, I was like, yeah, yeah, that's been the focus for meta.

Starting point is 00:04:37 But all of a sudden, it almost feels like there's a more pragmatic or it feels more real now than it did before. Am I reading that right? Like, what is leading us to now start to like talk about this as something that's not pie in the sky, you know, 23rd? years down the road, but something that might be achievable in the, you know, nearer term. Yeah, I mean, Jan and I and the team in fair have been talking in those terms for many years. It's been clear we've been putting in place sort of a portfolio of projects that are trying to build the building blocks towards general intelligence. In the last year, Mark, as well as many others, has taken a deeper interest in what's going

Starting point is 00:05:17 on in AI. I think he was always aware of a lot of the good work we were doing, but didn't. dig in quite as deeply and in the last year definitely has. And through a lot of conversations, you know, I think has come to see how in many ways the path even to bringing AI to the products that people use and love from the company, the path to making those AI systems better goes through building general intelligence, not narrow intelligence. And we've done a ton of work on AI on the platform over the last few years. That was what I would call more. narrow specialized models. We can continue to do that, but the bigger step change are going to come

Starting point is 00:05:58 through the more general model, building foundation models, building up to world models that essentially can capture a much richer version of the information. So I think that's what you're hearing from Mark. It's things that you've been hearing from, from Jan, myself, and others through the years. We're working together to connect these pieces together, both the researcher at map as well, the product roadmap, and make sure that we have the ability to connect these together. So the ability to have our research quickly diffuse in the best way possible through the product and the ability to learn. The thing about general intelligence is you have to solve many different problems to have, you know, the ability to claim general intelligence.

Starting point is 00:06:44 And fortunately, there are a lot of use cases across meta, across our family of products. And so that's giving us wonderful material with which to work. So why? So let's go back to, you know, that October 2022 discussion that we had before chat GPT come out. Like the idea of me asking you this question about like, why is human level intelligence now in focus? I never would have asked it. It just didn't seem like it would be something that would be relevant to ask. But now it does seem more relevant and we're hearing it more and more in the discussion.

Starting point is 00:07:15 So you mentioned world models, foundational models. but what about AI research now is allowing us to ask those questions? I think it's because, you know, the models are getting increasingly general. If you look at a model like Chad TPT, the Lama family of models that we've been releasing, you know, they started just as word prediction models. All they would do is take in sentences and predict what comes next. And what we're seeing is we can use them now through many other. uses, whether it's to predict things that are not just words, but they're actually code.

Starting point is 00:07:54 And some of that code is actually executable. Or you can predict, you know, the components of an image. And then you can plug in a diffusion model or other kind of synthesizer to realize the information. So what started as just language model has become much more general on its own. It gives us a path. It may not be the path, but it gives us at least a path to move towards general. intelligence. And it's an exciting one. It's one that we're exploring. It doesn't mean that we've

Starting point is 00:08:25 stopped exploring other paths towards general intelligence, but that is definitely the one that has proven to make the fastest progress. And what would you say the path is? The path is to essentially capture a lot of human information through this representation that we call language. And so the hypothesis that, you know, even things that are not necessarily text-based originally. If you describe them through these discrete tokens, and sometimes these discrete tokens are the words that we use to express, but sometimes discrete tokens are, for example, code numbers, essentially like chunks of images. These discrete tokens are a path to representing all of human information. There was a study that came out last year

Starting point is 00:09:16 basically saying that these models can't generalize outside of their training set. You know, I think that was like a lot of the hype around these models were people saying that they were really able to have these capabilities that you wouldn't expect emergent capabilities. And the study basically pushed back on it and was like, listen, they're not going to generalize beyond their training set. And your evaluation of that study basically, you know, made you to believe either A, if you believe that study, then you're a lot less optimistic about this wave.

Starting point is 00:09:49 If you don't believe that study, you can be, you can really use your imagination and believe that what we're hitting on now, these foundational models that you talk about can lead us in directions that we never could have dreamed of. So I'm curious what your evaluation of that study is and how we should be thinking about this. I tend to really like be quite balanced on a lot of these questions. I think it's very easy to kind of pull opinions to one side or another. But the truth is, like, machine learning algorithms can generalize.

Starting point is 00:10:20 That is a property of how we build these algorithms. Even the simplest, just linear models, they do linearize. They just linearize along a line. So, you know, the fact of the matter is, though, when you project that into a very, very high dimension. So some of these models have hundreds of billions of parameters. You have to think of, like, you're learning a function. in that really high-dimensional space. The directions in which you can generalize are so many

Starting point is 00:10:50 that it's hard to know which are the good directions to generalize and which are the poor directions to generalize. The more data you have, the more that constrains that question. So I do believe they can generalize. I think they generalize relatively narrowly, or at least, you know, as long as you stay close, you get a good manifold of information. when you start to go really far afield from your data,

Starting point is 00:11:15 because the dimensions are so large, you get all sorts of noise. So the advantage, and one of the reasons, you know, a lot of the progress has been through better and bigger data sets, bigger but also cleaner data, is because that really defines which parts of this really high-dimensional space are the most interesting one. And when there is not a lot of data to populate that space,

Starting point is 00:11:40 then the models tend to regurgitate the things that they, have been trained on. So let's go back to that Zuckerberg quote that I read earlier. We've come to view that in order to build the products that we want to build, we need to build for general intelligence. Now we talked a little bit about why that is now relevant, the path towards general intelligence. But now I'm kind of left with another question, which is why does meta need to build general intelligence in order to build the products that you want to build?

Starting point is 00:12:06 I mean, yeah. Yeah. I mean, just looking at like a couple of the AI products we've released this year. You know, one of them is the meta-AI assistant. People who are in the U.S. have been able to try this out on some of our platforms where you can essentially ask for questions and ask for assistance. In that case, you know, there's a sense that it has to understand a very large spectrum of information to be able to do well. And as we incorporate more data and as we perfect this assistant, the more it's going to have essentially world knowledge, the better it's going to be. Another example is, for those who've been following our work on AR devices, the smart

Starting point is 00:12:49 glasses that we released earlier this year also come now with an AI model, also accessible mostly in the US at this time, there too. You know, you have essentially a more embodied version of this meta-AI assistant that sees the world as you see it, that is able to take on some action. In this case, the actions are not just words. It can take pictures. It can provide information. and it can record information.

Starting point is 00:13:14 And so to be able to do well in a wide set of different tasks with a wide set of different people, different environments, you need to have to move towards more general intelligence. That's really where that connects, you know, the research work we're doing and already what we're seeing in terms of the applications that Meta's putting out there. Now, let's say you do achieve this and you open source it? Is that kind of like the end? Like, is human, reaching human level intelligence

Starting point is 00:13:46 or general intelligence kind of like the end of AI research? Or is there more to do after that happens? There is no end to this journey. I mean, I hope there's no end to this journey, right? Like, do we as adults sort of say, okay, I'm going to keep on growing my knowledge. And at some point in time, I don't know, for some of us, maybe 25, some of us, maybe, you know, 75, you decide like, okay, now I'm done. Like, I have reached where I am in terms of human intelligence. I don't think that's how it works for humans. The world is always evolving. There's always more to be curious about. And so I think that's the path that we are on with our AI algorithms. Similarly, they need to stay curious about the world that they evolve in. And over time, they need to figure out, you know,

Starting point is 00:14:33 how to integrate that information and sort of rise to the touch. challenge of the world that they're building, but because the environment is not static, I don't see us coming to an end. That's so interesting because it's always described as the finish line. Actually, there's people who would argue that there's no such thing as human level intelligence that the second you hit that, you're basically left with superintelligence and game over. Yeah, I mean, I don't really ascribe to that scenario, I have to say. And the other nuance I will add to this, you know, often this notion of general intelligence is articulated in the context of like a single agent, a single Uber intelligent agent.

Starting point is 00:15:20 And I don't think that's really where we will move towards either. There's clear evidence that as a species, humans, animals, we learn so much more through interactions and so much of our culture and our intelligence is derived from our ability to interact, to collaborate. So I think that's also going to be a super interesting door to open as we are on this journey to think about how do we build AI agents that are not just pushing for single entity intelligence, but are connected to a network of other intelligent agents, whether synthetic silicon agents or are human agents. Well, it's so interesting because language, of course, like speaking of types of intelligence, language is only one type.

Starting point is 00:16:08 Yes. And Jan and I spoke about this on a recent show, not so recent anymore, but your interactions in the world teach you so much that you never learn with language. Your understanding of gravity, for instance, is not something that, like, you can implicitly understand from language. So are you doing research now to help meta's research division? to figure out stuff beyond words and images. And I would say, you know, that may be one of the distinguishing factors compared to other research group out there. There's a strong belief that having AI agents that are deployed in the physical world

Starting point is 00:16:42 where the notion of embodiment is important is something we should be pursuing. We have a research team that's dedicated to this. They do some work in robotics in particular because that's the best agents we have to consider physical embodiment, spatial constraints. It's not necessarily because meta intends to commercialize robots. It's because by going through these, essentially, devices, we have a lot to learn about how to build AI models that live in the physical world. In the work that we've done recently with a smart glasses,

Starting point is 00:17:17 the models that proved to be useful for that use case came out of the work that this group was doing. People who were looking at robotics and devices, physical devices, living into the world and building AI models specialized for that was incredibly useful to inform the work going into the glasses. Of course, we also leveraged the work we were doing on language and our Lama family of models. But Lama on its own doesn't make for the best assistant on glasses because it doesn't have enough of an understanding of the physical world, of images, and so on. Now, there are some people saying that

Starting point is 00:17:57 The reason why I've met is now speaking about AGI is because opening I is speaking so much about AGI and other research houses are. And getting the talent to work on these projects is really difficult. This is something that Mark actually said in that verge interview. I think that it's important to convey because a lot of the best researchers want to work on the more ambitious problems. So I got to ask you straight up. Like, is the talk about AGI more of a recruiting thing? No. I mean, like, of course we love to have great talent. And of course, this is a competitive market for talent. But we don't talk about anything just because someone else talks about it. Like, we genuinely are doing the work and we've been doing it for a number of years. There's no major shift in terms of our ambition to solve AI that's been inscribed in our mission and our goals for fair for many years now. Mark is talking about it now. I think he's excited about the work. It's wonderful to have, have, have, his support to do it, but it doesn't necessarily fundamentally change the problems that we have

Starting point is 00:19:01 to solve, the work that we're doing. I think there's also a sense that we are, you know, we are being more explicitly ambitious about this work, which goes along with some of our investments on the compute side, which are necessary to fuel that work. And so that's why it's coming out maybe more from what you're hearing from Mark, but I think if you go back and listen to what Jan or I or some of our other senior research, have been saying for a number of years, there's not a departure there. Right. And briefly, on the talent market, what does the talent market look like right now?

Starting point is 00:19:34 Is there a real scarcity in the type of people that can do this type of work? And what is it like recruiting against fast-growing, and especially in terms of valuation, competitors like Open AI, Anthropic, et cetera? Yeah, it's always been a very competitive market. I would say going back to about 26,000. 2016, 2017. Since then, I don't really remember a year where it was like an easy slow market in AI. And so it continues to be one of the things that has changed in the last year or so is mostly on the startup scene, I would say. You know, three years ago, we didn't feel

Starting point is 00:20:14 much competition with a startup scene. Now we do a lot more. I tend to view this as relatively positively, to be honest with you. And that's one of the reasons we open so. source our work. We genuinely believe that more people working on this is good. And so when we open source our work, we get to leverage the creativity of a greater number of people. And there's many more than we can hire. So I think the very, very top talent that can train these models continues to be incredibly valuable to meta as well as to other organizations. Fortunately, there's also a good pipeline of students. You know, I do have an affiliation with Mila, the Montreal Institute for learning algorithms. There are hundreds of amazing grad students coming out of that institute as well as

Starting point is 00:20:59 others. We've set up some joint PhD programs in some cases so that these students have an opportunity to come work at least part-time or through internships with us. And so we're both, you know, sharing with them the work that we do as well as having an ability for us to see whether they're a good fit for our work. So I feel like we have a great talent pipeline, but it continues used to be a competitive market. Got to ask you about open source. Brad Smith from Microsoft has talked about how open AI is the most open. And I'm kind of curious from your position, are they living up to that open name?

Starting point is 00:21:37 Is there real open sourcing there? And what is the state of open source? I mean, why is meta open sourcing outside of like, I mean, from like a, you know, meta's a business. So from a business perspective, why open source? Yeah. Yeah. There's different levels of open sourcing, right? I do think, you know, having an AI model where you provide an API is sort of one layer of that, which is something that Open AI has done. But there's a lot more that goes on. And so, you know, from just providing an API, you can make available the code that was used to train it. You can make available the trained model weights, which enables someone to run the models. And then there's a number of other artifacts that come across from this. We've been focused. on making available model cars that give a better understanding and transparency about our models,

Starting point is 00:22:26 good use guides, tools for safety, and so on. So there's like a whole ecosystem of artifacts. I think the purists would say like everything has to be out there in an open way. So we even have some people who are coming from the software open source community who feel like we're not living up to the full view. Again, there's a continuum on this. It is clear that meta has taken a much more open view than other big players in the space. And in particular, we've been releasing some of our code and model weights for some of our larger models, including Lama.

Starting point is 00:23:03 It comes from a lot of deep discussions in doing that. And so I think there may be a misperception that we would do this without. without any process or reflection and there's a little bit of a religion, that's really not the case. We have quite a thoughtful process that's been put in place. You have to remember, we've been doing open sourcing work for 10 years since the first day of this organization. So we've built up a lot of muscle of how to do that in a responsible way. We do it in consultation with a wide set of people who have deep understanding of safety, ethics, and so on, who get brought into the process. And what's been wonderful to see in the last year is as the conversation has

Starting point is 00:23:55 been moving and as the models have been going better and getting bigger, we've invested a lot more into being thoughtful about a release process. And so I would say now we have a much more mature process than we did a year and a half ago. That involves a much more diverse group of stakeholders. We have a really rigorous process in terms of measuring the risks of these models across different categories of risk. So it's been exciting to see how much our commitment to open sourcing has driven us to innovate. And we've open source a lot of those innovations on the safety side. I think the Purple Lama Tools is an example of that, which we released in December.

Starting point is 00:24:33 And so it's been great to see that. I do hear a lot of people who are concerned about open sourcing. And I have many conversations with them, including other large organizations. And my worry about closing the doors down now is that the models are only getting better. And so if we don't release them now, we really miss an opportunity to develop the muscle we need to make these models safer. And I don't think today's model are the ones that are going to, you know, bring to the front the hardest questions. These models are yet to be trained and built. Is there anything stand out that you've seen being built on top of the open source Lama model that META has put out there?

Starting point is 00:25:23 Anything stand out in terms of like a cool product that you've seen and anything concerning that you can talk about? There's definitely been dozens of a product that are coming out of that. How about naming, yeah, do you want to name one? Yeah, let me take an example or segment anything model, which is a little bit different than our Lama model. but I think has been the one that has been just incredibly impactful in terms of people quickly building on it. Our segment anything models, one where you take an image and it gives you a detailed segmentation of that. We released it back in April, including a lot of tools and data to go along with it. And within days, we had people who had built up applications essentially for conservation applications.

Starting point is 00:26:08 So being able to track down some species who may be endangered, using that to follow them. We had people use it for the treatment of medical images, so segmenting cells from some of these images. And it's been wonderful to see that explosion of work. On the language side, we also saw many people build up all sorts of different tools. And in particular, the work that we're most excited about is the work on efficiency. to be honest with you. There is no much that we can do to make these model more compact and efficient and running really, really fast with low energy. And I think that's one of the things that I've been most excited about seeing. There's lots of other applications too.

Starting point is 00:26:54 Anything that stood out and made you say, oh, that's not good. That's not what we want. There are definitely some that are getting flagged that we discuss internally. I'm probably not going to go into the details of them right now, but there are definitely a number of them that we are tracking, I will say, in a number of the cases that we are most concerned, people are not respecting the terms of use of these models. So we release these models with very clear terms of use and people may not be respecting those terms of use. Do you have recourse once they disrespect those terms? Yeah, I think that's, I'm not going to go into the details of that today, but this is, you know, this is definitely part of the conversation. We, we are thoughtful

Starting point is 00:27:33 about the conditions under which we release. And so we are thoughtful about the follow-through as well. Before we go to break, I want to ask you about this move toward getting these models to reason. There was like this momentary freak out around this QSTAR model thing that OpenAI apparently has developed internally, which gets people to reason, gets the model to reason. What's your perspective on this technology moving towards like the ability to reason and how should how should we think about it when we see stories like the one about QSTAR? I mean, I think the number one thing is just like don't get too worked up about it. The amount of, you know, speculation probably far outweighed what was going on there. I don't have firsthand information on QSTAR. We have a lot of, you know, a lot of speculation of our own of what it is.

Starting point is 00:28:29 What I will say, though, is to some degree, people shouldn't be too surprised. You know, a while ago, we shared a model that could play the game. of diplomacy at the level of human player. I don't know what people thought that model is. Cicero. Cicero, exactly, right? Cicero was having conversations with other players, and it was reasoning about the game strategy.

Starting point is 00:28:51 And so this was an example of a model that had language and that could reason arguably in the hardest game out there. So I don't think people should be surprised that language models have the ability to be effective. in reasoning task, especially paired with mechanisms. In the case of Cicero, we were using some search mechanisms inside to be able to achieve reasoning. It's a different architecture than what we have in Lama.

Starting point is 00:29:22 But a lot of the ingredients of how to do reasoning have been explored in AI for 40 years and are published and well known to anyone who's taken even an undergrad level course in AI. So I'm not saying there's not any innovation in the work. that Open AI is doing or in the work that's happening across the community, I'm just saying it's not like a magic ingredient. I'd be extremely surprised that. So what could the next level jump there be? I mean, there's a lot of theories of how to achieve reasoning in these models. One of them is to incorporate search as part of the model. And another one is incorporating, for

Starting point is 00:30:01 example, a lot more coding abilities. Coding is executable. Coding allows us to essentially dig in through a sequence of operations. Another direction that many groups are exploring is the use of retrieval-based techniques. So you're retrieving information. Some of that retrieval can make use of information where reasoning is present in the information. So lots of different ways to go about it. We're exploring many of them. Any respectable AI research group probably is too.

Starting point is 00:30:34 And what's really going to make the difference is how do we bring this together, right? How do we make sure to have the right way for these components to integrate in some ways? That's still the hardest question in AI. How do we have different components working together in a very coordinated way? Is there anything that you could see in the sort of research or production that would freak you out? Or are you sort of Comcooll collected about where we're heading? There's stuff. I mean, I don't tend to freak out a lot. There's stuff that concerns me every day. You know, we review, you know, rigorously the performance of our model for different aspects. You know, there's many cases where I see a model and the performance, for example, on safety benchmarks isn't what I would expect it to be. And then we go back and we keep on working on it. So it's not that there's, I don't think there's a ton of work to. do. I just don't feel that like, you know, freaking out or being fearful about it is the best way to go about it.

Starting point is 00:31:40 I think you just have to look at the data in a collected way. In many cases, we don't even have the right way to analyze the properties of our model. You know, are this the model safe or unsafe? Does it have, you know, toxic behavior? Does it have bias? There's a lot of work to do to even develop the tools to assess this so we can look at it in a rational way. So we invent. invest a lot in that also. We're here with Joelle Pinia, the head of META's AI Research Division, still called Fair, fundamental AI research. We've talked a lot about the research side on this side of the break.

Starting point is 00:32:17 On the other side of the break, we're going to talk about product because Joel's division has recently moved toward the product side of meta, and we're going to talk about what that means right after this. Hey, everyone, let me tell you about the Hustle Daily Show, a podcast filled with business, tech news and original stories to keep you in the loop on what's trending. More than 2 million professionals read The Hustle's daily email for its irreverent and informative takes on business and tech news. Now they have a daily podcast called The Hustle Daily Show

Starting point is 00:32:46 where their team of writers break down the biggest business headlines in 15 minutes or less and explain why you should care about them. So search for The Hustle Daily Show and your favorite podcast app like the one you're using right now. And we're back here on Big Technology Podcast with Joel Pinyo, head of meta's AI research division. Your division just moved toward the products or under the product division within meta. Let me start this segment with this question. It's a broad question. I don't think I've ever seen a disconnect as much as I'm seeing now where the discussion of where

Starting point is 00:33:21 this technology can lead and what it does today is so, I would say even divorced from the products that we've seen. I mean, yes, chat GPT was was groundbreaking and still incredible to use and so is like some of the competitors. But beyond that, have we really seen the product momentum when it comes to building on large language models and the, you know, we've heard so much about an enterprise. Yeah, we've seen some co-pilots from Microsoft, stuff like that, the bots in the messaging apps that Metis creating. But, you know, for all the talk of revolution, it seems somewhat like an evolution.

Starting point is 00:34:01 So what do you think about that? What am I missing here? I do see it as a bigger step change, I think, than you're articulating it. I think we have seen the birth of what I would call an AI research product. And so if I take, you know, for example, the GPT family of models, I do think there is a real product there. People are using it. Some people are using it every day. And so I don't think we've seen anywhere.

Starting point is 00:34:32 near everything that is possible, but I think we have to have a very open mind that the product that our AI first are going to look very different than product we've seen before. That being said, I will say, you know, as much as we spend a lot of time worrying about what is the path on the research side, I do think we need almost as much exploration on the product side. You know, on the research side, the space of hypothesis to build these model is huge, but on the product side, like the space of new things. you could build with this is huge. And we don't yet have nearly enough information about what are going to be those products

Starting point is 00:35:10 and those experiences that people are going to actually use every day and love using. So I'm, you know, as I talk to partners across the company, one of the things I encourage them to do is to really embrace the exploration that comes out of having a completely new tech stack compared to what they had before and not just take, you know, the products that they know and like shove AI into them, but completely reimagine what is possible. So that's been a really, really fun conversation to have. And one of the things that is going on is Metas brought a bunch of AI bots into the messaging apps. Can you tell us a little bit about how that's going? I mean, I saw like the, was there's like

Starting point is 00:35:51 12 or 20 different bots that are in these apps and I played with them for a little bit and then I kind of lost interest and I haven't like seen any reminders that, hey, they exist. So how's adoption been there? What can you tell us about those? Are you asking for more reminders that they're there? Because we can do that. Honestly, yes. Honestly, yes. I think that would be good. Okay. Yes, the bots are there. They're available. The bots are an example of exactly what I mean, right? This is product exploration to some degree at its best in terms of like trying out different things. There's an intuition that there's enough there that it's worth putting it out in the hands of people. There's enough conviction as well as data to support

Starting point is 00:36:33 releasing that for people to use. But I think it was very much the kind of product we hadn't done before. And we're going to learn so much out of getting that into people's hands. You can think of it as really accelerating that cycle of development. And there's some bots that are doing quite well that are seeing quite a bit of use, some bots that are seeing a lot less use. I don't have the numbers with me and be for your listeners to understand, you know, Fair does the fundamental AI research and we have a sister organization that is more connected to the product and is releasing those bots. We're tracking that really closely. That's feeding back into the product exploration conversation going on. I would say the bots,

Starting point is 00:37:16 as well as the meta-AI assistant are within a category of things that we call AI agents. And so we have a pretty wide exploration within that space of AI agents that you should expect to see new things, new things coming in years to come. And the other example that we explored a bit this year is on the smart glasses where we also have an AI system running on that, which is a very different, very different experience compared to the desktop or mobile cases. Yeah, so adoption, how is adoption looking with those messaging bots? We'd have to, you know, we'd have to get someone else on your podcast to give you more detail on that. Yeah, absolutely. I'm sure we can find you someone who can give you some of that information. So you mentioned that you have the product teams and you have Fair, but Fair used to be in reality labs and now it's like directly under the product team within Meta.

Starting point is 00:38:10 Why did that move happen? So, I mean, we had a wonderful set of colleagues and great work happening in reality labs research. The truth is right now AI is moving so fast that it's really useful to be close to products that are in the hands of billions of people to be able to have that quick product innovation, that quick signal back to the research. We were already working in close collaboration with the family of product teams, but this just makes things go a little bit faster. And the gen AI team that is really putting out some of these AI characters and meta AI assistant was already. in that product team. So bringing us together gives us the ability to be much more coordinated in particular from the research to, you know, building up the products and then releasing them. We're still going to continue to do a lot of work on the reality lab side. You know, we've been

Starting point is 00:39:07 in that work for a few years. We've built up a lot of exciting projects. It's going to be maybe a few more years between now and when some of these get on the market. But these projects are not slowing in any way. I think there's a really good understanding at the company level that right now, the more we accelerate the AI roadmap, it is going to benefit both the existing products as well as the ARVR and the reality lab side of the company. So I think that's really where we are with this one. One thing that seems like it's really going to be a thing that people talk about this year is video generation. We've just seen a little bit come out this week from Google. I know that you guys are working on it. Tell us a little bit about what that could

Starting point is 00:39:54 look like. I mean, it's one thing to sort of type in, draw me a picture, and you get one out from Dolly or Meta has one, an image generator as well, but the video generation seems pretty wild. Yeah, it's been great to see that. It's not surprising. As soon as you, you know, you have good image generation. Every time we've had progress in terms of image generation, the next step is how do we do 3D images and how do we do videos? Like, these are the two dimension in which people quickly extend any progress in image generation. On the video side, I would say we've seen much, much better models coming out, but we haven't totally cracked the problem of generating long-form videos.

Starting point is 00:40:38 The temporal coherence is quite tough. And I think, you know, for those of you who know a little bit more about video, there's a piece of spatial coherence that you need to be thoughtful of, and that's the piece that image generation has, to some degree, solved in the last year. But the temporal correlation is something that right now is harder to do. We get really good quality video generation if you can intervene and kind of set a lot of the frames, and then you kind of use a diffusion model to interpolate in between. But to go from a really high level, for example, a script, written in words, and to have like a full, you know, full-length feature film is still going

Starting point is 00:41:24 to take us a little while. One of the biggest problems there is to think about how to do generation in sort of a hierarchical way, not just do frame after frame after frame, but actually think of how do you generate globally some properties of your video and then go through more and more granular resolution over space and over time. This is something that Jan has been thinking about a lot. He's working closely with some of our research teams in New York, in Montreal, in Paris, to make progress on that. And so I'm, you know, I'm leaving a lot of that on him to drive, but I know he has a lot of ideas on this topic. And how also to achieve that in a way that isn't too intensive in terms of data and compute. Right. I think that that's

Starting point is 00:42:09 when you sort of get into like, can it predict and plan and sort of really understand what reality is that's some fascinating stuff okay we're coming close to a landing here very quickly uh we also spoke when we really had some fun conversations when we met the first time we spoke with the chief technology officer of invidia and mark just announced that you have 350,000 invidia h100 chips and we'll end up with 650,000 by by the end of the year invidia h100 or equivalent I'm just curious from your perspective as a customer of invidia what makes those chips so effective for you now it's obviously a technology component but there's a software side of it as well right so can you talk us through exactly what makes them so appealing and do you think

Starting point is 00:42:56 they're they are going to just be the empower the unparalleled developer of these chips forever or are you starting to look at others like arm et cetera intel you tell us yeah i mean it's honestly it's it's clear to everyone that a lot of the progress in ai has been fueled by the availability of GPUs built by NVIDIA. It's not the only solution. Google uses a lot their own TPUs as an example. So there's a few others. But overall, I think NVIDIA's GPUs have been essential to the progress. And we've been fortunate to have many of them to power our own research. There's a couple things that make them great. One, you know, the GPUs on their own have the ability to parallelize a lot of the computation, which is essential for training these

Starting point is 00:43:44 models. And we also have the ability to build them into systems, you know, networked with very fast interconnection between them to allow information to be passed around very, very quickly. And when you do that at scale with with a few thousand GPUs, you can train some of these larger models. So that's really the essential ingredients. In terms of the trajectory there, of course, you know, as all responsible organizations, we're looking at all options that could accelerate our work, we keep a close eye on the development of hardware. Right now, as Mark has shared, you know, I think the betting on the DPUs from Nvidia is a sound bet for our research, but we're always interested to see innovation in all aspects of the stack.

Starting point is 00:44:27 Are you going to build your own chips? We will definitely be exploring some of that. Yes, yes. I mean, we built a lot of hardware for reality labs. We have some specific needs, and, you know, As much as we look at that for the ARVR devices, there's also a great group doing some of that innovation inside our Infra Team. All right. Last question for you. We started the conversation talking about AI reaching human-level intelligence. I think that's going to happen, let's say five years over or under. You have a perspective on that one?

Starting point is 00:44:59 And five years, we're going to see really strong systems across a broad set of tasks. I have some strong conviction that we're on a problem. path there. After that, you know, I don't want to bin any intelligence into narrow box, whether human or AI, but we will be amazed by what gets done in the next five years. All right. Can't wait to watch it. Joel, thank you so much for joining. Thank you, Alex. All right, everybody. Thank you for listening. We will be back on Friday with a new show breaking down the news, and we will see you for our Friday show on Big Technology Podcast.

Big Technology Podcast - Why Meta Wants To Build Artificial General Intelligence — With Joelle Pineau

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.