Big Technology Podcast - Why Meta Wants To Build Artificial General Intelligence — With Joelle Pineau
Episode Date: January 24, 2024Joelle Pineau is the head of Meta's AI Research division. She joins Big Technology Podcast to discuss the company's recent proclamation that it intends to build Artificial General Intelligence, diggin...g into how we get there and why it feels it must. In a wide-ranging discussion, we cover the latest research trends, the company's open-source practices, what actual products developers have been built with AI, video generation, and the reasons why NVIDIA chips are so in demand. Tune in for a deep dive into the world's most crucial technology from someone directing one of its most important research labs. --- Enjoying Big Technology Podcast? Please rate us five stars ⭐⭐⭐⭐⭐ in your podcast app of choice. For weekly updates on the show, sign up for the pod newsletter on LinkedIn: https://www.linkedin.com/newsletters/6901970121829801984/ Questions? Feedback? Write to: bigtechnologypodcast@gmail.com
Transcript
Discussion (0)
The head of META's AI Research Division joins us today to discuss the company's pursuit of human-level artificial intelligence, the cutting edge of AI, why it's open sourcing, its large language models, and plenty more in the only podcast interview the company is giving about its recent news.
All that and more coming up right after this.
Welcome to Big Technology Podcast, a show for cool-headed, nuanced conversation of the tech world and beyond.
Boy, do we have a show for you today with recording.
to the minute on Wednesday here right before we drop this episode because there's
breaking news coming out of Meta, all about the moves that they're making with their AI division,
their pursuit of human level intelligence. We have none other than Joe L. Pino here to talk to us
about it. She's the head of Meta's AI Research Division, formerly called Fair. Now I guess it's called
Mayor and Still Fair, Fundamental AI Research. We love the name. Okay, well keep it fair, keep it
running. We spoke actually in October 2020 before chat GPT. So this is going to be a really cool
moment to talk a little bit about where we've come from there and where we're going. Joel,
welcome to the show. Great to see you. Thank you, Alex. Great to be here. So if you recall,
in October 2022, and we spoke a couple times at the World Summit AI, one of the, it's kind of
funny because like the big storyline then was whether AI is sentient. And this was kind of a moment where
like all the big research houses had big large language model chat bots internal and they hadn't
released it yet. And it's kind of interesting how society starts to talk about a breakthrough, right? It
sometimes goes in a weird direction before we're actually refocused on what matters. And now I think
we are refocused on what matters, right? There's been much more talk beyond sentience in terms of
like the near term viability of this technology. I'm curious just to start, what has surprised you
in the research since that discussion, not necessarily about, okay, we all know that it's
now taken off and it's been hyped, but has there been anything that's made you sit back and
be like, wow, we can actually do more than we thought we could, you know, a year or a year
and a half ago?
So many times this year, honestly.
And it's great to think back to that point in time.
I hope you didn't ask me for any very specific predictions, even for someone who's deeply
in the space of AI, just predict.
how this is unfolding continues to be full of surprises.
I will say, you know, it's also been interesting.
The faster we progress, the more we have a sense of how much more is left to do.
And so though, you know, back you mentioned back in October 2020, we were worried about
sentience and we hardly talk about it now.
And yet we are so much further along on the map in terms of our ability to have.
models that deeply understand information and process multimodal data. So we're getting further
along and yet we worry about some of the more concrete problems. We've talked a lot this year about
safety, for example, about how to make sure that we have models that are performing well,
but also are aligning with the values of peoples and the needs of people, which I consider
sort of a much more grounded problem that we can tackle with research. So
That's, I think, the major change that I see.
Significant progress, but that means we also have a much better view of what are the real problems we need to solve.
Yeah, it's funny because back then we also had a discussion about whether we should be focusing on the short term or the long term problems.
And obviously, those are both worthy of attention and it's kind of wild that the focus on the long term problems, it seems like, blew up open AI over a weekend and maybe it's been put back together now.
But the talk from meta now is actually focused on some of the more big ideas that people might have thought were more long term.
But now it actually seems like, you know, it might be closer than we think, at least according to some of what we hear from Open AI and others.
So this is a quote from Mark Zuckerberg that just came out fairly recently.
He says, as recently as last week, we've come to view that in order to build the products that we want to build, we need to build for general intelligence.
So, I mean, Jan Lacoon, in our discussions, I've been speaking with him since 2015, one of your colleagues, he's always talked about how the goal is building for artificial general intelligence.
So when I saw Mark come out with that last week, I was like, yeah, yeah, that's been the focus for meta.
But all of a sudden, it almost feels like there's a more pragmatic or it feels more real now than it did before.
Am I reading that right?
Like, what is leading us to now start to like talk about this as something that's not pie in the sky, you know, 23rd?
years down the road, but something that might be achievable in the, you know, nearer term.
Yeah, I mean, Jan and I and the team in fair have been talking in those terms for many years.
It's been clear we've been putting in place sort of a portfolio of projects that are trying
to build the building blocks towards general intelligence.
In the last year, Mark, as well as many others, has taken a deeper interest in what's going
on in AI.
I think he was always aware of a lot of the good work we were doing, but didn't.
dig in quite as deeply and in the last year definitely has. And through a lot of conversations,
you know, I think has come to see how in many ways the path even to bringing AI to the products
that people use and love from the company, the path to making those AI systems better goes
through building general intelligence, not narrow intelligence. And we've done a ton of work on
AI on the platform over the last few years. That was what I would call more.
narrow specialized models. We can continue to do that, but the bigger step change are going to come
through the more general model, building foundation models, building up to world models that
essentially can capture a much richer version of the information. So I think that's what you're
hearing from Mark. It's things that you've been hearing from, from Jan, myself, and others
through the years. We're working together to connect these pieces together, both the researcher at
map as well, the product roadmap, and make sure that we have the ability to connect these
together. So the ability to have our research quickly diffuse in the best way possible through
the product and the ability to learn. The thing about general intelligence is you have to
solve many different problems to have, you know, the ability to claim general intelligence.
And fortunately, there are a lot of use cases across meta, across our family of products.
And so that's giving us wonderful material with which to work.
So why?
So let's go back to, you know, that October 2022 discussion that we had before chat GPT come out.
Like the idea of me asking you this question about like, why is human level intelligence now in focus?
I never would have asked it.
It just didn't seem like it would be something that would be relevant to ask.
But now it does seem more relevant and we're hearing it more and more in the discussion.
So you mentioned world models, foundational models.
but what about AI research now is allowing us to ask those questions?
I think it's because, you know, the models are getting increasingly general.
If you look at a model like Chad TPT, the Lama family of models that we've been releasing,
you know, they started just as word prediction models.
All they would do is take in sentences and predict what comes next.
And what we're seeing is we can use them now through many other.
uses, whether it's to predict things that are not just words, but they're actually code.
And some of that code is actually executable.
Or you can predict, you know, the components of an image.
And then you can plug in a diffusion model or other kind of synthesizer to realize the
information.
So what started as just language model has become much more general on its own.
It gives us a path.
It may not be the path, but it gives us at least a path to move towards general.
intelligence. And it's an exciting one. It's one that we're exploring. It doesn't mean that we've
stopped exploring other paths towards general intelligence, but that is definitely the one that has
proven to make the fastest progress. And what would you say the path is?
The path is to essentially capture a lot of human information through this representation that we
call language. And so the hypothesis that, you know, even things that are not
necessarily text-based originally. If you describe them through these discrete tokens,
and sometimes these discrete tokens are the words that we use to express, but sometimes discrete
tokens are, for example, code numbers, essentially like chunks of images. These discrete tokens
are a path to representing all of human information. There was a study that came out last year
basically saying that these models can't generalize outside of their training set.
You know, I think that was like a lot of the hype around these models
were people saying that they were really able to have these capabilities that you
wouldn't expect emergent capabilities.
And the study basically pushed back on it and was like, listen, they're not going to
generalize beyond their training set.
And your evaluation of that study basically, you know, made you to believe either
A, if you believe that study, then you're a lot less optimistic about this wave.
If you don't believe that study, you can be, you can really use your imagination and believe
that what we're hitting on now, these foundational models that you talk about can lead us
in directions that we never could have dreamed of.
So I'm curious what your evaluation of that study is and how we should be thinking about
this.
I tend to really like be quite balanced on a lot of these questions.
I think it's very easy to kind of pull opinions to one side or another.
But the truth is, like, machine learning algorithms can generalize.
That is a property of how we build these algorithms.
Even the simplest, just linear models, they do linearize.
They just linearize along a line.
So, you know, the fact of the matter is, though, when you project that into a very, very high dimension.
So some of these models have hundreds of billions of parameters.
You have to think of, like, you're learning a function.
in that really high-dimensional space.
The directions in which you can generalize are so many
that it's hard to know which are the good directions to generalize
and which are the poor directions to generalize.
The more data you have, the more that constrains that question.
So I do believe they can generalize.
I think they generalize relatively narrowly,
or at least, you know, as long as you stay close,
you get a good manifold of information.
when you start to go really far afield from your data,
because the dimensions are so large,
you get all sorts of noise.
So the advantage, and one of the reasons,
you know, a lot of the progress has been through better and bigger data sets,
bigger but also cleaner data,
is because that really defines which parts of this really high-dimensional space
are the most interesting one.
And when there is not a lot of data to populate that space,
then the models tend to regurgitate the things that they,
have been trained on.
So let's go back to that Zuckerberg quote that I read earlier.
We've come to view that in order to build the products that we want to build, we need
to build for general intelligence.
Now we talked a little bit about why that is now relevant, the path towards general intelligence.
But now I'm kind of left with another question, which is why does meta need to build
general intelligence in order to build the products that you want to build?
I mean, yeah.
Yeah.
I mean, just looking at like a couple of the AI products we've released this year.
You know, one of them is the meta-AI assistant.
People who are in the U.S. have been able to try this out on some of our platforms where you can essentially ask for questions and ask for assistance.
In that case, you know, there's a sense that it has to understand a very large spectrum of information to be able to do well.
And as we incorporate more data and as we perfect this assistant, the more it's going to have essentially world knowledge, the better it's going to be.
Another example is, for those who've been following our work on AR devices, the smart
glasses that we released earlier this year also come now with an AI model, also accessible
mostly in the US at this time, there too.
You know, you have essentially a more embodied version of this meta-AI assistant that sees
the world as you see it, that is able to take on some action.
In this case, the actions are not just words.
It can take pictures.
It can provide information.
and it can record information.
And so to be able to do well in a wide set of different tasks
with a wide set of different people, different environments,
you need to have to move towards more general intelligence.
That's really where that connects, you know,
the research work we're doing and already what we're seeing
in terms of the applications that Meta's putting out there.
Now, let's say you do achieve this and you open
source it? Is that kind of like the end? Like, is human, reaching human level intelligence
or general intelligence kind of like the end of AI research? Or is there more to do after that
happens? There is no end to this journey. I mean, I hope there's no end to this journey, right?
Like, do we as adults sort of say, okay, I'm going to keep on growing my knowledge. And at some
point in time, I don't know, for some of us, maybe 25, some of us, maybe, you know, 75, you decide like,
okay, now I'm done. Like, I have reached where I am in terms of human intelligence. I don't think
that's how it works for humans. The world is always evolving. There's always more to be curious about.
And so I think that's the path that we are on with our AI algorithms. Similarly, they need to
stay curious about the world that they evolve in. And over time, they need to figure out, you know,
how to integrate that information and sort of rise to the touch.
challenge of the world that they're building, but because the environment is not static, I don't
see us coming to an end. That's so interesting because it's always described as the finish
line. Actually, there's people who would argue that there's no such thing as human level
intelligence that the second you hit that, you're basically left with superintelligence
and game over. Yeah, I mean, I don't really ascribe to that scenario, I have to say.
And the other nuance I will add to this, you know, often this notion of general intelligence
is articulated in the context of like a single agent, a single Uber intelligent agent.
And I don't think that's really where we will move towards either.
There's clear evidence that as a species, humans, animals, we learn so much more through
interactions and so much of our culture and our intelligence is derived from our ability to
interact, to collaborate. So I think that's also going to be a super interesting door to open
as we are on this journey to think about how do we build AI agents that are not just pushing
for single entity intelligence, but are connected to a network of other intelligent agents,
whether synthetic silicon agents or are human agents.
Well, it's so interesting because language, of course, like speaking of types of intelligence, language is only one type.
Yes.
And Jan and I spoke about this on a recent show, not so recent anymore, but your interactions in the world teach you so much that you never learn with language.
Your understanding of gravity, for instance, is not something that, like, you can implicitly understand from language.
So are you doing research now to help meta's research division?
to figure out stuff beyond words and images.
And I would say, you know, that may be one of the distinguishing factors compared to
other research group out there.
There's a strong belief that having AI agents that are deployed in the physical world
where the notion of embodiment is important is something we should be pursuing.
We have a research team that's dedicated to this.
They do some work in robotics in particular because that's the best agents we have to consider
physical embodiment, spatial constraints.
It's not necessarily because meta intends to commercialize robots.
It's because by going through these, essentially, devices,
we have a lot to learn about how to build AI models that live in the physical world.
In the work that we've done recently with a smart glasses,
the models that proved to be useful for that use case came out of the work that this group
was doing.
People who were looking at robotics and devices, physical devices,
living into the world and building AI models specialized for that was incredibly useful to inform
the work going into the glasses. Of course, we also leveraged the work we were doing on language
and our Lama family of models. But Lama on its own doesn't make for the best assistant
on glasses because it doesn't have enough of an understanding of the physical world, of images,
and so on. Now, there are some people saying that
The reason why I've met is now speaking about AGI is because opening I is speaking so much about AGI and other research houses are.
And getting the talent to work on these projects is really difficult.
This is something that Mark actually said in that verge interview.
I think that it's important to convey because a lot of the best researchers want to work on the more ambitious problems.
So I got to ask you straight up.
Like, is the talk about AGI more of a recruiting thing?
No. I mean, like, of course we love to have great talent. And of course, this is a competitive market for talent. But we don't talk about anything just because someone else talks about it. Like, we genuinely are doing the work and we've been doing it for a number of years. There's no major shift in terms of our ambition to solve AI that's been inscribed in our mission and our goals for fair for many years now. Mark is talking about it now. I think he's excited about the work. It's wonderful to have, have, have,
his support to do it, but it doesn't necessarily fundamentally change the problems that we have
to solve, the work that we're doing. I think there's also a sense that we are, you know,
we are being more explicitly ambitious about this work, which goes along with some of our
investments on the compute side, which are necessary to fuel that work. And so that's why it's
coming out maybe more from what you're hearing from Mark, but I think if you go back and listen
to what Jan or I or some of our other senior research,
have been saying for a number of years, there's not a departure there.
Right.
And briefly, on the talent market, what does the talent market look like right now?
Is there a real scarcity in the type of people that can do this type of work?
And what is it like recruiting against fast-growing, and especially in terms of valuation,
competitors like Open AI, Anthropic, et cetera?
Yeah, it's always been a very competitive market.
I would say going back to about 26,000.
2016, 2017. Since then, I don't really remember a year where it was like an easy slow market
in AI. And so it continues to be one of the things that has changed in the last year or so
is mostly on the startup scene, I would say. You know, three years ago, we didn't feel
much competition with a startup scene. Now we do a lot more. I tend to view this as relatively
positively, to be honest with you. And that's one of the reasons we open so.
source our work. We genuinely believe that more people working on this is good. And so when we open
source our work, we get to leverage the creativity of a greater number of people. And there's many
more than we can hire. So I think the very, very top talent that can train these models continues
to be incredibly valuable to meta as well as to other organizations. Fortunately, there's also a
good pipeline of students. You know, I do have an affiliation with Mila, the Montreal Institute for
learning algorithms. There are hundreds of amazing grad students coming out of that institute as well as
others. We've set up some joint PhD programs in some cases so that these students have an
opportunity to come work at least part-time or through internships with us. And so we're both,
you know, sharing with them the work that we do as well as having an ability for us to see whether
they're a good fit for our work. So I feel like we have a great talent pipeline, but it continues
used to be a competitive market.
Got to ask you about open source.
Brad Smith from Microsoft has talked about how open AI is the most open.
And I'm kind of curious from your position, are they living up to that open name?
Is there real open sourcing there?
And what is the state of open source?
I mean, why is meta open sourcing outside of like, I mean, from like a, you know,
meta's a business.
So from a business perspective, why open source?
Yeah.
Yeah. There's different levels of open sourcing, right? I do think, you know, having an AI model where you provide an API is sort of one layer of that, which is something that Open AI has done. But there's a lot more that goes on. And so, you know, from just providing an API, you can make available the code that was used to train it. You can make available the trained model weights, which enables someone to run the models. And then there's a number of other artifacts that come across from this. We've been focused.
on making available model cars that give a better understanding and transparency about our models,
good use guides, tools for safety, and so on.
So there's like a whole ecosystem of artifacts.
I think the purists would say like everything has to be out there in an open way.
So we even have some people who are coming from the software open source community who feel
like we're not living up to the full view.
Again, there's a continuum on this.
It is clear that meta has taken a much more open view than other big players in the space.
And in particular, we've been releasing some of our code and model weights for some of our larger models, including Lama.
It comes from a lot of deep discussions in doing that.
And so I think there may be a misperception that we would do this without.
without any process or reflection and there's a little bit of a religion, that's really not
the case. We have quite a thoughtful process that's been put in place. You have to remember,
we've been doing open sourcing work for 10 years since the first day of this organization.
So we've built up a lot of muscle of how to do that in a responsible way. We do it in consultation
with a wide set of people who have deep understanding of safety, ethics, and so on, who get brought
into the process. And what's been wonderful to see in the last year is as the conversation has
been moving and as the models have been going better and getting bigger, we've invested a lot
more into being thoughtful about a release process. And so I would say now we have a much more
mature process than we did a year and a half ago. That involves a much more diverse group of
stakeholders. We have a really rigorous process in terms of measuring the risks of these models
across different categories of risk.
So it's been exciting to see how much our commitment to open sourcing has driven us to innovate.
And we've open source a lot of those innovations on the safety side.
I think the Purple Lama Tools is an example of that, which we released in December.
And so it's been great to see that.
I do hear a lot of people who are concerned about open sourcing.
And I have many conversations with them, including other large organizations.
And my worry about closing the doors down now is that the models are only getting better.
And so if we don't release them now, we really miss an opportunity to develop the muscle we need to make these models safer.
And I don't think today's model are the ones that are going to, you know, bring to the front the hardest questions.
These models are yet to be trained and built.
Is there anything stand out that you've seen being built on top of the open source Lama model that META has put out there?
Anything stand out in terms of like a cool product that you've seen and anything concerning that you can talk about?
There's definitely been dozens of a product that are coming out of that.
How about naming, yeah, do you want to name one?
Yeah, let me take an example or segment anything model, which is a little bit different than our Lama model.
but I think has been the one that has been just incredibly impactful in terms of people quickly building on it.
Our segment anything models, one where you take an image and it gives you a detailed segmentation of that.
We released it back in April, including a lot of tools and data to go along with it.
And within days, we had people who had built up applications essentially for conservation applications.
So being able to track down some species who may be endangered, using that to follow them.
We had people use it for the treatment of medical images, so segmenting cells from some of these images.
And it's been wonderful to see that explosion of work.
On the language side, we also saw many people build up all sorts of different tools.
And in particular, the work that we're most excited about is the work on efficiency.
to be honest with you. There is no much that we can do to make these model more compact and
efficient and running really, really fast with low energy. And I think that's one of the things
that I've been most excited about seeing. There's lots of other applications too.
Anything that stood out and made you say, oh, that's not good. That's not what we want.
There are definitely some that are getting flagged that we discuss internally. I'm probably
not going to go into the details of them right now, but there are definitely a number
of them that we are tracking, I will say, in a number of the cases that we are most concerned,
people are not respecting the terms of use of these models. So we release these models with very
clear terms of use and people may not be respecting those terms of use. Do you have recourse once
they disrespect those terms? Yeah, I think that's, I'm not going to go into the details of that
today, but this is, you know, this is definitely part of the conversation. We, we are thoughtful
about the conditions under which we release. And so we are thoughtful about the follow-through as well.
Before we go to break, I want to ask you about this move toward getting these models to reason.
There was like this momentary freak out around this QSTAR model thing that OpenAI apparently has developed internally, which gets people to reason, gets the model to reason.
What's your perspective on this technology moving towards like the ability to reason and how should how should we think about it when we see stories like the one about QSTAR?
I mean, I think the number one thing is just like don't get too worked up about it.
The amount of, you know, speculation probably far outweighed what was going on there.
I don't have firsthand information on QSTAR.
We have a lot of, you know, a lot of speculation of our own of what it is.
What I will say, though, is to some degree, people shouldn't be too surprised.
You know, a while ago, we shared a model that could play the game.
of diplomacy at the level of human player.
I don't know what people thought that model is.
Cicero.
Cicero, exactly, right?
Cicero was having conversations with other players,
and it was reasoning about the game strategy.
And so this was an example of a model that had language
and that could reason arguably in the hardest game out there.
So I don't think people should be surprised that language models have the ability
to be effective.
in reasoning task, especially paired with mechanisms.
In the case of Cicero, we were using some search mechanisms inside to be able to achieve
reasoning.
It's a different architecture than what we have in Lama.
But a lot of the ingredients of how to do reasoning have been explored in AI for 40 years
and are published and well known to anyone who's taken even an undergrad level course
in AI.
So I'm not saying there's not any innovation in the work.
that Open AI is doing or in the work that's happening across the community, I'm just saying
it's not like a magic ingredient. I'd be extremely surprised that. So what could the next level
jump there be? I mean, there's a lot of theories of how to achieve reasoning in these models.
One of them is to incorporate search as part of the model. And another one is incorporating, for
example, a lot more coding abilities. Coding is executable. Coding allows us to essentially
dig in through a sequence of operations.
Another direction that many groups are exploring is the use of retrieval-based techniques.
So you're retrieving information.
Some of that retrieval can make use of information where reasoning is present in the information.
So lots of different ways to go about it.
We're exploring many of them.
Any respectable AI research group probably is too.
And what's really going to make the difference is how do we bring this together, right?
How do we make sure to have the right way for these components to integrate in some ways?
That's still the hardest question in AI.
How do we have different components working together in a very coordinated way?
Is there anything that you could see in the sort of research or production that would freak you out?
Or are you sort of Comcooll collected about where we're heading?
There's stuff. I mean, I don't tend to freak out a lot. There's stuff that concerns me every day. You know, we review, you know, rigorously the performance of our model for different aspects. You know, there's many cases where I see a model and the performance, for example, on safety benchmarks isn't what I would expect it to be. And then we go back and we keep on working on it. So it's not that there's, I don't think there's a ton of work to.
do. I just don't feel that like, you know, freaking out or being fearful about it is the best way to go about it.
I think you just have to look at the data in a collected way. In many cases, we don't even have
the right way to analyze the properties of our model. You know, are this the model safe or unsafe?
Does it have, you know, toxic behavior? Does it have bias? There's a lot of work to do to even
develop the tools to assess this so we can look at it in a rational way. So we invent.
invest a lot in that also.
We're here with Joelle Pinia, the head of META's AI Research Division, still called Fair,
fundamental AI research.
We've talked a lot about the research side on this side of the break.
On the other side of the break, we're going to talk about product because Joel's division
has recently moved toward the product side of meta, and we're going to talk about what that
means right after this.
Hey, everyone, let me tell you about the Hustle Daily Show, a podcast filled with business,
tech news and original stories to keep you in the loop on what's trending.
More than 2 million professionals read The Hustle's daily email
for its irreverent and informative takes on business and tech news.
Now they have a daily podcast called The Hustle Daily Show
where their team of writers break down the biggest business headlines
in 15 minutes or less and explain why you should care about them.
So search for The Hustle Daily Show and your favorite podcast app
like the one you're using right now.
And we're back here on Big Technology Podcast with Joel Pinyo,
head of meta's AI research division. Your division just moved toward the products or under the
product division within meta. Let me start this segment with this question. It's a broad question.
I don't think I've ever seen a disconnect as much as I'm seeing now where the discussion of where
this technology can lead and what it does today is so, I would say even divorced from the
products that we've seen. I mean, yes, chat GPT was
was groundbreaking and still incredible to use and so is like some of the competitors.
But beyond that, have we really seen the product momentum when it comes to building
on large language models and the, you know, we've heard so much about an enterprise.
Yeah, we've seen some co-pilots from Microsoft, stuff like that, the bots in the messaging
apps that Metis creating.
But, you know, for all the talk of revolution, it seems somewhat like an evolution.
So what do you think about that?
What am I missing here?
I do see it as a bigger step change, I think, than you're articulating it.
I think we have seen the birth of what I would call an AI research product.
And so if I take, you know, for example, the GPT family of models, I do think there is a real product there.
People are using it.
Some people are using it every day.
And so I don't think we've seen anywhere.
near everything that is possible, but I think we have to have a very open mind that the product
that our AI first are going to look very different than product we've seen before.
That being said, I will say, you know, as much as we spend a lot of time worrying about
what is the path on the research side, I do think we need almost as much exploration on the
product side. You know, on the research side, the space of hypothesis to build these model is
huge, but on the product side, like the space of new things.
you could build with this is huge.
And we don't yet have nearly enough information about what are going to be those products
and those experiences that people are going to actually use every day and love using.
So I'm, you know, as I talk to partners across the company, one of the things I encourage
them to do is to really embrace the exploration that comes out of having a completely new
tech stack compared to what they had before and not just take, you know, the products that
they know and like shove AI into them, but completely reimagine what is possible.
So that's been a really, really fun conversation to have.
And one of the things that is going on is Metas brought a bunch of AI bots into the messaging
apps. Can you tell us a little bit about how that's going? I mean, I saw like the, was there's like
12 or 20 different bots that are in these apps and I played with them for a little bit and then
I kind of lost interest and I haven't like seen any reminders that, hey, they exist. So how's
adoption been there? What can you tell us about those? Are you asking for more reminders that
they're there? Because we can do that. Honestly, yes. Honestly, yes. I think that would be good.
Okay. Yes, the bots are there. They're available. The bots are an example of exactly
what I mean, right? This is product exploration to some degree at its best in terms of like
trying out different things. There's an intuition that there's enough there that it's worth
putting it out in the hands of people. There's enough conviction as well as data to support
releasing that for people to use. But I think it was very much the kind of product we hadn't
done before. And we're going to learn so much out of getting that into people's hands.
You can think of it as really accelerating that cycle of development. And there's some bots
that are doing quite well that are seeing quite a bit of use, some bots that are seeing a lot
less use. I don't have the numbers with me and be for your listeners to understand,
you know, Fair does the fundamental AI research and we have a sister organization that is more
connected to the product and is releasing those bots. We're tracking that really closely.
That's feeding back into the product exploration conversation going on. I would say the bots,
as well as the meta-AI assistant are within a category of things that we call AI agents.
And so we have a pretty wide exploration within that space of AI agents that you should expect to see new things, new things coming in years to come.
And the other example that we explored a bit this year is on the smart glasses where we also have an AI system running on that, which is a very different, very different experience compared to the desktop or mobile cases.
Yeah, so adoption, how is adoption looking with those messaging bots?
We'd have to, you know, we'd have to get someone else on your podcast to give you more detail on that.
Yeah, absolutely.
I'm sure we can find you someone who can give you some of that information.
So you mentioned that you have the product teams and you have Fair, but Fair used to be in reality labs and now it's like directly under the product team within Meta.
Why did that move happen?
So, I mean, we had a wonderful set of colleagues and great work happening in reality labs research.
The truth is right now AI is moving so fast that it's really useful to be close to products that are in the hands of billions of people to be able to have that quick product innovation, that quick signal back to the research.
We were already working in close collaboration with the family of product teams, but this just makes things go a little bit faster.
And the gen AI team that is really putting out some of these AI characters and meta AI assistant was already.
in that product team. So bringing us together gives us the ability to be much more coordinated
in particular from the research to, you know, building up the products and then releasing them.
We're still going to continue to do a lot of work on the reality lab side. You know, we've been
in that work for a few years. We've built up a lot of exciting projects. It's going to be maybe a few
more years between now and when some of these get on the market. But these projects are not slowing
in any way. I think there's a really good understanding at the company level that right now,
the more we accelerate the AI roadmap, it is going to benefit both the existing products
as well as the ARVR and the reality lab side of the company. So I think that's really where
we are with this one. One thing that seems like it's really going to be a thing that people
talk about this year is video generation. We've just seen a little bit come out this week from
Google. I know that you guys are working on it. Tell us a little bit about what that could
look like. I mean, it's one thing to sort of type in, draw me a picture, and you get one out
from Dolly or Meta has one, an image generator as well, but the video generation seems
pretty wild. Yeah, it's been great to see that. It's not surprising. As soon as you, you know,
you have good image generation. Every time we've had progress in terms of image generation,
the next step is how do we do 3D images and how do we do videos?
Like, these are the two dimension in which people quickly extend any progress in image generation.
On the video side, I would say we've seen much, much better models coming out,
but we haven't totally cracked the problem of generating long-form videos.
The temporal coherence is quite tough.
And I think, you know, for those of you who know a little bit more about video,
there's a piece of spatial coherence that you need to be thoughtful of, and that's the
piece that image generation has, to some degree, solved in the last year. But the temporal
correlation is something that right now is harder to do. We get really good quality video generation
if you can intervene and kind of set a lot of the frames, and then you kind of use a diffusion
model to interpolate in between. But to go from a really high level, for example, a script,
written in words, and to have like a full, you know, full-length feature film is still going
to take us a little while. One of the biggest problems there is to think about how to do
generation in sort of a hierarchical way, not just do frame after frame after frame, but actually
think of how do you generate globally some properties of your video and then go through more
and more granular resolution over space and over time. This is something that Jan has been thinking
about a lot. He's working closely with some of our research teams in New York, in Montreal,
in Paris, to make progress on that. And so I'm, you know, I'm leaving a lot of that on
him to drive, but I know he has a lot of ideas on this topic. And how also to achieve that
in a way that isn't too intensive in terms of data and compute. Right. I think that that's
when you sort of get into like, can it predict and plan and sort of really understand what reality
is that's some fascinating stuff okay we're coming close to a landing here very quickly uh we also
spoke when we really had some fun conversations when we met the first time we spoke with the
chief technology officer of invidia and mark just announced that you have 350,000 invidia h100
chips and we'll end up with 650,000 by by the end of the year invidia h100 or equivalent
I'm just curious from your perspective as a customer of invidia what makes those
chips so effective for you now it's obviously a technology component but there's a software side of
it as well right so can you talk us through exactly what makes them so appealing and do you think
they're they are going to just be the empower the unparalleled developer of these chips forever
or are you starting to look at others like arm et cetera intel you tell us yeah i mean it's
honestly it's it's clear to everyone that a lot of the progress in ai has
been fueled by the availability of GPUs built by NVIDIA. It's not the only solution. Google uses
a lot their own TPUs as an example. So there's a few others. But overall, I think NVIDIA's
GPUs have been essential to the progress. And we've been fortunate to have many of them to power
our own research. There's a couple things that make them great. One, you know, the GPUs on their
own have the ability to parallelize a lot of the computation, which is essential for training these
models. And we also have the ability to build them into systems, you know, networked with
very fast interconnection between them to allow information to be passed around very, very
quickly. And when you do that at scale with with a few thousand GPUs, you can train some of
these larger models. So that's really the essential ingredients. In terms of the trajectory there,
of course, you know, as all responsible organizations, we're looking at all options that could
accelerate our work, we keep a close eye on the development of hardware. Right now, as Mark
has shared, you know, I think the betting on the DPUs from Nvidia is a sound bet for our
research, but we're always interested to see innovation in all aspects of the stack.
Are you going to build your own chips? We will definitely be exploring some of that. Yes, yes.
I mean, we built a lot of hardware for reality labs. We have some specific needs, and, you know,
As much as we look at that for the ARVR devices, there's also a great group doing some of that innovation inside our Infra Team.
All right.
Last question for you.
We started the conversation talking about AI reaching human-level intelligence.
I think that's going to happen, let's say five years over or under.
You have a perspective on that one?
And five years, we're going to see really strong systems across a broad set of tasks.
I have some strong conviction that we're on a problem.
path there. After that, you know, I don't want to bin any intelligence into narrow box,
whether human or AI, but we will be amazed by what gets done in the next five years.
All right. Can't wait to watch it. Joel, thank you so much for joining.
Thank you, Alex.
All right, everybody. Thank you for listening. We will be back on Friday with a new show
breaking down the news, and we will see you for our Friday show on Big Technology Podcast.
