@HPC Podcast Archives - OrionX.net - @HPCpodcast-85: Stanford AI Index w Nestor Maslej
Episode Date: June 7, 2024Shahin and Doug are joined by Nestor Maslej of the Stanford Institute for Human-Centered Artificial Intelligence (HAI) at Stanford University. He tracks the advancement of AI in his role as Research ...Manager and Editor in Chief of the annual Stanford AI Index and Stanford Global AI Vibrancy Tool. Nestor has degrees from Harvard and Oxford and is also a fellow at the Center for International Governance Innovation. [audio mp3="https://orionx.net/wp-content/uploads/2024/06/085@HPCpodcast_Stanford-AI-Index_Nestor-Maslej_20240607.mp3"][/audio] The post @HPCpodcast-85: Stanford AI Index w Nestor Maslej appeared first on OrionX.net.
Transcript
Discussion (0)
If you didn't get a chance to visit Lenovo at ISC24, it's not too late.
Check out their Inside HPC booth video to get caught up and learn more about their booth's theme,
transforming HPC and AI for all.
Visit InsideHPC.com slash Lenovo dash ISC24 video.
First, why do I think it's the time that businesses should be using AI? We talk about this in the index.
There's compelling evidence that managers that integrate AI into their business functions
see cost decreases and revenue increases.
There's compelling evidence that AI makes workers more productive, at the minimum, lets
them do work in less time.
I would say, yeah, businesses should be thinking about it, but I don't think there's a need
to be losing sleep over the pace of integration.
Certainly make it a priority, but understand that if you do it slowly and in a steady way,
you will probably set your business up quite well.
From OrionX in association with Inside HPC,
this is the At HPC podcast. Join Shaheen Khan and Doug Black as they discuss supercomputing
technologies and the applications, markets, and policies that shape them. Thank you for being with
us. Hi, everyone. Welcome to the At HPC podcast. Shaheen and I are joined today by Nestor Maslay. He is research
manager at Stanford's Institute for Human-Centered Artificial Intelligence, where he manages the
annual Stanford AI Index Report. Nestor has degrees from Harvard and Oxford. He was an
analyst for startups based in Toronto, and he is also a fellow at the Center for International
Governance Innovation. Esther, welcome. Thanks for having me, guys. Today, I'm super excited
to be talking about the AI Index and more broadly where the AI ecosystem is.
Super. So this year's AI Index was released in April. Why don't we start off with what you
consider maybe the top findings in the new report?
And, you know, were there trends or new insights that you think are especially noteworthy?
Yeah, it's a good question. It's also a tricky question because, I mean, the report's quite long.
It's close to 500 pages. And I like to sometimes joke that I get offended when people ask to
summarize this 500 pages of brilliance that I've put out into the
world into a quick little soundbite. But of course, that's important to do. I think for me,
if there is a high level trend, and I think this kind of theme setting will set up well,
the other things that we'll be talking about in the podcast today is that 2022, to me felt like
the year in which AI technology really put itself on the map. And
2023 felt to me like the year in which people responded. Now, I think you see that response
in a lot of different ways. You see policymakers reacting in terms of passing more AI-related
regulations. You see the public responding in terms of seeming to be very concerned about AI
technology for really the first time. And you also see businesses responding, whether it's in terms of
rising rates of business integration AI, there being new research on the productivity impacts
that this kind of tool has. And there also being some data on the fact that there's just much more
investment in generative AI than ever before. For me, that's really the high level story.
And I think that's a story that you see with any kind of technology. It announces itself to the
world and then ultimately people respond. And in a lot of ways, this theme kind of situates itself
nicely with the mission of the index. We exist to give people, whether
it's policymakers, business leaders, or just regular civilians, the opportunity to understand
what this technology is and make appropriate decisions about how they should be using it and
vibing with it in their day-to-day lives. Yeah. 2022, November, when OpenAI came out
with ChatGPT, that was one of those instances that happens
every decade or so.
Oh, for sure.
Where you just feel, I'm living right now in a historical moment.
There's history going on all around me at this given time.
It was amazing.
But let me ask you this too.
You know, AI is changing so fast.
It's now almost two months since your report came out.
Is there anything new since then that you've detected? There are always new things. I think
there's new papers being written, exciting new research that's being published. I wouldn't say
that there has been anything that is, let's say, transformatively new in the way of chat GPT. I
think you've seen an extension of
trends that have continued, whether it's the launching of new capable open source models
like Lama3. I think there's some ongoing drama at OpenAI with there being a lot of resignations of
members of their super alignment team. But I think some of the trends that we've seen in the last few
months are more kind of reflective of broader themes that
had been percolating and building in the last few years, whether it's this kind of question of to
what degree can open source models actually compete with closed models? That's something
that we touch on in the index. The question of how much is existential risk something that should be
taken seriously when developing these kinds of AI systems.
So I think some of the things that have happened in the last few months are really just extensions
of a lot of themes that had been building in the last little while.
What I've seen just lately in media, the Wall Street Journal just had a big piece that
the possibility that AI has been overhyped, there could be a bubble.
Questions along those lines. Yeah. I mean, is the question, are you asking me, do I think AI been overhyped, there could be a bubble. Questions along those lines.
Yeah, I mean, is the question, are you asking me, do I think AI is overhyped?
No, I would say it's just a trend within AI that there are growing questions just in the
general ecosystem and in the media ecosystem.
Yeah, I mean, unquestionably.
I mean, I think this happens with a lot of technology.
And I think there is reason for people to be, I don't want to say skeptical, but you
see other technologies that emerge like crypto was something that was really celebrated by
a lot of people in Silicon Valley.
And I mean, I think you can make an argument that it's yet to make the impact that a lot
of people perhaps predicted that it might.
I think AI is perhaps a little bit different to me.
I think there's already a lot of compelling evidence that it might. I think AI is perhaps a little bit different to me. I think
there's already a lot of compelling evidence that it does make a big difference in terms of worker
productivity. It leads to higher quality work. It levels up lower skilled workers. And I think in a
lot of ways, it's also hard to fully judge what AI can do because you haven't really fully seen
a lot of companies integrate this technology yet. There's a good parallel,
I think, with the development of electricity, where when electricity was discovered in the mid-19th century, it took several decades, if not four or five, for the productivity impacts of
that new technology to be really fully manifest. Because of course, if you had a factory that was
built in the middle of the 19th century, you couldn't just retrofit it with the snap of the fingers to be using all of this new
technology, all of this electricity that had been discovered.
So I think there's a similar analogy that holds true with artificial intelligence, that
there's probably a lot of questions to be asked about how companies can be thinking
about actually using these tools.
Do we have the right infrastructure to even ensure that companies can be thinking about actually using these tools. Do we have the right infrastructure
to even ensure that companies can take advantage of their own data, that they could build well
with AI? It will probably take us a little while to get there. And I think at that point,
we're really going to start to see some compelling evidence that AI has had some
high-level productivity impacts. That's causing me to re-evaluate the question I was going to ask.
Well, what was the question perhaps? I mean, you can challenge me. There's a challenge.
No, it's not a challenge. It's fundamentally a question is where is AI in the so-called hype
cycle? And when do I need to make what kind of investment in it? Do I need to rush now because
otherwise I'm going to be left behind or do I still have time to catch up with
it? And is this an existential question for my business? Or is this going to evolve slowly enough
that I can catch up with it? I mean, I think it's the time now for businesses to be thinking about
using AI, but I think you need to exercise patience. So I mean, first, why do I think it's
the time that businesses should be
using AI? We talk about this in the index. There's compelling evidence that managers that integrate
AI into their business functions see cost decreases and revenue increases. There's compelling evidence
that AI makes workers more productive, at the minimum, lets them do work in less time, and at
the maximum, lets them save time and also improves the
quality of work. There's work and academic literature that shows that AI is bridging the
skill gap between low and high skilled workers. High skilled workers still top low skilled workers,
but the gap is certainly bridged. And you see these studies now across different industries,
which to me suggests a degree of further validity.
So there's definitely a lot of evidence that AI offers a competitive advantage. And I mean,
I think it's just economics 101. But if as a business, you're not necessarily positioning
yourself to leverage a tool that your competitors might use that gives them a competitive advantage,
you yourself might be put a little bit behind. Now, at the same time, we're still in this area or in this
time period where a lot of regulatory questions about AI remain unsettled. A lot of countries
are still figuring out what kind of regulatory regimes and regulatory landscapes they're going
to set up to govern how this technology is used and how it is deployed. So I think businesses,
you know, they face this challenge that on the
one hand, there is definitely a desire to be integrating AI. On the other, you know, you don't
necessarily want to rush and perhaps either go against regulation or perhaps set up AI in a way
that might not necessarily align with where the regulatory landscape is heading. So I think I
would say businesses should be thinking about it. But I
don't think there is a need to be losing sleep over the pace of integration and certainly make
it a priority. But understand that if you do it slowly and in a steady way, you will probably set
your business up quite well. Now, you mentioned policy. And I also noted part of your background
is governance innovation, which is highly relevant to a lot of things these days.
Do you think that regulators know enough to regulate this right now?
I mean, I think they're trying their best.
I mean, I think there is always you would always hope that you could get higher level talent in to government.
And I mean, that's another completely different discussion. There's some good data in the index report, which talks about how a lot of AI PhDs mostly are going to industry
now. Some are going to academia, but very few are going into government. And that's obviously a bit
problematic. But put that aside, I do think governments have relatively been on the front
foot. And I think there's an interesting juxtaposition to be made to the rise of social
media, where I think when social media really emerged into the zeitgeist in the kind of late
2000s, it really took half a decade, if not more, for there to be a fairly substantial policy
response. And I think a lot of us looking back maybe wish we would have done things a bit
differently with social media. Whereas I think with AI, I mean, if 2022 is the kind of, let's say,
groundbreaking moment when AI announced itself on the map, you can say that literally the next year
in 2023, you had landmark regulation being passed in both the United States, you had landmark
legislation being passed in the EU. So I do think policymakers are trying to respond in the best way
that they can. Obviously, I think some
regulation perhaps is a bit better than others, just from my own personal opinion. But I do think
that there is a sense and an awareness that this technology is here, and that it's certainly very
necessary to be doing something at the moment to deal with this technology and to try to not only
regulate it to ensure that it's used in a safe way, but also expand its possibilities,
because you can do a lot with AI. But if we're in a world where it's going to cost $200 million
to train frontier AI models, that obviously is perhaps leaving certain people outside of the
potential of building and developing these really transformative systems.
Right. Speaking of the different models of
governance that are emerging, certainly you mentioned the US, which appears to be pretty
market oriented and a little bit looser. Then you have the EU and their regulations are actually
going into effect like next month. And they are a little bit tighter, a little bit more controlling
of the data and how it's used.
And then you've got China, which of course is using it more for social stability and such.
And then you've got the rest of Asia. How do you see that spectrum? And what is your thought
process to favor one over the other? I mean, I think, again, there's a challenge here where
I think you could do a lot with AI. And I think AI as a tool, if it's integrated well and developed well,
can have a lot of massive productivity impacts. At the same time, though, you want to ensure that
it is being developed in a responsible way and that people trust the tool. I think if you look
at data that, again, is in the report on public opinion perspectives, like how do people actually
feel about this technology? You know, in a lot of
Western countries like the United States, like Canada, like Germany, opinions are actually very
low. People are very pessimistic about AI. And I mean, that kind of pessimism can really inform
how it's developed. Why am I talking about this? Well, because if you're a government,
you want to ensure that the developers of this tool are developing it in such a way that people
ultimately have faith in it, have faith that it's not necessarily going to automate their jobs,
have faith that it's not going to violate their privacy. Now, at the same time, you don't want
to create a regulatory environment that kind of stifles innovation in a way that is perhaps
too cumbersome. So I definitely think there's a kind of a need to
really walk a fine line here. On the one hand, create an environment that is expansive enough
for companies to feel like they can innovate, for companies to want to be innovating, and also not
only for companies to innovate, but also for just regular developers to feel like they're empowered
to pursue interesting projects. And then on the other hand, create kind of rules and regulations about how we should be using this technology so
that people trust the technology and people have faith that it's not going to work against their
interests because it is instilling that faith that you can put people in a position to really
take advantage of the tool and use it as completely as possible.
Yeah, that really is a challenge to accelerate all the good stuff it can do
without, you know, and control it as it relates to the bad stuff.
Yeah. How do you guys feel about that? Because obviously my perspective is my perspective,
but I'd be curious to hear from both of you guys as people that come from a slightly different
landscape, how do you feel about the kind of need for AI regulation? Are you favoring perhaps more of a framework, the kind that you see in the
European Union, where I think they've really doubled down on regulating and putting in rules
first and foremost, or perhaps one that we see more here in the United States, where it's a
little bit more market oriented, as you said. Well, I was just about to ask you what your views were on the new EU regulatory regime. I guess my overall sense, you mentioned social
media. I think in some ways, social media is really out of control, especially where it relates
to children and teenagers. Now, if AI goes in a similar direction, right, not good. And if it's
threatening people's livelihoods,
then certain responses have to be put in place. Either that's either halted or hindered, or we
need to enable people to acquire new skills to fit into the new AI driven economy.
Yeah. I mean, I think that the reality is, my perspective is, you know, AI isn't going anywhere.
I mean, it's here.
And I think if you look at other technologies that have launched themselves historically,
you know, despite the attempts that certain people have made to resist those technologies,
they inevitably become part of the mainstream.
So I think for us, the question is, how do we want to relate to this technology in a way that it serves our interests?
Because with AI, again,
it could really augment what a lot of workers can do. How do we create incentives to ensure that
companies are using AI tools to augment the capabilities of their workforce and not
necessarily automate them? So I think for me, I think the first step is just admitting that
this technology is here. And the second step afterwards is really thinking critically about what do we want out of it? I mean, it's always a values question, right? I
think that, you know, if you think about it, some of these tools that we use, some of these digital
tools like Netflix or Amazon, they can make great recommendations for, you know, the show that I
want to watch next or the product that I want to buy. Obviously, to make those recommendations,
they have to have access to my search history. Some of us might be very comfortable with kind of surrendering that data
about ourselves if it leads to better services in the economy. Others might not be. But at the end
of the day, it's a social question. And I think we have to come together as a society and think
what we really want from this technology and start having that dialogue, start having that discourse. So my view is that the value of data is not being
allocated properly right now. And when you have startups that are not even generating a lot of
revenue, that value is in their valuation rather than their revenue. And that I think is an issue. And I would like to see a move towards data rights
management at inception that allow individuals to license their data rather than just hand it over.
In terms of policies, my view is that while AI can augment an employee and may not impact that
individual employee's employment status, it does impact
the category of jobs that need that particular task that they can do.
Yeah, definitely.
Therefore, it impacts employment. And the impact on employment does not have to be large
to be disruptive to society. And governments would do well to account for that, budget for that,
and ease that transition to
nirvana. However, it is impossible to slow down, and it's not even advisable to slow down AI. So I
think we need to move as fast as possible while making sure that any transition issues are
budgeted for. Yeah. What I was just going to say is that I think even in a lot of ways, it can
sometimes be difficult to anticipate the impact that these kinds of technologies will have on economies. And I
think one of the things that I always find quite amusing about chat GPT, I mean, when I studied,
I did a lot of research and economic development and automation. And if you look at a lot of
canonical automation literature, I think a lot of this literature very often made assumptions that
the peoples whose jobs would go first would be those that would be doing very kind of routine,
repetitive labor. I mean, that was always the assumption that like robots would come for those
tasks first, you can never really automate what a creative person does, or what a knowledge worker
does. And I mean, there have been developments in robotics. Again,
we profile them in the report, but we're still pretty far away from a robot flipping a burger
at a McDonald's. But you know what, there are some pretty good large language models that
can maybe do some of the job that lawyers are doing. Obviously, that's a bit of an exaggeration.
But my point being is that I don't think a lot of people really predicted that knowledge workers
would be the first ones that would be threatened. And again, you know, that's the challenge with this technology.
There's a lot of people that sometimes make bold predictions and bold prognostications about where
we're going. But this technology has a way to it kind of moves according to its own intuition. And
I think the best thing policymakers do can do is just be continually receptive and continually responsive to the realities of AI and some of the new dynamics that it's introducing.
Actually, they also should have in mind that politicians themselves could be replaced by AI.
Yeah, well, that perhaps is a very brave new world, but definitely an interesting thing to think about.
It'll get votes. I think a really compelling imperative issue is the replacement of journalists. Now that's
alarming, but I say that out of complete silence. Could we ask us, just looking way out, I mean,
there are some who prognosticate that AI is a threat to the human species. I mean,
are you in that camp looking out decades ahead?
I mean, I think it's a non-zero possibility, but I think as in like there are narratives about this
kind of existential risk and I guess I understand them, but I still think given the way the AI
ecosystem is structured today, I don't necessarily know if we're going to get to these kinds of
Terminator existential robots
anytime soon. And I mean, I have to caveat this because there's a few assumptions that this kind
of argument requires. I think you've seen tremendous developments in AI systems in the
last few years. A lot of these developments have really been confined to large language modeling.
And then there has been a lot of really interesting research that has used insights from
large language modeling, like next token prediction to lead to further improvements in robotics or
image generation, image editing. And the improvement that you've seen in large language
models has really been a function of scaling. I think when GPT-2 came out, it was fairly well
understood that if you just give these models more data, you take a
transformer and you give it more data, it will perform better on a lot of tasks. And that's
basically, I think, what a lot of these companies are betting on. OpenAI, Anthropic, Google, Meta,
I think they're all betting that you could just have that architecture. Of course, maybe introduce
new slight tweaks, whether it's RLHF or something along those lines. And then if you just scale and
add in more data, you're going to create a system that's going to be a lot better.
Now, I'm not necessarily convinced that scaling these systems is going to lead to some kind of,
let's say, AGI, whatever that means, that kind of has its own intentions and that is sentience.
I think that will require a further architectural
development as in a new way of developing a model and thinking about a model. And it's just really
hard to predict when that development is going to occur. There's a great line that I like to use
from a really celebrated Stanford computer scientist. I think John MacArthur was once asked,
when are we going to get AGI? And he said, anywhere from five to 500 years.
And I think there's a point there that it's just, it's hard to predict when you're going
to have the next kind of the next transformer, right?
And even the transformer, you know, it's funny, you would imagine that nowadays, if Google
could go back in a time machine, and if it knew that the transformer was a way to build
these highly functional LLMs, it probably would have never openly released that paper. But again,
I think when the paper was released, no one that was on that authorship team, I think had any idea
that it was going to unleash the revolution that we've seen. So that's the thing about research,
it requires a lot of serendipity, it's hard to exactly know when it's going to go and move forward. And I think large language models,
as they are today, are very powerful, very capable, can lead to a lot of very important
transformations. But again, it still might be a while before we get another architecture that
really raises the bar of what these AI systems are capable of doing. But again, it'll be really interesting for me to see how well GPT-5 and
you know, Cloud4 and these kind of Gemini 2, the next level of models do, because I'm sure they're
going to get better at some tasks. But I'm really curious to see if scaling these models and feeding them more data really continues to lead to massive improvements in performance capabilities, or if it's going to be the case that you start seeing some kind of leveling off.
So in other words, am I right in saying, to capture a bit of what you're saying, generative AI is not an early form of general AI. I think to a degree.
I mean, I think the problem is there isn't really a industry or community accepted definition of
what AGI means. There is, for example, some, I think there are some like prediction websites
like Metacolous that define AGI as a system that could score a certain score on an SAT, you know, play a video game at a
high level and do a couple of other tasks. I mean, that's one definition of AGI. And if that's the
definition you use, we're probably fairly close to getting systems that can do that because we
already have systems that are fairly flexible. But I think when a lot of people think of AGI,
they think of like a sentient system that kind of
is aware of its own conceptions in the world. I like to joke sometimes with friends that, you
know, we've passed the Turing test. A lot of these systems can speak fairly convincingly and can
sometimes fool humans into thinking they're humans and not computers. But what would impress me more
is the so-called Masley test. You know, I would like to ask an LLM to copy edit the AI index. And when the LLM tells me, no, I don't want to copy edit,
I'm moving to Brooklyn to become an artist or a jazz musician. And when it exercises its own
capabilities, you know, to me, that would be very impressive. But we're still, I think,
quite far away from systems that could do that well. So it's a definitional question in a lot of ways, too.
I think a really interesting phenomenon is LLMs drawing correlations and insights that it wasn't trained to do.
And Shaheen and I both think this is a matter of understanding how LLMs work better.
It's not an early form of AGI.
How do you know you didn't train it?
Because I think there is information in the data that you give it that may not be open to you, but it is in the info.
So you don't think you trained it, but you actually did.
So how do you...
Yeah, exactly.
I mean, the challenging thing with a lot of these things is, you know, we still don't even understand how our own brain works. And the brain is really, the human brain is a really incredible organ. If you think about the fact that, at least from a textual perspective and kind of plan in ways that some of these large language models cannot.
So there clearly is something going on in our own minds architecturally that I think a lot of people were perhaps surprised that language was really a prediction
problem. That it was really just about next token prediction and kind of creating a model that
uses that kind of system. You could ultimately get systems that were able to generate text at
a high level and understand what you were saying.
But that, I don't think, was something a lot of people expected. When we talk about getting AI
systems to do other things, whether it is think flexibly, whether it's exercise some sentient
capability, I don't think we always even understand in our own minds how our brain gets us to that
level and what kind of things are required to get us to that level.
And obviously, without that understanding, it makes it harder to engineer those kinds of systems.
It's not to say that we can't engineer them, but just that I think a bit of patience is required as well.
Didn't get a chance to visit the Lenovo booth at ISC2424 or you want to have another look check out their inside hbc booth
video where lenovo discusses ai neptune cooling technologies true scale for hbc and more visit
insidehpc.com lenovo-isc24 video to view the video today one of the definitions I have offered for HPC is that it believes everything is eventually
computable. And that includes intelligence is computable with AI and trust is computable with
cryptocurrency consensus algorithms, and of course, on and on. And we are seeing some of that as well.
Even emotional considerations can be computable. So if you define it like that,
then is this AGI discussion more than philosophical, academic, but a bit of a
distraction as it relates to, let's say, industrial policy? Because if it can automate certain tasks
and that can impact society, who cares if it's AGI or not? Yeah, I mean, I just think it's more, I think
you're right. I think that there is, you're right in the sense that I think AI has tremendous
economic impacts. And I think probably in the present, we'd all be much better off focusing
on ways in which we could use this tool to, you know, advance our livelihoods, and also figuring
out ways in which it is dangerous in the present, whether it is its potential to be used for scams, deep fakes, or its potential to be biased. But I just think humans,
you know, I think we are the most intelligent species. And we are, if you look at the kind of
food chain, we're at the top of the food chain, not necessarily because we're the biggest or the
strongest, but because we're the smartest. And I think it's just fascinating when
you're at the top of that pyramid to potentially see the development of another system that has
some of the capabilities that we do, but maybe is also lacking in some kind of ways. And it's almost
terrifying to think that we could ourselves engineer a system that might unseat us from
that top. And I just think that's, you know, it's an element of a lot of human fascination. And I mean, I think even if there is a very low chance that it might
happen, it's still something that should be taken seriously by policymakers. But it's about balancing
those kinds of long run potential threats with more day to day concerns about how we could be
integrating AI and how we could actually be using AI to really improve a lot of people's lives. But I just think it's something humans like to
think about and humans like to talk about. Well, you're absolutely right. If it is a
super intelligence, then it's going to come after us in ways that we cannot fathom.
Yeah. So right there, you have a black swan waiting to happen.
Potentially. And if you say the likelihood of that is not a lot,
that's still really a lot. Yeah. It's hard to know, A, what that percentage is. And then even
if you knew what that percentage is, could you definitively say we should be doing one policy
set versus another policy set? Right.
Like, you know, I think when nuclear energy was being discovered in the 30s and the 40s,
and, you know, America was working
on building bombs, I think some of those scientists, you know, didn't know for sure
whether they were going to build a system that could, you know, maybe obliterate us all. I think
they calculated that the probability was fairly low. So they continue doing the research. And I
think you have to think about things similarly from a perspective with AI. But I mean, I think
epistemic humility is also necessary. One of the co-chairs of the AI Index, Jack Clark, is one of the co-founders of Anthropic. He's been in the AI policy space now, and he actually releases a reflection on the five years that passed since the launch of GPT-2.
And he seemed to say that he thought at the time that GPT-2, when it was going to launch, was maybe going to have a much worse impact than it ended up having.
As in, look, when GPT-2 launched, they forecasted how increasing AI systems, increasingly capable AI systems
might have negative impacts on the world. And he seemed to say now that, you know, you've seen some
of these negative impacts, but it perhaps wasn't as dire as maybe he once anticipated. And he now
looking back is mindful that when you have these discussions of AI policy, you know, it's important
not necessarily to act with 100% certainty that, you know, you want to stifle development, because some of the
fears that perhaps him and some of the other team members had didn't completely manifest. So again,
this prediction game is a very, it's a very tricky one. But that's why it's important for
initiatives like the index to exist, because, right, you know, you need it, you need to look
at the data, you need to see what impact it is having because you need to look at the data.
You need to see what impact it is having. You need to see how much better the technology is
getting. How are businesses actually using? How do people feel about it? I mean, AI is a very
multifaceted thing. So you really need a multifaceted perspective if you want to truly
understand what's going on there. Yeah. Nestor, getting back into two of the trends identified in the report, I thought
number seven, I don't know if I was surprised, but the finding that AI makes workers more
productive and makes higher quality work, I thought that was some time still out in the future.
So you say it completes tasks more quickly, improves quality of output. Can you give us
some examples of where that's happening? Yeah. so I think it's several studies there that come to mind. And again, I think we're far away
from saying this is always the case, but at least in studies that the report has looked at,
there was one study we featured, which was actually a meta review of five separate studies.
And in each of these five studies, there were two groups of coders that were basically given
the same task to do.
One group of coders didn't have access to an AI co-pilot.
The other did.
And the groups that had access to the AI tools were all able to do the task substantially
faster, anywhere from 73% to 26% less time than those that didn't have the AI tools. That's one finding from one
world, computer science. There was also a good finding from the world of consulting that similarly,
if you took two consultants, two groups of consultants, you gave one GPT-4 and you gave
the other no AI, the group of consultants that had GPT-4 were more productive, they got worked
on a lot quicker, and they submitted work of substantially higher quality.
And you tend to see similar results from the world of law.
There was one study that did a similar analysis.
Now, I will kind of caveat this by saying that there was another really interesting paper that we featured in the index that looked at the effect of AI when people perhaps trusted a bit too much. And in
this particular study, there were three groups. One of the groups was not given AI. The other two
groups were given the same AI, but one of them was told that the AI that they were given was good.
It was always reliable, did a very good job. The
other group was told that it was given a quote-unquote bad AI, like an AI that was good,
but sometimes was prone to making mistakes. Now if you juxtapose the performance of these three
groups on a set of tasks, the two groups that had access to AI did better than the group that
didn't have access to AI. But of the two groups that were
given AI, it was actually the group that had this bad AI that did a lot better than the group that
had the good AI, even though they were given the same AI. And researchers basically speculated that
you sometimes can have an autopilot effect. You know, if you're using an AI tool that you think
is really reliable, and you stop working or stop checking the outputs,
there may be a chance that some kind of faulty AI generated outputs make it into the final product.
And this kind of goes to show that AI could really level up workers, but it's still really
important to keep humans in the loop and to ensure that you use AI with kind of human supervision.
Yeah. It's a starting point, not an end point like Wikipedia or a lot. Yeah, exactly.
Yeah. A friend of mine had a young marketing person working at his consulting firm,
had her develop some marketing materials. She asked for a list of questions or points that
they wanted to cover, and she simply fed those into an LLM and the resulting product was not acceptable.
So there we go. The other area was number eight, scientific progress. Our audience
is especially interested in AI for science. You cited Alphadev and GNOME. Could you talk about
AI and science and things that are going on in that area?
Yeah, absolutely. And I mean, I think I'll start by saying that
I think this is a really important topic to discuss.
It's a topic that we created an entirely new chapter
this year to talk about it.
Just because I think when you look at the headlines,
it's easy to think about Chad, GPT, Claude, or Gemini.
Those are the kinds of big, sexy models
that everybody's talking about.
But there's a lot of really exciting things that are going on in science.
So GNOME, for example, was an AI-related application that made it substantially easier to discover new material structures.
AlphaDev was a system that improved the performance of different computer algorithms.
Again, there are so many different examples you could cite here. The one that I really like to discuss when I do these kinds of speaking engagements is alpha-menescence,
because I think it really puts into perspective the power of AI in the domain of science.
So for the audience that might not be familiar, a menescence mutation is a genetic alteration that can impact how human proteins function,
and they can lead to various diseases like cancer. And these mutations can either be pathogenic or
benign, and scientists estimate that there are close to 71 million different menescence mutations.
Now, prior to alpha-menescence, which was an algorithm that
basically was designed to classify these menescence mutations, human annotators were able to
successfully annotate 0.1% of all menescence mutations. Alpha-menescence then comes on board
and it has been able to annotate 89% of these 71 million possible mutations. So we go from 0.1 all the way to 89%.
And I think that really illustrates the power of AI, that there are so many scientific problems
that could really be massively aided with computational strength. And you're starting
to see AI apply to such problems, and you're starting to see the realization of some very exciting results.
Very cool.
Amazing.
If I can introduce another topic, it's the funding model for AI, the cost of AI and its
return on investment, so to say. If you're a startup, that would be monetization. If you're
an organization or a funding agency, it's how much do you spend on what, when? What are you
seeing in that world?
Well, I mean, I think it depends what you're trying to do when it comes to building these
systems. So I mean, I think to build an LLM, you just need a lot of money. And this is actually
some novel research that we had in this year's AI index, but we estimated the training costs of a
lot of significant systems. And again, we talked about the transformer once or twice on this podcast, but we estimated that the transformer, which was launched in 2017,
cost around $1,000 to train. If you fast forward all the way to GPT-4 or Gemini,
we estimate that those systems cost anywhere from $80 to $190 million to train. So massively
expensive to be building these
frontier models. So I think if you're a startup that's just starting up now, it's probably going
to be pretty hard to compete with some of these high level systems just because of the cost that's
required to really build one that is hyper competitive. Now there's still a lot of other
things to be done in the AI ecosystem and there's still a lot of kind of efficiencies to be found and still a lot of good work for businesses to do. But I think
it's really the case that if you want to build a frontier model, you need to have a lot of financial
backing. And that's just the reality of the market at the day. And that obviously has set the market
up in a certain way. As I've said, there's only really a few players that could really afford to
be building those kinds of systems. To me, it's an interesting phenomenon that, you know, as you
mentioned, models are really coming out of the private sector. PhDs are not going into government.
They're going to private companies. This is almost something that's beyond the reach of even
governments or the U.S. government to take the lead in, even though AI
has such tremendous implications for national security, health, and other things that government
is concerned with. Yeah, I mean, it's definitely concerning. And I mean, I think that like,
I think there's two narratives here. I think on the one hand, model costs are still going to go
up. So for example, the CEO of Anthropic anthropic dario amadei i think he predicted not
too long ago that you're you'll soon see models costing as much as one billion dollars and again
this is because these companies they understand the scaling laws i think they're betting they
could just feed the systems more data they're going to get better at the same time you're
presumably going to see algorithmic improvements you're going to see improvements in hardware
i think it will become a bit less expensive to train some of these frontier systems, but we'll see by what degree.
I think there's also an important question for the broader community. We have all these capable
closed models. Is there going to be an open model in the conceivable future that is going to be
very capable and really can match the performance of some of these top closed models.
At the moment, the answer to that question is yes. You have Lama 3, which seems to be fairly
competitive with GPT-4, CLAWD 3, and Gemini on a variety of benchmarks. And that's good for the
broader research community because again, there are so many things you could do with large language
models. And you could probably do them a little bit easier if you have access to an open model whose weights you can freely modify.
Now, if we move into a future where that kind of distribution paradigm changes in some kind of way,
that might create a lot of important inequalities in the development ecosystem that might require
some kind of resolution. But it also becomes, you know, it's a tricky security question and safety question.
Do we want models that are closed or do we want models that are open?
I think some of these developers believe in closing access to models, I think partly for
competitive concerns.
They don't want their competitors to have access to their secret sauce.
I think part of it is also safety and
security concerns. You know, if you openly release a model that is really capable, it can be used for
a lot of nefarious purposes. On the other hand, there are certain developers like Meta that really
seem to believe for business reasons and also scientific reasons about making research openly
accessible, that open is the way to go. So I think that kind
of that tension in the ecosystem is something that I'm keeping an eye on. And that I think
is going to be important to monitor in the near future. I think you're absolutely right. The HPC
community has always been in a leading position for open source technology, starting with one of
the early adopters of Unix when it came about,
and it wasn't quite open source, but it was open, and then Linux, and then the entire stack, etc.
But that I think is definitely a big issue in AI, whether you want to go top to bottom integrated,
or do you want to integrate and then disaggregate? Or do you want to just go open all the way?
Yeah, I mean, it's it is. And it's an open question, right?
I think there are people that I've spoken to that really champion the open philosophy
of model development and are really confident that philosophy is going to win in the long
run.
I think there are others who believe that there's always going to be a closed model
that's going to be at the top of the pyramid.
So it's something that I don't really know where my own intuitions align.
I could see there being compelling arguments made for both sides. And it's something that we've now
started tracking in this last edition of the report and something that we're continuing to
think about more and more. Yeah, well, definitely the reduction of friction that the open model
provides is a non-trivial attribute. So it really depends on
what sort of contribution you expect from those who adopt it. If you're just looking for their
subscription, then maybe not so much. But if you think they're going to contribute data or insights
or development, then it makes a huge difference, I think. Yeah, I mean, it's also interesting to think about how the kind of economics differ
for a company like OpenAI and Meta.
I think, you know, it seems like Meta really believes
from a business perspective
that it's better for them to open source the model,
partially because it seems like they intend
to deploy their model on a lot of their existing products
and maybe even to develop future products
that, you know, could bring us to the metaverse.
And it seems like they're almost outsourcing kind of part of the development pipeline or
the process improvement pipeline to the community.
And it seems to then save them a lot of resources because you have developers that are outside
of meta that are improving the model and basically doing that work for them, which makes it massively
more efficient to then redeploy that model internally. Now, that's a strategy that they can afford because they have
a litany of products that are used by massive amounts of people. It's a bit different for
companies like OpenAI and Anthropic that don't necessarily have as much of a distribution network
as a company like Meta. So yeah, there are really interesting questions of
economic management on a technological side that, you know, depending on exactly where you are in
this ecosystem, you know, you might prioritize development of models in different ways. And
that's another question I'm very interested in as well as, you know, what's the economics of this?
What is OpenAI? Like, I mean, I was thinking a few years ago, if all these companies
are just basically scaling up transformers, like anything that open AI does, you know,
Anthropic can probably do in a few months, Microsoft can do in a few months, Google can do
in a few months. So what is open AI's advantage? Is it a distribution one? My thoughts on that have
changed a little bit in recent months. But I think the economics of this tech are also very
fascinating to talk about. Yeah, actually, along those lines, it was always a bit curious that for a technology
that's supposed to be so advanced and out there, why is it that there are a dozen companies
competing? So maybe it's not such a big barrier to entry after all. Yeah, I mean, I think it's not,
you know, these models are expensive, but they're not, again, expenses relative, they're not so expensive that, you know, they're in that price range where
you could have a couple of different companies that are competing. I think I guess what I was
also saying is that, you know, for me, I thought that if there was going to be five or six players
that had pretty comparable models, the strategy for kind of, let's say, newer players like OpenAI or Anthropic
was to either A, build models that were substantially better,
or perhaps maybe build models that people could trust a little bit more.
But I've inherently thought that existing players like Google and Microsoft
had massive distribution advantages because because I mean, realistically,
if Google has a technology that works as well as GPT-4, ChatGPT, are you going to go to a separate
website to use that? Or are you going to use it if it's integrated into your, you know, Google Docs
suite or your Microsoft Office suite? Now, that's what I had thought six months ago. I think it's
still true. But I think it's also important to
say that the kind of rollout challenge of this technology is also non-trivial. As in, you know,
Google now has had a couple of snafus, whether it's the kind of, you know, racially diverse
Nazis from a few months ago with its text-to-image model, or even recently it was trying to integrate
AI into Google and it was suggesting to people that
they should eat rocks, just kind of comical missteps. Yeah. And I mean, this is like one
of the world's richest companies. So clearly rolling the technology out in an effective way
is non-trivial. So I think, you know, that's changed my perspective on things a little bit,
but it'll be interesting to see how things unfold.
Nisar, what is the methodology you use for this work for the index?
Yeah, it's a great question. I mean, it really depends what kind of data we're collecting. I
think on a high level, we ask ourselves when we start thinking about the report, which is always
a half a year, if not more in advance of the publication, what kind of topics do we think
are really important to cover? And there are certain topics now that we've been covering for
a while, like trends in technical performance, what can the technology do now that it couldn't
really do five years ago, trends in policy, how are governments thinking about it, trends in
economics. But there are always perhaps new things that we think to ourselves, okay, the AI ecosystem
has evolved. These are some new things that we want to consider. Last year, we added a new chapter
on public opinion that looks at how the general public feels about this technology. In the most recent report,
we added two new chapters, one on responsible AI that was spearheaded by a great computer science
student that worked for us, Anca Royal, and the science and medicine chapter, which was again,
a completely new chapter that we added in this year's report. And I think the inclusion of these new chapters reflected our sentiment that it was important to
talk about these things when the kind of landscape evolves. But we're in a position with the index to
be blessed by to have the advisory role services of a committee of really excellent AI thought
leaders that sit on our steering committee. We dialogue also with a lot of Stanford professors that give us their perspectives on what we should cover.
And we typically tend to partner with existing data vendors if there is existing data that is
available on particular topics. But if there are certain data points that are unavailable,
like for example, the geopolitical affiliation of different foundation models, that's something that
we then work to collect ourselves. Excellent. Excellent. And what is next? What do you see for the next edition?
I think a couple of things. So I think that, I mean, the report I think is institutionalized,
it's in a good position. I think we're going to look more or less at the same kind of category
of topics. I'm really interested to see the economic data. I think it's going to
be really curious to me how much businesses are using this technology. You know, I cited some
studies which suggest productivity impacts. These are more kind of micro level. I'd be really
interested to see if there are macro level impacts that we might hypothesize down the line. Also,
the index team is working hard on some, let's say, smaller scale
projects that work to paint a picture of what's going on in the AI ecosystem. In a couple of
months, we're going to be relaunching our global vibrancy tool, which is a tool that is more of a
traditional index that ranks how well different countries are doing in the AI ecosystem. We also
have some analyses forthcoming that look at the amount of money that governments
spend on AI. There's actually no systematic measure that comparatively analyzes how much
different countries on a public level have spent in AI. We're trying to be the first that kind of
publish some research on that kind of topic. A lot of kind of exciting projects for us in the
pipeline. And of course, you know, in this kind of job, you always have to refresh your Twitter or your ex every five minutes, because it seems
like with AI, there's a new development that comes up very rapidly. And would you tell us a little
bit more about how Stanford has organized its AI activities across its various departments, and
certainly with the project that you're leading? Yeah, I mean, the index is housed at Stanford High. We're actually celebrating our fifth
anniversary tomorrow. And I think Stanford High was really creative to keep the human in the loop
with AI. And I think it really believes in doing AI from a very interdisciplinary perspective. So
the institute itself is actually under the Dean of Research at Stanford, and we collaborate with legal scholars, medical scholars, you know, artists, more traditional computer scientists, because I think we really feel at the Institute that AI is a technology that could broadly benefit humanity.
But in order to ensure that it can broadly benefit humanity, we need to solicit a variety of perspectives. So I think that
is really fundamental to the mission at high. And that really fundamentally drives forward a lot of
the great work that is being done at this institute. Very multidisciplinary. Absolutely. Yeah.
Yeah. And I think Stanford has always led in that, well, has been one of the leaders in that
multidisciplinary approach. Now you've also been at Harvard and Oxford, also institutions of note. Are you seeing any difference in the way those guys approach it? Or is it generally a global effort? certainly has fairly robust programs when it comes to AI. I think they have some good robotics
things from a pure computer science perspective. I think they also excel a lot when it looks at
questions of AI governance. So I think there's a leading center on AI governance called the
Center for the Governance of AI that kind of spun out of the university. I think Harvard also has
some similarly strong institutions. But I mean, all three of those institutions do good
work with the technology, but I'm certainly a bit more familiar with what's being done at Stanford
than the other two. Brilliant. Well, we'd like to possibly invite you in advance on the occasion of
the next index coming up, I assume next April. Yeah, it would be a pleasure. I mean, it's
interesting to think about that because I wonder to myself, where's the technology going to be in a year? What's going to happen?
And it might be interesting for me to listen back to this podcast at this point next year and,
you know, think to myself, wow, I can't believe I actually said that. How stupid was I?
Or perhaps, wow, that was actually very genius. I know what I'm talking about. So
we'll see if my predictions have any weight. Well, we'll hold you very genius. I know what I'm talking about. So we'll see if my predictions have any
weight. Well, we'll hold you very closely. We'll have a profundity index.
Excellent. Thank you, Nestor. It was excellent. Yeah, it was a pleasure speaking with you guys
today. All right. Thanks so much. That's it for this episode of the At HPC podcast. Every episode is featured on InsideHPC.com and posted on OrionX.net.
Use the comment section or tweet us with any questions or to propose topics of discussion.
If you like the show, rate and review it on Apple Podcasts or wherever you listen.
The At HPC podcast is a production of OrionX in association with Inside HPC.
Thank you for listening.