@HPC Podcast Archives - OrionX.net - @HPCpodcast-85: Stanford AI Index w Nestor Maslej

Starting point is 00:00:00 If you didn't get a chance to visit Lenovo at ISC24, it's not too late. Check out their Inside HPC booth video to get caught up and learn more about their booth's theme, transforming HPC and AI for all. Visit InsideHPC.com slash Lenovo dash ISC24 video. First, why do I think it's the time that businesses should be using AI? We talk about this in the index. There's compelling evidence that managers that integrate AI into their business functions see cost decreases and revenue increases. There's compelling evidence that AI makes workers more productive, at the minimum, lets

Starting point is 00:00:43 them do work in less time. I would say, yeah, businesses should be thinking about it, but I don't think there's a need to be losing sleep over the pace of integration. Certainly make it a priority, but understand that if you do it slowly and in a steady way, you will probably set your business up quite well. From OrionX in association with Inside HPC, this is the At HPC podcast. Join Shaheen Khan and Doug Black as they discuss supercomputing technologies and the applications, markets, and policies that shape them. Thank you for being with

Starting point is 00:01:17 us. Hi, everyone. Welcome to the At HPC podcast. Shaheen and I are joined today by Nestor Maslay. He is research manager at Stanford's Institute for Human-Centered Artificial Intelligence, where he manages the annual Stanford AI Index Report. Nestor has degrees from Harvard and Oxford. He was an analyst for startups based in Toronto, and he is also a fellow at the Center for International Governance Innovation. Esther, welcome. Thanks for having me, guys. Today, I'm super excited to be talking about the AI Index and more broadly where the AI ecosystem is. Super. So this year's AI Index was released in April. Why don't we start off with what you consider maybe the top findings in the new report?

Starting point is 00:02:07 And, you know, were there trends or new insights that you think are especially noteworthy? Yeah, it's a good question. It's also a tricky question because, I mean, the report's quite long. It's close to 500 pages. And I like to sometimes joke that I get offended when people ask to summarize this 500 pages of brilliance that I've put out into the world into a quick little soundbite. But of course, that's important to do. I think for me, if there is a high level trend, and I think this kind of theme setting will set up well, the other things that we'll be talking about in the podcast today is that 2022, to me felt like the year in which AI technology really put itself on the map. And

Starting point is 00:02:47 2023 felt to me like the year in which people responded. Now, I think you see that response in a lot of different ways. You see policymakers reacting in terms of passing more AI-related regulations. You see the public responding in terms of seeming to be very concerned about AI technology for really the first time. And you also see businesses responding, whether it's in terms of rising rates of business integration AI, there being new research on the productivity impacts that this kind of tool has. And there also being some data on the fact that there's just much more investment in generative AI than ever before. For me, that's really the high level story. And I think that's a story that you see with any kind of technology. It announces itself to the

Starting point is 00:03:35 world and then ultimately people respond. And in a lot of ways, this theme kind of situates itself nicely with the mission of the index. We exist to give people, whether it's policymakers, business leaders, or just regular civilians, the opportunity to understand what this technology is and make appropriate decisions about how they should be using it and vibing with it in their day-to-day lives. Yeah. 2022, November, when OpenAI came out with ChatGPT, that was one of those instances that happens every decade or so. Oh, for sure.

Starting point is 00:04:10 Where you just feel, I'm living right now in a historical moment. There's history going on all around me at this given time. It was amazing. But let me ask you this too. You know, AI is changing so fast. It's now almost two months since your report came out. Is there anything new since then that you've detected? There are always new things. I think there's new papers being written, exciting new research that's being published. I wouldn't say

Starting point is 00:04:37 that there has been anything that is, let's say, transformatively new in the way of chat GPT. I think you've seen an extension of trends that have continued, whether it's the launching of new capable open source models like Lama3. I think there's some ongoing drama at OpenAI with there being a lot of resignations of members of their super alignment team. But I think some of the trends that we've seen in the last few months are more kind of reflective of broader themes that had been percolating and building in the last few years, whether it's this kind of question of to what degree can open source models actually compete with closed models? That's something

Starting point is 00:05:15 that we touch on in the index. The question of how much is existential risk something that should be taken seriously when developing these kinds of AI systems. So I think some of the things that have happened in the last few months are really just extensions of a lot of themes that had been building in the last little while. What I've seen just lately in media, the Wall Street Journal just had a big piece that the possibility that AI has been overhyped, there could be a bubble. Questions along those lines. Yeah. I mean, is the question, are you asking me, do I think AI been overhyped, there could be a bubble. Questions along those lines. Yeah, I mean, is the question, are you asking me, do I think AI is overhyped?

Starting point is 00:05:49 No, I would say it's just a trend within AI that there are growing questions just in the general ecosystem and in the media ecosystem. Yeah, I mean, unquestionably. I mean, I think this happens with a lot of technology. And I think there is reason for people to be, I don't want to say skeptical, but you see other technologies that emerge like crypto was something that was really celebrated by a lot of people in Silicon Valley. And I mean, I think you can make an argument that it's yet to make the impact that a lot

Starting point is 00:06:19 of people perhaps predicted that it might. I think AI is perhaps a little bit different to me. I think there's already a lot of compelling evidence that it might. I think AI is perhaps a little bit different to me. I think there's already a lot of compelling evidence that it does make a big difference in terms of worker productivity. It leads to higher quality work. It levels up lower skilled workers. And I think in a lot of ways, it's also hard to fully judge what AI can do because you haven't really fully seen a lot of companies integrate this technology yet. There's a good parallel, I think, with the development of electricity, where when electricity was discovered in the mid-19th century, it took several decades, if not four or five, for the productivity impacts of

Starting point is 00:06:56 that new technology to be really fully manifest. Because of course, if you had a factory that was built in the middle of the 19th century, you couldn't just retrofit it with the snap of the fingers to be using all of this new technology, all of this electricity that had been discovered. So I think there's a similar analogy that holds true with artificial intelligence, that there's probably a lot of questions to be asked about how companies can be thinking about actually using these tools. Do we have the right infrastructure to even ensure that companies can be thinking about actually using these tools. Do we have the right infrastructure to even ensure that companies can take advantage of their own data, that they could build well

Starting point is 00:07:30 with AI? It will probably take us a little while to get there. And I think at that point, we're really going to start to see some compelling evidence that AI has had some high-level productivity impacts. That's causing me to re-evaluate the question I was going to ask. Well, what was the question perhaps? I mean, you can challenge me. There's a challenge. No, it's not a challenge. It's fundamentally a question is where is AI in the so-called hype cycle? And when do I need to make what kind of investment in it? Do I need to rush now because otherwise I'm going to be left behind or do I still have time to catch up with it? And is this an existential question for my business? Or is this going to evolve slowly enough

Starting point is 00:08:13 that I can catch up with it? I mean, I think it's the time now for businesses to be thinking about using AI, but I think you need to exercise patience. So I mean, first, why do I think it's the time that businesses should be using AI? We talk about this in the index. There's compelling evidence that managers that integrate AI into their business functions see cost decreases and revenue increases. There's compelling evidence that AI makes workers more productive, at the minimum, lets them do work in less time, and at the maximum, lets them save time and also improves the quality of work. There's work and academic literature that shows that AI is bridging the

Starting point is 00:08:52 skill gap between low and high skilled workers. High skilled workers still top low skilled workers, but the gap is certainly bridged. And you see these studies now across different industries, which to me suggests a degree of further validity. So there's definitely a lot of evidence that AI offers a competitive advantage. And I mean, I think it's just economics 101. But if as a business, you're not necessarily positioning yourself to leverage a tool that your competitors might use that gives them a competitive advantage, you yourself might be put a little bit behind. Now, at the same time, we're still in this area or in this time period where a lot of regulatory questions about AI remain unsettled. A lot of countries

Starting point is 00:09:32 are still figuring out what kind of regulatory regimes and regulatory landscapes they're going to set up to govern how this technology is used and how it is deployed. So I think businesses, you know, they face this challenge that on the one hand, there is definitely a desire to be integrating AI. On the other, you know, you don't necessarily want to rush and perhaps either go against regulation or perhaps set up AI in a way that might not necessarily align with where the regulatory landscape is heading. So I think I would say businesses should be thinking about it. But I don't think there is a need to be losing sleep over the pace of integration and certainly make

Starting point is 00:10:09 it a priority. But understand that if you do it slowly and in a steady way, you will probably set your business up quite well. Now, you mentioned policy. And I also noted part of your background is governance innovation, which is highly relevant to a lot of things these days. Do you think that regulators know enough to regulate this right now? I mean, I think they're trying their best. I mean, I think there is always you would always hope that you could get higher level talent in to government. And I mean, that's another completely different discussion. There's some good data in the index report, which talks about how a lot of AI PhDs mostly are going to industry now. Some are going to academia, but very few are going into government. And that's obviously a bit

Starting point is 00:10:54 problematic. But put that aside, I do think governments have relatively been on the front foot. And I think there's an interesting juxtaposition to be made to the rise of social media, where I think when social media really emerged into the zeitgeist in the kind of late 2000s, it really took half a decade, if not more, for there to be a fairly substantial policy response. And I think a lot of us looking back maybe wish we would have done things a bit differently with social media. Whereas I think with AI, I mean, if 2022 is the kind of, let's say, groundbreaking moment when AI announced itself on the map, you can say that literally the next year in 2023, you had landmark regulation being passed in both the United States, you had landmark

Starting point is 00:11:37 legislation being passed in the EU. So I do think policymakers are trying to respond in the best way that they can. Obviously, I think some regulation perhaps is a bit better than others, just from my own personal opinion. But I do think that there is a sense and an awareness that this technology is here, and that it's certainly very necessary to be doing something at the moment to deal with this technology and to try to not only regulate it to ensure that it's used in a safe way, but also expand its possibilities, because you can do a lot with AI. But if we're in a world where it's going to cost $200 million to train frontier AI models, that obviously is perhaps leaving certain people outside of the

Starting point is 00:12:18 potential of building and developing these really transformative systems. Right. Speaking of the different models of governance that are emerging, certainly you mentioned the US, which appears to be pretty market oriented and a little bit looser. Then you have the EU and their regulations are actually going into effect like next month. And they are a little bit tighter, a little bit more controlling of the data and how it's used. And then you've got China, which of course is using it more for social stability and such. And then you've got the rest of Asia. How do you see that spectrum? And what is your thought

Starting point is 00:12:54 process to favor one over the other? I mean, I think, again, there's a challenge here where I think you could do a lot with AI. And I think AI as a tool, if it's integrated well and developed well, can have a lot of massive productivity impacts. At the same time, though, you want to ensure that it is being developed in a responsible way and that people trust the tool. I think if you look at data that, again, is in the report on public opinion perspectives, like how do people actually feel about this technology? You know, in a lot of Western countries like the United States, like Canada, like Germany, opinions are actually very low. People are very pessimistic about AI. And I mean, that kind of pessimism can really inform

Starting point is 00:13:36 how it's developed. Why am I talking about this? Well, because if you're a government, you want to ensure that the developers of this tool are developing it in such a way that people ultimately have faith in it, have faith that it's not necessarily going to automate their jobs, have faith that it's not going to violate their privacy. Now, at the same time, you don't want to create a regulatory environment that kind of stifles innovation in a way that is perhaps too cumbersome. So I definitely think there's a kind of a need to really walk a fine line here. On the one hand, create an environment that is expansive enough for companies to feel like they can innovate, for companies to want to be innovating, and also not

Starting point is 00:14:16 only for companies to innovate, but also for just regular developers to feel like they're empowered to pursue interesting projects. And then on the other hand, create kind of rules and regulations about how we should be using this technology so that people trust the technology and people have faith that it's not going to work against their interests because it is instilling that faith that you can put people in a position to really take advantage of the tool and use it as completely as possible. Yeah, that really is a challenge to accelerate all the good stuff it can do without, you know, and control it as it relates to the bad stuff. Yeah. How do you guys feel about that? Because obviously my perspective is my perspective,

Starting point is 00:14:57 but I'd be curious to hear from both of you guys as people that come from a slightly different landscape, how do you feel about the kind of need for AI regulation? Are you favoring perhaps more of a framework, the kind that you see in the European Union, where I think they've really doubled down on regulating and putting in rules first and foremost, or perhaps one that we see more here in the United States, where it's a little bit more market oriented, as you said. Well, I was just about to ask you what your views were on the new EU regulatory regime. I guess my overall sense, you mentioned social media. I think in some ways, social media is really out of control, especially where it relates to children and teenagers. Now, if AI goes in a similar direction, right, not good. And if it's threatening people's livelihoods,

Starting point is 00:15:45 then certain responses have to be put in place. Either that's either halted or hindered, or we need to enable people to acquire new skills to fit into the new AI driven economy. Yeah. I mean, I think that the reality is, my perspective is, you know, AI isn't going anywhere. I mean, it's here. And I think if you look at other technologies that have launched themselves historically, you know, despite the attempts that certain people have made to resist those technologies, they inevitably become part of the mainstream. So I think for us, the question is, how do we want to relate to this technology in a way that it serves our interests?

Starting point is 00:16:24 Because with AI, again, it could really augment what a lot of workers can do. How do we create incentives to ensure that companies are using AI tools to augment the capabilities of their workforce and not necessarily automate them? So I think for me, I think the first step is just admitting that this technology is here. And the second step afterwards is really thinking critically about what do we want out of it? I mean, it's always a values question, right? I think that, you know, if you think about it, some of these tools that we use, some of these digital tools like Netflix or Amazon, they can make great recommendations for, you know, the show that I want to watch next or the product that I want to buy. Obviously, to make those recommendations,

Starting point is 00:17:02 they have to have access to my search history. Some of us might be very comfortable with kind of surrendering that data about ourselves if it leads to better services in the economy. Others might not be. But at the end of the day, it's a social question. And I think we have to come together as a society and think what we really want from this technology and start having that dialogue, start having that discourse. So my view is that the value of data is not being allocated properly right now. And when you have startups that are not even generating a lot of revenue, that value is in their valuation rather than their revenue. And that I think is an issue. And I would like to see a move towards data rights management at inception that allow individuals to license their data rather than just hand it over. In terms of policies, my view is that while AI can augment an employee and may not impact that

Starting point is 00:18:02 individual employee's employment status, it does impact the category of jobs that need that particular task that they can do. Yeah, definitely. Therefore, it impacts employment. And the impact on employment does not have to be large to be disruptive to society. And governments would do well to account for that, budget for that, and ease that transition to nirvana. However, it is impossible to slow down, and it's not even advisable to slow down AI. So I think we need to move as fast as possible while making sure that any transition issues are

Starting point is 00:18:38 budgeted for. Yeah. What I was just going to say is that I think even in a lot of ways, it can sometimes be difficult to anticipate the impact that these kinds of technologies will have on economies. And I think one of the things that I always find quite amusing about chat GPT, I mean, when I studied, I did a lot of research and economic development and automation. And if you look at a lot of canonical automation literature, I think a lot of this literature very often made assumptions that the peoples whose jobs would go first would be those that would be doing very kind of routine, repetitive labor. I mean, that was always the assumption that like robots would come for those tasks first, you can never really automate what a creative person does, or what a knowledge worker

Starting point is 00:19:21 does. And I mean, there have been developments in robotics. Again, we profile them in the report, but we're still pretty far away from a robot flipping a burger at a McDonald's. But you know what, there are some pretty good large language models that can maybe do some of the job that lawyers are doing. Obviously, that's a bit of an exaggeration. But my point being is that I don't think a lot of people really predicted that knowledge workers would be the first ones that would be threatened. And again, you know, that's the challenge with this technology. There's a lot of people that sometimes make bold predictions and bold prognostications about where we're going. But this technology has a way to it kind of moves according to its own intuition. And

Starting point is 00:20:00 I think the best thing policymakers do can do is just be continually receptive and continually responsive to the realities of AI and some of the new dynamics that it's introducing. Actually, they also should have in mind that politicians themselves could be replaced by AI. Yeah, well, that perhaps is a very brave new world, but definitely an interesting thing to think about. It'll get votes. I think a really compelling imperative issue is the replacement of journalists. Now that's alarming, but I say that out of complete silence. Could we ask us, just looking way out, I mean, there are some who prognosticate that AI is a threat to the human species. I mean, are you in that camp looking out decades ahead? I mean, I think it's a non-zero possibility, but I think as in like there are narratives about this

Starting point is 00:20:52 kind of existential risk and I guess I understand them, but I still think given the way the AI ecosystem is structured today, I don't necessarily know if we're going to get to these kinds of Terminator existential robots anytime soon. And I mean, I have to caveat this because there's a few assumptions that this kind of argument requires. I think you've seen tremendous developments in AI systems in the last few years. A lot of these developments have really been confined to large language modeling. And then there has been a lot of really interesting research that has used insights from large language modeling, like next token prediction to lead to further improvements in robotics or

Starting point is 00:21:32 image generation, image editing. And the improvement that you've seen in large language models has really been a function of scaling. I think when GPT-2 came out, it was fairly well understood that if you just give these models more data, you take a transformer and you give it more data, it will perform better on a lot of tasks. And that's basically, I think, what a lot of these companies are betting on. OpenAI, Anthropic, Google, Meta, I think they're all betting that you could just have that architecture. Of course, maybe introduce new slight tweaks, whether it's RLHF or something along those lines. And then if you just scale and add in more data, you're going to create a system that's going to be a lot better.

Starting point is 00:22:10 Now, I'm not necessarily convinced that scaling these systems is going to lead to some kind of, let's say, AGI, whatever that means, that kind of has its own intentions and that is sentience. I think that will require a further architectural development as in a new way of developing a model and thinking about a model. And it's just really hard to predict when that development is going to occur. There's a great line that I like to use from a really celebrated Stanford computer scientist. I think John MacArthur was once asked, when are we going to get AGI? And he said, anywhere from five to 500 years. And I think there's a point there that it's just, it's hard to predict when you're going

Starting point is 00:22:51 to have the next kind of the next transformer, right? And even the transformer, you know, it's funny, you would imagine that nowadays, if Google could go back in a time machine, and if it knew that the transformer was a way to build these highly functional LLMs, it probably would have never openly released that paper. But again, I think when the paper was released, no one that was on that authorship team, I think had any idea that it was going to unleash the revolution that we've seen. So that's the thing about research, it requires a lot of serendipity, it's hard to exactly know when it's going to go and move forward. And I think large language models, as they are today, are very powerful, very capable, can lead to a lot of very important

Starting point is 00:23:37 transformations. But again, it still might be a while before we get another architecture that really raises the bar of what these AI systems are capable of doing. But again, it'll be really interesting for me to see how well GPT-5 and you know, Cloud4 and these kind of Gemini 2, the next level of models do, because I'm sure they're going to get better at some tasks. But I'm really curious to see if scaling these models and feeding them more data really continues to lead to massive improvements in performance capabilities, or if it's going to be the case that you start seeing some kind of leveling off. So in other words, am I right in saying, to capture a bit of what you're saying, generative AI is not an early form of general AI. I think to a degree. I mean, I think the problem is there isn't really a industry or community accepted definition of what AGI means. There is, for example, some, I think there are some like prediction websites like Metacolous that define AGI as a system that could score a certain score on an SAT, you know, play a video game at a

Starting point is 00:24:47 high level and do a couple of other tasks. I mean, that's one definition of AGI. And if that's the definition you use, we're probably fairly close to getting systems that can do that because we already have systems that are fairly flexible. But I think when a lot of people think of AGI, they think of like a sentient system that kind of is aware of its own conceptions in the world. I like to joke sometimes with friends that, you know, we've passed the Turing test. A lot of these systems can speak fairly convincingly and can sometimes fool humans into thinking they're humans and not computers. But what would impress me more is the so-called Masley test. You know, I would like to ask an LLM to copy edit the AI index. And when the LLM tells me, no, I don't want to copy edit,

Starting point is 00:25:31 I'm moving to Brooklyn to become an artist or a jazz musician. And when it exercises its own capabilities, you know, to me, that would be very impressive. But we're still, I think, quite far away from systems that could do that well. So it's a definitional question in a lot of ways, too. I think a really interesting phenomenon is LLMs drawing correlations and insights that it wasn't trained to do. And Shaheen and I both think this is a matter of understanding how LLMs work better. It's not an early form of AGI. How do you know you didn't train it? Because I think there is information in the data that you give it that may not be open to you, but it is in the info.

Starting point is 00:26:16 So you don't think you trained it, but you actually did. So how do you... Yeah, exactly. I mean, the challenging thing with a lot of these things is, you know, we still don't even understand how our own brain works. And the brain is really, the human brain is a really incredible organ. If you think about the fact that, at least from a textual perspective and kind of plan in ways that some of these large language models cannot. So there clearly is something going on in our own minds architecturally that I think a lot of people were perhaps surprised that language was really a prediction problem. That it was really just about next token prediction and kind of creating a model that uses that kind of system. You could ultimately get systems that were able to generate text at a high level and understand what you were saying.

Starting point is 00:27:25 But that, I don't think, was something a lot of people expected. When we talk about getting AI systems to do other things, whether it is think flexibly, whether it's exercise some sentient capability, I don't think we always even understand in our own minds how our brain gets us to that level and what kind of things are required to get us to that level. And obviously, without that understanding, it makes it harder to engineer those kinds of systems. It's not to say that we can't engineer them, but just that I think a bit of patience is required as well. Didn't get a chance to visit the Lenovo booth at ISC2424 or you want to have another look check out their inside hbc booth video where lenovo discusses ai neptune cooling technologies true scale for hbc and more visit

Starting point is 00:28:13 insidehpc.com lenovo-isc24 video to view the video today one of the definitions I have offered for HPC is that it believes everything is eventually computable. And that includes intelligence is computable with AI and trust is computable with cryptocurrency consensus algorithms, and of course, on and on. And we are seeing some of that as well. Even emotional considerations can be computable. So if you define it like that, then is this AGI discussion more than philosophical, academic, but a bit of a distraction as it relates to, let's say, industrial policy? Because if it can automate certain tasks and that can impact society, who cares if it's AGI or not? Yeah, I mean, I just think it's more, I think you're right. I think that there is, you're right in the sense that I think AI has tremendous

Starting point is 00:29:10 economic impacts. And I think probably in the present, we'd all be much better off focusing on ways in which we could use this tool to, you know, advance our livelihoods, and also figuring out ways in which it is dangerous in the present, whether it is its potential to be used for scams, deep fakes, or its potential to be biased. But I just think humans, you know, I think we are the most intelligent species. And we are, if you look at the kind of food chain, we're at the top of the food chain, not necessarily because we're the biggest or the strongest, but because we're the smartest. And I think it's just fascinating when you're at the top of that pyramid to potentially see the development of another system that has some of the capabilities that we do, but maybe is also lacking in some kind of ways. And it's almost

Starting point is 00:29:56 terrifying to think that we could ourselves engineer a system that might unseat us from that top. And I just think that's, you know, it's an element of a lot of human fascination. And I mean, I think even if there is a very low chance that it might happen, it's still something that should be taken seriously by policymakers. But it's about balancing those kinds of long run potential threats with more day to day concerns about how we could be integrating AI and how we could actually be using AI to really improve a lot of people's lives. But I just think it's something humans like to think about and humans like to talk about. Well, you're absolutely right. If it is a super intelligence, then it's going to come after us in ways that we cannot fathom. Yeah. So right there, you have a black swan waiting to happen.

Starting point is 00:30:41 Potentially. And if you say the likelihood of that is not a lot, that's still really a lot. Yeah. It's hard to know, A, what that percentage is. And then even if you knew what that percentage is, could you definitively say we should be doing one policy set versus another policy set? Right. Like, you know, I think when nuclear energy was being discovered in the 30s and the 40s, and, you know, America was working on building bombs, I think some of those scientists, you know, didn't know for sure whether they were going to build a system that could, you know, maybe obliterate us all. I think

Starting point is 00:31:15 they calculated that the probability was fairly low. So they continue doing the research. And I think you have to think about things similarly from a perspective with AI. But I mean, I think epistemic humility is also necessary. One of the co-chairs of the AI Index, Jack Clark, is one of the co-founders of Anthropic. He's been in the AI policy space now, and he actually releases a reflection on the five years that passed since the launch of GPT-2. And he seemed to say that he thought at the time that GPT-2, when it was going to launch, was maybe going to have a much worse impact than it ended up having. As in, look, when GPT-2 launched, they forecasted how increasing AI systems, increasingly capable AI systems might have negative impacts on the world. And he seemed to say now that, you know, you've seen some of these negative impacts, but it perhaps wasn't as dire as maybe he once anticipated. And he now looking back is mindful that when you have these discussions of AI policy, you know, it's important

Starting point is 00:32:24 not necessarily to act with 100% certainty that, you know, you want to stifle development, because some of the fears that perhaps him and some of the other team members had didn't completely manifest. So again, this prediction game is a very, it's a very tricky one. But that's why it's important for initiatives like the index to exist, because, right, you know, you need it, you need to look at the data, you need to see what impact it is having because you need to look at the data. You need to see what impact it is having. You need to see how much better the technology is getting. How are businesses actually using? How do people feel about it? I mean, AI is a very multifaceted thing. So you really need a multifaceted perspective if you want to truly

Starting point is 00:32:58 understand what's going on there. Yeah. Nestor, getting back into two of the trends identified in the report, I thought number seven, I don't know if I was surprised, but the finding that AI makes workers more productive and makes higher quality work, I thought that was some time still out in the future. So you say it completes tasks more quickly, improves quality of output. Can you give us some examples of where that's happening? Yeah. so I think it's several studies there that come to mind. And again, I think we're far away from saying this is always the case, but at least in studies that the report has looked at, there was one study we featured, which was actually a meta review of five separate studies. And in each of these five studies, there were two groups of coders that were basically given

Starting point is 00:33:45 the same task to do. One group of coders didn't have access to an AI co-pilot. The other did. And the groups that had access to the AI tools were all able to do the task substantially faster, anywhere from 73% to 26% less time than those that didn't have the AI tools. That's one finding from one world, computer science. There was also a good finding from the world of consulting that similarly, if you took two consultants, two groups of consultants, you gave one GPT-4 and you gave the other no AI, the group of consultants that had GPT-4 were more productive, they got worked

Starting point is 00:34:24 on a lot quicker, and they submitted work of substantially higher quality. And you tend to see similar results from the world of law. There was one study that did a similar analysis. Now, I will kind of caveat this by saying that there was another really interesting paper that we featured in the index that looked at the effect of AI when people perhaps trusted a bit too much. And in this particular study, there were three groups. One of the groups was not given AI. The other two groups were given the same AI, but one of them was told that the AI that they were given was good. It was always reliable, did a very good job. The other group was told that it was given a quote-unquote bad AI, like an AI that was good,

Starting point is 00:35:11 but sometimes was prone to making mistakes. Now if you juxtapose the performance of these three groups on a set of tasks, the two groups that had access to AI did better than the group that didn't have access to AI. But of the two groups that were given AI, it was actually the group that had this bad AI that did a lot better than the group that had the good AI, even though they were given the same AI. And researchers basically speculated that you sometimes can have an autopilot effect. You know, if you're using an AI tool that you think is really reliable, and you stop working or stop checking the outputs, there may be a chance that some kind of faulty AI generated outputs make it into the final product.

Starting point is 00:35:53 And this kind of goes to show that AI could really level up workers, but it's still really important to keep humans in the loop and to ensure that you use AI with kind of human supervision. Yeah. It's a starting point, not an end point like Wikipedia or a lot. Yeah, exactly. Yeah. A friend of mine had a young marketing person working at his consulting firm, had her develop some marketing materials. She asked for a list of questions or points that they wanted to cover, and she simply fed those into an LLM and the resulting product was not acceptable. So there we go. The other area was number eight, scientific progress. Our audience is especially interested in AI for science. You cited Alphadev and GNOME. Could you talk about

Starting point is 00:36:38 AI and science and things that are going on in that area? Yeah, absolutely. And I mean, I think I'll start by saying that I think this is a really important topic to discuss. It's a topic that we created an entirely new chapter this year to talk about it. Just because I think when you look at the headlines, it's easy to think about Chad, GPT, Claude, or Gemini. Those are the kinds of big, sexy models

Starting point is 00:37:01 that everybody's talking about. But there's a lot of really exciting things that are going on in science. So GNOME, for example, was an AI-related application that made it substantially easier to discover new material structures. AlphaDev was a system that improved the performance of different computer algorithms. Again, there are so many different examples you could cite here. The one that I really like to discuss when I do these kinds of speaking engagements is alpha-menescence, because I think it really puts into perspective the power of AI in the domain of science. So for the audience that might not be familiar, a menescence mutation is a genetic alteration that can impact how human proteins function, and they can lead to various diseases like cancer. And these mutations can either be pathogenic or

Starting point is 00:37:53 benign, and scientists estimate that there are close to 71 million different menescence mutations. Now, prior to alpha-menescence, which was an algorithm that basically was designed to classify these menescence mutations, human annotators were able to successfully annotate 0.1% of all menescence mutations. Alpha-menescence then comes on board and it has been able to annotate 89% of these 71 million possible mutations. So we go from 0.1 all the way to 89%. And I think that really illustrates the power of AI, that there are so many scientific problems that could really be massively aided with computational strength. And you're starting to see AI apply to such problems, and you're starting to see the realization of some very exciting results.

Starting point is 00:38:45 Very cool. Amazing. If I can introduce another topic, it's the funding model for AI, the cost of AI and its return on investment, so to say. If you're a startup, that would be monetization. If you're an organization or a funding agency, it's how much do you spend on what, when? What are you seeing in that world? Well, I mean, I think it depends what you're trying to do when it comes to building these systems. So I mean, I think to build an LLM, you just need a lot of money. And this is actually

Starting point is 00:39:15 some novel research that we had in this year's AI index, but we estimated the training costs of a lot of significant systems. And again, we talked about the transformer once or twice on this podcast, but we estimated that the transformer, which was launched in 2017, cost around $1,000 to train. If you fast forward all the way to GPT-4 or Gemini, we estimate that those systems cost anywhere from $80 to $190 million to train. So massively expensive to be building these frontier models. So I think if you're a startup that's just starting up now, it's probably going to be pretty hard to compete with some of these high level systems just because of the cost that's required to really build one that is hyper competitive. Now there's still a lot of other

Starting point is 00:40:00 things to be done in the AI ecosystem and there's still a lot of kind of efficiencies to be found and still a lot of good work for businesses to do. But I think it's really the case that if you want to build a frontier model, you need to have a lot of financial backing. And that's just the reality of the market at the day. And that obviously has set the market up in a certain way. As I've said, there's only really a few players that could really afford to be building those kinds of systems. To me, it's an interesting phenomenon that, you know, as you mentioned, models are really coming out of the private sector. PhDs are not going into government. They're going to private companies. This is almost something that's beyond the reach of even governments or the U.S. government to take the lead in, even though AI

Starting point is 00:40:45 has such tremendous implications for national security, health, and other things that government is concerned with. Yeah, I mean, it's definitely concerning. And I mean, I think that like, I think there's two narratives here. I think on the one hand, model costs are still going to go up. So for example, the CEO of Anthropic anthropic dario amadei i think he predicted not too long ago that you're you'll soon see models costing as much as one billion dollars and again this is because these companies they understand the scaling laws i think they're betting they could just feed the systems more data they're going to get better at the same time you're presumably going to see algorithmic improvements you're going to see improvements in hardware

Starting point is 00:41:23 i think it will become a bit less expensive to train some of these frontier systems, but we'll see by what degree. I think there's also an important question for the broader community. We have all these capable closed models. Is there going to be an open model in the conceivable future that is going to be very capable and really can match the performance of some of these top closed models. At the moment, the answer to that question is yes. You have Lama 3, which seems to be fairly competitive with GPT-4, CLAWD 3, and Gemini on a variety of benchmarks. And that's good for the broader research community because again, there are so many things you could do with large language models. And you could probably do them a little bit easier if you have access to an open model whose weights you can freely modify.

Starting point is 00:42:10 Now, if we move into a future where that kind of distribution paradigm changes in some kind of way, that might create a lot of important inequalities in the development ecosystem that might require some kind of resolution. But it also becomes, you know, it's a tricky security question and safety question. Do we want models that are closed or do we want models that are open? I think some of these developers believe in closing access to models, I think partly for competitive concerns. They don't want their competitors to have access to their secret sauce. I think part of it is also safety and

Starting point is 00:42:45 security concerns. You know, if you openly release a model that is really capable, it can be used for a lot of nefarious purposes. On the other hand, there are certain developers like Meta that really seem to believe for business reasons and also scientific reasons about making research openly accessible, that open is the way to go. So I think that kind of that tension in the ecosystem is something that I'm keeping an eye on. And that I think is going to be important to monitor in the near future. I think you're absolutely right. The HPC community has always been in a leading position for open source technology, starting with one of the early adopters of Unix when it came about,

Starting point is 00:43:25 and it wasn't quite open source, but it was open, and then Linux, and then the entire stack, etc. But that I think is definitely a big issue in AI, whether you want to go top to bottom integrated, or do you want to integrate and then disaggregate? Or do you want to just go open all the way? Yeah, I mean, it's it is. And it's an open question, right? I think there are people that I've spoken to that really champion the open philosophy of model development and are really confident that philosophy is going to win in the long run. I think there are others who believe that there's always going to be a closed model

Starting point is 00:43:59 that's going to be at the top of the pyramid. So it's something that I don't really know where my own intuitions align. I could see there being compelling arguments made for both sides. And it's something that we've now started tracking in this last edition of the report and something that we're continuing to think about more and more. Yeah, well, definitely the reduction of friction that the open model provides is a non-trivial attribute. So it really depends on what sort of contribution you expect from those who adopt it. If you're just looking for their subscription, then maybe not so much. But if you think they're going to contribute data or insights

Starting point is 00:44:39 or development, then it makes a huge difference, I think. Yeah, I mean, it's also interesting to think about how the kind of economics differ for a company like OpenAI and Meta. I think, you know, it seems like Meta really believes from a business perspective that it's better for them to open source the model, partially because it seems like they intend to deploy their model on a lot of their existing products and maybe even to develop future products

Starting point is 00:45:03 that, you know, could bring us to the metaverse. And it seems like they're almost outsourcing kind of part of the development pipeline or the process improvement pipeline to the community. And it seems to then save them a lot of resources because you have developers that are outside of meta that are improving the model and basically doing that work for them, which makes it massively more efficient to then redeploy that model internally. Now, that's a strategy that they can afford because they have a litany of products that are used by massive amounts of people. It's a bit different for companies like OpenAI and Anthropic that don't necessarily have as much of a distribution network

Starting point is 00:45:41 as a company like Meta. So yeah, there are really interesting questions of economic management on a technological side that, you know, depending on exactly where you are in this ecosystem, you know, you might prioritize development of models in different ways. And that's another question I'm very interested in as well as, you know, what's the economics of this? What is OpenAI? Like, I mean, I was thinking a few years ago, if all these companies are just basically scaling up transformers, like anything that open AI does, you know, Anthropic can probably do in a few months, Microsoft can do in a few months, Google can do in a few months. So what is open AI's advantage? Is it a distribution one? My thoughts on that have

Starting point is 00:46:19 changed a little bit in recent months. But I think the economics of this tech are also very fascinating to talk about. Yeah, actually, along those lines, it was always a bit curious that for a technology that's supposed to be so advanced and out there, why is it that there are a dozen companies competing? So maybe it's not such a big barrier to entry after all. Yeah, I mean, I think it's not, you know, these models are expensive, but they're not, again, expenses relative, they're not so expensive that, you know, they're in that price range where you could have a couple of different companies that are competing. I think I guess what I was also saying is that, you know, for me, I thought that if there was going to be five or six players that had pretty comparable models, the strategy for kind of, let's say, newer players like OpenAI or Anthropic

Starting point is 00:47:07 was to either A, build models that were substantially better, or perhaps maybe build models that people could trust a little bit more. But I've inherently thought that existing players like Google and Microsoft had massive distribution advantages because because I mean, realistically, if Google has a technology that works as well as GPT-4, ChatGPT, are you going to go to a separate website to use that? Or are you going to use it if it's integrated into your, you know, Google Docs suite or your Microsoft Office suite? Now, that's what I had thought six months ago. I think it's still true. But I think it's also important to

Starting point is 00:47:45 say that the kind of rollout challenge of this technology is also non-trivial. As in, you know, Google now has had a couple of snafus, whether it's the kind of, you know, racially diverse Nazis from a few months ago with its text-to-image model, or even recently it was trying to integrate AI into Google and it was suggesting to people that they should eat rocks, just kind of comical missteps. Yeah. And I mean, this is like one of the world's richest companies. So clearly rolling the technology out in an effective way is non-trivial. So I think, you know, that's changed my perspective on things a little bit, but it'll be interesting to see how things unfold.

Starting point is 00:48:23 Nisar, what is the methodology you use for this work for the index? Yeah, it's a great question. I mean, it really depends what kind of data we're collecting. I think on a high level, we ask ourselves when we start thinking about the report, which is always a half a year, if not more in advance of the publication, what kind of topics do we think are really important to cover? And there are certain topics now that we've been covering for a while, like trends in technical performance, what can the technology do now that it couldn't really do five years ago, trends in policy, how are governments thinking about it, trends in economics. But there are always perhaps new things that we think to ourselves, okay, the AI ecosystem

Starting point is 00:48:58 has evolved. These are some new things that we want to consider. Last year, we added a new chapter on public opinion that looks at how the general public feels about this technology. In the most recent report, we added two new chapters, one on responsible AI that was spearheaded by a great computer science student that worked for us, Anca Royal, and the science and medicine chapter, which was again, a completely new chapter that we added in this year's report. And I think the inclusion of these new chapters reflected our sentiment that it was important to talk about these things when the kind of landscape evolves. But we're in a position with the index to be blessed by to have the advisory role services of a committee of really excellent AI thought leaders that sit on our steering committee. We dialogue also with a lot of Stanford professors that give us their perspectives on what we should cover.

Starting point is 00:49:48 And we typically tend to partner with existing data vendors if there is existing data that is available on particular topics. But if there are certain data points that are unavailable, like for example, the geopolitical affiliation of different foundation models, that's something that we then work to collect ourselves. Excellent. Excellent. And what is next? What do you see for the next edition? I think a couple of things. So I think that, I mean, the report I think is institutionalized, it's in a good position. I think we're going to look more or less at the same kind of category of topics. I'm really interested to see the economic data. I think it's going to be really curious to me how much businesses are using this technology. You know, I cited some

Starting point is 00:50:31 studies which suggest productivity impacts. These are more kind of micro level. I'd be really interested to see if there are macro level impacts that we might hypothesize down the line. Also, the index team is working hard on some, let's say, smaller scale projects that work to paint a picture of what's going on in the AI ecosystem. In a couple of months, we're going to be relaunching our global vibrancy tool, which is a tool that is more of a traditional index that ranks how well different countries are doing in the AI ecosystem. We also have some analyses forthcoming that look at the amount of money that governments spend on AI. There's actually no systematic measure that comparatively analyzes how much

Starting point is 00:51:12 different countries on a public level have spent in AI. We're trying to be the first that kind of publish some research on that kind of topic. A lot of kind of exciting projects for us in the pipeline. And of course, you know, in this kind of job, you always have to refresh your Twitter or your ex every five minutes, because it seems like with AI, there's a new development that comes up very rapidly. And would you tell us a little bit more about how Stanford has organized its AI activities across its various departments, and certainly with the project that you're leading? Yeah, I mean, the index is housed at Stanford High. We're actually celebrating our fifth anniversary tomorrow. And I think Stanford High was really creative to keep the human in the loop with AI. And I think it really believes in doing AI from a very interdisciplinary perspective. So

Starting point is 00:52:00 the institute itself is actually under the Dean of Research at Stanford, and we collaborate with legal scholars, medical scholars, you know, artists, more traditional computer scientists, because I think we really feel at the Institute that AI is a technology that could broadly benefit humanity. But in order to ensure that it can broadly benefit humanity, we need to solicit a variety of perspectives. So I think that is really fundamental to the mission at high. And that really fundamentally drives forward a lot of the great work that is being done at this institute. Very multidisciplinary. Absolutely. Yeah. Yeah. And I think Stanford has always led in that, well, has been one of the leaders in that multidisciplinary approach. Now you've also been at Harvard and Oxford, also institutions of note. Are you seeing any difference in the way those guys approach it? Or is it generally a global effort? certainly has fairly robust programs when it comes to AI. I think they have some good robotics things from a pure computer science perspective. I think they also excel a lot when it looks at questions of AI governance. So I think there's a leading center on AI governance called the

Starting point is 00:53:15 Center for the Governance of AI that kind of spun out of the university. I think Harvard also has some similarly strong institutions. But I mean, all three of those institutions do good work with the technology, but I'm certainly a bit more familiar with what's being done at Stanford than the other two. Brilliant. Well, we'd like to possibly invite you in advance on the occasion of the next index coming up, I assume next April. Yeah, it would be a pleasure. I mean, it's interesting to think about that because I wonder to myself, where's the technology going to be in a year? What's going to happen? And it might be interesting for me to listen back to this podcast at this point next year and, you know, think to myself, wow, I can't believe I actually said that. How stupid was I?

Starting point is 00:53:58 Or perhaps, wow, that was actually very genius. I know what I'm talking about. So we'll see if my predictions have any weight. Well, we'll hold you very genius. I know what I'm talking about. So we'll see if my predictions have any weight. Well, we'll hold you very closely. We'll have a profundity index. Excellent. Thank you, Nestor. It was excellent. Yeah, it was a pleasure speaking with you guys today. All right. Thanks so much. That's it for this episode of the At HPC podcast. Every episode is featured on InsideHPC.com and posted on OrionX.net. Use the comment section or tweet us with any questions or to propose topics of discussion. If you like the show, rate and review it on Apple Podcasts or wherever you listen. The At HPC podcast is a production of OrionX in association with Inside HPC.

Starting point is 00:54:46 Thank you for listening.

@HPC Podcast Archives - OrionX.net - @HPCpodcast-85: Stanford AI Index w Nestor Maslej

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.