@HPC Podcast Archives - OrionX.net - @HPCpodcast-103: Stanford AI Index 2025 w Nestor Maslej

Starting point is 00:00:00 There's a lot of evidence that points that businesses are starting to dabble with AI, but there's a big difference from dabbling with the technology and actually then having the technology drive positive productivity impacts. It's not an accident that most of the major corporations in the world that are pioneering these models are American and not only are they American, they're all headquartered within, I would say 5,200 kilometers of one another in Silicon Valley. We don't really know exactly what the future is going to hold with synthetic data, but I wouldn't say there's any overwhelming reason to either be incredibly negative or incredibly positive.

Starting point is 00:00:44 From Orion X in association with InsideHPC. This is the AtHPC podcast. Join Shaheen Khan and Doug Black as they discuss supercomputing technologies and the applications, markets, and policies that shape them. Thank you for being with us. Hello, everyone. I'm Doug Black of Inside HPC. I'm with Shaheen Khan of OrionX.net, and this is the At-HPC podcast. Today, it's our great pleasure to be joined again by Nestor Masledge of his research manager

Starting point is 00:01:16 at Stanford's Institute for Human-Centered Artificial Intelligence, also referred to as H-A-I. Stanford recently put out its annual AI index report. We're going to be talking about that with Nestor today as we did about a year ago. Nestor manages the AI index and global AI vibrancy tool at Stanford in developing tools that track advancement of AI. The goal is to make the AI space more accessible to policymakers, business leaders, and the lay public. So Nestor, great to be with you again.

Starting point is 00:01:51 Yeah, super excited to be here and very honored to be. speaking with you guys. Okay, super. I'll just jump in. If you could maybe provide us with an overview of the new AI index, what do you think the most impactful findings that the report offers this year? I mean, I think to summarize, to me, it's a combination of findings. I mean, I think the technology is getting a lot better. I think as it's getting better, governments across the world are thinking more about it. People are thinking more about it. But it seems to me like I think the big difference now from let's say a year ago, is that there's a lot of discussions about how our business is going to be using this tool. And I think there's a lot of evidence at points that businesses are starting

Starting point is 00:02:33 to dabble with AI, but there's a big difference from dabbling with the technology and actually then having the technology drive positive productivity impacts. So I think for me, this is the big thing that we're thinking about at the index, not only from last year's report in 2025, but the one that's upcoming, actually trying to understand how much of an impact this increasingly improving technology is going to have on business and what it's actually going to take for this tech to make a mark in the business world. Because I think we're seeing some positive signals, but I still feel like it's too early to definitively declare that AI is this productivity enhancing tool that maybe some people have prognosticated it that it would be. Okay. Well, why don't we start with the first of

Starting point is 00:03:19 your 10 key findings. This relates to AI performance on, as you say, demanding benchmarks continuing to improve. Could you talk about some of those benchmark improvements and to what do you attribute that? Is it hardware and software or where do you think those improvements are coming from? I mean, I think it's a lot of different points. I mean, I feel like there's a lot more money that's being invested in AI than ever before. People, I think, know what it's capable of. And there's a lot of companies that are now working on making better quality models, and these companies are very well financed. They have the ability to afford more and more compute to maybe go after larger data sets. The hardware thing is also something that's very important to discuss, and we

Starting point is 00:04:02 talk about this fairly prominently in the report, but there has been a lot of algorithmic improvements, a lot of the GPUs that are coming out are sharper and sharper. And as a result of all of this, you can in a lot of ways, get more from less, so to speak. But you have this kind of, I think, simultaneous efficiency gain with the fact that there is a very strong competitive pressure that a lot of these companies now feel to actually be on the leading edge with this technology. And I think when you see those kinds of factors come together, then it's not surprising at all to see that AI progress is continuing to improve. Now, I think the one interesting corollary to this is we might be about a moment where we're seeing saturation on a lot of

Starting point is 00:04:48 benchmarks, even some of the more challenging ones. And I think there's maybe an open question as to how much better can we get in the current kind of paradigm, which to me is functionally scaling transformer models. And I think there's hot debates about this in AI. So Nestor, when I look at the last year's report and this year's report, how did you change it? And why did you have to change it. I mean, I think in some ways, the whole point of a report like this is to, I think, look at AI through the same kinds of lenses. So I think we try in some ways, I wouldn't say not to change it, but at least to keep the same, let's say, lens of analysis to facilitate year over your comparisons with the technology. Now, with that said, the technology is changing very

Starting point is 00:05:38 rapidly and it's forcing us to consider different ways of understanding and in different ways of evaluating it. And I think you see this in different ways where, for example, this year, we had a section for the first time in the report where we were talking about the usefulness and the validity of benchmarks. And the context of this is that since its inception, the AI index has been trying to measure progress in AI by looking at how well AI systems did on these benchmarks, these kinds of general evaluations of AI capabilities. But there's been a lot that's been written about the shortcomings of these benchmarks, that they're contaminated, that they saturate too quickly,

Starting point is 00:06:17 that in some cases they're gamed by some of these major model developers. And we felt, for example, that it was very important to acknowledge some of those criticisms. So in the technical performance chapter, we discussed that a little bit. But I feel like now we're at a general stage with the index, where we do look at AI through a fairly comprehensive lens. And I think, interestingly enough, when I started in the report in 2021, the challenge was really figuring out, like, what do we include in the report? And you're trying to see if there's additional data that we can throw in.

Starting point is 00:06:48 Now, I think it's the opposite challenge where there is so much information that is out there. I think we need to take a step back and actually figure out what is the really important, really essential information that we want to include and throw in. I think it'd be good to go over the top 12 findings that, I believe precede the bulk of the report. Also, I want to point out that I admire your ability to cut the report down from 502 pages to 456 pages. Yeah, I'm worried that people like, you won't have time to read it front to back, so we're trying to make it as tight as possible. I do recognize how hard that can be, actually, so that's great. We talked about the benchmarks. The other thing,

Starting point is 00:07:30 And like just number two is the increasing embeddedness of AI in everything that we do. And that, of course, is fraught with the challenges that it presents, both in terms of the distribution of that, having access to AI, having the right device to be able to have access to AI, but also the reports that came out of MIT a couple of months ago on the impact of using AI on users' brains and how it also presents a challenge. What is your perspective on that? I mean, my perspective on it is that I think functionally, it is too early to know definitively whether AI, I would say, is more force for good than a force for bad. And I mean, I ultimately believe and this is why I'm motivated to do AI-related work, I do fundamentally believe that it could be

Starting point is 00:08:22 steered towards being more a source of good. But I think we're going to have to take a step back and tangibly ask ourselves, how is it being used and critically evaluate the ways in which we relate to and think about this technology. I think for me, that's the kind of really important point here in the really interesting one, which is that this technology has emerged. We're using it a lot. And I think what I always emphasize when I speak to different stakeholders is that technologies don't exist in a vacuum. They're fundamentally mediated by the things we decide to do with them. or the ideas that we have about them. And I guess for me, it's too early to definitively say where we are at in that entire,

Starting point is 00:09:01 let's say on that entire spectrum. But I think we need to start asking ourselves very tangibly and do research to actually understand again, what is this tool doing, both positively and negatively, and where is the balance leaning so that we can actually know these things in a more definitive way and so that we have the capacity to make grounded decisions about the technology. Yeah, that's very fair. I think that's right on. The other finding was the lead that the United States has in AI and certainly in top AI models and their infrastructure and, of course, the hardware that is used to do this. But the report also says that China is closing the performance gap. Are there other gaps that China is not closing? And how do you see that performance gap remaining or not remaining, given all the trade friction? that exists between the two countries? It's a great question.

Starting point is 00:09:54 I mean, I think there's different ways to look at it. There's been some good things that I've read that have suggested that some of these export controls that were put in place might come to hit China a little bit more severely now than at the time in which they were training some of these deep-seek models that have rivaled some of the best American models. I mean, in a lot of ways, there really isn't a moat when it comes to building these high-level AI systems, at least currently because they're all based. on this fundamental transformer architecture that really is being scaled and scaled by a lot of

Starting point is 00:10:26 these companies. Now, it's not to say that one of these companies can't develop a new way of building these things, but innovating and kind of timing innovation and knowing exactly when it is going to occur, that's very tricky. That's very difficult. And that's something that it's just generally hard to sit here on a podcast and say, oh, well, in three years, somebody's going to come up with a new way of doing things. Right. Now, I think that China is, is leading in certain things like public opinion. It's a very bullish country when it comes to the use of AI technologies. It's leading as well in AI robotics as well as AI patenting.

Starting point is 00:11:01 I think some of the areas where the U.S. leads, though, is I think there's a lot more, there's a much more, I think, vibrant and open capital ecosystem when it comes to artificial intelligence. And I think that is really important for the development of the technology. Like, I think it's not an accident that most of the major corporations in the world that are pioneering these models are American, and not only are they American, they're all headquartered within, I would say, 5,200 kilometers of one another in Silicon Valley. And as someone who lives in San Francisco, there is something about you walk around in the city, you bump into a cafe,

Starting point is 00:11:35 and everybody seems to be talking about how much money they're raising in their seat round or what their new startup idea is and how it relates to AI. And you can laugh about that, but there is this infectious attitude in terms of willingness to innovate and desire to push that frontier even further. And it's not to say that this isn't occurring in China. I don't have the perspective to comment on that. But I do think that the capital ecosystem in the United States is really well suited to facilitate the kind of development of this type of technology. Nestor, a lot of our listeners will be very interested in finding number 11 of your top 12 findings around AI for science. Could you elaborate on that area. Of course. So this highlight, I think really is just giving service to the

Starting point is 00:12:22 fact that in the last few years, AI has had tremendous impact in a variety of scientific domains. There have been Nobel Prizes that have been awarded for pioneering AI-related work in physics and chemistry. Deep Mind every few months seems to be releasing groundbreaking new models that are doing things that are very exciting in quantum computing, weather forecasting or protein structure prediction. And I think it is really worth highlighting the scientific developments because I think it is easy to think that AI is just about chat bots to just hear about some of these sexy large language models that are being announced, GBT5 last week, Lama 4, Clod for Cloud 5 at some point, as I'm sure it's going to be. But there are a lot of very exciting things happening

Starting point is 00:13:11 in the scientific domain. And I had the pleasure last summer of teaching a course at Stanford. And as part of this course, I invited Greg Carrotto to do a talk, a fireside chat. He's one of the co-founders of the Google Brain Team. It does a lot of work on AI in science. And I asked him, why are we seeing so many tremendous developments in science as a result of AI? And he said, quite simply, there's a lot of problems in science where answers are known if you can construct sufficient models, usually mathematical models of the answer.

Starting point is 00:13:42 and it can take humans a lot of time to build these models, whereas AI systems are incredibly well suited to do a lot of that kind of ground work. So there has been a lot of very exciting developments that have occurred in the scientific field. And I think at the index, that's something that we don't want to lose sight of as AI seems to be poised to be this technology that might, let's say, take over the business world sooner rather than later. I was also very interested in your findings about autonomous vehicles and robotaxies. And you had some very interesting facts and figures in that area. That seems to have taken quite a leap over the past year. Am I right in saying that? At least in terms of the degree to which robotaxies have really, in some markets, in some areas of

Starting point is 00:14:29 the country, have been adopted. I think you're certainly right. And I mean, you see this if you drive around San Francisco. Waymos have now been a thing for, I'd say, longer than a year. even the last few weeks I've noticed that Zooks and other self-driving company has pioneered these self-driving vehicles that take a slightly different form. Self-driving seems to be pretty ubiquitous in a lot of different Chinese locales. And when I think about the self-driving, I think to me it's not just a story of the tech getting better and more pervasive. I think kind of a few interesting subsequent questions that, you know, speak to this question of like, where are we going with this tech? And one of the questions is, when are we going to get fully end-to-end self-driving cars? Because at the moment, you can use away mo in very kind of limited areas in San Francisco,

Starting point is 00:15:20 and the technology seems to be fairly expensive. Both of those things seem to be prohibitions to me to having the technology be more widespread and ubiquitous. But even more fundamentally, do we as a society want more self-driving cars? And I ask this question because in the index, we point to the fact that the data definitively shows that these self-driving cars are a lot safer than human drivers across virtually every single metric. And if you've driven in one of them, you can understand why it's a very smooth ride. They're calibrated to optimize for safety. Now, despite a lot of this data, we have another figure in the index, which shows that the majority of Americans are

Starting point is 00:16:00 still deeply distrustful and afraid of self-driving cars. So I think there's an interesting question to me here, which is not just, I think, relevant to self-driving, but relevant as well to other technological applications of AI. If the technology is good at doing what it's doing and making our lives better, how do we get it to be more part of our lives when maybe there is this kind of inherent skepticism about it? I think that's something that really interests me. Yeah, I know Shaheen has told me, Shane, you've said you've taken many of these.

Starting point is 00:16:33 self. I have. Yes. Can I ask Mr. possibly an annoying question? Have there been reported accidents with these AV robotaxies in which the robotaxie was at fault? Yeah, I mean, there certainly have been accidents. The data definitely suggests that. But I think when you want to look at the data ultimately, it's a question of, I think, just on aggregate looking at how likely is it for human drivers to get into accidents versus, let's say, some of these self-driving vehicles, and at least data that's come out for multiple studies seems to suggest that across a variety of self-driving metrics like airbag deployment or injury reporting or police reporting, the self-driving vehicles seem to be a lot less likely to have

Starting point is 00:17:19 incidents than human drivers. So it's not to say that they're perfect, but they seem to be doing better than humans, and that's what the early data seems to suggest. Okay. I think a lot has to do with humans' willingness to forgive a machine. Yeah. And I think that is its own kind of a social evolution as we rely on machines in ways that we haven't before. Yeah, I think it speaks to this fact as well that it seems almost easier, I think, in some ways to assign blame to humans.

Starting point is 00:17:47 I mean, things are going to go wrong in the world. That's an inevitability. And it seems that we have better mechanisms for maybe assigning blame to humans and accepting that blame than perhaps having a similar kind of thought process with machines. That's just a speculation that I see. But it is an interesting thing where the data does point to these cars being safer on aggregate, but people are still very skeptical about it. That I find to be quite interesting. Yeah, absolutely. And they're getting more and more capable. Now, one thing that I reference your background and perhaps present engagement with governance and innovation in governance,

Starting point is 00:18:24 When we look at industrial policy, social policy, preserving or creating the proper environment for continued development of AI, there seems to be multiple strategies being put forward around the world. One could say that the U.S., Europe, and China, let's pick those three, seem to have really different strategies on how they want to manage that. Is there any data that suggests if any of them have it right? I would find it hard to imagine there being data for that because I think the question of whether any of them have it right is a fundamentally subjective one. Yes, good point.

Starting point is 00:19:01 As in the EU seems to have adopted a much more kind of pro-regulation approach. And I think there are some people that argue that maybe we should be leading with that. Whereas there is others that argue that perhaps what the U.S. is doing, which is a little more light touch on the regulation. And that certainly seems to be what Donald Trump has emphasized in his recent AI policy guidance is maybe a bit. better approach, but it's hard to say which one is the right one. I think it's ultimately for us to decide what outcomes we want to optimize for with this technology. I think personally for me, and this is speaking as an individual, not as the AI index or as a representative of my institute, I do think you fundamentally need balance. I think you need to give a playing field where a lot of

Starting point is 00:19:43 these companies are free to innovate and experiment with this technology. But at the same time, you do need some ground rules because, I mean, in places like the United States, people don't fundamentally trust this technology. And if things go wrong, I mean, let me maybe actually put it differently when things go wrong, you want to make sure that there is good mechanisms of helping people through that process and, in fact, limiting that because that could really set back the ways in which we deploy, use, and think about AI on a society level. I had another question relevant to AI for science. There's also AI that relies on science. And one place where I think we could have a good discussion is AI running out of data

Starting point is 00:20:30 or AI increasingly relying on data that has already been impacted by AI and therefore not quite original. And also we live in times where incorrect data, invalid data, is rampant. So how does that vote for the future of AI? and one way to really address that is through synthetic data that is scientifically oriented and therefore a little bit more well-defined. How do you see all of that? Again, I think with a lot of these questions, it depends. And I think when you look at the data story, I think there's different narratives that you're seeing. On the one hand, there has been

Starting point is 00:21:11 research that has suggested that, or at least the bulk of research that I've seen, has suggested that training these AI systems on synthetic data doesn't necessarily seem to optimize their performance. And in fact, in some cases, there's been this phenomenon that's been observed called model collapse, whereby when you substitute real data with synthetic data, the performance of the model seems to generally degrade. Now, there's been other research that's pushed back on

Starting point is 00:21:44 that's argued that, well, if instead of a direct one-to-one substitution you stack synthetic data on top of existing real data, you don't necessarily have that degradation. So that I think seems to be one of the things that the research is suggesting. Now, there's been some more research papers that I've seen recently that have been a little bit more bullish that have suggested that synthetic data on its own has some promise. And there's certainly research in fields like medicine and science in domains where there is just very little to note data to begin with, that suggests that in certain specific use cases like biomedicine,

Starting point is 00:22:22 synthetic data can bring a little bit more value. Now, in addition to that, there do seem to be a lot of more companies and entities that are doing work, collecting data. If you think about this on a more kind of abstract level, us humans, we probably get thousands, millions of data points every day in terms of the conversations we have, our eyes are always processing things, we smell things, we feel things, that's all data that's

Starting point is 00:22:47 coming into our system. And these AI systems are trained almost exclusively on data that's existed on the internet. And it's not like when the internet was created in the 80s, people got together and thought to themselves, for a decade, this is going to be a great repository to train a lot of AI systems with. It was convenience that we had this repository of data. So I think now there's a lot of companies that are trying to think very critically about this data challenge and think to themselves like how do we find ways to better extract data from the real world and maybe take a lot of data that we have, clean it up and repurpose it in some of these language models. So on the one hand, I think we don't really know exactly what the future is going to

Starting point is 00:23:31 hold with synthetic data, but I wouldn't say there's any overwhelming reason to either be incredibly negative or incredibly positive about what this development holds. for the future of the technology. There you go. Mr. At least in my world, in the media world looking at AI, so much excitement around

Starting point is 00:23:51 when new models come out and also the work NVIDIA and AMD are doing in GPUs. I believe Jensen Wong at NVIDIA said really where AI is going will require 100X, the current computational power that's available. Looking at the AI infrastructure, all have views on how we're actually going to support all this in terms of these AI factories and the energy requirements and the compute requirements that are going to be needed? Well, it's an interesting

Starting point is 00:24:21 question. I mean, one of the things that we highlighted in last year's report is that there were several stories of old nuclear facilities being brought back up to action in order to meet this kind of intense energy demand that is coming with these systems. And I think in some ways, it's important to, I think, distinguish the two competing forces here on. the one hand, I think on a per use level, the systems are getting a lot more energy efficient as if there is better hardware, it is more efficient to train these systems. But I think the countervailing force and arguably the bigger one is the fact that more and more people are using this technology. So I think the energy demand is poised to be increasing. And I think this is

Starting point is 00:25:02 something that now a lot of countries are thinking about really critically when it comes to trying to understand and think about the impact that this technology is going to have. So I think this is a big kind of strategic priority for a lot of governments across the world that have made very large infrastructure investments in artificial intelligence. So part of that investment is in, well, hopefully is in maintaining the talent base and human capital. And one of the challenges of AI is that it is eliminating entry-level work in many areas. We're already seeing that in computer programming. What is the impact on education for AI? So I think I would say first and foremost, I think it's hard to know definitively what impact

Starting point is 00:25:49 it's having on entry-level jobs. I've heard people speculate that now for some junior coders, it's very hard to find jobs because companies now are getting, instead of relying on junior-level developers, maybe relying on AI to do some of that work. I've heard that anecdotally, but I've I've also talked to some people at Stanford that are doing some more rigorous data on that and suggesting that maybe isn't really worse. So again, I think this is one of those things we have to go open eyes and imagine what impact it's going to have. But I mean, I think fundamentally outside of that, I would agree with you that AI is poised

Starting point is 00:26:25 to change the way in which we do work. And there is going to be a lot of re-skilling work that's going to be required. And I think a really good way to look at this is that I was one of the data vendors that we partner with at the index LinkedIn, they did some interesting research where they estimate that I think close to 20% of the jobs they see on LinkedIn now didn't exist two decades ago. Oh, interesting. These kind of categories like prompt engineer or whatnot, social media manager, weren't really around 20 years ago.

Starting point is 00:26:55 And that's a pretty sizable portion, right? And if you're mid-career, you're probably not going to have the opportunity to go back can do another graduate degree or an undergraduate up to speed on some of these new technologies. So yeah, I think we do need to critically think about ways in which we can retrain, re-skill, work with other individuals. This is something that I think about a lot in my day job. At Stanford, I'm also partnering with some startups like thinking through AI that are trying to build these platforms that basically help people get re-educated about artificial

Starting point is 00:27:26 intelligence. And it is a critical issue, right? because I don't think you can fundamentally think to yourself that this technology is going to just disappear. I think it's going to be a part of all of our lives. And we're going to have to empower people to learn about it so they know how to use it. Because I think fundamentally to me, AI is not going to come for your job unless you're really unwilling to use it. I think people that figure out ways how to use AI technology and what they do, I think those people are going to be the ones that are going to be most poised to benefit.

Starting point is 00:27:59 Now, looking forward, Esther, we've heard lots of talk about the next stages beyond generative AI, which is coming up on its third year anniversaries. Moving toward Agentic, I've recently read about also the combinations of multiple LLMs, and I think it was referred to as multi-LLM orchestration, where so much more data can be used and interchanged and information exchanged. Can you talk about that sort of moving toward AI becoming more of a learning tool that can take input from several sources and respond to problems by sort of sifting through different solutions to the problem,

Starting point is 00:28:42 weeding out the faults and moving toward the strengths, etc. Yeah, I mean, I think this is certainly a very exciting development. And I'll answer that with more narrow and then perhaps a broader meditation. on the more kind of narrow side, yeah, I mean, I think this kind of desire to work with multiple language models is reflective of the fact that different models seem to be good for different types of purposes. And when you construct systems that are leveraging multiple forms of intelligence, I think that has an ability to optimize outcomes that you're trying to achieve and also can just lead to better work at the end of the day. But I would also, on a more broad

Starting point is 00:29:18 level positive. This is, I think, an example of a lot of companies thinking more about kind of the application layer than strictly the technological one. Now, as in, there are still technological considerations that come into play when one thinks about using these language models in different ways and doing multi-model orchestration. But what I think it communicates in a lot of ways is that businesses, they now have seen that this new technology is here. And in a lot of cases, what you get with AI is a raw technology out of the box like electricity, but then as a business, you need to fundamentally build the wiring to ensure that you can take advantage of this tech

Starting point is 00:30:00 and get it to do the kinds of things that you want to do. So the story to me here isn't strictly about the possibility of using multiple systems, having those systems optimize their own performance. In some ways, it's a story more of businesses actually trying to figure out, like, how do you take this raw intelligence and rejig it in some kind of applicative way so it does the tangible work that you want to be done? When it comes to impact on economy, how do you see that distributed geographically, both within the U.S. and internationally? It's a great question. I mean, so far, I think the research that I've seen has suggested that in very micro settings,

Starting point is 00:30:41 and we're talking about like small isolated studies of knowledge workers, whether it's computer scientists, lawyers, financiers, consultants, AI does seem to have positive productivity impacts. It will save you time at the minimum and at the maximum it will lead to work that is just going to be a better quality. So I think that's an important initial observation to make. I would add to that that beyond that I think it's hard to exactly anticipate which jobs might be most effective, which industries might be most affected. There have certainly been like predictions that have come out. And I mean, I think it's easy to imagine that knowledge

Starting point is 00:31:18 workers are perhaps going to be the most touched. But I guess I would give two points of response here. I think the first to me is that although knowledge workers logically might be the most effective, I think we need to remind ourselves that we have not been good at predicting these kinds of things. If you read literature from a lot of reading and leading economists that were doing work on automation and thinking about it in the early 2010s, a lot of them were writing that technology is really going to hurt lower skilled manual routine workers, people like plumbers, people working at fast food places. That was their prediction as to who technology would come for first. And almost the opposite has happened, whereas if you're working in a lot of

Starting point is 00:32:02 these kind of routine physical jobs that requires some degree of spatial movement, you're safe from AI, whereas if you're a lawyer or a doctor, you might need to watch out. So, yeah, knowledge workers might be in trouble, but I think we need to humble ourselves a little bit with the predictions. Second, I think to me, the big question now, and this in some ways is a more interesting question than how much better is the tech going to get, to me a more interesting question is what business is going to figure out how to fully leverage and take advantage of this technology. Because I think what you see is that historically, when new technologies come out,

Starting point is 00:32:43 it takes time for businesses to figure out how to leverage them well to drive positive business outcomes. And a very good example is the transition from steam to electricity. When electricity was discovered and factories started replacing steam engines with electric motors, they were made marginally more productive, but these kind of three, four X productivity gains that we saw later, they took three to four decades to realize. And part of the reason that was the case is because when a lot of these factory managers did this one-to-one replacement at the beginning, they just took out the steam engines and put in the electric dynamos, what you saw was electricity being used to do the same thing that the steam engines were being used to do,

Starting point is 00:33:26 which was to pull all these kind of levers, polies and crankshafts in the factories. It was only until a few decades after that these people and factory managers started realizing, hey, we could build factories and lay out factories in entirely different ways to take advantage of what electricity uniquely brings, that you then started seeing massive productivity increases. So to me, in some ways, it is less about, okay, how are we going to, how much better is the tech going to get? of course, it's going to get better. That's going to help us all. To me, the story is a little bit more about which business can actually figure out how to use this tech in a way to reinvent how business is done. And then from that perspective, fundamentally drive forward,

Starting point is 00:34:12 very positive business outcomes. Nestor, in the area of hallucinations, recently you saw a piece in the Wall Street Journal where an AI therapy application, if you will, just sort of went off and told a client, quote, you are a cosmic royalty in human skin, which is pretty, I'd love somebody to say that to me, but the hallucinary, is progress being made in tamping down or controlling or preventing hallucination? Yes, I do think it is being made. I would say that, I would say the following on this point. I think it is being made, but I think it's also hard to know how much better the issue of hallucinations is getting. So there are some benchmarks, simple facts, and I think there's been some other ones that have come out recently that do suggest that

Starting point is 00:35:06 larger models, and we've been getting larger and larger models, they do seem to be doing better on some of these hallucinations. But I think part of the challenge is that unlike general AI benchmarking, where there is a general consensus, as in each model developer seems to be benchmarking on the same kind of category of models, or the same category of benchmarks, there isn't as much consensus in the world of responsible AI evaluations like hallucinations. So each developer might do some of their own work where they're saying, this is how well our model does on this kind of internal hallucination evaluation. But if you want a model-to-model comparison, that's actually tough to do. So I do think the issues are getting better, but I mean,

Starting point is 00:35:47 they still persist. I mean, GPT-5 was launched last week, and I read some of the interesting articles, highlighting how these systems, even GPT-5, is still making some kind of fundamental logical errors that most, let's say, young humans would not. But I do think as a whole that the issue is getting better. But again, I think even taking a step back, we just need fundamentally better ways of actually measuring and understanding how severe this problem is and how it's evolving over time. So related to that is when AI is used to do that. things. And one of the findings of the report is really about the higher quality AI video generation and the rapid progress that they're making, even without much of a prompt, they do it all these

Starting point is 00:36:35 days, albeit at very high computational cost. What is the impact of that on just on social structure, on politics, on elections? Just today in a separate conversation, we were talking about how you really just cannot trust phone calls that you might get and that family members need to have passwords to all James Bond movies to verify you're talking to the person that you think you're talking to. What does the findings show for that? Well, I mean, I think the findings show that the tech is getting better, right, and it seems to be getting better on a month-by-month basis. The index came out in April and we had a section on how much better video generation has gotten. I teach a class. I'm teaching one this summer.

Starting point is 00:37:18 and I was trying to show to my students how much better the models had become. And I think in the time that I was teaching, there was some new generations. And now these models, they can generate very realistic videos with accompanying voice. And you only imagine that it's going to get a lot better. Now, I think there's a separate question of how effective is this actually going to be persuading people? Because I feel that when you talk to people that are my age or that work in the AI space, lot. There's an awareness as to how much better the tech is getting. And I think at least now when I see a video, I'm going to not look at it with a grain of salt, but think to myself, what sources

Starting point is 00:37:58 are coming from is this AI generated. Now, not everybody is thinking about this, right? So I think that's the more fundamental challenge. But this would be a very kind of interesting area of investigation rather than looking at strictly like how much better is the tech getting, thinking a little bit more about how much more vulnerable are people actually to these kind of new kind of deep fakes. And I think correspondingly, thinking about different ways to actually deal with this, I think that some people have proposed doing work where you point out what is a deep fake and what is not. I mean, I think that just might be fundamentally very resource draining to do.

Starting point is 00:38:38 I think a better process here, and this is what I've seen proposed, is almost having some like digital certification or digital authentication, where if you're an agency like the office of the president, you can say in definitive terms like this is a video that's been generated by us that's authentic. It has a digital seal and whatnot. So, yeah, I mean, the tech is getting better, but I think there's an even broader question here, which is actually figuring out, like, how do we adjust how we do things on a society level as a result of the improvements of the technology has come with. So along those lines, how do you see the intersection of AI with other fundamental technologies. What technologies do you have in mind, for example? For example,

Starting point is 00:39:17 quantum computing or cryptocurrencies or internet of things or 6G networking, where you use a combination of technologies to have the cake and eat it too, so to say. Well, I think it's a dynamic dual interplay in that a lot of these technologies have enabled AI as in the internet is fundamental to the story of artificial intelligence. On the other hand, artificial intelligence, I think, is going to lead to improvements in a lot of these kinds of technological domains. I think it's already making quantum computing better. I'm sure it's going to improve how we're able to do things on the Internet of Things side. So I do see this kind of core interplay that's going on. I think AI perhaps to me seems a little bit more general purpose. Maybe that's the distinction that I would draw that I think

Starting point is 00:40:08 as interesting as crypto quantum or internet of things are, they may not have the same kind of generality and general application that artificial intelligence does. And I think for that reason, maybe artificial intelligence gets a little bit more attention because it's something that could be deployed and used in so many different verticals. Yeah, very true. Nestor, I guess my last question would be looking out over the last year, the last 12 months since we spoke in a general sort of way. Would you say that AI has evolved and progressed faster than you had expected? We've heard that from some of our guests. Or would you say things are moving along at a pace that could be anticipated? It's a great question. I think that,

Starting point is 00:40:53 I think it depends who you ask, right? Because I think there are some people that even a few years ago predicted this kind of intelligence, exponential intelligence takeoff with AI. I don't think we've seen that. I think AI has gotten better technologically. I mean, it's acing a lot of these benchmarks. But, and I hate to use the word AGI, because I'm not even sure what it means, but it doesn't feel like we've reached that level yet. Now, with that said, the tech is getting a lot better. And I think to me, the biggest story in some ways is more that there's just a very large number of businesses that are now using the tech and deploying the tech. And I think that's a very exciting thing that historically, when new technologies emerge, there can be a bit of a delta

Starting point is 00:41:34 between when the technology comes out and when it's used by different people in society, especially in businesses. It seems like business has embraced this technology straight from the gecko and they're really deploying it and using it well. So I think to me that's the very interesting to see and observe here. Nestor, maybe we can conclude with the future of AI. and how you see it evolving in the coming year? What do you expect for the next installation of this report? It's a great question. I mean, I think the tech is going to continue getting better.

Starting point is 00:42:03 I think we're going to start seeing a little bit more data on how businesses are using the tools in targeted ways. I think that's what really interests me. And by that, I mean, I think last year we've had data on like our businesses generally using the tools, but like in what particular ways, what ways are leading to the greatest degrees of productivity. What models are businesses even favoring, right? Because you do have like now five or six models that all perform fairly well in terms of their ability. So how then are businesses making tangible decisions about what particular models to build with? I think I'm also

Starting point is 00:42:39 interested in seeing how some of these questions related to geopolitics change. How much is going to be different now from, let's say, a year or two about the U.S.'s position relative to China. or the use. I think that really fascinates me. So for me, I think that the index, I think we have a good sense as what some of the key topic areas are. And I think it will be just fundamentally very interesting to see how each and every one of those individual topic areas develops further. And I think to me, what I'm most excited by other than obviously like how much better is the tech going to be getting is more what impact is this going to have on business and jobs? Because I think that really is the that a lot of Americans are afraid of, but at the same time, the impact of AI on productivity

Starting point is 00:43:26 is a reason why people are so excited about this tech. And I think it'll be really interesting to see, like, in very tangible terms, what the technology might actually do in this domain. Brilliant. I want to invite our listeners to go check out episode 85 that we did together about a year ago. It was June 7th, 2024, as I looked up. So please go listen to that, and you will see the exchange that we had about the last edition of this report. But the wonderful conversation, where do people go to get more info? Yeah, so just feel free to visit us online at the AI Index at Stanford. We also have a tool called the Global AI Vibrancy Tool that ranks how well different countries do in the space of AI.

Starting point is 00:44:09 So those would be the two resources to check out on our end. Yeah, I definitely want to encourage people to go do that too because it's a treasure trove of data. We've only scratched the surface, but they managed to cover some important topics in the process. Well, thanks so much, Nestor. It's been a real pleasure to be with you again. Yeah, thank you very much. It was a pleasure, and it was a real honor to discuss these topics with you guys today. Have a good one. Thank you so much. Take care of likewise. That's it for this episode of the At-HPC podcast.

Starting point is 00:44:40 Every episode is featured on insidehpc.com and posted on Orionx.net. Use the comment section or tweet us with any questions or to propose topics of discussion. If you like the show, rate and review it on Apple Podcasts or wherever you listen. The at-HPC podcast is a production of OrionX in association with InsideHPC. Thank you for listening.

@HPC Podcast Archives - OrionX.net - @HPCpodcast-103: Stanford AI Index 2025 w Nestor Maslej

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.