Computer Architecture Podcast - Ep 24: From Unicorns to Centaurs: Codesigning Computer Systems for the AI Era with Dr. Partha Ranganathan, Google

Starting point is 00:00:00 Hi, and welcome to the Computer Architecture podcast, a show that brings you closer to cutting-edge work in computer architecture and the remarkable people behind it. We are your hosts. I'm Suveni Subramanian. And I'm Lisa Shue. On this episode, we were joined by Dr. Partha Ranganatha, Ranganathan, a VP and Technical Fellow at Google,

Starting point is 00:00:18 where he serves as the area technical lead for hardware and data centers. Partha is currently driving next-generation computing systems for the AI era, tackling the challenges of building, scaling, and managing warehouse-scale computers. and the bleeding edge of hardware software co-design. He has pioneered numerous impactful innovation, touching billions of users, spanning power-efficient servers, specialized accelerators, and disaggregated data centers. A leading voice in the field, Partha has published over 100 papers,

Starting point is 00:00:46 is a co-inventure on over 125 patents, and co-authored the seminal textbook, the Data Center as a computer. He is an I-Triplea and ACM fellow, a member of the Iska, Asplas, and HPCA Halls of Fame, a recipient of the ACM Stigark Maurice Wilkes Award. He's been named one of MIT Tech Review's top 35 young innovators. He has received distinguished alumni awards from both Rice University and IIT Madras.

Starting point is 00:01:12 And finally, he is our second and probably final guest with a unique honor of having won an Emmy Award. When he joined us, we discussed what else, AI, of course, with topics ranging from ethics and safety to agentic improvements to computer architecture, to work hacks using AI, with a lot of smattering of Partha's trademark alliterations thrown into the mix. We had a great time, and we don't want to keep you one second longer from this great conversation. So let's get to the interview. A quick disclaimer that all views shared on the show are the opinions of individuals and do not reflect the views of the organizations they work for. Partha, welcome to the podcast. We are so excited to have you here. Well, it's a pleasure to be here.

Starting point is 00:01:59 I love hearing the podcast from you folks, and I feel like I finally made it to be on the Lisa Suveni podcast. Well, we don't think of it as the Lisa Suveni podcast. Like the featured most amazing people are always the guests, and that's why we wanted to have you on the show. So thanks for being here. And I know you're a fan of the podcast, so you know the first question is going to be, what's getting you up in the morning these days.

Starting point is 00:02:24 Well, I think, you know what I'm going to say, AI, AI, AI, AI. So, but honestly, jokes aside, I do believe that we are at a very fundamental inflection point in computing, maybe even arguably in society. And I honestly waking up, I think about all the opportunities we have and how much we need to do. And so I think it's a really interesting place to be in. And so I can talk about one thing that is really close to my heart. I mean, AI, of course, we see it everywhere, AI for consumer applications. AI for Enterprise. I just came back from Google Cloud Next where I'm just blown away by how Enterprise is adopting AI in some meaningful ways and happy to talk about that more if you're

Starting point is 00:03:07 interested. But AI for science is something really close to my heart and I've been working on it for a couple of years now. And we had Alpha Fold from Google and Alpha Fold identifies how to do, how to look at protein folding. And for me, again, I don't know how many, I didn't know this when I first started. I mean, we all knew proteins were sequence of amino acids. but it turns out the way that three-dimensional structure of the protein folding, that determines what that protein is useful for, drug discovery, disease, and all that stuff. And finding that protein folding was such a complex problem, and typically it took one PhD student, five years, wet lab and all that stuff,

Starting point is 00:03:45 to kind of figure this protein folding problem. And alpha fold comes along, and with a click of a button, 200 million proteins in the world, protein folding solved, right? And so if you think about it, that's like every time I say this, I literally get goosebumps, one billion years of researcher time automated at the click of a button. And what is interesting is now people are like, well, AI has solved all the problems that they're nothing to do. No, it's just the beginning because now that we have this amazing database of protein folding,

Starting point is 00:04:12 we can look at vaccine development, we can look at antibodies, we can look at disease, we can look at plastics. So many new applications with these things are there. And this is just biological sciences and not even protein folding. We have alpha fold three, we have where we're looking at all kinds of, all kinds of materials, weather forecasting, we have computational sciences like AlphiBol, we have all kinds of amazing stuff there, right? And so the possibilities literally are just very, very interesting in terms of how this technology can transform the world. And we have a

Starting point is 00:04:45 very, very lucky seat at the table. And I always have this analogy that somebody gave, which really stuck in my mind. If we think of our AI colleagues as space explorers, and this is the new space age. And if you think of them as space explorers discovering new engine, new worlds and new areas, we, namely the computer architecture community, are the ones designing the rocket ships that get them there, right? And so you ask me a question, how do I wake up? I wake up feeling like a rocket scientist kind of helping with the space age. So you brought your catchy quote for the social media post for the podcast here. Nice. Nice. I like it. Yeah, I mean, it does seem like there's this crazy amount of activity happening in this space, of course.

Starting point is 00:05:31 And so one thing I wonder for you is, you know, if we're the rocket scientists and the AI people are the space explorers, there is an element of a lot of folks around the world who are wondering, like, you know, AI for morality, AI, for ethics, AI for safety. Well, how do you, not to get like straight into the deep, deep stuff right now, but yeah, let's get straight into the deep stuff. right now, I guess, we're building the rocket ships. The other reason why I'm thinking about it is because I actually had a student asked me the other

Starting point is 00:06:02 day, like, should we be working on AI? And I was like, I think we should be working on AI. We might want to think about whether or not we can cross layers and think more about safety type things, even though we're the hardware layer. So I wonder what you think about this, because rocket ships could be built to go to the moon. Rocket ships could also be built to go do potentially more nefarious things. So, like, how do you think about that aspect? I didn't even mean to go this hard hitting right away,

Starting point is 00:06:29 but that's like the first thing that I thought of. So Lisa, I think it's a really good question. And it's something that I think, again, we spend a lot of time thinking about. And I personally take very seriously as well. And I think you said it really well. Every technology development in the history of humanity can be a double-edgedged and it is up to all of us to kind of think about

Starting point is 00:06:52 how do we do this responsibly. And so I think about two things. So number one, you ask two questions. Should we do AI? And I think the example I just gave around AI for science, the applications are so profound that it would be a disservice for us not to look at AI. So if I can again, alpha fold, I'll give you an example where because you now have a way of using AI to solve disease, you can look at the tail of all the diseases, right?

Starting point is 00:07:18 So typically most of the research focuses on like the market of where the biggest diseases has happened. And there's actually this very inspiring story of how Alpha Ford was used to solve the disease of this one particular family in Ohio where they had this genetic disorder and nobody looked at it because it was this very rare variant. And Alpha Ford came in and literally transformed their lives. Self-driving cars. AI to kind of use that, the number of accidents, there was actually this article from this person who wrote that. And they said, like, look, the number of accidents that have gone down dramatically with self-driving cars, it's just such a fundamentally transformative societal change.

Starting point is 00:07:55 I think we got to embrace it. Now, at the same time, we got to think about guard rails and we have to think about responsible AI. And again, I'll give you an example of something where very trivial example, but I want to ground this because these are very profound conversations and I don't want to have an answer that sounds very appreciate in some respect, right? And so we use AI to kind of think about data center cooling. And so, as we know, the data centers are things where we build all of our servers and so on,

Starting point is 00:08:22 and cooling is one of the biggest chunks there. And so we had AI to kind of think about reinforcement learning to look at cooling and so on. Worked amazingly well. And Google, we've had like 25 years of optimizing data center cooling. We have this metric called PUE, our usage effectiveness. It used to be three. The ideal number was one. We brought it down all the way from three to one point two.

Starting point is 00:08:42 Fantastic work. So very matured area. We apply AI and we get like an additional 20 to 30% improvement in efficiency. Okay, really nice. But very interesting story here. So we had, we were watching this system and it's like winter in Iowa and in our data center and the AI is kind of going really doing something very non-intuitive. It's turning up the air conditioner, cranking up the air conditioner in the middle of winter

Starting point is 00:09:09 in Iowa. And we go like, okay, what's going on? And it seems to be kind of really kind of turning up the air conditioner. are really hard. And so we went and looked at that, and it turned out that the AI was responding to a sensor, which was very close to where the de-icing algorithm was being up. So we were de-icing something. And because of de-icing, the local sensor was kind of noticing that the temperature was going up. And it was telling the AI, hey, look, things are warming up a little bit. And the AI goes like, oh, wow, okay, let me cool you down. And the more the AI cooled it down, the more the de-icing algorithm

Starting point is 00:09:41 was like, hey, wait, something is not working. I'm going to keep doing that. And so, So they were kind of literally fighting with each other and so on and so forth. And so in this particular case, I mean, it may seem like something very benign, but again, this is a data center and the power management and cooling management can be pretty significant. And so we noticed it and we had actually multiple checks and balances and so on and we kind of solved this problem. Now, how did we do that? And this is a theme that you will hear me talk about. It's important to have responsible AI.

Starting point is 00:10:10 And I talk about the three hitches, hate, harm, hallucination. I think we need to be thinking about all of them very, very deliberately. And so we spend a lot of time, and in particular, going back to the data center cooling example, so we had multiple guardrails. And one of the interesting research challenges, again, this is a podcast that I know a lot of people in the research community listen to. There's some really fundamental research challenges here. Like in this particular case, how do we train the model on these corner cases?

Starting point is 00:10:35 Because we are training only on, we don't train on a case where the data center exploded, right? We've trained only on the safe cases. And how do we kind of look at the edge case? and the boundaries. And so we came up with some really interesting ideas at Google, where we kind of put up the design space in incremental ways to be able to find the boundary conditions and go look at that. How do we put multiple layers of checks and balances? How do we kind of make sure that the data is not biased? How do we make sure that it is grounded?

Starting point is 00:11:02 Hate harm hallucinations? So I do think, again, we have a lot of evidence around how we can do that. And yes, I'm not minimizing the importance of doing this. We have to be. looking at responsible AI. But I am optimistic that I think if we apply the brightest brains in our community to solving this problem, we will come up with a better answer. Like any other technology, we will harness fire for good purposes and we will kind of take it to where it needs to be. I love that answer and I really like your three H's. And I think my follow-up question is. I have more alliteration where that came from. Okay. Wonderful. Wonderful. Plays on words as a computer architect is a good thing.

Starting point is 00:11:46 So, you know, just as a quick digression, I once gave a keynote, which was called how an introverted computer architect learned to allow networking. And it was a talk I gave to the networking audience. It was the open networking forum, and the play on words was a big hit. And I am a very introverted computer architect, and I did know how to kind of do podcasts with celebrity podcast interviewers. So obviously, I figured out how to do that a little bit of. You are too funny, Partha. You are too funny. Well, believe it or not, I'm an introvert as well. So I feel you could air. We can use language to sort of mask some, mask some discomfort and create connection, I suppose. So I guess my follow-up question to what you were saying about all this stuff is historically computer architecture. We live in a very, very layered industry. You've got the, you've got devices, you've got architecture, you've got maybe like ISAs, companies.

Starting point is 00:12:43 compilers, operating systems, software, you know, middleware, application level software. It's a very layered system. And what we work on is a very, very low level of that huge stack. And so when we build things that enable awesome capabilities like you were just talking about with the protein folding, we enable a lot on top. And in some cases, it can feel like it is up to the agents,

Starting point is 00:13:11 not AI agents. It just means like someone who's in control. It's up to the value ad chain closer to the top to determine which way the technology is going to be used and for good or for not so good. So I guess what I wonder in this case, you gave some examples where there was potentially more than just computer architecture going on

Starting point is 00:13:36 in order to create some of these guardrails, right? And so I guess my question to you is, Do classical computer architects need to bleed into upper layers to think about these guardrails? Or is there anything they can do specifically at the computer architecture layer to assist in this sort of responsible AI effort? Yeah, no, I think really good question, Lisa. And that you've given me a really nice opening to one of my favorite topics, namely co-designed. I honestly think if I think about the way we have, as a computer architecture community can make an impact in this new world,

Starting point is 00:14:15 it's got to be embracing code design as a first-class design consideration. And what do I mean by code design, right? So again, I talked about we are building the rocket engine that's powering this new space age. And so how are we looking at it? At Google, we have this thing called the tensor processing unit, the TPUs. And Suena has done some really nice work on that, and he can talk about it better than I can. But the way we looked at the TPUs was it's a custom silicon accelerator. I mean, so it's optimized for AIML.

Starting point is 00:14:44 And so you start off with code design as a first class consideration because some of the really nice innovations in TPUs, and this is true for GPUs and other accelerators as well, like the way the numerates was invented, where Sparsco was invented, the way the HPMUs was invented. All of these kind of things at the TPU level started off from a deep appreciation of the hardware software boundary.

Starting point is 00:15:03 But then we didn't stop there. The next level of code design is you take these TPUs and you build larger systems with those. So you put them in servers and you put the servers in lax and you put the lax in data centers and you think about how do I do the networking. And we have some really amazing stuff with optical interconnects and reconfigureable topologies and using mirrors and mems and all that kind of stuff. And so we have to do that, but then we have to think about power and cooling as well.

Starting point is 00:15:29 And liquid cooling, for example, at Google, we have publicly said we were the first company to have multiple gigawatts of liquid cooling deployed in our data centers. And then on power, how do you think about sustainable, the environmentally friendly power and how do you think about medium voltage, low voltage, all that kind of stuff as well, right? And so that's the next level of co-design is to start thinking beyond chips to thinking about systems, but thinking about all of the elements of systems that come in. But then it doesn't stop there because then what we do is we take that layer, but then

Starting point is 00:15:58 we need to co-design with the next layer of the stack. So we need to think about the software layer like compilers, the XLA compiler, for example, or frameworks, PITARCH or TensorFlow or Jax, or even higher level, how do you kind of think about the scheduler, Kubernetes and so on. So we now need to think about compute storage, networking, all the layers of the hardware and software, and put that into one big packet,

Starting point is 00:16:20 which is really the infrastructure as a service that we provide on the cloud, right? And Google calls it the Google hypercomputer, and so that's fine. But again, you see another layer of co-design that needs us to have a deep appreciation of not just our conditions, but also thinking about the additional, adjacencies in the stack as well. And then you take all of this and it doesn't stop there because you now have your infrastructure

Starting point is 00:16:41 as a service, we need to be thinking about how to be co-designed with the models. Because we had a paper at Asplas where we showed that co-designing the model and the infrastructure leads to multiple Moors Law generation of benefits and so on. And so you co-design with the model, but you can co-design all the way up to the model garden, the agent garden, the applications and so on. And so this level of co-design from chips, chips to systems, systems to platforms, platforms to solutions, solutions all the way to applications, is incredibly powerful.

Starting point is 00:17:09 And so in the last 10 years, if I look at the DPU roadmap, we just announced that DPA-8 and 8D, and we have multiple generations of TPUs that we have looked at, this has been a very, very powerful approach, Lisa, in terms of how we've been able to meet this moment. And one of the things that's very interesting that is worth pointing out is that the supply demand gap is huge.

Starting point is 00:17:31 So if I look at what I mean by that is, So typically in the computer architecture community, we talk about Moore's Law. And Moore's Law very informally is that performance doubles every two years or three years for the same cost. Right. And again, we can quibble over exact definitions, but roughly let's say that's what we are talking about. If you extrapolate the growth of model sizes over the last five years, from the very first transformer paper to where we are with Gemini 3.1 or whatever, model sizes have been doubling every two to three months.

Starting point is 00:18:00 And I'm going to say model sizes are proxy for the compute involved, well. And so on one hand, you have the demand for computing doubling every two to three months. On the other hand, you have the supply for computing, at best, doubling every two to three years. And we all know that Mursla is dead and we need to kind of do more things to kind of do all of that. And so that is why co-design is such a powerful technique, because the approach that I talked about for co-designed from Chipsy System, Systems, Platforms, to Solutions, really gives us orders of magnitude. So we've had orders of 10 to 100 improvement in performance per watt, performance per dollar, over the last few years by kind of thinking about this code design as a first-class design concentration.

Starting point is 00:18:39 Right. So you talked about the tension between the supply and demand. So we have immense demand for compute that's just growing in a rapid pace. At the same time, our supply is sort of constrained and we need to look at multiple layers of the stack. The same time, you know, when we were talking to Tushar as well, he pointed out that even in the context of these inflection points that we have had in the last five years and looking forward, there have been multiple inflection points.

Starting point is 00:19:02 For example, in the early stages of DNNs was when we started co-designing these systems and accelerators and so on. And it looks like we've actually gone through at least a few different paradigm shifts from early DNS to large language models. And now we have generative AI. So can you maybe take a step back and put the landscape in perspective for our listeners? Like what have we actually achieved over the last five years? And there are a few elements that you talked about in terms of co-design. But also looking forward why this might be yet another inflection point that could have even more orders of magnitude change in the way we designed. systems and the opportunities that lay ahead.

Starting point is 00:19:35 Yeah, I think I often make this joke about AI, and I usually say last week was an amazing year in AI, and I think I saw this in a tweet somewhere, right? And there are weeks where it literally feels like we have crammed a year's worth of innovation in a week. And so in some respects, going back to the very first question on how do I wake up every day, I think it is by far the most exciting times of our careers. I think we are going to be challenged in an interesting ways, but we are going to be able to make contributions in interesting ways.

Starting point is 00:20:09 All of this is to say that we are still in the very early stages of the AI revolution. And why do I say that? I mean, again, I look at 10 years, and we've done some pretty amazing stuff and so on. But then I look at the growth. Just in the last year alone, and we presented some of our data in Google Cloud Next, if you look at the growth that we have had, we have seen a 50x increase in how our accelerated usage has been done. So we are literally quadrillion tokens, more than quadrillion tokens now.

Starting point is 00:20:36 Every minute I speak, it's increasing across all our AI surfaces and so on. So we are talking about orders of magnitude improvement, 10, 20, 50x in terms of how the AI growth is going. And it's not stopping anywhere. Because each model, like if you think about more reasoning-based models and if you look at, again, some of the Jevon's paradox with how as models get cheaper, you're going to use things and so on, I think we have incredible opportunity. So we literally need orders of magnitude further improvements in terms of capacity, in terms of efficiency, and so on. And so that, to me, is kind of that infliction point via that. Now, in terms of, are there specific ideas we look at? I think, I mean, there's a lot of work in the academy community.

Starting point is 00:21:16 I think if you go to say something like Nulips or any of the AI communities, so you hear all the exciting opportunities that you see there. I mean, large language models themselves are still kind of innovating. I mean, the step function between Gemini 3 and the prior version was pretty fantastic, and the step function we see with the next version is equally fantastic, and we have so many more ideas that our colleagues in Google Deep Mind are exploring, each one of which sounds like very powerful in terms of what we have, but we also have so many other markets to apply. So AI for education, AI for entertainment, AI for productivity.

Starting point is 00:21:51 I personally have been using a bunch of AI tools to improve my personal productivity and some really interesting base that honestly are profoundly transformative to how I think about my day-to-day operations. And so both the volume and the quality and quantity of compute is changing. And so I think we have tons of opportunities ahead. Yeah, so something that you were saying made me think in your last two answers. One is, you know, you're talking about examining all sorts of things that academia is doing. And the other is in your previous answer, when you were talking about, you know, chips to platforms, classrooms to systems, systems to solutions, that kind of co-design continuum, it made me wonder

Starting point is 00:22:31 if you feel like you must be a large vertically integrated type of institution to be able to do the sort of co-design you were talking about, because it seems like it would be very hard for an academic institution to do that level of deep layer code design, or even say, a hardware startup, or a small startup. So do you need to be a Google or at Microsoft or a meta or something like that to be able to do this sort of work? I think the short answer is no. And I think, I mean, the computer architecture community is full of examples where this

Starting point is 00:23:09 question has come up several times and we have still made contributions, right? I mean, doesn't university have research to build our own fab and do a bunch of stuff or even do the kind of excruciating level of detail that computer architecture needs to go to? we have still made some performed contributions that have transformed the industry. So I don't really think so. But honestly, I mean, let's look at some of the grand challenges that I would personally rank as my, let's say, top three. The things I talk about, number one, efficiency and sustainability.

Starting point is 00:23:39 How do we kind of make sure that we are thinking about environmental efficiency, scope one, scope two, scope three emissions, and how do we kind of think about co-designers and mechanism to do that? Really important problem, lots of exciting opportunities there, lots of really strong academic work in this space as well. Another option that academia hasn't done as much is this notion of agility. And there's a lot of work around performance and performance per watt and so on. We often don't look at speed. And oftentimes in industry, the area under the curve is really critical. And you could argue like, hey, is that an industry problem that only industry can solve? And no,

Starting point is 00:24:15 we have had some really amazing progress in academia in the last few years that has solved that as well. The third problem that is very, very hard that industry always talks about is what I call capability. And what I mean by capability is kind of going back to the question you asked a little bit earlier, Lisa, which is, hey, how much do we need to worry about the top of the stack and the hardware layer underneath that? Almost when we think about that, every major silicon accelerator, at least everyone that I have worked on, the successful ones have always been because they're focused on unlocking a new capability than focusing on efficiency. In other words, when you're building a new accelerator, the typical instinct is to draw high chart of where the cycles go and you say, hey, here is the biggest chunk of cycles and let's kind of go accelerate that. But in reality, that is not quite the right approach. If I look at like TPUs, when we first started doing this, the AIML cycles in the fleet was single digit percentages. The reason we did the TPUs was because it allowed us to unlock a capability, a training that took like weeks to kind of convert it. days and so on, right? Same thing with our video accelerator. I was very lucky to work on a

Starting point is 00:25:23 really amazing video accelerator that is probably powering this particular meeting right now as well. And when we did that, yes, you could kind of have efficiency in transcoding. You could kind of make transcoding run much faster and do a bunch of these things. But the real reason why that accelerator was successful is it enabled capability. It enabled 8K video, it enabled immersive video, it enabled live TV, it enabled cloud gaming. So capability is by far one of the hardest problems. And typically the criticism I hear is like, hey, how can academia do that? It's really you're unlocking a new market. But again, you go back and look at the history. Academia has often come up with some of the most compelling answers to all of these as well. Right. And so to me,

Starting point is 00:26:03 this notion that we need, yes, of course, infrastructure is key. We need to kind of work on that. But we've had plenty of industry academia partnerships to kind of go look at this as well. So I do think there's a lot of opportunities in the track record speaks for ourselves in the sense that the academy community in computer architecture has made some profound contributions, and I think they'll continue to do that. You hear that, listeners? Get to it. Yeah, so speaking about the top three challenges, so efficiency, agility, and capability, what do you see as, so given that we have a new toolkit, technology toolkit and AI,

Starting point is 00:26:34 and the capability seem to be improving dramatically, do you see any need for some mindset shifts in the broader research community so that we focus on the right set of problems? Are there some under-highlighted problems or underappreciated problems in the space or problems that you wish had a lot more people working on this? So, of course, people work on performance efficiency, as you mentioned, but there are maybe a new slate of problems, new categories of problems that are coming up as we look forward into the future. Are there certain problems that you feel people would do more work on or you think we should shine a light on this so that more people would work on this? We need more ideas in the space and there are new frontiers being broached over here. So we should try and focus and move that frontier a little more faster. Yeah, no, I think really good question.

Starting point is 00:27:18 And something I spend a lot of time thinking about as well. And when I think about some of the challenges ahead, we do need some step function innovations to kind of get us to where we want to get to, whether it's sustainability, agility, capability, capability, performance, performance, but what and all of that stuff. And so I've been thinking about how AI can be used to get more efficient AI tools, right? And if you look at AI for biology, biology is a very, very complex space. And if we can kind of have these major breakthroughs in how human systems work,

Starting point is 00:27:52 why would we not have AI to kind of help us make breakthroughs in how computer systems work? And so I've been spending a lot of time thinking about AI for systems. And maybe this is a shameless plug, but I did give a keynote at ASPLAZs recently where I talked about. And the keynote was called from Unicorns to Centaurs. And the idea really is Unicons is basically the custom silicon accelerator that defined the innovation in the last era of computer architecture there. And Centaurus is where I think we need to go, this notion of human AI hybrids and how we can work synergistically to do that. And in case, just as a quick digression, the word centaurus is not my invention, right? And Gary Kasparov, the chess champion, when he first got defeated by the AI machine, he wrote a book called, I think it's called,

Starting point is 00:28:39 deep thinking or something like that. And he basically pointed out that it wasn't the world's best chess champions, the humans who won the chess championships or the world's best machines who won the chess championships. It was average humans working with average machines that won the championships consistently. And so he coined his terms and centaurs, and he said we are entering the era of centaurs. And I honestly believe that is where we are, this notion of having AI as a tool to help us do even more innovative stuff, I think it's where we are. So, Sumitay to your question,

Starting point is 00:29:12 so to me, I've been playing around with AI for pretty much every layer of the stack. We have looked at AI for chip design. We have looked at AI for computer architecture design, AI for systems design, AI for application design, AI for entire model design. And every one of those things, we have been surprised by how powerful AI is as a tool to get us to the next wave of innovation. And so that is the challenge I would offer all of our readers, listeners here, is basically how do we embrace AI as this powerful paradigm.

Starting point is 00:29:42 It's almost like saying, hey, look, we went from slide rules to calculators, and we need to start using calculators. More importantly, I think the metaphor for this audience is, like, we now have EDA tools. We don't need to kind of, we have very sophisticated EDA tools, and we should still not be relying on spice models and mathematical analysis of circuits.

Starting point is 00:30:00 We should use EDA models to build the next way of things. So we have a powerful tool that I really think can transform things quite a bit. Yeah, I love that answer because I think a lot of people, whether in the tech community or not, are having a panicky moment about with respect to AI and what that means for people and work going forwards. And I think the thing that gives me a lot of hope for the future is that you could use AI to replace yourself. You could also use AI to amplify yourself. And I think probably the question in a lot of people's minds is how do I use AI to amplify myself instead of replace myself. And I think I've heard a quote once. I forget who it was from. So,

Starting point is 00:30:40 and I believe it was someone famous. So hopefully whoever they are doesn't listen to this and doesn't mind the lack of attribution. But they said something like, in the future, there will be two kinds of people, people who tell the AIs what to do and people who get told by AIs what to do. And I think, you know, if you want to be on the side of being able to tell the AIs what to do, that that's going to have to be this kind of centaur-like collaboration. that you're talking about where you have an idea of what you want. The AI helps you bring it to fruition. It couldn't do it alone because it needs some thought from you.

Starting point is 00:31:16 You couldn't do it alone because it's too much grunt work. But if it's just you're just doing what an AI is telling you and you're not bringing anything to table except for the fact that you have a physical body, let's say, then that's not a good place to be. And if you're doing it all yourself, then you're potentially letting an opportunity of something to accelerate your own capabilities go by the wayside. So this centaur-like analogy I like.

Starting point is 00:31:40 Plus centaurs are the good guys in a lot of Harry Potter, right? Or at least Ronan was. Very nicely said, Lisa. It looks like we have to have you be one of the guests in the podcast and talk more about some of your thoughts here. I feel like I share maybe too many of my thoughts in the podcast. Sometimes we should be highlighting the guests more. But all that to say, I really liked your unicorns and centaur analogy.

Starting point is 00:32:07 I feel like I should have gone to ask us to go hear your keynote. Well, I suspect it will be on YouTube, so you can watch it sometime soon. But going back to again, Suenai's question, following upon Centaurs, again, I don't want to say that AI is a panacea and it solves all the problems. We already talked about responsible AI. So I think that's super important. And so you need to learn how to use AI. So I would offer that to the community as an important challenge for us to figure out how do we make sure that this tool is being used well. It's being developed well.

Starting point is 00:32:35 It is being co-designed well. So that's problem number one. I think the next thing, again, is too often people think of AI as just the models, but it's really the models, the agents, the harness. And we don't spend enough time thinking about the systems that the AI encapsulates. And so what are the data sources? So one of the things I've been pushing really hard and I've been very fortunate to help lead a coalition on this is, what does the data comments for chip design look like?

Starting point is 00:33:03 where do we as a community come together and share the data that helps us all kind of do that? I mean, we've had an amazing record of simple scalar or simics or MFI and all the stuff, and we've done really well at sharing infrastructure that helps us as a community to do that. What does that mean in the context of this new wave of innovation where AI is going to help us do that? And then the last bit is, again, a little bit non-intuitive, which goes back to this team of co-design. We need to really think about the full stack, right? And I often call this the focus on the AI-infused workflow. And what I mean by that is like,

Starting point is 00:33:40 if you think about like AI for coding, AI for coding is transformation. It can make things faster. But if you have 10 times faster coders, the next bottleneck becomes testing. Almost always you have like same thing. If I can build accelerators faster, if I can have AI building a chip faster,

Starting point is 00:33:56 we're still going to need to build the software stack for that. We need to build the compilers. We need to build that. Right. And so there's always this notion, that you need to think about the entire story together. And so I do think we are still in the very, very early stages. I promised you another alliteration, Lisa, so here is the next one.

Starting point is 00:34:15 I talk about the four Ds. So we have drudgery, we have development, design, and disruption. And so almost always an AI journey kind of thinks about, you kind of look at, you start with something where we take the very simple tasks, and that's the drudgery part of it, and we kind of do that. So we have had, for example, AI to convert X86 to ARM, or AI to convert tensile flow to jacks and so on, right?

Starting point is 00:34:41 And then you go into the AI for development, which is where we have like anti-gravity, for example, where you have AI for coding and so on and so forth. And then as you go to AI for design and disruption, some of the challenges that I talked about, taking this full AI-infused workflow approach, thinking about systems for AI, how do you scale AI and so on,

Starting point is 00:34:59 those get very, very important as well. love the alliteration again. Now I have to wonder, do you ask AI to help you with these alliterations, or do you come up with them yourself? I actually come up with them myself. But can I tell you a funny story here? You know how one of the things that you do is when you go to a search engine and search for yourself or something like that? So I went to Gemini and Gemini's Google's model, obviously, and I said, hey, can you kind of light a keynote abstract in, in Portland? And the first thing it came up with, this like, here is a title,

Starting point is 00:35:36 and it said, designing disaggregated data-centric data centers. And then I said, why did you come up with this title? And it said, Parthal has a style to how he does that. He always likes alliteration in his titles. And he always kind of sets up something about end of Moorsla, and he always ends up saying co-designed. And so I was like, oh my God, I'm so predictable. That is indeed what I say.

Starting point is 00:35:59 So answering your question, yes, I do come up with annotation a lot. It helps me remember. So because otherwise, there's so many concepts there, it helps me prioritize. And it's honestly fun. Yeah, I mean, it is a great mnemonic to be able to remember. I mean, I was trying to write down the four Ds, but I can't write as fast as you speak. But then by the time I got to the end, I was like, shoot, I forgot the fourth one. But wait, I know it starts with a D and then I remembered it. So it is a helpful, a very helpful mnemonic. So maybe this is a good time to wind the clocks back a little bit. So you've had a fairly storied career across, you know, multiple industry labs. You've seen multiple epochs of

Starting point is 00:36:36 transformation in the computing industry. Can you tell our readers, like, how did you get to Google? How do you get interested in computer architecture very broadly? What has the trajectory been like for you? I think it's like most people who follow interesting things and you kind of go do stuff. So I first learned about computer architecture in my undergraduate IT Madras. And it was because I think everybody was doing computer science. And I loved computer science. I remember I learned coding with the X-X spectrum, and I suspect many of the listeners for this podcast will not even know what it meant.

Starting point is 00:37:11 I remember my dad was really very impressed when even before the ZX spectrum, we had a Casio. It was literally a four-line display, and initially it was a one-line display. And I figured out how, and this was, I was in my seventh grade or something, and I figured out how to code cricket. And I had these stick characters that at pixels where they could run. whenever they had like a, so it was a random number generator and you would say it's a four or a six and then the stick character would run. And my dad was so impressed that I had learned a program and he would show everybody. It's like, oh, look, my son made this stick character move.

Starting point is 00:37:46 And I was like, you know, computer science and computer stuff seems to be a good place to be in. So that's my origin story a little bit. But since then, I've been incredibly fortunate. I went to IT Madras where I learned some of the basics and then Sarita Adwey at Rice. She was very, very kind enough to recruit me into Rice, and I had a fabulous experience there working on some really amazing stuff. And we worked on computer architectures. We worked on building large simulators. And then I joined what used to be digital equipment corporation, which was an amazing group of people.

Starting point is 00:38:18 And they got acquired by Compact. Compact got acquired by HP. And then, again, I joined Google, like about 30 years back. And I've been very lucky to be at the forefront of doing stuff. And again, I think one of the things that's been really helpful is I've been very lucky to be at kind of the leading edge of finding what the next big problem is. So I started working on ILP multiprocessors back when people were looking at ILP uniprocessors. And we did some really amazing work there. I worked on power management back when people were looking at performance.

Starting point is 00:38:49 And we did some amazing work. Every phone on the planet uses some of the technology that we developed with energy-aware use interfaces, heterogeneous multi-core. So that's been pretty fabulous. But then I switched to power management for data centers back when nobody was looking at it. And now, again, power management for data centers is the number one problem with everyone thinking about it and so on. And then again, I switched to Blade servers and what we were doing there, data-centric data centers, custom silicon accelerators, now AI for systems. So I would say that's been kind of very helpful.

Starting point is 00:39:21 And I don't know if it is luck or brilliant intuition or amazing mentors. I'm going to say it's because I listen to some amazing podcasts where they had really good speakers. And so I will attribute all of my intuition to listening to some podcasts. And if anybody wants some suggestions for podcasts, I think you're listening to one as well. That's amazing. Yeah, well, I was going to ask you whether you thought it was luck or intuition, because a lot of the guests that we've had, they always attribute a lot of their success to luck, which is very humble. But they're, you know, in many ways, I'm sure for our,

Starting point is 00:39:58 listeners, they would like something maybe more, not more satisfying, but more replicatable than luck, right? They can potentially do for themselves. And so, so maybe you can talk a little bit more about your, like problem selection process. Yeah, I think, Lisa, I mean, luck obviously always plays a role. And I think, I mean, nobody's wrong when they say luck, but I'm a systems person. I'm a computer architect and we disaggregate problems into patterns. And so I do kind of dissect success and career growth with the same intensity. I dissect a computer architecture problem as well. And in fact, at one of the architecture conferences, I gave a talk to the students.

Starting point is 00:40:44 And I titled it the one-word secret to succeeding at life and at Google. I'm not going to tell you folks what that one word was. So success does leave clues. There are patterns to success. And I often find that when you go look at successful people, there are some patterns that show up. And in my case, I kind of encapsulated into a few things. I usually tell people, be useful, be nice, be deep, be consistent. Sorry, Lisa, not alliterative enough, but still it limes reasonably, I guess.

Starting point is 00:41:14 And the one word is actually a much easier pneumonia to remind people, but I'm not going to go into that right now. So be useful, be nice, be deep, be consistent, right? And these were, again, lessons I learned from my mentors. So be useful. It's finally about the impact you drive. And so whenever I pick problems, I always look at what is the impact. And I talked about AI for science, right? And at the end of the day, I think if you, I mean, sometimes we work on, like, we may be working on a policy in a branch predictor,

Starting point is 00:41:44 but the branch predictor part of a CBO that gives us more so law that powers the world and so on. So you always want to tie to tie back to how useful you are. And so be useful is something that I spend a lot of time thinking about. It's something that my PhD advisors always, that Sarita would always say, if it's something worth doing, it's worth doing well. So kind of go look at that really well. So be useful. Be nice.

Starting point is 00:42:07 At the end of the day, I do think the connections you make, the human element of what you work on is super important. So surrounding yourself with people you enjoy working with and you learn from, but are also nice people who teach you. I've been incredibly lucky to have amazing many. mentors who have been very, very selfless. And this is the part where I don't know if it is luck, but none of them needed to do all of the things they did for me. And that's one thing I wake up kind of saying, can I pay it forward? Because I wouldn't be where I was without all of these people. And so be nice. My mom was like a fantastic individual who would always kind of do this kind of stuff.

Starting point is 00:42:42 And then be deep. I think ultimately, you have to be curious. You have to learn. You have to kind of understand what you're talking about. And I think, honestly, if there is one big risk in the world of AI that I would think about, it's the fact that we can outsource our thinking to somebody else. And I always say that is why AI is a tool. You still need to have that critical thinking. I think, be deep is really, really, really important. And again, this is a lesson my dad taught me as well. And so be useful, be nice, be deep. I think to me, at least seems to have been the algorithm that has worked. And if I go back to any of my successes, then I kind of try to deconstruct what happened, almost always these are the three common themes that led me to that place in the first place.

Starting point is 00:43:29 And I continue to use it as a rubric to evaluate what I want to do next. And then be consistent part of it. And this is again, one of my mentors mentioned this to me. And I often, when I go to parties and I find some young kids, middle schoolers, high schoolers, I often give them this magic trick. And I say, hey, if you improve 1% every day in 300 days in a year, how much do you think you're going to improve? And almost somebody hopefully comes up and says, 300%. And I usually tell them, no, it's actually 3,600%. And so where did you get this magic factor of 10?

Starting point is 00:44:03 And that's the power of compounding. It doesn't make me very popular at parties with kids to have math-related jokes. But the point is, it is amazing what compounding does. And so this factor of 10 by just incrementally, improving things. So I think the be consistent, apply to be useful, be nice, be deep. I think to me is a very, very powerful thing. So hopefully that's useful to the listeners here as something that they can apply as well. But I'm sure again,

Starting point is 00:44:30 success is a very personal thing. And I think each person finds their way different and different things work for different people. But this seems to be something that is a foundation that you can build on. Yeah, very pertinent words of wisdom. I'm sure our listeners will definitely appreciate it. while your math puzzles may not be very popular with the young kids, maybe you can tell them about the other cool thing, which is you've actually received a primetime Emmy Award, which is known to the broader public.

Starting point is 00:44:56 It's not a purely technical award. And rather interestingly, you're the second guest on our podcast who has been a recipient of the Emmy Award. So Vivian Z, whom we had a few episodes before, was also a winner of this. So maybe that's a good time to tell the kids that, hey, you know what? The Emmy Awards that you see have actually won this for something

Starting point is 00:45:13 that has deployed at scale. and if you ever watch a YouTube video or upload anything onto YouTube, it does go through one of these inventions. Yeah, and so if you become a computer architect, you can have as many or more Emmy Awards than Taylor Swift. I need to check whether that number has been updated, but that's what my teenage daughter came in when I won the Emmy,

Starting point is 00:45:37 and she's like, Dad, do you know you have more Emmy Awards than Taylor Swift? I'm like, for that brief fraction of time, I was a cool dad for my teenage daughter. That is absolutely amazing. Well, I got to say, too, you know, you kept talking in the beginning about how you were so excited to be on this celebrity podcast. It sounds like we're the one who's lucky to have an Emmy Award winner to be on the podcast. You're the celebrity. But, yeah, it is cool that we've now had two Emmy Award winners on the show.

Starting point is 00:46:08 Yep. I'm going to use this in my introductions at House Parties. where I've actually interviewed like two different Emmy Award winners. Well, it's just to come to my house. We have the Emmy. And it's unfortunately, I haven't unpacked it. It's in a box because I haven't figured out where to put it in. But you're more than welcome to take a picture with it to what it's the social media post as well.

Starting point is 00:46:31 Yeah, for sure. So what was the Emmy for? Can you share that with us? Yeah. So the Emmy was actually, first of all, it wasn't an individual Emmy. It was an Emmy for the team. for the technology. And it was for the work that we did with building

Starting point is 00:46:47 this cloud scale accelerator to accelerate video processing. And so if you look at video processing, it's an amazing least, I think as big a killer app as AIML, because video underpins everything from YouTube to video meetings like these to Cloud Gaming to a whole, I mean, nest cameras, whatever, right? And so, and it turns out they take a huge amount of computing And so one of the technologies we came up with

Starting point is 00:47:15 those are custom silicon accelerator to accelerate video coding. And the big aha innovation here was that we did co-design. So we basically broke down how we built the silicon accelerator across meaningfully interesting innovations in hardware that worked with meaningfully interesting innovations in the software stack.

Starting point is 00:47:33 And that geter got us a very simple, but incredibly energy efficient design. And so when they kind of had the citation for Miami award, they said, This is a person who helped in greening the screen. So that was their catchy way of describing it. But it turns out that it's an incredibly energy efficient way, incredibly carbon efficient way of kind of getting all this video in the world.

Starting point is 00:47:55 And it's also a cost-efficient way as well. So you can kind of do a lot more. So incredibly privileged to be part of a really broad team. I think we worked across Silicon YouTube. We worked across hardware, software, amazing team. So it was one of my most. highlight moments to hold the statue and say thank you to the academy or on behalf of the team. So it was amazing.

Starting point is 00:48:19 That's awesome. So I like it. It was not alliterative, but it's rhyming, greening the screen. I like that. And yeah, and so that is, I mean, another thing that comes across for a lot of our guests is that they have tended to have successful projects across large collaborative groups. And so I think one thing, I mean, we probably talk about this ad. nauseam episode after episode. But, you know, sometimes the population that likes to work on technology can be kind of, you want to work on something yourself, you want to write your own code,

Starting point is 00:48:54 maybe you don't push it back necessarily, maybe you don't collaborate in larger groups. But at the end of the day, that kind of skill to work across people and now maybe across machines as well, it's really important. And I, you know, some of the things I read about is that now, you know, you, whereas it was very important before that you have skill yourself, like for example, you have to write, if you want to be a code ninja, you better be able to write code. Now the skill is to be able to describe what you want very clearly to an AI to get the code that you want. And so being able to communicate clearly, concisely is probably even more important than ever.

Starting point is 00:49:38 Would you say? I think we all going to have. to build a new set of tools. I think the communication is absolutely, but I think it's the critical thinking behind what we do. It's the judgment. It's the taste behind how we pick problems. In a world where ideas are commoditized,

Starting point is 00:49:56 the judgment behind the ideas, the taste is going to become important, but it's also execution as well. Again, that's why I said, understanding the full life cycle becomes very important. Honestly, Lisa, it's very early days. I don't think anybody can stand here and say, this is the most important stuff.

Starting point is 00:50:11 I think that most important stuff is to be aware that we are in the middle of a big revolution, and we need to adapt. And exactly how we need to adapt, it will change as we go along, and the technology will kind of do that. I mean, I remember taking a course on prompt engineering, and I used to be amazing at prompt engineering, and I would do all of that. But now I think you can talk to the AI and kind of brainstorm the prompt. So I don't know if I really need to do that.

Starting point is 00:50:37 Going back to it's clear, concise communication important. I can kind of have a long conversation and get to where I want. But critical thinking is important. So to me, really being able to adapt and being able to have the values, I think, is going to be important. And that's one of the things I think the liberal arts, like philosophy and understanding why we do what we do. I think it's, and this is the other layer of code design that I always talk about. I think it's, I mean, I pride myself on being an engineer, but I think we've got to kind of understand that the roles need us to kind of think a little bit more broadly. So I've been reading a lot of really fun books on kind of how to think about all of these things and so on.

Starting point is 00:51:13 So I do think I would say the single most skill, if I had to pick one, is the understanding that we are in the middle of change and be adaptive to change. Yeah, that's very interesting, the idea that now, now, especially with the increased capability of talking to the AIs, you can have quite a few iterations of back and forth before you kind of arrive at your final product. I maybe now's a good time to challenge you this is totally lighthearted but when Google as the search engine first came out when I was a lot younger one thing that I developed and felt that I had that I was good at was I and many people I think called Google Fu so you know if there's a lot of people between me and a bunch of people if we were trying to find something on the internet I somehow had a sense for what the search term really should be to get the information that I want so I say like my Google Fu is strong so now with respect to talking to these AIs and using this critical thinking and using the judgment and being very context aware, we need to come up with something also as catchy. And you seem like the one to figure out what this, the new version of Google Food should be. Well, I think much like Google, I think the big thing I like about AI is it's also a democratizing technology. And just like now, everybody uses Google search and we don't need any magic superpowers to use Google search

Starting point is 00:52:35 and you get the answer pretty fast, the technology is going to improve, right? And so again, so I do feel that we may not even need to invent a term just yet. I think if you wait at the pace of progress AI is doing, I think it's going to be fairly accessible and hopefully usable. We did you talk about one of my other favorite ideas around coscientist and alpha-yball. Oh, tell us, tell us. We have time.

Starting point is 00:52:58 So I think the next big wave in AI for systems is this notion of agentic self-improvement, right? and it works amazingly well. And so I had this really nice student from Berkeley, who is in his early PhD, and I asked him to say, hey, can you kind of, we had this tool called Alpha Evolve, which if people don't know,

Starting point is 00:53:19 Alpha Evolve is this computational science agent from Google that uses evolution to come up with a better thing. So if you can write anything in code, and you can verify it, it automatically fine-use the code and it invents amazing algorithms and so once. It's invented a new sort algorithm. It has figured out how to do matrix multiplying the minimum number of steps.

Starting point is 00:53:39 Very, very powerful tool. And so I said, why don't you use Alpha Eval for computer architecture? And as many people in this podcast listeners know, we have three championships in computer architecture. We have cash replacement. We have pre-fetching and branch prediction. And so he picked cash replacement. And within 18 days, we were the world champions on the cash replacement champions.

Starting point is 00:54:01 And so we went and looked at the algorithm, and it turned out, it came up with some really nice thing. It came up with things where it said, here is why I'm going to do aliasing. And so a bunch of small changes, nothing profound, but they all made sense. And what was very interesting is that the winning margin between this new approach and the prior winner was compatible or even better than the winning margin from the last two years of competition. So in other words, typically these kind of championships have one graduate student working with their, PhD advisor for a year to kind of go do that, 18 days. And by the end of the month, we had won all three championships, branch prediction, cash replacement, and pre-fitching. That's incredible. And even more interesting, I said in 18 days, we won the championship. In four days, we actually won the, we thought we had

Starting point is 00:54:51 won the championship by a much bigger number. And I was like, wow, we got a 10% improvement here. Okay, Nobel committee, be prepared. We are here. And it turned out that the model had cheated. And it wasn't quite cheating. What the model had done is it had found a flaw in the simulator, which again wasn't a flaw because we used the standard simulator that came with the championship stuff. The simulator doesn't model values, right? And why model values when you're doing just simulation? And the AI algorithm had figured out that that was what was happening. And so it just decided to ignore all rights because it's like, hey, the simulator is ignoring that. So why should I model all the lights?

Starting point is 00:55:28 And so we came up with this amazing policy that was really nice. And so the lesson really is, one, agency computer architecture is here to stay. And two, we as a community need to figure out what does it mean. I mean, we need to think and go back and look at our simulations. I mean, these kind of simulator escapes are going to get more and more common. We need to start designing simulators, not just for human users, but for machine users as well. And more interestingly, we also have the next phase of this work where we have this other technology called AI co-scientist, that again is a Google technology we have talked about

Starting point is 00:55:58 it publicly. And so AI co-scientists, basically you give it a problem. It comes up with a lot of hypotheses and ideas, and it does an amazing job of that. And so now you can pair like deep thinking or co-scientist along with Alpha Eval. And now going back to what I talked about earlier, this notion of an AI-infused workflow, you can start seeing how these things start synergistically playing to each other as well. And so we are already seeing some pretty amazing ideas. And in particular, where the design space gets more complex. Like Lisa, you talked about hardware software co-design earlier. And so when we kind of look at this broad spectrum

Starting point is 00:56:31 of looking across hardware and software or across looking at front end and back end, the AI is actually combined alpha-evolvent co-scientists. You actually come up with some really interesting solutions. So I'm very confident, and all of you heard it here in this podcast. First, I think we are going to have some pretty significant breakthroughs. We call it the MO-37 breakthroughs,

Starting point is 00:56:52 because when Alpha Goh played Lee Sedal, the Go champion, move 37 was this very non-intuitive move that in hindsight turned out to be pretty brilliant, and this was a game like Go that has had decades of people looking at it. And I do think that we are going to have a Move 37 moment in computer architecture in the near future, and we are talking probably weeks or months. So I think we should all stay tuned for that. That's really, really amazing.

Starting point is 00:57:19 Super impressive. Wonderful part. I think that was an extremely enlightening conversation, spanning both technology, the landscape of how computing is changing, all the way to words of wisdom to our listeners, peppered with plenty of alliterations, which I'm sure would delight the listeners. So thank you so much for joining us today on this podcast. It's been an absolute delight talking to you. Well, thank you for having me.

Starting point is 00:57:41 It was a lot of fun, and I hope at least some of the content was useful to all the listeners. Oh, absolutely. I enjoyed it myself so much, and I'm sure that our listeners will. will too, if not the least, just for the plays on words. It's very enjoyable. And to our listeners, thank you for being with us on the Computer Architecture podcast. Till next time, it's goodbye from us.

Computer Architecture Podcast - Ep 24: From Unicorns to Centaurs: Codesigning Computer Systems for the AI Era with Dr. Partha Ranganathan, Google

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.