Computer Architecture Podcast - Ep 17: Architecture 2.0 and AI for Computer Systems Design with Dr. Vijay Janapa Reddi, Harvard University
Episode Date: September 3, 2024Dr. Vijay Janapa Reddi is an Associate Professor at Harvard University, and Vice President and Co-founder of MLCommons. He has made substantial contributions to mobile and edge computing systems, and ...played a key role in developing the MLPerf Benchmarks. Vijay has authored the machine learning systems book mlsysbook.ai, as part of his twin passions of education and outreach. He received the IEEE TCCA Young Computer Architect Award in 2016, has been inducted in the MICRO and HPCA Halls of Fame, and is a recipient of multiple best paper awards.
 Transcript
 Discussion  (0)
    
                                         Hi, and welcome to the Computer Architecture Podcast,
                                         
                                         a show that brings you closer to cutting-edge work in computer architecture
                                         
                                         and the remarkable people behind it.
                                         
                                         We are your hosts. I'm Suvini Subramanian.
                                         
                                         And I'm Lisa Xu.
                                         
                                         Our guest on this episode was Vijay Janapareddy,
                                         
                                         an associate professor at Harvard University
                                         
                                         and vice president and co-founder of ML Commons.
                                         
    
                                         He has made substantial contributions to mobile and edge computing systems
                                         
                                         and played a key role in developing the MLPerf benchmarks.
                                         
                                         Vijay has authored the machine learning systems book,
                                         
                                         mlsysbook.ai, as part of his twin passions of education and outreach.
                                         
                                         His work has also earned him numerous accolades,
                                         
                                         including the IEEE TCC A Young Computer Architect
                                         
                                         Award in 2016, induction into the Micro and HPCA Halls of Fame, and multiple Best Paper
                                         
                                         Awards.
                                         
    
                                         On this episode, Vijay discusses Architecture 2.0, a new era of using AI and ML for computer
                                         
                                         systems design, exploring the opportunities, challenges, and educational shifts it necessitates. He also delves into his work on TinyML, enabling machine learning
                                         
                                         on resource-constrained devices and its potential to transform our technological interactions.
                                         
                                         A quick disclaimer that all views shared on this show are the opinions of individuals and do not reflect the views of the organizations they work for.
                                         
                                         Vijay, welcome to the podcast. We're so excited to have you here.
                                         
                                         Thank you for having me. It's a pleasure being here.
                                         
                                         As longtime listeners of the podcast usually know, our first question is often, in broad broad strokes what's getting you up in the morning these days it is without doubt my four-year-old and my eight-year-old first thing
                                         
                                         in the morning at about six o'clock and i'm sure that's very common a lot of people six is rough
                                         
    
                                         i gotta tell you that six is rough i got seven amers so i got lucky well we get some midnight
                                         
                                         wake-up calls so yeah so after they get you you up, then what is your day looking like these days?
                                         
                                         Most of the time, it's kind of thinking about what's the next big thing that's actually
                                         
                                         happening in our field.
                                         
                                         That's honestly what kind of really keeps me up, thinking about it quite a bit.
                                         
                                         I think because right now, it's such an exciting time when there's so much change going around
                                         
                                         and finding the path through this nebulous
                                         
                                         cloud that we're looking into is sort of like the most fascinating thing I feel because it is both
                                         
    
                                         an opportunity for a whole bunch of different ideas that we can all explore and research and
                                         
                                         education but at the same time it's also quite a challenge because it's easy to kind of go down
                                         
                                         you know right home so just thinking about that and trying to identify what are the interesting areas is
                                         
                                         sort of the most exciting thing right now.
                                         
                                         Right.
                                         
                                         So one of the themes that you have talked about in recent times is what's called Architecture
                                         
                                         2.0, a shift from the traditional paradigm of how we design computing systems.
                                         
                                         So what sparked this particular vision and what are the most exciting advancements
                                         
    
                                         that are driving this particular paradigm shift? Yeah, so that's a great question. So architecture
                                         
                                         2.0, just for the listeners to kind of be clear about, architecture 2.0 is fundamentally just
                                         
                                         thinking about how we use AI ML to help us build better systems in the future as we're starting to
                                         
                                         build increasingly more complex systems and to do that very efficiently and to
                                         
                                         also do that with extreme sort of, you know, consciousness about like how we reduce time to
                                         
                                         market. Because I think as systems get more complex, you know, validating, verifying,
                                         
                                         designing, all of that gets inherently more complicated. And so we need new tools. And
                                         
                                         that's fundamentally what Architecture 2.0 is really about. And obviously at this given point
                                         
    
                                         of time, you know, it's a super exciting time
                                         
                                         because it's like you're in this era
                                         
                                         of not only just AI ML,
                                         
                                         we're truly in this era of generative AI ML, right?
                                         
                                         Which is sort of a very exciting area.
                                         
                                         Now, the reason that actually came to light
                                         
                                         is honestly, just from reflecting all the work
                                         
                                         that's been happening in the community,
                                         
    
                                         the architecture community
                                         
                                         has been doing some very interesting work.
                                         
                                         Obviously, we build a lot of systems for ML, without doubt.
                                         
                                         But in recent years, we've definitely seen the shift towards having AI ML being used
                                         
                                         for machine learning systems, right?
                                         
                                         Like actually designing the systems in itself, right?
                                         
                                         So this could be in the form of like, you know, whether it's Bayesian optimizations
                                         
                                         or genetic algorithms or
                                         
    
                                         reinforcement learning or pick your favorite you know bell whistle that you want to apply
                                         
                                         right and it's been like reading those papers that have actually been coming out which i think
                                         
                                         in all honesty are fantastic and phenomenal because they're showing sort of you know what's
                                         
                                         possible but once you take a step back and you try to think about like how do we systematically
                                         
                                         translate or convert this into something that we can use in a practical sort of sense you start getting into some really deep questions
                                         
                                         about what are the challenges um around this space and from reading those papers is when i kind of
                                         
                                         realized oh wow we don't really have an engineering principle around how we're going to be applying
                                         
                                         this methodology you know in order to accelerate all the traditional stuff that we've been doing
                                         
    
                                         and i think that's fundamentally what gave birth
                                         
                                         to going around and talking to a large number
                                         
                                         of community members around what are the new challenges
                                         
                                         as we try to use AI ML for system slash architecture design.
                                         
                                         I think there are several themes that you've talked about
                                         
                                         within this broad umbrella,
                                         
                                         everything from datasets to ML algorithms
                                         
                                         to tools and infrastructure that you need.
                                         
    
                                         And as you rightly pointed out, new methodologies and a new way of thinking about designing
                                         
                                         these systems.
                                         
                                         Can you tell us a little more about these different themes?
                                         
                                         What are some of the challenges and opportunities that you see under each of these?
                                         
                                         And I think the big and most fascinating element of all of this, as much as like, you know,
                                         
                                         we talk about whenever we talk about AML, by and large, most of the community is wickedly
                                         
                                         excited about, oh, I've got this new little model that I'm actually going to put in,
                                         
                                         and it's going to do blah, blah, blah. Fundamentally, I think it's the most boring aspect of, you know,
                                         
    
                                         AI ML, in all honesty. I think the most fascinating aspect of it is where it all begins, which is the
                                         
                                         inherent data that we're actually talking about, right? Because data is effectively the new code
                                         
                                         today. And when we kind of think about it from how do we apply AML methods for system design,
                                         
                                         and you kind of go back and you look at like,
                                         
                                         okay, what corpuses do we have?
                                         
                                         Question is very simple.
                                         
                                         What's the ImageNet dataset for computer architects?
                                         
                                         That's a very simple question.
                                         
    
                                         And yet we would struggle to answer that question.
                                         
                                         Why? Because we have not systematically thought about it. We are the ones who have actually been building the systems that enable, you know, all this AI technology. Yet we ourselves have not thought about how we would be able to data sets for you know architecture design now architecture
                                         
                                         design is a very complex thing it spans many layers right it goes all the way from talking
                                         
                                         about high level design space exploration in my head it also cuts right through the eda flows
                                         
                                         because at the end of the day when we talk about architecture it's not about just the design that
                                         
                                         we come up with it's actually you know how to be taken down implementation right and so everything
                                         
                                         from that top all the way down is sort of like you know the critical critical thing that i think is
                                         
                                         um fascinating and how do we think about data there is one of the first and foremost things
                                         
    
                                         we got to ask ourselves if we want to actually you know use this new methodology in our existing
                                         
                                         workflows well i i'm very curious to hear what you mean specifically by this kind of
                                         
                                         data exactly. Because, you know, when I think about sort of maybe architecture 1.0 of how we
                                         
                                         would build things, the data would be, say, like some sim points or, you know, spec basically.
                                         
                                         That's the data that we use to essentially not train the design, but sort of inform what we want
                                         
                                         the design to be good at.
                                         
                                         And that's the data that we test against, that's data that we design against, and that's the sort
                                         
                                         of performance benchmark metric. In this case, it sounds like, you know, obviously the data would be
                                         
    
                                         slightly different in this world because it's going into these AI ML techniques to try and
                                         
                                         inform these designs and do them rapidly and optimize them. So, you know, obviously the word data is extremely broad.
                                         
                                         Maybe you can dive down a little bit into what you mean
                                         
                                         or what kind of different pillars of data you're talking about.
                                         
                                         Yeah, pillars of data, that's a good way of kind of putting it.
                                         
                                         I think there are three fundamental ways of kind of bucketing these things, right?
                                         
                                         The absolute cutting edge one would be how do we get data in
                                         
                                         a format that's actually useful for generation right another one another pillar is how do we
                                         
    
                                         get data in order to do sort of um optimizations and and the more basic one in all honesty is like
                                         
                                         how do we do some sort of prediction data so in my head there these three pillars you start with you
                                         
                                         know getting data sets where we can make very basic predictions about what's going to happen next.
                                         
                                         The next thing would be, how can I get the data in order to actually design the system to be much
                                         
                                         more, you know, optimal for some, you know, whatever heuristic you choose to. And the third
                                         
                                         one really is the generative aspects. And I think once you kind of bucket things into these three
                                         
                                         major pillars, then you can systematically think about, you know, what needs to be done. Now, off these three, obviously,
                                         
                                         prediction and optimization are things that we have been doing in the past, right? Because when
                                         
    
                                         we do design space exploration, we are effectively looking at, you know, various design points and
                                         
                                         trying to, you know, figure out what's the best optimized method that we have to kind of pick from.
                                         
                                         Even if you're looking at prediction, we have done prediction. We look at, you know figure out what's the best optimized method that we have to kind of pick from even if you're looking at prediction yeah we have done prediction we look at you know prefetchers and
                                         
                                         branch predictors they're all looking at you know data coming through and making predictions right
                                         
                                         there is however a difference when we talk about prediction optimization and generation once we
                                         
                                         start thinking about it in the context of Architecture 2.0. It's an incremental step.
                                         
                                         That incremental step in my head is fundamentally
                                         
                                         about breaking the abstraction layers, right?
                                         
    
                                         So traditionally what we have done
                                         
                                         when we have thought about optimizations is by and large,
                                         
                                         we have been kind of focused on these abstraction layers
                                         
                                         from the system stack going from the application algorithm
                                         
                                         all the way down to the hardware.
                                         
                                         We've created these multiple layers of abstraction, you know, isa being the most classic version of it right we create these nice
                                         
                                         abstractions between the hardware and the software ecosystems and kind of let each independently
                                         
                                         evolve what that ends up happening what ends up happening then is that you sort of do these
                                         
    
                                         smaller optimizations and i think as you start getting into the ai space what's really interesting
                                         
                                         is it's kind of stepping away from this traditional
                                         
                                         paradigm of instruction set architectures to more about parameter set architectures, PSAs,
                                         
                                         as I like to think about it. The idea of PSA is that in the future, you still need to be a core
                                         
                                         architect. Don't make a mistake. I'm not saying that suddenly our students don't need to know
                                         
                                         anything about architecture. All our people still need to know everything deep inside so we know when models are hallucinating and so forth.
                                         
                                         So we take that fundamental understanding, flip that vertical stack into a more horizontal stack,
                                         
                                         and then our future architects are really going to be understanding what are the parameters that are actually essential to expose across each of those horizontal layers. Because at that point,
                                         
    
                                         once you expose the parameter space, then you let the AI agents actually get to work.
                                         
                                         And at that point, it gets really fun because now you could have an agent that's perhaps just
                                         
                                         dedicated to the hardware module, or you could obviously break it down into the individual
                                         
                                         microarchitecture components and have multiple agents all kind of working and learning from
                                         
                                         each other. But in the end, when you take a step back, they're effectively
                                         
                                         learning from each other and exploring that massive design space that we truly have across
                                         
                                         the system. And I think that sort of paradigm shift is really what we need to have rather than
                                         
                                         thinking about things in a very traditional sort of a perspective about how we have done things
                                         
    
                                         today, right? I think that sort of changes, you know,
                                         
                                         what architecture one to two is going to be. That sounds very interesting. I think I want to
                                         
                                         double click on this idea of this horizontal design space that you were talking about. It
                                         
                                         sounds like, and let me make sure I heard it right, that, you know, of course, we have this,
                                         
                                         a lot of layers, cross layer, and we and we often do cross-layer optimizations, but they're usually in adjacent layers. Are you saying then that you turn that on the side, those layers from like
                                         
                                         ISA all the way up to, you know, microarchitecture, turn that on its side so that an AI can look at
                                         
                                         all of the layers together and essentially optimize the parameters that we decide are good for exposure across what has traditionally been vertical stack, but now it's horizontal
                                         
                                         and they have the purview of the whole white space. Okay. Interesting.
                                         
    
                                         And I think that's going to be super fun because you start to kind of understand that
                                         
                                         there are going to be differences about how we even expose those parameters, right? There might
                                         
                                         be hierarchical parameters. You want one agent, one AI agent to kind of, you know, maybe just work
                                         
                                         on the memory subsystem in complete isolation. Or you might actually want to break the memory
                                         
                                         controller completely down and say, okay, even within the memory controller and the way it
                                         
                                         interacts with the memory subsystem might actually have multiple agents because some of them might be
                                         
                                         responsible for very specific parameters that are playing around with. And so when you kind of think about it, you get into this really interesting design space of how do you get the AI to actually map onto this horizontal parameter space that we're actually exploring.
                                         
                                         And those kinds of things have not yet been fully explored because, as you said, Lisa, we have largely been doing co-design between two adjacent layers. We lump it into
                                         
    
                                         hardware and software co-design, which is true, but if you really go into it, it's really just
                                         
                                         algorithm and hardware co-design in this very tight binding. But there is so much more of what
                                         
                                         an architecture stack really is, right? And there might be optimizations that we would perform
                                         
                                         at the highest levels of the stack that are in fact suboptimal when you actually look at it
                                         
                                         from a holistic system design,
                                         
                                         because sometimes you wanna leave more room
                                         
                                         for the system at the lower levels of the stack
                                         
                                         to actually make other kinds of opposite decisions
                                         
    
                                         to what we would normally do.
                                         
                                         Yeah, that's fascinating,
                                         
                                         because I feel like a lot of times
                                         
                                         when you are a student doing microarchitecture design even.
                                         
                                         Maybe you've honed in on some substructure
                                         
                                         within the microarchitecture,
                                         
                                         whether that be a BTB or a TLB or an L2 cache or whatever.
                                         
                                         And you kind of do have to isolate yourself
                                         
    
                                         into looking at that structure in and of itself.
                                         
                                         You've got to get yourself a pattern stream
                                         
                                         that goes into it.
                                         
                                         And then within that pattern stream, you isolate
                                         
                                         yourself to figure out, okay, here's what happens here. You know, maybe I need, if I like memory
                                         
                                         systems, so like maybe you need an eight way cache, or maybe you need a four way cache, or maybe you
                                         
                                         should have. And even with those dimensions of like, how many indices, how many ways,
                                         
                                         how many megabytes, gigabytes, whatever, whatever, depending on what level of
                                         
    
                                         the cache you're talking about. That often was a relatively taxing space to look at just because,
                                         
                                         you know, you would still have to run lots and lots of jobs who didn't have the computational
                                         
                                         power. And you would wonder sometimes like, okay, well, what if I change something in the L2 that
                                         
                                         changes the traffic? You know, like the way that the L1 is filtering to the L2, now the traffic you know like the way that the l1 is filtering to the l2 now the traffic
                                         
                                         has suddenly changed like there was no way to pop all the way up and so it sounds like what you're
                                         
                                         saying is that if we turn everything on its side and now that we have the massive power of all this
                                         
                                         ai we can look at everything potentially all together although we probably still have to be
                                         
                                         judicious about what parameters are being exposed is Is that what you mean by the first piece, the prediction
                                         
    
                                         piece and the data piece?
                                         
                                         Yes.
                                         
                                         And I think it's the architect's job still
                                         
                                         to have very deep knowledge about what parameters are
                                         
                                         actually critical.
                                         
                                         So this by no means undermines what a traditional architect is
                                         
                                         doing.
                                         
                                         If anything, all we're trying to do
                                         
    
                                         is, for instance, if we go back in time,
                                         
                                         he says to kind of help you just compute through faster
                                         
                                         so you can actually look at more interactions
                                         
                                         with your fundamental knowledge.
                                         
                                         I want to circle back to the data itself.
                                         
                                         And the quality of the data is very important
                                         
                                         for the quality of the AI systems or agents
                                         
                                         that we build towards these different tasks.
                                         
    
                                         This is true even in other domains.
                                         
                                         So if you look at the toolkit that we have
                                         
                                         to create such data, we have simulators on
                                         
                                         the one hand, and then we have real world performance profiles on the other hand. So
                                         
                                         how do you think we should go about collecting these data sets for architecture research?
                                         
                                         What should we be careful about, especially as we try to ground this data in the real world?
                                         
                                         We want to ensure that if you're simulating, the quality of the data should be good, which means
                                         
                                         that it needs to correlate in some reasonable manner
                                         
    
                                         to what we might expect in a real system.
                                         
                                         So what do we think about these different attributes
                                         
                                         of the data and the quality of the data?
                                         
                                         And what are mechanisms that we need
                                         
                                         so that we can create these data sets,
                                         
                                         curate these data sets, and then also measure the quality
                                         
                                         of the data sets themselves?
                                         
                                         Yeah. I'm going to split the data element
                                         
    
                                         up into two two fundamental pieces which you were already alluding to the first piece being quantity
                                         
                                         of data because we do need these are inherently having to be big data oriented kind of problems
                                         
                                         right and another one is once you have that big data then sort of how you how do you tune it around
                                         
                                         for quality both are actually needed if you kind of look at what's actually happening in the ai community you know if you historically kind of look at the size
                                         
                                         of the data that's been evolving for images for instance you start to see that originally you
                                         
                                         know people try to curate these really high quality data sets right and people said like
                                         
                                         that's the most important thing but then it's like over years like as the models have gotten bigger
                                         
                                         we've started creating more noisy data sets.
                                         
    
                                         You start getting noise in the data set because you start pulling the human out of the loop a little bit and you start relying on self-supervised methods or just having the systems effectively just kind of mining for data.
                                         
                                         And then you end up with a lot of errors in the labels and so forth.
                                         
                                         Now, just because you have a bit of error does not necessarily mean that,
                                         
                                         you know, it's actually bad. Sometimes having a little bit of error can actually help the model
                                         
                                         not get stuck in certain things, right? And so you do need a large amount of data.
                                         
                                         And to that point, I would say, think about the number of simulations that we would all run,
                                         
                                         right? Just globally, just think about the number of G gem5 simulations alone that you and i probably run
                                         
                                         forget even what's happening in the companies just to jump by simulations that are being
                                         
    
                                         run academically and even within my lab right now probably right what do we do with all that data
                                         
                                         we basically get the paper out i guarantee you the students probably got it in some directory he or
                                         
                                         she will forget about it you know once the paper, right, and then we just kind of like, you know, at some point, just kind of, you know,
                                         
                                         archive or erase it, we don't really use it, and I think like that's a wasted opportunity, especially
                                         
                                         for a domain that is quite specialized, right, there's a lot of domain knowledge you need to have
                                         
                                         to be able to kind of, you know, understand how to work with things, just,. Just, I'll tell you a little bit about a project
                                         
                                         that we're actually doing centered around data
                                         
                                         in Architecture 2.0 later on.
                                         
    
                                         But one of the very basic questions just yesterday,
                                         
                                         one of my students, Shwetan asked the models was,
                                         
                                         is data movement generally more costly than compute?
                                         
                                         Now you can by and large ask any starting a phd in
                                         
                                         architecture and this is probably one of the first things we try to teach them and guess what mr
                                         
                                         l claude and chat gpt come back with right they say no data movement is actually not costly
                                         
                                         now of course they once you start asking them they'll rationalize this and kind of come up
                                         
                                         with you know all kinds of excuses about like, it really depends on what data you're talking about.
                                         
    
                                         Oh, it depends on what compute. Yeah. But if you ask a vanilla student, you know, it kind of comes down to this point about.
                                         
                                         You know, a very basic question. Right. And these models are not able to get it.
                                         
                                         And so but a lot of that domain knowledge is kind of inherently captured in a lot of the data that we're inherently throwing away today i think that's a lost opportunity for us and so this is where i think a very simple
                                         
                                         thing for the quantity side of the world would be what if we could just create a plug-in into gem5
                                         
                                         or any other open many other excellent simulators that are out there right um like within gpg you
                                         
                                         send all kinds of different things even for accelerators even for time loop and all these
                                         
                                         you know modeling based systems what if we could kind of inherently create you know a platform agnostic back and we're
                                         
                                         able to suck this data into some cloud service provider where you know it's open for the
                                         
    
                                         architecture community to be able to happen that gives us a wealth of data on which we can start
                                         
                                         training at least open source models in order to do basic tasks like prediction and optimization right and just be able to do it really well so
                                         
                                         that's purely on the on the quantity side it's going to run over to the quality side of course
                                         
                                         now as you kind of make the data sets noisy and you know as you start injecting as you start
                                         
                                         implicitly injecting errors and then you got to worry about the quality of the data i think for
                                         
                                         regulating the quality of data,
                                         
                                         one of the key things you will ultimately need
                                         
                                         is some human in the loop element, right?
                                         
    
                                         And I think as a community,
                                         
                                         we have to start thinking about training
                                         
                                         our next generation of PhD students
                                         
                                         and engineers and so forth
                                         
                                         to help us kind of get that higher quality
                                         
                                         so that we can end up with something
                                         
                                         that's like a reasonable, you know,
                                         
                                         a reasonable data set with low error, right? And this means labeling the data sets and so forth right and i think that that
                                         
    
                                         gets into you know what kind of data set if it's a if it's a ginormous gem5 simulation log yeah
                                         
                                         that's very very hard to kind of you know really sort of um you know streamline right because what
                                         
                                         are you going to label on that thing it's very hard to label the best you can do is have some sort of metadata about the conditions of the experiment and so forth right
                                         
                                         however we can still curate high quality data sets about basic information and questions such as like
                                         
                                         you know is data movement more costly than compute that's a qa right if you kind of go back and you
                                         
                                         look at nlp models by and large you, they've worked on these kinds of QA
                                         
                                         data sets that we have, right? Question answering pair data sets, which, you know, test the model's
                                         
                                         ability to understand the domain. So if we can create those kinds of data sets, which I think
                                         
    
                                         students would be able to help, and so will the community, then I think we can start sort of,
                                         
                                         you know, bootstrapping this quality-oriented data sets and start creating benchmarks,
                                         
                                         which is a whole other area outside of the data itself. Yeah, I have some follow up questions here. So I guess
                                         
                                         what I'm kind of picturing based on what you're saying, because I use as one of the early developers
                                         
                                         of Gem5 way back in the day. And one of the things that we had tried to do was, you know,
                                         
                                         have a very rigorous set of statistics about all of the major structures and they
                                         
                                         just get all spit out at the end. And then there would be maybe a little bit of labeling
                                         
                                         about what happened in this particular simulation so that you could distinguish what happened
                                         
    
                                         between this run versus that run or what have you. And so I guess in your mind, are you
                                         
                                         imagining something like this where you essentially spit out a bunch
                                         
                                         of data saying like, okay, if the ROB size is eight, but has some ridiculous number, and the
                                         
                                         L2 cache size is four gigabytes, which is also a ridiculous number, then, you know, then it can
                                         
                                         essentially glean out some correlative stuff where when you have maybe a more reasonable number for both of those or you isolate like what is cause and what is effect or what is at least correlated when things are happening.
                                         
                                         Is that what you're talking about? Or do you necessarily need some sort of label to say like, hey, I believe this run is testing new structure A because all the run to run, some of them may have a new structure
                                         
                                         that wasn't there before at all
                                         
                                         that introduces new relationships,
                                         
    
                                         or some of them might have bugs,
                                         
                                         which we would have tons of bugs where like,
                                         
                                         oh, these results don't make any sense.
                                         
                                         Like somebody had to look at it and say like,
                                         
                                         this doesn't make any sense.
                                         
                                         I guess what I'm imagining here
                                         
                                         with respect to the generation piece,
                                         
                                         you know, there is a structured generation of like, what are we spitting out? You know,
                                         
    
                                         what are the, what are the pieces of data? What are the structures? And then there's the,
                                         
                                         the sort of description, or I guess, label of it. Like if you invent a new widget,
                                         
                                         how does that then get incorporated? Yeah. I mean, at the end of the day, we're, we're always
                                         
                                         having these unit tests in some
                                         
                                         capacity right so so in the case where we end up with something new i mean we will still have to
                                         
                                         continue doing what we are doing today right we're just kind of writing these custom unit tests and
                                         
                                         making sure that we're actually right about them but on a macro scale what i would say is that if
                                         
                                         you're looking at it from the holistic system then i would very much do whatever we are already doing
                                         
    
                                         in many of the big ai systems right so you know, you know, when you're hitting, let's say if you're hitting something
                                         
                                         like Dolly, for instance, and you're generating an image, given a prompt, you have to generate an
                                         
                                         image. Well, the prompt doesn't directly go straight to the model, right? It doesn't go
                                         
                                         straight into the backend. You've got a whole bunch of infrastructure that's actually sitting
                                         
                                         in the front end that's actually guarding the prompt and making sure that the prompt is intentionally good and it's well
                                         
                                         well-meaning and so forth does not mean that the back end is completely you know
                                         
                                         going to be safe right because you can you can generate pretty harmful images today with just
                                         
                                         straight-of-the-art models right so you still have to kind of you know have some checks and you know
                                         
    
                                         guardrails in place which is what you know the front-end classifiers are typically designed to do
                                         
                                         so in a very similar way you would trainend classifiers are typically designed to do.
                                         
                                         So in a very similar way, you would train simple classifiers, I would think, that are able to spot anomalies that are happening inside the system. And so you could effectively use that to kind of
                                         
                                         have some mechanism of a feedback signal that comes back to the architects who are designing
                                         
                                         the system. So I'd still go back again with the human in the
                                         
                                         loop being the most critical element of this all. In addition to the technical challenges that you
                                         
                                         discussed, it looks like there is a big part of the community contributions that's required in
                                         
                                         order to bootstrap this entire ecosystem. You've been involved in multiple both open source and
                                         
    
                                         community efforts over the years, including MLPerf as a founding member.
                                         
                                         So can you talk about the importance
                                         
                                         of such open source contributions
                                         
                                         along with industry plus academic collaborations
                                         
                                         in advancing this particular field?
                                         
                                         And also, how do you think about bootstrapping
                                         
                                         this particular ecosystem for Architecture 2.0?
                                         
                                         Yeah, that's great.
                                         
    
                                         I'm glad you're asking about that.
                                         
                                         Yes, it is true that I'm a super big proponent of doing
                                         
                                         community-driven efforts.
                                         
                                         And in all honesty, the kudos and the credit
                                         
                                         really goes to things that I've learned
                                         
                                         when I was a student looking back
                                         
                                         on what the community was doing.
                                         
                                         I mean, the community built the GemFi simulator.
                                         
    
                                         The community also helped contribute to GPU-CM
                                         
                                         with TorchStart.
                                         
                                         And as a community, every once in a while,
                                         
                                         we kind of reach
                                         
                                         a point where we really need to come together to create something that will unlock the next
                                         
                                         generation of ideas and research that can come out, right. And so that's where I really draw a
                                         
                                         lot of the inspiration from is kind of looking at how we have done these big mega projects that are
                                         
                                         now like you know sort of the backbone, right. So from that sense like when you talk about
                                         
    
                                         Architecture 2.0 or kind of building this data set,
                                         
                                         one of the things that is that we're actually gonna,
                                         
                                         you know, talk about it at one of the workshops is,
                                         
                                         we've been, we basically, you know,
                                         
                                         have created a massive corpus of last 50 years
                                         
                                         of architecture research.
                                         
                                         And we haven't talked about this,
                                         
                                         we will be talking about it, but it's coming.
                                         
    
                                         And what we have actually done is we've started creating
                                         
                                         a data pipeline where, you know where we have data annotators,
                                         
                                         basically undergraduate and graduate students
                                         
                                         effectively labeling certain types of questions
                                         
                                         because they need domain expertise,
                                         
                                         such as the one that I was talking about,
                                         
                                         which is data movement related thing.
                                         
                                         And we've started creating that data set.
                                         
    
                                         And we originally started with the ISCA 50 retrospectives
                                         
                                         where we collected all the retrospectives, which is a very small sample that was put together by Jose Martinez and Lizzy John from last year.
                                         
                                         And we did a QA data set around that.
                                         
                                         And we took that data set and we fine-tuned some of the open source models, which are actually performing bad in architecture.
                                         
                                         We immediately saw a spike in the ability, which was a clear signal that even with a little bit of a curated corpus you can actually
                                         
                                         improve their domain knowledge about architecture and now we've effectively expanded that to be the
                                         
                                         last 50 years worth of you know architectural research architecture again i said it's kind
                                         
                                         of encompassing both traditional architecture as well as eda kind of flows so papers in that sort
                                         
    
                                         of corpus and we've created a pipeline which allows us to kind of,
                                         
                                         you know, start labeling as a community.
                                         
                                         And this is what we're actually hoping to,
                                         
                                         you know, announce pretty soon at one of the ISCA workshops
                                         
                                         and then write a subsequent blog to engage the community.
                                         
                                         And I think this is where, for instance, you know,
                                         
                                         for people who do believe, okay,
                                         
                                         AIML can be a useful tool in our toolkit.
                                         
    
                                         It'll be a wonderful opportunity to contribute to help shape it.
                                         
                                         In my vision, it's like first we start labeling the datasets.
                                         
                                         We start labeling, then we start fine-tuning models.
                                         
                                         And I would love for us as a community to have a collection of open-source
                                         
                                         models that are actually, you know, domain-specific to us.
                                         
                                         And then you can start, you can start trying to improve their knowledge across
                                         
                                         various ways. And this is where benchmarks become critical. Because
                                         
                                         benchmarks in my head are a way to kind of bring the community together because
                                         
    
                                         everybody has to agree on what are the interesting tasks that we actually want to solve
                                         
                                         first. And then you can start creating a roadmap
                                         
                                         that allows not only apples to apples
                                         
                                         comparisons with benchmarks but benchmarks are also sort of the north star like you know
                                         
                                         for instance when we wanted to go to the moon we weren't necessarily talking about oh this is the
                                         
                                         specific navigation system i have in a fall or this is the thruster that i have no you actually
                                         
                                         focus on getting to the moon and then you work backwards and say okay what are all the elements that I need to have in order to be able to get to the moon?
                                         
                                         Then you say, okay, well, I need to have a certain amount of, you know, the thrust capability. I need
                                         
    
                                         to have a certain type of navigation capability, right? And you kind of identify all the pieces
                                         
                                         and you kind of build a matrix that says, okay, if I can check all these, it tells me that I might
                                         
                                         be able to get to the moon. And so in my head, that requires a community effort
                                         
                                         because you have to build that complex matrix for one, right?
                                         
                                         Which is identifying what are all the tasks
                                         
                                         that we would need to be able to do incrementally one by one
                                         
                                         so that we can say, okay, someday perhaps
                                         
                                         we can ask a large language model to say,
                                         
    
                                         okay, act like an architect, you know,
                                         
                                         give me a RISC-V core that's got this sort of, you know,
                                         
                                         ISA support and, you know, it's able to really optimize these particular workloads.
                                         
                                         I'm not saying that one LLM is magically going to do it all. I'm saying it might actually invoke
                                         
                                         other agents or existing traditional non-AI ML tools to actually get the job done, right?
                                         
                                         So there's the two parts. One is kind of like getting all the data that needs to actually be put in place,
                                         
                                         which is kind of what, you know, I think that's a massive community effort.
                                         
                                         There's no way you can do this just by dumping some data set outside and say,
                                         
    
                                         oh, OK, everybody adopt this.
                                         
                                         I don't think that's going to work.
                                         
                                         We all need to chip in much like the way I'm just picking Gem 5 as opposed to Child Year.
                                         
                                         Much like Gem 5 is a community project. We all chip in.
                                         
                                         When my students find a bug, I say, don't complain about it.
                                         
                                         Just go fix it.
                                         
                                         It's incredible that you actually have a massive simulator that someone wrote for you.
                                         
                                         So just go fix the bug instead of complaining about it.
                                         
    
                                         So if we all kind of contribute in that way, I truly believe that we'll be able to kind of build a new set of tools that will help us with hardware design.
                                         
                                         And I think that's where the community aspect kind of comes in with respect to Architecture 2.0. Yeah, I mean, I think that sounds very
                                         
                                         interesting. I mean, the way that our community always has worked, it seems, is that there's some
                                         
                                         sort of thing happening. There's some turn, there's some change, and then there's a lot of discussion, and then eventually there's a
                                         
                                         congealing around some sort of pillar of how we're going to do things as a community.
                                         
                                         You know, so we eventually congeal around a benchmark suite, or we eventually congeal around
                                         
                                         a simulator. You know, there's a few, but there's not, everybody doesn't come and roll their own, right? Because we sort of realize that collectively, it's better if we all collaborate on a few.
                                         
                                         So I think what you're saying, it makes sense.
                                         
    
                                         It seems like a tall endeavor too, but it always is in the early stages.
                                         
                                         So for some of this, I mean, I'm imagining this big world of possibility, right? Where let's say one of the parameters that's on the table is, say,elloed around a pers a set of instructions more or less
                                         
                                         right we might quibble about whether you need f mole or not or whatever depending on the situation
                                         
                                         but we more or less they they look very similar you know barring risk v-sys and i guess what i
                                         
                                         wonder is if we wanted to say explore something different, it seems like what would be necessary is for someone to say come up with a new instruction and then come up with a new compiler that uses that instruction adequately to come up with the instruction stream that then can then be fed into a number of sample machines to be able to produce
                                         
                                         the data that would be rich enough for an ai to be able to reason about it right because you know if
                                         
                                         you just say um add one new instruction and then compile one program and put it through one run of Gem 5, there's no way for an AI to be able to
                                         
                                         reason about what it was that might change if you put that in the corpus of everything.
                                         
    
                                         So it feels, I guess I'm just thinking through how this would work, and it feels like then
                                         
                                         the kind of work that we do now, which is like, Hey, what would happen if
                                         
                                         we had this new instruction, then you'd have to do all this work. And you have a sort of a hypothesis
                                         
                                         in mind and you set up your, uh, your experiments to be able to figure it out now, sort of
                                         
                                         the hypothesis maybe feels even more vague or more like, what would happen? Like,
                                         
                                         would this instruction be a good idea? And you do all this work and then let the AI say,
                                         
                                         yes, it would be a good idea under these circumstances for these types of instructions,
                                         
                                         but you still have to run all the simulations. So I guess, I guess I'm just thinking about that process where, as a student, if you have a hypothesis, you sort of have to come up with your experiment set.
                                         
    
                                         And now what you're trying to do is come up with an experiment set that is wide and varied enough to produce enough data so that an AI can draw a conclusion.
                                         
                                         Is that sort of how you picture it? Yeah, I think like this, there's an
                                         
                                         aspect of, yes, we might have to build all the tools and so forth to kind of get to that, you
                                         
                                         know, evaluating that hypothesis. Now, I think like, I'd like to think, as you're kind of mentioning
                                         
                                         that I was translating this into a visual in my head where I'm like, okay, if I want to, I'm
                                         
                                         sitting in a room and I'm trying to think, okay, what's the next set of optimizations to perform, right?
                                         
                                         In my head, I would assume that given all the simulation data that's kind of setting in,
                                         
                                         for instance, right, from, you know, whatever simulators, you know, pick your favorite company
                                         
    
                                         and all the tools that are internal, I would assume that I should be able to ask, like,
                                         
                                         what are the common bottlenecks that I'm actually seeing and what aspects should I really focus on optimizing?
                                         
                                         And as an architect in my head, the architect of maybe like 2030 or 2040
                                         
                                         would be like kind of interacting with an AI agent
                                         
                                         that's kind of asking probing questions.
                                         
                                         The AI is really kind of just looking through the minds of data
                                         
                                         and making connections that you and I normally would not make.
                                         
                                         And I don't think it's going to necessarily, we don't necessarily have to push it to the point
                                         
    
                                         where, okay, just give me the chip, but it's more of an interactive feedback loop, right?
                                         
                                         That allows your architects to very intelligently brainstorm things because often architects are
                                         
                                         kind of doing this today anyway, right? Chief architects kind of sit around and like talking
                                         
                                         to all that, you know, IP modules that are getting integrated into the SOC.
                                         
                                         And I would think that, you think that that feedback loop is very slow
                                         
                                         today.
                                         
                                         And I would assume that in the future,
                                         
                                         the feedback loop is going to be extremely fast,
                                         
    
                                         because the AI agent is effectively synthesizing
                                         
                                         all this data and comes prepped for the meeting,
                                         
                                         much like any other person.
                                         
                                         And you can just ask the AI agent,
                                         
                                         what would it likely be if I had know, I had this sort of,
                                         
                                         you know, configuration, right? Which would be the notion of taking the prediction data,
                                         
                                         looking at optimizations, right, that have been performed in the past, and then potentially kind
                                         
                                         of making, you know, some sort of generative sort of an idea of like, okay, this is how I would
                                         
    
                                         retweak your design. And so I agree with you, it's inherently nebulous, and I don't have all
                                         
                                         the answers around this. But my hope is not so much that you and I honestly figure it out,
                                         
                                         but my hope is that we get the next generation to fire up, because they're likely going to think
                                         
                                         about these things in a very unorthodox manner that you and I probably don't think about,
                                         
                                         because we're very much stuck in a certain box, given the rules and things that we have,
                                         
                                         we ourselves broke, you know, in order to be who we are.
                                         
                                         That's right.
                                         
                                         They're going to be AI native, unlike us.
                                         
    
                                         Right.
                                         
                                         Yeah, that's a fascinating discussion.
                                         
                                         And also you've painted an exciting vision for the possibilities in the future.
                                         
                                         So I'm hoping a bunch of our listeners are geared up towards this particular challenge.
                                         
                                         Switching gears a little bit to another thrust
                                         
                                         in your research, you've worked on enabling ML
                                         
                                         in resource-constrained devices, like edge devices,
                                         
                                         mobile devices, and so on.
                                         
    
                                         I think you've christened it TinyML.
                                         
                                         Can you tell us a little bit about the unique challenges
                                         
                                         in designing both efficient algorithms and hardware
                                         
                                         for TinyML applications?
                                         
                                         And how would you sort of compare and contrast it against, you know,
                                         
                                         large-scale machine learning deployments?
                                         
                                         Why is it exciting?
                                         
                                         What is different about it?
                                         
    
                                         What are some unique challenges in that particular space?
                                         
                                         Yeah, so TinyML is effectively, you know, really talking about embedded machine learning.
                                         
                                         And for folks who like typically when we talk about on-device machine learning,
                                         
                                         you know, most people traditionally in the industry will say that, okay, that's kind of more talking about mobile devices, right?
                                         
                                         Our smartphones are effectively the on-device element.
                                         
                                         TinyML is really not about that.
                                         
                                         It's really about pushing ML onto, you know, hundreds of kilobytes of memory capacity, right?
                                         
                                         And so you're really talking about, you know, milliwatt level power consumption always on ml specifically in iot kind of devices
                                         
    
                                         or even smartphones it's always on you know some element is constantly listening because it has to
                                         
                                         in order to detect a keyword like when you say hey siri it's not like the system wakes up submodule
                                         
                                         wakes up right so certain aspects always have to be on and the question is can i fit in neural
                                         
                                         networks into a few hundred kilobytes or you you know, one or two megabytes of flash storage that I actually have.
                                         
                                         And so that's what TinyML really is about.
                                         
                                         And it's a vastly different ecosystem from the rest of the big ML stuff that's happening.
                                         
                                         And I would say that it's quite fascinating because it's the perfect melding of hardware, software, and ML.
                                         
                                         It's that blending of all three that I think is truly what TinyML sort of,
                                         
    
                                         you know, is all about.
                                         
                                         And that kind of opens up the space
                                         
                                         to many interesting challenges.
                                         
                                         I mean, this whole ecosystem kind of started
                                         
                                         about five years ago,
                                         
                                         maybe five to six years ago, I would say.
                                         
                                         And, you know, back then it was just an idea
                                         
                                         and there's a tiny
                                         
    
                                         amount of foundation that got formed around this where we're all kind of just thinking about what
                                         
                                         would it be if we could enable speech recognition on a coin cell battery operated device
                                         
                                         i mean that's a pretty damn far shot that's quite the moonshot if you kind of think about this is
                                         
                                         still five six years ago right and so that's where and this was
                                         
                                         also during when i was at my sabbatical at google where there was a skunkworks project on hey can we
                                         
                                         actually do this can we adapt tools like tensorflow to be able to run on microcontrollers much the
                                         
                                         same way they were adapted from running on big servers and workstations onto mobile devices
                                         
                                         things needed to be stripped down and so forth and of course you know today if you kind of look at it
                                         
    
                                         there's an entire world of you know tiny models all over the place lots of you know
                                         
                                         optimizations that are specialized for these embedded devices and this is a space that i think
                                         
                                         is fascinating both for research and education i'd say in research it's especially fascinating
                                         
                                         because talk about co-design this is one place people could really use co-design because they're highly bespoke
                                         
                                         applications you know typically when we talk about co-design i often often kind of you know
                                         
                                         skirt a bit because i'm kind of worried about like co-design looks you know complicated on paper it's
                                         
                                         lots of innovation technical innovation but from a practical standpoint i'm always like
                                         
                                         how is anyone going to take any of this and make sense in a company?
                                         
    
                                         I'm not saying that they have to do it today, but even like seven years from now, where you're
                                         
                                         asking them to rip apart the algorithms, you're asking them to rip apart the runtime, you're
                                         
                                         asking them to rip apart the architecture slash microarchitecture. I'm like, it's an intellectual
                                         
                                         exercise. That is awesome. But there's an aspect of it which just seems completely imbalanced right because often when
                                         
                                         you're building systems in large scale you need them to kind of be general purpose to an extent
                                         
                                         okay and you know when you're talking about tiny ml is very different because it's highly bespoke
                                         
                                         your ring doorbell does nothing but pretty much just image classification it does not have to
                                         
                                         listen to you like there's lots of very simple things that it can do. Alexa does not need to see you necessarily. It's mostly just trying to listen
                                         
    
                                         to the sounds. Now, in the future, I think speech is going to become so common. I think this notion
                                         
                                         of touch, I bet my daughter is going to be like, what? Why do you use or touch the screens? That's
                                         
                                         so yucky. That's probably what she's going to say when she's, you know,
                                         
                                         a couple of years older because she's just going to probably be talking to
                                         
                                         every single thing.
                                         
                                         It's going to be like, hey, widget, toast my bread for two minutes.
                                         
                                         Bake for 325, right?
                                         
                                         And that's a limited vocabulary space where you don't necessarily need big
                                         
    
                                         models.
                                         
                                         You can really get away with highly bespoke models that are highly
                                         
                                         specialized.
                                         
                                         And, of course, if you can do that, that's pretty incredible.
                                         
                                         I mean, just think about the world where today you think about AI,
                                         
                                         you still physically have to interact with it in the real world, right?
                                         
                                         Like you kind of, you know, you're interacting with some sort of entity.
                                         
                                         That's pretty clumsy and clunky if you kind of think about it,
                                         
    
                                         because you are working around, you know, the machine and reality.
                                         
                                         We are having to adapt to what the machine is. If it's truly embedded, you don't notice it. And that's the beauty of Mark Weisner's
                                         
                                         vision about like Ubiquitous computing way back at Xerox PARC. They have this idea that you're going
                                         
                                         to have intelligence spread across everywhere. And I think that's where we're getting to,
                                         
                                         which is ultra low power consumption, specialized intelligence for specific things
                                         
                                         that the devices need, and then being able to seamlessly interact with us.
                                         
                                         And that's essentially why I'm super excited about the TinyML ecosystem.
                                         
                                         Right.
                                         
    
                                         I think we have a long way to go to get to that ambient computing where everything just
                                         
                                         disappears into the background.
                                         
                                         Truly magical technology is something that you don't even notice exists.
                                         
                                         You briefly touched upon how TinyML also
                                         
                                         has a space in education, because for a lot
                                         
                                         of the other models, especially the largest models,
                                         
                                         you need industrial scale machinery
                                         
                                         to be able to interact and iterate with it
                                         
    
                                         at multiple scales.
                                         
                                         Can you expand a little bit on how
                                         
                                         you think this is going to be useful in education?
                                         
                                         Very broadly, I've thought a lot about how do you teach students about computer architecture,
                                         
                                         especially in the current times, given that the space is evolving so rapidly.
                                         
                                         You tell our listeners, how do you think about teaching computer architecture?
                                         
                                         How do you think ecosystems like TinyML or the associated tooling and infrastructure
                                         
                                         would be helpful and beneficial
                                         
    
                                         in teaching students about the different concepts
                                         
                                         in this particular space?
                                         
                                         Yeah, I'm going to talk a little bit more broadly
                                         
                                         than architecture, because I think architecture's scope is
                                         
                                         also expanding, especially as we look in this domain of ML.
                                         
                                         I think an architect who wants to play in space,
                                         
                                         for better or worse, needs to understand the ML ecosystem,
                                         
                                         the ML systems ecosystem.
                                         
    
                                         So I'm definitely very passionate about education
                                         
                                         in this space
                                         
                                         because I think it like breeds new life into traditional embedded systems that have been
                                         
                                         thought forever in universities worldwide, right? I still remember the first time I ever, like,
                                         
                                         you know, was when I was a professor at UT Austin, you know, I went into the classroom, you know,
                                         
                                         I was about to start teaching embedded systems and I was like, before leaving my room, my home,
                                         
                                         I remember I grabbed the garage door opener because I was like, before leaving my room, my home, I remember I
                                         
                                         grabbed the garage door opener because I was like, oh, this is a very basic embedded system.
                                         
    
                                         It's amazing. It's like everywhere. It's like, you know, I'm going to use this to inspire students.
                                         
                                         And I still remember, you know, holding it up and I'm like asking the kids and they were all like,
                                         
                                         oh, it's a garage door opener. And I was like, this is an amazing piece of technology because
                                         
                                         it's got all this stuff. And as I was getting excited, I saw their faces going really dull. And I was very perplexed by that as to what was going on.
                                         
                                         And one of the kids at the front said, I really don't want to do engineering so I can build garage door openers for the rest of my life.
                                         
                                         I was like, damn.
                                         
                                         Kids got a point.
                                         
                                         Two weeks later, I got my Google Glass and I took that in.
                                         
    
                                         And suddenly everybody wanted to kind of work on like embedded systems and stuff.
                                         
                                         They're like, oh yeah, this is super cool.
                                         
                                         And I want to do this stuff.
                                         
                                         I think like for education, I realized that it's really about kind of making it relevant
                                         
                                         with the times where we are.
                                         
                                         And so like obviously today when we look at AI systems and so forth, there's a lot of
                                         
                                         excitement.
                                         
                                         We've built lots of incredible hardware for this, but it's often very inaccessible, right?
                                         
    
                                         Think about how often can we actually, you know, go come up with a design that we can actually take, you know, all the way through the lifecycle of the chip, right, from concept to doesn't quite happen.
                                         
                                         But TinyML kind of opens it up to a very exciting space, right?
                                         
                                         For one, there's a lot of open source ecosystem tools that are kind of coming up.
                                         
                                         And because the designs are highly bespoke, you can actually do a lot of specialization.
                                         
                                         And because they're also small designs,
                                         
                                         you can actually completely go from your concept
                                         
                                         to kind of getting the tape off this,
                                         
                                         or whether you're doing it on FPGA.
                                         
    
                                         It's so much more practical.
                                         
                                         And I think the timing is sort of very interesting for education
                                         
                                         because you have these open source tools
                                         
                                         that are just mature enough to be able to pull this off, and then you've got an interesting educational area which is around ai everybody
                                         
                                         wants to do ai but then often it's like some big model some big data set that you know i can go
                                         
                                         around asking how many people have actually built a data set and i can guarantee you often whenever
                                         
                                         i'm asking this question probably you know one or two people will raise their hands out of 30 or 50
                                         
                                         people right very few people touch that.
                                         
    
                                         But when it comes to this sort of embedded ecosystem space, the data set is highly bespoke.
                                         
                                         So you can actually get them to go all the way from understanding how you collect the data, how you pre-process the data.
                                         
                                         Imagine if I have to just kind of wake up a machine that says, you know, when I say, OK, OK, Vijay, that's what i want like well you go you can go easily collect all of that data and you can build the pipeline out and actually train the model do your optimizations and particularly deploy it and
                                         
                                         actually get it to close the loop right and these these widgets today are literally five bucks a pop
                                         
                                         i mean the ones that i have like you know folks can't see this but like you know we actually kind
                                         
                                         of build these things right and they're really really really cheap. And of course, you know, Arduino, Seed, all these folks have started putting out, you know, putting out these, you know, MCUs. And these MCUs, these
                                         
                                         microcontrollers are completely capable of running, you know, these models. So students have an
                                         
                                         incredible opportunity to deeply understand whichever layer of the stack that they're quite
                                         
    
                                         interested in, right? If you kind of look at Songhan's papers, who has also been doing some
                                         
                                         pretty amazing work in
                                         
                                         tiny ml for instance you know they've been able to build an entire runtime engine that sort of you
                                         
                                         know optimizes it how often would you ever go out and build a big tensorflow like engine to show
                                         
                                         some incredible capability that you can unlock you can't do that on a big system however in their
                                         
                                         case they were able to kind of build a custom runtime model right so that kind of opens it up
                                         
                                         really and of course there's lots of hardware solutions you know um that's like preaching to the quarter on like you know what it means to
                                         
                                         build hardware so i'm not going to dabble into that but then that's a really exciting space
                                         
    
                                         right the one thing though that's certainly missing in this ecosystem is sort of like
                                         
                                         they're not enough educational resources around this right This is one of the reasons I think folks
                                         
                                         know about this, but I started putting together my own class notes, and in fact, Suvanya actually
                                         
                                         knows about this, where I started writing a machine learning systems book that talks about
                                         
                                         what it means. Originally, it was a tiny ML specific book, but as I started writing my notes
                                         
                                         in that, I started realizing it doesn't matter if it's tiny ML or big ML. Fundamentals are fundamentals, right? When you do operating systems, yes,
                                         
                                         they're distributed operating systems and all kinds of crazy stuff when you go to RTOSs and so
                                         
                                         forth, but you still have to learn one-on-one operating system. So when it comes to ML systems
                                         
    
                                         and architecture, it doesn't matter if you're building a big ML or a small ML. The fundamentals
                                         
                                         are still the same around all the nuances you need to understand
                                         
                                         about what happens in an ml pipeline from the point when data comes into the point the data goes
                                         
                                         out right and so i ended up uh creating this you know mls book that i you know it's an open source
                                         
                                         project um where people have actually been contributing back so this goes back to my whole
                                         
                                         passion about community involvement and so forth in fact just this morning i actually kind of um was working on getting the release out um because i've been
                                         
                                         spending an insane amount of time i feel like until i get the release out i can't actually
                                         
                                         rest because there's always something more to do when it comes to these educational things
                                         
    
                                         so question about the open source notes that you're you're talking about that sounds really
                                         
                                         really interesting i'm just wondering is the is the model sort of Wikipedia model where everybody can just put their stuff in, or is it a Linux model where
                                         
                                         you need a pull request and Linus need to say, okay. It's definitely a pull request model.
                                         
                                         So yeah, so someone has to be, you know, involved in curating it. Of course, there are a couple of
                                         
                                         people that are, you know, you know, I certainly have been talking
                                         
                                         to multiple faculty members.
                                         
                                         And as much as I kind of do
                                         
                                         the initial drafts and my students,
                                         
    
                                         you know, my research lab is very active.
                                         
                                         Every time I teach it,
                                         
                                         my students kind of contribute,
                                         
                                         oh, these are interesting seminal references
                                         
                                         because the field is moving very fast, right?
                                         
                                         So the question always is like,
                                         
                                         how do you sort of keep up with it?
                                         
                                         And that's the whole reason for making it
                                         
    
                                         an open source sort of a project
                                         
                                         where people can issue pull requests
                                         
                                         and kind of keep it updated. That said, though, I was still struggling with it. And that's the whole reason for making it an open source sort of a project where people can issue pull requests and kind of keep it updated that said though i was still struggling
                                         
                                         with it and that's when i kind of reached out to dave pattison to ask for a bit of advice on like
                                         
                                         you know when they wrote the book you know they wrote the computer organization book back when
                                         
                                         things were evolving rapidly back then right like today we kind of look at it as the holy bible but
                                         
                                         when they were writing it there were heated debates going on about what's the right thing
                                         
                                         to do what's not the right thing to do and so forth and i think the
                                         
    
                                         advice that he gave me is kind of what i follow which is if a company has started putting it into
                                         
                                         practice that could be a nice litmus test for whether this concept should be in an educational
                                         
                                         resource because it means that there's community wisdom that yes this makes sense the
                                         
                                         nuances of course will be different but that's sort of like a way of kind of proofing it against
                                         
                                         the rapid change that's actually happening in the ecosystem and that's actually worked out quite well
                                         
                                         i think that's a wonderful initiative and also a great resource not just for students but also
                                         
                                         practitioners in this field because even once you get into industry or you're working in a particular space, because the space is
                                         
                                         evolving pretty rapidly, it's hard to keep track of all the different developments, number
                                         
    
                                         one, but also, as you mentioned, someone curating it and saying, these are the essential ideas
                                         
                                         that you actually need to pay attention to.
                                         
                                         I think that signal is quite useful.
                                         
                                         And the process of doing this and curating it is incredibly valuable to the entire field.
                                         
                                         So I highly encourage the listeners to doing this and curating it is incredibly valuable to the entire field.
                                         
                                         So I highly encourage the listeners
                                         
                                         to also go and check out the book.
                                         
                                         Maybe this is a good time to wind the clocks back
                                         
    
                                         a little bit.
                                         
                                         You're clearly very passionate about teaching.
                                         
                                         Maybe you can tell our audience, how
                                         
                                         did you get interested in computer architecture?
                                         
                                         What is your journey like as you got to Harvard,
                                         
                                         where you are currently? Yeah, I'd say that it sounds a bit tacky, but in all honesty,
                                         
                                         I think I got interested in computer architecture because when I was reading
                                         
                                         Dave Patterson Hennessy's book, I remember, I mean, I kid you not, this sounds really weird,
                                         
    
                                         but I read it like it was a storybook or like it was a novel because it was accessible.
                                         
                                         I mean, I just kind of picked it up right now. I was like, who's going to read this massive book massive book i still remember when they gave it to me it was like this big fat book and i'd gone
                                         
                                         and picked it up at the national university of singapore because that's where i started my
                                         
                                         undergrad picked it up and i had a cd i was like what like who knows all this stuff and what i'm
                                         
                                         gonna have to memorize all this stuff and so anyway i still remember like kind of just sitting
                                         
                                         down and like reading through it and i found it fascinating that it was so accessible to learn
                                         
                                         something that seemed so complicated that you would normally think that oh I have to go to class
                                         
                                         and that's really kind of pick it up and that to this day you know kind of left an impression on
                                         
    
                                         me it's like oh it's like when you have a good educational resource where you can learn you
                                         
                                         might not be able to master it certainly you need mentors to help you master it but if you have a good
                                         
                                         educational resource then that can really kind of you know spur you and it's and also think more
                                         
                                         than just that material i think it's also the community aspect um i think some folks in our
                                         
                                         community are very approachable and accessible i think like just looking at them as sort of mentors
                                         
                                         and being like oh maybe someday you know i can be. For me, I honestly feel that that has a bigger impact on you
                                         
                                         than the actual technical material.
                                         
                                         And that's honestly how I ended up becoming a professor.
                                         
    
                                         I never thought I was going to be a professor, to be honest.
                                         
                                         I was so inspired by my own mentors.
                                         
                                         I was like, wow, these people are so incredibly smart,
                                         
                                         yet they're so humble and so nice and so forth.
                                         
                                         And they were so invested in me,
                                         
                                         even though I don't even know the ABCs of stuff.
                                         
                                         Right.
                                         
                                         And that I think is, you know, for me over time, it's kind of translated into,
                                         
    
                                         it's we're all technical people.
                                         
                                         We're all smart people and so forth.
                                         
                                         But at the end of the day, we're humans first.
                                         
                                         Right.
                                         
                                         And it's, it's all about relationships and just being nice and taking care of
                                         
                                         one another, I think is far more important than all the nitty gritty. One of the best pieces of advice that I was given, which I
                                         
                                         take to heart from one of my colleagues, Gustavo at UT Austin back when I was there was, if you
                                         
                                         can't have a cup of coffee with your colleague and just kind of hang out, forget writing a $20
                                         
    
                                         million proposal or whatever it is, it will never work. If you can't just hang out with a person,
                                         
                                         like the way I'm hanging out with Yusuf and A.R. Lisa,
                                         
                                         yeah, there's no way you're gonna have fun
                                         
                                         doing whatever it is, right?
                                         
                                         And so I really feel like as much as we wanna invest
                                         
                                         in technical things and always debate things
                                         
                                         very technically, I think it's very important to remember
                                         
                                         that we're all just trying to learn from one another
                                         
    
                                         as researchers, and we're always a learner first
                                         
                                         in our community so i think that's kind of really what it's kind of inspired me so i know it's not
                                         
                                         the classic i did this not in that and that i think for me it's really just incredible mentors
                                         
                                         and people that i've seen that was awesome i have never heard anybody say that they read
                                         
                                         patterson and hennessey like a novel like that
                                         
                                         is like incredible i think one of the great things about doing this podcast i'm sure you
                                         
                                         agree souvenir is like when we ask this question like we usually lead with what gets you up and we
                                         
                                         usually end with you know how did you become a computer architect and the the ways that people
                                         
    
                                         became computer architects are very varied i mean they're they they run the gamut of different ways
                                         
                                         but yours is quite singular
                                         
                                         i'm quite amazed i mean i i enjoyed the book as well and i remember reading it and thinking like
                                         
                                         oh wow there are several chapters like this is the first chapter textbook i've ever read where
                                         
                                         i didn't have to like really reread it like you read it and it's like gets in
                                         
                                         there basically line speed I was like wow and so so I had similar feelings although I don't think
                                         
                                         I blitzed through it the way you did it doesn't I didn't consume it like candy
                                         
                                         but it was fascinating though because I actually you know we run a rising stars program at ML
                                         
    
                                         Commons to recognize outstanding junior students in ML and
                                         
                                         systems. And Dave, you know, graciously agreed to kind of, you know, talk to the students. And when
                                         
                                         I was introducing him, I kind of mentioned this because it left such a positive mark.
                                         
                                         And I could see because he said this, he was like, he was very happy to hear that because he said,
                                         
                                         a lot of people don't realize how much time and
                                         
                                         effort you know we put into the writing and trying to make sure it's actually accessible that it's
                                         
                                         not that it's actually really something that people can consume so and i think it shows the
                                         
                                         amount of effort that they must have actually put into just making it available to us right
                                         
    
                                         sure yeah yeah for sure and i think that also kind of touches on how important it is.
                                         
                                         You know, our guests time and time again have talked about how important relation, how important it is to have good relationships and collaborations and communication being effective.
                                         
                                         That's always top of everybody's mind and how to to do as well as our guest list has done, yourself included.
                                         
                                         And so I hope that maybe this is one of those things
                                         
                                         where like repetitions will get into our audience's brains.
                                         
                                         You know, you want to do the technical stuff,
                                         
                                         but at the same time,
                                         
                                         you really have to learn how to work with people,
                                         
    
                                         be able to communicate effectively,
                                         
                                         maintain relationships,
                                         
                                         and that's how you get bigger things done.
                                         
                                         Because we are long past the age
                                         
                                         of being able to do anything on your own
                                         
                                         that's sort of like a rich value to everybody, right?
                                         
                                         I think it's a new generation.
                                         
                                         Cliff Young actually recently was visiting Harvard,
                                         
    
                                         and he made this really astute observation in the casual conversation
                                         
                                         where we're saying, you know, when we were building them up for benchmarks,
                                         
                                         like, you know, Cliff and Dave were one of the, you know,
                                         
                                         original pillars in there.
                                         
                                         And, you know, it worked because we were able to bring all the community together and kind of work collectively you know have a lot of grudging consensus in that and he
                                         
                                         said maybe it's just uh you know the reason perhaps like you know nowadays we have to do
                                         
                                         these bigger community kind of things is because the cohort of people who are actually doing things
                                         
                                         that have been deeply influenced with social media you know it's like if you kind of things is because the cohort of people who are actually doing things that have been deeply influenced with social media, you know, it's like, if you kind of think about the
                                         
    
                                         generational changes that we've come through as individuals, right? And if you think about it,
                                         
                                         like, yeah, that era of people, the current era of people are people who are deeply influenced
                                         
                                         with social media, which is like, you know, it's a community kind of thing. Everything is kind of
                                         
                                         shared. Everything is discussed. Everything is debated. And, you know, we do it collectively,
                                         
                                         and we agree to disagree and so forth. And I think that
                                         
                                         was a very interesting observation that he made. It was like, oh, it's, we live in a different world
                                         
                                         and people think differently today. So maybe as we move forward, we should work on projects more
                                         
                                         holistically and more collectively rather than the way we used to do things in the back,
                                         
    
                                         for one, you know, back in the day.
                                         
                                         For one, systems are much more complicated today, right?
                                         
                                         All of it, you also need bigger teams.
                                         
                                         And so I thought that was a very interesting observation
                                         
                                         that he made about how times have changed
                                         
                                         and how our culture has kind of evolved.
                                         
                                         And that possibly is changing the way
                                         
                                         we actually work together too.
                                         
    
                                         Yeah, I look forward to the day where we can work together only through Instagram DMs.
                                         
                                         No, I don't. I really don't.
                                         
                                         Yeah. Well, Avita, I think this is a really, really interesting conversation.
                                         
                                         I think we ran a lot of different topics, you know, from Architecture 2.0 to MLPerf and MLCommons and teaching and TinyML.
                                         
                                         I feel very stimulated right now. And so thanks so much for joining us today.
                                         
                                         Yeah, thank you so much for having me. Super fun.
                                         
                                         Yeah, thank you so much. It was a fascinating Architecture Podcast. Till next time, it's goodbye from us.
                                         
