Computer Architecture Podcast - Ep 9: Hyperscale Cloud and Agile Hardware Design in China with Dr. Yungang Bao, Institute of Computing Technology

Episode Date: August 7, 2022

Dr. Yungang Bao is a professor at the Institute of Computing Technology (ICT), Chinese Academy of Sciences (CAS) and the deputy director of ICT-CAS. Prof. Bao founded the China RISC-V Alliance (CRVA) ...and serves as the secretary-general of CRVA. His research interests include open-source hardware and agile chip design, datacenter architecture and memory systems. Prof. Bao’s contributions include developing the PARSEC 3.0 benchmark suite which has been adopted by leading industry players in China (like Alibaba and Huawei), the labeled von Neumann paradigm to enable a software-defined cloud, Hybrid Memory Trace Tool (HMTT), and Partition-Based DMA Cache. He was awarded the CCF-Intel Young Faculty Award, was the winner of CCF-IEEE CS Young Computer Scientist Award, and received China’s National Honor for Youth under 40.

Transcript
Discussion (0)
Starting point is 00:00:00 Hi, and welcome to the Computer Architecture Podcast, a show that brings you closer to cutting-edge work in computer architecture and the remarkable people behind it. We are your hosts. I'm Suvina Isubramanian. And I'm Lisa Xu. Today we have with us Professor Yungang Bao, who is a professor of the Institute of Computing Technology in the Chinese Academy of Sciences and the Deputy Director of ICTCAS. Professor Bao founded the China Risk 5 Alliance, CRVA,
Starting point is 00:00:30 and serves as the Secretary General of CRVA. His research interests include open source hardware and agile chip design, data center architecture, and memory systems. Professor Bao's contributions include developing the Parsec 3.0 benchmark suite, which has been adopted by leading industry players in China, like Alibaba and Huawei, the labeled von Neumann paradigm to enable a software-defined cloud, hybrid memory trace tool,
Starting point is 00:00:55 or HMTT, and partition-based DMA cache. He was awarded the CCF Intel Young Faculty Award, was the winner of the CCF IEEE CS Young Computer Scientist Award, and received China's National Honor for Youth Under 40. Today, he's here to talk to us about the state of hyperscale cloud in China, open source hardware, and his efforts in revamping computer architecture education. A quick disclaimer that all views shared on the show
Starting point is 00:01:23 are the opinions of individuals and do not reflect the views of the organizations they work for. Yungam, welcome to the podcast. We're so happy to have you. Thank you for inviting me. I'm very happy to be here. Thank you for having me. Thank you. Yeah, we're so excited to have you too. I think so. What's getting you up in the morning these days? What's what's making you excited? Yeah, today, you know, is Saturday. And the next day, actually Sunday, but it's a working day, we accepted to the May 1, there will be five-day holidays. Every day, it seems like we are approaching
Starting point is 00:02:09 the five-day holidays. Gotcha, so what's making you excited is five days off. Yes. Yeah, well, that would make me excited too. So how about on the work front? I think one of the reasons why we were really excited to have you on the show today was because over here in sort of the Western hemisphere, there's a lot of cloud stuff.
Starting point is 00:02:29 And although news is dominated by like other hyperscalers in the West, like Microsoft, Facebook, Google, and then we hear about Alibaba, we hear about, you know, all the companies out in China, and then we see the papers that get submitted to ISCA and published in ISCA and other top venues, but it is, it is far away. So like one of the reasons we were excited to talk to you today is just like to get your views on, you know, whether or not they're facing parallel problems or, you know, what the opportunities are over there and what the kind of culture and thought process around how to manage a hyperscale cloud is for some of these major Chinese
Starting point is 00:03:07 companies. ZHANG ZHANGZHANG WANGYI WANGYI WANGYI WANGYI WANGYI In China, probably we don't see hypercloud. So actually, we talk more about internet companies. Since actually, those internet companies, they consume a lot of data center servers. Basically, there are two phases for those Internet companies' growth. The first phase, I think, is that before 2010, there were three major Internet companies called BAT,
Starting point is 00:03:41 just as you mentioned, Baidu, Alibaba, and Tencent. Yeah, those three, today they are still giant company in China. But there are also some new company that rose over the past decade. For example, like Jindong JD, TMD. T is, we see the buy-downs actually used to be the 头条 and later changed to name to buy-downs and there are also another company called Meituan M and, actually this is a competitor with Uber, those new internet companies in China.
Starting point is 00:04:30 Those companies together, I think, are used to see a report on the top 20 internet companies of the world in terms of the market capacity. There were, I think there are 12 in the US and 8 in China. Those are internet companies that consume a lot of servers. I did some homework on how many servers or computation capacity they are consuming now. So I see like last year in China, so the China server market, the revenue is 25 billion US dollar accounts for like one fourth of the global revenue. and the shipment volume is actually 3.9 million servers in China that were shipped last year. The global market is about like 14 million.
Starting point is 00:05:39 So that is very significant, and it's interesting that you characterize it in terms of internet companies and like sort of server consumption. So I guess my question to you is because like maybe it's because I work for Microsoft and Souvene works for Google where like in my mind there's you know cloud, there's public cloud, private cloud and then a place like Meta where you know they run their own servers because they have this massive presence and they have a lot of demand. So for some of these newer internet companies that you described, like for example, Didi, right?
Starting point is 00:06:09 Didi, the actual product is a competitor to you. So the actual product is these rides, but of course it needs to be backed by a ton of server capacity as you've sort of described. So are these rising new internet companies in China building their own data centers or are they sort of renting data center capacity from another internet company? How is the ecosystem out there?
Starting point is 00:06:34 For those companies, just I mentioned that. So those companies, they actually prefer to build their own data centers. For example, like Alibaba, like Tencent, they have many sites, they are the sites all over the country. Yeah, some in like around Beijing, some in Western China, you know, recently, the China, the Chinese government encouraged companies to build data centers in Western, in Western China, since in this area, there are enough power, electricity. So it's easy to get power.
Starting point is 00:07:15 For Eastern China, since the electricity is actually more expensive than Western China. So many companies, they will choose some places to build their own data center there are also some other companies middle size we see middle-sized yeah company or small size startups and they usually use public cloud. For example, Alibaba has its own public cloud. I think they rank number four in the global market. Number one is Amazon and Microsoft Azure. Besides Alibaba, there are also other public cloud companies. For example,
Starting point is 00:08:06 like Huawei also provides a cloud service. Tencent is providing cloud service. Actually, this is kind of different since Tencent has its own ecosystem. For example, WeChat. On WeChat, there are a lot of light. We say, we call it light program. This is just like small apps can be installed on WeChat. So on the WeChat platform, there are many startups, many small companies, they build their own light apps
Starting point is 00:08:51 and those light apps can run on WeChat. And so Tencent provides a computation service for them. Yeah, this is kind of different levels of cloud services. Got it. Yeah. I mean, that's a fascinating overview. Thanks a lot for that summary. Just donning my computer architecture hat on I think there are several themes over here. I'll pick up on one of them, which is the workloads in the applications. You talked about these providers, there, of course,
Starting point is 00:09:20 there are people like Alibaba who sell their own products and so on. But for Tencent, which has the wechat app or ecosystem a lot of people essentially build applications or tiny applications on top of this particular platform so from a workload standpoint like how do you view the you know the cloud ecosystem in china especially with this diversity of workloads and also this control of the ecosystem often when we for example when google runs you know machine learning workloads we control the stack all the way through to tensorflow which allows us to do a lot of optimizations across different layers of the stack i was just wondering you know given the distribution of the workloads and the kind of ecosystems available in china from a computer architecture perspective like does it either open
Starting point is 00:10:03 up interesting new opportunities are there different attributes or characteristics of these applications for the cloud ecosystem that you see in China? Yeah, I think for more and more data centers run AI related workloads. I saw a data that most internet companies they consume a large amount of their servers are running AI related workloads. So for example for Alibaba they use AI to recommend things or stuff for consumers.
Starting point is 00:10:48 And for like a buy a dance. Yeah, buy a dance and TikTok in the US and Douyin in China. So they consume a lot of AI. They need a lot of AI power, computation power for AI workload, since a lot of it can make people look beautiful, more beautiful. They have a lot of very small gadgets to make the video look better, look beautiful. JANE TCHEYANSKA- It sounded almost
Starting point is 00:11:35 like you were saying that if you were to look, because as Souvenay pointed out, it does seem like you have potential bifurcation of workloads, where if you're going to say, OK, I want to make the hardware better in order to serve these workloads, that might be different because, you know, this company has their own stack
Starting point is 00:11:50 and their whole sort of like closed system. This other company has their own closed system. This other company has their own closed system. And so then, you know, then how do you make the hardware better for all of them? But it sounded like what you were saying is that you could sort of broadly characterize and generalize that a lot of them are running AI workloads. And between these two kinds of AI workloads, or within the AI workloads, there's potentially
Starting point is 00:12:13 even another classification where some of them are recommendation, where it's like, hey, let's make people look at these videos or, you know, recommend these products. And then the other is sort of almost like, I don't know if there's a term for this, like smart filtering where like, we can make the video better and improve the light background or make people's eyes bigger or like cat filters or something like that.
Starting point is 00:12:33 So then that kind of stuff is probably a different kind of, underlying AI substrate than recommendation models. If you were a hardware person trying to kind of attack the China market, would you say then, OK, let's focus on recommendation models or let's focus on multimedia filters or something like that? Is that essentially what you were observing? YANG ZIUOBAN ZHANG So this is probably challenging for building a general purpose architecture for those diverse workloads.
Starting point is 00:13:08 So that is why, you know, for many Chinese internet companies, they started to build their old chips. You know, GPU is still the major computation power for the general AI workloads, but for some specific workloads. Many companies, they prefer to build their own accelerators. You will see Alibaba is building its own chip, and ByteDance, they invested some AI accelerator companies to accelerate their workloads. So every company has its own requirements
Starting point is 00:14:00 on computer architecture. In the future, probably the better way is to provide the agile chip design, chip development methodology. Since then, it's easy for us to tank those such diverse requirements. So that is why I finally choose to open source and Azure design, hardware design, this direction.
Starting point is 00:14:30 Yeah, certainly. That's a very pertinent point. If you look at the chip design lifecycle, the design costs are pretty high. And bootstrapping a new chip design tends to be a pretty labor-intensive and time consuming effort. So maybe this is a pretty labor intensive and time consuming effort. So maybe this is a good point to sort of talk about your efforts. You've been building
Starting point is 00:14:49 an open source RISC-V based processor, as you mentioned. So can you tell us about the effort, you know, the current state, what have your learnings been during that entire process? We launched the Xiangshan project two years ago in 2020, but actually we have made, we prepare for this project almost since like 2015. Yeah, so it's this is a long journey to this project. So I used to run a project called Labeled Computer Architecture. And we built a simulator, we built an FPGA, and we published a paper on rs or feasible solutions, for example, like x86 and ARM chips, and I didn't figure out a feasible solution.
Starting point is 00:15:57 And finally, I found that RISC-V and the Rocket chip is just the solution, the right solution for us. And we tend to use Rocket to build the label of the KV architecture, prototype chips. Yeah. After we choosing RISC-V and Rocket chip, actually we made two decisions. One decision is using Chisel rather than Verilog. But not everyone supported this decision, even in my group. We did a set of experiments to compare Verilog and Chisel.
Starting point is 00:16:37 So the experiments were divided into two phases. One phase is I asked two guys to complete a level 2 cache and to integrate into the rocket chip. And one is a senior student, undergraduate. Another actually is an engineer. good at, actually he is good at cache and has studied like OpenSpark T1 cache and the Xilinx cache. He actually, he studied those caches. And okay, one, the senior student, undergraduates use the Chisel
Starting point is 00:17:22 and the engineer use the Verilog. The result is that the undergraduate only took three days to complete the work, yeah, and integrate the cache into Rocket Chip, and then it can run like Linux, even with the DMA. So it's fantastic. But for the engineer, he actually spent six weeks to finish the L2 cache and it will integrate into the rocket ship. But there were still some bugs. box. So this is a very impressive comparison.
Starting point is 00:18:05 And so there are different also the lines of codes. For Qisho, only 1 1⁄5 of lines of codes are Varog. So the productivity is huge different. But the engineer refused to accept this result. Yeah, he said his design is better. His design is better in terms of like frequency, power in area. Yeah, so then I asked another senior student, also an undergraduate student. He never learned Chisholm, but he can write Verilog codes. So I asked him to rewrite the engineer's Verilog codes into, to translate the Verilog codes
Starting point is 00:19:02 into Chisholm. the Verilog codes into a chisel and then pass the fabrication program written by the engineer. And he actually took about one week to finish this task. And we will see that all the data, the PPA, actually most of the PPA is better than the Verilog version, the engineers. So that is quite impressive for us. Of course, this is done on FPGA, it's not on the ASIC flow, but anyway, we can see that Chishol can do quite good work, at least for this fasting prototype. Actually, I gave a talk on the Visioning Workshop in 2019, on ISCA, the SIGARCH Visioning Workshop. In China, many companies, they start to pay attention to Chisholm.
Starting point is 00:20:09 And some companies, then they form a team to build Chisholm libraries and to help accelerate the development cycles. So this is our experiments. Then in our group, everyone was convinced to use Chisel. But anyway, there are still a lot of concerns about Chisel, since people will see that this is only a small module. Like L2 cache is just a module.
Starting point is 00:20:50 There's no evidence or no case study to show that Chisell can, we can use Chisell to build a complex, high performance processors. But next is our next position is, so decision is to use Chishol to build a high-performance RISC-V CPU core. This is the Xiangshan. The Xiangshan project, we target high-end ARM processors. But anyway, it's still far away from ARM chip. But we hope in the future we can approach that packet. So the first version is actually we taped it out last year.
Starting point is 00:21:37 It was in 28 nanometers, and it can run 1 gigahertz. The chip was back this February and it brought up successfully just in, and we finished all tests in three weeks. Everything actually seems quite good. And with DDR4-1600, yeah, DDR is not that fast. So the chip can run 1 gigahertz and we run all the specs CPU 2006 and get a score. The score is 7 gigahertz. It's not that it's actually if we don't take consider, we don't look at like power or error this performance is similar to arm
Starting point is 00:22:31 Cortex-A73 but anyway Cortex-A73 is much better in in terms of power yeah and also frequency is much much higher than the first generation of the shanghai and now we are working on the signal generation this is is where take part next month yeah in in this this may and the second generation uh now the pre the frequency can reach two gigahertz in 14 nanometer technology. The performance is 10 points per gigahertz for SPEC CPU 2006. This is the second generation. Yeah, you are right. Just you mentioned that.
Starting point is 00:23:16 Actually, the third generation is ongoing. Every year, there will be a new version of a Shanshan. Right. That's a fascinating journey. From your early observations that it's difficult to translate your ideas into actual chips, into having your second version of the chip out and the third version in design is quite fascinating.
Starting point is 00:23:39 I think there were a few takeaways for me from that entire experience, starting from the early days when you had to bootstrap this and figure out, okay, how do we actually do this? What's the right set of tools that you can use to build this? Using the experiences of your undergrad students to actually convince people that it's possible to be productive, and it's actually the chip designer that matters more than the tool itself. The tool is quite productive. I think your early experience, you showed that the second undergraduate student was able to take the Veril-to-end in RISC-V that actually works and is
Starting point is 00:24:26 competitive on certain metrics with some of the ARM processors is quite fascinating. So maybe I can ask you, what's next on the horizon for this project? So you have started with an L2 cache to a processor. Now you have the second generation out in the lab tested. And it looks like you have the third generation as well coming
Starting point is 00:24:44 out. How do you see this being used uh what's your hope for uh how these kind of uh open source but yet high performance processors built using these open source tools and that also improve developer productivity how do you see these being used uh in the future yeah i think um from our perspective uh the main uh merits of shanshan project is not only the shanshan uh the processor itself but also uh the the framework yeah the chip development framework framework so actually there are already many other frameworks like chip yard from berkeley like open piton from princeton like uh black parrot from university washington yeah and also that chip chip kit from from from harvard yeah that brooks group so there are many other uh like frameworks yeah we but we i think that the the frameworks of shanghai is a uh it's a has its own features um it's actually focused
Starting point is 00:25:56 more on not only agile design but also agile verification actually we built a bunch of tools to support agile verification. A bunch of tools, yeah. There are more than 10 tools for verification. So what we needed to address is when we make a modification or modify a design, for example, like currently we have the signal generation. Then we needed to build the third generation. We have to make a lot of modification
Starting point is 00:26:35 to based on the signal generation. But we needed to address how to verify its correctness. How do we know the modification is correct? And we also need to get the performance changed. What's the impact of the modification on performance? Since these two issues are very important for agile development. Otherwise, we can not just design is actually easy. Since you are doing TPU, you know many verification
Starting point is 00:27:17 actually is time consuming. And also, we need a lot of engineering work. But we have some interesting observations to do Azure verification based on C-cell. Actually, C-cell is a very good language for Azure verification, since it is just like the LLVM. We can do many paths. It's not a path transform. Can be inserted into Chisholm.
Starting point is 00:27:52 So recently, there are also new IR based on the ML IR. For example, we can write our own transform based on this IR. For example, we have built a tool that is able to translate waveforms into Chishol code. When you are watching the waveform, you will see the events what what chishol codes is uh related to this this is waveform events yeah so it's a this kind of a tool actually can help us to to do like agile debugging yeah agile verification so i think the tissue actually is a is a very good framework to do a lot of extension for aerial verification. Yeah, I think this is really interesting because I think in the sort of the history of our
Starting point is 00:28:56 field, we introduce tools that sort of allow people to move further and further away from the lowest low levels of the machine. And so that sort of enables the scaling of, in some ways human capital, the more people are able to do a good job. But the key is, as you say, a framework that's able to sort of translate, make things easier for the users on the top, but still produce quality in the bottom.
Starting point is 00:29:18 And that production of quality means like, you can attack the problem in this framework layer by making sure that, you know, and continually improving it. so similar to like compilers right so everybody used to write an assembly means you had to have the machine in your head but then if you continually attack the compiler problem then you can you know continually improve this side or the quality of the machine code underneath right and so i think what you said right there because you know when i when i first heard about chisel many years ago one of the things that you would hear about is just like impossible to debug right because now you've really you introduced a layer between the final user and the and the and the bottom level and and
Starting point is 00:29:54 you can't go in and figure out you know what went wrong and if you have an idea yourself like maybe the engineer did an idea of what you want the bottom level to look like and you can't actually make it do it because there's this it's too soft the layer in between then it makes it difficult so that thing that you just said there where you're actually able to watch the waveforms go by and have it go like be pinpointed back to the chisel code that that is sort of um uh instigating that that waveform like this is the part of the code that's being run right now, that seems like a major leap forward that enables this translation
Starting point is 00:30:30 between what the user is doing and what is like being produced at the very bottom layer. So I think one thing you mentioned before was that now people are doing a lot of work on Chisel itself to make that better, which is I think a great way to enable that framework to continue to improve and continue to scale the ability of humans at the top layer
Starting point is 00:30:48 to produce what they want. So when you have students come in, are you more targeted at them working on the chisel side so that they can see this translation between the top and the bottom? Or are you having more working as users and doing design? YONGJIN PARKERI- Yeah, this is a very good question. So that is actually how we train our students.
Starting point is 00:31:15 So it's not easy for a junior student to participate in the Xiangshan project directly yeah it's uh this is a the project is still quite large for them to understand to uh to to investigate to modify yeah before students joining in the shangshan project we we will uh ask them to participate in the One Student One Chip project. So this is the One Student One Chip initiative. So we will provide the opportunity for them to build a real chip. This is a small real chip. After this training, then they will be good at using Chisel since we asked them to use Chisel to build a small chip. And we also provide the paypal opportunity, the channel for them. So every student can get their real chip.
Starting point is 00:32:26 When they want to get a real chip, they need to pass some verification or to pass some tests. Their design needed to be able to run a real-time operating system and needed to be integrated into an SOC. Some of those students assigned the physical design tasks. So this is the whole process of our chip development from like an architectural design, SOC integration,
Starting point is 00:33:02 and physical design. All are done by students. Once their design passes the tests, then their design will be taken out. Currently, we are using 110 nanometer technology. It's not that expensive. We also use a framework to integrate all those CPU cores built by the students into one die.
Starting point is 00:33:27 So in one die, there are like 10 CPU cores. This is a multi-core, but it's generous. This is really multi-genius core. It's a totally different design, but they share a bus. They share IOs. And then when the chip is back, they can enable their own call and disable other's call and the chip can run. So the cost can be reduced to almost currently it's like
Starting point is 00:34:01 $3,000 per student. It's quite cheap. That's amazing. That's super cool. And is that something where the One Student, One Chip project, is that available only to students who are at ICT, or is it sort of a nationwide effort? Or how does that work? YANG ZIUO, For the first year, there were only ICT students. This is the University of Chinese Academy of Science. We call UCAS. There were five undergrad students participating into the project. There were only five students.
Starting point is 00:34:43 And in the next year, there were 11 undergraduates from five universities. And then for the third year, OK, third year, the number is actually skyrocketing. And there were 760 students all over the country from 168 universities students. They actually they applied for one student, one chip initiative, including 23 universities from outside of China. Yeah, because some university in the US and now
Starting point is 00:35:26 this currently we are running the fourth the fourth year so currently we have already the application is the applications are already it's more than larger than 1200 yeah from more than 200 universities so this is that it's it's more than larger than 1200 yeah from more than 200 universities so this is a it's it's become a national wide uh initiative and many companies they uh they donate uh have been not funding for us so we uh we we also use a way to you, since there are like so many students apply for the project, so we need a lot of TA. We found, we figured out a way that like the third year, those good, those students, they perform very well and in the third year become the TAs of the fourth year students. Yeah.
Starting point is 00:36:26 So then we recruit a part like 30 to 50 TAs from the last term to serve as a TA for the next term. So it can, so the whole initiative can run at a very low expense. This is a fascinating experiment. I'm intrigued by so many different aspects of it. And it sounds like a very valuable experience for students as well, because they learn by doing. They get a more hands-on approach
Starting point is 00:36:58 to understanding the concepts, to seeing it actually taped out. And you sort of talked about the pipeline, where they start off as applying to build of build the chip and sort of experiment with their ideas. And then the following year, they become TAs and sort of shepherd these ideas for the next batch of students as well. So overall, this entire pipeline is extremely fascinating. And I'm curious to hear about all the other learnings
Starting point is 00:37:22 that you've had through this experiment. But maybe this is also a good time to sort of wind the other learnings that you've had through this experiment but maybe this is also a good time to sort of wind the clocks back a little bit and understand your journey of how you came to ict uh how you got interested in computer architecture sort of how you see the state of the ecosystem as well you know having been in this field for for a while now so my first time overseeing computer actually was in middle school yeah i think that is like 1993 it's in my middle school so it's i was i was born in eastern china so So that time, a company owner, an entrepreneur, he donated 20 computers to our school. I think it's 286 computers. So then I become kind of, I think,
Starting point is 00:38:31 okay, this is what my whole career. Yes, so actually you can see that I probably already made a decision over my career since I was a middle school student. And then I got my first home computer in 1994. And this is, I still remember, this is a Cyrix CPU. This is a 486, a Sirius 486 CPU. And I become, I write my first program.
Starting point is 00:39:11 But you know, at that time, since I was in a small town, there's nobody can teach me programming. At that time, we in the school only learn like basic basic like for fox space you know fox based database i still remember yeah and when i then i when i entered the the school the university the nanjing university i i was uh i think I found that I was good at system courses. So, but my, I still don't know, I still didn't know how to do system research at that time.
Starting point is 00:39:59 So, you know, my undergraduate thesis is still on like natural language processing. I found my research topic, finally, I dived into computer architecture since I was entering ICT. So ICT is a lot of famous system work, system project in China, like the Donning supercomputer. This is used to be the fastest supercomputer in China. And the Longsang, the CPU, actually this is China's first general purpose CPU in China. And also, there are many spin-off startups,
Starting point is 00:40:48 like Lenovo is a spin-off of ICT. Yeah, so I found that, wow, this is the right place. So I found my ICT. I learned a lot of computer architectural knowledge, and I also did my first project. This is the HMTT. This is actually the memory monitoring toolkit. It is plugged on the DIMM slot and can track all memory bus signals and can be
Starting point is 00:41:27 translated into virtual address. We do actually do like a hard OS and the memory system code design to translate the physical address into virtual address and to identify processes, memory behavior. This tool still works right now. Now it can support DDR4. And also last year, Microsoft also bought a set of HMT-TLA two years ago. So this is my first project. And later, actually, in 2010, after my PhD, I went to Princeton University and working
Starting point is 00:42:15 with Professor Kelly on Parsec. And that actually opened my eyes on computer architecture i i i talked i i knew more people more like pioneers in computer architecture and i start to know many like lisa yeah we we, we, we, I knew many friends in the area. So actually, this is also given of to the computer architecture. Yeah, so I appreciate it all of those who helped me. That's a great story. Cause I think, you know, from all the folks that I've interacted in the field, there's like a huge range of people saying like, you know, I saw a folks that I've interacted in the field, there's like a huge range of people
Starting point is 00:43:25 saying like, you know, I saw a computer and I fell in love, or like I showed up in college and I still didn't, I didn't know anything and then I fell in love. And so it's, but either way, you know, everybody falls in love or everybody that I know of in our field who like really has an affection for our sort of field of study and successiveness because they're really passionate about it. And it's cool that you sort of, like through the generosity of this one entrepreneur, sort of like set the course of your life, you know, when you were a child. So that's pretty cool. And you're sort of
Starting point is 00:43:56 taking it full circle now where, you know, you've instituted this program that has sort of not only nationwide, potentially internationally, you know, giving lots and lots of students the opportunity to do something cool that, you know, I didn't do in school myself. And like, I almost wish like, oh, man, you know, I wish I was in school now and I could do that too. So that's very awesome. You know, so with this really, you know, this kind of tandem effort that you have with A, building an open source, high performance processor with RISC-V and B, using that as a vehicle to develop frameworks to enable more productivity and teach students.
Starting point is 00:44:41 So for a long time, every year we have a lot of Chinese grad students and Chinese students come into the United States, come to American universities to learn the craft of computer architecture. And so now with this homegrown effort out there, are you finding that you have A students staying more B, students internationally coming to Chinese universities to study computer architecture? And then sort of a little bit as a tongue-in-cheek question, do you guys use Patterson and Hennessy? Yeah, so actually you will see that for the second term of one student, one chip project. So actually, we have one student, after he finished the task, and he applied for the PhD program from MIT.
Starting point is 00:45:37 And finally, he got an offer. And another student applied for University of Toronto. And he also got the offer and another student applied for University of Toronto and he also got the offer but finally he didn't, he failed to go due to the pandemic and visa issue. So he stayed in China. But you will see that actually this, what, that what you mentioned, A and B, for me, they are combined together. You know, most of the first January, the first year of the 1S, 1C initiative,
Starting point is 00:46:19 the five students, all five students become the major force of the Shanshan project and for the second year in the second year some students going abroad and some also many many students stay in in China and join in the Shanshan project. So I think this is, I think for the One Student One Chip project, this is open. We encourage students to go abroad and to learn more expertise from all over the world. Yeah, we encourage them to do that.
Starting point is 00:47:08 Yeah, but for the other side, so you mean, are there any international students returning back to China? Yeah, so we will see some students, they are, they are returning since you know, there are more and more opportunities and job, there are a lot of jobs. So in China, so you get to I mentioned, many internet companies, they start to build, start to build chips. So a lot of positions, jobs in the market. So they
Starting point is 00:47:53 can pay quite good. I was actually more asking, you know, so like the United States used to be like a very big sink for students all over the world. And then now, and so I guess it was more wondering whether you were detecting that China, because of some of these things, was also sourcing students, non-Chinese students, from outside of the country.
Starting point is 00:48:19 Like whether, say, European students or Turkish students or even American students or Canadian students are actually applying to Chinese universities because you know I don't live in China so I sort of always imagine Chinese universities being filled with Chinese people but maybe that's not the case and I'm just curious. Yeah you are right currently most students are still Chinese students. It seems like more international students like to go to the US and to, since there seems like have good opportunities in the like teaching skills
Starting point is 00:49:00 and also many other environments. Oh yeah. But for Chinese students, and also many other environments. But for Chinese students, I think they seem like in China, there are also more and more opportunities. So since this is also an attraction of students abroad. I did want to ask, what textbook do you guys use? Do you guys like the rest of the world? I forgot that question.
Starting point is 00:49:33 We do use the John Hennessy's textbook for graduate students. And for our graduates, so usually we actually we have in many universities has have it's the textbook in written in Chinese, in Chinese. Yeah. And for example, there are already some textbook on RISC-V. And Professor John Hopcroft from Cornell, actually, he helped a lot to guiding or to instruct Chinese universities to write textbooks. So recently, we in China launched the 1.1 or 101 project this means for 10 courses we want to write a a good test book for 10 courses uh with the efforts of uh with the joint efforts of of many universities so there are like 33 universities each universities there will be at least one faculty to participate in the 101 project to help refine the textbook yeah textbooks for 10 basic courses of computer science yeah that's great yeah. Speaking of students and opportunities,
Starting point is 00:51:06 any words of wisdom or advice that you would have to students today or just other listeners to this podcast as well? My suggestion is just like learning by doing. So concept is
Starting point is 00:51:21 important, but we also need to put our hands dirty. So to learn by doing. If we can do more practice, we can master the knowledge more comprehensively. So I think this is always true. I hope those students who are exciting with physical things, just don't be just dive himself into the computer architecture field.
Starting point is 00:51:57 Awesome. Well, that was really wonderful. I super, super enjoyed talking with you today, Yungang. And I'm sure I speak for Suvene too, and that this is just a really, really interesting session to hear about these great initiatives that you're putting forth. And it's been a total, total pleasure to speak to you today.
Starting point is 00:52:22 Thank you so much for joining us. YONGGONG BAO, Thank you for having me. Thank you very much. SIDDHARTHA SRINIVASA, Yeah, thank you, Professor Yonggong Bao. It was an absolute delight talking to you. And to our listeners, thank you for being with us on the Computer Architecture Podcast. Till next time, it's goodbye from us.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.