Computer Architecture Podcast - Ep 9: Hyperscale Cloud and Agile Hardware Design in China with Dr. Yungang Bao, Institute of Computing Technology
Episode Date: August 7, 2022Dr. Yungang Bao is a professor at the Institute of Computing Technology (ICT), Chinese Academy of Sciences (CAS) and the deputy director of ICT-CAS. Prof. Bao founded the China RISC-V Alliance (CRVA) ...and serves as the secretary-general of CRVA. His research interests include open-source hardware and agile chip design, datacenter architecture and memory systems. Prof. Bao’s contributions include developing the PARSEC 3.0 benchmark suite which has been adopted by leading industry players in China (like Alibaba and Huawei), the labeled von Neumann paradigm to enable a software-defined cloud, Hybrid Memory Trace Tool (HMTT), and Partition-Based DMA Cache. He was awarded the CCF-Intel Young Faculty Award, was the winner of CCF-IEEE CS Young Computer Scientist Award, and received China’s National Honor for Youth under 40.
Transcript
Discussion (0)
Hi, and welcome to the Computer Architecture Podcast, a show that brings you closer to
cutting-edge work in computer architecture and the remarkable people behind it.
We are your hosts.
I'm Suvina Isubramanian.
And I'm Lisa Xu.
Today we have with us Professor Yungang Bao, who is a professor of the Institute of Computing
Technology in the Chinese Academy of Sciences and the Deputy Director of ICTCAS.
Professor Bao founded the China Risk 5 Alliance, CRVA,
and serves as the Secretary General of CRVA.
His research interests include open source hardware
and agile chip design, data center architecture,
and memory systems.
Professor Bao's contributions include developing
the Parsec 3.0 benchmark suite,
which has been adopted by leading industry players in China, like Alibaba and Huawei,
the labeled von Neumann paradigm to enable a software-defined cloud, hybrid memory trace tool,
or HMTT, and partition-based DMA cache. He was awarded the CCF Intel Young Faculty Award,
was the winner of the CCF IEEE CS Young Computer Scientist
Award, and received China's National Honor
for Youth Under 40.
Today, he's here to talk to us about the state of hyperscale
cloud in China, open source hardware, and his efforts
in revamping computer architecture education.
A quick disclaimer that all views shared on the show
are the opinions of individuals and do not reflect the views of the organizations they work for.
Yungam, welcome to the podcast. We're so happy to have you.
Thank you for inviting me. I'm very happy to be here.
Thank you for having me. Thank you.
Yeah, we're so excited to have you too. I think so. What's getting you up in the morning these days? What's what's making you excited?
Yeah, today, you know, is Saturday. And the next day, actually Sunday, but it's a working day,
we accepted to the May 1, there will be five-day holidays.
Every day, it seems like we are approaching
the five-day holidays.
Gotcha, so what's making you excited is five days off.
Yes.
Yeah, well, that would make me excited too.
So how about on the work front?
I think one of the reasons why we were really excited
to have you on the show today was
because over here in sort of the Western hemisphere, there's a lot of cloud stuff.
And although news is dominated by like other hyperscalers in the West, like Microsoft,
Facebook, Google, and then we hear about Alibaba, we hear about, you know, all the companies
out in China, and then we see the papers that get submitted to ISCA and published in ISCA and other top venues, but it is, it is
far away.
So like one of the reasons we were excited to talk to you today is just like to get your
views on, you know, whether or not they're facing parallel problems or, you know, what
the opportunities are over there and what the kind of culture and thought process around
how to manage a hyperscale cloud is for some of these major Chinese
companies.
ZHANG ZHANGZHANG WANGYI WANGYI WANGYI WANGYI WANGYI
In China, probably we don't see hypercloud.
So actually, we talk more about internet companies.
Since actually, those internet companies,
they consume a lot of data center servers.
Basically, there are two phases for those Internet companies' growth.
The first phase, I think, is that before 2010, there were three major Internet companies called BAT,
just as you mentioned, Baidu, Alibaba, and Tencent. Yeah, those three, today they are still
giant company in China.
But there are also some new company
that rose over the past decade.
For example, like Jindong JD, TMD.
T is, we see the buy-downs actually used to be the 头条 and later changed to
name to buy-downs and there are also another company called Meituan M and, actually this is a competitor with Uber,
those new internet companies in China.
Those companies together, I think,
are used to see a report on the top 20 internet companies
of the world in terms of the market capacity.
There were, I think there are 12 in the US and 8 in China. Those are internet companies that consume a lot of servers.
I did some homework on how many servers or computation capacity they are consuming now.
So I see like last year in China, so the China server market, the revenue is 25 billion US
dollar accounts for like one fourth of the global revenue. and the shipment volume is actually 3.9 million servers in China that were shipped last year.
The global market is about like 14 million.
So that is very significant, and it's interesting that you characterize it in terms of internet
companies and like sort of server consumption. So I guess my question to you is because like
maybe it's because I work for Microsoft and Souvene works for Google where like in my mind
there's you know cloud, there's public cloud, private cloud and then a place like Meta where
you know they run their own servers because they have this massive presence and they have a lot of
demand.
So for some of these newer internet companies
that you described, like for example, Didi, right?
Didi, the actual product is a competitor to you.
So the actual product is these rides,
but of course it needs to be backed
by a ton of server capacity as you've sort of described.
So are these rising new internet companies in China
building their own data centers or are they
sort of renting data center capacity from another internet company?
How is the ecosystem out there?
For those companies, just I mentioned that.
So those companies, they actually prefer to build their own data centers.
For example, like Alibaba, like Tencent, they have many sites, they are the sites all over the country.
Yeah, some in like around Beijing, some in Western China,
you know, recently, the China, the Chinese government encouraged
companies to build data centers in Western, in Western China,
since in this area, there are enough power, electricity.
So it's easy to get power.
For Eastern China, since the electricity is actually
more expensive than Western China.
So many companies, they will choose some
places to build their own data center there are also some other companies middle size we see
middle-sized yeah company or small size startups and they usually use public cloud. For example, Alibaba has its own public cloud.
I think they rank number four in the global market.
Number one is Amazon and Microsoft Azure.
Besides Alibaba, there are also other public cloud companies. For example,
like Huawei also provides a cloud service. Tencent is providing cloud service.
Actually, this is kind of different since Tencent has its own ecosystem.
For example, WeChat.
On WeChat, there are a lot of light.
We say, we call it light program.
This is just like small apps can be installed on WeChat.
So on the WeChat platform, there are many startups,
many small companies, they build their own light apps
and those light apps can run on WeChat.
And so Tencent provides a computation service for them.
Yeah, this is kind of different levels of
cloud services.
Got it. Yeah. I mean, that's a fascinating overview. Thanks a lot for that summary. Just donning my computer architecture
hat on I think there are several themes over here. I'll pick up
on one of them, which is the workloads in the applications.
You talked about these providers, there, of course,
there are people like Alibaba who sell their own products and
so on. But for Tencent, which has the wechat app or ecosystem a lot of people essentially build applications
or tiny applications on top of this particular platform so from a workload standpoint like how
do you view the you know the cloud ecosystem in china especially with this diversity of workloads
and also this control of the ecosystem often when we for example when google runs you know machine learning workloads we control the stack all the way through to tensorflow
which allows us to do a lot of optimizations across different layers of the stack
i was just wondering you know given the distribution of the workloads and the kind of
ecosystems available in china from a computer architecture perspective like does it either open
up interesting new opportunities are there different attributes or characteristics
of these applications for the cloud ecosystem
that you see in China?
Yeah, I think for more and more data centers
run AI related workloads.
I saw a data that most internet companies they consume a large
amount of their servers are running AI related workloads. So for example
for Alibaba they use AI to recommend things or stuff for consumers.
And for like a buy a dance.
Yeah, buy a dance and TikTok in the US and Douyin in China.
So they consume a lot of AI.
They need a lot of AI power, computation power for AI
workload, since a lot of it can make people look beautiful,
more beautiful. They have a lot of very small gadgets
to make the video look better, look beautiful.
JANE TCHEYANSKA- It sounded almost
like you were saying that if you were to look,
because as Souvenay pointed out, it
does seem like you have potential bifurcation
of workloads, where if you're going to say, OK,
I want to make the hardware better
in order to serve these workloads,
that might be different because, you know,
this company has their own stack
and their whole sort of like closed system.
This other company has their own closed system.
This other company has their own closed system.
And so then, you know,
then how do you make the hardware better for all of them?
But it sounded like what you were saying is that
you could sort of broadly characterize and generalize that a lot of them are running AI workloads. And
between these two kinds of AI workloads, or within the AI workloads, there's potentially
even another classification where some of them are recommendation, where it's like, hey, let's make
people look at these videos or, you know, recommend these products. And then the other is sort of
almost like, I don't know if there's a term for this,
like smart filtering where like,
we can make the video better
and improve the light background
or make people's eyes bigger
or like cat filters or something like that.
So then that kind of stuff is probably a different kind of,
underlying AI substrate than recommendation models.
If you were a hardware person trying to kind of attack the China market, would you say then,
OK, let's focus on recommendation models
or let's focus on multimedia filters or something like that?
Is that essentially what you were observing?
YANG ZIUOBAN ZHANG So this is probably challenging
for building a general purpose architecture for those diverse workloads.
So that is why, you know, for many Chinese internet companies, they started to build
their old chips.
You know, GPU is still the major computation power for the general AI workloads,
but for some specific workloads.
Many companies, they prefer to build their own accelerators.
You will see Alibaba is building its own chip, and ByteDance, they invested some AI accelerator companies
to accelerate their workloads.
So every company has its own requirements
on computer architecture.
In the future, probably the better way
is to provide the agile chip design, chip development
methodology.
Since then, it's easy for us to tank
those such diverse requirements.
So that is why I finally choose to open source and Azure
design, hardware design, this direction.
Yeah, certainly.
That's a very pertinent point.
If you look at the chip design lifecycle,
the design costs are pretty high.
And bootstrapping a new chip design
tends to be a pretty labor-intensive and time
consuming effort. So maybe this is a pretty labor intensive and time consuming effort.
So maybe this is a good point to sort of talk about your efforts. You've been building
an open source RISC-V based processor, as you mentioned. So can you tell us about the effort,
you know, the current state, what have your learnings been during that entire process?
We launched the Xiangshan project two years ago in 2020, but actually we have made, we
prepare for this project almost since like 2015.
Yeah, so it's this is a long journey to this project.
So I used to run a project called Labeled Computer Architecture. And we built a simulator,
we built an FPGA, and we published a paper on rs or feasible solutions, for example, like x86
and ARM chips, and I didn't figure out a feasible solution.
And finally, I found that RISC-V and the Rocket chip is just the solution, the right solution for us.
And we tend to use Rocket to build the label of the KV architecture,
prototype chips.
Yeah.
After we choosing RISC-V and Rocket chip, actually we made two decisions.
One decision is using Chisel rather than Verilog.
But not everyone supported this decision, even in my group.
We did a set of experiments to compare Verilog and Chisel.
So the experiments were divided into two phases.
One phase is I asked two guys to complete a level 2 cache and to integrate into the rocket chip.
And one is a senior student, undergraduate.
Another actually is an engineer. good at, actually he is good at cache and has studied
like OpenSpark T1 cache and the Xilinx cache.
He actually, he studied those caches.
And okay, one, the senior student,
undergraduates use the Chisel
and the engineer use the Verilog.
The result is that the undergraduate only took three days
to complete the work, yeah,
and integrate the cache into Rocket Chip,
and then it can run like Linux, even with the DMA.
So it's fantastic.
But for the engineer, he actually spent six weeks
to finish the L2 cache and it will integrate into the rocket ship. But there were still some bugs. box. So this is a very impressive comparison.
And so there are different also the lines of codes.
For Qisho, only 1 1⁄5 of lines of codes are Varog.
So the productivity is huge different.
But the engineer refused to accept this result. Yeah, he said his design
is better. His design is better in terms of like frequency, power in area. Yeah, so then
I asked another senior student, also an undergraduate student.
He never learned Chisholm, but he can write Verilog codes.
So I asked him to rewrite the engineer's Verilog codes into, to translate the Verilog codes
into Chisholm. the Verilog codes into a chisel and then pass the fabrication program written by the engineer.
And he actually took about one week to finish this task.
And we will see that all the data, the PPA, actually most of the PPA is better than the Verilog version, the engineers.
So that is quite impressive for us.
Of course, this is done on FPGA, it's not on the ASIC flow, but anyway, we can see that Chishol can do quite good work, at least for this fasting
prototype.
Actually, I gave a talk on the Visioning Workshop in 2019, on ISCA, the SIGARCH Visioning Workshop. In China, many companies, they start
to pay attention to Chisholm.
And some companies, then they form a team
to build Chisholm libraries and to help
accelerate the development cycles.
So this is our experiments.
Then in our group, everyone was convinced to use Chisel.
But anyway, there are still a lot of concerns about Chisel,
since people will see that this is only a small module.
Like L2 cache is just a module.
There's no evidence or no case study to show that Chisell can,
we can use Chisell to build a complex, high performance processors.
But next is our next position is, so decision is to use Chishol to build a high-performance RISC-V CPU core.
This is the Xiangshan.
The Xiangshan project, we target high-end ARM processors.
But anyway, it's still far away from ARM chip.
But we hope in the future we can approach that packet.
So the first version is actually we taped it out last year.
It was in 28 nanometers, and it can run 1 gigahertz.
The chip was back this February and it brought up successfully
just in, and we finished all tests in three weeks.
Everything actually seems quite good.
And with DDR4-1600, yeah, DDR is not that fast.
So the chip can run 1 gigahertz and we run all the specs CPU 2006 and get a score.
The score is 7 gigahertz.
It's not that it's actually if we don't take consider, we don't look at like power or error this performance is similar to arm
Cortex-A73 but anyway Cortex-A73 is much better in in terms of power yeah and also frequency
is much much higher than the first generation of the shanghai
and now we are working on the signal generation this is is where take part next month yeah in in
this this may and the second generation uh now the pre the frequency can reach two gigahertz in 14 nanometer technology.
The performance is 10 points per gigahertz for SPEC CPU 2006.
This is the second generation.
Yeah, you are right.
Just you mentioned that.
Actually, the third generation is ongoing.
Every year, there will be a new version of a Shanshan.
Right.
That's a fascinating journey.
From your early observations that it's
difficult to translate your ideas into actual chips,
into having your second version of the chip out
and the third version in design is quite fascinating.
I think there were a few takeaways for me
from that entire experience, starting from the early days when you had to bootstrap this and figure out, okay, how do we actually do
this?
What's the right set of tools that you can use to build this?
Using the experiences of your undergrad students to actually convince people that it's possible
to be productive, and it's actually the chip designer that matters more than the tool itself.
The tool is quite productive.
I think your early experience, you showed that the second undergraduate student was able to take the Veril-to-end in RISC-V that actually works and is
competitive on certain metrics with some of the ARM
processors is quite fascinating.
So maybe I can ask you, what's next on the horizon
for this project?
So you have started with an L2 cache to a processor.
Now you have the second generation out in the lab
tested.
And it looks like you have the third generation as well coming
out. How do you see this being used uh what's your hope for uh how these kind of uh open source but
yet high performance processors built using these open source tools and that also improve developer
productivity how do you see these being used uh in the future yeah i think um from our perspective uh the main uh merits of shanshan project is not
only the shanshan uh the processor itself but also uh the the framework yeah the chip development
framework framework so actually there are already many other frameworks like
chip yard from berkeley like open piton from princeton like uh black parrot from university
washington yeah and also that chip chip kit from from from harvard yeah that brooks group so there are many other uh like frameworks yeah we but we i think
that the the frameworks of shanghai is a uh it's a has its own features um it's actually focused
more on not only agile design but also agile verification actually we built a bunch of tools to support agile verification.
A bunch of tools, yeah.
There are more than 10 tools for verification.
So what we needed to address is when
we make a modification or modify a design, for example,
like currently we have the signal generation.
Then we needed to build the third generation.
We have to make a lot of modification
to based on the signal generation.
But we needed to address how to verify its correctness.
How do we know the modification is correct?
And we also need to get the performance changed.
What's the impact of the modification on performance?
Since these two issues are very important for agile development.
Otherwise, we can not just design is actually easy.
Since you are doing TPU, you know many verification
actually is time consuming.
And also, we need a lot of engineering work. But we have some interesting observations
to do Azure verification based on C-cell.
Actually, C-cell is a very good language for Azure verification,
since it is just like the LLVM.
We can do many paths.
It's not a path transform.
Can be inserted into Chisholm.
So recently, there are also new IR based on the ML IR.
For example, we can write our own transform based on this IR.
For example, we have built a tool that is able to translate waveforms into Chishol code.
When you are watching the waveform, you will see the events what what chishol codes is uh related to this this is waveform events yeah so
it's a this kind of a tool actually can help us to to do like agile debugging yeah agile
verification so i think the tissue actually is a is a very good framework to do a lot of extension
for aerial verification.
Yeah, I think this is really interesting because I think in the sort of the history of our
field, we introduce tools that sort of allow people to move further and further away from
the lowest low levels of the machine. And so that sort of enables the scaling of,
in some ways human capital,
the more people are able to do a good job.
But the key is, as you say,
a framework that's able to sort of translate,
make things easier for the users on the top,
but still produce quality in the bottom.
And that production of quality means like,
you can attack the problem in this framework layer
by making sure that, you know, and continually improving it. so similar to like compilers right so everybody used to write an
assembly means you had to have the machine in your head but then if you continually attack the
compiler problem then you can you know continually improve this side or the quality of the machine
code underneath right and so i think what you said right there because you know when i when i first
heard about chisel many years ago one of the things that you would hear about is just like impossible to debug right because now you've
really you introduced a layer between the final user and the and the and the bottom level and and
you can't go in and figure out you know what went wrong and if you have an idea yourself like maybe
the engineer did an idea of what you want the bottom level to look like and you can't actually
make it do it because there's this it's too soft the layer in between then it makes it difficult so that thing that
you just said there where you're actually able to watch the waveforms go by and have it go like be
pinpointed back to the chisel code that that is sort of um uh instigating that that waveform like
this is the part of the code that's being run right now,
that seems like a major leap forward
that enables this translation
between what the user is doing
and what is like being produced at the very bottom layer.
So I think one thing you mentioned before
was that now people are doing a lot of work on Chisel itself
to make that better,
which is I think a great way to enable that framework
to continue to improve and continue
to scale the ability of humans at the top layer
to produce what they want.
So when you have students come in,
are you more targeted at them working on the chisel side
so that they can see this translation between the top
and the bottom?
Or are you having more working as users and doing design?
YONGJIN PARKERI- Yeah, this is a very good question.
So that is actually how we train our students.
So it's not easy for a junior student to participate in the Xiangshan project directly yeah it's uh this is a the project is still quite
large for them to understand to uh to to investigate to modify yeah before students
joining in the shangshan project we we will uh ask them to participate in the One Student One Chip project.
So this is the One Student One Chip initiative.
So we will provide the opportunity for them to build a real chip.
This is a small real chip. After this training, then they will be good at using Chisel since we
asked them to use Chisel to build a small chip. And we also provide the
paypal opportunity, the channel for them. So every student can get their real chip.
When they want to get a real chip,
they need to pass some verification
or to pass some tests.
Their design needed to be able to run
a real-time operating system
and needed to be integrated into an SOC. Some of those students assigned the physical design tasks.
So this is the whole process of our chip development
from like an architectural design, SOC integration,
and physical design.
All are done by students.
Once their design passes the tests,
then their design will be taken out.
Currently, we are using 110 nanometer technology.
It's not that expensive.
We also use a framework to integrate all those CPU cores
built by the students into one die.
So in one die, there are like 10 CPU cores.
This is a multi-core, but it's generous.
This is really multi-genius core.
It's a totally different design, but they share a bus.
They share IOs.
And then when the chip is back, they can enable their own call and
disable other's call and the chip can run.
So the cost can be reduced to almost currently it's like
$3,000 per student. It's quite cheap.
That's amazing. That's super cool. And is that something where the One Student, One Chip project,
is that available only to students who are at ICT, or is it sort of a nationwide effort? Or how does that work? YANG ZIUO, For the first year, there were only ICT students.
This is the University of Chinese Academy of Science.
We call UCAS.
There were five undergrad students participating
into the project.
There were only five students.
And in the next year, there were 11 undergraduates
from five universities.
And then for the third year, OK, third year,
the number is actually skyrocketing.
And there were 760 students all over the country from 168 universities students.
They actually they applied for one student, one chip initiative, including 23 universities
from outside of China.
Yeah, because some university in the US and now
this currently we are running the fourth the fourth year so currently we have
already the application is the applications are already it's more than
larger than 1200 yeah from more than 200 universities so this is that it's it's more than larger than 1200 yeah from more than 200 universities so this is a it's it's become
a national wide uh initiative and many companies they uh they donate uh have been not funding for
us so we uh we we also use a way to you, since there are like so many students apply for the project,
so we need a lot of TA. We found, we figured out a way that like the third year, those
good, those students, they perform very well and in the third year become the TAs of the fourth year students.
Yeah.
So then we recruit a part like 30 to 50 TAs from the last term to serve
as a TA for the next term.
So it can, so the whole initiative can run at a very low expense.
This is a fascinating experiment.
I'm intrigued by so many different aspects of it.
And it sounds like a very valuable experience
for students as well, because they learn by doing.
They get a more hands-on approach
to understanding the concepts, to seeing it actually taped
out.
And you sort of talked about the pipeline,
where they start off as applying to build of build the chip and sort of experiment with their ideas.
And then the following year, they become TAs
and sort of shepherd these ideas for the next batch of students as well.
So overall, this entire pipeline is extremely fascinating.
And I'm curious to hear about all the other learnings
that you've had through this experiment.
But maybe this is also a good time to sort of wind the other learnings that you've had through this experiment but maybe this is also
a good time to sort of wind the clocks back a little bit and understand your journey of how
you came to ict uh how you got interested in computer architecture sort of how you see the
state of the ecosystem as well you know having been in this field for for a while now so my first time overseeing computer actually was in middle school yeah i
think that is like 1993 it's in my middle school so it's i was i was born in eastern china so So that time, a company owner, an entrepreneur,
he donated 20 computers to our school.
I think it's 286 computers. So then I become kind of, I think,
okay, this is what my whole career.
Yes, so actually you can see that I probably already
made a decision
over my career since I was a middle school student.
And then I got my first home computer in 1994.
And this is, I still remember, this is a Cyrix CPU.
This is a 486, a Sirius 486 CPU.
And I become, I write my first program.
But you know, at that time, since I was in a small town,
there's nobody can teach me programming.
At that time, we in the school only learn
like basic basic like for
fox space you know fox based database i still remember yeah and when i then i when i entered
the the school the university the nanjing university i i was uh i think I found that I was good at system courses.
So, but my, I still don't know,
I still didn't know how to do system research at that time.
So, you know, my undergraduate thesis
is still on like natural language processing.
I found my research topic, finally, I
dived into computer architecture since I was entering ICT.
So ICT is a lot of famous system work, system project in China, like the Donning
supercomputer.
This is used to be the fastest supercomputer in China.
And the Longsang, the CPU, actually this is China's first general purpose CPU in China. And also, there are many spin-off startups,
like Lenovo is a spin-off of ICT.
Yeah, so I found that, wow, this is the right place.
So I found my ICT.
I learned a lot of computer architectural knowledge, and I also
did my first project.
This is the HMTT.
This is actually the memory monitoring toolkit.
It is plugged on the DIMM slot and can track all memory bus signals and can be
translated into virtual address. We do actually do like a hard OS and the
memory system code design to translate the physical address into virtual address
and to identify processes, memory behavior. This tool still works right now.
Now it can support DDR4.
And also last year, Microsoft also
bought a set of HMT-TLA two years ago.
So this is my first project. And later, actually, in 2010, after my PhD,
I went to Princeton University and working
with Professor Kelly on Parsec.
And that actually opened my eyes on computer architecture i i i talked i i knew more people more
like pioneers in computer architecture and i start to know many like lisa yeah we we, we, we, I knew many friends in the area. So actually, this is also given of to the computer architecture.
Yeah, so I appreciate it all of those who helped me.
That's a great story.
Cause I think, you know,
from all the folks that I've interacted in the field,
there's like a huge range of people saying like, you know, I saw a folks that I've interacted in the field, there's like a huge range of people
saying like, you know, I saw a computer and I fell in love, or like I showed up in college and I
still didn't, I didn't know anything and then I fell in love. And so it's, but either way, you
know, everybody falls in love or everybody that I know of in our field who like really has an
affection for our sort of field of study and
successiveness because they're really passionate about it. And it's cool that
you sort of, like through the generosity of this one
entrepreneur, sort of like set the course of your life, you know, when you were a
child. So that's pretty cool. And you're sort of
taking it full circle now where, you know, you've instituted this program
that has sort of not only nationwide, potentially internationally,
you know, giving lots and lots of students the opportunity to do something cool that, you know, I didn't
do in school myself. And like, I almost wish like, oh, man, you know, I wish I was in school
now and I could do that too. So that's very awesome. You know, so with this really, you
know, this kind of tandem effort that you have with A, building an open source,
high performance processor with RISC-V and B, using that as a vehicle to develop frameworks
to enable more productivity and teach students.
So for a long time, every year we have a lot of Chinese grad students and Chinese
students come into the United States, come to American universities to learn the craft
of computer architecture. And so now with this homegrown effort out there, are you finding
that you have A students staying more B, students internationally coming to Chinese universities to study computer architecture?
And then sort of a little bit as a tongue-in-cheek question, do you guys use Patterson and Hennessy?
Yeah, so actually you will see that for the second term of one student, one chip project. So actually, we have one student,
after he finished the task, and he
applied for the PhD program from MIT.
And finally, he got an offer.
And another student applied for University of Toronto. And he also got the offer and another student applied for University of Toronto and he also got the
offer but finally he didn't, he failed to go due to the pandemic and visa issue.
So he stayed in China.
But you will see that actually this, what, that what you mentioned, A and B,
for me, they are combined together.
You know, most of the first January,
the first year of the 1S, 1C initiative,
the five students, all five students
become the major force of the Shanshan project and
for the second year in the second year some students going abroad and some also
many many students stay in in China and join in the Shanshan project. So I think this is, I think for the One Student One Chip
project, this is open.
We encourage students to go abroad and to learn
more expertise from all over the world.
Yeah, we encourage them to do that.
Yeah, but for the other side, so you mean,
are there any international students returning back
to China?
Yeah, so we will see some students, they
are, they are returning since you know, there are more and
more opportunities and job, there are a lot of jobs. So in
China, so you get to I mentioned, many internet
companies, they start to build, start to build chips. So a lot of positions, jobs in the market. So they
can pay quite good.
I was actually more asking, you know, so like the United States used to be like a very big
sink for students all over the world.
And then now, and so I guess it was more wondering
whether you were detecting that China,
because of some of these things,
was also sourcing students, non-Chinese students,
from outside of the country.
Like whether, say, European students or Turkish students
or even American students or Canadian students
are actually applying to Chinese universities because you know I don't live in China so I
sort of always imagine Chinese universities being filled with Chinese people but maybe that's not
the case and I'm just curious. Yeah you are right currently most students are still Chinese students. It seems like more international students
like to go to the US and to,
since there seems like have good opportunities
in the like teaching skills
and also many other environments.
Oh yeah. But for Chinese students, and also many other environments.
But for Chinese students, I think they seem like in China,
there are also more and more opportunities.
So since this is also an attraction of students abroad.
I did want to ask, what textbook do you guys use?
Do you guys like the rest of the world?
I forgot that question.
We do use the John Hennessy's textbook for graduate students.
And for our graduates, so usually we actually we have in many universities has have it's the textbook in written in Chinese, in Chinese. Yeah. And for example, there are already some textbook on RISC-V. And Professor John Hopcroft from Cornell,
actually, he helped a lot to guiding or to instruct Chinese universities
to write textbooks.
So recently, we in China launched the 1.1 or 101 project this means for 10 courses we want to write a a good test book for 10
courses uh with the efforts of uh with the joint efforts of of many universities so there are like 33 universities each universities there will be at least one faculty to participate
in the 101 project to help refine the textbook yeah textbooks for 10 basic courses of computer
science yeah that's great yeah. Speaking of students and opportunities,
any words of wisdom
or advice that you would have to
students today or just other listeners
to this podcast as well?
My suggestion is
just like learning by doing.
So
concept is
important, but we also
need to put our hands dirty.
So to learn by doing.
If we can do more practice, we can master the knowledge more comprehensively.
So I think this is always true.
I hope those students who are exciting with physical things,
just don't be just dive himself into the computer architecture
field.
Awesome.
Well, that was really wonderful.
I super, super enjoyed talking with you today, Yungang.
And I'm sure I speak for Suvene too,
and that this is just a really, really interesting session
to hear about these great initiatives that
you're putting forth.
And it's been a total, total pleasure to speak to you today.
Thank you so much for joining us. YONGGONG BAO, Thank you for having me.
Thank you very much.
SIDDHARTHA SRINIVASA, Yeah, thank you, Professor Yonggong
Bao.
It was an absolute delight talking to you.
And to our listeners, thank you for being with us
on the Computer Architecture Podcast.
Till next time, it's goodbye from us.