Screaming in the Cloud - Networks and Sustainability in Computing with George Porter
Episode Date: March 21, 2024George Porter, a computer science professor at the University of California, San Diego, talks to us about advanced networking and the effects of computing on the environment In this episode o...f Screaming in the cloud. George explores the shift towards optical networking in data centers to meet growing bandwidth needs and discusses the significant carbon footprint associated with computing, from data centers to device production. In addition to providing a look into the future of scalable, sustainable computing systems, George mentions the difficulties and benefits of incorporating cloud computing into academic research. Show Highlights:(00:00) - Introduction (03:15) - The Shift to Optical Networking(07:50) - The Efficiency of Cloud Networks(12:06) - Adaptable Networks for Different Uses(16:19) - Reducing Computing's Carbon Footprint(20:25) - Highlighting Computing's Environmental Impact Through Art(26:51) - Cloud Computing Challenges in Academia(31:18) - The benefits of cloud computing for academic research(34:14) - Closing thoughts About George:A Computer Science Professor at UC San Diego focusing on high-performance and sustainable computer systemsLinks:Center for Network Systems at UCSD: https://cns.ucsd.edu/Low Carbon Computing and Collaboration with the University of San Diego: https://c3lab.net/Â
Transcript
Discussion (0)
In order to build enough switching capacity, you started to have to build these really complicated topologies in the data center network.
Different switches interconnected in different ways.
And that drives up cost, and it drives up power, and it becomes a barrier to kind of deploying stuff.
Welcome to Screaming in the Cloud.
I'm Corey Quinn. My guest today is a little bit off the
beaten track for people I normally wind up speaking to who are usually doing interesting,
or at least things, in the world of industry. George Porter, instead, took his life in a little
bit of a different direction. He's a professor at the University of California, San Diego,
in the computer science department.
George, thank you for joining me.
Hi, Corey. Thank you for having me on. It's a pleasure. We've talked on Twitter,
so it's nice to talk in person.
This is honestly one of the first conversations I'm aware of having with a computer science professor where it wasn't a very different dynamic when I was in the process of failing
out of college 20-some-odd years ago. I'm surprised, like, wow, I can't see the chip on your shoulder from here, nor the very thinly
disguised drinking problem as you start trying to shove algorithms down the throat of someone who
isn't really equipped to handle it. Oh, well, I'm just a networking professor, so I don't have to
worry about algorithms. We can just write some code and try some stuff out and see if packets
get through. That seems as good a place to start as any.
Because back in my day, which is a terrifying turn of phrase, but by God, it's become true.
Networking was always perceived as being a very vocational area where, oh, academia,
working with networking?
Nonsense.
My first job was as a network engineer at Chapman University without having a degree myself. I was
viewed as, in many ways, akin to various other people in the facility staff of just make the
packets go back and forth, and that was the end of it. But now it's, you're telling me it's become
a full-on academic discipline. It has been, I'm afraid to say. All the fun's been drained out,
and now we're being very rigorous and creating theories and models and things like that.
No, I kid, but I actually started very similarly. My first job in high school was at an internet provider in Houston called Neosoft. And so it was sort of, you know,
this was the mid-90s. And like you said, there was none of this at scale, cloud, public cloud,
private cloud. There was basically just, you know, hey, we finally got
a connection to a new website from Coca-Cola. They're on the web now, you know, it was brand
new. But the reality today, though, is for our students who are graduating, pretty much
regardless of what they're interested in doing, they need some ability to connect the software
they're writing to either other nodes, to the cloud,
to download things, to update things, to push software updates.
It's just, you know, networking is so important.
I have been cited somewhat recently now as pointing out that, you know, without the network,
all these cloud computing is just basically a series of very expensive space heaters.
If they can't talk to anything, there's not a lot of value behind it.
You've been talking a lot lately academically
about the idea of optical networking,
which on the one hand struck me as,
oh, so what?
What's the big deal on that?
We've had fiber runs that people have been tripping over
and breaking in data centers
since longer than I've been involved in that.
What's changed in the space?
Oh, it's actually very interesting.
So like you said, in a traditional data center,
you're going to find fiber all over the place,
running to all the racks, running to rows,
running to telecom rooms and et cetera.
What's really changed over the last 15 years or so
has been the introduction of optics into the actual switching process.
So you might think of using fiber to interconnect, let's say, a Broadcom switch to a Cisco switch,
connect it to a Mellanox NIC, or I guess now an NVIDIA NIC.
You know, the fiber is used to carry data from one place to another.
But once it actually gets to that switch, to that router, to that end device, it's converted back into electronics where traditional 1990s-style networking can happen.
We can look at the MAC address, we can look at the IP address and do something with it.
That has become a real bottleneck and a real barrier to deployment. So to build a network
at cloud scale that can support thousands of machines, GPUs, TPUs, solid state storage devices, etc., etc., etc.,
the bandwidth requirements is just growing so quickly that actually getting data into and out of those packet switch chips has become a big problem.
And so my group at UCSD and other academic groups dating back about 10 or 15 years have started looking into how can we actually replace those switch chips, those switch devices with fully optical devices. And that was very sort of
sci-fi, very far into the future kind of research. And what's interesting over the last, really just
since the pandemic, even the last year or two has been to see hyperscalers talking about how they've
successfully deployed this technology,
Google most particularly.
I was always stuck in a mindset
where fiber is expensive, first off.
Secondly, oh, anyone who has two neurons to bang together
can basically crimp down an ethernet patch cable,
as evidenced by the fact they trusted me to do that
once upon a time.
Cable testers are worth their weight in gold,
but grinding fiber is a very specific expensive skills set. So I was always viewing it as,
it's what you do for longer runs, where you start to exceed the spec for whatever generation of
CAT cable you're using. When I started, it was CAT 5. Now I think it's 7 or 8, who knows. But
the speeds continue to increase on the Ethernet side of it. What is the driver behind this?
Is it purely bandwidth?
It is in a lot of ways bandwidth.
So like you mentioned, copper is really convenient and it's really durable.
I've run over some Cat5 with a vacuum cleaner and it works fine.
So it's not a big problem.
The big issue is that as you go to faster and faster bandwidths, there is a property
of copper cables called the skin effect that
means you need to put more and more energy into that cable in order to drive those high
bandwidths.
So when you went from like 100 megabit to 1 gigabit, we could stay on Cat5, we kind
of 5e, we go up to maybe 10 gigabits a second, you start running into these technical limitations
to driving that copper.
And so when you want to start looking at 100,
200, 400 gigabits a second terabit
Ethernet, all the way to the desktop,
all the way to the server, to the
device, you really have to go optics.
You have to go fiber. I remember back in the
early days of my own career, you look at
things like OC-192s, which were
basically the backbone of the Internet
where they could handle
speeds, not quite what you're
talking about, but darn close, over huge, huge spans. That was always giant bundles of fiber
that were doing those things. I have no idea what the actual termination of those things look like,
because I assure you, you did not let the relatively wet behind the ears kid who's
accident prone near the stuff that wind up costing more in some cases than the building it was housed in. Oh, absolutely. And for wide area applications where you're running from San
Francisco to Los Angeles or LA to San Diego or something like that, that fiber is extremely
expensive because you've got all the right away and the trenches and all this kind of stuff.
When you go into the data center, though, you can pull bundles of fiber. It's super cheap.
It's actually quite inexpensive. And then really the endpoints become kind of expensive. And so the thing that we were
addressing, one of the problems we were looking at was just the fact that as we start driving
bandwidth up to the devices, again, in order to build enough switching capacity, you started
have to building these really complicated apologies in the data center network, different switches
interconnected in different ways. And that drives up cost, and it drives up power, and it becomes a barrier to kind of deploying
stuff. One of the things that I think is underappreciated about the cloud, and I've been
whining about this for ages, which is you can turn a dial effectively on anything in AWS.
If you want more RAM, great, turn the dial. You'll spend more money. You'll get more RAM. Oh, that's too expensive. Let me turn it down. Okay. And what happens is what you would
expect. Same with disk space, same with CPUs, same with almost everything except network.
Everything in the network side of any of the cloud providers is you get one side that is
uniformly excellent. There's no more cost-effective egress fee
where you say, okay, I'd like it to be over there by,
I don't know, September.
September sounds good.
And instead, you're consistently getting that high cost
and high performance network.
And I want to be clear, because this gets overlooked,
what they do from a network perspective
is just this side of magic.
Because back when I worked in data
centers, you had top-of-rack switch congestion. I had a client at one point who built out a private
cloud and didn't think about that, and suddenly you have this massive number of servers in every
rack that's supposed to talk everywhere seamlessly, and they were bandwidth-constrained all over the
place as the top-of-rack switches started melting and dripping down the rest of the rack. I don't see any evidence of that in the modern cloud providers. You can get effectively
line rate between any two points indefinitely. And that's magic. Yeah, I think that magic is
delivered by the fact that a lot of these networking folks at these hyperscalers are
pulling their hair out to make that abstraction look like it's real. It's not, but they're able
to sort of make it look like it's real. Just like you said, you can wheel, essentially your ability to scale out in the
data center is limited by how big your loading dock is, because you can just start unloading
servers and devices and RAM and storage as fast as you want. But like you mentioned, it's the
network where everything has to come together. And that has traditionally been something that
has been difficult to upgrade because either you need to kind of upgrade your network first, and now it's going to be very
expensive upfront. You're not going to be able to saturate it. Or alternatively, you're going to
upgrade your devices and now your network's a problem. And so trying to figure out how to
keep those in parity is a huge challenge that a lot of these operators have.
I have to say that it's always,
I've always had an extreme sense
of talking to wizards
whenever I talk to some of these
cloud provider network folks.
AWS's Cole McCarthy is the top of mind
for a lot of that stuff.
And he's always been incredibly kind,
incredibly clear in communicating.
And you're right,
discussing the fact that
none of this stuff actually exists,
which I know intellectually, but it almost feels like there's a sense whenever I talk to some of these folks of like, okay, time to go back into the land of make-believe, where we are telling stories to children about how things work within the logical structures and rules of physics bounded here.
And it's got to be weird for folks who see both sides of the snake
to figure out what side they're having
a given conversation on.
Absolutely.
And in fact, if you look at AWS,
they've built some of their own hardware to do networking.
You look at Google and they have been innovating
on building, they have a network called Juniper,
I mean, a Jupyter network that is involving
a lot of custom chips, custom devices,
and things like that. And I think what we're seeing now is instead of thinking of this data
center that has maybe 100, 200,000 machines in it, and it's got a perfect network where you can
deploy software anywhere you want in that, you can migrate anything you want. I think what we're
starting to see is a model where we actually reconfigure
the structure of the network
for a particular application.
And then when it's time to do another application,
we can actually change the way the network's built.
So we're not trying to build one network
for every application.
We're trying to adapt it
to the needs of the application.
That feels like something you would do
in the context of an HPC or supercomputing
project, where you have very specific data access patterns that need to happen incredibly quickly at
a certain point of scale, where for the next three months, this is what it needs to do. But that was
always done from a perspective of time to release the hounds, and you would have the network monkeys
go and do the reconfiguration. It sounds like you're talking about something that's a lot more dynamic.
Absolutely.
What's interesting, in the research world,
we had supercomputing dating back to before you and I were born.
And you saw a divergence of that in the early 2000s
as public cloud providers or public, you know, Google, Facebook, etc.,
well, eventually Facebook,
had needs that were just very different than supercomputing because they were running lots of applications rather than one application.
And what's interesting is, so then you saw things kind of split into these two different sectors.
And now you're starting to see them come back together again, especially with machine learning,
AI training, et cetera. It's worth it for some of these providers to set up a custom network for 6,
8, 10, 12, 24 hours and then change it for another
training job or another, you know, one of these big, huge, long running tasks.
It was one of those areas where it's just, it feels like this is dark magic. And you're right,
because whenever you talk to academics about large computing projects, it feels like I'm
suddenly talking to people from
a very different world. You're right. When I'm talking about, okay, I'm looking at massive
corporate fleets in industry. Yeah, they're all running a whole bunch of different applications,
some of whom were never allowed to talk to each other because they think internal NDAs apply or
whatnot. But in academic clusters, it's, yeah, this is the project that's going to be running
for the next foreseeable period of time because we got a grant, et cetera, et cetera. And I do the economics on that,
and it's a completely different world. I keep looking for people who can say something like,
yeah, HPC on public cloud makes perfect sense for high utilization, steady state workloads.
I just have a hard time making that economic case because at that point of scale,
it pays for itself in an embarrassingly short period of time.
Yeah, this is the interesting thing about does it make sense to kind of run on-prem?
Does it make sense to run in a public cloud?
I think the organization matters.
If you're an academic, you might get a grant to look at a particular problem, and you're not going to be able to keep an on-prem deployment necessarily busy for three years in a row.
Let alone build the on-prem deployment on that grant.
Like, okay, you need to put a few zeros on the end of that dollar figure, please.
Yeah.
And, you know, as much as we'd like to believe equipment can manage itself, it doesn't.
You need experts, people on staff who can kind of manage that.
And it becomes quite challenging. I think that in these public
cloud environments, one of the things we were just talking about a second ago, which is that
you're seeing these really long jobs for AI, ML, chat, GPT, I'm sure. And in these particular cases,
you saw an evolution where in the mid-2010s, companies like Google had these optical patch
panels where a human could go and sort of put little fiber jumpers around and actually change some of the structure of their network.
In other words, think about bringing all those fibers rather than connecting them directly to
switches or routers. You essentially put them into a light bright set where you've got a bunch of
little things on the back that they're all plugged into, and you can kind of plug stuff into the
front. And now you're seeing the evolution of that with these optical switches, where you can do that programmatically. You can actually write code that will change the
configuration for this particular next six hours, let's say. And so that's kind of something that's
quite interesting, I think. The idea of applications reconfiguring themselves like that has been
longstanding, but networking has always felt much more manual. Like the idea of controlling it via
the infrastructure as code style approach
seems to have come very late to an awful lot of the networking world. And I get it because if you
screw up a computer, okay, we'll revert that. Screw up the network, you're driving to the data center.
Absolutely. And a lot of times the network, if it's broken, how do you fix it? Because you need
the network to access things to fix it, et cetera. The academic world often is informed by what's going on in industry. And we're responding, we're looking at trends and
roadmaps. But one thing where I think that is reversed is that there's a lot of formal theory
that's actually being brought into network configuration that's extremely interesting.
So the idea is basically, imagine you want to specify some properties of your network,
and you want to guarantee that all the traffic
entering this point goes through this firewall, let's say.
Well, you can actually write software that will ensure that all the configuration changes
respect that property.
And this is something that's really nice because it gives you more confidence the network's
going to work.
Another area that you have been focusing on to some extensive degree has been the idea of carbon
footprints of cloud computing, which I've heard a lot about from some AWS folks, then some Google
folks who frankly showed how disclosure and transparency should be done, relatively speaking.
And I hear about it very occasionally from customers who have a mandate org-wide to ask
the questions. How are you approaching it?
This issue of the carbon footprint of computing, broadly speaking, I think is something that's
really important and something that is a field we have to address. In terms of data centers,
it's particularly important because you're seeing so much of this on-prem deployment going to data
centers. And so even though there's not a huge number of these public cloud providers, they account for quite a bit of the compute that
underpins websites that we go to. If AWS has some sort of load balancer problem, it feels like half
the web has failed and half the websites don't work. So you get a sense of what's actually on
there. The concentration risk is no joke. Oh, yeah. And so I think globally, data centers
account for maybe 2%, 2 or so percent of the carbon footprint of the planet Earth. But that's
growing quite dramatically. And you're seeing that especially with AI and ML, Grok, ChatGPT,
OpenAI, etc. And a lot of companies had these roadmaps like you talked about for carbon neutral, net zero, whatever you want to say.
And it will be an open question how well we keep to those given that the compute requirements of AI is pushing in the opposite direction.
But just to answer your question, there's sort of two ways we've been looking at this, not just at UCSD, but elsewhere, has been reducing the amount of energy. And then I think more importantly, redesigning data centers to support renewable energy,
which is a real massive generational challenge, in my opinion.
There's a significant and consistent series of stories leaking out from various places.
I saw one earlier this week from Oregon that was talking about how a small utility has apparently gone from
basic obscurity to one of the region's biggest polluters.
And apparently it's one of the power utilities supplying a bunch of data centers, specifically
Amazon's.
And it's weird because I remember circa 2016 or so where they said, oh, if you want to
run a data center on, you want to run your workload on purely carbon,
pure renewable energy, put it in Oregon.
And they stopped talking about that.
And now I'm seeing articles like this,
and I'm starting to wonder if, you know,
like things like leadership principles,
and building an arena to remind them of their pledge,
and all these other things are just zero interest rate phenomenon.
It's like, well, you know, we need to make the money printer go brr. So at some point,
we're just going to basically back away from all of that.
That is one potential explanation. I think another one is simply the fact that if you look at low
carbon energy sources, the big issue you have is what's called intermittency, meaning that they're
not always available. Solar power is super cheap and it's gotten really cheap over the last 10 years. But even here in
Southern California, it's not sunny 100% of the time, it turns out. We have night here as well.
And if you look at other sources like wind, they're intermittent as well. And so the sources
that are low carbon that are available pretty much all the time are things like hydro. And so I think that was where Oregon, I want to say that in Dalles, Oregon, they had a data center that was part of the Columbia River there.
And that's where they were getting a lot of their energy 24-7.
Nuclear is an example of a source where you can get power 24-7.
But if you look at the grid, I think you're seeing a small amount of power that's available all the time.
And then you have this huge percent,
maybe 80%, that is available intermittently. And so as you grow, as you deploy, Amazon gets bigger,
AI is more important, just meeting that need. I think it might be a little bit difficult with
those always on low carbon sources. And so if you start sourcing from coal, natural gas,
it's going to drive your carbon footprint up.
You recently did something that I found fascinating.
You collaborated with someone on effectively doing theatrical storytelling on how computing affects the planet.
Tell me more about that, please.
Well, sure.
I figured this might be of interest to you, potentially, given that you have to act in some sense several times a week.
I am the dancing monkey of cloud computing.
It's an esteemed title, I will say. But no, my wife, Dr. Monica Stuft, is a professor at the University of San Diego, which is also here in San Diego.
We have three universities that all have the word San Diego in the title. It's very confusing. And one of them happens to be in Los Angeles. I kid, I kid.
It could be. I don't know. We did a collaboration sort of during pandemic,
or at least it started then, where we were sort of saying, okay, you know, there's
that the compute sector of the economy has this huge carbon footprint.
And in general, people really don't have a very much of an awareness about that. We understand cars, maybe ride your bike.
We understand public transit.
There's a lot of aspects of things that people understand.
But computing is really very opaque.
We talk about the cloud.
It sounds very happy, that kind of thing.
And so she had a set of theater students work to tell stories about climate change and about
climate, in particular, things that intersected
with computing. So it might be the energy needed to run all of this AI stuff. So that might be
something to try to convey. And it also might be the carbon footprint of making this stuff.
So if you think of like your smartphone in your pocket, the carbon footprint of making that
smartphone is really high. It involves a global
supply chain. It involves making all these different chips. And we bring all of that together
to build this phone. And a typical person only keeps it for about 20 months. And so all of that
environmental impact is essentially thrown away at the end of 20 months. And so that's why the
vast majority of the carbon impact of a laptop or a smartphone
has already been spent before you even turn it on. So even if you powered it with zero carbon energy,
huge percentage of that total lifetime carbon footprint is going to be because of just making
that device. Yeah, the smelting facility making the metals for these things is going to have a heck of a larger carbon footprint than whether or not you decide to power it via a central gas-fired power system or solar or what over the lifetime of that phone.
It's really true. And so you look at things like cobalt that comes from the Congo, graphite, lithium, other kind of elements that all come together to make it happen. So some of her students collaborated with the computer science students at UCSD and created
a set of performances that highlighted maybe keeping your phone for longer,
or how do you kind of keep a device in use longer as a way to lower its carbon footprint?
This is a really interesting, I thought, a really interesting sort of collaboration to
try to raise awareness in some sense of some of these issues. I will just say, if you turn on, well, it was PBS. I think we have a thousand channels now,
but Science Channel, whatever you want to call it, anything on Netflix,
there's a million scientific shows about how black holes work and wormholes.
But then there's really nothing about how an email gets sent or how a Google search works
or anything like that. And so I feel like in some sense, a lot of students, a lot of people have a better understanding of, you know, relativity and
quantum physics than they do about the technology they use every single day. It also is like the
old bike shedding problem, which originally came from a discussion, I think on Usenet, where they
said, okay, if you say you're going to build a nuclear reactor, very few people are going to
challenge you on the merits of that because no one knows how to build a nuclear reactor okay, if you say you're going to build a nuclear reactor, very few people are going to challenge you on the merits of that
because no one knows how to build a nuclear reactor.
But if you say you're going to build a bike shed,
well, suddenly everyone is coming out of the woodwork
talking about what color you should paint it,
and you get nibbled to death by ducks in those stories.
And I have to say, the first time I learned
how an email gets sent or a Google search worked, it's like, yeah, that can't possibly work.
How does it really happen?
It is still magic to me that these things work in spite of themselves, not because of them.
And I used to run large-scale email systems.
I know for a fact that it works, but it still boggles my mind that it doesn't at all.
And it works extremely well, so well that when something
is even slightly not working, we get really upset. Like you're on an airplane and you're like, oh,
my email's slow or something. And you're like, you're on an airplane sending an email. And so
making that all that work has been kind of a miracle. Now, I will say that one of the reasons
sort of over the last, say, 20, 25 years that everything has been able to work is because we've
been able to scale things out. And like you said, scaling out many hundreds of servers, thousands of servers for a single application,
spreading an application across different data centers, et cetera. And as we leave this kind of
zero interest percent, zero percent interest world, and as we start taking climate change
more seriously, we're going to have to just re-architect, I feel, the way we build software
so that it can better match with things like renewables. So this idea that I have a data
center, it's always on, that it's got gasoline generators in case the power goes out, it's got
UPSs, everything is very perfect, is going to be hard to keep going when we have a world where you want to be sourcing
from wind and solar, for example. And so I think that's one of the big challenges that we're going
to have to face moving forward. It's just a big problem space that it's hard to wrap your heads
around. One last topic I want to get into before we end up wrapping the episode is bringing it
back down a bit to something a lot more prosaic. One of the reasons that I've always found to be such a great Twitter follow
has been you periodically do talk about using cloud computing for things.
And in the land of industry, it's very different from what I can tell,
where, okay, you're going to have to hire some people past a certain point of scale
to handle the care and the feeding of the cloud, but that's okay.
You're presumably going to make money.
Ha ha ha.
This is a zero interest rate environment
with VC money slopping everywhere.
You sure about that one, professor?
Yeah, well, that's changing a bit.
But in academia,
the way that the grant process works,
the way that you effectively get
more or less free indentured servitude
in the form of grad students
lurking around here and there,
but they often don't have the 15 years experience that it generally takes to avoid some of the sharp
edges. What do you see is different in managing the care and feeding of cloud environments and
workloads when you're doing it from a perspective of an academic? Yeah, this is a really interesting
question because a lot of my colleagues, a lot of my friends, but myself in particular, I think it's really important to give students hands-on practice, hands-on experience, building something, deploying something.
Yeah, but this is like giving them credentials, root passwords to say AWS.
And we've done that in the past.
And, you know, our job is to do research.
So we're trying to study some problem.
We're trying to deploy something, run an experiment, collect some data and things like that.
And the complexity of something like AWS or Google Cloud or Azure is a real benefit because it means that the data that we collect,
the experiments that we do are relevant to industry. And so that helps us with impact.
But the challenge is that we cannot staff up, like you said, an organization of people who can
manage our cloud resources, look at permissions, give students different access to things, etc. And so this means that occasionally a student or a faculty member
will accidentally do something like commit our AWS credentials to GitHub.
And so within a few hours, all of a sudden,
a hacker has spun up thousands of VMs running crypto mining software.
And our bill is $20,000 in 12 hours or something
like that. Now, it's hard to kind of have a problem like that when you have an on-prem
deployment like we used to. But now it's very easy to just fat finger something or type something
incorrectly, and suddenly your bill is huge. Let they who have never done that cast the first stone. I mean,
people don't usually ask me the question, but, huh, why are you so passionate about AWS bills?
And it didn't, it didn't come from nowhere. I screwed it up and was made to feel like a fool
for it. And it's okay. Let's see if I can smack back a bit. Well, and it's, and, and, you know,
you're not, you, you might feel that way, but absolutely you're not. And I want to be clear,
none of the myself, I've done this, I've had students who do this, and it's not necessarily
our fault. What it is, is just the fact that you have a system that is sort of infinitely scalable
in every dimension. And even small decisions like, does my VM need a world accessible IPv4
address or not? Well, that might change my bill by $500 a month
or something like that. Or, you know, do I need, I typed dash N a hundred thinking I meant something,
but actually I started a hundred VMs, you know, or something like that. I thought I was starting
a hundred gigabytes of memory, but it's actually, I did a hundred different copies of my system or
something like that.
And those kind of things are really challenging.
And the providers try to do their best, I think, to help out kind of academic users.
But it is pretty difficult.
You're seeing one thing I will say is that the National Science Foundation, which funds a lot of computer science research, pretty much the main funding body of computer science research in the U.S., has identified this as a little bit of an issue.
And so they have started now, including with your grant,
cloud credits that might work on AWS, might work on Azure, something like that.
So rather than necessarily giving you money to spend on cloud credits,
you can sort of potentially get resources that way.
And so this is a little bit of an opportunity to try to both sort of make the federal tax money go a little bit further and possibly try to offload
a little bit of that complexity from us academic users who don't have experience all day long
setting up these cloud-based resources. And it's always a setup and tear down story. At least when
you're building an on-prem university cluster, great, you're going to build something out.
You're going to then have people
to do maintain the care and feeding of it.
It gets reused again and again and again.
And then you're just getting time on the cluster.
This is effectively build the whole thing from scratch
because everyone should be able to do that
off the top of their head.
And well, someone did something similar four years ago.
They have their scripts still around.
Sure, that'll still work without,
because nothing like, there's not a new version
of Python or some
weird change has happened somewhere
that's going to make this work very differently.
Usually not, but sometimes
there are. By the time you can wrap your head
around all of this, that's its own career.
Yeah. Now, the one thing that is
advantageous about using the real cloud
instead of some fake cloud or on-prem
is that
a former student, let's say, or fake cloud or on-prem is that, you know,
a former student, let's say, or a friend or someone can say, hey, this project looks kind of cool.
I'm just going to actually just grab your code and deploy it in the same environment you tested on.
And so this is an opportunity to have impact in a way that in the old days, we'd write a paper.
And if you're lucky, maybe someone looked at it, and if you're really lucky, they decided to code it up and run it in their company or something like that. Here, you know, you can just
grab the code from GitHub, deploy it, run it, and you sort of see some of these projects making
their way into industry, which is really great. It's neat to see. It sort of answers the question
of when am I going to use this in life, which is, I think, every academic's least favorite question from the uneducated masses.
Yeah, the answer, you'd be surprised.
Yeah, I mean, as a grad student, I was at Berkeley at the time that the sort of Spark
project was taking off.
And so there's, you know, when you see these things like, when am I going to use this in
your life?
You're like, well, there are companies like Databricks that have a really clear, you know, ancestry back to kind of these academic projects. But even
you're seeing this in terms of things like programming languages. When I was an undergrad,
everyone was like, I have to learn C++ because that's what they use in industry. And over here,
you're teaching us about Haskell and all these crazy things no one will ever use.
And yet you're seeing, you know, Microsoft, Google deploying code in these other languages
and things like that. So it's actually a really exciting time, you know, Microsoft, Google deploying code in these other languages and things like that.
So it's actually a really exciting time, I think, to to be doing academic research because it's kind of never been easier to deploy stuff that you develop in in the in the academic world into industry.
That's true. The companies do it. Hey, you know what you should do is basically volunteer for a bunch of universities that which, yeah, in in good times, people still look at that with suspicion and distrust.
And when times get tight, it's like, oh, yeah, turns out one of our leadership principles is very much not philanthropy.
So good luck.
Yeah, I think that at least at the cloud level, it's a lot easier for cloud providers to provide credits to academics than, you know, hard dollars.
You're seeing a little bit of a mixture of both, but at least as far as we can tell, you know, from talking to folks at
conferences and things like that, you are seeing this impact kind of go both ways. And so our
students, when they graduate, have generally speaking, had the opportunity to put some code
into multiple data centers and they can say, you know, I wrote a program that I deployed to Korea and it
also ran in Europe and it ran in the U.S. and I failed one of the data centers and it all kept
working. And you never would have been able to do that in the 90s or 2000s or even early 2010s.
Would have required a research team and an enormous budget. Now it just requires a few
lines of configuration. It just requires a credit card and it might cost $100 or it might cost $10,000. That's the question you have to figure out. Or you think it's one,
it's going to be the other by surprise. Yeah, those are always great. It's unlikely that you
think it's going to be $10,000. It's only $100. If people want to learn more, where's the best
place for them to go to see what you and your team are up to? We have a website at UCSD called
cns.ucsd.edu. It's the Center for Network Systems, which is a bunch of faculty, staff, and students
who are working in this space.
And then in terms of our low-carbon work, and that includes the collaborations with
the University of San Diego and Monica's stuff, that's on a website called c3lab.net.
And we will put links to both of those in the show notes.
Thank you so much for taking the time to speak with me.
I really do appreciate it.
It's a real pleasure to chat with you, and I hope we can talk on Twitter soon.
Oh, I expect it'll be hard to get away from me as that environment continues to contract.
Thanks again for making the time.
I appreciate it.
Thank you so much.
George Porter, professor at the University of California, San Diego, Computer Science Department.
I'm cloud economist Corey Quinn, and this is Screaming in the Cloud.
If you enjoyed this podcast, please leave a five-star review on your podcast platform of choice.
Whereas if you hated this podcast, please leave a five-star review on your podcast platform of choice,
along with an angry, insulting comment that channels 20 years of aggression
over the way a crappy computer science professor made you feel silly back in the early noughts.