Screaming in the Cloud - Networks and Sustainability in Computing with George Porter

Starting point is 00:00:00 In order to build enough switching capacity, you started to have to build these really complicated topologies in the data center network. Different switches interconnected in different ways. And that drives up cost, and it drives up power, and it becomes a barrier to kind of deploying stuff. Welcome to Screaming in the Cloud. I'm Corey Quinn. My guest today is a little bit off the beaten track for people I normally wind up speaking to who are usually doing interesting, or at least things, in the world of industry. George Porter, instead, took his life in a little bit of a different direction. He's a professor at the University of California, San Diego,

Starting point is 00:00:42 in the computer science department. George, thank you for joining me. Hi, Corey. Thank you for having me on. It's a pleasure. We've talked on Twitter, so it's nice to talk in person. This is honestly one of the first conversations I'm aware of having with a computer science professor where it wasn't a very different dynamic when I was in the process of failing out of college 20-some-odd years ago. I'm surprised, like, wow, I can't see the chip on your shoulder from here, nor the very thinly disguised drinking problem as you start trying to shove algorithms down the throat of someone who isn't really equipped to handle it. Oh, well, I'm just a networking professor, so I don't have to

Starting point is 00:01:19 worry about algorithms. We can just write some code and try some stuff out and see if packets get through. That seems as good a place to start as any. Because back in my day, which is a terrifying turn of phrase, but by God, it's become true. Networking was always perceived as being a very vocational area where, oh, academia, working with networking? Nonsense. My first job was as a network engineer at Chapman University without having a degree myself. I was viewed as, in many ways, akin to various other people in the facility staff of just make the

Starting point is 00:01:51 packets go back and forth, and that was the end of it. But now it's, you're telling me it's become a full-on academic discipline. It has been, I'm afraid to say. All the fun's been drained out, and now we're being very rigorous and creating theories and models and things like that. No, I kid, but I actually started very similarly. My first job in high school was at an internet provider in Houston called Neosoft. And so it was sort of, you know, this was the mid-90s. And like you said, there was none of this at scale, cloud, public cloud, private cloud. There was basically just, you know, hey, we finally got a connection to a new website from Coca-Cola. They're on the web now, you know, it was brand new. But the reality today, though, is for our students who are graduating, pretty much

Starting point is 00:02:37 regardless of what they're interested in doing, they need some ability to connect the software they're writing to either other nodes, to the cloud, to download things, to update things, to push software updates. It's just, you know, networking is so important. I have been cited somewhat recently now as pointing out that, you know, without the network, all these cloud computing is just basically a series of very expensive space heaters. If they can't talk to anything, there's not a lot of value behind it. You've been talking a lot lately academically

Starting point is 00:03:09 about the idea of optical networking, which on the one hand struck me as, oh, so what? What's the big deal on that? We've had fiber runs that people have been tripping over and breaking in data centers since longer than I've been involved in that. What's changed in the space?

Starting point is 00:03:24 Oh, it's actually very interesting. So like you said, in a traditional data center, you're going to find fiber all over the place, running to all the racks, running to rows, running to telecom rooms and et cetera. What's really changed over the last 15 years or so has been the introduction of optics into the actual switching process. So you might think of using fiber to interconnect, let's say, a Broadcom switch to a Cisco switch,

Starting point is 00:03:54 connect it to a Mellanox NIC, or I guess now an NVIDIA NIC. You know, the fiber is used to carry data from one place to another. But once it actually gets to that switch, to that router, to that end device, it's converted back into electronics where traditional 1990s-style networking can happen. We can look at the MAC address, we can look at the IP address and do something with it. That has become a real bottleneck and a real barrier to deployment. So to build a network at cloud scale that can support thousands of machines, GPUs, TPUs, solid state storage devices, etc., etc., etc., the bandwidth requirements is just growing so quickly that actually getting data into and out of those packet switch chips has become a big problem. And so my group at UCSD and other academic groups dating back about 10 or 15 years have started looking into how can we actually replace those switch chips, those switch devices with fully optical devices. And that was very sort of

Starting point is 00:04:51 sci-fi, very far into the future kind of research. And what's interesting over the last, really just since the pandemic, even the last year or two has been to see hyperscalers talking about how they've successfully deployed this technology, Google most particularly. I was always stuck in a mindset where fiber is expensive, first off. Secondly, oh, anyone who has two neurons to bang together can basically crimp down an ethernet patch cable,

Starting point is 00:05:18 as evidenced by the fact they trusted me to do that once upon a time. Cable testers are worth their weight in gold, but grinding fiber is a very specific expensive skills set. So I was always viewing it as, it's what you do for longer runs, where you start to exceed the spec for whatever generation of CAT cable you're using. When I started, it was CAT 5. Now I think it's 7 or 8, who knows. But the speeds continue to increase on the Ethernet side of it. What is the driver behind this? Is it purely bandwidth?

Starting point is 00:05:47 It is in a lot of ways bandwidth. So like you mentioned, copper is really convenient and it's really durable. I've run over some Cat5 with a vacuum cleaner and it works fine. So it's not a big problem. The big issue is that as you go to faster and faster bandwidths, there is a property of copper cables called the skin effect that means you need to put more and more energy into that cable in order to drive those high bandwidths.

Starting point is 00:06:10 So when you went from like 100 megabit to 1 gigabit, we could stay on Cat5, we kind of 5e, we go up to maybe 10 gigabits a second, you start running into these technical limitations to driving that copper. And so when you want to start looking at 100, 200, 400 gigabits a second terabit Ethernet, all the way to the desktop, all the way to the server, to the device, you really have to go optics.

Starting point is 00:06:33 You have to go fiber. I remember back in the early days of my own career, you look at things like OC-192s, which were basically the backbone of the Internet where they could handle speeds, not quite what you're talking about, but darn close, over huge, huge spans. That was always giant bundles of fiber that were doing those things. I have no idea what the actual termination of those things look like,

Starting point is 00:06:55 because I assure you, you did not let the relatively wet behind the ears kid who's accident prone near the stuff that wind up costing more in some cases than the building it was housed in. Oh, absolutely. And for wide area applications where you're running from San Francisco to Los Angeles or LA to San Diego or something like that, that fiber is extremely expensive because you've got all the right away and the trenches and all this kind of stuff. When you go into the data center, though, you can pull bundles of fiber. It's super cheap. It's actually quite inexpensive. And then really the endpoints become kind of expensive. And so the thing that we were addressing, one of the problems we were looking at was just the fact that as we start driving bandwidth up to the devices, again, in order to build enough switching capacity, you started

Starting point is 00:07:38 have to building these really complicated apologies in the data center network, different switches interconnected in different ways. And that drives up cost, and it drives up power, and it becomes a barrier to kind of deploying stuff. One of the things that I think is underappreciated about the cloud, and I've been whining about this for ages, which is you can turn a dial effectively on anything in AWS. If you want more RAM, great, turn the dial. You'll spend more money. You'll get more RAM. Oh, that's too expensive. Let me turn it down. Okay. And what happens is what you would expect. Same with disk space, same with CPUs, same with almost everything except network. Everything in the network side of any of the cloud providers is you get one side that is uniformly excellent. There's no more cost-effective egress fee

Starting point is 00:08:26 where you say, okay, I'd like it to be over there by, I don't know, September. September sounds good. And instead, you're consistently getting that high cost and high performance network. And I want to be clear, because this gets overlooked, what they do from a network perspective is just this side of magic.

Starting point is 00:08:43 Because back when I worked in data centers, you had top-of-rack switch congestion. I had a client at one point who built out a private cloud and didn't think about that, and suddenly you have this massive number of servers in every rack that's supposed to talk everywhere seamlessly, and they were bandwidth-constrained all over the place as the top-of-rack switches started melting and dripping down the rest of the rack. I don't see any evidence of that in the modern cloud providers. You can get effectively line rate between any two points indefinitely. And that's magic. Yeah, I think that magic is delivered by the fact that a lot of these networking folks at these hyperscalers are pulling their hair out to make that abstraction look like it's real. It's not, but they're able

Starting point is 00:09:23 to sort of make it look like it's real. Just like you said, you can wheel, essentially your ability to scale out in the data center is limited by how big your loading dock is, because you can just start unloading servers and devices and RAM and storage as fast as you want. But like you mentioned, it's the network where everything has to come together. And that has traditionally been something that has been difficult to upgrade because either you need to kind of upgrade your network first, and now it's going to be very expensive upfront. You're not going to be able to saturate it. Or alternatively, you're going to upgrade your devices and now your network's a problem. And so trying to figure out how to keep those in parity is a huge challenge that a lot of these operators have.

Starting point is 00:10:03 I have to say that it's always, I've always had an extreme sense of talking to wizards whenever I talk to some of these cloud provider network folks. AWS's Cole McCarthy is the top of mind for a lot of that stuff. And he's always been incredibly kind,

Starting point is 00:10:18 incredibly clear in communicating. And you're right, discussing the fact that none of this stuff actually exists, which I know intellectually, but it almost feels like there's a sense whenever I talk to some of these folks of like, okay, time to go back into the land of make-believe, where we are telling stories to children about how things work within the logical structures and rules of physics bounded here. And it's got to be weird for folks who see both sides of the snake to figure out what side they're having a given conversation on.

Starting point is 00:10:50 Absolutely. And in fact, if you look at AWS, they've built some of their own hardware to do networking. You look at Google and they have been innovating on building, they have a network called Juniper, I mean, a Jupyter network that is involving a lot of custom chips, custom devices, and things like that. And I think what we're seeing now is instead of thinking of this data

Starting point is 00:11:12 center that has maybe 100, 200,000 machines in it, and it's got a perfect network where you can deploy software anywhere you want in that, you can migrate anything you want. I think what we're starting to see is a model where we actually reconfigure the structure of the network for a particular application. And then when it's time to do another application, we can actually change the way the network's built. So we're not trying to build one network

Starting point is 00:11:36 for every application. We're trying to adapt it to the needs of the application. That feels like something you would do in the context of an HPC or supercomputing project, where you have very specific data access patterns that need to happen incredibly quickly at a certain point of scale, where for the next three months, this is what it needs to do. But that was always done from a perspective of time to release the hounds, and you would have the network monkeys

Starting point is 00:12:01 go and do the reconfiguration. It sounds like you're talking about something that's a lot more dynamic. Absolutely. What's interesting, in the research world, we had supercomputing dating back to before you and I were born. And you saw a divergence of that in the early 2000s as public cloud providers or public, you know, Google, Facebook, etc., well, eventually Facebook, had needs that were just very different than supercomputing because they were running lots of applications rather than one application.

Starting point is 00:12:29 And what's interesting is, so then you saw things kind of split into these two different sectors. And now you're starting to see them come back together again, especially with machine learning, AI training, et cetera. It's worth it for some of these providers to set up a custom network for 6, 8, 10, 12, 24 hours and then change it for another training job or another, you know, one of these big, huge, long running tasks. It was one of those areas where it's just, it feels like this is dark magic. And you're right, because whenever you talk to academics about large computing projects, it feels like I'm suddenly talking to people from

Starting point is 00:13:05 a very different world. You're right. When I'm talking about, okay, I'm looking at massive corporate fleets in industry. Yeah, they're all running a whole bunch of different applications, some of whom were never allowed to talk to each other because they think internal NDAs apply or whatnot. But in academic clusters, it's, yeah, this is the project that's going to be running for the next foreseeable period of time because we got a grant, et cetera, et cetera. And I do the economics on that, and it's a completely different world. I keep looking for people who can say something like, yeah, HPC on public cloud makes perfect sense for high utilization, steady state workloads. I just have a hard time making that economic case because at that point of scale,

Starting point is 00:13:44 it pays for itself in an embarrassingly short period of time. Yeah, this is the interesting thing about does it make sense to kind of run on-prem? Does it make sense to run in a public cloud? I think the organization matters. If you're an academic, you might get a grant to look at a particular problem, and you're not going to be able to keep an on-prem deployment necessarily busy for three years in a row. Let alone build the on-prem deployment on that grant. Like, okay, you need to put a few zeros on the end of that dollar figure, please. Yeah.

Starting point is 00:14:14 And, you know, as much as we'd like to believe equipment can manage itself, it doesn't. You need experts, people on staff who can kind of manage that. And it becomes quite challenging. I think that in these public cloud environments, one of the things we were just talking about a second ago, which is that you're seeing these really long jobs for AI, ML, chat, GPT, I'm sure. And in these particular cases, you saw an evolution where in the mid-2010s, companies like Google had these optical patch panels where a human could go and sort of put little fiber jumpers around and actually change some of the structure of their network. In other words, think about bringing all those fibers rather than connecting them directly to

Starting point is 00:14:52 switches or routers. You essentially put them into a light bright set where you've got a bunch of little things on the back that they're all plugged into, and you can kind of plug stuff into the front. And now you're seeing the evolution of that with these optical switches, where you can do that programmatically. You can actually write code that will change the configuration for this particular next six hours, let's say. And so that's kind of something that's quite interesting, I think. The idea of applications reconfiguring themselves like that has been longstanding, but networking has always felt much more manual. Like the idea of controlling it via the infrastructure as code style approach seems to have come very late to an awful lot of the networking world. And I get it because if you

Starting point is 00:15:31 screw up a computer, okay, we'll revert that. Screw up the network, you're driving to the data center. Absolutely. And a lot of times the network, if it's broken, how do you fix it? Because you need the network to access things to fix it, et cetera. The academic world often is informed by what's going on in industry. And we're responding, we're looking at trends and roadmaps. But one thing where I think that is reversed is that there's a lot of formal theory that's actually being brought into network configuration that's extremely interesting. So the idea is basically, imagine you want to specify some properties of your network, and you want to guarantee that all the traffic entering this point goes through this firewall, let's say.

Starting point is 00:16:08 Well, you can actually write software that will ensure that all the configuration changes respect that property. And this is something that's really nice because it gives you more confidence the network's going to work. Another area that you have been focusing on to some extensive degree has been the idea of carbon footprints of cloud computing, which I've heard a lot about from some AWS folks, then some Google folks who frankly showed how disclosure and transparency should be done, relatively speaking. And I hear about it very occasionally from customers who have a mandate org-wide to ask

Starting point is 00:16:44 the questions. How are you approaching it? This issue of the carbon footprint of computing, broadly speaking, I think is something that's really important and something that is a field we have to address. In terms of data centers, it's particularly important because you're seeing so much of this on-prem deployment going to data centers. And so even though there's not a huge number of these public cloud providers, they account for quite a bit of the compute that underpins websites that we go to. If AWS has some sort of load balancer problem, it feels like half the web has failed and half the websites don't work. So you get a sense of what's actually on there. The concentration risk is no joke. Oh, yeah. And so I think globally, data centers

Starting point is 00:17:26 account for maybe 2%, 2 or so percent of the carbon footprint of the planet Earth. But that's growing quite dramatically. And you're seeing that especially with AI and ML, Grok, ChatGPT, OpenAI, etc. And a lot of companies had these roadmaps like you talked about for carbon neutral, net zero, whatever you want to say. And it will be an open question how well we keep to those given that the compute requirements of AI is pushing in the opposite direction. But just to answer your question, there's sort of two ways we've been looking at this, not just at UCSD, but elsewhere, has been reducing the amount of energy. And then I think more importantly, redesigning data centers to support renewable energy, which is a real massive generational challenge, in my opinion. There's a significant and consistent series of stories leaking out from various places. I saw one earlier this week from Oregon that was talking about how a small utility has apparently gone from

Starting point is 00:18:26 basic obscurity to one of the region's biggest polluters. And apparently it's one of the power utilities supplying a bunch of data centers, specifically Amazon's. And it's weird because I remember circa 2016 or so where they said, oh, if you want to run a data center on, you want to run your workload on purely carbon, pure renewable energy, put it in Oregon. And they stopped talking about that. And now I'm seeing articles like this,

Starting point is 00:18:53 and I'm starting to wonder if, you know, like things like leadership principles, and building an arena to remind them of their pledge, and all these other things are just zero interest rate phenomenon. It's like, well, you know, we need to make the money printer go brr. So at some point, we're just going to basically back away from all of that. That is one potential explanation. I think another one is simply the fact that if you look at low carbon energy sources, the big issue you have is what's called intermittency, meaning that they're

Starting point is 00:19:22 not always available. Solar power is super cheap and it's gotten really cheap over the last 10 years. But even here in Southern California, it's not sunny 100% of the time, it turns out. We have night here as well. And if you look at other sources like wind, they're intermittent as well. And so the sources that are low carbon that are available pretty much all the time are things like hydro. And so I think that was where Oregon, I want to say that in Dalles, Oregon, they had a data center that was part of the Columbia River there. And that's where they were getting a lot of their energy 24-7. Nuclear is an example of a source where you can get power 24-7. But if you look at the grid, I think you're seeing a small amount of power that's available all the time. And then you have this huge percent,

Starting point is 00:20:05 maybe 80%, that is available intermittently. And so as you grow, as you deploy, Amazon gets bigger, AI is more important, just meeting that need. I think it might be a little bit difficult with those always on low carbon sources. And so if you start sourcing from coal, natural gas, it's going to drive your carbon footprint up. You recently did something that I found fascinating. You collaborated with someone on effectively doing theatrical storytelling on how computing affects the planet. Tell me more about that, please. Well, sure.

Starting point is 00:20:46 I figured this might be of interest to you, potentially, given that you have to act in some sense several times a week. I am the dancing monkey of cloud computing. It's an esteemed title, I will say. But no, my wife, Dr. Monica Stuft, is a professor at the University of San Diego, which is also here in San Diego. We have three universities that all have the word San Diego in the title. It's very confusing. And one of them happens to be in Los Angeles. I kid, I kid. It could be. I don't know. We did a collaboration sort of during pandemic, or at least it started then, where we were sort of saying, okay, you know, there's that the compute sector of the economy has this huge carbon footprint. And in general, people really don't have a very much of an awareness about that. We understand cars, maybe ride your bike.

Starting point is 00:21:26 We understand public transit. There's a lot of aspects of things that people understand. But computing is really very opaque. We talk about the cloud. It sounds very happy, that kind of thing. And so she had a set of theater students work to tell stories about climate change and about climate, in particular, things that intersected with computing. So it might be the energy needed to run all of this AI stuff. So that might be

Starting point is 00:21:53 something to try to convey. And it also might be the carbon footprint of making this stuff. So if you think of like your smartphone in your pocket, the carbon footprint of making that smartphone is really high. It involves a global supply chain. It involves making all these different chips. And we bring all of that together to build this phone. And a typical person only keeps it for about 20 months. And so all of that environmental impact is essentially thrown away at the end of 20 months. And so that's why the vast majority of the carbon impact of a laptop or a smartphone has already been spent before you even turn it on. So even if you powered it with zero carbon energy,

Starting point is 00:22:34 huge percentage of that total lifetime carbon footprint is going to be because of just making that device. Yeah, the smelting facility making the metals for these things is going to have a heck of a larger carbon footprint than whether or not you decide to power it via a central gas-fired power system or solar or what over the lifetime of that phone. It's really true. And so you look at things like cobalt that comes from the Congo, graphite, lithium, other kind of elements that all come together to make it happen. So some of her students collaborated with the computer science students at UCSD and created a set of performances that highlighted maybe keeping your phone for longer, or how do you kind of keep a device in use longer as a way to lower its carbon footprint? This is a really interesting, I thought, a really interesting sort of collaboration to try to raise awareness in some sense of some of these issues. I will just say, if you turn on, well, it was PBS. I think we have a thousand channels now, but Science Channel, whatever you want to call it, anything on Netflix,

Starting point is 00:23:33 there's a million scientific shows about how black holes work and wormholes. But then there's really nothing about how an email gets sent or how a Google search works or anything like that. And so I feel like in some sense, a lot of students, a lot of people have a better understanding of, you know, relativity and quantum physics than they do about the technology they use every single day. It also is like the old bike shedding problem, which originally came from a discussion, I think on Usenet, where they said, okay, if you say you're going to build a nuclear reactor, very few people are going to challenge you on the merits of that because no one knows how to build a nuclear reactor okay, if you say you're going to build a nuclear reactor, very few people are going to challenge you on the merits of that because no one knows how to build a nuclear reactor.

Starting point is 00:24:08 But if you say you're going to build a bike shed, well, suddenly everyone is coming out of the woodwork talking about what color you should paint it, and you get nibbled to death by ducks in those stories. And I have to say, the first time I learned how an email gets sent or a Google search worked, it's like, yeah, that can't possibly work. How does it really happen? It is still magic to me that these things work in spite of themselves, not because of them.

Starting point is 00:24:34 And I used to run large-scale email systems. I know for a fact that it works, but it still boggles my mind that it doesn't at all. And it works extremely well, so well that when something is even slightly not working, we get really upset. Like you're on an airplane and you're like, oh, my email's slow or something. And you're like, you're on an airplane sending an email. And so making that all that work has been kind of a miracle. Now, I will say that one of the reasons sort of over the last, say, 20, 25 years that everything has been able to work is because we've been able to scale things out. And like you said, scaling out many hundreds of servers, thousands of servers for a single application,

Starting point is 00:25:10 spreading an application across different data centers, et cetera. And as we leave this kind of zero interest percent, zero percent interest world, and as we start taking climate change more seriously, we're going to have to just re-architect, I feel, the way we build software so that it can better match with things like renewables. So this idea that I have a data center, it's always on, that it's got gasoline generators in case the power goes out, it's got UPSs, everything is very perfect, is going to be hard to keep going when we have a world where you want to be sourcing from wind and solar, for example. And so I think that's one of the big challenges that we're going to have to face moving forward. It's just a big problem space that it's hard to wrap your heads

Starting point is 00:25:55 around. One last topic I want to get into before we end up wrapping the episode is bringing it back down a bit to something a lot more prosaic. One of the reasons that I've always found to be such a great Twitter follow has been you periodically do talk about using cloud computing for things. And in the land of industry, it's very different from what I can tell, where, okay, you're going to have to hire some people past a certain point of scale to handle the care and the feeding of the cloud, but that's okay. You're presumably going to make money. Ha ha ha.

Starting point is 00:26:25 This is a zero interest rate environment with VC money slopping everywhere. You sure about that one, professor? Yeah, well, that's changing a bit. But in academia, the way that the grant process works, the way that you effectively get more or less free indentured servitude

Starting point is 00:26:40 in the form of grad students lurking around here and there, but they often don't have the 15 years experience that it generally takes to avoid some of the sharp edges. What do you see is different in managing the care and feeding of cloud environments and workloads when you're doing it from a perspective of an academic? Yeah, this is a really interesting question because a lot of my colleagues, a lot of my friends, but myself in particular, I think it's really important to give students hands-on practice, hands-on experience, building something, deploying something. Yeah, but this is like giving them credentials, root passwords to say AWS. And we've done that in the past.

Starting point is 00:27:29 And, you know, our job is to do research. So we're trying to study some problem. We're trying to deploy something, run an experiment, collect some data and things like that. And the complexity of something like AWS or Google Cloud or Azure is a real benefit because it means that the data that we collect, the experiments that we do are relevant to industry. And so that helps us with impact. But the challenge is that we cannot staff up, like you said, an organization of people who can manage our cloud resources, look at permissions, give students different access to things, etc. And so this means that occasionally a student or a faculty member will accidentally do something like commit our AWS credentials to GitHub.

Starting point is 00:28:14 And so within a few hours, all of a sudden, a hacker has spun up thousands of VMs running crypto mining software. And our bill is $20,000 in 12 hours or something like that. Now, it's hard to kind of have a problem like that when you have an on-prem deployment like we used to. But now it's very easy to just fat finger something or type something incorrectly, and suddenly your bill is huge. Let they who have never done that cast the first stone. I mean, people don't usually ask me the question, but, huh, why are you so passionate about AWS bills? And it didn't, it didn't come from nowhere. I screwed it up and was made to feel like a fool

Starting point is 00:28:57 for it. And it's okay. Let's see if I can smack back a bit. Well, and it's, and, and, you know, you're not, you, you might feel that way, but absolutely you're not. And I want to be clear, none of the myself, I've done this, I've had students who do this, and it's not necessarily our fault. What it is, is just the fact that you have a system that is sort of infinitely scalable in every dimension. And even small decisions like, does my VM need a world accessible IPv4 address or not? Well, that might change my bill by $500 a month or something like that. Or, you know, do I need, I typed dash N a hundred thinking I meant something, but actually I started a hundred VMs, you know, or something like that. I thought I was starting

Starting point is 00:29:38 a hundred gigabytes of memory, but it's actually, I did a hundred different copies of my system or something like that. And those kind of things are really challenging. And the providers try to do their best, I think, to help out kind of academic users. But it is pretty difficult. You're seeing one thing I will say is that the National Science Foundation, which funds a lot of computer science research, pretty much the main funding body of computer science research in the U.S., has identified this as a little bit of an issue. And so they have started now, including with your grant, cloud credits that might work on AWS, might work on Azure, something like that.

Starting point is 00:30:18 So rather than necessarily giving you money to spend on cloud credits, you can sort of potentially get resources that way. And so this is a little bit of an opportunity to try to both sort of make the federal tax money go a little bit further and possibly try to offload a little bit of that complexity from us academic users who don't have experience all day long setting up these cloud-based resources. And it's always a setup and tear down story. At least when you're building an on-prem university cluster, great, you're going to build something out. You're going to then have people to do maintain the care and feeding of it.

Starting point is 00:30:49 It gets reused again and again and again. And then you're just getting time on the cluster. This is effectively build the whole thing from scratch because everyone should be able to do that off the top of their head. And well, someone did something similar four years ago. They have their scripts still around. Sure, that'll still work without,

Starting point is 00:31:04 because nothing like, there's not a new version of Python or some weird change has happened somewhere that's going to make this work very differently. Usually not, but sometimes there are. By the time you can wrap your head around all of this, that's its own career. Yeah. Now, the one thing that is

Starting point is 00:31:19 advantageous about using the real cloud instead of some fake cloud or on-prem is that a former student, let's say, or fake cloud or on-prem is that, you know, a former student, let's say, or a friend or someone can say, hey, this project looks kind of cool. I'm just going to actually just grab your code and deploy it in the same environment you tested on. And so this is an opportunity to have impact in a way that in the old days, we'd write a paper. And if you're lucky, maybe someone looked at it, and if you're really lucky, they decided to code it up and run it in their company or something like that. Here, you know, you can just

Starting point is 00:31:49 grab the code from GitHub, deploy it, run it, and you sort of see some of these projects making their way into industry, which is really great. It's neat to see. It sort of answers the question of when am I going to use this in life, which is, I think, every academic's least favorite question from the uneducated masses. Yeah, the answer, you'd be surprised. Yeah, I mean, as a grad student, I was at Berkeley at the time that the sort of Spark project was taking off. And so there's, you know, when you see these things like, when am I going to use this in your life?

Starting point is 00:32:21 You're like, well, there are companies like Databricks that have a really clear, you know, ancestry back to kind of these academic projects. But even you're seeing this in terms of things like programming languages. When I was an undergrad, everyone was like, I have to learn C++ because that's what they use in industry. And over here, you're teaching us about Haskell and all these crazy things no one will ever use. And yet you're seeing, you know, Microsoft, Google deploying code in these other languages and things like that. So it's actually a really exciting time, you know, Microsoft, Google deploying code in these other languages and things like that. So it's actually a really exciting time, I think, to to be doing academic research because it's kind of never been easier to deploy stuff that you develop in in the in the academic world into industry. That's true. The companies do it. Hey, you know what you should do is basically volunteer for a bunch of universities that which, yeah, in in good times, people still look at that with suspicion and distrust.

Starting point is 00:33:08 And when times get tight, it's like, oh, yeah, turns out one of our leadership principles is very much not philanthropy. So good luck. Yeah, I think that at least at the cloud level, it's a lot easier for cloud providers to provide credits to academics than, you know, hard dollars. You're seeing a little bit of a mixture of both, but at least as far as we can tell, you know, from talking to folks at conferences and things like that, you are seeing this impact kind of go both ways. And so our students, when they graduate, have generally speaking, had the opportunity to put some code into multiple data centers and they can say, you know, I wrote a program that I deployed to Korea and it also ran in Europe and it ran in the U.S. and I failed one of the data centers and it all kept

Starting point is 00:33:50 working. And you never would have been able to do that in the 90s or 2000s or even early 2010s. Would have required a research team and an enormous budget. Now it just requires a few lines of configuration. It just requires a credit card and it might cost $100 or it might cost $10,000. That's the question you have to figure out. Or you think it's one, it's going to be the other by surprise. Yeah, those are always great. It's unlikely that you think it's going to be $10,000. It's only $100. If people want to learn more, where's the best place for them to go to see what you and your team are up to? We have a website at UCSD called cns.ucsd.edu. It's the Center for Network Systems, which is a bunch of faculty, staff, and students who are working in this space.

Starting point is 00:34:30 And then in terms of our low-carbon work, and that includes the collaborations with the University of San Diego and Monica's stuff, that's on a website called c3lab.net. And we will put links to both of those in the show notes. Thank you so much for taking the time to speak with me. I really do appreciate it. It's a real pleasure to chat with you, and I hope we can talk on Twitter soon. Oh, I expect it'll be hard to get away from me as that environment continues to contract. Thanks again for making the time.

Starting point is 00:34:56 I appreciate it. Thank you so much. George Porter, professor at the University of California, San Diego, Computer Science Department. I'm cloud economist Corey Quinn, and this is Screaming in the Cloud. If you enjoyed this podcast, please leave a five-star review on your podcast platform of choice. Whereas if you hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry, insulting comment that channels 20 years of aggression over the way a crappy computer science professor made you feel silly back in the early noughts.

Screaming in the Cloud - Networks and Sustainability in Computing with George Porter

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.