LINUX Unplugged - 336: Linus' Filesystem Fluster

Episode Date: January 15, 2020

Linus Torvalds says don't use ZFS, but we think he got a few of the facts wrong. Jim Salter joins us to help us explain what Linus got right, and what he got wrong. Plus some really handy Linux picks,... some community news, and a live broadcast from Seattle's Snowpocalypse! Special Guest: Jim Salter.

Discussion (0)
Starting point is 00:00:00 Happy end of life for Windows 7 day, Wes. Microsoft sent us a whole bunch of new Linux users for our return show. You don't think they'll upgrade to Windows 10? I don't know. I was looking this up. I was like, how many people are actually still using Windows 7? And over at Computer World, they have an article that says they ran an estimation, and they say on January 1st, 2021, so a year from now,
Starting point is 00:00:21 there will still be 18.7 of all Windows users running Windows 7, which is approximately 281 million machines. Something tells me they're not going to Windows 10. Pretty sure those folks are all going to OpenBSD. Hello, friends, and welcome into the Unplugged program. My name is Chris. My name is Wes. Hello, Wes. If I sound a little different out there, dear audience, today,
Starting point is 00:00:56 it's because I am a little different today. I'm on a remote snow setup using my emergency podcasting pack that I carry with me like every good podcaster should. So if I sound a little different, that's why. But nevertheless, we still have a great episode today. Jim Salter from TechSnap and Ars Technica and other places online, like anywhere you want to find him, is joining us. Hello, Jim. What's up?
Starting point is 00:01:18 Hey, Jim. Now, you must be here because we're talking about a ZFS topic today. You can't keep me away. Well, thank you for joining us. We'll be getting to, in our estimation, why Linus has gotten a few things wrong about ZFS recently. We'll explain it to you and also what he's gotten right. We've gotten some community news that we're going to get to, something that's a little more straight from the headlines type inspired.
Starting point is 00:01:42 But before all of that, we've got to say holler to our virtual lug. Time-appropriate greetings, Mumble Room. Hello, hello, hello. What's up? Hello. I can hear that we have a really good showing today, but I'm flying blind in my remote setup. So I know it's like around a dozen people,
Starting point is 00:02:00 so it's good to see everybody. I think you're just going to have to find out as they make their erudite comments. Yeah, tag me in the chat room with the old mumskies, and let me know if you want to jump in since I can't really monitor the room today. But yeah, I am snowed in. And I say that kind of like with, I kind of laugh about it because it's not like there's a lot of snow.
Starting point is 00:02:20 I posted a quick like 20-second video on my YouTube channel, which I have linked in the show notes if you want to see what I'm talking about. I mean, I'm not trying to overstate what's going on here. It's not like it's horrible, but for the Seattle area, it's snowpocalypse 2020. And it's combined with a couple of days of staying below freezing. And so the roads are just trash and my water is frozen, which happens from time to time. The roads are just trash, and my water's frozen, which happens from time to time. But more importantly, people do not know how to drive around here,
Starting point is 00:02:52 and an area of the freeway that I transit through, it's the exit right before the studio exit, has just seen nothing but accidents today. There was a car fire around 5 a.m. There was a pileup that caused. And then, spooky enough, around the time I would have been heading into the studio had I decided to brave the snow, there was a five-car pileup
Starting point is 00:03:13 and a semi-collision in that exact spot. And so I thought, you know what? I'll just podcast from the snow in the RV today. Sounds much safer. Yeah, it's much safer. And, you know, it means it's kind of an interesting slow ramp back because the show has been on break for the last three weeks when we've been running our holiday specials.
Starting point is 00:03:36 So this is our first episode back since before Christmas. And it's like a slow ramp up. It's kind of nice in a way. It's a little different. It's great to be back. Meanwhile, while we're dealing with the snow, the rest of the tech industry has been dealing with CES, the Consumer Electronics Show, that happens every single year in Vegas. And we don't talk about it a lot because there's not a lot of pure Linux and open source stuff going on there. But as you would expect, there's a lot of gadgets running Linux. But some of them have been a little more upfront,
Starting point is 00:04:11 a little more obvious about the fact that they're using Linux. They're sort of proudly using Linux. And surprisingly enough, Wes, it's the devices that seem to be using the automotive-grade Linux project from the Linux Foundation. Subaru made a big announcement this week that they're going to do all of their infotainment systems using Linux. They'll be rolling it out initially in the Subaru Outback and Legacy vehicles,
Starting point is 00:04:36 but there's a bunch of others that already have announced support as well. Yeah, including the 2020 Toyota RAV4 and Mazda CX-30, both powered by automotive-grade Linux software. Apparently there were 20-plus open-source demos at the Linux Foundation booth this year. Yeah, I did a little digging, and there's almost a dozen different manufacturers now that are on board with automotive-grade Linux,
Starting point is 00:04:57 and it's legitimate companies that are in on this. We're going to actually start seeing real cars that people buy running automotive-grade Linux. And they had over 20 different open-source software demos running on this thing. Yeah, that's pretty neat. And maybe that's where some of the advantage is. I mean, it's neat to see Linux being used,
Starting point is 00:05:17 but this definitely feels Linux foundation-y, if you get my drift here, in that these probably won't still be user-modifiable or Linux like we would recognize it in any way. Yeah, it's kind of like that Android. It's the Android effect. But it does still speak to the capability and usefulness of the kernel we all love. Well said, and it's pretty neat to... It's like when you get on the airplane. How many times have you seen when people get on the Delta airplane,
Starting point is 00:05:41 and Delta will do a reboot of the infotainment systems during the passenger offloading and unloading? And so if you get on the Delta airplane. And Delta will do a reboot of the infotainment systems during the passenger offloading and unloading. And so if you get on the plane kind of early, or they just are slow at the reboot, I guess, you can get a little shot of picture of Linux booting in the seat cushion of the Delta airline plane seats. Makes me feel better to have tux right there as I'm getting ready to fly.
Starting point is 00:06:05 Yeah, it doesn't look like it's a super great system, to airline plane seats. Makes me feel better to have tux right there as I'm getting ready to fly. Yeah. It doesn't look like it's a super great system, but I probably see one of those once a week, either sent into the show or somebody posted online. And I remember, I mean, we did it as a Runs Linux forever ago. So yeah, it's out there, and it's just an implementation detail, really, for the most part,
Starting point is 00:06:25 except for us. Right. That's assuming you can see the seat back screen between your knees when you're in that Delta seat. Yeah. Yeah, that's true. And,
Starting point is 00:06:34 um, honestly I prefer Alaska, but I don't want to start a fight. I have, um, I have so much to say about the snow, but I know we need to get into housekeeping this week already. Cause it's,
Starting point is 00:06:49 there's a lot to update people on since we've been gone, and tons of really good stuff. So, nominally today, we're going to talk about Linus' take on ZFS, and maybe where he got it wrong. But before we do that, let's do a little housekeeping. There's a lot to tell you about, so I wanted to kind of get right into this, sort of near the top of the show. There's a lot to tell you about, so I wanted to kind of get right into this, sort of near the top of the show. First of all, thank you to everybody who's joined us at slash telegram. That's our official telegram channel
Starting point is 00:07:12 where the conversation goes on after the show. And you can keep it going over at slash telegram. It's doing great. And I think a big part of that is the spam bots and stuff. It's really kind of changed the game over bots and stuff. It's really, it's really kind of changed the game over there. So it's been really nice. Also this Friday, assuming I'm not snowed in, Cheese and I, and whoever else wants to join us, are doing a live stream on living in the terminal, essentially creating a desktop-like environment, but in the terminal. Something you could SSH to on a VPS or back home.
Starting point is 00:07:47 And it's really slick. It's something Cheese used to do on the DL at an old gig. So we really dialed it in. And he'll walk us through it on Friday. So that'll be live, assuming I can make it. I'm excited to see what his secrets are. Yeah, he's got a few great ideas in there. And then I've got a couple of points of interest that we need to get out there for everybody. Number one, two fests are closing their calls for paper very soon. This Saturday, January 18th, Texas Linux Fest closes their call for papers. Now, Texas Linux Fest is one of my favorite Linux fests because it's the perfect
Starting point is 00:08:26 size to actually build friendships, relationships, and connections. Some of the most fruitful things that have happened in my career and others I know are the result of Texas Linux Fest. So it's a great conference to get involved in if you can. And their call for papers wrap up very soon. So we have a link to their site. If you want to give a shot, there's worse places to go than Texas in the winter. Is there anything else that we want to add to that, Carl? I know you're involved very closely with Texas Linux Fest, so if there's something you want to mention, I want to give you a chance here in the show. Yeah, just one thing somebody asked me about today.
Starting point is 00:09:04 The call for papers, you don't actually have to have your talk done. Just get the idea and the basic elevator pitch submitted so that way we can select between them and schedule them and categories and all that. Yeah, get your pitch ready, get an idea, get an angle, get a concept, and then flush it out. You use the time between now and the fest. Yeah, that's probably something people don't think about
Starting point is 00:09:24 because it feels like you're cheating, but that's really the fest. Yeah, that's probably something people don't think about because it feels like you're cheating, but that's really, that's the expectation. Besides, you want to polish it right up to the time it's time to go. I don't know any other way. Also, LinuxFest, our favorite LinuxFest, LinuxFest Northwest, it's call for papers, wraps up Wednesday. That's tomorrow as we record. So if you're listening, it may have just closed
Starting point is 00:09:45 or you may have a tiny bit of time left. Better go check. And it's a good time to think about your plans to come join us at Linux Fest. Definitely. We are starting to do our planning and we've had so much fun over the years that if you can make it,
Starting point is 00:10:00 it's another one of those where you can really hear it's made connections and changed things for people. So I really recommend, if you can make it, please join us at LinuxFest Northwest. You can get the details at their website. It's in April, towards the end, like last weekend. And then last but not least, our buddy Alex, my co-host on Self Hosted, sat down with the Home Assistant podcast that comes out this week. And the Home
Starting point is 00:10:26 Assistant podcast is definitely for those of you who want to learn a lot more about Home Assistant. We go into it on Self-Hosted, but it's part of our overall solution, whereas this podcast focuses in on what I think is one of the most remarkable open source projects and in the top 10 active open source projects on GitHub. And great podcast that really zeroes in on that. And they had Alex sit and join them. Alex has a beautiful home assistant setup too. So that was a good call on their part.
Starting point is 00:10:55 Yeah, that's for sure. That'll be episode 61 if you want to find it. There you go. And we'll also put a link to the general podcast page, since it's not out as we record yet, in the show notes. But you can find it there. There's a lot going on. It's just about LinuxFest season, because something we didn't mention, but I will get it out there right now since we're covering this, we're also going to be at scale.
Starting point is 00:11:15 So if you're going to be at scale this year, come say hi to us. I'm sure coming up soon we'll start setting up the actual meetups and all that kind of stuff. But, yeah, this year we're planning to do LinuxFest Northwest, Texas LinuxFest, and Scale. Now, everything's always subject to change, especially these days, but that's our intention as we record right now. So if you're thinking about it and we could
Starting point is 00:11:35 push you over the line a little bit, I hope that encourages you to come say hi to us. Come hang out. And that's the housekeeping. I thought that was going to be a lot to get through, but really, for being off the air for three weeks, that wasn't so bad. No, it actually looks pretty good in here. Yeah. I mean, I'm pretty proud of us, Wes, you know? Computer kid in the chat room asks
Starting point is 00:11:57 if you're going to Self. I don't know. It always conflicts, but we'll see. I've always wanted to make it down there, but it's a really long travel. I'm going to self. Are you? Yeah, okay. Ah, great. There you go. Jim will be there. Jim goes every year. Not quite as far for Jim as it is for me.
Starting point is 00:12:12 No. So that's good. And you often go to all things open too, don't you? Every year. There you go. I want to try to make it this year if I can. I don't quite have it nailed down yet, but I would like to. It is a monster of a show. Is it? We had 5,000 people last year.
Starting point is 00:12:28 Whoa, okay. It sounds like we shouldn't miss it. Yeah, you really should not. All right, okay, I'm getting hyped. I mean, it's on my plan, so I'll start taking it seriously. All right, well, let's talk about Linus Torvald's recent claims not to use. He says, quote, don't use ZFS. But unfortunately, he doesn't seem to fully understand it. It's kind of, for me, it was kind of hard to read this because it's sort of like when I have a family member who I realize isn't going to figure something out. Like, it's not kind of within them.
Starting point is 00:13:01 And they sort of hit their limit. And it's sort of a little, well. Turns out Linus has faults too, Chris. Yeah, I mean, we're all human, and I think we knew that. It's just when Linus says something, it comes with a certain level of authority. It immediately, immediately generates headlines. It creates blog posts. It creates discussion.
Starting point is 00:13:17 It's, you know, as it should after his contribution. But there was a series of questions about Linux performance in a thread, and eventually in there, ZFS came up. And it was in here where Linus went on to make some inaccurate and damaging, potentially, claims about ZFS. And Jim kind of makes this case in an article
Starting point is 00:13:35 we're going to link that he wrote over at Ars Technica that really kind of goes into some of the history here. Because that's, I think, something that's really key to understand about this, is this really kind of goes back to January of 2019, when Greg KH decided he was done exporting certain types of kernel symbols to non-GPL loadable kernel modules. Yeah, back then, that particular symbol was keeping track of the state of the processor's floating point unit. And without access to that symbol, external kernel modules, think ZFS, which access the FPU directly
Starting point is 00:14:11 as ZFS does, well, they've got to implement some of the code, the state preservation logic, all on their own. So once that symbol was no longer available to them, that had to be done. Right. And I remember when that change was made, Greg was quoted as saying, quote, my tolerance for ZFS is pretty non-existent. And it kind of created a lot of drama. It spawned a lot of chatter, and it did create a breaking change that the ZFS folks had to respond to.
Starting point is 00:14:41 So there's been some obvious kind of, not hostility, but not really, there's a hard line. Maybe a reminder that these are not well-meshed systems. It certainly works very well, but these two development communities are not necessarily working together. Right. And the hard lines, I think, really form because of the licensing. And that's where I think Linus got this thing right. His concerns over licensing, I think, are reasonable. He writes, got this thing right. His concerns over licensing, I think, are reasonably. He writes, honestly, there is no way I can merge any of the ZFS efforts until I get an official letter from Oracle. Other people think it can be done or think it's okay to merge ZFS code into the kernel
Starting point is 00:15:15 and that the module interface makes it okay. That's their decision. But considering Oracle's litigious nature and the questions over licensing, there's no way I can ever feel safe doing so. And he goes on to kind of discuss the flimsy nature of the shim. But where it kind of goes off the rails is when he says, don't use ZFS. It's always more of a buzzword than anything else. And the benchmarks I've seen don't make ZFS look all that great. And as far as I can tell, there's no real maintenance behind it anymore. Well, my jaw dropped when I saw that. How about you, Jim? Yeah, in a big way. Yeah, I do want to back things up a little bit, though. Things were kind of already off the rails when Linus started talking about, you know, how he couldn't integrate
Starting point is 00:16:03 ZFS into the kernel. Because while I don't doubt that, you know, how he couldn't integrate ZFS into the kernel, because while I don't doubt that, you know, he has seen people ask that, maybe even some people have asked him that directly. He was responding to somebody, and I'm sorry if I mispronounce your name, if you're listening, Jonathan Dante. Jonathan had replied to Linus saying that the number one priority for Linux has been don't break existing users. And he goes on, you know, to some degree about that, which is great. And that has been a longstanding policy. And it's an important one and a good one. Jonathan says, I really appreciate this pragmatic policy targeting user space application. However, I have a question about a recent kernel
Starting point is 00:16:38 change. That was the one that, you know, Greg KH made his infamous snark about that broke an important third party module, ZFS. I don't want to flame and feel free not to reply if you think I am, but recently the kernel FPU symbol was declared GPL only and caused a significant issue for the ZFS on Linux project. How do you feel about that? Can you speak about the motivation? Now, it's worth noting that at no point does Jonathan say, hey, Linus, why won't you pull ZFS into the kernel? He just wants to know, you know, about why do we stop exporting that symbol when that breaks a project? And we say we don't break users and how Linus opens up his response. He says, note that we don't break users as literally about user space applications.
Starting point is 00:17:26 And I'm alighting a couple of his words here for brevity and clarity. We don't break users about user space applications. If somebody adds a kernel module like ZFS, they're on their own. I can't maintain it and I can't be bound by other people's kernel changes. Yeah. His argument is that's not user space. That's a module. That's kernel space. And we can break that as much as we want. Exactly. And now at this point, right there at this point, he has answered Jonathan's entire question. He has done so informatively and very usefully. He's also, if he stopped right there, done a lot to put a productive positive cap on the controversy, you know, from Greg Cage's changes says, you know, hey, don't break user space about user space.
Starting point is 00:18:10 A kernel module is not user space. If you want to play in kernel space, you've got to play with the big boys. And that means that ABI's will break and you have to keep up. And that's perfectly fair. We were so close to not having to talk about this at all. Exactly. So he should have just clicked send right there when he starts saying, and honestly, there's no way I can merge ZFS, you know, blah, blah, blah.
Starting point is 00:18:30 Well, you know, nobody's asking you to right now. And he's already gone off the rail there because I got to tell you, you know, like you opened up, you know, I wrote a long detailed article about this at Ars Technica. We're up to 10 pages of comments now. And I can't tell you how many people, because they've read Linus's comment, you know, they think now that stopping the export of the, you know, kernel FPU begin symbol, now they think that that's about license compliance. And it never was. It was just a maintenance thing. It was tidying up. Right. And I think importantly here, where it gets into comments about performance and the state of maintenance, it really goes off the rails.
Starting point is 00:19:13 Because first of all, the performance is an extremely complicated story, but I'm not sure where he got the maintenance thing from at all. He may have maybe found an old repo or something, but it seems like the ZFS on Linux project is firing in all cylinders these days. Well, so, you know, since then he's kind of, and I gotta be honest, I mean, I don't know that somebody the stature of Linus Torvalds really cares what I think, but he really disappointed me because he replied to that, to, you know, where are you getting this unmaintained thing from? And he kind of retconned it and said, oh, well, you know,
Starting point is 00:19:44 if I say ZFS, obviously I'm talking about Oracle ZFS and, you know, maybe it's being maintained. I don't know. It's proprietary. How can I tell? Do you know if it's maintained? And it's like, come on, dude. I mean, how could you have been talking about Oracle ZFS when you know perfectly well they don't need a Linux kernel symbol export. And nobody using Linux and asking about ZFS is asking about Oracle. They're asking about ZFS on Linux, which is OpenZFS. And the GitHub repo's right there. There were 5,000 code additions in the last month
Starting point is 00:20:16 at the time I wrote the article. Yeah, and great things, if you just look at over 2019, landed in ZFS on Linux that really brought it up to parity with the BSDs and brought fantastic features, some of which can lead to pretty decent performance, like some great compression features. When you look at ZFS itself, there is this tendency to just write it off as a buzzword. There's things I can build around it, things like Red Hat Project Stratus, which looks like a pretty nice collection of tools with an API that could eventually lead to some real serious alternatives.
Starting point is 00:20:52 Or we have our setup in the studio where our OS and our boot drives and things of that type of nature are ButterFS, and then our data storage is ZFS. And, you know, we didn't have to pick a camp. We use both for what they're great for. But when it comes specifically to this kind of writing zfs and reducing it down to one variable, oh, it's a buzzword. Where is that coming from? Because it's clearly used in production quite successfully. I, you know, the only thing I can tell, I don't know what kind of private conversations,
Starting point is 00:21:27 you know, senior developers like Greg KH and, you know, Linus might have, but Greg has been pretty voluble about his disdain for ZFS and, you know, what appears to be some anger about Sun having released it under the cuddle, at the very least with the knowledge that it probably would not be very least with the knowledge that it probably
Starting point is 00:21:45 would not be license compatible with the GPL. And he's made some pretty salty comments about, you know, with if they didn't want to work with me, then why should I work with them? The thing about that is, I mean, you know, now we're not talking about Oracle, we're talking about Sun, but to the best of my knowledge, we're not talking about the actual developers. Now, we do have some original Sun developers that are, you know, very senior and very active, you know, in OpenZFS. They basically, when Oracle bought Sun to begin with, ZFS developers pretty much all quit and immediately began the OpenZFS project. Matthew Ahrens being one of those. I haven't asked Matt about this, but I find it hard to believe that the actual developers, some of whom I know, were really all that keen on, hey, let's make Linux not be able to use our
Starting point is 00:22:34 crap. That just reeks to me of Scott McNeely, who, you know, he was the CEO of Sun. And he absolutely, on the record, hated the crap out of Linux and didn't want any of his stuff ever working on it. Well, it competed with Solaris. Yeah, well, I had the great displeasure of attending a Scott McNeely keynote at an open source conference a couple of years after Sun was bought by Oracle. And he went on for nearly twice his allotted time, would not get off the stage, and would not shut up about how the entire open source world was screwed. Now, if they didn't have their hero, Scott McNeely, to protect them and all they had was crappy old Linux. Right. And around that time, SCO was going to come and get us, too. So they, you know, they were really feeling empowered at that time. But I mean, that's the guy that didn't want to cooperate with, you know, the Linux kernel maintainers
Starting point is 00:23:22 or anybody using the GPL who had the giant partisan agenda, not the OpenZFS project, who is neither Sun's corporate leadership nor has anything to do with Oracle. That's a great point. Yeah, I feel like it's pretty crappy to still be holding that grudge and snarling about they don't want to work with us. Yes. Times have changed, the crew has changed,
Starting point is 00:23:42 the project location has changed, and it's a really great project. I wonder, too, though, can we just, because you addressed this, and I think it's worth touching on, can we talk about performance for a second? Because there is some truth to the fact that if you put ZFS up against an extended 4 disk or a single XFS SSD, ZFS might be a little bit slower. And so that performance thing is a bit of a touchy subject. It is, but it's also an unusually complex subject. If you just set up ZFS default and you run benchmarks and compare it against EXT4 defaults or XFS defaults,
Starting point is 00:24:27 and compare it against ext4 defaults or xfs defaults it will probably be slightly slower it's usually not going to be slow enough to really notice unless you get yourself wedged into you know a particularly pathological corner case involving like really really heavily uh loaded sql binaries stored on it or something sure Sure. But it will probably be slightly slower. But here's the thing about that. One, you know, we're just talking about the defaults. And I think I think it's reasonable to say that most of the people that are using ZFS, they want to dig a little deeper than that. You know, even if they don't on day one, they eventually get into looking at data sets and, you know, record sizes and all this tuning stuff that can make an enormous difference. And the other thing about that is, you know, there is a very, very long history
Starting point is 00:25:09 of benchmarking storage. And one of the first rules of benchmarking storage is you do absolutely everything possible to eliminate any impact from file system cache. So in the real world, you're doing everything you possibly can to make sure as few reads as possible ever hit the middle because you want to conserve those IOPS and you want to service those requests, you know, literally like two, three orders of magnitude faster from RAM in your cache, right? So you want to eliminate that when you're benchmarking disks or even usually when you're benchmarking file systems because the assumption is cache will just completely wreck, you know, the actual numbers that you care about because it makes such an enormous impact. And also the assumption here is that all file systems will cache the same way and to the same degree of effectiveness. Right. Using the same probably kernel level algorithm for it. Well, so one, up until very recently, they've all used the same algorithm, which is least recently used. It's just slightly
Starting point is 00:26:05 smarter than FIFO, first in, you know, first out. An LRU cache, the way that works is as you read in a block, it goes into cache. If the cache is full, then the block that has been sitting in the cache for the longest without being read gets evicted to make room for the new one. Now, every time you read a block from cache, it gets jumped up to the top of the list again, like it's brand new. So it is at least that smart, but an LRU cache will very happily evict every single block in it. If like you just read, you know, a stream of data off the disk larger than the size of the cache. Now, every file system on every operating system I'm aware of, other than ZFS, they share an operating system kernel level cache facility on Linux, ext4, XFS, butter, whatever you're using.
Starting point is 00:26:56 It's not just, they're not just all using the LRU cache. They're all specifically using the Linux kernel page cache, which is an LRU. Now over on the Windows side, NTFS, you know, what have you, it's using a page cache, which is LRU, which is built into the Windows kernel. But now when you look at ZFS, ZFS not only doesn't use the operating system cache facility, it also doesn't use an LRU cache. It uses something called the ARC. ARC stands for adaptive replacement cache. And I like to explain it in simplistic terms as you can think of it as a weighted cache. So basically every time you read a block, it not only stays in cache, it gets a little heavier and it's harder to push it out. So if you've got a
Starting point is 00:27:36 block that has been in cache and has been read from cache like 500 times, but maybe it's been five minutes since the last time it was read, it's going to be more difficult to evict that block just because you read in some random thing off of disk, because even if it hasn't been active for a few minutes, the cache knows this block gets hit a lot. It's really hot. And it's even slightly smarter than that. Let's say that a block never quite gets all the way to the top of the cache, which are really hard to push out, but it keeps getting read in and keeps getting evicted and keeps getting read back in again. Well, the ARC also tracks blocks that have been recently evicted from cache. So if you have a frequent flyer like that, that keeps getting evicted and bouncing back in, the ARC notices that and starts being
Starting point is 00:28:16 more reluctant to evict it because it's like, I'm just going to see this thing again pretty soon. So the net impact here is that you have a much higher hit rate in most real world workloads from the ARC than you would from the LRU cache. And as we discussed before, you know, the impact of cache is enormous. Every time you serve a read request from cache, you can serve IOPS and you also, you know, service that individual request much, much, much faster. Well, let me ask you this. Am I right in saying then it's also even a little trickier with the ZFS arc because it sounds like the adaptiveness of it takes it a little bit to figure out, oh, that is something they actually are using frequently. I will add it. Would that might not show up in a benchmark immediately? I'm not sure. You're correct. It's
Starting point is 00:28:58 very difficult to trigger in a synthetic benchmark. You know, usually the gold standard for storage benchmarking is a tool called FIO. You can use FIO to model workloads. You can say, I want read, I want write, I want random, I want sequential, I want this block size, I want this many processes. You can mix it all together however you want. And if you're bypassing cache, it's a really, really amazing tool for modeling your exact type of workload and getting really realistic answers. But once you start looking at something that is as smart and adaptive as the ARC, it gets really difficult to synthetically benchmark it because the very synthesis tends to be
Starting point is 00:29:40 way more random than an actual workload. So it doesn't really do a great job of modeling the actual impact the arc has on you really doing real things that really do read the same data over and over again. Right. Okay, and then there's sort of the big picture question. Do you think there was some harm done by these statements? You know, you notice it in the comments for sure, certain people feel empowered now.
Starting point is 00:30:04 Is there long-term harm done here because of just the scope of Linus's reach? That's kind of a complicated question to answer. The first answer is absolutely. There's no question about it. When you speak from a pulpit as tall with a mic as heavily amplified as, you know, Linus Torvalds does, when you take a crap on something, people notice. Positive comments from Linus, for example, got me, you know, that's what finally tipped me over to actually look at WireGuard, which was already on my radar as a potentially cool VPN thing. But I can't tell you how many potentially cool VPN things that I've been like, I should look at that and never did. But Linus says, let me just say I love this code. And that day when
Starting point is 00:30:45 I see him say that, I'm installing it. Yeah, that caught Wes and my attention the same. Yeah. Now, with that said, I mean, Linus taking a dump on something is kind of less newsworthy than him actually praising something. So I don't know that it has quite the same impact, but people absolutely do notice. Also, honestly, you know, I feel like podcasts like this one and, you know, articles like the one I wrote immediately responding in a thoughtful, detailed way, I do think they nerf a lot of that damage. So it was a negative impact. I don't think it's going to be the end of the world because the community responded and responded pretty well. And it sends a pretty clear signal that there is demand here. You know, Wes and I were talking about this,
Starting point is 00:31:26 and it could be, in part, if you look at Linux in production today, in enterprises, true, some of them, and some very well-known ones, are indeed using ZFS, but probably at scale, most traditional shops that have invested for maybe a few years ago, they have a NAS or a SAN. It's probably doing some sort of vendor magic hardware raid. Could be that they've iSCSI'd out the disks and they're just connecting them directly to the different machines and handling them that way. It could be they're using who knows what file system on the back end,
Starting point is 00:32:01 but it's likely not ZFS, at least not if you bought something in the last five years. So I wonder if part of this is just simply at a practical user base level, there's a lot of enthusiastic people that are rolling custom solutions around ZFS that work really well, but it's not at a scale that your SANS and your NASs are. I don't know that I would necessarily entirely agree with that. I think there's a kernel of truth to it, but I think you're probably discounting the popularity of the iX systems. It makes storage appliance. I don't want to use the word storage appliance because that's what Oracle calls their ZFS box.
Starting point is 00:32:41 But they make FreeNAS Mini and they make TrueNAS as their true enterprise product. And, uh, there are a lot of those things, you know, in the wild in businesses. Um, and there are a lot of, you know, just like free NAS installations, you know, on individual boxes. It's a pretty popular backend with like an iSCSI or NFS transport, you know, to people's, uh, VMware or KVM or whatever virtualization systems. And then you've also got some people going hyper-virtualized, like I do, with the ZFS storage part
Starting point is 00:33:13 and the virtualization part all inside the same chassis. Now, I think that's considerably less common because people are just so used to that idea of separate the storage silo from the virt silo. Right. It's essentially how we have it set up in the studio, though, is we have with Docker instead of virtualization, but it's a similar setup for us.
Starting point is 00:33:31 And I think, you know, when you look at things like ZFS send and compression that works really fast and really well and just the overall, like, functionality, it's a good option. There's a lot of good options, but it's certainly a good one. And I don't know if it's ever going to be possible. And maybe this is my last question, because I'm curious to know your opinion on this, because obviously it's just opinion. But I don't know if it's ever going to be possible for it to be officially mainlined.
Starting point is 00:33:54 I think it's possible. I certainly wouldn't bet anything that you can't afford to lose on it. You know, the current situation is that there's just not really anybody with a good reason to try to bring an enforcement suit, you know, against mixing the GPL and the CUDL. That was my opinion before Canonical went all, you know, YOLO SWAG in 2016 and just started shipping ZFS headers in their mainline kernels. I don't know if you knew that, by the way, but Canonical doesn't just have ZFS in the repos. If you're using an Ubuntu machine with a kernel from Canonical, there are kind of licensed headers built into that kernel, whether you install ZFS or not. But yeah, even before Canonical just went all YOLO and decided to do it, my position was, you know, I don't see where anybody would want to bring a suit like that because there's literally no good outcome out of it.
Starting point is 00:34:45 It feels like the most likely, honestly, would be a GPL partisan. Folks who are really, really invested in the GPL get really, really pissed about this particular topic, but they don't want to bring that suit because it's a risky-ass suit. And if you bring a GPL enforcement lawsuit against somebody bundling CUDdle and it doesn't go your way in court, well, now you've weakened the GPL and you've weakened enforcement efforts. And that's certainly not anything that somebody who's big into the GPL wants. I don't want that. On the other hand, if you win, you've also set a precedent that's really going to put Linux very far back. Because if you win a Cuddle enforcement lawsuit that says you can't have a cuddle kernel module, well, you can't have NVIDIA drivers either. Those are freaking
Starting point is 00:35:28 proprietary. Right now you can't have NVIDIA drivers. You can't have, you know, hardware drivers for, you know, industrial stuff that requires, you know, dongles or whatever. You can't do any of all that proprietary. You have just made Linux unusable for an enormous portion of the industry that it's in right now. And now you're just hoping everybody goes BSD instead of going to freaking Windows or something. Nobody wants that. This has got to be the calculation that Canonical's made,
Starting point is 00:35:56 is that it's just too destructive of a fight for either side. What if they won? What if they took it to court, the GPL folks take it to court and they won, the ZFS user base would get drained too. It would be so damaging to not just things like dongles but anything else like that to your point
Starting point is 00:36:14 but transversely if they didn't win it would be so damaging because you would see folks no longer have the motivation to mainline their code. They would just come up with these shim solutions and call it good. And the knock-on effects would be felt for years. Yeah, absolutely.
Starting point is 00:36:30 I don't think that Canonical only had my thoughts there about, you know, nobody is going to be, nobody with any sense is going to be motivated to bring this suit. I do believe that their lawyers also told them, you know, look, we think this is fine. We're totally willing to take this to court. We think we'll come out on the other side of it. I know many years before Canonical did that, I want to say it was like 2012, 2013. I was very interested in the same question. I was about a year or two deep into, you know, my ZFS on Linux journey. I was several more years deep with ZFS,
Starting point is 00:37:05 you know, on FreeBSD initially. And I convinced a local intellectual property lawyer to co-author a presentation with me for an open source conference. And basically, you know, I kind of brought him up to speed on the technical details. And, you know, he took a look at it from a legal perspective. And to my surprise, his conclusion was not only, you know, yes, you can get away with installing ZFS on Linux and then like selling somebody a system with that installed. He was like, you know, I'd be perfectly willing to take a course to a case defending just putting ZFS directly in the kernel. I think it'd be good to go.
Starting point is 00:37:45 And he had a lot of reasons for that, some of which I have seen repeated elsewhere, some of which were mostly, you know, his take. But basically what I'm seeing is, you know, every time somebody who is an actual law professor or an actual practicing IP lawyer comments publicly on this, they seem to all be commenting, nah, I think we can work this out in court. That has been my observation as well. Colonel has a question, aptly named, in the mumble room about maybe Oracle playing a long game here, letting everybody get all ZFS'd up and then coming in with a big old lawsuit. Yeah, Colonel, is that your thought? That's their scheme? Yeah, I was thinking, you know, if they hold the intellectual property rights
Starting point is 00:38:26 and they let everybody integrate it and then they go after people, it'd be a lot harder for us to switch over to, you know, BcashFS or something else. It would. But again, that's not really in Oracle's best interests either, because Oracle is a significant Linux player, and it's going to be really damaging for them if all of a sudden unbreakable Linux becomes completely freaking useless because of a court case that they instigated that set the precedent that you can't have proprietary loadable kernel modules. And that actually seems like a big deal to Oracle's customer base, too, in particular. Exactly. I just, I don't see that happening because again,
Starting point is 00:39:07 I think that it would damage them badly if they won such a suit. Now, what I used to worry about, I worried about a poison pill effect where Oracle, you know, might take a long time to let interest in OpenZFS build up, wait for it to reach a critical mass, then say, oh, hey, Oracle ZFS is now GPO'd and can be integrated in the kernel. OpenZFS can't, you know, ha ha, LOL. And now everybody's, you know, scrambling to, you know, get on board the Oracle thing and yada, yada, yada. I'm less worried about that now because what I hadn't understood at the time, and this is the thing that most people who get vocal about this topic seem not to quite understand. The terms of the cuddle that Solaris ZFS was first licensed under, they include the ability for the licensed steward,
Starting point is 00:39:53 who is defined as the initial developer of a project. In this case, that would be Sun Microsystems, which has now been purchased by Oracle. So the initial developer is the license steward and the license steward may issue a new version of the license. If that happens, the covered code can now be used under the terms of either cuddle 1.0 or the new license version introduced by the license steward. So it's completely possible for Oracle to fix this should enough, you know, grassroots irritation, you know, get them motivated. If they were to make, you know, the last Solaris ZFS before they turned it proprietary, if they were to add, you know, Cuddle version 3.0 monkey blue or whatever, that just happened to be exactly, you know, the MIT license or the simplified BSD license, you know, which was weak permissive, but also, you know, GPL compatible, if they did that, that would
Starting point is 00:40:46 automatically cover OpenZFS as well as, well, it wouldn't have to cover Oracle ZFS because again, they own that code and it doesn't have to apply to what they did because as the owner, they can say, well, we're no longer offering that. But if they did that, that would immediately fix everything for the OpenZFS folks. It would not require everybody who's ever contributed OpenCFS code or Solaris ZFS code. It doesn't require them to okay it because again, if you contributed to the project to begin with, you had to contribute under CUDL 1. And if there wasn't an exclusion for the future license version clause, which there was not for Solaris ZFS, then that's just, you know, you agreed to that when you contributed to begin with.
Starting point is 00:41:29 So it's good to go. You've essentially just laid out a potential path to eventual mainlining. And I wonder if we talked about the damage this could have potentially done these statements by Linus, but I wonder if there isn't potentially a longer-term greater good if the top kernel developers play hardball. They sort of force Oracle's hand, maybe long-term, to do just what you laid out.
Starting point is 00:41:55 I don't know that anybody can really force Oracle's hand on this. I think the potential path to success, and this is something I've discussed at some length with Brad Kuhn of the Software Freedom Conservancy. He and I don't completely see eye to eye on this topic. He's a lot more of a traditional hardline GPL V3. But I'm a little, maybe a little bit more pragmatic about it. But at any rate, what, you know, what Brad said, and it makes a lot of sense to me is that, you know, the way that you maybe potentially get Oracle to resolve this, it amounts to a grassroots campaign. You get enough pressure, enough complaining, enough, why won't you fix this? You know, this is a thing of great good, you could do it no cost to yourself. Maybe eventually you convince them that, you know,
Starting point is 00:42:50 yeah, all right, we'll do that. Sure, it'll make us look like good guys. We could use some good PR. It's not like we get a lot. That's my hope. I mean, maybe it's, I think, you know, of course, there's going to be people that completely disagree if that's even possible. But gosh, I sure like to think it could be one day. It is possible. Like I said, I'm not betting anything I can't afford to lose against it, but it's absolutely possible. To one degree, it's already been done. Brad hounded, and other people hounded Oracle relentlessly about being the biggest GPL violator on the planet by shipping Dtrace with unbreakable Linux as kernel for years and years and years. Yes, right. shipping a D-Trace with Unbreakable Linux's kernel for years and years and years. Yes. And they got
Starting point is 00:43:25 them to fix it. And they fixed it the exact same way that we're discussing right now. They added, you know, weak permissive license to it and problem solved. You can now ship it with the kernel. That's true. So this is not theoretical. This could be done. We know exactly how to do it. It's just a question of, you know, do we have the collective will to make a great big, you know, public stink in the right way to convince Oracle that, you know, yes, we should do this. I think it's building. And the kernel developers not being willing to bend also sort of, I think not mad at them about that. I do not think Linus is wrong for his opinion that, you know, we should never integrate cuddle code, you know, into the kernel, which is GPLv2. Right. And although, again, he shouldn't have brought it up the way he did in the context he did.
Starting point is 00:44:19 And the other thing is, you know, his whole thing about I get an official letter from Oracle. No, that's stupid. You either resolve the license conflict because Oracle has, you know, added weak permissive to it or it's resolved, you know, in a court, somebody brings an enforcement lawsuit and loses. And now you have a court precedent that states, no, it's totally kosher, you know, to mix CDL and GPL. At that point, that's when it becomes okay. But until then, no, Linus is totally right for saying,
Starting point is 00:44:50 you know, we're not going to mix CUDL and GPL code. That's fine. He was also completely fine to say, we don't break users about users, but we can't extend that guarantee to kernel space. You want to play in kernel space, you got to play like a kernel developer, you got to keep up.
Starting point is 00:45:03 That's fine. Yeah, reasonable. And what the expectation has been for a long time, you got to keep up. That's fine. Yeah, reasonable. And what the expectation has been for a long time in that regard. I agree. Yeah, it was just some of the other comments about it. Well, all right. I feel like we've got our head around it. So thank you.
Starting point is 00:45:17 Thank you, Jim. I appreciate you joining us to break that down. And we'll link to the Ars Technica post that Jim wrote that goes into even more detail and some of the history in the show notes. And Wes, before we go, we should probably do some picks. We've got a good batch, one that I've already been using since I'm on a remote connection with very limited bandwidth. So it's time for some picks. Ooh, yeah, something you can play with while you're stuck at home.
Starting point is 00:45:38 Yeah. Now, who found Bandwich, the terminal bandwidth utilization tool that was, I think it used to be known as what? Yeah, you know, I've seen this floating around. It looks to be somewhat recent. It is written in Rust, so I knew you would like it, and that means they've got simple, single binary releases you can go download from GitHub if you just want to give it a casual try without installing it.
Starting point is 00:45:58 Now, can you spell that for me? It's band, B-A-N-D W-H-I-C-H and it's up on GitHub. We'll put a link in the show notes. It's in the Arch-A-N-D-W-H-I-C-H. And it's up on GitHub. We'll put a link in the show notes. It's in the Arch repo. And apparently it's in Void Linux. And it's also, if you've got a Mac, it's brewable.
Starting point is 00:46:15 Not doing anything unique here, but it is a really handy, simple display of what connections are going on from which programs. Yeah. Well, and I was surprised when I ran it that a couple of applications I had paused syncing on and those apps were still communicating just like doing heartbeat check-ins or whatnot.
Starting point is 00:46:32 But right now, every single bit when you're broadcasting from the Snow account. So I went and closed a browser that was checking in, probably Bookmark Sync or something, and I closed those applications. And I think it's been kind of helpful specifically for today, so it's sort of perfect. But a follow-up pick previously on Linux Unplugged,
Starting point is 00:46:52 we reviewed or picked STUI, a terminal-based CPU stress and monitoring utility. Well, it actually worked out pretty great because Joe was running it last night on his machine and discovered that he had a clogged fan with this thing. So just a replug for S2E, the terminal-based CPU stress and monitoring utility.
Starting point is 00:47:13 It's got real pretty graphs right there in the terminal and an optional stress feature if you want to actually put some load on your system. Put it under stress. But we're not done yet. There's more picks, Wes. Oh, yeah, okay. So we've talked a little bit about Firefox Send, Mozilla's offering to be able to quickly, securely, and easily send files. That works great in a browser, but you know me,
Starting point is 00:47:35 I'm mostly in the terminal, Chris. So FFSend is also Rust-based, but it's a terminal CLI interface to Firefox Send. That's awesome. Firefox Send from the command line. And then last but not least, Age. Yes, it's called Age. It's a simple and modern secure encryption tool with small, explicit keys, no configuration you've got to worry about, and it goes together like a regular old Unix command.
Starting point is 00:48:01 You'll grok it immediately. We'll have a link in the show notes for that as well. Sometimes you just want to encrypt a single file, right? Maybe you've got some archive or you just have a single backup or you just want to have something safe and secure for transit. But we've all tried to use BGP or GPG and it's just a horrible mess every time. This tool uses some nice modern cryptography
Starting point is 00:48:21 and it's a lot simpler to interface with. I was thinking about doing my Docker Compose files with this, and then throw them on a cloud storage provider just to have some easy accessible backup that's off-site. So there you go. There you go. A simple modern encryption tool for the command line, age. And we have more picks than that, but we realized we're going to have other podcasts, so we should probably save them.
Starting point is 00:48:44 But we got pick Plurific on our holiday break, Wes. Yes, we did. Although that doesn't mean we're not still soliciting some good picks or any other feedback from the audience. I got some feedback for you right now, real time. I took a look at Bandwitch. It looks pretty cool. It's got a really nice looking interface. I like that. But if you want something that's already in repos and it's just an app getaway on Ubuntu or Debian, the most direct old school, you know, tool that serves the same basic function is called NetHogs. Oh, yes. NetHogs does the same thing. It'll show
Starting point is 00:49:17 you all your connections. It shows you process ID as well as, you know, the termination points on both source and destination. And there's also IFTOP. IFTOP does not show you process ID, so you can't tell if it's Apache or Nginx or Firefox or what have you on your end that's using the bandwidth, but it does show you the source and destination terminations, and it's very useful if that's all you're looking for. Very nice. Both of those are in the main repos on pretty much everything.
Starting point is 00:49:47 Good ones. Yes, I love NetHogs. Thank you. And like Wes was saying, if you have something that's a really handy desktop app, command line app, or even a web app that makes using Linux a little easier, do let us know. We've got a year of podcasts. We have now taken our totality of breaks for the entire year, and the Linux Unplugged show never misses a beat. So we would love to hear your feedback, your picks, all of that.
Starting point is 00:50:11 slash contact. Jim, thank you for joining us. Really appreciate it and the real-time feedback. Go catch more of Jim over at He and Wes are cracking over there these days, been absolutely loving it. And there's a whole back catalog if you haven't been listening. So go grab that. Also, join us live next week.
Starting point is 00:50:32 on an old Tuesday. We do it now at noon Pacific. You can get converted to your local timeskis over at slash calendar. And you can find me on Twitter. I'm at ChrisLAS. What about you, Wes? I'm at Wes Payne. And what about you, Wes? I'm at Wes Payne. And what about you, Jim?
Starting point is 00:50:47 JRSSNet. Magic. There you have it. That's it. That's the podcast from the snowpocalypse. Don't forget about that live stream on Friday, assuming I can survive and get out there. Hopefully my water will unfreeze and my vehicles will become unburied.
Starting point is 00:51:00 You'll be a little leaner by then, but that's okay. It's really not that bad. I put the video on the YouTube channel. I mean, it's really not that bad, but it's kind of fun for us and people in Western Washington have no idea what to do with it. So it kind of, it makes it a whole little bit. It makes it an event. I'm really grateful I was just able to get back for our first show of the year and we're still able to make it work via the remote connection. You never know. And there's always more when you join us live. So do that. Thanks so much for joining us here on this week. And we'll see you right back here next Tuesday. Unplugged program. There you have it.
Starting point is 00:52:07 Okay, thank you everybody. Let's go vote on our title, take care of our final business, and then get this shipped over to Joe so he can edit it up. Yeah, we really need some voting here to prune through all the great suggestions. It's funny how we're running out of sounds for music. We're at the point that we're starting to run out of, you know, simple like chord structures for music. Like you just, I don't think you can really invent one of those anymore.
Starting point is 00:52:32 That's what people say about movies too. You know, we're running out of new stuff, except for podcasts. There's always a new podcast out there. There's always new podcasts. I will say I am very happy that movies are finally starting to get the message that cell phones are a thing though. Yeah. Yeah. That's an entire plot point. And when I, when I watch, like Die Hard and Cliffhanger and stuff, it's an entire point of the plot that they can't
Starting point is 00:52:56 communicate because they can't, or they got to make it to a pay phone in time. And in Die Hard 3, they're all about running from phone to phone. Yeah. What a different world, and yet so similar. There was a real run of movies in the 90s where the bad guy was over the phone. The bad guy's always over the phone. It's a whole dramatic bit of it. It's really funny. These times, they just fix it by releasing an EMP. It's a dark outlook, but yeah.
Starting point is 00:53:22 You've got a phone. Not anymore. I think i might maybe just jam the frequency instead but uh you do you buddy

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.