LINUX Unplugged - 439: Double Server Jeopardy

Starting point is 00:00:00 I don't think any of us had this in our predictions, and that was just last week. But Canonical has a job posting up on their careers page for a, quote, Linux desktop gaming product manager. And the description reads, we're excited to create the role of desktop gaming product manager to make Ubuntu the best Linux desktop for gaming. We will work with partners in the silicon world to ensure the latest graphic drivers and tweaks are built in for optimal frame rates and latency, and world to ensure the latest graphic drivers and tweaks are built in for optimal frame rates and latency, as well as with partners in the gaming industry to ensure that mechanisms such as anti-cheat capabilities are available to ensure fairness and product availability. Where is this coming from? First of all, and second of all, what strikes me about this is I was just thinking we're getting awfully close to the deck and no major game has announced

Starting point is 00:00:41 anti-cheat compatibility. Does this feel like a response to the deck being Arch, you think? Oh, I don't know. A response seems like maybe a lot. You know, you'd think there'd have to be a little more going on behind the scenes than just that, but it certainly feels like maybe it's playing into more of these machinations we don't quite see behind the various curtains. Hmm. It is weird the way this job post is written.

Starting point is 00:01:02 Like, you know, half of it's talking about some of this graphic stuff and software development, but then there's also a lot of the phrase go-to market. How do you find someone who's software development experience, really interested in open source, and then also has an MBA? I guess that's what you need. Hello, friends, and welcome back to your weekly Linux talk show. My name is Chris. My name is Wes.

Starting point is 00:01:31 And my name is Brent. Hello, gentlemen. Welcome to our first Sunday episode. We're here together. It's nice to see your smiling faces. Coming up on the show today, we've been waiting all summer. It's a new year, and it's time for a new server. all summer. It's a new year and it's time for a new server with some essential help from our community. We are starting off on a major migration. We're going to tell you more about it. You know some of it. We're going to tell you the complete picture in this episode. We've got two

Starting point is 00:01:55 servers, new networking, a totally new balls to the walls, don't do as we do kind of setup. So you're joining us today mid-gallop into a migration, and we'll give you our first major progress report. Then we'll round out the show with the emails and the picks. You know, we've gone through and done our pre-show-like discussion on what would be the right pick. I think I have one that's going to be good for the timing, so I'm vouching for this week's pick.

Starting point is 00:02:21 Oh, wow. Okay. I mean, the first pick of a new year is serious business. Yeah, yeah. So before we go any further, pick of a new year is serious business. Yeah. Yeah. So before we go any further, we got to do the business of the show. We got to say hello to our virtual lug. Time appropriate greetings. Virtual lug. Hello.

Starting point is 00:02:32 Hello. Hello. Good evening. Hello. Namaskaram. Sunday. Hello. Welcome to the Sunday episode.

Starting point is 00:02:39 We're trying out the old Jitsi today. Should be interesting to see how that goes. Nice to see some of their faces. They're like real people. It is. Wow. And, I mean, they did not bat an eye at following us right over here. Yeah, well, it helped that we kept dropping our connection to Mumble.

Starting point is 00:02:52 That didn't change it. As you probably guessed, I'm out of the woods. Made it out of the woods. Yeah, you didn't die. Yeah. An icy death. Good work. So if you didn't catch last week's episode, I got stranded out in the woods after the Pacific Northwest got slammed with an unusually cold winter storm and didn't have enough fuel, didn't have enough propane, didn't have enough water.

Starting point is 00:03:09 Tanks were full. It's freezing my, you know, what's off. And our ideal scenario is we were going to get out of there after last week's show on Tuesday. We didn't make it out till Thursday. Yeah. Now, the nice thing was, is the longer we stayed out in the woods, the more we kind of learned to live out there. Still real hard on the rig. And there has been damage to our water system.

Starting point is 00:03:28 But what happened is that a bunch of ice formed on top of our slides, like thick ice, much thicker than I realized. And it must have been a combination of just melt and then refreezing that just made this thing huge over time. And so it was hours per slide to get that ice. Wow. Took forever. Of course I tried to not do it and just saw it. See, well, maybe it'll break when it comes in. No, no, you could drive a car on this ice. It was like that thick. So we had to like chip each slide took hours and you know, it's 20 degrees, 15 degrees outside. There's only so many,

Starting point is 00:03:59 and it writes right at winter solstice too. So there's only so many hours of daylight to do the work out there. You can only be out in the cold for so long. We're making no solar, right? Uh, it was just worst case scenario. And, uh,

Starting point is 00:04:10 so we didn't actually make it out till Thursday. And then that evening, another huge snowstorm came in. We, if we hadn't made it out that day, we would have been trapped again that evening. It was like down to the wire. Yeah.

Starting point is 00:04:23 You probably would have just gotten out now. Yeah. Maybe. Yeah. Cause it's just now gone for the first time since christmas it's gone above freezing so yeah but then you know uh fast forward like we're here at the studio saturday last night uh day after new year's even though wes has friends in town he still came spent the day up here brought brent up brent's in from town. Heck yes. Brent's in from out of town. Nice to have Brent here. Out of country.

Starting point is 00:04:48 Right. Right. And you had quite the experience. I had a saga. I think it's a podcast length story in itself that I won't go into details, but it took me, let's say two days to get here. Yeah. You had to end up crossing the border a total of five times.

Starting point is 00:05:03 Yeah. Yeah, I did. Customs got to know me really well. And that's not usually a good thing. Also, you learned almost the hard way that if you buy something in the duty-free store and then bring it back into your own country, there's like a 200% fee for that. Yeah, it turns out they don't tell you that when you buy these things. I guess it's an edge case. Yeah. So just be aware of that, listener. Rental save you some money there. So before we get into our server stuff today, I just want to mention that we do have a meetup on the 30th planned at the studio here in the Seattle area. It's up in the town of

Starting point is 00:05:36 Arlington. So it's about 45, 50 minutes north of Seattle. And we're going to do phase two of the server launch. We're going to have a get-together potluck here. Just have a good old time, do a live lup, and hang out at the JB studio, assuming that it's safe to do so. You know, we'll watch all that. So we'll have more information on that soon. But if you're in the Pacific Northwest area, start making your plans for that. But all right, let's talk about server stuff, because you guys know that if you've been listening to the show for a while, we have this really super reliable super micro box that has been around forever. We bought it used.

Starting point is 00:06:15 It's really done good work for us, right? I mean, it sure has been the bedrock of the studio for a few years now. And it's lived through some real hot summers because it's a garage that has an exposed roof. And it's also where the afternoon sun is. So it gets really hot in there, even for the Seattle area. You know, it could be over 100 and something degrees in there in the summer. Survived multiple OS transitions, many Arch updates. Yeah, it started life as a FreeNAS box.

Starting point is 00:06:39 Then it became a Fedora box. And then it became an Arch box. it became a fedora box and then it became an arch box and there are legacy artifacts of that going all the way back to the free nas installation it's pretty funny and uh alan jude set up our original zfs setup on free nas and uh so that has been carried forward through all those installs as well which is something we're going to kind of finally modernize and update on which we'll get into that in a little bit but i But it really has served us well. It's loud. It takes a ton of power.

Starting point is 00:07:09 And it's been through a lot of seasons. And this last summer, it started hard locking on us a few times. Probably the most fatal one was during the final few weeks of the road trip when we were all AFK and it went down. And I had like some backup infrastructure on Linode because we knew it was sketch. And that was the only thing that kind of saved the day on that. So we realized goal number one, when we go for a new server is stability. We got to make sure this thing's not

Starting point is 00:07:37 locking up on us. We want to have remote management. We want to be able to figure out what's happening, even when we're not in the studio. And along with that, we want to build like a cooling box for it in the spring and all that kind of stuff. That was like, and probably, you know, obviously something that's just will last us a few more years too. It's not just reliability. It's also sustainability for a few more years. Yeah. Right. We're kind of putting some of that, getting, getting some work in now to forestall having to do this again sometime soon. Yeah. You know, I love getting four or five years out of something like this if we can't even longer,

Starting point is 00:08:07 which we definitely did with this last box. So that's like 50% of our goals right there. The other 50% of our goal is to increase the available storage that we have. We archive flack versions of our show productions on there. We archive special projects on there. Anytime we go to a community event, if we record audio or we film it, we archive all the source material on there. We archive special projects on there. Anytime we go to a community event, if we record audio or we film it, we archive all the source material on there. There's also a media collection for people in the studio that's on there. There's, you know, I have some family

Starting point is 00:08:35 stuff on there, like videos and things like that. There's a ton of our archives. So from the object storage up in Linode, what you guys, we've talked about this before, but just as a recap, the object storage up in Linode. You guys, we've talked about this before, but just as a recap, we run Nextcloud up on Linode. Object storage is the backend. And then to prevent excessive growth,

Starting point is 00:08:53 we just carve off every few months some of that and we throw it on the physical server here in the studio that has this. And that way we keep what's in object storage to like three or four months. And then anything older than that lives in an archive storage, kind of like a cold storage here at the studio.

Starting point is 00:09:07 Then we run NextCloud on each of them and it helps us manage that and it's really nice. And so we need to be able to grow that as well. So right now, the current system

Starting point is 00:09:15 has two terabytes free. It's running Arch. We haven't updated it since the last time we updated it on the show. That's been a few months. It's like, it's definitely due for some updates. Yeah, I think so. Let's just a few months. It's like, it's definitely due

Starting point is 00:09:25 for some updates. Yeah, I think so. Let's just see here. How many? You think? No, it'd be really dumb to do one more update just because we're in the middle

Starting point is 00:09:34 of a migration. Yeah, no, we don't need to update it. But boy, I tell you what, there was a second there. I'm like, let's do it. Let's kick off the millionaire. Let's update this sucker. But no, no,

Starting point is 00:09:42 this is not that time. So we wanted to move over to something that gave us additional capacity, some runway for storage. Ideally, although this is a long shot, ideally something that gets us through the rest of the year because buying high capacity SAS hard drives right now is like buying them made out of gold or Bitcoin even. It's ridiculous. We were looking at prices last night for even just, you know, maybe two terabytes. Some of the drives are going for nearly $800 right now.

Starting point is 00:10:11 It's just not happening. Too much. So we wanted to try to get as much storage runway as possible with this migration. And did you get the updates there, the number there? Oh, yeah. Oh, you know, hey, net upgrade size only 24 megs but uh 229 packages three gigs of updates three gigs updates and a 24 megabyte net size that just

Starting point is 00:10:32 makes me smile you know that's just so damn impressive we'll miss you arch so it it really was a matter of getting new hardware trying to get the right price to performance ratio, and then finding time to set it up and finding a new physical spot to put it that we could build a box around later. You know, it's as with a lot of these things, it's an exercise in engineering compromises, right? Like we have the resources we have entirely, you know, thanks to the amazing community that we have. And we're trying to figure out like for our particular needs, which, you know, it's kind of an odd combination of things. And it's not necessarily, it's not the thing hosting the shows.

Starting point is 00:11:08 It's not the stuff, you know, if it, if something happens or we need to do some work on it, that's okay. But it is still, you know, important to the studio. Yeah, it's not in the pipeline of production, but it's in the support area of like infrastructure, which is like a second tier of infrastructure. So it works kind of good for this sort of thing. It's definitely, and it's something we definitely have in production.

Starting point is 00:11:26 So if it breaks it, you know, we feel it when I was looking at buying something, you know, I was looking at like a thousand, $2,000 rig, you know, I was,

Starting point is 00:11:34 you know, looking at something budget friendly because I knew I was going to have to put a bunch of money into storage. But like Wes said, like we have a amazing community. It's like, you can't put it down as an asset on a balance sheet, but it has got to be one of the strongest assets of this company

Starting point is 00:11:50 because we had two different individuals step up and hook us up with incredible hardware that is really, it changed the focus from like, how the hell are we going to afford this to like, let's get some storage and focus on the setup. Both of these servers that were donated to us are nicer than I would have bought. Right? No kidding. Like, you know, we could have got something decent, but not this. Yeah.

Starting point is 00:12:23 And both servers, one came with no disk and one came with disk, but both servers ended up having 48 terabytes of total storage available. Ironically, they both, that's what it worked out to be, is 48 terabytes, which is interesting because right now we have about 20 terabytes, and we have two terabytes free in our current system. One is a set of spinning disks that I picked up from two different vendors on eBay, and the other came with 24 low-write threshold SSDs, but they're all SSDs, and man, is that a thing of beauty. So when we saw this, what we saw was a real opportunity here for us to play with disaster recovery and failover setups over time,

Starting point is 00:12:53 especially not maybe so much from like an OS hot standby standpoint, but more like from a data recovery. For some of these really big, you know, archived files of productions and stuff that right now they're just not properly backed up because it's difficult to do so. Yeah, it's a lot of storage. So how do you do that? And especially when you have a Comcast connection that has like a kind of crappy upload speed and stuff like that. So we're going to talk more about that.

Starting point is 00:13:19 That's more down the road. What we're going to focus on today is the box that has 24 SSDs in it. And that's going to be the first box we deploy in this because it's got some great CPUs in it. It has got a couple of Intel E5-2667 Xeons clocked at 3.6 gigahertz. And these are just beautiful CPUs. And there's 16 cores total here. And it has 755 gigabytes of usable RAM. It's just we're using like something like 0.4% right now, just in the base setup. It's insane. Yeah, we're talking RAM disk here, we're going to be getting there. So it's an it's a Dell R730 XD. And it's a hell of a box,

Starting point is 00:14:02 but it was a little bit above my pay grade. Yeah. I got to say our community once again, stepped up and joining us this Saturday, we had a listener who helped gave the hardware once over and get things rolling. Just don't confuse them with old Wes. My name's Wesley. I'll go by that. Just so that we're not confusing me with a Westpain. But yeah, so I'm, I'm kind of local here to the Seattle area. And when I heard Chris say that he would love some extra help, I was like, I'm a guy who lives in Seattle,

Starting point is 00:14:36 the Seattle area, and I would love to help you guys. So yeah, I'll come on down and see what I can do. I actually told New West, don't come. I said, don't show up. The roads are frozen. It's a two-hour drive from where you're at. Plus, we're having a get-together on the 30th. That's just too much work. That's crazy.

Starting point is 00:14:50 Don't come. It's no, it's just not, we're not worth it. But he didn't listen to me. He headed out, and he really, really knows this particular line of Dell hardware in and out. So that worked out, like, perfectly because it's what he deals with his day job. No kidding. He could not ask for anything better. Now, Seuss on the other hand, well, we're all kind of on the same page there. Yeah. I knew it was a distro and that's about the level that I had. Yeah. We had some fun moments. I think that's how we feel sometimes too. We're slowly learning. The net install had some learning curves for us.

Starting point is 00:15:28 We also chatted about how much of a great deal you can get from used server hardware. I think that is really the gold point here is everything we got is used and yet it's still amazing for what we need to use it for. And maybe something people should consider more often. Absolutely. It's very affordable because you got got companies who, you know, they go through their stock a third of their servers at a time. They do upgrades.

Starting point is 00:15:51 And they'll go for like $700, $800. Yeah. And they're ridiculously powerful for that, too. Yeah. They also have a nice feature that I think we're excited about, IPMI, right? But New West wanted to warn us and tell us to be careful. Oh, yeah, IPMI is great. It's easily one of the best things about sort of enterprise hardware.

Starting point is 00:16:13 There is a downside, though, of course, as with everything. There's a major security component there because it's its own separate board before the rest of the motherboard. That's how you can turn the server on while it's off. It's because it's its own board that has the ability to control the server remotely. And so when you expose this over IP, right, which is how you want to for convenience, now you've got this huge attack surface that has the ability to control your entire box. That's definitely something we'll have to keep in mind.

Starting point is 00:16:43 Definitely. Because I think a big part of this for us was IPMI getting some kind of remote management. You can really think of it as a computer inside your server. And so I think

Starting point is 00:16:52 what we'll probably do is lock down what IPs can connect to and that kind of stuff. But it also was why we did this because if I had IPMI while I was on that road trip,

Starting point is 00:17:00 I could have rebooted the server and got it back up and running again and it wouldn't have been out for a few weeks. So just, I mean, a huge thank you to new West for making it all the way up here and helping us get the ball rolling on this. We really kind of needed that because I ended up way behind on getting this going because I got stuck out in the woods for

Starting point is 00:17:17 a week longer than I intended to. So like the, all the, all the time after Christmas, between Christmas and new year's was when I was going to be working on it, and I was stuck. And there were a lot of different moving pieces, which is why I was having extra hands around, too, right? Because, like, okay, there was the physical setup in the studio, getting that wired up, plugged into the network, up and going. Then there was actually getting the OS installed, and we still needed to figure out, like,

Starting point is 00:17:38 disk layouts and operating system configuration, and, you know, all kinds of things. Even just having his knowledge there on the BIOS, because he knew those BIOSes in and out and what we wanted to do with it. That could have been a rebel. We wasted way too much time on it. Right.

Starting point is 00:17:50 And also a big thank you to those of you who donated the hardware. Real Zombie Geek hooked us up with one of these awesome servers. That's the one we'll be telling you more about soon. We had an anonymous listener give us the one with the SSDs that we put into production now. And it's great for us because it allows us to just get to work and get back to making the shows and learning this stuff and be able to relay that to you guys. So let's talk about what we're excited about with this server. It's definitely IPMI. That's a big game changer for us, but there's a ton in this box.

Starting point is 00:18:18 A lot of it we haven't even told you about. you about linode.com slash unplugged boy have i really just gained a whole new appreciation for linode this week i don't want to do any more of these these setting up an actual physical server is is a lot of work yeah no kidding right just even the physical aspects rather than just clicking a button in a web ui perhaps and i honestly wouldn't be surprised if the power aspects of it. Rather than just clicking a button in a web UI, perhaps? And I honestly wouldn't be surprised if the power draw of that server is probably about the cost of a base Linode, right? But then that base Linode

Starting point is 00:18:52 is going to be in their data center with their screaming fast connections and their awesome hardware and also 11 data centers I could pick from. Plus they can handle the backups if you want. Yeah.

Starting point is 00:19:04 I mean, it's really just a whole other ballgame. And it's honestly why we have 98% of our infrastructure up on Linode. That's how we really deploy everything. Anything that the team uses, anything that our listeners interface with, that's what we use Linode for. But I hear from so many of you out there that have tried it out for something like NextCloud or replaced Zoom with Jitsi or have decided to migrate to GitLab and you put that on Linode, you're just having that peace of mind that you've got the root password. You can make a backup when you want it. You can get your data when you need it.

Starting point is 00:19:33 You can have full control over that setup. I think that's worth a lot. It's a kind of peace of mind or almost a stress that gets lifted off your shoulders when you move to that kind of setup. And then the great thing about Linode is they give you all the tools to manage that in a way that's not super complicated. It's not going to be overwhelming. So if you want to take steps in this direction, but you've never done it before, you're not going to get scared by their horrible interface or their complicated UI. Like I picture something like the big hyperscalers. That is its own ecosystem, right? That is its own language. That's its own tooling. There are people that can become specialists in those hyperscalers that can't do anything else. It's not necessarily a

Starting point is 00:20:11 good or bad thing. It's just not where I want to invest my time. It's not where I want to develop my skillset. And it's also, it's not going to enable me to use multiple cloud providers should I ever need to. And Linode then has all these great tools like APIs and command line tools you can use as well, or integrate with Kubernetes or Terraform. So it's just like, it's sort of sitting in this perfect spot for people that really are advanced with their infrastructure management or for people that just need a $5 Linode. And their pricing is 30 to 50% less than the major hyperscalers. So go try it out, get a hundred dollars and really kick the tires and you support the show. It's Linode.com slash unplugged.

Starting point is 00:20:45 You go there, you get $100 in 60-day credit, and you support the show. What a great way to start the year. Linode.com slash unplugged. Well, after New OS put in a lot of hard work getting us all set up and configured, you know, the OS was installed,

Starting point is 00:21:01 then we realized, oh, right. We've got to set up this new disk array. And we hadn't really gotten quite that far in our planning. Yeah, this was tricky too, because we had to make a couple of decisions. And one of the things we seriously gave consideration to was doing everything in ButterFS. Traditionally, we have the root file system in ButterFS,

Starting point is 00:21:24 and then we have the data pool in ButterFS. Traditionally, we have the root file system in ButterFS, and then we have the data pool in ZFS. And we thought this time, perhaps, we just ButterFS the entire thing. And we didn't really have like a hard line on like requirements. So we had to actually hash out, well, what do we actually want to do? How much storage do we actually want available? And by having that conversation, we kind of whittled it down to ultimately, well, we're going to actually end up using ZFS. You know, it was interesting, though. It was a fun discussion

Starting point is 00:21:50 and thought experience in some sense. You know, we kind of showed what things, you know, what parts worked well in a world where it was all on ButterFS and kind of unified

Starting point is 00:21:58 some of the tooling we could have used. But, you know, it came down to just the available features and the amount of the pool we could actually use. Because at the end of the day, if we just had the same amount of space, that's still an improvement, but we were going to run into issues down the road in 2022 that we

Starting point is 00:22:14 would just have to immediately address. Yeah. I wouldn't be surprised if we wouldn't have made it more than three months with what we have available on the current system. And with the price of storage right now, I just can't really see us being able to replace a bunch of those disks right away. Right. And so if we, you know, if we had kind of copied a similar setup, the current array was set up by Alan back in the day, and it's set up with a bunch of mirrors in the pool.

Starting point is 00:22:37 Probably a sensible way to do it. Totally, yeah. You know, Alan and Jim over the years have shared a lot of great stuff. There's some good wisdom on how you might want to set up ZFS, particularly what kind of use case you've got and what uptime reliability and how much data protection you need.

Starting point is 00:22:50 But with mirrors, you're making a copy there. You're mirroring that data over, and that means that you're only going to get 50% utilization. So we ended up with exactly the amount of free space our current system has. Yeah, and when we told you that, you looked over at the calculator and you weren't exactly happy.

Starting point is 00:23:07 I'm like, we just can't do this. We should, especially because these are- We kept rating all the arguments. You know, we went and looked up all these kind of guides on like, oh, I think about setting up your ZFS storage and stuff and be like, yeah, okay. Yeah, I mean, that argument, I can't say that's not good reasoning.

Starting point is 00:23:22 Right, I like PM'd Neil on Matrix last night. I'm like, Neil, what do you think about this too? Because we were all sitting here in the studio trying to decide what to do about it. And, you know, you threw out a couple options we could use to try to get some available storage out of a ButterFS setup, which was very helpful. Thank you for taking the time to do that.

Starting point is 00:23:41 When we were talking about it, it was the main constraint you had was you didn't foresee expanding your physical topology the time to do that? When we were talking about it, it was the main constraint you had was you didn't foresee expanding your physical topology within this year, right? That was the constraint that you ultimately put down to me. And I said, the main concern you should make sure you think about is avoiding a topology which can lead to cascade disk failure. Because if you're relying, for example, on parity RAID, and you say, all right, I'm okay with accepting two disks failing,

Starting point is 00:24:08 you have to make sure your RAID configuration is set up so that you reduce the likelihood that two disks will fail at the same time. And because you can't replace the disks and you can't afford and you can't figure out whether those disks were bought at the same time or if they're different times, you have to go with the worst case assumption that they were all bought at the same time. And that means that they will all fail at the same time when they get to failing. I expect that to be the case. You're right. So expect that to be the case and design it accordingly. Like I recommended, you know, for everyone else, if you're doing Butterfest,

Starting point is 00:24:40 I recommend using a RAID 10 and using z standard compression to increase the available space to be somewhere closer to what you would get with the raid raid 6 arraignment because the raid 10 configuration is less punishing to discs than the other options with zfs you can do an equivalent configuration with the um uh you putting mirrors and then striping the mirrors on top because that's essentially what a RAID 10 is. It's a striped mirror. But don't do RAID Z or Parity RAID because with the higher density disks that exist today, the likelihood of cascade disk failure goes up. I have personally experienced this at work and in my home setups, and I basically don't use parity rate unless I absolutely have to.

Starting point is 00:25:28 So don't if you can avoid it. Yeah. And to make it more poignant, these are low right SSDs to begin with. So they have a lower life cycle than an average SSD. So they will fail sooner than an average SSD as well. Right. And so like this is where if you're using ButterFS, and like what I told Chris, and I'll say it repeated again for everyone else is make a

Starting point is 00:25:50 separate ButterFS pool for your data with your disks, and configure it differently than how you would set up your OS pool, so that it's optimized for those characteristics. So for example, set it up so that discard is async, set it up so you use Z standard compression, probably at a higher compression level because you're willing to eat more CPU cycles in return for less IO on disk. Set it up with rate 10. Make sure that you've got proper monitoring and alerting to catch the warning signs quick. Those kinds of things. Thank you. And thank you for taking the time. That's some great advice. And that is exactly the kind of rational setup everyone should follow. And I want to be clear, what we ended up with is nowhere as good as that. And it is better than what we had on the whiteboard for quite a while, which was simply YOLO. And it was just a JBOD with no RAID at all, which we ultimately decided against. But what we chose seems a little more rational when you

Starting point is 00:26:52 consider we considered no RAID at all. But I will also remind you that we have a server with the identical amount of storage that we have as a failover box. So that ultimately played a role in our decision. So we decided to go with ButterFS on the root, which is a hardware RAID 1 using the Dell RAID controller. It's got these two drive slots in the arse of the box back where the ports are. There's two disks in there that run the OS. Then we've got 24 solid state disks that are getting passed through on the controller. And we're going with ZFS. And because you should never do like what we do, we are doing RAID Z2. disks that are getting passed through on the controller and we're going with zfs and because

Starting point is 00:27:25 you should never do like what we do we are doing raid z2 and like you should never do like we do we're doing one big v dev which you should never do especially with 24 disks right so we basically tried to read all the advice we could get and then not do it. I mean, that seems pretty rational, right? I mean, no one— But you liked the number when we said you could get—we had like 82.5% capacity. That was the number you liked, I think, right? That's what it came down to, right? Is this migration just doesn't happen if we don't have more storage.

Starting point is 00:27:58 So ultimately, we wanted as much storage available as possible. So we went with a parity raid, which we're really going to be screwed if we have a couple of more than two disks fail at any point in time. And it's extra screwy because the time it's going to take to get a replacement disk

Starting point is 00:28:13 will probably be about the amount of time it takes for a third disk to fail. So we're really kind of going down to the wire with the idea being that we have the second box that will have an identical copy of everything that we could do like a ZFS send to recover should we need. The IRC thinks we hate our data. Well, they might be right. I mean, we did almost yellow this whole thing, right? You know, what's been fascinating to me watching

Starting point is 00:28:34 all of these discussions happen, Chris, is just your, the difference in your capacity to accept risk and is basically Wes for a few hours just trying to bring you back from those cliffs there. So, you know, I'm not trying to, it's more just to make sure we presented the facts as they may be. I appreciate it. So one of the things you kind of workshopped on your laptop was we were curious,

Starting point is 00:28:59 like what happens if you just do a big ass ButterFS volume with 24 disks and a couple of those disks pop off? Like, how does ButterFS handle that? So Wes kind of replicated that on a smaller scale on his laptop. Tell us what you did. Oh, yeah, it's just fun playing around with ButterFS or ZFS. You can set up some loopback devices, you know, just make some raw files you treat as disks with the loopback system in Linux, which is super handy to play with. You know, it's the mount-o-loop option you may be familiar with. And then, yeah, you can

Starting point is 00:29:27 set them up and then play with whatever type of RAID configuration you might want, right? Great way to start learning on ButterFS or CFS tooling without necessarily touching your real disks. Yeah, and no big loss if one of those loopback devices disappears and you want to see how the system reacts. Yeah, right, and that's also, yeah, exactly. And then you can sort of modify

Starting point is 00:29:43 things, you can just write arbitrarily to it without worry or take it offline entirely and see what happens. And you know, Butterfest, it did degrade, but we would have had to do like a full restore from backup, which, you know, isn't, we should be able to do. Like, that's the point of having the backup. But ultimately, we

Starting point is 00:29:59 decided that probably wasn't reasonable. So why only one VDEV? Because we did actually, didn't we initially do four VDEV 6DICs? Well, so it was basically just trying to figure out what storage, right? It ended up being the storage capacity metric. Yes, right. And so

Starting point is 00:30:15 if we were splitting it up, that just meant there were more total parity disks. More parity disks, so there's less available space. But with the amount of disks we have, it actually would make a lot of sense to split it up into multiple VDEVs

Starting point is 00:30:26 because then it kind of isolates the rebuild degradations to that VDEV and doesn't screw with the entire... Yeah, you can find

Starting point is 00:30:33 some tips out there for just how many you should stick to for the various RAID levels and how to take your totaled sysks and place them up into

Starting point is 00:30:40 VDEVs that will perform nicely. And then there's also, of course, concerns around if you're trying to optimize for IOPs and stuff like that. That wasn't

Starting point is 00:30:46 our concern in this case. But it gets complicated and you definitely shouldn't do like we're doing. Don't do it like we're doing. Now here's something that you might want to consider. And we're going to just be honest with you on how this part goes. And we kind of have a luxury because the old server hasn't crashed

Starting point is 00:31:01 recently. Doesn't mean we want to take our time doing this, but it means instead of just like whole hog picking up the Docker environment and dropping it on the new server, which was what plan A was, now we're thinking because they're both running, why don't we do a application migration to Podman?

Starting point is 00:31:21 So we'll go from Docker in this migration to Podman running on the OpenS from Docker in this migration to Podman running on the OpenSUSE Tumbleweed in the new setup. We thought we didn't have enough to learn, you know,

Starting point is 00:31:30 with a new server operating system, so why not introduce something new into the mix? Yeah, right. But no, really, we've been, you know,

Starting point is 00:31:36 I've been playing with Podman a little bit on the side. I think you have, too, and it's come a long way. It's been exciting to watch its

Starting point is 00:31:40 development. There's a lot of stuff to like about it, especially with the mixed sort of changes that the Docker desktop world has been seeing lately. So,

Starting point is 00:31:48 I kind of felt like maybe this was just the excuse we needed to really see. You know, it's not, again, as we said,

Starting point is 00:31:53 like there's some stuff that needs to run, but it's not exactly business critical day to day. So, we can have a little time here to try to figure out

Starting point is 00:32:00 and learn, frankly, learn a lot more about Botman is also what we need to do here. And we have the luxury that the production implementation is still running. So we're going to start with like a simple applications and then kind of work our way up.

Starting point is 00:32:13 Eventually, we get to a bunch of applications that are all kind of intertwined with a whole bunch of storage and all that kind of stuff. And that's where we're going to have to just move like three or four things at a time. But I think although we haven't actually have we tried this yet, because you might have been plugging away at this before the show started, but we're thinking, because we were using the Docker ZFS driver on Arch, we have a subvolume for each Docker container?

Starting point is 00:32:35 Yeah, we've got, there's a lot of subvolumes. So there's both the layer of, there's the volume layer, and a lot of ours are set up on that setup. And then there's the sort of container storage layer for the images. And in the current setup, those are both on ZFS. And so should we just be able to ZFS send those to the new box?

Starting point is 00:32:50 Yeah, so the application data, we can just, yeah. Yeah, that's going to be nice. And then it'll be a matter of starting the app in Podman and just connecting it to its moved data, hopefully. I mean, Podman's come a long way in some of the Docker compatibility, and I have not really used it in anger or in production. You know,

Starting point is 00:33:05 I don't think either of us have. So that's, I think that's going to be what's interesting is to figure out what pieces move easily because we've tested a few simple apps

Starting point is 00:33:12 in the new setup and they seem to be working just fine but nothing with, you know, complicated interdependencies and networking and a whole bunch of state

Starting point is 00:33:18 that we're trying to migrate. Yeah. Yeah, it's going to be interesting but I think it'll be good to learn and try something new. Then we'll also be learning the Tumbleweed tooling along at the same time.

Starting point is 00:33:29 So there's quite a bit, actually. There's a lot of moving parts to it. It has been neat to see some of that integration, though. For instance, Podman has a ZFS driver, but it also has a ButterFS driver. And Tumbleweed has that set up for you out of the box. Yeah, well, of course. And so it's just been interesting since we've kind of been, you know, like we hacked on Snapper to our Arch setup.

Starting point is 00:33:49 And so it's one thing I'm excited to explore in this new world is just some of those pieces in a system that someone else has been thinking about how to compose those pieces and work with Butterfest and not just the stuff that we've come up with. Then phase two-ish, I suppose, after we get everything migrated, is going to be attacking this DR box.

Starting point is 00:34:05 We have a couple of routes we could go there. We could do another tumbleweed box and just kind of keep them in sync. Or we could proxmox that machine and allocate a VM to be like a backup server and do like some cron jobs to ZFS send. I don't know if we have a complete picture of how we want to do that yet. Like one thing we're considering is like sync it up here and then West takes to his house and it runs there as an actual legit for realsies offsite backup.

Starting point is 00:34:32 What an idea. That would be a big step up for that data. There is like some stuff that we've just decided, like if a disaster strikes, I guess we're just going to lose this because I mean, where, where do you want to pay for 20 terabytes of backup storage? And we're adding to it every single episode.

Starting point is 00:34:49 Like, it just gets crazy, right? So this is a real solution for us now. So if you have ideas or suggestions, linuxunplugged.com slash contact for that. Oh, yeah, I'm sure you guys have some very clever ideas. Now, that being said, that seems like a pretty performant box to just be doing backups. Wes, do you think you'll have

Starting point is 00:35:08 other plans for this thing? Yeah, we got it, right? Something tells me we'll find some use cases. That's kind of why I was thinking Proxmox. Because you could have a lot of the storage and like a tumbleweed VM and then you slice off like the other stuff. Because the other box that we're not using it has four Xeons

Starting point is 00:35:24 in it. And another is like 700 gigs of RAM or something. Just a crazy amount of RAM. Something tells me it doesn't need all of those for copying bytes off the network. Especially when they're going to be coming over at Comcast Connection. So it does make me think, one of the things Wes and I were talking about

Starting point is 00:35:39 is offloading the CPU video encoding that we do in the cloud right now to one of those boxes. They also, these Dells let you set power caps, which is fascinating. So you can say, just don't let the system use more than 600 watts or something. That's pretty nice. I mean, maybe you don't always want that,

Starting point is 00:35:55 but it's cool to have some of these more, you know, customizable options down at the firmware level. I noticed an option in there while we were playing to cap it by BTUs as well, which I thought was really fascinating. Yeah. I gotta say, the SSD box is quite enough that you could have it on your desk when it's not under load. That's been pretty surprising too, yeah. It's also big enough

Starting point is 00:36:14 to be your desk. That's true. It is a monster, and they're heavy. That is a thing. But it's up and running. The migration is in progress. We're starting with network test pinging applications, stuff that just is really simple, but has obvious application data history.

Starting point is 00:36:30 So we can verify that the migration worked correctly. And then we'll start moving over to the more integrated tooling, things like NextCloud and our production tools and workflow things and media stuff. And I imagine we'll have it probably done within a couple of weeks. I mean, we're not in a rush,

Starting point is 00:36:46 but once you get one or two of them going, it's just sort of rinse and repeat. Yeah, definitely. All right, then. How about just a spot of housekeeping? Thank you to our members, UnpluggedCore.com and our network members at Jupiter.Party. It's now 100% ad-free feed or the live feed,

Starting point is 00:37:04 which has a lot more show. You want more Westpain? That's where you get it. Right there. Right there. So thank you to our members. You get all our screw-ups. So you're welcome.

Starting point is 00:37:15 If you have feedback or ideas, segment ideas for the show, linuxunplugged.com slash contact. You can join our growing Matrix community at linuxunplugged.com slash matrix. Our Matrix server is colony.jupyterbroadcasting.com and you know if matrix isn't your thing we have a telegram group it's always going jupyterbroadcasting.com slash telegram we'd love to have you there and don't forget our virtual lug meets on sundays before the show and we're going to be meeting on tuesdays at

Starting point is 00:37:43 the old show time too so go get your mumble set up, which is actually jitsy. Practice both, right? Get mumble installed. Maybe get your mic working. For goodness sake. It's 2022. Probably also install an audio editor, something you can record your mic and then listen back to and make sure you sound alright. Yeah, Audacity. There you go.

Starting point is 00:37:59 It's free. Try it out. Goodness sakes, 2022. Goodness sakes. We've gotten a lot of great feedback. Here's a quick one that we can get into. Bedrick wrote in, he says, Dear Linux Unplugged, You've talked about monetizing Linux desktop apps in Linux Unplugged 4.3.8. I would like to pay for applications I use,

Starting point is 00:38:22 but I personally dislike the feeling of being forced to. My New Year's resolution is yet again to go over the open source projects I use and set up donations. What I would personally like to see is a tool that makes it easier to donate without the hassle with individual project payment details. I imagine a GNOME software welcome screen to set up credit card details as a Flathub account or something similar, and an autopay option with an opt-out possibility. This autopay option would be default, regularly pay the recommended price for all apps installed on my system that satisfy a few defined conditions. One, the app is already installed for some minimum amount of time, so longer than a trial period, for instance, and is used frequently enough to be useful, like at least once a year or something.

Starting point is 00:39:10 Some additional configuration would be appreciated, like global and per-app configuration for the price as an absolute value or a factor of the recommended price, configuration of the trial period and usage frequency, and some payment breakdowns. I feel like the Linux desktop community needs to feel in control and not be forced to. You know, I think we've seen various takes on some pieces of that. Nothing that's quite got all of it, but more so really seems to have taken off.

Starting point is 00:39:38 Yeah, Open Collective comes to mind, but it doesn't quite do it at this level. The days of Flatter, was it? Yeah, yeah, right. There's a few out there. And I don't think I'd want something, much like he doesn't want to feel forced to pay. I don't know if I agree with that take,

Starting point is 00:39:54 but I understand the sentiment. I wouldn't really want this to be a GNOME software solution. Try as I might, I still struggle to use GNOME software, and I wouldn't want it to be something that only could be done in GNOME software. Now, Chris, you and I this morning were throwing around ideas around podcast networks and how they could monetize shows. Do you see any of that being applicable in this case? So next week, Dave Jones is going to join us on the show.

Starting point is 00:40:18 And Dave works with the Podcast Index and podcasting 2.0 initiatives. podcast index and podcasting 2.0 initiatives. And one of the things Dave's going to talk to us about is how they use the Bitcoin lightning network to pay content creators. And the idea is roughly, he's going to tell us more about it, but from my rough understanding, you say you have a podcast app, an open source podcast app,

Starting point is 00:40:38 and it's, it's got a Bitcoin wallet in there. And what you do is you throw like 30 bucks of Satoshi's in that, in that wallet that's built into the podcast catcher. And then you go in there and you say, you know, you send this amount to Jupiter Broadcasting and you send this amount to, you know, name your favorite podcast. And kind of like you might with a Patreon where you can kind of subscribe at different tiers. And then what the, what the app does is it uses the lightning network to kind of stream satoshi's to the creator as you listen so you set a you set a threshold and you set a

Starting point is 00:41:12 total amount and then based on your listening it funds the creators you're actively listening to which i think is a really interesting thing because it's it's uh it's part of what they call a value for value model and uh it could be applied to free software as well, I would think. I don't know if people want to pay by their hour usage of software, but there could be something in there that says, hey, you know, you used this editor for eight hours this week or this month or this year. Or you launched it 28 times this month, something like that.

Starting point is 00:41:43 Would you like to, you know, send the average amount? Yeah, ways to lower that barrier in a practical way. Because, yeah, many of us mean well, right? But even when you are intentional enough to maybe think like, oh, yeah, year-end things, I want to donate and such. You know, you might think about nonprofits, New Year's, but how often are you also thinking of software? Like, it just doesn't always easily happen if you don't actively do it.

Starting point is 00:42:01 And are you also thinking of software? Like it just doesn't always easily happen if you don't actively do it. I like this idea of the Bitcoin wallet in the podcast app using the Lightning Network because it's all free software. The entire stack, even the payment system is free software. All of it. And yeah, there's scammers out there that use this same stuff, but that's true for cash and everything else. And so there's something interesting there about monetizing free software with a free software currency that tends to grow in value over time and doing it in a way that is transparent because everything on the blockchain is transparent. And so the project by the very use of a blockchain payment system is held accountable or a podcast network is held accountable, right?

Starting point is 00:42:47 Because it's all of the accounting is there on the blockchain. And there's something about free currency, free software, and a public ledger that seems like there's probably some kind of, as they would say, synergy that we just haven't really fully explored yet. Do you feel like these tools are ready

Starting point is 00:43:04 for prime time action, like the Lightning Network? Oh, sure, yeah. So well-tested and totally ready for us to take it on full speed? Yeah, I mean, like with the podcasting index, that monetization system's been in practice for a bit now. Nice. Yeah, so maybe.

Starting point is 00:43:20 You never know. We could see something there or nothing, absolutely. It all happens. And we still struggle to pay free software developers. Yeah, I mean, I think you're just trying to lobby to get people to implement your prediction. That's what's going on. Oh, yeah, that's right. Right.

Starting point is 00:43:34 Although I didn't work NFTs. You see, that was, yeah. All right. So I have a pick that I feel like is the kind of pick you only really do on the show right after the holidays where people may have gotten a Kindle. So what I wanted to link in the show notes, and I know it's not for everybody, but it's a Kindle comic converter. It'll take like a JPEG or a PNG or a PDF of your comic that you might be reading, probably a Star Trek comic, and then it will convert it to a format that is readable on your new Kindle that you got without having to go through like some Kindle cloud or Amazon authorization. And it's simply called Kindle comic converter. We'll have

Starting point is 00:44:09 a link to the website or we'll have a link to the flat hub as well. Isn't the Kindle just one of those things like so many of those that were, you know, I'm sure there's plenty of open source behind it and the device works pretty, pretty nicely. I have one, I use it. I like eBooks, but it's just, I'm so grateful for open source things like this that can put you a little bit more in control because otherwise it's really very much feels like a locked-in ecosystem. I wonder if I could find our old Linux Action Show

Starting point is 00:44:34 Kindle review. That was pretty rough. Linux Action Show is the very first Kindle. Yep, there it is. 13 years ago. There's us. I'm going to put a link to this into the show notes. It's only two minutes long.

Starting point is 00:44:49 I think it was an outtake, so it's not the full review. But it's pretty bad. I'll put a link to that in the show notes. Oh, boy, I don't think I've seen this one. 12 years ago, we were talking about the very first Kindle on the Linux Action Show. Season 2. Yeah, that was just after we started calling it Seasons, which I think was around 100 episodes, I think.

Starting point is 00:45:07 I don't remember for sure. So it was like 100 episodes in we started. The whole thing is... Definitely a system that makes sense. Very foggy. It's very, very foggy. I have to be honest with you. Well, thankfully you documented it, so you can review it later.

Starting point is 00:45:18 I look back and I go, who are those people? Who are they? What are they talking about? Huh. And every now and then I find myself agreeing with my past self and I realize oh, I'm just like a robot. If you give me the same input, I pretty much give you the same output. Is that a good sign or a bad sign? I don't know.

Starting point is 00:45:31 I guess you're consistent. Yeah, I suppose on some things. Although you can find some reviews of me really dogging on Butterfest and Fedora, so some things do change. And more recent ones about Seuss now that I think about it. If you'd like more show, remember that we do have a lug that gets together every Sunday and now on Tuesdays at noon Pacific. If you're in the tech industry, make sure you're not missing LinuxActionNews.com.

Starting point is 00:45:52 There's a lot going on there that doesn't make it into this show. But now, we have a new Same Bat Time. It's Sundays now at noon Pacific, 3 p.m. Eastern. See you next week. Same Bat Time, Same Bat Station. There you have it. Of course, the show will just kind of come out

Starting point is 00:46:06 like a little bit earlier in the feed, so for most of you out there, just a little bit sooner. Linux Action News will be towards the end of your week now, so make sure you're at

Starting point is 00:46:14 linuxactionnews.com slash subscribe for that. Links to the things that we talked about, which is not much, not much, because it's, you know, it's a special episode.

Starting point is 00:46:22 But we'll have those at linuxunplugged.com slash 439. Yeah? Anything else, gentlemen? No. It's been good. Good work, guys.

Starting point is 00:46:30 We got ourselves a new server. Yeah, and a new day. A new year. All right, everybody. Thanks so much for joining us on this week's episode of the Unplugged program. See you right back here next Sunday. All right, jbtitles.com. Let's go boat, everybody. Go pick your title.

Starting point is 00:47:24 You got something there bro what you got throughout all this discussion i got really curious two things i think okay uh maybe i'll throw them at you one at a time so go for it okay number one what's gonna happen to the old server uh well we've uh had someone who's who's requested it so we might donate it to a listener well that's nice it's like a full circle kind of thing yeah yeah like that. Yeah. So that's sort of plan A. Although if they don't want it, then I don't know. I haven't really figured out plan B yet. Right.

Starting point is 00:47:50 But that's plan A. Okay. I like plan A. All right. Let's question two. Okay. Question over here. It's a good question.

Starting point is 00:47:56 How's Arch been? What's the debrief? Like, are you feeling like it did what it needed to? We should have done an Arch exit interview. Yeah, really? We should have done that because I think it worked what it needed to. We should have done an Arch exit interview. Yeah, really. We should have done that because I think it worked better than we expected. And we thought it'd be okay. But I think it was solid as hell.

Starting point is 00:48:13 And it's not going out of production because of any Arch issue. It's because the hardware's dying. Yeah, really, right. We could have kept this train rolling. The will of the community in our vote. Yeah, and stepping up and saying, yes, you're going to go this direction in the polls, so we're doing it. Even though the lizards may have conspired to make sure the poll went their direction. Yeah. I will say, you know, it was

Starting point is 00:48:32 one of the, I've had a few personal Arch servers or similar type things in the past, but I think this was the least, the lowest touch of those, you know, which is not generally necessarily maybe what you think of in the Arch setup. So it was kind of a test of that kind of stability as well, because it was, you know, maybe monthly or maybe sometimes a little less on those updates that we did. Someone can do the math. But yeah, I think that worked pretty well too. I mean, I think just to recap, what made it work well was, we say it all the time, but

Starting point is 00:48:59 put our FS on the root and not ZFS, because when the ZFS module a couple of, maybe three times on us, failed to build or install correctly, we could still boot the system. That could have been catastrophic otherwise. You fixed a little timing issue, so that way ZFS got mounted before the containers start. Oh yeah, just making sure that was going to play nicely together. Because that's the thing about Arch, there's no distro maintainer, kind of making sure that works. We had to figure that out on our own,

Starting point is 00:49:27 because sometimes the Docker containers would start, and the pool wouldn't't be available and they'd be like, where's my storage? And then things would be broken. But the applications would be running, so it's not immediately available or obvious. Yeah, but like, you know, we tweaked that ages ago. Yeah, it just stayed. It just worked. It was never a problem again. And then I think the LTS kernel helped.

Starting point is 00:49:44 We weren't taking advantage of the cutting edge kernel features. Not in that box. It got updated more than I expected, but it was point updates. And then I think obviously having our applications in containers and not like installed from the AUR or the Arch repository meant that the OS could pretty much roll independently and not affect the applications. And so we didn't have things like AUR installs breaking

Starting point is 00:50:08 that no longer updated or anything like that. And I think that also contributed to being a successful Archbox. Are we going to do some occasional like zipper live updates? Oh, yeah, I think we should because, I mean, the whole thing is holding it accountable. And if it, you know, and not sugarcoating for breaks and not hiding if it's been a big success. And if it's been really great and, you know, we not sugarcoating for breaks and not, uh, not to hiding if it's been a big success and if it's been really great and you know, we have to own that, then we own it either way. I think we do it publicly as a, you know, as sort of a, an example

Starting point is 00:50:34 of how to run something. So you don't have to, right. It's sort of how I always looked at it. Like we'll try this and we'll try it with something that really matters to us. So that way you listening can do, do it the right way and don't do like we do. And I think there's some value in that. We've gotten quite a bit of feedback in the last several months of people saying, oh yeah, I remember, you know,

Starting point is 00:50:52 when I first caught some Linux Unplugged and the thing that really grabbed me was you guys doing live Arch updates. We've gotten that so many times and it feels like it's both memorable and really valuable, perhaps. We'll it feels like it's both memorable and really valuable perhaps we'll see hopefully it's been hopefully it's also fun and valuable with the the sews setup

Starting point is 00:51:11 and i actually i could imagine if it you know if it actually failed on us it would like wreck the rest of the show like what are we gonna do just keep on doing the show that's a bummer yeah yeah we're bummed out at worst we stop the show and fix the server and at best we keep going knowing the server is busted and we got a big job

Starting point is 00:51:30 after the show Wes will just do his tap dance routine to fill the extra waves that we gotta fill for the show you know we should probably just set up

Starting point is 00:51:37 a mic in the garage directly already one of us just moves out there you better get your finger on the beat button right

LINUX Unplugged - 439: Double Server Jeopardy

Our new server setup is bonkers, but we love it. ...

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.