Embedded - 141: Malevolent and Trying to Trick You

Starting point is 00:00:00 This is Embedded FM. I'm Alicia White alongside Christopher White. We're going to talk about Linux kernel hacking today. Yes, you heard me right. Kernel tools and how they make your application development better. Julia Evans is here to help us because you all know how I feel about operating systems, even our tosses. One thing first, our multimedia empire is growing. Please check out our blog at embedded.fm slash blog.

Starting point is 00:00:34 It's cleverly hidden that way. I put in a second entry about taking toys apart. Andre Chichak and Chris Zweck continued their respective introduction to embedded systems. And Christopher White unboxed the particles cell modem platform, the electron. So RSS should work. If you have trouble, let us know. And I will get a newsletter together very soon. Okay, now on to Julia.

Starting point is 00:01:00 Hi, Julia. Thanks for joining us this evening. Hey, thanks for having me. Could you tell us about yourself? Yeah. So I have been a software developer for a while. And at some point, I got really excited about operating systems. And I decided that I wanted to learn way more about them. So I started a blog where I write about things that I don't know about. And then eventually I learned about them. And then I tell other people what I learned like yesterday. It's instructive to hear what problems you're having as you're having them.

Starting point is 00:01:39 Yeah. One of the things that you write about a lot in your blog is S-Trace. Yeah. S-Trace is my favorite program in the universe. That's a little odd, right? I mean, you recognize it's a little odd? No, it seems really normal to me. Okay. Yes, and Christopher's reminding me that i forgot a whole section here that always goes at the top i think i would learn this we've got to do lightning round okay okay so lightning round is where we ask you questions and look for one word answers you don't have to explain

Starting point is 00:02:21 um probably unless we really want you to. It's usually better when you disobey that, though. Yeah, that's true. Okay, Chris, do you have one? All right. I'll start off with what programming language do you think should be taught in the first CS course? Python? What is your favorite programming language?

Starting point is 00:02:42 No. No? I don't have one. Okay, that's fine. Okay. I wasn't sure if no was a new one because I've been losing track. I was like, go, rust, no. Sure, that totally sucks.

Starting point is 00:02:56 I would program in no a lot, I think. I think no is the best programming language. Favorite fictional robot? Oh, man. I wasn't prepared for this that's fine um you can pass hmm there's a robot but i can't remember its name okay like from anyway no there's a science fiction story which i love tell us i can't tell you anything about it like um what is it about it's about these robots with feelings anyway no that's not nearly enough but okay yeah have you seen the new star wars movie no I have not. Dogs or cats?

Starting point is 00:03:46 Cats. What is your second favorite operating system? OS X. What is the worst version control system you've ever used? I like that. None. Like not using version control. Okay.

Starting point is 00:04:05 Yeah, that's definitely the perfect answer. Yeah, but it's too perfect. Oh, well. Someday somebody's going to say git, and I'm going to ring a bell, and balloons will fall from the sky and all of that. What kind of phone do you have? I have an Android phone with a broken screen. Very popular model, yes. Favorite planet?

Starting point is 00:04:31 Earth. Oh, I guess that's the right answer. She's good at this, but she chooses answers I wouldn't expect. I don't even think Earth is on the list. It's not. Which came first, the chicken or the egg? What? The chicken? I agree with that. Hacking or engineering? Oh, goodness. It depends if I'm being paid or not.

Starting point is 00:05:00 Favorite conference to speak at? Bang Bang Con. What? Bang Bang Con. What is Bang Bang Con? Bang Bang Con is a conference about the delight and wonder of programming where all the talks are 10 minutes long. And the only rule is you have to have an exclamation mark in the title. New York City in May alright yeah the CFP is open right now

Starting point is 00:05:29 I'll add a link to that in the show notes 10 minutes isn't a lot to prepare for yes 10 minutes I could lie about anything exactly 10 minutes I could talk about nothing for sure that's kind of what I meant. I saw a video of you speaking at PyCon in 2015, where you did not, in fact, speak about Python at all.

Starting point is 00:05:55 What did you talk about? So, I talked about how understanding your operating systems better can help make you a better programmer. And the way I kind of approached it was I was like, well, suppose you have some Python programs. And they're slow for some reason. And you want to find out why without actually knowing Python or without interacting with Python in any way. And it turns out that you can totally do that because, so for example, one of the programs was slow because it was like waiting on a really slow

Starting point is 00:06:33 network connection, right? Like the server on the other end was just doing like a sleep 10 and it was like, I'm just going to wait 10 seconds and do nothing and not reply to you. And if you look at, like, if you ask the the operating system what the program is doing, it's like, well, it's waiting. And you're like, well, what is it waiting for? And you can kind of find out that it's waiting for a network connection without actually knowing that Python was doing that at all.

Starting point is 00:06:57 So I wanted to introduce people to those tools because I think it's really important to know. Well, one of the things you said that really caught my attention was debugging applications as though they're black boxes and you can't look inside of them yeah right exactly um and one of the reasons i like that so much like treating your applications like a black box is because like when when you write a program you've always kind of done something that you didn't expect right like so you might as well assume that the person who wrote the program is malevolent and is trying to trick you. So if you ignore the source code and just observe what the program is doing without making a lot of assumptions about what you think it's doing, you can sometimes get a lot further in debugging it. And so do you use this on your own code or is it mostly looking at other

Starting point is 00:07:45 people's things and why they're not running as you'd expect? Definitely both. Like I use it a lot on my own code because when I'm debugging it, it means kind of by definition it's doing something I didn't expect. So I try to just go look at what it's doing. And yeah, like kind of pretend that the source code isn't there. And you usually run in a Linux system. That's right. And so all our conversation today is going to be more about Linux. And for the most part,

Starting point is 00:08:15 the show tends to avoid operating systems. But many of these same tools exist in other operating systems. Some of them do. I mean, so if you're debugging a program by asking your operating system what that program is doing, that's kind of like the definition

Starting point is 00:08:31 of an operating system-specific method, right? So a lot of the tools that I've learned about are only for Linux or only for OS X. And that makes sense. And so that brings us back to strace. Yeah. Assuming I have never seen it before, what does strace do and why would I want to see it?

Starting point is 00:08:56 Trace is S's. I know, big S's on the screen. Yeah. So to understand what strace does, I mean, talk about what a system call is, which I'm sure you both know. But should I explain what a system call is? Yeah, go ahead. I mean... Yeah, a lot of the people who listen to the show don't use operating systems that much, or at least high level ones. So it might. So if you're using an operating system, I guess it's kind of like more obvious if you think about it from an embedded point of view, because like normally you don't have an offering system. So you have no way of doing things like sending data over a network, right? Like you don't expect to be able to easily send data over a network. Is that right?

Starting point is 00:09:43 Oh, we can, but easily is probably the key word there. Right. I mean, in an embedded system without an operating system, you're doing a lot of these things by hand. You probably aren't granted a socket interface normally or something like that. That'd be nice. Yeah, because I think people who write desktop software

Starting point is 00:10:04 or whatever are just like, oh, I can just send information over a network, there's no problem, right? Right. be nice yeah because i think like people who write like desktop software or whatever are just like oh i can just send information over a network there's no problem right right um but then if you're an embedded developer it's like much more immediately obvious to you that there is a problem um so system call is just like the operating system has all this code in in it to do things like you know like do networking or like write files to the disk or scheduling or scheduling. Yeah. And system call is just a way of like asking the operating system to do that.

Starting point is 00:10:31 Right. So it was like one for reading, one for writing, one for opening a socket or like sending information over that socket. And S-Trace just tells you what every system call your program is using is. So if I write program, I don't know foo i can s trace foo this isn't like top where i get s traces of everything that's running right yeah you just get foo okay and so i s trace foo and then foo like has a infinite loop or something. And S trace would tell me I was stuck in my while waiting for file ready loop.

Starting point is 00:11:11 Right. So if I had an infinite or if you just had an infinite loop that was like while true repeat, S trace actually wouldn't tell you anything because there you're just running like instructions and you're not talking to the operating system. Right. There's no system. Yeah.. Right, there's no system call there. Yeah, because there's no system calls. But if you were pulling a file, then it would tell you... If you were pulling whether or not a file existed every second, then every second strace would print...

Starting point is 00:11:38 You tried to check if that file was... You ran fstat or whatever every second. Doesn't that usually give you a giant wall of text? The advantage to having an operating system is you get to call all these things. Wouldn't it be too big? So it turns out that it's not always too big. So this is what I think is amazing about S-Trace. I can tell you an example from the train once.

Starting point is 00:12:01 Because sometimes when I'm on a train or on a plane, I'll just like S-trace things for fun and see what happens. Hopefully not things running on the plane. S-trace navigation. I'm not that kind of hacker. Right. So there's the kill program on Linux. Which like will kill a program. So I wanted to know how it worked.

Starting point is 00:12:39 And I think I actually wanted to know how pkill worked. So pkill is this thing where you're like pkill Firefox and it'll find all the processes named Firefox and it'll kill them all. So I wanted to know how this worked. And I was like, well, what if I just ask strace how it works? So I straced it. And then what you see when you strace is it opens every single... So in Linux, on Linux, processes each have a PID and those correspond to a directory and slash proc, so like slash proc slash five slash whatever. And so it opened every single directory in slash proc,

Starting point is 00:13:12 and then you could see sometimes it would like, so it would open it, move on, open one, move on, open one, move on, and then every so often it would kill one, right? So it'd be like open, open, open, open, open, open, open, kill. Open, open, open open open open read read kill open open open open open kill um so you can see what it's doing is just going through these files and open opening them and then finding out like if that process is called firebox and then it kills one once it finds the right one sort of like playing duck duck goose yeah kind of um and then you can you can see like exactly what it's doing from the system calls.

Starting point is 00:13:47 That makes sense. And it seems like it would help with debugging. Mm-hmm. But you really like S-trace. I do really like S-trace. And you like oscilloscopes. It's an operating system oscilloscope. That it is is in fact that um okay so can you give me one example you used uh in your talk was about figuring out which configuration file chrome opened which is interesting to me because I can never figure out which config file

Starting point is 00:14:26 is currently trashing my build. Right. How do you do that? So what you do, I did this today, actually, because I edited a file and I couldn't remember which file I edited. So you run S-Trace, you tell it to look for the open system call. So look at when it's opening a file. And so if I strace Google Chrome and ask it for every file it's opened, then I can find all of Google Chrome's configuration files because it needs to open them all.

Starting point is 00:14:59 And that's it. And it can't, like, lie to you, right? Like, documentation can. That seems like cheating. Yeah. All right. I have no problem with cheating if it helps you get your job done better. One of the other tools you talked about was perf, which is one I'm a little more familiar with. But can you explain perf? Yeah, but I really want to know how you use perf.

Starting point is 00:15:29 I use perf a lot like I use top, which is to say I use it to figure out what is currently using too much of my processor. Do you use perf top? No. I just learned about perf top, which is why i asked that oh okay good didn't know if i was using the wrong thing all the time okay um so tell us about perf and then perf pop top yeah um so purpose kind of complicated so i actually want to talk about perf top first because it's my current favorite thing. So PerfTop, so Top tells you which programs are running on your computer.

Starting point is 00:16:10 And PerfTop tells you which C functions, basically, are currently being used the most. And I was kind of surprised at how useful this is. So I was on a machine the other day, which was super, super slow. And I wanted to know what it was doing, right? Like it was spending all this time on the CPU. And I was like, what are you even doing program? This is dumb.

Starting point is 00:16:38 And so I looked at it and it was a node program. And like every single function that it was spending the most time in looked like garbage collection. It was like mark and sweep. And I was like, I know what mark and sweep is. It means like garbage collection. So then I could see like instantly

Starting point is 00:16:56 that that entire machine was busy because it was spending all of its time garbage collecting. So doesn't that require the symbols to be built with the, I mean, if you have an executable that's stripped, can it tell you anything? Um, all system

Starting point is 00:17:09 calls? Well, that was strace, but perf top tells you yeah, first off is about C functions. Um, so yeah,

Starting point is 00:17:15 you're totally right. Like that can be a problem. Okay. Yeah. Um, but some, but often it's not.

Starting point is 00:17:21 So you just kind of try it out and see if you get lucky. But if I wanted to prevent you from looking, that would be the thing. Yeah, you totally could. Which I guess if you were debugging your own code, that would be less useful.

Starting point is 00:17:34 Um, okay. So perf top tells you all of the C functions that are most in use. So if you did perf top on the Pkill app that we just talked about, the Pkill executable, it would tell you that you used open a lot. F open. Yeah, well, it might not. So because perf is all about CPU time. So, and like a system call like open isn't really that expensive in terms of CPU time.

Starting point is 00:18:15 So it might not think that that was very important. So if you wrote a dumb version of pkill that every time you opened a file, you read the whole thing, then you might start getting, well, okay, I can totally design bad programs. I'm good at that. Yeah, but no, that's totally right. And there you would probably end up seeing, well, yeah, yeah. Like if you read the whole thing and then like iterated through the file like 10 times, yeah, you would probably see something to do with that but you might end up seeing it being like some kind of string function right yeah yeah and so perf top okay how is perf different um so another thing you can do with perf

Starting point is 00:19:07 is annotate like assembly code which was really surprising to me so like i at some point tried to write a program that was fast um and i wanted to see like how many bytes i could add up in a second so i had like a six gigabyte file and I was like, can I add up all six gigabytes in a second or not? And the answer turned out to be yes, you can totally add up six gigabytes in a second, which was a little bit surprising to me, because six gigabytes is sort of a lot, and a second is not that much time.

Starting point is 00:19:38 Anyway, so I wrote a C program to do it, and then I compiled it, and then i was like perf which instructions did that program spend the most time on and then it could totally tell me which was very surprising to me it's a performance analyzer christopher's looking at me like well i'm wondering how i know how profilers tend to work in ide's and i know how s trace works but i'm this is just something that attaches to any running executable. So I'm curious how it's doing its job. Well, either it is statistical sampling

Starting point is 00:20:12 where it is running and then every tiny increment it interrupts and goes and it looks at the program counter. And then over time it just ticks which program counters, it finds what function your program counter is in, and then it will just plus plus to that function. And then you get a statistical sampling of what you're doing. And I guess I could look on the Wikipedia page. No, but that sounds like a plausible answer.

Starting point is 00:20:42 Yeah, that sounds totally right to me. Yeah, it's statistical profiling of the entire system. And it does support hardware performance counters, which means it also supports the other way of doing performance. Which is to put little

Starting point is 00:20:59 breaks in and say... Right, that's what I was wondering. Did you go through here? Right. And that way you have to support it more in the code, whereas statistical you don't. But it is interrupting your program all the time, which has some runtime effects. Right. I doubt you added up the six gigabytes

Starting point is 00:21:17 while Perf was running in a second. I did. Oh, it did? Yeah, so Perf is actually pretty low overhead, which is the amazing and wonderful thing about it. It uses more RAM, but it doesn't use much processor because it just goes in, gets the program counter, goes right back out.

Starting point is 00:21:34 Yeah. It's a fast interrupt. Yeah. I'm still a little unclear on the mechanism. Like, I know that... So, like, it's something in the linux kernel right which like lets perf work okay um yeah like there's yeah like like it wouldn't work unless something inside the linux kernel was like cooperating with it i'm pretty sure um and

Starting point is 00:21:59 you mentioned that uh if you didn't hook in your symbols. Yeah, perf would be pretty boring because it would just go to each instruction line. It would be each program counter that it outputted. Right. It would still give you, now that I understand how it probably works, it would still give you some output. Which you could use if you reverse engineering still. You'd say, okay, it's this address a lot. Let's go look at the disassembly and see what that might be doing. Yeah, because you don't need to look at the disassembly

Starting point is 00:22:26 when it isn't running those instructions very often. Right. Look at the good stuff. So have you seen instances where perf or strace or perf top cause problems as you're debugging? I mean, like the tool gets in the way of the issue? That's a fantastic question. So I haven't seen any problems with Perf.

Starting point is 00:22:52 Like, my impressions of Perf so far is that it seems really low overhead. I run it on stuff. There's no problem. S-Trace is totally different. S-Trace can slow down your program by, like, 50 times if it has a lot of system calls and that makes sense i mean if you if you put a gatekeeper into your system and he has to do that thing where you punch the button that says click click click that takes time even if it seems fast if you're doing it a lot it takes time and like this isn't

Starting point is 00:23:24 like a fundamental necessity. So there's a program called DTrace on OS X, which you might know that, that can do basically exactly the same thing as STrace, except it doesn't slow it down that much. Yeah, they have something called Instruments 2, which is a gooey thing on top of it, I think. Yeah, that's totally right. And you can probe all kinds of activity.

Starting point is 00:23:48 Well, and then there's one more tool that I wanted to talk about, and that was Top, because that one, if people don't know about, it's super useful. And that is, I mean, if you were in Windows, that would be your resource monitor. What is it in Macworld? Top. Top. Top.

Starting point is 00:24:05 That's true. Well, on the command line. On the command line. But it just shows you what processors are using the most CPU, the most RAM, the most, I don't know, networking. And so it's super useful for all of the same reasons to figure out which of your many programs is causing the biggest problem. I don't know. I was headed off into if you were using embedded linux these are the tools you want to make it tiny and

Starting point is 00:24:31 power efficient yeah yeah especially since if you're doing embedded development usually you own the whole stack right or you have no operating system and you've written everything you know what's running at all times because you've written it all. Whereas if you shift to embedded Linux or some larger operating system than just an RTOS, you have a lot of stuff running that you may not know is there. And so the performance of your system and the battery performance and all that kind of thing can be a little bit out of your hands.

Starting point is 00:25:02 I was surprised with one of my embedded Linux projects. It was going out to update a program that I thought I had deleted. It wasn't even that it was trying to run the program, it just was trying to run its little updater, and I was like, this is not a good use of your network. You don't have any network, cut it out. But I wouldn't have known that without knowing some of the tools. That was just top, though. I'm really good with that one.

Starting point is 00:25:26 Okay, so that's sort of why benchmarking matters in embedded systems. But you don't work in embedded systems. You work on larger things. You work on web stuff. Yeah. And I know you didn't want to talk about your job in particular. I just kind of wanted to settle where you could normally run. So it normally runs.

Starting point is 00:25:48 So it normally runs on Linux servers that are connected to the internet. Yeah, so like web stuff. And so why does benchmarking matter to you? So sometimes I have programs that are too slow, right? Like I need to handle a certain number of requests per second, sometimes my machine is not doing it and I need to know why. Requests per second. Okay, I'm going to just totally change subjects here. I was curious so I wanted to ask more about it. You have a blog post where you talk about requests per second and you call them QPSs?

Starting point is 00:26:26 Oh, like queries per second second it's the same thing okay um so you you also gave a talk about this topic about how computers are really fast but only if you use them right yeah um and i think this is like much more obvious to people who do embedded programming because like have such a small computer and then you need to make the most out of it. Is that right? Oh, yeah. Oh, yeah. Well, and that's sort of where I was headed when in that post you talked about languages and QPSs. Yeah, right. so like i guess i kind of because i think it becomes like not obvious when you're using like

Starting point is 00:27:08 a large computer how fast it is um at some point because like if you're writing um i was talking to someone who was generating a blog um using rivi um and like you can see these posts where someone's like i'm trying to generate some some HTML files and it takes like two minutes. And my reaction to that is like two minutes is a disaster, right? Like you're just like generating some HTML. It should not take two minutes. But it's not, I think, always obvious to people who I think especially like get further away from their computers and aren't doing things like embedded development where they think about like the cost of everything really carefully. How fast the computer is actually supposed to be, if that makes sense.

Starting point is 00:27:50 Oh, yeah, it totally makes sense. Because what I will totally not put up with for my device, I will, on the other hand, let Windows do its stupid updates and let it take like 30 minutes. I would never let my device do that nonsense on power off. Right, exactly. I'm locked into this. This is what it does and Chrome is sometimes slow and if I have all

Starting point is 00:28:12 15 programs I usually run, running at the same time, everything gets slow. Mm-hmm. Yeah. Chris is going to be like, I'm never helping you with your computer again. Yeah. Our programs that we write don't need to be slow, right?

Starting point is 00:28:26 Like Chrome can be slow, but not my program. One of the things you talked about was comparing languages. Yeah. I was fascinated by this because we've been talking a lot here and on the blog about languages and choosing and how to decide. Okay. Yeah. here and on the blog about languages and choosing and how to decide okay yeah um so i wanted to get some like really ballpark estimate estimates of like how fast each programming language is um even though it's like kind of a nonsense question right because like of course you're

Starting point is 00:28:58 gonna have like different workloads um and you're gonna be doing different things depending on the language but i was like okay let's say you're trying to serve http requests because i work on like web software um and i need to like serve requests um and a kind of like natural thing to talk about to me is like how many requests can you serve in a second right um and so i started like googling this and it turns out that in c you can serve like five million requests per second which when I first learned this I thought it was like a joke or like a typo like I didn't realize that you could do that like a million is like really a lot and it turns out that in Java I think you can also basically like serve a million requests per second which was also kind of like

Starting point is 00:29:44 fascinating and surprising to me, because I was like, I thought Java was slow, but Java is not slow at all. Oh, in your blog you have Java at 50,000 and C at a million. Yeah, so that was kind of like I didn't want... I think you can do

Starting point is 00:30:00 a million, or almost a million. I wrote 50,000 because I was like, it's definitely at least 50 000 um but java is like quite fast like not a lot slower than z which is surprising most people wouldn't think that i mean yeah most people most people in this room wouldn't think that at all java java outside of probably the people who use it a lot does not have reputation for speed yeah but once you get the just in time-time right compiling down it it can get fast especially if you're doing something uh in the java frameworks that gets more optimized

Starting point is 00:30:35 okay but the other one you had on your blog post was python oh yeah right so i was like okay c and java like fast programming language and python is like not python is not the same right because of the java you can write your java code and it can get like just in that file so like assembly and then that's super fast and that's awesome but python is an interpretive language so it's definitely going to be way slower um but so i found a benchmark and i was like okay python can totally do like 10 000 requests per second which is still like quite a lot right like most websites don't need anything faster than that. But orders of magnitude. Yeah. But it's like a lot slower, but still like way faster than like

Starting point is 00:31:14 than you might think, right? Because if you think like Python is slow, you're like, oh, does slow mean that it can do like five requests in a second? But it turns out that it means you can actually still do like tons. So in the beginning of the show, in Lightning Round, you didn't have a favorite programming language. With this, it seems like you should. I mean, C and Java are a lot faster than Python. We've established that for sure now in a line in the sand.

Starting point is 00:31:43 Yeah. I don't so like i really like knowing how things work um and part of the reason i find performance really interesting is because it gets you into like really interesting questions about like how things work and if you ask why something is slow like the answer can turn out to be very interesting um and like very surprising but i don't really care about programming languages very much like i care about programming and i like to do things and i like to understand how things work um and i like to understand how things work in every programming language but i don't like it's not very important to me. I get that.

Starting point is 00:32:27 I totally get that because I feel that way sometimes too that people get so worked up about this language or that language or what's new and I do feel like once you understand fundamental programming concepts, you know, they all have their pluses and minuses.

Starting point is 00:32:44 I would not do I mean, I've used NumPy and I've used Matlab and they're both interpreted and I would not do visualization in C. Not unless it had to be super fast. I mean, it's just so much easier to program those in Python. Yeah. So languages all have pluses and minuses. Totally. Yeah. Like if I'm doing machine learning, I'll use pluses and minuses. Totally. Yeah. Like if I'm doing machine learning, I'll use Python and that's it.

Starting point is 00:33:09 Right. Like Python has like awesome like machine learning libraries. It's really good at scientific computing. And I would just use Python all day. Like, and I would never consider using anything else. Because your time, I mean, if you're running, even if you're running something that takes hours, you would rather it ran for those hours than you put in the six weeks it would take to make it faster. Yeah.

Starting point is 00:33:31 And also, it's already quite fast. Wait a minute. Stop. We're getting lots of noise. Uh-oh. Okay. How is it now? It's perfect now.

Starting point is 00:33:44 Oh, cool. Okay. I just won't move at all sorry sorry i'd prefer tapping to that okay yeah we'll see what happens okay um christopher will you cue us back cue you back what oh i i missed some of what jul Cue you back what? Oh, I missed some of what... Julia, do you remember what you were saying, or do you want to go back? I don't know what I was saying. I can play it back.

Starting point is 00:34:13 With NumPy and Python, NumPy is already super fast. Yeah. So it's funny that you don't have a strong preference about language, but you do have a really strong preference about tools. It's a different take, and I like that. But how did you fall in love with S-Trace? How did you get this perspective that you wanted to talk that you wanted to get more into the tools so the reason i find these tools exciting is that i feel like they're a lot like learning about them is a lot higher like leverage for me than learning for example a new programming language because if i learn more about operating systems every program i write has to interact with my operating system um so anything i learn about operating systems will help me like in all of my programming Every program I write has to interact with my operating system.

Starting point is 00:35:09 So anything I learn about operating systems will help me in all of my programming. Whereas if I just learn something about Ruby, it'll only help me when I'm writing Ruby. And sometimes learning one tool very well, it leads to using that tool for different solutions, your deep understanding. I mean, I guess it's sort of a, I have a hammer, look at all the nails. Yeah, exactly. But how did you choose which tools to dig into? Why these?

Starting point is 00:35:36 Why S-Trace? Are there any that you've thrown away? Yeah, ones you got interested in and then decided, oh, you know, I'm not sure LS has all of the features I want. Yeah. So, yeah, like I have a kind of a research list of tools in my head that I'm trying to learn about. And most of the ones that I don't end up learning about are like, like many of them are complicated um or seem like difficult to me so there's the thing called f trace um which is a way to trace function calls inside the linux kernel um which i think is really powerful but it also has like a fairly difficult interface and so i haven't like gotten into it yet but s trace like the reason, like the reason I love S-Trace

Starting point is 00:36:26 is because its interface is so simple, right? And like the contract it provides is really simple. Like it's just like, I'm going to tell you system calls and only system calls. And so you know exactly what you're going to get and what you get is like pretty straightforward. And then it ends up being useful. So I guess I like things which are like relatively simple

Starting point is 00:36:44 and also like where the information they're giving you is very useful. And that makes sense. One of the things you've talked about on your blog and in various talks is that you are sort of a beginner. You come at this from a beginner perspective, even if you are no longer a beginner you're still in that mindset of learning um but for many people it's a it's a very chicken and egg problem where you need to know have enough breadth to know which tools are interesting and then immediately go into the depth of that tool which means that again you might spend six weeks learning all of the uh command line variations for directory right so i think this is a super interesting question like like how do you learn something when you're like a beginner at it um how do you balance it yeah yeah so the way i kind of approach this like for better for worse is i just like i spend

Starting point is 00:37:45 like a fair amount of time like learning about these tools like just for fun um so i'll just like str things and see what happens and then i'm not like there's no pressure that like i need to debug this thing right now um i'll be like oh what happens if i str this or like can i debug this with the str and maybe the first 10 times the answer is no right and like i'll s trace it and i'll be like i have no idea what that meant at all okay well next okay um and i find like just like trying a tool over and over and over again without too much like expectation about whether or not it's going to teach me something practicing yeah yeah i mean mindful practicing that's trying it out experimenting yeah okay but do you how do you choose which ones are there books or blogs or tutorials that you

Starting point is 00:38:37 go to first i mean you hit the man page uh i don't i almost never read man pages um pages of wimps people tell me things like the way i learned about esterase is i was asking a question um on the or i was just like talking on the internet about something that i didn't understand and someone was like have you tried esterase and i was like what's ester trace um and then it totally solved my problem so this is kind of a weird thing like but yeah like sometimes i'll be like i'll talk on my blog about something or i'll like talk on twitter about something that i'm confused about or that i'm learning and then people will just like email me and tell me about a new tool or they'll like tweet at me this this actually is the beautiful thing about twitter and and even

Starting point is 00:39:26 blogs i mean we're already getting comments and it's been exciting there for us when you go to your blog and try to write these things up even as you're struggling with them you you tend to try to make it easy you make the technical topics easier um for people coming into it and and really do a good job of explaining the groundwork why do you do that i kind of can't imagine doing anything else like i think that things should be made as easy as they can be made um yeah like yeah i don't know like often I watch talks and someone is explaining something and it's really they make it sound really complicated and I'm like that's not that complicated you could explain that in an easier way like so a lot of the things on my blog I tried to just explain to

Starting point is 00:40:30 myself like last week so like because I've already had things as I'm learning them so I'm like okay this is what I knew last week um and this is what I don't know and then I try to just like cover the things in between. So that Julia last week would have understood. Yes. That's a very, very reasonable way to do it. I often say I am writing for me in the future a year when I'm doing documentation on my code. Because future me isn't dumb, but she's totally forgotten everything about this. Yeah.

Starting point is 00:41:07 So writing for past, a week ago you, makes a lot of sense. And when I worked on my book, I was writing for me, but years ago. Yeah. New college grad me. Ah, do you have advice for debugging? Oh, yeah. So I wrote this blog post about this, which I really like. Not really advice, but more like things that made me better at debugging.

Starting point is 00:41:41 Because I used to be really bad at debugging, I think. Like I would have bugs in my program and I would just like be really bad at debugging I think like I would have bugs in my program and I would just like be sad and I was like this sucks why does my program not work this is not fun I just want it to work and the frustration and the anger and the horror and the fear that people are going to find out that this is all that you are yeah I'm familiar with these things go ahead yeah um and now I really like it. And I was thinking about like, why do I like G-Bugging now? Why did I hate it before? And like, to me, one of the scariest things about G-Bugging is this fear that like, you're never going to find the bug. And it's going to be there forever. Every day. Oops. Did I say that out loud? Yeah. And, like, as I, like, solve more bugs,

Starting point is 00:42:30 I think I find it easier to have the confidence that, like, I'm actually going to fix the bug. Like, no matter how weird it is. Like, right now at work, I have this bug that I don't really totally know why it is at all. And it's, like, quite complicated. And no one who I've talked to knows what it is um at all and it's like quite complicated and no one who i've talked to knows what it is either um but i still feel like totally confident that i can solve it and that like next week i'll probably have it totally figured out right optimism optimism is a job skill yeah yeah

Starting point is 00:42:58 yeah but there's always those ones that you can't reproduce or only happen once a week or involve like six computers and sorry no totally like even bugs with six computers that you can't reproduce that only happen once a week like those like those two are bugs that you can solve that's true that's true well in your blog you say be unreasonably confident in my ability to fix the bug i liked that a lot because know, it's so easy to be unreasonably unconfident. Yeah. And you really need it, right? Because otherwise, like, it's kind of reasonable to give up when you have this, like, non-deterministic bug.

Starting point is 00:43:40 Yeah. Yeah. And somebody breathing down your neck saying, you got to ship it, you got to ship it. Right. And if there's a lot of time factors, sometimes you have to give up and that's a reasonable choice. But I still try to hold on to the confidence that if I had a month, I could figure it out. Given an unlimited amount of time, I can solve things of unlimited complicatedness. What else do you have in your how you got better at debugging?

Starting point is 00:44:08 I feel like I need to look at it to remember. Yeah, it's always great when somebody starts asking you detailed questions about a blog you wrote like six months ago. And really, you know, yes. Oh, yeah. Okay, but now I have it. Oh, yeah. So one thing that I have it. Oh, yeah. So one thing that really helps me is to remember that, like, there is logic behind why the bug is happening.

Starting point is 00:44:32 I find this really hard sometimes because it's easy to just say, like, oh, you know, computers, they just do things, which is not true at all, right? Like, computers are totally logical machines. And every time a bug happens, it's happening for like, like there were instructions that got executed and functions that got called with parameters and like, like it's always

Starting point is 00:44:53 totally logical and straight. And like, even if it doesn't seem like that. Yeah, I mean, there's several times a year where I have a bug and I look at it and my first thought is

Starting point is 00:45:02 that is not possible. This can't actually be happening. And yet of course it's happening. It's a computer. It's logical. Yeah. Yeah. So trying to hold on to like,

Starting point is 00:45:13 like, like, yeah, it's really useful to remind myself that it is in fact happening. And because I think it's impossible means that there's some assumption that I have that I need to let go of. But like, I think that these two things where you like believe that it's some assumption that I have that I need to let go of. But like,

Starting point is 00:45:29 I think that these two things where you like believe that it's logical and like you have some confidence are like not enough by themselves, even though like, I kind of wish that they were so like, there's some bugs that I can solve now that like when I just finished undergrad, I totally could not have ever solved because now I know a lot more. Knowledge is power. Yeah.

Starting point is 00:45:47 And that one's kind of hard to like grapple with because I want to believe that like I can solve everything, but sometimes I'll like learn a new piece of information and I'm like, oh, well, I was never going to figure out that bug without that piece of information ever. Okay,

Starting point is 00:46:02 good. Now I know it. Like next thing. Yeah. You always have to learn stuff for it. By the way, when I said knowledge is power, Christopher rolled his eyes so hard that I think they came out and rolled across the floor. So, yeah, I didn't mean, it wasn't giggling at you, I was totally laughing at him. So you also suggest talking to other people, which I think is great if only

Starting point is 00:46:26 you talk to an inanimate object i'm still a huge proponent proponent of that where um where you you just explain the problem and sometimes that is enough yeah yeah um or sometimes they hold the like key to the, right? Of the secret missing knowledge that you're missing. Or the person who said, have you tried S-Trace? Yeah. Or the person who told me about S-Trace. All right. Well, I think those are good pieces of advice for debugging. Also to like it. You know, that's a good one too if you're in constant

Starting point is 00:47:06 pain as you're trying to figure something out it's really not it makes it much harder than if you can let go of the angst and enjoy the puzzle well sometimes you have to yeah sometimes I get in the mindset that I'm a detective

Starting point is 00:47:22 and that's kind of a cool you know okay what are the clues? But then you realize that you're probably the murderer, which makes it a little harder. Both murderer and detective. Yes. Yes, indeed. One of the other things you talked about in your PyCon talk was hacking the kernel. I'm not sure you fulfilled that promise.

Starting point is 00:47:49 Did you talk about actually modifying the kernel? Not in that talk, no. In a different talk, I talked about modifying the kernel. I may have watched more than one talk so I may have them confused. Yeah. I gave this talk at Strangeloop in 2014 called You Can Be a Kernel Hacker

Starting point is 00:48:13 where I talked about how understanding the code in the Linux kernel is not actually impossible which I think is a concept that's kind of familiar to embedded programmers because you're used to like working with like the full stack of your program and you're like well there's just code everywhere and i could understand all of it i imagine um sometimes we get to write all of the code ourselves so we'd better understand it yes

Starting point is 00:48:40 right um so like the code in the linux kernel is just code that you can like read and look at um and which was kind of very surprising to me like i thought of it as like this like mystical thing but it's not um and so when i started wanting to learn about the livings kernel i had this problem where i didn't know where to start or how to approach having fun with the Linux kernel. Because I was like, I'm not going to contribute to the Linux kernel, because that's a lot of work. And it's a very intense process. But it turns out that you can write Linux kernel modules, where you write your own code that gets inserted into the kernel. And then you can have fun and play around there and that's really easy to get started with so i did a couple

Starting point is 00:49:31 of fun things one thing i did was i wrote like a rootkit uh where you like provide where you make like a backdoor into your kernel where if you like look at the right file then you suddenly become root which is like surprisingly it's like a relatively easy like linux kernel module project um and like to get it to work you need to already be root on your system like right and then you should create an exploit so that anyone can become root and then hopefully you remember to turn it off so i did that and then i wanted to give a talk about like writing linux kernel modules and then my friend had this hilarious idea to write a kernel module which every time you open a file like an mp3 file um rick rolls you so it plays it plays Rick Astley instead. Never gonna give you up.

Starting point is 00:50:29 Which is also a pretty easy Linux kernel module to write, because you just need to override the open system call so that whenever it opens an empty file, it just sends you to Rick Astley instead. And it's really awesome to see, because you just go into Music Player and you start playing files, and you can't play any music anymore.

Starting point is 00:50:49 Fun with other people's computers. Yeah. And so my friend came up with this idea and I was preparing this talk and then my partner actually wrote the kernel module and he'd never written a Linux kernel module before, but he was just like, oh yeah, that sounds fun. I could do that. Well, and it kernel module before, but he was just like, oh yeah, that sounds fun. I could do that. Well, and it does sound fun, but you also could think if you were making a BeagleBone black

Starting point is 00:51:13 sort of widget, and every time you wanted to turn on some functionality, it was not working. Fine, write a kernel module. You don't have to send it out to the Linux people. We've all heard those are very mean people. Only from me.

Starting point is 00:51:32 But you can write a kernel module, stick it in, and it will debug for you if strace doesn't work out. Yeah. Yeah. And I really like this idea of like engaging with like software like independently of the community that develops it if it makes sense

Starting point is 00:51:50 you could be like best friends with like Linux have fun with Linux right right right well and this idea is kind of foreign to a lot of developers you know modifying dynamically modifying the operating system as it's running seems like a crazy idea,

Starting point is 00:52:08 or at least it does to me historically. Yeah. But it's not. I mean, it's a common, common thing now. You do it on macOS, you do it on Linux, and, you know, you can unload and load different functionality that extends the kernel, and you can do just about

Starting point is 00:52:25 anything you want which is kind of amazing every time you plug in a new thumb drive right yeah all right i only have a couple more questions because i think we're about out of time and it's getting late for us uh so i've heard you have a zine about Estrace and I have a question about that. What is a zine? So a zine is like, it comes from the word fanzine. So it's sort of like a little magazine that you write yourself. Often like this kind of this like photocopied hand-drawn aesthetic to it um it's often associated with like uh like the riot girl movement in the 90s and like but it doesn't need to be about feminism it can also be about computers um like people often make like fanzinesines about bands they like or something.

Starting point is 00:53:27 So I wrote a fanzine about S-Trace and how much I love S-Trace. That is pretty cool. I don't think there are any Linux commands I love that much. On the other hand, there might be some

Starting point is 00:53:41 microprocessors. I would love it if you wrote a microprocessor, Zian. Oh my God. Yeah. Wow. How about how a quadcopter works? Because that one is coming. Christopher, do you have any more questions?

Starting point is 00:53:59 I had something I thought of back when we were talking about S-Trace in detail, and I didn't manage to ask. So this seems like a great way to reverse engineer stuff or at least get started reverse engineering stuff. But I can see why some developers might not want it to be

Starting point is 00:54:17 usable. Is there a way that somebody could block it out and say no you can't strace this executable or is it just nope that's just the way it works? I don't see how. No. Because you're asking, like when you're doing a system call,

Starting point is 00:54:34 you're asking the operating system to do the system call for you. And when you use strace, you're like, hey, operating system, please report to me back what happened. So yeah, I can't imagine any way to do that because it's not like, like you're not asking the program to tell you, you're asking the operating system.

Starting point is 00:54:52 Now I want to repurpose all the system calls. So calling things that look like they do something but actually do them in so much less a way that they do something else. S-trace and kernel mods. You can have a really good time with these tools. Yeah. Yeah. Yeah.

Starting point is 00:55:07 Okay. Well, that's what I figured, but it's interesting to hear that. Yeah. Yeah. But there's another thing. So there's this environment value variable called LD preload, where you can say, like, let's say there's an internal function in a program that you want to like rewrite or like spy on. So like you're trying to reverse engineer something and you see this like

Starting point is 00:55:35 magic, awesome function. And you're like, what happens when you're in the magic, awesome function, you can kind of like write your own magic, awesome function that wraps it. And like tell the dynamic linker,

Starting point is 00:55:47 like, Hey, when you want to call the function, this program called my function instead. Oh my. Yeah. Like, which is amazing.

Starting point is 00:55:56 Like the function that says check DRM. Yeah. For instance, you can just wrap that with a nevermind. Yeah, exactly. We won't talk about what I did to V wrap that with, never mind. Yeah, exactly. We won't talk about what I did to VxWorks once upon a time. Yeah, let's not. I really want to know.

Starting point is 00:56:12 I had an argument with the license server, so I might have done something to it. That's awesome. Right, so LD preload is perfect. But there's this blog post that I read somewhere about how like you can try to protect yourself from LD Freeload and then you can try to like circumvent their protection. And then you can try to like, like there's this like layer of like LD Freeload hacks and then like protections from LD Freeload hacks that went like fairly deep.

Starting point is 00:56:37 And I don't remember who won in the end. Probably no one. Yeah. Julia, do you have any last thoughts you'd like to leave us with? I'm kind of left thinking that people who aren't embedded developers probably have a lot to learn from embedded developers. Oh, that's nice. It probably goes both ways.

Starting point is 00:57:03 Yes, it definitely goes both ways. One of the things I've been thinking about a lot lately is embedded programming has gotten a little ossified in spots we're growing we're learning alright my guest has been Julia Evans

Starting point is 00:57:20 software engineer at Stripe focused on machine learning her blog is jvns.ca because she's Canadian. And her PyCon video about using operating system features for debugging will, of course, be in the show notes. Or you can just search on YouTube. Thank you, Julia, for joining us. Thanks so much for having me. And thank you to Christopher for producing and co-hosting. And thank you for listening.

Starting point is 00:57:50 We love it when you tell other people about our show. That is the nicest thing you can do for us. Although some of your emails are pretty nice too. If you'd like to contact us, hit the contact link on embedded.fm or email us show at embedded.fm. And now I have a final thought to leave you with. Now I know you were all wondering who said it initially. So Abraham Maslow in Towards a Psychology of Being. I suppose it is tempting if the only tool you have is a hammer, to treat everything as if it were a nail. Embedded FM is an independently produced radio show that focuses

Starting point is 00:58:31 on the many aspects of engineering. It is a production of Logical Elegance, an embedded software consulting company in California. If there are advertisements in the show, we did not put them there and do not receive any revenue from them. At this time, our sole sponsor remains Logical Elegance.

Embedded - 141: Malevolent and Trying to Trick You

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.