PurePerformance - eBPF and the Superpowers it unleashes with Liz Rice

Episode Date: May 6, 2024

eBPF is a kernel technology enabling high-performance, low overhead tools for networking, security and observability. In simpler terms: eBPF makes the kernel programmable!Tune in to this episode wheth...er you have never heard about eBPF, using eBPF based tools such as bcc, Cillium, Falco, Tetragon, Inspector Gadget ... or whether you are developing your own eBPF programs!Liz Rice, Chief Open Source Officer at Isovalent, kicks this episode off with a brief introduction of eBPF, explains how it works, which use cases it has enabled and why eBPF can truly give you super powers! In our conversation we dive deeper into the performance aspects of eBPF: how and why tools like Cillium outperforms classical network load balancers, how performance engineers can use it and how the Kernel internally handles eBPF extecutions.We discussed a lot of follow up material - here are all the relevant links:Liz's slide deck on "Unleashing the kernel with eBPF": https://speakerdeck.com/lizrice/unleashing-the-kernel-with-ebpfeBPF Documentary on YouTube: https://www.youtube.com/watch?v=Wb_vD3XZYOALearning eBPF GitHub repo accompanying her book: https://github.com/lizrice/learning-ebpf eBPF website: https://epbf.ioLiz on LinkedIn: https://www.linkedin.com/in/lizrice/ 

Transcript
Discussion (0)
Starting point is 00:00:00 It's time for Pure Performance! Get your stopwatches ready, it's time for Pure Performance with Andy Grabner and Brian Wilson. Hello everybody and welcome to another episode of Pure Performance. My name is Brian Wilson and as always I have with me my fantastic co-host Andy Grabner. How are you doing today, Andy? Very good, very good. And it's a unique opportunity for me today because it's the first time I believe I am broadcasting or podcasting or recording from my new apartment. I think you've been on drugs lately, Andy.
Starting point is 00:00:50 Earlier we were talking about have we ever mentioned EPPF and you said no. I think it was last episode you were in your new apartment. If you want, you can lie to our audience members and tell them that and I'll go along with it.
Starting point is 00:01:06 Maybe I'm just so excited today to actually have a special guest with us in an episode, which is some long way overdue for multiple reasons. First of all, for the guests that we have and also for the topic, which we want to dive deeper in today. But before I introduce yourself, Brian, typically you start with some strange dreams that you had. Any strange dreams lately? Well, yeah, it's a really bizarre dream. I had a dream that I was happy. That's it. And then I woke up.
Starting point is 00:01:40 That's it. I got nothing clever today you even asked for a dream and I didn't have one for you it's an odd topic to build a dream around for today's show how can I even fit that in or maybe I was dreaming I was eating
Starting point is 00:01:57 alphabet soup well without further ado I think folks if you took a closer look at the screenshot that we took for the podcast and you look behind on the left side from the speaker, there's like an interesting poster and it says, Unlocking the Kernel, EBPF. It sounds like a Marvel movie and it sounds like something really exciting, obviously. And that's exactly what we're going to talk about. We want to talk about eBPF. We want to go deep into the aspects of what is eBPF,
Starting point is 00:02:28 but especially in the context of performance. And now I will stop talking, and I want to let Liz Rice introduce herself to the audience because, Liz, this is the first time for you on Pure Performance. Thank you so much. Who are you? Hi, thanks for having me. Yeah, so my name is Liz Rice. I am Chief Open
Starting point is 00:02:48 Source Officer at Isovalent, which is the company that originally created the Cillium project. And we've been involved in eBPF since basically its inception. So yeah, I have lots of really, really expert colleagues, some of whom are on that movie poster that you talked about, which is from the EVPF documentary. Definitely really good. Wait, there's a documentary? There is a documentary. Wow, that's amazing. That's amazing.
Starting point is 00:03:17 You go to, I think it's evpfdocumentary.com, you will find it's incredibly, the production values are amazing. You know, when you said it was a bit like a Marvel movie, it's, you know, I think Marvel would be proud. It seems that, you know, I mean, I really think it's, you know, tech has obviously for us always been cool, but now it's been really cool because we've seen the documentary from Kubernetes, I remember, a couple of years ago when they started with like a really professional production of this. And it's great to see that eBPF is following here as well.
Starting point is 00:03:51 At least the two of us, I mean, I've seen you multiple times. We have crossed paths probably more than you remember because I remember sitting in the audience and watching you giving live demos at different conferences about eBPF where you're doing some really dangerous things because you're launching these eBPF programs that potentially bring down your whole system. Because if you don't do what you do, then you can also do bad things, obviously.
Starting point is 00:04:17 But I was always amazed of people like yourself explaining things in simple terms and also then being brave enough to then do live demos. Recently, we met at the Container Days in Stockholm, where we both spoke, so that was also great. But Liz, for the people that have not heard about eBPF, maybe a quick introduction. What is eBPF for somebody that is kind of new to this topic?
Starting point is 00:04:45 Yeah, so because it's four letters, people will think, oh, it's an acronym. What does it stand for? And it used to stand for Extended Berkeley Packet Filter. But we now consider it really just a standalone term because you can do so much more than just filtering packets that that acronymms become kind of misleading so what evpf really lets us do is program the kernel we can load programs into the kernel and we can change the way that it behaves and that gives us incredible power um the kernel
Starting point is 00:05:23 is involved in basically anything interesting that's happening on a running system. So anything that involves file access, networking, or privileges, or memory access, all these things are getting, you know, the kernel knows what's going on. So if we can sit in the kernel with our eBPF programs, we can observe everything
Starting point is 00:05:46 that's happening and we can even influence what's happening through eBPF. So lots of power. Lots of power. And you've been, you know, working on with eBPF for many years. You wrote several books, I would assume, right? I mean, at least the Learning EBPF, and I think probably some others that I'm not even aware of. On EBPF itself, I've written Learning EBPF, and then there's also a short little, they call it a report about what is EBPF. So it's more like a kind of taster book
Starting point is 00:06:20 rather than a thing you can buy on Amazon. But the Learning EBPF book you can buy from your local bookstore. I also wrote a book about container security as well, which is broader than just eBPF. It did have a little bit of eBPF in there. So that means, folks, if you're new to eBPF and you want to learn the basics before you maybe continue listening to the deeper dive conversations that we're now going into,
Starting point is 00:06:44 check out the links in the description of the podcast. you maybe continue listening to a deeper dive conversations that we're now going into uh check out the links in the description of the podcast we will point uh you know to your profile to the books that you just mentioned and i know if you search for ebpf and this rise on your favorite search engine many different recordings of demos and presentations come up um i do love a demo i always think you you know, if you want to see how something works, you've got to actually see it working. There's no amount of, I don't know,
Starting point is 00:07:12 diagrams are really going to explain it. Or certainly for me, if I want to understand how something works, I kind of have to touch it and feel it, you know? Yeah. And for that, it's actually great that sometimes you break things because if you break things then uh you know you learn even better because then you have to ask yourself why did this break because i thought it works what mistake did i make and uh
Starting point is 00:07:36 but i think i've never seen you break your systems uh at least when i watched the demos maybe you just did an amazingly good job in hiding any mistake. It has happened occasionally. In fact, in KubeCon in Paris, I did a demo where I forgot to kind of, you know, I'd run through the demo beforehand. I left a tracing policy config, you know, sitting in my Kubernetes cluster. I forgot to delete it.
Starting point is 00:08:04 So that meant that some of my network traffic didn't flow and i thought okay what's going on and eventually i realized that it was because i'd left this tracing policy in place and fortunately i realized relatively early in the process and and it didn't completely ruin the demo. And somebody did actually say to me afterwards, did you do that deliberately? No, I did not. So let's assume now everybody that is listening in and you have a basic understanding of
Starting point is 00:08:38 eBPF, you've seen some of your demos. eBPF is really programs where you can basically say, please, Colonel, call this program if a certain process, if processes are accessing a file, if they're trying to send something over the network. You mentioned memory access. From a performance perspective,
Starting point is 00:08:56 because this is always questions that I get asked, and isn't this very dangerous? Because if you think about, you have a Linux system where you have hundreds and hundreds of processes or on a Kubernetes cluster on the node, you have many, many pods and then you have eBPF programs running. Isn't that a potential good source of performance problems if you are slowing things down because you're maybe tracing too much or having too many of these programs being executed. Yeah, I mean, each individual eBPF program itself is lightning fast, typically,
Starting point is 00:09:36 because there's no need to transition between kernel and user space. So you have however many CPU instructions in your eBPF program, it's going to, you know, it's running machine code the same as anything else that's happening on your system. One of the aspects of eBPF is it goes through this verification process. And one of the things that the verifier checks is that an eBPF program is definitely going to run to completion. So the verifier is going to analyze all the possible paths through a program. And interestingly, I was
Starting point is 00:10:19 talking to one of the kernel developers who works on eBPF recently, he was saying it's less about whether or not it's going to run to completion. From his perspective, it's less about is it going to run to completion, and it's more about while you're running that program, that CPU is tied up running that program. So anything else that the kernel has to schedule can't happen because this is that the kernel has to schedule can't happen because this is in the kernel. There's no sort of super kernel scheduling things that can kind of interrupt this program. So I hear you saying, I see it less as, you know, you're not going to hang the kernel
Starting point is 00:11:00 and more as we need to know that that program isn't going to sort of disrupt the ongoing activity that the kernel needs to look after. So from that perspective, that kind of prevents or that's part of the reason why, you know, eBPF programs are kept relatively small and deficient individually. Now, you can do some really cool things like chain them together and call them on timers. So you can actually do some very, very sophisticated things with eBPF. But in terms of damaging your performance, again, if you attach to every single possible event in the kernel,
Starting point is 00:11:44 yeah, you're probably going to see some significant impact. There are, like anything else in programming, there are sort of pathological things you could do that would be really inefficient and would affect your performance. But equally, there are some incredibly powerful things we can do with extremely small amounts of overhead. I was going to say, that sounds a lot like code instrumentation, right? People, if they're using a profiler on their code and try to run that in production where they have
Starting point is 00:12:17 in the old days of Dynatrace, we had the ability to wildcard instrumentation on an entire package. And of course, then you would take down your system. It's all about strategic use of it. Individually they don't cost a lot of time or interruption. But if you do too much or you're not paying attention to what you're doing, it sounds like that's when it can get into trouble. One of the other aspects to it is what your program is doing and whether it's generating information that's going to go to user space. So you can send information between program, well, eBPF programs and user space using these data structures called maps.
Starting point is 00:13:00 And they might be things like ring buffers for example and you can sort of stick information in i mean one really common use case is you're going to capture information about events in ebpf programs and then send information about those events to user space and if you have a lot of those events and you send them all to user space and then you have some user space you know program that's doing something with those events that can you know be a significant amount of cpu in total one of the things that we've been able to do quite recently and and i would say this is a sort of advance in the way that we can use ebpf in um In Tetragon, which is a project that's part of Cilium, that's really used for security observability.
Starting point is 00:13:53 So you have a policy that says, what events am I interested in? And rather than filtering the interesting events, traditionally you'd send all the events to user space, let's say all the file read events to Traditionally, you'd send all the events to user space. Let's say all the file read events to user space, and then user space would say, is this a file that I'm interested in? Now we can actually do the filtering in kernel. So massively reducing the amount of event information
Starting point is 00:14:20 that has to be sent to user space. And it is a huge reduction in CPU cost for monitoring those kinds of events. You know, you can do things like, I want to say I'm terrible at remembering numbers, but I want to say that the file read benchmark that we did, it was kind of 0.2% CPU. It's really minimal overhead. And you have also, folks, if you are following the, if you're looking at the presentation on the SpeakerTech website from Liz,
Starting point is 00:14:54 you have some really great examples, also with performance benchmarks, on how an eBPF-based implementation is completely outperforming many other things. I think you had a couple of things on the network side, right? How you're proxying, how you're routing traffic. Also on the security side, as you said,
Starting point is 00:15:10 as closer you are to the kernel, and especially if you apply some of these principles, like don't capture data that you would otherwise filter out anyway at a much later stage, but it's much more costly if you filter it out at the source, if you never capture it, obviously. Overall, you will gain a lot in performance and resource consumption.
Starting point is 00:15:30 There's a general question that I now have on eBPF. Do you see eBPF mainly being used by tool vendors to really provide eBPF-based tools like what you just mentioned, right? With everything that you are doing. It is a well-earned open source project like Falco and many others. What do you see also end users? And with end users, I mean, I don't know, maybe site reliability engineers,
Starting point is 00:15:58 network engineers, security engineers in organizations also using eBPF? Yeah, so there is a whole range of, I don't know, how dirty you want to get your hands with eBPF. So at the kind of easiest to use level, yeah, absolutely. People are using tools that have been built that are using eBPF under the covers. And, you know, these projects like Cilium and Falco and Inspector Gadget and so on that are really intended to protect the user from having to get too involved. They might not even realize. People have been using Cilium in production for years
Starting point is 00:16:40 and they might not even know that it's using eBPF. They certainly haven't had to investigate how the eBPF aspects of it work. Then you can kind of sort of step down a level to things like the BCC family of tools or BPF trace. And these are really popularized by brendan greg i'd say and they let you particularly if you're interested in tracking down a performance problem kind of track down you know what is it in my system that's that i'm suspicious of or maybe i'm suspicious that there's i don't know problems in a i don't know things are in a, I don't know,
Starting point is 00:17:25 things are being oom killed, let me find out what's happening, or I'm concerned about, I don't know, how many network connections are being activated. And you've got these quite low-level tools that you can apply and run on the command line to say, okay, show me all of the executions or count the number of memory allocations or whatever it is that you want to track down. But even with those, you're not actually writing UBPF code yourself.
Starting point is 00:17:55 Then we get into, okay, do you want to write the code yourself? And I'm going to say, you know, i felt like this is such an exciting technology i want to i want to understand how it works i want to get in there i want to write code and that's a lot of what my book is about you know how do you go about writing an ebpf program but in reality you know it's one thing to do hello world or to sort of trace some system calls or something. Quite soon, it becomes kernel programming. You are quite soon dealing with the kernel's data structures. Maybe looking at a network packet is relatively straightforward and, you know, you can write things to manipulate a network packet.
Starting point is 00:18:48 But you are looking at the packet in its buffer the way the kernel sees it. You know, you can get hold of things, you know, data structures related to processes. And, you know, if you don't know what you're doing, you're potentially're potentially you know you're doing interesting things with the kernel so it's nice it can get quite complicated quite quickly yeah it's nice that you used the word interesting and not dangerous but i mean in the end right it's with great power comes great responsibility i guess that's one of the things. And that's maybe also the superhero poster on the back end, right? Unlocking the kernel. It's a lot of powers that you have.
Starting point is 00:19:31 Now you bring up the capability of actually changing parameters, changing content, changing memory, I guess. Changing things that people write to a file. And then I'm actually wondering, if you think about a large enterprise organization, how can you actually make sure that nothing tempers with your data based on a eBPF program that somebody kind of got into your system? I mean, kind of my question is, how do you audit this? How do you make sure that you actually know what is part of your system? Because now we're no longer only looking at the core kernel version
Starting point is 00:20:12 and the applications that you install on top, but you also need to think about what type of version of kernel do I have and what type of EBF programs do I have? Because all of this now in combination is really what makes up your system yeah so i guess that leads to a few different trains of thought one is you do need privileges to load bpf programs into the kernel and so effectively you're going to need to be root and you you know for all all the other, you want to absolutely minimize root access to those machines. So you no more want to give people privileges to run eBPF than to do anything else that they can do with root privileges.
Starting point is 00:20:58 You mentioned kernel version. And actually one of the more recent things that's happened in recent years, let's say, and is now in kernel versions that people to be running on, and I can send it to you, and it will run on your kernel version, even though the internal data structures within the kernel could be quite different. And in order to make that happen, there are some really interesting adjustments that get made. In fact, it's done in user space. You adjust the BPF code. The user space library essentially has to adjust the BPF code to make the program that it's loading match the data structure layout of your kernel version. So what that means, you know, for a user,
Starting point is 00:22:07 it just means it doesn't matter where you got your eBPF program from, it will just run. But it does also make it complicated to sign eBPF programs. You can't just take an eBPF program, sign it, and then have the kernel check that the signature matches because it's had to go through this adjustment process. It's actually not the same as the code you were sent.
Starting point is 00:22:33 So there's some activity going on to look at other ways of having the kernel validate signatures of the user space loader program. So I would say that's an area, this kind of signing, eBPF-specific supply chain security aspect is an area of development in the kernel. You can, of course, and should, of course, be very careful about where you're getting those eBPF programs from, validating that the binaries that you're downloading into user space
Starting point is 00:23:14 are signed and so on. So it shouldn't be a free-for-all. You should only run eBPF programs that you have good reason to believe are legit. I just wanted to follow on this topic because it was actually a thought on my mind. What is the resistance, even for doing like code tracing on our side, a lot of times people have security concerns and all that. Obviously, these are running as root. They have access to the kernel.
Starting point is 00:24:05 Are people resisting using eBPF because of some of these permissions, security concerns, or have the projects matured enough that they can prove out and say, look, we're safe, we're vetted? how does you bring them into an organization especially a large enterprise they're going to be like what's going on and we need to do a crazy security review and all the real world sides of that so what have you seen on that side? I would say the reality is that we don't see pushback but a big part of that is probably say for example Facilium it's graduated within the CNCF
Starting point is 00:24:22 it's incredibly widely adopted as a networking plug-in, the networking solution for Kubernetes. In fact, with Kubernetes, you have to choose a networking plug-in. If you're in a managed service, it might be chosen for you or there might be a default version chosen for you. But if you're installing Kubernetes yourself, you've got to choose networking from somewhere and psyllium is incredibly widely adopted as part of the graduation process it went through security reviews that the the cncf organized um yeah so And, yeah, so, I mean, I would say as a project, Cillium takes security very seriously. We don't really, it's certainly a question that people have when they hear about, you know, the power of eBPF and, you know, well the performance that you're going to get in terms of networking
Starting point is 00:25:25 you know is and the capabilities that you're going to get are so much in need you know that you know this is the right solution to achieve to achieve those capabilities um yeah it's it's certainly you know people people ask questions. The other option is, you know, well, would you rather have it as a kernel module? Probably not. That's, you know, even, you know, at least as many security
Starting point is 00:25:55 concerns or question marks to raise, plus a kernel module, a bug in a kernel module can bring down your machine. That verification process that I mentioned earlier, as it's analyzing eBPF programs, is making sure that there is no possible way for that eBPF program to crash.
Starting point is 00:26:16 And that ensures, you know, it can't just sort of, you know of crash the whole machine. Cool. Thanks. Is there something in the kernel that gives you feedback on what eBPF programs are actually executed, how long they run, how many of them don't finish in time, or things like this? Is there any kind of observability built in in the command line?
Starting point is 00:26:47 Yeah, well, you can use BPF to observe BPF. So there's actually a command line tool called BPF tool that is very handy for looking at what BPF programs are loaded, what maps are loaded. You can look at things like, you know, what are the instructions that these programs consist of. Trying to think if there's anything that will actually give you the timing. I'm not immediately thinking of anything that will sort of automatically
Starting point is 00:27:20 give you the timing of those. The reason why we're asking is because we keep getting these questions from the people that use our solutions for observability and therefore we built a kind of self-monitoring and also measuring the overhead that we have into our product. And so I'm just curious if this ever came up in the context of eBPF and if the Linux kernel provides some out-of-the-box capabilities. Yeah.
Starting point is 00:27:49 You did mention CPU, right? So is that in the command line? Because a lot of times if you're getting down that low, probably looking at CPU cycles is going to be a good indicator of how long. Yeah, it may be in there, but I honestly can't recall whether it is. We can leave that as an exercise to look after. Well, it will be maybe... Man BPF tool and see whether there's...
Starting point is 00:28:16 You know, a future talk of Liz Rice at an upcoming cloud native conference near to your home. Exactly. a cloud native conference near to your home. You have, as I said, I'm just looking and I'm clicking through the slides of your latest presentation called Unleashing the Kernel with eBPF. You mentioned Brent and Greg earlier. So for everybody that is interested and our listeners that are performance enthusiasts like Brian and I, he in slides, what is that? Slide, where's the slide number? Oh yeah, slide 29, 30 and 31.
Starting point is 00:28:52 It's kind of like how Brandon has used his famous flame graphs and CPU frame graphs to analyze, to basically do profiling. And I think that's really cool, kind of like something that has impacted our performance community a lot. And knowing that these things are possible also with eBPF is good to hear probably for many of you out there, for many of the listeners.
Starting point is 00:29:23 Some of the other things that I would highlight, and I'm just kind of clicking through your slides, you have a couple of examples here about how much more efficient eBPF-based programs are. I mentioned this earlier, and you also said Cilium, so much more efficient than KubeProxy. Can you exactly, or can you just in brief words again, recap why that is?
Starting point is 00:29:46 Because some people say, how is this possible? Why doesn't everybody do it like this? Why is it so much more efficient? So like a lot of things, it's a combination of many moving parts. So one part of this is the way that container networking works. So when we create a container or a Kubernetes pod, we usually put it in its own network namespace. So what that means for the path of a network packet, let's say a packet comes from outside the machine through a physical Ethernet connection into the host on which
Starting point is 00:30:25 this container is running. And in traditional networking, it's going to go through the whole network stack in the host's network namespace. And the outcome of that is going to be you need to go over a virtual Ethernet connection from the host namespace to the container network namespace. And then it goes to the networking stack again to eventually get to the socket and reach the application. And so that can be a relatively convoluted path. And we can bypass a lot of that by picking up that network packet
Starting point is 00:31:05 as soon as it comes off the physical wire, recognizing that the IP address is a container IP address and just directly injecting it into that network namespace for the pod. So that's one part of it. Another part of it is that because we're doing the lookups for that, essentially, where is this, you know, what is this IP address and what does it correspond to? Where is this packet intended to go? In sort of traditional networking that is essentially done through IP tables. There's a whole bunch of things like network policy can be done in IP tables.
Starting point is 00:31:51 IP tables are quite inefficient. And whenever you create a new pod, you end up having to rewrite all of the IP tables. And you can't just sort of append to them, they all get kind of rewritten. So in a very dynamic environment like Kubernetes, where pods are coming and going, you can actually spend quite a lot of CPU cycles just with updates to these IP tables. In eBPF, the map data structures that I mentioned are just incredibly efficient and they can be updated so much more, much, much, much more quickly. So that's another aspect that,
Starting point is 00:32:37 particularly at scale, massively improves. So that means I can have probably one eBPF program that understands if new pods are coming in and if they are, what IP addresses they use, and then keeps those maps up to date. And then the other eBPF program that is really then seeing incoming network connections
Starting point is 00:32:57 and then saying, hey, this actually goes to an IP, and I've just seen that this is a new pod that just came up and then routes the traffic. I think you actually, folks, I mean, I'm in the fortunate situation to have the slides open. This is slide number 22. That's a great job. 22 and 23 exactly visually explains what you've just kind of walked us through, that physical connection coming in from the outside
Starting point is 00:33:23 and then passing it through and basically bypassing everything else that would normally happen on a Linux machine. And one of the stats that you even have in your slide deck talks about up to 90% performance improvements. And if I think back about the big topics that we've discussed at KubeCon and also other conferences, I think efficiency is very important. We're wasting a lot of resources these days because we're just building new apps and we don't think about what's happening when we are letting service one talk with the
Starting point is 00:33:56 other. But there are people, fortunately, like what the Cilium project is doing and what Isolvalent is doing. We really think about how can we make these things more efficient by leveraging new, not new technology, eBPF has been out there for a while, but by leveraging this type of technology to really bring this efficiency gain to everyone that wants it.
Starting point is 00:34:16 And I think it's just also the awareness. I think the podcast here is also about the awareness that technologies like e EPPF even exist and that there's tools out there in the open source space and in the commercial space. But today, with Cilium, this is all, as you mentioned, a graduated project in the CNCF.
Starting point is 00:34:37 So there's so many great tools out there that really help us build more efficient systems. And it's really, really impressive. Absolutely. You know, I think I referred to it in the slides. You know, there are end users out there publishing their results and, you know, case studies of, you know, the performance improvements they've seen.
Starting point is 00:35:02 And they can be really startlingly you know much better you know it very much depends on the scale of your network and the uh you know what exactly it is you're doing but uh it's i would say it's normal for anybody running at any any kind of significant scale they're going to see really big performance improvements by using, you know, Cilium rather than traditional Q proxy, for example. Yeah. I wanted to ask a question. I mean, obviously the Cilium and the network side is a huge piece, but I'm looking at,
Starting point is 00:35:38 I believe, slide 28, where it says measure anything with eBPF. And this goes back to a comment I made before I started recording. So I just wanted to make sure if I understand this right. One of the major benefits, I think, especially even to a developer, once we start getting down to that road, we have tools like Dynatrace or other tracing components where you could look at the code. When the logical code is being executed, process this data, do whatever,
Starting point is 00:36:04 we can see that in the tracing tools. But then the tools will do something like write to disk, read from disk, write to cache, read from disk, get network connection and all that. And sometimes there's problems there. But from a code point of view, that's where your code is now interacting
Starting point is 00:36:19 with the kernel or the subsystems. It sounds like eBPF the magic magical magnifying glass that would allow you then to say okay once it leaves my code and goes into those systems what's going on what files is it writing to what size are those files is there a queue in those files right or is there a problem with that cache and this is that magical you know and if you think of other time right you know we see cpu it'll be like that's like, well, what happens when it goes there? And people have been blind and they have to scratch their heads and this would fit right in there to get answers to that, right? Yeah, exactly. The details of
Starting point is 00:36:56 what you can get with particularly, you know, that BCC family of tools, you know, those really came about from, you know, that those really came about from, you know, Brendan Gregg and others doing real performance, you know, detective work to try and figure out what, you know, what was causing bottlenecks at, you know, at the time he was at Netflix, you know, they had serious scale. So, you know, there are tools there that are measuring the performance of things
Starting point is 00:37:28 that I don't even know what those things are, you know, but that's the kind of depth of eBPF, the ability to look into all these different aspects of the way the whole system and the whole kernel is working. That's great. Do you also, I assume you work also with the big managed service providers like the cloud vendors, the hyperscalers, because if I as an organization decide to move my workloads to one of the big vendors and decide to use their offering, I guess I have more limited capabilities to actually use eBPF. Because obviously I may
Starting point is 00:38:08 run on shared environments that are shared from a Linux perspective, from a kernel perspective. I may not have access to it. Do you see... First of all, my question is, A, am I wrong with this? So do I still have the capability? Or B, if I'm right with this, do you see the big vendors also kind of jumping on eBPF to actually optimize? Because if they start optimizing the systems, right, this is at a much bigger scale. Yes. So all of the major cloud providers have, you know, are doing interesting things with ebpf including using cilium for networking so uh for example google their data plane v2 is based on cilium uh microsoft have an
Starting point is 00:38:55 azure cni that's powered by cilium uh in aws the default uh cni that you get if you're doing EKS anywhere is Cilium. And in all cases, you can, even if Cilium isn't sort of the thing by default, you can run it in, you know, I don't think there are any Kubernetes environments where you can't run now if you have a sort of even more managed service so some say something like fargate where you don't actually have access to the machines there you have less ability i know in fact i know there's an issue open on github talking about you know opening up ebpf on on fargate so it's certainly something you know they're looking at and i think there's a lot of demand from users who want to be able to run these tools and take advantage of eBPF, but they
Starting point is 00:40:06 don't necessarily have that ability in a shared environment where they don't necessarily have. I think it's not just about the, you know, the privileges to load things into the machine. It's also about things like what if, you know, the underlying machine gets upgraded and how do you manage programs around that? So the lifecycle is quite tricky to deal with. But yeah, absolutely. You know, there are tools that the cloud providers and the hyperscalers are building based on eBPF. For example, AWS have something called, I want to say, GuardDuty.
Starting point is 00:40:46 Definitely got the word guard in it. And it's an EVPF event monitoring solution that they're offering to people through managed services. It's absolutely a technology that people are really building on. Yeah. I know as we are coming about to the end of the time that we have today, but Lisa, I want to ask you, because as I said, I've seen many presentations of you where you
Starting point is 00:41:15 really educate people on the power of eBPF and what eBPF is, your demos. It feels like you've been doing this for a couple of years and still educating and people are still excited to learn this for the first time. But I guess eventually, right, you will also move on to the next topic.
Starting point is 00:41:34 And for me, the question is, what is the next thing for you? What do you see? What do you want people to know about? What are you going to talk about as you connect here, for instance? What are the things that we need to keep an eye out? Yeah, it's a great question.
Starting point is 00:41:52 I think there is still quite a long way to go with eBPF and the things that we can do and will build with eBPF. So at KubeCon in Paris, one of the talks that I did was with John Fasterbend, also from Isovalent, who is a kernel engineer. And we basically talked about, is there anything that you can't do with eBPF? And he ran a demo, showed a demo of running Game of Life, which is a kind of equivalent of Turing Complete programming, and it's implemented in eBPF.
Starting point is 00:42:33 So I think there is still a lot of mileage for things that are currently not done in eBPF yet, but could be and will be. And probably some fun surprises that we'll uncover along the way. So I don't think we've completely exhausted eBPF as of yet. Okay. Then maybe another question. What would you wish everyone would know about eBPF to avoid? A common mistake, a common thing people should not do?
Starting point is 00:43:06 Is there anything where you say, man, I've been talking about this for so long and still people are doing this, why? I think one thing that is a common misconception is really sort of where the kernel sits. So that you only have one kernel on a virtual machine or on a physical bare metal machine. And that however many containers you're running, they're all sharing one kernel. And it's something that most people don't really need to think about because, you know, even if they're developers, they're probably writing in user space.
Starting point is 00:43:43 And containers have their own kind of isolated views of what's happening in user space. But they're all sharing a kernel. And that is a little bit of a thing that people have to get their head around when they're really understanding what eBPF is. I can only refer to slide number 33. It's also great. Yeah, that's right. Awesome. Liz, I know it's the time of the recording.
Starting point is 00:44:12 We're in mid of April. This probably will air somewhere around end of April, maybe early May. Where will people see you if they want to actually catch the real Liz Rice in real life and maybe ask you questions. Do you have any things, any presentations, any conferences coming up? Yeah, so I will be in Turkey for, there's a KCD in Turkey and also an AWS Community Day in Turkey coming up in early May. And then in early June, I'm going to be at Cisco Live,
Starting point is 00:44:46 which will be a new, you know, I've never been, so that'll be a whole new thing. Yeah. So I guess those are the next couple of events coming up in my calendar. And absolutely, if people are there,
Starting point is 00:44:58 do come and say hello. I love it when people come and, you know, talk, maybe, you know, ask questions. It's brilliant.
Starting point is 00:45:04 Buy you a drink. Not obligatory. But you won't say no. Yeah, in Istanbul... I think in Istanbul, you will hopefully, I will tell Katerina Sik, she's one of our colleagues
Starting point is 00:45:21 and she's also a speaker there. She's actually talking about, I think, the least privileged principle from a security perspective. She has a of our colleagues. And she's also a speaker there. She's actually talking about, I think, the least privileged principle. She has a security talk there. Actually, I think with somebody from the Falco team. So you should definitely meet her. She has also been a great inspiration, help for me when I built the presentations
Starting point is 00:45:41 around platform engineering. Fantastic. I will look out for her. Great. Yeah. Sorry, Brian, I cut you off. Oh, I was just going to say in terms of the future of eBPF, I'm not on the speaking circuit scene and all, but being just on the outside, starting to see it pop up, starting to see those letters roll off of people's tongues very lightly still at this point but i
Starting point is 00:46:06 think we're at the um maybe apex of where it's going to finally start breaking through and becoming more common more thought of right even if people don't know maybe they're just talking about psyllium and they don't understand what's behind it but it's really exciting to see you know from way back and yeah falco was one of the conversations we had in the past, Andy. Then there was one several years before that. So there's been this knowledge, at least in the back of my mind of it, for years and years and years. But now starting to see it finally coming up and becoming self-aware, should we say, especially if it's got kernel access. But yeah, it's kind of exciting because there's so much more you can do with it, right?
Starting point is 00:46:46 There's so many tools give you pointers to where it is, and then you can go deep down into it. So it's really exciting to see what might come of that as it gets more widely adopted. So, fantastic. And really, thank you for being on here because I know Andy and I, this is a great learn for us, hopefully a great learn for everybody else. Was there any last thoughts you wanted to share, Liz, before we wrap up? I don't think so. Yeah, it's been a pleasure.
Starting point is 00:47:14 And yeah, we have to add EBPF documentary link to the show notes. That's one thing we have to remember. And yeah. So tons of great links in the show notes, everybody. And yeah, thanks. Thanks, everyone, for listening. Really appreciate your time today, Liz.
Starting point is 00:47:31 Thank you.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.