Embedded - 479: Make Your Voice Heard

Episode Date: June 13, 2024

Carles Cufí spoke with us about Zephyr, Nordic, learning, open source development, and corporate goals.  Carles had some great suggestions for learning Zephyr: Memfault Interrupt Practical Zephyr ...blog series Nordic’s Developer Academy  Zephyr’s Discord server Zephyr’s YouTube channel (@ZephyrProject), sorted by views  Macrobatics term is from Zephyr Devicetree Mysteries, Solved - Marti Bolivar, Nordic Semiconductor  There is also the Zephyr website for a full picture. And various Nordic tutorials (see nRF5340 Audio applications).  Carles was an author on Getting Started with Bluetooth Low Energy: Tools and Techniques for Low-Power Networking. The cover animal is a mousebird.  Transcript

Transcript
Discussion (0)
Starting point is 00:00:00 Welcome to Embedded. I am Alesia White, alongside Christopher White. We promised a show about Zephyr, and here it is. Zephyr and Nordic at the same time, with our guest, Carles Koufi. Here to answer our questions about Zephyr, Nordic, and I don't know, maybe some other stuff. Hi, Carlos. Thanks for joining us. Hi there. Glad to be here. Could you tell us about yourself as if we met at lunch at Electronica?
Starting point is 00:00:36 Sure. So I'm an embedded software engineer. And although for the last few years, I've been working on open source and specifically on Zephyr, my career has really revolved around Bluetooth, really. So I started many years back now, back in 2000, working on what was then to become the first ever hands-free car kit compatible with Bluetooth. So it was the first device you could actually install on a car and have your phone connect wirelessly to it. So that was pretty nice. For me, it was an introduction to a technology that then essentially has been with me for my whole career,
Starting point is 00:01:17 not always as my first area of focus, but usually related to or in very close contact to. And so after doing a little bit of Bluetooth in France back then, and after shipping that first hands-free car kit, which was made by a company called Parrot, which went on, by the way, then to make drones, then I started working on an operating system, actually, in the UK. This is a company that disappeared. But it was a very, very interesting company. I will
Starting point is 00:01:52 always remember it very fondly. We're basically doing Android before Android existed. So instead of using Linux, we had our own kernel. And then all of the applications were written in Java. So this was 2003, so well before Android. And we had some really top-notch engineers in there. And unfortunately, the startup itself wasn't successful, although we had funding from the mobile operators and we had a lot of support from a lot of people, but ultimately we didn't succeed. But the code base and in general what I learned there, that's where I got my first contact with Linux, Unix, and in general with
Starting point is 00:02:31 the whole philosophy behind all those operating systems. So this was for me a huge learning experience and I learned to write Unix-based code as well as learning Java a little bit. So this was my second big experience and one that took me to the most more powerful chip.
Starting point is 00:02:51 So away from MCUs, from microcontrollers, and then going to the big chips that power the mobile phones today. And then I went back to microcontrollers a little bit for a little while in a small startup before landing on Symbian. I don't know if you remember that system. I guess you have to be a bit old. I used to work at Symbian and it was very interesting, actually. They made the operating system that powered the smartphones of the day, mostly Nokia, but as well as a few others.
Starting point is 00:03:27 And it was an operating system based around C++ and very interesting and intricate concepts from the software development point of view. And it was hugely successful at the time. It had a massive market share, especially because or maybe related to the fact that Nokia was basically owning that market at the time. And then what happened is that the iPhone appeared. So Nokia scrambled to sort of, at the time, I guess, things must have changed so much, but at the time they scrambled to try to counter that. Because I remember they even set up a high-speed camera, one of those that can record thousands of frames a second, just to figure out whether the original iPhone's UI
Starting point is 00:04:06 was indeed running at 60 frames per second. So they wanted to know whether that was actually technically possible with the technology at the time, and it was. It's just that Symbia never did, right? And we never managed to get to that level of fluidness in the user interface. And I remember that also as a learning experience. And engineers sometimes from another side of the world can surprise you with something that you never expected.
Starting point is 00:04:29 So after a while and working on different odds and ends here and there, mobile applications, things like that, I landed at Nordic. That was 2010 and Bluetooth Low Energy was about to become a thing. The first version of the specification, I think, was released either late 2010 or early 2011. And Nordic wanted a chip, or Nordic had been designing a chip already for Bluetooth Low Energy. In fact, Nordic was one of the promoters of the Bluetooth Low Energy technology,
Starting point is 00:04:57 which was partially based on some earlier Nordic chips. And so I joined the team, the team that was making the first ever Nordic Bluetooth low-energy chip, a very simple chip with an 1851 microcontroller. It wasn't even ARM at the time. It was very limited, but we actually shipped in many products.
Starting point is 00:05:18 I guess it's safe to say now, Casio watches, not smartwatches. I'm talking about the ones with the LCD screens. So they actually sold some of those with our chip in them at the time. I think it was called NRF 8001. And the first one we released. And it was a very simple chip, but an interesting one.
Starting point is 00:05:36 It gave us a first step, a first foot in the market of Bluetooth energy. So that was interesting. And after that, I became very involved. I started actually with another person, the SoftDevice project with several people, actually. I started with several people, the SoftDevice project. And that is what we did for a long time, actually.
Starting point is 00:06:00 The SoftDevice, for those that don't know, I guess, that are not familiar with the Nordic software architecture of the last 10 years or so, which, you know, wouldn't be that surprising, is, was, or is, I guess, if you count that it's still supported but no new features come to it, is essentially a binary blob that you flash in Nordic devices and then you can interact with it via supervisor calls.
Starting point is 00:06:23 So it's kind of always there, always available, no matter where you are, what you're executing in the chip. So it has some very specific and particular properties that we made sure were designed so that you could have Bluetooth no matter what you're doing in the chip. So if you are updating an image, if you're in the middle of a bootloader operation,
Starting point is 00:06:43 you always have Bluetooth available. So it's a bit special, but I think it worked well. This was very well received. And for a long, long time, we were supporting it. And well, we still do, like I said. But after a while, we saw that we needed something else. So I was indirectly, accidentally put in charge of finding a next generation
Starting point is 00:07:09 or designing a next generation software development kit for Nordic, for Nordic Chips. So, well, that's what I've been doing for the last few years. It's been mostly working on Zephyr because we chose Zephyr. And I guess I'm sure we'll talk about Zephyr more. So I'm not going to get ahead of myself, but we chose Zephyr for SDK, as most people know now, especially engineers that work with Bluetooth chips. And we've been using it, we've been modifying it, we've been contributing to it, we've been expanding it. We've been doing
Starting point is 00:07:39 a lot of things with it for the last, I would say, seven years, I want to say, or perhaps even eight. And that's my job today, still now. I mean, I contribute to open source, mostly as effort, but other projects too. I interact with the projects. My team acts as a bridge between the internal Nordic engineers and the open source projects, and we do many, many more things. So that's sort of a summary of my professional career.
Starting point is 00:08:08 I'm not sure if that's where you were asking, but I kind of got on a tangent and went this way. That's fine. But we're going to do lightning round next, where we ask you short questions and we want short answers. Sure. Are you ready? Yes.
Starting point is 00:08:21 Okay. Barcelona or Real Madrid? Barcelona. Worst Zephyr macro? I'm just going to use the word macrobatics. All the macrobatics in device tree probably, although I can't choose. There's so many.
Starting point is 00:08:43 GDP. Well, GDP is command line debugging. I don't understand the question. GDB or printf debugging. Definitely printf. Best O'Reilly cover animal. Oh, real or fictitious? I don't know, honestly.
Starting point is 00:09:04 Off the top of my head, I really don't know. You always choose your own cover animal. Oh, that's true. Of course. Yes. I'm really bad at this. What was it? I don't even remember. It's been so long and I've had the song in between. Anyway, yes.
Starting point is 00:09:20 I don't remember. Who was it? Now I have to search it because you put me on the spot now. You know, it's been a long time since then. It has been a long time. Yeah. What is it then? Oh, yes.
Starting point is 00:09:37 But I don't know the name of the bird in English. Do you know it? It's not a robin. It's a jay, maybe? Could it be a jay? It would make sense if it was a blue jay. Yeah, I think it's probably's not a robot. It's a J, maybe? Could it be a J? It would make sense if it was a blue J. Yeah, I think it's probably a blue J. So can I say blue J, then?
Starting point is 00:09:51 Sure. Yeah, okay. And blue J it is. Would you rather work on applications or operating systems? Operating systems. Complete one project or start a dozen? Complete one project. Favorite fictional robot?
Starting point is 00:10:07 So I have to say, I'm not a big sci-fi fan, but I'm going to say the robot, the female robot, I think she's called Maria, from a film that I really enjoy, Metropolis. Oh, that's like the original sci-fi film, right?
Starting point is 00:10:22 Sort of, yeah, yeah. It is. Do you have a tip everyone should know? Use GitBysect. No, really, no. I know. Yes, everybody should know about connecting vessels. I remember when I learned this, it blew my mind,
Starting point is 00:10:39 like how to pour water using tubes and difference in height. Connecting vessels. Or GitBysect. I don't know. Both are important. They're both good, but only use one with code. Right. So Zephyr.
Starting point is 00:10:55 Chris and I have been getting a crash course in Zephyr after kind of seeing it for a while. But as we're working in Zephyr on multiple platforms, multiple processors, one of which is Nordic, we're coming across a lot of things that are good and bad. Do you have favorite parts of the system? I do. I do, actually.
Starting point is 00:11:23 My favorite parts of Zephyr are actually the kernel APIs, I would say. I think they're very clean. I've always liked it in general, and I've always liked them. I think they were designed... Some people don't think the same, by the way. This is not a widespread opinion, I guess, or at least I haven't. But in my particular opinion, and compared to others, and I have had a chance to compare to other kernels, I think they're great. Maybe it's because I'm biased and I've worked with them for too long now. But, you know, I really, really, really think they are, in general,
Starting point is 00:11:57 well-designed and they are very functional, easy to understand. And these are the ones that let you sleep the processor or let you send signals? Correct. So among others, they will allow you for a sleep for a certain amount of time, yield a threat, take a semaphore, spawn a threat, pull, for example, on multiple synchronization objects and so on and so on. But some of the ones that I think that nailed the use case and have been widely used over and over and over are some of the ones that perhaps are lesser known by those starting in Zephyr.
Starting point is 00:12:39 The message queue, for example. KMessage queue, that's a great API. It does require you to incur an extra copy, so to speak, but other than that, it's super practical, and it's one of the easiest ways you have to pass arbitrary data between two threads.
Starting point is 00:12:55 So I definitely recommend people coming to Zephyr to look not only at the ones that are mostly, that you come across always, like KSente, KMutex, KThreadCreate, and so on. But actually go beyond those and look at the documentation, which I have to say, in my opinion, the kernel one is very high quality. And look at those extra ones, look at those additional ones
Starting point is 00:13:17 that you don't find on every single sample. You actually have to go looking for them a little bit. Newcomers to Zephyr is a good thing to talk about. Do you have suggestions? When I started with the NRF53 Audio DK, I know Codex pretty well, so I was comfortable with the audio. I know BLE enough to know GATS and peripherals and all of that. But I didn't know the audio BLE part, which was big. I didn't know Zephyr, which was large, very large. And I didn't know Nordic's Zephyr, which was different. Do you have a good way of getting started
Starting point is 00:14:08 without getting lost in any of these rabbit holes? I want to say yes, but that would be optimistic. The point is, Zephyr has a steep, steep learning curve. That's a fact. I mean, this is something we've repeated time over time, and we've tried to mitigate that. When I say we, I'm talking here, I'm actually putting my Zephyr hat on and then changing it for my Nordic one and so on, then switching them because we've tried on both sides, right? Very hard. And I think we've accomplished some of it to help out new users.
Starting point is 00:14:41 But the first thing to know is that you need to take it slowly with Zephyr. You need to start with a simple sample, understand perhaps just how a thread is created, how a semaphore is posted, the very basics, without trying to understand everything at once. Don't go and looking, why is there an overlay file here?
Starting point is 00:15:01 What's an overlay file? Why is there a.conf file here? So start with the code, run it, compile it for your development kit. It's almost certain that your development kit or a very close variant will be supported because there's so many boards supported in Zephyr. Not in NCS, but in Zephyr, yes. In NCS, we reduce the number of boards to those obviously sold by Nordic. But by the way, sorry, NCS means NRF Connect SDK, which is a Nordic SDK that comes with,
Starting point is 00:15:31 it's not really a flavor of Zephyr, it includes Zephyr. So it's a bunch of things, including Zephyr. And that whole package, which is really a large number of Git repos, that whole thing, it's called NRF Connect SDK. And that is what we maintain in Nordic, and that is our software solution for all of our developers. So if we talk about Zephyr, upstream Zephyr,
Starting point is 00:15:52 vanilla Zephyr, as people tend to refer, then there's so many boards in there. You can pick your own board or buy a cheap one. You can even use QM or, you know, you don't even need to buy hardware. The point is you try a little bit, you add a print case statement, you add, perhaps you add
Starting point is 00:16:10 one semaphore, and you post the semaphore between two threads, and you start playing with that. Now, then slowly, you jump into what probably is the hardest part when you start with Zephyr, which is the build and configuration systems. They're really tricky. But they're tricky for a reason. When you have an operating system with Zephyr, which is the build and configuration systems. They're really tricky. But they're tricky for a reason.
Starting point is 00:16:27 When you have an operating system like Zephyr that supports not only so many different boards, but so many different SOCs, so many different applications. So we go from a 16K of RAM into gigabytes of RAM, right? Because Zephyr
Starting point is 00:16:42 actually runs on Big Iron as well, right? On some applications, but some runs on Big Iron as well, right? Some applications, but some of our member companies, some of our users actually use Zephyr in very, very big chips. But at the same time, you can still build and run Zephyr, including a Bluetooth stack for the BBC microbit, which is the original version, right? Which is a Cortex-M0 with 16K of RAM. So it's, why am I saying that? That's, I'm just trying to explain why the complexity is there, right?
Starting point is 00:17:11 You could skip some of that complexity by essentially compromising on the scalability, compromising on the extensibility. But if you don't, then you pay a price and the price is complexity. That complexity is there to avoid adding complexity to your application later. So you do need to understand the basic frameworks that allow you then to take your application and be close to the dream of a rebuild-a-way-to-switch-chips.
Starting point is 00:17:41 And I can switch them to another vendor. And that's why we have things like KConfig and DeviceTree, which are very complicated, we all agree, which have thousands of entries only in the main Zephyr tree, let alone if you then use an extension of Zephyr, right? Like the Nordics SDK or anything else. There are many extensions to Zephyr that can be downloaded. So the point is, take those slowly
Starting point is 00:18:06 and try to understand them one after the other. So try kconfig, create your own kconfig option, change the values of existing ones, see what happens then in the code. Do the same for device tree. Device tree is particularly difficult because it involves you understanding not only the actual device
Starting point is 00:18:25 resource, which is a way, a language to describe hardware, which is complicated enough in itself. But on top of that, you need to understand the schema for those files, which essentially is what we call bindings. Those are YAML files that describe what you can write in device trees, right? So the whole thing is overwhelming, I understand. And I haven't even mentioned the fact that we use CMake, but we have our own set of CMake APIs that begin with Zephyr underscore. And there's a very valid reason to that that I won't go into today.
Starting point is 00:18:58 But if you are curious, you can ask around in the community. They will explain to you why, right? So there's a bunch of things that you need to start taking on very, very slowly. Now, there is one series that I really, really enjoyed. And I read it even though I've been working on Zephyr for years. And it's from a company called Memfold. They do very, very good blog posts. Yeah, so they have this website or this blog called Interrupt.
Starting point is 00:19:26 And in it, they have this series called Practical Zephyr. So I would very, very much recommend those. They're very good. I think they do a great job at explaining the basics, kconfig, and then device tree. It stops there, but they do a great job. But of course, also, Nordic, you know, we've put a lot of effort into also training, training solutions, training materials. And we have the
Starting point is 00:19:53 Deaf Academy, to which I'm happy to say I've contributed myself as well. And, you know, as a technical consultant, so to speak, although I haven't written the content myself. But and I think that those courses are also absolutely great. The only downside, if you call that a downside, is that they're obviously oriented towards Nordics SDK, and less so to Zephyr. But there's a lot of content in there that's applicable to both Nordic and Zephyr, right?
Starting point is 00:20:18 So Dev Academy, if you're using Nordic, that's, you know, unmissable. You have to read it. You have to go through the courses. And no matter what you're using, if you're using Zephyr, be it with Nordic or not, I would very much recommend the Interrupt series from Memfault, Practical Zephyr.
Starting point is 00:20:35 That would be what I'd do if I studied now. The other thing that I would also absolutely recommend is going to the YouTube channel that Zephyr has. And there's some really great videos there for driver development, for device tree as well, KConfig, the build system, there's all sorts. Just search. If you sort by views, I'm sure you'll get a pretty good feeling of the ones that have been successful. And there's a reason for that. The presenters there, they did a great job two or three years ago. And it's, you know, it's aged pretty well, by the way.
Starting point is 00:21:06 I was re-watching one of them earlier this month. And those also, if you prefer listening or watching to reading, those do a great job at introducing those as well. And then finally, Discord. Please join Discord. Please ask questions on Discord. Avoid creating, if you fail to understand something, avoid creating a GitHub ticket
Starting point is 00:21:29 because that's really not their purpose. Their purpose is to contain actual bugs, right? And we've had so many users use that or GitHub tickets for questions. Either use the GitHub discussions. I don't use them myself that much, but I know they're popular among the community. But Discord. Discord is perhaps the king of real-time communication in, well, I guess in the world almost right now. Perhaps not. But definitely
Starting point is 00:21:53 for Zephyr. Zephyr, we have a Discord server. It's extremely active. People help each other a lot. Every day, every day I see dozens of conversations. We have channels for all possible topics. And I very, very much recommend joining this Discord server as soon as you start your path of Zephyr discovery. I did the Memfault Interrupt blog practical Zephyr. That was really good. Dev Academy, I did a little bit of. I'm more of a reader than a watcher.
Starting point is 00:22:29 And Discord, I've been on that Discord for a long time, but it's so noisy that I just kind of forgot about it. I just let the messages pile up and didn't really think about. And I was worried that I wasn't sure if my questions were Nordic or Zephyr for a long time. Oh, but that's fine. We have a Nordic channel, right? So you can ask them in the Nordic channel, and if they're not Nordic-related,
Starting point is 00:22:54 perhaps, I think we, you know, I don't want to, like, blow my own horn, but I think we're relatively good at redirecting people towards the right channels. And it's perfectly okay. Sometimes we get questions about device tree in the random channel or in the Discord channel, which is supposed to be about the Discord server itself.
Starting point is 00:23:11 And that's perfectly fine. So please don't hesitate to ask. We, in general, those on Discord trying to help other users, I'm sure we'll redirect you properly. And if not, it doesn't matter. Many questions get answered in the wrong channel anyway, every day, and that's perfectly fine. Well, that's good.
Starting point is 00:23:30 I have a UART2 DTS question that I will be posting there very shortly. All right. Sounds good. One of the difficulties I had just on top of i didn't find device tree all that intimidating maybe because i've seen like things like that before but the language of it and how it describes hardware i kind of like that i mean it's a giant macro system it's what i would have designed i don't i mean under the hood it's a macro system, but that's not how it appears in the sourcing. No.
Starting point is 00:24:11 But the trouble I had most was, okay, yeah, I understand K config. Yeah, I understand DTS. It's the hierarchy and the inheritance that happens from board to peripheral to maybe SOC. So there's all these DTS files that kind of build on each other in the application and walking and kind of seeing how those flow from one to another throughout the entire source tree, especially when you have a complete Zephyr, you know, repo with everything in it. It got very confusing to see, oh, this isn't working because 14 DTS is upstream. This pin was set to be something else.
Starting point is 00:24:46 Right, right. Not only is it a very good point, it's one that we've been struggling with for years now. The problem is you have to cater for those end users that use Zephyr exclusively to build their own application, and they're mostly worried about their final result, right? So they have their application and they have their device tree files,
Starting point is 00:25:08 their board, potentially, if they're not using a DK. But then you also need to support this operating system in general, meaning we have to build it against a thousand different targets, combinations and so on. So then it becomes essentially a compromise or a balancing game between,
Starting point is 00:25:24 let's make it clear where these files come from for users. And that's no easy task, as you've discovered yourself. But at the same time, we have to be so flexible that if you want to change a single device tree node, you can do it at the SOC level, at the board level, at the application level, or even at the command line level. And we do that, well, in part, because we think it's the right thing to do from an architectural point of view. So having multiple entry points. But most of all, we do that because we need to. If we didn't do that, then we would have to duplicate stuff.
Starting point is 00:25:57 And we'd end up with tons of duplicates. And that's the reason, really. But I understand what we've done to mitigate the issue that you've just described is to try in the different parts of the Zephyr documentation, which I agree could be improved in that regard. But what we've tried to do is have bullet points for the sequence of inclusion for device 2NK config files. And we have many sections in the documentation that deal with that. But there are two main ones, right? And one is called application development
Starting point is 00:26:28 inside developing with Zephyr in the documentation, in the main vanilla upstream documentation. And then the other one is inside build and configuration systems, where especially if you go into the build system and sysbuild, which is something else we can talk about, they try to make it clear how this inheritance works.
Starting point is 00:26:54 Not only for kconfig and device tree, but also for other files that are also inherited. So that's the reason, and that's how we try to mitigate it. But I completely agree with you. It's very complicated. And the downside also is that although for those systems, you do get a consolidated view, unique view in your build folder of the whole device tree once processed, once massaged,
Starting point is 00:27:20 once everything has taken place. And also the same for kconfig. You can't blame that, right? You can't git blame that. So you can't know who introduced what because that's generated build time. So like you say, I find myself now not seeing my UART output anymore
Starting point is 00:27:35 and I have to go and either I go to that file, the consolidated final file in my build folder and look at the nodes there, or I have to make an exercise of jumping back and forth between board, SOC, and application configuration files, configuration overlays or overrides in order to find out what on earth happened. And it is difficult.
Starting point is 00:27:59 I agree with that. It is difficult. We'll try to improve it. We always will, but I think we have a renewed energy now towards improving this after the introduction of sysbuild as well, which complicates things even more because it's another layer on top of everything else. So, yeah. I have a question from a listener, Tim, who says Nordic seems to have gone all in on Zephyr over the past few years. What has that process been like?
Starting point is 00:28:37 How has it been to go from the SDK and the soft devices to switching over to this big thing that you don't really quite control? Well, it's been difficult and challenging and interesting and fun. So, for me personally, this has been almost my, well, not almost, it has been my main project at Nordic for the last few years. So, for me personally,
Starting point is 00:28:57 it's been a big part of my life, actually. Not only Zephyr itself, but actually using Zephyr at Nordic. And, well, I think the hardest part, honestly, that we got through or that we essentially overcame was not too hard, but it was a matter of convincing internally those that were in charge at the time
Starting point is 00:29:22 that using open source was not only a good idea, but was also the future. And there was a lot of hesitation at the time. There were internal voices that decried the effort, decried the proposal, obviously, and it's normal. And, you know, I don't think it would have made sense if it had been any way else, because you have to understand that 2016, things look very different than what they do now, right? Zephyr was a newcomer, had just been unveiled.
Starting point is 00:29:54 There were a couple more Arthouses that were mildly popular, but by no means were taking the world by storm. Bare Metal was still the standard, at least for many, or Bare Metal combined with free artos, but always using silicon vendor halves or similar, right, drivers. Something as wide-encompassing as Zephyr, people were afraid of it, right, for many reasons. And, you know, and honestly, rightfully so, mean, this was a huge bet. So the thing is, it surprised me. It wasn't as hard as I thought,
Starting point is 00:30:32 probably because we did a good pre-study. So we sat down and said, okay, look, let's take Zephyr as it was back then, right? 2016, 2017. And let's try things around. Let's build a small sample with Bluetooth. Let's see what the state of it is in all its areas, right? File systems and networking stack and Bluetooth and kernel and drivers and so on and so on.
Starting point is 00:30:53 So we did all of that and we documented it thoroughly. But most importantly, I think what we said is there's a lot to do, but there's a good foundation. So I think we can start from here. So we had vendor neutral backing on the side of the Linux Foundation. So
Starting point is 00:31:10 we knew we weren't going to be tied to a particular architecture or vendor. The decisions would be taken by committee. So that was great. We knew that the code base we started with and that we would start contributing to was already of good quality. We knew that there was a focus started with and that we would start contributing to was already of good quality. We knew that there was a focus
Starting point is 00:31:28 on test and security, which is something that we really want. Because in many of these open source projects, security and test, they're an afterthought. Not in Zephyr. They were there from the very beginning. So then there was a point, you know, I gave a talk last year and
Starting point is 00:31:43 I mentioned a quote from a meeting. We said, at some point, someone said, okay, look, we can either wait to see if Zephyr and in general, open source in microcontrollers end up happening, or we can make them happen. So Nordic is not a huge company, but we are popular. We make popular MCUs. I think we could make it happen, or at least help make it happen, right? Obviously, it wasn't us who invented Zephyr or started Zephyr. But I thought, you know, we have enough power
Starting point is 00:32:12 that I think with our support, Zephyr could become at least more relevant. I don't think we ever dreamt of how widely used Zephyr has become now, but at the very least, we knew that it could be a strong player in the world of open source artists. So that was actually not too bad.
Starting point is 00:32:29 The hard part, so coming back to the original question, the hard part was actually moving to a new development model and making people inside the company work together in a way that was compatible with upstream Zephyr. And at the same time, that helped us provide value for our customers because we obviously don't want to be just a company that takes some software that's made elsewhere and puts it on some chip.
Starting point is 00:32:52 We actually want to add, like every other company, so some special sauce here and additional algorithms, features, et cetera. So maintaining that balance between working downstream in the SDK, working upstream so that we ensure that the Zephyr was a success, that our chips and our boards were usable. Not only usable, but were actually optimized for Zephyr upstream as well.
Starting point is 00:33:25 And maintaining that balance and at the same time reorganizing internally so that we would commit to using GitHub, to changing the review process, to dismantling the silos that we used to have in software development and trying to come all together and contribute to a single code base. All of that was hard, very hard. And that really took a while. Was there anything that you found surprisingly easy? Aside from convincing everyone, I think the transition to GitHub. I was expecting it to be worse because everybody was used to another code review system, an internal one, and so on.
Starting point is 00:33:54 So people actually liked it. And unlike other things that we changed, this was actually, in general, very well received. So it surprised me because I thought, oh my, we're going to get a thousand proposals to use something else for code review. And we did. But when we said, no, look, we're sticking to GitHub just to be consistent with Zephyr and also because it's simple, it works, everybody uses it, so why not? You know, we didn't really get any pushback. In the same way, that also surprised me, another one that was easy, actually, is the change to the coding style.
Starting point is 00:34:29 We changed from an internal coding style we had with the NRF5 SDK and all our software, really, at the internal coding style, to Zephyrus. Why? Because you don't want your customer to have to switch between two coding styles when looking at the code base, right? So we made this decision. That was surprisingly easy as well.
Starting point is 00:34:46 People seem to, if not like, at least adapt very quickly to it. Let me ask another question that turned out to be very popular, although I'm not quite sure how to ask this. Let me start with Bluetooth something something Wi-Fi something something release date.
Starting point is 00:35:04 What? Sorry? I'm trying to lead him into telling me when we're going to have a BLE chip and a Wi-Fi chip. And it will be released to the market. I see, I see. They're not going to tell you that. I cannot say that. I checked.
Starting point is 00:35:17 I checked. Before the interview, I checked. And unfortunately, I cannot share anything that's not on our website already. So I did see also a question in the Google Doc, but unfortunately, I cannot say. Well, we do. I don't think it's a mystery that we're a company that makes Wi-Fi chips and Bluetooth chips. That's obviously well known to everyone.
Starting point is 00:35:40 So I would say it's highly likely that we'll release one that combines both. When? I really don't know. I don't know myself, to be completely fair and honest. So I don say it's highly likely that we'll release one that combines both. When? I really don't know. I don't know myself, to be completely fair and honest. So I don't know. Dakey Poo suggested the question, how hard is it to integrate Wi-Fi and BLE into one IC? What?
Starting point is 00:35:58 I feel like he's trying to... Again, leading questions. Very, very clever workarounds to asking the question. Yeah, that's a very clever workaround. Well, look, I'll tell you from, because I'm not a hardware engineer, I'll tell you from the software perspective, given the architecture we have now.
Starting point is 00:36:14 So from the software perspective, it's actually not too hard because Zephyr, and by extension NCS, allows you to enable and disable everything you want. And in general, it's very well tested against running things in parallel and concurrently. So running the TCPAP stack, which is, by the way, the stack we use in our Wi-Fi products today, and the Bluetooth stack concurrently, it turns out to be a fairly well-known and popular use case already now.
Starting point is 00:36:41 So in different chips and combinations with and without Nordic chips. So actually the software architecture changes, if any, to ensure that both the Wi-Fi stack, which in our case means essentially the higher layers, right? The lower layers actually run typically in a small core within the Wi-Fi subsystem of the chip or whatever. And that's the case for many other vendors as well. But the higher layers combined with the Bluetooth protocol stack, the entire Bluetooth protocol stack, often including both the upper and lower layers, they are very easy to combine.
Starting point is 00:37:18 And that should not be a big problem. So from that perspective, speaking from the software side, it's easy. Or rather, nothing is ever easy, but it's certainly designed for. So then it comes to hardware, but unfortunately I don't know anything about hardware or very little. Well, then let's go back to software and ask the question from Timon about why NCS exists. I mean, it's hard because Zephyr does so much, but there are also modules and you can pull in and out things. Not modules. That was another thing I was going to ask, but nevermind. There are lots of subsystems. Why is it Nordic's Zephyr instead of Zephyr's Nordic?
Starting point is 00:38:08 Well, he said it's Nordic SDK that includes Zephyr. Exactly. But why isn't it the other way? Right, okay. Okay, yeah, that's a very fair question. The straight answer is because when we started, nothing of what you see existed. So almost nothing of what you see today,
Starting point is 00:38:26 the extensibility of Zephyr using Zephyr modules, for example, right? The ability, the West tool didn't exist even. So although we started gearing up towards developing all of this tooling, changing Zephyr so you could do almost everything out of three, and we did that for two reasons. We did it for us and for our customers. When I say our customers, I mean Nordic customers, but I could also say for our users referring to Zephyr users.
Starting point is 00:38:54 So if you're writing an application, the last thing you want is to modify someone else's C file and then have to commit that div. What you want is to have your files outside in your own repo and use Zephyr and everything else, and see if Zephyr or anything else that's provided for you essentially has a big, huge library that you use and then update when you need,
Starting point is 00:39:12 but you don't need to touch unless strictly necessary. So we didn't do, or Zephyr did not contain NCS because Zephyr at the time did not have the ability to do things like that. And once it did, we still wanted full control. So it's really very much, there's multiple factors. One is control. Why?
Starting point is 00:39:35 Well, the problem with Zephyr is that it is a community. It's not a problem. But Zephyr is an open source project. And that means that its development is led by agreement of the different parties contributing to it. Now, we are a hardware company. We sell chips. So there are times when we have to release on time for our customers. So those two are not really compatible.
Starting point is 00:39:57 So that's why we have a very lightweight Zephyr fork, meaning we take the Zephyr tree, the main tree, right? And we have our own variant of it, if you want. But it's very lightweight. We try to keep the changes in there to a minimum. And instead, we put all the functionality around it. Now, that functionality, for the most part, doesn't overlap with what Zephyr offers. So that means that for the most part, what we're offering there is additional value, additional algorithms, support for hardware, applications, extensions, all sorts of things that are useful to users, but to Nordic users. And this is part of the added value that Nordic sees when delivering the SDK.
Starting point is 00:40:42 So now, could we do it the other way around? Not really, because even though Zephyr does include now the functionality to extend it in a manner that you don't need to modify its code, it doesn't mean that we would be able to do what we do if we relied on the open source project. So that's the fact that we need the control, the fact that we sometimes also do things that are simply not allowed in Zephyr. Until very recently, you couldn't distribute binary blobs
Starting point is 00:41:14 as in pre-compiled libraries as part of Zephyr. We, as in Nordic, championed the introduction with other companies in order to enable vendors, not Nordic, because we already had our solution, other vendors to be able to provide their own binary blobs. But in fact, what we do, what we've been doing since the beginning is to have a special repository where we put our binary blobs.
Starting point is 00:41:36 And those are important because on some of those, we can't share the code because perhaps we don't know that. That's very common in silicon vendors. You don't own all the IP you have sometimes, right? you perhaps we can you can share the code so you have to ship it as a as a pre-compiled library or in other cases it's just easier for the user because the pre-compiled library it's pre-qualified for example for thread right so you don't want them to have to go over the thread qualification so you provide a pre-compiled thread stack that they can use and
Starting point is 00:42:02 they have they can skip that. So there are multiple reasons. And until very recently, that wasn't possible with Upstream's effort. So shipping binary blobs was an important thing as well. Then there's all the branding, our own documentation, our own extensions, everything that's part of the ecosystem but not part of the ecosystem, but not part of the software trees themselves, those are very important as well. And having our own distribution,
Starting point is 00:42:32 our own SDK controlled by us, managed by us, released by us, that was also one key requirement for us going jumping headfirst into open source. So there's all of that. Will that change in the future? I don't have a crystal ball. You know, I don't know. But how we do it today was a logical sequence of decisions
Starting point is 00:42:55 based on what was there at the time, what we wanted to give our users, and the approach we took. So, yes. Is that something that you could foresee reversing in the future when Zephyr becomes capable enough that it makes more sense
Starting point is 00:43:12 to go the other direction? It's a good question and it's one that I really don't have an answer for. Not in the short term. There's too much in NCS, there's too much in our SDK that it's not... There's one additional factor that I mentioned
Starting point is 00:43:30 that I will add now for completeness and a full understanding of this, that some of our source code, it's actually not open source. The one we ship in our SDK, although good parts of it are shipped as source code, the license we use is essentially a modified BSD license where we add a clause saying you can only use this software with Nordic chips.
Starting point is 00:43:53 This is quite common in other SDKs from other vendors, but you cannot call this open source because it's not part, that license is not in the list of OSI-approved licenses. So that means that we couldn't contribute that code to Zephyr directly. If we wanted to, we would have to change the license, which potentially we could, but does Nordic as a company want to do that?
Starting point is 00:44:13 That would fall on people above my pay grade that decide these sorts of things. And now they've decided not. They want to keep the value added and use that license, which I think makes a lot of sense. When you look at the amount of things we've contributed to Zephyr so far, and the ones we plan to contribute as well, to keep a little bit for ourselves. That's always a hard balance because, yes, the company needs to make money.
Starting point is 00:44:43 On the other hand, we want to say open source and we want it to mean open open source, not mostly open source. Correct. Yes, and it's very hard. This is actually going back to the earlier question. It was one of the hardest part is just decide like for every new module framework sample,
Starting point is 00:45:04 does this go up or downstream, right? We have to make a call for that. And we have our own internal processes for that so that we all agree within Nordic to an approach to going up and down and then we act accordingly. But it is hard sometimes because there are risks with doing things down downstream. For example, a competitor of the same functionality
Starting point is 00:45:25 may appear upstream and suddenly we have two implementations of the same thing. That's a problem, right? If you keep it downstream. But at the same time, certain things, certain functionalities specifically that we know gives us value when compared to other silicon vendors,
Starting point is 00:45:43 we want that to be part of NCS and not usable with other chips, just because it makes sense, right? In that regard, from a company perspective. So there's this walk line you have to, sorry, there's this fine line you have to walk where you want the project to be successful,
Starting point is 00:46:00 the Zephyr project. You want as many contributions as possible so that you ensure that. You want also to remain optimized and compatible upstream. And at the same time, you have to save something for the SDK. But over years, I'd like to think,
Starting point is 00:46:17 or I think, my opinion at this point is that we've refined the process so much that it's pretty clear by the moment we conceptually come up with an idea and say, we need to do this, where it's going to go. Because we know each other pretty well now. And everything that's a big system ends up upstream, right? Because we can't start modifying all of our files in a direction that's incompatible with upstream. And that makes sense because it's the only sensible approach. And then individual features, individual samples, things that are more self-contained, that
Starting point is 00:46:52 encapsulate a particular feature or extra functionality, those typically stay downstream. And you work almost entirely on open source. Yes. Well, go on open source. Yes. Well, go on, sorry. I mean, to the extent that we've been discussing that it's open source. Exactly. That's really cool that you get to work on open source and get paid for it, which is, you know, good. Has that become a core part of your next job?
Starting point is 00:47:28 Like, I'm not saying you're leaving Nordic. I have no information, no reason to think that. Stop panicking, Nordic folks. But having worked on open source code, is that something you think is important for your career? Or do you think it's just one step? It's just programming? No, absolutely.
Starting point is 00:47:50 Let me put it this way. I don't think I'd go back to only proprietary source code. It's fine to have some proprietary source code. It's fine. I absolutely think that combining proprietary and open source software is the right thing to do in some occasions. I think that companies do it for a purpose that makes sense.
Starting point is 00:48:08 And I'm all in in that. But only proprietary, not using an open source project as a foundation. I don't think I could go back to it now. Perhaps there's many reasons, but the main one, I think, is the fact that working with open source allows you to meet
Starting point is 00:48:25 so many talented devices. And this may sound a little bit like a cliche, but it really is true. I mean, the sheer number of people that contribute to Zephyr, the different companies, the different coding styles, the different approaches, the different backgrounds, all of that gives you a... It basically enriches you in a way that I think working in an office with a few developers would never give you. So there's that. And then there's the fact that I enjoy open source as a philosophy. I actually think it's a very good philosophy and
Starting point is 00:48:58 a very good way of developing some beyond all of the political or, you know, I just think it's a very smart way of developing software. So much so that I think it would be a mistake not to use that way, this mechanism, this approach nowadays, especially for complex software. And software and microcontrollers has become so complex that you either do it with open source in a sort of collaborative manner among all the vendors, or you end up with a clutch. What would have happened?
Starting point is 00:49:28 We would end up with a clutch, mismatch, mix match of different open source projects in part, proprietary modules, all together, tied with string probably, if we didn't have Zephyr. I mean, right? Because there wouldn't be a unifying factor. Zephyr gives you that central point where Embed TLS integrates with, trusted firmware M, all of these additional MCU boot, all of these additional satellite,
Starting point is 00:49:55 I'm going to call them projects, perhaps not the greatest work, but they are satellites to Zephyr. And all of those work together because Zephyr tests them together, right? And we ensure that those work together. If we didn't have that, I don't even want to think what the code base would look like, to be honest. I don't think it'd be good.
Starting point is 00:50:12 I might be wrong. So going back to the question, absolutely, I don't think I would go back to working on proprietary software exclusively. If I ever change jobs, again, not in my plans, I would definitely go for open source. Definitely. How does Zephyr test all of the different boards? Is there some Zephyr room that has a thousand tiny microcontrollers? No, not really. The way it's been, this has changed over the years. The way it's done now, essentially, is that all of the CI of the continuous integration in Zephyr happens only by building samples and tests and running them on servers,
Starting point is 00:50:54 meaning on QMO or native SIM, this additional mechanism that allows you to compile Zephyr into a Linux application that then can be executed natively on any computer running a Linux distribution. All of that happens on servers, servers that are maintained by the project. Actually, we actually have our own servers since, you know, I think we've had them for a year or so now.
Starting point is 00:51:17 But that's it. We don't have test farms or, you know, we do have them in the individual vendors' labs, of course, and that's essentially what we do, right? So the individual vendors run the same test suits that are used in the emulated or simulated platforms. They execute those on their own in their test labs. So Nordic, for example, has a daily, I think, two daily builds that take the latest effort,
Starting point is 00:51:46 the latest and greatest, and they run it on our boards, only on our boards, of course. And then every time we find a bug, we report it. And that happens with other vendors as well. So that's the way it works. It's essentially people pull their results together using the GitHub issues. Then we fix those issues as they come. But the actual execution on hardware is done by the silicon vendors, not by the project itself.
Starting point is 00:52:12 So that means that it's completely optional as well, of course. So some silicon vendors do it, some don't. Of course, it's in your interest as a silicon vendor to do it because then you ensure that there's no regressions on your particular hardware. But of course, that depends on your involvement on the project. Okay, so AudioDK. No, sorry. I've been working on the Nordic NRF AudioDK, and it has a lot of examples and applications. And I don't want to ask you about those because I understand that's not your area of expertise. But Zephyr as a whole has so many examples. Which is great.
Starting point is 00:52:53 Except for the part where none of them do what I want and they're all so different that I can't figure out what it was I was supposed to do. Yes. How do I untangle those? I mean, I'm an experienced embedded software engineer. I feel like I shouldn't have to read the code 10 times just to figure out whether or not it's using a ring buffer. I say that as though I pulled that out of nowhere, but it was a discussion recently. Or what was the button?
Starting point is 00:53:23 We had a discussion recently about zephyr and button handlers and the the person we were talking to was complaining it didn't have a button driver but it does but it does but then i read that button driver and it didn't do what he wanted so i see. Examples, good or bad? You know, we had the same problem in our previous SDK, and I think many SDKs have them. Because when you set out to develop something for your users, but you have an operating system that has,
Starting point is 00:54:00 I don't know how many thousands of kconfig options and how many modules, framers, et cetera, how do you do it? I can tell you what we did in Nordic in order to make it easier for our customers to try not hit that wall where there's just too many samples, each doing an individual small task or accomplishing a goal,
Starting point is 00:54:20 but then it makes it really hard to put together. What we did is we divided the samples into samples and applications, and then they're basically two categories, right? So the samples are samples. So they're testing or they're showing you how to do one particular thing. So this will be, for example, a BLE throughput sample, where it connects two boards, and then it sends data as fast as possible. And that's a sample, because typically you wouldn't,
Starting point is 00:54:42 you'd never ship that, right, in a product, unless you just want to show off how slow Bluetooth is in general with low energy compared to Wi-Fi especially. So you wouldn't do that. So that's a sample. But then we have applications. Those are actually close to what you'd call reference designs in hardware, right? It's essentially, they're more tested. They include fully featured applications in the sense they have firmer update. They combine multiple subsystems. We actually execute them in some cases
Starting point is 00:55:15 in specially designed hardware. So like the AudioDK, for example, right? The AudioDK is designed for that application. And that application is essentially the only one that runs on that hardware. We have a similar one called NRF Desktop that showcases how to do a mouse and a keyboard using Bluetooth Low Energy and NCS effort. So that's how we approach this. I think it's not a bad approach.
Starting point is 00:55:41 I like it personally. And I think it gives not a bad approach. I like it personally. And I think it gives users a starting point. But what happens if your future application doesn't fall into the category of the ones or doesn't match any of the ones we offer, right? Then you're back to square one, like in Zephyr, where you have to pick a sample and start banging away your code
Starting point is 00:56:01 and trying to figure out how to put together all these things together. I think that at least with Zephyr, you get the easy enablement side of things. So you can take, with the kconfig option, you can enable a subsystem and relatively easily add functionality to an existing sample. But that said, that still falls short. I agree. The thing is, I don't have a magical solution because we still need samples to test things beyond tests.
Starting point is 00:56:27 We still need samples even for ourselves. When we're developing, we need samples. That's what we use. And not only tests, we use samples very often. They're also a good starting point for some applications, but inevitably you're going to hit that wall of too many samples and then them not being useful for anyone.
Starting point is 00:56:47 I'm afraid I don't have a great solution to the problem. But like I said, you can mitigate it. You can mitigate it. Perhaps this could be something we discuss. We have many meetings in Zephyr, especially during in-person conferences where we meet and sit down. This is actually a great topic that we could discuss there, how to improve the sample situation in Zephyr where there's too many and perhaps not complete enough. There are sometimes multiple ways to do things, which is understandable given the organic nature of Zephyr, but it's hard for
Starting point is 00:57:28 new people to see the trade-offs. And so if there was a canonical solution, sometimes it would be better. It would. I agree with that. I've experienced that myself where I've seen basically two different samples or tests solve the exact same problem using a different kernel primitive, for example. And the problem is that you have developers who are very often very creative, very smart, and they want to try things out.
Starting point is 00:57:59 And at the same time, you have users, which probably at that point in time, they just want to get their stuff running. And, you know, they want to boot up their board, bring it up and do whatever they need to do with that firmware. So it's difficult because you need flexibility. And at the same time, you don't want to be able to do things in a thousand different ways. But if you've ever programmed in Python, for example, it suffers from much of the same, right? I was actually writing some Python code this afternoon. I said, I can do it in a thousand ways.
Starting point is 00:58:31 I can either iterate or I can use a set. And I ended up going for one of them, but I thought, I don't really know if there's an inherent advantage to one of these. It just, I think the moment you give people flexibility and freedom, you hit that, right? You hit that and it's very hard to avoid. I think the best thing to do in these cases is to look at a code that you consider of
Starting point is 00:58:54 good quality, be it Zephyr or not. Not all of the Zephyr samples are going to be of the same quality, but if you see them, some of them referenced again and again and again, or if you like the way a particular contributor does things. That's actually quite often the case. We have people that like other people's contributions. So if you do that, then I think that's also a good starting point to try and select how to achieve one particular thing. That's funny.
Starting point is 00:59:19 I didn't think about that. I mean, I remember in college having favorite teachers, but I don't ever really think about that. I mean, I remember in college having favorite teachers, but I don't ever really think about looking at who writes the code. Why have I never thought about that? Well, I do have a few actually in Upstream Zephyr, and that's obviously I'm not going to name names, but you end up kind of, perhaps you have a mind that is more alike. So you tend to select the same resolution to a particular problem. Maybe it's just because the way they write their code is also, from your perspective, the ideal or the optimal way.
Starting point is 01:00:02 Because very often you are mistaken about that. We are all mistaken because you try something out and you actually look at the assembly code in the list file or anywhere in the listings. And it's just nothing at all what you expected. And the compiler has done something completely different. And then your assumptions about what was better optimized completely fall through. But having some people that you trust how they code and that you agree
Starting point is 01:00:25 when they take technical decisions, I think it's something that's definitely worthwhile. Well, the temptation to ask you to name names is very high, but I don't think that would be prudent. I really don't want to name preferences here. Well, Carlos, it has been really great to talk to you. Do you have any thoughts you'd like to leave us with? Yes. In line with one of the questions that mentioned or that was asking, was it hard or not to transition
Starting point is 01:00:58 from a proprietary development model entirely to then switching to one that's based on open source, that contributes to open source, that radically changes the way software is developed in a particular company. One thought that I'd like to share with listeners is the fact that you see how things are done and that there's an established approach to solving problems in a company, I don't think that's always a reason to abide by it. So I think that sometimes things around you, and when I say around you, I mean in the industry, among your peer programmers that might not work at the same company,
Starting point is 01:01:38 change, things change, things evolve. And not everybody in your company or not everybody working with you may be on board with that. But I think it's very important to make your voice heard if you think that a company could do better. I'm not even talking about mistakes.
Starting point is 01:01:55 You know, everybody makes mistakes and that's always great to point them out. But those are usually very often pointed out very quickly in a company. What's harder to point out is the general direction. If you think that the general direction a company is going, and I'm not even talking about Nordic now, I'm looking completely in general.
Starting point is 01:02:11 If you see that the way your company is doing things, technically, of course, is not in line with what the world, with the direction of the world, when I say the world, obviously, I mean, the technical, the technical community is going towards, then I think it's very important to speak out. Not because you want to be the person that has, you know,
Starting point is 01:02:36 changed the way things are done in a company, but because if you don't say so, you're actually harming the company from my perspective. Because if you know something or you think you know something that's going to be a deal breaker or a major change or something that's going to completely overhaul how things work in your industry, and you don't mention that, right, then it's sort of like holding out on the rest. And you may end up not only harming the company, harming yourself in the sense that you will not be able to develop the software you always wanted to. You will not be able to see
Starting point is 01:03:11 your products succeed, et cetera, et cetera. So even if you have to go to the CTO, you know, what's the worst? The worst that can happen if you suggest going in a particular direction is perhaps you will, you know, have wasted a few moments of someone's time. But at best, you can actually change the way a company approaches technical challenges. And it might well be that one day, you will find out that that was the right thing to do. And then you'll probably be happy for it. So that's sort of the parting thought I wanted to share with you.
Starting point is 01:03:50 Thank you. That was good. Our guest has been Carles Koufi, open source software engineer at Nordic Semiconductor. Thanks, Carles. Thank you. Thank you to Christopher for producing and co-hosting. Thank you to our Patreon listener Slack group for their questions. And of course, thank you for listening.
Starting point is 01:04:11 You can always contact us at show at embedded.fm or hit the contact link on Embedded FM, which is where the show notes will be, which is what will contain links to things like the Memfault Practical Zephyr Introduction and Nordic's Dev Academy and all of that. And so now I have a quote to leave you with. I mean, the problem is once you start on Don Quixote quotes, you really just tune out for the podcast so that you can read Don Quixote all over again. So let's go with this one. Finally, from so little sleep and so much reading, his brain dried up and he went completely out of his mind. Alternatively, the proof of the pudding is in the eating. I didn't know where that came from, but apparently
Starting point is 01:05:00 Don Quixote.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.