Embedded - 234: The Good Word About AI

Starting point is 00:00:00 Welcome to Embedded. I'm Elysia White, here with Christopher White. If you've been listening to the show for a while, you may have heard me mention my typing robot. It's based on the NVIDIA Jetson TX2 platform. This week we have Dusty Franklin from NVIDIA to tell us more about that, and about the GPU Technology Conference in San Jose at the end of March. He brought coupons for that, too. Let's get started. Hi, Dusty. Thanks for joining us today.

Starting point is 00:00:35 Hello, everyone. Thank you for having us today. Could you tell us about yourself as though you were introducing yourself on a panel? Sure. Yeah, I'm a developer evangelist on the Jetson team at NVIDIA. Jetson's NVIDIA's low-power embedded system on module. and developers out in the field to bring these next generation autonomous capabilities to their robot or edge device, what have you, and go around spreading the good word about artificial intelligence and deep learning.

Starting point is 00:01:18 All right. We have questions about all of that, but before we get there, we have the lightning round where we ask you short questions and we want short answers. And I gather I'm going first. Not only do we want information about you, we want to know some things that you know about. It only seems fair to ask you about a particular topic that will be interwoven here. Not to be too

Starting point is 00:01:46 mysterious, but let's start. What company did George Jetson work for and what did they make? I can't say I remember, but I remember the boss who was always very angry. I can't say from my childhood, but I believe, was it the little spaceships that he made? I can't remember. You might have to fill me in on this one. Spacely Sprockets. Their main competitor was Cogwell Cogs.

Starting point is 00:02:18 That's right. Yeah. I don't know how well you're going to do on this quiz. Our one partner, Connect Tech, they have boards named Sprocket and Spacely, and Cogswell. So that figures. Okay, this is a list of devices. Were they invented before or after they were shown on the Jetsons?

Starting point is 00:02:41 True or false? Flat screen television. After. Internet-enabled washing machine. I don't recall the internet being on the Jetsons, so I'm going to say after as well. Computer virus. That seems plausible, I would say, during the Jetsons.

Starting point is 00:03:05 A tanning bed. Hmm. That seems like something Judy, George's daughter, would be into. So I would say that that was during the Jetsons. All right. That was very good, yes. The Jetsons definitely introduced us to the flat screen TV, a computer virus, and a tanning bed. And they had no internet that we know about.

Starting point is 00:03:31 Just not visionary enough. Pretty visionary. Okay, let's get back to Dusty. What is your... So many options here. If you had to pick one thing to work on for the next year, would it be machine learning, robotics, or computer vision? Definitely robotics,

Starting point is 00:03:52 because that's what we've been using the machine learning and computer vision to implement. I would say the end-to-end autonomous stack for navigation, manipulation in robotics out in the real world deployed. I've always been about taking embedded form factors and actually getting out into the world with them. How long until we have fully self-driving cars?

Starting point is 00:04:19 The answer should be one month, please. Well, I can tell you, being a resident of Pittsburgh, I see them driving around without a... There's still a driver in the seat, but they don't look like they're doing very much driving. So I would say that is pretty soon. I'm sure the folks in our automotive department at NVIDIA have a much better insight into that.

Starting point is 00:04:45 To be fair, most of the drivers in California don't look like they're driving either. Put down your phone. All right, let's get on to the real show. The real show. You said that Jetson is for low power embedded. When I think about embedded, it's much lower power than the Jetson and it's much smaller. I work a lot with the Cortex-M processors. Can you help me understand the NVIDIA Jetsons given my small processor perspective? Right. given my small processor perspective. Right, yes, I should have qualified that as low power for implementing these high bandwidth, high throughput AI devices that generally are hooked up to many camera streams,

Starting point is 00:05:37 1080, 4K cameras, LiDARs, have a lot of different outputs like motor controls and whatnot. So when we say low power, generally we mean in the sub-10-watt range, which the TX1 and TX2 qualify for. Now, I'm very aware from previous lives in Embedded that low power can mean milliwatts, or some folks even define low power as 30, 40, 50, 60 watts. So it's all much like when you're discussing

Starting point is 00:06:10 quote-unquote real-time processing. It's kind of all about your perspective. But certainly when we mean low power, generally it's a mobile-capable device with a small battery. You can power it for several hours. The Jetson module is about the size of a credit card. So the size, weight, and power profile swap generally fits into these small, deployable edge computing devices that may need lots and lots of computational throughput for

Starting point is 00:06:42 performing their artificial intelligence, computer vision-based pipelines. And the Jetson module is unparalleled in the performance that it can get for such a small embedded module that's still programmed generally through Linux, seeing that it's only the size of a credit card and can be deployed in basically the smallest of devices. So it's a very small, relatively low-power computer. I mean, when I think about embedded, there's always the Arduino and Nucleo 8-bit and 32-bit bare bones.

Starting point is 00:07:27 It's not a microcontroller. It's not a microcontroller. It's a processor with a system. It acted as my computer for like a week because I didn't want to take everything anywhere. It has microcontrollers, other A9 cores, other ARM cores inside of it that do microcontroller tasks for audio and PWM and the GPIO. But this being the NVIDIA Jetson, what sets it apart, as you said, being a processor basically equivalent to a desktop in functionality and performance is that it has the integrated NVIDIA GPU, which is CUDA capable and

Starting point is 00:08:06 can run all of the deep learning and computer vision applications that a desktop or workstation or server may. So it's just as if you took your GPU in the cloud or your GPU from your desktop system and were able to take that and deploy it out onto your robot or edge device. Okay, why do I care about that? Well, a lot of these devices need an intense amount of computational throughput in order to run these high-definition video pipelines, which generally are extracting perception data about the world around them in real time so that they can do, for example, dynamic obstacle avoidance, autonomous navigation, object detection and

Starting point is 00:08:53 recognition, so that, in short, the robots can interact and navigate safely in the real world. Similar use cases for edge devices that are essentially performing smart compression, taking in a firehose of data and using AI algorithms to determine what events are noteworthy and need to be sent up through the narrow pipe that they might have over 2G, 3G, or 4G back to the cloud. So it's all about harnessing the power of AI via the GPU, which the GPU inside of the Jetson TX2 gets over one teraflop of computational performance. So it's real similar to being able to deploy a whole server, but in the air or out into the real world. What things are GPUs good at and what things are GPUs not so good at? GPUs are really great at any type of parallel processing,

Starting point is 00:09:59 which includes lots of multimedia stuff. So they've always been great at graphics. But today, I always like to kind of explain that as we're really just doing the inverse of what GPUs used to do. Back in the day, you know, they were used to synthesize a 3D world from lower fidelity metadata, the vertices, the indices, other art assets that would then be rendered into this very high fidelity simulation. Now we're just doing the inverse, which is taking that data and boiling it back down into the metadata, taking the camera and extracting the objects and where the road is, where the trees are, where the path is that can be safely navigated on. And the reason for that is it's all parallel processing inside. And all of these parallel GPU cores can divvy up the image or the laser data into different cores and be able to process that all simultaneously in parallel to arrive at the result much faster than a serial implementation would be. This is a lot of computer vision, but you were also talking about AI. Those are the same to me. How does AI get better on GPUs? I mean, how are those related?

Starting point is 00:11:21 Sure. I think there's been a lot of hype around AI in the past years, but what's emerged and become clear as computer vision and deep learning have become more advanced is that AI is the overarching system that allows a machine to act intelligently, whether it's implemented with a solid state hard-coded computer vision pipeline or a deep learning algorithm, which is essentially learning the computer vision pipeline and learning to implement the filters that previously were hand-coded by guys like me and Kuda. Now, you know, the deep learning algorithms, which folks traditionally are identifying as quote-unquote AI, you know, those are taking the place of hard-coded computer vision algorithms with added robustness and flexibility. And it's nice with deep learning because if you get

Starting point is 00:12:19 a network working once for image recognition or object detection, for example, you can very easily substitute out the data for new objects with just collecting new data sets and not have to re-architect the network or re-architect your vision pipeline, which is a departure from days of past where, you know, we would spend weeks just making a pedestrian detector and then recode it for segmenting the road, for example. Now the end-to-end deep learning pipelines, they can just do all standard camera heuristics and tricks that would let you do the individual goals. Is that right? Very much so, yes. And let me give you a practical example of where we would use both computer vision and deep learning simultaneously. For example, if you wanted to

Starting point is 00:13:35 process a stereo camera and generate the point cloud and be able to navigate from that, you might just go with a traditional stereo block matching computer vision algorithm in OpenCV to generate the sterity disparity field from that. We'll set aside for a second that there are emerging networks in the deep learning research that are starting to do stereo also. So over time, you'll see more and more of this happening. But for today, folks generally are still doing stereo with the traditional CV methods. And so you use CV to get the stereo data and the initial point cloud and to align and register the point cloud like SLAM. So SLAM kind of composes really the big computer vision-based task that a lot of robotics folks in particular are still aiming towards. And then the deep learning kicks in and classifies the SLAM point cloud or the SLAM voxels to say, oh, this is a person in the point

Starting point is 00:14:42 cloud, or this is the object that I'm trying to find, or this is the object that I'm trying to find, or this is the package that I'm trying to pick up and deliver within the point cloud. And by virtue of that, since it's three-dimensional data, they're then able to do path planning, which can also be a parallel operation very suited to the GPU. Since we now have this high-fidelity point cloud. We want to evaluate the costs of many different paths in parallel so the GPU can just execute the path cost analysis for every different path and then come back. As opposed to before, we would have to make a best guess at what path could be the best because we couldn't evaluate all possible ones in real time. So it's still a very much fusion of traditional CV with this new age deep learning.

Starting point is 00:15:32 Over time, we are seeing more end-to-end deep learning implementations. That's where you'll see these pixels-to-action learner where you just put in the raw data. It can be stereo. It can have LiDAR. It all goes into a big network and out comes the motor controls, basically. And there's less insight into the internals of that network because so much is happening, but it vastly simplifies the autonomous vehicle stack for sure. So you're saying that the control system is actually moving into the neural network?

Starting point is 00:16:08 Which is bogglesome. How does it even work? I don't know if to be excited about that or really afraid. Yeah, well, to an extent, reinforcement learning generally still is fused with other safety-critical LIDAR or proximity detection systems below it.

Starting point is 00:16:23 But for our all for all intensive purpose, it is the reinforcement learning neural network that is directly driving the servos until it gets out of bounds. And even then, that data is collected and re-simulated at the next learning stage so that doesn't happen again. But what you said is basically the reason why a lot of these reinforcement learning agents learn in simulation and then are transferred, learned into the field, meaning that the robot is basically pre-trained in a 3D synthetic environment and then does a lesser amount of real-world training because the number of iterations required for the training might be not conducive to a real-world environment. As you said, we want

Starting point is 00:17:15 to be very safe first. We want to pass all these safety-critical certifications, do verification of the behavior in all different types of environments. And it's very interesting now, we're even starting to see in deep learning the use of GANs or generative adversarial networks on the example side that generate these scenarios to break the network, and then the network gets even more better. Okay, I want to stop you here, because we've gone through a lot of terms and we need to go back and define some. Okay.

Starting point is 00:17:50 Okay, SLAM, Simultaneous Localization and Mapping. This is the problem where you have a robot and you need to know where it is and where you want to go from one place to another. So that's just a whole bottle of problems that people are working on. The second one is reinforcement learning, where there is deep learning, which is often a branch that focuses on things like object identification. You have a whole bunch of images, you train the neural network, and you have the answer, you tell it the answer, and then you train the neural network and you have the answer. You tell it the answer and then you train the neural network so that it identifies those objects the way you want it to. And it learns their characteristics so it can go on. Reinforcement learning. Go ahead.

Starting point is 00:18:37 You had said that reinforcement learning was essentially supervised learning. What you just described was supervised learning. Reinforcement learning is where an agent goes out and collects its own experiences that are not pre-labeled by a human, unlike the object detectors and image classifiers that generally have this supervised label data. Okay, right. Sorry, I can't believe I...

Starting point is 00:19:03 You said reinforcement. All right, so reinforcement learning is one of the things we talked about with Patrick Polarski. That was a really great episode where he was building robot arms, and they would do kind of random things, and when they did something he wanted, he would press the button that said, good job job and that reinforces what they're doing and it's a the the reward yeah yeah the reward reinforces it and then it would generate behaviors that were good like shaking hands but then it would also generate weird behaviors that were completely unexpected because it was trying to get to a reward that it didn't understand. Like our dogs just rolling around on the floor. Like our dogs

Starting point is 00:19:51 rolling around on the floor for no reason. Yeah. Exactly. Yeah. It's very similar to the way a dog learns. You provide positive reinforcement when it does good in the form of a treat. You provide negative reinforcement, bad dog, when it does something bad. And over time, you hope the behavior converges towards the policy that you're trying to have the AI implement. And that's where the GPU-based training comes in. A lot of these reinforcement learners take thousands or even sometimes millions for very complex scenarios like the AlphaGo that has been in the news the past couple of years. And those agents require lots and lots of iterations to converge on this behavior because they are very explorative and explore all parameters and options in their environment

Starting point is 00:20:42 to see what reward is really good and what reward is bad. Millions of robot treats. Right. Yep. But it does require the simulation environments. I loaded robot operating system onto my TX2 in hopes that I could start using their simulation system because my robot arm is relatively ridiculously small and fragile.

Starting point is 00:21:10 And so this reinforcement learning, it is hard to do with a physical interface. It really needs some sort of simulation. Yes, and not only because it takes extra time and puts extra wear on the motors, but as you mentioned before, you need to have a feedback loop that generates the reward based on the conditions of the environment. For example, if you're having a robot try to escape the maze, you need to sit there and essentially, you know, quote unquote, give it. For example, when it got out of the maze, there could be like a red block or some visually identifiable object there that then

Starting point is 00:22:13 your pre-trained network would automatically recognize and be able to issue that reward automatically, which is kind of like a self-reinforcing feedback loop. But barring that, the human would have to sit there and oversee it, which we don't see as much of a problem when you're not training it from scratch. We foresee that these industrial robot arms, for example, which today are generally hard-coded with computer vision, and it costs as much to reprogram the arm to do a different task as it does to purchase the initial hardware.

Starting point is 00:22:49 We foresee that you can train it in simulation in advance and then have a technician spend a couple hours with it teaching it a new task for welding or complex object assembly, things like that. And this can greatly increase the versatility of these smart factories and whatnot. But going back to the reinforcement learning paradigm, generally most of the training would be done in simulation

Starting point is 00:23:19 because there it's very easy to oversee the conditions of the environment, like in the Ross Gazebo simulator, for example, which I've been using as well. It's easy to set up these event collision filters so that you can say, oh, the robot ran into something, that's a bad reward. Or the robot reached the end of the maze and automatically issue the good reward and reset the environment. So it's a lot easier to perform those type of things. But if you can think up a scenario that is self-reinforcing in the real world, then that's a great use case to deploy online learning, which is where the robot physically learns in real time on the actual platform. Neat. Okay, back to the board, the TX1 and TX2.

Starting point is 00:24:12 I got the board last summer and I started on your two days to a demo tutorial. Could you explain what that is and why it is and all of that? Great. Yeah. The two days to a demo is a end-to-end supervised learning tutorial where it takes you through basically all the steps that you need to get started deploying and training your own neural networks. So you can start with some initial out-of-the-box examples and just get running instantly on the Jetson.

Starting point is 00:24:54 For example, there is popular image recognition networks available like AlexNet, GoogleNet, and ResNet. And these come pre-trained to recognize 1,000 different types of commonly available objects. So you can deploy that onto the Jetson in real time via a library that NVIDIA makes available called TensorRT, which takes these networks and performs a host of optimizations on them to greatly speed up their performance

Starting point is 00:25:22 and take full advantage of the one teraflop of performance that the TX2 has. And then you can run that on real camera data, visualize it, play around with the different objects, get a feel for its capabilities, which can be a lot of fun trying to fool it. Because generally, when it gets fooled, you can kind of see that, oh, it thought that the apple was a pomegranate, or you can see how it can get some objects confused. And that can provide you insight to where you need to augment the training set, add more samples of a particular thing. And so the second phase of the tutorial is then retraining the network with your own data for your own specific application. And you don't need to redo any of the

Starting point is 00:26:14 program that runs on the Jetson or the inferencing side, which inferencing is the runtime aspect of deep learning where you take your pre-trained network and just run data through it and get the result of the classification or the detection. So you don't need to change any of the network architecture. You just need to pump new data into the training system, which we call Digits. It's an interactive web system that sits on top of a lot of different machine learning frameworks out there like Cafe, Torch, and TensorFlow. and segmentation network so that you can take these computer vision building blocks that we already have implemented and basically reconfigure them for your own application, which really speaks to the versatility of the demo is that you can make your own AI powered device that does a very specific thing from this general example that we have by doing

Starting point is 00:27:27 nothing other than just putting new data through it. For example, people make systems that detect playing cards or different types of objects. Folks like to detect their pets, train it on their pets so it can find different objects around their house. People get very creative with it, which is the whole idea that we just kind of set you up with the workflow and get out of the way so that you can go off and make and deploy your own application. And we call it two days to a demo because generally if you follow every step completely, it'll take two days. Not that you're sitting in front of the computer for that time. A lot of times it's downloading these large data sets like ImageNet or running the training in digits, which can take anywhere from as low as 15 minutes up to a couple hours, depending on the size of your GPU and the complexity of the network that you're retraining,

Starting point is 00:28:27 and in addition to the size of the data set. But if you follow the whole tutorial, which includes these retrainable image recognition, object detection, and segmentation primitives, then that'll take you about two days. You know, you can cruise through it. That's a nice theory. It took me nearly three weeks.

Starting point is 00:28:47 Some folks, it takes longer depending on your setup. And I couldn't make part of it work. Sorry. We have made strides to make the deployment of, in particular, the training system faster. Since the initial two days to a demo came out, we issued another update for the NVIDIA GPU Compute Cloud, which just with a click of a button, you can deploy the whole image of the training system to the cloud or to your local PC via Docker.

Starting point is 00:29:21 So you don't have to do all of this extra setup on the host side, like setting up the NVIDIA driver, KUDNN, which is the CUDA Deep Neural Network Accelerator Library, CAFE, and then all the Python dependencies. And then finally, you get to digits when you start to train. But now that's all done with basically one click of the button with NVIDIA NGC, which makes that part a lot easier for most folks. And then on the Jetson side, we've already had this NVIDIA Jetpack since the inception of Jetson TX1, which basically fully automates the install of everything to the Jetson. So that side generally works very well and much more streamlined for people because the Jetpack installer does all the heavy lifting there with installing off the NVIDIA and machine learning stuff to it.

Starting point is 00:30:17 Yeah, for me, retraining my inference weights was a large part of the problem and installing it was impossible on AWS. So I'm glad you've fixed some of that. And I have to admit the Udacity self-driving car helped me a lot to understand some of what your software was doing and how to make AWS work for me. What other resources can you recommend for people getting comfortable with the machine learning aspect of it all? Sure.

Starting point is 00:30:51 Well, it's funny that you mentioned Udacity self-driving car because we've actually been working with them on their Udacity robotics nanodegree with a lot of the material that we've talked about during this episode. And not dissimilar in theory to the self-driving car from a machine learning perspective, but more tailored towards robotics, edge computing, and reinforcement learning. So that would help anybody who has a specific knack for robotics. And SLAM includes a lot of material on SLAM and other made on reinforcement learning and simulation as well so there's the supervised two days to a demo and there's the reinforcement learning two days to a demo and similar to the supervised version

Starting point is 00:31:55 it might take you two hours, two days or two weeks depending on how much time and effort you want to spend on each particular step. Yeah. Yeah, for me, it was mostly fighting Linux. So the TX2 is a good robotics platform. Why do you say that? Generally, we see over time, there's more and more focus on the autonomous aspect of robotics, not only the low

Starting point is 00:32:29 level control and commands that used to be more pre-programmed, route following and such. But there's much more focus these days, similar to self-driving car, on the autonomous capabilities of the platform. And the Jetson has become the de facto compute resource for that due to its convenient small form factor packaging and the over one teraflop of performance that it can fit in in that relatively low power envelope. So it's the fact that these autonomous capabilities are implemented with computer vision and deep learning underneath that are really issuing the controls and the commands to the robot, which then actually get executed by a small microcontroller or the

Starting point is 00:33:22 autopilot, for example. A lot of drones that have the Jetson, they still have a PX4 or PIXOC autopilot, which then uses the Jetson as a coprocessor for doing all of the camera and sensor processing. And you actually have a platform for drones in the CTX2 together, right? Yeah, it's called the NVIDIA Redtail project. And it's a self-navigating drone that or down to maintain consistent heading in the middle of the trail. And I think they've tested it up to a couple kilometers, fully autonomous, using the Jetson TX2 onboard. Cool. Let's see. You have a conference coming up.

Starting point is 00:34:32 Yeah, that's correct. Every year we have NVIDIA GPU Technology Conference, GTC, in San Jose in the spring. And it's one of NVIDIA's biggest events of the year. We also have GTCs in different regions in the fall. But the one in the spring, we have located right near our headquarters

Starting point is 00:34:53 and serves as the base of a lot of our biggest announcements of the year. This year, we're having a packed track, lots of sessions from top roboticists that come and speak about how they're using NVIDIA GPUs in robotics today. In addition to lots of trainings about how you can make these autonomous deep learning systems that power robotics. So it's a great event to go to to get the latest in the industry and to learn and get trained on these deep learning methods. There were a lot of talks, and I tried to navigate to find what I could go to. It's tough because some are super advanced

Starting point is 00:35:49 and some are super advanced and very niche and some seem to be getting started. Do you have any advice for figuring out which ones I should attend? Or do you have any advice for which ones I should attend? Or do you have any advice for which ones I should attend? For sure. The organizers try to be very inclusive of beginners, folks who are intermediate in nature and advanced.

Starting point is 00:36:13 Generally, the tracks each come with a little label that says if it's beginner, intermediate or advanced. And from that, you can judge how low level and how technical they're going to get. We have talks that are from the titans of industry that come and talk about how they're deploying AI across their enterprise at the edge. And we have talks ranging to PhDs that are working on next generation deep learning networks that get deployed, and everywhere in between.

Starting point is 00:36:48 So there's lots of variation for people of all walks, in addition to a lot of different meetups and folks that we encourage. Lots of networking among the visitors to GTC, so that basically everybody who's really into GPU computing is there and is able to network together. And if they're not able to make it, that's why we started expanding into the regions in the fall so that, you know, everybody can get a chance to come to a GTC and meet other like minded folks who are into that type of compute paradigm. And you also have a competition for the Jetson TX1 and 2, although it's about to end. Yes, we have the NVIDIA Jetson Developer Challenge, which had run from last fall up until I think this weekend it closes.

Starting point is 00:37:46 But it's an AI-based challenge where folks can make different devices or any application really that harnesses the power of computer vision or machine learning or anything that has an autonomous aspect to it or some form of intelligence using the Jetson. And folks have come up with some really novel, interesting applications. And the finalists for that get flown out to GTC, where there'll be a panel and presentations similar to TED Talks, where they get up and do their pitch about what their concept does and compete for prizes and connections with top people in the industry. But if you're not able to participate in the challenge or hadn't heard of it previously, we are also offering a special discount code to listeners that can get 25% off of the registration to GTC.

Starting point is 00:38:51 So we hope that as many people can come as possible. So the 25% off coupon is a series of incomprensible letters that we're going to put in the show notes? That's correct, yeah. It's a code that you can enter into the GTC website and then it should provide the discount for you there. And listeners, as I mentioned last week,

Starting point is 00:39:15 I do expect you to vote for me on public choice for this competition. Don't even slack off. I'm going to need your votes. Because I'm not sure I'm getting a lot of the judges' votes. And if you're only able to come to GTC one day, generally Tuesday is the big announcements day. That's when Jensen has his keynote and announces

Starting point is 00:39:40 the whole roadmap for NVIDIA, which people get very excited about. Cool. Christopher, do you have any other questions? Yeah, so I don't really understand how this is positioned from a, I guess, a product standpoint. I actually use the TX1 with a client doing a medical device,

Starting point is 00:40:00 and I was mostly doing CUDA stuff with it, not any neural net or machine learning. It was taking high rate data and turning it into an image and through a bunch of warping things that were done in CUDA. It's not important what it was specifically, but I never got to the point where that was going to be put into a product. But how is this, if I'm a company and I want to use this, um, is it a reference platform? Is it something that, okay, you buy the core and you can put that into your product? Um, is this something that makes sense for high volumes or once you get to high volumes, is it okay, you better go roll your own? Great question. Uh, it's really meant for all walks and for all phases of product development, all the way from R&D through prototyping, low-rate initial production, into the final product. at every stage of that and all stages in between. The primary first exposure that folks generally get to the Jetson

Starting point is 00:41:09 is through the developer kit, which is a 7-inch by 7-inch mini ITX carrier board, which seats the Jetson compute module. The module is credit card size, and you can take the module off of the carrier board, which we provide all the design schematics and collateral for, so you can make your own, including all of the documentation and such. And many folks have done that, make their own scaled-down carriers or enclosures or fully integrated devices that they then deploy to the edge and

Starting point is 00:41:46 do all of this fancy processing on. But generally, it's positioned as a high-end AI at the edge compute platform that can handle the vast increases in sensor data that we've seen in the past few years, it's artificial intelligence and for navigation purposes or to be able to interact and manipulate the environment in some way. And a lot of that development can be done on a desktop and then put onto the device. You don't need to use it as your development environment necessarily. Right, yeah. It is very desktop-like in that it runs Ubuntu. It's just an ARM architecture, so generally just a recompile is required since all of the CUDA toolkit and CUDA libraries and machine learning frameworks are supported and run on the Jetson. Generally,

Starting point is 00:43:01 there's a minimal level of porting required, which is not the case previously when you go from a research system that might be running MATLAB or OpenCV on a desktop system to prove it out and then have to basically re-architect the whole solution to fit it into an embedded system. In this case, the Jetson can meet the performance requirements and the size, weight, and power envelopes for all phases of that. How much do the Jetson modules by themselves cost? module is $299 in volume and the TX2 module is $399 in volume of a thousand units. $399. Sorry, we do get dev boards that are $399.

Starting point is 00:43:56 They're not going to do this though. It's a small performance gap. It's a bit of a gap, yeah. Only in order of magnitude. I've seen the $3 SPC at the Maker Faire too. That's a great event also, the Maker Faire, because we're there right beside the guys that have the $3 SPC also. We all coexist in the same ecosystem. Jetson definitely has its niche. It's for these high-end autonomy and AI applications

Starting point is 00:44:26 where you don't even just need to record all the sensor data. You really want to do some type of live processing on it for the platform in real time. I have to say, I don't know if this is your experience, but I had trouble making it break a sweat. I mean, the fan never came on. It never got warm. I had doing computer vision and robotics and running the ImageNet to do object identification. It felt like I was boring it and that I really... I'm still boring it.

Starting point is 00:44:59 Disappointing it with my application. Alicia, as you mentioned earlier, a lot of folks end up using it as their desktop platform. If I didn't have to log on to my VPN or use Microsoft Outlook for work, I would never need another system. It's basically everything you need. In fact, when we released the TX2,

Starting point is 00:45:22 we benchmarked it against a 12-core server that pulled 200 watts. And this little sub-10-watt credit card module outperformed that. So it can do a whole lot in a small little package. And there are carrier boards to work with the credit card size module to have cameras and all kinds of stuff. I was excited about that. Yeah, since Jetson is an open platform, meaning we release all the design docs, and there's thousands of pages of documentation, and the software is all provided for free. The Linux kernel source code is all open sourced for it. Folks are able to take that and start with a dev kit, kind of change that

Starting point is 00:46:14 design around, not to oversimplify it, seeing as I'm a software guy predominantly, but delete components from the dev kit carrier board design that they might not need. For example, the large connectors, or there's like an M.2 mezzanine, or other things that they might not need on the DevKit. And that's how they scale it down to a small credit card size. And by virtue of that, there's flourished a rich ecosystem of third-party accessories, carrier boards, enclosures, camera pods, complete integrated drones that have it. In addition to ground robots and even some water-based robots, UUVs, and even some small submersibles. So really, Jetson gets deployed everywhere.

Starting point is 00:47:08 There's even people that launch it into space on nanosatellites, even though it's not rat-hardened or rated for that. They're so after that performance that they need, in the most extreme edge case, that they're really drawn to this form factor. I'm so going to lose this developer competition. No, we just have to send your stuff to space. It doesn't have to do anything.

Starting point is 00:47:34 I'll go work on the catapult. Yes, exactly. Well, Dusty, we've kept you for long enough. Are there any thoughts you'd like to leave us with? Well, thank you so much for having us today. And, you know, we really hope that folks listening will get involved some way in the computer vision or artificial intelligence, because it's a very interesting and exciting area of development the past couple of years. As we mentioned before, a few years ago, there weren't self-driving cars. Now they're out there driving around themselves. And this is just the general embedded use case of

Starting point is 00:48:11 that expanded to delivery robots, service drones, other types of edge devices that might be useful, medical instruments and such. So everybody can come up with cool ideas to use this kind of stuff for. And yeah, we hope that you'll get involved in the community. We're very active supporting people on the forums and in meetups online. And yeah, I hope you use that discount code to come to GTC if you're able to.

Starting point is 00:48:39 Our guest has been Dustin Franklin, developer evangelist on the Jetson team at NVIDIA. Check out his two days to a demo and look for the coupon for the GPU tech conference on our website, embedded.fm. Thank you for being with us, Dusty. Oh, thank you. I would also like to thank Christopher for producing and co-hosting. Once again, I want to let our Patreon supporters know that they have helped me get Dusty a good mic. And of course, thank you for listening.

Starting point is 00:49:13 You can always contact us at show at embedded.fm or hit that contact link on embedded.fm. I have a quote to leave you with, which will be even funnier if you listen to next week's show. This is by Robin Sloan. He wrote a book called Sourdough, which again, you might want to look up for next week's show. Hint, hint. There had to be a scale somewhere. The scale of stars, the scale of far-off cosmic superbeings upon which we ourselves, we humans with our cities and bridges and subterranean markets,

Starting point is 00:49:48 would look like the lactobacillus and the yeast. Embedded is an independently produced radio show that focuses on the many aspects of engineering. It is a production of Logical Elegance, an embedded software consulting company in California. If there are advertisements in the show, we did not put them there and do not receive money from them. At this time, our sponsors are Logical Elegance and listeners like you.

Embedded - 234: The Good Word About AI

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.