Utilizing Tech - Season 7: AI Data Infrastructure Presented by Solidigm - 2x06: Moving AI To the Edge with BrainChip

Starting point is 00:00:00 Welcome to Utilizing AI, the podcast about enterprise applications for machine learning, deep learning, and other artificial intelligence topics. Each episode brings together experts in enterprise infrastructure to discuss applications of AI in today's data center and beyond. Today, we're actually stepping outside the data center to learn more about neuromorphic processing and AI at the edge. Let's meet our guest. Lou DiNardo is the CEO of BrainChip, which presented at our Gestalt IT AI Field Day last year. We were so impressed that we wanted to bring Lou on to talk more about AI at the Edge. Well, thank you for having me.

Starting point is 00:00:46 And I must say at the beginning here, the AI Field Day was a tremendous success. It was a very well-run event. Brainship plays a very special role in the AI environment in that we are exclusively targeted at edge applications for good reasons that we can talk about. And I am Andy Thurai, your co-host, founder and principal at thefieldcto.com, where we provide consulting analysis and unbiased emerging tech advisory services. You can follow me on Twitter at Andy Thurai or at thefieldcto.com. That's at

Starting point is 00:01:26 thefieldcto.com. Thanks. And as you know, I'm Stephen Foskett, organizer of Tech Field Day and publisher of Gestalt IT. You can find me on Twitter at S Foskett. So Lou, one of the unique aspects of AI is centralized training, centralized processing, centralized everything. And for the most part, that's been because AI has such a heavy footprint. It requires a lot of computing power and a lot of power, power and cooling and everything. And all of that really, really adds up. And it fundamentally changes the design and the nature of AI applications, because what you end up with is you end up with centralized AI in the data center instead of AI in the edge everywhere, simply because you can't do it. What you were talking about with Brainship at AI Field Day is that you can do it, that you can move AI processing, not just to edge devices like

Starting point is 00:02:23 computers and stuff, but to basically everything. Do I have that right? That is exactly correct. The basics of it are fundamental neuromorphic technology. We process data or information, just like the human brain. So we look at only events, non-zero, what we call non-zero activations. If something isn't different, something isn't new, we don't process it. In today's architectures, which you're correct, primarily sit in the cloud or sit in a data center, every single piece of information gets processed. You can't do that at the edge where ultra low power, we're talking microwatts of power, not even milliwatts, thousandths of a watt, we're talking, you know,

Starting point is 00:03:16 microwatts or millionths of a watt. And with that ability, you can move the analytics to the edge, right up against the transducer, so that you don't have to process all the data. You don't have to suck up the bandwidth to communicate all of that data across a LAN or Wi-Fi, or even a 5G network. You don't need all that data. So we do the processing at the edge, at the device, and it is really one of a kind solution right now. And that, I think, is the thing that really caught my attention when I got the first briefing from BrainChip, because the idea that you can, basically, that you guys have developed this core that is so low power. I mean, for listeners, we're talking about, you know, like the difference between like an Arduino and a Raspberry Pi versus a regular computer, not the differences between, like I said, like a laptop and a data center. I mean, the idea would be that these things could literally be everywhere. And not just that they would be doing the pattern matching and stuff,

Starting point is 00:04:25 but they could actually be learning at the edge. Is that right, Lou? That's probably one of the most fundamental things that we bring to the table, aside from low power and any complete solution. This is not a deep learning accelerator that needs a CPU as a host and external memory. Everything is included in what we call Akita. Akita is the Greek word for spike. So we kind of stick to our knitting on what neuromorphic technology really means. But what we can do is learn at the edge. There is really no true learning going on. It's training. And every time a new object or a new event occurs, you'd have to go back to the cloud, or you'd have to go back to a data center and retrain the entire network. With Akita, up against any transducer. It could be vibration, temperature, flow, pressure,

Starting point is 00:05:33 something new, vision, audio. Something new happens and Aikido learns on the fly in the field. You don't have to go back and retrain the network. So if you start a vision application and you're looking for 10 things, you want to know if it's a dog, a cat, a kid, a person in general. But now all of a sudden you have something else that you'd be interested in. Or we recognize that pattern that says there's something you should be interested in. We learn in the field. The device does not have to be retrained. That is a very, very powerful solution for any autonomous application. And the ability to send back metadata rather than all of the data, those are two very, very powerful solutions for edge applications. Right. So let me jump in and double-click a couple of items, what he just said, right? So obviously, for the last, I don't know, 30, 40 years, the big data

Starting point is 00:06:27 center chip players like Intel and AMD popularized the whole concept of building everything at a center location and you have to move everything there. And then of course, the ARM chips and even the NVIDIA chips kind of popularized the notion of you could build somewhat of a smaller chip. It doesn't have to be as big or powerful as a data center chips. And you can move that to the edge to do those things. Right. And then he also touched on my, my, one of my pet peeves that almost, if you look at the data enterprises, as you call them, not, not necessarily a regular enterprise, but a company has produced a ton of data, they can't process the data at edge. So they've got to move everything to the cloud, not just for model creation,

Starting point is 00:07:10 but even to do analysis and stuff like that. So there are two issues there. One is, you know, you move, collect all the data somewhere to do a model creation. That's issue number one that has to be centralized. And the issue number two, as you were saying, the model inference itself at times could be centralized. And the issue number two, as you are saying, the model inference itself at times could be a problem at the edge. So either they do a half there and half of the cloud, and then the reinforcement learning and learning from that to update your models

Starting point is 00:07:36 becomes another issue because you have to do it at the edge at first time and then move it to the centralized location to retrain the models and stuff. I know it's a long question that I asked you, but the combination is a problem. So how you are positioning yourself to solve all of the above? Well, it's a huge transition, maybe call it a disruption. Everything that you just described is based on what us old guys know as the Von Neumann model. The computing architecture that's been around for the last 50 plus years is instruction, fetch data, process. Instruction, fetch data, process. Instruction.

Starting point is 00:08:20 Now, we've gotten away with that for 50 years because we just jack up clock speeds. But when you jack up clock speeds, you require more power. This is the transition to neuromorphic where there is no program involved. We look at events. An event is a change in data, and we look for repeating patterns. If you take something like LIDAR, which is very, very pervasive in the autonomous vehicle market or ADAS, the sparsity of that data could be as much as 90%. So we only have to process 10% of the data. We're not doing matrix multiplication,

Starting point is 00:09:06 lines of code upon lines of code upon lines of code. We only see and we only process what, again, I call non-zero activations. That is a huge, huge transition and will be very disruptive in enabling AI at the edge. So let me ask you a follow-up question on that. Now that you brought up that LiDAR subject, people don't realize, particularly when it comes to autonomous cars, right, or autonomous trucks or anything that's moving,

Starting point is 00:09:40 because when you kind of build eyes to those moving objects, you know, I could throw drones in that category as well. When it's going in, you're kind of educating them, you know, to an extent saying that, you know, if this happens, if you have already seen that, this is how you have to act, which is fine. But when the unknown unknown happens is where the actual problem comes in, right?

Starting point is 00:10:04 So it figures out what to do on its own particularly in case of drones because the connectivity is going to be very low so that's where on-chip learning on-chip inference and updating the models and figuring out what to do on the fly right i mean the drones can eventually move into autonomous flying planes and whatnot right so so that's where I think the market, where it's going, and then where your offering could be more helpful. That's my view. What do you think about that situation? It's a very wide spectrum of use cases. And I've kind of coined or followed a phrase that I learned very, very early on in my career. It's not what to do. It's which to do.

Starting point is 00:10:49 Everybody's chasing AI. But in our case, we're being very selective. We're playing at the edge. But the edge itself is a very wide spectrum. So we're engaged with NASA. I mean, think about autonomy in space flight. The ability to reduce power and have, you know, smart analytics on the fly, so to speak, on the fly in space. Aircraft, you know, you could go to the other end of the spectrum and we're doing vibration analysis on bearings in railroads, you know, railroad cars, preventive maintenance. We have an application that we're working on, which is a volatile organic compound detector, so that you can, you know, you can breathe into a device just like a breathalyzer and possibly get to a point where you can diagnose infectious disease, pre-cancers or early cancers. And this is why we have a really good feel about the company with respect to providing artificial intelligence for beneficial purposes.

Starting point is 00:12:00 So that's a very wide spectrum, but we're being very careful. We have an early access program, as many companies do, but we're being very selective about those applications. In 8086, Intel 8086, way back when, you know, one of the first and most popular processors. And of course, that's matured over time. But then you had companies like Microchip come in, STMicro, or Nassus, microcontrollers. But they all still need to be programmed. And that's the big step forward with neuromorphic processing or neuromorphic computing is that processing doesn't exist in lines of code. And I think that that's the key differentiator here. I think that some people listening to this may say, well, we have autonomous vehicles. We can do image detection and object detection in a doorbell or a Raspberry Pi or whatever.

Starting point is 00:13:08 What is he talking about that's so special here? And I think that that's the important thing. The thing that's special, I think, as Andy was trying to get at here as well, is that we're not just applying an existing model or an existing library to inputs and kind of hoping for the best. That the chip is actually building these things on the fly. And as you said, you know, I mean, we saw this demonstrated with AI Field Day that, you know, a single image of a tiger allowed it to identify tigers. Another image of an elephant allowed it to differentiate between tigers and elephants. You know, we saw it doing this with, as you mentioned, with the, you know, processing smells, basically. You know, and the difference is that that's happening entirely in ship at low power, right?

Starting point is 00:13:58 Exactly right. So take us an example of an autonomous vehicle. You're right. They're driving around the streets of San Francisco right now. But all of the sensors, and there are dozens if not hundreds of sensors, are sending all the data either to the trunk or under a seat to a bunch of GPUs, graphics processing units, that are burning hundreds of watts. I mean, literally hundreds of watts. I mean, think about a hundred watt light bulb. Put three or four of them in a suitcase and see what happens. So you've got all of the processing power is consuming energy, but there's power dissipation. So it's not just the consumption, but it's the dissipation because nothing is 100% efficient. So you generate heat. Now you have to have cooling. When you're driving an electric vehicle, which most autonomous vehicles are or will be, the last thing you

Starting point is 00:14:57 want to do is waste power. That draws on the battery. If we can do the analytics at the edge, and as I said, LIDAR could be 90% sparse. Even an RGB camera, whether it's low resolution or you're talking 4K, you're still going to have 50% sparsity, which means 50% of the data doesn't need to go back to the trunk and consume power. And that I think is one of the big benefits. And I use the automobile application because Andy brought it up, but that is one of the greatest advantages in all of the things that I've talked about so far. So I want to bring up the, by the way, so the reference of the AI field day about tigers, there was even a joke, if any of you go back and watch that, we thought you were going to stop at the hot dog, not hot dog. Remember the San Francisco, the Silicon Valley joke, but then you went to step up. It's not just tiger, not the tiger, but also you are sensing the other objects as well. So it was impressive.

Starting point is 00:16:05 But I want to double click on the use case you talked about. People don't realize how key that is. It's basically disrupting the healthcare field because most times in the healthcare field, things are still done in the old fashioned way, right? But what you're suggesting about having a breathalyzer analyze the molecules and figure out the pre-levels of some disease, imagine how great this would be if you're already ahead of the game and by chance if it relieves COVID breathalyzer, you know, rather than, you know, doing the test and waiting for hours, all you have to do is that if you're going to fly in an airplane, you got to go to the security and do the breathalyzer, like the same thing you do with driving, right? And then if you're clean, you can go, I mean, it's not there yet, but imagine that that's disruption, that he can

Starting point is 00:16:56 figure something out on the fly, right? Talk about it. It's not there yet, but it's getting darn close. And you're right. The transportation industry, the healthcare industry, going to a hospital. I had to go to the hospital last week for an accident that I had. Imagine if they could just give you a breathalyzer and check for COVID, H1N1, even MERS on your way out, which, you know, which is a big problem with people that stay in the hospital. But it's getting very close. Of course, that's theoretical. You know, you don't have that right now.

Starting point is 00:17:41 I want to make sure that nobody thinks that this is, you know, an application, existing application. But certainly this is the sort of thing that you can envision being able to do with a portable low-powered device. One of the things that occurred to me when you all, again, when you briefed me was like, how is this possible? Like, how come you guys can do it with, you know, a million, you know, milliwatt and everybody else is doing it with, you know, many watts. I mean, you know, for what it's worth, you know, a conventional, you know, AI processing, you know, a learning setup with a bunch of GPUs and stuff. I mean, you're looking at like 500 to 1000 watts, at least power draw. How come you guys can do it so low power? There's really two fundamental reasons. And I have to give a great deal of credit to Peter Vandermede, our founder. He's the brain in brain chip. He's been working on what we call spiking neural networks or to kind of bring it down to a level that many more can understand we call it event based

Starting point is 00:18:47 so again we we play on sparsity we only we only look at data and only process data when it's important the other thing that we've that we've done in akita, which really brings down the power a great deal, is in virtually all other architectures, you start with some floating point math. So you might start with 32-bit floating point, and then you quantize. You take floating point, you turn it into an integer, you take that integer, and you quantize down. And in virtually all cases, you quantize down and in virtually all cases they quantize down to eight bits. And that level of quantization impacts your accuracy. It may be the difference between 99% accurate or 97 or 96 or 95, depending on the architecture. We quantize down to one, two, and four bits. So we start by playing on sparsity because we're in the event domain. And once we have both sparsity

Starting point is 00:19:55 in weights, as well as in activations, then we quantize down to one, two, or four bits. And you can see some charts that we present and maybe Anil did it at the field day. If not, we'd be happy to share it at the next field day. But you can play with the weights. You can say, okay, I want four bits and four bits. And therefore, maybe I'll get another percent of accuracy. I'll take weights at two bits. I'll take activations at four bits.

Starting point is 00:20:27 Then maybe you lose a half a percent. I could take it all the way down. We could take it all the way down to one bit and one bit. And maybe you're going to lose several percent of accuracy. But all of that is what allows us to do this at extremely low power, sparsity and quantization. And just to translate that for folks listening, I like to kind of think of the metaphor of, you know, the box of crayons, right? So, you know, if you are identifying the colors of the world around you, and you only have the eight

Starting point is 00:20:59 crayon box, then you have to say, well, that's blue, and that one's red. If you have to say well that's blue and that one's red if you have the 64 crayon box maybe that's cyan and that's mauve or something you know i mean you know there's a whole bunch of shades of gray and or you know shades of blue shades of red um and i think to me that's the difference in what you're talking about so if if i. So if the task is identify red lights, and you've got the 64 box, you might have, you know, a bunch of different reds to choose from, but you're going to still be able to identify that light as red, even if you have the eight box of crayons, because you can look at it and say, yep, red, you know, and to me, that kind of gets to this whole, you know, maybe sometimes we don't need this level of precision.

Starting point is 00:21:47 Maybe we can get away with something incredibly low, like you're saying, like two bits or four bits of precision. Right. That's exactly right. So I got a question on when we were talking earlier, and even on the AI field, I guess we're talking about creating this neural processing mesh, right? Generally speaking, neural processing is somewhat on these days reserved to cloud location only, and whether by choice or by the vendor push or a combination thereof and the power requirement and all that. So you suggesting neural processing mesh brings out my mind at least a few differentiators. Just to mention some, the first one would be any nodal point. First of all, generally the service meshes are created using software mostly until now, right?

Starting point is 00:22:38 Having a hardware or chip-based mesh in itself is a true differentiator. And the second thing in my mind, any nodal point of the mesh itself can have the same capabilities of any other node in there. So you create a true mesh of a neural processing capability. I mean, that sounds very powerful, but what are the use cases? What do you think this is applicable? Well, I think there's two things to remember. The we can take advantage of, if you wanna use four of the nodes for an audio application, and then you still have 16 other nodes, which is a whole bunch of processing power,

Starting point is 00:23:41 you don't have to go into and out of memory. The memory is resident in the node. So the idea of doing the mesh really allowed us to distribute memory throughout the neural fabric. And again, that helps reduce power because you're not sucking up bandwidth going into and out of external memory. So I think that's the power of the mesh is it allows us to partition the neural fabric so you can do multiple applications on the same device. Again, with all the benefits,

Starting point is 00:24:15 each one of those networks running can learn on the fly, take advantage of the extremely low power because we have distributed the memory within the mesh. So obviously the open source, the whole PyTorch and the libraries and the whole nanny arts, it's popularizing the way doing a machine learning model training and AI model training, and everything is done in a software. What you're proposing is, you know, there is another way to do that as well. So in the future, how do you see the future evolving? Are you going to be working closely with them together? Are you going to

Starting point is 00:24:50 replace them? Are you going to arguments, you know, supplement? How do you think this is going to work? Well, you know, one thing we haven't touched on, and I think it's an important attribute as well, is the design flow to move from this big convolutional neural network that you know may be established within an organization the design flow doesn't change we use we use tensorflow we use python scripts so from the potential customers standpoint they don't need to learn a new language they run they run in the exact same flow that they have. Once they get to their quantized level, then we align the Akita layers to what they've completed.

Starting point is 00:25:34 And we do our flow in the background. We have an Akita development environment, which is very robust, but doesn't require the front end to change at all. We happen to be in TensorFlow and Python. You can do it in PyTorch. There's Cafe out there. There's a whole bunch of tools that people use.

Starting point is 00:25:54 And we're somewhat agnostic to that. The flow allows them to move through their front end. And then we take their quantized levels and we move it into the Akita compatible environment. Well, I think that this, honestly, this conversation could go on a long time because frankly, there's a lot going on here. I hope that we'll see you at the next AI Field Day to dive into some of these topics a little

Starting point is 00:26:19 bit deeper. And I know that folks who are listening, if they have questions, you know, I do recommend checking out the AI Field Day presentations. Just Google, you know, BrainChip AI Field Day, and you'll find the presentations on YouTube. And you can dive into a little bit more about how this works. But in the interest of time, we do have to wrap up the podcast. And since this is season two of Utilizing AI, I'd like to wrap up with a couple of easy questions for you, Lou. And I warned you a little bit, but I didn't tell you what the questions were going to be. So I'm sure you're wondering what I'm going to ask. So let's

Starting point is 00:26:57 go with one that's a little bit outside of BrainChip's capabilities now, but well, maybe it's not. So here's a question for you. How long will it take for a natural conversational AI to pass the Turing test and fool an average person in a verbal conversation? It's going to be a long time. It's going to be a long time. And that requires probably cloud-like support. The timeframe for that to be in an edge device is going to be quite some time. Okay. How about this one? And maybe you've got a little bit more of an idea about this one.

Starting point is 00:27:39 When will we see video-focused or visually-focused ML in the home that operates like the audio based ML and of assistance like Siri and Alexa? Very soon. Very soon. You know, our capability depends on how deep you want to go. But our capability to determine whether there's a person in the room or not, it's at hand now. Whether you're using a standard pixel-based camera or you're using a DVS device, which is dynamic vision sensor, that's at hand now. Now, depending on the marketplace, who decides to deploy at what level because of privacy concerns, there's market dynamics that I think need to be worked out. Actually, if I can, if I may, there are a couple of commercially available solutions.

Starting point is 00:28:34 We can have another conversation on that, which could do customer support based on a visual thing. So you could talk to somebody, to an AI-enabled, you know, persona, and it'll decide based on your conversation. They are not fully mature like the Alexas of the world, but it's getting there. All right. Lou, one more question then. Are there any jobs that are going to be completely eliminated by AI in the next five years? Undoubtedly. But that could be very well offset, if not surpassed, by the jobs that are being created.

Starting point is 00:29:15 Take a company like Brainship. We've got 40 employees now. I don't know where we'll be. And we haven't put a forecast out in the public domain, but we'll certainly have a whole bunch more people in the public domain, but we'll certainly have a whole bunch more people in the next year, two and three.

Starting point is 00:29:28 So I think it's maybe a redistribution. It's not necessarily a loss, but a redistribution. Well, thank you so much for that. And I really did enjoy this conversation. Again, I think we could have gone on a lot longer, but that's the problem with the podcast format. You've got a clock to meet. So, Lou, where can people connect with you to follow your thoughts on enterprise AI and other topics? Well, there's several ways. Certainly, our LinkedIn page is very active. We've got a couple of thousand, maybe 3,000

Starting point is 00:30:03 followers on LinkedIn. We have a Twitter location. All of them are located at the bottom of any one of our press releases. Frankly, I take emails directly. People can reach out to me directly. It's Eldonardo at Brainship.com. We also, for investors, we have an IR location. So it would be IR at brainship.com. But there's lots of ways to reach us.

Starting point is 00:30:28 And, you know, we have a YouTube channel all set up. And you can see, actually, I think there's a link to the field day. And, Stephen, I'll tell you right now, we will participate in your next field day. And we appreciate the invitation. Excellent. Excellent. Andy, how about you? Sure. People can find me on Twitter at Andy Thorey or connect with me on LinkedIn, or they can find more details from my website at thefieldcto.com. That's thefieldcto.com. Great. And of course,

Starting point is 00:31:06 you can find me on Twitter at S Foskett and you can find my writing at gestaltit.com, among other places. So thank you very much for listening to the Utilizing AI podcast.

Starting point is 00:31:17 If you enjoyed this discussion, please do go to your favorite podcast application. Give us a rating, a subscription, a review. That really does help our visibility. And please share news about this podcast with your friends. The podcast is brought

Starting point is 00:31:31 to you by gestaltit.com, your home for IT coverage from across the enterprise, as well as thefieldcto.com. For show notes and more episodes, go to utilizing-ai.com or find us on Twitter at utilizing underscore AI. Thanks for joining and we'll see you next time.

Utilizing Tech - Season 7: AI Data Infrastructure Presented by Solidigm - 2x06: Moving AI To the Edge with BrainChip

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.