Utilizing Tech - Season 7: AI Data Infrastructure Presented by Solidigm - 2x06: Moving AI To the Edge with BrainChip
Episode Date: February 9, 2021BrainChip is developing a novel ultra low power “neuromorphic” AI processor that can be embedded in literally any electronic device, rather than centralizing learning in high performance processor...s. Today’s edge devices are applying exiting models to process inputs but can’t actually learn in the field, but on-chip learning and inference could radically alter the capabilities of devices in automotive, home, medical, and other remote locations. BrainChip is able to reduce power thanks to the neuromorphic self-learning approach and also because they reduce precision down to 4 bits or less. This loses some accuracy, but only a little. The company also creates a mesh of cores that have access to local memory, enabling flexibility of processing. Guests and Hosts Lou DiNardo, President and CEO of BrainChip. Connect with Lou on LinkedIn Andy Thurai, technology influencer and thought leader. Find Andy’s content at theFieldCTO.com and on Twitter at @AndyThurai Stephen Foskett, Publisher of Gestalt IT and Organizer of Tech Field Day. Find Stephen’s writing at GestaltIT.com and on Twitter at @SFoskett Date: 2/9/2021 Tags: @SFoskett, @AndyThurai, @BrainChip_Inc
Transcript
Discussion (0)
Welcome to Utilizing AI, the podcast about enterprise applications for machine learning,
deep learning, and other artificial intelligence topics. Each episode brings together experts in
enterprise infrastructure to discuss applications of AI in today's data center and beyond. Today,
we're actually stepping outside the data center to learn more about
neuromorphic processing and AI at the edge. Let's meet our guest. Lou DiNardo is the CEO of
BrainChip, which presented at our Gestalt IT AI Field Day last year. We were so impressed that
we wanted to bring Lou on to talk more about AI at the Edge.
Well, thank you for having me.
And I must say at the beginning here,
the AI Field Day was a tremendous success.
It was a very well-run event.
Brainship plays a very special role in the AI environment
in that we are exclusively targeted at edge
applications for good reasons that we can talk about. And I am Andy Thurai, your co-host,
founder and principal at thefieldcto.com, where we provide consulting analysis and unbiased
emerging tech advisory services. You can follow me on Twitter at Andy Thurai or at thefieldcto.com. That's at
thefieldcto.com. Thanks. And as you know, I'm Stephen Foskett, organizer of Tech Field Day
and publisher of Gestalt IT. You can find me on Twitter at S Foskett. So Lou, one of the unique
aspects of AI is centralized training, centralized processing, centralized everything.
And for the most part, that's been because AI has such a heavy footprint. It requires a lot
of computing power and a lot of power, power and cooling and everything. And all of that really,
really adds up. And it fundamentally changes the design and the nature of AI applications, because what you end up with is you end up with centralized AI in the data center instead of AI in the edge everywhere,
simply because you can't do it. What you were talking about with Brainship at AI Field Day is
that you can do it, that you can move AI processing, not just to edge devices like
computers and stuff, but to basically everything.
Do I have that right? That is exactly correct.
The basics of it are fundamental neuromorphic technology. We process data or information, just like the human brain. So we look at only events, non-zero,
what we call non-zero activations. If something isn't different, something isn't new,
we don't process it. In today's architectures, which you're correct, primarily sit in the cloud
or sit in a data center, every single piece of information gets processed.
You can't do that at the edge where ultra low power, we're talking microwatts of power,
not even milliwatts, thousandths of a watt, we're talking, you know,
microwatts or millionths of a watt. And with that ability, you can move the analytics to the edge, right up against the transducer, so that you don't have to process all the data. You don't have to suck up the bandwidth to communicate all of that data across a LAN or Wi-Fi, or even a 5G network. You don't need all that data. So we do the processing at the edge, at the device,
and it is really one of a kind solution right now. And that, I think, is the thing that really
caught my attention when I got the first briefing from BrainChip, because the idea that you can,
basically, that you guys have developed this core that is so low power. I mean, for listeners,
we're talking about, you know, like the difference between like an Arduino and a Raspberry Pi
versus a regular computer, not the differences between, like I said, like a laptop and a data
center. I mean, the idea would be that these things could literally be everywhere. And not
just that they would be doing the pattern matching and stuff,
but they could actually be learning at the edge. Is that right, Lou?
That's probably one of the most fundamental things that we bring to the table,
aside from low power and any complete solution. This is not a deep learning accelerator that
needs a CPU as a host and external memory. Everything is included
in what we call Akita. Akita is the Greek word for spike. So we kind of stick to our knitting
on what neuromorphic technology really means. But what we can do is learn at the edge. There is really no true learning going on. It's training. And every time
a new object or a new event occurs, you'd have to go back to the cloud, or you'd have to go back to
a data center and retrain the entire network. With Akita, up against any transducer. It could be vibration, temperature, flow, pressure,
something new, vision, audio. Something new happens and Aikido learns on the fly in the field.
You don't have to go back and retrain the network. So if you start a vision application and you're looking for 10 things, you want to know if it's a dog, a cat, a kid, a person in general.
But now all of a sudden you have something else that you'd be interested in.
Or we recognize that pattern that says there's something you should be interested in.
We learn in the field.
The device does not have to be retrained. That is a very, very powerful solution for any autonomous application.
And the ability to send back metadata rather than all of the data, those are two very, very powerful solutions for edge applications.
Right. So let me jump in and double-click a couple of items, what he just said, right? So obviously, for the last, I don't know, 30, 40 years, the big data
center chip players like Intel and AMD popularized the whole concept of building everything at a
center location and you have to move everything there. And then of course, the ARM chips and even
the NVIDIA chips kind of popularized the notion of you could build somewhat of a smaller chip.
It doesn't have to be as big or powerful as a data center chips. And you can move
that to the edge to do those things. Right. And then he also touched on my, my, one of my pet
peeves that almost, if you look at the data enterprises, as you call them, not, not necessarily
a regular enterprise, but a company has produced a ton of data, they can't process the data at edge.
So they've got to move everything to the cloud, not just for model creation,
but even to do analysis and stuff like that.
So there are two issues there.
One is, you know, you move, collect all the data somewhere to do a model creation.
That's issue number one that has to be centralized.
And the issue number two, as you were saying,
the model inference itself at times could be centralized. And the issue number two, as you are saying, the model
inference itself at times could be a problem at the edge. So either they do a half there and half
of the cloud, and then the reinforcement learning and learning from that to update your models
becomes another issue because you have to do it at the edge at first time and then move it to the
centralized location to retrain the models and stuff. I know it's a long question that I asked you, but the combination is a problem.
So how you are positioning yourself to solve all of the above?
Well, it's a huge transition, maybe call it a disruption.
Everything that you just described is based on what us old guys know as the Von Neumann model.
The computing architecture that's been around for the last 50 plus years is instruction, fetch data, process.
Instruction, fetch data, process.
Instruction.
Now, we've gotten away with that for 50 years because we just jack up clock speeds.
But when you jack up clock speeds, you require more power.
This is the transition to neuromorphic where there is no program involved.
We look at events.
An event is a change in data, and we look for repeating patterns. If you take something like LIDAR,
which is very, very pervasive in the autonomous vehicle market or ADAS,
the sparsity of that data could be as much as 90%. So we only have to process 10% of the data.
We're not doing matrix multiplication,
lines of code upon lines of code upon lines of code. We only see and we only process what,
again, I call non-zero activations. That is a huge, huge transition and will be very disruptive
in enabling AI at the edge.
So let me ask you a follow-up question on that.
Now that you brought up that LiDAR subject,
people don't realize,
particularly when it comes to autonomous cars, right,
or autonomous trucks or anything that's moving,
because when you kind of build eyes to those moving objects,
you know, I could throw drones in that category as well.
When it's going in, you're kind of educating them,
you know, to an extent saying that, you know,
if this happens, if you have already seen that,
this is how you have to act, which is fine.
But when the unknown unknown happens
is where the actual problem comes in, right?
So it figures out
what to do on its own particularly in case of drones because the connectivity is going to be
very low so that's where on-chip learning on-chip inference and updating the models and figuring out
what to do on the fly right i mean the drones can eventually move into autonomous flying planes and
whatnot right so so that's where I think the market,
where it's going, and then where your offering could be more helpful. That's my view. What do
you think about that situation? It's a very wide spectrum of use cases.
And I've kind of coined or followed a phrase that I learned very, very early on in my career. It's not what to do. It's which to do.
Everybody's chasing AI. But in our case, we're being very selective. We're playing at the edge. But the edge itself is a very wide spectrum. So we're engaged with NASA. I mean, think about autonomy in space flight.
The ability to reduce power and have, you know, smart analytics on the fly, so to speak, on the fly in space.
Aircraft, you know, you could go to the other end of the spectrum and we're doing vibration analysis on bearings in railroads, you know, railroad cars,
preventive maintenance. We have an application that we're working on, which is a volatile organic
compound detector, so that you can, you know, you can breathe into a device just like a breathalyzer
and possibly get to a point where you can diagnose infectious disease,
pre-cancers or early cancers. And this is why we have a really good feel about the company
with respect to providing artificial intelligence for beneficial purposes.
So that's a very wide spectrum, but we're being very careful. We have an early access program, as many companies do, but we're being very selective about those applications. In 8086, Intel 8086, way back when, you know, one of the first and most popular processors.
And of course, that's matured over time.
But then you had companies like Microchip come in, STMicro, or Nassus, microcontrollers.
But they all still need to be programmed.
And that's the big step forward with neuromorphic processing or
neuromorphic computing is that processing doesn't exist in lines of code. And I think that that's
the key differentiator here. I think that some people listening to this may say, well,
we have autonomous vehicles. We can do image detection and object detection in a doorbell or a Raspberry Pi or whatever.
What is he talking about that's so special here?
And I think that that's the important thing.
The thing that's special, I think, as Andy was trying to get at here as well, is that we're not just applying an existing model or an existing library to inputs and kind of hoping for the best.
That the chip is actually building these things on the fly. And as you said, you know, I mean, we saw this demonstrated
with AI Field Day that, you know, a single image of a tiger allowed it to identify tigers. Another
image of an elephant allowed it to differentiate between tigers and elephants. You know, we saw it doing this with, as you mentioned, with the, you know, processing
smells, basically.
You know, and the difference is that that's happening entirely in ship at low power, right?
Exactly right.
So take us an example of an autonomous vehicle.
You're right. They're driving around the streets of San Francisco right now. But all of the sensors, and there are dozens if not hundreds of sensors, are sending all the data either to the trunk or under a seat to a bunch of GPUs, graphics processing units, that are burning hundreds of watts. I mean, literally hundreds of
watts. I mean, think about a hundred watt light bulb. Put three or four of them in a suitcase
and see what happens. So you've got all of the processing power is consuming energy,
but there's power dissipation. So it's not just the consumption, but it's the dissipation because
nothing is 100% efficient. So you generate heat. Now you have to have cooling. When you're driving
an electric vehicle, which most autonomous vehicles are or will be, the last thing you
want to do is waste power. That draws on the battery. If we can do the analytics at the edge, and as I said, LIDAR could be 90% sparse.
Even an RGB camera, whether it's low resolution or you're talking 4K, you're still going to have 50% sparsity, which means 50% of the data doesn't need to go back to the trunk and consume power. And that I think is one of the
big benefits. And I use the automobile application because Andy brought it up, but that is one of
the greatest advantages in all of the things that I've talked about so far. So I want to bring up
the, by the way, so the reference of the AI field day about tigers,
there was even a joke, if any of you go back and watch that, we thought you were going to stop at
the hot dog, not hot dog. Remember the San Francisco, the Silicon Valley joke, but then
you went to step up. It's not just tiger, not the tiger, but also you are sensing the other objects as well. So it was impressive.
But I want to double click on the use case you talked about. People don't realize how key that
is. It's basically disrupting the healthcare field because most times in the healthcare field,
things are still done in the old fashioned way, right? But what you're suggesting about having a breathalyzer
analyze the molecules and figure out the pre-levels of some disease, imagine how great this would be
if you're already ahead of the game and by chance if it relieves COVID breathalyzer, you know,
rather than, you know, doing the test and waiting for hours, all you have to do is that if you're going to fly in an airplane, you got to go to the security and
do the breathalyzer, like the same thing you do with driving, right? And then if you're clean,
you can go, I mean, it's not there yet, but imagine that that's disruption, that he can
figure something out on the fly, right? Talk about it. It's not there yet, but it's getting darn close.
And you're right.
The transportation industry, the healthcare industry, going to a hospital.
I had to go to the hospital last week for an accident that I had.
Imagine if they could just give you a breathalyzer and check for COVID, H1N1, even MERS on your way out, which, you know, which is a big problem with people that stay in the hospital.
But it's getting very close.
Of course, that's theoretical.
You know, you don't have that right now.
I want to make sure that nobody thinks that this is, you know, an application, existing application. But certainly this is the sort of thing that you can envision being able to do with a portable low-powered device. One of the things that occurred to me when you all, again,
when you briefed me was like, how is this possible? Like, how come you guys can do it with, you know,
a million, you know, milliwatt and everybody else is doing it with, you know, many watts. I mean,
you know, for what it's worth, you know, a conventional, you know, AI processing, you know,
a learning setup with a bunch of GPUs and stuff. I mean, you're looking at like 500 to 1000 watts,
at least power draw. How come you guys can do it so low power? There's really two fundamental reasons.
And I have to give a great deal of credit to Peter Vandermede, our founder. He's the brain
in brain chip. He's been working on what we call spiking neural networks or to kind of bring it down to a level that many more can understand we call it event based
so again we we play on sparsity we only we only look at data and only process data when it's
important the other thing that we've that we've done in akita, which really brings down the power a great deal, is in virtually all
other architectures, you start with some floating point math. So you might start with 32-bit floating
point, and then you quantize. You take floating point, you turn it into an integer, you take that
integer, and you quantize down. And in virtually all cases, you quantize down and in virtually all cases they quantize down
to eight bits. And that level of quantization impacts your accuracy. It may be the difference
between 99% accurate or 97 or 96 or 95, depending on the architecture. We quantize down to one, two, and four bits. So we start
by playing on sparsity because we're in the event domain. And once we have both sparsity
in weights, as well as in activations, then we quantize down to one, two, or four bits. And you can see some charts that we present
and maybe Anil did it at the field day.
If not, we'd be happy to share it at the next field day.
But you can play with the weights.
You can say, okay, I want four bits and four bits.
And therefore, maybe I'll get another percent of accuracy.
I'll take weights at two bits.
I'll take activations at four bits.
Then maybe you lose a half a percent.
I could take it all the way down.
We could take it all the way down to one bit and one bit.
And maybe you're going to lose several percent of accuracy.
But all of that is what allows us to do this at extremely low power,
sparsity and quantization.
And just to translate that for folks listening, I like to kind of think of the metaphor of, you know, the box of crayons, right? So,
you know, if you are identifying the colors of the world around you, and you only have the eight
crayon box, then you have to say, well, that's blue, and that one's red. If you have to say well that's blue and that one's red if you have the 64 crayon box
maybe that's cyan and that's mauve or something you know i mean you know there's a whole bunch
of shades of gray and or you know shades of blue shades of red um and i think to me that's the
difference in what you're talking about so if if i. So if the task is identify red lights, and you've got
the 64 box, you might have, you know, a bunch of different reds to choose from, but you're going to
still be able to identify that light as red, even if you have the eight box of crayons, because you
can look at it and say, yep, red, you know, and to me, that kind of gets to this whole, you know,
maybe sometimes we don't need this level of precision.
Maybe we can get away with something incredibly low, like you're saying, like two bits or four bits of precision.
Right. That's exactly right.
So I got a question on when we were talking earlier, and even on the AI field, I guess we're talking about creating this neural processing
mesh, right? Generally speaking, neural processing is somewhat on these days reserved to cloud
location only, and whether by choice or by the vendor push or a combination thereof and the
power requirement and all that. So you suggesting neural processing mesh brings out my mind at least a few differentiators.
Just to mention some, the first one would be any nodal point.
First of all, generally the service meshes are created using software mostly until now, right?
Having a hardware or chip-based mesh in itself is a true differentiator. And the second thing in my mind, any nodal point of the
mesh itself can have the same capabilities of any other node in there. So you create a true mesh
of a neural processing capability. I mean, that sounds very powerful, but what are the use cases?
What do you think this is applicable? Well, I think there's two things to remember. The we can take advantage of,
if you wanna use four of the nodes
for an audio application,
and then you still have 16 other nodes,
which is a whole bunch of processing power,
you don't have to go into and out of memory.
The memory is resident
in the node. So the idea of doing the mesh really allowed us to distribute memory throughout
the neural fabric. And again, that helps reduce power because you're not sucking up bandwidth
going into and out of external memory. So I think that's the power of the mesh
is it allows us to partition the neural fabric
so you can do multiple applications on the same device.
Again, with all the benefits,
each one of those networks running can learn on the fly,
take advantage of the extremely low power
because we have distributed the memory within the mesh.
So obviously the open source, the whole PyTorch and the libraries and the whole nanny arts,
it's popularizing the way doing a machine learning model training and AI model training,
and everything is done in a software. What you're proposing is, you know,
there is another way to do that as well. So in the future, how do you see
the future evolving? Are you going to be working closely with them together? Are you going to
replace them? Are you going to arguments, you know, supplement? How do you think this is going to work?
Well, you know, one thing we haven't touched on, and I think it's an important attribute as well,
is the design flow to move from this big convolutional neural network that you know may be
established within an organization the design flow doesn't change we use we use tensorflow we use
python scripts so from the potential customers standpoint they don't need to learn a new language
they run they run in the exact same flow that they have.
Once they get to their quantized level,
then we align the Akita layers to what they've completed.
And we do our flow in the background.
We have an Akita development environment,
which is very robust,
but doesn't require the front end to change at all.
We happen to be in TensorFlow and Python.
You can do it in PyTorch.
There's Cafe out there.
There's a whole bunch of tools that people use.
And we're somewhat agnostic to that.
The flow allows them to move through their front end.
And then we take their quantized levels
and we move it into the Akita compatible
environment.
Well, I think that this, honestly, this conversation could go on a long time because frankly, there's
a lot going on here.
I hope that we'll see you at the next AI Field Day to dive into some of these topics a little
bit deeper.
And I know that folks who are listening, if they have questions, you know, I do recommend checking out the AI Field Day presentations.
Just Google, you know, BrainChip AI Field Day, and you'll find the presentations on YouTube.
And you can dive into a little bit more about how this works.
But in the interest of time, we do have to wrap up the podcast.
And since this is season two of Utilizing AI, I'd like to wrap up
with a couple of easy questions for you, Lou. And I warned you a little bit, but I didn't tell you
what the questions were going to be. So I'm sure you're wondering what I'm going to ask. So let's
go with one that's a little bit outside of BrainChip's capabilities now, but well, maybe it's not. So here's a question for you.
How long will it take for a natural conversational AI to pass the Turing test and fool an average
person in a verbal conversation? It's going to be a long time.
It's going to be a long time. And that requires probably cloud-like support.
The timeframe for that to be in an edge device is going to be quite some time.
Okay.
How about this one?
And maybe you've got a little bit more of an idea about this one.
When will we see video-focused or visually-focused ML in the home that operates like the audio based ML and of assistance like Siri and Alexa?
Very soon. Very soon. You know, our capability depends on how deep you want to go.
But our capability to determine whether there's a person in the room or not, it's at hand now. Whether you're using
a standard pixel-based camera or you're using a DVS device, which is dynamic vision sensor,
that's at hand now. Now, depending on the marketplace, who decides to deploy at what
level because of privacy concerns, there's market dynamics that I think need to be worked out.
Actually, if I can, if I may,
there are a couple of commercially available solutions.
We can have another conversation on that,
which could do customer support based on a visual thing.
So you could talk to somebody, to an AI-enabled, you know, persona,
and it'll decide based on your conversation. They are not fully mature like the Alexas of the world, but it's getting there. All right. Lou, one more
question then. Are there any jobs that are going to be completely eliminated by AI in the next five years?
Undoubtedly.
But that could be very well offset,
if not surpassed, by the jobs that are being created.
Take a company like Brainship.
We've got 40 employees now.
I don't know where we'll be.
And we haven't put a forecast
out in the public domain,
but we'll certainly have
a whole bunch more people in the public domain, but we'll certainly have a whole bunch more people
in the next year, two and three.
So I think it's maybe a redistribution.
It's not necessarily a loss, but a redistribution.
Well, thank you so much for that.
And I really did enjoy this conversation.
Again, I think we could have gone on a lot longer,
but that's the problem with the podcast format. You've got a clock to meet. So, Lou, where can people connect
with you to follow your thoughts on enterprise AI and other topics? Well, there's several ways.
Certainly, our LinkedIn page is very active. We've got a couple of thousand, maybe 3,000
followers on LinkedIn. We have a Twitter location.
All of them are located at the bottom of any one of our press releases.
Frankly, I take emails directly.
People can reach out to me directly.
It's Eldonardo at Brainship.com.
We also, for investors, we have an IR location.
So it would be IR at brainship.com.
But there's lots of ways to reach us.
And, you know, we have a YouTube channel all set up.
And you can see, actually, I think there's a link to the field day.
And, Stephen, I'll tell you right now, we will participate in your next field day.
And we appreciate the invitation.
Excellent. Excellent. Andy, how about you?
Sure. People can find me on Twitter at Andy Thorey or connect with me on LinkedIn,
or they can find more details from my website at thefieldcto.com. That's thefieldcto.com. Great.
And of course,
you can find me on Twitter
at S Foskett
and you can find my writing
at gestaltit.com,
among other places.
So thank you very much
for listening to the
Utilizing AI podcast.
If you enjoyed this discussion,
please do go to your favorite
podcast application.
Give us a rating,
a subscription,
a review.
That really does help
our visibility. And please share news about this podcast with your friends. The podcast is brought
to you by gestaltit.com, your home for IT coverage from across the enterprise, as well as thefieldcto.com.
For show notes and more episodes, go to utilizing-ai.com or find us on Twitter
at utilizing underscore AI. Thanks for joining
and we'll see you next time.