Embedded - 497: Everyone Likes Tiny
Episode Date: March 20, 2025OpenMV has a new Kickstarter so CEO Kwabena Agyeman chatted with us about more powerful (and smaller!) programmable cameras. See OpenMV’s site for their existing cameras. See their (already funde...d!) kickstarter page for the super powerful N6 and the ridiculously small AE3. Note that OpenMV still is committed to open source. See their github if you want to know more. Edge AI is the idea of putting intelligence in the devices (instead of in the cloud). There is an advocacy and education foundation called Edge AI Foundation. This organization was formerly the TinyML Foundation. Edge Impulse and Roboflow are companies that aid in creating and training AI models that can be put on devices. ARM talks about their Ethos-U55 NPU (and how to write software for it). Transcript
Transcript
Discussion (0)
Welcome to embedded. I am Alicia White alongside Christopher White. Our guest this week is
Kwabena Adjiman and we are going to talk about cameras and optimizing processors and neural
networks and helium instructions.
I'm sure it will all make sense eventually.
Hi, Coopena.
Welcome back.
Hey, nice to see you guys again.
Thanks for having me on.
Could you tell us about yourself as if we didn't record about a year ago?
Hi.
So, I'm Coopena.
I run a company called OpenMV, And what we've been doing is really exploring computer
vision on microcontrollers.
We got our start back in 2013.
Well, actually a little bit before that.
I'm part of the original CMU Cam crew.
So I built the CMU Cam 4 back in 2011.
And that was basically doing color
tracking on a microcontroller.
Since then, we founded a company called OpenMV
that is doing basically higher level algorithms,
so line tracking, QR code detection, April tags,
face detection through hard cascades, and other various vision algorithms.
And we put that on board pretty much the lowest end processors,
well, sorry, not the lowest end processors,
the highest end microcontrollers,
but lowest end CPUs compared to a desktop CPU.
And so we got software that was meant for the desktop
running on systems that had barely any resources
compared to the desktop.
Anyway, that was 100,000 OpenMV cams ago when we started. And since then, as I
said, we've sold over 100,000 of these things into the world. People love them and use them
for all kinds of embedded systems. And we're launching a Kickstarter. We've already launched
it right now. That is about the next gen, which is about 200 to 600 times in performance and really
kind of takes microcontrollers and what you can do with them and computer vision applications
in the real world to the next level where you can really deploy systems into the world
that are able to run on batteries for all day and can actually process data using neural network
accelerators on chip and make mobile applications.
All right.
So now I want to ask you a bunch of questions that have
nothing to do with any of that.
That sounds good.
That was a mouthful.
I apologize.
It's time for a lightning round, where we will ask you short questions and we want short
answers and if we're behaving ourselves, we won't ask you for additional details.
Are you ready?
Yes.
What do you like to do for fun?
What do I like to do for fun?
Right now, it's really just working out in the morning.
I do like to walk around the Embarcadero in San Francisco and try to get a run in in the
morning.
What is the coolest camera system that is not yours?
Coolest camera system that is not mine?
Let's see.
There's a lot of different stuff.
I'm really impressed by just, and did self-driving in a previous life.
And so it was awesome to see what you were able to do with that and being able to rig
up trucks and such that could drive by themselves back at my last job.
And then seeing Waymo's and etc. drive around downtown in San Francisco has been incredible. And in fact, getting to take my 80 year old parents in them
and thinking about how my dad grew up
without electricity back in the 40s
and seeing him in a car that's driving around now,
that's a crazy amount of change.
Specifics of your preferred Saturday morning drink
when you are indulging yourself? My Saturday morning drink when you are indulging yourself?
My Saturday morning drink when I'm indulging myself?
These are good licensing round questions.
Let's see.
Well, you know, I've had to cut down sugar, so I can't say I have that many interesting
things.
However, I can tell you about beers I like, which is I like to go for those old Rasputins.
Oh, yeah.
Really dark beer. Yeah, I like to go for those old Rasputins. Oh, yeah. A really dark beer.
Yeah, I used to love those.
They kind of do the job on getting you drunk in one,
or giving you tipsy in one go, and also filling you
at the same time.
Since you like machine vision and cameras and things,
are you aware of the burgeoning smart telescope market?
No, I haven't.
But we had someone use an OpenMV cam to actually do astro...
They made the camera follow the stars so they could do a long exposure.
There's a lot of really cool new low-cost products that have nice optics and good cameras
and little computers.
And I think they're
Similar to open MVs compute power and stuff
But they they can look at the stars and plate solve and figure out where it's looking
So then do all the go-to stuff plus imaging. So it's a nice
Integrated place that I might be a place to explore at some point
I need a little air horn or sound maker so that I, because you're breaking lightning round rules.
A little bell.
A little bell.
What do you wish you could tell the CMU4 team?
The CMU4 team, well, that was me back in college.
I would say I never expected I would end up
doing this for so long.
It's been a journey now at this point.
Do you have any advice you would give your past self? Like, really don't put in those
fix me laters because you're going to actually have to fix them later?
The only thing I would say is focus on the most important features and not do everything and anything.
Everything you build, you have to maintain and continue to fix and support.
And so a smaller surface area is better.
Build nothing.
If you could teach a college course, what would you want to teach?
I really think folks should learn assembly and do some kind of real application in it. And I say that because if you remove all the abstractions and really get into how the processor
works, you learn a lot about how to write performant code, where to understand when
the compiler is just not putting in the effort as it should be, and how to think performant code, where to understand when the compiler is just not putting
in the effort as it should be, and how to think about things.
It's important when a lot of our problems in life come down to,
well, as engineers, we optimize things, we fix problems.
And not doing that optimization in the beginning, or at least thinking about it, can end up
costing you in various ways later on having to redo things.
See, also last week's episode, if you're interested in learning about the basic building blocks
of computing.
Okay, enough lightning wrap.
Yeah. Okay, enough lightning wrap. So you talked a little bit about what OpenMV is, and what I heard was small camera systems
that are intelligent.
Is that right?
Yeah, yeah.
We build a camera that's about one inch by one.5 inches, well, 1.75 inches. So it's pretty tiny,
a little bit smaller than a Raspberry Pi, like half of the size or so. And what we do is,
as I said, we find the highest end microcontrollers available in the market and we make them run
computer vision algorithms. And we've been a little bit of a thought leader here.
When we got started, no one else was doing this.
There was just an OpenCV thread. Well, what is it? Not Reddit. Stack Overflow.
There was a Stack Overflow thread where someone was asking, can I run OpenCV on a microcontroller?
And the answer was no. And so we set out to change that.
That no longer is the highest ranking thing on Google search when you search for
computer vision on a microcontroller.
But let me tell you, when we started, it stayed up there for quite a few years.
And you actually you just said you have 1.7 by 1.6?
Yeah, about 1.7 inches by like 1.3 or so, I think.
That's our standard cam size.
So it's pretty tiny. And this is a board. This or so, I think. That's our standard cam size. So it's pretty tiny.
And this is a board.
This isn't a full camera.
This is something you put into another system.
Well, it's a microcontroller with a camera attached to it.
So we basically have a camera module
and a microcontroller board and then IOPens.
And so you can basically sense, you can get camera data in,
detect what you wanna do, and then
toggle IOPens based on that.
So kind of everything you need in one system.
It was meant to be programmable, so you wouldn't have to attach a separate processor to figure
out what the data was.
This is important because a lot of folks struggle with inter-processor communication,
and so not having to do that and just being able to have one system that can run your program
makes it a lot easier to build stuff.
And when you say programmable, I mean, micropython.
Not just we have to learn all about the camera and about the system and start over from scratch,
but just micPython.
Yes.
So we have MicroPython running on board.
MicroPython's been around for 10 plus years now.
And it basically gives you an abstraction
where you can just write Python code
and be able to process images based on that.
And so all the actual image processing algorithms
and everything else, those are in C, and they're optimized.
And then Python is just an API layer
to invoke the function to call a computer vision algorithm
and then return results.
A good example would be, let's say
you want to detect QR codes.
There's a massive C library that does the QR code detection.
So you just give it a image object, which is also produced by a massive amount of C
code that's doing DMA accelerated image capture.
And then that returns a list of Python objects, which are just the QR codes and the value
of each QR code in the image.
So it makes it quite simple to do things.
But from my perspective as a user,
it's just take picture, get QR code information.
Yes.
Yes.
And so we make it that simple.
Isn't that so nice?
But QR codes, OK, so we know how to interpret those.
But I can do my own AI on this.
I can make an orange detector.
Or as I call it, machine learning.
Yes.
Or as I call it, regression.
Linear regression at scale.
Sorry. Linear regression at scale. Linear regression.
Sorry. So what's new with OpenMVCam is that we've actually launched two new systems,
one called the OpenMVCam N6 and one called the OpenMV AE3.
And so these two systems are super unique and super powerful.
Microcontrollers now come with neural network processing units on board.
And these are huge because they offer an incredible amount of performance improvement.
Case in point, it used to be on our open MVCAM H7 Plus, which was one of our highest end
models that we currently sell. If you wanted to run
like a YOLO object detector that detects people in a frame or oranges or whatever object you
train it for, that used to run at about 0.6 frames a second. And the system would draw
about 180 milliamp-years or so at 5 volts running. With the new systems, we've got that down to 60
milliampere as well running, so 3x less power, and the performance goes from 0.6 frames a
second to 30. And so if you do the math there, it's about a 200x performance increase.
So crazy. I was like Moore's Law has gone out of control.
Well, that's what happens when you put dedicated hardware on something.
Yes.
Well, and the A3 is the small one, the cheaper one.
It's like 80 bucks.
Yes.
And I made a point earlier about saying the size.
One inch by one inch.
But yes, this one is tiny.
The size of a quarter.
We got everything down to fit into a quarter.
Ha.
I mean, your camera lens is the biggest part on that, isn't it?
And we wanted to make that smaller too.
It ended up being that big though because the, so for the audience, the OpenMV AE-3,
we got two models.
The N6 is kind of a successor to our standard OpenMV cam line.
So it's got tons of I.O. pins.
It's got a removable camera module that you can put in
multiple cameras and other various things, and it's got all this hardware on board, full featured.
And then we built a smaller one though called the OpenMV AE3, which is honestly, the way it came
about is an interesting story, and so I do want to go into that in a little bit. But it came out to be one of the,
a really, really good idea on making this such a small, tiny camera. It's so tiny you can put it
in a light switch. I mean, at one inch by one inch you can put it almost anywhere. And it features
a neural network processor, a camera, a time of flight sensor, a microphone, an accelerometer,
and a gyroscope. So five sensors all in one, and a one inch by one inch form factor with a very powerful
processor on board.
I'm just boggled.
I mean, it's got everything, and it's so small.
And I just...
You work with microcontrollers?
I know, and I work with small stuff.
This also has the camera.
It's always been for me, once you have a camera, everything gets big again because it's just...
Do you have your phone?
I do want to ask about the cameras, just briefly, so we can get that out of the way because
I'm camera focused these days.
So what kind of...
What are these? One megapixel. One megapixel and
what kind of field of view do those two options have? Both of them are just
to try to aim for around 60 degrees field of view. It's pretty standard.
Both of them though actually have removable lenses, which is really
really nice. So for the OpenMVcam N6, that's a standard M12 camera module.
Got it.
And so you can put any lens you want on it.
And we have a partnership with a company called Pixar Imaging.
So Pixar, we actually met them at something called TinyML,
which got rebranded to the Edge AI.
No.
Edge Impulse.
No, it's not Edge Impulse.
Okay, there's too many Edge things.
Anyway, it used to be called TinyML.
It's now the Edge AI Foundation, I think.
And-
Wait, those aren't related?
They're the same organization.
It's a strange names.
No, no, the Edge Impulse and Edge AI aren't related.
One's a company, one's a conference. Well, yes, but- Oh yeah, one's a company and one's a consortium. No, no. The Edge Impulse and Edge AI aren't related. One's a company, one's a conference.
Oh, yeah.
One's a company and one's a consortium.
Oh, I don't know.
But are they...
I mean, Edge Impulse is a member company of the Edge AI Foundation.
It's too confusing.
And then you have the Edge AI Vision Alliance, which is a different thing.
No, no.
We're not doing it.
And then you have the Edge AI hardware.
What is it?
Edge AI hardware conference, which is another thing.
So there's too many edges here.
Makes it a little challenging.
Anyway, but yeah, what we were talking about, cameras.
So you have, we have two different types, M12, which is a standard camera module, and we have a
partnership with Pixart Imaging.
And so they actually hooked us up with 1 megapixel color global shutter cameras.
And so we're making the standard on both systems.
So the N6 has a one megapixel color global shutter camera
and this can run at 120 frames a second at full resolution.
So that's 1280 by 800.
And then the OpenMV AE3 has the same camera sensor
but we shrunk the lens down to an M8 lens
which is also removable.
So you can put like a wide angle lens
or a zoom lens on there if you want.
And that'll be able to run at a similar frame rate.
It has less resources than the N6,
so it can't actually achieve the same speed
and performance of the N6,
but we're still able to process that camera at full res.
So it could do like maybe 30 frames a second
at 1280 by 800.
But for most of our customers,
we expect people to be at the VGA resolution,
which is 640 by 400 or so on this camera.
And that'll give you about 120 frames a second.
OK, I want to switch topics because I have never used an NPU
and I don't know what it is.
And I mean, it's a neural processing unit.
I got that. So it has some magical AI stuff that honestly
seems like a fake.
It's a whole bunch of...
How is it different from a GPU?
It's all the parts of GPU without the graphics necessary.
So it's all the linear algebra stuff.
Yeah.
So it's just an adder multiplier?
Pretty much.
How is it different from an ALU?
It's got a billion of them.
It's a vectorized ALU?
Yeah.
Pretty much. This is actually what I wanted to talk about more. Not to try and do the sales pitch on this. I think folks will figure out things themselves, but I wanted to talk to some embedded
engineers here about cool trends in the industry. So, MPUs, what are they? Yeah, basically there was an unlock for a lot of companies,
I think, that they realized, hey, we've
got all these AI models people want to run now.
And this is a way to actually use sensor data.
So you've had this explosion of the IoT hardware revolution
happened.
People were putting internet on microcontrollers
and connecting them to the cloud,
and you'd stream data to the cloud.
But the challenge there is that that's a lot of data
being streamed to the cloud,
and then you have to do something with it.
And so you saw folks just making giant buckets of data
that you never used it for anything.
You might add like an application to visualize
the data, but you technically never actually put it to use. You just have buckets and buckets of
recordings and timestamps. And that's all very expensive to maintain, to have. And while it's
nice per se, if it's not actionable, what good is that? And a lot of times, it's not quite clear
how do you make an algorithm that actually,
how do you use accelerometer data and gyroscope data
directly?
If you've seen the kind of filters and things
you need to do with that classically,
they're pretty complicated.
And it's like, how do you make a step detector or a wrist
shaking?
There's not necessarily a closed form mathematical way to do that.
You process this data using a model where you capture what happens,
how you move your hand, etc.
Then you regress and train a neural network to do these things.
That unlock has allowed us to make sensors
that had very interesting data outputs
and turn those into things that could really detect
real world situations.
And of course, this becomes more and more complicated,
the larger amount of data you have.
So with a 1D sensor, it's not too bad.
You can still run an algorithm on a CPU, but once you go
to 2D, then it starts to become mathematically challenging. That's the best way to say it.
And you're 1 and 2D here for an accelerometer are the X and Y channels or do you have a
different dimension?
It would just be like an accelerometer is just a linear time series, right?
So to build a neural network model for that, you only need to process so many samples per
second versus if an image, you have to process the entire image every frame.
Sorry, when you went from 1D to 2D and you were still talking about accelerometers, I
was like, is that XYZ or something else?
Well, you also have 3D accelerometers, so it's very six channels if you think about
it, right?
And usually there's gyros and sometimes there's magnetometers.
So yes, you throw all the sensors on there, but those are still kind of 1D signals as
opposed to the camera, which is a 2D signal because...
2D and large dimension too.
Right, right.
Yeah, yeah.
Like an accelerometer, right?
You might have the window of samples you're processing.
Maybe that's, I don't know, like several thousand at once per ML model
and a thousand different data points per time
you do run the model.
That sounds like a lot, but not really compared
to several hundred thousand that would be for images.
More millions, yes.
Yeah.
There's a reason there's an M in megapixel.
Yes, yeah.
There's a lot of pixels.
Yes, absolutely.
So there's a lot more.
So anyway, enter the MPU.
Basically processor vendors have been doing this for a while, like your MacBook has it,
where they've been putting neural network processors on systems.
And what these are, are basically giant multiply and accumulate arrays.
So if we look at something like the STM32N6,
it'll have 288 of these in parallel, running at a gigahertz.
And so that's 288 times 1 gigahertz
for how many multiply and accumulates it can do per clock.
OK, my brain just broke.
Let's break that down a little bit.
OK, 288 parallel add multiply units.
Yes.
And so any single step is going to be 288, but I can do a whole heck of a lot in one
second.
Yes. So that's 288 billion multiply and accumulates per clock. Then it also features a few other things, like there's operation called rectified linear unit.
That's also counted as an op,
it's basically a max operation.
That's done in hardware.
Then you go from 288 to 500 effectively,
and there's a few other things they can do in hardware for you also.
All that combined, it's equivalent to
about 600,
you know, 600 billion operations per clock cycle
for basically running any ML model you want.
But there's a problem here.
I don't know that I believe in AI or ML.
It seems like, I mean, it's-
Well, it's not confused to two things.
The philosophical and the technical?
No, I mean, I was going to have a disclaimer about we're not talking about LLMs.
Ah, okay.
Which is what I, I traditionally am up in arms about.
This is a copyright issue.
This is image classification.
No, not anyway.
Everybody knows my opinion on AI quo LLMs.
And if you don't, it'll be just cussing me next week
so we can find out.
I work on machine, one of my clients
works on machine learning stuff.
I work on machine learning stuff I have for years.
It's very useful for these kinds of tasks
of classification and-
And self-driving.
Detection, it's not useful for self-driving
because that doesn't seem to work very well.
It worked fine when I did it.
Yes, your truck on a dirt road following a different truck at 20 miles an hour.
It worked.
Yes.
Anyway, what was your point?
Well, her point.
Your point is probably better than both of our points.
My point was we are seeing a lot of funding, a lot of things
that go into processors that are called neural processing units,
like they're supposed to be used for neural networks.
And yet we're also seeing some difficulties with the whole ML and AI in practice.
I don't think those difficulties are actually related to what you're working on,
but do you see them reflected in either your customers or your funders or people just talking to you and saying,
why are you doing this? Because it's not really working out as well
as people say it is.
Well, I think there's some difference there.
One, we were using the branding AI
because that's what everyone uses nowadays.
You have to.
Just to be clear, I would prefer to call it ML,
but that's old school now.
So everyone's using AI.
So we had to change the terminology just to make sure we're keeping up
We're doing we're just doing CNN accelerators. And so these are probably pre-chat GPT really
convolutional neural networks, yeah, yeah, and so what they're doing is
So most of the object detector models for example example, let's say you want to do something
like human body pose, facial landmarks, figure out the digits on where your hand is, like
your hand detection, figuring out how your fingers are, your finger joints, things like
that.
These are all built off these convolutional neural network architectures that basically imagine small image patches
that are being convolved with the image.
So like imagine a three by three activation pattern
and that gets slid over the image and produces a new image.
And then you're doing that in parallel,
like it's way too hard to describe how COBDECs work too.
Let me try.
At a high level.
You have a little image that you remember from some other time,
and maybe it's a dog's face,
and you slide it over the image you have here in front of you,
and you say, does this match, does this match, does this match?
Or how well does it match?
How well does it match?
And then at some point, if you hit a dog's face,
it matches really well.
And now you can say, oh, this is dog.
Now you do that with eight billion other things
you have remembered through the neural network.
And you can say, well, this is a face,
or this is where I have best highlighted a cat face,
and this is where it best highlighted
a dog face and there's a 30% chance it's one or the other.
That sort of, the convolving is about saying, so I have this thing and I have what's in
front of me and I want to see if this thing that I remember matches what's in front of
me.
And there are lots of ways to do it.
You can have different sizes of your remembered thing because your dog face might be big or
smaller in your picture.
And if you look inside the networks after they've learned and then to kind of interrogate
the layers, you can see what it's learning.
Like it'll learn to make edge detectors and lots of even more fine features than just
a face. It might just be, okay, there's a corner,
and a corner might mean a nose, but it might mean this, and it combines all of that.
So it gets very sophisticated in the kinds of things it looks for.
You can look inside a convolutional neural network that's been trained and kind of get
a sense for what it's doing.
Yeah.
And it's, I mean, this is the way things are going nowadays for how people do these higher level algorithms.
And honestly, you really couldn't solve it before
without using these kinds of techniques.
Just because most of the world
is with these weird amorphous problems
where there's no closed mathematical form
to describe like what is a face, right?
You actually have to build these neural networks
that are able to solve these problems.
And it's funny to say it,
because this started blowing up 10 plus years ago now.
And so it's like, it's actually been here for a long time.
It's not necessarily even new anymore.
Definitely not.
And so when I fuss about is AI still a thing,
or is it going to be a thing,
it's not this class
of problem.
This class of problem, the machine learning parts are really well studied and very effective.
And with these, and this is the last thing I'll say on this, you do get with the output
a confidence level.
Like it says, this is a bird, and it says, I'm pretty sure, 75% or 85%.
You can use those in your post-processing to say, well, what action should I take based
on this confidence, as opposed to certain other kinds of AI things that do not do that.
Yeah.
So the chat GPT-like stuff, that's a whole different ball game.
Let's come back next year maybe and talk about that.
Yeah.
You mentioned YOLO, which is an algorithm.
Could you tell us basically the 30-second version of what YOLO is?
Yeah, yeah.
So YOLO is you only look once.
So when you're trying to do object detection, right, the previous way you did this was that
you would slide that picture of a dog over the entire image checking every single patch at the same time
Well, you know patch one after another and you can imagine that's really computationally expensive and doesn't work that well
It takes like literally the algorithm would run for seconds to determine
If something was there, so if you only look once it's able to do a single
So if you only look once, it's able to do a single snapshot.
It runs the model on the image once,
and it outputs all the bounding boxes that
surround all the objects that it was trained to find.
That's why it's called you only look once.
Because instead of it,
before these, there's another one
called single-shot detector, SSD.
Before these were developed, yes, the way that you would find
multiple images or multiple things in an image would be that you would slide your classifier,
basically a neural network that could detect if, you know,
a certain patch of the image was one thing or other.
And you would just slide that over the image at every possible scale and rotation,
checking every single
position.
And that would be, hey, the algorithm could run on your server class machine and it would
still take 10 seconds or so to return a result of these are all the detections.
And so you can imagine on a microcontroller that would be, you'd run the algorithm, come
back a couple days later and you get the results. But you have YOLO running pretty fast on the new Kickstarter processors.
Did you code that yourself?
No, no.
It's thanks to these neural network processors.
And so the way they work, actually, the easiest one to describe is the one on the AE3, which
is the ARM Ethos MPU. That one uses a library called
TensorFlow Lite for microcontrollers, and so it's an open source library that's available.
They have a plugin for different accelerators. So basically, if you don't have an Ethos MPU,
you can do it on the CPU, just a huge difference in performance.
And if you have the ethos MPU available, then the library will offload computation to it.
And so you just give it a TensorFlow Lite file, which basically represents the network
that was trained.
And as long as that has been quantized to 8-bit integers for all the model weights,
it just runs.
And the MPU is quite cool in that you can actually
execute the model in place.
So you can place the model in your flash, for example.
And the MPU, you just give it a memory area
to work with called the tensor arena for its partial outputs
for when it's working on things.
And it'll run your model off flash,
execute it in place,
and produce the result and then spit out the output.
And it goes super fast.
We were blown away by the speed difference.
For small models, for example,
it's so crazy fast that
it basically gets it done instantly.
An example would be there's a model called FOMO, Faster
Objects, More Objects, developed by Edge Impulse.
Yeah, the name is on the nose, right?
And then that object went from running at 12 frames a second,
about 12 to 20 frames a second on our current open MVCAMs
to 1,200.
I was going to ask how you, models can be fairly sizable and with microcontrollers
we're usually not talking about many megabytes of RAM, so I was, yeah, you answered the question.
So it gets stored in flash and executed directly out of flash.
And 8-bit, and that's an optimization that's happening all over, is that it turns out you don't need your weights to be super bit heavy.
You can have lower resolution weights and still get most of the goodness of your models.
Yeah, yeah, you can. And this also does amazing things for memory bus bandwidth,
because if you imagine if you're moving floating point numbers, like a double or a float,
that's the four to eight times more data you need to process things.
And so if 8-bit, yeah, it's just a lot snappier trying to get memory from one place to another.
Quantizing to 16-bit is not too difficult. Quantizing to 8-bit, which I've tried a few times
for various things, there's some steps required there that are a little above and beyond to saying,
here's my model, please change a tape bit for me, right?
You have to-
Yeah.
Yeah.
Typically what you need is to actually do something called,
you wanna do quantization aware training,
where when the model is created,
whatever tool chain you're using to do that or tool set,
those actually need to know that you're quantizing, that you're going to be doing that.
Otherwise, when it tries to do it, it'll just result in the network being broken, basically.
You can't just quantize everything without any idea of what data is flowing through it.
Otherwise it'll not work out so well.
And when we say 8 bits, we don't mean 0 to 255.
I mean, we do in some cases, but there are actually 8-bit floating point,
and that's part of this quantization issue.
I think these are your integers.
No, it's not floating point. It's just scaling and offset.
So each layer basically has a scale and offset that's applied globally to all values in that labor.
And there's some more complex things folks are trying to do,
like making it so that's even further refined,
where you have different parts of a layer being broken up
into separate quantization steps.
But so far right now for TensorFlow Lite
for microcontrollers, it's just each layer
has its own quantization.
It's not more fine grained than that right now. OK, I wasn't aware it was per layer likeers. It's just each layer has its own quantization. It's not more fine-grained than that right now.
OK, I wasn't aware it was per layer like that.
That's cool.
I didn't realize that the NPUs basically
take TensorFlow light.
That gives the power to a lot of people
who are focused more on TensorFlow
and creating models and training them.
So what did you have to do?
Well, let's say it's not that easy.
Really?
Working with TensorFlow is not that easy.
Yeah, it's not that easy.
Let me say it like that.
And for no good reason, I think.
But anyway.
Yeah, yeah, no.
Well, I mean, well, what happens is actually
different manufacturers have different ways of doing this.
So for ST, for example, they do not use TensorFlow Lite
for microcontrollers.
They have their own separate AI system called the ST Art AI
Accelerator.
And so their MPU is more powerful.
But totally different software
library and package. None of the code is applicable. You need to use their tool
chain to build models and their libraries to run. Let's talk about this
SD. It's not a good idea. Well, I mean, the reason they did that is because
they wanted to have more control over it. it totally makes sense. And it lets them optimize in ways their processor is optimized
for instead of with TensorFlow Lite,
where you have to optimize the things everybody's doing
instead of what you're specifically.
Never mind.
Yeah, and I think that's the reason why they went for it.
Another reason is, and this is a weird architecture divergence,
but with TensorFlow Lite, you
have basically a standard library runtime that includes a certain number of operations.
And so for us with the OpenMVCam, we have to enable all of them, even if you don't
use them.
And so ST was trying to be more conscious to their customers and say, okay, for customers
who have less RAM or flash available on their chips, we want to then instead compile the network
into a shared library file, basically,
that just has the absolute minimum amount of stuff needed.
And then that way it's executed on your system.
And if you don't have an MPU, it works on the processor.
And if you do, then it goes faster.
Yeah. The only challenge of that is that means the processor, and if you do, then it goes faster.
The only challenge of that is that means your entire firmware is linked against this model.
It's not like a replaceable piece of software anymore.
So it's optimum from a size standpoint, but it means that being able to swap out models
without having to reflash the entire firmware becomes a challenge.
And for us with MicroPython, one of our goals is so that you don't have to constantly update the firmware to change any little piece of the system.
And so it was a lot easier to get the integration done for the ARM Ethos MPU, because it was kind of built that way, where the library is fixed and the model is fungible. Can you run multiple models?
Like if I, you said the OpenMV had a camera
as well as some other sensors,
can I have a model running the vision part
and one looking at the sensors for gestures and things?
Okay.
Yeah, yeah, that's part of the cool feature.
With OpenMV CAM AE3, for example, you can actually have,
we actually have two cores on it,
so I wanted to get into that in a little bit.
But you can basically, once you finish running the model,
you can have multiple models loaded into memory
and you just call inference and pass them the data buffer,
whatever you want.
Obviously only one of them can be running at a time,
but you can switch between one or another
and have all of them loaded into RAM.
And so if you wanted to have five or six models running and doing what you want, you can.
And again, the weights are stored in flash, just the activation buffers are in RAM.
So as long as the activation buffers aren't too big, there's really not necessarily any
limit to this.
It's just how much RAM is available on the system.
We're going to come back to some of these more technical details in a second,
but if I got the AE3 from your Kickstarter, which just launched and you will fulfill eventually,
but I got one today, what is the best way to start? I mean, do I go to tinyml.com, which now will redirect me somewhere else?
And how do I start?
Yeah, we've actually thought about that for you.
So there's two things.
One, we built into OpenMV IDE our software, a model zoo.
And so this basically means you're going to be able to have all the fun, easy to use models
like human body pose, face detection, facial landmarks, people detection, all of that.
There's well-trained models for that, and those are going to be things you can deploy immediately.
And so we'll have tutorials for that.
And then for training your own models, though, we're actually in partnership with Edge Impulse
and another company called RoboFlow,
which is a big leader in training AI models.
And so with both of their support,
they actually allow you to make customized models using data.
And one of the awesome things that RoboFlow does,
for example, and Edge Impulse,
is that they do automatic labeling for
you in the cloud using large language models. There's these ones called vision language models
that are kind of as smart as chat GPT, but for vision. So you can just say, draw a bounding box
around all people in the image and it'll just do that. You don't need to do it yourself. And
it'll find like most of the people in an image
and draw bounding boxes around them.
Or you can say oranges or apples
or whatever you're looking for.
And then using that model,
that basically helps you create,
you just take the raw data you have,
ask the vision language model to mark it up
with whatever annotations are required
to produce a data set that can then be trained to build
one of these more purpose-made models that would run on the camera.
So it's kind of like extracting the knowledge of a smarter AI model and then putting it
into a smaller one that can run on board.
I'm familiar with Edge Impulse, but Roboflow is new to me.
Have they been around for very long?
Are they robotics focused or is it just now it's everything machine learning and vision
for them?
Well, Roboflow is just focused on machine vision.
They're actually quite big in the more desktop email space.
And so like Nvidia Jetson folks and all the developers who are at a higher
level than microcontrollers, that's where they have been playing. But they are one of
the leaders in the industry for doing this and making it easy for folks to run models.
And we're working with them to kind of help bring these people into the market, to help
make it so that you can train a model easily. What they do is they'll provide you with a way to train a YOLO model, for example,
that can detect objects, and the object can be anything. And as I mentioned, they'll help
bootstrap that so you don't even have to draw bounding boxes yourself or label your data.
You just go out and collect pictures of whatever you want, put that into the system, ask the
vision language
model to label everything, and then you can train your own model quick and easy.
Is it really?
Well, the deployment might be a challenge.
We've got to work through those issues, but the hope is it will be by the time we ship.
Last time we talked, we hinted at the stuff with the Helium SIMD.
Actually, I should start that because I'm not gonna assume
other people have heard that episode.
What is the Helium SIMD and why is it important?
Especially since you have this NPU,
that's what I was gonna ask,
because you mentioned it, yeah.
Yeah, well, there's two big changes
that we're seeing actually on these new microcontrollers. I think the first thing to mention is, well, there's two big changes that we're seeing actually on these new microcontrollers.
I think the first thing to mention is yes, they all have neural network processing units
on board.
So these offer literally 100x performance speedups.
I mean, people should take aware of that.
I don't know where you get 100x performance speedups out of the box on things.
Two orders of magnitude is a pretty big deal.
But even more so, ARM also added the Cortex M55 processor
for microcontrollers, which feature vector extensions.
So last year, we were just getting into this,
and we were thinking about what does it look like to program
with vector extensions.
And I hadn't done any programming yet.
I was just talking about the future, what it could be.
But now, with launching the OpenMVCAM AE3 and N6,
I spent a lot of time writing code in Helium.
In particular, the OpenMVCAM AE3 is actually somewhat
of a pure microcontroller.
It does not have any vision acceleration processing units.
There's nothing in there specifically
to make it easier to
receive, to process camera data. It has a, it has a, it has MIPI CSI, which allows you to receive
camera data and it also has a parallel camera bus, but there's no, normally processors nowadays
have something called an image signal processor that will do things called image debaring.
It'll do scaling on the image, color correction, a bunch of different math operations that
have to be done per pixel.
So it ends up being an incredible amount of stuff the CPU would have to do.
That doesn't actually exist on the OpenMVCAM A3 and hardware.
The N6 from ST has that piece of logic.
And so it's able to have 100% hardware offload to bring an image in.
That's why it's able to achieve a higher performance from the camera because there's nothing,
you don't have to do any CPU work for that. But what we did for the OpenMVCAM AU3 is we actually
managed to process the image entirely on chip using the CPU. So the camera itself outputs outputs this thing called a
Bayer image which is basically each row is red green red green red green and
then the next row is red blue red blue all green blue green blue green blue and
then it alternates back and forth and so to get any pixel if you want to get the
the color of any particular pixel location you have to look at pixels to
the left right up down and diagonal from it.
And then compute.
And that changes.
That pattern changes every other pixel.
Because depending on the location you're at,
you're looking at different color channels
to grab the value of the pixel.
And you have to compute that per pixel
to figure out what the RGB color is at every pixel location.
And so if you just think about what I just said in your head,
it's a lot of CPU just to even turn a Bayer image
into a regular RGB image that you can even use to process
and do anything with.
And basically, every digital camera sensor works this way.
Color resolution is far lower than the absolute pixel
resolution because of these filters.
Because a camera doesn't know about color, right?
It's just measuring light intensity.
And so to get color, you have to put filters in front of it
and then, yeah, do this kind of math.
Yeah.
And so what we had to do is we're debaring the entire image
on the OpenMVChem A3 in software using Helium.
And what we were able to achieve is this is a 400 megahertz CPU. And so we were able to achieve is, this is a 400 megahertz CPU.
And so we're able to do about 120 frames a second
at the VGA image resolution,
which was about 0.3 megapixels at, yeah, 120 frames a second.
So what is that?
What's the math on that?
0.3 megapixels times 120?
Yeah, about 36 million pixels a second with the processor.
What's crazy here though is that you try to do that on a normal, regular ARM processor.
Which I have.
Yeah, for M7, the previous generation.
You wouldn't get that.
There's about a, Helium offers probably around 16 times the performance increase, realistically.
And that's huge.
I mean, again, it takes an algorithm that would be totally not workable.
Like you'd get maybe 20 frames a second, 30 at the best, and now we're at 100.
Well, yeah, like 20 or 30 maybe at the best, and now we're at 120 frames a second, right?
I mean, that's crazy.
Sorry. Yeah, that's crazy. Sorry.
Yeah, it's good.
I'm used to thinking about, like, if you're going to do image stuff, if you're going to
get complicated or you aren't quite sure what you need to do, you probably need to go to
the NVIDIA ORIN or the TX2 or whatever. I would usually I would first go
to Nvidia's website and see whatever their processor is. And what dev kit I could get there,
which then involves Linux and all of that. When did the microcontrollers catch up? Did they catch
up? Or are they just one step behind and I'm three steps behind?
Well there's still one step behind.
Like if you look at Nvidia or etc.
Those have a hundred tops.
And so with the opening became a hundred tera ops.
So a hundred trillion operations a second.
Right.
So microcontrollers are now just starting to hit up to one tera
op. So there's still a hundred X performance difference there. But what's important to
understand is with the current performance of these things, they're good enough to do
useful applications. And what's valuable is that they can run on batteries. That's the
big unlock here. So if I already can run on batteries, That's the big unlock here. So if we look at the open- My Ori can run on batteries.
They just weigh 10 pounds.
And are carried on a six foot wingspan drone.
I don't see the problem.
Yeah, yeah.
So as long as you have a big vehicle,
it's no issue, right?
Right.
But that's the challenge here,
is that you need to have a big vehicle for that.
So what we're looking at is, OK, with the OpenMVCAM AE3,
for example, it's going to draw 60 milliamps.
And this is like, I can't go over this number.
It draws 60 milliamps of power at full power.
That's amazing.
At full power.
And in terms of operations per watt,
that's way beyond what ORAN or anything does.
Well, think about it like this.
A Raspberry Pi 5 without the Healy, without an external AI accelerator, that gives you
100 giga ops if you peg every core at 100%.
And this thing is able to give you double that with that much less power consumption.
These AI accelerators are incredible in the performance. I mean, like again, 100x performance increases, nothing to
laugh at. It's a pretty big deal. But we're looking at 60 milliamps power draw, full bore.
So 0.25 watts or so. And we got it down to about 2.5 milliwatts at deep sleep, but
there's some software optimization
we still need to do because we think we can get it below 1 milliwatt while it's sleeping.
Anyway, the reason to mention that though is that, okay, two AA batteries, that's one
day of battery life at 60 milliamps, two AA batteries.
So you can have the camera just running all the time, inferencing, like, you know, if you want to do one of those chest-mounted cameras like the humane AI pin,
for example, this little thing could do that and give you all-day battery life. And again,
two energizer alkaline AA batteries, nothing particularly special, cost a dollar each.
And so if you put a little bit more effort in, and you actually have like maybe a $30 battery or something,
you have more than a couple days of battery life.
And then if you think about, okay,
maybe I can put it into deep sleep mode,
where it's waiting on some event to happen,
like maybe every 10 minutes it turns on
and takes a picture and processes that,
now you have something that you can build an application out of.
Like, let's say you want to detect when people are throwing garbage in your recycling can.
You could put this camera in there and every 10 minutes or every hour.
Wait, wait, wait.
Why would I want an image of me making a mistake all the time?
Sorry.
I live in a condo, so we have a shared place.
It's a problem constantly with people complaining because in San Francisco, at least, Recology
does hand out fines for you violating that.
Yes.
I can make a little beepy sound or change the shoot or...
Like the time of flight sensor.
If I wanted to make a smart bird feeder, I could wait
until a bird was actually detected physically without using the camera first before starting
to take an image.
Yeah, you get an interrupt from your accelerometer that the thing has got a bird.
Bird detected.
Oh yeah, no, very easily.
And this processor can be in a low power state waiting on that. And the
second it happens, yeah, you wake up and you proceed to take an image and check to see
what, you know, let's say you want to know what birds are appearing in your bird feeder,
right? Well, when birds touch the bird feeder, they cause it to shake, right? So you've got
a nice acceleration event. And so you could have the accelerometer in a really low performance
state just waiting for when it sees any motion detection and when that happens then the camera turns on
takes a picture runs inference and then if it sees a bird there it could then
connect to your Wi-Fi or we're also going to be offering a cellular shield
for the open MV cam N6 and AU3 and it could you know connect to the cellular
system and send a message like a text message to you and then go back to sleep.
And maybe it could text message the entire image too, if you wanted.
So these things are going to be doable.
And the best feature here is, again, it could last on batteries
and then you could also use like a solar panel, for example, and have that attached.
And then, you know, then the battery life is really, at that point, infinite.
The bird feeders do exist.
Do they already have open MV cameras in them?
No, they're using something else right now.
But I don't imagine it actually does much processing on board though to, you know, determine
what bird type or etc.
Probably just get an image.
It's cloud stuff.
Which means you're sending your data to the cloud
where, you know, is someone else's computer in your yard.
And they might be spying on your bird.
Well, I know.
I know.
An example would be trail cams.
Yeah.
So trail cameras right now,
a big complaint of them currently
is that they take a lot of images of nothing
because anytime there's any motion or whatever,
they turn on snap an image.
And so you might have, you know, a...
Wind.
Yeah.
Wind is such a problem with such cameras.
Especially in your trees,
which they are generally for trails.
Yeah.
Yeah, trail cameras just, they're known right now
to take tons and tons of images of nothing.
And you have limited SD card space on these, right?
So if your trail camera is taking tons of images of nothing,
then when it actually comes time for take a picture of something useful,
it might have run out of disk space.
It might have used up all of its batteries, right?
Or just if you want to go and actually do something with that data,
now you have thousands of images you have to look through trying to find the one that actually has the picture of the animal
you're looking for. And so yeah, having this intelligence at the edge, there we go again,
using the word edge. Having this intelligence at these devices really allows you to make
them much more easy to use really if you think about it because now the system is actually doing the job you want
versus capturing a lot of unrelated things you don't care about and there is a privacy argument there too because the more we
That's where I was headed. The more you push the intelligence to the edge the less you have to move stuff to the cloud where it's vulnerable and
For certain applications you might have an entirely closed system. That's totally inaccessible to anyone outside without physical access, which is not possible if you're shipping stuff up
to a cloud server to run on a GPU.
Well, and Thumbodom bandwidth isn't that expensive anymore, but it's still not,
I want to send videos all the time.
It's sending a text message that said, I saw this gecko you've been waiting for,
would be way more useful.
Yeah, no, absolutely.
Because otherwise right now it would be,
here's a picture of a gecko you were waiting for,
it's a picture of like wind,
and repeatedly over and over again.
So yeah, no, it's gonna be fun
what you can do with these smart systems
and what they're gonna be able to do.
And being able to do.
And being able to run in these low power situations is important.
So I mentioned earlier, I wanted to touch on how we got to the OpenMVCAM AE3, for example,
being so tiny.
Why did we create a one inch by one inch camera?
So tiny.
Yeah.
Well, you know, honestly, I didn't think this direction in the company would be something
we were going to support.
I kind of wanted to keep the camera at the normal OpenMV cam sites.
But the actual reason we ran into this and we made the OpenMV cam 83 is because the Aleph
chip was actually super hard to use at the beginning of last year.
We were...
I remember some complaints around here too.
In a mission failure kind of mode with it to be honest.
Yeah.
There were a lot of promises for the Alice.
Yeah, there were a lot of promises.
There were bugs in the chip.
I know that if you listen to our last episode, you'll have...
Oh wait, I'm free to talk about this now.
Yeah, yeah.
There were some issues.
In particular, USB was broken.
IOPens did weird things.
You had to set the IOPens, the I2C bus to push-pull for it to work, which if you know
about I2C, it should be open drain.
Stuff like that, repeated stops on the I2C bus didn't work. Did you encounter, I certainly didn't,
did you encounter any power issues,
brownouts, flash corruption kinds of things?
No, luckily we didn't encounter those,
but we had issues with the camera driver.
When you put it into continuous frame capture mode,
it just overwrites all memory with pictures of frames
because it never resets its pointers
when it's capturing images.
It just keeps going and incrementing forever.
OK.
Yes, Aleph.
But you got it working.
We got it working, though.
And now it's the best thing ever.
It's crazy how your whole interpretation
of these microcontrollers changes
once you get past all of the, oh my god, we're about to, you know, this is the worst idea ever, bugs.
Because the way that the OpenMV-AE3 came about was these bugs were so bad,
we were running into so many issues. Because it's a brand new chip, by the way. This is a new processor.
Yeah, that's the issue. Since beta, beta you know what I started using it was beta
silicon yeah yeah beta silicon they have finally got to production grade silicon
now so a lot of these bugs you won't encounter anymore they fixed them but we
were kind of in the full bore of that and what we did actually is said to
ourselves okay you know we want we put so much time and effort into this chip
and trying to make a product out of it we need to ship something and you know, we want we put so much time and effort into this chip and trying to make a product out of it.
We need to ship something. And, you know, we were just like, hey, what can we do to ship it? And it's
like, well, okay, if we remove all of the features that make the regular open MV cam fun, like the
removable camera modules, and we just make everything fixed, then there's a possibility,
a hope that we could actually build a product that makes sense.
And so we were just like, okay,
removable camera module gone,
let's just make the camera module fixed.
IOPens gone, there's a lot of issues with the peripherals,
let's just get rid of that, make it so there's,
you know, minimal peripherals, minimal IOPens exposed,
this way we don't have to solve two billion issues.
And so we just went down the line
basically just fixing everything instead of it being super, super, you know, having every single
feature possible exposed and usable. We just said we're cutting this, cutting that, cutting this,
cutting that. Removing your flexibility in order to optimize. Yeah, we kind of reduce the flexibility versus the N6 is super flexible,
can do all this stuff, the AE3,
we removed a lot of the flexibility,
but that ended up like actually creating
one of our best ideas ever.
It's, I share this with the audience just to say like,
hey, good things can come out of going on a bad journey,
basically.
And I'm also constantly in favor of constraints.
I think constraints can actually enhance creativity sometimes
and lead you places you wouldn't necessarily
have gone if you tried to just solve every problem
or be a general thing.
Yeah.
Yeah.
And for us, what we decided was, OK, well, we
don't know how to, like, so much stuff
is having a trouble on this chip.
Let's just, you know, make it tiny, right?
Everyone likes tiny.
Use the smallest package they have.
Just reduce the features.
Not going to try to use every IOPEN.
OK, that actually though yielded a lesser cost.
But then we started to do fun stuff and level up our abilities.
We're like, OK, well, I guess we're going to make it tiny.
We're going to use all 0201 components, we'll use all the tiniest chips. And, you know, over the course of like,
I think it took me about three weeks or so to design it originally, we managed to cram everything
for this camera into a one inch by one inch form factor. And it's been, I mean, just talking to
people and showing this off pre-launch, everybody's been blown
away.
They're like, this is a camera that's one inch by one inch.
You've got everything on there, processor, GPUs, MPUs, RAM.
What makes the Aleph chip so special is it has 13 megabytes of RAM on chip, meaning you
don't even need external RAM to use the system like all of these things.
And so yeah, that emerged through this weird process where we thought we were going one
direction and going to make a normal system and ended up somewhere entirely different.
And you've made a system.
Okay, so I have to admit the N6 super cool cool addition to your product line, makes a lot of sense.
But the 83 just gives me ideas.
Goosebumps, right?
It makes me think about things differently, like different directions.
And I mean, one by one is too big to swallow, but there are a lot of places you could fit such a self-contained system Yeah, yeah, that's why everyone has been I mean like I'm glad we ended up at making this tiny camera
Because I wouldn't have gotten there. I was constrained by my own thought process on what our system should be given our previous form factor
But yeah, now it's like yeah
It is legitimately small enough to put it inside of a light switch, right?
Like anything you can think of one inch by one inch fits almost anywhere.
I mean, then your problems go back to how do you light it?
How do you light the image well enough
that you can use the machine learning?
But that's a separate issue that is everywhere.
Well, are they, do you have an IR filter?
It does for some of them.
Flur boards that look really cool. Right, but you can use an IR filter? It does for some of them. Flir boards that look really cool.
Right, but you can use an IR flood.
Oh, you mean just take out the little lens?
You can use an IR light.
Oh, oh.
It takes a lot of power.
People can't see, but.
On detection, you can, yeah.
Illuminate with IR.
Well, you could do that.
We potentially might make a different variance of the AE-3. Potentially,
I'm not saying we're going to do that. But you could use different cameras. Like we're supporting
this new camera sensor called the Prophecy GenX 320, which is, say, an event camera. It only sees
motion. So pixels that don't move, it doesn't see. And this one can work in very dark environments.
It's an HDR sensor, so it can work in bright and dark
environments.
And that one, for example, is also very privacy preserving,
because literally it doesn't see anything but motion.
So pixels don't really have color.
So if you just wanted to track if someone's walking by
or something, that one could be used for that.
Also, because we're using, thanks
to our good relationship
with PixArt, we actually have data sheet access for these cameras and support.
How? How did you do that?
I know, right? It's great.
You can never talk to camera manufacturers.
Yeah, no, we actually have the field application engineers on the line. We ask for help and
they can respond.
They probably tell you how to initialize them.
Yes, we got all of that. It was amazing.
Damn it. It was amazing.
Damn it. Christopher is jealous. So much time trying to get just cameras up and running. It's just so painful. For the audience, if you don't know about this, a big camera everyone uses is
the Omnivision stuff because they built so many of them. And Omnivision provides no help support
whatsoever to anyone using their products
who aren't cell phone vendors. So you have to reverse engineer everything from a data
sheet that has basically no descriptions.
Or you pick it up from the internet and you send it a random set of bytes that you don't
know what they mean, but you know if you change it, something might go horribly wrong. Or
it might get better. You don't know.
Things go horribly wrong though. That's the thing about these cameras is that for the bytes,
like you try to figure out what's the minimum set
and then you realize that the default register settings
don't work.
Right.
Like it does not even produce images or function at all
with the default settings on power on.
You have to give it a special mixture of bytes.
Some of which are undocumented, they're just bytes.
Although they're all undocumented almost.
They have reserved registers you'd be writing to.
And it's like, what is this byte pattern to this reserved register?
Ah, yes. I did manage to get the data sheet for the on-efficient camera we were working with,
which helped some, but there was like 80% of the registers written to weren't in that data sheet.
Yeah, yeah. No, it makes it challenging.
In particular, what's challenging about that
is you can't do stuff like actually being able to set
your image exposure correctly.
So with these two cameras for the N6 and 83,
we can actually control the exposure, control the gain,
trigger the camera, all the features you'd want out
of a global shutter camera.
Oh, also change the frame rate.
Like everything you want to be able to control,
we actually have the ability to control precisely
and correctly now.
We have kept you a little bit past the time and yet I still have listener questions.
Do you have a few more minutes?
Yeah, let's go into them.
Matt, who I think might actually be able to help you if you answer this question correctly,
if you were to wave a magic wand and add three new and improved features to MicroPython,
what would they be?
Yeah, well, we actually have been waving that.
We're working with Damien George from MicroPython directly to help launch these products.
We've been big supporters of MicroPython from the beginning, and so each purchase of an
OpenMV cam actually supports the MicroPython project.
We actually want to fund this and make sure that these systems,
when we're able to sell products based on MicroPython,
MicroPython is also being financially supported.
And what we've worked to improve, for example,
is the Aleph port is actually being,
Damian has helped directly with that.
And so you'll find that support for the Aleph chip
is actually going to be mainstreamed into MicroPython
with MIT licensing.
Our special add-ons for image stuff will be proprietary
to OpenMV, but we will be mainstreaming
the default Aleph setup.
And so anyone who wants to use this Aleph chip now
will have someone else already fixed all the bugs for them. So you will not have
to fight through the giant, you'll not have to wade through all of the crazy problems
that'll have been done for you and generally available to everyone in MicroPython. Similar,
the same thing for the insects, the general purpose support for that. So, you know, we're
bringing these things to the community.
People are going to be able to use a lot of the features we're
putting efforts on and do things with them.
There's also a new feature to MicroPython
that we've supported called ROMFS.
And so this is very, very cool.
Remember how I mentioned those neural network processors
execute models in place?
So the way we actually make that easy to work with is that there's
something called a ROM file system on
MicroPython that is about to be merged.
So this allows you to basically,
you can use desktop software to concatenate
a whole bunch of files and folders like a zip file,
and you can then create a ROM file system binary from that.
And then that can be flashed to a location
on the microcontroller.
And once that's done, then it appears
as a slash ROM slash your file name or folder name directory.
And so you have a standard directory structure
that can be used to get the address of binary images
in Flash.
And so what this means is that we can take all of the assets that would be normally baked into the firmware
and then actually put them on a separate partition where you can be updated.
And your program then just references them by path versus address.
So this actually allows you then to technically ship new ROM file system images
that could be new models, new firmware for like your Wi-Fi driver
or whatever and etc. It's a very powerful feature.
I mean, this is, I love this. You could also use it as assets. And so I immediately went
to display assets.
Whatever you want. And it's mapped into Flash, so it's memory mapped.
Yeah.
That's nice.
Yeah. That's awesome. That's awesome.
We're finally entering the 90s.
I just.
Shh.
I mean, to everyone who's listening here,
so right now what you have to do is you have to bake things
directly into the firmware, but this means the address
of that stuff always changes constantly.
Or build your own weird abstraction thing that has,
yeah, that's mapping addresses.
It's perfectly fine. It was an image library.
No, but it's super helpful. And it's actually a problem because if you think about the fat
file system, the problem with fat file systems is they get fragmented, right? Files aren't stored
necessarily linearly. Every four kilobytes or so is chunked up.
And they can be located all over a disk, and so it's impossible to execute those in place.
So it's a magically good feature.
And then one other thing we're working on, which is a request for Comet, I guess, right
now, but hopefully it'll get worked on, is the ability to have segregated heaps in MicroPython.
What I mean by this is right now you have one giant heap
and it has all the same block size.
And this is a problem because guess what?
On the OpenMVCAM A3, we have a four megabyte heap on chip
just for your application, four megabytes.
And what this means now is that you can just allocate
giant data structures and do whatever the heck you want.
But it also means that if you're storing the heap as 16-byte
blocks still, that you actually have a lot of small allocations.
And so it takes forever to do garbage collection then.
And so what we're trying to do is have it so
MicroPython can have different heaps and different memory
regions where you have small blocks in one heap,
larger blocks in another heap. And then the heap allocator will actually traverse through the different heaps and different memory regions where you have small blocks in one heap, larger blocks in another heap, and then the heap allocator will actually traverse through the
different heaps looking for blocks that make sense on a size to give you.
All right.
You talked there about features that are going into MicroPython and that last one was a new
feature.
We're trying to get that one.
That one has not actually been implemented yet.
That one's the one we want to see happen though, because it's important for dealing with megabytes of RAM now,
which is like, you know, we didn't think we'd ever get there, right?
Everyone would be stuck on kilobytes of RAM with MicroPython,
but nope, megabytes now.
Why would you want more than 42k of RAM?
It's very funny that we're talking about still microcontroller level
quantities of RAM and ROM.
I know. We're still talking about megabytes. With this huge thing on the microcontroller level quantities of RAM and ROM.
I know.
We're still talking about megabytes.
With this huge thing on the side that does a ton of compute.
Some portions are moving up to desktop class and some are still stuck in the... I mean,
that's just the way it is.
If you want it to be cheap and small, some things are going to have to be cheap and small,
but it's nice that some things are advancing.
All right, so I think we got two more listener questions.
Couple more, yeah.
Simon wanted to know more about edge inferencing
and optimization and all of the stuff
that we did kind of talk about.
But we didn't mention one of the features of the N6
that you are very excited about.
Yeah, yeah.
So the STM32 N6 that you are very excited about. Yeah, yeah.
So the STM32 N6 actually has an amazing new hardware feature,
which is it's got H.26 hardware encoding support on board.
Oh, very good.
And so that means now-
What does that mean for those of us
without perfect memories?
Oh, yeah, it can record MP4 videos.
Oh, yes!
So this means you no longer need to be running a system that has Linux on board to have something
that can stream H.264 or MP4 videos.
So I mentioned the N6 has an MPU, so it's got the AI processor.
It's got something called the ISP, the image signal processor, so it can actually handle
camera images up to 5 megapixel with zero CPU load.
And then with the H.264 onboard,
it can then record high quality video streams.
And that can either go over ethernet
and it's got a one gigabit ethernet interface on board.
Again, a microcontroller, one gigabit ethernet.
It's so funny, we're putting these high powered things
all around the edge of this tiny little CPU.
Well, that's the thing, the CPU isn't tiny.
It's 800 megabits of vector instructions.
It outperforms the A-class processors of yesteryear, right?
Yeah. Yeah. Yeah.
It's new tech. But then you also have Wi-Fi and Bluetooth, so you could stream H.264
over the internet, or you can send that to the SD card. And even the SD card interface has
UHS-1 speeds, which is 104 megabytes a second or so.
And so yeah, you can push that data anywhere you want and actually do high quality video
record.
Amazing.
Okay, Tom wanted to ask about OpenMV being the Arduino of machine vision, but then that
led me down to a garden path where you actually worked with Arduino. Could you talk about working with Arduino before we ask Tom's question?
Yeah, working with Arduino has been excellent, really, really good. You know, thanks to their
support, we really were able to level up the company and get in touch with customers that
we never would have met. You know, obviously, as a small company, people don't necessarily trust you. And so having Arduino be a partner of us has really helped us grow and meet different customers
who are doing some serious industrial applications that wouldn't have considered us otherwise.
So it's been a really good deal.
We're super happy for Arduino and working with them actually.
And we're super glad that they really supported us in this way and that they wanted to work with us.
And this was the Arduino Nikola vision
and now Arduino is manufacturing it?
Yes, yes, the Arduino Nikola vision.
That was where we've been working with our partnership
and also the Arduino Portenta, that's where we started.
And we also support the Arduino Giga.
So those three platforms run our firmware
and we basically make those Arduino Giga. So those three platforms run our firmware and
we basically make those work and have been, you know, have really shown off the power of those
systems. So then I think you had something else you wanted to mention.
Well, I did, but then I wanted to ask about Arduino versus MicroPython because I, you know,
can't actually go in a linear manner.
micro Python because I can't actually go in a linear manner. Yeah. Well, this is one of those
diversive things I wanted to talk about a little bit more.
It's not really Arduino. It's really more of a C versus micro Python.
Where do you think things are going?
Yeah. The best way I would be to say that is, again,
I mentioned that these microcontrollers have megabytes of RAM for heap now,
megabytes of RAM.
I mentioned that we can allocate giant data structures.
We have neural network processing units.
We actually have something called ULab,
which is a library that looks like NumPy running on board,
or the OpenMVCam 2, which kind of lets you do up to a 4D dimension array processing.
And this is quite useful for doing all the mathematics for pre and
post processing neural networks and also doing,
like you want to do matrix multiplications,
matrix inversions, determinants,
all the things you would need to do linear multiplications, matrix inversions, determinants, all the things you
would need to do linear algebra to normally solve to do signal processing and sensor data processing,
all of that's on board. And so a question I pose to many people is where do you want to be doing
all of that high-level mathematical work? Do you still want to be writing C code to do that, or would you like to be in Python? So much Python. I want to be in Python.
I want to be in Python. I want to be in a Jupyter notebook on Python.
Well, that's the beautiful thing. With MicroPython and with NumPy kind of like support,
this means you can actually kind of take your Jupyter notebook code almost directly and run
it on board the system.
And so this is where I think things are going, which is it's just now that we're starting to get free of the limitations on, again, like the Aleph ensemble chip, that's a six by seven
millimeter package with 13 megabytes of RAM on board and the open MV cam and then the N6
processor that comes in different package sizes, but you can get one down to a similar size and with 13 megabytes of RAM on board and the open MV cam and then the N6 processor
that comes in different package sizes but you can get one down to a similar
size and footprint. It has less RAM on board so it might need some external
memory but still comes with four megabytes on chip and so when you look
at these things it's kind of like yeah we're in a new world where it kind of
starts to make sense to actually invest in standard libraries and
higher level programming.
The best way to say it would be, what's the default FFT library for C?
Is there one?
The one I wrote for whatever project I'm running.
No, at least use CMSys.
Well, you've got CMSys.
It may not be efficient, but it's okay.
The one out of numerical recipes in C from 1984.
That's the challenge is that, okay, well, let's say that one exists.
What about doing matrix inversion?
Yeah, yeah, yeah.
Which one are you using there?
And so these are all things where like having a standard package, but then also you can
accelerate this with helium.
So all of these standard libraries that are in MicroPython could be helium accelerated
under the hood.
Now you're talking about a system where developers can all work on the same library package and
improve it.
And everyone can utilize that and be efficient at coding and getting more done with less
mucking around versus rewriting all your C code with Helium, which is a recipe for lots
of bugs and a lot of challenges if you're doing it brand new from each system.
How often does somebody suggest you rewrite the whole thing in Rust?
Actually, not so much.
I hear about all these conversations theoretically.
I don't know though if I hear too many about them actually from a practitioner standpoint.
Okay, so Tom's question.
Tom assumes you are targeting some low-cost hardware.
Is there some higher-end hardware also?
He wants a global frame shutter that has been tested to just work in the spirit of Arduino.
He also wants a lens upgrade, a Nikon mount, and a microscope mount.
He's shopping on our podcast.
I know.
This is like, okay, so now I want to go to the lens aisle and then, but you have a lot
of these.
Yeah, I would say we're targeting global shutters by default now in our systems.
We just are going to have that.
And what that means for people is rolling shutter, it's the way their camera reads out
the frame.
Cheaper cameras, actually most cameras, but electronic digital sensors do rolling shutter
where they'll read out rows of the image kind of slowly,
not all at once. So if there's motion or something while it's reading out the image, you can
get this weird artifact where like something moving.
My head's over here and my body's over there.
Not quite, but something moving might be diagonalized.
Yeah.
Right, because some part was here while it was reading out, and now it's over here, so it's skewed.
Global shutters read out the entire, well, they capture and read out.
They expose the image at once.
Yes, that's it.
It exposed the entire thing at once instead of by rows.
Yes.
It's getting tripped up and read out in exposure.
Yeah.
But we're going to make that standard now, though.
We think it should be, because we try to do machine vision.
We're not trying to necessarily take pretty pictures, but we'll also have a regular HDR
rolling shutter for high megapixel counts.
And then we actually are working on plans for two megapixel versions of the camera soon.
Those will launch later, but we have a path forward for actually increasing the resolution
for the global shutters.
Regarding the Nikon mount and microscope mount, given our small size of our company, probably
going to leave that up to the community, but we think people are definitely going to be
able to build that though if we see, once the system gets out there, we see people making
these things. That's just a 3D printer thing, right?
Yeah.
Okay.
Because we have 3D files now in CAD for everything we do.
Those are available.
And so you just take that and you take your microscope and you print up what you need
to translate them, says the person who did not even look at the new 3D printer that arrived at the house.
Anyway, and you already do lens shifts.
I saw a bunch of those.
Lens shifts.
Lens upgrades.
You have different lenses.
Yeah, we do lens upgrades.
All righty.
I mean, not for the AE-3 because that's the super small one and it's been optimized, but
for just about everything else you have.
Yeah, for the N6, tons of lenses.
We actually have a lot of features for the N6.
So we've got thermal camera.
We're actually gonna have a dual color in thermal camera.
So you can do a FLIR Lepton
and a global shutter at the same time.
So you can do like thermal and regular vision
at the same time.
We also have got a FLIR Boson,
which is high resolution thermal.
Then we have the Prophecy Gen X 320,
which is a vent camera.
Then we've got an HDR camera, as I mentioned,
five megapixels, which will be like your high res camera.
And then it comes with the default
one megapixel global shutter.
So you've got options.
And then all of those besides the GenX 320 have removable,
well, sorry, the regular cameras, the color ones,
just have removable camera lenses.
So you can change those out too.
All right.
All right, I have to go write down some of these ideas that I've gotten through this
podcast with what I want to do with small cameras and different cameras and microscopes
and moss and tiny snails. Do you have any thoughts you'd like to leave us with?
Yeah, no. Just thank everybody for listening. We're excited with what people are gonna be able to do
with these new systems.
And if you have a chance, check out our Kickstarter
and take a look and buy a camera if you were so inclined.
Our guest has been Khabana Adjiman,
president and co-founder at OpenMV.
The Kickstarter for the AE3 and the N6 just went live, so it should be easy to find.
You can check the show notes to find the OpenMV website, which is openmv.io, so it shouldn't
be hard.
And there are plenty of cameras there that you don't have to wait for, along with all
of these other accessories we've talked about.
Thanks, Kavanaugh.
All right.
Thank you.
Thank you to Christopher for producing and co-hosting and not leaving in that section
that he really, I hope, cut.
Thank you to Patreon listeners' Slack group for their questions.
And of course, thank you for listening.
You can always contact us at show at embedded.fm or at the contact link on embedded FM.
And now a quote to leave you with from Rosa Parks. I had no idea
that history was being made. I was just tired of giving up.