Embedded - 161: Magenta Doesn't Exist
Episode Date: July 19, 2016Kat Scott (@kscottz) gave us an introduction to computer vision. She co-authored the O'Reilly Python book Practical Computer Vision with SimpleCV: The Simple Way to Make Technology See. The book's w...ebsite is SimpleCV.org. Kat also suggested looking at the samples in the OpenCV Github repo. To integrate computer vision into a robot or manufacturing system, Kat mentioned ROS (Robot Operating System, ROS.org). Buzzfeed had an article about SnapChat Filters. Kat works at Planet. And they are still hiring.Â
Transcript
Discussion (0)
Welcome to Embedded.
I am Alicia White with Christopher White,
and I'm excited to have Kat Scott on to talk about computer vision.
Oh, already? There's usually some sort of lead-in.
Hi, Kat. Welcome to the show.
Hi. Kat just complimented us on our professionalism.
Now here we are. That lasted.
Kat, can you tell us about yourself? Oh, well, so I'm a senior software engineer at Planet Labs
right now. I do quite a bit of computer vision work. I primarily work in Python, occasionally
C, C++. Prior to Planet, I've started a couple different startups doing robotics and manufacturing
systems. And then I started my career actually being a defense contractor doing small business
R&D contracts in Michigan. And that was quite an interesting job. That's where I kind of learned
how to do robotics and computer vision and that sort of thing. You also co-authored a book.
I also co-authored a book. It's Practical Computer Vision with SimpleCV.
It's been out for, what, four or five years now?
So SimpleCV is basically a sort of extension wrapper
on the OpenCV Python bindings.
And what it allows you to do is to basically take something
that may be an abstruse, you know,
hundred lines of code to get working and do it in a single line of code or a couple lines
of code and makes things a lot faster to prototype, which is, I think, the great advantage of
doing computer vision in Python is the ability to prototype rapidly and sort of figure out
the right way to do something before you deploy it.
It's usually, I find computer vision is more of a science
where you have to go and try a bunch of experiments
until you find something that works precisely.
And that means that your prototyping time is the most valuable time that you spend.
You're spending all your time doing research, developing, trying stuff.
And the faster you can do that, the faster you can get things done.
Cool. Okay.
Mostly I'm going to ask you about computer vision because it's an area that I always want to try more in and I'm hesitant.
But before we do that, we have the lightning round where we ask you questions and want short
answers. And if we are behaving ourselves, we won't ask you for additional information.
That never happens.
So first, favorite electrical component.
Favorite electrical component. Hmm. That's interesting. There's such a line of sensors,
right? There's so many sensors. You know, it's your basic CCD. I mean, that's how I get my bread
and butter. I mean, maybe it's a CMOS, but usually it's a CCD. I mean, that's how I get my bread and butter. I mean, maybe it's a sea moss, but usually it's a CCD.
All right.
I mean, that's a very generic answer, I know.
No, that's less generic than some others.
Dinosaurs.
Do you like dinosaurs?
Pro-dinosaur.
Would you like dinosaurs to exist?
How do they not exist already?
In their old form.
Sure.
Which ones?
Ooh.
Let's stick to the herbivores and maybe some of the little flying ones.
The little flying ones.
That seems safe.
Yeah.
Like the California brown pelicans.
Yeah, well, they were like raptor, or they were like the ones that would fly that were about that size.
They had the sharp beaks, right?
And they'd dive down in the water and they'd stab fish.
It was pretty cool.
Or so we think.
Or so we think.
Science, technology, engineering, or math?
Engineering.
What is the most exciting sci-fi concept
that is likely to become reality in our lifetime?
Utopian or dystopian?
Depends on your mood uh i i i guess dystopian isn't very exciting oh well it's more terrifying well it's been
it's been a really rough week so dystopians uh it may not be the way to go this week we
should be positive about things i i think the utopian nature i think you know a lot of the bioscience stuff is really changing and i think probably
within my lifetime we'll get rid of most of the cancers i think i think that's pretty exciting
that is exciting uh what is most important to your job a whiteboard a soldering iron or a
keyboard mouse keyboard mouse?
Keyboard mouse.
Mainly keyboard.
What is your least favorite planet?
Least favorite planet?
Venus.
What is your preferred programming language?
Python.
What language should be taught first in university courses?
Oh, Python.
Okay, so Python all the way.
Yeah. What's your favorite text editor?, Python. Okay, so Python all the way. Yeah.
What's your favorite text editor?
Say Python.
Emacs.
Oh, I was just trying to keep the Python going.
What's your favorite type of snake?
You get that?
That was the right thing.
Ooh, favorite type of snake.
The correct answer is Python.
That's not a correct answer.
I'll go with Python. I mean python should I be more specific on that?
because snakes are cool
yes you should be more specific on
what is your favorite snake
now I have to be specific about my favorite type of snake
and I'm drawing a blank on snakes
toilet snake?
that one's gotten me out of trouble
what did you want to be when you grew up and don't answer snake?
I can't?
Okay.
Actually, when I was very little, I wanted to be a neuroscientist.
And that actually extended almost into college for a while.
I went through a year period in college where I thought that was what I was going to do.
And then I took chemistry.
Hacking, making, tinkering, or engineering uh half engineering half hacking i'm done oh okay um
then what is a workday like for you this one can be longer of course uh well workday you know i
think it's not very standard i i'm quite actually planet is is fantastic um i really get about on a
normal day i get about seven hours of coding in all day you know coding and then discussing and
thinking through things um so i don't do a lot of meetings or managing or anything like that it's
very much like come in i have objectives i really like having a day-to-day where it is every day the software that you write
does something more than it did the day before, and you can have a discrete sort of accomplishment
every day. So lately I work a lot on doing image recognition. So at Planet we have lots and lots
and lots of stuff coming down. Pretty soon the entire earth every day is being imaged and we can't process all of it and we can process all of it but we can't
we shouldn't because a lot of its clouds phenomenally like 40 50 percent of the cloud
the world is covered in clouds every day and you can't really see anything and that information's
not really useful and so we shouldn't process it so I'm sort of working on how to weed out all of this scenery
that is not particularly usable or sellable.
So this is Planet. Patrick is at Planet.
He was on the show recently and we talked about satellites image the Earth every day.
Formerly Planet Labs, now Planet.
And of course they're still hiring.
Yeah, we're always hiring.
How long have you been there?
I've only been there about three or four months.
Yeah, because I thought you got the job after we scheduled this, or you told me about the job after we scheduled this.
Yeah, yeah, yeah.
Why would you...
It seems like taking the clouds out in post-processing is less efficient than just not taking pictures when they're in clouds
uh yeah generally but you have a constrained environment on the satellite and you the way
you sort of address these problems right is that you assume that every bit of data is probably
sellable or usable and you're always trying to aim for the most data to go through,
even if it is not correct.
So you want to process things even though they are clouds
because if you lean the other way, you're losing stuff that you can sell.
And still, computation is pretty cheap.
So since the satellites are constrained, we can't do quite a good enough job.
I mean, there's stuff we're working on right now.
But we do it all on the ground.
And so it's sort of like there is a team that handles getting stuff off the satellites and making sure, you know, here's the data.
It comes down.
Here is another group that deals solely with, like, image quality issues as they relate to color correction and how sharp the images are and
that sort of thing and then my group sort of is how do we only process things that are useful
that we can actually line up to the earth and then we actually do that sort of rectification
and then once it's sort of rectified then it is ready to either be used in a mosaic product or
sold to the customers and the word rectification here is sort of
stapling it to the ground. Yeah. So I like to say like all rectification, most rectification
requires that you have a map already. So we already know what the earth looks like through
various companies and governmental organizations. You have these base maps. And what we do is we say,
oh, well, here's this like little corner of a road over here and here's this
little uh manhole cover you know something that you can see and we we try to line all those up
with the thing that we just got and you do that very very precisely to like a sub-pixel level and
you say you take these two maps and you kind of like if you think about two paper maps sliding
them over each other until they lined up perfectly that's basically what we do and then we do some
stuff to correct for oh like there's mountains and this thing isn't actually flat it's
actually curvy and round and and funky world yeah and so you have to go correct for all that too
okay so computer vision as a whole what can you tell me about it
so i i well so so i like to i like to phrase these things as computer vision is like the
absolute worst way of replicating like replicating light for humans to understand it
and i say that because like computer vision is right this notion of processing images to get data out
and the images that we generally take are really just bad representations of light in the world
and like if you actually draw out the system right and i've done this when i do talks like
you start out there's a thing and there's a light source like the sun and you have all these
frequencies of light coming up the sun like just this huge giant pile of frequencies and they come down and they hit this object and the
object absorbs some of them and reflects some of them and they bounce into a piece of glass and
those colors and all of that stuff bounces around in the glass hits a ccd or a CMOS sensor which has
well actually before it does that it goes through what's called a bare filter, which is little tiny chunks of red and green and blue.
It might also go through an IR filter.
So it divides it up and already like weeds out most of the frequencies.
Then you're just left with like basically a little response around a certain peak of
red, a little response around a certain peak of green and a little bit of blue.
Hits the CCD.
That CCD has some sort of response curve to it gets digitized processed
beat up stored as bits and then those bits get moved around recreated
reprocessed put on a screen that may not represent the colors that actually came in
then bounces off from the screen into your eye the process repeats and then it gets weirdly
interpreted in your brain and this whole like long chain of events is like the worst possible
way of representing what's in the world right there's so many levels of just processing
filtering weird stuff going on and yet it all works like i can show you a picture of a cat
and somehow we still intuit that it's a cat and like how that actually works is just incredible
and then kind of magical yeah and then we're supposed to sort of process it and then like
have a machine say oh yeah that's a cat because cats all look the same yeah i mean well well it
it helps right because the other side of it is that you just have so much data i mean when you
when you look at like how much data is an image, it's books worth, effectively, of bits.
You have 8 bits for red, 8 bits for green, 8 bits for blue, or more.
And then you have 1 megapixel, 10 megapixel, 20 megapixels of that.
And it's a pile of data.
So luckily, just in that giant mass, you can sift and throw away a bunch of it and eventually
get to an answer. So another way I like to think about computer vision is it's the process of
throwing away as many bits as possible, as fast as possible, to arrive at the one bit or the two
bits that you actually need, which is like a classification or yes or no, or a number that
is a measurement of something in the world. Okay, so how do people get into computer vision? If they're, say,
embedded software hardware engineers or software engineers, computer vision is sort of specialized.
How do you go from college classes to this? Well, you know, so I didn't, my background is actually,
I did my undergrad in EE and computer engineering. And I really didn't do any computer vision in undergrad.
And I got my first job, and it was for this crazy, awesome, small,
it was literally a mom-and-pop defense contract.
It was a husband-and-wife team that ran this 50-person R&D shop in Michigan.
And they basically said,
here is the list of all the government sbr grants that come out every
quarter sbir is the small business innovative research yeah grant okay and if you apply like
if you put together a grant proposal and you win it that's basically your pile of money to go do
what you want and so i started looking through and i'm like okay what are the cool projects
because i only like the one thing i try to do with work is I only try to do cool and interesting, hard things.
And so I went through that list,
and I found, you know, the things I thought cool.
And everything I found that I thought was really, really cool
was related to sort of taking images
and understanding them in the world.
And I had no idea how to do any of it.
But I could sit down and read papers
and, like, read through tutorials
and just sort of
work and work through it and try to understand break problems apart it's just a general
engineering principle it's like well how do I solve this problem well to do this I need to
find these lines or I need to find like I'll have what can I say about the system that I'm looking
at well it'll be a camera outside which means it'll be light and sunny. So I need something that's invariant to illumination. And I need to look for people
in a cluttered outdoor scene. Well, what does the research look like for that? And then I go read
through like the research papers and I say, oh, this one does this, this one does this, this one
does this. And you sort of say, okay, well, that's sort of what the state of the world is. And what
can I take from each of these? And what do I, i you know after you do that a few times what can i take from my prior knowledge
of how to do stuff and like what sets of technologies work really really well and i
started out doing this all in c++ and it was hard it was really hard it's frustrating like
the notion that you have to have a lot of grit to be a good um to really get good at something that's sort of hard. It takes time and it takes practice.
My post this week for our embedded blog is titled Resilience is a Skill.
Yeah.
So yeah, I totally agree with you there.
So that's how you got into it.
And then you ended up getting a master's degree, partially in computer vision?
Yeah, it was like computer vision and robotics and machine
learning because they're all sort of interrelated i think to a certain extent but how would you
recommend somebody now i i again i think it's it's starting to read right so i think probably
the best set of tools out there i mean there are some really um there's some really great teachers
out there uh now there's sort of various blogs and stuff that people are showing tutorials.
Like if you want to get,
if you want to do something right now,
if you want to like sit down at your computer and learn something like the
open,
the open CV examples folder,
especially the Python folder where it's like,
there's probably 50 programs in there.
Right.
And you just open them up.
You look at the guts inside,
you run them, you tweak the guts inside,
you see what happens, you run them,
and you keep doing that.
And I think that, like if you have a basic knowledge of Python,
that is probably the quickest, dirtiest way
to start getting your hands dirty.
And for that, you just need a webcam and a computer, right?
Yeah, you may not even want a webcam.
You could probably do it off still images and videos, too.
Webcam makes it a lot more fun if you're going to do like face recognition or mustachination yeah stashination you take a picture of someone i know what you mean and they
i just it puts a mustache on them yeah i just you don't think that's a word well it's only a word
so much as like snapchat makes like rainbow a nation and other crazy crazy i mean
that's a little slightly different technique but it's based on the initial very similar approach
okay so i can go through and do this on my computer and then a raspberry pi would be not
too far from that no no, no different at all.
Just slower.
I mean, but if you want things to run faster with computer vision, generally you just make your images smaller.
It's a really handy trick.
Things are too slow, make the image smaller.
Yeah, and that's, you can, I mean, sometimes the camera will make the image smaller for you. Yeah, it's much, if you can solve, you know, never send software to do hardware's job as much as you can get away with it.
I like that.
I like that.
You know, if the camera will do it, then you should talk the camera to do it.
You know, similarly, if you're trying to make something robust, I generally very much advocate for controlling situations. Like people assume that there's magic, a lot of magic bullets out there,
which are that all of a sudden,
like I can write an arbitrary computer vision algorithm that works really,
really well outside in five lines of code.
And the truth of the matter is,
is no,
no,
you can't.
But if you want to solve a problem,
well,
like a sort of industrial computer vision where you want to recognize a lot
of things and,
and measure things and look at things consistently,
you have to have a consistent environment.
You need to constrain the environment a lot.
Reading through your book,
one of the things that struck me
that I hadn't quite realized,
having played just a little bit with some computer vision,
is that lighting is something
you shouldn't solve in software.
If you can have consistent light,
that is a physical property that will make your software so much simpler.
Well, if you think about signal processing, you want good signal.
And sometimes people forget that images are signals.
It's a picture.
I should be able to manipulate it in Photoshop or whatever to bring out detail.
Well, no, if you have a bad source, you have bad data.
Well, people, there's a weird sort of notion
that, you know, we understand color, not light.
And that's part of the problem.
You know, there's so many color hacks,
like the fact that magenta doesn't exist.
Like magenta is this color that we see on screens
that doesn't, the rainbow,
like if you look from red to blue,
it doesn't loop back over and mix red and blue again to make this magenta color. It's just a magical thing that we intuit.
And it's a very, like you have to remember that light and color are different things.
You can't really replicate color well without understanding the light that it's coming from.
I'm torn between going down that path and being so confused because last week one of
our guests was named Magenta.
And so I'm just sort of baffled here.
Well, I want to go back to something you said about magic because I've had limited experience
with computer vision in my working life.
And the one real serious thing that I tried to do was at a medical startup where
we were doing imaging. We were imaging the inside of arteries and it was sort of the same kind of
cross-sectional images you get from ultrasound in that they were very low contrast, features tended
to be soft and, you know, not have hard edges. And the desire was to be able to pick out all of
these features and segment it in these
complicated ways to say oh that's calcium that's you know a plaque that's some other disease and
the technical executives were just flabbergasted that we couldn't just do this yeah well there's
really like this gets to the very common set of problems. And the first is, you know, A, these problems are hard.
They're not like, not everything is a hard boundary.
It's a very softy, morphic, like setting up good definitions for what something is, is very hard.
You have to be very sort of rigorous in how you set up a problem.
Like what is the definition of the thing I want to find?
Is it, what does that look
like precisely and then the following is like when you represent that to people this gets back to
color too i was actually um i went out with a friend of mine who works at a microscopy company
and they were talking about imaging tissue samples and you know looking how do you
you know if you're doing some sort of analysis on something and you want to show a region,
if you show it in true color, like you actually show what you found, things don't pop.
People don't see it.
But then when you try to use an artificial color space, and if you don't do it correctly,
what does it mean for an image to pop?
What makes it very easy
to see and that's something that's that's very difficult so it's not just defining the problem
because like somehow latently in your brain like a doctor looking at this thing knows what it looks
like but like how do you take this thing that is intuitive and describe it rigorously enough that
you can kind of make it into a binary decision. Yeah, and it's like a different class of difficulty, too, because you mentioned seeing a cat and recognizing a cat.
Okay, we all know how that works.
But if you have something where it takes weeks of training to be able to discern features, okay, that's probably a harder problem to get a computer to do as well, because it's hard for a human to do.
Yeah, I generally define don't believe that I can't make a machine do anything that I can't do myself.
Right.
And, you know, there's certain things, like, computers can see in, like, different parts of the spectrum, or, like, we can get cameras that see in different parts of the spectrum, but then I remap that to something I can see, which is always, like, a really, I find that one of the most interesting problems right now is that how because we can get like these i used to work with these floor cameras these forward looking infrared cameras
sort of night vision but super night vision like heat seeking so you can see temperatures
or things in the ultraviolet or just like way out different bands in the spectrum that we don't see
the not the red green and blue that we're used to and how you remap those such that people can understand them when it's something that we just don't have the capability.
Like if you look at IR images, like IR photography is a different form of photography.
You can go buy this IR sensitive film and you go look outside, all the tree leaves are white because that's how they look under IR.
And it messes with your brain because our brains are so tied to the sensors that we have that you're just like tree doesn't look like that tree looks like this and it just it messes with things and like
how you how do you remap those things such that people can understand this new forms of information
is really kind of an interesting like both science problem and a psychology problem well
there are animals that can see ultraviolet and can see polarization and we can't and i've always wanted to see that i
mean it's like okay now i want to be i want to see how a butterfly sees well the other the other
thing about it is we we define sight as like such a interest like there's this very macro huge
wide field of view whereas plants actually see like plants have photoreceptors that see different
wavelengths that's how they sort of know like when it's fall or when it's spring they actually see
the changes in the elimination and what time it is and everything out there is seeing in different
ways that are optimized for its like sort of local application so you know and that kind of gets back
to solving some of these cv problems where people assume that if we take a because it
works well for us kind of that if we take an rgb camera and we apply it to a problem that we should
be able to fix it when optimally like um the case that comes to mind is i i had a buddy in grad
school who was working on something um about sorting recycling and like the different plastic
types and generally all plastic to us kind of looks more or less like
white or clear plastic you put you look at a couple different wavelengths or a couple a little
bit of polarization information all of a sudden that stuff just pops and the types of plastics
are like super discernible and you can sort them exceptionally easily and so if you sort of try not
to think like a human but think about understanding the problem and solving it. You can actually make the solution way easier.
That's got to be a huge key point to doing good computer vision projects,
is both to not expect the computer to do any more than you can,
but also it has different sensors, so it can't do more discernment,
but it can have different inputs.
Yeah, you can, I mean, some cameras can have like better, you know,
we can change the depth of field, we can constrain the color of light,
we can put filters in front of the lens itself that filter out different wavelengths of light.
You can adjust the light such that things appear darker or lighter
depending on like coming in at a blink angles versus backlighting versus top lighting lighting
is just incredibly important for particularly like manufacturing tasks well that's one of the
things i really wanted to ask you about i have done a raspberry pi project where i got open cv
running and i got to blob detection which was was very cool, and face detection, which was all just one line of Python at a time.
It was very encapsulated.
I didn't have to work very hard, really.
And now I think about, well, what if I need to do a manufacturing line, and I want to use computer vision on that?
Can I really expect to be able to?
I mean, yes you you can um so the world of manufacturing is sort of broken into vendors that specifically vend things um some of which are like very simple
turnkey solutions right like if you just need to read a barcode that that is a turnkey solution
you shouldn't roll that from scratch yeah uh that's a thing now if you have something
much more complex um i'm thinking quality control where i have a screen that does stuff and i need
to make sure my screen is good yeah before i ship it yeah and i think the way that you approach most
of these problems is as far as i found is that it actually starts with collecting a lot of data
right because if you one of the the places that is really, really bad to be
in computer vision is to be in what I call the optometrist office, where you are
changing stuff, trying to develop, and you're saying-
Is this A better or B better? A better or B better? Anybody who has glasses has done this.
Yeah. And you sit there and you're like, you're changing a parameter or tweaking something. The
first step is always to develop data. And you sit and you have a data set and you're like, you're changing a parameter or tweaking something. The first step is always to develop data.
And you sit and you have a data set and you have an objective measure of what you want to accomplish.
And once you have that objective measure and you set up a test environment where you don't have to deal with, like, say, an assembly line, it's just sitting there, you run across it.
You run across the data with whatever algorithm you're working on.
You say, I achieved an accuracy of you you get really into stats you say i achieved an accuracy
of 96.4 and then you tweak something it says 96.5 or 96 or just eight because you really screwed it
up and and sometimes you can actually just um that's where like computer vision sort of overlaps
with things like numerical methods and stats and all this sort of math background.
And machine learning here, because you're going to talk about false positives and false negatives and true positives and true negatives and your F1 score and all these things.
You end up with all these boxes to put things in and you have to figure out your tolerance of, it's okay to be wrong sometimes.
It has to be okay to be wrong sometimes.
Absolutely.
And people don't believe you when you say it's going to be wrong.
Because even humans don't do most of these tests 100% of the time.
And is it better to have it be wrong more?
But in one, yeah.
So you have to do a lot of this planning and specification of how good does it actually have to be and when somebody says it has to work every time then you
multiply your uh planned budget by about a hundred yeah like so i'm doing a million well yeah i'm
doing a um a lot of sort of deep learning things right now and i i generally suggest that like
well we've gotten to a certain level and i i kind of believe that like as a rule of thumb i don't
know if this is exactly the number but to give you an idea i think that like after the decimal place each a single another nine or another decimal
place of accuracy is probably going to cost you another order of magnitude of data yes because
what you're doing is you're actually running through your data to find the cases where things
break the common case generally if you if you do a bunch of machine learning you'll get the common
case very quickly it's the corner cases like we all know in writing any bit of, it's the corner cases that are going to get you. And you have to
collect those corner cases. They're like Pokemon. You just got to go and find them.
Yeah, we might've talked about Pokemon before the show. But there's also clean data versus
data that's not so good. My example for that is if I was training a voice learning system and I took all of my trained data in the morning and then I also took data that I wanted it to refuse in the afternoon.
Like I had Chris record in the afternoon and I wanted to train to my voice.
If there were birds singing outside when I was doing it but not when he was doing it it may
train on the wrong thing so you have to choose your features intelligently you can't just let
the machine go crazy and say garbage in garbage out and and well the other thing is like we
like in a lot of these projects use a turk right you use somebody and you say
some poor person that like has to sit in front of a computer and just click click click click click
to say like this is thing a this is thing b this is thing c so we know very very well that um
humans ability to do those sorts of tasks like that's why we want machines is that people can't
do that well for very long you can do it really well for an hour and then you need to go take a
break and have a cup of coffee and if you have people generate that data for you, there's always going to be problems with it.
That's why you see things like some of these captchas, right?
It's not like the captcha where it says,
click on three images of pizza in this image or something.
Those are usually going through people multiple times.
That same image will probably hit, I don't know, three, five, ten people.
Because no one person, if you're doing I don't know, three, five, 10 people, because no
one person, like if you're doing that stuff a lot, people kind of disagree. Like even, you know,
like even at Planet, like looking at clouds, it's like, what exactly defines a cloudy, cloudy image
that we don't want to, we don't want to sell to customers? Because it can't just be white,
that may be snow. Yeah, well, and yes, exactly yes exactly but there's other there's other bits of that too where it's like if i could make a way
to calculate a number that says like this image is 97 cloudy i've solved the problem and right
because i said this this image is 97 cloudy uh and if you define it that way like if you don't
have that definition it's still a subjective thing
because that's what we're really trying to find we're trying to find the subjective measure for
the subjective thing and that's that's sort of where things are interesting all right computer
vision i'm sorry you have me so distracted into machine learning and data and i totally
but computer vision i'm going to be focused uh yeah we're out
of focus that's part of it too uh if i wanted to build a manufacturing test bed for quality control
um and say i i used a computer um i didn't try to do a Raspberry Pi, although it would be very different. And I want to look at a screen.
What are my steps?
Assuming I don't have your background, that I'm pretty much coming at this without computer vision.
Okay, so the first step is you need to find a camera.
And you need to figure out what your illumination situation is.
And you probably want to specify the problem.
So you have some problem that you want to solve.
You want to sit down and very rigorously define what success looks like. Maybe get an idea of how often things will happen.
You know, say if it's a widget, you want to say, toss all your broken widgets in a box for two months, all the rejects, because we're going to run those through every single day. Because those are things that you really want to catch you're going to have millions of good widgets so you you first
set up defining your problem like this is what i want to find this is how i know i will have found
it this is roughly what we think it'll perform at then you have to go and specify before you even do
the software side of things it's really you should pick the right tool for a job, which means finding the optimal lens and set of filters.
So do I need RGB?
Do I need to work in near IR?
Can I just get away with a grayscale image?
Do I need like a high bit depth?
Because this is something where there's not a lot of contrast.
So try to figure that out.
And fewer bits is better.
I mean, because a smaller image is easier to deal with well it was
just easier to deal with um you know a higher more bit depth can if you have less contrast can pull
stuff out a lot easier you know like i said it's all about throwing away it's about throwing away
bits as fast as possible but if you need those particular bits then you just need them don't
throw them away too fast yeah and so first sort of figuring out a lens camera system like what frequency does the camera have to operate
that should it be running over usb can i do i want it to run over gigi there's all these different
protocols you know is it going to vary in temperature between negative 22 degrees and
120 degrees or is it just going to be 70 degrees all day and you know will i have to
deal with stuff like lens fogging or anything like that what does the lighting environment look like
will i want to put lights in to drown everything out if so how long will they last where should i
position them how are they going to get powered that sort of thing so once you sort of deal with
those hardware externalities then really hard problems and we haven't even opened Python yet. Yeah, you haven't even opened Python.
So then, depending on who your vendor is, you may be
writing your own drivers. Or not writing your own drivers, but they might have like a C API, right?
Yes. Yes, they do.
Camera link. Oh, jeez.
See? So they may or may not they may or may not be supported in linux they may only run on windows xp uh you you don't know you know vendors like pick a vendor who has really
good support uh for all of these things you know things. It's horrible to get in a situation
where you have to buy the cheapest camera
and it has the worst support.
There may be an open source Python wrapper to it.
There may not be.
You may need to use some sort of...
You might have to write your own Python wrapper
to actually get it at the camera library.
You may not.
That's all vendor-specific.
I've found recently on a couple of
projects, I really, robot operating system, if you're going to like build this on Ubuntu,
I think there's going to be some Mac and Windows support actually fairly soon. But if you want to
build these, if you want to get access to most of these sort of professional machine vision cameras
or even stuff like a DLSR or something like that.
Robot operating system, ROS, has really, really good support for getting at these cameras,
like interfacing with the cameras and then putting them on a unified bus structure and
presenting you with a nice set of Python tools for interacting with the images, grabbing
them, changing camera parameters, and seeing what's going on.
So I've used that in a few prior projects,
and it's saved me an absolutely ton of time.
But so you're basically going to figure out
how to get the data and tell the camera
what you want it to do.
And that's usually a pretty tedious step.
I was actually talking to somebody yesterday
who was trying to build a camera interface driver
and she was not happy about it.
It was really rough. See, we shouldn't have to rebuild
these things all the time. It goes back to my previous
rant in earlier shows, but I'm just tired of
I want to do machine vision. Well, you better
learn to write a Linux device driver.
But you can start with
Raspberry Pi and their camera
and it may not be
everything you need, but it may be enough.
Yeah.
So then you have an image.
Let's just assume now you have an image.
You have a JPEG or PNG or something raw, whatever.
Then it is a matter of figuring out what you want to do with it.
And that can be as simple.
You can do IPython.
I have actually been doing a lot of my prototyping.
Depending on what it is, a lot of it goes into Jupyter Notebooks first.
I just want to see Jupyterism.
It's now sort of more cross-platform where you can work in other languages,
but it started out as IPython Notebooks,
and it's sort of grown into this inclusive community of different languages.
But the way to sort of think about it is in the browser you can
run code uh visualize code very similar to matlab where you have like matplotlib i've been using
bokeh lately which is just builds these beautiful impressive plots that people love um that are nice
and interactive and you can save them in their like javascript html and like you can keep all your documentation in there about what you're thinking as you go along you can actually see like
process like the newest versions it's incredible you can actually see like how much of a resource
you're consuming you can farm stuff out to multiple computers it's like the sweetest
development environment i'm not sure she answered the correct thing when she said her favorite editor was Emacs.
I do a lot of work in Emacs.
So, Jupyter is an IDE for Python and other languages.
Yeah.
I have PyCharm and I have Anaconda, which Anaconda isn't really an environment, but
PyCharm is sort of an environment.
Is that similar?
I don't know what you produce.
I'm trying to figure out where it is.
Yeah, I mean, so it's,
I think the primary difference is,
like, PyCharm, I believe,
is an IDE that is actually, like,
you know, compiled and runs on Mac.
It's, like, running Visual Studio,
whereas IPython is more,
more or less, like,
an access to a Python shell in your browser.
Yeah, okay.
With like a lot of sugar on top of it to make things render
and be pretty and be saveable.
Little macros, yeah.
Oh, and you have like access to your documentation.
It's the cat's pajamas.
Okay, cool.
Learned something about Python.
So, and then it's really, you know,
the process of solving these problems is really,
it's really specific to, to what is going on.
And I, I no longer pontificate about like a particular approach to generally doing stuff.
I have hunches, but for any computer vision test, and this is part of the reason I like
doing stuff in Python so much is that you set up your metric, right?
You say, this is the thing i want to
do this is well i know it'll be done and then you throw a bunch of stuff at the wall like you have
some inclinations given prior problems you say like oh i think i can just do you know i do a
little bit of contrast adjusting i do a threshold i find the connected component and i measure the
center of this thing.
I think that'll work,
but you don't know.
Like you just start throwing stuff at the wall and tweaking.
And that's where that number,
getting an objective measure of what you want to do really,
really helps you.
Because then it's like,
instead of it gets you away from this optometrist,
that's better,
worse,
better,
worse.
Cause you can just be in your basement.
If you have enough sort of gumption,
you can just be down there trying to make things better indefinitely or ever. And if you have a number, then you at least know like,
oh, if I get it to 95% tomorrow, I just have to spend tomorrow getting it to 96.
And I have a time when I go home. And 95%
of something. You can't just have it be better.
95% good. It has to be 95% accurate over
these data sets or over these problem definitions.
Yeah, I've had the 95% better and it's like, better in what way?
But what you said suggests that you do have to have enough knowledge to understand some of the basic kind of operations, like you thresholding and and connect you know finding connected segments and things
so there's like a handful of things like that that you kind of piece together to
yeah to get your result i mean for an analysis test like that that is going to be
um you know you're what i almost like to call like sort of early it's like first gen normal it's image processing right it basically
takes things from 1d and extends them to 2d so it's you know thresholding values above a certain
level throwing away values uh basically looking for discontinuities um which are effectively edges
right we're looking for corners corners Corners are exceptionally important in most computer vision tasks
because they allow you to sort of position things.
You know, and a lot of that stuff,
especially for, you know,
these sort of old school problems,
it's like find a bunch of points,
fit a line to those points
or fit an arbitrary function to those points
or take the outside
and measure across various vectors across something i mean it's a lot of geometry really
in your book uh there were terms and it seems like some of these terms are related to what
you're saying now um for example dilation and erosion and i'm i was about to define them but if i was smart i would say could you
define those so so these are what we call basic morphological operations um so when you have an
image right you have each pixel has eight bits of red eight bits of green eight bits of blue
and and that's great for representing stuff but it's not really good at throwing away bits. And so if you want to throw away bits,
you basically say, is this a bit I care about or a bit I don't care about? And you can usually do
that about brightness. So a thing we're thinking about is, I guess, before we even talk about R,
G, and B, there's sort of color spaces, right? So you have a notion of red, green, and blue, but you also have a notion of dark and light, like bright things, dark things.
And so you can map between red, green, blue, and bright and dark, and there's all these different
color spaces. HSV in particular is pretty handy. And so what a threshold is, is you say, hey, I have bright regions and dark regions.
Give me all of the pixels that fall in between this brightness and this brightness, or between
this color red and this color red.
And you label them as all being, say, ones, as being white, and everything else is zero.
So once you have that pile of 1, 1, 1, 1, 0, 0, 0 on a grid, you basically say, well, if I have a white pixel, a black pixel, and a white pixel on one column, and then another column where I have a white pixel, white pixel, white pixel, so I have basically a reverse letter C is what I'm thinking.
So a dilation would say, well, if you have that letter C, one of the operations is to take that black chunk that's in the center of C and change it white. And so this has the effect of sort of,
like if you have a blob, right? A bunch of these pixels, all contiguous. It makes the blob grow.
You're dilating it. You're making it bigger. So if you're next to a black pixel, or if you're
surrounded by white pixels and you're a black pixel, then go white yeah okay and erosion is similar like if you
have like a one little pixel sticking out of like so if you have like a row of five white pixels
and then another row on top of that where the center one there's just one white pixel on top
of that an erosion operation will get rid of that white one and what you're basically trying to do
is you're trying to manipulate this blob to make things sort of separate or combined into contiguous regions
is usually what you're going after.
And so the example in the book was tools on a pegboard.
And by dilating and eroding, they could remove the dots of the pegboard and still keep the
outline of the tools because the tool outlines were pretty
consistent over that those operations yeah but there are other operations um you said you
mentioned brightness and turning up the brightness and i i've done that in gimp i mean i know how to
do that on my phone yeah but i did not realize that one way to think about brightness was just adding the image to itself or multiplying by more than one.
And that was brightness.
That is the mathematical operation.
So what other mathematical operations lead to these words that I already know what they mean?
Okay.
So, you know, there's a whole subset of these things right like when you
deal with binary images right addition right goes back to sort of being logical or right so if you
have white and white you have two whites they're still going to be 255 if you have one white one
black it's still going to be 255 like so addition of
subtraction have that same similar effect um let's say when you start talking about edges right
an edge is really where you're going and you're looking for a derivative i mean if you want to
think about it more simply you have two adjacent pixels one may be say like zero and one may be
255 and you subtract them from each other and you put it
back in the original pixel and you say oh this is 255 but the next so let me re-explain that a little
bit you have zero zero zero 255 255 255 and you go zero zero zero and then you flip to the 255
well that's an edge because it's like the subtraction is gonna be 255 and you go to the
next one and it's zero and so it goes away and so all of a sudden if you look at this as a line it's like the subtraction is going to be 255. And you go to the next one, and it's zero. And so it goes away. And so all of a sudden, if you look at this as a line, it's like boop.
And that's an edge.
And that's how you start finding edges.
And you find two sort of edges in each direction.
You all of a sudden have like a corner.
And those can be super handy.
There was also subtracting two images to show motion.
And that makes sense now. But if you'd asked me how to how to look for
motion i don't know that taking two images and just subtracting them would have been my first
yeah and well it's a it's a it works to a certain extent um you know if you have a really solid
background right what is a green screen, right?
Other than like I have this giant field of green.
Now you can do certain things by changing color spaces
and make it better than it was.
But like the rudimentary version is just a subtraction.
And everything after that is just getting more and more fancy math
and doing more and more sort of processing afterwards.
So the green screen works by having all the green in the background and then you video
the person in front of it and then you subtract everything that's that color green or close
to that color green.
And then in that space that is now totally blank, it's not black or white or red or green
or anything.
It is transparent.
It's alpha.
Yeah.
Now you can put in a background image and keep the foreground image, the weather casting. Yeah, and if you do it correctly, you can also kind of keep shadows if you want them
and keep illumination highlights and stuff like that because you can kind of tease out
the hue, the color from the illumination, the brightness.
So you don't just subtract every green in there you subtract every green to some extent well you subtract the the hue part of it like
so imagine like subtracting the green part of it but leaving the red part of it but you work in a
different color space so you're instead of subtracting you subtract the green part of it
but you leave the brightness part of it and then you can reapply that brightness to another thing.
One of the interesting things I think about computer vision is actually
it's the reverse problem of computer graphics,
which is a really interesting way to think about it.
In one, you're trying to project through a camera and recreate it. In the other, you're trying to take
things in from a camera and build the 3D model.
And that's always just...
The interplay between those two is actually kind of interesting because most cameras,
when we actually talk about where all this imagery comes from, it's because we're watching TV all the time.
We're making films, and some of those films are actually 3D generated.
Deconstruction of graphics.
She makes graphics sound easy.'s just math yeah shears and warps
what are those and and going back to it's just math i bet they're connected yeah i so uh you
know like a sheer like a general like perspective warping right is this Is this sort of, hey, I have like corners on a billboard and I want to,
it's like, say you were playing your GIMP example, right? You have your friend holding a book or his,
you know, their iPad, and you want to go and take an image that you found on the web of something,
you know, you want to show him having an iPad, but you want to put a cat on it. And you take
the image of the cat and you grab your tool in GIMP to take the corners for that cat image. And you say, here's one corner for the
iPad. Here's the other corner of the iPad. Here's the other corner of the iPad. Here's the other
corner of the iPad. And you warp and move that image into so it fits on the iPad screen. And so
that's just a general change of perspective. Now, some of the warping stuff, right? When you say,
if you have a Mac, you open up like the camera app
and there's all these weird warps that, you know,
make your nose look big and your mouth look small.
Oh, the circus funhouse things.
Yeah, and those are just sort of mathematical transformations
where you say, I'm going to take this pixel right here
and map it out to contain, like to the adjacent pixels
in some sort of circular way or some round way or some based on
some sort of function, right? It's just a definition of a function to remap from the
normal pixels to a new set of pixels. I remember thinking in college, I have no idea why anybody
would use this stupid matrix math stuff. I was so wrong. So very wrong. Are there things people, are there barriers to getting into computer vision? Things like understanding matrix math? program i you know i think that's a that's a huge barrier um you know the math side of it
you know to a certain extent i actually find learning applied math much much easier than
learning like abstract math and figuring out where to apply it like if i you know and this
is where actually computer graphics i think comes in a more. If you sit down and want to learn basic, like very basic elementary
linear algebra, it's, it's not exciting stuff. At least I don't think so. But if you, if you go and
want to hang out with a bunch of kids and you say, I'm going to teach you how to construct a matrix
to do crazy rotations in your video games. So that you click on this thing it rotates in this these different ways and then gets big and gets small
again at the same time like i'll show you how to do the math to do that and it's just like oh it's
just this you know you take this thing over here you take this thing over here and you take these
three matrices and you wrote you put them all together and that's how like and you see you can
decompose them so that one's the rotation one's the scaling and one's the change in position if you teach people like that and they
can see it it's it's way easier to teach like that i mean i think that's sort of one of the reasons i
got into computer vision and into graphics and these sorts of things is that you get this feedback
loop where it's not just like numbers on a screen it's actually something you see and you you tweak it a little bit and you get really good feedback and it's not just like oh the you know i
move the text box in my web page or something it's it's like a real huge image that you can
do really cool stuff with yeah and it it changes both how you see the world and how you can interact
with the world i mean if you can make a robot that really can tell that something's broken,
well, then you can make a robot that can fix things.
It's hard to fix something if you can't see it,
and that's where computer vision really starts to shine.
So what sorts of things are you doing for Planet?
Well, right now it really is recognition,
and it's it's starting
out on on the very basics of like um so when the satellites are up and they're imaging we have a
reasonable estimate of where they are um but we don't know exactly so sometimes like you you might
you know you might overshoot and get a little bit of water before you hit the coast or something
like that and you know my job is to basically filter out things like clouds and water and find scenes that are very, very interesting and then find things that customers really, really want in those images.
And help us figure and develop tools to understand those images.
And especially now, what's really, really interesting is seeing that data every day.
Like, you know, it used to be that this, like, satellite imagery was this sort of thing.
Well, we'll get an image now, and then we might get an image in six months, or we might get another year.
And with Planet, within the next year, it's going to be, we will be getting images of parts of the globe every single day.
And you're going to have to throw away some of that.
Well, you're going to.
Not actually throw it away, but.
Well, the government requires you to delete all the UFO ones, right?
Yeah.
So that's most of your work.
Actually, I can't comment on that.
But you said that throwing away the pixels is a huge part of the job, the ones you don't care about.
And so having this huge amount of data every single day, you're going to have to throw away a lot.
We save all of it that comes down.
We're actually contractually, I believe we're contractually obligated to store all of the raw data.
But generally, especially now, some people really care about the imagery. Some people really want to see it other people really just care about the number like they want to know
is this field growing you know how much water is this consuming how much more should I put on it
they want very concise answers to very difficult problems and we're throwing it away by actually
getting them that answer and those are the sort of things that I'm really excited to start tackling.
Yeah, because they don't care what the image looks like if they can get, okay, tell me how to irrigate this.
Because that's what I care about.
I don't care about your computer vision.
I care about water.
They want a number that is something like, you know, how many cars were in the parking lot of the mall the day after Thanksgiving?
And how does that correlate with sales in that mall?
And give me those two numbers
because I can really do something with those.
It's great that I can see this
and I can get a relative guess,
but get me something
that's even slightly more accurate of relative guess.
This is going to be a weird world.
Going to be?
Oh, it's already weird.
I mean, Pokemon.
Let's see i guess uh planet is hiring they're not doing a contest because they let me have cat instead and embedded fm at planet.com go look
please they're nice folks space is awesome so back to the computer vision and the book why is a barcode scanner one of the last examples in
the book that doesn't seem hard it's black and white it's lines why is that one of the hardest
things to do you said to buy it don't make it which i totally understand but if i'm going through
the examples in the book why is that hard it's not hard to do a triv... And this is a camera-based barcode scanner,
not the scanning laser one-dimensional one.
Right?
Yeah, it's a camera-based one.
It's not...
So, there are packages that will do this for you,
like Zebra Crossing will do this for you.
Somebody has already solved this.
I have also written these things from scratch.
It is easy to make it work. It is not
easy to make it work well. I mean, it's like all engineering problems almost. It's easy
to get a prototype that kind of works good, right?
It's alright. To get it to work robustly for any sort of application that you would
want, right? It's very, very difficult. Like if your
barcode scanner, like even now when you go
to the grocery store right it still screws up maybe one in 20 times or one and that's annoying
right like clerks sitting there like oh i gotta scan and you're just waiting and then like they're
wiping it off they're wiping it off and then they're into the voodoo gods to make it work
yeah and it's a it's a very very difficult problem because there's so many different, you know, you have this camera, right, that's probably fixed.
And then you have this thing that you can rotate in all these different ways and it can still be kind of seen.
You can kind of see the barcode, but you may not see the barcode.
And the projection of the barcode in the camera could be really, really weird.
So, you know, you definitely want the common case to be it flat right across the thing but you don't always get that and so making it robust to all these different
kinds of ways that you can you can turn the barcode and manipulate it and warp it and share it
is very very difficult and to then on top of that there's lots of different barcode formats too
yeah that didn't seem as hard a problem but the warping and sharing and dealing with lighting But then on top of that, there's lots of different barcode formats too. Yeah.
That didn't seem as hard a problem.
But the warping and shearing and dealing with lighting effects,
if the barcode is half in light and half isn't,
you can't use the same threshold to say what's light and dark.
Yeah.
And or, I mean, there are techniques to improve on that.
This is also why certain things like, well, you may actually want more bit depth because you just can't get that data out
unless you actually have a little bit more range at the dark side.
And is that a similar problem to putting known features around a room
if you want to do position detection
or if you're trying to do some robotic thing
where, okay, I want to go over here and find this position.
Well, I'll put a thing that looks like a barcode.
Like the QR codes.
Can you make that an easier problem than the general barcode like the qr can you make that easier
an easier problem than the general barcode case i guess my question it's it's effectively the
same problem yeah okay um yeah and you you kind of intuit why you do this like the you have to
understand the logic of why barcodes are in some ways are the shape they are and why you use those
for like understanding position um so if you if you stare at a white wall like if i were to sit you in a infinitely large room
with a giant white wall and you're just kind of floating in space if you move away from that white
wall you don't know really how far away you are and if you move to the left you don't know that
you move to the left or the right or up or down because you have no way,
I mean, you might feel it in your inner ear, but you don't know how far. If I put a single line,
like if I had a giant pencil on this infinite wall and I drew a line, you could tell if I moved you,
and it's a horizontal line, you could tell if I moved you up or down relative to the line. But if I move you right or left on that line, you're not, you can't tell how far you've
moved. Now, if I put another line perpendicular to that, now all of a sudden you actually have
a frame of reference and you can actually tell if I move you left, right, up or down.
And so, you know, having corners and having lots of corners and having to understand and being able
to sort of see those shifts in movement
or what helped you to do these sorts of problems.
I'm trying to speak sort of generally about these things.
Because you need these corners to then basically do a de-warping
to get it to sort of the square thing that you want,
which you can either then figure out what's the transform
to get to the projection that we had.
And then I can also, since I warped this back to a sort of 2D space,
I can then read the barcode
very very easily okay because i'm looking for basically the relative like widths of the different
lines so is making a line following robot still one of those intro things you do when you're
learning about computer vision and robots or is there some other new nifty thing that everybody does?
Well, I feel like I have never actually made a line following robot.
Truth be told.
I think generally the easiest, most approachable thing is actually to go to use these QR codes or these AR.
The primary one is AR toolkit. And this is just a package in ROS now that you can get. And it's super easy to use these QR codes or these AR, like the primary one is like AR toolkit.
And like,
this is just a package in Ross now that you can get,
and it's super easy to use.
And you can actually,
you know, once you know the size of the,
of the,
the barcode,
you can actually reconstruct the camera transform.
So you can say this thing is here and my camera's here and it's moved like
this as long within certain fields of view,
you can't like go completely oblique to it and like,
then it doesn't work.
But I feel like if I had to suggest one very, very early,
very, very useful approach, I think that's sort of the modern equivalent.
If you have a camera and you have a Roomba and you can put up those barcodes,
you're getting to the point of actually being able to develop a map
and understand how your robot moves around the world.
My next question is going to be, what are the easy and fun things to do that will get people over the hurdle of just trying out computer vision for the first time but i think
you just answered that yeah i mean i also feel like uh at least if you're coming from say like a
web dev background i i think the world really needs more sort of websites where you submit an image and some degree of processing happens on it.
Like it's mustache-ifying or like other sorts of things.
There's like this new app right now that sort of takes the,
some of the neural nets trained on like classical art and then like
reproject the image.
Oh, yeah.
Like, I mean, all of these things.
Deep dreamer.
Deep dream.
And like, I mean, as stupid as it is, right?
Like the Snapchat sort of face filters, right?
Like these, like, it's not that big of a step to go from,
hey, I process an image and pick the color or found,
you know, put a mustache on it to doing something like that.
It's just slightly more work, but you have all the tools right there so not as somebody who doesn't use
snapchat i know i use everything else um this is this is when you put mustache on people but
there are other things you open your mouth and a rainbow comes out and your eyes get big and all right this is sort of like when they're
doing the uh like pixar movies and they make the faces based on actors faces and they move
like you can almost see who the actor is underneath yeah and they're doing that sort
of real time with cameras without putting dots all over your face? Yeah.
Oh.
Well, and, you know, these are all like sort of chunks of known technology that are all put together, right?
Like we've had a basic face detector.
Like we can see full on front faces and it's called the VL Jones detector.
It's a very classic sort of thing.
And you can find faces very, very quickly.
But then you have to do that every image and that's computationally expensive.
So, but you have lots of images coming through time. So how do you track a face? Well, you take something like a common filter or some of these other tracking sort of filtering technologies,
and you say, okay, well now I can kind of constrain where I'm looking for that face.
Now that you have the face all the time, I forget the name of the particular technique,
but you have this basically 3D mesh of your face, right? That has the points in the face all the time um i forget the name of the particular technique but you have this basically um 3d mesh of your face right that has the points in the corners the eye
and your nose and what you're trying to do is you're trying to take the lines on that mesh
and uh adjust them such that they align with the sort of um lines that you can find
visually so like the the lower end of of your eyelid and the upper line of
your eyelid, your eyebrows, right? You're trying to find, you know, there generally appear to be
dark lines. And so you're trying to fit that mass to those. Once you have that in 3D, then you can
say, oh, reproject this image, this mustache or whatever you want onto it. And we've taken each
bit of that sort of pipeline
and sped it up and optimized it.
And that's what's going on there.
So I think there's a really good BuzzFeed article
on how this actually all happens
and how they do it step by step.
And it's really a good read.
Let's see if I can find that for you.
I will link that in the show notes.
But now I'm just so full of the idea
that I should be on Snapchat and I can be my own cartoon.
No, no, no, no. and using these existing tool sets versus implementing something like OpenCV
or other toolkits for other platforms
versus inventing brand new stuff.
These seem like very different things.
How much do you use MATLAB?
NumPy all the way, man.
She said it.
Or NumPy.
Nobody's going to respond to no i i had somebody say a phrase i loved the other day the numpy space princess
ah sorry um so you know it's a really it's like a practical problem versus a research and development problem. And it really, it really depends on what you're looking for. And, you know, I stronger and better understanding of the problem space,
then you very much do sort of start walking into, I'm going to write something from scratch.
But a lot of, you know, a lot of the more interesting things are built out of tools
that we already had. Like, if you look at most audio processing, right, it's not like you're
going to go re-implement the FFT library every
time you want to do audio processing. Similarly, if you need a feature detector, you're not going
to re-implement that. I mean, enough graduate students have written enough papers and built
enough code that at this point, it'd be a waste of your time to go and do that. You don't really
need to do that. And so a lot of things are sort of building up on more and more complex things that we've already built, right?
I think any sort of development at this point is sort of standing on the shoulders of giants who've been doing this work for years and years and years.
But you do kind of have to learn what's underneath because you can shoot yourself in the foot if you don't understand a little bit re-implement it.
But by and large, if you're working on a large
Windows, Linux, Mac system, you're going to just pull from a library
that does the basis of what you're doing.
Cool. Alright, I'm about out of questions. Christopher, do you have
anything? Do I have anything? Well, I I'm about out of questions. Christopher, do you have anything?
Do I have anything?
Well, I'm not actually out of questions. I just, okay, we're about out of time. I admit it. Fine.
Do you have anything? Anything at all? I have nothing. I have nothing.
Okay, then I guess I do have one more here that I would like to know. Do you have anything to say about Tempo Automation?
They're great. Go send boards to them.
Okay, so cool.
What was raising venture capital like?
It's super fun. Go send boards to them.
It's an interesting world.
Being an engineer, it's not exactly what I do.
I just kind of explain the hard technical parts and why it's important.
You know, it's not that, you know, you have to choose if venture capital is something right for you, I think is the biggest sort of parting thing I can say about that, is understand exactly what you're getting yourself into.
If only we could. That's tough advice to follow. Well, do you have any final thoughts you'd like
to leave us with? I would say for a final thought, you know, images are probably some of the most
dense bits of information that you can put on the internet and be very careful with where you put them.
My guest has been Catherine Scott,
co-author of Practical Computer Vision with SimpleCV,
The Simple Way to Make Technology See.
And she's a senior software engineer at Planet.
Thank you for being here.
Thanks.
Thank you also to Christopher for producing and co-hosting. And of course, Thank you for being here. Thanks. Thank you also to Christopher for
producing and co-hosting. And of course, thank you for listening. Hit the contact link or email
show at embedded.fm if you'd like to say hello. If you'd like to apply to Planet, use the embedded
fm at planet.com email so we get credit. I don't know what the credits lead to, but you know, I want them. Our own satellite.
Oh, that'd be cool.
Our quote this week comes from Muhammad Ali.
Champions aren't made in gyms.
Champions are made from something they have deep inside of them.
A desire, a dream, a vision.
They have to have the skill and the will.
But the will must be stronger than the skill.
Embedded FM is an independently produced radio show that focuses on the many aspects of engineering. It is a production of Logical Elegance, an embedded software consulting
company in California. If there are advertisements in the show, we did not put them there and do not
receive any revenue from them. At this time, our sole sponsor remains Logical Elegance.