Orchestrate all the Things - Useful Sensors launches AI in a Box, aiming to establish a different paradigm for edge computing and TinyML. Featuring Pete Warden, Useful Sensors CEO / Founder
Episode Date: September 21, 2023Would you leave a Google Staff Research Engineer role just because you want your TV to automatically pause when you get up to get a cup of tea? Actually, how is that even relevant, you might ask.... Let's see what Pete Warden, former Google Staff Research Engineer and now CEO and Founder of Useful Sensors, has to say about that. Although naturally much of what he did was based off things others were already working on, Warden is sometimes credited as having kickstarted the TinyML subdomain of machine learning. Either way TinyML is getting big, and Warden is a big part of it. Useful Sensors is Warden's latest venture. They just launched a product called AI in a Box, which they dubs an "offline, private, open source LLM for conversations and more". Even though it's not the first product Useful Sensors has created, it's the first one that's officially launched. That was a good opportunity to catch up with Warden and talk about what they are working on. Article published on Linked Data Orchestration
Transcript
Discussion (0)
Welcome to Orchestrate All the Things.
I'm George Anadiotis and we'll be connecting the dots together.
Stories about technology, data, AI and media
and how they flow into each other, saving our lives.
Would you leave a Google Staff Engineer role
just because you want your TV to automatically pause
when you get up to get a cup of coffee?
Actually, how is that even relevant, you might ask.
Let's see what Pete Worden, former Google staff research engineer
and now CEO and founder of Usual Sensors, has to say about that.
Although naturally much of what he did was based off of things others were already working on,
Worden is sometimes credited as having kick-started the TinyML subdomain of machine learning.
Either way, TinyML is getting big and Επομένως, το TinyML αυτοκίνησε και ο Worden είναι ένας μεγάλος μέρος του.
Ο Usual Sensors είναι η τελευταία εμπειρία του Worden.
Απλώσαν μόνο το πρόγραμμα AI in a Box,
το οποίο οδηγούν σε offline, private, open-source,
μεγάλο μάθημα για συζήτησεις και περισσότερα.
Αν και δεν είναι το πρώτο πρόγραμμα που έχει δημιουργήσει ο Useful Sensors has created, it's the first one
that's officially launched. That was a good opportunity to catch up with Worden and talk
about what they're working on. I hope you will enjoy the podcast. If you like my work, you can
follow Link Data Orchestration on Twitter, LinkedIn, and Facebook. Yeah, I'm Pete Warden. I'm currently CEO of Useful Sensors. I was on TensorFlow when we last spoke at Google. working on tiny ML and generally running machine learning on devices outside of the cloud.
All right, cool. So obviously you're not doing that anymore and you have a new, well,
it's how we call it venture uh which you just mentioned so useful sensors
and so what was the motivation so what what drove you to uh to leave this otherwise i guess for many
people enviable position you had that google and just start this this new thing um so there were a couple of reasons. One of them was, as you know, I was working on TensorFlow, an open source machine learning library from Google.
And specifically, I'd been focused on TensorFlow Lite Micro, which is aimed at running machine learning on embedded devices.
And I was working on that because I really wanted to see everyday objects in our world.
I want to be able to look at a lamp and say on
and have the lamp come on.
Or I wanted my TV to automatically pause
when I got up to get a cup of tea
without me having to find the remote every time.
And when I went to talk to companies that made light switches or TVs,
and I'd tell them all about this wonderful open source code
that they could get for free
and all the conferences and documentation
and examples and books,
they would hear me out.
And then at the end of it,
they'd usually say something like,
that's great, but we barely have a software engineering team.
We definitely do not have a machine learning team.
So can you just give us something that gives us a voice interface or tells us when somebody sits down in front of the TV? That was really the starting point for Useful Sensors was, okay, can we actually build something or some things that just provide these machine learning capabilities to all of these consumer electronics and appliance and other companies who could benefit from it, but without them having to worry about the fact
there's machine learning happening under the hood.
Okay, I see.
Yeah, I mean, that's a valid motivation, but I have to say,
I also had the chance, you know, just doing a little bit of basic research,
let's say, before the conversation, to hear yourself speak in a video that's actually recorded
and available on the Useful Sensors website,
in which he also cites something else.
So you share an anecdote there when you were talking to some person you know
and he mentioned, he or she, I don't know, actually
mentioned what I usually refer to as the Google creepiness factor. So, and it's been, it's
something that actually keeps coming back for, not specifically for Google, I have to say that,
but Google is one of the perpetrators, let's say, of this. So, you know, the usual thing,
like, okay, I had this conversation about, I don't know, XYZ,
and I had my mobile loan. And then the next day, or, you know, for the next week or month or
number of months, actually, I get bombarded with ads about this XYZ thing that I was talking about.
And this person asks you, like many other people ask, so how is that possible? That obviously must mean that, you know, they're actually tapping to my conversation.
And well, I had the impression
that that was also a deciding factor,
let's say that sort of motivated you to do that
because the thing that you just mentioned,
so being able to offer like a very, very simple interface
that people will be able to use,
I'm pretty sure you could have just as well done that uh wearing your google hat as well well that's definitely true that
was my second motivation was really trying to um you know as say, tackle the creepiness head on. And what I was able to say about
that common perception that, you know, big tech companies are spying on our conversations,
at least I know for Google, when I was there, I worked on that code. We weren't, but I couldn't prove it because everything is internal inside Google.
So the second big motivation was to build systems that were simple enough that they could actually be verified and checked by a third party.
Somebody that, you know know consumers can trust you know I would love to get
the um EU data regulators involved in this I would love to get in the US people like consumer reports
or the FTC actually you know trying to check this, have nutrition labels, so that there's actually, at least in part, that seems to imply open
source so that software that people and third party auditors in general are able to inspect
and then review.
But before we get to that part, I'd like to ask you, so, all right, so I estimate that usual sensors must be about a year or a year and a half old by now, more or less.
Yeah.
Okay.
So where are you right now?
So what's your current plan and what's your status in the execution of this plan?
Do you already have products?
Do you have some people on board?
What are you working towards?
Yeah, so that might actually be a good time
to sort of show you this demo of text-to-speech.
And hopefully you can see that it's actually providing uh live captions
of this conversation yeah so uh this is our latest product that we're going to be launching on September 19th through CrowdSupply.com. So prior to this, we've also
launched the Person Sensor, which is a small board, which you can kind of see here, that provides a indication of whether there's a
person nearby running entirely locally on a, at the moment, we're retailing this for $10.
And we've also launched a tiny QR code reader that you can get for $7. So we've actually been launching products to sort of back
up our vision of running machine learning locally and really being able to do it in a private and a checkable way.
Okay.
So, well, based on what I've just seen,
then it would seem that, well,
what you actually mean by products is actually chips because, well, that's what you just showed me.
So I guess in that sense, well, let's just frame it another way. So who is your audience?
Is it going to be people who buy those chips and then have the skills, the capacity, and also the
infrastructure, like the boards and the ability to connect those chips to those boards? And actually,
I guess you're also going to be needing sensors to make those products actually viable.
So, for example, if you're going to be able to detect a person's presence,
I don't know, you need a camera or, I don't know, something else
that you're able to leverage in order to make that happen.
So it sounds to me like they're actually more geared towards people
who have the ability, the skill set, and the rest of the equipment
to sort of integrate them into something bigger.
So not products in the, I don't know, more traditional sense,
like self-contained.
Yeah, and that's a good question.
So these modules, the person sensor and the tiny qr code reader um they are self-contained
so they contain a small camera um as well as the um all of the pre-loaded software needed to
um make it so that all you have to worry about with this is you plug it in and then you get, for example,
one pin that goes high whenever there's a person around.
So we're trying to take all of the capabilities of machine learning,
but wrap them up in a package that's no different than a temperature sensor or a pressure sensor
and provide very very very simple interfaces
now you're correct that these are very much aimed at you know both makers and you know at the
largest scale consumer electronics companies appliance companies we've worked with people who make audio equipment tvs um you know coffee machines
vacuum cleaners um but uh the new ai in a box which we're launching uh which i showed you doing
the captions um that's actually uh it is aimed at makers.
It's something that you can go in, you can sort of change the code and things like that.
But it does a bunch of useful stuff just kind of out of the box as well.
So it gives you live captions, as you've seen.
It lets you talk to a large language model that's actually running locally as well.
And we're going to be implementing live translation as well. So that is still something that
is, you know, maker friendly, but also has some immediate use cases out of the box
okay so you know wearing my i don't know startup consultant hat then i would probably say something
like well it sounds like your primary audience then in terms of what's the media the biggest
market that you can reach would actually not be end users, so consumers, but
rather electronics makers and companies that may actually have the need to consume what you
package as products en masse and then will use it for their own purposes and resell it as part of something bigger yes i mean that's really um where we're a venture-backed
business and you know to justify that investment we need to uh grow and we need to have um you know
volumes uh of units that we can sell and that's and as well, like I said at the start,
I really want to see these capabilities in everyday objects.
And so it's a really great opportunity to try and engage
with some of these large manufacturers
and see if we can actually improve these everyday objects.
Indeed. And I was actually going to ask you about your venture capital backing anyway, so
I will return to that point in a bit. But before we do, there's something else that sort of popped
up while listening to your line of thought about your potential audiences.
So in my mind, there is, well, there's a few of those,
let's say, big potential, big customers that you aspire to reach out to that could actually also have the impact of reaching out to end users,
to consumers.
So, well, mobile device makers come to mind or, you know, companies that in whatever other
way make end products that are meant to be sold to consumers.
So, do you have access to any of those?
And I understand that you may not be at liberty to disclose things that you have not
entirely sealed off yet so you don't have to mention names but you know just broadly speaking
are you in conversations with any of those types of companies? We are we actually have
evaluation agreements with several large consumer electronics companies for example um to uh
to check out you know to to you know the first stage of trying to get these into
their products um and i am hoping you know we're just launching this speech to text, you know, work, you actually use it, you'll find that you can get automatic captions running on device for any video or any other content that you actually play on the phone.
But that's not widely available in the sort of Android ecosystem.
So one of the applications I'd love to see is, you know, make everyone's Android phone able to have that capability too.
Yeah, indeed.
I mean, not a great amount of people are Pixel owners.
And so if you somehow manage to get that kind of capability
in just, you know, every other phone out there,
I'm pretty sure that people
will make use of that and it's also going to be easier to to reach more a greater number of people
that way rather than you know starting to sell something which is standalone like oh buy this
device x so you can have this thing it's much easier because everybody already does have a mobile phone so it's much much easier to reach people that way yeah that's very true
so i guess that in order to make that happen well obviously the fact that you are well
well-known figure in the community and you know you have been with google and you have this
body of work behind you
obviously helps in opening doors but i'm guessing that well you obviously you can't do everything
yourself and well opening the door is not enough you have to be persistent and you have to reach
the right people and you know these things can take lots of time and energy and all of that so
obviously you're not alone doing that so what what kind of team do you have working with you
at Useful Sensors?
Well, I'm very lucky to have my co-founder, Manjanath,
who was also one of the founders of TensorFlow.
He also helped found the CUDA compiler team at NVIDIA.
And he worked at Cerebrus, the large chip company, for several years, starting their compiler team.
So he's been an amazing producer of technological miracles, I'll say.
He's really helped us, for example,
run these modern transformer models much faster
than other people have managed
on especially this particular device
we're using for the caption box.
And then I have a fantastic team of people that either me or Manjanath have worked with in the past
from places like Google and Cerebus.
So there's just eight of us.
So we're still a pretty small team.
But I've been really blown away by how much they've been able to accomplish.
Okay, so that's a good segue to return to that venture funding topic. So even for a team of eight,
and even if these people are very much self-motivated
and able to work independently and all that,
well, they still have to pay bills at the end of the day.
And I also guess that you don't have much sales at this point because, well, you're an early stage startup, which sort of directly implies that you must have some sort of funding.
Well, maybe you were able to bootstrap initially, but that can only take you so far.
So obviously, you must have venture capital backing and I was wondering if you're able to disclose
who are your backers and well have you had like a seed round or pre-seed?
Yeah we had a seed round back in May last year where we raised $5 million roughly. And our lead investor was Mike Daubert at Amplify Ventures.
And we've also had investment alongside that
from James Cham at Bloomberg Ventures,
Ava Ho at Fika Ventures,
and Anthony Goldblum uh who you might know from kaggle uh actually led um from ai x ventures a pretty new um firm from him and chris manning and richard socha um so uh most of those people i've known for
over a decade so it's actually been really great to uh great to work with them
yeah well i'll be honest i don't know all of the names that you mentioned but some I do know and knowing judging from them I'm able to tell that
well it sounds like a sort of insider round if you will so people who are actually able to
understand the kind of work that you do and therefore wanted were motivated to be early
investors in that so and I'm guessing from the sound of it all,
that you'll probably reach a point
where you will actually raise
Series A. And if you
had like a 5 million
seed round, then, well,
I don't imagine you can have anything less
than 10 for your seed.
Yeah, for the
Series A, that's probably going to
be about right.
We're still figuring out the timing of that um but uh you know it's it's been really good um you know seeing the
reaction that we've had to the ai in a box um you know that has been uh you know, I think that that's going to be a, you know, a flagship product for us as we, you know, sort of go into the next fundraising round.
Well, definitely in terms of awareness, because that's probably the one thing among your products that has the potential to reach more people directly.
Indirectly, I still maintain that the rest of your products
are eventually probably going to be used by more people,
but on the things that you can directly send to end users,
probably AI in the box is it.
Yeah, I think you're right.
Which brings me to, well, something completely different.
And, well, I'm going to actually ask you to kick out freely on this one if you want to,
because we didn't do that so far.
So even from the start, when you mentioned this AI in a box
and how you are able to somehow pack a large language model in a presumably
not so powerful chip, I got curious.
So I wanted to ask you if, again, you're able to share which model exactly it is that you're
using.
I'm guessing that probably it's something open source with a permissive license that actually enables you
to do this sort of application.
And I also wonder if you have tweaked it in any way,
whether by using something like LoRa,
which many people use these days,
or, I don't know, some kind of custom training
or fine-t tuning or whatnot?
Yeah, so interestingly enough, like most of the, you know, a lot of what we focused on
is the speech to text, trying to get that real time. And running large language models locally, there's a very active community of people doing
that. So we actually had, you know, once we were able to get the real time speech to text working,
we actually had a lot of choices around which models, which large language models we could actually run locally. And
I'm actually not certain exactly which variation we are using in the current one,
because we've gone through, I think, things like Vicuna, Orca, a lot of stuff based on like llama 2 and we have been looking at doing some of our own
fine-tuning we've been able to get a long way with just providing kind of prompt contexts for the interactions. So that has been, you know, very interesting, you know, sort of asking the
large language model to be a bit of a comedian, or to, you know, try and give short answers,
or, you know, these other ways that you can actually steer it. So, yeah, we're, you know, anybody who's familiar with sort of the large language model
work, they would sort of look at what we're doing and be like, okay, yeah, that's, that's, that's
using kind of known, you know, known models, and it kind of makes sense. And what's really new is that you're actually able to talk to it
in natural language and actually have it talk back.
Okay, so this is the part, the text-to-speech is a part that's,
well, not really proprietary because I was under the impression
that at least some part of your work was proprietary,
but I'm not so sure anymore actually so well something that that you built in-house let's say yes exactly and i i think
i shared a link on email earlier to the useful transformers library uh which is a way of taking advantage of the neural processing unit that's actually available on these rock chip
socs that we're using they're mostly arm cortex a you know sort of they'd be familiar to anybody
who's used like a phone soc except that they have this accelerator for neural networks on it and we've actually been able to take advantage
of that um to run our uh speech to text um several times faster than uh would otherwise be possible
yeah well thank you i was actually also going to get to that part because we did be a little bit more expensive,
but also have more processing power. And we, so it's, it looks very much like a Raspberry Pi or phone SOC.
The biggest difference is that, you know, we picked this one because it does have this
neural network accelerator on it. Yeah, yeah, makes sense. And well, since we sort of touched upon the open source slash
community issue, indeed, you did share that link to your GitHub, which I didn't see recently, but
that sort of triggered me to look around again and i have a very good explanation
for why i miss it because it's nowhere in your site so i wonder yeah no i mean that's that's
that's true um we at partly because we have not launched the um the uh ai in a box yet.
So we actually, you may notice,
we don't have that up on the site yet either
because that's going to be launching on September 19th.
So that's where we normally put the links
to the open source repos is in the product description.
So that's why it doesn't show up on the website yet.
We're talking about this sort of pre-release.
But you can find it if you go to github.com slash useful sensors.
You'll be able to see everything that we've open sourced there.
So presumably at this point in time,
the only contributors are going to be,
well, you and your team, basically.
Yes.
So I'm wondering if sort of community building,
let's say, and evangelizing and so on
is actually part of your plan as well.
And if yes, how do you intend to to go about it or
what you know maybe you're just going to say oh you know you're just we're just going to release
the product and then we'll just sit back and wait let people come to us but i don't know well i think One thing to do is look at our projects on Hackster that we've both done ourselves in terms of sample code for things like the person sensor and the tiny code reader. projects using these, you know, using our examples as a starting point, but actually,
you know, building their own systems on top of this stuff. So it's not so much that we're expecting a lot of engagement with the open source repos that we've made
available. That's sort of almost like the foundation layer. What we want people to be
able to do is, for example, take the speech to text that we do through our AI in a box.
And one of the things you can do is use it like a USB keyboard.
So it just sends the text that it's hearing to the, you know,
to whatever device it's plugged into as if it were a, you know,
somebody typing at a keyboard.
So that means that it's actually quite
a nice way to integrate with something like a Raspberry Pi, where the Pi can actually then,
you know, control a robot, or a sculpture or something like that, using voice, you know, speech to text, speech recognition.
So we're hoping that these, you know, these things we're doing
will be useful building blocks for people to build their own projects around.
And that's where we've been seeing a lot of community engagement already
with our existing products and where we're hoping to see more.
Yeah, well, I would add to what you just said that I can also picture this kind of scenario,
but I think an important building block to make that happen is actually what you also already have.
So having this large language model somewhere
in the middle because that can be your interface to whatever sort of api or programming language
your device in the in the back end can can communicate to so you can you can say something in
natural language,
like, I don't know, turn on the lights or whatever. And well, obviously the text,
the speech text part will get you the textual command.
You can, and then you can pass that
to the large language model,
which may be able to translate that to an API call
and which can actually operate on the physical switch and
do what you wish it to do. Yeah, exactly. You know, that's a big reason why we've included
the large language model is, you know, it's fun to chat to and ask it to tell you jokes and,
you know, you can ask questions and sometimes it will make stuff up.
But a lot of the time it will, you know, it will tell you, you know, tell you something factual.
But it gets really interesting once you start the soup of natural language that we use
when we're having conversations with each other.
Yeah, indeed.
So overall, it sounds like your positioning, let's say, would be that of an ecosystem creator.
So maybe you have sort of tapped with a couple of demo applications or maybe even not so demo, like actual applications.
But it sounds like you're positioning useful sensors as an intermediary, as someone who provides the infrastructure
that other people and other companies
will be able to use to create applications and products.
So, well, first of all,
I wonder if my understanding is indeed
what you're thinking of.
And then if it is,
what are some favorite current applications
of yours and what are some applications that you would like people to to come up with
yeah and i think that's that's a good way of describing our positioning is we um we want to enable other people to take advantage of all of these ML capabilities
without having to kind of go down the person sensors, is actually by Thomas Burns.
He created a robot called Alexatron, and its eyes are actually controlled using a person sensor to kind of track your face. So it's just a little
detail, but it's always, you know, on the commercial side.
Blues Wireless have actually been doing some great work around, hey, is there a person here or not for remote sensing so just being able to tell um you know for safety reasons is somebody
in an area where you know there there could be you know danger um and uh you know being a we've
had people just using um we actually had one person out in new york who's uh been looking at using the person
sensor he's an actor and he want he has to pay somebody to train a spotlight on him when he's
doing solo performances um so using the person sensor he's actually able to um or he's trying to
sort of use that to automatically control the spotlight uh which is not something I would have ever thought of,
but it's the sort of thing that is really exciting
once the stuff gets out into the world.
Yeah.
Well, that's also a good trigger for me
to ask you another sort of closing question, let's say.
Well, it's a good example of
well uh what happens when you get like new technological capabilities and innovation i mean
yes on the one hand it's it's sort of cool that this this person is able to do that on the other
hand somebody might say well you know and what about the person who used to get paid to shine that spotlight on him when he was performing?
So what's your take on that?
I mean, obviously, it can go either way.
And it's a much, much broader question than the specifics of what you do.
But still, I wonder what are your thoughts on that? Yeah, and I think, you know, the whole question of,
hey, is AI going to, you know, get rid of jobs?
You know, from my perspective, I do see, you know,
some jobs may go away. It's hard to deny that, you know some jobs may go away it's hard to deny that you know even when we're looking at things
like driverless cars you know that's likely to have a big impact there but
I tend to be fairly optimistic that this opens up the door to actually people um being able to take uh you know
take other jobs um you know and i'm really trying to be quite thoughtful about you know the stuff
we're doing to see if we can actually make people's jobs easier versus trying to completely replace them.
But yeah, you're right.
That's a really big question with anything around innovation.
If we're kind of, quote unquote, making things more efficient,
what are the societal impacts of that?
Obviously, it's not something well either you or myself or anyone on their own
can can tackle but it's it's always something to keep to keep in mind at least yeah and a big part
of what i'm trying to do is actually get these technologies into people's hands so that it's not just a bunch of engineers who are making
these decisions about what we should do but you know people can actually try these models for
themselves and see for example how useful but also flawed like the current generation of large
language models are you know they'll
happily if you ask them about somebody that you don't it doesn't know about it'll happily tell
you oh yeah that person's a criminal um and you know so it takes a little a little of the
well it gives people a much you know i'm hoping i hoping, I don't want us, you know,
technocrats to be the ones making these decisions.
I want a well-informed public who are actually able to say, Hey,
this is what we want.
Yeah. Well, that's, you know, that's,
that's an opener for another like, I don't know,
a 45 minute conversation on, on, on its own.
So let's, let's leave it at that for the time being.
Well, thanks for dropping in, really,
and for letting me and anyone who may be listening know about what you're up to.
It sounds really interesting.
And, well, good luck with everything.
No, thank you so much, George.
Thanks for sticking around.
For more stories like this, check the link in bio and follow link data orchestration.