This Week in Startups - Reverse-engineering autonomy in humanoid robots with Sanctuary AI CEO Geordie Rose | E1832
Episode Date: October 21, 2023This Week in Startups is brought to you by… IntouchCX. Looking for ways to make your startup more efficient? IntouchCX has a ground-breaking suite of AI-powered tools for end-to-end optimization to ...give your business the edge it needs to thrive. Get started with your free consultation at http://intouchcx.com/twist Fount. Do you want access to the performance protocols that pro athletes and special ops use? With Fount, an elite military operator supercharges your focus, sleep, recovery, and longevity, all powered by your unique data. Want a true edge in work and life? Go to fount.bio/TWIST for $500 off. .Tech Domains has a new program called startups.tech, where you can get your startup featured on This Week in Startups. Go to startups.tech/jason to find out how! * Today’s show: Sanctuary AI CEO Geordie Rose joins Jason for an incredible interview on the complexities of using AI to train robots (11:09), developing large behavior models (17:53), the 'lights out' moment in manufacturing (42:52), and much more! * Time stamps: (0:00) Sanctuary AI CEO Geordie Rose joins Jason (3:42) Sanctuary AI's approach to robotics and motivation behind creating humanoid robots (6:05) The human hand's integral role in AI-driven robot development: Planning, reasoning, and understanding the world (11:09) Moravec’s paradox and the challenges of instilling perception in robots (16:40) InTouchCX - Get started with a free consultation at http://intouchcx.com/twist (17:53) The significance of "Micro-Policies" and developing large behavior models (22:59) Exploring human cognition and large behavior models (28:52) Fount - Get $500 off an executive health coach at https://fount.bio/twist (30:23) Sanctuary AI’s Phoenix robot, robot training, and use of large language models (37:46) Robotics in automotive manufacturing (41:43) .Tech Domains - Apply to get your startup featured on This Week in Startups at https://startups.tech/jason (42:52) The"lights out' moment in manufacturing and the challenge of regulatory capture in AI (56:01) Humans’ problem-solving nature and roots of technological fear * Check out Sanctuary AI: https://sanctuary.ai/ Follow Geordie: https://twitter.com/realgeordierose * Check out Bill Gurley’s 2,851 Miles: https://youtu.be/F9cO3-MLHOM?feature=shared * Read LAUNCH Fund 4 Deal Memo: https://www.launch.co/fourApply for Funding: https://www.launch.co/apply Buy ANGEL: https://www.angelthebook.com Great recent interviews: Steve Huffman, Brian Chesky, Aaron Levie, Sophia Amoruso, Reid Hoffman, Frank Slootman, Billy McFarland, PrayingForExits, Jenny Lefcourt Check out Jason’s suite of newsletters: https://substack.com/@calacanis * Follow Jason: Twitter: https://twitter.com/jason Instagram: https://www.instagram.com/jason LinkedIn: https://www.linkedin.com/in/jasoncalacanis * Follow TWiST: Substack: https://twistartups.substack.com Twitter: https://twitter.com/TWiStartups YouTube: https://www.youtube.com/thisweekin * Subscribe to the Founder University Podcast: https://www.founder.university/podcast
Transcript
Discussion (0)
I want to just make this very clear that my perspective on AI and automation is that there's an upward spiral.
When you have more energy, you have more intelligence, you have more capability.
These drive all the metrics of human flourishing up.
They don't take.
So when we think about the answer, when will you get lights out manufacturing?
I think the answer is never because people will always find new things to do with the tools that we've built, even very powerful tools that
that can think and maybe are even self-aware.
These will only increase the number of jobs, the increase wages, but there'll be different
kinds of jobs.
There'll be the sorts of things that maybe we can't even imagine now.
This weekend startups is brought to you by Intouch CX.
Looking for ways to make your startup more efficient?
Intouch CX has a groundbreaking suite of AI-powered tools for end-to-end optimization to give
your business the edge it needs to thrive.
Get started with your free consultation at in-touchcx.com slash twist.
Fount.
Do you want access to the performance protocols that pro athletes and special ops use?
With Fount, an elite military operator supercharges your focus, sleep, recovery, and longevity,
all powered by your unique data.
Want a true edge and work in life?
Go to Fount.bio slash twist for $500 off.
and dot-tech domains has a new program called Startups.Tech,
where you can get your company featured on this week in startups.
Go to startups.com slash Jason to find out how.
Hey, everybody, welcome to this week in startups.
We've been focused a ton on AI this past year.
Of course, we talked about it over the last decade on the show,
but things have heated up with language models
and, you know, the very forgotten,
category of startups is, of course, robotics. We see once in a while on the internet a trending
video. And the trending video tends to be one of Boston Dynamics robots doing a backflip,
or we see maybe some surgery being done on a grape. You've seen all these viral videos.
But the idea of humans leaving the factory floor and going and doing things in the real world,
but we don't see many startups doing that. We have one in our portfolio, Café
making coffees at SFO's airport right now.
And of course, our friends over at Tesla are making Optimus, and there's another startup
called figure.
They're working on a humanoid robot.
Sanctuary AI is today's guest.
They're another startup working on this problem, and they're specifically focused on
building robots with general intelligence.
What does this mean?
Well, it's not verticalized, and they're not just making a cup of coffee, but what if
these robots could solve problems in the same way?
do as biological creatures, as human beings. And if that works, well, that's going to have the
economic impact of humanity. And it's going to go well beyond just the steam engine. And we have
the founder, or I should say the co-founder and CEO, Sanctuary AI on the program. His name is
Jordy Rose. Welcome to the program, Jordy. Thanks for having me. Great name. I am reminded of the
Mark Knopfler lyric from
the amazing song
Selling Philadelphia where he says, I am
Jeremiah Dixon, I am a Jordy boy.
Do you understand the reference
Jority Boy?
I do. I do. Yeah.
Yeah.
So let's talk a little bit
about the company and
I know you were founded in 2018, so you've been
working on this for a while. He's raised close to
100 million bucks.
Where are you at with building this
humanoid?
robot and I would love to see the latest.
I should start by saying that our approach to the problem and the reasons for us working on it
are slightly different than most people who work in robotics.
For us, the motivation for doing it was a belief that human-like intelligence and more generally
the intelligence of animals, which is kind of our model for what intelligence means,
is very intimately tied to our presence in the world.
We have a body, we are a thing.
We experience the world through our sense.
senses, we develop understanding of it through interacting with it, and then we act on it to
achieve our goals.
All of those things are very difficult to do if you're not actually physically present in
the world.
So the starting point of this, which actually goes back more than a decade now through
two different companies, was to explore this idea that intelligence, by which I mean
general intelligence, emerges as a consequence of having to deal with the real world.
The real world, you never see the same thing twice.
You have to be able to generalize from your previous experiences to new experiences.
You have to be able to understand the common sense ways that the world is.
So we've been building software, which you could call general intelligence or AI,
but it's also control systems for robots.
And we've always viewed the problem of artificial general intelligence through that lens
is that for us, a true general intelligence can be thought of as a control system for a robot,
that converts what it sees, hears, touches, feels about the world into actions that are intended to reach goals.
So for us, the robot is some of a means to an end.
And because of that thesis, we focused almost exclusively on a very hard but very fundamental problem,
which is the building and use of hands.
So much of the humanoid robotics videos are performative.
They show robots doing things, but they're not valuable things.
And for us, I think that the key value of doing this is to understand how an entity,
a robot or a person, understands the world well enough to be able to manipulate it with its
five-fingered opposable thumb hands.
You know, I believe that the hand was a bit of,
played a big part in our technological evolution and also in the development of language,
which are related things. So that's how did that happen. How did the hand play a role in language?
I'm curious. Was it writing or the ability to hold pen? Speaking. So how does a hand help you speak?
Yeah. So there's, although this is speculative, there's a lot of evidence that the
the earliest spoken language was very strongly connected to the things that our hands do,
like point, touch, grasp, and so on.
And some of the evidence for that is in neuroscience where the part of your brain that controls
the grasping or the use of the hands overlaps with your language center.
These things are not disconnected.
And when you actually try to build a system that touches, feels the world,
and can interact with it in the way we do, you see this explicitly,
that the cognition, you think of it as the domains of intelligence, many of them and maybe all
of them are required in order to do something with your hands. It's a remarkable thing that planning,
reasoning, logic, all of these things are connected to the way that we interact with the world
through our hands. That's fascinating. Like when you were saying that, I was thinking, so I put my
hand on my chin. And then if I were to, if you and I were navigating the world, we were
you know, early settler or somewhere, we might point towards the direction we want to go,
or I might put my hand on my chest to refer to myself, or I might put my hand out and my
palm up to refer to you in some sort of gracious way. Is that what we're referring to,
this sort of instinctual thing that happens with our hands as we're talking?
Yes. Our view in the position that we've taken is that the hands and their use and the
mind are interwoven in an inseparable way in people.
So if you want to understand human-like intelligence, the kind of intelligence that we have,
the hands are the appropriate starting point.
And that's why we focus so much on them.
Now, you ask to see some things.
I can actually show you one of the hands.
All right.
So I see on the screen here, you've got, yeah, very interesting looking hand,
five digits, yeah, four fingers and a thumb and a palm.
And it looks like something out of Terminator, but a little more elegant, in fact.
Well, I think that I would not characterize it that way.
I think that the way that we imagine this hand is that it's the best that the technologies that the global community knows how to build.
It's the best that we can get to human hands.
There's a lot of things in this hand that aren't immediately obvious just by looking at it.
And those are mostly about the sensors.
Our sense of touch is a very important thing for our intelligence and how we are in the world.
We tend to take it for granted because it's always there.
And when we look at screens and things, you know, people are very visual and they think about the world in terms of seeing, which is fine.
But there's an interesting observation that seeing is about the future.
It's about planning because the things that you see are away from you.
So, for example, if you look at a cup and you want to pick it up, the part of your brain that plans thinks about the future, but touch is a little different.
Touch is an immediate thing that's in the now.
Touch is not, doesn't have foresight.
It's all about the present moment.
And when you make contact with the world, say you're seating in a chair or you're picking something up or you're turning a Rubik's cube in your hand, the sense of now is, is intimately connected with the touch sense.
And without it, you can't be who you are.
So this is an important thing for building robots,
is that touch is not a second-class citizen
if you're trying to build a system
that behaves and thinks like we do.
And so these hands are covered in very sophisticated touch sensors
that allow them to feel the world, something like we do.
Yeah, so when we're looking at each digit,
I guess we have a couple of knuckles.
And so the tip of your finger has one,
pad, then that middle
knuckle, I guess, has another
pad, and then there's a longer pad
in that third spot. So if you're looking at
your own hand, you have those sort of three segments
of a finger. They each have a pad on this
robot, and the thumb obviously
has the same configuration. And I guess
when they touch each other, that's telling
it something's hitting that pad. Is that correct?
It's more general
than that. Yeah. So the
sensors in your hand
are not just about
yes or no question about you're touching
something, they're very rich. You can feel temperature, you can feel if something's sliding past
your fingers, which is very important when you're trying to hold something or turn it in your hand.
Imagine trying to put a key in a lock and turn it. A lot is going on in that thing in your brain,
and a lot of it is driven by touch. If you didn't have the sense of touch, it would be very
difficult to insert a key into a lock and turn it, even something as simple as that.
Because of resistance, right? You have a certain resistance on either side and the
or the bottom of your finger where it's touching the key.
I'm imagining this as you're saying it.
Yeah, the way that we do things in the world is not, we take it for granted.
There's a thing called Morvex paradox where the things that we take for granted and are easy
are actually some of the hardest problems that there are.
And the reason they're easy is that we've had a billion years of evolution to create a system
that is fine-tuned to be able to deal with things like, you know, picking things up,
putting things in things and things like this.
But the reason why we have AI systems that can write at the level of GPT4 or create images, you know, that from scratch that are as beautiful as any of human artists could draw, the wonders of the digital age.
But we don't have a robot that can do laundry is that doing laundry is a fundamentally much more difficult problem than any of the ones that modern AI has managed.
to master.
Yeah, it's fascinating when you think about how complex this is.
And that paradox you mentioned, that seems like, you know, like a fascinating evolutionary
moment.
These systems are so complex that they must be automated.
Because to actually, with cognition, to try to think, okay, I'm going to have to give
some resistance as this key goes into it.
And I'm going to have to feel like click a couple of times and then I'm going to have to
twist it left, but I'm going to need to put more pressure.
pressure on the inside of my index finger versus my thumb.
And if I put too much pressure, I'm going to break the key off in the locks.
I mean, it's incredible when you think of all of that occurring.
And it's occurring in just an automated fashion.
It's just a chunk of a task.
Open the door.
It's not even one task either.
It's probably open the door, which is the key, getting taken from your pocket,
being put into a lock, twisting, open the door, closing it the whole shebang.
It's just abstracted into one instruction set, huh?
Well, that's an interesting phrase, and I'm glad you mentioned it because that's the way that most serious embodied cognition efforts work is they have an idea of an instruction set, which is very similar to the way that processors work.
I worked my first half of my career in building computer systems, and the marvels that we've built and the computing side are fundamentally based on a very non-trivial fact that every program that you're,
could write boils down to the execution of roughly order 100 different tiny little programs that
just happen in different orders.
So every program that you can write on a computer is basically composed of only about 100 building
blocks in modern processors.
The reason you can do that is that processes of a natural way to turn the analog nature
of the world into a digital character that allows error correction, which is the fundamental
the reason why you can do all the things that we do today, that in the computer, there's a thing
called a transistor, which is the basis of this digitization, going from things that are just
any number at all when you measure them like voltages and currents to something that's only a
zero one. In motion, that is the taking of actions in the physical world, you need to be
able to find a way to do that same thing. And so what we've done is created this type of
instruction set, which is a very small number of building blocks that you can compose
in different orders to create massive complexity of tasks and potentially all of them.
So if you think of a robot moving through the world or a person as a program,
then you can imagine that any program could be written in, say, maybe just a hundred things in
different orders.
And that's what we're trying to do here is to figure out what those hundred things are.
And then use a technique called task planning, which is the idea that given a goal, like say,
I ask the robot in natural language to do something.
something, the robot can figure out how to sequence the things that knows how to do in order
to achieve the goal and thereby achieve general intelligence. Because if I can ask the robot
to do anything at all, like in the human sphere, and the robot can actually perform the task,
then it would be fair, I think, to think of these things as being, as reaching the goal of having
general intelligence. By the way, I should mention that there's a concept that David Chalmers,
who's a philosopher, introduced called a philosophical zombie.
where a system can have the appearance of being like us in the sense that it can do things like we do,
but it doesn't have the first person conscious experience that we have.
So there's a lot of mysteries about the relationship between being able to actually achieve goals,
do things, and whether that is related or different than the experience we have of being people,
this thing that it feels like to be a thing.
That's a deep, deep mystery.
All right, listen.
efficiency needs to be top of mind for every founder in 2023.
Fundraising is drying up so you need to extend your runway.
And one great way to do that is automation.
But it's hard to apply automation in your day-to-day operations, isn't it?
So here's an amazing solution.
InTouch CX provides easily integrated automation tools for customer support.
You're wondering how it works?
Well, let me tell you.
InTouch CX provides automated and live chat, email, and voice support.
this eliminates unnecessary process and cost and it will make you faster.
IntouchCX will streamline your customer support process, cut back on repetitive and time-consuming
tasks, and increase productivity by 30%.
And it's going to simplify your business.
So refamp your workflows with Intouch CX.
Intouch CX partners are experiencing 45% average cost savings in customer support ops so far.
Find out how Intouch CX can improve your startup's efficiency.
get a free consultation with their automation experts and get started at in touchcx.com slash twist.
That's in touchcx.com slash twist.
Yeah, consciousness, the big C is for philosophers and religious people.
What is consciousness?
Is it just some illusion that we're having in this brain of ours,
which is a collection of a bunch of subroutines as you're sort of alluding to here?
or is it, you know, this God molecule in our brains making us sentient and driving us to do things?
I guess this is one of the exciting things about AI is that we're in some way, or in your case, quite literally,
trying to deconstruct and then reconstruct what is happening in cognition.
But it has to start with, hey, pour me a glass of milk.
So it has to know what milk is.
Easy enough to do now with visual computing.
but poor, okay, we know what that word means.
It's moving some liquid from one place to another,
and then we have to then, of course, make it do that accurately.
So if we were breaking down a task like that,
and you said there's about a hundred things you're teaching it,
what are those little subroutines or those microbehaviors
or you used a term for it?
What was the term you used?
We call them micro policies.
So in the world of...
Microlices, interesting.
Yeah, in the world of reinforcement learning,
which a lot of this is grounded in.
We came up through the reinforcement learning school
of thinking about cognition.
A policy is an action that you take from a current state.
So it's a prescription for how you act,
given the observation of the world that you have.
So these micro-polices are a collection of very specific types of behaviors,
like say, for example, turning a key in the lock,
that we train individually in isolation from any other use.
So the way that works is that we take the robot and a person who is teleoperating the
robot, which is a process of a person controlling and being kind of immersed in the robot,
receiving the senses of the robot and moving the robot through a rig, which is another
type of robot that the person is strapped to.
So the person moves the robot to accomplish the task.
because a person knows what it means to pick up a key input in a lock and turn it.
And we collect order hundreds of episodes,
which are instances of solving that problem.
And then we use that to seed a thing we call large behavior model.
So large behavior model is much like a large language model,
except the fundamental data is the data of experience.
It's vision, audio, pro preception,
which is the information from the servos and the robot where it is and so on,
how fast is moving and touch haptics.
So if you can use the same idea where you take a bunch of data and instead of predicting
the next word or token in a text prompt to response, you take the past, which is the things
that have happened to the robot and you predict the future, which are the sort of analog to
the large language model predicting the next tokens.
but in an interesting twist, the predictions of proprioception are predictions of where you will be, how you will move.
And you can then send those predictions to the actual motors and the motors can move.
So that one of the most fundamental change-shurning points in my professional career happened about 10, 12 years ago,
when I and some colleagues read
a Jeff Hawkins' book called On Intelligence,
which was really the first thing that I read that made sense
about a potential model of human cognition.
And central to that story was the idea
that our brains are predictors,
is that we imagine the future,
and then we implement the imagining.
So if I decide to pick up a cup,
my brain is predicting how my motor's signals will fire,
and then it sends those predictions to my muscles,
and then I perform the task.
So these large behavior models that we and others are working on now
are of this sort is that they predict the future
based on the statistical properties of the data that they've looked at,
and then they execute the tasks, and they work quite well.
So for a human being, we're going to perform a task.
We're going to pour this glass of milk.
We will, in our minds, and we do this either consciously
or maybe even subconsciously, okay, I'm going to be
pick up the glass, I'm going to pour the milk, I'm not going to spell it, it's going to pour
in some kind of an arc, I'm going to watch it fill, I don't want to splash, I'm going to
stop pouring at a certain point, and you kind of visualize this movie in your head, this potential
future behavior. And so our minds are so powerful, they can actually essentially play a scenario,
almost like a screenplay, like a little vignette, and then our muscles actually go play that
routine? Is that the concept here in terms of intelligence of what happens in our brains?
Yeah. So an analogy would be, let's say you take a piano keyboard and each of these little
micro policies that we're talking about are one of the keys on the keyboard. Your brain,
your mind, because I want to make a distinction here is that I've come to believe very strongly
recently that the mind is a creator of stories about the future.
you, the awareness that you are, your conscious presence is not your mind.
It's a different thing.
You touched on it briefly.
I don't have any proof of this, but I'm much more of the mind that the, there's a big mystery there about what it is to be the thing that is you.
But it's not your mind.
Your mind is a machine, just like your heart.
And the job of your mind is to produce stories.
So in the analogy to the piano, think of the mind as creating sheet music.
when the and then the sheet music is automatically put on the piano and the keys are played and you hear a melody or a song in the analogy the song is the behavior like for example picking up a glass of milk and drinking it all of the behaviors that we exhibit in this model are all different songs that are generated by pressing the keys in different orders and with you know different different styles so the mind is is the creator of the sheet music that it
It does always, this is, brains are always doing this.
And then it's played on your body like a song.
And our conscious perspective is sometimes not aware that these are separate things.
You know, we have difficulty introspecting our minds and our behaviors for a variety of reasons.
But I think that after working on this problem for a long time and seeing the, the
synthetic analogs of us, how it actually works in robots.
I think this is a good model, is that your mind is a machine for creating sheet music
that is immediately played on the instrument, which is you,
and your awareness is a separate thing that kind of watches this.
And sometimes we get confused and we think we are the mind, but I don't think we are.
I think we're something separate.
So there's the mind, and then there's the machine.
And this machine is going out, conceiving of these tasks, executing on them,
playing the sheet music, running through the script.
I use the analogy of a film.
You're using the analogy of piano.
But the script gets played.
The sheet music gets played.
It happens.
But our awareness that I am a human, I am Jason, you're Jordi.
We are on a podcast.
We're having a conversation.
I'm trying to understand what you're doing.
You're trying to understand my questions.
And then there's going to be 100 or 200,000 people who listen to this.
And they're also going to try to that.
That's consciousness.
the awareness of each other and ourselves and our place in the universe and that there is even a
universe.
These are two different things.
But for some reason, we perceive these two different things that are occurring, the mechanical
execution of tasks through this very interesting project, process, which you're now recreating,
is different than consciousness.
And consciousness, who knows when we're going to ever figure that out or if we can figure it
out, this idea that we're aware that we are a living being.
But we can figure out, at least at this point in time, it feels to you like we're going to figure out and we're close to figuring out how our brains break down complex tasks and do them so elegantly.
Am I reframing, am I repeating that back to you correctly?
Yes, that's right.
There's a spiritual leader, I suppose you could say, called Eckart Tolle, who refers to the first thing, which is the not knowing that you are.
are different than the plans that your mind makes as being unconscious is the phrase he uses.
And I think that it's a natural state of people is to not be aware that the mechanical following
of scripts, which is most of our behavior, and that's the sort of thing that you can, there's a shot
to doing in a robot.
So I'm fairly sure that we can build machines that can do all work, like all of it, at least
as it's currently, you know, understood, things like automotive manufacturing, logistics,
bringing parcels to your home, I think that all of those things are within scope to do within,
say, a decade, at least have the capability. So this idea of building a thing that appears as
though it's intelligent and does all the things that you'd want, that's within reach. But the thing
that I'm really kind of taken with is this other notion. You know, I used to be a theoretical physicist
a long time ago, and I worked on foundational problems and quantum mechanics and general relativity.
And I've always had, you know, at the base of who I am, I've always been interested in
understanding how things work at some fundamental level. And it's always bothered me that
all we ever experience of the world is this first person thing, the feeling of being you in
the moment. But we don't understand that at all. And I think that this neglect of what is probably
the most central direct experience we have of the world means that there is a discovery waiting
to be made about the relationship between our experience and notions of space and time.
And I think that this project is somehow, in some ways aimed at that, is that it sort of starts
from a weird spot because you see these robots and there's mechanical hands.
And then I'm talking about, you know, some fundamental relationship between the emergence of space and time and how we perceive it through our conscious perspective.
And as he seems to be not related, but I actually think that they are very tightly related.
You see these blue light glasses I'm wearing, I'm not wearing for style, although they are very stylish.
They've totally changed my life.
Why?
I started having headaches, right?
And had eye strain.
So I got these blue light blocking glasses that do a little magnification because I need readers.
Yeah.
I look nuts.
But my eye strain's gone down.
My headaches have gone away and I'm sleeping better.
Do you know how I got on this?
I got on it because I now have a health coach.
Who's my health coach?
It's found.
F-O-U-N-T.
It's a health company that's created custom health and performance programs that are tailored to
your body, obviously, also your goals, and they take into account your lifestyle.
My coach is incredible.
I text with them all the time.
They did a blood work for me.
They check out my wearable data.
and we do weekly calls to see if I'm on track and getting the results I want.
They also told me about some supplements I should be taking based on the blood work,
and they do it at a fraction of the cost.
We upgraded my diet.
I'm doing a little more protein.
We've optimized my sleep.
That's great.
I got the supplement packs.
I feel great.
I feel like I'm in control of my destiny.
If you want to be like me and you're concerned about your health and you want to just try
to do better, have some experts on your team.
Build your own program.
Go to fount.
BIO slash twist.
That's F-O-U-N-T-B-I-O-S-T-T-T-T-T-T-T-T-T-T-T-G.
get your free consultation, mention Twist, you get $500 off your first month, and get your
own personal health coach.
Health is well.
And if you're running a startup, if you're a CEO, if you're a capital allocator, take it
seriously.
I love this service.
Fount.
Bio slash Twist.
Well, if we think about the experience of being human and our place in this universe
that we're trying to figure out, performing the tasks, as you said earlier in our conversation,
is how we navigate the world.
And it's how we are actually doing this act
of trying to figure out what it is to be human.
And this all then starts to open up
all kinds of possibilities, free will.
When we pour that glass of milk,
when we play that sheet music,
where is our decision to do that occurring?
You know, what parts of it are automated,
which parts of it are just wrote
and just get executed on?
And so it does open up,
and I agree with you,
this is the question.
that we will always try to figure out.
And this is why science fiction, you know, always winds up here, which is what does it mean
to be human, whether we're talking about Blade Runner or Prometheus and the Alien series
and really Scott's take on it.
So let's get back to reality here.
When you're training the robot, you are not saying, hey, we're going to pick up a tennis
ball here.
If we're going to pick up a tennis ball, it has this size, therefore we're going to program
it to pick it up.
That's what people did with robots before they very, very, very much.
explicitly had to do some very narrow verticalized task.
You're having a human being, like the guy who played Gollum, I guess, in Lord of the Rings,
use gloves or something to send the instructions to the robot's actual physical actuator hands,
and they're incredibly sensitive and have those pads on them.
So we're teaching it, hey, I'm going to just pick up Andy Circus.
was a guy who played a Gollum.
We're going to actually just pick up the tennis ball,
and then the AI that we train is going to know what happened,
and that's what's going on here?
Yeah, so I have got a, this might be helpful.
Can you see the video?
Yeah, yeah.
So this is a robot, yeah.
Yeah, so this is Phoenix.
And what you're seeing is the,
this process that we're talking about,
where there's a person in a suit,
they have haptic gloves with force feedback.
so they can feel the world.
They see through a heads-up display,
so they can, if they feel like they're looking through the eyes of the robot,
and they're connected to it, their own robot,
that when they move, moves the robot in an analogous fashion.
So the, yeah, so this is what, this is the,
what it looks like when you watch the robot side of teleoperation.
You can see that these machines are,
um,
uh,
capable of doing lots of different things.
I mean,
that might not be obvious from watching this,
but the systems are,
nearly capable of doing anything that a person can do under this type of control,
as long as they don't have to move around the world. This is focused on the upper body stuff
and the problem that I mentioned, which is the dexterous manipulation of the world.
And you're seeing it there, you know, basically pick up an object and then scan it with a barcode
as if you were working in an Amazon, let's say factory and shipping and packing boxes,
or even doing something as delicate as using a Ziploc bag,
which we do unconsciously.
We feel it.
It feels like the Ziploc bag, yellow and blue, made green,
and you have that color system, but you also have the feeling of it.
So humans do the tasks,
and then take me to what the software then does with the human having done the task.
What does it do next in terms of building a model
to then go do the next thing in the world?
Yeah, so imagine you have a Reddit post, which is some sequence of words that someone says, I really love Diablo too because Amazon is my favorite character.
So somebody's written something.
That sentence is the expression of a thought that a person has had into words.
Now, when you train something like a GPT large language model, that sentence is used to help
figure out the statistical likelihood of each word,
let's just keep it simple,
following the preceding ones.
And if you give this model enough words
that people have written,
the expression of their thoughts,
then if I was to say,
type in a prompt,
which is my favorite game is,
then of all of the words
that have ever been written
to some approximation,
there's a probability of what the next token will be
given that prompt.
And then the thing can unroll,
which means I put the next word in, and now I ask with all those four words, what's the fifth word?
Okay, put the fifth word in.
What's a six?
And each time it's a probabilistic thing.
So you roll a random number and you pick the thing that the random number says it should be.
So with this type of model, these large behavior models, the data is a little different.
It's the data of the sequence of successive nows.
It's the time data from the person performing the task.
So if I ask the robot to open a Ziploc bag, let's say that's.
the micro policy that we're going to train. So a person picks up the bag from the table through the
robot, opens the bag and, you know, maybe pulls it open a little bit. So that then becomes the
analog of a sentence. It's a piece of data, which we're now going to use to train a model,
where instead of predicting the next word, we're going to predict the next sequence of actions,
and we unroll the same way we would a sentence. So every successive prediction becomes a movement
pattern for the system.
And in this case of the kinds of things we build,
while it's similar in some sense,
there are some very big technical difficulties
in actually doing this that require the synthesis
of many different kinds of artificial intelligence advances.
For example, you could send the pixels from the camera in
at every step to one of these models.
but the pixels are not
they're not the thing that you really care about.
What you really care about is where are the things and what are they?
Which is a much lower dimensional thing.
So machine learning computer vision techniques have been developed
that will take the camera feed
and extract what you could think of as the semantic
or important information about the scene
and those are typically the things that you put into these types of models.
And that's not just true for vision.
It's true for haptics and audio and pro preceptions.
So on the audio side, the obvious thing is if a person speaking, you could use the actual audio waveform, but you could also use the text.
And text is a much more compressed and high quality version of the data than the actual audio itself.
So we tend to do text extraction from speech before we send them into these types of models as well.
That's fascinating.
So you can know with machine learning, hey, there's a bag in this scene and the bag is open.
and but the bag is upside down
so it needs to be up
we should flip it around
so the things don't fall out of the bag
etc
and so where are you at
let's I think I understand
what's happening here
in terms of
the language model
analogy and then just
translating that
into predicting the next best thing to do
and so
where are you at
in terms of training this
in the real world
I assume that factories
and the example you give
looks like a you know
packing
and shipping, probably one of the most boring, monotonous soul-crushing jobs a human being
could have, so why not give that to a robot? And sure, you could do it 24 hours a day or whatever
the robots are going to be capable of. So where are you at in terms of taking this and actually
having it at a fulfillment center, packing boxes and making sure that it scans them and puts the right
objects into the box and then ships them onto the next person and being on this distribution center
floor. To be clear, the initial go-to-market is in automotive manufacturing. It's not in logistics
and retail. We focus almost exclusively on that with one exception. In automotive manufacturing,
if you take a look at a video that, say, Toyota makes of their factory floor, automotive
manufacturers is one of the most automated systems there are in any industry. But if you watch
what actually happens, there are hundreds of
of thousands of people in automotive manufacturing facilities all the time.
The question is why?
Why aren't they being automated?
So when you look at what they're doing, there's kind of two categories of answer.
One is it may be beyond the bounds of science.
We may not know how to do the thing that they're doing.
But there's another answer, which is that often people are used to connect machines.
So let's say I have a machine for stamping apart and I have a machine for putting
making the part in the first place.
So moving the part from the one machine to the other machine is a very difficult process
that involves all of these things that we talk about.
You need to be able to know what a thing is, where it is, localize it, use your hands to
pick it up, sometimes out of a cluttered mess, put it somewhere, which often requires
putting something on a jig, which is a difficult thing.
You need to be able to move around and so on.
So a lot of the work that's done in automotive manufacturing specifically is a combination of
different solved problems that have never been put together in a way that you could make economic.
And one of the key factors of these general purpose machines that we and others are building
is that this is exactly the kind of thing that's required in order to actually do this for real.
Is that if I was to spend all my time and energy building a machine that did one of the things
that somebody in this factory floor does, it would be very difficult to build a business.
But if you could build a machine that could do, say, 50 of the kinds of things, now we're talking.
So our initial use cases are nearly all of this sort.
They're automotive manufacturing.
They're the connector problems where you're moving between machines with parts or things of material.
And even the things that aren't automotive manufacturing that we've looked at all share the same feature.
For example, in warehouses, which was my last business built robots for e-commerce distribution centers,
there's a problem called induction.
So induction is the problem of taking things that usually come off trucks, you know, big pallets and just stuff.
You know, imagine all of the things that you could buy on Amazon coming into a warehouse.
And then taking them from their point of delivery and then getting them wherever they should be in the system,
on a shelf, in a box, whatever.
So induction is another kind of problem that's related to this,
where you're dealing with a system,
things that you need to manipulate with your hands,
opening boxes, closing boxes,
putting things in boxes,
taking things out of boxes,
and so on.
And so that's another category of things that's related,
but nearly everything we're doing now
is helping automotive manufacturers
dramatically improve the efficiency
and productivity of their workforce.
We're back with another pitch
it to J-Cal. This is the segment brought to you by our friends at dot-tech domains. Dot-Tech domains are
giving twist listeners the chance to show off their startup on this week in startup. So go to
startups.com slash JSON. That's startups.t, t-e-ch-j-j-j-j-j-j-j-j-tason to apply. There's only one rule.
You need to have a dot-tech domain name to get featured. This week, I received a great pitch from
label drive, which you can find at label,
LabelDrive. Tech. Label Drive helps other companies manage their AI data, and they've built a tool for
collecting and labeling data that's especially focused on identifying and catarizing objects
to save your time, save your money, and build better products. And as we all know, that's crucial
for AI training. So I want you to go right now to label drive.com. And if you're interested
in getting featured on this weekend startup with your new.combe domain name, I want you to go to
Startups. Tech slash Jason and apply today.
That's Startups.
dot tech slash Jason and fill out the form to apply.
You know, if this works, when do you think you'll have the ability to have the robot,
you know, find those 50 different things to do?
Let's say you nail that and it feels like you're well on your way.
The first question, when do you get that solved and in factories and just doing it day in
and down. The plan that we've got takes us from where we are now to the full automation of
important tasks by which I mean there are markets for say billion dollars of annually recurring
revenue for us. So let's say that's a kind of a thing that we want to target. We enter into
agreements with our customers where the first step is that we mock up their situation in our
facility here in Vancouver. Think of it as a digital twin, or not a digital twin, a real world twin.
There's also a digital twin, by the way, but the real world twin. And then the processes that
they pay us to show that we can automate using this type of thing, the kinds of tasks that they
want as a first step. So there's a, there's a period of roughly two and a half years that we see
where we go from where we are today to being able to really do something for real in the
of the sort that you could then scale.
So that's the first step.
When we start scaling is likely around the middle of 2026,
where you're going to start to see the increasing number of these types of machines
actually deployed inside automotive manufacturing plants,
contributing to the productivity of the plant.
So this is the plan of record.
Now, I've done quantum computing and all sorts of things
where it's very difficult to predict how things will go.
So in something like this, you have a plan.
Things could go faster.
It could go slower.
It's unclear.
But that's what we're aiming at.
I think you'll start to see the beginnings of large-scale deployments of these machines somewhere in 2026.
And so 2026, you start seeing the deployment of these.
And then when do you think factories start to remove humans?
I guess they call that the lights out moment.
you don't need to have the light,
you don't even have to install lights in a space.
I know it's funny,
but when do you think you have that lights out moment
and factories don't need to have humans in it?
So I want to make a point about this.
There is a myth that AI and automation reduces labor.
It's not true.
Throughout history,
there have been a series of moral panics
where the next big technology thing
is believed to do something terrible
to employment. It's never happened. Every single time there's been a new thing introduced.
And I think that the central reason for us thinking this is that it's the lump of labor fallacy,
the idea that there's a fixed amount of work. And if you give the work to the robots,
there's nothing left. That's simply wrong. The way that it actually works in practice is that
when you give, say like you give a bunch of labor, like I want to build 80 million cars. So that's a
a fixed amount. Let's say we could do that all with robots. The amount of work that's available
for the general human population expands as a consequence of that. It doesn't shrink. So I want
to just make this very clear that my perspective on AI and automation is that there's an upward
spiral when you have more energy, you have more intelligence, you have more capability. These
drive all the metrics of human flourishing up. They don't take. So when we think about the answer,
when will you get lights out manufacturing? I think the answer is never. Because people will always
find new things to do with the tools that we've built, even very powerful tools that can think
and maybe are even self-aware. These will only increase the number of jobs, the increase wages,
but there'll be different kinds of jobs. There'll be the sorts of things that maybe we can't even
imagine now that are made possible by these things. Like, look at the internet, 20 years ago or 30
years ago. This is a great example. Yeah. Yeah. Now we have an entire podcasting history. We have
people who take pictures or, you know, there's an incredible company called Song Finch. What they do is
you go there and you tell it you want to make a song for your mom or your dad. They pair you with
an artist and you pay them $200 and they'll write a song about your mom for her birthday.
That's very cool. Well, I mean, just there's a, there's, there's,
humans out there, and I guess these used to be bards or, you know,
Corchesters or whatever who would do these kind of tasks as well, but we find things.
And you're just thinking about your robot and, oh, well, we have this new problem,
forest fires.
Well, how are we going to, you know, clean up the, how are we going to rake up as our
former president, you know, joked of, you know, how are we going to, how are we going to
rake up all that debris under the trees there, you know, in the mountains in California?
You know, if somebody had 10 of these robots able to do a test, them, I'd say, oh, you
have an interesting idea. Maybe we could clean up
and do some deforestation with them.
And they will eventually, in your mind,
a decade from now or two decades from now,
not just be in factories.
They'll be in our lives. They'll be side by side with us
solving problems in the real world.
Yeah, that's the ultimate vision here, so they can leave the factories.
Yeah, I think of them as being
a kind of thing like the automotive industry,
where at some point they'll be ubiquitous
and parts of, in our entire civilization
will be built in synergy,
with this new thing, like we did with cars,
you know, roads and so on.
By the way,
I wanted to mention that this happened,
this business of the job upgrade happened to me.
When I was starting school,
there were no quantum computers at all,
except maybe theoretically.
And we started a company to try to build one.
This is an example where the,
we probably hired about over time,
I don't know,
maybe 300, 400 people who had PhDs and physics in that company.
That,
and this is D-Wave,
yeah.
This is D-Wave, yeah.
That,
that was a new kind of job
that was created as a consequence
of a revolutionary new idea.
So this is the sort of thing
that always happens
with innovation is that,
and I'm kind of emphasizing this a little bit
because we're at a very weird time
right now where there's an attempt
to do regulatory capture
and artificial intelligence.
It's a very dangerous idea,
this idea of de-acceloration
or stagnation or holding back,
which are connected to ideas of the old ideas that were rooted in communism.
These are very dangerous social ideas that I think it's important that we don't stay silent about.
And people like me who have very strongly helped beliefs that technology is the solution to maybe all of our problems,
not only the ones we create, but also the ones that might emerge as a consequence.
of our natural habitat, you know, global warming or meteors or whatever.
The idea that we, the better we can get at creating new things, the better we are all
is a very important policy idea that I don't think is being communicated effectively
by the community of people who build technologies.
There are a group of people who want to have the government, and they've specifically
gone to Washington and said, hey, please regulate us.
to be the people who are at the,
maybe at the forefront or some amongst the people at the forefront.
And,
you know,
building a bunch of regulation into this would benefit the people who have the lead today,
as opposed to say open source people or,
or,
uh,
folks who are coming up.
Is that the thinking of what their motivation is?
Because this is a group of technologists who are on the cutting edge.
Why would they go to Washington and want to in,
have a bunch of,
you know,
uh,
non-technical politicians.
slow things down.
What do you think
their motivation is?
I think the best answer to this
is somebody that you had on Bill Gurley.
Yeah.
So his take on this
I really resonated with.
It was one of the most,
you know,
sometimes you watch something
and you're like,
I'm disagreeing with everything
this guy's saying right now.
I think I would,
if people are interested in this subject
and they haven't seen that,
I would most definitely recommend it.
Bill Gurley,
all in talk.
Yeah, we'll put on the show notes
for everybody.
Yeah.
But the regulatory capture is,
what they're going for. It calcifies the winners as the winners. It builds up a mode for them.
And this could be just cataclysmic for humans, right? We need this technology to solve problems.
Yeah, and it's, and it's, I think that's the point is that the solving of problems comes from,
from innovation and, and growth. And the, the forces of stagnation, the people who are pushing for
not that are very strong right now.
And I think it's dangerous because I think that my view of this is that civilizations metrics,
like how well people are doing are very strongly correlated with growth.
And there's an idea that we have to slow everything down, which is, I think, a dangerous
idea.
I think that what would happen if we were to implement policies that were restrictive is the same
thing that happened.
I'll use an example with nuclear power.
A lot of the problems that we face today in the global warming sense and catastrophic, you know, potential futures that we might be looking at are connected very strongly to the precautionary principle, which in the nuclear industry was, well, we don't want to build nuclear power plants because we're afraid of nuclear bombs, which is ridiculous, by the way, because it's not the same thing at all.
not the same thing.
Yeah.
And, you know, maybe a reactor melted down once or twice.
Yeah, three mile island, yeah.
You don't count all the deaths that happened in all these other industries, which were
massively higher.
If we had not done that with nuclear instead embraced it, we would not be where we are
today.
And so there's examples of this fear, which is a rational fear that can become policy that
it could be very dangerous here, because I think that these technologies, these
technologies we're talking about, which is the AI robotics to a certain extent, but not just
those things more generally.
We should take the attitude that the upward spiral is the objective.
We want more.
We want more energy.
We want everybody on the planet to come to the energy consumption of us.
We don't want to reduce energy consumption.
We want to increase it.
And then we want to increase everybody by another thousand times.
And we need to be able to find ways that technology can enable that.
then enable solving the problems that might come of it from second order effects like global
warming. These things are all solved by innovation and technology. It's innovation and change is
not the enemy. It's the, it's our friend. It's a necessary part. And it's connected to who we
are as, as people. You know, people are explorers. We're adventurers. We want novelty. We want
to go to, you know, to places that no one's ever gone, either literally or figuratively. And that
is the essence of the human spirit to me.
And we want to be advocating for that as technologists and leaders in our fields.
Yeah.
And it's so paradoxical.
I remember when I was a kid, all these great musicians who I loved, Bob Dylan and
et cetera, did the no nukes concerts.
And we really were indoctrinated into this fear of nuclear.
And the second order effect is that we burned more coal and we burned more oil.
and we heated up the planet
and now we're trying to solve the problem
and the solution was there in the 70s
and then sometime in the 80s
we decided hey let's stop doing this
and now 80s 90s 2000s
we're sitting here four decades later
and finally people are starting to realize
40 years later oh you know what
maybe that was a mistake should we start building these again
and now we've got to reconvince
everybody that we went on a 50 year
side quest that made no sense
and it's incredibly frustrating
and you know it's
yeah to some of our friends
people have been on the pod
Sam Altman, Reid Hoffman
Mustafa like
I think they are misguided here
we could have conversations about this right
I mean there's nothing more with having a conversation
hey how do you make nuclear safer
hey could these robots
I mean it sounds farcego
but could the robots escape and do bad things
in the world sure we could have this conversation
but that doesn't mean that we need to have
a bunch of regulators come in and say
oh, somebody in Washington's going to approve your language model and your code?
I don't know doesn't make much sense to me.
That seems like they're doing regulatory capture.
I agree 100%.
And you know, the other thing I realized about what you're saying,
Jordy, is there's something about solving problems,
I realize in this conversation that when we were talking about jobs,
and there's a sensitivity to that, with good reason,
we ought to bait things and a large amount of jobs could go away quickly.
And there could be displacement, of course.
But when the mind and consciousness is left alone, our minds are designed in a very interesting way to think and find the next problem to solve.
There's something fundamental about human consciousness and this brain and, you know, Darwin and evolution that our species survived, dominated, and evolved with something inherent in our code, which is understand the world and find the next problem to solve.
Is that any of that resonate with you?
Oh, yeah.
I mean, everybody who's light awake at night and they can't get their mind to stop spinning through all the negative scenarios that could happen.
Everybody, I think, experiences this.
You're exactly right.
Is that this tool that we've got, this beautiful mind that does all these wonderful things,
it creates the worst nightmare as possible about what will happen as a consequence of it working well.
And so with technology, our mind spins up all of these horror science fantasy.
ideas, we turn them into movies like Terminator or Black Mirror, none of that is real.
I think there's a very important, powerful message here is that the terrible stories our minds
tell us when you lie awake at night about your personal life is the same process that generates
fear about the outcomes of change. So when we do something new, we innovate, we discover
something about the world. There's a natural tendency that all of us have.
to imagine what might go wrong.
And...
Exactly.
Yeah.
Yeah.
So my,
I would advocate for being aware of that is that it's a story your mind is
telling you.
The Terminator thing is not true.
It's not real.
It's not never going to happen.
It's just a story that somebody made up that resonates with our, with our base nature,
fears and concerns about the future and so on.
But it's not real.
What's real is very different.
Yeah.
And in our mind.
there was a reason this obviously existed,
or the person who worried,
hey, I wonder if these berries are poisonous or not,
or I wonder if there's something dangerous
in that body of water,
maybe I should be cautious.
Yeah, a little bit of caution,
thoughtfulness,
probably extended life.
And people who were reckless probably had shorter lives.
And so, yeah, the gene pool probably evolved this way.
But you must be aware of how catastrophizing it is.
I mean,
people can get really wound up. We see this with social media presenting us with so much
bad news in the world. Our brains are not designed to process that, are they?
No, and this is an example of how technology can have unintended consequences that are negative.
It's social media hijacks this propensity that we have to tribalize, to fear, to other,
to see people other people as being different.
You know,
what part of this idea of thinking of this conscious perspective
that you have is separate from your brain carries with it
another idea that we're all connected.
You know, we all have this thing.
We all share in it.
The analogy that Eckart Tolle uses is that there's an ocean
and we're ripples on the ocean,
but this ocean is the same for all of us.
This idea is a powerful one
when you're trying to think about why you're reacting in a certain way to certain things.
You know, like the social media stuff is an amplifier of the negative aspects of how
we function as people.
But that doesn't mean that we shouldn't have done it.
I think this is a point.
Is it like you said before, we want to talk about it.
We want to have a frank discussion about it.
But the solution to these things doesn't come from shutting things down.
It comes from having this discussion and making good good,
good clear-minded decisions about how to build,
not how to build.
One of the great paradoxes of all of this might be,
we build up this AI and we get to some general intelligence.
It might tell us, it's a non-zero chance,
it might explain things to us about our own consciousness,
why we're here,
and what consciousness is that we ourselves could not come to the answer.
So we may unlock some mysteries,
that explain our own existence in a way.
And that is just to me would be a wonderful gift of accelerating this, you know,
is what if this machine, what if this artificial intelligence can be more objective about us
and can teach us something, right?
That would be a pretty mind-blowing outcome.
I sure would.
Yeah.
All right, listen, continue success with this from, yeah, just working.
on quantum computing and now to robotics and figuring out how to make these sequences play.
It's going to be very interesting to watch your progress. And listen, accelerate it all. Let's go.
I'm assuming you're hiring and this must be one of the most fascinating places to work in the
world. If people are interested in learning more or maybe applying for a position to build this
out and accelerate human intelligence and augmented so beautifully.
Where can I find out more?
So I and one of the other founders of the company, Dr. Suzanne Gildert, have a podcast called
the Sanctuary Ground Truth podcast.
That's a place that you could look.
We also at our website, sanctuary.a.I, there is a careers page.
We are hiring and growing quite quickly.
and there are positions for all sorts of different kinds of people.
We mostly hire technical people, of course, but there are some other things.
And if anybody's interested, please watch the Ground Truth podcast and go to the website and check us out.
Amazing. All right. And we'll see you all next time on this week and start. Bye-bye.
