No Priors: Artificial Intelligence | Technology | Startups - The Best of 2024 (so far) with Sarah Guo and Elad Gil
Episode Date: July 11, 2024Believe or not, we’re almost halfway through 2024. Sarah and Elad have spent the first of this year talking with some of the most innovative minds in the AI industry, so we’re taking a look at som...e of our favorite No Priors conversations so far featuring Dylan Field (Figma); Emily Glassberg-Sands (Stripe); Brett Adcock (Figure AI); Aditya Ramesh, Tim Brooks and Bill Peebles (OpenAI’s Sora Team); Scott Wu (Cognition); and Alexandr Wang (Scale). Watch or listen to the full episodes here: Build AI products at on-AI companies with Emily Glassberg Sands from Stripe Designing the Future: Dylan Field on AI, Collaboration, and Independence The argument for humanoid robots with Brett Adcock from Figure OpenAI’s Sora team thinks we’ve only seen the "GPT-1 of video models" Cognition’s Scott Wu on how Devin, the AI software engineer, will work for you The Data Foundry for AI with Alexandr Wang from Scale Sign up for new podcasts every week. Email feedback to show@no-priors.com Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil Show Notes: (0:00) Introduction (0:46) Emily Glassberg Sands on the Future of AI and Fintech (4:23 Dylan Field on AI and Human Creative Potential (9:03) Brett Adcock on Running Figure AI’s Hardware and Software Processes (12:43) OpenAI’s Sora Team on Artists’ Creative Experiences with their Model (17:43) Scott Wu Gives Advice for Human Engineers Co-Working with AI (21:06) Alexandr Wang on How Quality Data Builds Confidence in AI Systems
Transcript
Discussion (0)
Hi, listeners, welcome back to No Pryors.
We're halfway through 2024, so we're doing a mid-year best-of episode where we go back to some of our favorite moments from episodes so far
and catch you up on everything that's been going on in AI, from the state of the art in research to hyperscalers and upstarts.
We'll list all the episodes featured so you can go back and relisten to the whole conversation.
To kick it off, we're going to hear a little bit from Emily Glassburg-Sands, who's the head of information at Stripe.
We talked a lot about how AI can help small businesses make a big impact in the economy.
Here, she talks about the intersection of fintech and AI.
When you think forward on the directions that the overall financial services industry is going,
and let's put Stripe aside for a second because I think Stripe is obviously a core company to sort of the internet economy,
and it touches so many different pieces of fintech and things like that.
But where do you think outside of strength the biggest white space for fintechs
employing AI is?
Like from a startup perspective or even an incumbent perspective,
like where do you think this sort of technology will have the biggest impact?
It's a great question.
I don't know exactly what others will do.
I think having a really robust understanding of identity.
who businesses are, what they're selling has always been important.
And, you know, I think often in industry, we think it's important for marketing or sales
or sort of go-to-market motions.
But it's also super important in fintech.
Yeah, it's important for credit lending decisions,
but it's also important for supportability decisions and understanding where, you know,
the business does or does not meet the requirements of a given card network.
or a given bin sponsor.
And so I think that that identity piece,
like who is this merchant,
are they who they say they are,
but also what are they,
what's their business?
What are they selling?
And how does that map to this pretty complicated regulatory environment
is a really interesting and hard problem
that lots of folks are solving in their own ways,
but is likely an opportunity.
I think there's almost certainly an opportunity to, you know,
whether Stripe does it or somebody else does it,
to make sort of financial integrations way more seamless.
Stripe has a whole suite of no-code products.
So you can use, you know, payment links or no-code invoicing.
But how does one actually build a really robust specific to the user integration without needing
you know, a substantial number of payments engineers or any complicated developer work.
LLMs are proving that they can be very good at writing code.
We have a couple cases actually where we're already seeing it work,
but as the decisions get more and more complicated,
I think there's still a lot of work to do to build the right integration
and to build it well in an automated way.
And then I think, as I mentioned before, some of this layer on top,
of the payments data of like, okay, you could build solutions that make payments work better,
but payments actually allows you to really deeply understand and improve the business is
pretty fascinating. And you'd have to think about, like, is it a startup that does that
or is it an incumbent that does that? And what's the, what's the business model? What's the
business model there. But, you know, if I think about the case of Stripe, you know,
Stripe has the opportunity to be beneficent, right? Incentives are super aligned. The more Stripe can
help its users, businesses grow, the more Stripe grows, and the more the economy grows. And so
whether it's Stripe or someone else, using financial data to help businesses be more successful,
to grow the pie, to grow the GDP, I think is really powerful. Up next,
we have a clip from our conversation with friend and formidable founder, Dylan Field,
whose company is using AI to change the design process and bridge the gap between design and
development. We talk about how bringing AI into the creative process changes the creative job.
Basically, you're moving from a human-to-human collaboration company to a human-to-a-I collaboration
company over time in some sense because, you know, what you're describing seems like a really
interesting way to have co-pilots augment humanity or augment creativity. Are there other ways that
you've thought about the substantiation of that sort of creativity augmentation or how AI really
interacts with human creative potential? Well, these are just examples of things that I have seen
or thought about that I think could be cool in the creative space because you asked about. But
I think in the design context, one thing that really matters a lot is the iterative loop.
and being able to keep going back and forth to an agent and give more instructions over time.
If you just kind of, like, go to first principles here, there's so much that you're not able to communicate via a prompt.
Like, if you think about great design, it often captures something about the culture, the ethos of the moment.
It captures something about the temporal aspect of the sequence of interaction someone's having or the context they will have mentally.
Something about affordances, what people are used to in terms of the language of design,
which is sometimes similar and dependent on the platform.
But oftentimes there's something about emotional state too.
There's videos that the designers probably watched or in-person research interviews they've conducted.
And so I think like fitting all that plus the product requirements plus visual style into a prompt that's hard,
even if you could just get unblocked by an AI.
helping you brainstorm and thinking through problems, you know, that's your first sort of draft.
And from there, you can keep iterating.
From there, you can keep evolving things.
I think that could be very, very interesting as a first step.
What's your response to people who worry that AI, like in every role, are going to, you know,
eliminate the need for designers?
For all the reasons I just mentioned around, you know, emotions, user context,
knowing how flows go, having that history of interactions and what not.
I think it's unlikely that that's like the world we're seeing in the short term.
I think no one knows what's happening in the long term.
You know, if we have superhuman intelligence, like, I don't know what it means for any of us
on this call podcast or anyone listening.
If we don't try to ask about what that case it looks like and instead ask about,
okay, if we assume that there's a continued improvement, what does it mean for design?
I think design is actually in a really good place.
Probably before you see potential or replacement of any part of the design role, you instead
see augmentation and you see access, you see efficiency so that designers can get more done.
And I think probably a lot of engineers do more of their, put more of their time towards
design than they put towards what we consider coding task today and the abstraction level
of coding changes.
There's probably still a human loop for engineering, but I think that it's not clear
to me that humans are going to write like every line of code in a year, three years, five
years.
Obviously already we have co-pilot, but I think that you could go even further than that
and a lot of companies are trying to do that.
I mean, I can't make multi-year bets in the current environment, but my expectation would
be that we, maybe it's because I'm an optimist, but I think we're just.
going to get better and more software and better designed software versus fewer, fewer designers or
engineers. Yeah, I definitely think that as a metric, like number of pieces of software that
will be created will go up tremendously. And it's interesting, like, there's, there's some visions
out there of the future where people interpret the capabilities of AI to mean that you won't
have any interface at all. I think it's really cool to see this export like the rabbit we've
talked about Sarah. I haven't used yet. I think you did. Is that right? But I think it's a really
cool vision. And I think that there will be so much more software in a year, two years or five years
from now than there is today. Like, both could be true that there's demand for that. And there's
just way more software. Next we talk to Brett Adcock, the CEO of Figure AI. Figure is creating a fleet
of human aid robots to take on the dull and dangerous jobs that humans shouldn't be doing.
doing.
In this clip, we talked to Brett about how he runs a team with velocity to drive hardware,
software, and AI into reality.
Big question, but can you describe, like, if you want to run a hardware project, a
hardware and software project like this, with this complexity at velocity, like, how do
you manage product development?
From like a thesis perspective, I strongly believe in like an iterative design approach.
We really don't believe on spending a lot of things.
spending a lot of time like just just doing research and analyzing we spend a lot of time on just
testing the building testing here and that helps us really shake out all the problems it helps us
learn helps us recursively add it into a continuum of product that's coming down coming out
and um so first that's our strategy we um we want to be continuously updating the hardware
and software forever it'll i don't think it will ever be good enough for us
So we have a whole process built around building a robot from a basically hardware and software
design that we run here.
We first set out with understanding who are the customers, like what does a robot need to do?
From there, we basically set requirements like, okay, we need the robot to lift this much
pounds, it needs to run this long, and it needs to charge here, the safety requirements
so that it can't, battery can't burn down the building, there's a bunch of stuff
We have to, the environment on IP rating has to be done on an other actuators.
There's just a bunch of requirements that come from there.
From there, we look at those requirements and we do engineering design.
And we have basically like three big phases.
We have a conceptual and preliminary and critical design review that we do here throughout the year.
The whole company is involved.
So we have these like design gates that we work through.
Similar practice that I instituted exactly similar.
Well, similar practice.
I instituted Archer from an engineering design perspective or philosophy.
And yeah, we work through it in a very methodical way, like all the way through that serially.
And how does integration and testing work in a way that's different from a software company since you've also done that?
I'd imagine really differently.
Yeah, we try to, we try to test and we try to prototype and test as fast as we can to see if we were right.
Same with software.
It just happens on a longer timeline.
Okay.
Well, software, you'll come in one day and I'll say, okay, we talk to the client, we believe the client, we've talked to the client, we believe we have.
all of these things on the product backlog list we want to do, you'll somehow have some
heuristics where you'll score those and you'll basically comb the backlog and you'll say,
I'm going to go, we're going to add these like six things to the sprint.
They'll do story points and you'll basically, you'll sign those out and you'll basically manage
that whole process. And then you'll launch it and you'll get feedback, right? You'll try to either
A, B, test things, you'll watch the analytics and you'll say, did that work? Did that work?
You really want to do that and you want to have that kind of a scientific method around it,
say like, okay, was that, did that actually help, you know, fix this part?
problem. Same here. We have the client. We have requirements that we set. Like, they need to do
this. We are designing things. Like, we are designing hardware from scratch. Like in, so we take
our, we're designing an actuator. We're going to take our CAD system and we're going to from scratch
design it. We're going to make assumptions on, and trade studies on like what the different
tradeoffs are of how we can do it at front. So we don't spend a lot of time designing something
that just didn't work. So we're going to be pretty methodical about it, like much more methodical
than you are software because the timelines are, you know, order of magnitude plus longer.
Up next is a snippet from a conversation we had with the OpenAI research team building SORA.
Here, we talk to this team about their generative video model and whether or not video is on the path to AGI.
Do you all have a favorite thing that you've seen artists or others use it for or a favorite video or something that you've found really inspiring?
I know that when it launched a lot of people were really stricken by just how,
beautiful some of the images were how striking,
how you'd see the shadow of a cat in a pool of water,
things like that, but as just curious what you've seen
sort of emerge as people,
more and more people start using it.
Yeah, it's been really amazing to see
what the artists do with the model,
because we have our own ideas of some things to try,
but then people who for their profession
are making creative content are like so creatively brilliant
and do such amazing things.
So shy kids have this really cool video
that they made this short story
airhead with this character that has a balloon and they really like made this story.
And there it was really cool to see a way that Sora can unlock and make this story easier
for them to tell.
And I think there it's even less about like a particular clip or video that Sora made and
more about this story that these artists want to tell and are able to share and that Sora
can help enable that.
So that is really amazing to see.
You mentioned the Tokyo scene.
Others?
My personal favorite sample that we've created is the Bling Zoo.
So I posted this on my Twitter the day we launched Sora.
And it's essentially a multi-shot scene of a zoo in New York, which is also a jewelry store.
And so you see like Sabretooth Tigers kind of like decked out with Bling.
It was very surreal.
Yeah, yeah.
And so I love those kinds of samples because as someone who, you know, loves to generate creative content,
but doesn't really have the skills to do it.
It's like so easy to go play with this model
and to just fire off a bunch of ideas
and get something that's pretty compelling.
Like the time it took to actually generate that
in terms of iterating on prompts
was really like less than an hour.
So I get something I really loved.
So I had so much fun just playing with the model
to get something like that out of it.
And it's great to see if the artists
are also enjoying using the models
and getting great content from that.
What do you think is a timeline
to broader use of these sorts of models
for short films or other things?
Because if you look at, for example, the evolution of Pixar, they really started making these Pixar shorts,
and then a subset of them turned into these longer format movies.
And a lot of it had to do with how well could they actually world model, even little things like the movement of hair or things like that.
And so it's been interesting to watch the evolution of that prior generation of technology, which I now think is 30 years old or something like that.
Do you have a prediction on when we'll start to see actual content, either from SORA or from other models that will be professionally produced and sort of part of the broader media?
genre? That's a good question. I don't have a prediction in the exact timeline, but one thing
related to this I'm really interested in is what things other than like traditional films people
might use this for. I do think that maybe over the next couple years we'll see people starting
to make like more and more films, but I think people will also find completely new ways to
use these models that are just different from the current media that we're used to. Because it's a very
different paradigm when you can tell these models kind of what you want them to see and they can
respond in a way and maybe they're just like new modes of interacting with content that like
really creative artists will come up with so i'm actually like most excited for what totally new things
people will be doing that's just different from what we currently have it's really interesting
because one of the things you mentioned earlier this is also a way to do world modeling and i think
it's you've been at open ai for something like five years and so you've seen a lot of the evolution of models
and the company and what you've worked on.
And I remember going to the office really early on,
and it was initially things like robotic arms,
and it was self-playing games or self-play for games and things like that.
As you think about the capabilities of this world simulation model,
do you think it'll become a physics engine for simulation
where people are actually simulating like wind tunnels?
Is it a basis for robotics?
And you say, there.
Is it something else?
I'm just sort of curious where are some of these other future-forward applications
that could emerge.
Yeah.
I totally think that,
carrying out simulations in the video model is something that we're going to be able to do
in the future at some point. Bill actually has a lot of thoughts about this sort of thing,
so maybe you can... Yeah, I mean, I think you hit the nail on the head with applications like
robotics. You know, there's so much you learn from video, which you don't necessarily get from
other modalities, which companies like Open AI have invested a lot in the past, like language.
You know, like the minutia of like how arms and joints move through space, you know, again,
getting back to that scene in Tokyo, how those legs are moving.
and how they're making contact with the ground in a physically accurate way.
So you've learned so much about the physical world just from training on raw video
that we really believe that it's going to be essential for things like physical embodiment moving forward.
Up next, we talked to Scott Wu, the co-founder of Cognition,
the company behind Devon, which is building an AI engineer.
Here, we talk about the design for Devon and what it means to work with AI engineers.
What do you think is going to be important from a human software,
engineer or just like human technology person five years from now.
I realize that's a really long time scale in AI.
But it's certainly not like encyclopedic knowledge anymore, right?
Yeah.
Yeah.
And I mean, I think there's, there's a meme that, you know, the hottest new programming
language is English, right?
And I mean, I think there's a lot of truth to that.
But with that said, I think that, you know, the software engineering fundamentals are
obviously still super, super valuable, right?
People, you know, for example, like, I think, you know, the Internet
today is something that we all kind of are able to use and kind of take granted, but people
who work with these networks, it's certainly very helpful for them to understand the details
of TCP, right? And I think similarly, I think, you know, I think we'll be able to communicate
our ideas in English, you work with all these things, but, you know, understanding the internals
of how computers work and understanding logic gates and, you know, a lot of these core kind of pieces,
like these core foundations, I think will still be very useful, right? And so, you know,
whether that's, you know, algorithms or technologies or, um, logical reasoning or things like that.
Like, I think the, you know, I think the role of a software engineer, five or ten years from now,
it looks something like a mix between a technical architect and a product manager today, you know,
where a lot of what you do is, you know, you take problems that you're facing or that your
business is facing or whatever, and you're really thinking about and breaking down what exactly
the solution should be.
How do you think about it on an even farther time frame?
Because when I, if it was five years ago, I would have told either my kids or people
who have kids, you know, you should study computer science and math.
20 years from now, I'm not as certain.
So I'm sort of curious how you think about the future of this field if much or all the
work, including a lot of the planning, is actually done by machines at some point.
Yeah.
I mean, I love that.
So I have to say it's a worthwhile.
while experience, even if it doesn't end up being practically useful.
But no, I mean, I think these, I think a lot of these fundamentals will stay useful for a long
time.
There's obviously a lot of questions that come up about, you know, superintelligence and
singularity and all of this.
And, you know, it's very hard to predict.
I think everyone in AI, it's, you know, we've all made our predictions and, you know,
tried to make our guesses, but I think it's hard to be very high confidence.
But with that said, I do think that, you know, we're going to see AI's concrete impacts on work and economy and people's lives, I think, a lot sooner than that.
You know, I think the way that we think about the problem is that, you know, even with the tools that are available today and the technologies that exist today, there's so much that's possible to really impact people's lives, right?
And, you know, we're still very, very early in this whole AI revolution.
I mean, chat GPT was about a year and a half ago at this point.
And there's a lot more to do and a lot more to build, you know, both on the research side and on the product side.
Finally, we have Alex Wing, the founder of scale AI, the data foundry for AI.
Here, we talk about what's next for scale as models approach and go beyond human abilities.
with great power comes great responsibility if you know if these AI systems are what we think
they are in terms of societal impact like trust in those systems is a crucial question like
how do you guys think about this as part of your work at scale a lot of what we think about is
how do we utilize how does the data foundry enhance the entire AI life cycle right and that
life cycle goes from you know a ensuring there's data abundance as well as data
equality going into the systems, but also being able to measure the AI systems, which builds
confidence in AI and also enables further development and further adoption of the technology.
And this is the fundamental loop that I think every AI company goes through.
You know, they get a bunch of data or they generate a bunch of data, they train their models,
they evaluate those systems, and they sort of, you know, go again in the loop.
And so evaluation and measurement of the AI systems is a critical component of the life cycle,
but also a critical component I think of society being able to build trust in these systems.
You know, how are governments going to know that these AI systems are safe and secure and fit for,
you know, broader adoption within their countries? How are enterprises going to know that
when they deploy an AI agent or an AI system, that it's actually going to be good for the consumers
and that it's not going to create greater risk for them? How do, how are labs going to be able
to consistently measure what are the intelligences of my, of the AI systems that we build,
and how are we going to, you know,
how do they make sure they continue to develop responsibly as a result?
Can you give our listeners a little bit of intuition for like what makes eVALs hard?
One of the hard things that, you know, because we're building systems that we're trying to
approximate and build human intelligence, grading one of these AI systems is not something
that's very easy to do automatically.
And it's sort of like, you know, you have to kind of build IQ tests for these models,
which in and of itself is a very fraught,
philosophical questions, like how do you measure the intelligence of a system?
And there's a very practical problems as well.
So most of the benchmarks that we as a community look at for-
The academic benchmarks.
Yeah, the academic benchmarks that are what the industry used to measure the performance
these algorithms are fraught with issues.
Many of the models are overfit on these benchmarks.
They're sort of in the training data sets of these models.
And so-
You guys just did some interesting research here.
Yes.
Published them.
Yep.
So one of the things we did is we published DSM-1K, which was a held-out
E-VAL. So we basically produce a new evaluation of the math capabilities of models that there's
no way it would ever exist in the training data set to really see how much of the, how were the
performance of the models, were the reported performance of the model capability versus the actual
capability. And what you notice is some of the models performed really well, but some of them
performed much worse than the reported performance. And so this whole question of how we decide
are actually going to measure these models is a really tough one.
And our answer is we have to leverage the same human experts
and the kind of the best and brightest minds
to do expert evaluations on top of these models,
to understand where are they powerful,
where are they weak, and what's the sort of,
what are the sort of risks associated with these models?
So, you know, one of the things that we're very,
you know, we're going to, we're very passionate about is
there needs to be sort of public visibility
and transparency into the performance of these models.
So there need to be leaderboards,
there need to be evaluations that are public,
that demonstrate in a very rigorous scientific way
what the performance of these models are.
And then we need to build the platforms
and capabilities for governments, enterprises, labs,
to be able to do constant evaluation on top of these models
to ensure that we're always developing the technology
in a safe way and we're always deploying it in a safe way.
So this is something that we think is, you know,
just in the same way that our roles
in infrastructure provider is to support the data needs
for the entire ecosystem.
We think that building this layer of confidence in the systems,
through accurate measurement is going to be fundamental to the further adoption and further development
of the technology. Thank you all so much for listening. We've really enjoyed talking to people
reshaping our world with AI. To listen to any of the full episodes, please find the links in the
description for this podcast. And we'll be back with new interviews next week.
Find us on Twitter at No Pryor's Pod. Subscribe to our YouTube channel if you want to see our faces,
follow the show on Apple Podcasts, Spotify, or wherever you listen.
That way you get a new episode every week.
And sign up for emails or find transcripts for every episode at no dash priors.com.