No Priors: Artificial Intelligence | Technology | Startups - The Intersection of AI and Blockchain, with Transformers author and NEAR founder Illia Polosukhin
Episode Date: September 15, 2023More than 25 million users are using NEAR-powered applications. Co-founder of NEAR protocol and Transformers author Illia Polosukhin joins hosts Sarah Guo and Elad Gil to discuss the intersections of ...crypto and AI technology, what we should expect from AI agents, decentralized data labeling, why AI’s alignment problem is really a human problem, and more. Show Links: Illia Polosukhin - Co-founder of NEAR | LinkedIn NEAR Sign up for new podcasts every week. Email feedback to show@no-priors.com Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil | @ilblackdragon Show Notes: (0:00:00) - Blockchain, AI, and Web3 Intersection (0:06:39) - How We Might Combine Blockchain and AI for Cancer Research (0:23:35) - Inference and Decentralized Data Labeling (0:30:13) - AI SaaS Strategic Challenges (0:38:18) - The Future of Hardware Accelerators
Transcript
Discussion (0)
A blockchain operating system might just be the key to a democratized Web3.
In fact, more than 25 million users already getting a taste of this, thanks to NIR.
This week, Elad and I are joined by Ilya Polosukean, the co-founder of NIR, and the co-author of the landmark Transformers paper,
to discuss the interaction of blockchain and AI technologies, what we should expect from AI agents,
how to handle the content authenticity problem
and why the alignment problem in AI is really a human problem.
Ilya, welcome to No Pryors. Thanks for doing this.
Thanks for inviting.
You are one of the authors of the original Transformers paper.
We've also had Noam and Jacob on.
How did you get involved with that seminal work in AI?
I worked on a team on natural language understanding.
It's focused on question answering,
and the state of the art at this time was LSTM's recurring networks.
which you could not launch in production at all because they're too slow and take a fair bit of time to process as document scale.
So Jacob at the time was using attention for query similarity and he had this idea like using attention for encoder decoder type.
I kind of jumped into it and with a shish were playing around with can we actually get it to train and understand the order of work.
and do translation just based on, you know, attention.
So, yeah, it was pretty cool to explore that
and obviously grew into something very interesting and awesome.
You originally co-founded NIR in, I think, 2018,
meaning for it to be an AI-focused company.
What was that initial mission,
and how did it become a blockchain company?
Yeah, so we started this idea that we wanted to teach machines to code.
You know, we have transformers coming out.
There was a lot of kind of really interesting,
push in 17, 1617 around AI.
And so our expectation was we kind of would write the exponential growth of AI, which has happened
in this year, we thought it will happen in 1718.
And so with that, we got a really interesting data set around language to code.
But more interestingly, we had a whole community of developers, mostly students, who were
doing crowdsourcing for us.
So we would give them code. They would write descriptions. We would give them descriptions. They would write code for them, write tests, like all kinds of tasks. And we actually faced a challenge of paying them because a lot of them were in China, in Eastern Europe and kind of other countries where there's monetary control problems. People don't have bank accounts. And so we started looking into blockchain just like to solve our own problem.
the AI kind of expansion explosion didn't happen at the time
and so we saw an opportunity of we can actually build a blockchain
that we would use to solve this first and focus on that
while kind of waiting out the AI thing to really happen
and as you go into the blockchain rabbit hole you realize
there's a lot more that meets the eye yeah yeah ended up being a pretty big mission
exactly so you call near a blockchain operating system
for any of our listeners who haven't used it.
Like, what does that mean?
So the idea is that we want to kind of go upstack, right?
We want kind of an environment where you can discover and use Web 3 experiences,
you know, benefits from them and not need to think about the low-level, you know,
implementations and quote-unquote hardware that runs under it, right?
So similarly how operating systems on your phone, you know, kind of abstracts out all the complexity
of networking and payments and everything, you just use it and you have apps that developers can
build. And so that's really what we're trying to achieve and kind of build this framework
and platform for everybody to build their applications in Web3 and really deliver it to the
user and to consumer. Where do you see a lot of the overlap coming in terms of Web3 and
AI? You've thought very deeply about both. I remember when I first met you, you were just
switching from sort of NIR's original mission and at the blockchain-based mission. And
you know, you were known as a team that could literally build anything, right? Like you had
yourself and Alex and Pai Guy and all these like amazing people. And you went down the direction
to building blockchain in part, I think originally around this data labeling kind of mission and
the ability to do payments and things like that. And now I know you've been thinking a lot
again about how these two worlds interact or intersect. Where do you think are going to be the
biggest places of overlap between AI and blockchain or Web3? There's few levels of interesting
intersections. I think the most obvious one that everybody talks about is various marketplaces for
resources, be that compute, model, or data, right? So data crowdsourcing. So those are pretty
obvious, right? Web3 is really good at creating marketplaces, creating traceability, and
providing an equitable place for everyone to participate. Now, the more interesting ones is
where AI kind of agents, right, which we've seen like initial versions of, but obviously
they're going to continue evolving.
If you equip them with a blockchain account, right,
they are now becoming an economic agent
that is able to pay other people
and pay other AIs to do work, right?
And they can communicate, right?
And I think one of the things that a lot of people
who are, oh, like this language models
are just the same advancements
like as everything before,
missing the point that this is the first time
that a machine is able to communicate with people
in the same way, right? There's no more need in an intermediate human that interprets data
and then tells it to other people. Now machine can communicate directly to people. And so
it can task them with work. It can provide them context. And so I really think one of the most
interesting cases is organizations that are run completely by AI, right, where quote-unquote
CEO role is taking by AI agent, who is tasked by, you know, by community or board of
directors or whatever is oversight governance is.
to hit specific APIs and follow specific mission.
They can even give specific feedback with training data
when they don't think it's doing the right job.
But what it does is like creates this kind of a new layer of management
that potentially removes a lot of middle management right now,
which is like transforming information and context for each individual person
and giving them specific area of work and then gathering their creativity
and putting it back together.
I think that's as very interesting use case that kind of really
melds blockchain and AI together.
Why, like, you have a traditional
biotech cancer research commercial entity,
like why blockchain and why AI for that?
I use this example, right?
We want to, you know, continue making progress
on solving cancer, right?
And it's a very complex problem, right?
There's a lot of like specific sub-cancers
that, you know, need research.
And so all of this and like coordinating people
doing experiments, propagating information,
and recruiting, you know, people recruiting the candidates, right?
All of this requires, like, somebody to do this work and kind of organize the process
and really set up a lot of pipeline and, you know, funding and all those things.
And right now there's so much overhead around everything from, you know,
how grant funding is allocated from the nonprofits that collect money for research,
how, like, experiments are set up, the information sharing.
Like, all of those pieces are really kind of broken.
And so you can actually have, you know, like coordinated efforts,
that is designed just to do that
and it can consume all this information
and kind of specifically task
who is the best person at doing
the experiment of which lab is the best
at doing this specific sets of experiments
fund them for this
amount of money, you know, oversee
their delivery and then
kind of iterate and you know if it thinks
this lab is not doing a good job, fire them
without having like extra
personal affiliations
that you know people do have.
I'm actually excited about some
folks are already building some examples of this in like a simpler
forms, but I think we'll see, you know, first organizations like this,
probably even this year where potentially with the simpler missions and kind of
more straightforward like KPI metrics, but where kind of this information
propagation and onboarding of people happens already through a kind of
language model AI agent. A simpler version of this that I've heard people talk about
and it may be the first step towards it is actually providing
on-the-job feedback via an AI versus like a human manager with the idea that it depersonalizes
the feedback, right? So if you have an agent or an AI providing feedback, some surveys at least
have suggested that the average employee may be more comfortable with that because it feels
more objective, it feels depersonized, it feels like it can be provided in a directive way.
And it seems like that's one aspect of sort of this AI as CEO concept that you're describing.
Do you think the first place that it'll show up is Dow's, or do you think it'll show up in a different
part of the community.
Yeah, I think DOS is, and especially what happened with DOS, there was a lot of people
who were really excited about DAUs kind of as a concept, and so they put a lot of time
running them.
But it's actually a very, like, not interesting job, right?
It's like, you onboard new members, you explain to them all the same thing, you know,
you answer to their questions.
And so that's the part which, like, you can already automate, right?
You can, like, have a Discord bot that is, like, have all the context about the DOS,
you know, interactions and kind of unboards new people and it gives them.
them like new, you know, tasks to start with and kind of coordinate them.
So I think that will be the first place where this kind of starts showing up and as well
because you have like payments kind of like there and you don't have any social constraints
that usually have in like regular organizations. Like, you know, a lot of people will revolt
if you like tomorrow say, hey, by the way, your new boss is this AI model.
Yeah, yeah. How do you think about AI in the context or I should say blockchain and AI in the
context of things like alignment. Yeah. So I think this is a very interesting topic. So I have this
view that we need human alignment instead of AI alignment. So right now, kind of when we talk about,
you know, hey, we need to align AI with like human values. But the reality is that, you know,
all the problems that exist, they all exist because of humans doing things. And they've existed
before. I actually like to use the Byzantine fault tolerance problem, right, which is basis for
blockchain, but its roots are in history where there was people propagating misinformation
and you were trying to figure out how to prevent misinformation in the army.
So this is like a really old problem of misinformation and kind of like how to work around
that.
And so I think what we need to start doing is figuring out how do we build a society that is
actually able to deal with kind of effective misinformation at scale.
So, like, we've kind of built, like, a lot of our society has started building up tolerance to misinformation around, you know, TV and mass media, but we don't have, like, a system and framework around dealing with it at scale. And that's what AI brings, brings just scale to the same problem. And so this is where reputation, identity, and kind of systems around our social, like code operating system that powers our kind of communities is really important, right? How do all this, this, this
pieces work together and how does they actually operate when there is malicious actors who
potentially are able to, you know, in mass create like very personalized misinformation or
create, you know, fake political actor that is, you know, convincing every individual exactly
in what they think, you know, that government should do to get elected. And this is where
Web 3 comes in as like a set of primitives, right? We have cryptography to authenticate content and
create a path, everything from, you know, you take a picture of his camera. Some of them already
have a secure enclave that can sign the image that's taken. And so as that image gets processed,
we can actually propagate that information and have a proof that it came from, you know,
specific time and place and then being processed by a specific set of filters. Right. So that can
give you like an anchor. Then you still need to know kind of who is publishing what, right? Like we're
recording this podcast, you know, people listening to it, it could have been completely generated
at this point, right? But if, for example, we all sign the, you know, the final podcast and
say, hey, yes, we've recorded it and this is that content, now when somebody's listening to it,
they can check that indeed, hey, this content is signed by us. Now, the question of us comes in,
right? So this is where kind of identity and reputation is important. And so this is where
kind of unchain identity becomes your kind of coalescence of all of the content and all the
interactions that you do. And then that links to kind of, you know, reputation in different communities
and provides context for people who are watching for this content to be able to understand,
you know, who is this person who's talking or where they're coming from and what are the
information values they have. So I think like it's, it needs to be a kind of systematic approach. And
it will start with pieces, right?
I think one of the imported pieces will be
kind of a green lock
similar to SSL transitioned
on the content, right? Like as you go to
YouTube, as you go to
New York Times, you actually will see
that like, hey, this content has been signed by
this party and this party is
in some trust route or trust
graph of
communities that you are following.
So that's probably like one important
piece. And again, blockchain and cryptography
is just like tools to enable that
product experience. And then from there, you know, we need similar things on the government
level, right, when you file paperwork, when you file, you know, your identity. The fact that your
SSN is a, you know, number that you give to everyone, which is like supposed to be secret, is like,
for example, ridiculous. So things like that is like all of this needs to improve and kind
of upgrade to this new level where like a massive amounts of kind of at scale of things
that have been happening now are possible. What do you think is the most likely form of
blockchain-based identity because, you know, the blockchain really has been the earliest place where
you've had programmatic actors interacting around economic and other utility functions, right? It really is
money as code. And effectively, smart contracts are ways to programmatically interact with that, right? So
you had almost like the execution layer without the intelligence, and now we're adding the
intelligence. You have the cryptography, but you're missing a real sense of identity, which is needed
if you have an agent or bot representing you interacting with another agent, which is probably where
a lot of things will work in the future online. What do you think is the most likely form of
identity on the blockchain and why hasn't it happened yet? It has happened to some extent, right?
We have, you know, like millions of people actually using blockchain right now and they're using
it more for financial use cases and kind of that's their financial identity. The wallet is identity
kind of thing. Yeah, wallet has became an identity, right? And the reality is like your quote-unquote
private keys are your identity, but that's just too hard of a concept for people to actually work
with, right? And so on NEAR, we actually change that. We, you know, you have a properly named
account, so like mine is root.orgnear, which can have lots of different private keys accessing
it with different permissions, right? I can give a key, and in a way, permissions to an agent to, for
example, interact on behalf of this, or I can withdraw it, right? I can give it to a specific application,
etc. So, like, a more extensible model is needed. That's one. We need to have more social interactions
kind of being spawned from this.
And so this is, again, blockchain-abring system
is powering actually social interactions
and kind of communication.
We actually have a project working on chat
and other ways of using now this identity
in more places.
It's mostly because we didn't have a critical mass
of these applications that are using this identity
for it to really become kind of the core.
And if it's not the core, it's not as useful
because nobody, you know, like, hey, you don't have it.
So, like, we're not going to use it
as a default thing everywhere.
So, like, we really need to kind of go over.
Like, again, I think SSL is a really good example of something that's like, it delivers
value.
It's clearly valuable.
But it was such a, like, uphill battle to get it there, right?
And so I think, like, until you have this critical mass of, like, kind of website switched
and browser support, it didn't become a default, right?
So we kind of need, like, the same here to happen.
Like, we'll need to have a critical mass of applications using,
identity and then
then we kind of
sees it like in browsers or
wallets or whatever like applications to
hold it and then we'll see a transition
function happen where like hey oh you don't have it
like you should get it because it's actually easier
and better to use it and it gives you
like more financial freedom as well and more
upside. Where do you think
the most likely failures
like system wide are
to be like with
growing capabilities and AI like
where do these mid-again's
in terms of reputation systems with blockchain
or content provenance are likely to,
how's it going to manifest in ways that affect us?
Yeah, I think there will be, probably next year,
it will be very interesting in US
because I think this will be a place
where everybody will just take whatever their toys
that have in toolbox and do it even just for kicks, right?
Even if it's not malicious, although some players will be malicious.
And I think what we'll see everything from like completely fake,
narrative candidates to, like, I would be very interested to see, like, a web page where
you land and, you know, you log in, and it literally generates specifically for this user
based on their interest, a agenda for this candidate, right? So, like, hyper-focused, you know,
marketing for candidates based on, like, who this voter is, right? So things like that,
like, we'll have all those possible things where the media will kind of be fly.
it was like, you know, you can spin up new media right now and just generate content about your
candidate like that you want and then market that. So like you can have like all kind of things
now just exploding without any way of like framing it on the user side. If like does this have
history, is this coming from the right sources? Has this been validated, right? And so I think that's
going to be really important. I think the other side actually is law enforcement and this is
sadly already happening.
The people are using this tools now in very malicious ways right now and law enforcement
don't have a like really good ways to deal with this.
And so I think everything from this like on camera, like signing, we need this now.
Like they really have no way to like kind of identify if the image was generated or not.
And similarly like for, you know, audio recordings and things like that, like there needs to be
kind of additional kind of levels of verification.
And this goes into actually like video calls and voice calls
because right now somebody can call you on the phone
and play a recorded, like generated audio
of somebody they recorded 30 seconds off.
And this can be this very nefarious means, right?
It's a huge consumer fraud problem already.
Well, it's huge consumer, but it's also like,
beyond that it's becoming like a real criminal problem.
Like criminals are able to use this tool.
now. And it's like the barrier of entry there is like very low. And so, uh, this is where like
you really need like, you know, the phone calls, the kind of all of this, like you need more
information identification and like kind of cryptography embedded into the system. Otherwise,
it's completely going sideways really quickly. Yeah, this is where people would be using API as like
Element or LFG or 11 Labs to create a voice snippet, right, where they'll upload to your point
30 seconds of voice, train a model. And then the output set.
sounds close enough to the person that you could fool a financial advisor or a bank or somebody
else to do transactions on your behalf or things like that. Yeah. And you like swipe their phone
and now you're able to like impersonate them completely. Right. So yeah. So this is like a real
problem and like having kind of authenticated pass is required there to really stop. And like we have
actually like the phones are actually have so much already like we have face ID and fingerprints. We have,
there's secure enclaves that sign things
that are like, haven't been hacked
as far as I know. So there's like a
lot of the pieces
are there. Now we just need like a product stack
that actually pushes it
to the user and like to the products.
Yeah, that makes sense. I guess one
other area where some people have talked about
overlap between the
blockchain world and the AI world
is around training
and there's almost like two or three different forms
of that. One form of that is
there's a lot of GPU capacity that
was purchased for mining on the crypto side. And given how valuable GPU is now, on the
training side, there's all sorts of sort of models to aggregate GPU specifically for training
in different ways, you know, aggregating access capacity. And then separate from that,
there's ideas around, well, can you train a model in a distributed way across a blockchain more
generally? Do you think either of those things are concepts that will work, or how do you think
about them relative to the future? Yeah, I mean, it's interesting.
because it sounds like such a no-brainer that, hey, let's grab those GPUs that, for
example, Ethereum just moved from proof of work to proof of stake. Let's grab those and
start using them. The challenges, the GPUs there are like not the ones that AI folks want
to use, right? Like, kind of all the AI is really zeroed in on like, how do we get A100s
or H-100s? And the GPUs that like folks used for Ethereum mining and like similar is like all
ones like that are not also focused on like floating point arithmetic for example as much and so
the challenge was more around like people who did did that like Corviva is probably a good example right
they were mining company like it's more that they had a know-how how to build data centers and
they can like get access to massive like talk to invidia and like get massive access to that
versus like repurposing the same GPUs although I mean obviously like for smaller models for some
specific
maybe inference things
there's maybe transition
there's a question of
decentralized training right
in general right
like hey we have like
lots of GPUs everywhere
can we train it
and the reality right now
the requirements on bandwidth
right like people who
training these models right now
they have like a
you know 800 gigabit
connect right between the GPUs
right so maybe you have
100 megabit on
between this usually not
and you need to like
replay and like work around problems for decentralized.
So I think decentralized training right now is like still not as realistic, although there's
some research people are trying.
I think an inference is really interesting because we do need so much more compute for
inference than we need for training, right?
Like it's a very interesting like economy of scale.
You train once, like Lama trained once and then everybody runs it everywhere.
And so the inference is where I think there's a lot of interesting cases.
One is you want it to be private, right?
Right now, if you're doing inference, you need to send it to some service,
and that service may or may not record it and both input and output.
Second one is you want large capacity that can scale with more usage, right?
Tomorrow I have 10x more users.
I want to be able to scale with that.
And so this is where I think using some of this hardware that exists as well as
kind of leveraging maybe new methods of privacy and coordination that can again
crypto has like mpc like multi-party computation there's zero knowledge proofs etc like they can
be leveraged to achieve that and have kind of secure like secure decentralized inference so i think
that's way more realistic than training and also way more needed and then i guess one of the
really early applications that Nira was thinking about was data labeling and to your point,
the ability to pay people who are doing data labeling for AI purposes, right? And since that time,
I think a number of companies have really grown out in terms of the data labeling world in a
centralized way. There's scale.aI, there's serves, there's a few others. Do you think the best
solution in the long run is still a decentralized model where you're using tokens to pay effectively
for labeling? Do you think things will stay in the centralized world? Like, how do you view all that
evolving over time. Yeah, I think decentralized kind of a Web3 marketplace is a more effective way
to do this. And it kind of provides few interesting benefits. One of them is that it opens up
kind of the market, right, where you don't need to set up like a local office and kind of hire
people and train them, et cetera. Like you can just open up global market, anybody can join.
And you have a very specific rules, right, that if they follow, they get paid, right? So I've
mechanical torque before, for example, and you can actually, as a client, you can just decline
paying them, right? So people in mechanical Turk, like the workers have very low kind of
way to push back if I say, at the same time, they don't have any like quality and knowledge
assessment on the platform, right? So I think having quality knowledge and this kind of escrow
model all embedded into one marketplace that opens up for everyone and, you know, anybody
everywhere can get paid at any time, like offering that both the people who doing this work
wants because they kind of are more protected actually. And it's like fair game. And then the people
who want to give tasks, they can actually get access to like way larger workforce. They can
like specify specific parameters. They can, you know, price it at whatever level they want. That's
going to be the kind of future of it. Can you talk a little bit about what makes the quality control
problem for annotation hard here? Right. Because one thing that I've seen,
with significant research labs is like still continued in sourcing of annotators for both pre-training
sets and LHF because some of the external services and marketplaces can't get to the level of
quality that they're looking for in particular domain. So can you just describe the dynamics there?
Yeah, so I think there's two parts. One is like domain knowledge, right? That generally like hard,
like it's hard to tap in into like a specific centralized service right because they need to kind of like for them to do payments to do all the things they need to set up a subsidiary in whatever country they have the workers they need to train them they need to hire them maybe it's contracts but like they need a lot of overhead that they do that for example developers let's imagine you know you're building a new really cool developer platform uh which uses you know language models and you want to fine tune on code right well
the existing platforms, like them hiring a bunch of developers to actually do this, right?
And, you know, if they're doing this full-time, it's like super complicated.
Then kind of building out the validation tooling for how to like cross-valid that the work has been done.
Now, on the Webstream Marketplace, you know, any student can join and like do this, right?
They don't need like, you know, join it, like get a contract with a specific company.
They don't need to have the company in the local region to work with.
with them. And like students, you know, for coding, for example, are really interested in doing
this because they usually don't have much money. And this is a way for them to practice their
work anyway. And then as a task giver, you can actually specify the specific way you want
the cross-valdation to happen. And one of the things we've done, it's like honeypots, right,
where you actually specify specific types of incorrect answers that people need to mark as incorrect
and otherwise they actually lose the buy-in. And so there's like, actually,
like very clear like economic game theory where people have buy-ins they they lose them if they
like do poor quality of work and so they have like way more incentive to do this versus like let's
say if you're working on the contract there's like way more leeway usually if you're not doing
your work right so it's like just way higher kind of self evaluation as well that happens and so
I mean there's a lot of pieces that needs to come together for this to be like
high quality. But again, it just opens up this marketplace and makes it effective. And it, in a way,
removes a lot of the human part as well. One thing that I think is really neat about how near
approaches innovation is you do both internal sort of near road mapping and product development.
And then you also have a series of things that you either spin out or spin up or you're sort of
involved with as sort of these ancillary companies or projects or efforts. What areas are you most
excited about over the next coming year in terms of either nearer or some of these other efforts
that you're involved with.
So we do actually have a project in this Web3 AI data marketplace that we are spinning
out to focus on.
Now they build a product.
They have all the pieces.
Now it's like ready to actually go to market and bring customers.
I think the really interesting area is kind of partnering with existing kind of either
already Web3 enabled or interested in Web3 teams who want to give access to me.
more functionality because they're users, right?
We have, for example, Sweatcoin, which is a really good example of, like, it was a VEP2
project that had 120 million installs, that had a ton of people using it every day, kind of
for a very specific use case, right, kind of tracking their steps and, you know, maybe getting
a discount on their next shoes.
But now, as they're transforming to Web3, they're kind of opening up, right, and you can now
participate in economic activity, you can, you know, learn about new kind of innovations that
happening in the ecosystem, you can now, you know, but like as they integrate more into
blockchain library system, I can potentially interact with, like, on the social side, do the
tasks and gigs. And so like you kind of really open up the, what before was like a very
limited kind of economy to really this like, you know, composable open web. I think that's really
exciting and like we will see probably more and more examples of that. And finally, I'm really
interested in kind of, as I mentioned, like, because we have now open web and social wear,
the kind of what I call future of SaaS.
So I think a lot of, between Web3 and AI,
a lot of SaaS will actually start being replaced.
Because right now what SAS is is like one database
with a specific UI for a specific problem.
The database is the same between CRM,
the hiring tool, marketing tool,
even some of the project management tools, right?
The database underlying is not that different.
And it's been just like the front
end. And like interconnecting all of this databases is like a ton of work. It always breaks. Right. But now you can have like the database you own, right? So using kind of Web 3 tech. And then you can build all of this front ends on top, either through kind of blockchabry system shared components or even through describing with natural language, some of the interfaces and business processes you want to have. Right. So the way people will interact with like kind of their business operations and,
all the tooling they need will start to change.
And so I'm really excited about this space.
And we have one company that is kind of starting to build out some of the things in this space.
And over next year, we'll see kind of that evolving.
Do you think that moves to an agent-driven world?
In other words, when you imagine the interfaces on top of this that are sort of driving these business processes for future SaaS applications,
do you view them as sort of traditional UIs or do you view them as agents that are interacting programmatically or some hybrid?
It will be a hybrid.
So, like, in my imagination right now, at least, I expect, like, you can describe a business process, which is like, hey, you know, when we have a new creative from, like, marketing department, spin up a Twitter campaign and create me a dashboard that tracks the conversions on our product, right?
And so what it does, it, like, creates, you know, the pipeline of those things.
And then it also creates a page where I can see, like, normal user interface of, like, analytics.
So it might be more generated dynamic UI.
Exactly, yeah. And it's like adjusted for specific use case you need. And probably there's like a bunch of templates that is like, you know, fine tunes for your specific problem. And this is possible right now. Yeah, I guess it kind of moves you down the path of what you were talking about in terms of like AI as CEO or AI as project manager where you're kind of morphing into a world where you're delegating to an AI to drive a bunch of activities and then come back to you with the results like you would an employee or a coworker, which is very different from the world of UI today.
where you just go to the same spot to see analytics,
you go to the same spot for communication,
which is your email,
you go to the same spot for,
you know,
interacting with the workflow.
And you're saying this should be more of a dynamic world
where things get brought back to you
based on a series of tasks that you provide out.
Yeah.
And it's like probably a shared environment as well
where, you know,
we probably will co-work on a business process
and, you know,
we'll share one display,
but then we'll maybe fork it
because I'm more interested in conversion
and you're more interested in retention,
for example.
And so that's kind of the dynamism,
right now that also doesn't exist where like we all look at the same you know gira task management
and I'm like I don't really care about half of this stuff right but it's not a filter problem
it's like I want different information showed in a different way author of the paper that changed
the world here we are in 2023 is it bigger transformers all the way or there are other architectural
directions that are worth thinking about that you're paying attention to I think there's
definitely something around like how do we get this models to have the capacity to
to let themselves think before outputting or like kind of process more.
And I think it's like still within the transformer structure and it can be like advanced.
But I haven't seen anything that's like really matches my intuition around us.
But I think the like the simplicity of this architecture and like indeed like the amount of
optimization that's going into this right now is just it will be really hard to match.
and kind of, you know, if there's enough expressivity, you can express any function.
So, like, this is not a problem at this point of, like, hey, we don't have an expressivity, right?
It's more around how do we, I like, compose a data set that's, you know, cleaner, better,
or add some, you know, self-critique and understanding of, like, is this content correct?
Or I need more time to think versus, you know, hey, I'm forcing you to output next token,
even if you don't have an answer yet.
So I think that parts we really need and I think they kind of fit in the architecture,
but just require more engineering and more different types of tasks as well for training.
I think like, you know, the fact that we're just using a big language model is kind of interesting
because this is not the task he would expect everything to be able to, you know, just predict next token.
So like, you know, starting to, obviously, our LFH being already helpful, but like starting to like, hey, can you
critiques this answer, what would be the better answer, et cetera.
Do you view that as a training or fine-tuning thing or do you that as an inference thing?
I mean, it's going to be like a combination, right?
So I think we just need an architecture that at training time, you're able to.
So like, I mean, the simplest thing is like instead of outputting a token in the next, right,
you can actually give it like, you know, empty token, for example, for some period of time.
And then when it says, like, okay, I'm ready.
give it to output next token, right?
And so this way you can train it to, like, think more before outputting.
And then at inference time, you can vary it, right?
Like, hey, I'll give you more time to think, you know, or like, no, you have no time to think.
But then you can train it to, like, actually being able to, like, dynamically to output.
So, again, this is like a very simple thing, but, like, you can keep expanding on this,
you know, output it and then feed it back and, like, is this a right answer, like, et cetera.
So there's a few different models.
But I think to Jakob's point, like the fact that this model is doing a really effective search in kind of this knowledge space means that probably like pushing more into that concept is more useful than doing more searches at inference time because like it means you already lost all the semantics if you're doing search at it first time.
I think you made a really interesting point where it's possible that transformer architecture increasingly is getting locked in.
And there's two components of that.
one is it just seems to run really well on the main silicon that we're using right now for
AI, which are GPUs. And then secondly, there's so much optimization work going into it and so much
being built around it that it effectively creates optimization that just won't happen for any models
anytime soon. And so you effectively end up with this interesting feedback loop or lock in effect
for this set of models. Do you think that we're in a spot now where this is just kind of the future
for the next five years or 10 years or something? Or what do you think it's
the likelihood that other approaches or architectures will emerge anytime soon?
I mean, there might be another architecture that, like, reasonably fits with the same silicon.
I think there's an interesting example of there's a company that built an alternative, right, silicon,
that is kind of allows to process things in pipelines.
And so, like, the chips are actually, like, kind of smaller compute chips, but they're kind of all,
like, in a grid and the data flows from one side to another.
So the example there is, on one side, it's like a really interesting architecture.
It can build really cool things with this, but it doesn't fit transformers very well, right?
Like, you can do transformers with it, but it doesn't fit very well.
And like your cost that, like, you get, like, you know, cost to output ratio is not that interesting.
And so in comparison to, you know, you're just optimizing on GPUs or using some of the new hardware accelerators.
And so this is where example, like, I mean, I'm not to speculate,
here on a specific company, but, you know, I wouldn't expect they will have, like, a ton of
people lining up because, like, there is a ton of alternatives for Transformers that come in,
and, like, somebody would need to, like, go in and develop a lot of new architectures that
fit better as this model. And so it'll be really hard for them to, like, be a viable business
and kind of have the economies of scale that Nvidia are having right now to kind of continue
optimizing and building best state-of-the-art chips, right?
So unless somebody's, like, really investing in this, I think it will be more around,
like, what else we can do with Carr and Silicon, right, and kind of combinations of this.
And then, I mean, maybe there's something new will come out.
Yeah, but when things lock in technologically, they actually tend to lock in pretty strongly
until those are really big C-change or sort of the optimization of those things hit an
asymptote. And it's interesting because I think a prior example of this kind of chip plus software
reinforcement loop was really the Windows and Intel monopolies of the 90s. They used to call
it Wintel for Windows and Intel because it was such a strong mutual lock-in effect where you had
chips. They were optimized for Windows and Windows was optimized for the chipset and it just kind of
kept going from there. And so this is I feel like a stronger version of that in some sense where
you have the underlying compute architecture and the most important model reinforcing each
other in a way that kind of locks both of them in.
Yeah.
And what changed that is pretty much come of mobile, right, and creation of IRM devices,
IRM chips that are kind of optimized for mobile and then came back into PCs, right?
So, yeah, unless there's like a completely new form factor, which hard to predict, right?
But also it's like, that's a lot of investment to go from not just software, not just hardware,
but like full stack, right, innovation.
Yeah, I think it's unclear if this is a strong enough market force,
but the short-term, you know, demand supply imbalance around GPUs
with all of the growth of applications, especially as like you think any of these
applications work, like inference needs grow, right?
Your ability to build enough for Nvidia, really, to build enough GPUs to service the
demand is, like, it's blocking a lot of companies, right?
And I think the question is, like, there is more incentive to make heterogeneous hardware work than there ever has been.
And, like, can that catch up with the full stack optimization that you describe, the Kuda, like, investment that NVIDIA has made?
It's super unclear.
But I think, like, there's been no reason to chase that until, you know, this past 18 months.
And I think now there is.
Yeah.
But at the same time, we have, like, every single, you know, large companies doing their own hardware accelerator.
as well as, you know, a bunch of folks who are kind of spun out of those.
And so, like, we're going to have a, you know, a market full of hardware accelerators,
which are still optimized for Transformers or at least, like, similar structured architectures
hitting the market, like, this year and next year.
Yeah.
Ilya, this is great.
I hope you will, after Alad and I work through all of the Transformers authors, like,
Pokemon style, got to catch them all.
I hope you'll come back for a reunion episode.
But thank you for doing this.
Yeah, thanks for Zimpon.
For sure.
Thank you.
Find us on Twitter at NoPriarsPod.
Subscribe to our YouTube channel if you want to see our faces, follow the show on Apple Podcasts, Spotify, or wherever you listen.
That way you get a new episode every week.
And sign up for emails or find transcripts for every episode at no dash priors.com.