a16z Podcast - AI x Crypto
Episode Date: September 13, 2023with @alive_eth @danboneh @smc90This week's all-new episode covers the convergence of two important, very top-of-mind trends: AI (artificial intelligence) & blockchains/ crypto. These domains together... have major implications for how we all live our lives everyday; so this episode is for anyone just curious about, or already building in the space. The conversation covers topics ranging from deep fakes, bots, and the need for proof-of-humanity in a world of AI; to big data, large language models like ChatGPT, user control, governance, privacy and security, zero knowledge and zkML; to MEV, media, art, and much more. Our expert guests (in conversation with host Sonal Chokshi) include: Dan Boneh, Stanford Professor (and Senior Research Advisor at a16z crypto), a cryptographer who’s been working on blockchains for over a decade and who specializes in cryptography, computer security, and machine learning -- all of which intersect in this episode;Ali Yahya, general partner at a16z crypto, who also previously worked at Google -- where he not only worked on a distributed system for a fleet of robots (a sort of "collective reinforcement learning") but also worked on Google Brain, where he was one of the core contributors to the machine learning library TensorFlow built at Google.The first half of the hallway-style conversation between Ali & Dan (who go back together as student and professor at Stanford) is all about how AI could benefit from crypto, and the second half on how crypto could benefit from AI... the thread throughout is the tension between centralization vs. decentralization. So we also discuss where the intersection of crypto and AI can bring about things that aren't possible by either one of them alone...pieces referenced in this episode/ related reading:The Next Cyber Reasoning System for Cyber Security (2023) by Mohamed Ferrag, Ammar Battah, Norbert Tihanyi, Merouane Debbah, Thierry Lestable, Lucas CordeiroA New Era in Software Security: Towards Self-Healing Software via Large Language Models and Formal Verification (2023) by Yiannis Charalambous, Norbert Tihanyi, Ridhi Jain, Youcheng Sun, Mohamed Ferrag, Lucas CordeiroFixing Hardware Security Bugs with Large Language Models (2023) by Baleegh Ahmad, Shailja Thakur, Benjamin Tan, Ramesh Karri, Hammond PearceDo Users Write More Insecure Code with AI Assistants? (2022) by Neil Perry, Megha Srivastava, Deepak Kumar, Dan BonehAsleep at the Keyboard? Assessing the Security of GitHub Copilot's Code Contributions (2021) by Hammond Pearce, Baleegh Ahmad, Benjamin Tan, Brendan Dolan-Gavitt, Ramesh KarriVoting, Security, and Governance in Blockchains (2019) with Ali Yahya and Phil Daian As a reminder: none of the following should be taken as investment, legal, business, or tax advice; please see a16z.com/disclosures for more important information -- including to a link to a list of our investments – especially since we are investors in companies mentioned in this episode. Stay Updated: Find a16z on Twitter: https://twitter.com/a16zFind a16z on LinkedIn: https://www.linkedin.com/company/a16zSubscribe on your favorite podcast app: https://a16z.simplecast.com/Follow our host: https://twitter.com/stephsmithioPlease note that the content here is for informational purposes only; should NOT be taken as legal, business, tax, or investment advice or be used to evaluate any investment or security; and is not directed at any investors or potential investors in any a16z fund. a16z and its affiliates may maintain investments in the companies discussed. For more details please see a16z.com/disclosures.
Transcript
Discussion (0)
AI is very much a technology that thrives and enables top-down centralized control,
whereas crypto is a technology that's all about bottom-up decentralized cooperation.
One of the points of NFTs was to support the artists.
Yes.
But if the artists themselves are now machine learning models, then who exactly are we supporting?
One of the things that will become important in a world where anyone can participate online
is to be able to prove that you are human.
for various different purposes.
If we're going to incentivize people to contribute data,
basically we're going to incentivize people to create fake data
so they can get paid.
So we have to have some sort of a mechanism
to make sure that the data you're contributing is authentic.
Hello, everyone, and welcome back to the A16Z podcast.
This is your host, Steph Smith,
but today I'm passing the baton back to longtime host Sonal Choxi.
This, of course, is also a crossover episode
from our sister podcast, Web3 with A16Z, which Sonal now hosts.
There are few technologies over the last few years that have quite captured the zeitgeist
like crypto in AI.
So in today's episode, Sonal sits down with guests Aliyaa and Dan Bonnet to explore the ways
in which these two emerging technologies oppose, yet also beautifully augment one another.
And they attack this from both directions, how crypto can help AI, like how crypto can help AI,
like how crypto acts as a decentralized counterweight to the somewhat centralizing force
where AI models with more data, more compute, and more complex models do tend to win.
But also how AI can help crypto.
For example, are we at the point where LLMs should be writing smart contract code?
And what about all these deepfakes we keep hearing about?
Let's find out.
Welcome to Web3 with A6 and 6.6.
Z, a show about building the next generation of the internet from the team at A6 and Z Crypto.
That includes me, your host, Sonal Jaxi.
Today's all new episode covers the convergence of two important top of mind trends,
AI, artificial intelligence, and crypto.
This has major implications for how we all live our lives every day,
so this episode is for anyone just curious about or already building in the space.
Our special guest today are Dan Benet, Stanford professor,
and senior research advisor at A6 and Z Crypto. He's a cryptography,
who's been working on blockchains for over a decade and the topics have a strong intersection
between cryptography, computer security, and machine learning, all of which are his areas of
expertise. And then we also have Ali Yaya, general partner at A6NZ Crypta, who also worked
at Google previously, where he not only worked on a distributed system for robotics, more specifically
as sort of collective reinforcement learning, which involved training a single neural network
that contributed to the actions of an entire fleet of robots, but also worked on Google Brain,
where he was one of the core contributors to the machine learning library TensorFlow.
And actually, Dan and Ali go back since Ali was an undergrad and master's student at Stanford.
So this conversation is really more of a hallway jam between them that I asked to join.
And we cover everything from deep fakes and bots to proof of humanity in a world of AI and much, much more.
The first half is all about how AI could benefit from crypto and the second half on how crypto could benefit from AI.
And the thread throughout is the tension between centralization,
versus decentralization.
As a reminder, none of the following should be taken as investment, legal, business, or tax
advice.
Please see A6Sinze.com slash disclosures for more important information, including a link to a list
of our investments, especially since we are investors and companies mentioned in this episode.
But first, we begin with how the two worlds intersect with the quick sharing of areas or visions
that they're excited about.
The first voice you'll hear is Ali's.
There is a really good sci-fi novel called The Diamond Age by Neil Stevenson, in which
there is this device known as the illustrated primer that is a kind of artificially intelligent
device that acts as your mentor and your teacher throughout life. And so when you're born,
you're paired to an AI essentially that knows you really well, learns your preferences,
follows you throughout life, and helps you make decisions and steers you in the right direction.
So there's like a sci-fi future in which you could build such an AI,
but you very much wouldn't want that AI to be controlled by a monopolistic,
tech giant in the middle because that position would provide that company with a great deal of
control and solve these kind of questions of privacy and sovereignty and you'd want to have
kind of control over it. And then also what if the company goes away or they change the rules or
they change the pricing, it would be great if you could build an AI that could run for a very,
very long time and could get to know you over the course of a lifetime, but have that really be
yours. And so there is this vision in which you could do that with a blockchain. You could
embed an AI within a smart contract and with the power of zero knowledge proofs,
you could also keep your data private.
And then over the course of decades, this AI can become smarter and can help you.
And then you have the option to do whatever you want with it or change it in whichever
way you want or shut it down.
And so that's kind of an interesting vision for long-running AIs that are continually
evolving and continually becoming better.
It'd be better if it were the case that they weren't just controlled by a single centralized
company.
Of course, it's a very science fiction idea because there are.
are lots of problems, including the problems of verification and the problems of keeping
data private using cryptography and still being able to compute on top of that data, maybe with
fully homomorphic encryption. All of these problems continue to be outstanding, but it's not
something that's inconceivable. Well, I love Ali's vision there. I love it too, especially
given that quote, I think it was Asimov that today's science fiction is tomorrow's science fact.
Ali, I know you have a meta-framework for thinking about all this stuff that I've heard you
share before. Can you share that now too? Yeah, there is this broader narrative that has existed
for quite some time now that's only becoming much more accentuated now with the development of
things like LLMs. Actually, if I met really quickly, just for listeners who aren't already familiar,
just as context. So in LLM, it stands for a large language model and it uses some of the technology
that was developed at Google. Back in 2017, there's this famous paper known as attention is all
you need. That was the title of the paper. And it outlined what are now known.
as transformers. And that's at the basis, basically, of some of the new models that people have
been training these days, including chat GPT and so on. All of these are large language models or
LLM's. There was that famous, I think 2018 line from Peter Thiel that AI is communist and crypto is
libertarian. That line is like very on point, actually, because AI and crypto in many ways are
natural counterweights for one another. And maybe we can go deep over the course of the podcast into each
one of these as we go through examples, but there are four major ways in which that's true.
The first is that AI is very much a technology that thrives and enables top-down centralized
control, whereas crypto is a technology that's all about bottom-up decentralized cooperation.
And in many ways, actually, you can think of crypto as the study of building systems that
are decentralized that enable large-scale cooperation of humans where there isn't really any
central point of control. And so that's one natural way in which these two technologies are
counterweight for one another. Another one is that AI is a sustaining innovation in that it
reinforces the business models of existing technology companies because it helps them make top
down decisions. And the best example of this would be Google being able to decide exactly
what add to display for each of their users across billions of users and billions of page views,
whereas crypto is actually a fundamentally disruptive innovation in that it has a business model
that's fundamentally at odds with the business models of big tech companies.
And so as a result, it's a movement that is spearheaded by rebels, by the fringes,
as opposed to being led by the incumbents.
So that's the second.
A third one is that AI will probably relate and interplay a lot with all of the trends
towards privacy because AI as a technology has built in all sorts of incentives
that move us towards less individual privacy because we,
will have companies that want access to all of our data. And AI models that are trained on more
and more data will become more and more effective. And so I think that that leads us down a path
of the AI Panopticon where there's just collective aggregation of everyone's data into
the training of these enormous models in order to make these models as good as possible.
Whereas crypto moves us towards the opposite direction, which is a direction of increasing individual
privacy. It's a direction of increasing sovereignty where users have control over their
own data. And those two trends, I think, will be very important. And this is just another
important way in which crypto is the counterweight for AI. And maybe the final one has to do with
this latest trend in AI, the fact that AI is now very clearly a powerful technology for
generating new art is now a creative tool that will lead us to infinite abundance of media,
infinite creativity in many ways. And crypto is a counterweight to that, because
It helps us cut through all of the abundance and helping us distinguish what's created by humans versus what's created by AI.
And cryptography will be an essential part of maintaining and preserving what actually is human in a world where a thousand X more of the content is actually artificially generated.
So these are all things that we can talk about, but I think that there is this important meta-narrative.
And these two technologies are very much diametrically opposing in many respects.
So maybe, Ali, to add to that, this is a wonderful summary.
And I would say also that there's also a lot of areas where techniques from AI are having an impact in
blockchains and vice versa, where techniques from blockchains are having an impact in AI.
I'll give a brief answer here because we're going to dive into the details in just a minute.
But there are many points of intersection.
I guess we'll talk about applications of zero knowledge from machine learning in just a minute.
But I also want to touch on all these applications where machine learning itself can be used to write
code. So, for example, machine learning can be used to write solidity code that goes into contract.
It can be used to find maybe errors in codes and so on. There's points of intersection where
machine learning can be used to generate deepfakes and blockchains can actually help to protect
against deep fakes. And so I guess we're going to touch on all these points. But the interesting
thing is that there's really quite a lot of intersection between blockchains and machine learning.
Yeah, before we dive into those, one question I have for you, Dan, is do you agree with that?
I mean, I definitely hear Ali's point that AI and crypto are very natural compliments actually
or counterweights really for each other.
Or, you know, they can be different forces that can kind of check and balance each other almost.
But is this an inherent quality to AI and crypto in your opinion?
Or is this just an artifact of the way things are done right now?
What parts might you agree or disagree with?
Yeah.
So I would say that if you look at it from far away, the techniques that are used in the AI,
they seem very different from the techniques that are used in blockchains.
So blockchains is about cryptography, decentralization, finance, and economics, and so on, whereas AI is, you know, about the statistics, the mathematics of machine learning and so on. It's about big data, right? The techniques actually look quite different. But there are actually a lot of places where one side can help the other and vice versa. So maybe the first one to start with is kind of the obvious one that's been on a lot of people's minds, which is what's called the applications of zero knowledge for machine learning. This is kind of an emerging area. It's called a ZKML.
And the reason this has become interesting is because ZK techniques have improved dramatically because of their application in blockchains.
What's happened over the last 10 years is sort of unbelievable.
You know, it's something that we don't see very often.
This idea of zero knowledge proofs and proof systems in general, they were considered very theoretical a decade ago.
And because of all of their applications in blockchains, all of a sudden, there was a lot of effort in making them more practical and real world.
And as a result, there's been tremendous progress, as our listener,
know that now these things are actually deployed and used to protect real systems.
So the question then is, can zero knowledge techniques be used to help machine learning?
And there are a couple of examples, honestly, we could spend a whole podcast just on ZKML.
But maybe I can just give a taste, one or two examples where ZK is useful for machine learning.
And so imagine Alice has a secret model that she spent a lot of time training.
And that model is actually very important to her.
It's very important that people don't know how the model works.
But she still wants to be able to service requests from Bob.
So Bob would like to send her some data.
She would apply the model to the data, send the result back to Bob.
Bob has no idea whether he's getting the correct results on the model, right?
Maybe he paid for a certain model and you want to make sure that Alice is really using that model.
Maybe he paid for GPT4 and he wants to make sure Alice is really using GPT4 and not GPT3.
Well, it turns out ZK techniques can help here a lot.
So what Alice would do, she would commit to her model, make the commitment publicly available.
And then whenever Bob submits a piece of data,
Alice could run the model on that data, sends the results back to Bob, along with a proof that
the model was evaluated correctly. So Bob now would have a guarantee that, in fact, the model
that was committed to is the one that was run on Bob's data. Yeah? So that's an example where
ZK techniques can be useful in the ML case. And I want to kind of stress why this is so important.
So let's look at one example. So suppose we have a function, a model that's actually used to
affect people's lives. Like imagine, you know, maybe we use a model to decide whether we grant a loan
or grant a mortgage, you know, a financial institution might want to use a model like that.
Well, you want to make sure that the same model is being applied to everyone, right? That it's not the
case that, you know, one model is being applied to me and a different model is being applied to you.
Well, by basically having the bank commit to the model, right? And then everyone can verify that
their data is being assessed by the same committed model. We can make sure.
that the same model is being applied to everyone.
And I have to say that there's a wonderful open problem here,
which is that even though zero knowledge techniques can make sure
that the same model is being applied to everyone,
there is this question, you know, is the model fair?
Models can have biases, could lead to unfair results.
And so there's a whole area of research.
It's called algorithmic fairness.
There are many, many papers on algorithmic fairness.
And it's really interesting to ask, well,
now that we have a committed model,
can we prove in zero knowledge that the model satisfies some fairness definition from the area
of algorithmic fairness? And how do we make sure that the training process ran correctly?
Well, everything that you said about ZKML is extremely exciting. And as a technology,
I think it'll play a role at making machine learning and AI sort of generally more transparent
and more trustworthy, both within the context of crypto and outside of it.
I think an even crazier and maybe longer term and more ambitious application of ZKML
and some of the other verification techniques that the crypto community has been working on
is just generally decentralizing AI.
Because as we were talking about before, AI is a technology that is almost inherently centralizing
in that it very much thrives from things like scale effects because having things
within a data center makes things more efficient. And so scaling things in a centralized way
makes for things to become more powerful and more centralized as a result. Then also data
is usually controlled by a small number of tech companies in the middle. And as a result,
also kind of leads to additional centralization. And then finally, machine learning models and
machine learning talent also kind of controlled by a small number of players. And so crypto can again
help on this front by building technology using things like ZKML that can help us decentralized.
So there are three main things that go into AI. There's the compute aspect, and that requires
sort of large-scale use of GPUs, usually in data centers. There's the data piece, which, again,
most of the centralized companies control. And then there's the machine learning models themselves.
And the easiest one might be the prong of compute. Like, can you actually decentralize the
compute for the training and the inference of machine learning models? And this is where some of the
techniques that Dan was talking about, things like zero knowledge proofs that you can
used to prove that the process of actually conducting inference or training a model was actually
done correctly so that you can outsource that process to a large community and you can have
a distributed process by which anyone who has a GPU can contribute computation to the network
and have a model be trained in that way without necessarily having to rely on a massive data
center with all of the GPUs in a centralized manner. And there's a big question still of whether
or not, that economically ends up making sense, but at the very least, through the right
incentives, you can actually tap into the long tail. You can tap into all of the idle GPU capacity
that might exist, have all of those people contribute that computation to the training of a model
or to the running of inference. And it provide an alternative to what otherwise would be just
the big tech companies in the middle that currently control everything. There are all sorts of
important technical problems that would have to be solved in order for that to be possible.
There's actually a company in the space that it's called Jensen, which is building exactly this.
They are building a decentralized marketplace for GPU compute very much for the purpose
of training machine learning models.
And it's a marketplace where anyone could contribute their GPU compute, whether it be in their
kind of personal computer under their desk or whether it be idle inside of some data center.
And then on the other side, anyone can leverage whatever compute exists in the network to train their large machine learning models.
And this would be an alternative to the very centralized sort of open AI slash Google slash meta slash all, you know, insert your favorite big tech company here, alternative that currently you would necessarily have to go with.
So before we go into more of the decentralization framework, because Ali, you were breaking down like compute and I think you were going to share the other two of those three prongs.
But before we do, both of you guys talked a little bit about all the technical challenges here.
So what are some of the technical challenges that need to be overcome here and that people may or may not already be solving?
I definitely want builders who listen to this episode to also think about what the opportunities are in this space and address existing challenges or what are some of the challenges they're going to face and building solutions here.
Yeah, so maybe I can mention too that I think would be interesting to folks.
So one is basically, imagine you have a situation where Alice actually has a model that she wants to protect.
She wants to send the model in an encrypted form to some other party, let's say to Bob.
So Bob now receives an encrypted model and it needs to be able to run its data on this
encrypted model.
Well, how do you do that?
If you have a piece of data that you want to run on a model but you only have the
encryption of the model, how do you make that possible?
And that is something that we would use what's called fully homomorphic encryption for.
Yeah.
It's a fully homomorphic encryption is this remarkable tool that allows you to compute an
encrypted data. This is kind of mind-boggling that this is possible, but you can have an
encrypted model, and you might have some clear-text data, and you can actually run the
encrypted model on the clear-text data and receive and obtain an encrypted result. You would
send the encrypted result back to Alice, and she would be able to decrypt and see the results
in the clear. So this is actually something that's already, there's actually quite a bit of demand
for this and practice. It doesn't take much effort to see that the DOD is interested in this. There
are many other applications where you can send an encrypted model to a third party. The third
party would run the encrypted model under data, send you back the results. You can decrypt and
learn something about the data that was given as input to the encrypted model. The question,
of course, is how do we scale that up? Right now, this works well for medium-sized models.
And the question is, can we scale it up to much larger models? So this is quite a challenge.
A couple of startups in the space. And again, very, very interesting technology. It's kind of
amazing that this is possible at all. Yeah. And we're probably going to see much
more of that in the future. The other area is actually what Ali mentioned, another very important
area, which is how do we know that the model was trained correctly? So if I send my data to someone
and I ask him to train a model on their data, maybe fine-tune the model on that data, how do I know
that they did that correctly? They might send me a model back, but how do I know that the model
doesn't have back doors in it? There's actually a fair amount of work on showing that if the training
is done incorrectly, I could send you back a model that would work correctly on all the
your test data, but it has a backdoor, meaning that it would fail catastrophically on one particular
input. This is possible if your training process is not verified. And again, this is an area
where ZKML comes in. We can prove to you that the training ran correctly, or maybe there's some
other techniques that might be possible to prove that the training was done correctly. But then,
this is another area and a very active area of research. And I would encourage many of the listeners,
this is like a very, very difficult problem, proving that training was done correctly,
proving that the training data even was collected correctly and was filtered correctly and
so on. So that actually is a wonderful area to get into if people are looking to do more work in
the space. Fantastic. Ali, is there anything you'd add to that? Yeah, definitely. Well, I guess if we
continue down the path of talking about what it would take to help decentralize the AI stack,
I think that in order to decentralize the compute prong, and there are the three important prongs.
If we wanted to decentralize the compute aspect, there are two very important open technical challenges.
The first is the problem of verification, which Dan just mentioned, which you could use ZKML for.
And you can ideally, over time, use zero knowledge proves to prove that the actual work that the people who are contributing to this network was actually done correctly.
And the challenge there is that the performance of these cryptographic primitives is nowhere near where it needs to be to be able to do either training or inference of the very, very large models.
So the models today, like sort of the LLMs that we all kind of know and love now, like Chad GPT, would not be provable using the current state of the art of ZKML.
And so there's a lot of work that's being done towards improving the performance of the proving process so that you can prove larger and larger workloads efficiently.
But that's an open problem and something that a lot of people are working on.
And in the meantime, companies like Jensen are using other techniques that are not just cryptographic and instead are,
game theoretic in nature where they just get a larger number of people who are independent
from one another to do the work and compare their work with one another to make sure that
the work's done correctly. And that is a more more of a game theoretic, optimistic approach
that is not relying on cryptography, but is still aligned with this greater goal of decentralizing
AI or helping create an ecosystem for AI that is much more organic community owned and bottom up,
as opposed to the top down that's being sort of put forth by companies like Open A.I.
So that's the first problem.
The first big problem is the problem of verification.
And the second big one is the problem of distributed systems.
Like how do you actually coordinate a large community of people who are contributing GPUs to a network such that it all feels like an integrated, unified substrate for computation?
And there will be lots of interesting challenges along the lines of breaking up a machine learning workload in a way that
makes sense and shipping off different pieces of it to different nodes in the network,
figuring out how to do all of that efficiently, and then also when nodes fail, figure out
how to recover and assign new nodes to then take over whatever work was being done by the
node that failed.
So there are lots of messy details at the distributed systems level that companies will have
to solve in order to give us this decentralized network that can perform machine learning
workloads in a way that's perhaps even cheaper than just using the cloud.
Yeah, that's great. Totally. It's definitely true that the ZK techniques today will handle the smaller models that are being used, but definitely the LLMs are probably too big for these techniques to handle today, the ZK techniques. But, you know, they're constantly getting better. The hardware is getting better. And so hopefully they'll catch up. Yeah. Before we go on, can we just do a really clear pulse check then on where we are exactly? And that so obviously what I'm hearing you guys say is that there are tremendous applications at the intersection of general verifiable
computing, which blockchains and crypto have definitely been significantly advancing and
accelerating that whole area. We've been covering a ton of it ourselves. If you look at our
ZK Canon and zero knowledge category, you'll see so much of this covered there. But where are we
exactly right now in terms of what they can do? Because you guys talked a lot about what they can't do
yet and what the opportunity is, which is exciting. But where are we right now? Like what can they
actually do? Yeah. So right now they can basically do classification for medium size models. So not something
as big as GPT 3 or 4, but medium-sized models, it is possible to prove that the classification
was done correctly. Training is probably beyond what can be done right now, just because training
is so compute-intensive that for proof systems were not there yet. But like Ali said, we have
other ways to do it. For example, we can have multiple people do the training and then compare the
results, yeah, so that now there are game theory consentives for people not to cheat. If somebody
cheats, somebody else might be able to complain that they computed the training incorrectly,
and then whoever cheated will not be paid for their effort.
Right, right.
So there's an incentive for people to actually run the training the way it was supposed to run.
Right.
And so basically that's sort of not like a hybrid phase, but it's basically like alt approaches
until more of this comes to scale and performances scale to a point where we can get there.
Yeah, I would say that for some models, classification can be proved in zero knowledge today.
For training right now, we have to rely on optimistic techniques.
Yeah, great.
So, Ali, you mentioned compute is one of the three prongs.
And you also mentioned that data and then the models for machine learning themselves.
Do you want to tackle now data and sort of the opportunities and challenges there where it comes to the crypto slash AI intersection?
Yeah, absolutely.
So I think that there is an opportunity, even though the problems involved are very difficult, to both decentralize the process of sourcing data for training of large.
machine learning models from a broader community, again, instead of having a single centralized player,
just collect all of the data and then train the models themselves.
And this could work by creating a kind of marketplace that's similar to the one that we just
described for compute, but instead incentivize people to contribute new data to some big
data set that then gets used to train a model.
The difficulty with this is similar in that there's a verification challenge.
You have to somehow verify that the data that people are contributing.
is actually good data and that it's not either duplicate data or garbage data that was just
sort of randomly generated or not real in some way. And to also make sure that the data doesn't
somehow subvert the model and some kind of poisoning attack where the model actually becomes
either backdored or just sort of less good or less performant than it used to be. So there's the
question of how do you verify that that is the case? And that's maybe an open, hard problem for the
community to solve. It may be impossible to solve completely, and you might have to rely on a
combination of technological solutions with social solutions, where you also have some kind of
reputation metric that members in the community are able to earn to build up credibility,
such that when they contribute data, the data can then be trusted a little bit more than it would
be otherwise. But what this might allow you to do is that you can now truly cover the very,
very long tail of the data distribution.
And one of the things that's very challenging in the world of machine learning is that
your model is really only as good as the coverage of the distribution that your training
data set can achieve.
And if there are inputs that are far, far out of the distribution of the training data,
then your model might actually behave in a way that's completely unpredictable.
And in order to actually get the model to perform well in the edge cases and the sort of
black swan data points or data inputs that you might,
experience in the real world, you do want to have your data set be as comprehensive as possible.
And so if you had this kind of open decentralized marketplace for the contribution of data to a
data set, you could have anyone who has very, very unique data out in the world, contribute
that data to the network, which is a better way to do this because if you try to do this as
a central company, you have no way of knowing who has that data.
And so if you flip it around and create an incentive for those people to come forward and provide that data on their own accord, then I think you can actually get significantly better coverage of the long tail.
And as we've seen, the performance of machine learning models continues to improve as the data set grows and as the diversity of the data points in the data set grows.
And so this can actually supercharge the performance of our machine learning models to an even greater degree.
We're able to get even more comprehensive data sets that over the whole of the whole.
distribution. So let me turn this on its head in that if we're going to incentivize people to contribute
data, basically we're going to incentivize people to create fake data so they can get paid. Yeah,
so we have to have some sort of a mechanism to make sure that the data you're contributing is
authentic. Exactly. And you can imagine a couple of ways of doing this, right? I mean, one way is
actually by relying on trusted hardware, right? Maybe the sensors themselves are embedded in some
trusted hardware that we would only trust data that's properly signed by the hard.
That's one way to do things.
Otherwise, we would have to have some other mechanism by which we can tell whether the data is authentic or not.
Completely agree.
That would be the biggest open problem to solve.
And I think that as benchmarking for machine learning models gets better, I think there's two important trends in machine learning at the moment.
There's improving the measurement of the performance of a machine learning model.
And for LLMs, that's still very much in its early stages and that it's actually quite hard to know how good an LLM is because it's not as if.
they were like a classifier where what the performance of a model is is very clearly defined
with an LLM, it's almost as if you're testing the intelligence of a human, right?
And coming up with the right way of testing how intelligent an LLM like ChatGPT is,
is an open area of research.
But over time, I think that will become better and better.
And then the other trend is that we're getting better at being able to explain how it is
that a model works.
And so with both of those things, at some point, I might become feasible to understand the
effect that a dataset has on a machine learning model's performance.
And if we can have a good understanding of whether or not a data set that was contributed
by a third party helped the machine learning models performance, then we can reward that
contribution and we can create the incentive for that marketplace to exist.
So just to summarize so far, what I heard you guys say is that there's trusted hardware
that can help check the accuracy of the data that's being contributed and the models
that are being contributed. Ali, you mentioned briefly reputation metrics and that type of
thing can help. You also mentioned that there might be a way, not necessarily now, but sometime
in the near future, to check how the data is influencing the outcomes in a particular model so
that you can actually, it's not quite explainable, but the idea is that you can actually
attribute that this data set causes this particular effect. So there's a range of various
techniques you guys have shared so far. Well, the final thing is that you could do the same thing
for the third prong, which is model.
I can imagine if you could create an open marketplace for people to contribute a trained model that is able to solve the particular kind of problem.
So imagine if on Ethereum you created a smart contract that embedded some kind of test, be it like a cognitive test that an LLM could solve or some classification test that a classifier, machine learning model that is a classifier could solve.
And if using ZKML, someone could provide a model alongside a proof that that model can solve that test.
Then again, you now have the tools that you need to create a marketplace that incentivizes people to contribute machine learning models that can solve certain problems.
So many of the problems that we've discussed, the open problems that we've discussed on how to do that are also present here.
In particular, there's a ZKML piece where you have to be able to prove that the model is actually doing the work that it's claiming that it's doing.
And then also we need to be able to have good tests of how good a model is, right?
So being able to embed a test inside of a smart contract that then you can subject a machine learning model to, to evaluate how good it is.
This is another very nascent part of this whole technology trend.
But in theory, it'd be great if we eventually get to a world where we do have these very open, bottom-up, transparent marketplaces that allow people to contribute and source compute data and machine learning models for machine learning, that essentially act again as a counterweight to the very, very centralized enormous tech companies that are driving all of the AI work today.
I really love all you.
You mentioned that because it's been a longstanding problem in AI in practice that it can see.
solve a lot of things for like the bell curve, the middle of the norm, but not the tail ends.
And that's a classic example of this is like self-driving cars, right?
Like you can do everything with a certain amount of standard behaviors, but it's the edge
cases where the real accidents and catastrophic things can happen.
So that was super helpful.
And I know you talked about some of the incentive alignment and incentives for providing
accurate and quality data and even incentives for just even contributing anything in the
long tail of data overall.
But on the long tail side, a quick question that popped out for me when you were
talking of like it sort of begs the question of then who makes money where in this system?
I could have not but wondered what does the business model kind of come in then in terms of
making money for companies because I always understood that in the long tail of AI in a world of
this kind of available data sets that your proprietary data is actually your unique domain knowledge
and kind of the thing that only you know in that long tail. So do you guys have any quick
responses to that? So I think that the vision behind crypto intersecting with AI is that you could
create a set of protocols that distributes the value that will eventually be captured by
this new technology by AI amongst a much larger group of people, essentially a community of
people, all of whom can contribute and all of whom can take part of the upside of this new technology.
And so then the people who would be making money would be the people who are contributing
compute or the people who are contributing data or the people who are contributing new machine
learning models to the network, such that better machine learning models can be trained and
bigger, more important problems can be solved.
The other people that would be making money at the same time are the people who on the other
side are on the demand side of this network, people who are using this as infrastructure for
training their own machine learning models.
Maybe their model does something interesting in the world.
Maybe it's the next generation of Chad GPT, and then that then goes on to make its way
into a bunch of different applications, like say enterprise applications.
or whatever it is that those models may be used for.
And those models drive value capture in their own right
because those companies will have a business model of their own.
And then finally, the people who might also make money
are the people who build this network.
So, for example, create a token for the network.
That token will be distributed to the community.
And all of those people will have collective ownership
over this decentralized network for compute data and models
that may also capture some value of all of the economic
activity that goes through this network. So you can imagine any payment for compute or any payment
for models could have some fee imparted on it. It might just go to some treasury that's
controlled by this decentralized network that all token holders that are part of this network
have collective ownership and access to as well as the creators and owners of the marketplace.
And that fee might just go to the network. So you can imagine that every transaction that goes
through this network, every form of payment that pays for compute or pays for data or pays for
models might have some fee that's imparted on it that goes to some treasury that's controlled
by the whole network and by the token holders that collectively own the network.
And so that's essentially a business model for the network itself.
Great. Okay. So so far we've been talking a lot about the way that crypto can help AI.
I mean, to be clear, it's not like unidirectional. These things are kind of reinforcing
and bidirectional and more interactive than one way.
But for the purpose of this discussion, we're really talking about it being like,
here's how crypto can help AI.
Let's now kind of turn it on its head and talk a little bit more about ways that AI can
help crypto.
Yeah, so there are a couple of interesting touchpoints there.
So one that's actually worth bringing up is this idea of machine learning models that
are used to generate code.
So many of the listeners are probably heard of co-pilots, which is a tool that's used to generate
code. And what's interesting is you can try to use these code generation tools to write solidity
contracts or to write cryptography code. And I want to stress that this is actually a very dangerous
thing to do. Oh, do not do this at home. Okay. Yeah, do not try this at home because what happens is
very often these systems actually will generate code that works. You know, when you try to run it,
you know, encryption is the opposite of decryption and so on. So the code will actually work,
but it will actually be insecure. We've actually written a paper about this recently that
It shows that if you try to get a co-pilot to just write something as simple as just an
encryption function, it will give you something that does encryption correctly, but it uses
an incorrect mode of operation, yeah, so that you'll end up with an insecure encryption mode.
Similarly, if you try to get it to generate solidity code, you might end up with solidity
code that works, but it will have vulnerabilities in it.
So you might wonder why does that happen?
And one of the reasons is because these models are basically trained on codes that's out there,
They're trained on GitHub repositories.
Well, a lot of the GitHub repositories actually are vulnerable to all sorts of attacks.
And so these models learn about code that works, but not code that is secure.
It's almost like garbage in, garbage out.
And so I do want to make sure people are very careful when they use these generative models to generate code,
that they very, very carefully check that the code actually does what it's supposed to do and that it does it securely.
One idea on that front, I'm curious what you think about.
this is that you can use AI models like LLMs or like chat GPT to generate code in conjunction
with other tools to try to make the process less error prone. And so one example, one idea would be to
use an LLM to generate a spec for a formal verification system. So basically you describe your program
in English and you ask the LLM to generate a spec for a formal verification tool. Then you ask the same
instance of the LLM to generate the program that meets that spec, and then you use a formal
verification tool to see whether the program actually meets the spec. And if there are errors,
that tool will catch the errors. Those errors can then be used as feedback back to the LLM.
And then ideally, hopefully the LLM can then revise its work and then produce another version of the code
that is correct. And eventually, if you do this again and again, you end up with a piece of code
that ideally fully meets the spec and is formally verified to meet the spec.
And because the spec is maybe readable by a human, you can maybe go through the spec and
see, like, yes, this is the program that I intended to write.
And that could be an actual pretty good way to use LLMs to write code that also isn't
as prone to errors as it might be if you were to just ask ChatGPT to generate a smart
contract for you.
Clever.
Yeah, this is great.
And actually, this leads into another topic that is worth discussing, which is basically
using LLMs to find bugs, right?
So suppose a programmer actually writes some solidity code.
And now you want to test.
Is that code correct?
Is it secure?
And like Ali said, you can actually try to use the LLM to find vulnerabilities in that code.
And there's been actually quite a bit of work on trying to assess how good LMs are at finding bugs in software in solidity smart contracts and C&C++.
There's one paper that came out recently that's actually very relevant.
It's a paper from the University of Manchester, which says that you would run a standard static analysis tool.
to find bugs in your code,
and it would find all sorts of memory management bugs
or potential bugs,
just a standard static analysis tool,
no machine learning whatsoever,
but then you would use an LLM to try and fix the code.
Yeah? So it would propose a fixed of the bug automatically.
And then you would run the static analyzer again on the fixed code.
And the static analyzer would say,
oh, the bug is still there or the bug is not there.
And you would keep iterating until the static analyzer says,
yeah, now the bug has been fixed and there's no more issues there.
So that was kind of an interesting paper.
This paper literally came out like two weeks ago.
So for both of these papers you just referenced, Dan, the one from the University of Manchester and also the one that you guys recently wrote on LLMs not being trusted to write correct code.
It could be working code, but not necessarily secure.
I'll link to those papers in the show notes so that listeners can find them.
Just one quick question before we move on from this.
So this is about the current state.
Is this a temporary situation or do you think there will be a time when LLMs can be trusted to write correct?
not just working, but secure smart contract code.
Is that possible or is that just way far off?
That's a difficult question to answer.
You know, these models are improving by leaps and bounds every week, right?
Yeah.
So it's possible that, you know, by next year, these issues will already be addressed
and that they could be trusted to write more secure code.
I guess we're saying that right now, the current models that we have, GPT4, GPD3 and so on,
if you use them to generate code, you have to be very, very careful and verified
the code that they wrote actually does what it's supposed to do and is it's secure.
Got it.
Well, and by the way, will we get to a point where the code that LLMs generate is less likely
to contain bugs than the code that a human generates?
And maybe that's the more important question, right?
Because in the same way that you can never say that a self-driving car will never crash,
the real question that actually matters is, is it less likely to crash than if it were a human
driver?
That's exactly right.
Because the truth is that it's probably impossible to...
guarantee that there will never be a car crash that is caused by a self-driving car or that
there will never be a bug that's generated by an LLM that you've asked to write any code.
And I think this will only, by the way, get more and more powerful, the more you integrate it
into existing tool chains.
So as we discussed, you can integrate this into formal verification tool chains.
You can integrate this into other tools like the one that Dan described where you have
a tool that checks for memory management issues.
You can also integrate it into unit testing and integration testing tool chains so that the
LLM is not just acting in a vacuum. It is getting real-time feedback from other tools that are
connecting it to the ground truth. And I think that through the combination of machine learning models
that are extremely big, trained with all of the data in the world, combined with these other tools
might actually make for programmers that are quite a bit superior than human programmers. And even
if they might still make mistakes, they might just be superhuman. And that'll be a big moment
for the portal of software engineering generally. Yeah, that's a great framing, Ali. So,
What are some of the other trends that come in for where I can help crypto and vice versa?
Yeah.
One exciting possibility in the space is that we may be able to build decentralized social networks
that actually behave a lot like Twitter might,
but where the social graph is actually fully on chain
and it's almost like a public good that anyone can build on top of.
And you as a user, you control your own identity on the social graph.
You control your own data.
You control who you follow and who can follow you.
And then there's a whole ecosystem of companies that build portals into the social graph that provide users experiences that are maybe somewhat like Twitter or somewhat like Instagram or somewhat like TikTok or whatever else they may want to build.
But it's all on top of this same social graph that nobody owns.
There's no billion dollar tech company in the middle that has complete control over it and that can decide what happens on it.
And so in that world, like that's an exciting world because it means that it can be much.
more dynamic and there can be this whole ecosystem of people building things and there's much
more control by each of the users over what they see and what they get to do on the platform.
But there's also the need to filter the signal from the noise.
And there's, for example, the need to come up with sensible recommendation algorithms that
filter all of the content and show you a news feed that you actually want to see.
And this will open the door to a whole marketplace, a competitive environment of participants
who provide you maybe with algorithms, with AI-based algorithms that curate content for you,
and you as a user might have a choice.
You can decide whether to go with one particular algorithm, maybe the algorithm that was built
by Twitter, or you can also go with one that's built by someone completely different.
And that kind of autonomy will be great.
But again, you're going to need tools like machine learning and AI to help you sift through
the noise and to help parse through all of the spam that inevitably will exist in a world
regenerative models can create all of the spam in the world.
What's interesting about what you said, too, is like it's not even about choosing between,
it goes back to this idea you mentioned earlier, and you mentioned this briefly about just
giving users the options to pick from marketplaces of free ideas and approaches that they can
decide.
But it's also interesting because it's not even only at a company to company level.
It's really just like what approach works for you.
Like you might be a person who's maybe more interested in the collaborative filtering
algorithm that was in the original form of original recommendation system.
which is like collaborative filtering across people.
So your friend's recommendations are the things you follow.
When in fact, I personally am very different and much more interested in an interest graph.
And therefore, I might be much more interesting people who just have similar interest to me.
And I might pick that approach versus say something else that's sort of like, hey, this is a hybrid approach.
Your only thing is going to do is X, Y, and Z.
Just even being able to pick and choose that is already tremendously empowering.
Like that's just simply not possible right now.
And it can only be possible at crypto and AI.
So that's a great example.
Oh, yeah.
Was there anything else to say on how AI can help with trust and security?
So I think that kind of the meta picture is that crypto is the wild west.
Because it's completely permissionless, because anyone can participate, you kind of have to assume
that whomever is participating might be an adversary and maybe trying to game the system
or hack the system or do something malicious.
And so there's much more of a need for tools that help you filter the honest participants
from the dishonest ones, and machine learning and AI as an intelligence tool can actually be
very helpful on that front.
So, for example, there's a project called Stello, which uses machine learning to identify
suspicious transactions that are submitted to a wallet and that flags those transactions for
the user before those transactions are submitted to the blockchain.
And that could be a good way to prevent the user from accidentally sending all of their funds
to an attacker or from doing something that they will regret later.
And that company basically sells to wallets, the companies like Metamask, such that then
Metamask can use the Intel and then do whatever it wants with it, either block the transaction
or warn the user or sort of reframe the transaction so that it's no longer dangerous.
And so that's one example.
There are other examples as well in the context of MEV, which stands for minor extractable
value or maximum extractable value, depending on who you ask, which is the value that can
be extracted by the people who have control over the ordering of transactions on a blockchain.
And that's often the miners or the validators of a blockchain. And AI here can cut both ways
and that those participants, if you're a validator on a blockchain and you have control over
ordering of transactions, you can do all sorts of clever things to order those transactions
in such a way that you profit. You can, for example, front run transactions, you can backrun
transactions, you can sandwich orders on uniswap. There's a lot of transactions that you could
craft such that you can profit from this ability of ordering transactions. And machine learning
and AI might supercharge that ability because it can search for opportunities to capture more and
more MEV. But then on the other hand, machine learning may help in the other way, in that it may
help as a defensive tool. You may be aware before you submit a transaction that there is
MEV that might be extractable from that transaction.
And so then maybe you will either split up your transaction into multiple transactions
so that there isn't a single validator that can completely control it or do something
as a way of protecting yourself from an MEV extractor at some point in the transaction pipeline.
So this is a way, again, where crypto plays a big role when it comes to security, when it comes
to trust, when it comes to making the space more trustworthy to the end user.
That's an example of AI making things difficult in crypto
and then crypto coming back and making things better for it.
I actually have another example like that.
So just like ML models can be used to detect fake data
or maybe malicious activity,
there's the flip side where ML models can actually be used
to generate fake data.
And the classic example of that is deepfakes, right?
Where you can create a video of someone saying things they never said
and that video looks fairly realistic.
And the interesting thing is that actually blockchains can help
to alleviate the problem. And so let me just walk you through one possible approach where
blockchains might be useful. Imagine it's a solution that might be only applicable to well-known
figures like politicians or maybe movie stars and such. But imagine basically a politician would
wear a camera on their chest and kind of record what they do all day long. Yeah. And then create
a Merkel tree out of that recording and push the Merkel tree commitment onto the blockchain. So now
On the blockchain, there's a timestamp saying, you know, on this and this date, you said such and such.
On this and that date, you said such and such.
And now if somebody creates a deep fake video of this politician saying things they never said, well, the politician can say, look, at this time where the video said I said this and that, I was actually somewhere completely different, doing something unrelated.
And the fact that all this data, the real data, the real authentic data is recorded on a blockchain can be used to prove that the deep fake really is fake and not real data.
Yeah. So this is not something that exists yet. It would be kind of fun for someone to build something like this. But I thought it's kind of an interesting example where blockchains might actually be helpful in combating deepfakes.
Is there also a way to solve that problem and show other timestamps or provenance where you can do that sort of verification of what's true, what's not true, without having to make a politician walk around with like a camera on their chest?
Yes, absolutely. We can also rely on trusted hardware for this. So imagine, you know, our cameras, you know, the cameras and our phones and such.
They would actually sign the images and video that they take.
There's a standard.
It's called the C2PA that specifies how cameras will sign data.
In fact, there's a camera by Sony that now will actually take pictures and take videos
and then produce C2PA signatures on those videos.
So now you basically have authentic data.
You can actually kind of prove that the data really came from a C2PA camera.
And so now if you maybe read a newspaper article and there's a picture in a newspaper article
And it claims to be from one place, but in fact, it's taken from a different place.
The signature could actually be verified by the fact that it's C2PA signed.
There's a lot of nuances there.
C2PA is a fairly complicated topic that there's a lot of nuances to discuss that maybe we won't get into here.
Yeah, and I remember you talking about this work with me previously.
I think it was at our offsite.
But I also remember from that that it doesn't stand up to editing.
And as you know, editorial people like me and other content creators.
And honestly, just about anyone who uses Instagram or any online posting,
No one, like, uploads things purely rawly as they were originally created.
Like, everyone edits them.
Yeah, typically when newspapers will publish pictures in a newspaper,
they don't publish the picture from the camera as is.
They will crop it.
There's like a couple of authorized things they're allowed to do to the pictures.
Maybe they gray scale it.
Definitely they downsample it so that they don't take a lot of bandwidth.
The minute you start editing the picture, that means that the recipient,
the end reader, the user on the browser who's reading the article,
can no longer verify the C2PA signature
because they don't have the original image.
So the question is, how do you let the user verify
that the image they're looking at
really was properly signed by a C2PA camera?
Well, as usual, this is exactly where zero knowledge techniques
come in where you can prove that the edited image
actually is the result of applying just downsampling
and gray scaling to a properly signed larger image.
Yeah, and so instead of a C2PA signature,
we would have a ZK proof, a short ZK proof associated with each one of these images.
And now the readers can still verify that they're looking at authentic images.
So it's very interesting that ZK techniques can be used to fight disinformation.
It's a bit of an unexpected application.
That's fantastic.
A very related problem, by the way, is proving that you're actually human in a world
where all of the deep fakes creating the appearance of humanity will generally outnumber humans,
a thousand to one or a million to one, and most things on the internet might actually be generated
by AI. And so one potential solution, which is related to what you're saying, is to use
biometrics to be able to establish that someone is actually human, but to then use zero knowledge
proofs to protect the privacy of the people who are using those biometrics to prove their
humanity. So one project in this category is called WorldCoin. Yeah. It's also a project in our
portfolio, and they use this orb, people may have seen, this shiny silver orb that uses
retinal scans as biometric information to verify that you're actually a real human.
And it also has all sorts of other sensors to make sure that you're actually alive and that it
can't actually be a picture of an eye.
And it's this system that has secure hardware and is very difficult to temper with such that
the proof that emerges on the other end, which is a zero knowledge proof that obscures your
actual biometric information is very, very difficult to forge. In this way,
politicians could, for example, prove that their video stream or that a signature of theirs
or that a participation of theirs on some online forum is actually their own and that they're
actually human. What's really interesting about what you said, Ali, that's a great follow-up to what
Dan was saying about ways to verify like authentic media versus like fake or deep fake media and
this world of infinite media, as you would say, that we live in. But what are the other
applications for proof of personhood type technologies like that? I think it's important because
this is actually another example of how crypto can help AI more broadly, too. We're kind of
flipping back and forth there, but that's okay because we're just talking about really interesting
applications, period. So it's fine. That's a really good question. One of the things that will
become important in a world where anyone can participate online is to be able to prove that you are
human for various different purposes. There's that famous saying from the 90s that on the internet,
nobody knows you're a dog.
Oh, yeah.
And I think maybe a reshaped form of that saying is that on the internet, nobody knows
you're a bot.
So then I guess this is exactly where proof of humanity projects become very important.
Because it will become important to know whether you're interacting with a bot or with a human.
For example, in the world of crypto, there's this whole question of governance.
How do you govern systems that are decentralized, that don't have any single point of control
and that are bottom up and community owned?
you would want some kind of governance system that allows you to control the evolution of those
systems. And the problem today is that if you don't have proof of humanity, then you can't tell
whether one address belongs to a single human or whether it belongs to a bunch of humans
or whether 10,000 addresses actually belong to a single human and are pretending to be 10,000 different
people. And so today you actually have to just use amount of money as a proxy for voting power,
which leads to plutocratic governance systems.
But if every participant in a governance system
could prove that they're actually human
and they could do so in a way that's unique
such that they can't actually pretend to be more than one human
because they only have a single set of eyeballs,
then the governance system could be much more fair
and less plutocratic and can be based more on each individual's preferences
rather than on the preference of the largest amount of money
that's locked up in some smart contract.
Actually, just to give an example of that,
Today, we're forced to use one token, one vote because we don't have proof of humanity.
Maybe we'd like to do one human one vote.
But if you can pretend to be five humans, then of course, that doesn't work.
And so one example, what this comes up is something called quadratic voting.
So in quadratic voting, basically, if you want to vote five times for something, you have to kind
of put 25 chits down to do that.
But of course, you can do the same thing.
You can just pretend to be five different people, each voting once.
and that would kind of defeat the mechanism of quadratic voting.
So the only way to prevent you from doing that is this exact proof of humanity, where in order
to vote, you have to prove that you're a single entity rather than a symbol of entities.
And that's exactly where proof of humanity would play an important role.
Generally, identity on chain is actually becoming quite important for governance.
Totally.
That, by the way, reminded me of an episode that you and I did, Ali, years ago with Phil Dan,
remember on Dark Dows?
That was such an interesting discussion that totally relevant there.
Totally.
By the way, is it phrase proof of personhood or proof of humanity?
What's the difference?
Is it the same thing?
Yeah, people use them interchangeably.
Proof of human, proof of humanity, proof of personhood.
Yeah, yeah.
Okay, so keep going on this theme then of media and this kind of infinite abundance of media.
Like, what are other examples?
And again, we're talking about crypto helping AI, AI helping crypto.
Are there any other examples that we haven't covered here where the intersection of
crypto and AI can bring about things that aren't possible by either one of them alone?
Completely.
I mean, another implication of these generative models is that we're going to live in a world of infinite abundance of media.
And in such a world, things like community around any one particular piece of media or the narrative around a particular piece of media will become ever more important.
Just to make this very concrete, there's two good examples here.
Sound.xYZ is building a decentralized music streaming platform.
enabling artists, musicians essentially, to upload music and to then connect directly with our communities
by selling them NFTs that give people in those communities certain privileges.
Like, for example, the ability to post a comment on the sound.xyc website on the track
such that anyone else who plays the song can also see the comment.
This is similar to the old SoundCloud feature that people might remember,
where you could have like this whole social experience on the music.
music itself as it played on the website, it's this ability to allow people to engage with
the media and to engage with each other often in a way that's economic because they're essentially
buying this NFT from the artist as a way of being able to do this. And they are, as a side
of fact, they're supporting the artist and helping the artist be sustainable and be able to create
more music. But the beauty of all of this is that actually gives an artist a forum to really
interact with their community. And the artist is a human artist. And as a result of crypto being
in the loop here, you can create a community around a piece of music that wouldn't automatically
exist around a piece of music that was just created by a machine learning model that's devoid
of any human element, right? That doesn't have a community around it. And so I think, again,
in a world where a lot of the music that we're going to be exposed to will be fully generated
by AI, the tools to build community and to tell a story around art, around music, around other
kinds of media will be very important as a way of sort of distinguishing media that we
really care about and really want to invest in and spend time with from media that may also
be very good, but it's just a different kind of media. It's media that was just generated by
AI with less of a human element. And it may be, by the way, that there's some synergy between
the two, like it could be that a lot of the music will be AI enhanced or AI generated. But if
there's also a human element, like say, for example, a creator leveraged AI tools to create a new
piece of music, but they also have a personality on sound. They have an artist page. They have
a community for themselves and they have a following. Then now you have like this kind of synergy
between the two worlds where you both have the best music because it's augmented by the superpowers
that AI gives you. But you also have a human element and a story that was coordinated and
made real by this crypto aspect, which lets you bring all of those people together into one
platform. It's really quite amazing that even in the world of music, just like we
talked about in the world of coding where you have a human coder that's being enhanced by tools
like co-pilot that generate code. We are seeing things like this where an artist is being enhanced
by ML systems that help write, or at least parts of the music are being written and generated
by an ML system. So it's definitely a new world that we're kind of moving into in terms of content
generation. Basically, there's going to be a lot of spam that's generated by machine-generated arts,
which people might not value as much as they value art as generated by an actual human.
Maybe another way to say it is one of the points of NFTs was to support the artists.
Yes.
But if the artists themselves are now machine learning models, then who exactly are we supporting?
Yeah.
And so it's a question of how do we distinguish, how do differentiate human generated art
that need support versus machine generated art?
This is a philosophical discussion for over drinks maybe,
but I would maybe go so far as to say that the prompter is also an artist of sorts.
And in fact, I would make the case that that person is an artist.
And the same thing has come about with, as this is a discussion and debate as old as time.
And it's just simply new technologies, old behaviors.
It's the same thing that's been playing out for eons and the same thing's playing out the writing, et cetera, totally.
Very true.
Well, that actually opens up the door for collective art,
for art that's generated through the creative process of a whole community as opposed to a single
artist. There are actually already projects that are doing this where you have a process by which a
community influences through some voting process on chain what the prompt for a machine learning model
like Dali will be. Then you have Dali use that prompt to generate a work of art. Maybe you generate
not one work of art, but like 10,000. And then you use another machine learning model that's also trained
from feedback by the community to pick from those 10,000 the best one. Right. And so then now you have
work of art that was generated from the input of the community, that was also sort of pruned
and selected from a set of 10,000 variants of that work of art.
Also, through a machine learning model that's trained by the community, to then generate
one work of art that is kind of the product of this collective collaboration.
That's incredible.
I love it.
Well, you guys, that's a great note to end on.
Thank you both for sharing all that with the listeners of Web3 with A6 and Z.
Thanks, Anil.
Thank you so much.
Thank you for listening to Web 3 with A6 and Z.
You can find show notes with links to resources, books, or papers discussed, transcripts, and more at A6NCrypto.com.
This episode was produced and edited by Sonal Choxy.
That's me.
The episode was technically edited by our audio editor, Justin Golden.
Credit also to Moonshot Design for the Art and all thanks to support from A6 and Z Crypto.
To follow more of our work and get updates,
resources from us and from others, be sure to subscribe to our Web 3 weekly newsletter.
You can find it on our website at A6NZ Crypto.com.
Thank you for listening and for subscribing.
Let's go.