a16z Podcast - AI x Crypto

Episode Date: September 13, 2023

with @alive_eth @danboneh @smc90This week's all-new episode covers the convergence of two important, very top-of-mind trends: AI (artificial intelligence) & blockchains/ crypto. These domains together... have major implications for how we all live our lives everyday; so this episode is for anyone just curious about, or already building in the space. The conversation covers topics ranging from deep fakes, bots, and the need for proof-of-humanity in a world of AI; to big data, large language models like ChatGPT, user control, governance, privacy and security, zero knowledge and zkML; to MEV, media, art, and much more. Our expert guests (in conversation with host Sonal Chokshi) include: Dan Boneh, Stanford Professor (and Senior Research Advisor at a16z crypto), a cryptographer who’s been working on blockchains for over a decade and who specializes in cryptography, computer security, and machine learning -- all of which intersect in this episode;Ali Yahya, general partner at a16z crypto, who also previously worked at Google -- where he not only worked on a distributed system for a fleet of robots (a sort of "collective reinforcement learning") but also worked on Google Brain, where he was one of the core contributors to the machine learning library TensorFlow built at Google.The first half of the hallway-style conversation between Ali & Dan (who go back together as student and professor at Stanford) is all about how AI could benefit from crypto, and the second half on how crypto could benefit from AI... the thread throughout is the tension between centralization vs. decentralization.  So we also discuss where the intersection of crypto and AI can bring about things that aren't possible by either one of them alone...pieces referenced in this episode/ related reading:The Next Cyber Reasoning System for Cyber Security (2023) by Mohamed Ferrag, Ammar Battah, Norbert Tihanyi, Merouane Debbah, Thierry Lestable, Lucas CordeiroA New Era in Software Security: Towards Self-Healing Software via Large Language Models and Formal Verification (2023) by  Yiannis Charalambous, Norbert Tihanyi, Ridhi Jain, Youcheng Sun, Mohamed Ferrag, Lucas CordeiroFixing Hardware Security Bugs with Large Language Models (2023) by Baleegh Ahmad, Shailja Thakur, Benjamin Tan, Ramesh Karri, Hammond PearceDo Users Write More Insecure Code with AI Assistants? (2022) by Neil Perry, Megha Srivastava, Deepak Kumar, Dan BonehAsleep at the Keyboard? Assessing the Security of GitHub Copilot's Code Contributions (2021) by Hammond Pearce, Baleegh Ahmad, Benjamin Tan, Brendan Dolan-Gavitt, Ramesh KarriVoting, Security, and Governance in Blockchains (2019) with Ali Yahya and Phil Daian    As a reminder: none of the following should be taken as investment, legal, business, or tax advice; please see a16z.com/disclosures for more important information -- including to a link to a list of our investments – especially since we are investors in companies mentioned in this episode. Stay Updated: Find a16z on Twitter: https://twitter.com/a16zFind a16z on LinkedIn: https://www.linkedin.com/company/a16zSubscribe on your favorite podcast app: https://a16z.simplecast.com/Follow our host: https://twitter.com/stephsmithioPlease note that the content here is for informational purposes only; should NOT be taken as legal, business, tax, or investment advice or be used to evaluate any investment or security; and is not directed at any investors or potential investors in any a16z fund. a16z and its affiliates may maintain investments in the companies discussed. For more details please see a16z.com/disclosures.

Transcript
Discussion (0)
Starting point is 00:00:00 AI is very much a technology that thrives and enables top-down centralized control, whereas crypto is a technology that's all about bottom-up decentralized cooperation. One of the points of NFTs was to support the artists. Yes. But if the artists themselves are now machine learning models, then who exactly are we supporting? One of the things that will become important in a world where anyone can participate online is to be able to prove that you are human. for various different purposes.
Starting point is 00:00:32 If we're going to incentivize people to contribute data, basically we're going to incentivize people to create fake data so they can get paid. So we have to have some sort of a mechanism to make sure that the data you're contributing is authentic. Hello, everyone, and welcome back to the A16Z podcast. This is your host, Steph Smith, but today I'm passing the baton back to longtime host Sonal Choxi.
Starting point is 00:00:56 This, of course, is also a crossover episode from our sister podcast, Web3 with A16Z, which Sonal now hosts. There are few technologies over the last few years that have quite captured the zeitgeist like crypto in AI. So in today's episode, Sonal sits down with guests Aliyaa and Dan Bonnet to explore the ways in which these two emerging technologies oppose, yet also beautifully augment one another. And they attack this from both directions, how crypto can help AI, like how crypto can help AI, like how crypto acts as a decentralized counterweight to the somewhat centralizing force
Starting point is 00:01:32 where AI models with more data, more compute, and more complex models do tend to win. But also how AI can help crypto. For example, are we at the point where LLMs should be writing smart contract code? And what about all these deepfakes we keep hearing about? Let's find out. Welcome to Web3 with A6 and 6.6. Z, a show about building the next generation of the internet from the team at A6 and Z Crypto. That includes me, your host, Sonal Jaxi.
Starting point is 00:02:05 Today's all new episode covers the convergence of two important top of mind trends, AI, artificial intelligence, and crypto. This has major implications for how we all live our lives every day, so this episode is for anyone just curious about or already building in the space. Our special guest today are Dan Benet, Stanford professor, and senior research advisor at A6 and Z Crypto. He's a cryptography, who's been working on blockchains for over a decade and the topics have a strong intersection between cryptography, computer security, and machine learning, all of which are his areas of
Starting point is 00:02:36 expertise. And then we also have Ali Yaya, general partner at A6NZ Crypta, who also worked at Google previously, where he not only worked on a distributed system for robotics, more specifically as sort of collective reinforcement learning, which involved training a single neural network that contributed to the actions of an entire fleet of robots, but also worked on Google Brain, where he was one of the core contributors to the machine learning library TensorFlow. And actually, Dan and Ali go back since Ali was an undergrad and master's student at Stanford. So this conversation is really more of a hallway jam between them that I asked to join. And we cover everything from deep fakes and bots to proof of humanity in a world of AI and much, much more.
Starting point is 00:03:17 The first half is all about how AI could benefit from crypto and the second half on how crypto could benefit from AI. And the thread throughout is the tension between centralization, versus decentralization. As a reminder, none of the following should be taken as investment, legal, business, or tax advice. Please see A6Sinze.com slash disclosures for more important information, including a link to a list of our investments, especially since we are investors and companies mentioned in this episode. But first, we begin with how the two worlds intersect with the quick sharing of areas or visions
Starting point is 00:03:46 that they're excited about. The first voice you'll hear is Ali's. There is a really good sci-fi novel called The Diamond Age by Neil Stevenson, in which there is this device known as the illustrated primer that is a kind of artificially intelligent device that acts as your mentor and your teacher throughout life. And so when you're born, you're paired to an AI essentially that knows you really well, learns your preferences, follows you throughout life, and helps you make decisions and steers you in the right direction. So there's like a sci-fi future in which you could build such an AI,
Starting point is 00:04:21 but you very much wouldn't want that AI to be controlled by a monopolistic, tech giant in the middle because that position would provide that company with a great deal of control and solve these kind of questions of privacy and sovereignty and you'd want to have kind of control over it. And then also what if the company goes away or they change the rules or they change the pricing, it would be great if you could build an AI that could run for a very, very long time and could get to know you over the course of a lifetime, but have that really be yours. And so there is this vision in which you could do that with a blockchain. You could embed an AI within a smart contract and with the power of zero knowledge proofs,
Starting point is 00:04:57 you could also keep your data private. And then over the course of decades, this AI can become smarter and can help you. And then you have the option to do whatever you want with it or change it in whichever way you want or shut it down. And so that's kind of an interesting vision for long-running AIs that are continually evolving and continually becoming better. It'd be better if it were the case that they weren't just controlled by a single centralized company.
Starting point is 00:05:19 Of course, it's a very science fiction idea because there are. are lots of problems, including the problems of verification and the problems of keeping data private using cryptography and still being able to compute on top of that data, maybe with fully homomorphic encryption. All of these problems continue to be outstanding, but it's not something that's inconceivable. Well, I love Ali's vision there. I love it too, especially given that quote, I think it was Asimov that today's science fiction is tomorrow's science fact. Ali, I know you have a meta-framework for thinking about all this stuff that I've heard you share before. Can you share that now too? Yeah, there is this broader narrative that has existed
Starting point is 00:05:57 for quite some time now that's only becoming much more accentuated now with the development of things like LLMs. Actually, if I met really quickly, just for listeners who aren't already familiar, just as context. So in LLM, it stands for a large language model and it uses some of the technology that was developed at Google. Back in 2017, there's this famous paper known as attention is all you need. That was the title of the paper. And it outlined what are now known. as transformers. And that's at the basis, basically, of some of the new models that people have been training these days, including chat GPT and so on. All of these are large language models or LLM's. There was that famous, I think 2018 line from Peter Thiel that AI is communist and crypto is
Starting point is 00:06:37 libertarian. That line is like very on point, actually, because AI and crypto in many ways are natural counterweights for one another. And maybe we can go deep over the course of the podcast into each one of these as we go through examples, but there are four major ways in which that's true. The first is that AI is very much a technology that thrives and enables top-down centralized control, whereas crypto is a technology that's all about bottom-up decentralized cooperation. And in many ways, actually, you can think of crypto as the study of building systems that are decentralized that enable large-scale cooperation of humans where there isn't really any central point of control. And so that's one natural way in which these two technologies are
Starting point is 00:07:19 counterweight for one another. Another one is that AI is a sustaining innovation in that it reinforces the business models of existing technology companies because it helps them make top down decisions. And the best example of this would be Google being able to decide exactly what add to display for each of their users across billions of users and billions of page views, whereas crypto is actually a fundamentally disruptive innovation in that it has a business model that's fundamentally at odds with the business models of big tech companies. And so as a result, it's a movement that is spearheaded by rebels, by the fringes, as opposed to being led by the incumbents.
Starting point is 00:07:56 So that's the second. A third one is that AI will probably relate and interplay a lot with all of the trends towards privacy because AI as a technology has built in all sorts of incentives that move us towards less individual privacy because we, will have companies that want access to all of our data. And AI models that are trained on more and more data will become more and more effective. And so I think that that leads us down a path of the AI Panopticon where there's just collective aggregation of everyone's data into the training of these enormous models in order to make these models as good as possible.
Starting point is 00:08:34 Whereas crypto moves us towards the opposite direction, which is a direction of increasing individual privacy. It's a direction of increasing sovereignty where users have control over their own data. And those two trends, I think, will be very important. And this is just another important way in which crypto is the counterweight for AI. And maybe the final one has to do with this latest trend in AI, the fact that AI is now very clearly a powerful technology for generating new art is now a creative tool that will lead us to infinite abundance of media, infinite creativity in many ways. And crypto is a counterweight to that, because It helps us cut through all of the abundance and helping us distinguish what's created by humans versus what's created by AI.
Starting point is 00:09:19 And cryptography will be an essential part of maintaining and preserving what actually is human in a world where a thousand X more of the content is actually artificially generated. So these are all things that we can talk about, but I think that there is this important meta-narrative. And these two technologies are very much diametrically opposing in many respects. So maybe, Ali, to add to that, this is a wonderful summary. And I would say also that there's also a lot of areas where techniques from AI are having an impact in blockchains and vice versa, where techniques from blockchains are having an impact in AI. I'll give a brief answer here because we're going to dive into the details in just a minute. But there are many points of intersection.
Starting point is 00:09:59 I guess we'll talk about applications of zero knowledge from machine learning in just a minute. But I also want to touch on all these applications where machine learning itself can be used to write code. So, for example, machine learning can be used to write solidity code that goes into contract. It can be used to find maybe errors in codes and so on. There's points of intersection where machine learning can be used to generate deepfakes and blockchains can actually help to protect against deep fakes. And so I guess we're going to touch on all these points. But the interesting thing is that there's really quite a lot of intersection between blockchains and machine learning. Yeah, before we dive into those, one question I have for you, Dan, is do you agree with that?
Starting point is 00:10:36 I mean, I definitely hear Ali's point that AI and crypto are very natural compliments actually or counterweights really for each other. Or, you know, they can be different forces that can kind of check and balance each other almost. But is this an inherent quality to AI and crypto in your opinion? Or is this just an artifact of the way things are done right now? What parts might you agree or disagree with? Yeah. So I would say that if you look at it from far away, the techniques that are used in the AI,
Starting point is 00:11:03 they seem very different from the techniques that are used in blockchains. So blockchains is about cryptography, decentralization, finance, and economics, and so on, whereas AI is, you know, about the statistics, the mathematics of machine learning and so on. It's about big data, right? The techniques actually look quite different. But there are actually a lot of places where one side can help the other and vice versa. So maybe the first one to start with is kind of the obvious one that's been on a lot of people's minds, which is what's called the applications of zero knowledge for machine learning. This is kind of an emerging area. It's called a ZKML. And the reason this has become interesting is because ZK techniques have improved dramatically because of their application in blockchains. What's happened over the last 10 years is sort of unbelievable. You know, it's something that we don't see very often. This idea of zero knowledge proofs and proof systems in general, they were considered very theoretical a decade ago. And because of all of their applications in blockchains, all of a sudden, there was a lot of effort in making them more practical and real world. And as a result, there's been tremendous progress, as our listener,
Starting point is 00:12:06 know that now these things are actually deployed and used to protect real systems. So the question then is, can zero knowledge techniques be used to help machine learning? And there are a couple of examples, honestly, we could spend a whole podcast just on ZKML. But maybe I can just give a taste, one or two examples where ZK is useful for machine learning. And so imagine Alice has a secret model that she spent a lot of time training. And that model is actually very important to her. It's very important that people don't know how the model works. But she still wants to be able to service requests from Bob.
Starting point is 00:12:36 So Bob would like to send her some data. She would apply the model to the data, send the result back to Bob. Bob has no idea whether he's getting the correct results on the model, right? Maybe he paid for a certain model and you want to make sure that Alice is really using that model. Maybe he paid for GPT4 and he wants to make sure Alice is really using GPT4 and not GPT3. Well, it turns out ZK techniques can help here a lot. So what Alice would do, she would commit to her model, make the commitment publicly available. And then whenever Bob submits a piece of data,
Starting point is 00:13:06 Alice could run the model on that data, sends the results back to Bob, along with a proof that the model was evaluated correctly. So Bob now would have a guarantee that, in fact, the model that was committed to is the one that was run on Bob's data. Yeah? So that's an example where ZK techniques can be useful in the ML case. And I want to kind of stress why this is so important. So let's look at one example. So suppose we have a function, a model that's actually used to affect people's lives. Like imagine, you know, maybe we use a model to decide whether we grant a loan or grant a mortgage, you know, a financial institution might want to use a model like that. Well, you want to make sure that the same model is being applied to everyone, right? That it's not the
Starting point is 00:13:49 case that, you know, one model is being applied to me and a different model is being applied to you. Well, by basically having the bank commit to the model, right? And then everyone can verify that their data is being assessed by the same committed model. We can make sure. that the same model is being applied to everyone. And I have to say that there's a wonderful open problem here, which is that even though zero knowledge techniques can make sure that the same model is being applied to everyone, there is this question, you know, is the model fair?
Starting point is 00:14:19 Models can have biases, could lead to unfair results. And so there's a whole area of research. It's called algorithmic fairness. There are many, many papers on algorithmic fairness. And it's really interesting to ask, well, now that we have a committed model, can we prove in zero knowledge that the model satisfies some fairness definition from the area of algorithmic fairness? And how do we make sure that the training process ran correctly?
Starting point is 00:14:42 Well, everything that you said about ZKML is extremely exciting. And as a technology, I think it'll play a role at making machine learning and AI sort of generally more transparent and more trustworthy, both within the context of crypto and outside of it. I think an even crazier and maybe longer term and more ambitious application of ZKML and some of the other verification techniques that the crypto community has been working on is just generally decentralizing AI. Because as we were talking about before, AI is a technology that is almost inherently centralizing in that it very much thrives from things like scale effects because having things
Starting point is 00:15:24 within a data center makes things more efficient. And so scaling things in a centralized way makes for things to become more powerful and more centralized as a result. Then also data is usually controlled by a small number of tech companies in the middle. And as a result, also kind of leads to additional centralization. And then finally, machine learning models and machine learning talent also kind of controlled by a small number of players. And so crypto can again help on this front by building technology using things like ZKML that can help us decentralized. So there are three main things that go into AI. There's the compute aspect, and that requires sort of large-scale use of GPUs, usually in data centers. There's the data piece, which, again,
Starting point is 00:16:06 most of the centralized companies control. And then there's the machine learning models themselves. And the easiest one might be the prong of compute. Like, can you actually decentralize the compute for the training and the inference of machine learning models? And this is where some of the techniques that Dan was talking about, things like zero knowledge proofs that you can used to prove that the process of actually conducting inference or training a model was actually done correctly so that you can outsource that process to a large community and you can have a distributed process by which anyone who has a GPU can contribute computation to the network and have a model be trained in that way without necessarily having to rely on a massive data
Starting point is 00:16:47 center with all of the GPUs in a centralized manner. And there's a big question still of whether or not, that economically ends up making sense, but at the very least, through the right incentives, you can actually tap into the long tail. You can tap into all of the idle GPU capacity that might exist, have all of those people contribute that computation to the training of a model or to the running of inference. And it provide an alternative to what otherwise would be just the big tech companies in the middle that currently control everything. There are all sorts of important technical problems that would have to be solved in order for that to be possible. There's actually a company in the space that it's called Jensen, which is building exactly this.
Starting point is 00:17:27 They are building a decentralized marketplace for GPU compute very much for the purpose of training machine learning models. And it's a marketplace where anyone could contribute their GPU compute, whether it be in their kind of personal computer under their desk or whether it be idle inside of some data center. And then on the other side, anyone can leverage whatever compute exists in the network to train their large machine learning models. And this would be an alternative to the very centralized sort of open AI slash Google slash meta slash all, you know, insert your favorite big tech company here, alternative that currently you would necessarily have to go with. So before we go into more of the decentralization framework, because Ali, you were breaking down like compute and I think you were going to share the other two of those three prongs. But before we do, both of you guys talked a little bit about all the technical challenges here.
Starting point is 00:18:21 So what are some of the technical challenges that need to be overcome here and that people may or may not already be solving? I definitely want builders who listen to this episode to also think about what the opportunities are in this space and address existing challenges or what are some of the challenges they're going to face and building solutions here. Yeah, so maybe I can mention too that I think would be interesting to folks. So one is basically, imagine you have a situation where Alice actually has a model that she wants to protect. She wants to send the model in an encrypted form to some other party, let's say to Bob. So Bob now receives an encrypted model and it needs to be able to run its data on this encrypted model. Well, how do you do that?
Starting point is 00:18:58 If you have a piece of data that you want to run on a model but you only have the encryption of the model, how do you make that possible? And that is something that we would use what's called fully homomorphic encryption for. Yeah. It's a fully homomorphic encryption is this remarkable tool that allows you to compute an encrypted data. This is kind of mind-boggling that this is possible, but you can have an encrypted model, and you might have some clear-text data, and you can actually run the encrypted model on the clear-text data and receive and obtain an encrypted result. You would
Starting point is 00:19:29 send the encrypted result back to Alice, and she would be able to decrypt and see the results in the clear. So this is actually something that's already, there's actually quite a bit of demand for this and practice. It doesn't take much effort to see that the DOD is interested in this. There are many other applications where you can send an encrypted model to a third party. The third party would run the encrypted model under data, send you back the results. You can decrypt and learn something about the data that was given as input to the encrypted model. The question, of course, is how do we scale that up? Right now, this works well for medium-sized models. And the question is, can we scale it up to much larger models? So this is quite a challenge.
Starting point is 00:20:06 A couple of startups in the space. And again, very, very interesting technology. It's kind of amazing that this is possible at all. Yeah. And we're probably going to see much more of that in the future. The other area is actually what Ali mentioned, another very important area, which is how do we know that the model was trained correctly? So if I send my data to someone and I ask him to train a model on their data, maybe fine-tune the model on that data, how do I know that they did that correctly? They might send me a model back, but how do I know that the model doesn't have back doors in it? There's actually a fair amount of work on showing that if the training is done incorrectly, I could send you back a model that would work correctly on all the
Starting point is 00:20:43 your test data, but it has a backdoor, meaning that it would fail catastrophically on one particular input. This is possible if your training process is not verified. And again, this is an area where ZKML comes in. We can prove to you that the training ran correctly, or maybe there's some other techniques that might be possible to prove that the training was done correctly. But then, this is another area and a very active area of research. And I would encourage many of the listeners, this is like a very, very difficult problem, proving that training was done correctly, proving that the training data even was collected correctly and was filtered correctly and so on. So that actually is a wonderful area to get into if people are looking to do more work in
Starting point is 00:21:22 the space. Fantastic. Ali, is there anything you'd add to that? Yeah, definitely. Well, I guess if we continue down the path of talking about what it would take to help decentralize the AI stack, I think that in order to decentralize the compute prong, and there are the three important prongs. If we wanted to decentralize the compute aspect, there are two very important open technical challenges. The first is the problem of verification, which Dan just mentioned, which you could use ZKML for. And you can ideally, over time, use zero knowledge proves to prove that the actual work that the people who are contributing to this network was actually done correctly. And the challenge there is that the performance of these cryptographic primitives is nowhere near where it needs to be to be able to do either training or inference of the very, very large models. So the models today, like sort of the LLMs that we all kind of know and love now, like Chad GPT, would not be provable using the current state of the art of ZKML.
Starting point is 00:22:20 And so there's a lot of work that's being done towards improving the performance of the proving process so that you can prove larger and larger workloads efficiently. But that's an open problem and something that a lot of people are working on. And in the meantime, companies like Jensen are using other techniques that are not just cryptographic and instead are, game theoretic in nature where they just get a larger number of people who are independent from one another to do the work and compare their work with one another to make sure that the work's done correctly. And that is a more more of a game theoretic, optimistic approach that is not relying on cryptography, but is still aligned with this greater goal of decentralizing AI or helping create an ecosystem for AI that is much more organic community owned and bottom up,
Starting point is 00:23:06 as opposed to the top down that's being sort of put forth by companies like Open A.I. So that's the first problem. The first big problem is the problem of verification. And the second big one is the problem of distributed systems. Like how do you actually coordinate a large community of people who are contributing GPUs to a network such that it all feels like an integrated, unified substrate for computation? And there will be lots of interesting challenges along the lines of breaking up a machine learning workload in a way that makes sense and shipping off different pieces of it to different nodes in the network, figuring out how to do all of that efficiently, and then also when nodes fail, figure out
Starting point is 00:23:45 how to recover and assign new nodes to then take over whatever work was being done by the node that failed. So there are lots of messy details at the distributed systems level that companies will have to solve in order to give us this decentralized network that can perform machine learning workloads in a way that's perhaps even cheaper than just using the cloud. Yeah, that's great. Totally. It's definitely true that the ZK techniques today will handle the smaller models that are being used, but definitely the LLMs are probably too big for these techniques to handle today, the ZK techniques. But, you know, they're constantly getting better. The hardware is getting better. And so hopefully they'll catch up. Yeah. Before we go on, can we just do a really clear pulse check then on where we are exactly? And that so obviously what I'm hearing you guys say is that there are tremendous applications at the intersection of general verifiable computing, which blockchains and crypto have definitely been significantly advancing and accelerating that whole area. We've been covering a ton of it ourselves. If you look at our
Starting point is 00:24:44 ZK Canon and zero knowledge category, you'll see so much of this covered there. But where are we exactly right now in terms of what they can do? Because you guys talked a lot about what they can't do yet and what the opportunity is, which is exciting. But where are we right now? Like what can they actually do? Yeah. So right now they can basically do classification for medium size models. So not something as big as GPT 3 or 4, but medium-sized models, it is possible to prove that the classification was done correctly. Training is probably beyond what can be done right now, just because training is so compute-intensive that for proof systems were not there yet. But like Ali said, we have other ways to do it. For example, we can have multiple people do the training and then compare the
Starting point is 00:25:27 results, yeah, so that now there are game theory consentives for people not to cheat. If somebody cheats, somebody else might be able to complain that they computed the training incorrectly, and then whoever cheated will not be paid for their effort. Right, right. So there's an incentive for people to actually run the training the way it was supposed to run. Right. And so basically that's sort of not like a hybrid phase, but it's basically like alt approaches until more of this comes to scale and performances scale to a point where we can get there.
Starting point is 00:25:56 Yeah, I would say that for some models, classification can be proved in zero knowledge today. For training right now, we have to rely on optimistic techniques. Yeah, great. So, Ali, you mentioned compute is one of the three prongs. And you also mentioned that data and then the models for machine learning themselves. Do you want to tackle now data and sort of the opportunities and challenges there where it comes to the crypto slash AI intersection? Yeah, absolutely. So I think that there is an opportunity, even though the problems involved are very difficult, to both decentralize the process of sourcing data for training of large.
Starting point is 00:26:29 machine learning models from a broader community, again, instead of having a single centralized player, just collect all of the data and then train the models themselves. And this could work by creating a kind of marketplace that's similar to the one that we just described for compute, but instead incentivize people to contribute new data to some big data set that then gets used to train a model. The difficulty with this is similar in that there's a verification challenge. You have to somehow verify that the data that people are contributing. is actually good data and that it's not either duplicate data or garbage data that was just
Starting point is 00:27:06 sort of randomly generated or not real in some way. And to also make sure that the data doesn't somehow subvert the model and some kind of poisoning attack where the model actually becomes either backdored or just sort of less good or less performant than it used to be. So there's the question of how do you verify that that is the case? And that's maybe an open, hard problem for the community to solve. It may be impossible to solve completely, and you might have to rely on a combination of technological solutions with social solutions, where you also have some kind of reputation metric that members in the community are able to earn to build up credibility, such that when they contribute data, the data can then be trusted a little bit more than it would
Starting point is 00:27:47 be otherwise. But what this might allow you to do is that you can now truly cover the very, very long tail of the data distribution. And one of the things that's very challenging in the world of machine learning is that your model is really only as good as the coverage of the distribution that your training data set can achieve. And if there are inputs that are far, far out of the distribution of the training data, then your model might actually behave in a way that's completely unpredictable. And in order to actually get the model to perform well in the edge cases and the sort of
Starting point is 00:28:20 black swan data points or data inputs that you might, experience in the real world, you do want to have your data set be as comprehensive as possible. And so if you had this kind of open decentralized marketplace for the contribution of data to a data set, you could have anyone who has very, very unique data out in the world, contribute that data to the network, which is a better way to do this because if you try to do this as a central company, you have no way of knowing who has that data. And so if you flip it around and create an incentive for those people to come forward and provide that data on their own accord, then I think you can actually get significantly better coverage of the long tail. And as we've seen, the performance of machine learning models continues to improve as the data set grows and as the diversity of the data points in the data set grows.
Starting point is 00:29:10 And so this can actually supercharge the performance of our machine learning models to an even greater degree. We're able to get even more comprehensive data sets that over the whole of the whole. distribution. So let me turn this on its head in that if we're going to incentivize people to contribute data, basically we're going to incentivize people to create fake data so they can get paid. Yeah, so we have to have some sort of a mechanism to make sure that the data you're contributing is authentic. Exactly. And you can imagine a couple of ways of doing this, right? I mean, one way is actually by relying on trusted hardware, right? Maybe the sensors themselves are embedded in some trusted hardware that we would only trust data that's properly signed by the hard.
Starting point is 00:29:49 That's one way to do things. Otherwise, we would have to have some other mechanism by which we can tell whether the data is authentic or not. Completely agree. That would be the biggest open problem to solve. And I think that as benchmarking for machine learning models gets better, I think there's two important trends in machine learning at the moment. There's improving the measurement of the performance of a machine learning model. And for LLMs, that's still very much in its early stages and that it's actually quite hard to know how good an LLM is because it's not as if. they were like a classifier where what the performance of a model is is very clearly defined
Starting point is 00:30:24 with an LLM, it's almost as if you're testing the intelligence of a human, right? And coming up with the right way of testing how intelligent an LLM like ChatGPT is, is an open area of research. But over time, I think that will become better and better. And then the other trend is that we're getting better at being able to explain how it is that a model works. And so with both of those things, at some point, I might become feasible to understand the effect that a dataset has on a machine learning model's performance.
Starting point is 00:30:53 And if we can have a good understanding of whether or not a data set that was contributed by a third party helped the machine learning models performance, then we can reward that contribution and we can create the incentive for that marketplace to exist. So just to summarize so far, what I heard you guys say is that there's trusted hardware that can help check the accuracy of the data that's being contributed and the models that are being contributed. Ali, you mentioned briefly reputation metrics and that type of thing can help. You also mentioned that there might be a way, not necessarily now, but sometime in the near future, to check how the data is influencing the outcomes in a particular model so
Starting point is 00:31:31 that you can actually, it's not quite explainable, but the idea is that you can actually attribute that this data set causes this particular effect. So there's a range of various techniques you guys have shared so far. Well, the final thing is that you could do the same thing for the third prong, which is model. I can imagine if you could create an open marketplace for people to contribute a trained model that is able to solve the particular kind of problem. So imagine if on Ethereum you created a smart contract that embedded some kind of test, be it like a cognitive test that an LLM could solve or some classification test that a classifier, machine learning model that is a classifier could solve. And if using ZKML, someone could provide a model alongside a proof that that model can solve that test. Then again, you now have the tools that you need to create a marketplace that incentivizes people to contribute machine learning models that can solve certain problems.
Starting point is 00:32:25 So many of the problems that we've discussed, the open problems that we've discussed on how to do that are also present here. In particular, there's a ZKML piece where you have to be able to prove that the model is actually doing the work that it's claiming that it's doing. And then also we need to be able to have good tests of how good a model is, right? So being able to embed a test inside of a smart contract that then you can subject a machine learning model to, to evaluate how good it is. This is another very nascent part of this whole technology trend. But in theory, it'd be great if we eventually get to a world where we do have these very open, bottom-up, transparent marketplaces that allow people to contribute and source compute data and machine learning models for machine learning, that essentially act again as a counterweight to the very, very centralized enormous tech companies that are driving all of the AI work today. I really love all you. You mentioned that because it's been a longstanding problem in AI in practice that it can see.
Starting point is 00:33:24 solve a lot of things for like the bell curve, the middle of the norm, but not the tail ends. And that's a classic example of this is like self-driving cars, right? Like you can do everything with a certain amount of standard behaviors, but it's the edge cases where the real accidents and catastrophic things can happen. So that was super helpful. And I know you talked about some of the incentive alignment and incentives for providing accurate and quality data and even incentives for just even contributing anything in the long tail of data overall.
Starting point is 00:33:50 But on the long tail side, a quick question that popped out for me when you were talking of like it sort of begs the question of then who makes money where in this system? I could have not but wondered what does the business model kind of come in then in terms of making money for companies because I always understood that in the long tail of AI in a world of this kind of available data sets that your proprietary data is actually your unique domain knowledge and kind of the thing that only you know in that long tail. So do you guys have any quick responses to that? So I think that the vision behind crypto intersecting with AI is that you could create a set of protocols that distributes the value that will eventually be captured by
Starting point is 00:34:29 this new technology by AI amongst a much larger group of people, essentially a community of people, all of whom can contribute and all of whom can take part of the upside of this new technology. And so then the people who would be making money would be the people who are contributing compute or the people who are contributing data or the people who are contributing new machine learning models to the network, such that better machine learning models can be trained and bigger, more important problems can be solved. The other people that would be making money at the same time are the people who on the other side are on the demand side of this network, people who are using this as infrastructure for
Starting point is 00:35:06 training their own machine learning models. Maybe their model does something interesting in the world. Maybe it's the next generation of Chad GPT, and then that then goes on to make its way into a bunch of different applications, like say enterprise applications. or whatever it is that those models may be used for. And those models drive value capture in their own right because those companies will have a business model of their own. And then finally, the people who might also make money
Starting point is 00:35:31 are the people who build this network. So, for example, create a token for the network. That token will be distributed to the community. And all of those people will have collective ownership over this decentralized network for compute data and models that may also capture some value of all of the economic activity that goes through this network. So you can imagine any payment for compute or any payment for models could have some fee imparted on it. It might just go to some treasury that's
Starting point is 00:36:01 controlled by this decentralized network that all token holders that are part of this network have collective ownership and access to as well as the creators and owners of the marketplace. And that fee might just go to the network. So you can imagine that every transaction that goes through this network, every form of payment that pays for compute or pays for data or pays for models might have some fee that's imparted on it that goes to some treasury that's controlled by the whole network and by the token holders that collectively own the network. And so that's essentially a business model for the network itself. Great. Okay. So so far we've been talking a lot about the way that crypto can help AI.
Starting point is 00:36:42 I mean, to be clear, it's not like unidirectional. These things are kind of reinforcing and bidirectional and more interactive than one way. But for the purpose of this discussion, we're really talking about it being like, here's how crypto can help AI. Let's now kind of turn it on its head and talk a little bit more about ways that AI can help crypto. Yeah, so there are a couple of interesting touchpoints there. So one that's actually worth bringing up is this idea of machine learning models that
Starting point is 00:37:07 are used to generate code. So many of the listeners are probably heard of co-pilots, which is a tool that's used to generate code. And what's interesting is you can try to use these code generation tools to write solidity contracts or to write cryptography code. And I want to stress that this is actually a very dangerous thing to do. Oh, do not do this at home. Okay. Yeah, do not try this at home because what happens is very often these systems actually will generate code that works. You know, when you try to run it, you know, encryption is the opposite of decryption and so on. So the code will actually work, but it will actually be insecure. We've actually written a paper about this recently that
Starting point is 00:37:42 It shows that if you try to get a co-pilot to just write something as simple as just an encryption function, it will give you something that does encryption correctly, but it uses an incorrect mode of operation, yeah, so that you'll end up with an insecure encryption mode. Similarly, if you try to get it to generate solidity code, you might end up with solidity code that works, but it will have vulnerabilities in it. So you might wonder why does that happen? And one of the reasons is because these models are basically trained on codes that's out there, They're trained on GitHub repositories.
Starting point is 00:38:13 Well, a lot of the GitHub repositories actually are vulnerable to all sorts of attacks. And so these models learn about code that works, but not code that is secure. It's almost like garbage in, garbage out. And so I do want to make sure people are very careful when they use these generative models to generate code, that they very, very carefully check that the code actually does what it's supposed to do and that it does it securely. One idea on that front, I'm curious what you think about. this is that you can use AI models like LLMs or like chat GPT to generate code in conjunction with other tools to try to make the process less error prone. And so one example, one idea would be to
Starting point is 00:38:54 use an LLM to generate a spec for a formal verification system. So basically you describe your program in English and you ask the LLM to generate a spec for a formal verification tool. Then you ask the same instance of the LLM to generate the program that meets that spec, and then you use a formal verification tool to see whether the program actually meets the spec. And if there are errors, that tool will catch the errors. Those errors can then be used as feedback back to the LLM. And then ideally, hopefully the LLM can then revise its work and then produce another version of the code that is correct. And eventually, if you do this again and again, you end up with a piece of code that ideally fully meets the spec and is formally verified to meet the spec.
Starting point is 00:39:41 And because the spec is maybe readable by a human, you can maybe go through the spec and see, like, yes, this is the program that I intended to write. And that could be an actual pretty good way to use LLMs to write code that also isn't as prone to errors as it might be if you were to just ask ChatGPT to generate a smart contract for you. Clever. Yeah, this is great. And actually, this leads into another topic that is worth discussing, which is basically
Starting point is 00:40:04 using LLMs to find bugs, right? So suppose a programmer actually writes some solidity code. And now you want to test. Is that code correct? Is it secure? And like Ali said, you can actually try to use the LLM to find vulnerabilities in that code. And there's been actually quite a bit of work on trying to assess how good LMs are at finding bugs in software in solidity smart contracts and C&C++. There's one paper that came out recently that's actually very relevant.
Starting point is 00:40:30 It's a paper from the University of Manchester, which says that you would run a standard static analysis tool. to find bugs in your code, and it would find all sorts of memory management bugs or potential bugs, just a standard static analysis tool, no machine learning whatsoever, but then you would use an LLM to try and fix the code. Yeah? So it would propose a fixed of the bug automatically.
Starting point is 00:40:52 And then you would run the static analyzer again on the fixed code. And the static analyzer would say, oh, the bug is still there or the bug is not there. And you would keep iterating until the static analyzer says, yeah, now the bug has been fixed and there's no more issues there. So that was kind of an interesting paper. This paper literally came out like two weeks ago. So for both of these papers you just referenced, Dan, the one from the University of Manchester and also the one that you guys recently wrote on LLMs not being trusted to write correct code.
Starting point is 00:41:18 It could be working code, but not necessarily secure. I'll link to those papers in the show notes so that listeners can find them. Just one quick question before we move on from this. So this is about the current state. Is this a temporary situation or do you think there will be a time when LLMs can be trusted to write correct? not just working, but secure smart contract code. Is that possible or is that just way far off? That's a difficult question to answer.
Starting point is 00:41:44 You know, these models are improving by leaps and bounds every week, right? Yeah. So it's possible that, you know, by next year, these issues will already be addressed and that they could be trusted to write more secure code. I guess we're saying that right now, the current models that we have, GPT4, GPD3 and so on, if you use them to generate code, you have to be very, very careful and verified the code that they wrote actually does what it's supposed to do and is it's secure. Got it.
Starting point is 00:42:09 Well, and by the way, will we get to a point where the code that LLMs generate is less likely to contain bugs than the code that a human generates? And maybe that's the more important question, right? Because in the same way that you can never say that a self-driving car will never crash, the real question that actually matters is, is it less likely to crash than if it were a human driver? That's exactly right. Because the truth is that it's probably impossible to...
Starting point is 00:42:34 guarantee that there will never be a car crash that is caused by a self-driving car or that there will never be a bug that's generated by an LLM that you've asked to write any code. And I think this will only, by the way, get more and more powerful, the more you integrate it into existing tool chains. So as we discussed, you can integrate this into formal verification tool chains. You can integrate this into other tools like the one that Dan described where you have a tool that checks for memory management issues. You can also integrate it into unit testing and integration testing tool chains so that the
Starting point is 00:43:03 LLM is not just acting in a vacuum. It is getting real-time feedback from other tools that are connecting it to the ground truth. And I think that through the combination of machine learning models that are extremely big, trained with all of the data in the world, combined with these other tools might actually make for programmers that are quite a bit superior than human programmers. And even if they might still make mistakes, they might just be superhuman. And that'll be a big moment for the portal of software engineering generally. Yeah, that's a great framing, Ali. So, What are some of the other trends that come in for where I can help crypto and vice versa? Yeah.
Starting point is 00:43:38 One exciting possibility in the space is that we may be able to build decentralized social networks that actually behave a lot like Twitter might, but where the social graph is actually fully on chain and it's almost like a public good that anyone can build on top of. And you as a user, you control your own identity on the social graph. You control your own data. You control who you follow and who can follow you. And then there's a whole ecosystem of companies that build portals into the social graph that provide users experiences that are maybe somewhat like Twitter or somewhat like Instagram or somewhat like TikTok or whatever else they may want to build.
Starting point is 00:44:15 But it's all on top of this same social graph that nobody owns. There's no billion dollar tech company in the middle that has complete control over it and that can decide what happens on it. And so in that world, like that's an exciting world because it means that it can be much. more dynamic and there can be this whole ecosystem of people building things and there's much more control by each of the users over what they see and what they get to do on the platform. But there's also the need to filter the signal from the noise. And there's, for example, the need to come up with sensible recommendation algorithms that filter all of the content and show you a news feed that you actually want to see.
Starting point is 00:44:53 And this will open the door to a whole marketplace, a competitive environment of participants who provide you maybe with algorithms, with AI-based algorithms that curate content for you, and you as a user might have a choice. You can decide whether to go with one particular algorithm, maybe the algorithm that was built by Twitter, or you can also go with one that's built by someone completely different. And that kind of autonomy will be great. But again, you're going to need tools like machine learning and AI to help you sift through the noise and to help parse through all of the spam that inevitably will exist in a world
Starting point is 00:45:29 regenerative models can create all of the spam in the world. What's interesting about what you said, too, is like it's not even about choosing between, it goes back to this idea you mentioned earlier, and you mentioned this briefly about just giving users the options to pick from marketplaces of free ideas and approaches that they can decide. But it's also interesting because it's not even only at a company to company level. It's really just like what approach works for you. Like you might be a person who's maybe more interested in the collaborative filtering
Starting point is 00:45:55 algorithm that was in the original form of original recommendation system. which is like collaborative filtering across people. So your friend's recommendations are the things you follow. When in fact, I personally am very different and much more interested in an interest graph. And therefore, I might be much more interesting people who just have similar interest to me. And I might pick that approach versus say something else that's sort of like, hey, this is a hybrid approach. Your only thing is going to do is X, Y, and Z. Just even being able to pick and choose that is already tremendously empowering.
Starting point is 00:46:24 Like that's just simply not possible right now. And it can only be possible at crypto and AI. So that's a great example. Oh, yeah. Was there anything else to say on how AI can help with trust and security? So I think that kind of the meta picture is that crypto is the wild west. Because it's completely permissionless, because anyone can participate, you kind of have to assume that whomever is participating might be an adversary and maybe trying to game the system
Starting point is 00:46:49 or hack the system or do something malicious. And so there's much more of a need for tools that help you filter the honest participants from the dishonest ones, and machine learning and AI as an intelligence tool can actually be very helpful on that front. So, for example, there's a project called Stello, which uses machine learning to identify suspicious transactions that are submitted to a wallet and that flags those transactions for the user before those transactions are submitted to the blockchain. And that could be a good way to prevent the user from accidentally sending all of their funds
Starting point is 00:47:24 to an attacker or from doing something that they will regret later. And that company basically sells to wallets, the companies like Metamask, such that then Metamask can use the Intel and then do whatever it wants with it, either block the transaction or warn the user or sort of reframe the transaction so that it's no longer dangerous. And so that's one example. There are other examples as well in the context of MEV, which stands for minor extractable value or maximum extractable value, depending on who you ask, which is the value that can be extracted by the people who have control over the ordering of transactions on a blockchain.
Starting point is 00:47:59 And that's often the miners or the validators of a blockchain. And AI here can cut both ways and that those participants, if you're a validator on a blockchain and you have control over ordering of transactions, you can do all sorts of clever things to order those transactions in such a way that you profit. You can, for example, front run transactions, you can backrun transactions, you can sandwich orders on uniswap. There's a lot of transactions that you could craft such that you can profit from this ability of ordering transactions. And machine learning and AI might supercharge that ability because it can search for opportunities to capture more and more MEV. But then on the other hand, machine learning may help in the other way, in that it may
Starting point is 00:48:40 help as a defensive tool. You may be aware before you submit a transaction that there is MEV that might be extractable from that transaction. And so then maybe you will either split up your transaction into multiple transactions so that there isn't a single validator that can completely control it or do something as a way of protecting yourself from an MEV extractor at some point in the transaction pipeline. So this is a way, again, where crypto plays a big role when it comes to security, when it comes to trust, when it comes to making the space more trustworthy to the end user. That's an example of AI making things difficult in crypto
Starting point is 00:49:17 and then crypto coming back and making things better for it. I actually have another example like that. So just like ML models can be used to detect fake data or maybe malicious activity, there's the flip side where ML models can actually be used to generate fake data. And the classic example of that is deepfakes, right? Where you can create a video of someone saying things they never said
Starting point is 00:49:38 and that video looks fairly realistic. And the interesting thing is that actually blockchains can help to alleviate the problem. And so let me just walk you through one possible approach where blockchains might be useful. Imagine it's a solution that might be only applicable to well-known figures like politicians or maybe movie stars and such. But imagine basically a politician would wear a camera on their chest and kind of record what they do all day long. Yeah. And then create a Merkel tree out of that recording and push the Merkel tree commitment onto the blockchain. So now On the blockchain, there's a timestamp saying, you know, on this and this date, you said such and such.
Starting point is 00:50:15 On this and that date, you said such and such. And now if somebody creates a deep fake video of this politician saying things they never said, well, the politician can say, look, at this time where the video said I said this and that, I was actually somewhere completely different, doing something unrelated. And the fact that all this data, the real data, the real authentic data is recorded on a blockchain can be used to prove that the deep fake really is fake and not real data. Yeah. So this is not something that exists yet. It would be kind of fun for someone to build something like this. But I thought it's kind of an interesting example where blockchains might actually be helpful in combating deepfakes. Is there also a way to solve that problem and show other timestamps or provenance where you can do that sort of verification of what's true, what's not true, without having to make a politician walk around with like a camera on their chest? Yes, absolutely. We can also rely on trusted hardware for this. So imagine, you know, our cameras, you know, the cameras and our phones and such. They would actually sign the images and video that they take. There's a standard.
Starting point is 00:51:13 It's called the C2PA that specifies how cameras will sign data. In fact, there's a camera by Sony that now will actually take pictures and take videos and then produce C2PA signatures on those videos. So now you basically have authentic data. You can actually kind of prove that the data really came from a C2PA camera. And so now if you maybe read a newspaper article and there's a picture in a newspaper article And it claims to be from one place, but in fact, it's taken from a different place. The signature could actually be verified by the fact that it's C2PA signed.
Starting point is 00:51:45 There's a lot of nuances there. C2PA is a fairly complicated topic that there's a lot of nuances to discuss that maybe we won't get into here. Yeah, and I remember you talking about this work with me previously. I think it was at our offsite. But I also remember from that that it doesn't stand up to editing. And as you know, editorial people like me and other content creators. And honestly, just about anyone who uses Instagram or any online posting, No one, like, uploads things purely rawly as they were originally created.
Starting point is 00:52:11 Like, everyone edits them. Yeah, typically when newspapers will publish pictures in a newspaper, they don't publish the picture from the camera as is. They will crop it. There's like a couple of authorized things they're allowed to do to the pictures. Maybe they gray scale it. Definitely they downsample it so that they don't take a lot of bandwidth. The minute you start editing the picture, that means that the recipient,
Starting point is 00:52:33 the end reader, the user on the browser who's reading the article, can no longer verify the C2PA signature because they don't have the original image. So the question is, how do you let the user verify that the image they're looking at really was properly signed by a C2PA camera? Well, as usual, this is exactly where zero knowledge techniques come in where you can prove that the edited image
Starting point is 00:52:56 actually is the result of applying just downsampling and gray scaling to a properly signed larger image. Yeah, and so instead of a C2PA signature, we would have a ZK proof, a short ZK proof associated with each one of these images. And now the readers can still verify that they're looking at authentic images. So it's very interesting that ZK techniques can be used to fight disinformation. It's a bit of an unexpected application. That's fantastic.
Starting point is 00:53:23 A very related problem, by the way, is proving that you're actually human in a world where all of the deep fakes creating the appearance of humanity will generally outnumber humans, a thousand to one or a million to one, and most things on the internet might actually be generated by AI. And so one potential solution, which is related to what you're saying, is to use biometrics to be able to establish that someone is actually human, but to then use zero knowledge proofs to protect the privacy of the people who are using those biometrics to prove their humanity. So one project in this category is called WorldCoin. Yeah. It's also a project in our portfolio, and they use this orb, people may have seen, this shiny silver orb that uses
Starting point is 00:54:08 retinal scans as biometric information to verify that you're actually a real human. And it also has all sorts of other sensors to make sure that you're actually alive and that it can't actually be a picture of an eye. And it's this system that has secure hardware and is very difficult to temper with such that the proof that emerges on the other end, which is a zero knowledge proof that obscures your actual biometric information is very, very difficult to forge. In this way, politicians could, for example, prove that their video stream or that a signature of theirs or that a participation of theirs on some online forum is actually their own and that they're
Starting point is 00:54:45 actually human. What's really interesting about what you said, Ali, that's a great follow-up to what Dan was saying about ways to verify like authentic media versus like fake or deep fake media and this world of infinite media, as you would say, that we live in. But what are the other applications for proof of personhood type technologies like that? I think it's important because this is actually another example of how crypto can help AI more broadly, too. We're kind of flipping back and forth there, but that's okay because we're just talking about really interesting applications, period. So it's fine. That's a really good question. One of the things that will become important in a world where anyone can participate online is to be able to prove that you are
Starting point is 00:55:23 human for various different purposes. There's that famous saying from the 90s that on the internet, nobody knows you're a dog. Oh, yeah. And I think maybe a reshaped form of that saying is that on the internet, nobody knows you're a bot. So then I guess this is exactly where proof of humanity projects become very important. Because it will become important to know whether you're interacting with a bot or with a human. For example, in the world of crypto, there's this whole question of governance.
Starting point is 00:55:48 How do you govern systems that are decentralized, that don't have any single point of control and that are bottom up and community owned? you would want some kind of governance system that allows you to control the evolution of those systems. And the problem today is that if you don't have proof of humanity, then you can't tell whether one address belongs to a single human or whether it belongs to a bunch of humans or whether 10,000 addresses actually belong to a single human and are pretending to be 10,000 different people. And so today you actually have to just use amount of money as a proxy for voting power, which leads to plutocratic governance systems.
Starting point is 00:56:30 But if every participant in a governance system could prove that they're actually human and they could do so in a way that's unique such that they can't actually pretend to be more than one human because they only have a single set of eyeballs, then the governance system could be much more fair and less plutocratic and can be based more on each individual's preferences rather than on the preference of the largest amount of money
Starting point is 00:56:53 that's locked up in some smart contract. Actually, just to give an example of that, Today, we're forced to use one token, one vote because we don't have proof of humanity. Maybe we'd like to do one human one vote. But if you can pretend to be five humans, then of course, that doesn't work. And so one example, what this comes up is something called quadratic voting. So in quadratic voting, basically, if you want to vote five times for something, you have to kind of put 25 chits down to do that.
Starting point is 00:57:20 But of course, you can do the same thing. You can just pretend to be five different people, each voting once. and that would kind of defeat the mechanism of quadratic voting. So the only way to prevent you from doing that is this exact proof of humanity, where in order to vote, you have to prove that you're a single entity rather than a symbol of entities. And that's exactly where proof of humanity would play an important role. Generally, identity on chain is actually becoming quite important for governance. Totally.
Starting point is 00:57:46 That, by the way, reminded me of an episode that you and I did, Ali, years ago with Phil Dan, remember on Dark Dows? That was such an interesting discussion that totally relevant there. Totally. By the way, is it phrase proof of personhood or proof of humanity? What's the difference? Is it the same thing? Yeah, people use them interchangeably.
Starting point is 00:58:02 Proof of human, proof of humanity, proof of personhood. Yeah, yeah. Okay, so keep going on this theme then of media and this kind of infinite abundance of media. Like, what are other examples? And again, we're talking about crypto helping AI, AI helping crypto. Are there any other examples that we haven't covered here where the intersection of crypto and AI can bring about things that aren't possible by either one of them alone? Completely.
Starting point is 00:58:24 I mean, another implication of these generative models is that we're going to live in a world of infinite abundance of media. And in such a world, things like community around any one particular piece of media or the narrative around a particular piece of media will become ever more important. Just to make this very concrete, there's two good examples here. Sound.xYZ is building a decentralized music streaming platform. enabling artists, musicians essentially, to upload music and to then connect directly with our communities by selling them NFTs that give people in those communities certain privileges. Like, for example, the ability to post a comment on the sound.xyc website on the track such that anyone else who plays the song can also see the comment.
Starting point is 00:59:14 This is similar to the old SoundCloud feature that people might remember, where you could have like this whole social experience on the music. music itself as it played on the website, it's this ability to allow people to engage with the media and to engage with each other often in a way that's economic because they're essentially buying this NFT from the artist as a way of being able to do this. And they are, as a side of fact, they're supporting the artist and helping the artist be sustainable and be able to create more music. But the beauty of all of this is that actually gives an artist a forum to really interact with their community. And the artist is a human artist. And as a result of crypto being
Starting point is 00:59:52 in the loop here, you can create a community around a piece of music that wouldn't automatically exist around a piece of music that was just created by a machine learning model that's devoid of any human element, right? That doesn't have a community around it. And so I think, again, in a world where a lot of the music that we're going to be exposed to will be fully generated by AI, the tools to build community and to tell a story around art, around music, around other kinds of media will be very important as a way of sort of distinguishing media that we really care about and really want to invest in and spend time with from media that may also be very good, but it's just a different kind of media. It's media that was just generated by
Starting point is 01:00:34 AI with less of a human element. And it may be, by the way, that there's some synergy between the two, like it could be that a lot of the music will be AI enhanced or AI generated. But if there's also a human element, like say, for example, a creator leveraged AI tools to create a new piece of music, but they also have a personality on sound. They have an artist page. They have a community for themselves and they have a following. Then now you have like this kind of synergy between the two worlds where you both have the best music because it's augmented by the superpowers that AI gives you. But you also have a human element and a story that was coordinated and made real by this crypto aspect, which lets you bring all of those people together into one
Starting point is 01:01:14 platform. It's really quite amazing that even in the world of music, just like we talked about in the world of coding where you have a human coder that's being enhanced by tools like co-pilot that generate code. We are seeing things like this where an artist is being enhanced by ML systems that help write, or at least parts of the music are being written and generated by an ML system. So it's definitely a new world that we're kind of moving into in terms of content generation. Basically, there's going to be a lot of spam that's generated by machine-generated arts, which people might not value as much as they value art as generated by an actual human. Maybe another way to say it is one of the points of NFTs was to support the artists.
Starting point is 01:01:57 Yes. But if the artists themselves are now machine learning models, then who exactly are we supporting? Yeah. And so it's a question of how do we distinguish, how do differentiate human generated art that need support versus machine generated art? This is a philosophical discussion for over drinks maybe, but I would maybe go so far as to say that the prompter is also an artist of sorts. And in fact, I would make the case that that person is an artist.
Starting point is 01:02:23 And the same thing has come about with, as this is a discussion and debate as old as time. And it's just simply new technologies, old behaviors. It's the same thing that's been playing out for eons and the same thing's playing out the writing, et cetera, totally. Very true. Well, that actually opens up the door for collective art, for art that's generated through the creative process of a whole community as opposed to a single artist. There are actually already projects that are doing this where you have a process by which a community influences through some voting process on chain what the prompt for a machine learning model
Starting point is 01:02:56 like Dali will be. Then you have Dali use that prompt to generate a work of art. Maybe you generate not one work of art, but like 10,000. And then you use another machine learning model that's also trained from feedback by the community to pick from those 10,000 the best one. Right. And so then now you have work of art that was generated from the input of the community, that was also sort of pruned and selected from a set of 10,000 variants of that work of art. Also, through a machine learning model that's trained by the community, to then generate one work of art that is kind of the product of this collective collaboration. That's incredible.
Starting point is 01:03:34 I love it. Well, you guys, that's a great note to end on. Thank you both for sharing all that with the listeners of Web3 with A6 and Z. Thanks, Anil. Thank you so much. Thank you for listening to Web 3 with A6 and Z. You can find show notes with links to resources, books, or papers discussed, transcripts, and more at A6NCrypto.com. This episode was produced and edited by Sonal Choxy.
Starting point is 01:04:02 That's me. The episode was technically edited by our audio editor, Justin Golden. Credit also to Moonshot Design for the Art and all thanks to support from A6 and Z Crypto. To follow more of our work and get updates, resources from us and from others, be sure to subscribe to our Web 3 weekly newsletter. You can find it on our website at A6NZ Crypto.com. Thank you for listening and for subscribing. Let's go.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.