Epicenter - Learn about Crypto, Blockchain, Ethereum, Bitcoin and Distributed Technologies - Sam Williams: Arweave – Bringing Permanence to the Web

Starting point is 00:00:00 This is Epicenter, episode 341, with guest Sam Williams. Hi, welcome to Epicenter. My name is Sebastian Kutio. Today, our guest is Sam Williams. Sam is the co-founder and CEO of RWeave for their company that is building the infrastructure layer for what they call the ParmaWeb. So RWeave has developed a system for data storage, which makes it economically sustainable and feasible to store data for centuries. they've figured out the data storage system that allows you to have highly available, low latency data storage while maintaining an incentive model where miners will store users' data

Starting point is 00:00:59 forever. Projects like RWeave are really fascinating because they conjure up, at least for me, they conjure up a lot of the same enthusiasm and excitement that I had in the early days of the web. there's something very freeing and liberating about the idea of being able to publish content to a decentralized, permissionless censorship-resistant platform. And I think in the early days of the web, like in the 90s and even in late 80s, this is kind of what people had for a vision of the internet. And as things have aggregated to large social platforms, that has somewhat been eroded.

Starting point is 00:01:39 And so RWeave kind of brings back that errone. original vision of the web. Now, of course, there's something inherently political about this because it brings up these concepts of transparency and accountability. The flagship application for R-Weave is this thing they call the permaweb, which is a global, decentralized and community-owned web where anyone can contribute. And in the perma-web, once data is stored there, it remains there forever. It can never be altered or deleted. Now, of course, this brings up all kinds of ethical and philosophical questions about the permanence of information, to whom it should be applied, and how it operates in the framework of concepts like the right to be forgotten,

Starting point is 00:02:22 which exists in Europe. So if you want to get a glimpse of what the permaweb looks like, you can go to their website, and here you can explore a list of sample applications that have been built on top of RWeave. So there's things like URL shorteners, a blog, an email service. There's even a service that allows you to create a screenshot of a tweet so that effectively never gets deleted. And what's interesting is you can also build defy applications on ROEV. So all of these defy applications, like say Uniswap, for example, that have a front end and that front end exists on a server, you could conceivably build that front end on ROEVE,

Starting point is 00:03:06 so you have effectively unstoppable defy applications. And I think that's really, really empowering. So just after our conversation with Sam, Frederica and I had a brief one-on-one conversation to recap the interview, gather our thoughts about what we think of the project and where we think things might be going, you can hear this conversation by signing up to our substack newsletter. That's epicenter.orgs slash substack. If you already subscribe to the weekly newsletter that we send out with every episode, you'll have to sign up for this separate mailing list.

Starting point is 00:03:44 And we might move the newsletter to Substack at some point. But for now, we just kind of want to experiment with this platform and see what we can do with it. The idea is for us to release these short debrief conversations with every new episode. They're going to be free for the foreseeable future. Perhaps at some point we'll consider making them part of some subscriber. bundle, but for now we just want to experiment here, get your feedback and see how we can create something valuable in addition to the podcast interviews. A bit of extra housekeeping. I mentioned last week that we were hosting a meetup that's happening this week on Thursday, the 28th of May.

Starting point is 00:04:24 It is at noon Pacific, 3 p.m. Eastern, 9 p.m. Central European time. You can come and just have a chat with us. All the hosts will be there and we'd be happy to get to talk to you and see how you're doing, especially in this time of crisis. And you can register for that at epicenter.orgs slash virtual meetup and we'll send you all the details, calendar invite, zoom links, etc. More housekeeping. I am really excited to be moderating a fireside chat with Danny Ryan of the Ethereum Foundation and Joe Lubin of Consensus. That's next week at the Mainnet Conference on June 7. Second, you can register to get your tickets at mainnet. And I am also moderating a panel on the topic of freedom versus civic duties.

Starting point is 00:05:14 That's happening at the Web 3 Forum from June 8th to June 10th. It's part of the CogX conference organized by Fabric Ventures. And check out these panelists. We have Yaya Funnusi, Adjunct Senior Fellow at the Center for a New American Security. Harry Halpin, CEO of NIM Technologies. Steininger, CEO of Lease Authority and Corey Doctro of the Electronic Frontier Foundation. Tickets are free for Epicenter listeners, and the link to register is in the show notes. One last thing before we go to the interview.

Starting point is 00:05:50 Thank you to all of you who have left an Apple podcast review. I didn't think it was as easy as asking, but I guess that's what works, because in the last couple of weeks, we've received several reviews per week, sometimes several per day. And they're just so fun to read. Some are very touching. Others are just funny. I've been posting them on Twitter. And so thanks to everybody who left the review. If you haven't done so already, please take two minutes and leave an Apple podcast review. Just go to epicenter.org slash Apple. Let us know how long you've been listening, what you've learned, what were your favorite episodes or memorable moments, and it really helps us boost our visibility and attract more

Starting point is 00:06:34 people to the podcast. And as a thank you for doing that, I will send you a discount code for 100% off a keepkey hardware wallet. Just email me at sebastan.com.com.tv. And let me know that you've left us to review and I'll send you that discount code. Lots of you have done so already. And so I hope your keep keys are on the way. And with that, here's our interview with Sam Williams. We're here with Sam Williams. Sam, thanks for joining us. Thanks for having me. So tell us a bit about yourself and how you got involved in the space. I was doing a PhD in computer science studying distributed systems. I was building an operating system that you could essentially take out the components of the machine that it was running on while it was on. And the

Starting point is 00:07:21 idea was that you should at least be able to maintain some level of continued operation of that machine. So it was really kind of hardware fault tolerance is what we like to call it. And this was interesting. It's in the space of decentralized systems that are tolerant to very large amounts of failure. And then I had this idea for what later became AWIF, which is kind of the same ideas of using decentralization to achieve fault tolerance of, well, it was computation. And then I kind of switched to data storage because you realize that actually that's much more interesting and important, I think, for the world to be able to make sure that we can keep our data storage systems online rather than necessarily compute, which is actually probably less important.

Starting point is 00:08:07 This operating system you were working on was like a system OS, like you would have on a computer or on a cluster of servers or something like that? Yeah, exactly. So it was built using technique that they started working on in about 2007, 2008, called multi-colonels. The idea was that you would take a single machine and you would split. it up into what you essentially treat it as many, many machines inside a single network. So each core or each thread would be treated independently.

Starting point is 00:08:35 And the idea was that this would allow operating systems to scale to the point that they could easily handle, like having 500 cores inside a single box. But actually, the hardware never really emerged for that. Like at the time, they thought that this was going to be here in 10 years or something. Just didn't happen at all. But the techniques they produced, I realized, were pretty interesting from the point of view. of having massive fault tolerance. Why do you think that storage is more interesting in this setting?

Starting point is 00:09:04 Well, to some extent, compute can be redone most of the time. Like you can take inputs and you can transform them to the same outputs. It doesn't really seem like such an important area relative to data storage, which is that if we lose access to pieces of information about the past, or perhaps their verifiability, then they're just gone. There is no way to reverse that transition. What happened to the PhD? Did you finish it or is it still on hold? I built this operating system and I figured I just needed maybe like two months to write up the thesis itself.

Starting point is 00:09:37 And at that point, the idea for ARWIV was taking shape. And I thought, well, I could start this RWEE's project like four days a week or something like that and use one day a week to finish my thesis. And of course, that didn't happen at all. It had like negative numbers of three days since the moment I started on the project properly. and consequently it kind of stalled from the operating system perspective. But then I got back in touch with my PhD supervisor about a year ago now, I think. And we basically agreed to just shift the topic of the PhD thesis to ARWI actually. So now I'm just trying to find some time to turn the RWEB yellow paper into like a PhD thesis

Starting point is 00:10:13 and then I'll submit that is the idea. In fact, this afternoon I'm supposed to be sending an email to stop myself being unenrolled from the system for being a PhD student for too long. But yes. So that's kind of where that ended up. Cool. Good luck. Thank you. Yeah, in terms of how I got into blockchain,

Starting point is 00:10:34 or really it's peer-to-peer systems. I think blockchain is kind of an interesting evolution along that line, but it's not the starting point by any means. It was back with BitTorrent. You know, in like 2003 to 2005, that period was really fascinating. I mean, we had this,

Starting point is 00:10:49 which is essentially a decentralized incentive mechanism, go from pretty much zero to transferring at its peak about 30 to 35% of all of the Internet's traffic. And there was a built-in mechanism design inside Bittern that allowed people, or rather encouraged people, to swap something of value for something else that they wanted that was valuable, this whole idea of optimistic tip, a tap. So that was, I think, the first time we saw that kind of P-to-P mechanism design

Starting point is 00:11:17 at scale. I think that was very exciting. And then it was around 2011, 2012 or something, I first encountered Bitcoin. I don't remember by the year, but I remember by the price. It was like around three cents. And then I remember it passing parity with the dollar and finding that to be like a very, very hard concept to understand. I mean, how can this magic internet money that was basically an experiment be worth more than a US dollar? And then I mined some.

Starting point is 00:11:43 I took part in the Ethereum ICO back at the beginning with some of the tokens I'd mined. in my university dorm room where I didn't have to pay for electricity. And then I guess I've just kind of been watching the space ever since then. So I'd like to go back to BitTorrent for a little bit. Because BitTorrent is something that I used a lot. And I still use once in a while. I mean, although not so much anymore, but many of us have used BitTorrent to some degree. I've kind of understood conceptually how it works.

Starting point is 00:12:11 But there's a lot of components there that I never really fully understood. But for our listeners, perhaps it would be helpful to understand. like what BitTorrent did that was so unique in terms of peer-to-peer file sharing, but also this incentive mechanism that I think a lot of people don't recognize existed within BitTorrent. It's pretty basic. And that's one of the things that's so elegant about the protocol. You take a file, you split it into a number of chunks, and then everyone can join this network,

Starting point is 00:12:39 and there's some kind of tracker that tells you where everyone else is. And then you just say, I have these pieces of information, and I would like these pieces of information. That's what's happening at a technical level, but at an economic or mechanism design perspective, there's this mechanism that we call optimistic tip-for-tat, which basically says, I'm going to speak to four people at once. And three of those people, I'm going to only give data to them, the data that they're asking for, when they have given data to me.

Starting point is 00:13:07 But with that fourth slot, someone can come to me and say, I'm looking for this piece of data, and I'll just try giving it to them. It's kind of level of optimism, this idea that maybe if I do something nice, something nice will happen to me in return. But this is an extremely basic system, and the Nash equilibrium that arises from it, is that everyone is sharing data with everyone else all the time. It's extraordinarily effective.

Starting point is 00:13:30 There are these incentive mechanisms that were enforced by torrent search engines, effectively, where to be a member of this kind of privileged group of people that would have access to these torrent files, they would be tracking the amount of data you were seating to the next. network and there was like this whole reputation thing about and I have a friend who's still like very much into this ecosystem and like he has these seed boxes that like are running all the time so that his reputation on these platforms remains you know like at a you know where he has like a high ratio basically it's similar to some of the concepts that we you know and understand from the

Starting point is 00:14:11 blockchain space only like they're all centralized like they're all built on you know these like centralized infrastructure is like on these, basically, on these websites or, and like even the trackers are files that are stored on systems. What were some of the lessons that you took from that and brought into RWeath? There's two layers of incentive design going on there. So at the base layer, there's really just this optimistic tit for tap thing that happens in a completely decentralized way. So I join the network and I can start generating peer scores with other people. And that doesn't require a track it. Like you can do this through the DhtE extension for BitTheart.

Starting point is 00:14:45 Yeah, with the magnet links. Yeah, exactly. But then on top of that, yeah, you get these tracker websites that basically make cross-swarm peer reputations. So I can upload something on one torrent and then get a greater score for downloading on another, that kind of thing. And yeah, these are kind of like quasi-centralized. Or at least, I think it's fair to call it a distributed system, but not a decentralized system, something like that. But in terms of how this affected Arbeave, the key place is in a data distribution mechanism, which really is optimistic, for TAT, very close to as the way it was formulated in BitThorrent in the beginning.

Starting point is 00:15:20 But there's something in that that is really elegant that I don't think people in blockchain paid so much attention to, and I think it's really valuable, is this idea that actually if we do things based on social reputation like that, we can keep scores off chain, which means that it requires zero synchronization in order to maintain this kind of social value score for each of the participants in the network. And that means that there's no consensus. And so there's subsequently no scalability limit, which is, you know, the exact opposite of the effect you get with like a traditional blockchain where every node in the network has to come to consensus about every state transition and including, you know, peer reputation transitions.

Starting point is 00:15:57 So yeah, I think that's really powerful. And it's a very interesting area. So let's talk about, you know, storage on the internet. In your white paper, you actually say that you estimate that 98% of content is deleted off the internet every 20. years. How did you arrive at this figure? This wasn't our research. There are lots of people doing research into, I guess, the sustainability of the web. There should be a reference there that you can follow to find the citation for it. I think that the fact that the whole internet, or at least the links on the internet, basically wipes every 20 years is pretty astonishing. But maybe equally scary is the fact that around a third of links change the content fundamentally or remove the content

Starting point is 00:16:41 completely within just three months of the creation of that information. And this is, according to a study that was done based on links created on Twitter. So things people are sharing on social media. It's really impressive that it's changing at that kind of rate. Do you think that 100% of the content produced on the internet should be stored forever? No, definitely not. We think that there are certain classes of information that should be permanent, things like knowledge, records of history, web applications in a lot of cases, and things that really shouldn't be. Things like instant messenger chats, these kinds of private communications between individuals that are built for ephemeral, to be ephemeral, essentially. And I actually

Starting point is 00:17:23 think that there's a space for a kind of, what you call it, like a sister network to AR weave, that does the exact opposite thing, something that enforces that a piece of information is not going to be available in the future. And we spent a lot of time thinking about how you might achieve that. I don't think you can truly enforce that the information is not available, but what you can do is provide potentially some kind of insurance against information becoming public after a certain period of time. Like a blockchain Snapchat? Potentially. I mean, how you go about building those economic mechanisms is not completely clear to us yet, at least. But I think there's space for it. So the problem that we really identified is that basically

Starting point is 00:17:59 data storage, as it exists today, is kind of where you could call it like Schrodinger's data storage. So basically you put a piece of information onto a hard drive. And it might be there in 15 years. It also might be gone. And so you don't really have any way to express intentionality about that. You really just have to kind of hope or, you know, make lots and lots of replications, but then all of your systems for managing those replications are typically centralized and not really very reliable.

Starting point is 00:18:26 But also the data might just disappear. And so we thought that given that information is the new oil of the economy, being able to be intentional about how long that information is around for should be an extremely valuable thing. And that's why we got started on that week in the first place. I think there's a really interesting philosophical question here about what constitutes information that should exist, that we think should exist forever.

Starting point is 00:18:51 And I think we can make some pretty clear assumptions about what the majority of people think that might be. So government communications might be one of those things. Anytime a government official speaks up or when you have there's like a policy decision or this sort of thing. Perhaps even the entirety of like press publications, you know, could fall into that category of things that we want to exist in the permanent record. And then on the other end, of course, you have things like chats and things like that,

Starting point is 00:19:18 that one could decide that they don't want that to exist forever. And, you know, there's tools that allow you do that, like things like Signal. There's all this mushy stuff in the middle that it's unclear, right? Let's take one example, Twitter. I think there are people who think that Twitter is some sort of an historical record of like human conversation and that every single tweet that's ever existed should be kept and recorded. In fact, there's a lot of Twitter bots that do this, just this, right? They capture and retweet things that have been deleted so that they remain in the permanent record. And other people who think that that's not the case.

Starting point is 00:19:52 Like I know I was talking with Evan Van Ness a couple months ago about this very thing where, you know, I noticed as one of his old tweets got deleted. It's like, yeah, I have a script that deletes my old tweets every month or something like that. And so should we be approaching the question from the point of view of the body of information or more about what is the importance of information for society or like the class of information that is contained in that body? So it could be things like storing all the tweets of like Donald Trump or but not necessarily the tweets of like, you know, Joe everybody. How do we approach this stuff that sits in the middle? That's a really interesting philosophical question. You're absolutely right. Personally, I take a fairly practical approach to this, which is that, okay, so if you tweet something and you have, say, 2,000 followers, then you've just broadcasted that piece of information probably to, I don't know, the eventual count might be 20,000, 30,000, 50,000 people.

Starting point is 00:20:45 Realistically, you've done that knowingly. And, I mean, it's just not practical to expect to be able to remove that piece of information from the information space after you've voluntarily done that. the line that people will eventually end up drawing is something like, if you're saying it publicly to a very large number of people, you don't really have the expectation that the statement can be withdrawn. Because fundamentally, that's not the world we live in. Data likes to copy itself. Or rather, humans like to copy information that they find interesting. It can't really be helped. So Arweb doesn't really change that. What it does is it provides a level of assurance and is particularly valuable for information that people wouldn't. necessarily want to propagate today. So if Donald Trump says something outrageous on Twitter, there are many people that want to tell other people about that thing immediately right now. And so it spreads around the information space very rapidly. And many, many copies of that piece of information are generated. But there might be other things about perhaps, I don't know,

Starting point is 00:21:44 a Wikipedia page about some kind of social club in Berlin in the 1930s or something like this, that isn't necessarily very, very valuable information to people today. But for some reason, in the distant future, that could become extremely useful knowledge. Like perhaps the members of this club went on to do something interesting. And so the RWeave allows us to kind of deal with that set of data that are not necessarily of extremely high social replication factor, frankly, today, but in the future will be very valuable. And it allows us to have assurances about that data being available.

Starting point is 00:22:18 we don't have to go very far back in history to remember a time in which this problem didn't exist. All of us are, you know, in our 30s, I think we grew up pre-internet. And I think for a lot of people who went through that transition of like no permanence or basically no permanence to like permanence of all information ever created, this sort of social norms around creating information, creating content and haven't, you know, quite. taken hold within that generation of people. I think if you look at the younger generation, those who were born after 2000, for instance, those people will have totally different assumptions

Starting point is 00:22:59 about how their information and how the data that they create gets propagated and the assumptions that they have around that content perhaps being permanent will be very different from, say, ours or our parents. Right. I think that's fair to say. I mean, it's not that people expect that the content is permanent. it's that they expect that they will lose the ability to delete it.

Starting point is 00:23:22 It's not quite the same concept. RB is literal permanence. You could also argue that the problem has existed just in different forms and in a lesser form that perhaps affected fewer people before the birth of the internet. But it was still there. Like if we went to a stadium or a park or some public space and then we shouted something very loudly,

Starting point is 00:23:42 we lose the ability to enforce that no one can write down what it was that was said and then publish it somewhere else or tell someone else that we said it. And really, this is fundamentally a re-formulation of the same problem. It's just in a slightly lesser form of perhaps just non-technical form, less extreme. When things go viral, they are distributed to a far larger extent than you would ever expect it in the 80s. Had you gone to like a football stadium and said something really loudly, you don't go viral for that. The European Union has taken a pretty hefty stance on the right to be forgotten. a couple of years ago. That is impact R-Weev at all?

Starting point is 00:24:22 It impacts R-Wev actually to the same extent it impacts any kind of blockchain. In fact, quite possibly lesser with R-Wev because we realized that, you know, right of beginning, we're talking about permanent storage of a large amounts of information, and the miners, the nodes in the network, subsequently should have the strong capabilities for choosing what information they want to take part in storing and what information they don't want to take part in storing. So essentially what the network is doing at a base level, is encoding an incentive, a permanent incentive to recall a piece of data. And then people come along and they essentially claim that incentive by replicating

Starting point is 00:24:58 the data. But nobody is enforced. Nobody is selected by the network to store any specific piece of content. This actually makes it remarkably GDPR compliant for a blockchain-like system. It essentially means that if you have some kind of GDPR problem, then you just don't take part in storing that data. And the legal system, you know, minors are data storers. And so they are responsible for the data that they store. They must adhere to all of the local laws of the jurisdictions in which they operate. The law creates this disincentive to take part in storing information that other people don't want you to store within appropriate bounds anyway. And the ARWIF protocol just kind of allows miners to make those choices for themselves, essentially. So on a practical level, how would you

Starting point is 00:25:42 expect miners to make those choices? So basically, say, I, and we'll get into the weeds of how the protocol works just after this, but say, I request something be stored. And do you expect the miners to look exactly at the thing that I want to be stored and determine whether this is clearly something that should be stored forever or should not be stored forever, or might fall somewhere in between it? And I might decide after the fact that I do actually want for this to be forgotten, despite the fact that I didn't at the time that I clicked the save on R-Weave button. They can do a number of things. And we expect that they'll sort of form cabals.

Starting point is 00:26:21 At the basic level, they can say, scan this document that I've been sent. Does it match on any of these kind of filters that I'm applying? And we expect over time those filters are going to get more and more complex. It won't just be scanning for words or anything like this. It will probably be doing AI-backed computation on top. of the data we've been given to try and understand what it is that that data contains. So this is the kind of the first layer.

Starting point is 00:26:46 The second is that just like if it's a recognizable piece of data, say an MP3 or something. I mean, this is a system that's been around since the mid-2000s or something, this idea of, I forget what they call it, some kind of hash gun or something. Basically, the music industry can just make massive lists of SHA-256 hashes of files. They don't want people to store because they contain copyrighted material. And then miners can just load up these hashes and then you can immediately not store that data. So that's pretty simple. And the third mechanism is you can just say, okay, any transaction that matches these transaction IDs.

Starting point is 00:27:22 So you just have a massive list of transaction IDs or any transactions from these wallets or any transactions that follow some set pattern of behavior in the network. So kind of distribution. So the tokens that were used to fund these wallets, for example, came from this source or something like that. can also be blocked. So we don't expect individual miners will do this. We expect they'll kind of band together in the same sort of architecture that you see in the ad blocking list communities, right, where we've essentially got this decentralized group of communities that propose different rules for what should or should not be blocked from your web browser while you're browsing around the web. We basically expect minus will do the same thing. And they'll almost become like political

Starting point is 00:28:04 parties I expect eventually. Let's dive into the protocol because I think in order to actually get into the weaves here more, we kind of need to explain what the weave is. You have said several times now the R chain block weave, which is distinct from a block chain. What is a weave? How is it different from a blockchain? The basic principle that we've taken is that you memorize the state of the network into headers that with a single block you can use to join the network. So you have to download that block from someone that you trust and you can do sort of partial assessment of the proof of work along that chain as well to work out how much it would cost as a minimum to create that chain. But other than that, you can start taking part in the processing of new transactions

Starting point is 00:28:47 in the chain without access to any of the prior data. And then you have this incentive that basically says, okay, so the larger the amount of prior data from the network, you can prove you have access to on demand, the higher the likelihood you have of receiving a reward in the network. And with our age 2.0 now, not just the higher the likelihood, but also actually slightly the higher the reward you get when you do get a reward. This is basically a pretty simple consensus rule that says, okay, when I'm mining a new block, based on a derivative of the hash of the last block, calculate a random byte from the network and provide me with a proof that you have access to that bite. So in order to do that in the same way the BitTorren does, everything's split into chunks.

Starting point is 00:29:32 So you transfer the chunk. And then we also create a Merkel proof that says this chunk of this ID is verifiably a component of this transaction. And this transaction is a component of this block. And this block is a component of this block weave. And then you bundle all of that up along with the new block data. And you send it to everyone else. Everyone else can verify it because where they have access to a block hash list. So a list of the header hashes of every other block in the network.

Starting point is 00:29:59 And also the index of essentially the byte offset. at that time step in the progression of the network. And then they can essentially take your proof, they can verify it, and they can validate that you had access to that piece of information. And then they can accept the new block, give you the reward, and the network continues. Let's talk about the new block for a minute. This is the thing that's used to store data. How big would you expect a block to be?

Starting point is 00:30:23 They can be enormous, really. But I think the limit to a block size is 2 to the power, 256 bytes now, with the AWIE of 2.0. That's because you don't really have to transfer that much data around. What you need to transfer to everyone in the network is the transaction header, which says, okay, there exists a data set for which the Merkel root is X. And then as long as everyone can agree about this, then we can generate the Merkel root of transactions,

Starting point is 00:30:51 which is essentially just kind of like another tree on top of the other tree, probably. And then from that, we can generate the root of the transaction set. And so from this very large set of transactions, we can create just one kind of route. And then everyone can agree on this and write it into the network. If I want a block from you, would I get the entire block or would you give me like a piece of the block? So you wouldn't get blocks at all. That's not the way you would do it. Instead, you would say, give me the block header.

Starting point is 00:31:19 And then you would say, okay, and now give me a transaction header. And now also you just go around asking for chunks from those transactions and then you fill it in that way. the idea of downloading a block is kind of way too big, or at least could be way too big. How do the economics of this work? So basically, say, I want something on the internet stored forever. How do I pay for this? Is there some sort of state rent, or does this Vision Bank on the storage becoming cheaper indefinitely? We basically realized that as storage decline, or cost decline, which is pretty regular.

Starting point is 00:31:51 It's extremely regular, actually. You can essentially get a series for costs that tends towards a cost. a single value. So if you take the summation of the declining cost, then you get a single value for the cost of storage. And that's pretty interesting, but obviously the question there is, well, how sustainable is that model? I mean, does the price of storage decline forever? And so we looked at the history, which says that over the last 50 years, the cost of storage is declined at a rate of about 30.5% every year on average. And indeed, if you go back further than that, so if you stop thinking about sort of magnetic hard drives and you look at like, what were the storage mediums before

Starting point is 00:32:27 this, how dense were they, and how long did, for example, a piece of parchment lasts, this kind of thing. You actually see that this pattern extrapolates over like millennia. So we see that in the past has been extremely well sustained. And then we say, well, okay, just because it happened in the past doesn't mean it's going to happen in the future. What are the things we would expect to see if it was going to stop? So there's two fundamental components. So if you want to understand this properly, you first have to understand what we call PGB. So cost for storing a gigabyte for a single block or for an hour, PGBH. And this is a derivative of the data size of the storage medium, the cost, and also the length of time. The storage medium

Starting point is 00:33:11 is active before it starts to fail. And if you take these factors, you can basically work out that, you know, there's some cost X for the storage of a piece of information for some time period. So the key factors that change there substantially are the data density and the data reliability. And so we look to data density. Okay, so in practice, we are somewhere around the 1 times 10 to the power 13 or 14 bits per cubic centimeter range. And then you look at what's the theoretical maximum data density, something like 1 times 10 to the power 68.

Starting point is 00:33:44 So that's interesting. So if we were to continue towards that theoretical maximum data density, at the current rate, which is pretty aggressive, it seems, like 30.5%. How long does it take? And it takes 437 years from the present point. Or actually, that was when we wrote the paper last year. I think it's, I guess it's 436 years now. So that's an interesting component. And then you look at the other side, which is data reliability. From where we are now, how much stretch is there in making data storage more reliable? And we see that, well, when you're storing an R weave, everything is immutable. it's permanent, right? You never have to delete anything, which is a fundamentally different game from a

Starting point is 00:34:24 hardware perspective than storing rewritable medium, because basically all of our storage media at the moment is based on the idea that you should take a bit and you should be able to flip that bit backwards and forwards. And subsequently, the sensible way to do this is with electrons. But of course, electrons, they do all sorts of stuff, like at the quantum level, they kind of move around in slightly unpredictable ways. They're just not a very firm storage media. So if you were going to make a truly permanent storage media, something, for example, akin to, I guess, like A6 that you saw in the Bitcoin world, then actually you would stop doing it with electrons. You would start encoding in graphite or something like this. And then you could make structures that lasted for millennia,

Starting point is 00:35:04 trivially. So that's kind of interesting. So it's harder to work out what the absolute bound is on data reliability, but we see that there's even more stretched there than there is in data density. It looks like we're very, very, very far off from a technical level, these limits. Then the final question you have to ask yourself is, well, is there some reason that this might not happen? Like, yes, you can do it. That doesn't mean people are going to do it. And we take the approach that so much of the economy is becoming about information processing and information in general, the likelihood that humans don't have an incentive from the normal economy, not to serve the art week, but just in the general workings of society, to continually

Starting point is 00:35:47 increase the efficiency of these systems is extremely low. And consequently, yes, we think that they are extraordinarily likely to keep doing this for a very long period of time. There's another component to this that says, okay, fine. We think it's declining at a rate around 30.5%. But just to be safe, let's set the basic expectation in the network itself to 0.5%. And as long as we stay above 0.5%, we gain interest in terms of storage purchasing power for a year. And that means that even if for some reason the rate of storage cost decline was to stop decreasing, then you've got funds there to pay for 200 years' worth of storage up front. That's the argument for permanent data storage without weev.

Starting point is 00:36:29 Okay, so the short answer, just to make sure that I actually got this, is I pay for the storage once when I actually requested, right? Yes, you contribute to an endowment, exactly. Okay, and then the miners can choose to store this piece of data for me in some, sort of block. Do miners have to store the entire block? Or is it a yes, no, yes, no, yes, no thing? Or can I say, I would like to store section 1 through 17 of this block and then 25 through the end? Exactly. Yes, they can store on a chunk basis. So everybody is taking part in the consensus of transaction headers and block headers in one component. And now everybody is doing that.

Starting point is 00:37:12 So everyone is having access to each one of those components. But there's another kind of of another system that is happening basically where people are uploading chunks to the network. And then people are just asking for chunks that they want to store in a very similar way to bid time. Okay. And now say I want to recall something from a block. What do I do? So you ask someone in the network.

Starting point is 00:37:36 And if they have the chunk, they give it to you. Free of charge. Because of this, well, the optimistic tip of attack mechanism. So it looks from their point of view like it's free of charge, but only free of charge. but only free of charge in the same way that downloading a bit torrent file was free of charge. It wasn't really. You're exchanging something of value. It's just that thing of value to you wasn't that big a deal to lose, your bandwidth of your computer.

Starting point is 00:38:01 Yes. So you find the chunk and you download it from the network swapping bandwidth with the person. What's the level of replication that RWeve incorporates within its infrastructure? Right. So one of the interesting things about the way the incentives in the system work is we don't say, let's take 30 people and give them copies of the data. We don't actually think that works very well for a number of reasons we can get into in a bit. But instead, what we do is we say everybody has an incentive to make sure that they have access to the data. And that's not to store the data per se, but to make sure that they have access to it.

Starting point is 00:38:37 So they can recall it on demand when someone asks them for it. Yes, and consequently, in the current network, I think the replication, rate is 98% or something like that. It's basically complete. Almost everyone is storing almost everything apart from a kind of level of churn in the network. So new people joining, they have to download stuff. So it's around 98%. That means, I guess, yeah, we have about 500 nodes. So we're looking at quite a few hundreds of copies of a piece of information. So when you say 98% replication, that means that a single piece of data is replicated at least once. Yes, yes, at least once, but also much, much greater than that.

Starting point is 00:39:19 So there's 98% replication of the entire data set on every node, and there's 500 nodes. So we're looking at almost 500 replicas of the pieces of data, essentially. Okay, interesting. But we don't expect it to stay at this kind of replication level. It'll drop substantially lower than that, and that's within bounds of the system. So I'm still interested in attacks on the system, if you want. I mean, you've got this optimistic tit for tat in place. So is there a way that I can deduce the system by asking for data over and over again?

Starting point is 00:39:54 And I know that then three out of four times this would be given to someone else, but can I basically block the other people from receiving data who also don't have any reputation yet? So kind of like an eclipse attack, I guess you could say. Yeah. Well, I mean, it gets its security properties from BitTorrent. And there were many, many attempts to do that kind of thing there, and it just didn't work in practice. So I would say no. But it would be interesting to try.

Starting point is 00:40:21 Like it does come down to implementation details in those kinds of things. This is one of the things that we focusing this year on decentralizing about the protocol. So at the moment it's like Bitcoin in that there's only one client for the network. We don't think that that's how protocols should work. Because that means that the rules of that individual client and that bugs of that individual client, determine the behavior of the entire network. That just simply isn't the way it should work. We should design protocols that are secure and robust, and then there should be developers that come along and build implementations of those protocols. And the bugs in those

Starting point is 00:40:57 implementations should not affect the protocol at a core level. So, yeah, we're working with a few groups now to build out those separate implementations. And then hopefully if something like that were to come up, it might affect one client, but it wouldn't affect all of the clients. And that's the important part. Cool, yeah, that's super interesting. So you said earlier that the reputation is stored not globally but locally, right? So basically I would know about the reputation of the people I'm in contact with. So how does that work?

Starting point is 00:41:25 So you call it the wildfire mechanic, right? Well, that's actually like the first agent in a broader game, an adaptive, interacting incentive agent game. But yes, it's an implementation. and alternative attack broadly. So basically, I would store data on the people I regularly interact with and say whenever Sam asks me for something, he never sends anything back. So basically, I would dodge you, you know, I would dock you points. Exactly, yes.

Starting point is 00:41:57 Yeah. Or really, you deprioritize me. Okay. So I would deprioritize you and then if you ask me something, I would just not send it to you. Right. Or I would just be really far down on your queue. And you might never get to responding to my request. And then people who join the network afresh, they would be equally far down the queue or they would join above you?

Starting point is 00:42:18 Not necessarily. No, you get a kind of grace period with the current implementation of the agent. Although actually it's a multi-agent game now. In the same way that so BitTorrent doesn't enforce the rules about which ranking algorithm you use for other members of the game. And that's really interesting because it means that you can have multiple implementations of different games. games playing on the same field, if you like. So I could rank you guys in one way, and then you guys could choose to rank me in a different way. And that has, yeah, very, very interesting mechanism design effects. It essentially creates a game of generating games.

Starting point is 00:42:55 And one of the things we've been looking at building a proof for over the last year is this idea that if we all have a bias, like if we all know that there is some social good out there. So, for example, yeah, the system we normally point to is, in our review, you can run an adjunct IPFS node on your machine. And when you do this, it means that you expose the information that you've got inside ARWIVE, also inside IPFS. This is a valuable thing for the network to be doing, but there's no direct incentive for it.

Starting point is 00:43:25 And so we essentially see that when you have these interacting incentive agents, there is an incentive for you guys to incentivize me to run this IPFS node, because we're all incentivized around the value of the network, right? And you might not necessarily want to do it yourself, but you definitely do have a desire to express that I should do it. And that means that you have a desire to build agents that slightly tip me, so rank me slightly higher for exposing this behavior. And consequently, what happens over a long enough timeline appears to be,

Starting point is 00:43:59 and we have simulations for this that are pretty robust and show it, at least in practice, but we like a kind of theoretical proof, that over time you tend towards pro-social behavior in evolving environments, which is a really fascinating mechanism designed to have. Basically, we all start to incentivize each other, to incentivize others to express useful behaviors,

Starting point is 00:44:22 even when the environment the network finds itself in is shifting all the time. So perhaps we find that, I don't know, link latency for some reason increases dramatically. That might change the way that the structure of the network should work. But there's robust mechanisms in there to make sure that people, adapt their agents to cover that kind of eventuality. So that's really cool. And we think over like the periods of decades or centuries,

Starting point is 00:44:47 they'll be very important. So this game of incentivizing each other to engage in these games tends to consolidate and to, I mean, at the end, essentially there would be sort of a mechanism that would emerge as like the optimal pro-social mechanism. or would you anticipate for there to be always like these competing mechanisms for ranking? I think that, so in a theoretical simulation where the environment doesn't change, you reach a

Starting point is 00:45:21 fixed point. You reach a point at which there is the optimum agent. But that's not really the way the world works. So the environment is always slightly shifting. There are different features that the customers want, this kind of thing. As a consequence of that, we don't think that it moves towards a fixed point. We think it actually oscillates towards the most optimal solution for the current environment. One thing we have observed is that if the environment moves too fast, you actually remove this ability to have consensus. The really fascinating thing about this kind of mechanism of design problem is that it requires zero global consensus in order to have the mechanisms themselves change. So when we look at Bitcoin, for example, we see that like, you know,

Starting point is 00:46:00 the block size limit has been a major, major problem. Everybody knows the way. one megabyte is not a sufficient way to have done the network. But no one can really agree on what it should be. And in order to change it, requires global consensus. So if we were to change the mechanism of the game, everyone has to agree on playing the new game. Or we split and we start playing two games at once in different fields, different disconnected universes.

Starting point is 00:46:25 But with this adaptive mechanism design approach, you can basically have people, yes, changing independently of one another and then slowly over time annealing towards a single solution, which is very powerful, we think. This is super fascinating. Would you allow me to recap just what I understood from the protocol? Because I think it's super complex and it's different than a typical blockchain, right?

Starting point is 00:46:49 So, AWeave is there to store data. The right to mine a block is begotten to me by the fact that I can recall a piece of an arbitrary prior block. you incentivize the fact that people store old blocks by kind of allowing them to mine new blocks and get the block reward for the new blocks based on how many of the old blocks they have stored. So basically, if I have stored a large chunk of the old blocks, then I'm eligible to mine many new blocks,

Starting point is 00:47:23 whereas if I only hold like a couple of the old blocks, I am not. But if I hold, for instance, if I hold rarer blocks, then my chances of being able to mine a new blocks, block again increased, right? Yes, broadly. The only thing I would add to that is that it's not just a block reward per se. It's also a take from an endowment structure. So when you pay to add a piece of information to the network, it's not like Bitcoin where the fee goes directly to the miner that generates that block. Approximately 20%, I think maybe 16% of the fee goes to the person that generates that block. But the rest goes into this endowment, which the miners take from

Starting point is 00:47:59 indefinitely over time, as a supplement to... the traditional block reward that you would get in in Bitcoin-like system. Cool. Let's go into that in a second. But just one last question on the mechanism itself. So basically the recoil block, the block that I need to prove I have in order to be able to mine a new block. What if no one actually has it?

Starting point is 00:48:20 What if it's genuinely disappeared from the R weave? Right. Then you use option two or option three. So there's a mechanism that you can use to derive alternate chunks. that you would want to, yeah, submit as a proof of access. But when you do that, you increase the proof of work difficulty by one each time. Or by one means essentially doubling it each time. So it gets harder and harder.

Starting point is 00:48:45 Yeah, so there is a mechanism in place to deal with that eventuality if it were to ever occur. Okay, but I mean in the current setting where 98% of all nodes actually store all data, it's not really necessary. Right, exactly. Let's talk about the endowment and the token economics of the AR token. So how does that work? So I know that there were 55 million AR tokens that were minted at the get-go. So what happened to those?

Starting point is 00:49:13 And I know there's more tokens being minted for block rewards. Where do they go? Who gets them? Okay. So of the initial 55 million, I think something like 70% are now in the hands of the community. Yeah, token holders that essentially... purchase them. We, we hold the other components there. Well, there's also some in there for the team and a few other things. I feel like that's broad distribution. The other 11 million, yeah, it gets

Starting point is 00:49:42 printed in a way that's very similar to Bitcoin per block is basically a kind of subsidy during the period of which the network is gaining momentum. And this is, except we don't have this kind of strange halving mechanic that the Bitcoin has. This is something that I just desperate to ask Satoshi if they were to ever come forward. Why make it so that, yeah, the number of tokens that are printed halves every four years, whatever it is, instead of having a smoothly declining curve. Like markets really like certainty. And they get terrified around things like, well, the inflation rate is going to half tomorrow. What's that going to mean to the price? Yeah. So, that's something I really, I would really love to know what motivated that design choice.

Starting point is 00:50:26 But anyway, in our way, we use a smoothly declining curve, so there are none of these kind of halving events. In terms of the token dynamics, when you want to store a piece of information, you have to have a wallet that has a greater balance than the cost of storing the piece of information that you'd like to store forever. You sign a transaction, you dispatch it to the network. The miner that generates the block gets to take some very small portion of this as a kind of incentive for including that transaction in the block at all. Because if you don't have that, they can continue the take from the endowment or from the block reward without actually including new transactions.

Starting point is 00:50:59 So there has to be a little a nudge there to get them to include transactions. So that's what that is. The rest goes into this endowment pool, which has these mechanics, the declining cost of storage leading to essentially interest in storage purchasing power of that over time. And then minus take from that at such a rate that the principle is never declining. so never removed, but they're paying for the storage of all data through the interest on top of it. And they can take from that pool, yes, once for every block. Okay, and how is it determined how much I have to pay for storing an arbitrary piece of data?

Starting point is 00:51:35 Does it depend on the size of the data? Yes, absolutely. Yeah, that's the critical component. And is it centrally determined or is it something that the miners can set themselves? Well, the miners can set everything themselves at some level. But yes, in this case, there are consensus rules for defining how that should be agreed amongst the miners. So one of the things I've thought a lot about in terms of these decentralized data storage networks, whether it's IPFS or SIA or are we, is storage availability in the context of a, well, I'd say decentralized storage availability in a context where individuals own less and less personal storage. So, you know, if you take a step back for a moment, if we want to have

Starting point is 00:52:22 decentralized storage that is massively distributed where there's no central point of failure, you would want that storage network to be made up of a large number of individuals providing storage to the network. But the reality is that increasingly people don't have storage capacity in their homes. I mean, people are using things like Dropbox or Google drive to do a majority of their cloud storage. I mean, whereas, you know, five to 10 years ago, lots of people had these hard drives in their houses to backup files and this sort of thing. And, you know, I still have that, but most people I know don't have hard drives in their homes. And embedded devices also are basically, you can't use them for this sort of thing.

Starting point is 00:53:06 So what do you expect the RRWeave network to look like in the future? Who will be the participants? and do you think that there will be professionalization around providing storage to the network? You know, like we have mining companies, right, that are professional miners or like REN ASIC farms, et cetera. And, you know, how would we ensure, like, what are the mechanisms that are in place to ensure that there isn't like a large portion of that storage that's captured by a company like Google, for instance, or even a state actor? How do we ensure decentralization in this context?

Starting point is 00:53:39 Okay. lots of interesting questions there. The last one first. I think we ensure this through the protocol. So you can't necessarily stop centralization. I'm not a strong believer that blockchains are true generators of decentralization over long enough periods of time.

Starting point is 00:53:57 It just doesn't actually seem to me like that's what happens. And the reason for that is we just can't crack this one CPU, one vote, or even one human one vote in a decentralized trustless network problem. and so naturally I think we're always going to end up with centralization of some form when you have an economic reward and an economic cost to the behavior because then people are going to find the economies of scale basically setting up larger and larger rigs and this is somewhat disappointing but it doesn't have to be the end of the world

Starting point is 00:54:30 because we can set up the rules of the protocol itself so that even if there is kind of quasi centralization of network. Yes, there are still guarantees about how that should function from the user's point of view. And that's what I think is powerful about these things. So that's my point of view on that one. I think this idea that you picked up on that, well, people basically aren't storing their data locally anymore. That's absolutely right. I think one of the areas where the Aweb approach really disagrees with the SIA file coin approach is that we just don't see that there's mountains and mountains of unused hard drive space out there in consumer devices that we can easily attach to this network that allows people to have decentralized and fast and reliable access

Starting point is 00:55:16 to their data that is stored on someone else's machine. I don't think that that's a reasonable assessment of how the infrastructure of the internet works to expect that that would happen. And I also think that if it was possible to get greater economies of scale for data storage, by using a kind of decentralized cloud approach essentially, then Amazon would have done this a very long time ago. Amazon would just let you rent out your hard drive space, and they would do it in an essentially controlled manner. It would be much more efficient from that point of view,

Starting point is 00:55:48 and they would be making a lot of profit from it, but it just doesn't strike me that that's actually quite how the world works. Like the economics, we can't. We've spent quite a while looking at them, we can't get the numbers to add up. What we think decentralized storage is valuable for is allowing you to do things that you simply couldn't do before. And in the case of our we've, that store data forever. And you get extreme benefits from this immutability of the business model. So everyone can come along to this business model over time.

Starting point is 00:56:19 They can trust it's never going to change because it's not a single centralized party that's controlling it. Yeah. And consequently, it's just much more reliable and robust. That's something that decentralization offers that I don't think you can either. mimic in the centralized world because if Amazon were to offer a permanent information storage – sorry, you can't see my air quotes on the podcast. But yes, a supposedly permanent storage offering, you would have to trust that in 50 years time Amazon is still going to offer that same product in an unmodified fashion.

Starting point is 00:56:50 I think that's really unlikely. But with the decentralized protocol, you can do that. You know, HTTP 1.1 works just the same now as it did when it was first released. and TCPIPP4 and so on. Are the incentives in RWeave dependent on people using the network? So if I'm the only person left in the world using RWeave, what are the incentives for, not that I think that will happen,

Starting point is 00:57:16 but if there's one person left using the system, say in like 50 years from now, it's forgotten, right? And no one is using it anymore. What is the incentive for all the miners to continue storing the data? there and basically like fulfill in the contract of the endowment, they could just say, okay, I'm going to show off with this hardware. Nobody cares about this thing anymore.

Starting point is 00:57:38 Right. So I think from a technical perspective, the key thing is that assuming the value stays constant, you don't need new users to upload more data. But it is a currency. And so its value is backed by perception. And so, yeah, there is that question. And I think the answer to is actually quite simple. It's just social reward. In the same way that we back archives now. In fact, there's a pretty large, decentralized archiving community in the world

Starting point is 00:58:07 that just stores information from the internet because they think it's valuable and they put it on their hard drives. Yeah, for the sake of prosperity for the future. So I think that, you know, if that financial reward ever were to not be enticing to people anymore, people would just fall back on that social reward. And we actually already seen miners in the network

Starting point is 00:58:26 talking about this. A lot of the miners, they don't care so much about be the financial reward of the theme, but they're kind of in love with the mission and they just want to run a node because they want to replicate this data and they want to be part of this project to permanently store all of humanity's useful knowledge and history. I think that the mission is something that's incredibly useful and valuable. And like it's something that I could stand behind in terms of, I do want to go back to briefly what you talked about,

Starting point is 00:59:01 SIA and IPFS. I mean, we just talked to David Vorek as we were recording this last night. And I asked him this very question the first time we interviewed him. I think it's also similar for something like space mesh, right, that makes use of hard drive space for like a proof of some sort of like consensus mechanism.

Starting point is 00:59:20 Or Tierra first coin, that kinds of things. Right, like all of these proof of storage type systems. And their answer, I think, is, We expect professionalization to emerge and for professional actors to enter the space. And they don't think that a future is one where there are billions of people with hard drives in their homes powering this network of storage. And I think that that's probably inevitable also. I mean, we've seen this already in Bitcoin, right? Like mining is not like a hobbyist's game anymore.

Starting point is 00:59:55 It is a professional game. I'm not really sure where I'm going with this. But yeah, I think it's inevitable that these things happen in a very complex economic game when there are multiple actors and money at stake. Yeah, for sure. I mean, I think, frankly, like, we don't CSLs as competitors to CIO or Filecoin, particularly, because they're working on temporary storage and we're working on permanent storage. I actually think that it's, it could be very valuable if we could provide people with the ability

Starting point is 01:00:23 to, yeah, store temporary data inside CIR or. or Filecoin from an RWeef PermaWeb app, that'd be great. Like, as we spoke about right at the beginning, we don't think everything should be permanent. So it'd be great if you can make applications that interact with, so permanent applications that interact with temporary storage. Same with Blockstack, their idea of having these kind of, oh, and solid pods. Tim Berners-Lee's thing that's kind of like on that edge of crypto where everyone is sort of vaguely aware of it, but no one has actually looked into what it's doing,

Starting point is 01:00:55 and no one can see anything that works with it in practice. But yeah, these ideas where you have like private, decentralized data pots where the user owns their data. That's a great thing to have. And it would be really cool if you could or rather would because you can. Yes, if you would build our applications interacts with those kind of data storage systems where permanence just doesn't make sense. But the application should be permanently stored. Like this is something we're very happy about the idea of. And there's another component to your kind of question, which was about IPFS and maybe Skynet now versus Filecoin and SEA.

Starting point is 01:01:38 And this is something that I think is kind of confused a lot in the crypto communities. They're actually very different products. Sea and Filecoin are just about storing data, like connecting you with someone out there that will store your piece of data for you. whereas Skynet and IPFS, much closer to traditional content distribution networks. And they're doing entirely different things. I actually think there's some very interesting areas where we could work with IPFS or Skynet. If the data routing ever became fast enough, yeah, to essentially embed with RWeb and provide a CDN on top of that data storage layer. That'd be great.

Starting point is 01:02:18 One thing I was also thinking about when preparing for this is in terms of RWeb and, the product that you're building and what you're providing as far as product. It's the ability to store things forever, right? Whereas, like, you're right. Like, SIA and IPFS is like, IPFS is more of a content distribution network where SIA is meant to be more like a, like a decentralized dropbox type of thing. But the underlying commodity that all of these systems use is data storage. I think it would be interesting if all of these projects got together. and somehow figure it out how to create a single, you know, system where basically all of these user-facing applications that sit on top rely on the same data storage mechanism.

Starting point is 01:03:07 Instead of having like silo data storage mechanisms for different applications, just having like one big decentralized data storage mechanism where like you're benefiting from like the compounding size of these different networks and then building applications on top. I don't know if that's like the right way to look at it. So the thesis that we work with in the company that founded Arweave is along this lines that information is the most valuable things humans have now. And we've explored the space of how to store more information cheaper very well. And there are enormous efficiencies there and it created a massive amount of value.

Starting point is 01:03:46 What we haven't explored. Oh, and the same with moving data around. So that's the, you know, first is the data storage revolution we went. from storing one megabyte of data in something that was the size of a house, to storing 16 terabytes of data in something that's like a small part of my laptop. So that creates an enormous amounts of value. And then another thing you can do with data is you can move it around. And that was the internet.

Starting point is 01:04:09 And that created, it would be very hard to quantify, but just staggering amounts of value. Our thesis is that there are essentially other ways that you can manipulate data that are along the lines of the length of time it should be around in the world. And we're just trying to fill one of those holes. And I think that that could unlock enormous amounts of value, but also that it's necessarily

Starting point is 01:04:36 kind of slightly dissociated from this temporary or at least ephemeral storage world. Like they are quite different things. That's not to say that the protocols can't interact, but I don't think they will just be kind of drop-in replacements for another, I guess. Because they're just focused on fundamentally different things.

Starting point is 01:04:55 I think if you're building a firmware app that you want to, I don't know, store large videos or something like that, then you might consider putting them in IPFS or in file coin or something. That'd be great. It makes a lot of sense because the file probably doesn't have to be around forever. But, of course, if you're building some kind of app inside IPFS, that needs to store a record or something for a very long period of time, then you should probably store it on Arweave. So they can interoperate on that sort of level,

Starting point is 01:05:23 but I think it's not like the commodity underneath is the same. It's almost that that's the very point that it's not. The commodity of permanent data storage is different from the commodity of temporary data storage and when we eventually get to it, the commodity of ephemeral data storage. So let's talk about the whole that you are trying to fill. So basically let's talk about the use cases that you are addressing.

Starting point is 01:05:45 Is are we a storage chain that should interstate, link with other blockchains? Do you expect applications to be built on top of our weave? Or what kind of information do you expect to be stored in the weave? There's different time periods, right? There's like the very immediate time period. We can see that we're very obviously useful to crypto projects. So we're now working with a kind of scalability chain in order to store backups of their entire chain because it's part of their model that you don't necessarily have to store that stuff. but users are now asking them, particularly in the defy space. But what if I want records of every single transaction that's been through my chain?

Starting point is 01:06:23 So they are just offloading that to Are Weave, where there's permanent and sustainable economics to handle that kind of storage. But in the longer run, we see ourselves fulfilling some kind of role that is somewhere between an archive and an internet. This place that humans go to put piece of information that they want to be around for exceedingly long periods of time, whether that's knowledge, whether it's records of history, or whether it's personal memoirs, or really anything like this. If you need it to be around for a long period, we expect that RWeb is the place that you'll want to go to store it.

Starting point is 01:06:55 But this is like, you know, 10, 15-year time horizon. So I probably should have asked this earlier in the protocol section, but what's the block time? So what's the block? Right, two minutes. Two minutes, okay. On average. Although we think we can speed that up now.

Starting point is 01:07:11 With R-W2.0, we changed the proof of access mechanism so that the amount of data that you have to transfer when you produce one of these proofs is much, much smaller. And so now we think we can compress the block time. Okay. So, I mean, there's a well-no problem on smart contract chains, the data availability problem. Right. Do you think Al-WIV can help with solving that? Partially. I mean, people often mistake Al-Wev for a solution to the data availability problem. And we have like a basic solution to it, but it's not perfect for all situations. So you can use Rweber's data availability layer.

Starting point is 01:07:50 It's very simple. You just use a special transaction type that requires all of the data in the transaction to be distributed to everyone in the network. And then once that's happened, there's permanent incentive to recall that data. So it's just going to be available. There's an extraordinary high degree of likelihood. But that's not the kind of hypers scalable stuff that Rweaver is focused on. instead we permanently store an incentive to recall the data.

Starting point is 01:08:14 And that's not quite a solution to the same thing. And so that means that as long as the user has actually uploaded it in a responsible way, then it will be there. And if they haven't, then, you know, it's not magic. So let's talk a little bit about the business. How many people on your team, where are you based? And what is the business model here for are we? First things, first.

Starting point is 01:08:37 The size of the team, we're about, 15 people internally. But we have this approach where we see... So we look back at like the history of protocols before crypto when we were starting Arweave. And we saw that there's this very clear pattern that protocols are generally made by small numbers of people. Actually, sometimes as low as one,

Starting point is 01:08:56 but the kind of average seemed around like 2.2 people, something like that. They weren't enormous teams. And we think that's because protocols are essentially valuable in proportion to their simplicity. because with a protocol, you want more and more and more people to plug into it. And more people are going to plug in if it's simpler to build an implementation, if it's easier to put inside their product.

Starting point is 01:09:19 So you get this very simple function that basically just tells you, well, the easier it is for someone to integrate or to create a new implementation of, the higher the likelihood that they're going to actually do that and subsequently the more integrations you get. So we've seen simplicity of the protocol as like a pretty core function. right from the beginning. And subsequently, we kept team size fairly constrained the whole way along. So we've been about 10 to 15 for the last, it's a year to 18 months or something at this point.

Starting point is 01:09:52 And we don't have plans to massively expand that. Instead, what we're trying to do is push more capital out into the ecosystem where developers can pick up the RWeave network and build applications on top of it. And we essentially just want to provide the kickstarting capital to allow people to do that. And that kind of neatly brings us on to what the business model is, which is also pretty simple. As the company that founded Albi, we just have a big pool of tokens and we have a big pool of fiat. Our game, and our purely selfish economic game, is to say, okay, so if we take this fiat and we deploy it, say we spend $10,000, can we increase the value of the tokens that we have, the sum value of those tokens, buy more than $10,000? And if we should, but that's just profit.

Starting point is 01:10:36 and so we should do it. And this has this amazing byproduct that even though this is kind of selfish behavior, and we think that, you know, one of the things that the crypto economic community has kind of absolutely perfectly decided and kind of highlighted, I guess, in people's minds, we are selfishly incentivized to grow the size

Starting point is 01:10:55 of the RWeave community and network and make the network itself more and more valuable. So we just put our mindsets to doing that. And we do so by investing in community projects. Can you tell us a little bit about the community projects that you're invested in and basically the kind of products that are built on top of the R-Weave? Right, sure. I mean, there's all sorts of things. There's things like simple blogging platforms to file uploading tools that basically allow you to store a simple file you have on your desktop forever.

Starting point is 01:11:29 To things like decentralized, ownerless, essentially forums, so social spaces that are governed, by code. And these have really fascinating properties. It's as if you could have a kind of website or a community where we agree up front what the rules are going to be and so we don't need moderators. And there's no power games associated with that. So we think that'll be extremely

Starting point is 01:11:51 powerful eventually as well. People are basically building all kinds of services from stuff that's just a drop-in replacement for a Web 2 service that provides the user with the certainty that the model is not going to change under their feet and the essentially they're not

Starting point is 01:12:06 exploiting their data and so on, so forth, to entirely new kinds of things that you just couldn't have built on the centralized web in any form. I'm like super excited about all these applications. Personally, I've been on this quest for the last two years or so to like detach myself from things like cloud storage and things like that. So like I run my own cloud basically like here in my house and backed up in different places and stuff like that. But then I'm also hosting my own web. website, my own blog, a lot of my own kind of personal publishing stuff. And I see this as sort of an

Starting point is 01:12:43 extension of that quest to detach myself from cloud services where like, hey, I could like have my blog on Arwee, for example, and it would stay there forever. And like I don't ever have to, you know, think about, do I have to back this up or something? It's just there. Like I've, there's been so many things I've written over the years that I've lost, right? Because like I've lost the password to some Tumblr account or something like that. Right. I think this is like, you know, very attractive as a the product proposition. So what do you hope to achieve in the next coming years? Well, the coming years are all about bootstrapping adoption.

Starting point is 01:13:14 So finding the areas where permanent information storage is most useful to people most quickly. And so one of those areas is inside crypto, right? Where, for example, like if you have a defy application, right, you have a smart contract on Ethereum, and then you have a UI. And the UI is always stored on something like Amazon S3. This is kind of crazy because it means, you know, your crypto kitties, If for some reason the S3 bucket, yeah, someone stops paying for it, it's unavailable or something like that. Well, now you just have this smart contract that says that X address has Y number associated with it,

Starting point is 01:13:47 which is not nearly as valuable as the UI itself that was sitting on top of it. So if you store the UI on RWeave, then you get this, the entire package is just as permanent as the networks that are running it, which we think is very powerful. So we see a lot of people moving from, yeah, central. storage of NFTs or centralized storage of defy UIs to putting those things on top of RWEV. And sort of at the larger scale inside crypto to simply moving chains inside the RWeave network. This is in the kind of short term in the crypto space, but we also see adoption amongst the archiving communities because, well, that technology hasn't changed for like,

Starting point is 01:14:29 in some cases, you know, hundreds of years. The techniques are pretty much the same. and we can offer them something that is fundamentally different that has a coherent, sustainable economic model behind it. And so that's pretty exciting also. Yeah, I mean, there's really like a whole bunch of different vectors. One of the ones I'm personally most excited about is this idea of allowing for a free and open source web. So one of the protocol-specific problems

Starting point is 01:14:54 that seems to have led to societal level issues that are now playing out is the incentive model around Web 2, which is like hidden. under layers and layers of, I guess, interpretation and subjective, very social stuff. But the fundamental situation is, if I create a HTTP request and I send it to a server that you own, and you then do some computation, and you send me back a result, you have incurred costs because of my usage. So at a technical level, this is sort of, I'd say, paired with

Starting point is 01:15:30 this interesting social phenomenon, which was that as the web developed, but everybody decided that everything should basically be free. So now we have this competing sort of set of facts that says, okay, everything's got to be free, but also my usage is costing you money. And consequently, basically everything on the web had to be a commercial entity of some kind. And worse, it had to be a commercial entity where the user was not going to pay for their usage. And subsequently, the user themselves had to be commoditized. And the way that that sort of played out was in what we now call, I guess, surveillance capitalism

Starting point is 01:16:02 or in the ad tech industry, this idea that basically we just commodify your future purchasing habits or voting habits or access to your personal information, information about you that's pretty sensitive. And we package this up as a product and we sell it to other people.

Starting point is 01:16:19 And that is how we fund this entire ecosystem as a result of this very, very basic protocol decision that says, I ask you for data, you perform computation and you respond with data. So we think that essentially the reason that an open source ecosystem did not emerge on the web is exactly because of that. So if I build

Starting point is 01:16:38 something fun on the weekend, right, kind of hobby project and I launch it and I put it on a domain name, I have this terrible problem that more users equals more expenditure for me. And that pushes me to monetize. Whereas with Arweave, I say,

Starting point is 01:16:54 okay, I put the application I've built out into the, essentially the free information space. I pay once and it's about, you know, one cent per megabyte, something like this. So I barely notice the cost. And that's it. I'm done. Like my expenditure related to this application is over if I don't want to ever update

Starting point is 01:17:13 again. And that means that other people can come to that application forever. And they're essentially, you know, they're paying with their own bandwidth for the costs that they're occurring on the infrastructure. Yeah. And they're not charging the developer. So the developer doesn't have to monetize. They can just build an application because they're passionate about it.

Starting point is 01:17:32 and they wanted to exist in the world. And we essentially see that this enables people to build open source web applications for the first time. It really weren't possible previously. So, yeah, that's also something I'm really excited about for the next few years. I think those incentives are just fundamentally different than the Web 2 world. I think this is a fantastic closing statement. Excellent. Thank you so much for coming on the show.

Starting point is 01:17:58 Thanks for joining us. Thanks for having me. It was great. Thank you for joining us on this. week's episode. We release new episodes every week. You can find and subscribe to the show on iTunes, Spotify, YouTube, SoundCloud, or wherever you listen to podcasts. And if you have a Google Home or Alexa device, you can tell it to listen to the latest episode of the Epicenter podcast. Go to epicenter.tv slash subscribe for a full list of places where you can watch and listen. And while you're there, be sure to sign up for the newsletter, so you get new episodes in your

Starting point is 01:18:27 inbox as they're released. If you want to interact with us, guest or other podcast listeners, you can follow us on Twitter. And please leave us a review on iTunes. It helps people find the show, and we're always happy to read them. So thanks so much, and we look forward to being back next week.

Epicenter - Learn about Crypto, Blockchain, Ethereum, Bitcoin and Distributed Technologies - Sam Williams: Arweave – Bringing Permanence to the Web

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.