Epicenter - Learn about Crypto, Blockchain, Ethereum, Bitcoin and Distributed Technologies - David Vorick: Sia – Creating a P2P Marketplace for Data Storage to Disrupt the Cloud Industry

Episode Date: December 17, 2019

Much of the digital world now operates in the cloud. A handful of companies are responsible for the massive market, and the centralization makes many people worry about censorship, privacy, and networ...k resilience. David Vorick’s passion for distributed file storage had him start Sia in 2014. He’s been working fervently ever since to develop a viable competitor to the centralized solutions that are responsible for much of the content on the internet today. David aim’s to create a more private, resilient, and secure alternative to Amazon Web Services while also outperforming it.Topics covered in this episode:Why David started Sia rather than taking a job in big-techThe Sia launch in 2015Why Sia has prioritized development over marketing in the beginningThe importance of decentralized storage solutionsHow decentralized storage can be cheaper than AWSWhy Sia needs its own blockchain and protocol's tech stackDavid's views on Proof of Work how that lead him to start a mining companyThe circumstances around the Sia blockchain fork of 2018Where David envisions decentralized storage in the next few yearsEpisode links: Sia API DocumentationSia BlogDecentralization & Cutting-Edge Cryptography - Starkware Sessions TalkBusted Setup - MIT Bitcoin Expo 2019 TalkRecovering Payment Channel Midstates Using only The User's Seed - Scaling Bitcoin TalkThe Sia Ethos (Sia Blog)Blockchains of the Sia familyObeliskSiaStatsDavid Vorick on TwitterSia on TwitterSponsors: Pepo: - eToro: Automatically copy every trade of eToro's top crypto traders at the exact price in real-time - https://www.etoro.com/This episode is hosted by Sebastien Couture & Sunny Aggarwal. Show notes and listening options: epicenter.tv/318

Transcript
Discussion (0)
Starting point is 00:00:00 This is Epicenter, episode 318 with guest David Vorek. Hi, welcome to Epicenter. My name is Sebastian Quirio. I hope everyone's having a good pre-year-end holiday week. To all of you who will be traveling over the next few days, I wish you the best of luck and patience. I find that traveling at this time of year can be particularly exhausting, but the payoff of being with family and doing nothing for about a week. is always well worth it. So I arrived in Canada over the weekend. It'll be here for the next 10 days or so before heading back to Europe for the new year. Today our guest is David Vorick. David is the
Starting point is 00:00:57 founder and CEO of Saya. Saya is an interesting project for a number of reasons. Now, before doing this episode, I have to admit, I was not extremely familiar with Saya, but I'm really happy that I was able to learn about it. It's a decentralized file storage platform. which has been around since 2014. Now, needless to say, it's definitely less talked about than other projects in this category. I'm thinking of IPFS and storage, for instance. But I think it deserves a lot more attention than it does. SIA is a true decentralized dropbox, and its potential is huge.
Starting point is 00:01:37 The technology seems quite mature, and the cost of storing files on SIA is surprisingly affordable. And as someone who runs his own at-home cloud server, I can definitely see SIA complimenting or straight-up replacing that setup. Sunny and I talked to David about the history of the project, how it works under the hood, and some of the really interesting features it has like seed-based file recovery, which allows you to recover your entire file library with just a seat, which is one of my favorite things about this product. We also got into David's views on proof of work. David's a strong proponent of proof of work and even started a company which builds minors for SIA. Now, a few years ago, there was some drama surrounding a hard fork of the Sire protocol, which was somewhat contentious. Going into the interview, I actually had a little context for what happened during that hard fork. And so my reaction to it was as genuine as it could have been.
Starting point is 00:02:41 Anyway, I'm curious if you all knew of this and what you think about what happened. So it's quite possible that we have David back on the show in the near future. He's quite outspoken on the topic of trusted setups in ZKP systems. In fact, I first encountered David when he gave a talk on trusted setups at Stark Recessions and Tel Aviv. But unfortunately, we didn't have time to get into this. as it deserves an entire episode on its own. Before you go to the interview, I'd like to tell you about our sponsors for today's episode. PEPO is a community of creators sharing short video content on crypto and entrepreneurship.
Starting point is 00:03:20 And on PEPO, you show your love for content creators with PEPO coins. When you like something, you send PEPO coins. And when you want to take part in a conversation, you show your interest by putting up coins as well. So if you want to try it out, now is the perfect time, because the PEPO team has launched the Home for the holidays challenge. So as we're all spending time with family over the next few weeks, it's inevitable that someone at some point will ask you to explain what is crypto. The challenge is to share how you explain crypto to your family in a 30 second video. And there's a $2,000 price pool in Pepo coins for the top three videos as voted by the community.
Starting point is 00:03:58 And to be honest, your odds of winning this are actually pretty good because as I'm recording this, there's only about 15 entries. So to participate in the challenge, download the Peppo app on your Android or iOS device and log in with your Twitter account. Then open your mobile browser and go to epicenter.rox slash pepo. That's PEPO. And that'll take you right to the contest thread. And then reply with your unique, What is Crypto explanation by December 20th at 12 p.m. New York time to enter the contest. And while you're there, you can also check out my entry and send me some Peppo coins if you like. We'd like to thank Peppo for the support of the podcast. We're also brought to you by Itoro. Now, if you're interested in getting into
Starting point is 00:04:40 the financial markets and you don't know where to start, Eitoro is a great place to begin building your portfolio. Itoro makes trading easy because you get access to stocks, bonds, ETFs, indices, commodities, currencies, and crypto in one single easy-to-use platform. Itoro is unlike any trading platform you've ever experienced. If you're used to using your banks trading or investment dashboard, you're into something, if you're used to using your banks trading or investing dashboard, you're in for something vastly different here. And that's because Ituro is a social trading platform. When you join Itoro, you're joining a community of 12 million traders from around the world. They're talking about trading, sharing charts, and they're talking about crypto too.
Starting point is 00:05:25 The best thing about Itoro is the copy trader feature. You can automatically copy the trades of top crypto traders at the exact price in real time. Now, these are people who spend a lot of time researching the market, developing their strategies, and building their portfolio. And you can copy their trade so that whatever gains they make, you'll make the same profits proportional to your investment. And one of my favorite things about copy trader is that I get to see the risk profile of each of these traders so I can build my own strategies investing certain amounts with low, medium, or high-risk traders and learning from their trades. So to get started, go to etorro.com, that's E-T-O-R-O, create your account and start trading
Starting point is 00:06:07 and copying trades today. Now, of course, this is not investment advice. This is my personal opinion, and you should always consider the risks in investing, especially with cryptocurrencies as they are highly volatile. We'd like to thank E-Torro for those support of the podcast. And with that, here's our interview with David Vork. We're here with David Vorek. David is the co-founder of the SIEA network.
Starting point is 00:06:32 David, thanks for joining us. Thanks. Glad to be here. So tell us, how did you get involved in crypto? And what's your background? What were you doing before getting into the crypto space? Yeah. So I got involved freshman year of college.
Starting point is 00:06:47 This was in 2011. And basically what happened was someone from my dorm just popped their head through my doorway and said, you should check out Bitcoin and then they disappeared. So it was like one sentence that changed my life. It was your own little Satoshi. Yeah. So he didn't even care that much. Bitcoin wasn't really his thing, but he knew it would be my thing.
Starting point is 00:07:11 And he was definitely right. So I just basically from the age of 18, dive really deep into Bitcoin, started studying it, looking at it, experimenting with it, trying to find ways to make it better. and by the time I graduated, I decided to start a company around blockchain and cryptocurrency that ended up being the SIA network. So, yeah, SIA platform was started by me and my co-founder. We both did it straight out of college.
Starting point is 00:07:40 So we don't have anything between. So I guess I've always professionally been a Bitcoin guy. What was it about you that your friend knew that you would be super attracted to, like working on Bitcoin stuff? That's a great question. I think he knew that I was pretty libertarian. I was pretty excited about technologies like Tor and BitTorrent. And Bitcoin kind of falls under the same like internet and, you know, fight the man, gain independence sort of philosophies. So I think I think he knew that there would be a match there. So you started SIA in 2014. Yep.
Starting point is 00:08:23 And so, I mean, I only recently learned about Saya. When did you guys go live or what are the different sort of like iterations of that? Because it seems that Sia has, you know, gotten much more attention recently than it did like, you know, in 2015 or 2016. Yeah. So we launched the first version of the network in 2015, June, June 2015, which seems kind of insane to see. say that from here because at that point we were just like panicked about you know oh is is storage going to come to market first is file coin going to come to market first like we really wanted to be the first platform that that worked and we were and then it was like it took all the way until
Starting point is 00:09:08 2019 when our we've launched for there to really be a decentralized like competitor but yeah so we we first launched the sign network way back in 2015 and then we've just been working on it building it, I think we're a bunch of builders more than like marketers or businessmen. And so as a team, especially early on before we had, you know, grown and expanded and kind filled out the roles that we were missing, we were just focused on building up the technology. And now today we actually have people on the team who are responsible for growth and product strategy and like customer fit. And so I think that that's really where more of the attention is coming from is, yeah, we have we have non-builders or people who do more than.
Starting point is 00:09:51 and just write code helping SIE out. Yeah, I was remarking that you guys have a pretty impressive blog presence. Like your blog is pretty frequently, you know, seeing new articles and things and, like, community updates quite regularly. So I think that's quite good. So you mentioned storage and IPFS. Explain how you differ from those two products. So I think in our audience, a lot of people will be familiar with IPFS.
Starting point is 00:10:18 We also did an episode with storage. like years ago. But yeah, how to see it different? Yeah. So I think chief way we distinguish from our competitors is that we're fully decentralized. So at no point in the standard like SIA flow or architecture, do you have to deal with a centralized party?
Starting point is 00:10:40 And so like storage is pretty transparent about the fact that their architecture has centralized elements to it. They have, you know, satellites and coordinators. And they, you know, they need. If storage, the company, disappears, the network stops working. IPFS also has like a similar weakness in that like when you put data on IPFS, it's not guaranteed to stick around unless you contract a centralized like a pinning service, like in fear or something. So on the SIA network, none of these limitations
Starting point is 00:11:07 exist. Today when you put data on the SIA network, it goes up in a permanent way and because you're paying for it, you don't have to worry about pinning it. You don't have to find a centralized party. If our company disappears tomorrow, you know, the website would go down, but the people actually using the SIA app would not notice. There would be no decrease in speed. You would, you know, you'd still be able to upload and download and form new contracts. And so in that sense, SIA is like a fully, fully independent network. Okay. So do you think that people are aware of the fact that, you know, IPFS, for instance, has these centralized elements? And why does it matter? So I think most developers that are building applications on top of IPFS are aware because they either they run into the problem that their files disappear or they are actively contracting these centralized services.
Starting point is 00:12:01 And I think it's pretty transparent from that side of things. I don't know if users, people who use apps that use IPFS are aware that these centralized elements are in play. But I think it is important because what it means is that some third party, has the ability to shut down the app. If Inferra decides that they're not serving your apps files anymore, you know, with one executive decision, someone, you know, some leader at the Infura company can decide to cripple your entire application.
Starting point is 00:12:33 And that really, like, decentralization is about building things that other people don't control. And so we see that as a huge weakness. And really, we want to build a web and like an application system where there is no counterparty who can decide to turn off your access to an application or an app. But isn't it true that you could use IPFS in, I mean, like, so you're contracting out decentralized services, but it is possible as well to utilize like layers on top of IPFS to build a decentralized network of storage providers, is it not?
Starting point is 00:13:12 I think that remains to be seen. The IPFS team would say that it is possible. however no one's done such a thing and I think that there would be substantial challenges. Another thing I would point out about IPFS and I think this is also something developers on top of IPFS are well aware of is that the architecture is just fundamentally very, very slow. So if you want to open up a file,
Starting point is 00:13:34 especially one that doesn't get access very often, it's going to take you many seconds to get that file open. And the way the SIE network is built is fundamentally much, much faster. Not only is it much easier and like the storage element of SIA is, you know, we started with storage first and then we're moving on to these other elements second, whereas IPFS, the storage has to be like bolted on later. And that really, I think, has a big impact to the user experience.
Starting point is 00:14:03 IPFS and SIA are not exactly, you know, comparable because one is more of a content delivery network and the other is sort of a decentralized cloud service, very different use cases, right? And even that, like, you know, maybe a closer comparison would be to like file coin, but even that, I think there's some quite a few differences. And so maybe like before we start to like delve more into these comparisons, maybe we should get a good base ground of what SIA is trying to do. What is, um, the product here? Saya is a progression. So it's a technology that we are iterating rapidly on. The very first thing that Saya could really do is is archival storage. So if you.
Starting point is 00:14:43 have data that you want to backup or protect, the SIA network is a very fast, very efficient, very cost-effective solution, robust, you know, decentralized place where you can store your data. So first and foremost, like SIA is an archival platform or a backup solution. But the reason that that's what it is first is because that's what we felt as you look at more things you can do with distributed data or decentralized data. That was the easiest thing to build. So that's where we put all of our attention, because above everything else, we wanted a practical, decentralized storage platform as fast as possible. So that V1 would be sort of like a personal cloud.
Starting point is 00:15:29 Let's say I had a bunch of files on my Dropbox. Instead of putting them on Dropbox, I would go ahead and put them on the SIA network. But I wouldn't be serving a website, for example, from this V1. Yeah, so V1 doesn't do website serving or content delivery. However, V2 does do website serving and content delivery. And so as we are finishing up, you know, the final edges on this archival storage or like object storage platform, really where that moves on, if you think about the pieces, we have hundreds of hosts all over the world, you know, near many major cities. if you want to download a file and it's on the SIA network, the Sion network is going to have the distribution and the locality and the ability to serve that file very quickly.
Starting point is 00:16:17 And so we've been building from day one, starting in 2014 with the idea of eventually turning into an IPFS style content delivery network or competing with like say in Akamai. That's just, we've known that that's going to be step two, not step one. So that's kind of where we're pointed for the future. So is this V2 what's live today, or is it V1 that's live right now? V1 is live right now, and we are actively building both to make V1 better and to bring V2
Starting point is 00:16:51 to production. I see. So let's talk a little bit. Let's start off by talking about V1 and some of the economics and design around that, and then we can shift towards V2. So in this V1, could you tell us a little bit about, okay, I have a, you know, I'm a, you know, On my hard drive currently, I have a movie collection of 100 gigabytes, and I want to get rid of my hard drive and put it on the SIA network. Can you walk me through what is the process of doing that, both what I'm doing as a user and then what's happening technically behind the scenes?
Starting point is 00:17:25 Yeah. So as a user, what you do is you'd go to our website. You'd download SIA or if you want to get it some other way, you could clone the GitHub repo or GitLab, clone the GitLab repo and build it yourself. Either way, you're going to have to get the SIA software and run it. Then it's going to have to sync the chain and it's going to need funding. So the SIA network philosophy is that everything runs on money. When you upload, it costs money, when you store data, that costs money, when you download that costs money.
Starting point is 00:17:54 And it's all on a big open marketplace. So things are generally extremely cost competitive because every single host on the SIA network is competing with every other hosts on the SIA network. and price is a big factor in, yeah, which hosts get selected. So the prices are constantly being pushed down on the SIE network. So you're going to have to give the Sai client, so Sia coins, and then you'll have to set up an allowance, which is basically just like a control mechanism,
Starting point is 00:18:21 so it knows how much money it's allowed to spend. That way it doesn't go and spend, you know, $1,000 a month or something. But really that's just a, like a safety feature more than, you know, more than like a critical element. Once you're there, once you're set up, you'll be able to just upload your files to the SIA network. So you will either drag and drop or open the folder uploader or there are a couple ways to put files onto the SIA client. But once you start uploading them, it'll go ahead. It'll saturate your home connection.
Starting point is 00:18:54 So SIA network can do about 400 megabits per second. Most people's home connections are not that fast. So for 100 gigabytes, it might take you, you know, one day. a full day of uploading to get that onto the SIA network. At which point, your data is on the SIA network, and you can download it any time. SIA Today supports video streaming. So in your case, you gave the movies as an example. If you want to watch your movies, this is actually something you can do directly from the SIA network.
Starting point is 00:19:21 So you don't even have to download them and then watch them. You can just stream them directly from the SIA network. And then the other thing that the SIA network supports, which I think is really critical, to making it a practical backup solution is seat-based file recovery. So once all your data is uploaded and it's in good health and the UI will show you, you know, the health of your files, you can create a snapshot. So there's a tool to walk through that says create a snapshot. It'll make a snapshot.
Starting point is 00:19:53 It'll upload the snapshot. It'll take maybe an hour to get everything in place. And then it'll report that the snapshot was successful and has completed. what this means is that your wallet seed that you use to fund your account also now has access to your data. So if you lose your computer and you have to start over, you get a new computer, you install SIA, and you open up the type in your seed screen. When you put your seed in, not only will you get your money back like a traditional cryptocurrency wallet, but you will also get that snapshot back when you'll have all your data.
Starting point is 00:20:32 What exactly is going into that snapshot beyond just my crypto private key, for example? Yeah. So when you upload data to the SIE network, what you're doing is you're storing it on a bunch of machines around the world. Each piece of data ends up on 30 different machines. And to make things fast, you have to, we don't have like a DHD or any sort of lookup mechanism. So what you need to do is you have to remember where each piece is. and you're going to have an encryption key for that piece. You're going to have the IP address of the host,
Starting point is 00:21:06 and you're going to have the content ID that you give to the host to get the piece back. And so for 100 gigabytes might have, you know, 50,000 pieces on it. So you're going to have to remember all the metadata for those 50,000 pieces. So what a snapshot is, is essentially it bundles up all that metadata and stores it on the Sion network and it gets kind of complicated here, but in a way that we can recover it using only your file seat. Basically, you distribute the metadata to the hosts in a very specific pattern. When you're doing recovery later, you can ask hosts for a specific location of data
Starting point is 00:21:44 rather than a specific content address. That location will unpack to all of your metadata. That's really cool. So essentially, you could store files in SIA and, This opens up some interesting use cases. So someone could do like an interesting kind of inheritance planning scheme where, you know, they upload something to SIA. It could even be like encrypted blockchain private keys, for instance, and then have a single key that's stored somewhere. And then as part of a will, for example, that key gets given to the beneficiary.
Starting point is 00:22:24 And then that beneficiary uses that key to get files. It could be pictures. It could be letters. or whatever, but it would be sort of like a lockbox where you keep files until like a later date. It would be interesting also if, you know, if you guys were to implement some sort of time lock mechanism where like a seed could only unlock files. I'm just like thinking about it here, but like where a seed could only unlock files after a single, like after some time has passed or something like that.
Starting point is 00:22:51 Yeah. Off the top of my head, I don't know if there's an easy way to do that. But definitely aside from the time control thing, like creating a lockbox on site. is something that you could do today, I believe. Yeah, that's cool. So let's talk about the architecture a little bit. Because SIA is a little different from, well, we've already talked about how different it is from the other file storage systems that exist.
Starting point is 00:23:16 But it leverages a lot of different technologies that, you know, we all know and understand in the crypto space. So it has some sort of a blockchain mechanism, but it also has file storage and it also uses payment channels and encryption. So talk about the technical architecture and how all those building blocks fit together. Yeah. So philosophically, when we're building Saya, we use two bedrocks, just two requirements. The first requirement is that it has to be fully decentralized.
Starting point is 00:23:48 So that decentralization always trumps every other decision. If it's not decentralized, then we don't make, you know, we don't go down that route. But then the second thing that we always focus on it and we ask ourselves every time we build something is how fast is this? Is it as fast as possible? Like, is there theoretically a faster way to do something? And if the answer is yes, there's theoretically faster ways to do it. We make sure that we build so that performance is ahead of everything else. And so what that means on the SIE network today is basically and moving forward that the whole thing is,
Starting point is 00:24:27 setup to be completely point to point. So it was something that a lot of distributed systems do is they have some sort of lookup routing or content hashing or, you know, they'll have these distributed algorithms to find data. And that means that you have to ask someone who has to ask someone, who has to ask someone, and that like multi-hop step takes a lot of time. On the SIEA network, every single request is point to point. As soon as you want a file, you immediately know exactly who's storing the file. You find that out in under a microsecond. And then after that, then you can do the network request to fetch the data. So I think that's something that's really critical to us. So another thing is like scalability, you know, blockchains don't scale very well.
Starting point is 00:25:11 They only get a couple transactions per second on the SIA network. You know, it's, I think during, you know, peak times the Sion network has probably seen more than a million transactions in a single hour. And of course, that's not happening on chain. That's all. going over payment channels. And so we use something similar to the interledger protocol. Basically, every time you download data, you download a small piece of data. So you pay for it first. Then you trust the host is going to send it to you.
Starting point is 00:25:39 So the host could steal that small payment, but it's very tiny. And if they do, you know they're dishonest. You know not to use them anymore. You know to penalize them. And so there's very tiny amount of trust that you extend to the host. then they give you data, then you extend another tiny amount of trust, then they give you the data. And this allows us to scale very well. You know, we don't need a payment channel system that's as complicated as the Lightning Network.
Starting point is 00:26:03 We do data over state channels. So, you know, we're not just sending money. We're simultaneously updating the file contract for what the host is required to store. So this is specifically for when I'm retrieving data, right? So when I'm, for example, creating a new contract with someone, then I, I will make a new, what were they called, file contracts or something? File contracts, yeah. File contracts.
Starting point is 00:26:28 And then if I'm constantly requesting data, let's say I'm streaming it like a movie or something, right, then I use the payment channel. But what if I don't want to be retrieving data very often, I'm just using it primarily for archival purposes? Do I still use payment channels in that case? Yep. So what we do is when you create your allowance, what happens? is the Sion node is going to go and form a contract with basically the 50 best hosts on the network.
Starting point is 00:26:58 So you're going to end up with 50 state channels that you use for both uploading and downloading. And so those state channels have a lifetime of several months. And then any time you upload or download, you're going to update these state channels. And so you only need to make on-chain transactions on the Sion network about once a month. and then otherwise all activity uploading, downloading, or even just, you know, storing and sitting there, is going to happen through these payment channels, state channels. And so when I give this data to 50 best hosts, right, I'm assuming there's some sort of Reed Solomon type encoding going on. What is generally the best practices here to make sure that
Starting point is 00:27:42 your data is well redundantified? So you hit the nail on the head with Reed Solomon. By default, we use a 10 of 30 scheme. So every piece of data is going to go to 30 out of those 50 hosts. And of the 30 who receive it, any 10 is sufficient to recover the original data. And so basically what the SIE network is going to do, because hosts on the SIE network, you know, they're not unreliable, right? They tend to be pretty reliable, but they're not Amazon reliable.
Starting point is 00:28:12 And we don't expect them to be Amazon reliable. because that's very expensive and we want to be cost competitive. So instead, we just monitor and we see, have hosts gone offline, are there issues? And so we have this constant monitoring service that's checking in on the health of your files. And this information, the health information that is being monitored, I presume it's being monitored by the individual nodes. Yeah. So the person who uploaded the data is the person who's doing the health monitoring. Does this information somehow get stored somewhere where the entirety of the network has access to it?
Starting point is 00:28:49 Is that where the blockchain comes in? Explain where also the blockchain falls into this architecture. We'll come back to the blockchain thing. So the answer is no. The health is not monitored by anyone except the person who uploaded the data. And I think this is one key departure from Filecoin's ambitions. So it's a very difficult theoretical problem. to do decentralized data repair.
Starting point is 00:29:16 So what Filecoin wants to do is that you upload data to the FileCoin network and then you disappear, and the FileCoyne network will automatically restore the health of your file as hosts go offline. But the challenge here is that FileCoin can't restore the health of your file unless it knows how to erase your code things. If it knows how to erase your code things, hosts can collaborate with each other to deduplicate the data and create this sort of file. false redundancy, make things look highly redundant when they're not, or make things look decentralized when they're actually, you know, it's six copies of the same file on one hard drive.
Starting point is 00:29:52 And the answer to like, how do you prevent hosts from doing that is not easy. And I would, I would say is a strictly unsolved problem. I don't believe Filecoin has solved the problem. Certainly, it's been one of the challenges for them in launching. So we sidestep the issue by just saying the network will not repair your files for you. you have to repair the files yourself. And so that goes back to your question, Sunny, what do you have to do as someone storing data on the SIA network?
Starting point is 00:30:21 You have to run your node every once in a while. It doesn't need to be on every day, but maybe, you know, if you leave it overnight once a week or, you know, leave it on all weekend once a month, that'll give it a chance to check in with the files, see what the health is. If any files are kind of low on health, it can restore them to full health.
Starting point is 00:30:42 But how does that still help with solving like civil attack on the host side? How do I know I'm not getting false redundancy on SIO? Yeah. So the important thing here, actually, so civil attacks and false redundancy are, I think, two different questions. False redundancy, we have a super simple solution, which is that we encrypt the data after we erase your code it. So first you make these 30 redundant pieces.
Starting point is 00:31:12 And of course, every piece is different. And then you encrypt each piece with a different key. So we have super high confidence that at least three copies of the data definitely exists because we know no one has the cryptographic ability to deduplicate that. Was this term you use erasure code? Yeah, erasure coding. Erasure coded data means I give you a big set of data. And then if some attacker or some, you know, process in the middle cuts out pieces of it,
Starting point is 00:31:39 erases parts of it, the erasure code. means that up to some threshold of erase data, you can restore the things that got deleted. And so it doesn't matter that things got lost. Okay. Is this similar to how like a raid would work or something like that? Yeah. Raid is a very simple form of erasure coding. So I guess one of the reasons it makes sense in SIA, but it's a little bit different
Starting point is 00:32:04 in Filecoin, or at least in Sia v1, is because here it's my personal data that I'm trying to distribute a bunch of a number of hosts. And so what I can do is I can encrypt to each of the host, I can give a different encrypted copy. And I know that they can't convert, they can't be colluding because they don't know how to decryp one person's data and turn it into another person's data. But with Filecoin, what they're trying to do is more for public data rather than private data. And so they're assuming, let's say it's Wikipedia, right? And so there's no private decryption key. You know, that decryption key has to be public to everyone in the world.
Starting point is 00:32:44 So how do you solve this in, for example, SIA v2? So the solution that we are favoring right now is that you would actually publish, you'd have two versions of the file. So let's say for a public thing, instead of doing a 10 of 30, you might do a 10 of 40 or a 10 of 50. And then you would publish 20 out of those 50 pieces. which means that if hosts are colluding, if those 20 pieces disappear, you still have these 30 pieces that you know can't be colluded over
Starting point is 00:33:15 because you never release the keys to those. And this also kind of gives you a nice canary because it lets you see if hosts are colluding. If suddenly a file unexpectedly disappears, you know that hosts are colluding over the public data. But it doesn't matter because you kept some of the redundancy back. So I think that's the strategy we'll be using in V2 for that. public data. You just won't publish all of it at once. So you mentioned like the decentral, you know, you really focused on the decentralization as a
Starting point is 00:33:46 primary focus. Obviously, the decentralization helps us with the censorship resistance and, you know, all that fun stuff. But does it also help with the costs of these things? So you mentioned that science cost of storage is like 10x lower than AWS. Is it something about the decentralization that allows this to happen? Or could I have created a centralized Airbnb for storage? And I could have gotten the same cost savings. Yeah, that's a great question. So I think in the long term, the decentralization is absolutely critical to how the pricing
Starting point is 00:34:22 model works. What we see with centralized systems and services is that even if they're competitive for a while, once they've obtained the brand, once they've obtained the trust, once they've obtained the market share, they all. I mean, Google is a famous recent example. They always switch gears from being super consumer friendly, super cost-oriented, super everything that people want to being this more value extracting. Prices go up, terms of service get nasty.
Starting point is 00:34:50 Even if a centralized service can provide good prices in the short term, you're really depending on the goodwill of that entity to maintain that good pricing once you depend on it. And I think we've seen consistently in the big, business world, you know, once a business has established a moat, it cashes in on that moat by increasing prices and just making things better for the company and worse for the consumer. And so I think in the long term, this decentralization is really important for consumer protection. It's the only way that consumers can guarantee that their service will continue
Starting point is 00:35:25 to work as it did when they joined. So let's come back to the technical components just briefly. Where does the blockchain and fit into this because I know there's a blockchain in here somewhere. Yes. Great question. So basically the chief challenge that we face with a decentralized payment network, right? So we're changing value for storage. And we're also requiring hosts to fulfill certain promises.
Starting point is 00:35:53 And the big promise is like in the file contract, the file contract says, you know, you have to hold this data for this amount of time. And so what we need is some sort of system. that can check in on the hosts and then can decide, okay, you kept the data, here's a payout, or, oh, no, you didn't keep the data. We're going to penalize you. We're going to slash you. And in a decentralized setting, I think the only way to do that is to have this blockchain
Starting point is 00:36:20 where everyone can validate, did the host fulfill the promise or not? And then based on that update, this decentralized ledger to say whether or not the host gets paid. So where the blockchain comes in is it stores the file contracts, and specifically it audits whether or not the host has correctly stored the data that the host was supposed to store. And then from there, we can penalize or pay the host. So that's why we need our own blockchain. Then it's also just helpful in general. Everything runs on money. If you want a decentralized money, I think you need a blockchain somewhere in there.
Starting point is 00:36:57 But for that, you could use, you know, Bitcoin or whatever. So we need SIEA for file contracts. So the settlement layer is the blockchain. This also stores the contracts between users and hosts and the payments for storing, but also downloading and uploading files happen over the payment channel. So there's a layer one blockchain and a layer two payment channel solution. or state channel built into SIA, all within the SIA stack.
Starting point is 00:37:33 Yep, that's correct. Okay. And which blockchain are you using? Did you build your own or are you leveraging an existing framework? Yeah, so we built our own. It's highly inspired by the Bitcoin blockchain, but it is from the ground up. So we wrote all the code that's in our software.
Starting point is 00:37:54 But it's very similar in design philosophy to the Bitcoin blockchain. What is the functionality of SIA that was of the base layer, the L1 layer, that wasn't available on Bitcoin? Like, could we have done this as an L2 solution using Bitcoin like payment channels? Were there op codes that were missing that were needed? Yeah. So you have to remember that we launched SIA in 2015, June 2015.
Starting point is 00:38:23 That's pre-Etherium. I believe it, that definitely pre-tendermintendment. And so we were working with a much, much smaller technology base. Bitcoin today still can't do the SIFI contracts. We're missing two really critical things, I think, for the SIE network. The first is that Bitcoin doesn't have updatable payment channels. So the Lightning Network is based on HTLCs and these like crazy draconian penalty schemes that really just don't work with size complexity. but L2 is an example of a network being built or an extension to Bitcoin.
Starting point is 00:38:59 And I think if we had L2, we might be able to do everything we need on the Bitcoin network directly. The other thing that the Sion network really needs that the Bitcoin network doesn't offer is we need decentralized entropy. And so we get this by looking at the block hash. So the block hash is, you know, the tailbytes are highly random. and we can depend on these to essentially have hosts construct secure proofs that they're actually storing data. And that's a bit of a more complicated topic. Because block hashes are not perfectly random, you have to be careful with how you use them.
Starting point is 00:39:37 But the Sine Network is careful with how it uses block has. Hashes. And make sure that despite the fact that they're not perfectly random, everything still checks out economically. I mean, so obviously the Bitcoin Network has block hasashes. but I guess the issue is that you can't access that data from within a UTXO contract. That's correct. So that's the challenge.
Starting point is 00:39:57 And so if we got that and we also got L2, then I think you could probably build something that's functionally equivalent to Sia's file contracts on Bitcoin. So something that maybe a lot of listeners don't know is that you are sort of, in part, inspired me to go work on Cosmos because I met you the first. time at CoinDest Consensus in 2017. And at the time I was interning at Consensus, the Y, and roughly around the same time as when Storage or Storage or however you say it, just transitioned to shifting onto Ethereum. And so I was kind of asking you, like, why don't you guys shift onto Ethereum as well? It seems like that's the hot thing these days. And then you kind of went and
Starting point is 00:40:43 pitched me on, explain to me why application-specific blockchains make sense, which is kind of what led me down to Cosmos. So could you explain a little bit now for the listeners of why not shift onto Ethereum or something like that? Yeah, absolutely. This is a good point. Even if Bitcoin had all the primitives that we needed to build SIA on top of Bitcoin, we probably still would have our own blockchain. And the reason for this comes down to governance. So on the SIA network, every single user of the SIA blockchain is storage oriented. They're thinking about how do I store data, how do I retrieve data, how do I make money from hosting? They're all aligned around this common goal of data storage. And so that means if something comes up, if we need to make some
Starting point is 00:41:28 sort of blockchain governance decision, a hard fork, or we need to extend the protocol in some way or something like that, everybody's aligned around the same goal. And so it's going to be easy to support the common use case. On a blockchain like Ethereum, you have this competing interest problem, this political thing where you have some people who really care about stable coins and defy, some people who really care about cryptoes, some people who really care about ICOs, and then also people who care about storage. And so if the storage network says, oh, we need to do XYZ to make storage successful. And the defy people are like, wait, you know, XYZ is going to harm the defy use case. You have this internal battle and it's much more difficult for the storage people to push through
Starting point is 00:42:14 storage stuff or if like if defy is the big thing and defy is like yeah we need xyz and the storage community is like wait that's going to break us well too bad like you're a small player um and so i really like the application specific blockchain because it means that we're never every governance decision every community decision is oriented around a common goal and that just makes the storage platform a lot more powerful and a lot more agile so when you built seea initially proof of stake was hadn't yet been experimented with, at least not to the scale, which is today, does CS still use proof of work? Yeah, so SIA is a proof of work blockchain.
Starting point is 00:42:54 I'm a huge proof of work proponent. I'm really not a fan of proof of stake, although I know lots of people are happy to debate me on that. I think right now where we are with the Saya blockchain is actually 95 to 98% of our time. is focused on the application layer. So we assume, you know, a decentralized consensus mechanism. We assume a file contract. We assume a payment channel system.
Starting point is 00:43:23 And then once we assume all those things, 98% of our dev time is focused on using those primitives to build a decentralized storage network. And so, you know, if three years from now, it turns out that proof of stake is absolutely the way to go, or if three years from now, it turns out that we need to switch our underlying consensus mechanism, you know, that's something we're not really worried about today because doing so we should be able to carry over, you know, 98% of our effort. So I think right now
Starting point is 00:43:54 we don't even think of the consensus element just because it's so in flux in terms of leading research. We're really happy to be proof of work and we think we have a secure system as is. And then if we can get more by switching, we're happy to wait it out. And we don't think there will be a big switching cost. Most of the things we've built will just, we'll just be able to transplant over if we ever do need to switch. Have you ever thought about maybe using some sort of proof of storage style system, so similar to like Chia kind of stuff where let's say providers don't have contracts that they're fulfilling, but they have extra space. Maybe they can use that for consensus. And then as contracts come in, they can, you know, shift that around. Or even if you take it even a step
Starting point is 00:44:41 For other file coins seems to want to see, they think they can even do proof of useful storage, not useless storage like Chia. Do you think that's even a feasible thing? Yeah. So I don't think proof of useful storage is ever going to be a desirable consensus mechanism. Chia, of course, has sidestepped the issue by having proof of useless storage. I think they have their own set of challenges, primarily the time element. How do you prove the data has been stored for a specific amount of time?
Starting point is 00:45:11 is there ways to cheat that? So, fun fact, back in 2014, when we started building the sign network, it was actually a proof of storage-based system. It was BFT-based. It looked very similar to a lot of proof of stake networks today. It had a lot of 51% assumptions or 67% assumptions. And then it was all based on showing storage and proving storage. And so I actually took this and presented it to the Bitcoin Core developers.
Starting point is 00:45:37 And specifically, Greg Maxwell kind of broke it down. and he pointed out all these issues with what we had built. And he convinced me substantially that I was not ready to build my own consensus system and that I should just use proof of work. So we kind of abandoned that. I think many of the issues that he pointed out then still apply today, both to Chia and a file coin. I don't think they've solved a lot of the fundamental issues. But also, like I'm happy to see the research, happy to see the experimentation.
Starting point is 00:46:09 maybe they'll have some incredible, you know, discovery or breakthrough. And in the meantime, as like someone who's trying to build something that's going to be used as soon as possible, I think proof of work does fit our use case. And so since it's what we know best, we're happy to use that for now. And like I said, you know, if there's some big consensus breakthrough, I think it'll be relatively easy for us to move over down the road. So given that we're talking about proof of work, you know, the other thing that you you're quite well known for in the crypto community is also a lot of your research on ASICs and
Starting point is 00:46:45 whatnot. That day that we met in 2017, you convinced me of two things. One, which was application specific blockchings and two, ASICs are good, which heavily inspired a lot of my design for how atoms work in this cosmos system. So can tell us a little bit about why you really believe in ASICs and then why you even went ahead and built an ASIC company? That's a very interesting question. The reason I believe in ASICs is because it creates alignment with your consensus builders and the long-term health of the network. Well, and the long-term value of the coins that they mine. Basically, an ASIC is a giant upfront payment for hardware that has exactly one purpose.
Starting point is 00:47:26 The only way you can see ROI on that hardware is to mine coins, and those coins have to have value. So if you're mining coins that have no value, you're never going to see ROI. to the best of our ability we want to build proof of work blockchain such that if the network's not healthy, the coin price is also not healthy and the miners aren't getting paid. That way, the miners are strongly motivated to push the network forward and maintain the health of the network, make sure it's not getting attacked, et cetera. Whereas with like, you know, GPU mining, if one network fails, you just jump to the next one. You know, if, yeah, I don't even know it's GPU mine these days.
Starting point is 00:48:03 Well, I'll say grin because there are no grin, A6, yes. If Grin fails, a GPU miner can just jump back to Ethereum and they'll see maybe a 1 to 2% reduction in their revenue, but it's not going to be a big deal. They're not married to Grin the same way that an ASIC miner is going to be married to its cryptocurrency. So we think that's very valuable from a security perspective. Yeah. And so then the follow up to that was sort of why did you go ahead and start an ASIC company? In 2017, Nebulae created a subsidiary called Obelisk. Obelisk purpose was to become a mining entity.
Starting point is 00:48:44 Nebulaus is your company? Yes, Nebulaus is the parent company of Sia. Nebulae has two major projects. One is Sia and the other is Obelisk. So between 2014 and 2017, Saya was the only focus for Nebula. Starting in 2017, it also had this mining focus. And in 2017, this is when a lot of the drawings, with Bitmain was really heating up. Bitmain was making a lot of political power plays against
Starting point is 00:49:10 Bitcoin. Bitmain was controlling a lot of the alts. And we were, and also just the general sentiment at the time and, you know, my personal belief at the time was that having distributed hash rate was really important. And so we didn't want any single party to own more than 51% of the hash rate of the SIA network. And we felt Bitmain was a big risk. So Obelisk was actually an attempt to give users an alternative to BitMain for mining on the SIA network. And so we did. We created an ASIC company. We designed a chip.
Starting point is 00:49:47 We manufactured the chip. We designed a rig and manufactured the rig. And we shipped something like, I think, 15,000 machines total, the purchasers on the SIA network. And so for a while, we had some of the best distributed hash rate of any ASIC mine blockchain. As it turns out, I think that the more we studied Bitcoin, and if you look at the people who really dive deep in proof of work in 2019, and I think this is really just an idea that started to take root in mid to late 2019, is that it doesn't actually matter if a majority of the hash rate is controlled by one entity. Because that entity is still financially heavily aligned, that entity
Starting point is 00:50:30 still has to worry about hard forks. So really, all you want from a security perspective is one, an entity that's heavily financially aligned with your network. And then two, an entity that is capable of being threatened. So if it's a perfect monopoly and they know no one's ever going to compete with them, that's a bad situation. But if they have like 80% of the hash rate and there's some, you know, there are some other companies or some other opportunity for people to step in.
Starting point is 00:50:57 If they get lazy or slow or stop innovating, there's an opportunity. there's an opportunity for that 80% to switch hands from one company to another. That I think is sufficient for a secure network. And this is quite a controversial opinion. I do think over the next two or three years, it will become more broadly accepted. I think it's very sound from a research perspective, but it's certainly very different from how most people were thinking about proof of work blockchains between 2010 and 2019.
Starting point is 00:51:27 And so what was like different about what Obelisk was doing is, you know, I know one of the things is there's a heavy emphasis on open source hardware. What made Obelisk different than Bitmain or Inosilicon and stuff? Yeah. So we had a couple of really big things we cared about. One was open source. We wanted the firmware to be open source. We wanted the hardware to be open source. We wanted the gate level to be open source.
Starting point is 00:51:54 We wanted competitive hashing network. We wanted it to be such that, you know, a Bitmain couldn't lock everyone out. So we wanted to pave the steps so that if someone wanted to come in and compete with us, they would have as low buried entry as possible to do that. Another thing that we really cared about is transparency. And this is actually important to the person buying the ASIC. It's less important to the network health, more important to the person buying ASIC. what we saw Bitmain and Inno Silicon do and continue to do even today is they would sell a bunch of hash rate.
Starting point is 00:52:31 They wouldn't tell you how much it had manufactured. And so miners would go in and they'd make these projections. They're like, okay, well, as long as Bitmain made less than 20,000 machines, then I'm going to ROI and I'm going to make a ton of money. And then what Bitmain did was they made, you know, 150,000 machines. And they sold all 150,000. And so they have 150,000 customers who all ROI only if less than 20,000 machines were sold. And the result is that if, you know, the block reward for two years is $40 million, Bitmain made a profit of $100 million.
Starting point is 00:53:04 Bitmain made more money than it was possible to even mine selling, you know, ASICs for a cryptocurrency. And of course, this money comes directly out of the pockets of ASIC buyers. And so Oblis really wanted to fight a lot of the mining space is absolutely saturated. with these crazy dark patterns, patterns designed to abuse the customer and abuse the ecosystem for the profit of the manufacturer. And Obelis really wanted to take a stand against that. You know, we just kind of realized that if Bitmain has sales channels and they're going to be able to sell, it's very exhausting for Obelis to try and like shut down those dark patterns. If Bitmain has a source of people that they can talk to and sell to that Obelisk doesn't, you know,
Starting point is 00:53:47 it doesn't have a communication channel with. there's nothing we can do to stop those sales. So could you tell us a little bit about this, you know, infamous hard fork in, uh, Sia where I remember I was like peripherally following it, but there was like a lot of drama around essentially it was a hard fork designed to brick a lot of the bit main and intersilicon A6, but not the obelisk ones. One, how does that even work technically? And two, why was that done?
Starting point is 00:54:18 And is that like, you know, potentially a conflict? of interest. Yeah, so I will say that is the most stressful and difficult decision that I've ever had to make. So practically speaking, there was no conflict of interest. Obelus had already sold all of its hardware. Obelis doesn't mine itself, never did. So we've never mined a substantial amount on the SIA network. Like, yeah, sure, we had machines that we had to test. We had to make sure they worked and that that was happening on the SIA network. But we never controlled, like, say, even 1% of the SIA hash rate. as obelisk. And so the actual conflict of interest was a lot smaller than the apparent conflict of
Starting point is 00:54:57 interest. There's still a conflict of interest there just because, you know, we did control obelisk. More sales for obelisk meant, you know, more revenue for us, more profit for us. But a lot of people assume that obelisk was mining. Obelisk had claimed at one point and had said... Is obelisk, nebulus's primary revenue? At the time, yes. It was the primary cash flow for Nebulaus. and, you know, Obelis could state it that it would not mine more than 25% of the hash rate of the network, which, of course, most people assumed that we were mining 25% of the hash rate of the network. We had kind of, or I think 20%, one of those. Either way, we had carved out the ability for us to mine a substantial fraction of the network.
Starting point is 00:55:38 That we never did, didn't help with the apparent conflict of interest. So you had all these, like, conflict of interest issues that both made it difficult to think clearly, made it difficult to establish as a leader that you are, you know, just internally to convince yourself that you're making an impartial, correct decision. But then it made it basically like almost impossible to present to the rest of the world that the decision was impartial. So I think that, you know, that was a very difficult situation. But I also think that we, we made the right decision overall. And I'll dive into that a little bit more. So the background of the situation is like Obelisk had announced this open, transparent effort to decentralize the mining of SIA.
Starting point is 00:56:25 So one critical mistake we made is we had never established like a policy as, you know, going into manufacturing. We never established a policy on secret A6 as, you know, just the governance tools. But we had mentioned several times that we felt that if someone had made A6 in secret, this would be considered a hostile act. because it comes back to the, you know, if Bitmain sells 90,000 machines to a market that can only reasonably sustain the production of 10,000 machines. Everybody gets screwed except Bitmain. And so Secret A6 come into the exact same thing. We felt at the time that the-
Starting point is 00:57:06 Sorry, Secret A6, you mean like a manufacturer secretly making A6 and then dumping them onto the market? Yeah, exactly. So the only way to make an informed decision as a consumer when you're buying ASICs is to know how many ASICs are on the market because it's a zero-sum game. The more ASICs there are, the less money you make. We held the position internally and we had spoken about it nonchalantly. We had never established a policy that secret ASICs would be considered an attack on the network because it would substantially damage the revenue of people who had attempted to make informed decisions. Of course, what happened was Bitmain announced with one week notice, hey, we have SIE A6 that we're shipping. And then we discovered through some back channeling that they made 90,000 machines.
Starting point is 00:57:54 And again, the SIE network could sustain maybe 10, maybe 20,000 machines. Bitmain had manufactured 90,000. That's what our information said. And so it was this like transparent attack. Then I received a call from BTC DRAC of all people who said, like, hey, we've been working. on our own ASIC, we've been working with Inosilicon, and so we also have a secret ASIC. So what you can do is you can break BitMain and just use our ASIC instead, which of course, completely misses the point that the problem is the secret ASIC.
Starting point is 00:58:27 It's not BitMain specifically, but the action of creating a secret ASIC. So definitely every purchaser of Obelisk hardware perceive this as an attack on them. And then because it was a huge fraction of the committed SIA community had purchased Oblisk hardware, it was kind of an attack on the SIA network as a whole. And so this was like the big dramatic challenge. And you had all these people, suddenly accounts that had never been seen before were very active on Discord. We were talking exclusively about how Bitmain is a good actor and how you can't break Bitmain and like just this like whole social engineering mess. ultimately we decided, and it really, really came down to, Inosilicon was actively mining themselves
Starting point is 00:59:09 as a manufacturer more than 51% of the network. So not only did they dump a ton of machines on the network, then also overproduce. They also kept most of the machines for themselves. And so Inosilicon, a party that had already attacked the network and shown to be acting in bad faith, also had the ability to 51% the network. That was kind of what pushed me all, the way over and said like, okay, now we have enough reason to fork, even with all this conflict of interest mess. We're going to go ahead and fork. But like one common theme that I felt in every interaction that I had with the other mining
Starting point is 00:59:46 manufacturers, you know, specifically Bitmain and Nino Silicon, but we were talking to other manufacturers as well was the sort of like manifest destiny. They're like, SIA is a decentralized network. If there's a way for us to make a ton of money by crapping all. over the network, then it's our right to do so. And it's, it's our right to make a giant mess of SIA and then make a ton of money off of that. And so that I just felt like had to be shut down. So a big part of the fork was also this element of like, no, you have to behave. Like, we as a community can make decisions that directly impact your financial status and your financial
Starting point is 01:00:26 gain. And if, if you're going to be like a bad bodyguard, we're going to fire you. And so, that's what we did. We forked. And something that I definitely noticed following the fork is that mining manufacturers for other proof of work cryptocurrencies suddenly became a lot more attentive to the developer's desires. Developers are suddenly part of the discussion when miners are being manufactured because they realize that there can be forks. So those who had bought your own mining machines as well at Obelisk were also then basically forked out of the network? No, there's this strategy that most, if not all, ASIC manufacturers do. I, again, think we need to credit Greg Maxwell with coming up with the idea.
Starting point is 01:01:11 But basically what you do is you make your hardware capable of mining the main algorithm. Then you also add, you know, just a little branch somewhere in the chip, somewhere in the gates, that allows it to mine this alternate algorithm. So it's just a slightly modified algorithm. There are a million ways to make a slightly modified algorithm. Every manufacturer, when they pick a tweak, is going to pick a different tweak. And what that allows us to do is hard fork to tweak the mining algorithm so that instead of doing mining algorithm A, you do mining algorithm A prime.
Starting point is 01:01:48 Or if there's one bad actor and four good actors, you tweak it so that it's A prime or A prime prime. Isn't this also building some kind of a secret ASIC? I can't help but to see the irony in all of this because on one hand, you've been saying since the beginning that you're for decentralization, and I really think you are. But by doing this, by acting as a privileged actor who had access to the community and the code of SIT network, but also making an ASIC that you could change the algorithm on a whim, how do you see yourself as different from the free market actors that are just producing A6 in a free market and, you know, playing by the rules of the market, they're allowed to make
Starting point is 01:02:35 A6 and sell them the customers if they want to. And it's up to them on whether they want to inform the market of their desire to do so. So it's important to clarify, like, Inno Silicon also had this switch in it. And we don't know for certain, but I would be very surprised. if Bitmain didn't also have support for some alternate algorithm. What's the point of them putting a switch in it? Because how could they choose what the next hashing algorithm would be, unless they're also their developers? So it's about governance.
Starting point is 01:03:09 What Bitmain is doing, what Obelisk was doing, and before we conclude, we should definitely get into the actual mechanics of the fork. Because I think it's very interesting what we did and very important the way we executed the fork. But before that, as a manufacturer, when you put this alternate algorithm, this tweaked algorithm into your ASIC, what you're doing is you're offering the community the option to switch to exclude some ASICs and not others. So you're giving them the flexibility to make a governance decision to exclude other manufacturers, but retain you. If manufacturers aren't doing this, the only tool that the community has is like a nuke. to break all ASICs at once.
Starting point is 01:03:53 If every manufacturer has this tweak that they either disclose to the developers or disclose the community at their discretion, then instead of dropping a nuke that kills everything, you can drop a targeted nuke that leaves certain parties alive. Manufacturers do this because it gives the community a choice. There's very little downside to the manufacturer.
Starting point is 01:04:14 And for example, a possible scenario where the SIA network would have, or where the SIA network would have selected Bitmain's A6 over Inno Silicans is like, Inno Silicon really was like the bad actor in the space. So if we assumed that Obelis didn't exist, Bitmaint had come to market, then Inno Silicon came to market with much superior hardware. Inno Silicon has 51% of the hash rate. This included, Inno Silicon's hardware was superior to Bitmains.
Starting point is 01:04:42 If Inno Silicon had 51% and the network is now like being attacked or something, Bitmain could have come forward and said, instead of breaking all A6, clearly the bad actor is Inno Silicon, we have this tweak that we know Inno Silicon doesn't have, or we strongly suspect Inno Silicon doesn't have. Who's to decide that a bad, Inno Silicon or Bitmain? I mean, I don't have any opinion on this, but who's to decide who are the bad actors? And do you not feel that as a privileged actor in this whole ecosystem
Starting point is 01:05:13 by making the decision to tweak the algorithm to favor that of which your miners were ready for, that is not a huge conflict of interest. This really gets into the deep nitty-gritty of decentralized governance. It's difficult for people to wrap their heads around how leadership works in a decentralized way and actually, to the detriment of decentralization. In a decentralized network, if there's one leader and, everyone assumes that the leader has the unilateral ability to make a decision. That leader, in fact, does have the unilateral ability to make a decision. We see this, I think, very strongly in
Starting point is 01:05:53 Ethereum. It also is strong in the SIA network. However, we are doing everything we can to show people that we don't have the unilateral ability to make a decision. And the way we structured our hard fork, I think underscores very well. I think it's the right way to do it. And we may sure that any dissenting minority could opt out or the hard fork was opt in. So you could choose not to opt in to the hard fork. You could neglect to opt into the hard fork and remain on the old network. The important thing about decentralization is that every political decision each person can make on their own and they can go any way they want and be successful. in activating this hard fork and inactivating this switch, the important thing was to make sure that
Starting point is 01:06:46 anybody who didn't like the hard fork, anybody who wanted to resist the hard fork had easy, trivial means to do so, or any group that wanted to opt out and not follow the hard fork would not have to work very hard, would have to do very little work in order to maintain the old network. So that was very important to us. And we did it came down to about four things to make sure that this hard fork was not our decision, but like the collective's decision. Each person on their own got to decide.
Starting point is 01:07:18 The first thing is that a sign network does not have mandatory upgrades. So every time you upgrade your sign node, you have to do it yourself. And that means that the developers cannot push an update out onto the network. So we couldn't just sign a binary, send a bunch of messages around,
Starting point is 01:07:34 and then have the default network be this hard fork. So if we wanted to, to instigate a hard fork, we'd have to convince every member individually to upgrade. The second thing we did was we did replay protection and wipeout protection. And these conversations that came up a lot in Bitcoin with the USF and whatnot, but what it means is that the transaction format on the hard fork network, on the new network, where we changed the proof of work algorithm, was slightly tweaked so that transactions that you signed on the new network were completely invalid on the old network.
Starting point is 01:08:09 And then vice versa. Transactions signed on the old network were completely invalid on the new network. So you had no fear of, you know, trying to send someone SIA Classic coins and them receiving your SIA coins as well. And no fear of, you know, vice versa happening. And then wipeout protection similarly means that the proof of work algorithms are immediately different from each other on both networks. So if one network, you know, gets more work than the other, notes will not switch from one network to the other. that you don't have to worry about this massive, this other network, 51% attacking you or wiping out a ton of history. So the next thing we did was developer maintenance. We wanted to make sure that people on the old network would be trivially able to continue
Starting point is 01:08:55 merging our updates and continue merging the code that we were writing for the new network. If we had misread the situation and the true SIA community, the users of the SIA community, really did not believe in the hard fork. The SIE network would continue to benefit from all of the develop work we were doing. We would see that we had made a mistake. We could switch back to the old network. So the line of code difference between the hard fork and the not hard fork with the replay protection, with the wipeout protection, with the proof of work change, was three lines of code.
Starting point is 01:09:29 And they're actually just configuration switches in a header somewhere. You do not need any developer ability to keep the two networks synced with each other in terms of code. There are no merge conflicts. There's no challenge in keeping the two networks updated, which we felt was important. You don't need an experienced developer to maintain the Sia Classic chain. You just need someone who knows how to click the merge button on GitLab. Finally, we made it so that the storage network files would not be impacted. So whether you chose to go onto the new Sion Network or stay on the old Sion Network,
Starting point is 01:10:04 and whether your hosts chose to go onto the new Sine Network or not, your files would stay intact, which I think was probably the neatest, most interesting trick that we pulled. But basically, if you personally did not like the hard fork, you wouldn't have to go along with it, you wouldn't lose your files, your money wouldn't be at risk, it'd be easy to maintain, you'd continue to receive updates as we made the network faster and more scalable, et cetera, is very easy for the community to reject the leadership's decision
Starting point is 01:10:36 to do this hard fork. And I think that's really what separates us from like a Google just passing down to users. Oh, now your Android phones have DRM. Now your blockchain uses a different proof of work algorithm and we break Dino Silicon is very easy for the community to say, no, that's not what we want and we're not going to go along with it. Final thing is if you, you know, live under a rock or whatever, you don't pay attention to any of the news. Because we can't update the network, if you didn't update your software, if you didn't know that this hard fork thing was happening, your default was to end up on the old network. So no one ended up on the hard fork on accident.
Starting point is 01:11:14 They chose to upgrade to the hard fork network. That kind of illustrates the lengths we went to to make sure that this was a decision not handed down from the top, but one made by the community in a decentralized way. going back to the four content itself, weren't you primarily sort of hurt, if Bitmain had already sold all of their A6s, at this point weren't you no longer hurting BitMain, but now you're kind of punishing everyone who bought AISX from BitMain? Yes, which is a good point. In practice, actually, the only people we were hurting were Inno Silicon. Inno Silicon was mining more than 51% of the hash rate themselves. So it wasn't customers of Inno Silicon who was mining. It was Inno Silicon themselves who was mining.
Starting point is 01:12:06 Customers of Inno Silicon had already lost 90 plus percent of their money, just giving it to Inno Silicon at these ridiculous margins with these assumptions that Inno Silicon knew at the time of sale were incorrect, but it was more than happy to sell an overpriced device to a customer. Bitmain was in a sales. similar situation, all their customers had already lost all their money because, one, Bitmain oversold, so the money was all gone anyway. And two, Inno Silicon came in and completely destroyed BitMain. So all the money was gone anyway. So in practice, we were only hurting
Starting point is 01:12:39 Inno Silicon. But philosophically, I think, you know, if you're going to buy hardware to mine on a network, it's kind of your imperative to do research and make sure that your purchase is not contributing to the downfall of that network or, you know, contributing to drama or contributing to instability, et cetera. And so even though we're not hurting Bitmain directly, I do think there is some moral responsibility on the people who gave Bitmain money and contributed to the disaster that happened. But as it were, we didn't hurt those people anyway. I don't know how you can say that. I mean, like, the people that are buying the hardware from Bitmain are consumers. The people that are buying a hardware from the Silicon is consumers, they're looking for, you know, the best
Starting point is 01:13:24 minor and they might not even have that much skin in the game when it comes to SIA. To put that on the consumer has their responsibility to choose the right minor and have to understand all these political implications and all this drama, you know, happening underneath all of this. It's nonsensical. I mean, maybe this goes down to like the application. specific chain. And like I mentioned, that's why he believes in A-6, where he expects minors
Starting point is 01:13:54 to be active participants in the network. You can be, if you want, but you don't have to be an active participant in network. You can see it as simply as an investment opportunity or whatever.
Starting point is 01:14:05 I do want to come back to SIA and the evolution of decentralized storage and get your thoughts on a word. You see decentralized storage in 3, 5, and even maybe 10 years ahead.
Starting point is 01:14:21 In the next one year, we're going to see a lot of power users starting to put their storage onto decentralized mediums, I believe. So probably as a secondary backup, people are increasingly getting nervous about the control that AWS has, about the control that Google Cloud has.
Starting point is 01:14:38 SIE is a very cheap alternative. So if you're worried about Google handing down this nasty terms of service decision, oh, we terminated your account, you don't get your data back, you know, you're done. SIA for just a little bit of overhead, like a 10% overhead of what you pay Google, you can ensure that your data is immune to this sort of centralized decision. So I think over the next year, you know, we're going to see increasingly people moving that
Starting point is 01:15:01 direction. In the two to three year horizon, I think we're also going to see a big shift in how contents distributed on the internet. So, you know, when you go to Vimeo and you watch a video, when you go to Netflix and watch a video, I think it's very likely that that data is going to be coming from a decentralized network. as opposed to the current like centralized content distribution. When you're doing live video streaming, how are we making it efficient to like get data from so many different sources, decrypted and like reverse erasure coded and to provide a seamless streaming experience? It basically comes down to a lot of parallelism.
Starting point is 01:15:40 The data needs to go through a fan out process because it, you know, the uploader only has so much bandwidth. So they're going to upload to someone who has a lot more bandwidth. And that person's probably not going to distribute directly to users. They're probably going to upload themselves to people with even more bandwidth. So you get this big fan out that takes maybe one second or so, fan out, you know, three or four layers. And then once you're fanned out four layers, now you have, I mean, four layers is enough to have the entire bandwidth of the SIE network to distribute content. So the answer is is a lot of scalability engineering, essentially. These can all be done using classical techniques, and then you just make sure that the primitives are decentralized.
Starting point is 01:16:23 I'd like to get your thoughts on this, and I'm very much looking forward to having more decentralized storage solutions available and people using decentralized storage solutions. I think we agree on the online premise that one would be better off storing their files and their personal files on decentralized storage that's encrypted, that's censorship resistant rather than on like Google or Google Cloud or something like that. I see a problem with the space getting big enough to sustain storage needs. One of the blog posts on your blog, I forget which one, it says that by 2025, we're expected to create over 400 petabytes of data per day. And to expect that decentralized storage networks, which rely, you know, if we want to keep it decentralized primarily on storage provided by individuals, to expect that storage, need to be met by that, I think is we're unlikely to see that.
Starting point is 01:17:21 And the reason why I think that's the case is that there is a trend where, you know, perhaps 10 or 15 years ago, a lot of people had storage in their house, right? Like there were probably more people that, I mean, this is my assumption, but probably more people had NAS hard drives or USB hard drives because cloud storage wasn't so prevalent and cheap and even like free. But increasingly, it's becoming niche thing. You know, I think I know three people that have like a NAS or like an external hard drive that's constantly connected to the internet. And I have the same issue with like something like Space Mesh, for example, which built on the same premise that there will be an abundance
Starting point is 01:18:01 of decentralized, highly available storage available out there. Is this something you guys have thought about? And how do you counter this? You know, what if we go into a trend where people just don't have a whole lot of storage available to put up on the network? This is something we've thought about a lot, basically from day one. And you'll see in a second that we answer this problem very effectively. To justify your concern, when we looked at other peer-to-peer storage systems that had popped up in the past, whether it's BitTorrent, SpaceMonkey, Synform, all of them kind of were built with this assumption that the people who are storing data, and the people who have storage are the same people.
Starting point is 01:18:45 And this really, like, caused struggles on the network. They always had these imbalances where either they had, you know, too much storage or they had too much data. And it was very difficult to convince people who wanted to use, you know, say 10 terabytes of cloud to also have 20 terabytes at home serving up to the cloud. So in SIO, we intentionally and very sharply divorced these two situations. That's why we pay people. So what we actually envision for the SIA network is just like mining farms. You know, we have mining farms all over the world that do proof of work mining.
Starting point is 01:19:20 And they do it because they make money, not because they have some, you know, ideological, you know, decentralization imperative, but because there's profit to be made. So we see the same thing for Sia. We pay people to store our data on an open marketplace. If there's a lot of demand for storage, people will set up data centers that are designed just to service the SIE network. And so we will be able to meet an unlimited amount of demand so long as that that demand is unlimited at a price point that makes sense for people to open up data centers. And so we have this marketplace and we don't actually expect storage to come
Starting point is 01:19:57 from individuals. We expect storage to come from professional data centers that are tuned to, you know, be profitable on the SIE network. You're expecting a sort of parallel data center market to emerge, and I say parallel to the AWS's, the Azure's, the Googles of the world that are serving, purely serving the SIA network and providing storage for that decentralized network and perhaps even other decentralized storage networks. Yep, that's correct. And what about leveraging existing storage, like partnering with manufacturers of embedded devices and using that as a way to incentivize people to basically pay off devices that they would buy, right?
Starting point is 01:20:45 So if you have like a Roku or something that has a hard drive in it, well, then as a user of that hard drive, you're already getting, you're automatically getting back, sort of like you're paying for it with these, these SIA rewards that you're gaining from just having it plugged in. We've looked at that a number of times, and I think the conclusion's actually always been the same, is that the economics don't make sense. once you have these kind of like mining, it'd be the same sort of idea as, you know, just having your phone mining all the time.
Starting point is 01:21:14 The truth is that your phone is not a device that's optimized around mining, and therefore it's one, two, three, four, five orders of magnitude less efficient than A6 in a data center. And I think storage is going to end up being much the same way. You know, if you took, if you increased the cost of like, say, a smart TV by $100 to put a bunch of storage in it, that smart TV is going to make the company maybe $1 a month in storage revenue by serving on the SIA network,
Starting point is 01:21:42 whereas like an optimized data center can get that same $1 a month of revenue for maybe an additional $25 a buildout cost. Competitively speaking, I think even if the smart TV can lean in on things like free space, yeah, like free physical space, free electricity, et cetera, it still just doesn't make economic sense. So I don't see that happening in the future, just purely because economies of scale are really, really sharp when you're specializing. Very much looking forward to seeing how this plays out. And I know I'll probably be using SIA providing space as a host of my Synology NASS. We'll be glad to have you. All right.
Starting point is 01:22:25 Thanks a lot for coming on the show, David. Thank you. Glad to be here. Thank you for joining us on this week's episode. We release new episodes every week. You can find and subscribe to the show on iTunes, Spotify, YouTube, SoundCloud, or wherever you listen to podcasts. And if you have a Google Home or Alexa device, you can tell it to listen to the latest episode of the Epicenter podcast. Go to epicenter.tv slash subscribe for a full list of places where you can watch and listen.
Starting point is 01:22:51 And while you're there, be sure to sign up for the newsletter, so you get new episodes in your inbox as they're released. If you want to interact with us, guests or other podcast listeners, you can follow us on Twitter. And please leave us a review on iTunes. It helps people find the show, and we're always happy to read them. So thanks so much, and we look forward to being back next week.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.