Epicenter - Learn about Crypto, Blockchain, Ethereum, Bitcoin and Distributed Technologies - Juan Benet: IPFS – Decentralizing the Web with the Inter-Planetary File System

Starting point is 00:00:00 This is Epicenter Bitcoin episode 100 with guest, Juan Bennett. This episode of Epicenter Bitcoin is brought you by Hyde.me. Protect yourself against hackers and safeguard your identity online with a first class VPN. Go to hide.combe slash epicenter and sign up for your free account today. And by ShapeShift, with no account or signup required, it's the easiest way to buy and sell gems, counterparty, dogecoin, dash, and other leading cryptocurrencies. Go to Shapeshift.com. I.O. to instantly convert your altcoins. and to discover the future of cryptocurrency exchanges.

Starting point is 00:01:11 Hi, welcome to Episode of Bitcoin, the show which talks about the technologies, projects, and startups driving decentralization and the global cryptocurrency revolution. My name is Sebastian Kutjua. And my name is Brian Fabian Crane. We're today with Juan Bennett. He's the inventor of the interplanetary file system, a project with a nicely modest name. And he's also the founder of Protocol Labs, which is the company behind developing it. So IPFS isn't directly a Bitcoin project, but it's a super interesting and ambitious

Starting point is 00:01:43 kind of projects to decentralize the internet. And before we were just sort of going through topics and Sebastian had this realization where it was like, oh my God, this changes everything. So hopefully we'll get to the same moment in this episode and all of you will have the same drop everything and think like have a change in worldview. So thanks so much, one, for coming on. Yeah, thank you so much for having me. This is a pleasure.

Starting point is 00:02:11 So perhaps let's get started with that. So can you tell us a little bit, how did you come up with IPFS? And how did that project get started? Yeah. So it started very simply. Like I wanted just to make a package manager for data sets. So the goal was to make something that was really fast for moving around scientific data. So this is really big files, things that are, you know,

Starting point is 00:02:37 hundreds of gigabytes or terabytes per dataset. But that was also versioned, right? So something that you could improve over time and collaborate on. So something like Git and something like BitTorrent. And in kind of sinking into the depths of melding these two pieces of technology, I came up with a distributed file system, which then came on to be named IPFS. But it didn't strike me until later that it has wide implications for, the web as a whole, and that can even prove how we move around data throughout the network.

Starting point is 00:03:12 And so that's kind of the focus now. So though we started with trying to solve like a scientific dataset problem, we are much more focused now on just improving the web as a whole. And that's the goal of IPFS. So how did you come up or why did you choose a name interplanetary file system? So that's funny. So the JCR Licklider, who was one of the people, who came up with the ARPANET originally,

Starting point is 00:03:40 had this idea of creating the intergalactic network. And so when the internet actually stands for intergalactic network. And so in an amount to that, we decided to name it something along those lines. And so interplanetary file system works really well because it also, when you say IPFS, it's the file system for the internet. And so that's what brought the name together.

Starting point is 00:04:07 So tell us exactly, What is IPFS? What does it do? So at its core, it's just a versioned file system, right? So it's a system that can take files and manage them and store them somewhere and track versions over time, very much like Git. But it also accounts for how those files move across the network. So it's also a distributed file system. So it has rules as to how the content moves around.

Starting point is 00:04:38 similar to how BitTorrent has rules around how data and the BitTorrent network moves around. And this file system layer is enough to build a much better web, a web that gives you very interesting properties, like websites that are completely distributed, websites that have no origin server that can run entirely on client side browsers and not have any server to talk to. But yeah, it all kind of falls out of just a nice system to move around files. Kind of like the web falls out of a lot of servers speaking the HTTP protocol and some web browsers running pages. So, I mean, when we when we think about IPFS, it seems to me like, I mean, not only HTTP could be replaced by or potentially augmented by it. Have you thought of other potential protocols that it could...

Starting point is 00:05:39 Yeah, so... So, this is tricky, right? So when you make something like IPFS, you suddenly start looking at everything in one specific way, and you realize that you can, like, replace it all. This is probably the same kind of thinking that probably Tim Bernice Lee had when he came up with HTTP and HTML. He probably just saw all these file systems

Starting point is 00:06:00 as things that could be gloved up into the web, and they have been. And so I think, like, things that are candidates are just a lot of CDNs, a lot of content distribution networks of different types. A lot of different kinds of distributed file systems may not need to be around if this ends up being as good as we wanted to be. Versioning systems, like, I think Git is so good that it'll be around for a long time. But there are certain kinds of version control that don't exist, that doesn't exist now. versioning on media, for example, imagine being able to version correctly

Starting point is 00:06:35 videos as you make them or large movies and so on. So those kinds of systems could be augmented or replaced by IPFS depending on how the tooling evolves. But I think the main focus for a while will be on building this new storage layer for the network,

Starting point is 00:06:56 like for the entire internet, where the way, the protocol that people use to move large amounts of data changes. And it goes from something that like HTTP, which is client server oriented and oriented around recent data in that you always have to sort of make requests to the server to check if data has changed or if you have the latest thing or something, to something where you can reason very carefully about what you already have and sometimes completely avoid having to make an error. at all, which is a really powerful property when you think about the offline use case. So one of the big design decisions for IPFS is to make it offline first, the same way that Git is offline first, meaning that you can mutate files or add content to IPFS entirely offline or disconnected from all their peers. And when you rejoin the network, all these changes are synced over time. So we mentioned HTTP, and I mean, we can get into some of the other use case.

Starting point is 00:07:58 a bit later on in our discussion. But in a lot of your talks and a lot of the writing around IPFS, it's often mentioned or alluded to that HTTP is somewhat broken or flawed. Can you explain why you feel that HTTP needs to be replaced or complemented by IPFS or distributed systems? Yeah. So I think, first of all, I think much more complimented. I think for a long time, HDP is a fantastic protocol, and it does so much right, and it got so much right, you know, 25 years ago now, or so 20 something years ago.

Starting point is 00:08:39 It's a fantastic protocol, and it'll continue to do great stuff for the Internet, and we'll continue to use it for a long time, the same way that we use today still, things like FTP and so on. But there are some issues in how HDP works that are not scaling with our uses. of the network and our uses of the web in general. And in particular, actually, I think in terms of how websites are represented or how websites store data on the internet, HTTP is actually not a very good system for doing this. You would want to be able to have a protocol which allows you to reason about how the data moves, which perhaps has some, you want some certain. properties and the links that you have between computers, things where you're able to check

Starting point is 00:09:31 integrity or where you're able to have some cryptographic guarantees around things, you want to be able to perhaps have signed links and things like that. And just in general, the client server, you know, single link model of HTTP doesn't really work when you think about how big the network is today and how you can leverage connectivity between hosts. on every single request, right? Like when you're able to, if you're downloading some big file, you don't just want to download it from a specific location. You want to be able to leverage whoever else has this file

Starting point is 00:10:11 in the same way that BitTorrent Swarms, for example, achieve this great network performance. You want to have peer-to-peer sharing of the bandwidth load of downloading website or something. So in that regard, I think, like, the core problem with HTTP is that it's location addressed. So when you look at a URL, the very first part of a URL, so it's a protocol, but then after that, it's the location. So it's the IP address or the domain name which maps on IP address. And that location identifies a specific set of computers that will serve whatever resource you're requesting,

Starting point is 00:10:52 which presents a really big problem if you can't talk to that set of computers, right? if there's some problem in the network between you or the network is slow or you're just completely disconnected, you just cannot access that resource at all. And that resource may have certain properties. Like, it could be a file that hasn't changed in 10 years, and yet you still can't access it. And this problem gets worse. When you think about how quickly websites disappear, so there's this really short lifespan to websites. I think some people can give figures around 100 days. I think that might be too intense. But in general, tons of websites disappear all over the place, and they change and move

Starting point is 00:11:34 and so on. And so all these links become stale or broken, broken entirely sometimes. Sometimes you can't find data anywhere except on, say, the Internet Archive, who's been graciously trying to back it up as much as possible because they realize this problem early on, and they've been trying to make backups of everything. But there's just a tremendous amount of data, and we can just improve how that you network works to just make those backups sort of automatic. So you talked about that HTTP uses the idea of sort of location addressing,

Starting point is 00:12:09 which you think is flawed and what IPFEST does is content addressing. Can you explain what content addressing is and why that's a better approach? Yeah. So, and yeah, let me first say that location addressing makes a lot of sense for certain kinds of use cases. So it makes a lot of sense when you want to specifically designate some authority or some set of computers as the source of truth on what the current state of something it might be. But it is not very good for just large amounts of data or storing data that you may want to access offline. So the alternatives, one example might be content addressing.

Starting point is 00:12:50 And so this is what IPFS uses. Content addressing is the practice of saying, instead of creating an identifier that addresses things by location, we're going to address it with some representation of the content itself, meaning that the content is going to determine the address. This pretty much means you take a file, you hash it cryptographically, so you get a very small representation of the file that's secure so that you can't just come up with some other file that has the same hash. And you use that as the address. So the address of a file in IPFS usually starts with a hash that identifies some object, some root object, and then a path walking down.

Starting point is 00:13:33 So instead of a server, you're talking to a specific object, and then you're looking at a path within that object. You're sort of looking at a root of a graph, and then walking down its links to find whatever it is that you're looking for. Right. So sort of right now, the way it will be, you know, you see a link to a file, and then basically that point. to some location and you go there and it gets whatever is on that location, which may be the same and it was created, maybe something else, who knows, right? So, but you sort of go out looking at like what is at that location, whereas with IPFS would be, you know, you have this file and you go out and it's like, where is this file? Yeah, exactly. So in IPFS you have these

Starting point is 00:14:17 addresses which mean the content. So you start with the hash and so on. And then you have to solve the problem of locating it separately, right? So, so HTTP has this nice property that because the identifier is the location, you know exactly where to go, you talk to those computers and you get the file. So it's nice, and that can be pretty fast. But it doesn't work in the offline case, right? And it doesn't work in large distributed scenarios where you want to minimize the round trips or you want to minimize like the load across the network. And certainly when you have just tons of data, it becomes a pretty big problem to just constantly be making requests. Instead, in IPFS, you have this separate, you separate the steps, and so the first step is you

Starting point is 00:15:02 identify the file with content addressing, and then the second step is you actually go and find it. And so when you have the hash, you ask the network that you're connected to, you basically ask who has this content, who has this hash, and then you connect to those peers and download from them. basically what dhTs do. So this is a well-known technique. This has been around for 15 years. In fact, this is how BitTorrent works nowadays. Nowadays, when you go and download a torrent, it usually starts with an info hash, and that info hash is just a hash that gives you a set of peers in a dh-ht, and then from them, you download the torrent file, and then you start downloading the rest of the file. So this is not new. This is actually a pretty old idea.

Starting point is 00:15:51 It's been around since 2000 or 2001, but it hasn't been put as part of the web itself yet. There's been some attempts in the past, but they've been as sort of like different layers. I think like the most notable one I think is NDN. So this is named data networking. And the idea was to put content addressing directly in the IP layer. And this was a project started by Van Jacobson, who's famous for basically fixing TCP in the 80s, TCP broke, the internet fell apart. Van Jek and so came in and said, here's how we fix it, and he came up with the entire field of congestion control for TCP and so

Starting point is 00:16:33 on, and just sort of became an internet hero. And lately, what in the last, I think, 10 or so years, he had been working on name data networking, which is the idea of putting content addressing at the IP layer. It's really difficult to do, because, in order to get the entire network to move off of IP to something like NDN, you need massive buying across the world. I mean, we haven't even switched to IPV6. I think switching to something else would be even way less likely. And so we are working at a layer above,

Starting point is 00:17:11 which is let's put content addressing at the HTTP layer and move from something like HTTP to something like IPFS, which can, in fact, generate demand for NDN. So if NDN existed and was and gave, and we weren't over IPFS, but we were over NDN, IPFS would be a lot easier to build. It would be a much, much faster thing to get through.

Starting point is 00:17:35 But we basically had to do all this like intense peer-to-peer work to wrangle this point-to-point link network into a peer-to-peer overlay that gives you really fast routing. More on that later, but... Let's take a short break and talk about hi.m.me. Hi, dot me is a VPN provider. And if you don't know yet why you should need a VPN provider, let us help you. I'm sure you were like me and when all the crazy revelations came out during the Snowden time of all the spying that is being done by the NSA and other government agencies, you were shocked and you said, not with me, not with my own rights. Now, the way government agencies can spy on you, there's many of them, but the most easiest way is by simply going to your ISP and getting all your traffic capturing all your traffic and the VPN can protect you from that it can give you a secure tunnel from your computer to any of the exit nodes all over the world so that all your traffic goes to this secure pipe that's

Starting point is 00:18:39 encrypted and cannot be intruded on and with height.m. you can choose any of their 30 exit notes all over the world so you can enter the internet in a secure location The best thing about hide.me is that they have a free plan. The free plan includes two gigabytes of unthrottled bandwidth on three of the exit nodes that they provide. Now, if you want to get a free plan, you just go to hide.me slash epicenter and sign up there. And if you use that URL, if every user decides to get a premium plan, you'll get 35% off. Now, the premium plans are really fantastic. They offer unlimited bandwidth, access to all 30 of their exit modes,

Starting point is 00:19:17 and you can use high dot me on up to five devices. is so you can just install the thing on your phone and your tablet and your PC and have the thing running all the time and just be completely protected at all times. And by the way, you can also pay with Bitcoin. So we'd like to thank hide.me for those support of Epicenter Bitcoin. So you mentioned before BitTorrent as one of the inspiration. And it makes kind of sense that, you know, in some cases, HTTP is bad at delivering data and, you know, this can do a much better job. The other thing you mentioned was Git and versioning. Can you explain? why is versioning important for files?

Starting point is 00:19:54 So versioning gives you the ability to track how content has changed over time. So there's two sides to this versioning thing. So one is seeing the history, so being able to have access to how things have changed. And the other is to be able to link to any one of those versions forever. So if I see some content out there, like say it's some posts by somebody, and I want to be able to link to it or I want to be able to reference it then I can usually link to it with an HTTP URL

Starting point is 00:20:26 but I'm not guaranteed at all that that content will be there in the future I'm not guaranteed that the user that is viewing my post will ever be able to see that post and so with IPFS you get this property of being able to link to specific versions of objects and replicating them yourself and so the point is when I were to quote your your post, for example, I would create a link directly to it, and then whoever viewed my post

Starting point is 00:20:52 would also view that version of your post as well. There's ways of letting them be able to see the newer versions of your post, say, if you updated it or something. But the important piece is that the content that I referenced doesn't disappear, that I can have some way of ensuring that that content doesn't go away. So that means, for example, let's say you had a blog and you're linking to different people and you said like, oh, I want to be sure that when people come back to my blog three years down the line and I have all these links in there that those pages still exist, then with something like IPFS, I could just host those myself. Yeah, exactly. So you help replicate that content, right? So when you link to stuff,

Starting point is 00:21:39 you are adding them sort of in a sense dependencies to your content. And And so when you back up your content on all of its dependencies, then you would be backing up some of those sites as well. I mean, of course, you could potentially today take someone else's site and just copy it, but that wouldn't work so well, I guess, because the location would be different then, right? It wouldn't be... So the location would be different, and, you know, we would also have to trust you and trust that you were really linking to the correct thing.

Starting point is 00:22:15 it's really hard to do. You would have to go and take their site and try to do this of copying all the stuff and hosting it somewhere and trying to create this mirror. And in effect, that's actually what the Internet Archive does. So the Internet Archive is trying to do this

Starting point is 00:22:30 for the entire web all the time. And it works reasonably well and that they're able to create these backups of websites. But it doesn't work so well for dynamic websites. Things that are by nature, like just have pages that are constantly changing. and so on, or like, you know, social networks that are closed and the archive can't crawl or something. And so with IPFS, you could put all of these posts as first-class objects in the graph itself.

Starting point is 00:22:59 So the thing about IPFS is that it doesn't, it's not just about files, it's a layer below that. It just gives you this graph of objects that you can link together. So think of all of your data as being able to write on this transport and be able to be linked the same way. so that you could have like a user object, for example, as an IPFS object that can link to other user objects or that can link to posts and so on. And so a post can link to another post that a reference or something. And by backing up that post, you back up everything else. So this is the aha moment that I had earlier where I was imagining, because I've been playing with IPS for the last couple of days and trying to think about ways that you could perhaps like host websites on IPFS. you know, they would have to be client-side, but with technologies like Meteor, you could

Starting point is 00:23:46 considerably do stuff like that. And then you could have like a new SQL database that would be sort of a JSON structure. And in that structure, you would simply reference other objects, like user objects or products, for example, if you're hosting an e-commerce site. And those objects would have all the properties that you mentioned, like versioning, etc. And but then I guess the the flip side of that is if those objects, if some of those objects become unnecessary or needed, they could considerably stop being served

Starting point is 00:24:24 if the nodes and network stops, stop serving as files. So in general, think of this a little bit like Git where once you have old history, it kind of tends to stick around. Like, yes, there are cases where you go and like garbage collect old stuff, but in practice, you rarely do that because the content is so small.

Starting point is 00:24:43 Like in practice, the really important data is usually really small, meaning that databases and, you know, like, yeah, like content in databases or the content in like important files and documents and so on is usually very, very small compared to, say, like, big movies and big media. So because the data tends to be smaller on... On average, I mean, like, there are, of course, like these massive data warehouses and so on. But because those pieces of data tend to be smaller, you don't have a problem just accumulating it over time. And in general, most people want to keep as much data as possible because it gives you more information about the past

Starting point is 00:25:26 and it gives you more information about your decisions in the future, being able to look at the past allows you to make better decisions about the future. So kind of like a Git repository, you would just keep things around for the most part. Like, think also about the books, right? Like, you don't, if a book references some old book, there probably isn't a time that's going to come when you say, you know what, we don't need the old book. Let's just burn them all, right? We've kind of, like, figured it all out. You still want to keep everything around.

Starting point is 00:25:55 And so IPFSA gives you the ability to just do that for lots of data that isn't very large in general. And it gives you a very simple way of reasoning about how those old versions are going to stick around and how you can reference them and how do you can reference them and how do you can find them and so on, the same way the Git works. So I'd like you to walk us through, I mean, let's not spend too much time on this because we have so much more to talk about, but walk us through what happens when you install IPFS, when you add a file or a set of files to the network and then they start propagating.

Starting point is 00:26:30 Can you sort of explain what happens? Like end to end. Yeah, end to end. Yeah, yeah. That's great. So it's large and it sounds like. complicated, but when you think about just how complex moving a packet from one side of the network to another, like, it's not actually that big of a deal. In that, for example, like, HTTP is actually pretty complicated when you look into it, or even IP is really complicated

Starting point is 00:26:55 when you look deeply into it. So in EpiFS, so today, the way the distribution works right now is that you download this client and you run into your local machine. That's not going to be the case in the future, like what we're targeting is for IPFS to be to have implementations on the browser. So we're talking about native support for IPFS in, you know, Chrome and Firefox and all the, all the major browsers. And the way we're going to get to that, you know, we have like this several step approach. One step is to make a JavaScript implementation so that you can just run IPFS entirely in a tab. Another is to have like extensions because the tab is it's going to be really good with no friction.

Starting point is 00:27:37 People are going to be able to use IPFS without installing anything at all. But it'll be a little bit slower than if you had it installed in your browser. And so for the browser, we'll have an extension. So some users that want to do that can run an extension. And then over time, we want to just prove out to the browsers and say, look, this is what it would look like to have IPFS just natively in your browser and then submit patches to them to get it adopted. And that's the goal, to make it completely transparent and seamless

Starting point is 00:28:02 so that nobody has to install anything at all. So today, in order to get there to some point, we have a distribution that works through a local binary that you install and you run locally. So once you're running IPFS and you just run one command, which is adding a file. And so that could either be from the command line. So you can just say IPFS add some file or some directory. Or if you use the web UI, which there is one, you can just drag and drop. and so you can just drag a file from your computer and drop it on IPFS, and then it gets sort of added to your local node.

Starting point is 00:28:41 IPFFS doesn't actually move the content all over the place. Nobody downloads anything unless they specifically request it. This is a set of heart constraints. I'll describe more why later, but just for now, like just assume that that's the case. Once the content is in your node, your node is now going to, Well, first of all, it chunks the content. So if you're throwing in a large file, as it imports it, it splits it up into a whole bunch of little pieces

Starting point is 00:29:10 and creates a graph out of it. So a graph that allows seeking. So being able to have a really fast index so that, for example, if you want to access the middle of the file, you don't have to go through like a really, you know, you don't have to seek from the beginning and so on. This is like basics of file systems. Once you have that graph, you have a set of objects

Starting point is 00:29:30 that are all these nodes in the graph, and you want to sell the world about them. And so this is what we describe as the routing layer of IPFS. So it's the content routing. When you add content to IPFS, you start advertising what content you have. And this is, of course, configurable. Like there are certain content that you might want to add that you don't want to advertise,

Starting point is 00:29:51 and that's a policy that can be set. How that gets set and so on is a little bit complicated right now, but in the future it'll be pretty seamless. But, you know, the content that you want to distribute gets advertised to the rest of the network. And these advertisements, all it is is just you're kind of like telling the rest of the network, hey, I've got file with a certain hash. If another node somewhere else in the network now wants to request that file, what they do is like they try to get that content.

Starting point is 00:30:21 And so they ask the rest of the network who has that content. And amongst one of the responses, it'll be your content. node and so they'll connect directly to you and download the content from you. And so they'll download object by object. And so they might actually just want a piece of it. So for example, if suppose that this is, you know, some like large data set or like, you know, some large file, like a huge document or a movie or something and they're only seeking like from the middle on, they don't have to get the first part of the file. They can just like request the object that they need to just get to the middle and just start from there, which makes seeking really fast.

Starting point is 00:31:00 and gives you this really, really good experience on browsing for video and so on. And then they start getting that content, right? But as soon as they start downloading it, as more people start requesting that file, there are now, say, two, three, and so on, nodes that can distribute this content. This is basically kind of how BitTorrent and other networks work, but that's usually just kind of like this separate system, that is not integrated with the web. And so we're just making it integrated entirely with the web.

Starting point is 00:31:36 So I think, like, yeah, so you add content, it gets added to your local node, it then gets advertised to the network, and then somebody else in the network can request it, and then they pull it in, and then they have it. Along all these steps, by the way, there's hooks for logic to be added, which is, I think, a very important distinction between IPFIs and other systems. we're giving you the ability to set up policies around how content gets advertised to whom it gets advertised over what network and even if you receive requests for certain objects you're able to decide whether or not you serve it to those peers you may have certain constraints and you

Starting point is 00:32:13 say well yes I have this content and I want to serve it but I only want to serve it to specific sets of individuals or individuals that carry a token so a token like an authentication token or something. So this is how you do security today in HDP Web, and it's the same thing in an IPFS. So you would be able to give people the ability to just download secure stuff, like being able to know exactly who's downloading things. Today's magic word is storage, S-T-O-R-A-G-E. Head over to LetstockBitcoin.com to sign in, enter the magic word, and claim your part of the listener award. So when I was reading about IPFS, I came across a blog post by a guy from Neo-Cities, which I thought was quite interesting.

Starting point is 00:33:11 And perhaps that would be a good thing to talk about also to give a little bit of a more, you know, a concrete project that's actually using IPFS. Can you, can you talk about what they do? Yeah. So, yeah, this was Kyle Drake. Kyle Drake is the founder of Neo-Cities, and he's a really, really, really awesome guy. So he sort of found us on the internet and found IPFS and understood the implications and was like, wow, this is, this is huge. And he just kind of like jumped on the IRC channel and just started hacking away at a bunch of like different things around it. Super awesome guy, by the way. So he made NeoCities to be this website that he's trying to sort of bring back creativity into the web. And so he sees the disappearance of GeoCities,

Starting point is 00:34:00 like the old Yahoo GeoCities website, as this huge moment of kind of destruction on the web when this site that used to host tons of websites and tons of valuable, important creative moments for people just kind of all went away. And a lot of crap, too, I've got to say. Yeah. I made a lot of that crap on GioCities at the time. Yeah. It's a, it's a, it's a, For sure, but you never know. Some of that stuff could actually be valuable in the future. Like you don't know what silly little experiments from random people are actually interesting to look at or interesting to read about.

Starting point is 00:34:37 So like GeoSides is like this website that sort of disappeared and sort of destroyed kind of creativity. And one of the points here is that it's difficult to get started with programming nowadays when you think about how most people started programming in our generation, maybe not in the older generations, but in our generation, it's usually with a web, right? Like, people started learning how to program websites and how to do HTML and how to, like, add a little bit of JavaScript and a little bit of CSS and so on. And over time, they had this, like, really fast iteration cycle of being able to just add a little bit of code to get a result. And GeoCities made that really easy. And Kyle's point is, with that gone, we don't really have a whole bunch of other outlets.

Starting point is 00:35:22 that are that easy and that simple to get started with. So he wanted to reboot that and say, like, let's create this again. Let's try and build a better version of that idea and bring back creativity to the web. And his goal is to make something that's kind of permanent, right? That people can't necessarily just sort of like shut down automatically and that, like, he doesn't want a company that will be like, bought and shut down like Two Cities was

Starting point is 00:35:55 or a set of websites that just at some point the company will just say, hey, sorry users, we know that you're using this, but we're just going to shut it down on you anyway. He wants to give control to the users entirely and say, like, look, this is your thing. We just make it easy for you, but you have the complete control. And so IPFS actually gives New Cities a really good way to do that. because normally with the regular web, you have to host servers, and when people are hosting things in your infrastructure,

Starting point is 00:36:28 you have all their content, and you are hosting it for them, and they're intermediary, and at any point, you can just sort of shut down and disappear, and it's very difficult for people to back all that content up and set up their own website somewhere else and so on. But with IPFS, all of those websites are just these objects that are shared. And so NeoCities becomes a website focused entirely around making it easy for people to get started and making it easy for people to create on the web.

Starting point is 00:37:00 But all of the content is stored on IPFS, which means that anybody can back it up and continue creating and continue generating. And so he sort of makes it a thing where users have complete control and have the ability to just own their data and decide when it. it gets shown and, you know, when it gets taken out or whatever. But so I get that, that they, they have the data on IPFS as well or only, as well, I presume, right? So right now it's as well. So they have their, so the implementation will follow a set of steps. So like, so for now, it's as well.

Starting point is 00:37:40 So you have all of the sites are on, they're served by the HTTP servers in New Cities, the same way that we have our IPFS gateways that serve HTTP to people. And you can request all of that content also with IPFS. So if you're using the IPFS protocol natively, then you can request all of that content as well. So it's both in

Starting point is 00:38:01 HTTP and in an IPFS. And where it's going to get interesting is when we get naming to be really robust. So that's when you can give people keys. So New City says this plan of giving

Starting point is 00:38:18 people keys to a certain name. And the keys can be generated by them if they want to do that. And they can just give the updates to New Cities. And if cities doesn't have to ever own the keys or anything at all. They just kind of like mirror your content for you and make it easy for you to get started. And so in the case, let's say New Cities disappeared, then somebody else, at least in that intermediary time where, let's say, my browser doesn't know what to do with IPFS, someone else would have to go in there and do that sort of resolving that, you know, if you make a request, then they look up in IPFS. Yeah, so we have this bootstrapping problem, which is that the majority of the network does not speak IPFS. So how do you get them to use IPFS content?

Starting point is 00:39:07 And the way we solved it is that we have, along with the standard distribution of IPFS, you get this HTTP server, which takes every request it gets and resolves it through IPFS. So So we call this the gateway. And so it's an HTTP to IPFS gateway. So you make HTTP requests, and so standard browsers can look at it and can link to it and so on. But the whole content actually gets resolved through IPFS. And we run a set of gateways. So we run this network of servers that are the, so the IPFS.

Starting point is 00:39:41 So the IPFS.io domain is our gateway. and every link that people go to and that website is hosted entirely on IPFS. So anything inside IPFS.I.O is being served with IPFS behind HTTP. So your browser might make an HTTP request, and at that server it gets turned into an IPFS request, and then it gets served. So there are multiple gateways of which IPFS.

Starting point is 00:40:09 That I was one of them. But this is, so I was talking about websites earlier, And I sort of experimented. I was building a website, a static website, and I uploaded it to IPFS, and I was able to access it through a gateway. And then I started thinking, okay, well, how can I route a domain name to this address? One way of doing that would be to just on my hosting server, have a static page that redirects there, that does the URL cloaking, but I still have to rely on that. that hosting provider, that's sort of a central point of failure, you know, if you're looking

Starting point is 00:40:50 to eliminate those. In the future, can we imagine that you can straight up set like DNS, your DNS settings to point to an IPFS object? Yes. So that's not the future. That's today. You can do that entirely today. We call this DNS link. So we not only are we going to make the IPFS web, but we're also going to make the DNS web. The DNS web is just you take a text record and you set a value called and you prefix it with DNS link equals and then some some link. And that link is not just an IPFS hash. We made the point that, hey, this should be for other protocols as well. And so you say DNS link equals and you say, you know, some path. And that path could be on IPFS, it could be an HTTP, it could be anywhere else, right? And so

Starting point is 00:41:43 whoever's resolving this text record can take that domain name and the value of that domain name and then put it on the browser, right? And so we have this working today with two things. One is our gateways do this automatically. So if you go to IPFS.io slash IPNS-Ns slash IPFS.I.O again, what that will do is it will request the DNS text record for the domain IPFS.io. It will realize that there's some hash associated with that, and then we'll fetch the IPFS content that will serve it. So your URL will look beautiful. Like it will have no, no hatches on it in it at all. Right, but you're still relying on a gateway to do that.

Starting point is 00:42:30 Right. So I'm like to get to. So the next step is being able to set a DNS record and your D&S record and your DHS record, DNS zone with your hosting provider and saying this domain name points to not an IP address, but points to an IPFS object or an IPNS object. And perhaps we can explain with IPNSs briefly. So what I mean is that the records work exactly how they will in the future already. What doesn't work yet is the resolution because the user's browser doesn't know what to do with it.

Starting point is 00:43:00 The user's browser still needs a bootstrap step of being able to talk to the IPFS network. And so we can solve that in multiple ways. one way is with the gateways, that's like that zero running IPFS locally at all. The next step is we solve it by, so our gateways will transition to doing this in the future, but when they detect that you're viewing it through a regular web browser, and you don't have a special header set or something, it will serve you some JavaScript that will be the IPFS client in JavaScript. Like an IPFS node in JavaScript, you will run that and then you'll be able to resolve the DNS record,

Starting point is 00:43:38 yourself. So we can do this without any changes to DNS. We can do this now with just what we have. You can do it with what you have. We just need to ship our JavaScript implementation and then then you'll get a zero install thing. You can do it also today if you want to install something. If you install your a local client, then you can do it locally. And if you install a the browser extension, for example, you can do it locally. So we're just a, sort of reliant on browser extensions and down the road, eventually this is what's desirable is for browsers to support IPFS. And once that is in place, then we can have purely like DNS resolving two IPFS objects. Yeah, well, you'll have the DNS resolving in. Yeah, so, so JavaScript

Starting point is 00:44:31 doesn't allow us to do text record resolving yet. So yeah, we'll still need to get, um, either some code changes in the browser or rely on an extension to get the full the end of resolution. But you can actually, if you sign your records with, you can actually use some other resolver for you, right? So you can still get all of this in JavaScript, meaning that you can get, what you're describing will be a reality in, like, before the year ends, like in 2015. That's, that's very cool.

Starting point is 00:45:03 And so just, we mentioned IPNS. I think it's important to, to just point out. that IPNS is a component of IPFS, which allows you to have a hash which is mutable, which will not change over time, and which will continue to link to an object that can be changing. So, for instance, you could have a website that, you know, you keep updating, and with this unique hash that remains the same, you will always point to that same object. Yeah, so the idea is that IPNS gives IPFS mutability. So a regular IPFS path is immutable by design.

Starting point is 00:45:40 The ideas you have these objects that can't change. And that gives you all these really nice distributed system properties where you can reason about the content, you can view it offline and all that stuff. But you still need some mutability because you want to be able to give people identifiers that don't change, right? That allows them to view the latest version of something. And the way we make that happen, and by the way, this looks a lot. like what Git does, right? So Git internally is a huge set of objects that are all immutable,

Starting point is 00:46:13 and then it has these mutable pointers on top that we call branches, right? So Git Branch is really just a file that has a very nice name, but has a big immutable hash, or like the hash of an immutable object inside of it. And so IPNS works the same way, except that the names in IPNS are, instead of being like these nice readable things, they're the hash of the public keys. And what that means is that I can claim, instead of having

Starting point is 00:46:42 this naming authority problem, right, like the DNS has this issue where you need to register a domain name and you have to pay people money to get a name. And the same thing happens with name coin or any of the other name resolver systems. You have to go through some consensus process to reserve a name.

Starting point is 00:47:00 instead what we can do is generate keys you generate a private key and the hash of the public key is the name and you know it's not a pretty name there's a way of binding a pre-name to an ugly name but what you use these ugly names for is that it allows you to give people updates on that name

Starting point is 00:47:21 by just signing a record so what this means is that if I have a version of an object right A and I want to update that to say version B, I move the pointer to now point to the next version and what I do is I craft a new record saying, oh, this name now points to B and I put that into the network, into the routing system of the network. There's a lot of complexity into how all of that happens in terms of, you know, there's a lot of guarantees around there, like who can get to sensor records, who can get to like what kind of attacks might exist, how do you ensure that records reach everyone and all that kind of stuff.

Starting point is 00:48:02 And for that, you know, we can give people links and so on. But the point is this has been solved for a long time. Like people have come up with really good ways to doing this. And we're employing them and we're saying, great. Like this is the way you do mutable records in a large distributed system. And let's use that to move these pointers over time and give people the ability to get all these mutable names. Okay, so we have this technology which allows us then to have websites that are not hosted on any server that can live forever, where you can also have DNS pointing to them. But one limiting factor is that these are static files.

Starting point is 00:48:46 So, for instance, you could store, you could host an HTML website. And then, you know, going one step further, you now, we now have pretty advanced JavaScript. crypt technologies, which allows us to have single-page apps like Meteor or whatever. But server-side scripting language would still be sort of an issue. Can we imagine some sort of layer on top of IPS where we could have some more advanced server-side scripting, I mean, equivalent of service-side scripting like PHP or something like that? So when you think about the web today, what you have is you have a little bit of computation spread around and you have a lot of storage, right?

Starting point is 00:49:25 So servers do a little bit of work, and they have to do it at the server side because they're kind of trusted. The client does a little bit of work, and that's kind of trusted for the user's context. And then usually you just have a lot of data in the server side because they're backing up everything. And you have a little bit of data in the client side because they're just viewing some subset of the stuff, right? When you move to the world of IPFS, though, you kind of unravel these distinctions. and you treat the client and the server is just nodes. And so the client can make some operations, and the server can make some operations,

Starting point is 00:50:02 like just do their computation as usual. And the content that they produce is just IPFS objects. So though all the content is static, you can think of that as just a transport of data. So the same way that you would say, perhaps have like a Rest API that gives you some object. Instead, they could just return the reference to an IPFS object. They could just return the hash, or they could just return the actual object themselves.

Starting point is 00:50:28 And that gives you the same equivalent of the computation we have today, but in an IPFS world. You still have to reason about who does what computation, because you have certain kinds of trust implicit in the logic, right? So you don't want, like if you have a sales system that has to make sure that accounts are properly managed and so on, you don't want to give the user access over that because they will cheat. And so instead you put that in a trusted environment. Now, a really nice replacement for all of this is smart contracts, right? This is where when you think about how smart contracts might work, you can create a whole bunch of really sophisticated programs that run entirely in the network

Starting point is 00:51:13 and are auditable by the network. So you can have whole systems that are totally open and you can view. the source code of all of them, and still run them in kind of a trusted environment because you can check that the properties are holding. There's no double spends, that the protocol was followed correctly and so on. And all the content that they create is just they're mutating pieces of data, all of which can just be more IPFS objects. As we were talking about before, you could have these posts, you know, like user accounts

Starting point is 00:51:46 and posts and so on, but just as well you might have shipping orders and, you know, like boards of stuff or like forums and something all these kinds of websites that are dynamic you can make all of them be just IPFS objects and so the program would just mutate the data structure and just put the data structure back into the network which gives you a really nice separation between the viewers which are just UI on top of the data

Starting point is 00:52:15 it's kind of a way of like just ripping open the big databases that people are creating and just dumping them straight into the web. So right now you have this construction where the browser talks to the server over like a wire and, you know, like an activity connection. And on the other side, the server is like writing stuff to a huge database, but you don't really have access

Starting point is 00:52:40 to the database. However, many times that database doesn't really matter. Like, you don't... In some cases, you do care about privacy and in some cases you don't. And so for the cases that you don't, you can just like rip it open completely. And for the cases where you do care about privacy, in reality, what you want is a much more secure model of privacy than what we have today. Because what we have today is just that it's kind of security by obscurity because you just have to be on the server side to be able to look at everything.

Starting point is 00:53:08 And what you would instead want is something with capabilities where the user could entirely encrypt all of their own data end-to-end. and so encrypt things locally and ship to the servers or, you know, the other nodes, totally encrypted pieces of data that they can't read because they don't have the keys. And so only other nodes that have the relevant keys could decrypt it and read it, and only specific nodes that have the update capabilities could, like, write back. There's a lot of great research done on capabilities and a few really good implementations. I think, like, notably, Tajo L-A-F-S is, like, I think, a fantastic implementation of all the all the capability stuff.

Starting point is 00:53:48 And that's that alongside with E-Rites, which is another great project for kind of thinking about how capabilities work and they might work in distributed systems and so on. But the point is like it's a very, it's a different model, it's a different security model than what we have today, but it's way, way better for the user because you get to encrypt things into end, you get to decide who looks at what files,

Starting point is 00:54:11 who gets to update what files, and so on. Our show today is brought to you by our friends at Shapeshift.I.O. Shapeshift is the fast and easy way to trade all coins and they now support over 50 cryptocurrencies, which includes all the ones you have ever heard of unless you have no life and spend way too much time on Bitcoin talk. So if you want to trade all coins, there's the old way of doing it, which means create an account somewhere, giving them all your data, depositing your money and then growing old while hoping for the best.

Starting point is 00:54:43 Or there's a Shapeshift way, which is fast, easy. and means getting it all done in less than a minute while not even needing an account. So here's something to consider. Shapeshift is a company that really stands by its values and goes out of its way to protect its users' privacy. One way to do that, obviously, is by not requiring you to create an account to user service, so they don't track any of their users' information. And secondly, when BIT license was enacted in the US, they were the first Bitcoin company to say, screw this, we're not going to stand up for this nonsense.

Starting point is 00:55:15 And so what they immediately did was move the company out of the States and into Switzerland. So, ShapeShift is a company that really stands by the ideals of Bitcoin, and we think that's really cool. And plus, by sponsoring shows like ours, they really help entertain people like you, and also promote growth in the industry. So we'd like to thank Shapeshift for their support of Episode of Bitcoin. So yeah, you mentioned the smart contracts topic, which I find particularly important, interesting because I work with ERIS now and, you know, one of the first things they did was

Starting point is 00:55:48 built a sort of distributed YouTube based on IPFS and IPFS is sort of a quite central component of it. And I was, I was thinking about before, you know, where is the synergies between IPFS and smart contracts? And there's a few things that sort of come to my mind. And so, you know, in a smart contract, right? So you, you have some code that, basically describes the sort of the protocol of interaction for some parties, right? So, of course, one of the things you probably want to do is reference some files, right? So how would you do that now? Well, it's not really clear, right?

Starting point is 00:56:28 I mean, of course, you could put a link in there, but then you depend on the other side, right? Yeah, so what you do is you put an IPFS link there. and the IPFS link just looks like, you know, slash IPFS, slash, and a big hash, and then a path. But right now, because we're not in, you know, it's not the case yet that everybody is running IPFS locally. So what you do is just you add a protocol, so we'll have a protocol identifier that allows people to, like, when they click on it, either resolve it with by installing IPFS or resolve it by

Starting point is 00:57:07 going to the gateway. So you can use links today and the people can view them it's just they have to be well formed. And I think right now there's a little bit of confusion about that. People aren't following kind of like the guidelines that we need to do a better job at like expressing

Starting point is 00:57:24 how this should be done. Right, but I think the nice thing as well is that you sort of combine the content addressing and the location of the file, right? Because you want to have in this sort of future smart contract environment, right? You want to have the security. So you want to be sure that that file is actually the thing you're referencing. So if you can then look it up the same way, you know, that's great.

Starting point is 00:57:49 And then the other thing is who hosts the file, right? So that's the sort of another security thing. And in a distributed application, well, if you distributed application, It depends on some central server hosting the file. Well, then, you know, how distributed is it really and how secure is it really? But of course, if you have something like IPFS and then it's just any party that's involved in this contract, any, anyone who cares about these files and says like, okay, I need this thing to be available can just host it themselves and make sure that it will always be available. And then that's super powerful. So I think that's like a good example where, I mean, you could probably do it without IPFS, but it's so much nicer if you have IPFS.

Starting point is 00:58:39 Yeah, exactly. And on the how do you get it to be hosted, so yes, the parties that are interested in those files can keep them around, and we'll have a set of tools around helping it, making it really easy for groups of people to just back up large amounts of data. There's already, for example, these efforts that we have of backing up huge caches of important knowledge. So we're taking like all these open science archives, like tons of papers and tons of like, you know, creative comments, media and so on, and just backing them on IPFS. Like just backing up all this stuff because, you know, helping to archive all this data in cases like disappears for whatever reason. And so we'll have some tools, but the real solution. is what we, so a sister project to IPFS, which we have, which is called Filecoin.

Starting point is 00:59:34 And that's a way where you can get, you can incentivize the entire network to help you back up stuff. So when I said earlier that I would come back to like describe like the strict, you know, the design constraints around IPFS and why it doesn't, you know, kind of like force other notes to store content, this is it. So people would not install something like IPFS if it meant that they might be downloading stuff that they didn't explicitly request in that if to view certain files they had to like download other stuff that they didn't want to download and store it. They might do it, but the big reason they won't is that a lot of people would just start storing illegal stuff and people just don't want to store that. Even if you encrypted, people would still not want to do that.

Starting point is 01:00:22 This is certainly true about companies, right? so large companies would never want to use something that would just start downloading all this encrypted stuff, of which some of it might be really bad, illegal stuff. So IPFAS strictly has as a design constraint that you never download anything that you don't explicitly request. And that's like the default operation. Like you do not download things unless you explicitly request them. At the same time, you do want to be able to build a certain network that some nodes can say, we have a lot of disk space, we can offer to back up stuff for everybody, but we want something

Starting point is 01:01:00 in return. And so the idea there is that a certain set of people will set up disks and trade those for money. Right. And so you have this situation where in order for disks to keep connected in the network, people have to keep putting in value, right? Like, there's this constant value expenditure that people have to have to have. It's similar to hashing in a way, right?

Starting point is 01:01:26 Like, people are expending a resource. So in the Bitcoin network, people are constantly expending electricity and to power these machines, hashing constantly to power the Bitcoin network. So similarly, you have a situation in storage where people have to constantly be dumping money into keeping bandwidth lines open and electricity for running disks and, you know, swapping out bad disks and all this kind of stuff. like this constant expenditure of value to keep data around. And though that, of course, is getting cheaper over time.

Starting point is 01:01:57 It's not zero. It's not a zero cost. And because it's not a zero cost, you need to remunerate that with value in some way. And so that's where the idea for Falcon came out, right? Of saying, look, we'll have two protocols, one protocol, which is IPFS, and helps everybody think about where the data is located and move it around. And another protocol, which is Filecoin, which allows people to get, together in a network and then trade disk space for cryptocurrency.

Starting point is 01:02:26 Yeah. So that's an obvious need, right? Because otherwise, how are you going to have people actually save stuff? Can you talk a little bit more about how FalkoN works? Yeah, absolutely. So FalkoCoin is very much based on Bitcoin and it's a, the difference comes in the proof of work system. So instead of having,

Starting point is 01:02:56 so there's a layer, an additional layer there, which includes a proof of retrievability. This is a cryptographic construction where a certain node has to prove they're holding a specific file locally. And you can organize that proving process to build a network with lots of nodes that are competing to prove

Starting point is 01:03:18 that they have certain files at the right time and organize them such that, like, you have this network that keeps growing with a large data set that keeps growing, meaning, like, data that people are requesting from the system, or requesting the system backup for them. And as it grows, people, miners that are coming into the system with disk space can earn Filecoin as a currency for backing up that data. and so like the what it looks like to users is that there's a spectrum right

Starting point is 01:03:53 they can be on either ends they can just be users and just kind of hire the file coin network to store data for them and so they would either they would buy file coin from the network by exchanging it for bitcoin or something and then spend it on the network to hire it to backup data and once they

Starting point is 01:04:16 add this data, other nodes on the other side of the spectrum would back up this data and over time prove that they have it and serve it out to earn that file coin or to earn a file going along the way. And so that allows users to be anywhere in the spectrum, right? You can be on one end and just be a user. You can be on the other end and be just a minor that's storing data for other people on making money. Or you can be, you can hover in the middle and just kind of come out even. Just store enough data that you get enough file coin to back up your data. And so you can be pretty much anywhere you want. So how does that work from a sort of monetary perspective? Do you have some sort of issuance that, I don't know, it's like deflationary or inflationary?

Starting point is 01:05:04 Right. So like you don't want, yeah, so you'll have a system where you want it to be pretty cheap to be able to use that network. You want it to be pretty cheap to, uh, so like the units, like when you get down to like the least divisible unit, you don't want the unit to be too large to be able to like use the network. Of course, there's like side protocols that you can make and to like, you know, split even further, but like you don't want to do that. Um, so you want, you, you want the, the price of like the smallest possible unit to be low enough that the people for a while can use this. But what you ideally want is the value of FilePoint to track, um, to have a floor, and that floor being the value of the whole network,

Starting point is 01:05:47 meaning that the network together running the service of storing files for everybody will have a certain value that will include the actual physical value of getting the disks and storing the data for people and the value of organizing this whole network to happen. And you want that to be sort of the floor. And you can get it to be really cheap enough. And it's the same thing with Bitcoin, right? So when you think about Bitcoin, you have this massive network with lots of people mining. And sort of like the floor of the currency is, it's a very low floor that we're like, or just a magnitude above that floor.

Starting point is 01:06:28 But the floor is as a payment processing system or as a transaction system, Bitcoin is a valuable thing to have to be able to process these transactions. And so that lends some value to the network. And so in a sense that there's like that at the very least. So that sort of makes sense? Yeah, I mean, I was just, I was just curious. So what does that mean? So it is, you know, for example, there is a certain amount of, there will be a certain amount of file coin that are issued each time frame.

Starting point is 01:07:03 And then you get it according to, you know, the proportion of files that you're hosting on a network. or it will be like, you know, the more people host files, then it's just like the more gets issued, or how would that work? Yeah, exactly. So you can follow a... So I think to like the deflationary aspect of Bitcoin because it gives a really strong incentive

Starting point is 01:07:29 for people to join the network early and to work with a network early. And you can always build either other currencies that are kind of companions to it that can give you the inflation, like the benefits of an inflationary currency. So though it is not 100% settled, it looks very much like Falcon will be strictly a deflationary currency as well. Like there will be a certain limit of Falcon. But the divisible unit will be really large so that for a long time you can keep so

Starting point is 01:08:03 dividing and get to a point where you want the value to be pegged by, to have that floor of the value of the network, but you want to also incentivize people to get into the network early. And that's a really powerful game theoretic mechanic because it rewards people at the beginning. And it makes the coins that people earn early be worth a lot in the the long run if the network works out as a whole. One of the problems with inflation and currencies is, with an inflation currency you would get like this, this really nice system which would, like, over time, you wouldn't have to, like, you could align the value of the currency much closer to what the value of real storage

Starting point is 01:08:56 is. But if you do that, you might not venerate the necessary, you can't. incentive for lots of people to set up the network. So in a sense, when you think about the worth of Bitcoin today and how much money people were getting in Bitcoin at the very beginning of the Bitcoin network, a lot of people got really rich because they worked really hard to push this forward.

Starting point is 01:09:20 And you need something like that to cause the effects that we need to build out this massive storage network. To build this whole thing out, you need a lot of people working really hard to set it up, and you want to reward that. You don't want to like just, you know, inflate it over time and be like, okay, well, like, thank you. Thanks so much, but like, better luck next time, right. So what's the time frame here?

Starting point is 01:09:45 It seems like at the moment you're prioritizing IPFS and that's the thing. And will Filecoin come later? Yeah, so we decided to be to push out IPFS sooner and to, because we wanted to have Filecoin out sometime this year, when we first started, like this was like a year ago, when we first started both projects, we wanted to have Falcon out sometime this year. But along the way, we just saw it so much traction with IPFS

Starting point is 01:10:14 and so much, so many people just wanting to use it and so on. And the value of having IPFS as a transfer for the web was so, such a compelling and important thing to have that we kind of like redouble our efforts there and said this is kind of like the big important thing to get right. Because at the end of the day, if you don't build out a distributed network that will just create tons of data to be hosted by these services, decentralized and distributed storage won't ever be able to compete at all

Starting point is 01:10:51 with centralized systems. Like there's just no way. Things like S3 and Google Cloud Storage and so on just orders a magnitude better, like the uptime is way better. You can just pre-encrypt everything and use them as a storage system, and, you know, store to S3 and store to Google Cloud and store to Apple and so on, and get your redundancy there, and maybe a few other providers here and there. But that will give you a much better system than most decentralized system.

Starting point is 01:11:19 The way you beat that, the way that you can try to create a competitive network that can actually rival a system like that, is that, hey, you kind of use, those systems at the beginning. You want to create a system at the beginning where it's actually profitable for users to set up nodes on top of S3 or set up nodes on top of other systems like that and mine for Falkoin, the same way that people were mining for Bitcoin at the beginning on EWS and so on. And second, you want to create an enormous reward for people building out the network to succeed. So you want to incentivize these massive mines, this Bitcoin mines in China to also just start stucking up on hard drives and starting mining

Starting point is 01:12:03 wild coin. But they won't do it if there's not a huge reward at the end of the, at the end of the day, right? And so to cause that enormous scale, like when you look at the computing power of the Bitcoin network, is this is astounding thing, right? Like, it passed several server computers, like, I think two years ago. And like now it's like the most, it's one of the most. it's beyond the most powerful server computer in the network. It's just like this insane amount of competition power devoted just to hacking.

Starting point is 01:12:36 And that's the power of strong monetary incentives. And so you need to create something similar to that to build out a massive storage network. And in order to do that, you need a strong reward. And in order to create the demand for the storage, you need a large consumer system. You need a system that is just going to create tons of data. And these are backing up all of what we have, right? Backing up all of the important scientific data sets, backing up all of the media that's out there,

Starting point is 01:13:06 like all of the creative commerce media, taking all of these video sites that have, you know, tons and tons and tons of video, and it's really expensive for them to host it, to just host it on other systems as well. And so that's where, like, the dual nature of the system comes in, right? Like you have IPFIS is a way to transport data and to generate this massive amount of demand

Starting point is 01:13:27 for storage. And on the other side, you have FileConn, which gives you a very strong incentive for miners to build out this really strong storage network, coupled with that demand, right? So it's, it's, storage is a two-sided market.

Starting point is 01:13:41 And the reality is that there exists some set of applications that require this distributed storage like this, kind of applications that are based, built on top of Bitcoin or Ethereum and so on. But the storage requirements are minimal, right? Like you can just host it all on S3 Google Cloud and Azure, and like, that's it.

Starting point is 01:13:59 You don't need anymore. And so to really make these networks actually valuable and give some real reward to the end user, you have to couple it with an enormous demand. So how then, I mean, if we sort of extrapolate into the future and, you know, file IPFS is successful in some order of magnitude, and we then have an economy of file storage,

Starting point is 01:14:26 that is established and, you know, these miners in China, as you mentioned, start having hard drives and those operations grow bigger. And don't we just get back into the same system or sort of scenarios we have now where there are data centers hosting the data? I mean, I guess the difference is that it would be distributed, but we're- Yeah. So, I mean, you don't want to host the data necessarily on people, you don't want to host people's data on laptops, right?

Starting point is 01:14:56 you don't want to host the data on mobile phones, right? You actually want dedicated machines. But the nature of the dedicated machines, what you want is certain kinds of guarantees. One is, like, if you have a marketplace where anybody can plug in a rack of disks and start earning money, that lowers the barrier of entry. So a lot of systems can come in and play and be part of this huge network. And then the other side is if you allow caches, like, file code nodes to be added anywhere in the network, like not just in the backbone, but like

Starting point is 01:15:31 anywhere in people's homes, in like universities, in, you know, large event spaces and so on, then there's an incentive for all those organizations to just run these machines locally. And then the content providers can, or you can run algorithms that figure out what content should be placed where to be served more effectively, right? So, so the CDN to end all CDNs is a CDN that can put content into the user's home, like beat the speed of light, like put the disks and all of the content they want to look at or view or like a majority of the big stuff,

Starting point is 01:16:04 co-located with them, like exactly in their near proximity, ideally in the same physical network. And the only way you can get to do that is if you incentivize large amounts of users all over cities to just start building out these content caches, both in the user side of the network,

Starting point is 01:16:21 meaning like on the other side of the ISP, kind of like in the local area networks and so on, and in the backbone, but pretty much everywhere, like not bounded by what a certain company and so on might do. That sort of makes sense. Yeah, so what you're suggesting is that essentially data storage is no longer only limited to large data centers that have infrastructure that can, that can provide. redundancy and sort of near low latency access to content through having multiple copies of this content in their search on around the world to more localized storage where you may have a data center in your you know sort of local geographic area or your ISP may provide some sort of IPFS node next to your, you know, sitting right with the Deslam,

Starting point is 01:17:25 your ADSL connection or something like that. Yeah, exactly. And what you also get out of that is that you go against this. So right now, storage is getting cheaper way faster than bandwidth is getting cheaper. Like that's just kind of like the nature of the curves. And so our storage capacity is increasing faster than the bandwidth capacity. and so our relative perception is that things might seem to be getting slower instead of faster. It's kind of like this weird behavior because our data usage is just increasing to fill the disks

Starting point is 01:18:02 and the capacity of the disks. So you get a situation where you have like you just create more and more and more media to store in whatever disk capacity that you have, but it becomes more difficult to move it around the network. So what's really valuable is bandwidth. So storage is valuable, but it's not as valuable as being exactly in the right location. So that's the more interesting part of the whole network. But in order to get to be able to do that sort of a thing, you have to first build this massive storage layer. You have to level the playing field, like make it really easy for people to be part of the network,

Starting point is 01:18:37 make it really easy for people to join and start providing storage and so on. And you have to also make it possible for the groups to not just control, the network as a whole, right, for not one entity to control the whole thing. So, you know, if we, you know, we can sort of look at the future where that's, that storage layer would be there and sort of everything would be, would be stored on an IPFS system. In more near term, what are some of the most exciting use cases that you're looking forward to seeing? So that's sort of for Falcon, right? So for the most exciting use cases right now, for IPFS, it's stuff like just distributed websites, right? So people are already starting

Starting point is 01:19:25 to build these. So yeah, you can take a static website today and add it to IPFS and it's like way easier, nicer, simpler hosting and that's really cool. But what becomes really interesting is to make websites that are completely decoupled from the backbone, meaning websites that are, that have no origin server websites that just ride on clients directly and just are perhaps, perhaps potentially entirely encrypted end to end, right? So the entire payload of the website is this encrypted thing, and you get this encrypted thing, you load it up an application, your browser like decrypt it, loads it, starts doing some computation or whatever, and takes all the products and encrypts them again

Starting point is 01:20:03 into end and sends them somewhere else. And so, like, that's, I think, like an extremely exciting part of this, because not only do you add this, like, you put people's, you put the power back in people's hands, right? So we have this problem right now where all these important applications that we use day to day, like what we depend on in our daily lives, from the things that we use to store our files on to the messaging systems that we use to communicate with our family, with our loved ones, with our coworkers, and so on, are all these centralized services that own all of the data. And you don't have any control over that whatsoever.

Starting point is 01:20:41 And so you can't encrypt it end to end because if it's a messaging system, like you can't really use it. I think Apple is the only one who's encrypting everything end to end. And so, like, that's a big win. Another is that you can get applications to just work entirely offline so that if your network gets disconnected from the backbone, everything can still work, right? Like, it's kind of silly that today, if you were sitting next to somebody else

Starting point is 01:21:06 and your connection to the backbone breaks, you can't collaborate on documents, you can't work on certain kinds of applications. It's really silly. And we know we have a technology today to, to just fix this problem, we just haven't. And so what we're really makes it a lot easier to do this. And that's, I think, an extremely set of, like,

Starting point is 01:21:25 really interesting applications that are going to spring up. So, Juan, like, the use cases, I mean, IPFS, right, the vision here is, like, super enormous. And it's interesting. You know, we talk about a lot of these applications, and they're, like, really cool and interesting. But, like, I'm sure you think a lot of, about, you know, what's the sort of the end game? So in your, you know, most optimistic

Starting point is 01:21:53 scenarios when you know like everything goes right and people adopt it in the way you want them to adopt it, what's the internet going to look like 15 years from now or 10 years from now? So in the long run you have a system where you have a fabric of, the entire fabric of the internet is this like computable system where you compute things in the right place based on where the data is located. You have a network that is just adapting entirely to usage and being extremely efficient at how it uses those resources, both computation and storage, bandwidth, and so on.

Starting point is 01:22:27 You have a network where people can plug in systems and algorithms into the network and can get compensated in micro-consections of stuff. So this is the kind of stuff that's super exciting down in the long run. But for all I think most important is you can have these networks that are entirely run by protocols and not run by ISP. So we're talking about taking the ISP and turning it into a protocol.

Starting point is 01:22:56 And so instead of having one organization that says, okay, we're going to own a network and put it together and run it, you have networks that can emerge and arise as protocols and collaborations that come together through rules that everybody plays out. Because that's how you can scale out. other systems like wireless, right? So we haven't really seen massive usage of wireless mesh.

Starting point is 01:23:21 Like we've seen some very large deployments, you know, there are some in Germany, there's some in India and so on. But we haven't seen these massive wireless deployments because ISPs don't want to break that. Except that if you enabled regular people to build out the network bit by bit by bit, that's where the incentives can actually play out in your favor to build out this huge network. And you can do that with protocols. You can't do that with these huge capital investments. It's not worth it to a lot of these companies to set up and deploy wireless,

Starting point is 01:23:51 but it might be worth it to some specific individuals in like a small scale, and you can get to fill out all these gaps. So we're really talking about an internet and a huge network that is hyper-efficient, that is encrypted end-to-end where the entire applications live and operate entirely where they're supposed to be based on where the data is, based on the properties of the data who wants to share what, with whom, what are the permissions, and where you can have computing systems that are almost kind of like autonomous and evolving on the network. So this is, this starts to get a kind of like, this starts getting into like,

Starting point is 01:24:34 like what happens when you, when you started like releasing these smart contracts onto the network with like these optimization systems. And it gets potentially like kind of dangerous, right? Like, you don't know what some of these smart contracts will do. Like some people will program like some pretty stupid, bad things, and you have a smart contract that will, you know, make money and continue to make money by paying other people by doing something evil, right? And people will have it hard. It'll be hard for people to stop that.

Starting point is 01:25:03 However, on the flip side, you can get these autonomous agents that can work within the internet to optimize the hell out of the network and to make it much, much better and simpler. And you can get this competition going where more efficient systems can emerge and can very quickly deploy through the entire network and gain, to a certain extent, like gain usage. So we're talking about the full market of turning all of these networks

Starting point is 01:25:32 and uses of the network into a hyper-optimized market. And then you can think about like computational systems, where all of the code that you're running is just addressed in the system. So you no longer have to think about shipping code from one place to another or libraries and files and so on. You just kind of write functions and just deploy them into the network.

Starting point is 01:25:57 I think Amazon Lambda is probably the closest thing to this. I was actually very surprised when they came out with this. I think like there's some very smart people there that are thinking way ahead of the game. Usually you don't see this sort of like kind of like investment on a on, improved idea so much, but like they realize that this is kind of what it's all going to, where you're going to shed away the notion of a server, you're going to shed away the notion

Starting point is 01:26:19 of an OS. When you deploy an application, all you care about is writing a function and deploying it somewhere. And so you can think of the end state of a lot of the stuff as these agents that have a wallet associated with them, and they run some computation, and they hold some cryptographic keys. And when they run that computation, they might be rewarded monetarily for that computation. It's the same way that websites sort of work today, but today you have this large infrastructure running. In the future, you might just program them. It might be an individual that programs one thing and just deploys it and it just runs entirely in the network. It pays for its own storage. It pays for its own computation. It moves itself around and earns value that way.

Starting point is 01:27:00 Yeah, Juan, this is amazing for sure. And we definitely going to have to have you on again. some point, but like we're running, we're running really late, although we've, yeah, sorry, but let me try it. No, no, it's not at all. It's been, it's been super interesting. Now, before we wrap up one thing, so the company you run is called Protocol Labs. Can you talk just very briefly about the company and particularly also what kind of ways of monetizing or sort of business model do you think you will build around that? Yeah. So I thought of company called Protocol Labs, and I think it's important to kind of tell the story of why I decided to go and do this. When I came up with IPFS, I had a few options. And so one option was potentially

Starting point is 01:27:51 like, you know, go and try to develop this in my spare time while I work on something else. Like that just wasn't going to work. It's just an enormous amount of work to put out, create all these all these protocols. Another option was to, I don't know, like go into, this is fairly academic, so you could conceivably get a PhD and go do this. And like I saw that as a complete dead end in that most PhD work doesn't produce usable things

Starting point is 01:28:18 in the real world that produces better papers and better systems and better ideas. But we actually have a surplus of good ideas right now. I think like the network is 15, 20 years behind the best ideas. So like when you look at the edge of where academia is, it's just 15, 20 years ahead of what we have deployed in the network. So when that's the reality, what you really need is a system where you can take a lot of those good ideas and develop them and deploy them into the network. So that happens in a few places in the world. It happens in a couple of like really big companies, namely Google, Apple and Facebook to some extent, Amazon, and a few other places.

Starting point is 01:28:59 they develop these better infrastructure systems and push them out into the world many times as part of open source. It also happens with the few academics that are also really good programmers that actually go and like implement and deploy the things. And it also happens a lot with open source. So specific people that come together and say, you know what, we really want this thing to exist. We're just going to build it. But those two latter routes are fraught with problems. They usually don't have enough resources to build out something really meaningful. full and something really big. In fact, we look at something like Bitcoin as this insane achievement

Starting point is 01:29:34 of one person who managed to come up with great ideas, you know, building on the work of others over time and managed to piece together the right set of incentives, constructs the network, manages to implement tons of it, and then ships it to the world, right? And yeah, he had, he or she or whoever might have been multiple people, had help from other, from others, but it's still a tremendous amount of work that went into producing Bitcoin. The same could be said about the amount of work to produce BitTorrent and the same amount of work to produce something like Git, although Git had the help of the entire Linux kernel community, though it was a lot of Linus and a few people. There's all these examples of amazing pivotal

Starting point is 01:30:23 pieces of software that are wrapped up with a whole bunch of new ideas that changed the face of the network for forever. And those kinds of developments happen pretty rarely because it's very difficult to couple the really good ideas with a really good development team with a really great deployment plan. And all of that has to happen or you don't get the good solution, right? So I'm a huge fan of two labs in the past, Bell Labs and Xerox Park. And I think that those two Labs produce so much technology that we today take for granted and use every day to day. And they managed to set these huge bets on the future and managed to create tons of great valuable stuff.

Starting point is 01:31:09 And Park created a lot of stuff from scratch. Bell Labs also did too, but a lot of what Bell Labs did, at least the projects that I really care about, where I'm referencing is specifically like the Unix team within Bell Labs. they refined a lot of older ideas and they refined a lot of other kind of like surplus of economic ideas that were around and they refined into really good development

Starting point is 01:31:31 so the development and the deployment part they got perfectly and they managed to deploy this amazing thing that just every single system except I guess Windows machines run on Unix and that was achievable because they had like this focus on just understanding what was the important thing to do

Starting point is 01:31:50 building it and deploying it And what I want in the long run is to build a group that can do this for the internet, for the network. So my plan and my goal is to build a, what I call an RD&D lab. So research development and deployment where we can look at the entire internet stack, figure out what protocols are broken, figure out where we can improve the network, and devote the resources to do it. And now the interesting thing is that before 2008, protocols didn't make money, right? protocols kind of got created by people altruistically or as the result of government funding,

Starting point is 01:32:25 actually, most protocols that we use today are the result of massive investment from the U.S. government funding the development of the ARPANET and also other European agencies funding the development of the OSI model and so on. So we're writing on all that massive investments, except that today we don't have that same level of investment, and so our protocols are kind of lacking. However, Bitcoin came in and Bitcoin changed the entire world because it allows protocols to be coupled with value. So now you can have networks that create protocols that have a value influx into the group that is creating and maintaining and improving and helping run the protocol. And that just changes everything.

Starting point is 01:33:10 And so what we are doing at protocol labs is crafting protocols, some of which will be cryptocurrency incentivized. and when they are cryptocurrency incentivized, some portion of the currency will be allocated for protocol labs. And we're not exactly 100% sure of how we're going to do that in the long run because there's a few different models. You can do the sort of like pre-allocation as some other people do. Or what I'm more of a fan is releasing over time. So you have like this notion of like as the currency continues to be successful,

Starting point is 01:33:40 some larger and larger piece gets unlocked for the development team. So this can work. And it's worked for Ethereum, Ethereum did this very successfully. And it's worked for other groups. And so we see that as one potential route. But it's not the only route that we are exploring. There's another side to all of this, which is that when you look at IPFS, it's not a very cryptocurrency-centric thing.

Starting point is 01:34:05 It's actually pretty standard in terms of the regular network. So a lot of services can be built around IPFS that can be services that we run and that we run for the network, or that we sometimes sell to companies, right? So there's these massive entities in the Internet, like the Fortune 500 companies, that spend a lot of money adopting new technology. And so we can help them adopt a lot of new technology

Starting point is 01:34:29 by building services and make it easy for them. So there's like all these different ways to make money around protocols. And so we're exploring a lot of them, though the main one for a while will be just Filecoin. I hope that that kind of makes sense. No, absolutely. And I agree with you.

Starting point is 01:34:47 I think that's one of the amazing things of Bitcoin, as you mentioned there, that incentivization is just one thing that Satoshi got brilliantly right and the ability, actually, yeah, to have open source protocols that have financial value tied to it. So that makes perfect sense. So, yeah, Juan, thanks so much for coming on. It was a great pleasure having you. It was super interesting. Yeah, thank you so much, by the way.

Starting point is 01:35:09 This was a fantastic interview. Great fun chatting and happy to come back whenever. Well, we'll probably have you on at some point and go for another hour and a half. Yeah, and actually, ironically, this is our 100th episode. And we were like, so it's like, oh, we didn't get around to sort of like doing something special, but I think this was actually quite special. Hey, look what I got. Okay.

Starting point is 01:35:38 Well, I didn't, but it was. I didn't get a bottle of champagne, but I had a bottle of champagne. I have one in a wine cooler, so. yeah so thanks so much and of course thanks to a listener lots i know lots of you have been listening for a long time um and yeah episode 100 it's exciting so i just want to say how excited i am about this i mean i i i just before the show i was uh i don't know you said something you mentioned something said oh you get this process all down i said uh you know 100 episodes i'm like it's 100 episodes already seems like you started just yesterday i know so um yeah so that's it

Starting point is 01:36:16 but yeah we'll be back for the next hundred episodes starting next monday which we're with the 101st now the other thing we've been doing for a while and then we keep doing that this is this like bribery t-shirt contest which basically means you leave in iTunes from you and and we send you a t-shirt and you know you can say of course wonderful things or or that this was terrible and you hate the internet now that you've heard all these things um and yeah if you do that just send us email at show at Episendipikon.com. And yeah, we'll put out episodes every Monday. You can subscribe to show on iTunes, SoundCloud, or of course watch the YouTube, the videos on YouTube, and that's at YouTube.com slash EpicTC. And yeah, that's it. So thanks so much.

Epicenter - Learn about Crypto, Blockchain, Ethereum, Bitcoin and Distributed Technologies - Juan Benet: IPFS – Decentralizing the Web with the Inter-Planetary File System

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.