The Changelog: Software Development, Open Source - IPFS (InterPlanetary File System) (Interview)

Episode Date: May 21, 2016

Juan Benet joined the show to talk about IPFS (InterPlanetary File System), a peer-to-peer hypermedia protocol to make the web faster, safer, and more open — addressed by content and identities. We ...talked about what it is, how it works, how it can be used, and how it just might save the future of the web.

Transcript
Discussion (0)
Starting point is 00:00:00 I'm Juan Benet, and you're listening to The Change Log. Welcome back, everyone. This is The Change Log, and I'm your host, Adam Stachowiak. This is episode 204. And today, Jared, I talk to Juan Benet, one of the developers behind IPFS, the Interplanetary File System. Beastie Boys comes to mind right about now. It's a new hypermedia distribution protocol addressed by content and identities. We talked about what it is, how it works, how it could be used, and how it just might save the future of the web. Our sponsors for today's show are TopTile and Linode.
Starting point is 00:00:43 Our first sponsor of the show is TopTile. I talked to Danielle Reed, head of design, about their recent expansion to TopTile Designers, doing for designers what they've done for engineers. And I talked to Danielle Reed about what was behind this, why designers should be excited about it. And this is what she had to say. Take a listen. As a designer, or as any kind of creative person, the big overarching question is always like, how can you find inspiration? And for me personally, and for a sort of infiltrated into your life by having that freedom and flexibility is something that's absolutely fundamental
Starting point is 00:01:34 to doing great work. A lot of the most talented designers and again it's subjective but a lot of the great ones I see are based in really interesting places throughout the world and travel regularly and have clients who actually promote that lifestyle as well. So I think like for any designer that is wanting to pursue their skills, to be accountable for their life, to have new challenges, that's the real power of TopTel I feel. You're not just stuck with one product, one company, or even one agency, but you can choose to work on multiple occasionally or a range of different clients. And I think that that keeps you fresh. It gets you involved in new technologies, different people, and is really fundamental for being sort of switched on as a designer.
Starting point is 00:02:20 All right. That was Daniel Reed, head of design for TopTal. To learn more, head to toptow.com slash designers. That's T-O-P-T-A-L dot com slash designers. Tell them the jeans I'll send you. And now on to the show. All right, a fun show lined up today. We got Juan Bennett on the show.
Starting point is 00:02:39 Interplanetary, Jerry. We almost wanted to open this show with a fun song. And this is a topic you brought up, IPFS. Why was this on your radar? Well, I mean, I think, first of all, it stands for the Interplanetary File System. Right. Great name. Right.
Starting point is 00:02:56 Catches you right there. You know, a permanent web. It's just kind of an audacious goal. It seemed cool. It seemed kind of tantalizing, and yet I didn't get it exactly. And so just very interesting. I think, Juan, you may have missed just slightly on the name because you went with intergalactic file system.
Starting point is 00:03:15 Then you could have hopped on the Beastie Boys chain and had intergalactic file system, file system, intergalactic. But interplanetary, it just doesn't quite fit right. Yeah. Do you feel like that was a missed opportunity? Yeah, definitely. And funny you should mention because intergalactic actually is technically a better name for the original purpose of the name. So the name comes from, it's an homage to JCR Licklider who came up with the concept of the name. So the name comes from, it's an homage to JCR Licklider who came up with the
Starting point is 00:03:48 concept of the internet. And the internet, believe it or not, actually stands for the intergalactic network. So intergalactic network. So that's what the internet stands for. So IPFS is meant to be the file system for the intergalactic network. Intergalactic file system might have been a better name. The original name was GFS, galactic file system, but then that clashed with a whole bunch of other file systems called GFS. You guys have a pretty good name
Starting point is 00:04:16 out there to people interested, but it might not be too late if you want to hop on that. I don't know if IGFS.io is available, but worth checking into. I guess enough about that one. Let's get to know you a little bit. We like to hear about the origin stories of not just the projects that come across our radar and come on the show, but the people that are bringing us those projects and why it is that you are somebody who's involved with IPFS
Starting point is 00:04:40 and kind of where you come from to get to where you are here today. So can you give us your origin story and tell us kind of where you where you come from to get to where you are here today. So can you give us your origin story and tell us kind of where you're coming from? Yeah. So, man, origin story. Don't even know where to begin. So I think perhaps like the most relevant thing to mention is I pretty much grew up in the Internet. So most of my thought has been learning through things like Wikipedia and things, you know, books online and all that kind of stuff.
Starting point is 00:05:09 I certainly, of course, went to school and all that kind of stuff. But I very much am a product of the Internet generation. And I tend to think about the world of bits often more than the world of atoms. So for a long time, I've been very interested in how information moves around the network, how distributed systems work, how to make information more reliable and usable to humans. And really, I've come to look at programming as the ability to create superpowers. So not just to have a superpower as a programmer, but to also be able to create superpowers and gift them to other people. Like when you write an application, you're really creating something that becomes this powerful thing like this, you know, kind of like a magical item that you then give out
Starting point is 00:06:03 to other people. And you can give it not only to individuals item that you then give out to other people. And you can give it not only to individuals, but you can give it to billions of people on the planet all at once. And that's huge, right? You think about the people making Wikipedia and how much of a valuable contribution they made to humanity. And that's a superpower that you can give out to everyone. And so I think to think about that kind of stuff, how knowledge grows, how we can build better, how can make these superpowers more resilient?
Starting point is 00:06:34 How can we make sure that when you give out the superpower, you're not accidentally making people depend on something that may go away? And, you know, more concretely and more grounded, I studied distributed systems, I studied computer science. I, yeah, a lot of both theoretical and applied work. So not just building applications, also thinking about them more deeply, but also not just lost in, in, um, in abstractions, um, having to build something that is usable to the, uh, to regular people helps you translate, um, really good ideas from research all the way down to something that's valuable and usable to, you know, average people on the internet, uh, that may not even care about the underlying things, right? So at the end of the day,
Starting point is 00:07:25 most people, when they use the internet or the web, they're not thinking about how information moves. They're just, you know, manipulating the... They're pressing buttons on their computers and clicking on things on the web and learning how to use those interfaces. And so giving people good metaphors for manipulating digital objects is a big
Starting point is 00:07:45 part of the whole thing. How can you make contributions that are good theoretically and good from where distributed systems theory is going, but also expose that kind of way to manipulate
Starting point is 00:08:02 and create value directly to the user in an understandable way right so things like the initial software interesting little bits of pieces of of interfaces right but for example how mail clients will operate like when they refresh when they download new new mail when you know that a mail is being sent when you have confirmation that somebody has read something right like read receipts are a very interesting little thing that, you know, it's actually like a very nice distributed systems problem that can help change how people communicate. So you say you grew up in the age of the internet. And to me, I get that, but I don't get that because I'm 37 and I didn't grow up in the age of the Internet.
Starting point is 00:08:46 And having the thought process you just shared, you had to get that from somewhere. So I'm kind of curious. When we have people on this show, we're always interested to find out what it was that got them into programming, what hooked them. Sometimes it's games. Sometimes it's cheating at a game. Sometimes it's doing sometimes it's cheating at a game sometimes it's you know doing better at math you know who knows what but something got you into software what was that for you definitely games so i you know i so i was born and grew up in in mexico and then i moved to the
Starting point is 00:09:17 u.s when i was 15 and i was playing video games from an early age uh lots of RPGs, for example. And I got really interested in making games. Also, I think the direct reason I learned how to program was that I was part of an online guild in, I think, StarCraft. I think it was StarCraft and WarCraft. And we needed to make a website. So I was like, fine, I guess. I mean, how hard could this be? I'll figure it out.
Starting point is 00:09:43 And that just exposed me to making websites and programming. And that was like the opening of the rabbit hole. And I think I must have been like 12 or 13 at the time. I don't know. I was pretty young. Didn't start as early as some other people out there did. And for a long time, I was just kind of looking at things, copy pasting, not really understanding what I was doing.
Starting point is 00:10:06 A lot of trial and error, right? Like kind of like the early version of Stack Overflow programming. But more a. Yeah, over time, I started. It wasn't, I guess, until I went through college that I got a really good grounded grounding from like a theory perspective of like how computation actually works what's really valuable and useful like good ways of thinking about it and so on so I think yeah it was hugely valuable to have formal formal training and understanding right so I think you can definitely self-teach a lot of programming and how to make applications and all that kind of stuff.
Starting point is 00:10:46 But to really understand the deep ways in which these applications behave or how large systems scale and all that kind of stuff, it is very useful to have a formal grounding. And that doesn't mean go to school or anything. It just means you can read a textbook. You could read. The point is to study it. And I think most people don't get, at least when I was learning, that wasn't as accessible on the internet. I think it's changed.
Starting point is 00:11:15 I think there are a lot of really nice tutorials now and things like edX or Coursera that do give you the experience of, of, you know, a more theoretical class. Yeah. The distribution mechanism of how we educate around software is changing or is fluid, but the education itself is still just as important as it ever has been. Especially if we're, don't want to be doomed to repeat the failures of the past, which tends to happen when you don't know about the past. So you got the education, you were interested in computer science, and then you got, you know, you learned the underpinning, so to speak.
Starting point is 00:11:53 And now you are leading a group of people coding this new hypermedia distribution protocol. Can you tell us about IPFS, where the idea came from, how it started, the genesis story of this project? Yeah, so the genesis story is a bit long. Well, not necessarily long, but there's a lot of things that came together, right? So on the one hand, I was always interested in distributed systems. That was my focus when I was in school. I was very interested in peer-to-peer
Starting point is 00:12:26 systems. So I was always very interested in multiplayer games and things like BitTorrent and how you could build very nicely scalable systems by sharing the resources and bandwidth of different peers in the network. And an annoying thing about studying networking in university was that they did mention things like Baturin and Skype and so on, like that definitely came up, but it came up in a very cursory level. Like we kind of just discussed it a bit and didn't really take up all of the improvements that were brought through those technologies into consideration as much. And it took me a while to understand why. And the reality is that a lot of these systems are kind of special purpose.
Starting point is 00:13:16 The contributions are pretty specific and they get something working really well for that one use case, but it doesn't translate to nice libraries that people can use for a bunch of other stuff. And you actually have to work a lot harder to get that working, right? Like to make nice interfaces and nice libraries for a much more general set of use cases, which is what people like, you know, teaching and, or, you know, it makes it relevant to teach and relevant to apply in a broader context. You have to work a lot harder for that. And anyway, so that's one avenue.
Starting point is 00:13:48 Another avenue was that I wanted to, I was pretty always dissatisfied with how the web worked in terms of, you know, like this notion that I have to host a web server somewhere. And, you know, even to do something as basic as just transmit a set of files. Like I was, why can't I just publish this data? And as long as people are interested in resharing it, like have it work on the browser just fine. Like not through having to host my own web server.
Starting point is 00:14:16 That's another thing. I was interested in BitTorrent like use cases for caching and distribution of content. I was actually pretty excited that, you know, we were mentioning Warcraft and Starcraft earlier. Blizzard actually was one of the only companies to use BitTorrent in a meaningful way in their distribution, at least publicly. There might have been others that did it as well. But, you know, and it helped solve a big problem with their updates. Like, I remember the days when they had all the distribution decentralized and downloading a patch for a game took forever.
Starting point is 00:14:49 And it was also partly the modems that people had, but also just their servers were pegged. So once they moved to BitTorrent, it worked a lot better and faster and much nicer. And that served as an example to prove that even when you're a large company and have a lot of money and so on, you can still gain a lot of value from peer-to-peer distribution systems.
Starting point is 00:15:10 So that was like a nice example, right? Skype was another one, I think, for me that really served as a fantastic shining example of the value you get by helping interface and network people in the world, but you're not really an intermediary that they're piping all communications through. I think nowadays Skype does intermediate
Starting point is 00:15:34 all of your communications, but that's a whole separate story. I think it has more to do now with the difficulty in connecting people peer-to-peer. It actually is pretty hard to open a pipe from one computer to another without having intermediaries. There's a whole bunch of problems like net traversal and so on.
Starting point is 00:15:50 That was actually another avenue of this. I was really frustrated with how hard it came to program distributed systems simply because the network was not as nice as IP gave us. So IP gave this really, really nice network where everything was addressable. Any computer should be able to talk to any other computer. And then NATs and mobile phone networks and a whole bunch of other things came in to ruin the party, right?
Starting point is 00:16:15 They made it pretty difficult to open a connection from one process to another. Also browsers, right? So you can't open a socket from a tab, right? That's, of course, a big important security feature, but there are many cases where nowadays applications on the web probably should be able to dial out to anything else. I think the model changes.
Starting point is 00:16:37 I think the computational platform of today is more about the boundary between the browser and the OS is always shrinking. And I think at some point we will want to be able to make that possible. So anyway, all of these things were brewing in my mind. I guess another strong influence was I did a lot of studying of different kinds of distributed file systems. So these are things like Plant9, for example, which came out of that Labs. Plant9 had a fantastic set of file systems. So these are things like Plant9, for example, which came out of the labs when I had a fantastic
Starting point is 00:17:07 set of file systems. It had 9p, which is a really cool protocol for modeling resources and the network is just different pieces in the file system. You use the same path to do everything. Vanti and Fossil were two examples
Starting point is 00:17:23 of file systems. SFS was another file system that was a huge influence. There were a lot of them. They were all pretty interesting. So I was always annoyed a little bit with a divide between file systems and the web. To me, it would be really, really nice to drop into the terminal and be able to just manipulate the web directly. So mounting. We tend to use wget and curl and so on but like imagine if the web was just a directory in your file system and you could like browse through it or read and write through it i think uh you know zooming out a little bit i think it's easy to have maybe that perspective now because especially someone like you who grew up in the web whereas you know and and Jared and I are a bit more of a dinosaur
Starting point is 00:18:05 when compared to you, I would say that we didn't grow up in the web. I don't mean dinosaur. We're older, of course. But, you know, we grew up in the age where you joined the web, like that the nodes begin to trickle in, so to speak, like the web grew and grew and grew. And now it's this big thing. And so it can be easy to look at it now and see, okay, here's the network. It's already there. Here's how we make it better versus where it came from which was small and got big
Starting point is 00:18:29 you know so i can see how you can look at it and say okay here it is let's make it better now that it exists but you had to build to the point where you know like putting a file server onto the web and stuff like that you had sort of had to stake your claim or put your flag down so to speak right i think it's like a manner of perspective and our generation which is probably just one probably up one from yours not dinosaur level okay sorry my bad it's like we saw it come from nothing to what it has become and so we've seen that change but we're not quote-unquote web native in terms of like growing up inside of what it already was. And so from your perspective, you see all... I don't know if I want to say it's always been there,
Starting point is 00:19:08 but you natively understood the web, and so you're seeing how it could be so much better, whereas perhaps from our perspective, it's already gotten so much better from nothing. So it seems like sometimes it takes the next generation to reinvent things. To point out the problems. Yeah, anyways. So you decided to create IPFS.
Starting point is 00:19:28 Can you give the quick high-level elevator pitch? We're going to dive deep into it after the break, how it works, the problems you're trying to solve, and all that good stuff. But if you had to do like the 30-second, this is what IPFS is, what would you say? So IPFS is a new way of moving around content on the network. It's a protocol with the goal to upgrade the web and make digital information have more permanence, be able to work offline more, be decentralized, and move around faster in general. So use as much of the power of the network as possible and change where the points of failure and points of control are. So there's a lot wrapped into it.
Starting point is 00:20:13 At the end of the day, it's just to say that it's just software. It's just a new protocol for how computer programs should exchange data. So it's like HTTP in that way. But it's a very different design that borrows a lot of great ideas from other distributed file systems and version control systems like Git. And so it models all content as content that's linked
Starting point is 00:20:40 through content addressing and hashes and uses that as a way of getting a much better security properties and a much better distribution model. So there's a lot wrapped into that. At the end of the day, it's about making the web better, making the web faster, safer, and more secure. And so, you know, there's a lot of,
Starting point is 00:21:01 you know, that sounds really nice in the high level, but it's the how this is done and the details that where IPFS really shines. Well, let's let's take a quick break and then we will dive into how it shines. You mentioned that it exists to make it faster, safer and more open in the context of how IPFS works. I think we should keep those three things in mind. And maybe as you tell us the different aspects of the protocol or the, I guess the protocol is the right word for it. Why this is faster, why this is safer, why this is more open than, than what we're currently using. But before we go to the break, just for, from the networking level, where does this fit in? Is it at the IP layer? Is it above IP at the application layer? Where does it replace?
Starting point is 00:21:53 So it is above IP and it's below the application layer. So it complements and potentially replaces HTTP. So think of it as a different protocol for web browsers and applications to use to communicate with each other. And yeah, it doesn't exactly fit in terms of the OSI nice network layering model. The networking, the actual layering is much more complicated than networking groups would let on. But it fits there. It's replacing the HTTP layer. I think that helps just for all of us to be on the same kind of framework of where we see this fitting into how computers communicate. So I think that's very helpful.
Starting point is 00:22:32 All right, let's take that break and we'll talk about how it works in just a minute. If you're looking for one of the most fastest, efficient SSD cloud servers on the market look no further than our friends at Linode you can get a Linode cloud server up and running in seconds with your choice of Linux distro resources and node location and they've got eight data centers spread across the entire world North America Europe Asia Pacific and plans started just ten bucks a month with hourly billing get forward access for more control run vms run containers run your own private git server enjoy native ssd cloud storage
Starting point is 00:23:13 40 gigabit network intel e5 processors again use the code changelog10 with unlimited uses tell your friends it expires later this year so you have plenty of time to use it. Head to linode.com slash changelog. All right, we are back, and we're talking about the interplanetary file system, which, by the way, is just still fun to say. I love saying that. I'm going to keep saying it. It's so awesome.
Starting point is 00:23:40 We all know Adam's a big space fan, so i'm sure you're all on board for this name adam oh yeah totally dude but uh it is a mouthful so ipfs its goals are to change the way we communicate with our computers using peer-to-peer distribution protocol aiming to make the web faster safer and more open one you said that the way it shines is really in the details of how it works. Sounds like you have a lot of education with regard to past file systems, even current file systems, as well as networking protocols. And so you've put together this gem, which people are getting quite excited about. We'll talk about that here real soon.
Starting point is 00:24:22 But can you open it up for us and kind of give us a look inside ipfs um give us the insider look of how it's all put together and why it's faster safer and more open the core principle uh under underlying ipfs is to model data and link data using causal linking so this is an idea that goes way back to, you know, people like Leslie Lamport and others in distributed systems that, you know, talked to us about, really had a good framing for how to move around data. But more recently, I think distributed version control systems
Starting point is 00:24:59 like Git and Mercurial and so on, proved to us how valuable it is to model data this way. You know, they weren't the first systems to do it. There were others before, but I think they were certainly the most widely used. And so the same fundamental property that underlies Git, and it's the same fundamental property that underlies things like Bitcoin,
Starting point is 00:25:23 is the idea of linking objects using hashes. And so this is both causal linking, meaning that one object is ordered after the other. You know, you can say that when you link something by cryptographic hash, the object that's linking to another has to always come after the other. So it orders them in time. And the other piece of it is that by using to another has to always come after the other. So it orders them in time. And the other piece of it is that by using cryptographic hashes, you can verify the content so that if I have a link to an object, a file, and that link has a cryptographic hash, it means that I can find that file anywhere. I don't have to go and ask
Starting point is 00:26:05 any specific location or authority for the file. Anyone can serve me that file. And I can check that it's the right file because I can hash it and I can verify the hashes match. And so that is an organizing principle for the entire file system that you can build on top of it. And the kernel of the insight for something like IPFS and other systems, not just IPFS, is that if you center on this as the main way to model your data and link the data, then you can make a lot of problems easier.
Starting point is 00:26:44 Like you can easily reason about what content came before what other content. You can easily reason about making sure the content is correct and valid. And you can authenticate the content. You can make sure that you can verify it's correct. And you are free to now accept it from anyone in the network. You no longer have to go to specific web servers. You can really get it from any other computer. You can also not have to be connected to the internet, actually.
Starting point is 00:27:16 You can be in a different network that is separate and using and manipulating the exact same set of links. So it is the underlying principle of linking something by hash. We call it Merkle linking, and this comes from Merkle trees. And Merkle trees are a data structure that was invented by Roth Merkle, a very eminent cryptographer. And Roth Merkle has done a lot of other amazing things. He, one of his most, perhaps most famous contribution was I think called Merkle puzzles that proved
Starting point is 00:27:53 that you could establish secure communications with each other in the clear. This was before public private key pairs. So it was like a big, important contribution. And this idea of Merkle linking through Merkle trees kind of stayed buried in the cryptography community and the kind of low-level systems community for a long time. I think probably because it was patented, I think people were more reluctant to use it. But I think the patent has expired since, and then it began to be used all over the place,
Starting point is 00:28:26 including systems like Git and so on. And this is what gives rise to the nice distributed system properties, right? Like you can, when you think about a Git object, you have a hash, you have a SHA-1 hash that you can use to address the commit or address the file or the directory and whatever. And you no longer have to trust the network to provide the correct content to you. And you can reason about the history, right? So you can even talk to the server and find out that it's been compromised because it's serving you some other completely different history. Or maybe it's not been compromised, but people did rewrite history and you can tell that that's happening.
Starting point is 00:29:04 And you can be selective about the changes that you take in. So that fundamental property, which is, you know, again, to restate it, you're just linking objects with hashes, right? So you are embedding into one object the hash of the other, gives you this way to tie up content causally.
Starting point is 00:29:27 Right. So if one object gets updated, then all of the links to it have to change and so on. And this gives you the ability to verify and validate content and to also content address it. So there's another leap there, which is you have to also consider that these hashes may not just be a good way to verify the content. You can also use them as a way to address the content in the links themselves. And so you can put it in a file system or in an address bar or something. You can ask for something by hash. This is also an old idea. It's been used in many systems.
Starting point is 00:30:09 But by using these simple abstractions and piecing them all together, you can build a distributed system, a distributed information system, if you will, that can move around content in a much safer way because you can verify all of it. It's faster because you can oftentimes check caches that are local to you. It could be in the same machine. It could be in a machine close to you physically, or it could be in the network that you're in, not even having to talk to the internet backbone and
Starting point is 00:30:37 so on. So it just makes information distribution faster and allows you to reuse the bandwidth of other peers. You no longer have to trust others. You can ask them for something and you can verify that they're giving you the right content. So all of this falls out of the fundamental idea of Merkle linking. You also say here on the website that it combines the distributed hash table that you're talking about with incentivized block exchange, which I'd like you to kind of unpack that for us, and a self-certifying namespace. So let's start with incentivized block exchange. What's that mean? Yeah, so this is a concept that comes from BitTorrent, right? So one of the improvements of BitTorrent over previous systems
Starting point is 00:31:26 was that it modeled data distribution as an incentivized exchange, right? So this means that if you have a bunch of people trying to download a torrent, then it's better for the swarm if people exchange pieces of content that each other needs. This is usually referred to as a tit-for-tat model it's not perfectly modeled for tat so you know if you ask people in theory and so on the incentive structure is a little bit different and there have been better proposals since then um but the basic idea is you say hey there's a lot of peers in the network that have content and anybody can provide the content to you,
Starting point is 00:32:05 select between those peers that are likely to give the content to you. And that becomes more likely if there's an incentive structure there, meaning that if I have pieces of the file or I have pieces of other files that you are interested in, we can exchange those. And that way you align the network
Starting point is 00:32:24 so that you share the bandwidth resources, right? So instead of just supporting leeches that are only downloading and not contributing to the network, you get the distribution to sort of, in a sense, not exactly pay for itself, but you help load balance the distribution. This isn't perfect because there's a lot of models where you really just want to transmit data out and you don't really care about people helping share it or other cases where
Starting point is 00:32:51 maybe it's something really big and the people that are distributing it actually I don't know, maybe want to charge money for it or something. This is something that we took into account when we designed a new protocol called BitSwap, which is a sub-protocol of IPFS. And this is what we call the block exchange.
Starting point is 00:33:09 And so it models the data distribution as kind of like a data barter system where I give data to you, you give data to me. I take into account how much data you've given me in the past. And it makes me more likely to want to give you stuff in the future if you've also given me stuff as well. So if our data sharing relationship is profitable, then I am more likely to give you stuff in the future. There's a whole bunch of other cases where maybe I'm new to the network, people should still give me content, or maybe I don't really have anything that other people are interested in that you still have to take into account.
Starting point is 00:33:43 And here, the standard HTTP model of just just i'm just going to distribute out content also works where um you can default back to that kind of thing and so it's meant more of us than optimization of the network than um a hard and fast rule that you force nodes to always distribute stuff right there will always be leeches in the network that you have to take into account and so it's like you're somewhere in between so another concept in ipfs is the self-certifying namespace can you tell us what that is a self-certifying namespace comes from uh an old file system you know another old it's early 2000s um called sfs and that was the self-certified file system and the basic idea is that you know when you think about naming on the network, this is the problem of assigning an identifier to some resource or content
Starting point is 00:34:32 that may change over time. So something like foo.com points to an IP address. And if I change that pointer to point to something else, how do you know that it was me who did that and not somebody else, right? And so DNS employs, you know, some amount of security in terms of, you know, only allowing certain people to update records.
Starting point is 00:34:55 There's also some problems around security of how those records move and all this sort of stuff, but there is a good, you know, amount of security there where it's not like you can, if I own food dot com, then you can't set records on that. Right. That's the basic idea. There's other naming systems, right? Like there's other names that work in different ways where, you know, the way that registration happens and so on.
Starting point is 00:35:18 Maybe I have a public private key pair. And so food dot com is assigned to is bound to, say, a public key. And then any record signed by the private key corresponding to that public key can uh update that pointer right and so self-certifying records or self-certifying um file system uh took it a step further and said hey wait a second what if if we relax the constraint and say that we don't need these nice human readable names um and instead just, you know, we can allow some ugly looking names.
Starting point is 00:35:50 What if we just embed the hash of the public key directly into the name itself? And so you can imagine like, you know, this unreadable name, which is just like a big long hash, but that's just the hash of the public key. And that means that there's no need for a centralized authority validating hash of the public key. And that means that there's no need for a centralized authority validating or securing the namespace. You know, it's in a sense, a distributed
Starting point is 00:36:11 namespace that cryptography assigns, right? So this means that by just generating a public-private key pair, I have a name now. And that name is the hash of my public key right it's not a nice name i can't you know you can't hear it and type it or anything like that right um so you know name we tend to think of name as a nice human readable thing but the value here is that if you relax that constraint then what you get back is you don't need a centralized namespace you don't need to talk to the internet to validate their name um as long as you have the records and they're signed correctly by the corresponding private key, then you're good to use the value, right? And so this means that you and I can be in an IPFS network that is separate from the entire internet. And I can create a public private key pair. And I say, hey, I'm going, update content. And the link that
Starting point is 00:37:07 I give you for that content is, you know, um, the hash of my public key, then I can, you know, continue to publish content there and you can find it and you can be assured and certified that it was only me that updated that content, nobody else. And so think about this kind of like a, another way to think about it is kind of like a Git branch. So in Git, you have immutable content, which are the objects that are all hash addressed and content addressed. And on top of that, you have these mutable pointers,
Starting point is 00:37:40 and these are the branches, right? So something like master. And so master is a pointer that keeps pointing to the latest head that you want to consider as master. And whenever you commit, when you say git commit, then you're updating the pointer, the master pointer to point to the new commit. So the same idea,
Starting point is 00:37:59 this is how we use self-certified names in IPFS. They're pointers to the latest version of content, right? So, and this could be a version history or it could be just one version of the file or something, right? It doesn't matter. You get to decide what that means. But it gives you mutability. It gives you the ability to change content in a completely decentralized way where you don't have to rely on any central authority whatsoever. This is a huge property. It's a huge win. You end up giving up on, you know, the nice human readable naming. But there's ways to add that back in later. You add it on top, basically,
Starting point is 00:38:36 like you map human readability to this, you know, non-human readable names and that are, you know, certified, self-certified. And the reason it's called self-certified is that the name itself has the hash of the public key. And that's all you need. Right. So if you have the name of the hash and you have the content, you can verify all of it. You do not have to ask any central authority whatsoever for validation. So this means that you don't need CIs, right? Like you don't need CIs. You don't need, you know, you don't need CIs, right? Like you don't need CIs, you don't need a consensus-based naming system like DNS. You don't need any of that. And you can just do naming on your own peer-to-peer.
Starting point is 00:39:11 It's a huge thing. This concept shows up all over the place. Lots of systems use self-certified naming. They don't tend to credit it that much. And they don't tend to refer to SFS, which was like the original place where this showed up. But yeah, that's kind of where the idea came from. And it's hugely valuable. And I think people tend to underestimate how important this piece of IPFS is.
Starting point is 00:39:35 And there's a lot of challenges that making it scale and making it nicely usable and so on. But it's an important part. Well, let's pause here. When we get back we're gonna dive into the practical use of IPFS like how that exists some so far you've described what seems to be is a bunch of kind of standalone technologies and implementations data structures protocols what have you we'll put it all back together and see how you can use IPFS. And then we'll talk about who's using it,
Starting point is 00:40:07 what they're building on top of it because it's a file system. So the point is to build things with it. It's not really the end goal, right? It's a piece of infrastructure. So we'll take our break. And after that, we will discuss those things. Every Saturday morning, we ship an email called ChangeLog Weekly. It's our editorialized take on what happened this week in open source and software development.
Starting point is 00:40:32 It's not generated by a machine. There's no algorithms involved. It's me, it's Jared, and the rest of the ChangeLog team hand curating this email, keeping up to date with the latest headlines, links, videos, projects, and repos. And to get this awesome email in your inbox every single week, head to changelog.com slash weekly and subscribe. Okay, Juan, so far you've described to us what seems to be a bunch of interrelated yet separate technologies. Can you bring it all together? How does IPFS work?
Starting point is 00:41:03 What's the software packaging? How do you use it? How does IPFS work? What's the software packaging? How do you use it? How do you get started? Tell us all that good stuff, actually practical uses of putting this stuff out there and using it. Yeah, so the architecture fits together in that IPFS, you know, the core IPFS node. You know, you don't think about it as a client or a server. You think about it as a node or a peer in the network. You know, we're trying to get rid of the client-server mentality.
Starting point is 00:41:27 So you have a node, and this node, what it gives you is the ability to add or retrieve objects into the graph. And so the graph is, think of it kind of like the web, but these objects aren't HTML. They're kind of like JSON. It's not actually JSON, it's C-bore in the wire format. But they're kind of like JSON. It's not actually JSON, it's C-Bor in the wire format. But they're kind of like JSON objects and they can represent files, they can represent
Starting point is 00:41:51 web pages, they can represent version histories like Git, whatever. And you get to add objects here. So if you add a file to IPFS, uh, there's a whole bunch of tools that you can use around IPFS nodes. So for example, you can have a command line implementation. And so the command line tool can add a file, right? And so you can get an IPFS command and your command line that says IPFS add, you know, my file.jpg or something, right? So what that does is that it reads the JPEG and chunks it into a graph, right? And so this means that it'll read the file and split it into a whole bunch of smaller pieces and then construct a graph out of it.
Starting point is 00:42:30 And this graph is, you know, you can think of it kind of like the easiest case would be a linked list, but there are some other kind of abstractions. The graph is a description of the file. And, you know, here you can chunk really large files this way, right? And it helps version things.
Starting point is 00:42:47 And so then you put all of these objects that represent the graph into IPFS, into a local repository. Think of it a little bit like Git. There's some repository that your node can access where it stores the data. Once the data is in there, the IPFS node is connected to the network,
Starting point is 00:43:05 and that network, I'll explain a bit more how it finds the network and so on, but it advertises to the network that it now has this new content added. You don't immediately, you don't transfer that content to anyone until they request it. And so this is, you know, different from what other people might expect about peer-to-peer systems, but the files don't move unless you explicitly request them. It's an important thing because it means that you're only downloading and accessing the stuff that you explicitly request. You don't have to worry about people adding bad content and it somehow showing up in your node.
Starting point is 00:43:39 None of that happens. And you can also add files through... you know, we have this, this IPFS node can also expose an API, right? Like you can expose an API at an endpoint. And here you can use something like HTTP, or you can use something like a socket or, you know, like a Unix domain socket or something. You just have some way of communicating with it either by command line or programmatically, and you add content to IPFS. And so you chunk it up and you add it in and you link it with these hash links. And now the graph is in your node and other people can access it, right? So say that I get back a link that I can
Starting point is 00:44:17 give to other people. And so when I give that link to other people or I place it in an application or something, when those other nodes try to access that link, they connect to the network, they ask the network, hey, who has this content? And they get back a response of a list of peers. And at the very beginning, it may just be one. And then they just contact that peer, your node, and retrieve the content from that peer.
Starting point is 00:44:41 And from then on, now that they have the content, they also advertise to the rest of the network that they can distribute it. There are interesting policy questions there where you can also make that optional, right? Like, you don't have to necessarily advertise content to the network or the way you advertise it may be dependent on the use case, right? So certain
Starting point is 00:44:57 applications may want to have their own sub-network that, you know, you're not leaking the content to everyone else. And you can also pre-encrypt the content. If people end up seeing the content floating by or something, they're kind of crawling or aggregating content, they can't read it. They just get this encrypted blob. So that's sort of how you use it.
Starting point is 00:45:20 So think of it a little bit like Git, where you can add content to a repository, and then now that it's added, it's accessible from any other IPFS node that can talk to your IPFS node. So you form this peer-to-peer mesh network with everyone. And this is where the DHT comes in to help organize how to find peers and access the content and all kinds of stuff. There's a whole bunch of interesting peer-to-peer protocols that can come in here. So in reality, Akifas sits on top of a subproject that we're writing called LibP2P. And this LibP2P is a, you know,
Starting point is 00:45:54 think of it kind of like it's a huge toolbox of interesting peer-to-peer protocols that are useful and valuable in various settings and use cases. And, you know, things like local DNS or, you know, WebSocket transports or WebRTC transports and so on. And you're able to piece these together into a nice connectivity fabric that we like to term like the peer-to-peer network.
Starting point is 00:46:21 And your IPFS node just sits on top of that and is able to find other IPfs nodes that have the content and they retrieve the content and now you can serve it right so anyway long story short the basic idea is you know from an interface perspective you add content to ipfs and once it's added it's only in the node that you added it to but then you can move that link or give it to other nodes and they can then pull the content and move it elsewhere. And now it's distributed to more than one node and all of those nodes can now help share it. So it's a little bit like, like a bit or it in that way. Can you just put it up there and say, give this to all nodes or any nodes? Like, do you have to be specific around which nodes that you're going to distribute through?
Starting point is 00:47:04 You can, you know, other nodes can, and this is something that we're still working on and figuring out exactly how to do, because there's many different constraints here, right? Yeah. The hard constraints here are that we can't make it so that you, by writing to IPFS, somehow get to send content to other people. Because that content could be bad, right? So imagine that you have some illegal content of some sort, and you add it to ipfs um that content should not automatically be sent to other people that should just be on your node um and it's only by other people requesting it that that you move it right and so um unless you could easily dos their server if you have if you just fill it with more content than this space they have or it just seems like
Starting point is 00:47:41 there's lots of bad things that could happen that way exactly so it's kind of like a pull model right and so the though what you can do though is you know you um once you add content you immediately send a message to another node saying hey i just added this content and if you can have some authenticated agreements like saying hey please replicate all content that i that i have right and so you can think of it a little bit like get pull and get push right so uh most of the functionality is flowing um and pushing uh has an authentication that needs to to be in place right like you shouldn't be just allowed to push to any arbitrary node um they have to sort of allow you to do this um and you know both of these may be your nodes uh you just need to make sure that the system knows that that's possible.
Starting point is 00:48:26 And so given some authentication, yeah, then you can push objects however you want and distribute them to other nodes. But then they're sort of available. So think of it kind of like one massive BitTorrent swarm that's moving around objects in one massive Git repo. And all of the objects there are accessible to your web browser so that your web browser can directly fetch content from this repository of objects. So you can put images, put web pages, put whatever, and you can now access them all through the browser. Seems like it'd make it pretty trivial to build your own private Dropbox in terms of
Starting point is 00:49:02 you just build the authentication around which computers can act as nodes. It's authentication and some UI user experience stuff. We're playing around with some of that. We're more interested in the lower level protocol stuff, but there is a
Starting point is 00:49:19 file browser thing that we're making that's pretty cool, actually. You can drag and drop uh files on the browser and it adds them and you can view them and send them to other people and so on there's a lot of interesting challenges around sharing links and encryption there that uh we're working towards um we don't have all of that stuff in place yet um we'll be doing that over this year and you know in the coming and and so on different groups are very interested in this um and so right now we're kind of focusing on getting the perf to be really good and um focusing on the on the public use cases uh but all the private stuff is
Starting point is 00:49:54 is just around the corner just by adding encryption yeah so it sounds like you know i look at this anything that's a file system whether it's distributed across all these nodes, or if it's just sitting on my little laptop right here, you know, it's, it's a building block. It's a part of a bigger system, right? And so it seems like what you guys want to do is lay a really good foundation and have all of these different aspects of things that you would want to build on top of it figured out so that they're, and then let people go nuts. What are some applications that you guys see being built on top of it? I just mentioned an idea of your own personal Dropbox type thing.
Starting point is 00:50:34 One thing that hit our radar recently was this everythingstays.com, which was a immutable and distributed Node.js modules. Seemed like it was a package registry built on top of IPFS. What are some ways that people are interested in using it or even possibly using it currently?
Starting point is 00:50:53 IPFS is meant to just interface with the web of today directly, right? So it's meant to just kind of rebase the web on top of this better protocol for moving around content. So we're doing a whole bunch of work to make sure that IPFS is accessible to people using web browsers today and that, you know, web developers don't have to think about a new model.
Starting point is 00:51:14 They're just doing the same kind of web applications that they're building today, but just on on IPFS. Right. So you can do pretty much anything that you would build on web app now on the public pass. It just depending on how content updates, you might think a bit of it a little bit different. And depending on how you want to do control, you may think of it a little bit differently. So let me give you some concrete examples. Right. So you can do file distribution really easily.
Starting point is 00:51:42 Right. Like this means just add static files, write any kind static file delivery. So CDN use cases and so on. That's pretty easy. The next thing on top of that is things like package managers. So you mentioned everything stays. IPFS began its life as a package manager itself. So the original goal was to make a dataset package manager, right? So add the nice versioning features that we have around Git and the nice distribution system of like something like BitTorrent and make it usable for moving around scientific data. Then I just kind of realized that this would be really valuable for the web as a whole. So, you know, really just focus more on that. And the thing here is like a package manager for code like NPM or a package manager for binaries like aptitude or something are all very similar.
Starting point is 00:52:28 And when you add hashing to how you make all of those links, you can decentralize the whole thing. You can think of package managers as moving around all of these static pieces of content, right? Whether it's code or binaries, and you can address all of those by hash. And so you can think of making a completely decentralized package manager on top of IPFS. And in fact, IPFS makes it extremely easy to do all of this. And so you can look at, we have one package manager called GX that you can take a look at. It's our solution for package management in Go. And we use it to build IPFS. It's a little bit pre-opinionated and it's early days still, but it's super exciting. So check it out and think about it. And yeah, of course, there's a bunch of things coming around NPM, like everything stays in
Starting point is 00:53:18 other systems, right? So we actually favor using, we were doing one where we are importing the entire NPM registry into IPFS and still using NPM, you know, as a centralized registry for the naming. But have all of the content be addressed by hash and distributed peer to peer so that, you know, you can, when you NPM install, you could download the files from other computers that are near you. Right. So imagine that you're in an office setting with like 50 other people or something, and you're NPM installing something, and you know that you've downloaded this stuff before or that other people in the same room have downloaded it before. There's no reason you have to go out to the backbone of the internet
Starting point is 00:53:57 and download it again. And so you can dramatically speed up all of this or maybe even make it work completely offline, right? So imagine that the connectivity in your in your office falls apart and suddenly you can um you can still install all these npm modules because uh you you already have them somebody has them in your office how do you know about versioning at that point how do you know you're getting the latest version or the the is it too late to ask that question i mean that's what i think about when you say stuff like that it depends on the on how the caching and the updating of the versions happens, right?
Starting point is 00:54:27 So one model here is that the registry, so the index of versions, right? So like how the name maps to a list of versions that have been published. That itself you can download and cache. And that's not very big. So you could cache all of that pretty quickly. And, you know, maybe you can't get the latest version that was published right now, but you can get the version that was published an hour ago, right? Or right before when the internet went down.
Starting point is 00:54:54 So you can think of accessing data as not a strictly online procedure that happens at that moment, but rather this kind of more asynchronous thing where everything is sort of more eventually consistent. That's one way of looking at it. It's not strictly eventual consistency. It's a different property, but I guess the, the push pull process provides the authentication to trust. Yeah, exactly. So, well, you have the hashes, right? So you, you have the, the hashes and the, and they're signing for updated for mutable things. There's a, you can sign them directly, right? So NPM can sign the registry and the updates to the registry and distribute those.
Starting point is 00:55:30 So, you know, they're, they're valid. And even in another more decentralized way, like you wouldn't then the individual authors could sign them, right? Like the individual authors could sign it with our key and you know, it's a valid new version. Just think about trust in that situation. Like if i can bypass the the backbone of the internet and trust local network uh even if it's an older version
Starting point is 00:55:51 what allows me as the user to feel comfortable to know like well hey i'm offline but i can still trust what i'm getting yeah yeah that's what i think so you can do things there like there's a whole bunch of interesting challenges that are more application dependent on something like a package manager what you would want to do is expose what versions are available. And then you have to know that these are the only versions available in the network that you can see right now, but the new ones may have been published. And so you can attach dates to that and know when they were published. So if you think that there might be newer stuff, then you know whether to use them or not. So there's some interesting challenges there, but we can think about data in a more
Starting point is 00:56:29 distributed sense and offline first. So these are the same kind of questions, by the way, that people were wondering about Git at the beginning, right? So when Git was getting started, everyone was really worried. They're like, wait, what? What do you mean I can just ask somebody else's repository for the data? Like, can't I don't't I have to go to like the central server? And the reality is no, you can make sense out of all these pieces of information. The central server is really good to maintain, you know, the latest copy or the latest, have some notion of what the latest value that we want to agree on is. But you can get the pieces of data from anyone.
Starting point is 00:57:00 And even those updates can be distributed through peer-to-peer. Cool. So yeah, package managers are another great use case. One really exciting use case that we like a lot is distributed chat, right? So we use IRC to, you know, we have this IRC channel, Pound IPFS, come hang out to communicate and so on. But we also would like to be able to chat when we are, say, disconnected from the internet. Like, for example, if we're traveling together and we are, I don't know, in a train or maybe just in some poor connectivity location, we would like to be able to chat. And, you know, things like IRC or even things like Slack and so on don't work in that use case because you have
Starting point is 00:57:41 to connect to the backbone, right? And all of the messages are sent through this backbone. But what if you could have a chat room that just works wherever you are with the peers that are around you? And so that's where we're creating a thing called Orbit. And Orbit is a peer-to-peer chat application. It's all entirely on top of IPFS with dynamic content. And the way it works is that all the messages are individual IPFS objects. You have a message log that points to all the data. So you can think of it a little bit like a blockchain sort of, it's not a blockchain actually, it's a better data structure. It's called a CRDT.
Starting point is 00:58:11 CRDTs are a class of data structures. They're amazing. I could probably spend whole days talking about them and I highly encourage you to have a future talk and interview about CRDTs with some of the CRDT experts out there. But it's a really good way of modeling data and IPFS allows and supports building CRDTs with some of the CRDT experts out there. But it's a really good way of modeling data. And IPFS allows and supports building CRDTs on top of it.
Starting point is 00:58:30 What's the stand for? CRDT stands for, depends on who you ask, it actually stands for a couple of different things. So it could be convergent replicated data type or conflict-free replicated data type or commutative replicated data type. And I think there's a different version for the R too. They're all different words for expressing the same set of principles.
Starting point is 00:58:49 Depending on which one you use, the emphasis and the implementation changes. So the systems actually look a little bit different depending on what you call them. But the basic principles are the same and the constructions are isomorphic, meaning that you can build the same kind of stuff on top of each other. And they give you the same properties. So what these mean is that imagine if Git had no conflicts, like never had any kind of merge conflicts. So this is more like, you're used to, say, Google Docs. So Google Docs uses a thing called operational transforms. And this means that when you make edits on a Google Doc, all the operations are guaranteed to never conflict.
Starting point is 00:59:27 That means that they can commute or in the end converge. So they're all convergent. You can apply them in whatever order and you achieve the exact same result. CRDTs are a better version of operational transforms, or at least you can think of them that way. It's a different research lineage, but they're used for the same kind of stuff. And you can do things like Etherpad type of data structures, but you can also do something much more general, like a chat application, or even something like a whole social network or email and so on, right? So it's a really striking new distributed systems type thing and super valuable research
Starting point is 01:00:05 that is just now being turned into applications. And so we built a whole chat client using CRDTs on top of IPFS. And it's really cool, right? Like you can just load it up and start chatting with other people on the IPFS network and all of the content is moving through IPFS.
Starting point is 01:00:21 So a lot of people were wondering, hey, you know, IPFS is really cool for static content, but what about dynamic content? And yeah, we can do that too. The secret of making it fast there is we use PubSub. So this is the one piece that we're still, it's not fully there on the public release of IPFS yet, but we're still working on the interfaces
Starting point is 01:00:39 and how that will work. But yeah, PubSub, making it possible for some IPFS nodes to move around content to each other really quickly is a big part of making this work really nicely. So going back to the office setting, imagine that you're talking to each other in your team chat, and so imagine that the internet connectivity falls apart. You should still be able to talk to each other. You still have computers, you still have a network that works in the building. Why is it that you can't talk to each other? That's just a very silly problem. And so IPFS is meant to solve all these kinds of problems, like decentralizing the web.
Starting point is 01:01:13 So one of the fundamental problems with how we're using the web today is that websites and links in the web or all the content and APIs and so on, the way they work is that they force you to go to the backbone and talk to people in the backbone to make sense of the data. And this, you know, creates this huge central point of failure, both central point of failure and central point of control that those websites own that data. And if they disappear or they cancel the service or, you know, they're just inaccessible because the links between you and them are failing, suddenly you cannot use this application, right? And this is a deeply unsettling problem, right? So on the surface, it's like, well, you know, they're providing a service and usually, you know, a lot of times for free, but sometimes, you know, you pay them and, you know, it's a best effort service, right?
Starting point is 01:02:02 And so if it doesn't work because there's a major disaster or something, well, tough luck. But at the same time, most of our communications are starting to be moved through the web, right? So think about how you talk to your co-workers or, more importantly, your family members, right? So you probably use some chat system. And if you're using this chat system and, you know, there's some disaster or something or just a service falls apart that day, suddenly you can't talk to them anymore. use some chat system. And if you're using this chat system and there's some disaster or something, or just a service falls apart that day, suddenly you can't talk to them anymore. And now this superpower that you had, this amazing ability to just talk to them really easily and quickly, is gone immediately, surprisingly, suddenly. And so we need to, as engineers, restructure
Starting point is 01:02:44 how we build web applications to make sure that this is not a problem. That we build resilient and decentralized applications so that these messaging platforms should be able to continue operating even in those cases. If the internet works, if I have the ability to have my computer contact yours, that should be enough to be able to communicate with you. And this happens for messaging systems. It happens for web applications. It happens for chat systems in general, things like GitHub and so on. I recall GitHub has been under a lot of attack in the last couple of years. Last year, it was taken down by a CDN problem. Somebody injected some bad code into a CDN, which caused a lot of people to attack GitHub,
Starting point is 01:03:27 right? And even earlier this year, it was taken down again by some other problems. So suddenly in those times, a whole bunch of people were kind of frustrated by the centralization of GitHub and said, hey, why don't we just like decentralize GitHub and have it work over something like IPFS, right? And it turns out that IPFS can actually help tremendously in this problem. So on the first hand, if a CDN was using something like IPFS, that initial attack vector
Starting point is 01:03:51 would just not work. The attack that people did last year of injecting some code into the CDN would just not happen at all because all the code would be certified and checked by hash. The second part is that even if you manage to attack GitHub and take it down, if you were properly decentralized, then other peers that have the content can help serve it.
Starting point is 01:04:10 And so it doesn't matter if you take one host down. Other people should be able to serve the same exact content. And maybe it's a little bit slower or something, but the important part is that content is still there. And so this is one of the important parts of decentralizing the web, right? So IPFS in a big way is becoming a big push to make sure that the web itself is decentralized. And the thing is that there are certain problems when centralized websites impose a point of failure and a point of control.
Starting point is 01:04:37 And so if we use a better model of moving around the data, then we can save ourselves from these deep problems of the web. We can make it more resilient. We can think back around actually controlling more as a user where the data ends up and who uses it and who has the ability to address it. So one way of summarizing in a big way what IPFS is about is imagine when you cite a book or you go and find a book, imagine that people told you that the only way that you could find a book is by giving you a bunch of directions of how to find it at a specific library. And suppose that you live in New York and they tell you that to find this book, you have to go to San Francisco and you have to go to a specific library in San Francisco and get a book there.
Starting point is 01:05:22 And that's the only possible way of reading this book. And it's really silly, right? Like, why? Why couldn't you just get an ISBN number or the title of the book and look for that book elsewhere in a different library? And like, that's that's what I was about. It's about making it possible to get the content from whoever has the content and making digital information move like physical information, right?
Starting point is 01:05:42 Like where, you know, you can get a copy anywhere and you could get it. As long as somebody has the copy to give you, you should be able to get it and use it. And this has vast, deep implications for how content moves, how resilient the network is, how applications operate on top of it, and the points of control around the data, right? So imagine if I could give you a link to it. So suppose that people use things like Twitter or Medium to publish a lot of their thoughts.
Starting point is 01:06:09 And this is really valuable forms of expression that people have. A bunch of important communications happen over these networks. So imagine that those services go away, and it's only all of those posts or tweets or whatever disappear. Or all of the links of these break. So even if you can download data or something all the links of these break right so even if the if you can download data or something the links will break but what if instead you could when you add data you could get a link directly to the data itself not going through an intermediary not going through
Starting point is 01:06:36 twitter.com but rather going directly to the tweet and being able to order the medium post right and being able to move that around without having to to trust that these organizations will continue to exist decades from now yeah um i was going to ask you the question on what you had said in your talk at stanford the future of the web could be in danger but it sounds like you've pretty much answered that by these examples of you know the danger of the future of the web could be that without decentralization we uh we kind give up control, as you said, to these networks. And whenever they decide to go away, whether it's because there's an internet outage or
Starting point is 01:07:12 a connectivity issue or something more serious like a business issue, and let's say Twitter fails as a business and goes away, you've got all this collective effort that is all this expression, as you had said, which is valuable, that just now goes away. And so is that what you meant by the future of the web could be in danger by the fact that if we don't think about decentralizing content networks and data networks like that, that we could be giving up too much control? And there's a way we can actually build in the security and control for the long term by leveraging IPFS? Yeah, exactly. Right. So these are important concerns about how the data that we create and publish moves through the network. Right.
Starting point is 01:07:53 And how we address the data is a huge part of this. Right. Today in HTTP, we address the data through a DNS domain name, which maps to an IP address. And that means a specific location on the network, right? It means a specific set of computers, usually controlled by one organization. And, you know, whatever happens means that, you know, like that domain name could, you know, that business could go away, that organization could go away, that service could be canceled. You know, think about how many services have disappeared.
Starting point is 01:08:25 And, you know, you suddenly wake up to a notice one day and that tells you, hey, the service is now been taken. It's going to be taken down in a month. You have one month to take all your data and move it elsewhere. Right. And like, what about all of the links that you gave to other people? Right. Like suddenly all of that breaks.
Starting point is 01:08:39 And so we are, you know, tired of that kind of model. And we don't think that that's at all, you first of all, or correct to, you know, there's a whole bunch of concerns. You can't force people to continue providing a service that they just can't in terms of a business. Like that makes sense, right? But there are ways in which we can model how to structure and link to the data such that it doesn't matter if that service goes away.
Starting point is 01:09:04 The data is still there and the data can still be accessed and the data can still be backed up. Right. And so like that's that's a big part here, like making sure the links don't break, making the links be able to to last in the long term. Gotcha. And so, yeah, it's a lot of this, you know, as part of the archival efforts as well, right? So think about being able to archive all this content.
Starting point is 01:09:27 Yeah, putting it into the way the network works is, I mean, if it's built in, you don't have to really think about it. It's just part of the way the file system works. And, you know, we could be in an age where, you know, five years on the road, this is getting more widely adopted and more networks use it. And then, you know, it might even take over a larger portion of what we know as the web today. And it's built in and you don't really have to think about data being lost or networks closing or a file not being there because it resolves no matter what, given the protocol. Yeah, exactly right. Well, it was definitely a fun deep dive into this topic. I know that,
Starting point is 01:10:06 uh, in a planet, interplanetary travel is fun. And so are files to sleep. So, you know, why not combine the two? I had to put that joke in there and I,
Starting point is 01:10:15 I swear, Jared, I kind of wish we can play the beastie boys. They probably sue us though, which is a shame. Uh, cause I want to bear use. You just got to keep it short.
Starting point is 01:10:23 I got to keep it short. And one, I'll also, I did some looking up that, uh, IGFS is, uh, a shame. That's fair use. You just got to keep it short. You got to keep it short. Also, I did some looking up that IGFS is while it may not be suitable, there's still time to change. You could change to IGFS. Yeah, maybe for April Fool's Day next year
Starting point is 01:10:39 we'll revamp everything. There you go. There you go. Bobo, anything else you want to, anything else you want to mention before we close the show? We got about, about two minutes to close. Yeah,
Starting point is 01:10:50 I think I will. First of all, thank you very much for having me. This was a really exciting discussion. You know, there's lots of stuff to talk about. The project is really, really big.
Starting point is 01:10:59 One, I wanted to give, you know, first of all, one huge shout out to the entire community that is building this out. Right. So this is not, you know, this is not really my project anymore. This is a project that everyone owns.
Starting point is 01:11:11 It's a huge, large project with lots of people contributing, lots of people making it happen, lots of ideas. I get to sit here and represent other people, but there's some amazing contributions from all sorts of people around the world. And with that, our open source community is super open. And there's open source and there's open open source. And we're like open open source. You can come in and tell us, please file bug reports and all that kind of stuff.
Starting point is 01:11:39 But also tell us what you would like to see, what features you would love to have. If anything needs more documentation. Come and hang out in IRC, come and contribute on GitHub. This is, IPFS will become what you make of it. It's a big call to people out there to come and join us and help remake the web in a much better and decentralized way. So yeah, welcome. And everyone's welcome.
Starting point is 01:12:05 And yeah, we definitely look forward to the project growing and so on. I'm looking forward to seeing some listener pick this up and email us back with something they created using it. And then that way the complete circle can be made. And then we can have them on the show talking about how they leveraged IPFS to build the next big thing or whatever, because that's the best way to do it. Right.
Starting point is 01:12:29 Yep. So listeners out there, we thank you for tuning into this awesome show on, on rebuilding the web basically, and, and omitting some danger that could be in our future if we don't decentralize. So if something is,
Starting point is 01:12:40 has been interesting to you about networks that this problem has been there, but now IPFS solves that problem, then go build it or at least think about it and share that back with us and tell us what you think. But Juan, thanks so much again for joining us today and covering this conversation. But that's it, fellas. So let's call the show done and say goodbye. Goodbye. Thanks, Juan. Yeah. Thank you very much. Bye. I'm out.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.