Epicenter - Learn about Crypto, Blockchain, Ethereum, Bitcoin and Distributed Technologies - Joel Thorstensson: Ceramic – Building the Dataverse

Starting point is 00:00:00 This is Epicenter, Episode 482 with guest, Joel Thostensen. Welcome to Epicenter, the show which talks about the technologies, projects, and people driving decentralization and the blockchain revolution. I'm Friedrich Erz, and I'm here with Meheroy. And today we're speaking with Joel Thostensen, who is the technical co-founder of 3 Box Labs, the creator of Ceramic. And Ceramic is this data storage solution for Web3 project. and we'll talk about this in just a little bit. But just before, I'd like to tell you about our sponsor this week. Our sponsor this week is Omni.

Starting point is 00:00:56 Omni is your new favorite multi-change mobile wallet. Omni supports more than 25 protocols, so you can manage all of your assets in one place. But what's really special about Omni is what you can do inside the wallet. When you get yield, Omni allows you to get the best API with zero fees and three tabs need to. swap. Omni aggregates all major bridges and dexes so you can bridge and swap across all supported networks and one transaction directly in your wallet. Love NFTs. Omni offers the broadest NFT support

Starting point is 00:01:25 of any wallet so you can collect and manage your favorite NFTs across all chains in one place. Omni truly is the easiest way to use Web3 and it's fully self-custodial meaning you never has to trust anyone with your assets other than yourself. And they support Ledger as well. Give Omni a try at Omni.com. Joel, it's a pleasure to have you on. Yeah, thanks. Great to be here. I remember you from ages ago at Consensus. So clearly you've been in the ecosystem a while.

Starting point is 00:01:55 Tell us what you've been up to. Yeah. Well, started, I was playing around with Ethereum before the launch in like 2015, did a Bachelor project that led to an internship at Consensus. And eventually started working part-time at Consensus. as I was finishing my master's in complex adaptive systems. And so at the time, the project that was working at a consensus is called Uport. So this was this early identity focused project in the Ethereum ecosystem.

Starting point is 00:02:31 So some of the older listeners might remember that. And at Uport, that's essentially where I met my co-founders. And we started actually incubating inside of Uport, what eventually, eventually has become ceramic network. Right. So how did you come up with the idea of ceramic? Or rather, what is the grand idea behind ceramic? Yeah.

Starting point is 00:02:55 So one thing that we noticed with Uport was that we're building this identity solution. And we kind of had to build our own wallet. And even back in like 2016, 2017, it felt like it doesn't seem right to compete with a mask and all of the other walls that was around at the time. And so one of the inquiries that led to the creation of our first prototype called 3Box JS was essentially like, hey, can we build a system that allows developers to build more data-rich applications, but it works with any wallet. And so we're prototyping on that and that led to this 3-box JS SDK SDK.

Starting point is 00:03:43 And after that having been used by a bunch of people over time, we draw some main insights from that experience and created ceramic. Thinking back a while ago, the first thing that I kind of became aware of under the three box umbrella was this chat box. And we actually, at the time, we kind of integrated it into a prediction market platform that we were offering at the time. I mean, still around. You can still use them.

Starting point is 00:04:13 But I mean, and basically it was just a way to kind of put comments and stuff. So how did you get from there to, you know, this very comprehensive data storage solution? The JavaScript SDK that we're building was very ambitious. We're trying to build this completely decentralized in browser system that would allow you to make some basic social features like comments and profiles. and things like this, and have that be live inside of your browser. So that was using a JavaScript implementation of IPFS at the time. And some of the learnings we got from that is that these kind of data structures that enable this really local first type of applications are really powerful primitives. And they have similarities to blockchains in that you have these verified.

Starting point is 00:05:11 data structures that can be shared across multiple nodes. But actually doing it and starting to build that in browser first is really difficult. And so we took the core insights of like these data structures, how we can deal with identities and how we can create basically use updates and events users create over time to create an actual database and take this and build a node application that your front-end applications can talk to. So in the same way you talk to like an Ethereum node, you can also talk to a ceramic node that could deal with more of like the data-rich features. I was going through all the ceramic documentation on the web today.

Starting point is 00:06:05 and maybe I'll tell you my imagination of how like why you need something like ceramic or how this idea comes about and maybe you could corroborate if I understand it right. So to me the story really starts with just the observation that, you know, the fact that, you know, I have data on Ethereum and then I can use multiple front ends or multiple user interfaces. for for my accounts is really cool. And you always start to wonder, why doesn't like the entire internet stack work that way? Meaning you start to wonder, okay, why doesn't social media work that way? I can put my data onto Facebook and then I can use

Starting point is 00:06:58 some other user interface to Facebook. And this is like this is quite a standard thought experiment many of us have done and there are many projects that go down in this direction. So I think one of the things that one starts to realize when you want this kind of composability that there should be multiple user interfaces to some data I have is you don't really need blockchains or consensus all the time. So if you have financial data, that data should lie on Ethereum or if you have a Dow that financial logic should lie on Ethereum. But if it's my personal data that I want to put somewhere and allow lots of people to build applications,

Starting point is 00:07:53 then my personal data may not need the blockchain at all and I necessarily don't want to pay the cost of the blockchain consensus at all. So start to need some kind of data, where I can put I can push my data and and then other people can build applications on on top of that so that's kind of like the direction ceramic goes down goes toward but what ceramic is kind of also adding is the observation that for for good interoperability to exist among multiple applications we need no standard ways of doing identity standard ways of doing profile standard ways of doing X like all of these news feeds and things

Starting point is 00:08:47 like that and you're allowing for such standards to exist so is that right is it it's like a data it's a data layer that any enables people to build you eyes for these u.s to be on interoperable with each other Yeah, I think that that's a good overview. I think there's a few things there I could double click on. So first of all, like the financial aspect of blockchains. Blockchains are hard to scale because we have this requirement of what in the service systems is called strong consistency.

Starting point is 00:09:24 That means that basically all nodes in the network need to be able to have an agreement all the time like what the state of the system is. And this is how we prevent double spends. And in a system that doesn't deal with any kind of assets, but it's more just data that's produced by the user, we can kind of loosen that constraint because we're fine with what is called eventual consistency. And so we don't need to always have all the nodes, have the same agreement on what the state of the system is.

Starting point is 00:09:58 And the nice thing is that we can also have different nodes, only synchronize different aspects of the network. So you might care only about a subset of users or a subset of data models in the network. And this is really what we achieve with ceramic. Yeah, and then the second point around like standards and things like this, I think that's something that is really tricky. And our approach to it has been to provide examples,

Starting point is 00:10:29 but we don't want to set the standards, right? We want the community to come up with the things that works for their applications because ultimately we are not going to be able to know what different applications needs. And so the approach we take to it in our graph database product, which we're building on top of ceramic called ComposeDB, is that developers can come and create data models. They can onboard their users and have their users write data to these data models. And then other developers can import these data models into their applications and kind of get the onboarding of those users for free because the data is already there.

Starting point is 00:11:11 So that's the composability that you were talking about. I have lots of questions for kind of these, for lots of different facets here. But kind of I think I want to step back and kind of look at the larger landscape of data solutions in the Web 3 space first. So I think our listeners may be familiar with products such as Are Weave and IPFS and Sire. So how would you kind of contextualize ceramic within that landscape?

Starting point is 00:11:49 How is it different or similar to those? So yeah, all of the three products you mentioned, they have focused first on just like, how can we store file? and how can you store like pieces chunks of data in an efficient way that can scale. And those are all very useful things to have in the ecosystem, like storage of massive amount of data is incredibly important. Our focus from the start has been on users, their identity, and like how to make these systems easy to use for people. So in ceramic, instead of having

Starting point is 00:12:32 creating blocks where you either have deals with miners to store data or you have big blocks that include a bunch of data as in some of these systems do. We don't really have blocks in ceramic. We rely on the security of Ethereum. And instead, each user creates an event stream of actions they take in the network. This event stream, you can think of kind of like a micro ledger that is signed by the user's key. And all of these updates, we can verify that they come from the user. And since we have like all, like if we take that together,

Starting point is 00:13:13 we can have a view of multiple users with all their independent event streams. And we can compose that into a database view. So it's like a different approach to the architecture. But does that mean that basically only the event creator can write to the event stream because basically if different people have like right permissions, this won't work, right? Yeah. So in Ceramic right now, each event stream has a controller, which is essentially a user

Starting point is 00:13:46 account. So right now there's support for, I think, three different blockchain wallets. Most users use Ethereum, Ethereum-based wallets. And so every event stream is controlled by one account. then you can take multiple events streams and listen to all these events and create a combined view of that. Okay, so say one of the event streams is say my Twitter output, so things I tweet, but also things I like and things I retweet and so on. You could just kind of, this would kind of fit the, I don't know whether broader social media data structure or Twitter data structure. and then it can be compared to other people's status structures and can be compiled into a view of a decentralized version of Twitter where basically I can say how I want to, how I want to view things or which things I want to prioritize or which things I want to be shown.

Starting point is 00:14:45 Because basically the way that social media works right now is that, I mean, obviously there's, you know, very complex algorithms at work to kind of calculate what to show you. But what exactly they do and how they operate and what they prioritize, this is not visible. And there's no competition as to this. So basically would kind of ceramic enable other people to kind of build on the same data streams and kind of showcase this differently or prioritize things differently? The power of ceramic in this case is that we can actually start to mimic make a little bit more how the architecture of kind of Web 2 social media works because Web 2

Starting point is 00:15:37 social media networks like Twitter or Facebook, they don't scale on a strong consistency model like a blockchain. They have some databases, they have event streams in their systems, and they have a bunch of microservices that take care of different tasks. So we can imagine a very primitive social network on Ceramic that just like, oh, here's my tweets. And then when I follow, for Rieke and Meher, I just composed that into my view. But if I follow like thousands of people, that's not really going to be efficient. But the cool thing with Ceramic is that they could actually

Starting point is 00:16:13 be a service somewhere that ingests all of these streams of my followers and as a microservice, run some computation over it and outputs that in a new event stream. And then I consume that. And this computation could be done in a verifiable matter. Either it's a deterministic competition that I can rerun and see that it was correct. Or maybe in the future, if we can have very efficient CKPs, like, that could even be that.

Starting point is 00:16:44 But for the time being, like having compute actually attributed by, in the ceramic stream, by this microservice provider, actually gives us some better trust in the system. And I can actually choose which of these service providers that I want to build my stream of tweets or what have you. Okay. I think one thing that I kind of don't understand yet is, I mean, you do distinguish between different kinds of consensus. But why do you need consensus in the ceramic network at all? I mean, if it's just a decentralized storage layer, and it can be proven that basically my data is stored,

Starting point is 00:17:38 why does it need consensus? Well, so we need some basic form of consensus, right? Like if your node and my node get the exact same events, we want to be sure that we end up in the same state. If we don't end up in the same state, that's bad. So it's not like a global consensus that in the sense of blockchains where there is an agreed upon state that everyone the network agrees on. It's more like if we consume the same events, we end up at the same state. So it would be terrible.

Starting point is 00:18:10 Okay. So it's more like a checked term. Yeah. In distributed systems, this is called consensus, like that your nodes can actually arrive at the same conclusion. Okay, I think then I just have a very different mental representation of consensus, because to me, consensus kind of is by default a global thing, but I think this is maybe just a corruption of kind of how the actual technical term is used by unlearned communities.

Starting point is 00:18:38 Okay, so then I kind of, then I kind of understand that part. You said that you guys build on Ethereum. So basically what's kind of the connection between ceramic and Ethereum? Yeah. So I mentioned these event streams. They're signed by end users. So they're good for like, OK, now we know we have attribution to who created what and who wrote what

Starting point is 00:19:01 into the network. But we also want some guarantees about when certain events took place. And this is where Ethereum comes into the picture. So every once in a while, event streams are anchored into the blockchain. And what this means is that there is hash or some other kind of vector commitment that's included in the blockchain. That basically allows any consumer of this event stream to convince themselves that,

Starting point is 00:19:31 okay, this event was published at least at this point in time. And obviously, like, making one Ethereum transaction per event stream update would not really be scalable. So what we do is we create a Merkel tree. that batches a bunch of updates to a bunch of different streams and puts the root of that Merkel tree on-Jane. So earlier you mentioned that data in ceramic will be eventually consistent, which kind of means that if let's say I push two updates, so let's say I'm using the ceramic Twitter and I push two posts

Starting point is 00:20:16 and one after the other, ultimately the ceramic network will decide on which post came first and which came second. Right. And there might be a span of time where the network hasn't made a decision, but ultimately it will eventually it will make this decision. That's how I think of eventual consistency.

Starting point is 00:20:42 Yeah. And in the case of your personal posts, you actually, when you make post one and then you make post two, your post two will actually point back to your previous post. So like for your personal things, it will be like ordered. But between like two different users, it's not ordered in the same way. And then we would rely on these anchors. Okay. So how do you reach eventual consistency in the network?

Starting point is 00:21:07 Is it down to is it down to the anchor in the Ethereum blockchain, meaning some ceramic node at some point of time? is going to push an anchor and then whatever ordering they did is the ordering of the ceramic network is it like that or is there a different mechanism yeah so based on the anchor you can look at the blockchain and see like hey this block that was produced what's the block height and what's the time stamp in this block and so if you have two conflicting events in the network you can look at like which one came first and and then you would know how to choose I'm actually curious like so there there have been these computer science data structures called like CRDs conflict free data types where essentially it's a data structure different

Starting point is 00:22:03 people can push updates to it and they will be eventual consistency of the data without there being like an active voting based consensus like in proof of stake proof of stake networks. So you have ways of getting consistency of data in a decentralized network without, you know, practical Byzantine fault-tolerant consensus. Are you using something like that in ceramic as well? Or are you actually getting eventual consistency via some other mechanism? Yeah.

Starting point is 00:22:42 So CRED thesis is something that's. We're quite familiar with. We're not actually using them yet, but it's something we're intending to use as we improve the protocol. So right now, if we think about a single event stream, a single event stream is only allowed to have one canonical history, kind of in the same way a blockchain has like one canonical chain. And if there is a conflict, if there is a fork in this event log, then we would basically choose the fork that was anchored earliest. And this actually is the property that allows us to do key rotation in a secure way. And that's a whole other rabbit hole, which I have an article on our block on, by the way. But for many cases, we don't actually need to choose one of these forks.

Starting point is 00:23:37 We can actually just like do a merge and then do a CRDT based logic to figure out the complete ordering of these events. So that is an improvement that we're planning in the protocol. Okay. So essentially now the way I'm imagining Ceramic is, okay, there's this huge lake of data. I can push my event stream to it. And by virtue of how my event stream is designed and the fact that ultimately this network will anchor something on the Ethereum blockchain, there's kind of going to be eventual consistency in the network

Starting point is 00:24:16 where the network can agree on what events came first and what events came second. Roughly, that's my picture. Yeah, exactly. So if you have two nodes in the network and they observe the same events, even though they might not be talking to each other, but they have seen the same events, they can arrive at the same conclusion. So let's talk about the nodes in the network, right? So basically, anyone can, what are the requirements for running a ceramic node? Yeah. So running a ceramic node in itself, it doesn't have like super big requirements because when you start, start a ceramic node, you don't have any data on it. And then you have to tell your node that, hey, I want to subscribe to this particular event stream.

Starting point is 00:25:08 And so if you're familiar with something like IPFS, where you have to pin individual objects, you have to subscribe to individual event streams in Ceramic. So for most users of Ceramic right now, you actually, or the developers that are building on on Ceramic, they are running their own nodes to support their applications. And right now in Ceramic, there's no like built-in redundance. One thing that we're doing with this database product composed DB that I mentioned is the ability to synchronize data between nodes so that they can have the same view. For example, if you have like a blog post model, we can have two different applications

Starting point is 00:25:58 that build on this blog. So that's like kind of allowing nodes to just like subscribe to the data which they care about. Now, that's fine kind of for developers. But if I'm an end user, I don't have any guarantees that like these two blog applications will be online. So one thing we are going to add in the future is a network incentive where you as a user or you as a developer can pay the network and pay a set of nodes to keep this data available in the network. But right now, nodes are run mainly by application developers that wants to leverage the functionality of Ceramic.

Starting point is 00:26:47 Everyone kind of stores their own data, and there's no way to kind of say, I will backup someone else's data or someone else will backup my data. just basically all in my own ceramic node. No, you can definitely, it's a public open network. So anyone can subscribe to any stream in the network. So I could, for example, subscribe to your stream and provide like a redundant copy of the stream. So as we kind of alluded to earlier, this is for very specific types of data.

Starting point is 00:27:26 It's kind of not a general storage solution. We already talked about social networks for a bit. But what are kind of the verticals that kind of, what kind of data usage is this gear towards? Yeah. So there's around four different niches that we look at being more common today and one of the most prominent ones right now is reputation so we have projects like gitcoin and disco that are putting different kinds of verifiable credentials on ceramic that are

Starting point is 00:28:10 associated with your ethereum address or your other blockchain address and then be able to calculate some kind of score based on that so in gitcoin's case you have a civil resistance score. We already talked about social. So there's a few different applications building different kinds of web three focused social networks. So Orbis is one. Cyberconnect is another one. Then another category is knowledge graphs. So there we have mainly projects within the decentralized science base, D-SI. One of the most furthest along there is, is lateral. They're building essentially a knowledge graph that represents scientific discourse. And the cool thing there, of course, is like, you can look at this knowledge graph, and since it's

Starting point is 00:29:04 stored on ceramic, you can see who actually contributed what to this knowledge graph, because it's cryptographically linked to your Ethereum address. And what they want to do eventually is to trickle down payments to people who actually contribute valuable knowledge. Then another niche, I think, in the ceramic community is DAO tooling. So basically, if we think about the DAO ecosystem today, we use a lot of like centralized tools. We use Discord. We use hosted forums that maybe like one guy on the DAO is hosting. And like this seems very fragile. And so one example here, so ceramic could be used to like replace these things with like a more decent. centralized and resilient infrastructure.

Starting point is 00:29:55 One place where this is starting to happen in particular is the Nelsus safe community where the safe app has a centralized backend that stores a bunch of transactions that are pending and may be signed by some of the delegates. So there's a company called Dowellus and Systems. building a decentralized safe registry for these pending transactions in ceramic. So those are some of the use cases we see today. I think there's interesting things in the future that might be more speculative, but around data provenance.

Starting point is 00:30:36 So that could be like these new language and image models we're seeing in AI. There's a bunch of copyright problems. could be used to provide attribution to who actually did what and how that feed into these systems. I think we want to have supply chains have more attribution in their systems and similar with IoT. So that's things that's worth exploring in the future. Does it have to be public data by default? I mean, can I put private data on ceramic and have it stay private or only accessible to some? Yeah, this is a great question and it's nuanced, right?

Starting point is 00:31:21 So by default, ceramic is a fully public network. Now, the first thing you might want to think about when putting some data on ceramics, say, I can encrypt it. So you can certainly encrypt data, put it on ceramic. And this will be private, but you have to think about the future, right? Because in the future, we're going to have a very fancy quantum computer. computers that might break some of our cryptography. So if you're using any kind of like current asymmetric

Starting point is 00:31:53 cryptography, your private data might not be so private anymore. And that might be, you might also not be like satisfied with like, ciphers that exist today. They might be broken even though they're like supposed to be quantum secure. It really depends on like your risk and like how private this data really is in any decentralized system that is publicly verifiable, you have this problem. So the only way you can really be safe that your data is completely secure is probably to encrypt it and store it on your own machine or you trust someone to store it for you

Starting point is 00:32:33 and then you trust that they don't get hacked and so on. There's a possibility that we could explore in the future some kind of access control logic in ceramic where only if you have an authorized identity or like account, you're able to synchronize certain subsets of data. But that's not something that we really focus on right now. We think there's like a lot of exciting things in the public data or public but encrypted data ecosystem. So yeah. So my imagination of ceramic is now, okay, so there's a network. There's lots of nodes on the network, I can publish my event stream.

Starting point is 00:33:18 In the early days of the network, it's probably nice if I publish my event stream in a way that it doesn't contain my private data. Or it rather contains data that I'm comfortable sharing with the world. And there's eventual consistency in the networks. And the network will agree on what came first eventually. OK, understood.

Starting point is 00:33:42 So we have that basic primitive. But allude to the fact that, OK, you want to allow people to build applications on top where ultimately they are interoperable with each other. That's on your website. So what is the meaning of interoperability for applications on top of ceramic? And what might be some of the tools you've

Starting point is 00:34:10 built on top to enable it? Yeah. So at its core, like Ceramic is an event streaming protocol. And this is not something that like most developers are familiar with in Web2. Like there are a few different events streaming solutions. But it's it's more of like a thing that experienced backend engineers used to really scale to applications to like handle a lot of load. And so we need some tooling to make it actually usable and easy. to use for developers.

Starting point is 00:34:48 And so what we created for this is ComposeDB. So ComposeDB is a graph database that allows developers to define data models, which is basically a schema for your data. And this model, it's kind of analogous to a smart contract where you define the data model, and then users come to an application. they create objects or documents that conform to this schema.

Starting point is 00:35:19 And then the developer can query this data and read like, hey, here's all the objects that conform to the schema and by all these different users or query like subsets of that. And you can also have relationships between different models. So if you have a blog post, you might have a comment that points to the blog post and you will be able to query like, hey, give me all blog posts,

Starting point is 00:35:42 or like this subset of blog posts and also all the comments related to that subset of blog posts. So that's kind of the graph aspect of that. And this tooling is built in something that's familiar with to many developers called GraphQL. So you actually define your data models in GraphQL and you query, read and write the data using GraphQL as well in CompostDB. be. So I understand that basically when I write data to my own data stream, I have to choose a data structure. But how is the knowledge about which data structure I'm using percolated in the network?

Starting point is 00:36:25 Because as I understand it today, I host my own data. So how is the lateral connection made? So you can run an indexer by spinning up your ceramic node and telling it to index and say, like, hey, what are all the data models in the network? Or if you have some application that you really like and you would like to use like, oh, I need that data, then you can just look at if their application is open source. You can look at their application code and just like pull in the data model from there. So this is the early days.

Starting point is 00:37:00 I think what we're hoping to see in the future is. some kind of explorer or catalog of data models where people can use to browse the different data models see their popularity and how much usage they have and so on. And so that experience should become much easier over time. And that is also what happens if there's two competing standards for the same data model that basically you just check what gets more usage or what people Because you kind of, there's a lock-in effect. I mean, if you want to be composable, the number of things you're composedable with is super relevant, right? Yeah.

Starting point is 00:37:47 So there might be two competing standards for like a user profile. And you might want to choose the most popular one. Or you might just want to have a super application that can include both profiles and just like kind of display the information that's best. because then you have reached to more existing users. But really is like enabling the developer community to figure out what's best for their needs. So but data models are by default open source, right? So I can use any data model that's out there.

Starting point is 00:38:23 Yeah, exactly. OK, so maybe let's talk about the economics of everything a little bit. So if at the moment I kind of I, I'm not actually guaranteed any redundancy. Is there any way to actually monetize data that I make available? Yeah. So right now, Ceramic is a fully peer-to-peer network, and anyone can spin up a node and replicate the data.

Starting point is 00:38:54 So I think the primary use case right now is not for people who want to monetize their data. It's more for like, hey, we're a Dow community. We want to make sure that. or pending transactions or discussion governance forum doesn't disappear. And then you can have like multiple individuals in this participating in this style, like providing redundancy for this data. I think monetization of data is something that's really interesting. It's not our core focus right now.

Starting point is 00:39:25 And Ceramic is primarily a data network right now. I think there's a lot of interesting combinations that could be made with financial systems like Ethereum and other blockchains where we can leverage the best of both worlds, like maybe like NFTs and ERC20 tokens for doing some of the financial aspects of social media or knowledge graphs or something else and then using ceramic for that really high throughput scalable data system. Okay. Yeah, I think I understand where you're coming from.

Starting point is 00:40:05 I think I still have a couple of mental disconnect. So if basically if currently people kind of replicate data, you know, altruistically and are not compensated for this, how do I protect myself against censorship? How do I protect myself against kind of having people replicate my. data either incomplete or you know maliciously differently than I I wanted it to be stored. Yeah, so you can certainly not, if you have your data, you're running your own ceramic node and have that data there, you can certainly not like guarantee that other nodes in the network right now can like are providing exactly the same data. But however, if there is one

Starting point is 00:41:05 honest node in the network, any honest node that's just like is wanting to synchronize data, your data, would eventually be able to get up to speed and find all of the data that your node is providing. So as long as there's one honest node, the data will be in the network. But how do I decide which one the honest version is, right? So say, I was one of the hostess for my Dow community and my computer went up in flames and there's like a couple of people who kind of hosted the same content and now they're in disagreement about which the real content is. So the only thing that these nodes can do is to remove data. And if your node that you know is honest disappears, yeah, then you kind of have to trust that they are providing all the data

Starting point is 00:41:54 that was there before. But they can't like say that, hey, this data, is something completely different or here's a bunch of new data like they can only say here's all the data or like a subset of the data because all the data is signed by the end users that are participants in the doubt and that's not really something you can fake and so yeah to be clear like this is this is the the current state of the network we are planning to add a network incentive where either developers or communities or individuals can pay to make sure that that data is kept available in the network. Could you add something like a proof of completeness or so?

Starting point is 00:42:38 Yeah, I mean, I think that's like kind of what blockchains do, like they have this completeness because they have like this global state in an eventually consistent system. You can't really know if there's like some piece of data that someone's been hiding for a long time and then eventually reveals. because there isn't like the same completeness over time. You kind of have to let go of some of those guarantees to get this more scalable system. Okay, that's fair.

Starting point is 00:43:13 So we kind of talked about potential ways to kind of generate revenue from streams as a user. What about ceramics? So basically, how is ceramic finance or how is it going to be financed in the long run? Yeah, so as I mentioned, Ceramic is a fully peer-to-peer network right now where anyone can run a node. There's a few aspects that we think it makes sense to introduce some kind of token model. So the aspect right now that is the biggest cost to the network as a whole and that 3Box Labs is currently subsidizing is the anchoring process. So like actually making the on-chain Ethereum transactions.

Starting point is 00:43:52 That's something that we want to decentralize as soon as possible so that it's more of like a network activity where you participate in the network, you participate also to this process of anchoring things. So that's one aspect. The other aspect is the availability of data. So having the ability for users to pay for their data to be available. and node providers to get paid by the network to just run a node and keep some subset of the data and the network available. I think that's also key to understand that once we have this logic and system for nodes to be compensated, we're likely we can actually have each node only needs to provide a subset of the network and we can have this decentralization without having

Starting point is 00:44:57 the throughput limitation we have currently in blockchains. I hope that answered your question. Yeah, it does answer my question. So you already talked about the four different niches that you feel could benefit from building on ceramic. There's already a large number of projects already built. building on ceramic. Can you talk about kind of the ceramic ecosystem and which projects you're excited about? Yeah.

Starting point is 00:45:25 So I mentioned some of them already. So I think the largest one right now is Gitcoin that they're building this passport functionality on top of ceramic. Disco is building their data backpack system. In the social network niche, there is like Orbis and CyberConnect. They're both building different kinds of social networks. Orbus is definitely doing a bunch of social network. bunch of interesting things there and they are using some other secure multi-party compute system

Starting point is 00:45:56 to actually store encrypted data on ceramic. So that's pretty exciting. They're providing like an SDK for people to integrate comments and that kind of social functionality in a similar way that was on the previous Omen prediction market, I believe it was called, that we mentioned in the beginning. So what 3Box J.S did way back. Lateral is a project in the D-Side space that I mentioned before. They're building these scientific discourse graphs.

Starting point is 00:46:29 In the Dow Tooling ecosystem, I mentioned the decentralized safe registry. I think the Dow tooling space is there's a lot of opportunity there once people realize that this like resilience aspect is. actually rather important. And what's on the roadmap for this year? Yeah, so for this year, next up is the release of ComposteB on Ceramic Mainet. So that's something we're planning to go live with at Heath, Denver. So that's the end of February, early March.

Starting point is 00:47:06 So that's kind of what we're in the team are the most excited about right now. Beyond that, there's a bunch of improvements that we want to make in terms of developer experience and performance to the CompostDB graph database. And we're also working to make a lot of improvements on the core event streaming layer. So right now there's a pretty tight coupling between the event streaming layer and CompostyB. We want to decouple this. So the event streaming layer is more on its own. And essentially also enable other people to build databases or different types of microservices

Starting point is 00:47:45 and so on. So those are like some of the most eminent focuses over the next year. I'm trying to wrap my head around this problem that a lot of what ceramic is doing seems very similar to the work done by protocol labs in the IPFS file coin combo. So in my imagination of ceramics, if I think like ceramic versus IPFS, IPFS is similar, right? I can put some data into IPFS,

Starting point is 00:48:23 but unless I run my node, unless I kind of replicate my own data, it can get deleted, but nobody can mess with the integrity of that data. That seems very similar across ceramic and an IPFS. IPFS does not provide eventual consistency so there will be no global ordering of events but ceramic does so that seems to be a big key difference IPFS natively did not have yeah a lot of incentives baked into it and ceramic as of today also doesn't have incentives baked into it but then protocol labs built file coin where they were incentives and people can basically be guaranteed that any data they put into IPFS will

Starting point is 00:49:17 be replicated by file coin made available and ceramic also seems to make take steps in in that kind of direction I'm trying to understand like what's the meaningful difference between between these two ecosystems and why would developers prefer one of the ecosystems over the other In this case, why would somebody prefer ceramic or the other ecosystem? Yeah, perfect. So I will comment on like what these systems do today, to my understanding, at least in the protocol labs ecosystem and what the differences are today because things are likely to change in the future.

Starting point is 00:49:58 So IPFIS is really good if you have like static files, right? I can put a file. I get the hash or what they call a CID of that file. and now I can synchronize that across the network and do a bunch of fun things with that. And I have integrity proof in this hash. But there's no way to update this file because if I update the file, you get a new hash, right? And so there's no way to have like an easy way to keep track of updates. In ceramic, our core focus has been actually one important thing to know is actually

Starting point is 00:50:35 Ceramic is using the same data model as IPFS called IPLD and that's how we represent this Hashlink event log and the main thing we add on top of that is that we provide like a signature system so we have attribution of like who created what And I think I like let me go in a little bit in how that works So essentially when you come to an application that uses ceramic there's a session key created in your browser you sign in with Ethereum message with your wallet that basically delegates some permissions to the session key on behalf of your Ethereum mattress or other blockchain address. And now the user can actually use the application, like they would use any Web2 application,

Starting point is 00:51:24 without having to like for every like and for every comment, there's like a pop-up that we need to sign in their wallet. Like that user experience wouldn't be great. So I think, yeah, that's like one of the key differences is that there's like focus on making the user experience of how we do attributions really well. And in IPFS, natively, you don't really have an attribution. And currently in Filecoin, it's really good for storing large files and large chunks of data in a kind of backup manner. But it's not right now like something you can easily query and build like feature applications on top. And so that has been our core focus with ceramic.

Starting point is 00:52:06 And so we see our technology as kind of like, it's kind of taking the best of like Ethereum and IPFS and merging that to create a system that just like enables developers to really build something meaningful. I also compare ceramic to orbit a little bit because this fundamental idea that, There should be data and then there should be a lot of like people should be able to build UIs for for interoperable data. This actually this value kind of repeats in many different ecosystems and orbit is another ecosystem in which this this this value repeats and essentially on orbit the data you put on orbit is also such that many people can build different

Starting point is 00:52:59 the UIs to it. So, but of course, when you look at the difference between orbit and ceramic, I think like the the trade of space here is something like in orbit you can push private data to orbit quite easily like it's designed for privacy. That's a strength that orbit offers along with data interoperability. But then the big, big challenge of weakness of orbit offers. on the other side is it's a completely new tech stack right from the ground up starting from operating system to networking protocol and so and to even

Starting point is 00:53:41 programming language so it it's the case that the the ambition in orbit is so big that they just might end up easily outcompeted by much simpler systems that also provide application interoperability like ceramic. So that's how I tend to view that trade-off. What are your thoughts on how ceramic and orbit differ from each other? Yeah. So I'm afraid I can't make super nuanced comments here because I'm not intimately familiar with Erbit.

Starting point is 00:54:18 But from what I understand, you have your kind of own virtual private server with Erbit, which you could run on your computer, on your computer or someone else's computer. And there you have the private data in the way which we talked about earlier where yes, it's private if you host it yourself or you trust the person that hosts it. But you don't necessarily have public verifiability. And so that's like a core thing that we've been focused on in ceramic.

Starting point is 00:54:50 Like how can we have public verifiability and then privacy features on top of that? But since we're coming from like the Ethereum and blockchain space, the aspect of public verifiability and that neutrality that provides is been one of our kind of core principles. So I've also seen that ceramic raised a quite a big funding round in the recent past. So how was that funding round structured and what did people buy in that funding round essentially and what kind of incentives does your funding round imply for the future?

Starting point is 00:55:35 I'm probably not the best person to answer this question because it was led both mainly by my co-founders. But yeah, what we essentially sold is equity in the three bucks labs company. and there's also token warrants in a potential token in the ceramic network. Cool. Fantastic, Joel. So if someone wants to build on ceramic or run a ceramic node, where should they go to kind of find out more about documentation and speak with people who are already doing this? Yeah. So I think the main place to go to goal for everything is Ceramic.network. And if we're specifically interested in developer documentations, we have a developer portal at Developers.ceramic.network.

Starting point is 00:56:37 Cool. Fantastic. Thank you for coming on, Joe. This was super interesting. Yeah, thank you very much. This was a great fun. Thank you for joining us on this week's episode. We release new episodes every week. You can find and subscribe to the show on iTunes, Spotify, YouTube. SoundCloud, or wherever you listen to podcasts. And if you have a Google home or Alexa device, you can tell it to listen to the latest episode of the Epicenter podcast. Go to epicenter.tv slash subscribe for a full list of places where you can watch and listen. And while you're there, be sure to sign up for the newsletter,

Starting point is 00:57:10 so you get new episodes in your inbox as they're released. If you want to interact with us, guests, or other podcast listeners, you can follow us on Twitter. And please leave us a review on iTunes. It helps people find the show, and we're always happy to read them. So thanks so much, and we look forward to. being back next week.

Epicenter - Learn about Crypto, Blockchain, Ethereum, Bitcoin and Distributed Technologies - Joel Thorstensson: Ceramic – Building the Dataverse

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.