Epicenter - Learn about Crypto, Blockchain, Ethereum, Bitcoin and Distributed Technologies - Near Protocol: 'Blockchains Cannot Scale Without Sharding!' - Illia Polosukhin & Alex Skidanov

Starting point is 00:00:00 You need to actually store all that states out where. And for a billion users, it's actually going to be a lot of state. And this is, usually everybody talks about TPS, but actually one of the bigger bottlenecks right now across the space is state. And so obviously no single computer, no single kind of node can process all that, can validate the whole network. Now you can rotate this validators who validate what chart all the time, every block. They can be actually, you know, you can randomly select a set of validators. And they are able to validate this because they don't need to sink into the shard. They don't need to process every single transaction that ever hit that shard.

Starting point is 00:00:39 They only need to look at this specific block. The data availability and consensus are merged together into a single mechanic. The essence of it is you want to have a design that is not dependent on the underlying hardware itself improving at a certain rate to be able to. to service user demands. You want to have mechanisms beyond that type of scaling. This episode is brought to you by NOSIS. NOSIS builds decentralized infrastructure for the Ethereum ecosystem. With a rich history dating back to 2015 and products like Safe, cowswap or NOSIS chain, NOSIS combines needs-driven development with deep technical expertise. This year marks the launch of NOSIS pay, the world's first first.

Starting point is 00:01:43 decentralized payment network. With the Gnosis card, you can spend self-custody crypto at any visa-accepting merchant around the world. If you're an individual looking to live more on-chain or a business looking to white-label the stack, visit nosispay.com. There are lots of ways you can join the Nosis journey. Drop in the Nosis Dow governance form, become a NOSIS validator with a single GNO token and low-cost hardware, or deploy your product. on the EVM-compatible and highly decentralized NOSIS chain. Get started today at NOSIS.io. Kars1 is one of the biggest node operators globally and help you stake your tokens on 45 plus

Starting point is 00:02:28 networks like Ethereum, Cosmos, Celestia and DYDX. More than 100,000 delegators stake with KORS1, including institutions like BitGo and Ledger. Staking with Kors1 not only gets you the highest years, but also the most robust security practices and infrastructure that are usually exclusive for institutions. You can stake directly to Quarice 1's public node from your wallet, set up a white table node or use the recently launched product, Opus, to stake up to 8,000 eth in a single transaction.

Starting point is 00:03:03 You can even offer high-year staking to your own customers using their API. Your assets always remain in your custody, so you can have complete peace of mind. taking today at chorus.1. Hello everyone, welcome to Epicenter. Today I have an amazing episode lined up for you. We're talking to Alex and Ilya, the co-founders of Neer, which is a in-production sharded blockchain with a lot of value riding on it. Specifically, we will cover how Neer Sharding actually works,

Starting point is 00:03:38 tried to build it concept by concept into an integrated hole, and then understand where they are in their journey to implement sharding. Alex and Inia, welcome to Epicenter. So at this point, we've kind of done a few episodes with both of you. Maybe you could give, you know, a short introduction to your backgrounds. Yeah, I think at this point my background is all near. Everything that was before is irrelevant. But, yeah, I was working on a sharded database called MemSQL before Near for five years.

Starting point is 00:04:12 which is right now it's an in-production sharded database and before that I was at Microsoft Yeah I mean my background is actually in machine learning AI I was at Google research

Starting point is 00:04:27 prior to our startup adventures and I was one of the costs I was in a paper that introduced Transformers which is technology powering Chad GPT,

Starting point is 00:04:39 majority and other AI advancements And then with Alex, we actually started originally near as AI company and realized we need SaaS, cheap, easy to use, easy to build on blockchain because we wanted to use it ourselves for data crowdsourcing and some other data use cases for our AI company and adapt pivoting to that in 2018. And, yeah, focusing on that ever since. Yeah, let's get into sharding, blockchain scalability. So what is sharding overall and why has it been a generally difficult target? Industry-wide. I think, I mean, maybe a broad question is if we imagine having a billion users coming in

Starting point is 00:05:35 and using blockchain as a means of payments, as a mean of tracking ownership, as a way to kind of coordinate resources and efforts. You imagine that you need to have kind of a few things happening, right? One is you need to actually store all that state somewhere. And for a billion users, it's actually going to be a lot of state. And this is usually everybody talks about TPS, but actually one of the bigger bottlenecks right now across the space is state and kind of its growth. And so that's probably problem number one.

Starting point is 00:06:10 Number two is, I mean, you have billion users, try a day-transacting. There's hundreds of thousands of applications. Obviously, there's a lot of transactions flying around and a lot of processing that needs to happen. And kind of people want to put more and more complex smart contracts and complex logic indices. So you need to have throughput, bandwidth, and processing power to do this. And so obviously no single computer, no single kind of node can process all that. can validate the whole network.

Starting point is 00:06:43 And so the way to support this is one way or another partition this computation, partitioned this storage, and partitioned this kind of band list to receive this. And so people have kind of historically been talking about sharding because that's how Web2 companies doing this. So Alex mentioned, you know, doing this kind of in Web2 world, MemSQL, single store, is used by 400 companies, you know, Google and Facebook have their own solutions. And it kind of seemed reasonable

Starting point is 00:07:20 that this is an approach that you should take in blockchain. Now, there are a lot of kind of problems that arise when you actually move into a permissionless setup, kind of compared to a permission setup that usually two companies deal with. I also want to add, Ilya said it's obvious that you need to share processing. I don't think it was obvious to everybody until recently.

Starting point is 00:07:47 So there were multiple blockchains, which were, like, the favorite thing they like to say was, like, look at visa, look at how many transactions via processes, and that's a world scale. Obviously, we can do it on a single computer, right? But as a user, you don't use Visa frequently, right? You use it like three times a day on a good day. Right. And so finally now, generally when NIR launched, we made multiple bets, which were not obvious to everybody. You know, like on NIR from the 1, we had named accounts where you can rotate keys and we had sharding. And it wasn't obvious to everybody that it's useful.

Starting point is 00:08:22 And now suddenly, very scalable blockchains, which are not sharded, get congested. and they have no way of getting any more performant, right? And similarly, like, you know, when it comes to account obstruction, like Ethereum right now is switching to that. So now finally, it is becoming obvious to everybody that those were correct decisions. Yeah, I mean, to give an example, right, like any kind of a single, I call it single node blockchain, right,

Starting point is 00:08:55 which is something that every single node into network needs to process every single transaction and store all the state, right? What it means is as soon as, like, let's say the network has a high capacity, they also have a huge state growth. They have kind of limit on how many transactions they can process because of the bandwidth and execution. And so at some point, there will be more demand. Like, if, and it is a very natural thing, right?

Starting point is 00:09:22 The price for the transaction is usually based on supply demand. And so while transactions are cheap, at some point there will be more demand because they're so cheap to even just spam it to try to get some financial benefit from because the transaction fees are so cheap and so when that happens

Starting point is 00:09:42 you don't have a way to expand capacity so your prices start to grow for everyone right and so this leads to now you kind of pricing out people who are using this blockchain originally for normal use cases because of the spam and kind of people trying to run arbitrage for some other application.

Starting point is 00:10:03 And so that's kind of just the principle where like any single like kind of state machine, right, single or thing machine will get over on and start crisis fees. Right. So kind of Solana is the, you know, contrasting example. Of course, even Ethereum and Bitcoin are based on the idea that the miners or the validators and even the full loads of these systems have to process every turn. transaction that is happening in the system. Solana has taken that idea and said,

Starting point is 00:10:33 yes, we have a bunch of validators, I think 1800 or something like that currently. And every transaction that goes through the Solana system has to be processed by every one of these validators. They are assuming that, okay, these validators can be placed in data centers where networking bandwidth is very high, which means they can ingest a lot of transactions

Starting point is 00:10:58 from the network at a very high rate. They also assume that the machines are very performance. So the work of accounting can be, you can assume that each machine can handle lots of transactions, do their accounting work. And then Solana would also assume that, okay, maybe the history doesn't need to be stored by these machines. they only need to store the currently what are the different accounts and what balances they own.

Starting point is 00:11:32 And what they kind of like a project like Solana assumes is the improvement in compute in terms of like bandwidth, processing power, every resource is it kind of doubles on some time scale. So some resources double on a 12 month time scale, another one doubt might double on three year timescale. But because of this doubling, the capacity. of the blockchain would keep growing at a certain rate and the hope is that the user growth is actually slower than the doubling rate of the underlying

Starting point is 00:12:06 hardware and therefore you continue to have like a cheap blockchain. Whereas in the near case, near the opposite approach where we where you say user growth can be way faster than the improvement in hardware

Starting point is 00:12:21 so fundamentally you need to move away from the paradigm of every validator or every node needing to store, first of all, what are the balances of every account in the system are? And then you also need to, so basically no single machine may have a complete view on what the balances of every account on the near system is. And then it might also be the case that there's some machine and there's some transnational, transaction on the network and that transaction happened, but this machine is part of processing that network, but it never actually executed the transaction itself.

Starting point is 00:13:08 And so the essence of it is you want to have a design that is not dependent on the underlying hardware itself improving at a certain rate to be able to service user demands. want to have mechanisms beyond that type of scaling. Yeah, I would add that, as I said, it's not even about users. It's actually, in a way, it's sadly the economics. The economics, all of these blockchains, isn't such that you need to have kind of a way to expand capacity. otherwise if you want to maintain low fees. Because at some level there will be a such like saturation where some subset of people are willing to pay higher fees because they're planning.

Starting point is 00:14:06 Like they try to capture some economic value from an exchange, token launch, whatever, trading. And that in turn increases price for everybody else. Right. And so Solana and some other kind of high-capacity networks, even though the idea that's like, hey, we have the last capacity and it will grow over time, the reality, what happens is instead it gets flooded by transactions that are all trying to hit kind of the same economic opportunity and extract value. And so that people are willing to pay way higher fees than folks that are potentially using it for, you know, other use cases, right, for payments, for example, and others. And so that's kind of the point is, and this is to add to the fact that, like, yeah, the state grows and everything requires validators to continuously expand their hardware, even just to continue maintaining the network. So, so I think, like, to me, actually, like, past, like, three, four months have been

Starting point is 00:15:08 a really great validation. And then this is not just about a lot of base and other, like, kind of even, you know, basis is a centralized sequencer, right? It's a single server, effectively. But even that cannot catch out with all the transactions that they need to process. And so that's kind of an example of just like, as soon as you have enough economic activity,

Starting point is 00:15:27 you're starting to get kind of this slot of transactions trying to capture that, and you don't have any way to either isolated and add that extra capacity for everybody else to do this, right? And so the example I like to use is, imagine Netflix. You go to Netflix. And first of all,

Starting point is 00:15:44 in the CDOM ecosystem, it would ask you to choose which data center you want to watch from, do you want there, arbitral data center or optimism data center? A base or, you know, blast. And then when you go there, it says, like, actually, first of all, you need to bring your money from the other data center where you have your money

Starting point is 00:16:03 if you want to pay for this movie, pay for watching a movie here. And then the second one is actually, you know, because somebody else watches a very popular, movie, you cannot watch this movie right now at the lower price. You need to pay more. So that's kind of the current state, right? And what we want to do is, you know, you go, you can pick any movie and you watch it and you pay kind of fixed fee, right, that's like, is predictable for everyone. And so to do that, right, similarly, how Nessics need to use Amazon

Starting point is 00:16:32 that kind of scales under the hood, right, and is able to build more data centers ahead of the demand that Gatslakes has, kind of similarly, you need a network that. is able to scale with demand and kind of, you know, in a way, you know, you have the supply demand curves. And so, like, you want to flatten the supply curve such that even as demand grows, you kind of can maintain their fixed fees. Right. So this is the distinction between burst capacity and, like, average capacity in a sense, where, like, a system might only be using, like, X capacity. certain times in a year but then suddenly like one application or the whole system might require

Starting point is 00:17:17 5x or 10x and that burst might happen very quickly and what you're saying is essentially that if the scalability properties of a system are only dependent on the underlying machines that the validators use then that cannot

Starting point is 00:17:37 change very quickly to adjust to demand like suddenly lots of demand comes in machines can't be changed across the whole network that fast so there needs to be like some other mechanism where a burst happens and the system is also able to somehow respond and being able and be able to scale dynamically on a shorter shorter time horizon and this is this is a property nearly every blockchain kind of like lacks today which is why you have gas congestion.

Starting point is 00:18:13 Yeah, and so specifically in last months, we went from four shots to six charts, increasing our capacity by 50%. Because we had a couple of applications who had massive growth, we have hot, which grew from zero to five million users. It's like over a million daily active within a month.

Starting point is 00:18:32 And so we started to having this actually congestion without the charts because of that. And so instead of just, okay, everybody's now paying high fees, your network added more capacity. So, yeah, and let's get into what, you know, what are the different challenges with building a shared blockchain? So there's a couple of them.

Starting point is 00:18:56 First of all, there are certain changes to the user experience because since nobody maintains the state of old accounts, if the transaction has to touch multiple accounts, something needs to be done about it, right? And it's a very large design space and generally we refer to those transactions across short transactions. And that's one big challenge. The second challenge is in the network in which every node processes every transaction, if you have a node and you're at a particular state, you have very high certainty that,

Starting point is 00:19:40 every transaction was processed properly because you literally processed each and every one of them. In the worst case, you can be in a situation where you're looking at the network, which is not canonical, right? So maybe some other part of the network believes that another set of transactions happened on this one. But that's a very different problem from someone literally, you know, like making something that doesn't make any sense according to the rules of the chain. and so in the shardot chain because every node only applies a subset of transactions you need some checks and balances

Starting point is 00:20:12 which ensure that what you see is actually a correct state that results from the properly executed transactions and when you start digging deep into that problem into the problem of ensuring that everything

Starting point is 00:20:26 is executed correctly then you start facing another problem where in order for like for all almost any mechanic you can come up with that ensures that everything is executed correctly, maybe with an exception of Ziki proofs. Like, once we start digging today, we will see it. You need to be able to access certain information in order to perform the validation.

Starting point is 00:20:52 And that information could be made unavailable by malicious actors. And so you need to have a sort of native, you need to have a native mechanic on the blockchain, which ensures that certain pieces of data are available to certain participants and cannot be concealed. So those two challenges are one of the biggest challenges that exist. There are some others. One interesting one would be that because the state is so big, you need to have a way for people to either synchronize state very quickly or work without synchronizing state. So I think those four would be the most interesting ones.

Starting point is 00:21:32 Right. So yeah, essentially like imagine yourself like being an accountant of the near blockchain. It's a massive data structure. You only have a small part of it. The transaction comes. So the first problem is, okay, you might not be able to process the transaction fully yourself because you only have a part of that entire data structure. So you can maybe make some changes to that part.

Starting point is 00:22:03 But then the transaction might hit, okay, now it needs to do a change on some other part that you don't have. And that's a completely different accountant. So you need to process, you know, only a part of the transaction. And then it needs to be like that rename Beton needs to be kind of handed over to some other party. And then they do that and so on. So like that's one, that's one issue. second issue is if you're only

Starting point is 00:22:34 handling a part of that data structure which has contains all the account balances you receive that data from someplace and then how do you even know that

Starting point is 00:22:48 that data is genuine right so that's that's the problem of stateless validation or like how do I know that this data I'm receiving it's actually processed correctly in the past. Then kind of like, okay,

Starting point is 00:23:07 if there's any mechanic to know that any kind of certificate that would tend me that this was processed correctly in the past, then the generators of that certificate kind of need to have had access to that data in the first place. But if that data wasn't there to the generators of, like these certificates, then they won't be able to generate certificates. A certain data needs to be kept in a state where you can reach it in order to do something with it.

Starting point is 00:23:44 Yeah, so a couple of comments. So the first one, you mentioned that you need to process part of it and then send it to someone else. So it's also important that that message you're sending is not getting lost, right? that it has to be delivered. With certificates, you mentioned that you need to be, to have certain data to create certificate, it's also in many situations the case that you need certain data

Starting point is 00:24:07 to be able to verify the certificate. Actually, like in near, in one sense, like there's a lot of complexity because fundamentally like kind of like the accountants in your system may not have access to the full data. And so that leads to a lot of complexity. But in a different sense, Near is similar to other blockchains in it that it's just one data structure containing accounts and like smart contracts and their data. And this is exactly the same as kind of like Bitcoin and Ethereum where you're dealing in the end with a single data structure.

Starting point is 00:24:51 Near is managing their data structure in a different way. but ultimately it's a single data structure. Like that's a unifying property across all of these systems. Is that right? Yeah. So, yeah, you can think of NIR as like a very large mapping from account IDs to state. A big difference from other blockchains is that the account ID is not pure key. Account ID, you think of it as like a domain name.

Starting point is 00:25:17 Right. So like you know, I would be Alex.orgnear. and it's not just a convenience, it's not just something that is easier to convey to other people. It's also, like if you think about it, if the key is what identifies your account, then if for any reason you have a reason to believe that the key is compromised, your account is gone, right?

Starting point is 00:25:39 You have to create a new one. You have to try to move the assets. Not all the assets are movable, right? Like you can imagine an NFT, which is not movable by design and NFT. you know, like the first user of this feature, it's not a movable NFT. That NFT is gone forever, right? So, NIR, if I have a reason to believe that my key is compromised, I just change it. Right.

Starting point is 00:26:01 It also allows you to do to like auction your account. Like your account has a particular state that you think is of value. You can literally sell your account. There are services on NIR that allow you doing that. Right. And then this massive state of mapping, but at the end of the day, it's still a mapping from an account to state, similar to Bitcoin, Ethereum, Stalana, any other blockchain.

Starting point is 00:26:24 But that state is sharded. Just to be clear, not Bitcoin because you JXOs, but yes, they feed you and so on that. Every account-based blockchain. But yeah, you know, like your DXO, you can think of it, is as even less persistent account than just a key. And so this state of all the accounts, it's split into multiple subsets, right, into multiple sets. So today we split by name, right?

Starting point is 00:26:50 So there would be a contiguous set of accounts that leaves on shard 1, and there is a contiguous set of accounts that lives on chart 2. And that those boundaries are not, they're not immutable through the life of the blockchain. And as a matter of fact, they did change multiple times. From near launch, it was a single set, near launch to the single shard. Right. And then it was split into four. and then in the recent months,

Starting point is 00:27:16 two of them were split twice into two again. So it's, and in the future, that will be dynamic. In the future, the system will be changing the boundaries as the you know, like as the load changes automatically.

Starting point is 00:27:33 So, I mean, maybe one way to imagine it is like in your, like in the postal system in your city, most likely like your city is kind of like divided into these different regions each with a different postal code, right? Where I live there called PLZs or something like that.

Starting point is 00:27:54 And so each kind of like region of the city will have, will have a number, and the city will be partitioned into various different postal codes, and you'll have like a post office in every code, essentially. And can imagine like in near, it's taking that data structure. you can think of that as the city and then breaking it down into like these these areas, these shards. And the dynamism is like if you have like a region in the city that has a postal code and suddenly lots of letters are being sent there and they are like,

Starting point is 00:28:31 oh, now actually we need two post offices, then maybe they will divide the region in the city into two different regions with two different post office with two different numbers. And in practice, that does happen, right? Like postal codes change over long horizons. And in near, similarly, like, okay, the whole blockchain is broken down into like these shards. And the definition of a shard can also change in order to kind of route around the capacity demand in some way. If you think of near today, you can think of it as there is a, you know, we all observe what happens in the city and how much mail goes to every post office, right? And at some point we realize that for a particular post office, there is a lot of mail coming in.

Starting point is 00:29:26 And it's, you know, it's getting harder for the employees there to handle it. Right. And so there could be a proposal that says, hey, guys, let's build another post office half a mile away and split it like this. Right. And then there is a separate entity, which is validators. of the network, which either choose to, you know, to go ahead with this change or not. And if sufficient percentage of them, which I think is 80, you know, wants to go with this change, then it's, then it happens, right?

Starting point is 00:29:55 And, you know, if you think of near of the future, when the name of your sharding is there, it's slightly different. It's, you know, as mail starts coming into the building, another building just spontaneously pops in without any human being involved, right? It's entirely... More like that building splits into two and moves away. Exactly, exactly.

Starting point is 00:30:13 And your wall appears, yes, to building separate on a set of rails. And that happens without any involvement from any natural intelligence, right? It just occurs.

Starting point is 00:30:26 And then at some point, you know, two post offices become, you know, it's getting chill there so they kind of, you know, come together and the wall disappears.

Starting point is 00:30:34 Yes. Yes, exactly. But I think the important parts here, right, as you said, zip codes changes, which means, like, when you're sending a mail, you need to, like, now whoever is in a new zone needs to update everyone. They're, like, in a new zip code. You know, everybody needs to like change. But on near, you never see the zip code. On here, you say, I want a mail to be some tithelia. Right? And near figures out the zip code itself. You don't need to know it. That's the beauty.

Starting point is 00:31:02 Yeah. And to compare with this, with other approaches, like subnet, trollops, et cetera. I mean, in a way, they're trying to emulate the same here, right? It's like, oh, you know, this roll-up is too busy. Instead of launching there, you know, you can spin up a new roll-up, and now everybody can go there and you have more capacity. But it's a very, you know, not just manual and expensive process, right? I mean, like each roll-up, you know, cost is at least a million dollars in year just to run between sequencer, explorers, RPCs, et cetera,

Starting point is 00:31:35 all the infrastructure. But it's also now every user, every developer, every smart contract that is actually trying to use it now, at least to figure out how to go there, how to bridge there, what gas tokens is used there, et cetera. So it's a huge load on the whole kind of understanding of the network process, which we are actually addressing with chain abstraction and chain signatures as well because we do believe this is kind of a unit, like what we're trying to do with NIR is a universal. problem, right? It's like the capabilities of the network should be able to change dynamically and everybody should be able to route things without thinking about the underlying infrastructure.

Starting point is 00:32:16 But on near, we kind of solved it in a very like direct way by having this kind of namespace that is common for everyone and using that to route a kind of transactions and messages, mail between different partisans. yeah that is so cool i actually i actually own meher dot near and and yeah i've never needed to think about what shard it is on right so to me i own i only need mehr dot near and its journey through time maybe like it was processed in shard number one then shard number three and then it's changing and i never need to know it's not about it. That's like really...

Starting point is 00:33:07 Yeah, I think it wasn't sharp zero, then chart two and then chart three, now it's probably chart three right now. But yeah, you definitely don't need to know about that and like even I have like, you know, there is a way to look it up, but we actually don't show it an explorer usually because, I mean, some new explorers show sometimes, but because we don't actually want people to know because it's irrelevant information. It's like knowing which exact computer on which rack in AWS is, you know, providing us with this interface we look at right now. Yeah, yeah.

Starting point is 00:33:46 So maybe like an interesting imagination for NIR is because like ultimately it's like this human memorable names are at the bottom. And maybe, you know, like each human memorabil name actually corresponds to a person or a company because that's how the world is partitioned. And because like the near system breaks into shards, you can almost imagine like a virtual connection of people that are transacting their business on a shard at a certain point.

Starting point is 00:34:16 And these are the users of the shard itself. And of course, like as the shard boundaries change, the collection of people transacting on a shard is also changing. But maybe if we imagine it as like being, you know, stationary or constant for a certain while, which is true for near. We can think of this like virtual collection of people in the near suburb, on a sense, that's transacting on a shard. And then on the other side, corresponding to a shard, you need kind of servers or validators or postmen

Starting point is 00:34:50 in our earlier analogy that are kind of processing the mail that's coming to that particular area or shard. I guess like one of the questions starts to become. So in any blockchain network, you have these validator machines or minor machines that are ultimately like kind of like this postman or accountants of the system that are doing the processing. And even in near I would imagine, okay, there's a set of validators that are found through proof of stake. Now these validators sort of like need to be assigned to shards that, hey, you go and process. with transactions in this shard, you are the one, you do it there. And how does that process work? So I first want to correct slightly the first time.

Starting point is 00:35:41 I think the second analogy was good, but the first analogy was not quite correct. Because even though shard boundaries do not change as frequently today as they will be at some point, it has been the case from day one on near that two accounts residing on the same chart has absolutely no significance for those two accounts. So I think a better analogy for that would be not a post office, but like cell towers, right? We could do next to each other and I can call you. And we will be, you know, served by the same cell tower. Or we could be in different parts of the world and I can call you and other different cell towers.

Starting point is 00:36:19 But we have no benefit. You know, like I will never know that you own the same cell tower and I will never care. And it is the case of NIR that if you and I own the same chart and I send you money, or, you know, like we transact on some smart contract, or if we are in different shards, from the user's perspective, there's no difference in experience, right? So, you know, fees are not affected. The performance is not affected.

Starting point is 00:36:43 You know, sharding is completely abstracted to it. And so there's no incentive, for example, to try to be on the same chart. There's no incentive to grind, for example, accounting is where like intention was at the same, you know, accounting in the same chart. when it comes to the second analogy you can think about this way

Starting point is 00:37:00 you can think I like I like going in multiple steps and effectively saying let's say we're designing a new blockchain and we wanted to be sharded and how do we ensure security when not everybody processes every transaction and the first idea would be

Starting point is 00:37:18 let's say we have a massive set of validators right so we set the minimum fee to be relatively low and we say we have hundreds of thousands of validators, or even millions, I don't know. And then every shard, even though it has only a subset of validators, it still has a massive set of validators. So we have a million total 100 charts,

Starting point is 00:37:40 and every shard has 10,000 validators. Then you can say, well, if we sample them randomly and we relatively certain that the total set of validators has up to a certain percentage of bad games, We say, we believe that the total set has up to 25% bad guys and not more. Then you can do the math and you say, well, if I sample 10,000 of them, then the percentage of the bad guys exceeding 33% is so unlikely that we can consider it to be impossible, rather to be like, you know, one over 10 to the power of some large number.

Starting point is 00:38:17 And then you say, well, and because there's no more than 33% of bad guys in the shard, we can just assume that they adhere to the protocol, that any state transition they approve of, like if it has a particular percentage of signatures of those people who are validating the shard, then we believe that that state transition was valid because the number of bad guys is limited and the good gaze signed off so it's good to go.

Starting point is 00:38:43 It has practical issues. We don't actually have a million of validators, and we do want to have more than 100 charts in the limit. But it has a bigger problem, which is the contract of a big guy or a good guy is very abstract. At the end of the day, everybody who's on the blockchain, they want to make money. That's the ultimate goal. You know, majority of validators, I'm sure there are some validators who are there to build

Starting point is 00:39:09 the decentralized world of the future where everybody's a happy corgi owning their data. But in reality, majority of them are there because you stake money. Or rather, like, you have people delegate to you, you keep the percentage, you make money. Right. And so, correspondingly, we should think about the security in the presence of the bad guys who will try to corrupt other participants, right? And people talk. There are ways for people to talk, you know, a lot of validators are just sitting on telegram.

Starting point is 00:39:41 Right. And it makes sense for them to be in the same telegram groups because they run into issues. You know, like the network is too slow. The validator, they need to know that they need to operate the validator. So they're all in the same telegram channels. They're all easily reachable. If the bad guy wants to come and say, hey, guys, I want to do this act of being a bad guy,

Starting point is 00:40:00 and I need that percentage of validators to cooperate, right? I can DM each of you and say, this is how much money I'm willing to pay you because there's something for me to gain from the blockchain gain going down. It could be some minor extractable value. It could be that I'm, you know, I'm the Salana investor, and I just want New York to go down and Salana to go up, et cetera, right?

Starting point is 00:40:23 And so the system needs to be designed in such a way that a very large percentage of the validators could be corrupted and, you know, incentivize to do something bad. And we need to. But correspondingly, let's say we have 100 or 1,000 of validators, and we have on the small sets, a subset of them in every shard. We should expect that almost all of them will get corrupted, right, or even all of them. And so the system needs to be designed in a way that, yeah, there are those, you know, postmen in the post offices. But it could be that a bad guy enters the post, post office, gives all of them, you know, $1,000 and ask them to do something bad. You know, like a route and a mail, which was what actually sent by the originator or something like this. And so we designed systems in a way that is prevented or made very difficult to execute, if that makes sense.

Starting point is 00:41:13 So now I don't know the mathematics of it But if we take that that scenario that you sketched out first There's a million validators And then I'm sampling What I mean by sampling is a huge set And I'm taking 10,000 and I'm choosing randomly Because the blockchain ultimately needs to choose randomly And then if I get the set of 10,000

Starting point is 00:41:38 My mathematical intuition says that if like 25% of that set of million is 250,000. They are malicious in some way and 750,000 are kind of honest. If I am choosing 10,000 randomly, my odds of kind of choosing a set where the majority, maybe 66%, is malicious, means 6,666 are kind of malicious and the rest are good guys.

Starting point is 00:42:11 I would think that would happen pretty frequently, right? Like if I'm creating these sets like once every second, I would think every six months or every year, I'm going to get a set which is full of malicious. Now, so if you have 250 out of a million, then even sampling one third will never happen. If you sample 10,000 out of a million, then even sampling one third would not happen.

Starting point is 00:42:40 it's extremely unlikely. Oh, it's extremely unlikely. So my physical intuition is wrong, and if I'm sampling 10,000 out of a set of million, and I keep doing that, keep creating new samples once. You can do it like billions of billions of billions of time. You can do it every nanosecond for a billion years, and it will not happen.

Starting point is 00:43:05 We actually have the math in one of our original papers, right? and it was, I don't remember the exact constants, but yeah, it was like 10 to minus 20 minus 30, something like that probability. So like longer than universe exists kind of thing. But to an extent it doesn't matter

Starting point is 00:43:26 because we still, like if your intuition was correct or if my math was wrong, which is possible, it would only make the situation worse, right? But we're saying, we're saying, hey, let's say the math is such, that, and I think it is, right? Let's say the math is such that it's very unlikely to sample a large percentage of bad guys. It still doesn't matter because good guys can become bad

Starting point is 00:43:48 guys when they said Kisholitiens incentivize to be bad guys. And then in the world of blockchain's incentives are often such that you can benefit a lot by corrupting, you know, a sort of validators. And so you would, every now and then, the validators will choose to do so. And so correspondingly you want to design a system where even if the set of validators in the shard is corrupted, right? Either due to sampling or because they were corrupted, what we call adaptively, right? So after the fact, then the system still operates reliably. And so there are multiple, you know, there are many ways of ensuring that. And it's the area of research that has been developing.

Starting point is 00:44:33 And this is where in sharding we use, you know, this is the, you know, this is the, the biggest user of Ziki proofs in sharding, because if I can just prove that my transition is correct, then the problem is solved. Instead of sending a bunch of validators, if you can just have a proof that says everything is correct, that it's, and the problem is solved. But maybe building this up from the bottom up, right?

Starting point is 00:44:57 I mean, it starts with so kind of, we discussed, right, there's this account space, right? You know, think of it as a city, with people, and, you know, we have the cell towers. So when people, you know, call each other, like, fleas of the analogy, when people call each out in the city, right, kind of bounces to their cell tower and that goes to another cell power to kind of connect them.

Starting point is 00:45:25 And so the first thing that we need to do, right, is to ensure that every second, every transaction is recorded, right, it ordered and kind of a cross-all shards. And that no, there's no way even for, if everybody is corrupted in that shard, to be able to kind of change the order. And for other use cases also, you know, potentially introduce something that's not valid. And so that's called data availability problem. The mirror had kind of data availability from 2019, then it designed Manchade,

Starting point is 00:46:07 and then, you know, as other approaches like roll-ups, et cetera, started becoming more popular, they also needed data availability, and that's kind of where a lot of the current data availability offerings are coming to market. Like a very short primer would be, you know, like let's use roll-ups as an example, right? So let's say there's a pathetical optimism and they do transactions, but they do not use zero-knowledge proofs, right? I actually don't know if optimism is planning to use zero-knowledge-proofs, right? And so correspondingly... We're using zero-knowledge-prose for fraud-proof.

Starting point is 00:46:45 I see. Interesting, interesting. And so they checked in the, you know, the state route to Ethereum every now and then. And they say, you know, I applied all the transactions correctly. And if you think I did not, I posted just enough of cryptographic information of like, you know, at a station that if I did something wrong, you would be able to come and prove that it was wrong. Right. And if you do that, I will lose a lot of money. And so I have a strong incentive not to do so. Right. And so then you observe the roll-up and you see that some transaction was applied incorrectly. you can go to Ethereum and say, this is a transaction, it was applied wrong, and this is the proof.

Starting point is 00:47:27 But you can only do that if you actually see the transactions. So if the roll-up was able to operate in such a way that the validators cannot see the transactions, then the roll-up would be snapshoting something, but nobody can prove anything wrong because nobody sees what's happening. It's like, you know, it's a filled room. So data availability is effectively this concept of ensuring and proving that the transactions that you claim to apply and the state on top of which you claim to apply

Starting point is 00:47:54 those transactions are all visible to everybody. And that's something that Nier had from we were either first or second to have it like. I don't know when, I don't remember when PolkaDot launched and whether they had data related to you from day one. Okay, so you have like a shard and then there's a set of validators that are processing transactions in the shard.

Starting point is 00:48:15 let's pause get an intuition of like this set of like let's imagine like shard 2 or whatever the set of validators that are processing shard 2 is it like a static set or does it keep changing changing with time dynamically it's changing with time so the idea is and so there's like what's right now lives and what's been live for for a few years

Starting point is 00:48:43 and also what's we launching with of validation. I'll probably just talk about status validation just for easy explanation. So with status validation there is a two set kind of two roles that the value that it can play and all of this is relatable. One one role is so-called chunk producer which is somewhat similar to what you enroll up is called sequencers. right. So this is the node that receives the transaction, orders it in the block, in a chunk in our case, responsible for their shard. It sends out this chunk to others as well, and then executes the transactions and receives the kind of the result. And so importantly, kind of where comes in the data availability, when the child producers, sends out the chunk of information, they include, they kind of so-called erasure coded, which means they replicate this information in such a way that, you know, they send it to

Starting point is 00:49:55 other cell towers such that even if everybody who's in, you know, servicing this cell tower is offline, goes malicious, etc., other cell towers can completely replicate everything that happened in this cell tower. So that's kind of what data availability or ratio coding is. So there is chunk producer. Now, there is a small set of junk producers kind of similar of how, you know, there's single sequencer usually on roll-ups, but you don't want that because of censorship and reliability. Now, for validators, actually have a different story, right? So what Alex mentioned, you have this adaptive corruption problem, right? So if you have validators, which is sitting there for a long term, for a long time, it's possible,

Starting point is 00:50:42 you go and say like, hey, if you're in this shard, and I see you in this shard, for example, I can bribe you to, you know, do something through the shard. And so, and then you need fraud proofs, and fraud proofs are complicated and kind of require additional timelines. And so with the status of auditions, actually, the chunk producer, not just produces kind of the transactions, but also includes all of the state required to execute these transactions. And so that's so-called state witness. And so then any other node in the network, right, can receive this block and execute it

Starting point is 00:51:20 without knowing anything else about the network except kind of like-client of the network. So you receive everything kind of you need to validate that if you apply this transaction and you have the state and this state was included in the previous block, then the result of this. And you can confirm that and kind of send the confirmation. And so that's kind of, I mean, in the way, you know, kind of state of validation is the ability for any node to come in and say, like, hey, I'm ready to validate. I don't need to synchronize a network state. I don't need to maintain the state on my desk, right? Which, again, reduces the needs for the validator node requirements.

Starting point is 00:52:05 They can just, like, make sure that everything's okay. And so now you can rotate this validators who validate what chart all the time, every block. They can be actually, you know, you can randomly select set of validators. They can be overlapping. You can select, you know, out of the million, you can select, you know, 100,000 per shard.

Starting point is 00:52:25 And they've validated 10 shards, for example, each if there's enough capacity or any kind of parameters. And they are able to validate this. because they don't need to sync into the shard. They don't need to process every single transaction that ever hit that shard. They only need to look at this specific block. So that kind of opens up a lot.

Starting point is 00:52:48 You know, both kind of, you know, you can imagine, I'm actually really excited that potentially, you know, not probably this year, but soon enough, somebody can open up a new tab, you know, typed in a URL, which has kind of a validator node in a browser. and actually start receiving blocks and validating them. Because again, you don't actually need anything else

Starting point is 00:53:12 and browsers have WebAssembly to execute the transactions embedded. So that's kind of the very lightweight validation that you can rotate all the time. So, like, I'm imagining this state witness is more like, so let's say like there was block in and then the transaction came in and the chunk producer made n plus 1. But is it the case that for every transaction,

Starting point is 00:53:45 almost like this, for every transaction that they processed in that block, they are creating individual witnesses for each transaction. So if you take transaction 1, they'll say, okay, transaction 1 modifies these two accounts. These two accounts had disbalance previously. after the modification, the result is these two accounts have dismalanced now. So our state witness is this was what was before this transaction was processed.

Starting point is 00:54:17 This was after the transaction process, but the data that's being supplied is only those two accounts that are hit by the transaction. And so for every transaction, you are just creating like these breadcrumbs, this bare minimum amount of info that is needed to kind of validate that transaction and that's the state witness for that transaction. So every transaction that comes in the block of the shard by this chunk producer entity is kind of broken down into these individually verifiable pieces and then those individually verifiable pieces are scattered across all of the validators of the shard and they can and kind of do the job of verifying these individual pieces one by one.

Starting point is 00:55:05 And because each of these piece can be verified individually, that's why you are able to run the validator in your browser, because while your browser may not be a powerful machine, it can still validate a few of them. Yeah, yeah, that's right. So is it the case that this chunk producer, that's kind of like a more, a more long, long-lived role and then the validators of a shard is like a more short-lived role,

Starting point is 00:55:37 meaning as a validator, I'm doing some verification in this chart, then a few seconds later in a different shot, then a third few seconds later in a different shard like that, I'm constantly switching as a validator, but as a chunk producer and... A few seconds later is a little bit, is a little bit consulting. We produce a block every second, but... Yeah, it's like every second, every block you can be valid in a different shard, because there is actually no difference.

Starting point is 00:56:03 Like, you don't actually care with Charlotte's on because for you, it was just a block result, like a set of transactions to resolve the information you need. And so that's why you can rotate validators now every second. I mean, there's like networking requirements and some, you know, data information propagation. But in principle, yes, it can rotate every second.

Starting point is 00:56:23 And then for chunk producers, they rotate every epoch right now, which is 12 hours or 12th to, 14 hours. And there where you, because you need to actually sink the state, like if you're moving to an X chart for the chunk producer, you actually need to know everybody's balances, right? And so you need to actually download all that, make sure it's correct, consistent. And then kind of while you're downloading, you actually need to now receive new blocks

Starting point is 00:56:52 and apply them as well. And so you're kind of doing two jobs in parallel. You validate it, you're kind of chunk producing the chart you're in now. and you're getting ready to produce short that you are the next time. And so that requires kind of a more sophisticated setup. Right. So, okay, so then you have these validators that are constantly jumping from short to short, validating some small pieces.

Starting point is 00:57:19 And is it the case? And when I'm validating a certain piece, I'm also adding my signature saying, yes, I checked it and it's correct. and every transaction and its witness is kind of getting more and more signatures or attestations from the validators, and that's how you're building up trust? But it's not accumulating over time. All the signatures happen on the same block, right? So every block you need to do a sign-off.

Starting point is 00:57:46 And it's not the case that everybody signs off every block, right? We just need a certain majority. And, you know, because blocks are created every second, and the validators are running on a relatively commodity hardware, sometimes you will miss a signature. If you go to an explorer, you will see that nobody has 100% uptime. People have like 99%. Right.

Starting point is 00:58:12 But yeah, but the idea is that, yeah, there's a set of validators, they validating a particular block. We know whom we expect to sign off on the block and then a majority of them signs off. And the block is created. You can look at it and say, well, let many validators sign the block.

Starting point is 00:58:27 at this point I know which charts they were supposed to validate. I know that unless someone corrupted them in 0.1 milliseconds, you know, there's a relatively high certainty that the state transition is valid. Right, because on the other side, the mathematics says that when I'm sampling these validators randomly, my odds of getting a completely bad set of validators, if 25% of the set in the bigger network is corrupt is kind of low.

Starting point is 00:59:00 So because of that sampling, you're able to kind of trust that, okay, while this validators are considered, or you are sampling them constantly, every second you are changing the samples. But because you calculate the probability of one sample being malicious, like 66% extent or something,

Starting point is 00:59:22 as being very low, you are able to trust kind of like the signatures on the witnesses of your transactions and be sure of the state of a particular shard. And meanwhile, like these chunk producers when they are producing these blocks, they are also forwarding the data

Starting point is 00:59:43 corresponding to these blocks to a set of validators. Now, these other set of validators may be different from the validators that are checking the witnesses of the transactions. They don't need to be the same set. Yeah, so there's two things that happen. One is you, I mean, the validators that are receiving

Starting point is 01:00:06 to check the witnesses, they actually received the data as well, right? Because they actually can validate the transaction. But also, you want to send to other shards as well in case, you know, this whole shard sale, but also like just you want to route the outgoing messages. And so we actually combine the message routing data availability and

Starting point is 01:00:29 consensus into kind of one process where let's say you have, you know, let's say like you withdrew money from my account on chart 1 and then you're sending money to Alex within a chart 2. So now there's a message going to chart 2

Starting point is 01:00:46 saying, you know, you should debit Alex, you know, 10 years. And so now that message is not just a message, it also includes a so-called erasured-coded part of the transaction data that this chart was producing. And so kind of this process ensures few things. One is everybody then goes and when they actually signing and confirming their own information and sending their approval, they also confirmed they received the needed chunks, the needed kind of parts from other shards. And so that's also provided this guarantees with this message delivery from chart to shard. It provides the dataability guarantees and it all kind of integrated into the consensus messages

Starting point is 01:01:39 that are being sent by validators to each other to actually accumulate their BST consensus. So it's worth mentioning you, there's a next role called BlockPretion. producers, right? So there's an actual, it's the blockchain, is the near blockchain. So it's not like, because often when people think of sharding and, you know, many sharding blockchains do work this way, people think of multiple chains, right? So every shard is a chain. It is not the case on near. Or near there is only one blockchain and there are block producers creating blocks on the chain. And when, but those blocks do not contain the actual transactions, right? Or all, like logically they do, right? Like you can think of a block exactly the same way as you would

Starting point is 01:02:25 think of a block on Ethereum, where it has like a header, consensus information, and a bunch of transactions, with the difference that while logically transactions are there, physically, it only contains information about what we call chunks, so one chunk per shard, or rather up to one chunk to shard. And physically, the block does not contain those transactions, it just contains the information about the chunks that were produced. Right. And at every particular block, some sharps might miss a chunk, right? Because there's a particular chunk producer responsible at every particular moment. It could be offline, it could be, you know, busy, etc.

Starting point is 01:03:01 And so a chunk could be missed. Right. But if the chunk is produced, what happens is that the chunk producer, when they produce a chunk, as Ilya mentioned, they erasure coded, and they send a part of it to every block producer. Right. And the block producer would only sign off on the block if they have. have a part of the chunk that is intended for them. And so this is where consensus and data availability

Starting point is 01:03:28 are sort of mended together, is that in order to reach a consensus on the block, two-thirds of all the block producers waited by stake have to sign off on it. It's a BFT consensus. So if there's no two-thirds of signatures, the block, we wait until we have. We favor safety over liveness.

Starting point is 01:03:47 And correspondingly, if we cannot get two-thirds of signatures, stole. But we, and if you have two-thirds of signatures, because the block producer would only sign off on the block if they have their part of every chunk in that block, right? Then you know that two-thirds of the block producers have, for every chunk included, have their little part. And the erasure code is such that you need one-third to reconstruct any chunk. So as long as you believe that no more than one-third of the block producers is malicious,

Starting point is 01:04:17 and if you have two-thirds of signatures, in the worst case, all the malicious actors are included in those two-thirds. So you have one-third malicious, but you still have one-third honest. And so you can reconstruct any chunk. So every chunk is available to everybody, guaranteed if you have consensus reached on the block.

Starting point is 01:04:35 So data availability and consensus are merged together into a single mechanic. Right. So, like, this is wicked cool, and it's also, like, hard to understand because, this is actually unlike any other system where data availability and consensus are usually like two very separate processes like whether you go from Ethereum to Sestia to all of the

Starting point is 01:05:01 but in near it's almost a yeah in essence right you you want to it sort of begins with with your goals right and then and then we were going back right so the goal was to have cross short transactions and generally communication to be with a delay of one block. So effectively, if in a particular block a transaction initiated and it wants to do something in another shard, we want that to happen with a very high probability in exactly the next block. Right. And so if the data availability was separate from consensus, that would be extremely hard to

Starting point is 01:05:38 ensure, right, because we need to be certain that data is available as of the moment when the block is produced. as opposed to that being a separate process, right? And similarly, Ilya mentioned there are three things which are merged together, right, data availability consensus and the message passing, right? So together, the chunk that is now totally available at the moment of consensus being reached

Starting point is 01:06:03 also contains the messages that needs to be routed to another chart, and it is designed in such a way that it is ensured that by the time the chunk producer of that other shard is producing the chunk, they not only know that the messages exist, but they also have them, right? And so they can immediately act upon it. And moreover, they have to act upon them.

Starting point is 01:06:23 Right, so a chunk producer, the chunk would not be valid if the chunk producer did not act on the messages that were sent from another shark. It could be that they don't act upon them immediately because of congestion. Like imagine everybody's sending receipts to the same sharp. Right, so that is automatically handled.

Starting point is 01:06:38 It could be that the receipt is not processed immediately. But the receipt is acknowledged immediately, and it's put on the queue immediately. and most of the time it is also acted upon immediately because congestion is there, most of the time there's not congestion. But it is all part of the same process where we ensure that if something happened in block N, and that's something wants something else to happen in another short, that something else will very likely happen in N plus 1. And maybe to use like ECDM and roll-ups analogy here that, you know, you have optimism,

Starting point is 01:07:14 producing a block, right? That block immediately sent to Ethereum the others who included and as well as they include every other role, let's say

Starting point is 01:07:30 we have our Optimism Arbitral trying to send money directly between each other, right? So like both of their blocks need to be included at the same time immediately in the same Ethereum block. to guarantee data availability

Starting point is 01:07:47 because now the kind of like let's say optimism sends something to Arbitram connect in it but if it's just that then Arbitrum needs to

Starting point is 01:08:03 read out the state from the stadium right which adds extra latency and so what happens here is you're assumed that optimists and arbitram are that those sequences are also validators in the network. And so you send them, you kind of, after this mortgage rooms send their blocks to each other, right, they confirm it. And kind of

Starting point is 01:08:25 those stations are now allowed to progress forward the blockchain. And so like as if all the roll-ups themselves form data series, right? And we're sending information to each other directly. And in turn, that allows to optimize for latency and for kind of this cross-shot communication. because everybody talking to each other directly and but rely but then you know sending confirmation to the whole united system so i'll i'll try to state this in kind of my own understanding of how it works so so it's like we need two different views one is kind of like the shard view and then one we need like the global view because there's a global block so you know we we went to the shard view pretty well it's like there's a

Starting point is 01:09:14 use set of validators, some of them are assigned to the shard for a block and then we reassigned to other shards and they are kind of like validating parts of the transactions in the chart. Now these validators, sometimes they also get the responsibility of being these chunk producers, which are more like long-lasting entities for 12 hours for a particular shard. They are producing like, okay, for this block, this is the set of transactions. and here are all of the witnesses, witnesses, so these are like long-lasting entities. So you could have like a short-lasting role,

Starting point is 01:09:52 and then you could also get a long-lasting role as a validator, right? So that's kind of like the local view of the chart. Now, in the network there's like a global view where there's actually a single blockchain with like block after block after block, like Bitcoin or Ethereum. but what that block contains is not the transactions themselves, but the chunks from the different shards that are sort of accepted post consensus as being correct. Like, okay, so here's block N, it contains chunk one from this chart, like chunk X from this chart, chunk Y from that chart, chunk Z from that chart, and so on. and it just contains what chunks the networks is considering like finalized. And so all of the validators are trying to build this blockchain that just contains chunks.

Starting point is 01:10:53 And the validators are kind of like signing off on that, on that block containing chunks. They are trying to add their signatures to it. And their logic is something like what they are checking is, corresponding to each of the chunks that are part of the block, did I get the slice of data in order to ensure data availability for the whole network? So as a validator, when I'm signing off on a block, what I'm checking is, okay, this block contains these chunks.

Starting point is 01:11:29 If it contains these chunks, I should have received the data corresponding to these chunks. Do I have it in my hard drive? Yes. okay so that's one sort of validation passed and then I sign and like all of the other validators are signing that so fundamentally the network is coming to agreement that we have the data that we are going to need to reconstruct any part of the actual transactions in the block in the future that's the thing people are coming to consensus so in a normal like Bitcoin like that when a block gets finalized, the network comes to consensus about these transactions that are processed. In near when a block gets finalized, the network is coming to consensus about the validators collectively agreeing that they all have the data needed to reconstruct every part of the block should the need ever arise. Is that right?

Starting point is 01:12:33 is that intuition right? And they also, like in the world of stateless validation, they also have the information that each of those parts were reconstructed and validated by a set of validators. Right, right. Not only that the data is there that can reconstruct everything in every transaction in the chunks, but also the data corresponding to

Starting point is 01:12:58 that every transaction with its state witnesses was validated by a certain number of validators of the shards where the state witnesses originated. So it's coming to consensus about data where we... That's, yeah, that's really interesting. It's really cool. Yeah, I mean, the big benefit was... I mean, we do get this question,

Starting point is 01:13:23 which is like, you know, why Thedium kind of shifted from sharding to roll-up architecture? And, I mean, first of all, obviously, not talking for anybody, individually or ecedium or ecedium is not an agent themselves either. But practically speaking, to design something like this,

Starting point is 01:13:43 right, you need to build everything from ground up. You see how consensus, data availability, message passing, and kind of validation, correctness, all layered in into one system. And that requires kind of this like approach. And as well as the DM itself, because VM now needs to be aware that compared to EGM, EVM, where everything is kind of available, right?

Starting point is 01:14:09 You can always say like, hey, give me a state of that other account. Like, tell me how much tokens does, you know, Alex have. In the case of the sharded system, like, that requires a cross-shard message, right? That requires saying, like, well, go find where Alex lives, right? and ask him how much tokens he has. And so you need to design kind of everything from scratch, from bottom up, with this understanding, given the goal we had, which is like, you know, ultimate horizontal scaling that is hidden from the users and developers in many cases.

Starting point is 01:14:46 And so for ETHIUM, like they do not have such a luxury, right? They have a working system, extremely valuable, extremely kind of integrated everywhere. And so they needed something that is kind of an evolution of their existing system versus a complete rebuild right from scratch. And I mean, roll-up architecture, you know, we use it as analogy because a lot of it is similar. Echidium provides availability and consensus. Roll-ups are able to communicate it with each other, but they need to go through ETHM pretty much to kind of validate and settle before a message can be passed. And so a lot of it is kind of spiritually the same. And it's just because of the legacy of like, well, now each of VM is the whole separate universe of accounts, right?

Starting point is 01:15:33 And so now your item as a user have account that ever change, right? I don't have a singular balance now that they can use everywhere. Like all of that is kind of a legacy. You know, how do we kind of upgrade the existing system into more scalable platform? So that's kind of really the biggest, obviously, you know, in a way, better set we had, which was like, starting from scratch and kind of designing it. The luxury of the clean slate. Yeah.

Starting point is 01:16:01 Luxury of the clean slate is what you had. Right. Right. So as a validator, sometimes I am taking on this role of being a long-lived chunk producer for 12 hours in a particular shard. I am constantly taking the role of validating the state witnesses of a shard and I'm being assigned from shard to shard to shard. And then every time a kind of like a block is produced in the global main network,

Starting point is 01:16:36 what I'm signing off on is corresponding to that block, I should have received some data to backup the data of the state of the entire network, the data of the entire network. I should have received some small chunk of it. I can identify it. Have I received it? and then I not only should have received data, but I should have also validated some of the witnesses in the global network or in the chart. Have I done that?

Starting point is 01:17:04 And if I have done that, I sign off. And if a majority signs off, that is when Neer says, okay, we have consensus overall that the data is backed up properly. And all parts of the update had witnesses and they were validated by enough, enough validators and therefore this block is correct and then that's done and then the chain moves on. Yes, yeah, that's right. And at the end of the day,

Starting point is 01:17:35 all of that complexity boils down to, you know, like there are checks and balances that people do, but at the end of the day, all you care about as the observer is do I have signatures from the sufficient set of validator? And if I do, I know that they have done all the work or at least they claim to have done

Starting point is 01:17:53 so, right? But that's a whole set of validators, right? So for them to be corrupted, you need, you know, you need to corrupt a, like a whole big proof-of-stake system, which, you know, it has sort of its own, you know, design. There are certain design principles that don't allow it to be corrupted without a massive amount of money being lost. And, yeah, and so that allows also, you know, other chains to easily create client's, right? So near itself, encompasses a lot of processing internally. It could also be a part of a bigger ecosystem because building a light plant for NIR is a relatively straightforward process. And NIR having, you know, general purpose VASM machine can run any light client for any blockchain that allows a light

Starting point is 01:18:37 client and that allows you to have two-directional bridges that effectively say as long as the light client of both chains is not compromised. And light clients are very hard to compromise, despite the term light client. Like, you know, compromising in Ethereum light plant or near a light client is extremely hard. And so then you say, well, for as long as those are not corrupted, NIR is also part of the bigger ecosystem of more chains. So, I guess like

Starting point is 01:19:06 every creator, when they make something, they end up having, like, there's usually a certain element of the system that they wished was better. And then do you have those? Like, for you individually, what parts of NIA are kind of, unsatisfying?

Starting point is 01:19:22 you're kind of unsatisfied about in the design? Well, I mean, there's a few things that we're still finishing up and still on the road. I think, so we mentioned dynamic resharding, right? So right now, as Alex mentioned, right, this is more of a kind of a governance, technical governance process. But the benefit with the stateless validation approach is that because we rotate validators, you know, every second now, you can actually change the number of shards. for validators very quickly, right?

Starting point is 01:19:55 Because again, they don't really care how many shards in that word they don't care which shard about it. They just receive the block. And so the benefit here, if we have sufficient kind of redundancy on chunk producers, we can split the chunk producer group

Starting point is 01:20:09 and say like, you know, as of next block, you don't care about half of the shard that you were in, right? So, comparative, you know, to where we are now, where we need to like everybody agrees, you know, new client has been drawn, validators spit it up, you know, this shot it happens. This can happen literally instantly where everything like,

Starting point is 01:20:32 okay, now we split into two charts, you know, now I'm ignoring just half of the transactions that I may still receive because people are still routing to me and I'm just processing this half. And then on the junk production side, and then on the Valator's side, you know, now you're just assigning, yeah, sampling just from a different number. For the listener, this is the cell tower spitting to two dynamically or the post office spitting into two. Exactly. And as can happen immediately, right? Which is, you know, huge benefit for the spike in usage where, you know, sub-application

Starting point is 01:21:04 just went, you know, viral, anesthesia, or, you know, tuck and claim or whatever. And you can like, hey, let's just pull it out into a separate chart, let it, you know, go nuts and then maybe merge it back in like a day or so. So that's, that's, you know, definitely huge. kind of better suit that and for context I mean, status validation is something we published in February

Starting point is 01:21:28 and been building last year and should be launching within the next few months and then we're going to start working in dynamically. Now, from my perspective, there is a few things that beyond that

Starting point is 01:21:45 that would be, that we should be working on. One is and Alex actually worked some designs of this earlier is leaderless chunk production. So right now there's one chunk producer at a time that is responsible for producing

Starting point is 01:22:03 a chunk, right? And I mean, they kind of rotate and they randomly assign but still if that chunk producers offline, now we have a gap on that time slot and you drop in TPS you have you know higher latency with the users.

Starting point is 01:22:20 And so the idea definitely is how They can also be deduced. Which is, yeah, something that happens on other networks right now. Not because we somehow resilient to that. It's just badgers shows another network to deduce, not ours, but it can happen eventually, right? So, so literally with chunk production and consensus is something we need to implement. And we have a pretty clear way of achieving it.

Starting point is 01:22:46 So by neediness chunk production, So when you say that I'm kind of like reminded of Algonaut where where the essential idea is like when you think of a network like like like you when you think of like a Cosmos network the blockchain is rolling along and blocks have been produced everybody knows who should be producing block N plus one right it's publicly known and if they fail to produce the network weight and then realizes fails to produce and then the network also knows if that per guy fails to produce the block

Starting point is 01:23:26 then this other guy should produce the block and there's kind of like almost a line a cue made of like who has rights to produce and so that's only what you mean

Starting point is 01:23:37 when you were saying leader so if you have a chart you have a multiple validate like multiple chunk producers but which one exactly produces the chunk for this block

Starting point is 01:23:48 it's currently known in near, just like cosmos. But what you would like is a system where there are N chunk producers and then when a certain block rolls along, one of the chunk producers realizes they have some kind of winning lottery ticket and they can produce the chunk?

Starting point is 01:24:11 So this is not, so you're explaining algorithm and it is not legal. I'm explaining algorithm. So is that the vision for Luton? No, there's still a leader in what you explain. So the only difference is that the leader is not known in advance. So it partially solves the problem. It's much harder to get those, for example.

Starting point is 01:24:26 You cannot dedos someone you don't know. Right. And so by the time, you know that they want the lottery ticket. And we actually do the way we think of it is similar. So in Algarant, you actually don't know in advance that you want a lottery ticket. You know, like if actually what happens is that everybody looks at their lottery ticket and sees the number. And the highest number wins. you don't know if you have the highest number, because you don't know the numbers of others.

Starting point is 01:24:51 If you did and others did, then it would be no different from Cosmos, then everybody would know, right? So instead you say, well, I have a sufficiently high number for me to believe that I might be the winner. So I will produce the block. I will publish it. Maybe someone else will publish, and, you know, given my number was 97 out of 100, there's a good chance that I was the highest, right? But maybe there was 98. this approach has a minor problem that still multiple blocks will have to be broadcast because effectively either the threshold is too high that every now and then nobody will broadcast a block

Starting point is 01:25:25 because nobody won the ticket above a certain threshold or it's like sufficiently low that multiple blocks will be broadcast but in our case something interesting to think about is that at the end of the day the chunk will be broadcast to everybody like a little part of it right and so everybody will have to receive. But within the network, within the set of the chunk producers,

Starting point is 01:25:48 what they can do, they will have to send the chunk in its entirety for validation, or like with the validators of the shard. Right. So you can think of a system where on the high level, what happens is that every single chunk producer generates a chunk, not just some people who are beyond certain threshold. Everybody produces a chunk. And then everybody stands to every person in the shard,

Starting point is 01:26:11 So like to every chunk producer, every, you know, like you're a recrugated part. And so at the end of the day, the network overhead is not chunk size times number of participants. It's still proportional to the chunk size, right? But now you have as many chances you have chunk producers. And then you still do the lottery ticket. Then you reveal your lottery ticket. And if you won, your chunk is accepted. Right.

Starting point is 01:26:35 But there is no issue with choosing the threshold and maybe spamming the network with multiple chunks that have to be exchanged in its entirety. So that's the high-level idea, right? But the high-level idea is that now you can literally never a chunk will be skipped, no matter, you know, like as long as there's one person you didn't deduce, the chunk will not be skipped. And it's a slightly, slight improvement over an algorithm

Starting point is 01:27:00 idea where you, you know, have a threshold and but still exchange the whole block. Yeah, this is really fascinating. So I think the essence of the design, being that in Bitcoin or in Ethereum, when I have a block, I have to forward that entire block to everybody. And there's a certain like diameter of the network. So I'm a validator.

Starting point is 01:27:28 And there's some validator to which I have the worst connection. Like there's multiple hops. And that entire block must be now sent across the diameter of the other side to that worst other validator. But in near almost it's like I have a more efficient microphone, or megaphomes. I produced a block. I cut it into lots of pieces and I only need to send these small pieces to all of the validators.

Starting point is 01:27:59 So there's some validator to which I have the worst connection, but I only have to send a small piece through that worst connection, through that diameter of the network. And because of that, because, because like this broadcast is only piecewise and this broadcast is kind of like efficient, you can afford to have a design where in a shard, all of the chunk producers produce a block, they broadcast their pieces and like then they compare, okay, which of these pieces has some highest amount of randomness and that becomes like the canonical chunk for that block. Yes.

Starting point is 01:28:38 Yeah, it's got a chunk of the block, yeah. And it's like depending, like, that person I have the worst connection to, depending on why they need my block, right? If they're just a block producer and they just need to sign off, it's sufficient for them to have just that little piece. They don't need to reconstruct the block. Or a chunk. But if they do need the whole chunk, it's still more performing

Starting point is 01:29:00 than sending a whole chunk to some, right? Like I send little pieces and then that person will collect little pieces from different entities, right? So it's still a faster way to propagate information. And I guess the advantage here for us in building that literally consensus, sorry, literally chunk production, is that we already have a concept of this erasure code,

Starting point is 01:29:21 we already send small pieces. We already have the mechanic to gather them. So it's much less invasive change than for a network where today blocks are being sent fully. So for them to implement the same feature, would be implementing a lot of new mechanics, while in our cases, just plugging into existing machinery. I wanted to touch, so you asked a question,

Starting point is 01:29:47 which I think is interesting, and I earlier covered it from a very different perspective of sort of, you know, something that design-wise is not great, and we would like it to be different. I think something I thought a lot from the day we launched near is that the accounting model for sharding, is quite different, right?

Starting point is 01:30:08 Like, you cannot have, like, flash loans, for example, easily because accounts are only given shards and everything takes a hope, right? And we've been trying to solve this problem, like we were trying to find a way to have atomic transactions since day one with many different designs. And it's a drawback of sharding. But what's interesting is that I think slowly we're coming to realization that long-term not having sharding is not an option.

Starting point is 01:30:35 Right? So in essence, the highest throughput blockchains today, which are not sharded, are getting congested. And that it's only that much they can squeeze more, right? Like they work day and night to remove all the suboptimalities, but at best they will squeeze out another 20%, right? Let's say 50%. That will get congested again. Like the adoption of blockchain today is a fraction of what we wanted to be to consider the whole ecosystem to be successful, right? And so sharding will have to happen.

Starting point is 01:31:02 And when shardling has to happen, people will have. to deal with this disadvantage of this different account model. And so NIR in this case is positioned extremely well because from day one, every application of NIR was built with that account model in mind. Right. So we have tools. We have understanding. Developers in the ecosystem are used to working in this setup, right? While in the rest of the ecosystem, people are still operating in this atomic transactions mindset, which they will have to abandon eventually. Because at some point, if their application is to scale and applications they depend upon

Starting point is 01:31:38 are to scale, they will not be able to maintain this atomic guarantees and scale to the usage, right? So they will have to abandon this mindset and we're positioned uniquely in the sense that everything that is built on near, this rich ecosystem of applications, is built in the future-proof way.

Starting point is 01:31:57 As we create more shards, every application in near gets to take advantage of that. while for any application that is built on, you know, what we call synchronous runtime, they will have to rewrite their applications. I think that actually the really interesting part is because NIR is kind of asynchronous and in a way every contract, every account and every contract is designed to be independent, it actually doesn't matter if that account is on NER or not. And so that's why kind of a lot of the chain kind of abstraction ideas become because, well, it actually doesn't matter.

Starting point is 01:32:41 For like for a rest finance, like an AMLNN, NIR, it doesn't matter if the token is on NIR or on the CDUM and base on Solana. Like you can actually deposit any token and the rest finance is able to handle it. And so, so like generally speaking, NIR is designed with every contract, every cap, every contract, every cap, count as kind of handling assets that are living on like in a synchronous way right and obviously it's good to have them on near because we get like one second communication time so like the latency is very low air you know that count spaces is nice but it can be living somewhere else and if like if we have kind of a message passing way of sending this then you know near spot contracts know how to deal with that And all our standards, like AirC20 on analog or standard is designed with callbacks and kind of best passing in mind.

Starting point is 01:33:38 And so again, like compared to right now in EVMs, they try to figure out how to do cross-layer-2 communication. The challenge is really lies in like all of the standards. Like imagine trying to send the RC20 from one chain to another. Like there's no standard address space or anything that supports that. It's also synchronous, like expectation is that that transaction will execute the same block. but actually it needs to be scheduled message passed settled you know sent somewhere else about invalidated etc so so really that's kind of the it was a very non-trivial trade-off right and and we kind of i would say had a had the period of time where we're like did we do a right

Starting point is 01:34:20 trade-off kind of but but right now yeah we seem like we're literally seeing the validation of our pieces throughout. Yeah, so the point being, like something like flash loan where in the near design, it's hard. It's not that flash, yeah. It's more like Iron Man loan, yeah. Yeah, it's not that flash. So at some point in 2021 or something like that, it must have seemed that, oh, Ethereum has

Starting point is 01:34:48 flashed loans, but near Woodle and that's a problem or is a perceived problem. but when you look at like Ethereum in 2008 which is Ethereum plus all of these L2s and N3s you can't have flash flows across their entire ecosystem and you can't have flash notes across near

Starting point is 01:35:07 so it's fine right Nick it's a it's a downside but not really exactly like the modern G5 of 2028 will be what what people have been built in G5 near the past

Starting point is 01:35:22 two or three years. And as I will you mentioned, this entire ecosystem of L2's and L3s will also be part of near ecosystem because of chain obstruction. Any account on any chain can be, you know, with a very thin abstraction layer

Starting point is 01:35:39 perceived as a near account just with a higher latency. Right? So like a near account to near account will be one second always. But near account to optimism account, it's going to be exactly the same protocol on near side. But the latency will be higher because there's communication between them.

Starting point is 01:35:58 I'm not going to chain abstraction because I feel like we've already covered so much and I did cover chain abstraction in the previous episode with Ilya. So I mean, avoiding that because things start to become already too complex.

Starting point is 01:36:16 Curious listener can Google near chain abstraction. Yeah, I just wanted to give an understanding why Like chain obstruction didn't come from nowhere. Chain obstruction was the mentality we took was NIR when we designed NIR. It's just like now we expanded that to kind of all chains. But it's still kind of, you know, the sinking we put in into designing NIR is still there. Now how do we apply it to the whole that 3?

Starting point is 01:36:43 Again, as soon, like you can think of optimism just being another shard of near. And this is where actually like, you know, for example, ZK Proofs, it, And Ag layer, what Aliga is working on, is all coming together. Because, like, if we can unify security, right? If we can kind of provide kind of common security layer, then on top of this, you know, and again, we have NIA, we can actually settle DA of other layer twos. Well, now they're actually not that different from other shards of NIR.

Starting point is 01:37:16 I mean, there's differences in sequencing and production. So, like, there's things to handle under the hood. But again, we can kind of extend our layer of obstruction that we provide to users and developers to kind of cover up that and say, like, actually, you know, if there's ZDK proves and data availability, we can actually, like, say security is the same, and now we can message pass and we can do kind of other species this way. And so that's kind of the idea is, like, you know, how do we, how do we apply the same methodology and kind of user experience, developer experience, but then expanded back more to the rest of WebStreet.

Starting point is 01:37:56 So, like, from my perspective, you know, when I look at like Ethereum's roadmap and then like the near and near roadmap, one of the things that stands out to me is in Ethereum, the roadmap is based on like scaling via these L2s and L3s. But the relationship between ether, the asset and the core asset of the L2 can be synergistic at times

Starting point is 01:38:26 but it can be non-synaristic at other times right? So it's like the L2 pays the main Ethereum chain for certain services and the service intended is usually data availability. But it can be the case that the L2 generates 100 million in fees but it only pays 500,000,

Starting point is 01:38:49 in fees to the main chain. It's, this relationship is kind of like great. If the L2 is kind of building a completely new market that Ethereum never had, right? Like, imagine, I don't know, some AI decentralized app comes and the L2 capitalized on it, built it.

Starting point is 01:39:11 They got 100 million in transaction fees and paid $500,000 to Ethereum main chain. It's great for the Ethereum main chain because there's a new revenue stream coming. It made $500,000. it's new. The interesting case becomes when some app that was massive, that's like massively popular on the Ethereum main chain generating millions in fees, ends up thinking, it's better I migrate to that L2. And so they might be making like 10 million in fees on the main chain and then they migrate to the L2 and then the fees get cut to a million and then Ethereum is only making 100,000 on it.

Starting point is 01:39:49 So it was making 10 million in fees, and now it's only making 100,000 in fees because the L2 ecosystem exists. And they're kind of like the relationship is, well, from the Ethereum Ether holders perspective, that isn't so ideal, right? Because you're losing a DAB that might have been cultivated by the Ethereum network over years,

Starting point is 01:40:14 and then now it's kind of like migrated away. And in practice, this has happened with something IDX. But what's really cool about Near is like this sort of system doesn't exist. Like the Shards, like the relationship between like Near and the shards is kind of the near token kind of like owes the revenues made by all shards. And kind of there is not, in order to scale, it doesn't need to have these complex economic games be present.

Starting point is 01:40:48 between kind of like a main chain and an execution layer, which is very much there in Ethereum, and I believe that this will be, this will become a relevant feature of Ethereum's ecosystem politics in the future, and Nia will just not have any of it. Cool. So, yeah, I guess we can keep it at that, and it was great to have both of you on the podcast. maybe we should have another one to discuss on how

Starting point is 01:41:21 Alex is planning to use recent developments in the AI technology and what is building there. I was going to say that it's not the coincidence that AI stands for Alex Amelia. Yeah, I mean, thanks for having us. Obviously, this is a highly technical topic that I think it's been really hard to explain in general. I mean, we've been trying to do this for years now. but the core idea

Starting point is 01:41:49 is also like it was really hard to prove it out when you just launched because when you just launch you don't have anything so there's no users so there's no need for sharding and I think like we've kind of had that problem in Web 3 where everybody was claiming scale but until you actually have like real world massive user base to actually transact

Starting point is 01:42:13 you know just like a general improvement was enough. And I think only in last probably like three, six months we're seeing, you know, on NIA, for example, kind of multiple kind of million user applications launching at NIA right now, somewhere between

Starting point is 01:42:32 1.5 to 2 million daily active, right? Which is more than any other blockchain right now. We have more transactions usually than all layer two combined. at least on some days. And so, like, that's kind of where this is starting to prove out, right?

Starting point is 01:42:53 And for context, we're still under Solano transaction numbers. So, Salana counts the consensus transactions. I don't know how we compare to the actual member. Yeah, but generally speaking, yeah, like we have more daily active users than Solana and most of the days, Tron, because Tron is kind of second biggest right now. But again, like this is the point that, you know, as we started to see kind of as growth, like we started to see, you know, congestions and some of the charts. And the idea was that, you know, we can just expand capacity without increasing fees

Starting point is 01:43:34 versus every other blockchain, including Tron actually is a really good example because the transaction fees went from being very cheap. to actually now being like, you know, 30, 50 cents for their users. Even though they are running, you know, it was like a small subset of validators, you know, a kind of modified Indian chain. So I think like that's kind of where we're starting to see these things play out. And obviously, again, it took a while, right, to ecosystem to mature the applications to, you know, build and launch as well as for them to gain users.

Starting point is 01:44:12 but now we're starting to see this story really playing out and, you know, on this street. It's exciting and it's also kind of tell now is really good time to tell the story and kind of explain how it works. Cool. Then I'll catch you again on Epicenter, India and Alex. Thank you. Thank you for being there. Thank you. Thank you. We release new episodes every week. You can find and subscribe to the show on iTunes, Spotify, YouTube, SoundCloud, or wherever you listen to podcasts. And if you have a Google Home or Alexa device, you can tell it to listen to the latest episode

Starting point is 01:44:55 of the Epicenter podcast. Go to epicenter.tv slash subscribe for a full list of places where you can watch and listen. And while you're there, be sure to sign up for the newsletter, so you get new episodes in your inbox as they're released. If you want to interact with us, guests, or other podcast listeners, you can follow us on Twitter. And please leave us a review on iTunes. It helps people find the show, and we're always happy to read them. So thanks so much, and we look forward to being back next week.

Epicenter - Learn about Crypto, Blockchain, Ethereum, Bitcoin and Distributed Technologies - Near Protocol: 'Blockchains Cannot Scale Without Sharding!' - Illia Polosukhin & Alex Skidanov

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.