Bankless - 119 - Dive into Danksharding | Vitalik, Dankrad, Protolambda, Moderated by Tim Beiko
Episode Date: May 23, 2022This might be the dankest livestream we’ve done to date. Vitalik Buterin, Dankrad Feist, and Protolambda join Tim Beiko to discuss all things Danksharding and Proto-Danksharding. Sharding what? Don'...t worry. By the end of this conversation, you’ll know how sharding has evolved over time, why danksharding is possible, how proto-danksharding will get deployed, and so much more. ------ 📣 ALCHEMIX | Get a self-repaying loan today! https://bankless.cc/Alchemix ------ 🚀 SUBSCRIBE TO NEWSLETTER: https://newsletter.banklesshq.com/ 🎙️ SUBSCRIBE TO PODCAST: http://podcast.banklesshq.com/ ------ BANKLESS SPONSOR TOOLS: ⚖️ ARBITRUM | SCALED ETHEREUM https://bankless.cc/Arbitrum ❎ ACROSS | BRIDGE TO LAYER 2 https://bankless.cc/Across 🏦 ALTO IRA | TAX-FREE CRYPTO https://bankless.cc/AltoIRA 👻 AAVE V3 | LEND & BORROW CRYPTO https://bankless.cc/aave ⚡️ LIDO | LIQUID ETH STAKING https://bankless.cc/lido 🔐 LEDGER | NANO S PLUS WALLET https://bankless.cc/Ledger ------ Topics Covered: 0:00 Intro & Danksharding Explained 7:00 Panelist’s Backgrounds 8:34 How Sharding Has Evolved 15:56 Why Danksharding is Possible 23:40 Proto-Danksharding 27:15 Data Availability 35:40 Data Availability Sampling 48:00 Cryptography for Noncryptographers 56:10 How Does All of This Get Deployed 1:00:05 The Big Open Questions in Sharding Land 1:05:30 Closing Thoughts ------ Resources: Vitalik https://twitter.com/VitalikButerin Dankrad https://twitter.com/dankrad Protolambda https://twitter.com/protolambda Tim Beiko https://twitter.com/TimBeiko Proto-Danksharding https://www.eip4844.com/ ----- Not financial or tax advice. This channel is strictly educational and is not investment advice or a solicitation to buy or sell any assets or to make any financial decisions. This video is not tax advice. Talk to your accountant. Do your own research. Disclosure. From time-to-time I may add links in this newsletter to products I use. I may receive commission if you make a purchase through one of these links. Additionally, the Bankless writers hold crypto assets. See our investment disclosures here: https://newsletter.banklesshq.com/p/bankless-disclosures
Transcript
Discussion (0)
Hey, Bankless Nation, welcome to our special live stream. This is going to be a panel edition.
I think, David, this is going to be our dankest panel, maybe the dankest episode we have ever recorded because we are talking about dank sharding today.
And I got to be honest, going to this episode, I'm not entirely sure what it means.
Like, what is dank charting? But we'll ask some of the participants this.
Maybe, David, you could describe the setup of this panel and who's on it and how we're going to handle this episode.
Yeah, of course. You guys all know.
Tim Bako. He's the guy on the bottom of the screen. Tim manages the all-core
deaths call and is coordinating the lead into the merge and beyond. And we've had Tim
Beko on before. He led a very technical EIP-1559 panel and asked the questions that Ryan and I
are just not smart enough to ask. So this is one of those panels. We're going to get as technical
as possible. We have three fantastic panelists who are behind the scenes. We've got Vitalik,
Dankrat, and Proto. Who are the minds behind Danksharding?
and charting in general and other Ethereum-related technologies.
And Tim is going to be able to ask questions that are the technical questions, the smart questions.
But before we get there, Tim, we just need to want to cover some high-level stuff.
What is Danksharting?
How to get its name?
And just like, what does it mean for users?
Right.
Well, first, yeah, thanks for having me on, guys.
So bank sharding and proto-dank sharding, which we'll also get into,
are iterations over the sharding design for Ethereum.
And we'll spend the bulk of this panel discussing what they are with the tradeoffs are and what not.
But at a high level, sharding is a way for Ethereum to have more data pass through the network.
And because layer two solutions like ZK rollups and optimistic rollups, they produce it on a data.
If we have a way for them to post that to the Ethereum network more cheaply, it immediately reduces how much they need to charge users for transactions.
So all of these kind of flavors of sharding all have kind of the same end goal, which is to create a cheap place for layer two solutions to post data on Ethereum.
And the impact of that is that the transaction fees, that end users end up pay on layer twos is lowered by a lot.
So this is all about getting transaction fees down, particularly on layer two's, gas fees down on layer twos.
And Tim, is this related?
You mentioned proto-dank sharding and dang sharding.
You guys will talk about all that in the panel.
But is this related to EIP 4844?
Because that's another EIP we've heard a lot about.
Yes.
So EIP 4844 is proto-dang charting, basically.
And one way to think about it is like,
proto-deg sharding is like maybe the first step we get towards sharding,
and then bank sharding is the simplification that we'd had on the previous roadmap.
So like the more prefixes we have, it's like the sooner we get them.
Yeah.
All right.
This has been, like you said, an iterative cycle, an iterative process to get to where we are today.
And dank sharding comes from Dankrad, who's on the panel.
And Proto-Dank sharding comes from Proto-Lamda, who's also on the panel.
So these guys' names have been baked into the name of this EIP itself.
Tim, is this the new EIP-1559?
Is EIP 4-844 going to be the new EIP that we focus about going into the future?
So there's a lot of stuff we're working on right now.
you know, it's definitely one of the big ones.
And I think for end users, it's one of the most impactful ones
because it directly affects the gas price they pay.
Hopefully it's much as contentious than 155-559 was.
But yeah, it's definitely an EIP you're going to be hearing more about over the next few months.
And just one last question before we hand it over to the panelists
and we bring the panelists online.
There's been a bunch of conversations about, you know,
when EIP-1559 came and go, people were like,
is this going to reduce gas fees?
And the answer was no.
And then people are like, the Ethereum merge, is that going to reduce gas fees?
And the answer is no.
This EIP reduces gas fees.
This reduces transaction fees, not for the layer one, but for the layer two, correct?
Yes, on layer two's, yes, that's correct.
And basically, 4844 is a way for us to get some of the reductions of shardine quicker.
And then the full dang shardine rollout gives us even more reductions.
But because 4844, aka Proto-Dank shardin, is simple to implement, we can just get that first.
Fantastic. All right, I think that is all of my questions. And I think that's all of Ryan's questions. So with that, I'm going to ask the panelists to come in from the shadows and turn on their cameras. And me and Ryan are going to actually duck out of here. Excuse me, we're going to dank out of here and let Tim take over this stream. Guys, welcome to the panel. And Tim, thank you for doing this. And then absolutely just take it away.
If you're trying to grow and preserve your crypto wealth, optimizing your taxes is just as lucrative is trying to find the next hidden gem. Also, IRA can help you invest in crypto in tax advantage.
ways to help you preserve your hard-earned money. Also, crypto IRA lets you invest in more than 150
coins and tokens with all the same tax advantages of an IRA. They make it easy to fund your
alternative IRA or crypto IRA via your 401k or by contributing directly from your bank account.
There is no setup or account fees and it's all you need to do to invest in crypto tax-free.
Let me repeat that again. You can invest in crypto tax-free. Diversify like the pros and trade
without tax headaches. Open an auto-cry-o-ri-a.
to invest in crypto tax-free.
Just go to altoira.com slash bankless.
That's A-L-T-O-I-R-A.com slash bankless
and start investing in crypto today.
The era of proof-of-stake is upon us,
and Lido is bringing proof-of-stake to everyone.
Lido is a decentralized staking protocol
that allows users to stake their proof-of-stake assets
using Lido's distributed network of nodes.
Don't choose between staking your assets
or using them as collateral in D-Fi.
With Lido, you can have both.
Using Lido, you can stake any amount of
your eth to the Lido validating network and receive STEth in return.
SCEth can be traded, used as collateral for lending and borrowing, or leverage on your
favorite defy protocols, all this without giving up your ETH to centralized staking services
or exchanges.
Lido now supports Terra, Solana, Kusama, and Polygon staking.
Whatever your preferred proof of stake asset is, Lido is here to take away the complexities
of staking while enabling you to get liquidity on your stake.
If you want to stake your ETH, Terra, soul, ormatic, and get liquidity on your stake,
go to Lido.fi to get started.
That's L-I-D-O-F-I to get started.
The Layer 2 era is upon us.
Ethereum's Layer 2 ecosystem is growing every day,
and we need bridges to be fast and efficient
in order to live a Layer 2 life.
Across is the fastest, cheapest, and most secure cross-chain bridge.
With Across, you don't have to worry about the long wait times
or high fees to get your assets to the chain of your choice.
Assets are bridged and available for use almost instantaneously.
Across bridges are powered by Uma's optimistic Oracle
to securely transfer tokens from Layer 2 back to Ethereum.
A token proposal is being deliberated as we speak in the Across Forum where community members will decide on the token distribution.
You can have your part of Across's story by joining the Discord and becoming a co-founder and helping to design the fair, fair launch of Across.
If you want to bridge your assets quickly and securely, go to across.tto to bridge your assets between Ethereum, Optimism, Arbitrum, or Boba networks.
Okay, sweet. It's just us now.
So I guess, yeah, before we get into it, do you each want to just take a?
minute and kind of talk about like what you work on and who you are um yeah
vitalik we can talk with you um yeah so hi um i'm vitellic i'm the co-founder of pickwin magazine and um i write a
blog i contribute to specs once in a while um denkrad yep hi i'm dankrat i'm an ethium researcher since
2019 and I'm working on among others sharding and what else we've worked on proof of custody,
statelessness, yeah, some projects on the roadmap of Ethereum. Nice. And Proto.
Hey, hello. I'm Proto Lempta. I used to work at the Ethereum Foundation on research there. Now I do
the same thing with an optimism. I helped to
sharding earlier on and now I'm contributing back to layer one while working at optimism
and layer two. Sweet. Okay, so just to kind of get into sharding generally, over like the past
couple years what sharding means that Ethereum has changed a lot. And I think like the biggest
one is like this shift from full execution sharding, the only data sharding. Vitalik, do you kind of,
Do you want to give us just an overview of like how that shift happened in the research roadmap
and why we've landed on just doing data sharding?
Yeah.
So I think there's been this ongoing simplification of the sharding roadmap that started really
sometime in 2016.
So for people who have been in the Ethereum ecosystem for a long time, you might remember
some of the scaling docs that we published back in 2015, back in early 2016.
some of the blog post thinking that I came out in 2014.
And the stuff that was there in those earlier periods was in a lot of ways really complicated, right?
Like there were these ideas around hypercubes and hub and spoke chains and in protocol supported cross-charged transactions that would be routed between like one corner of a hypercube to another corner and where the protocol would help them.
from like one chart to another to a third to a fourth.
There was even thinking about super quadratic sharding,
which is basically saying,
like instead of just having shards,
you have shards on top of shards
and potentially like in infinite hierarchy of shards inside of shards inside of shards.
So actually the sort of stuff that the telegram ton project ended up incorporating
into their paper,
though I guess that never really ended up coming close to going life,
unfortunately. But that was the kind of thinking that we had back in 2015 and 2016. And I think after
that, the progression has just abandoned this big slow process of, I guess, increasing pragmatism,
increasing appreciation of how complicated it is to develop and actually bring to production
just about anything. Like what feels like 10 lines of code actually becomes like hundreds of
lines of code once. You add all of the
complex cities that
clients have to inevitably have and deal with
how some particular thing
interacts with the sinking process,
how some particular thing interacts with the
fork choice, how
some particular thing interacts with the
database and the need
to store it in optimized formats and that
sort of stuff.
So the process
of simplification, I think
the big first step was definitely
the decision to not
bother with anything beyond quadratic sharding and just say we're doing quadratic sharding, right?
So not bother with doing ever any kind of shards on top of shards and just saying, we have the
beacon chain, we have shards, shard headers are connected to the beacon chain, and that's it.
Like, that's the only layer of sharding that ends up actually happening.
So that was the first step.
Then the second step was the move from this concept of like chains.
that have regular commitment blocks in the beacon chain.
I think there was another word for them.
I forget what that word is.
Crosswinks was the word, right?
You have shard chains that crosslink into the beacon chain.
So moving from that to a system where you just have every shard block directly get included in the beacon chain.
So that was the second simplification.
I forget exactly one that happened.
I think that might have been around 2019 or 2020 or so.
And the big benefit of that simplification is that it meant that we didn't have to worry about shard chains anymore.
Then after that, there was the idea that we're going to do data sharding first.
This was when I started talking about the roll-up-centric roadmap, right?
Basically, instead of shard blocks actually containing transactions that would be executed at the Ethereum layer,
these shard blocks would just contain big blobs of data, and it would be the responsibility of
Ler2 roll-up protocols to use that data space in order to create secure and more scalable
experiences for their users, right? So Ethereum, the system would provide non-scalable computation and
scalable data, and what a roll-up does is it converts scalable data and non-scalable computation
into a scalable computation.
So we have a somewhat more performant layer one that has extra data space,
and then we combined that with this layer two ecosystem
and the layer two ecosystem ends up like really bringing
of the scalability to life.
So that's the roll upcentric roadmap.
And at the beginning, I think the rule of centric roadmap was phrased in this ambiguous way
where it basically said, well, look, the rule of centric roadmap,
realistically, like data sharding is the obvious.
obvious prelude to full sharding anyway, right?
Like if we're going to implement full sharding with EVM's all the shards,
it's just an obvious first step to have data shards first.
But it turns out that data shards are actually really good for roll-ups already.
And so we might as well run with that.
We might as well realize the roll-ups are our best hope for short-term scalability.
And just take that direction and try to make the best of it.
And that still leaves open the door for adding EVM execution charts in the future.
but it basically says, well, actually, you know, Ethereum will be fine even if we end up never actually completely doing that, right?
So that's roll upcentric roadmap.
It's another simplification basically saying we don't have to bother with execution on shards.
And then that also allowed some other simplifications, like it made it even more possible for shards to not bother with Fortress rules, for example.
And then the next simplification after that is dink sharding,
basically said that there is actually this merged proposal mechanism where there's only one
proposer that chooses all of the shard blocks on all of the shards that appear within a particular
beacon block.
And that simplifies things massively at a whole bunch of ways.
So basically means we don't have to deal with like the whole shard proposer bureaucracy,
which simplifies that.
It's not this complexity a huge amount.
It simplifies some of the economic properties a huge amount.
It basically makes the system feel much more similar to like just what a non-scalable chain would look like, except it's just more scalable, right?
And that extra scalability happens in the background.
And then proto-degs sharding finally, like that's not a simplification.
That's more a step on the way to full-dank sharding that gets us maybe half the benefits of sharding.
But at a point that's maybe like halfway along the timeline to actually getting full sharding out there.
so we actually get some of the benefits sooner.
So that was the general progression, like basically more complexity to less complexity,
more of Ethereum trying to do everything to less Ethereum, I'm trying to do everything,
and more willingness to work with way or two protocols and those two things together,
and that's where we are now.
Yeah, thanks.
That was comprehensive.
Dacrad, like talking about thanks, Hardy, like, can you kind of walk us through how,
So, like, this idea that it's okay to assume that the block proposals or the block builders
have to track all of the shards because of this separation between proposers and builders
that we've seen kind of over the past few years, especially with the rise of MIV.
So, yeah, just talk us through like, why is it possible to do something like bank sharding
and not sacrifice the decentralization properties of the chain?
Right.
Yeah.
So I guess, I mean, maybe like if we go a bit into the history of M.A.V.
Like, or maybe think about how it has been recently.
So like it started with maybe like maybe some mining pools,
doing some stuff like to exploit MEP and like to like, yeah, to make to get some more than just the transaction fees.
And over over the course of time, this has become more and more.
more professionalized.
Like nowadays, most of them work with some other entity,
for example, flash spots that sells them complete bundles of,
or maybe selling is the wrong word here,
that buys complete slots for bundles to be included in blocks
that exploit a certain amount of MEV.
And then the block producers, which are currently the miners,
they just get the payment for that.
They get like, so yeah,
So those searches are now like lots of independent entities that try to find the best M.A.V.
And mining pools now don't have to bother with us anymore.
And basically it turns out that if we want to properly decentralize those, which is something we really want to do, like, so right now there's like maybe a few mining pools, like tens of them or so.
And so like the way it works right now is that basically FlashBots has a business relationship with each of them.
Like they basically have a trusted relationship.
And if like one of the two sides did something naughty, like for example, the miner could start looking into the strategies and instead of just executing them, they could like exploit the strategies themselves.
So for example, if you have an arbitrage transaction, then like you can always.
often do much worse things with the individual trades if you want to exploit them.
Or like they could try to figure out what the strategies are by what they are being sent and so on.
So this doesn't scale to a system where instead of like a few tens of mining pools,
we have probably tens of thousands or so of individual validators because you can't
have like a trust relationship with each of them.
That's not going to work.
So the only way to translate this world of MEV
into the future of proof stake is by having some form
of proposal builder separation.
The way proposer builder separation works
is that instead of having traditionally,
like I guess like five years ago,
we all thought of it as the same thing,
if you propose a block, you build that block, right?
And with a builder separation, that's not true anymore.
What we do instead is someone, a builder, builds a block,
and the proposer just proposes, oh, yeah, I'm going to propose that block that this guy built.
So we separated the two roles.
And in that way, we can have a professional role of block building,
which is this role that will extract M.V or work with all the searches.
and so on, and we can have the proposer,
and that is just a normal validator.
And the good, nice thing is proposing is extremely simple and cheap,
because it's basically just selecting the highest bid
and saying, yep, you get to build the block.
And whereas building is a complex process
where you have to manage lots of searches
and they have to trust your system
that their strategies won't get exploited and so on.
So that is more suited for a complex,
and more capitalized entity.
And so it's good that not everyone has to do it.
And basically this is the, I guess,
this is how we're seeing the future of block building on Ethereum.
There's not really currently a viable alternative to this world.
And what we also know is that,
is that sort of building these,
building scalable system, especially also building this massive data availability system,
becomes a lot easier once you assume that there is someone who can handle these massive amounts of data.
So once you put that entity into the system and say, well,
there's someone who can compute this encoding, who can distribute all this data and so on,
and so on, then many things become a lot simpler.
And so in the past, I guess we didn't really think about these designs because we were like,
well, we want Ethereum to be extremely decentralized.
And now with this proposal building, builder separation coming into the design space due to
MEV, it's also become available to think about for other things.
And basically this is how we, or like I first thought at the end,
of last year, well, let's use this, let's use this idea where you have these entities that can
handle, for example, large amounts of data. It's not really a problem if you are like running
some large machines anyway. It's not like an absolutely insane amount. It's not data center
kind of amount. It's just like large machines with a good internet connection kind of amount.
And exploit these entities and let them basically do this building.
and that allows us to get to a much simpler and more efficient charting design.
Got it.
And so am I right in thinking because it's basically very hard to build an optimal block
that becomes a specialized industry, but once you do have a block that is,
it's very easy to verify its validity, right?
So finding the exact right block to build is hard.
And so you need tons of machines to do that.
But once you have found it, then anyone,
can verify it. It's almost analogous to proof of work in that way.
For like, yeah.
Yeah, yeah, that's absolutely correct.
Like basically verifying, I mean, that's the, that's the cracks of data availability checks.
They, this idea that there, there is a way to check that this amount of data is available
that needs much less work than actually downloading all the data.
So there's this asymmetry where like there's someone's somewhere we need to do this
work of encoding the data and distributing the data, but then verifying it as much easier.
Got it.
Okay, and to get us all the way to Prologen charting now, so recently the three of you have
written an EIP, EIP 4844, that's now colloquially known as Prolodank sharding, which helps
kind of lay the foundation for this full sharding design without requiring this entire kind
of shard data network to be live. Provo, do you want to walk us through, like, what are the things
that 4844 does to get at storage sharding? And also, like, how does that help layer twos?
Like, how do L2s then use 4844 to provide their users with lower transaction fees?
Right, so we just ended with how Deng Sharding basically introduce a data availability sampling
and these other new advanced tech features to try and distribute this job across the network better.
But this comes with additional complexity.
It will take more time to properly test and introduce it to Ethereum.
So instead of waiting for the full version of Deng Sharding, we can reduce the features.
set and go with a amount of data in between these things.
We can offer additional data and we can look at layer two, like what kind of security
properties they need and optimize for that.
And they can already make a big win there.
And then later, with the additional features, we can get to full bank sharing.
And so what we started with here is the changes to pay for the data as a layer two,
this type of transaction that introduces the data to the network.
And we need some changes to distribute the data across the network.
But it won't be as much data just yet.
So it's manageable by the own network to download.
And so we don't need sampling yet.
And we can have everyone download the data.
Got it.
And how does a layer two then use that data?
Like from, say, optimism's point of view,
how do you actually leverage that?
Is it just changing where you post the data
that's currently posted in normal transactions as call data?
Right.
Is there more involved?
You need to optimize what a layer two really needs.
You can take apart all the things that the layer two uses.
One of those things is publishing the data,
making sure that this honest minority that protects the layer two
is able to get the data in the first place.
and then there is this other property that the layer two uses right now of getting the data long term
but these are very different data availability is this property that ensures actors are able to get the data
and this can be for a more limited amount of time and so you do need to ensure that even with downtime
and even with censorship and whatever other in foreseen circumstances these actors on a network on layer two are
able to get the data. But after some amount of time, this should be sufficient to guarantee the
security of the layer two because you want the actors to be able to reconstruct the state,
because only with a full state, only you have to fill history, they're able to challenge the
operator, challenge the sequencer of the rot-up. Got it. I think we've mentioned like a few times
already, and I want to make sure we kind of clarify for folks, is this idea of data availability.
And I think this is something that, at least to me, was not like clear until I spent
way more time looking at the sharding is like what, when we say data availability, what exactly
do we mean and how is that different than like the data we store on like the Ethereum blockchain
today?
I don't know, Vitalik, do you want to give a quick overview?
Sure.
Yeah, I mean, I think it's definitely a very important and the subtle topic.
Like I think even the really big point of comparison a lot of people have is like what's the
difference between what we're doing and IPFS, right? And this, like, IPFS is a platform where
if you publish data, then that, like, presumably, you know, if the incentives are right or different
enough people care about the data, the data gets broadcasted, and then anyone who wants to download
the data is able to download the data. The difference between that and what Ethereum is doing and
going to be doing is that Ethereum is and will be providing consensus on data availability.
Basically, so you just, you can always have a hard consensus on the question of whether or not
a piece of data with a particular hash or a particular commitment actually is available.
And what we mean by available basically is did the data go through this publication process
where it got broadcasted on a public network, and anyone who wanted to actually download the
data actually did have a lot of time during which they could have downloaded the data.
So basically the difference between that and something like IPFS has to do with the case
of a malicious publisher, right?
I'm like, if I'm a malicious publisher, then on IPFS, I could potentially do something where, you know,
I control some small number of servers, and then I published through a small number of servers,
and then a small number of servers might respond to, and say data is available to some people,
but they might not respond to other people.
And so some people get the data, other people don't get the data,
but you never actually get this kind of very binary consensus on whether or not the data was actually published.
Now, the reason why this kind of concept of consensus on data availability is necessary has to do with
a lot of these layer two protocols where those layer two protocols depend on data being out there
and use, not downloaded by everyone by default, but downloadable by anyone in the case that
they wants to download it for some security properties, right?
So one very simple example is a Zika rollup, right?
In a Zika rollup, you have a sequencer, that sequencer accepts transactions.
that sequencer publishes these blocks that contains state deltas, they contain a proof.
And that sequencer also contains, like, basically manages this kind of internal state,
like what is the state of the ZKK, like, you know, the balances, contracts, whatever,
inside of that ZK rollup.
Now, the difference between a ZK rollup and a Validium, right,
is that a ZK rollup has like state deltas or inputs on chain.
In a validium, you only have the proofs on chain and you have everything else off chain.
from a security point of view of like can they force invalid stuff to go into the system
both rollups and volidiums both protect against that right because so the zika stark
prevents you from actual bringing anything invalid the place where they're different is what
happens if the sequencer disappears right what happens if the sequencer becomes malicious
and they just basically shut off from the network never talk to anyone again and the reason
And why they do this is basically that because they want to just make it not possible for someone
else to interact with that system going forward.
And so if people have money inside the system, that money gets stuck.
Now, in a voladium, this is actually a problem.
If the volidium operator does this, then they can't steal, but they can make people's money stuck.
And so, you know, if they're really mean, they could potentially like extort and they can
basically say like, hey, you know, if the whales don't.
extends 20% of their money to a ransom address, then they're just going to make everyone's
money stuck forever.
In Ezekiel Rollup, on the other hand, there's this guarantee that because either the inputs
or the state deltas get published to the chain, if the original sequencer disappears,
you can always have a new sequencer come in, read the data from the chain, and basically
initialize the exact same state that the original secrecylure has, and so be in the exact same
position and have the exact same capability to then be able to continue providing ZK
Roa blocks, processing withdrawals, and processing transactions, right? So because that data is on
chain, and so someone else can come in and reconstruct it and like basically swat themselves
into the same role, you don't have this same security problem that Volidians have, right? So the
difference between the two, basically, is data like in this, like on chain or is it off chain? Now,
why does it matter if it's on chain? Because on chain is a very,
simple, convenient medium where if you see the data is on chain, even if you don't personally
process it, even if you personally don't care about it, you still know that if something terrible
happened and you needed to recover, then you will be able to actually go on chain and grab that
data, right? So what proto-dink charting and then eventually full dink sorting try to do is they basically
try to like really zero in on providing a platform that provides exactly that capability, right? So
the beacon chain would actually only contain hashes of data. And so if you're a client,
then you would be just downloading the beacon chain and you would get hashes of everything.
But when I say hash is here, I mean like hashes of KZG commitments and the, you know,
blah, blah, complicated math. But like, think of them as hashesage. Like, actually, yeah, it is a,
yeah, yeah, KZG is a cryptographic revalent hash function by, you know, definitions of collision
resistance and pre-image resistance and so forth. But basically, yeah,
the actual full data would instead live in this like sharded system where it would be inside of shards,
and it would be inside of puret peer sub networks.
And the point of all of this machinery around data availability sampling and that sort of stuff is to basically
provide a way of kind of checking and guaranteeing that the data actually has been published
through this mechanism where if in the future you need it, you will be able to get it without actually
requiring everyone to like directly download all of the data themselves.
themselves, right? Now, the chain does not have to store the data, or the shards do not have to
store that data forever, right? So, like, the plan is for them to delete that data after some
period of time. Like, it's, you know, numbers have been thrown around of, like, somewhere, like,
could be 30 days, could be a couple of months. And then, basically, but the point is to give enough
time that any mechanism or, like, anyone that wants to be able to download the data, will be
able to download the data and like for there to be enough time that for all of the people that
would be making backups for what related to a particular application to actually have the
time to do that.
Right.
So basically, you know, create the system that's like really optimized around this idea of like,
how can we just provide this exact guarantee of like data availability, like proof that the underlying
data behind a particular hash has actually been published to this public notice board where
if people want it, they can get it.
So the roll-up similar to you can take advantage of that
for scalability without incurring the complexity costs
of actually trying to shard, like, full-on EVM execution
or whatever.
Right, right.
And so it's like the guarantee of the Ethereum L1 protocol
is quite tight.
It's like we guarantee that this data will have been
published on the network for this amount of time.
Beyond that, obviously,
there's still ways to retrieve that data.
They're just not guaranteed by the Ethereum protocol, right?
Then it could very well be on IPFS,
but that's not a guarantee that the protocol can make, you know.
Right, exactly.
Yeah.
And we've touched kind of on this a couple of times already,
like this idea of data availability sampling
and only verifying, like, having each validator verified parts of the data
to make sure that like overall entirety has been published.
Denkrad, do you want to give us like an overview?
view of how does data availability sampling work for, call it an intermediate audience,
you know, not a photographer.
Yeah.
Yeah.
Yeah.
So the idea between the data and the data availability sampling is that you somehow want a scalable way
to ensure that some amount of data is available.
Like is basically and available means you could download it if you wanted to, right?
And so the obvious, obviously we know if we don't know it, which is what we do now, it's available.
That's simple, right?
Because if you could download it, then you could download it.
Okay.
But how do we make it scalable?
So scalable means we have the same amount, a constant amount or maybe increasing a logarithmic amount,
but not like a linear amount of resources that we need to do this amount of work.
So we need to find some way of doing this in a more efficient way.
And the way data availability sampling does this is what nodes do is that they sample.
They pick these random parts of the data and they say, I want this and this and this,
and I'll try to get it.
And only if I can get all of these, or maybe a vast majority of them, then I will consider the data to be available.
And naively, like if you just do this on blocks as they are now, then it doesn't work.
And why? Because if there's, say, just one piece of the block missing, the probability that you catch it is tiny.
Because you would have to request exactly that piece of the block.
And you want to only request a really small part of it.
So the probability that you catch it is small.
So that doesn't work because we know it like in blockchains,
basically the thing with blockchains is like the bad things could happen anywhere.
Like even one single missing transaction could screw up the whole system.
So you cannot allow any part of the data to be missing.
Like just sampling directly doesn't work.
So what you have to do instead is you have to first encode the data.
And you encode it in such a way.
And that's this is called.
read Solomon codes, you encoded in such a way that any fixed fraction, for example, you can pick 50%.
If any 50% of the data are available, then you can reconstruct the whole from that.
So you encode it in that way.
And now the scaling becomes different because now you don't have to ensure that.
all the data is available. You don't have to know that all the samples are available,
but you have to know that 50% are available. And that's a task that you can do statistically,
because if you download 30 samples, well, I mean, the correct way is saying like this,
if someone is trying to hide, this is the attack we're trying to defend against, right?
Someone is trying to hide the data somehow. If they do that, they have to make less than 50%
of the samples available. If they make less than 50% of the samples available and you download,
for example, 30, then the probability that all of those will be available is true to the minus 30,
which is one in a billion. And so it's really small. And by downloading 10 more, you decrease it
by another factor of 10. So that is a scalable way of ensuring data availability. And that's
the principle of hard works.
AVE is the leading decentralized liquidity protocol.
And now AVEV3 is here.
AVE V3 has powerful new features to enable you to get the most out of DFI,
including isolation mode, which allows for many more markets to be launched with more exotic collateral types.
And also efficiency mode, which allows for a higher loan to value ratios.
And of course, portals, allowing users to port their AVE position across all of the networks that AVE
operates on, like Polygon, Phantom, Avalanche, Arbitrum, Optim, Optimism, and Harmony.
The beautiful thing about AVE is that it's completely open source, decentralized, and governed by its community,
enabling a truly bankless future for us all.
To get your first crypto collateralized loan, get started at AVE.com.
That's AABE.com.
And also check out the AVE protocol governance forums to see what more than 100,000 Dow members are all robbing about at
governance.aVe.com.
Living a bankless life requires taking control over your own private keys.
And that's why so many in the bankless nation already have their ledger hardware wallet.
And brand new to the ledger lineup of hardware wallets is the Ledger NanoS Plus, a huge upgrade to the world's most popular hardware wallet.
With more memory and a larger screen, the NanoSplus makes it easy to navigate and verify your transactions.
And the paired Ledger Live desktop app gets you increased transparency as to what is about to happen with your NFT.
What you see is what you sign.
The NanoS Plus gives you the smoothest possible user experience while you're doing all of your crypto things.
So go to the Ledger website to check out the features of the new Ledger NanoS Plus and join the waitlist to get your.
And don't forget about the Crypto Life card, also powered by Ledger.
The CL card is a crypto debit card that hooks right into the Ledger Live app,
right next to all the Defy Apps and Services that you're already used to doing,
like swapping tokens and staking.
So if you don't have a Ledger hardware wallet, go to Ledger.com,
grab a ledger and take control over your crypto.
Arbitrum is an Ethereum layer two scaling solution that's going to completely change how we use
Defi and NFTs.
Over 300 projects have already deployed to Arbitrum,
and the Defi and NFT ecosystems are growing rapidly.
Some of the coolest and newest NFT collections have chosen Arbitrum as their home, all the while
defyreterms continue to see increased usage and liquidity.
Using Arbitrum has never been easier, especially with the ability to deposit directly into Arbitrum through all the exchanges, including Binance, FTX, Quibi, and Crypto.com.
Once inside, you'll notice Arbitrum increases Ethereum speed by orders of magnitude for a fraction of the cost of the average gas fee.
If you're a developer who wants low gas fees and instant transactions for your users, visit Arbitrum.io slash developer to start building your DAP on Arborative.
Arbitron. If you're a Dgen, many of your favorite apps on Ethereum are already on Arbitrum,
with many moving over every day. Go to bridge.arbitrum.io now to start bridging over your
eth and other tokens in order to experience defy and NFTs in the way it was always meant to be.
Fast, cheap, secure, and friction-free.
Right, right. And basically, building this entire system is why shipping bank charting is going to
take a while. Prolo, can you walk us through? Like, in the meantime, in like the 48-44 world,
how do we like sidestep that? Why can we get away with not having all this already live?
So for the background here, first of all the merge, we separate Ethereum into a constant
player and exclusion layer. We are not throwing more data at the exclusion layer.
but rather we continue to scale the consensus layer.
And even then, we are only doing so with a limited amount of data.
So we're talking about a month or maybe three months,
some amount of data that is retained.
After the period, we start to prune the data.
So we can ensure that this is available for layer twos,
for a sufficient amount of time for them to secure their network.
But then at the same time,
it doesn't grow infinitely, like it doesn't grow indefinitely, where now you have a bounded
amount of data to store on the consensus knots, and they can distribute this between the different
beacon nodes.
Right.
And I guess just to clarify this for the listeners, the amount of data that we make available
in Proto-Dang Chardin is less than the amount that we make available in full-dank-Sardine, correct?
Right. So with Phil Deng sharding, we distribute the job of storing and propagating the data
between all the nodes on the network, between the different validators, whereas with EIP4-4-4,
we still require all of the consensus nodes to acquire all of the blob data, but we limit this.
We don't make it grow indefinitely, so we can increase the throughput.
Got it.
And can, what if you give me an estimate, like, you know, how much do we lower the cost of storing beta for day or twos with 4844?
And then how much do we lower it further with a full sharding deployment?
Roughly.
Sure.
So current Ethereum blocks are anywhere between like 50, maybe tops like 100 kibytes.
It's very favorable.
If like worst case, it could grow a lot larger.
But you are paying for call data.
data that is going through the EVM and that's available forever.
This is a very different type of data that a roll-up really needs.
So instead, we can try and optimize.
You can have this different type of data called blob compared to call data.
And we can grow it from this order of magnitude from like 50 kilobytes to maybe like a
megabyte per block.
And this is obviously this is already a huge increase that roll-ups could benefit
from a reduced cost of it.
And on the full dang sharding,
it can go another order of magnitude larger
because now we don't have to store all the data on one node,
that we can distribute it across 64 nodes.
So we could have a multiplier here
and how we tool the data apart.
Got it.
So it's like we get an order of magnitude increase
with just 4844 in terms of how much data we can have
and then we get another order of magnitude
with full day sharding.
And then one thing,
that's been interesting to me to learn as I've been spending time on this is the idea that like
the demand for storing data in these blobs or in the full sharding system is like independent
or at least decoridated from the demand to use Ethereum gas, right? Like there might be people
who are willing to pay a lot to execute computation and people who are willing to pay a lot to store
data, but they don't necessarily overlap. And it's like it creates two different markets.
So what if you kind of walk us through, like, how we're like designing these two different markets and isolating them from each other to an extent?
Well, so it starts with the transaction type, where you add this additional fee parameter.
But with this fee, you create a different market.
And so if you really want to, you could separate the transaction pool.
and the capacity and the type of resource is very different.
I think Vitalek already wrote a post about a multidimensional EIP-1559
where we can try and think of all the different resources in Ethereum as different markets.
And I believe Dengkrat already has a post on how EIP 1559 could work for this type of blob data instead of regular gas.
Right. So it's like there's two auctions happening in parallel. One is people bidding.
for like transaction computation and the other one is people bidding for storage and we can use
kind of the same mechanism which we have already for gas and call data which is like weirdly
bundled to then to then separate it and have one 1559 that works for gas and one 1559 that works
for shard data. Another thing we haven't touched on this a lot but Vitalik I think you mentioned
them earlier on, but the Shrine
design requires the introduction of KZG
commitments, and they're
kind of like a hash, but not really.
I know Dancrod, you had like
a great post about them. Do you want to explain
again, and sort of to non-cryptographers
what these are and, you know, how they
resemble the cryptography that's
currently in Ethereum and differ
from it? Yeah. Yeah,
I mean, so I mentioned earlier
at the end of explaining sampling
how the data has to be encoded in this way that we call Reed-Solomon code,
which is a way to ensure that any 50% of the data can be used to reconstruct the whole data.
So, I mean, I guess data is a bit misleading here because there's like the original data,
which is the actual payload that we're talking about.
But then the code expands this data, so it becomes twice as large in the process as well.
And so what is a Reed-Solomon code?
So a read Solomon code is, well, so yeah, we call it, it's a polynomial.
So basically what it means is you've learned about polynomias, maybe in the mathematics classes.
It's a certain type of function.
And basically this type of function has the property that, that,
when you know it at a certain number of points,
which we call the degree of the polynomial,
then you know the whole polynomial.
So basically we use that property in order to put this polynomial function
through the data.
And then if you have like a certain number of samples,
which is like half the amount of the full encoding,
then you can get all of them.
And the reason we need KZG commitments is this.
Like when you just sample the data,
there's one thing that you can't decide from those samples.
And that is whether the encoding is correct.
What if someone just like,
Reed Solomon calls have a certain structure, right?
They have a certain structure that allows us
to reconstruct the whole thing.
But what if someone encoded it in a different way?
What if they just put
garbage in it, then every 50% of the samples would give you different data. And of course,
that's not acceptable because the data has to be unique. It has to be that all 50% of the
samples give you exactly the same data. And the way we do that, I mean, there are different
approaches, but basically over the years we have ended up here where we just found this
amazing type of commitment, a KT or KZG commitment, that you can basically, basically,
see as a hash, it's similar to a hash of data, but with a property that instead of
hashing to just data, it hashes a polynomial. So it's a way to hash a polynomial function and
basically reveal any point on it. And so that guarantees that this correctness of the
encoding. And that's why we need these KZG commitments. Got it. And as I was reading about
case of G commitments, one of the first things you stumble on as a non-cryptographer is they require
a trusted setup.
And, you know, I'd be curious for the viewers to walk through, like, what actually, what are
the trust assumptions in a trusted setup, you know, like, what are the things that, like,
we are trusting in that setup?
And as we make one to kind of enable this on Ethereum, is there things that, like,
end users can do so that they can have kind of a higher assurance that the setup was performed
correctly and like minimize the trusted assumptions they are making individually.
Right.
Yes.
Yeah.
So basically the trusted setup is like basically what we have to do is we have to generate
these elliptic curve points that have a certain relation.
That's like one of the fundamental inputs of the KSG commission.
scheme and the trusted setup is basically a way. And okay, and in addition to that property that
they have a certain relation is nobody is allowed to know the actual relation between them. So this
has to be a secret. And that's why it has to be this trusted setup. And it's called trusted setup
because one of the ways to do a trusted setup is just to say, hey, we all trust him. Tim, you do it,
and you give us the output. And then it's done and you throw it away and everything will be fine.
But the problem is, of course, that's not really sufficient for the Ethereum community.
People would be like, well, what if?
So instead, we have this way of distributing this trust and saying, like,
we let many, many people participate in this trusted setup.
And we can design trusted setups in a way so that if even a single one of these people that participate in it,
did it according to the protocol.
And the protocol means that you execute this whole thing,
you run this program, send your output,
and then you destroy your data.
Like you destroy the secret that you used to do it.
You don't keep it.
And if even a single person out of the potentially thousands
that are going to participate,
did this properly, then the setup is completely safe.
So even like, let's say 1,000 people do this,
999 colluded and they all kept their secret and they come together and try to reconstruct it,
but one person did it properly and they don't have it.
Even in this case, these 99 people know absolutely nothing that helps them to break it.
So that's the security guarantee, which we call like n minus one.
So even if n minus one collude, they can't get anything on that last person that participatory properly.
And yes, I mean, obviously this property has a nice property that
If you are really, really worried about this and I like, oh my God, how can I trust these people, then you can just participate.
So like one obvious way is if you participated and you know you did things properly, then you don't need to trust anyone because you're part of it.
And yeah, it's done.
So like you can consider it secure.
Obviously there are also all the parts of like making sure all the software works properly and so on that we're all very familiar with in our.
blockchain systems, like all of this, we're relying for the functional Ethereum as well.
But so obviously it has all to be very well audited. And we need several implementations of this
as well. But yeah, so that's, that's, I guess, like the other trust assumption that we have to
make sure that all the software is safe and works properly. Right. So there's like kind of different
levels. It's like either, you know, you don't care at all and you just trust that somebody somewhere
has been honest. And maybe you just, it's not that you don't care, but you learn about
Ethereum 10 years from now, which is basically what you have to do because you can't participate.
If you're part of the Ethereum community today, there's going to be an opportunity for people
to participate individually.
So then as long as you're confident that like your participation was correct, then the whole
output should be correct.
And then if you're even kind of more concerned, there's a specification from which we can write
different implementation.
So I assume, you know, if you didn't trust the existing ones, you could write your own and also
produce an output or at the very least kind of review the code of the different ones and make
sure that they they match up and yeah, you would kind of get a high level of certainty that
things are correct. Okay, and I guess we're coming up on time here. The last thing I did want to
talk about is like, how do we actually get this deployed, which is the part I'm usually getting
involved in. I know Proto, you started prototyping 4844 along with some other folks.
Can you give just kind of a quick summary of like what was done so far in terms of prototyping
and what do you think are like the next steps that the community can expect?
Right. So earlier this year it started with an initial writer of what it could be like.
Then during the HECATOM in Denver, it transformed into this software where we have an actual
implementation of the proposal.
And over time, we have been improving that and testing that.
And what we need to go and do from here is there are two different branches, right,
where we want to further develop the client software to be able to make a test network.
And we want to continue the development on this trusted setup so that we do have the cryptography,
everything there on that side all set as well for when we do want to deploy this.
and then once we have both ready, you can make larger and larger dustnuts and then eventually
include it as an EIP via the AlkaDfts process into Ethereum Maynardt.
Got it, got it.
So we have some initial prototypes.
We want to product, like grow them, make them more robust, make sure the trusted setup is working
according to plan.
And then once we have that, it kind of becomes a normal EIP.
We need to shepherd through the process.
And then one thing that's worth knowing about 4844 is what's very neat about it is from the execution layer point of view.
So like the kind of smart contract and end user transaction generation point of view, sharding is basically done then.
Like there's no more changes that will do for people to interact with this blog data.
What will happen is then we need to deploy kind of this entire dank sharding infrastructure.
on the consensus layer.
But from applications point of view,
that kind of just happens in the background.
But at the consensus layer, Dancrad,
like, what are the steps to get this deployed?
Like, how many hard forks do we need to get there?
You talked a lot about like proposer builder separation before.
So, like, what do you see is like the logical set of stepping stones
to get us to the full shardine on a consensus layer?
Yeah, I mean, I hope it's two.
But I don't know yet.
I think like so clearly like, I mean, this is the reason why we chose this stepping stone of protodank sharding, that it's a, it's something that gets a substantially closer to the full implementation.
So it will become simpler.
It will like the interface, for example, will stay the same.
The execution layer changes will be minimal once that's implemented.
So that's why it's really nice if we can get that done in the Shanghai hard fork.
And then, I mean, I think it's very unlike that will be the next hard fork after that.
But hopefully, relatively soon, we will get, yeah, we'll get this.
I, yeah, I don't know.
It's currently still, there are still like some things we definitely need to work on.
Like there's a lot of work on the network working to be done to have, yeah, full sharding rolled out.
I currently have no exact estimate, but I would hope that that can be done in one hard fork.
Okay.
I sympathize with not being able to give direct estimates about complex projects.
So I won't push you on there.
And maybe, yeah, to kind of close this off, like Vitalik, like if people want to contribute
from a research or like engineering point of view, like what are like the big open questions
in Shardingland that?
they should spend their brain cycles on.
I think one problem is definitely like figuring out the networking of data availability sampling.
Like there are designs that we have that work in theory.
Like there's doing it based on subnets.
There is the approach of trying to make a DHD much faster.
There's a couple of other techniques in the middle.
but like really taking those ideas and from an engineering perspective like just trying to optimize it really hard.
Like how do you actually make a basically it's like a specialized scalable Dhty or we are publishing and downloading can happen extremely quickly.
So that's one problem.
And I think in general, like in the Ethereum ecosystem, the networking side is one of the sides that's talked about.
the least. I mean, possibly just, I think, a bit of an accident of history that the
Ethereum core research community just happens to, like, never really have people who,
like, spent a lot of time thinking about networking stuff. It's generally, like, people have
spent a lot of time thinking about, you know, cryptography and get sentenced to economics.
But it's still a really important problem. And that's a problem where I think it would be
amazing if we can have more, like, very active networking expertise in Ethereum. So that's a
short term, or that's like a very clear problem. Another problem is that with folding sharding,
there's this issue of like how to combine it well with proposer builder separation. And there
there's some like economic challenges are still the challenge of like, well, how do you actually
make a good proposal builder separation protocol? How do you add like censorship resistance lists
so you can bypass censoring builders? And once again there, like we have ideas on
each of those things, but there's still the question of digging into the details, combining a PBS design with the design of Ethereum's future proof of stake, which is something that at some point we'll probably start having to kind of talk and think more about, right?
Like there's been this increasing effort within the Ethereum research and the protocol community of basically thinking through what would a better proof of stake design look like in the long term.
Like do we want to have single solid finality?
How do we achieve single solid finality?
What other benefits can we achieve?
How can we offer more in protocol of what Lido offers to people extra protocol to try to reduce, you know, staking pool centralization incentives?
and just generally increasing simplicity.
So I've written a bunch on that.
Like if you just Google for single slot finality, you can probably find it.
But do that and the intersection of that and purpose builder separation and the intersection of that and the intersection of that and sharding is going to be another research area going forward.
And then the other one is also adjacent to Deng sharding stuff.
and also adjacent to 4444 stuff, which is very critical to proto-dicturing and take-shorting
and actually being viable and just generally Ethereum scaling well is like creating the
as decentralized as possible and as robust as possible systems to give people the same guarantees
that they come to expect out of Ethereum in terms of history retention, but without requiring
participants in the core Ethereum Consensus Protocol to all be retaining blocks forever.
Right? So there's a team working on portal. There's things like the graph. Like there's this big
long list of projects, right? And trying to figure out how to make those better or even create
better alternatives to those. I think also an important area. Oh, also one last one I wanted to
throw in. This is important. The switch to layer two is I think one where,
it's very important for the ecosystem to try to maintain and even improve its
decentralization going through that switch.
And so, you know, we need light clients that can poke into optimism and poke into
arbitrage up and poke into us dark net.
And that's something that there's been a bit of like theoretical thinking work on.
But it's the sort of thing that I think there is a lot of room for people to slot themselves
in and like really try to improve that ecosystem a lot, right?
like basically we try to like really think through if everyone's really going to migrate over
to L2 over the next that's well to 24 months especially as protodiac shorting comes alive and the
roll-up costs go down even more like how do we really make sure that transition goes well and
that it preserves all of the decentralization properties and even improves on those properties
that we've come to expect of Ethereum yeah that's a really good list
we have like five minutes left so I wanted to leave a bit of time for the three of you
if there's anything you wanted to share that you think is important about shardine or 4844
or a theorem generally that like we haven't talked about the floor is yours whatever you want
to rant about or get people to pay attention to I mean the last week has been kind of
chaotic to say the least I think this is like the type of market where
If you feel a little bit down, like maybe read a post, try and get involved with Nissan projects.
It's not the bear market, it's the builder market.
Read the specs for the EIP.
There is the site called EIP4844.com.
So I'll get you started.
And then there are simple diagrams and all the way down to like very elaborate post about the crypto.
cryptography involved.
And yeah, just reach out and get building.
Stay optimistic, by the way.
I love it.
Thank you, Vitalik.
Anything else you want to share?
Stay optimistic, but in the long term,
hopefully, stay zero knowledge.
Wow.
Anything that you heard, that crowd?
Yeah, I mean, like, I mean, I think,
I think the week has shown that you need solid designs
and that we need to build things that can actually last
and that are built to last and won't go away.
And that's what we're trying to do here.
And yeah, so I'm optimistic on this,
but yeah, I am clearly also pessimistic
about many other things that are happening in the ecosystem.
So we just have to be better and build better.
Love it.
Yeah, then that's basically a wrap.
The bankless guys did ask me to end with a disclaimer.
So here it goes.
And I'm reading from the screen now, risk and disclaimers.
Crypto is risky.
You could lose what you pull in, but we're headed west.
This is the frontier.
It's not for everyone, but we're glad you're with us on the bankless journey.
Thanks a lot.
And yeah, thanks a ton, prolo, Vitalik, Dankrad,
for coming at a bunch of different hours across your respective time zones.
I think this has been really helpful to explain the entire Sharirolet back to people.
