Epicenter - Learn about Crypto, Blockchain, Ethereum, Bitcoin and Distributed Technologies - 'Ethereum Needs Polygon's Aggregation Layer to Scale' - Brendan Farmer & Sandeep Nailwal

Starting point is 00:00:00 If Web3 needs to scale to the scale of internet, the only way it scales is through ZK, and we have all our bets on ZK. The biggest problem facing Ethereum and facing Ethereum's ability to scale is fragmentation of L2s. Because right now we have this proliferation of L2s, but liquidity is not shared between those L2s. State is not easily shared between those L2s. Like users have a difficult time moving between those L2s. What Polygon is building is basically a guaranteed. of safety for a shared bridge, which allows different L2s to have asset fungibility between their chains. L2s are submitting blocks to the Aglar, and the Aglar is proving to Ethereum that all of

Starting point is 00:00:44 these properties hold. This episode is brought to by Nosis. Nosis builds decentralized infrastructure for the Ethereum ecosystem. With a rich history dating back to 2015 and products like Safe, cow swap or Nosis chain, NOSIS combines needs-driven development with deep technical expertise. This year marks the launch of NOSIS pay, the world's first decentralized payment network. With the Gnosis card, you can spend self-custody crypto at any visa-accepting merchant around the world. If you're an individual looking to live more on-chain or a business looking to white-label the stack, visit nosispay.com. There are lots of ways you can join the NOSIS journey. Drop in the NOSIS Dow Governance Form,

Starting point is 00:01:46 become a NOSIS validator with a single GNO token and low-cost hardware, or deploy your product on the EVM-compatible and highly decentralized NOSIS chain. Get started today at NOSIS.io. Course 1 is one of the biggest node operators globally and help you stake your tokens on 45 plus networks like Ethereum, Cosmos, Celestia and DYDX. More than 100,000 delegators stake with Chorus 1, including institutions like BitGo and Ledger. Staking with Chorus 1 not only gets you the highest years,

Starting point is 00:02:23 but also the most robust security practices and infrastructure that are usually exclusive for institutions. You can stake directly to Chorus 1's public note from your wallet, set up a white table note, or use the recently launched product, Opus, to stake up to 8,000 eth in a single transaction. You can even offer high-year-staking to your own customers using their API. Your assets always remain in your custody, so you can have complete peace of mind. Startsaking today at chorus.1.

Starting point is 00:02:55 Welcome to Epicenter, the show which talks about the technologies, projects, and people driving decentralization and the blockchain revolution. I'm Frederica Ernst, and today I'm speaking with Brandon Farmer and Sandeep Naewal, who are the co-founders of Polygon. Sandeep and Brendan, thank you so much for coming on. Sandeep, you have been on multiple times before. So let's start with Brendan today. Brendan, you are a co-founder of Polygon

Starting point is 00:03:22 by way of having founded Mur Protocol, which was then acquired by Polygon. Tell us about Mur. Yeah, so Mir was a, it was an L-1 that was trying to use ZK for privacy and scalability. And so we started mirror in 2019, and by 2021, really realized that we did not want to build an L1 and wanted to build in the Ethereum ecosystem.

Starting point is 00:03:47 And Sandeep and Mahalo and the Polygon founders were gracious enough to offer us a platform to do that. Yeah, super nice. So Sandeep, during that time frame, you guys acquired quite a few ZK teams. Maybe let's talk about that for a while. I mean, the other very well-known one is a mess, but I think there were a couple of other ones as well, right? Yeah, like there was like Hermes and, you know, the like mayor protocol. And then other was not an acquisition more of a aqua hire,

Starting point is 00:04:27 like was the Facebook's Winterfell project, which is like, you know, which became Polygon-Miden. even now we recently finished the acquisition of Toposware who is also one of the

Starting point is 00:04:43 you know one of the leading companies who are working on this aggregation thesis and you know we have been working very closely on our code base and the fun part is like

Starting point is 00:04:55 now Polygon ZK is like so deep into the into the whole ecosystem whether it's like a succority S.P1 building on Plonky 3 or like, you know, different, different protocols using, either directly using the open source code or get, you know, their architectures fully, you know, inspired by it. So we are already doing a lot of stuff. So Toposphere was also like one of the

Starting point is 00:05:20 teams who was working on our code base and thinking about like a more of an aggregated, you know, future the way we were thinking. And it was like, it made a total sense to, you know, bring them in, especially related to the type 1 prover that we have and things like that. So yeah, that has been like a long journey because we are absolutely clear that, you know, if Web 3 needs to scale to the scale of internet, the only way it scales is through ZK

Starting point is 00:05:47 and we have all our bets on ZK. Yeah, and I think those were very good beds to place. You're kind of bringing a lot of the research you have done over the years together in this thing called aggregation layer that you launched earlier this year. Can you set the stage here? So what aggregation layer? What problems are you looking to solve with it?

Starting point is 00:06:12 Yeah, sure. I can take that. So the way that we see the world is that the biggest problem facing Ethereum and facing Ethereum's ability to scale is fragmentation of L2s. Because right now we have this proliferation of L2s, but liquidity is not shared between those L2s. State is not easily shared between those L2s. Like users have a difficult time moving between those L2s.

Starting point is 00:06:37 And so we are scaling Ethereum by adding additional block space, but we aren't scaling Ethereum in the sense that we're not scaling access to liquidity into shared state. Like if you think about like the value that's bridged to arbitram, that doesn't help optimism. It doesn't help other L2s. So it doesn't help the Ethereum 01. And so right now we're in this mode where everyone is basically trying to build their own

Starting point is 00:07:04 like mini copy of Ethereum, where there's sort of liquidity and bridge value, defy activity. But we're not able to contribute to Ethereum's network effects globally and to contribute to like the overall way that Ethereum functions for users and developers. And so the Ag layer is an attempt to fix fragmentation on Ethereum. It's an attempt to make it really easy for users to move state and value and liquidity between L2 chains so that we can have a horizontally scalable ecosystem for an execution layer for a 3A, but one in which we're able to scale access to liquidity and to show it state. Basically like a multi-chain ecosystem that feels like you're still using a single chain,

Starting point is 00:07:50 where we have composability and all these nice properties that we like on L1. Okay, so basically, if I kind of take one step back, what you're saying is say I am a user on base and I want to do something on Arbitrum. The way that I kind of get from base to Arbitrum, despite the fact that they are both layer twos on this shared security layer of Ethereum is by kind of exiting to Ethereum at the cost of kind of doing a transaction on Mainnet, which is notoriously expensive. and then kind of moving into the other L2, and there's no way for me to kind of laterally move from base to Arbitrum, and that's what we're fixing. Is that correct? Yeah, exactly.

Starting point is 00:08:37 So exactly like you said, like suppose that I have some EF on base, and there's like an NFT mint on Arbitrum, and I want to be able to claim an NFT from that mint. There are two things that I can do, right? So the first is I can withdraw my, i.e. through the native bridge on base and then submitted L1 transaction and deposit into arbitram. Obviously, the problem is that base is an optimistic roll-up, and it takes seven days

Starting point is 00:09:05 for me to withdraw via the native bridge, and then I have to wait for that transaction to be finalized. I have to wait for my deposit to arbitram to be finalized, and the L-1 transaction is expensive. Some of this gets a little better with ZK roll-ups because we have a much shorter window where users can withdraw funds. But fundamentally, like, it's not good enough. Like, we, we need a way for users to be able to take their, their native eth on some L2 and instantly bridge it over to arbitram and get native eth and be able to do things on that chain. The second option that users can do is they can withdraw via a third-party bridge. So there are third-party bridges that connect different L2 chains. But the problem here is like the trust assumption for using a third-party bridge

Starting point is 00:09:54 is much, much different. And there's a requirement for users to rely on liquidity providers and market makers to basically swap from the wrapped synthetic version of a token. So let's say I use wormhole to bridge from base to arbitrum. I get wormhole rap d on arbitrum. And I have to pay some amount of money to swap that into arbitrum native eat so that I can claim my netteamint. And so what we can think about the Aglar is doing is providing two things. So the first is asset fungibility. You should be able to take your assets and move them between chains without having to rely on a market maker or liquidity provider. And you should be able to do this safely at super, super low latency, even lower than like Ethereum block time or Ethereum finality. Okay. I understand the goal here. How do you make it work?

Starting point is 00:10:46 Sure. So this is a good question and stop me if I'm getting two in the weeds. But what Polygon is building is basically a guarantee of safety for a shared bridge, which allows different L2s to have asset fungibility between their chains. So this is what allows us to avoid making an L1 transaction when we move funds. We can just take L1 native eth and move it between chains and we never have to touch the L1. And the second thing is it allows us to provide this cryptographic guarantee of safety so that chains that are interoperating at lower latency than like even an Ethereum block time, but certainly the time that it takes to finalize an Ethereum block aren't at risk of some sort of malicious behavior like a chain can't equivocate or a shared sequencer can't, you know, lie about the messages that are sent between chains. And so the ag layer fundamentally is like this very, very minimal guarantee of safety for the shared bridge and for interoperability. And it provides a foundation for what we call emergent coordination infrastructure.

Starting point is 00:11:57 So the way that chains interoperate is up to them. They could use a shared sequencer or relay or, you know, a builder. And these mechanisms benefit from the shared bridge and the safety guarantees the ag layer provides. So like one case would be for a shared sequencer, if users wanted synchronous composability between chains, they could opt in to using the same shared sequencer. And that shared sequencer would allow the same block builder to simultaneously build blocks across chains. So a user could submit a transaction on 1L2 to move funds and claim the mint on another

Starting point is 00:12:40 chain and then swap back to the original chain. or access some key store that's located on a third chain. And the builder could do all of this simultaneously. And the Ag layer would guarantee that for all of these chains, the builder can't misbehave and asset fundability is guaranteed. And so that's sort of like the way that we can think about this work. But would that mean that the builder, kind of any builder kind of tool to build the next block would actually have to opt into all of these chains.

Starting point is 00:13:16 So kind of if I want to start transaction on base that kind of bridges to Arbucharum and then back in one transaction, the builder has to support both, right? Yeah, so this is, I think that we should emphasize that there are different roles that exist in this model that maybe don't exist in the same way for L1. And so in the shared sequencer case where we have synchronous composability.

Starting point is 00:13:49 And this isn't required for all chains. Chains could operate asynchronously too. But in the synchronous case, which is sort of the holy grail of composability, a builder would be executing transactions and producing blocks for all of these chains, which sounds bad, right? It sounds like, oh, like this can't scale. like our hardware requirements are going to blow up. But we have to distinguish between the actors that are building blocks that are super

Starting point is 00:14:19 sophisticated. They might be running in data centers and have access to a large amount of hardware. And the validators on L1 that can be fully, you know, have very minimal trust requirement or very minimal hardware requirements and can fully validate everything that happens at L2. I think it's up to chance. whether they're comfortable with this model in which we have super sophisticated builders that are able to operate a bunch of full nodes concurrently. But they still exist in this competitive marketplace, right? Like builders are limited in what they can do.

Starting point is 00:14:56 They're limited in how they can misbehave in the extent to which they can extract rent. And so I think we're sort of moving to this world where block building is just, a role and a function that might not be accessible to everyone who's running a node on their laptop or a Raspberry Pi. But I wonder to what extent it's like a necessary step to deliver very, very good UX, at least for synchronous composability. Yeah, I also want to add one thing on that. The question you asked from a very simple user's point of view, that if I am, let's say,

Starting point is 00:15:36 or in fact a chain point of view, let's say there is a transaction. where, you know, there is a cross-chain transaction between three different chains and you want to do it in one single transaction. And you can actually break it down into two categories. One, where you really want to have it under one single transaction. That means like synchronous, that we call synchronous composability across chains. And that would require what Brendan was saying,

Starting point is 00:16:03 that some level of shared sequencing or like a, you know, sequence of marketplace. where you people are, the chains are selling rights to, you know, sequence the blocks. And then for these cross-chain transactions, somebody who has the rights to all three chains, they, they create the block. But there is another form which is a very much, much simpler, which where Ag layer is more focused on. And then on top of that, people can build shared sequencing kind of mechanisms is the asynchronous composability, where, you know, if you want to do, if your user, let's say I come to a user, which is, let's say, cannot connect it to a chain, chain A, and I do a transaction, but that

Starting point is 00:16:44 transaction interacts with chain B and chain C. In an A-sync, Ag-Lair provides you the mechanism or the safety guarantees that that transaction will go through the chain B and chain C, and it will come back to, let's say, some action comes back to your chain, but it will go asynchronously. You cannot guarantee, the Ag-Layer doesn't guarantee you that the entire series of transaction will go through with the same set of conditions that the user initially started with but asynchronously there are like safety guarantees

Starting point is 00:17:15 that the chains are not playing a role in that. If the market condition, let's say it's a, you know, a text transaction or something changes the user transaction might or might not fail. But if the conditions are correct, the chains will honor that transaction in a asynchronous way. Whereas, you know, if you needed this synchronous one single transaction and you are guaranteed

Starting point is 00:17:36 that the entire sequence of the transaction, transaction will go through. For that, you need some sort of shared sequencing and Ag layer provides an environment where people can come and build these shared sequencing mechanism. And we see the world growing into a place where you will have, we will have like, Ag layer will have like hundreds or like, you know, not even hundreds. Like I would say hundreds of thousands of chains connected in the next five years. Like I think the space where we are heading into by the end of 2025 only, you will probably have like thousand chains. And each application eventually spinning up their own chains, meaning like tens of thousands of chains.

Starting point is 00:18:10 And how we see is that there will be like a large number of that, those chains connected to the Ag layer will be individual sequence chains. So you can do asynchronous transactions across other chains, whereas a few clusters of chains will be shared sequence chains, which have much faster and synchronous composability across the chains. And that is dependent on the use case and how. those chains want to choose and join when some particular consortiums, consortiums and all that is fully dependent on the use cases. Aglare is pretty, pretty agnostic to that.

Starting point is 00:18:47 Okay, so I think there's a, there's kind of like a lot to unpack here, but what I'm taking away from this is basically you're distinguishing two cases here. So one where kind of the transactions happen one after the other. So kind of say, for instance, I bridge from chain A to chain B. There I swap token X for token Y, then I bridge to chain C, buy an NFT against token Y that I really wanted, and then bridge back to chain A. And then there's the other case where kind of everything, where kind of all the transactions happen at once or they don't happen, right? Kind of what I would call kind of an atomic transaction, right? So basically you bundle everything together. So for instance, if I want to do, if I want to ob something, that's kind of how

Starting point is 00:19:40 I would want to do it in order to kind of minimize my own risk, right? And you're saying that the subsequent transactions that is possible today with, with Ag layer, whereas kind of the atomic transactions, that is something that would require shared sequencing, which is kind of very much in, which is optional for chains to kind of opt into. And you kind of see specialized cases coming up where that will be done. Is that more or less correct? Yeah, that was very well summarized too. Yeah.

Starting point is 00:20:14 I would say we're still working on both the async and the synchronous and atomic case. But yeah, I think that's exactly the way to frame it. Yeah, I just thought that, not to get off topic, but I think this is sort of interesting. Like the ARB case that you describe, I think is like, like a really good example of when atomicity is really important. I think it's interesting to like imagine what the world will look like. I'm not sure that every single chain in the world will be using the same shared sequencer and the same shared builder.

Starting point is 00:20:50 But you could imagine like if we're trying to scale defy to like internet scale or to like global scale, what that looks like. It could be a bunch of different chains that are all providing liquidity sort of in parallel. And they're all using the same shared sequencer because it's really necessary for for people trying to arbitrage to be able to do atomic arms between chins. And like it's very, very important for us to be able to guarantee like within some slippage parameters that either both legs of an arm go through or neither. And so I think like I wonder if most of the world.

Starting point is 00:21:33 composes asynchronously with these sort of like defy hubs, which are like synchronously connected via a shared sequencer, or whether sort of synchronous composability becomes important for everyone. I think it's just there are a lot of open questions as to how the world looks. I'm under the eye clear. This is kind of an aside here. I want to get back to the asynchronous case because that's kind of what works today. But do you think kind of in the synchronous case

Starting point is 00:22:07 where you kind of have to have shared sequencing between different chains, do you think it's possible to have a composite model where you say every other block is kind of executed by synchronous

Starting point is 00:22:24 sequences and every other block is done by regular people because obviously or at least my concern would be that the hardware requirements and kind of the DevOps capabilities needed to kind of run the hardware behind

Starting point is 00:22:43 a shared sequencing setup would be very difficult and not achievable for many people. So kind of you would have a small number of entities actually doing this shared sequencing setup, which would result in much lower censorship resistance than you would have with a much wider validator set.

Starting point is 00:23:10 So do you think it's conceivable to kind of have both where you have like dedicated slots for shared sequencing and then kind of if you want to do something atomically, you wait for that slot and every other block is built by independent builders? Yeah, so I think that's definitely one approach. Another approach would just be to, like, I think implicit in the shared sequencing construction is sort of a natural fallback where if a block builder goes offline or misbehaves or attempts to rent seek, chains can always default to allowing anyone to produce blocks for that chain. They wouldn't be a sophisticated builder running full nodes for every chain, but it would sort of like fall back to the asynchronous case. But I think that like, like, Ultimately, it's not for us to decide. It's up to the chains.

Starting point is 00:24:01 But I do think having, like, periodic blocks that are produced that are not synchronously composable is, like, an interesting solution to sort of guarantee that there exists enough block producing nodes to provide censorship persistence. Yeah, super interesting. So let's talk about the asynchronous case. So in my head, this currently takes the shape as, I would almost call it like a unified bridge. Is that kind of a fair conception of what you're building? Yeah, so I would, yeah.

Starting point is 00:24:34 So I would say that it has two components, right? So there's the unified bridge, which means that all L1 and L2 native assets that are in the Aglar ecosystem are locked in the same contracts. So all L2s that opt into the Aglar have this shared deposit contract that saves us from or saves them from having to submit an L1 transaction. every time we do asset movement or asset transfer across chants. And so the interesting thing with the ASIN case is like, so we have asset fungibility

Starting point is 00:25:08 and we also really want low latency interrupt. So right now, like if you want to, we were talking about this earlier, but if you want to move assets between L2s, even if you have a shared bridge, there's still this like heavy latency penalty because let's say that I'm on roll up A and I want to move my Eith to roll up B. In order for Roll Up B to accept the message that transfers assets from Roll up A to Roll Up B, it has to have a guarantee that Roll Up A's state is finalized. Otherwise, Roll up A could equivocate. It could create two blocks, one of which has a transfer to Roll up B, one of which doesn't.

Starting point is 00:25:51 and it could submit the block with the transfer to Roll Up B to Roll Up B, and it could submit the block without the transfer to Roll Up B to Ethereum to be settled on Ethereum. In the case in which we have to wait for finality on Ethereum, that's a really bad user experience, right, because we have this like 12 to 19 minute latency delay for that block to be finalized. So instead, what we can do in the Ag layer is Rollup B can declare that it has this dependency on some state of roll up A. So I can say, okay, I've received the state of roll up A that includes a message telling

Starting point is 00:26:27 me to mint a bunch of Eath and give it to this user. But like I'm going to via the Ag layer guarantee that my roll up, Rollup B, can only be settled to Ethereum if this state of Roll up Bay is settled. And so if Roll Up A equivocates, Roll Up B would just have delayed settlement until it reorgs the transaction from Rollup A. out of its history. And so there's this potential for, like we're basically prioritizing safety over liveness.

Starting point is 00:27:02 And so there's this potential where if the operator of a chain misbehaves or acts maliciously, settlement of Role of B could be delayed. But fundamentally, this is a necessary tradeoff for us to provide super low latency interrup. Okay. I think I understand the value problem. What kind of is missing for me in my head is how do you ensure the security and integrity of that ag layer? So kind of where does it check in or does it have collateral somewhere? How does it guarantee all this?

Starting point is 00:27:36 Yeah, so it proves it cryptographically with a zero knowledge proof. So like what's happening is L2s are submitting blocks to the ag layer. And the ag layer is proving to Ethereum that all of these properties hold. So the shared bridge is protected and safe. And chains are interoperating in a way that's consistent and no one's behaving maliciously. And so it's aggregating all the validity proofs from every single chain and also these additional proofs that guarantee these safety properties. And it's submitting all of this to Ethereum. And so that's where settlement happens.

Starting point is 00:28:13 For every block, that sounds very expensive. So it's not necessarily submitting. submitting to Ethereum on every block, but it's accepting blocks or batches from each roll-up. And so it's not like the cost of aggregating proofs or proving this is like actually very, very minimal relative to like your typical cost for a ZKVM. So like proving a ZKVM transaction, which we do routinely is actually a lot more expensive than the types of proofs that the act there is creating. Okay, but then kind of in between the check-ins or the ad layer into Ethereum,

Starting point is 00:28:54 what security guarantees do you have in that interval? Yeah, so, like obviously you are running the risk of some other chain misbehaving or of the ag layer misbehaving. But I think fundamentally, like, there's a very small window where this can happen. and the ag layer can never like settle a malicious action or malicious behavior to Ethereum. And so the way that I look at it is like it's very similar to how rollups work now. So most people use rollups and rely on like the pre-confirmation guarantee that's provided by the sequence. So the sequencer basically says like, okay, I'm accepting this transaction at very low latency.

Starting point is 00:29:39 Maybe it's like 400 milliseconds. And for most users, that's fine. They sort of have enough trust that the sequencer is not going to misbehave, and they're fine with operating on that guarantee. It could be the case that certain users are transacting in large enough amounts or need a further guarantee. And so they wait until their transaction is posted to Ethereum or until a proof is posted to Ethereum. And so if you're like selling a house and accepting cryptocurrency or like selling the Lamborghini, You just have like a different requirement for settlement versus if you're like buying a low value NFT or something.

Starting point is 00:30:22 And so similarly, if you are using a chain that's on the ag layer, you have these different stages of guarantees. So you have the initial guarantee that's provided by the sequencer of your chain. You have then the guarantee that's provided by the ag layer if you need an additional guarantee. And finally, you have the guarantee of settlement on Ethereum. And so once everything is settled on Ethereum, it's final. We know that it's valid. It can never be reverted.

Starting point is 00:30:52 And so I think it's up to users to sort of figure out which assumption, which guarantee sort of fits their use case. Okay. So I understand that the Ag layer can't settle false transactions to Ethereum, but can it send the transactions? Yeah. So there will be a mechanism to force transactions, like to force settlement without explicit cooperation of the Aglera.

Starting point is 00:31:21 The Aglare will be decentralized. So there will be like a censorship resistance component. But like fundamentally that the goal is to is for for chains to be able to use the Aglayer with no additional trust assumptions. So they should be able to to guarantee that like the Aglare can't censor, can't censor. can't censor settlement of chains for a long period and there are no additional trust or security of sanctions from the chains perspective.

Starting point is 00:31:51 Okay, so who builds, who makes the proofs for the egg layer? And is it currently centralized? Yeah, so the initial version is centralized, just like ZK roll-ups are centralized just because we're trying to do like a bunch of different things and like building the system sort of takes precedence over decentralizing it. But in the future, it will just be decentralized stake nodes that are producing proofs.

Starting point is 00:32:21 But again, these proofs are relatively lightweight, certainly relative to a ZKVM. And so, like, you know, we use laptops to test per generation. and that will be an option for users that are producing these proofs. Okay, and so in your time when kind of the Ag Layers decentralized, who will be able to generate these proof? So who will run the AgLail? Yeah, so Stake-Node, so we'll just have a leader that occurs on every Ag-Layer slot, and that leader will be in charge of producing proofs.

Starting point is 00:33:07 Okay. Who's already using the Ag layer? Currently, there are four chains connected to it, 4.05, which includes the polygon ZKVM, OKX's X layer is connected to it, A Star is connected to it, and as we recently announced that we have the pessimistic proof. you know, stage one is getting, you know, built and it comes very shortly, I think, in four to eight weeks. Then I think many other stack-based chains will be able to connect into the Ag layer.

Starting point is 00:33:51 So, you know, previously near protocol, like as a layer one has also declared that they'll be connecting into the Ag layer once they have the ZK ZK was on proofs. So similarly, like, you know, we expect many other, you know, chains and not only new chains, but the existing chains also connecting into it. We, in the next two, three weeks, for example, we also have another one of the biggest names in the space connecting into the Aalachia. So yeah. What's the pessimistic proof? The pessimistic proof. So this is, it's an interesting idea.

Starting point is 00:34:29 So part of the vision for the Ag layer is that we're not opinionated about anything. So chains should be able to use their own sequencers, their own tokens, their own governance mechanism. They should also be able to use their own execution environment. So you should have chains that are able to run a ZKVM, type 1 and type 2, the SVM, Miden, you know, a move VM, like a custom Rust VM. Basically, whatever they want. The problem with this is that if we're using a shared bridge, then for every VM and every prover that we include in the Aglar,

Starting point is 00:35:11 the probability that one of these provers is unsound and can be used to generate a proof that's valid, but contains an invalid transaction goes up. And so this would be really catastrophic because it would allow some chain to construct a proof that verifies. But maybe like the block that that proof is validating includes a transaction that allows someone to withdraw like a million eat or something. And they can drain the entire shared bridge of all funds. And so we don't want this to be possible.

Starting point is 00:35:45 And this is like a very, very important part of guaranteeing that chains have the same security using the Ag layer as not using the AgLayer. And so what we do is we basically say, okay, from the AgLayer's perspective, we assume that every, proof is unsound, that every prover has some soundest bug that's sort of like hidden and deep in the code. And so instead, we have this special proof that's very, very simple, and it checks, it basically takes in all of the asset transfers from the bridge, and it checks that the token balances on each chain are conserved. So no chain can withdraw more tokens than are currently deposited to that chain. And so this guarantee is that, you know, I can't spin up some chain that has a prover that I know is unsound and use it to drain the entire bridge. And so we call it

Starting point is 00:36:39 the pessimistic proof because we're basically assuming that there's a soundness error everywhere in every prover and we still want to guard against that possibility. So yeah, so that allows us to provide the same security for chains as if they were deployed on separate bridges. So if you're on Polygon ZKVM, your funds are safe, even if there's some compromise chain that exists elsewhere in the ecosystem. Yeah, that makes a lot of sense. So we were talking about different kind of chains integrating into the AgDaya. Does this also work for optimistic roll-ups? So I think we've kind of touched upon different ZK brood-ups, but is optimistic fundamentally different? Yeah, so because there's such a long fraud-proof challenge period, we can't offer the same guarantees

Starting point is 00:37:34 because there's no like fast finality that's guaranteed by the chain. And so so the chains that can connect to the Iglare are obviously ZK roll-ups. And they're also side chains that have that have finality. And so so you could say like, okay, I'm not going to to generate validity proofs for my chain. Maybe your users don't care or you think that, you know, having side chain security is a sufficient security guarantee, which I think is a valid position. And from the Aglar's perspective, we already have the pessimistic proof. And so for the Aglar, like, we actually already assume that your validity proof is not valid or there's some issue with it. And so we can accept like a proof of consensus that verifies the security of the

Starting point is 00:38:22 side chain. The problem is, like, you know, Like, we can't have optimistic roll-ups because there's no way to offer this short inter-op period where we can't guarantee that like the state of the optimistic roll-up won't fork if, say, the validators say that a particular state is final and then a fraud proof is submitted later. like the Aglar has to be sort of fork-free after like finality is reached. So that's sort of the big issue with optimistic go-ups. Okay, I see. You very recently introduced something called a type 1 prover. So for everyone like me, not well versed in ZK EVM proofers, what does type 1 mode mean?

Starting point is 00:39:18 Yeah, so Vitalik came up with this classification framework for ZKVMs. And it ranges from type 1 to, I think, type 4. And it basically refers to like how similar an environment is to the EVM. So a ZKVM is like a special type of computer that's running inside a zero knowledge proof that allows us to verify execution of EVM transactions. And so a type 4 is basically like, okay, this computer is. it's different from the EVM in like concrete ways. But maybe we have a compiler or something. It allows us to take existing solidity code and compile it to this new target.

Starting point is 00:40:01 And maybe, you know, it's like 90% similar. Or like there are like, you know, there's some parts that you have to change. You have to re-audit. But otherwise it's similar and functionally equivalent. Type 2 is type 2 and 3 are. are basically like you have an environment, you have a ZKVM that presents a functionally identical environment to users and developers.

Starting point is 00:40:29 And so you can take your solidity code, you can use it as is, users can transact with the chain with basically no difference from Ethereum. But we can't use that ZKVM to prove existing EVM chains. And so that's where a type 1 comes in. It allows us to take any existing EVM, VM chain, whether it's like an optimistic roll-up, like the Polygon POS chain, and we can upgrade it to being a ZK roll-up.

Starting point is 00:40:54 So seamlessly, we can just immediately start generating proofs, and we can convert it into a chain that's secured by fully proofs. Okay. Okay, I see. And how does this kind of fit into kind of the Ag-Lay integration? Yeah, so it allows us to take optimistic roll-ups that already exist and upgrade them to ZK roll-ups so they can join the AgLair. Okay, so basically this is a way for kind of fixing optimistic roll-ups such that kind of they are compatible with the Ag-Layer architecture then. Yeah, exactly. So if the Type 1 proof as a way of kind of making an optimistic chain

Starting point is 00:41:43 integrable into the app layer. Are there any drawbacks for that? I don't think that there are any drawbacks. I would just say that when an optimistic chain uses a type 1 brewer, then they don't need to remain an optimistic chain. They can be a fully validity proven, fully proven chain

Starting point is 00:42:06 and don't need to have those optimistic fraud proofs which need for seven days to seven days' challenge period as a safety mechanism and all those things, and they can settle instantly. So, you know, when I see of, I think of optimistic chains using the type one prover, I am thinking of them upgrading into a ZK rather than like, you know,

Starting point is 00:42:31 going along with both of these mechanisms. Okay, so how exactly does the type 1 prover, how does it generate proofs for these optimistic chains? Yeah, so, so, So it can take in what we call witness data from full nodes and use it to generate a proof that every transaction in the block is applied correctly to the existing state of the chain. And so we can just prove that a block is valid given a previous state. Yeah, and Fridig, I think your question is like, how does the type of

Starting point is 00:43:13 and Proover kind of like, you know, work with the optimistic or generate the proof for an optimistic chain. Actually, you have to understand what is an optimistic chain. Like there is a simple, mostly like I'm talking about EVM chain. There is a simple, let's say, get node, right, which is running somewhere and the chain. And the get node doesn't know anything about optimistic proofs being sent somewhere. It's a simple get node. And then on the Ethereum, you have a few smart contracts which are the optimistic part of the of the of the of the of the of the of the whole system right and when you create the zK proof the zK proofs are being

Starting point is 00:43:49 created of the chain zK proofs don't have anything to do with the optimistic proofs of the smart contracts that they have on the ethereum block chain all that they're zK proofs are just creating a proof for a get chain get based chain or any uh you know ethereum or evm client based chain and you what you do is as an optimistic chain is like you just simply strip out the optimistic part of the chain and just use, replace it with the ZK proofs and upgrade it into a ZK chain. So basically what you do is kind of you, you go back to the last state of the chain that can't be rolled back because kind of the challenge period has lapsed and then kind of you prove that your current state is a valid successor state of that. Is that correct?

Starting point is 00:44:37 Absolutely. Absolutely. Okay, cool. What are the challenges? in building these, because it sounds like, sounds to be true. So what are the challenges in kind of building a type 1 proof or does it only apply to certain kinds of chains? Yeah, so I think the challenges are, there are a few. So like a lot of the cryptographic permanents that are used in the EVM, specifically Ketchak, some of the pairings, they're actually not very friendly to being proven in Israel knowledge proof.

Starting point is 00:45:17 And so there's like a lot of engineering and research work that has gone into making those more efficient. So specifically like Ketra Act and parents. And so beyond that, things like the NPT and RLP encoding, they're just not very ZK friendly. But what we discovered was that a lot of the work that we've been doing, doing in R&D in Polygon, so developing Ponky 2 and Ponky 3, that has gotten us to the point where we can actually accept this extra complexity and cost, and we're able to generate transactions at very, very low cost. And so we've been generating basically a proof for every single Ethereum block from, I think,

Starting point is 00:46:03 the beginning of Shanghai or something. And what we've seen is that for these proofs, or these transactions, the average proving cost is between one and two tenths of a cent. And so this is already like a very, very competitive cost in relation to transaction fees that users are already paying. And we think that the proving costs will continue to decrease because our type 1 ZKVM is built on Plotkey 2. And when we move to PlotE3, that will be like a huge, huge unlock and speed up.

Starting point is 00:46:41 And so I think it's fair to say that we're already there in terms of proving cost. We're just trying to push the frontier of applications where it becomes economically feasible and practical to generate proofs. So right now, like everything that you currently do on an L2, I think is feasible and practical and the cost is low enough to generate proofs for. But if we want to do like games and social applications that aren't as like economically valuable and don't have as high a fee component, we need to further reduce proving costs to make those sort of practical as well. Okay, so kind of the only reason why kind of type 1

Starting point is 00:47:21 proofs are feasible is kind of because the cost of generating proofs has decreased so much that you can now kind of just prove the state of the chain since the last uncontestable state. So let's talk about these advances in cryptography. So you were talking about Plonkey 2 and Plonkey 3, and the fact aside that all of these Provo systems seem to have really wacky names, can you walk us through kind of the evolution of ZKP system? So kind of like back in the day, kind of we had ZK Snarks and ZK Starks and then kind of these come with,

Starting point is 00:48:06 inherent limitations and challenges that were then kind of counteracted by kind of like polynomial commitments and recursion and so on. Can you can you dive in can you can you can you dive into a little bit more detail here? Yeah I I think by by wacky names you mean like brilliant branding and very uh actually very good names no I see that see that's a reason why we don't put mathematicians in the marketing department. Yeah, yeah. I do think the Planky naming might be like some of the worst branding that's ever been introduced to crypto, but I don't know, like people, people know what it is and it's recognizable.

Starting point is 00:48:48 But yeah, so I think if we rewind to 2019, so Daniel Lubrov, who is my co-founder of MIR, he and I raised money for MIR in 2019. And it was like a very, a very, very nerve-wracking time because the primitives that were available for ZK Tech were not fast enough to do what we wanted to do. And we weren't sure how quickly they would become faster. And so we put in a ton of effort in 2020 and 2021 to improving ZK Tech. And our approach was like in the academic community, there's a lot of focus on like asymptotic efficiency. So like how many field operations a prover requires to generate a proof? And this is like a proxy for computational cost and latency and proving efficiency.

Starting point is 00:49:46 But our insight was that not all field operations are created equal. And we should have this notion of like hardware and software and theory code design. So we should look at like which operations can be can be optimized to be more efficient hardware and how that can feed into the theory piece. And so one of the things that we that we hit on was like Fry, which is the polynomial commitment scheme used in Starks, has this nice property that it doesn't depend on elliptic curves. And what this means is that like unlike elliptic curves that depend on or that require

Starting point is 00:50:27 very large fields, so at least 256-bit fields. With Fry, you basically have a lot of freedom in selecting a field that might be much smaller. And if you think about modern CPUs, they operate on 64-bit word sizes. And so when you're simulating 256-bit field arithmetic, it's actually a lot less efficient than if you had a field element that could fit in a single word. And so we discovered, or Hamish from our team proposed the Goldilocks field, which is this field that has like a very, very specific structure that makes it really, really efficient on modern CPUs. And so from there, there were a ton of optimizations and a ton of work that went into Planky II, which was this, like this vision of Starks that operate on these small, like very carefully chosen fields.

Starting point is 00:51:25 And that yielded like a 50 to 100x speed up over what was currently available on Ethereum. And so, so that's been sort of like the route that we've taken with with Ponkey 2 and now with Ponky 3 where we have picked another even smaller field that had like it was really, really nice on CPUs, but it had this like theoretical drawback where you. couldn't, it didn't have this nice property that we needed for Starks. And so Ulrich from our team, working alongside some researchers at Starkware, basically figured out a way to like change the protocol a little bit so that we could use this, this really, really nice field with, with really nice structure. And so that's kind of been the progression is like going from an academic mindset

Starting point is 00:52:13 where we're really, really far from like the concrete efficiency on hardware to more of like a hardware software code design where we have, we have engineers that are working alongside researchers and people on the theory side to collaboratively build faster protocols. Okay, so I think I'm missing like a little bit in between here. So I totally understand kind of how you set your constraints, right? So you said, okay, this is kind of the, GPU, this is kind of the number of CPUs with certain specs that are, that should,

Starting point is 00:52:51 this should somehow run on. And then you kind of, you determine the field size based on that. But how was the field size chosen initially? So how was it decided that kind of you would need 250, 65 bit field in the first place, if kind of like half the size, it would have easily been enough. Yeah, because for elliptic curves, you need, you're working over a group that is defined by an elliptic curve. And so in order for that to be secure, in order for the discrete log problem to be hard enough,

Starting point is 00:53:28 it needs to have a certain minimum field size. You can define an elliptic curve over an extension field, so you could pick a small field and use some higher order extension. But there's still, like, that's not what people were doing. and you're still stuck doing a bunch of arithmetic in at least a 256-bit sized field. And so with Fry, there's no dependence on an elliptic curve. And so we don't have this minimum size requirement.

Starting point is 00:54:01 Okay. And so now the difference between Plonky 2 and Plunky 3 is that you can use an even smaller field, making it much cheaper. So is this the limit, or can you make it smaller still? So I think it might be minimal gains from making the field size smaller. So the nice thing about Plotky 3 is that it has a small field that also has this really nice structure. It's a Risen prime, which means that it's of the form 2 to the end minus 1.

Starting point is 00:54:31 And I think that the further improvements will come from like on the theory side. and on the zero knowledge proof protocol side. And so using different polynomial commitment schemes, using polynomial commitment schemes with a more efficient verifier for better recursion efficiency, I think that these will kind of be the routes for future improvements, not so much improvements on just from reducing and field tests. What about specialized hardware?

Starting point is 00:55:06 So, I mean, this is run on regular multi-core CPUs. I assume, right? Yep. So if you were to build like specialized chip sets, would that make it much simpler or much more efficient? Yeah, so it would. And there are a bunch of different projects that are currently developing chips that support,

Starting point is 00:55:29 you know, Goldilocks field arithmetic or other operations that are used in improving. The tricky thing is that some of these, some like FPGAs and GPUs might be like hardware or might be memory or bandwidth limited. And so you might be able to really accelerate certain parts of the prooper. But end to end, it might still be the case that, you know, things are cheaper on a CPU. But I think that we're seeing a lot of efforts toward hardware improvement. And I think that a lot of these are going to yield very significant speedups in the near future.

Starting point is 00:56:08 Okay, so do you remember when Zcash had the bug in the cryptography for the shielded pools? So kind of with all of this very advanced cryptography, the problem is that the number of people who really understand it to the course very small and kind of you always run the risk that kind of there's a vulnerability somewhere in there that just no one's found yet, right? So I hear that kind of like, if you look at kind of like how you set up the egg layer with kind of the pessimistic proof and so on, kind of, I mean, you everywhere you're kind of trying to contain the risk of introduced vulnerabilities. But if something like Plonkey 3, where to kind of be faulty on some level, what would that mean for the system? Yeah, so I think this is like very much a top of mind. concern for us. And our strategy has been to sort of have a gradual progression of like governance minimization and decentralization. I think it's really important for ZK roll-ups or for protocols that use ZK technology to launch with training wheels and to give their systems time to be scrutinized

Starting point is 00:57:31 and to develop and to, you know, potentially have formal verification. And so, I think that we're still like very much in the early stage of this process. And I think it's important to, you know, we're also doing audits and like we're very much like in bug bounties and taking like an approach that we really want these systems to be scrutinized and to be secured. But I think like with a lot of things in cryptography, it's just a question of time. And we just need these systems to be in production and securing value. in kind of a training wheels mode where if there is a soundness issue, it can be detected and it can be remedied by an emergency security council or something like that. And so I think that that's the best strategy because we need to provide a sufficient incentive

Starting point is 00:58:27 or like a sufficient level of exposure to systems to harden them. But at the same time, we can't progress too quickly and really put user funds at risk. And so that's kind of the, we're like trying to balance those two concerns. What shape do the training wheels take here then? Yeah, so right now like for like, like we can take the ZKVM roll up, for example. So like I think like every ZK rollup that currently exists, only a single designated party can provide proofs to this rollup. So we are the only approver. I believe that this is the case for Starkware and for ZKK.

Starting point is 00:59:06 async and for scroll. There's only one party that can provide proofs so that if there is a soundness issue, it cannot be exploited by an attacker. The second training wheel is an emergency security council where if there is an issue that's detected, we basically run every transaction twice. We run it once to prove,

Starting point is 00:59:33 but we also run it to basically check that execution is correct and that the proof is consistent with what we're executing in the client. And if those two were ever to differ or if someone were to submit a transaction that could potentially cause an issue, we have the ability to default the system and to rely on a security council that's the majority is from outside polygon to address that issue. And so I think that this is the same strategy as for the Ag layer where, you know, users are still getting a much higher level of security because like we cannot steal user funds because we would have to like exploit a soundness issue or or prove something that's invalid. But if a soundest issue is detected,

Starting point is 01:00:24 like we have the ability to address them. Okay. So maybe this is a question for Sandeep now. So, Sande, you guys have put together this very ambitious vision of how the future of the Internet should kind of look under the hood, right? And it is certainly super compelling. If I were to tell you I had a crystal bowl and I had a crystal ball and I could look into the future and in five years it doesn't work anything like that. What do you think the issue will have been? So if this fails, why will it fail? I would say that your crystal ball is not paying its internet bill and it's not showing you the correct results.

Starting point is 01:01:12 So use Moses pay to pay the internet bills of your crystal ball. No, I mean, seriously, I think that, you know, we as a unit have been working on this problem. And many times, like, you know, people come to us today that Polygon is not doing this. And they talk about a lot of these short-term things like, you know, Polygon is not doing this, Polygon is not doing that, this, that, this, they should do this, they should, that. But the thing is that for us, the mission has been extremely clear from get-go.

Starting point is 01:01:48 And that mission is very simple, that how do we create this infinitely growing Web3, you know, infrastructure, right, our blockchain network, infinitely growing blockchain. You know, if somebody tells me like, okay, this system works for 50,000 TPS, this system works for 100,000 TPS, not interested. We want to build an infinitely growing network, so which can take like even like, who knows, one million, two million, two million, even one billion TPS in 20 years, it should be able to take that. So if you ask me that architecture, I don't see, I mean, as of now, like I don't see there, there, there can be an, There can be an alternate architecture which can achieve this, you know, this infinite scalability

Starting point is 01:02:35 while like, you know, also solving from this fragmented liquidity and, you know, user experience and all that. And if let's say that doesn't happen, Aguilier is not that. I would say somebody else would be there, but their architecture will be very similar to Aglia. Like something like this will win. Whether it's Polygon's Aglair or somebody else's Aglair, that nobody can tell. But somebody like this will win, something like this will win.

Starting point is 01:03:01 I think those are fantastic parting words. So Sandeep and Brendan, if people want to learn more about this, to kind of dive into the docks and so on, because obviously this was, we've barely scratched the surface here. Where do they go? How can they start building on this? Yeah, so the idlyar docs are, I believe, on the Polygon website, I believe that we will be spinning up a separate website

Starting point is 01:03:28 that's ag layer focused that will have the docs there but yeah they're welcome to go there they're welcome to reach out to me directly or anyone from our engineering team and yeah

Starting point is 01:03:44 go from there and I also want to as a parting way just want to say that you know with this ag layer that polygon like kind of extends beyond a layer two network like you know we like with ag layer once the Ag layer comes in fully,

Starting point is 01:04:00 you can't really, like, ag layer is not a layer one or layer two. It's kind of a meta layer, which allows layer twos, and many of them will be simple connected chains who are not even using the layer two validity proves, but they still can exist in this multi-chain environment. And so, you know,

Starting point is 01:04:20 that is that the polygon transcends this layer two, layer one debate, and then, you know, goes into a place where it can actually support this infinite scalability while using Ethereum as the settlement layer and that has been our core thesis from day zero

Starting point is 01:04:35 and I think now we are at a place with Ag layer where we can take it to you know more closer to reality cool, super nice thank you guys

Starting point is 01:04:46 as always it's been very elucidating I look forward to having you guys on in six months time and see what you feel then yeah thanks for Rico.

Starting point is 01:04:58 Thanks for Lviki. Always nice talking to you. Thank you. Bye-bye. Thank you for joining us on this week's episode. We release new episodes every week.

Starting point is 01:05:08 You can find and subscribe to the show on iTunes, Spotify, YouTube, SoundCloud, or wherever you listen to podcasts. And if you have a Google home or Alexa device, you can tell it

Starting point is 01:05:17 to listen to the latest episode of the Epicenter podcast. Go to epicenter. com. For a full list of places where you can watch and listen. And while you're there,

Starting point is 01:05:25 be sure to sign up for the newsletter. So you get new episodes in your inbox as they're released. If you want to interact with us, guests or other podcast listeners, you can follow us on Twitter. And please leave us a review on iTunes. It helps people find the show, and we're always happy to read them. So thanks so much, and we look forward to being back next week.

Epicenter - Learn about Crypto, Blockchain, Ethereum, Bitcoin and Distributed Technologies - 'Ethereum Needs Polygon's Aggregation Layer to Scale' - Brendan Farmer & Sandeep Nailwal

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.