Epicenter - Learn about Crypto, Blockchain, Ethereum, Bitcoin and Distributed Technologies - Misha Komarov: =nil; Foundation – The Marketplace for ZK Proof Generation
Episode Date: September 22, 2023Zero knowledge proof systems have found tremendous cryptographic utility in scaling blockchains, due to their ability to prove computational integrity, succinctly. However, despite recent advancements... in ZKP R&D, their construction still requires special prover circuits. Their complexity is what gatekeeps zero knowledge technology to a select few astute teams. =nil; Foundation aims to challenge this status quo by providing an alternative through their zkLLVM circuit compiler and zk proof marketplace. By commoditising the production of custom proofs, =nil; Foundation unlocks an entire new range of applications employing zero knowledge technology. We were joined by Misha Komarov, co-founder of =nil; Foundation, to discuss the use cases and challenges of building the first marketplace for (outsourced) zero knowledge proofs.Topics covered in this episode:The vision behind =nil; FoundationUse cases for zk techzkLLVM circuit compilerHomomorphic encryption, ZKPs & privacy solutionsZKP marketplaceWhy proof markets (currently) run on DBMSMarketplace actorsProof generators vs. PoW minersInfrastructure challengesFuture roadmapEpisode links: Mikhail Komarov on Twitter=nil; Foundation on TwitterThis episode is hosted by Brian Fabian Crain. Show notes and listening options: epicenter.tv/514
Transcript
Discussion (0)
This is Epicenter, Episode 514 with guest Mikhail Komarov.
Welcome to Epicenter, the show which talks about the technologies, projects,
and people driving decentralization and the blockchain revolution.
I'm Brian Crane, and I'm here today speaking with Misha Komorov.
He's the co-founder of NIL Foundation.
NIL Foundation is basically working on a marketplace for zero-knowledge-proof.
So we're going to dive into what Nile Foundation is, zero-knowledge-proof, zero-knowledge-proof market.
I think this is one of the areas where there's been a lot of interest, a lot of buzz about it.
There's a lot of investment in this area.
And yeah, so really excited.
So thanks so much for joining us, Misha.
Yeah.
Thanks for inviting me.
So, you know, I just mentioned, right, like, okay, a lot of like ZK interest happening now.
Now, an interesting thing about NIL Foundation is that you guys actually started in 2018.
and you know it says on the website that this initial focus was on
best sort of best practices for database management systems for crypto
and so I'm just curious if you can talk a little bit about
how did this get started and what was the original vision for NIL Foundation
and yeah what is this database management system?
Why is that important?
Okay, so let's go into that. Basically, the reason why Nell Foundation was started is because prior to that, it's like me and, well, it's like, we were together doing a fork of stimid, basically like European dedicated fork of stimid. And I was kind of a fellow who was responsible for all the technical part in there. And it's like, from my perspective, I was dealing with all the technical issues, with all the data management issues. And I literally, it's like, I was literally like in pain.
by the absence of proper data management tools back then.
And it's still kind of absent, to be honest.
Like, people still struggle with, like, access to the theory and yad-y-y-y-y-y-a.
So I was struggling with the absence of that.
And considering that all the Stimid stuff and all the Kola stuff
was actually kind of a social network and something-something, something,
you obviously were required to have, you know, proper data management.
I mean, like proper one, just like they do in traditional web industry.
So we had no such a thing back then.
and we still have no such a thing.
So in April of 2018, I was like,
hey, cost, let's go do a DBMS, right?
I mean, I don't want anybody to struggle that.
I don't want to struggle myself.
I don't want anybody to have that issues.
So that's what it was.
Right.
So this would be like, okay, you want to have a database where like you store in there.
I don't know, these are all the users, these are all the posts.
And then like use that information to serve a web application.
Yeah.
Yeah.
Since it was required for such a database to work in untrusted environments,
like to be basically BFT compliant, right?
So it's like database for crypto, right?
So I had to, so, so we had to think about how to mix this two industries together
to make it, to make it basically work, to make it to make it suitable for hosting BFT
privateals for hosting BFT applications.
So yeah, that's what it was.
That was the idea to merge the industries together.
Yeah.
And then because the idea with this databases were basically, I'm curious if you can,
Expanded a little bit more on this aspect of have the database be trustless.
Did you kind of imagine that the user would have some way of verifying that, you know,
the database functioned in a particular way and, you know, sort of serve the results in the right way?
That's one of the critical components because if you weren't, if you want, if you want,
if you weren't like the DBMS or like the database to work in untrusted environments,
you've got to be able to verify what's going on in.
I mean, you can't just like go and access, for example,
somebody's data like Ethereum's data or like some roll-ups data
or like some other product or data, something, something, whatever.
You can't just come in and, you know,
and trust what has been given to you.
Because, I mean, this database could easily just, you know, screwed yet.
And this can result like in something very nasty.
So basically, it is required and it was required to make the interaction with this thing
as a trustless as possible because, I mean, like one,
one more trust point. Come on. We don't want to be that trust point, right? So we do not want to
have that responsibility on our hands. So basically, to make it as trustless as possible, it was
required for people which operate over some data inside this database, so through this database,
like over the VMs data, for example, right? It was required to make it to make them capable
of, like, verifying whatever they have done. And for the sake of this, it was required to have
it's like a like provable execution environment, right? And so that's basically the desire. And
how to do that's like a provable execution environment. It's like you also, it's like beside
how to do this. I mean, you've got to prove somehow what was executed. And the most, it's like
the most convenient, the most suitable thing which we had back then was some like modification of
growth and, you know, some radical restraint system based proof systems. So that was the best
fit back then, and we were like, it's not sufficient, it's not good, and we would need much more
than that. So we started working on the cryptography suit, we've embedded proof systems in there,
and once we realized that, okay, it's like, it's like the industry got to that point, when there
is, when there is enough of, like, tech and theoretical research available to make such
executable environment, we were like, okay, well, it's like, Blam-Kish-proof systems were introduced,
We got the limit in the computer pursuit of ours.
Another question arrived.
It's like the second question which arrived is that besides just proving the execution of whatever
was done with the data inside the database, you've got to prove that the data, which was
taken as an input to this database, was basically taken from a right place.
Because otherwise, I mean, how can you be sure that the data that you're operating
over wasn't, you know, just made up out of nowhere?
That it's actually, for example, Ethereum's data, right?
So for the sake of this, we needed state and consensus proofs.
And that's how we got together basically with the Zero Foundation,
with Mita Foundation, with Salado Foundation guys,
because that was like our desire to do state groups and consensus groups.
And they were like the only ones which had any idea about this back then.
And once we got this, once we got together in 2021,
this collaboration of ours evolved into the birth of ZK bridges.
I mean, like so many projects are building ZK bridges now, right?
So, but like, back then, it was, back then it was like, hey, guys, we need state truths.
You know how to do them.
We want to learn.
Let's do something together.
So that was basically like, it were just some ZK region in general.
So that's like what it was.
So and when we, it's like, it's like in the process of doing all that, we were like, yeah,
well, it's quite a lot of circuits.
I mean, like very, it's like too much circuits, right?
And they're very complicated.
And we don't want to do that like manually.
anymore because we've spent like a couple of years before that already, like, crafting this
circuits. And we were like, nah, we're not going to do that. I mean, probably somebody else
has this kind of problems. So we need a compiler for that. Let's just do a compiler for that.
So we took LLVM, I mean, like, just a compiler, like, which everybody uses it, like, you know,
very solid. And we just took it. We made it provable. So that's how ZKLVM was born. I mean,
because we were like sick and tired of building surface, like manually. And apparently, the rest of the
market was also sick and tired of doing that. And the proof market is basically, it was born
out of our realization that all of those state proofs, consensus proofs and state transition
proofs we worked on for the sake of making this as transparent and as trustless as possible,
were really heavy. And we were not willing to make like, you know, anyone to generate it themselves.
And we will. It was like, we didn't want to generate it ourselves. So we were like, okay,
we'll just make a marketplace since, since,
Yeah.
So, and basically, and basically we just slapped the marketplace on top of the same database
we're doing.
And we were like, okay, well, we were building a database.
We tried to make it as trustless as, like, you know, as transparent as possible.
So we'll just slap the proof bucket on top of it.
So this could also become like, you know, decentralized and like distributed, whatever, whatever,
whatever, like for pro generation.
That's what it was.
Cool, cool.
No, I think that was very helpful.
Actually, I think what would be great to talk a little bit about.
and, you know, I mean, I think for some many people, this will be familiar and for other people, it's still kind of like maybe a little new.
But it's like, there's not as they're like use cases for ZK tech, right?
Because what you mentioned, right, basically, okay, you want to have like this provable, you know, I have some code, right?
And that's like provable.
So of course, like sort of an obvious thing would be, well, you just use a blockchain for it, right?
put it on the blockchain.
It's not that simple.
Right.
I mean, I guess a bunch of the big downsize of that would be, well, now you have consensus.
So, like, your database is going to be maybe limited by the speed of consensus, right?
And then a scalability and cost would be much worse.
Exactly.
And then maybe some other things might be, yeah, let's say if you want to have, you know, private data, some off-chain thing being done.
Let me expand on that.
It's like you're absolutely right when you're absolutely right when you're talking that,
okay, of course you can prove your competition, just putting it inside some protocol and,
you know, just computing that.
But there is a nuance that you can only compute inside such a protocol.
With such a protocol, you can only compute something which doesn't have a lot of like communication complexity.
It's like when you come to communication complexity and when you need to compute something really big,
or something really complex, right?
It's like the problem, the problem with all that,
with all that protocol-based computation
is that you get basically limited
by the communication complexity
that you gotta basically split
the piece of computation you want to do
to level parts,
deploy them separately,
then to make them communicating between each other.
And this introduces a lot of overhead,
so you get kind of limited.
And that's not really, that's suitable
when you have, when this,
when this computation is decomposable, right?
There's, it's like, there is, like, a very traditional trick in the DBMS world
when you need to compute something over data.
It's like, currently, it's like what people do currently, right?
Like, for example, in Ethereum, it's like, people just do, people just do basically, like,
synchronized computation, like, over some transactions, over, like, some, I know, replication packets,
and there's, like, across all of those notes, right?
So the traditional trick in DGMS industry to overcome this is to introduce, like,
dynamic sharding, but that's still not.
enough. I mean, it's like we realize that it's still, if not enough, in case you cannot
decompose the computation you want into little pieces, which would not introduce communication
complexity hard enough, like the large enough or like, you know, big enough, you know, to kill
the whole efficiency, to kill efficiency at all. So when you have a big chunk of computation,
which cannot be decomposed, you got to, and it's easier, it's cheaper, it's faster to compute it somewhere,
and then just to put the result of this competition
to some protocol where you can operate with this results
in like, you know, decomposed meta
and basically like, you know, with small chunks.
So that's what basically, that's what basically it is about.
So we tried to cover, so we basically tried to cover,
we knew that it would be required, like not only us,
the whole industry does this, right?
It's like we knew that it would be required
to be able to cover decomposable computations
which do not introduce a lot of computations
complexity. It's like, you know, for this, basically, and that's why we've used kind of charted
to the DBMS because it's traditional. We just brought it from here. It's like, okay. And to cover
the piece of computation, which is not decomposable, in which is not better to be decomposed,
which is just simpler to be computed somewhere, and then to be proven, and then to be like
used on some, I don't know, Ethereum or like whatever. Anyway, that's what for the combination
of a proof market plus compiler exists.
It's basically like a marketplace for provable computations
which cannot be decomposed.
That's what it is.
So that's what it makes sense.
And that's what it makes sense to use ZK Proof systems
for this kind of computations.
You mentioned like circuits
and you mentioned, you know,
CKLVM, right?
So now, guys, please correct me if I understanding here is correct, right?
Because let's say you have like some computation
and you say, oh, you want to prove this computation.
And then circuit basically means, okay, you develop a bunch of like equations, no, to then check this particular computation.
So somebody else can then check that.
And then if something like the ZK LLVM, you then just do that for, you know, any code that's written using that VM.
and then you can just verify anything written on this VM.
Is that kind of?
It's not really because, again, it's like,
Mill is again being weird, right, in here.
So basically what's going on in the industry currently?
It's like, obviously, yeah, there are people which do custom circuits,
which do custom circuits with libraries for each particular application.
Like, I don't know, scroll did that, right?
For example, they did very custom, pretty good circuit.
I mean, really good circuit.
For ZKVM, they've built over a couple of years.
And that's pretty good.
They used Halo 2.
It works.
It's fine.
And all right.
That's how it was done before.
Then people have started realizing that you can actually not to craft custom circuit
for each particular piece of computation,
but to do just one circuit of some virtual machine,
and then to put the bytecode of the computation you want to prove
as an input to the circuit and then to prove this computation via this like, you know,
novosely big-ass VM circuit, all right?
The problem which this thing, which is like the problem which these two approaches
introduce is that for, for example, custom circuits, it's very expensive and very, very
troublesome to write them, right, to implement them.
And to implement them properly is even harder.
I mean, like, even without complaints about that.
And that's, you know, that's the valid, that's a valid concern.
It's like in case, for example, some roll-up or something, it turns out that custom circuit they rolled is under-constrained or it's just, you know, it just wasn't audited properly or something, something somebody forgot something. Like, just please don't forget anything, guys. If somebody forgot something in some circuit, turned out to be under constraint, somebody will be able to prove to Ethereum what actually didn't happen on Roll-up. And then we will all get fucked.
So that would be really bad if this happens.
And that's like the problem with custom circuits.
You can't know what's going on inside.
It's really complicated to craft them.
It's really complicated to know what's going on.
It's really hard to make them secure.
The nuance with VM circuits with just one VM circuit
is that you introduce an enormous overhead.
So basically the overhead with, for example, like custom circuit,
it's like NVM circuit and VM-based computation, ZTVM-based computation,
is usually like at least 10 times, at least 10 times in terms of cost and time.
So this is very, again, like introduces like a lot of overhead.
What is at least 10 times?
It's like it's at least 10 times.
It's like the complexity in terms of like approximation in terms of a circuit size,
in terms of the amount of computation you need to prove.
Like for example, let's say you want to prove A plus B, right?
So if you do a custom circuit for it, it's going to be very simple.
it's actually going to be like, okay, let's just prove A plus B.
You'll lay out as an equation and you're like, okay, well, this is error criticize.
Here goes the authorization of it, and we're good.
If you do A plus B, it's like with the ZKVM circuit, you basically prove not just A plus B,
but you prove the execution of the whole CPU plus everything, everything and everything,
like all the bytecodes, everything, everything, everything, everything, a shit lot of upcodes,
which do in the end, A plus B.
So that's, you know, just an almost overhead for doing just A plus B.
We kind of knew that when I'm going to go with this overhead, so we chose the way, as they say in the middle,
which basically about introducing custom service for each particular piece of program, yes,
but not manually, but generating that automatically via compiler.
So you can still prove mainstream languages, you can still prove, I don't know, like Ross,
civilities, DPP, anything you want.
But it's going to be custom circuit for each particular part of computation.
There will be no overhead, and in the same time, you do not craft those circuits manually.
So it's basically like something in the middle.
That's what it is.
And how are these automatically generated these custom circuits?
Compiler.
That's where our compiler comes to the stage.
That's what it is about.
Okay.
So the compiler is basically like a program that will take some,
something like, okay, A plus B, and then put out a custom circuit.
For it, yeah.
Or, for example, if you want to prove something really big, you want to prove some, I don't know,
like ML model or learning or something, something.
You just put it in there, and there will be a custom circuit generate for it.
So there will be no overhead.
But in the same time, you could use the code, which was already written by, I don't know,
like thousands of people outside of the industry, for example.
Or, I don't know, game.
If you want to prove the game, let's say you want to, let's say you want to prove
that you speed ransom game in some particular time.
Or I'm like, do my favorite example.
Do speed run.
So you can actually just put the doomed code inside the compiler.
Compiler's circuit for it.
And then to brag to your friends, like, hey, I approve.
It's like, I can speed run with this amount of time.
Beat me.
Here goes a proof like of it on Ethereum.
So that's what it is.
That makes doable.
And so with this compiler here, I mean, I mean,
I guess there's all kinds of different.
different things that people may want to verify, right?
It could be like maybe solidity code or it could be code on, I don't know, Solana or like some other blockchain or maybe some code that runs.
You mentioned machine learning or something else.
So this compiler, then it can basically take any kind of code and output these custom circuits or like...
Yep.
So any arbitrary programming language you can basically use for that.
it's like any programming language which is supported by LLVM
and there's quite a lot of compilers written for LLVLVL.
There were a lot of compilers written for LLVM for the last 20 years.
So that's pretty much like, yeah, that's quite a lot of languages.
It's like just in case, fun fact,
it's like we have like an internal job that you can actually produce a ZKVM
via ZKLVM.
So you can take basically, for example, EVM,
interpreter, for example, EVM1, right?
It's like classic EVM interpreter, which was around like since, I don't know, 2016 or like 2017, right?
You can just take it.
You can compile it to a circuit.
So you would get a ZDK EVM circuit as an output of it as an output of a circuit compiler.
So something like this, yeah.
Maybe one thing we can touch on with is it's like a little bit of a detour potentially.
I mean, people in crypto, many people have been sort of aware of CK Tech for a long time.
But I guess the main project, right, where we've kind of known ZK Tech through is Zcash, you know, basically said like, okay, we're going to take, you know, do something like Bitcoin, but, you know, give people basically the ability to make transactions privately.
And then, you know, there's still these proofs so you can kind of know the whole thing is safe and correct.
but you don't know who's sending what to whom.
So the privacy thing was, you know, I think for a long time,
the kind of main way that people thought about ZK.
And yeah, now a lot of it is more around this other use cases, right?
Like roll-ups or like where you basically say, okay, you can run this computation.
We just use proof.
And so it's much more efficient because like maybe you have to run less things on-chain.
But let's search on privacy briefly.
What are your thoughts on like ZK in privacy?
Is there much activity?
Do you think, like how do you think is going to play out?
Okay.
Yeah.
See, it's like practice applications of proof system is something which was traditionally
supposed what this is for until recent times.
I mean, that's true.
And it's like to understand, to understand if there is a way forward with this,
if there's like any, you know, like in the end of the title, right, with privacy and with ZK
like for privacy is what we've got to understand how this privacy is achieved, for example,
like for Zcash, right? For Z. So for Zcash, privacy with ZK is basically achieved as follows.
Because it's like you have some data which is kind of encrypted and stole inside inside Zcache,
Zcash, Zcash protocol, Zcash like database, whatever. So what's going on is that you basically,
when you want to do like a transfer Z cache, you get the data from there encrypted one,
you decrypt it, you post a proof of a successful decryption in there, you do some changes
with this data, then you encrypt it back, you do the proof of a correct encryption and you post it back
to the cache. I mean like it's a high level, it's a high level overview of like how does that work,
right? And what you've noticed in here most probably is that privacy is actually being
preserved, not by
proof system itself, but
by simply the fact that the
data never leaves
the, never leaves
the device of
basically, you know, user.
Like just user, it's like the decrypted
data is only available in user's machine
and the user is not willing
to disclose that, is not interested to disclose
that. So that's how the privacy
is achieved. So there's any
ones. It's like, there's a question.
Can you store some large amount of data
in the database and still being able to get it each time from a protocol to your machine,
decrypt it, do something with it, and then encrypt it, and then post all of these proofs of
correct encryption and decryption.
It's like, will you be able to do that with large amount of data?
That's a good question.
And I haven't seen anybody willing to download, I don't know, like picabytes of data on their phone
to just decrypt it, do some small change, and then encrypt it back again and send it back.
it's not like it's, you know, like very attractive.
So to defeat this, to defeat this basically, people started thinking about,
okay, we need to process this data right where he is, right inside the day space.
You don't have to do it.
So you won't need to download it or like do something with it.
And that's where basically, that's where basically like Google mobile most encryption came into play.
And it's still in the lab.
It's still not out there.
It's still not usable.
But once it is, I think in terms of privacy, it will become much more relevant.
than proof system-based mechanisms.
Proof-system-based mechanisms are very good for compression.
I mean, those use cases which we see right now, like DKBridges,
e-KML, roll-ups, I mean, whatever, oracles, anything, anything, right,
the gate, gig, gates.
Very good for progression.
With a privacy.
All right.
So that was very interesting.
Let me just sort of like, rephrase if I understood this correctly.
Basically, like, the issue is that, you know,
in a proof system, private.
see, you know, I would basically need to run it locally, right?
So I'm going to have to download some data from somewhere.
I do whatever I do.
And then I generate this proof on my phone, on my computer,
and then I send this proof back.
And then that's the change.
But of course, the downside here is, well, I have to download this data.
I don't know if maybe generating the ZK proofs is also computationally.
Sometimes it is, yeah.
Right.
And then with homomorphic encryption,
basically the advantage is that they can have the data,
someone else can have the data,
and they can apply the computation on top of this data,
but everything's encrypted,
so they don't know actually what the data is,
but I don't have to be involved as a user.
Yeah, it's like this is the way.
Yeah, okay.
And so you feel like homomorphic encryption
is the thing that's really going to help
sort of the privacy applications more than the proof systems.
Yep.
It's like, again, proof systems are very good for compression.
For data structure for serving compression,
I don't think any other thing will beat this.
But in terms of privacy, yeah.
Let's talk about the marketplace.
Can you explain how does the marketplace work to ZK Marketplace?
And, you know, like who are the different actors that are
participating in this marketplace?
Basically, what proof market is in terms of, like, you know, tropical architecture.
So what is that?
It's basically, again, just a marketplace for proof generation, right?
It basically turns all those like ZK proofs into commodity, which you can, like, measure their
value, measure their generation time, measure how much they would cost to be produced or how
much it would cost to, I don't know, like speculate with them.
We had told you're actually speculating with them.
So anyway, so that's basically what it is.
What are the actors in there?
And what are the actors in there?
Well, let's start with like the most obvious one.
It's like generators.
I mean, this is nothing without proof generators, right?
So these are the fellas with, these are the fellas with like, you know, big machines or specialized card or something else, which are willing to provide their computational facilities for the sake of, you know, applications being able to use them for there for the sake of security of theirs.
for example. Sometimes it's not security. Sometimes they get it.
It's crash or something. Yeah. So that's
the most critical component of this whole thing.
The nuance is that different participants, different group generators,
induce basically like open competition.
I mean, obviously it induces open competition
because somebody has a better hardware, somebody has worse hardware,
and somebody is more fitted to generate. For example,
roll-ups, serve.
that proves for roll-up surface.
Somebody's more fit to generate
like kind of proofs
for ZKML circuits or something.
So it creates an open competition.
To coordinate this open competition
and to make sure that it stays fair,
there's basically like
coordinating protocol.
Like coordinating like product or application,
right? Application on top of DBMSS thing.
So the second actor,
which is the most obvious one,
is basically the fella
who maintains
maintains this cluster, maintains this protocol, maintains the DBMS.
I mean, because currently it's just what it is.
You got to store the data, you've got to facilitate the competition, you have like a competition
and you got to make sure that it stays fair.
So that's what it is.
There is any nuance on how this competition designed internally in terms of like product
architecture and what is required for that.
But that's like, you know, different topic.
We'll come to that later if you want.
The third of this participant of this is basically like application.
Well, it's the most obvious participant.
Like applications, I don't know, DK bridges, ZK oracles, like roll-ups, I don't know, ZKMLs.
Anyone, ZK games.
It's about, so how does it look like for that?
It's basically, in most cases, this is like the theorem application.
It's like, and roll-up is also kind of a theorem application.
We can say it that way.
So in most cases, this is like an Ethereum application.
which comes up with a desire, something like, okay, I need composable complications for Ethereum.
I need to be able to just, you know, order some heavy-loaded, big-ass computation to be able to be able to use it in Ethereum.
For example, some application decided that they need the result of an ML model which did scoring over
different, over different Ethereum addresses for the sake of, you know, figuring out risk parameters for some planting.
All right?
this is a very big chunk of computation.
And the application usually comes up
with something like, okay, I need this chunk of
computation and the result of this like, you know,
model loss like all this year.
So I'll just go, order it.
Somebody will generate it for me,
and I will just use the result of it
without being concerned that, you know,
somebody tries to screw me up.
So that's basically,
it's like three, it's like three major actors in that.
Maybe you can talk a little bit about the second thing.
So you mentioned, okay,
the maintenance of the,
You know, storing your data.
I mean, because in the end, right, I mean, it's a marketplace, right?
And you want it to be a trustless, a decentralized marketplace.
So that is run on chain?
Let's say this way.
That is around, that's like that is being run on top of, on top of a DBMS.
And the DBMS handles some VFT protocol inside of it.
So there is some protocol which facilitates this.
And I got to admit that.
I can admit that in the current state of it, in the current state of it, it's not maybe
decentralized enough.
It's like, we're going to progress thing.
But anyway, so, yeah, it is being maintained by some protocol.
By, you know, yeah, there is a nuance that to run this marketplace, to run this kind of
thing, and that's not only this market place, but actually like quite a lot of other different
applications, like, for example, you've got to have some very particular requirements in this
radical. Because, I mean, what is effectively proof market is, it's a lot of computation over
data, but computation which can be decomposed, right? It's a lot of verifications. And verifications,
I mean, you can run it on Ethereum. But if you do that on Ethereum, like directly on
Ethereum, you will pay billions of dollars in fees just for the proof pocket per year. I mean,
nobody wants to do that. If you run this on a roll-up or something, like on a traditional roll-up or something,
you will quickly hit into the limit of a gas available less sophisticated.
And the second thing, it's like just one proof market, just a single proof market,
will be enough to induce congestion as like at any roll-up existing out there.
So it's like if you deploy a proof market, for example, like to ZKKK or some other roll-up,
it will get congested like in just seconds.
That's interesting.
Why is the proof market so complex?
reputational intensive.
Verifications.
A lot of verifications.
Each proof, which is being submitted by the proof generator,
has to be checked on the protocol side that the proof generator didn't try to screw somebody up.
So the verification has to be done.
And we have this like verifiers for EDM.
So yes, I got to admit that the protocol which runs through market is kind of EVM-based one, all right?
So we've put EVM inside database system.
So you're welcome.
databases now have EVM inside. Anyway, so you've got to check each proof which is being submitted,
which is being submitted by the proof generator. Because otherwise, if an application comes to
proof and they're like, okay, well, can we be sure that it's good? No one else. So we got to make
sure that it's good. And each pair, each circuit, each new border induces at least one verification.
and just to understand verifications, I mean, they take quite a lot of gas, quite a lot of complications.
So, yeah, that's intensive.
There's also new ones, if you want.
There's also new ones if you want that each proof, each proof and the input data,
which you use for progeneration, takes not just a lot of computation,
but it also takes a lot of storage in terms of you've got to be able to produce a lot of data.
And once you try to put, like, for example, like a bunch of, I don't know, 500-kilabyte proofs at some roll-up, it's going to get congested in seconds.
If you pause that much proofs on some roll-up, right?
Or if you put input for the proof generation, which, like with which the application has come, you know, to the proof generator, this will also get congested, like, you know, in seconds.
For example, the input for Solano consensus proof, I mean, we did Slot DeSlau consensus proof some time ago.
Input for Solano consensus proof is being produced.
I mean, it's really big.
It's actually just, you know, thousands of signatures, thousands of hashes being produced each added all like 0.2 second or something.
That's a lot.
I mean, that's really a lot.
And for this, to process all of that was like, I mean, it's quite convenient that we started with a DBMS.
because we found ourselves very lucky
then we started with DBMS.
So that's basically really easy.
We can process.
Yeah, yeah.
Okay.
And then you mentioned this DBMS.
It still has some sort of BFT thing.
So do you still have basically,
you know,
a bunch of different,
I don't know,
node operators or something similar
that then all run this DBMS?
Well, currently it looks that way.
Yeah.
It's like not like node operator or something,
but like a DBMS instance.
yeah, there are different servers, different operators
which do host this.
But like, again, right now it's just intestable.
And again, it's very, it's very important for us
for this to not to accidentally become something, you know,
very standalone because the majority of anybody
who's interested in this, I mean, like in the DBMS,
it's either accessing the Ethereum data,
either JDAV-ROOV-ROOV-S or the proofboxes for Ethereum applications,
either using the proof market in combination with some other application
dedicated to this year.
So we try to not be standalone and we're figured out the way how to not end up being a
standalone thing.
So yeah.
We mentioned the three actors.
So maybe let's go through the other actors too.
So we mentioned, you know, we talked about the application.
So can you just run through it a little bit?
Let's say I'm maybe one or two examples of, okay, I'm someone who wants to develop an application
that's going to leverage, you know, ZK, how, and, you know, particularly this marketplace.
How would this work?
Well, basically, first of all, you got to determine if you would need, if you would need
approved generation outclosing.
That's like the first thing you've got to determine.
It's like the second thing is the second thing you've got to determine.
mind in terms of like you being as a developed, the application developer, you've got to figure out
if circuit of yours is like, you know, if it's, if it's really complicated or if it requires
like something, something huge to be proven or something something, something. So the typical
workflow for this, like once you decide these two points, the typical workflow is the, well,
we suppose it that way, all right? So we're kind of like protocol and tool chain agnostic, but by
default, we suppose it the way. So you just come in, you take the circuit compiler, you draft some
circuit, if you don't need proof generation outsourcing, you just generate locally and you just
verified on Ethereum or like somewhere else, and you're good.
Basically, here goes, your application, just build the logic, all right?
If you need proof generation outsourcing, you will be required to basically force the circuit
to the proof market as you know, as we call it list the circuit.
It's like listed on the proof market.
Here we go out of kick that.
Never thought that's going to be in my vocabulary.
But anyway, so you're going to list the circuit on the proof market and say something like,
Okay, guys, now I need somebody to provide liquidity, like proving liquidity, like prove liquidity,
all right?
It's like, hold this particular circuit for this particular circuit pair or something like this.
So if you realize that, okay, yes, it needs to be outsourced, somebody comes in, generates
your proof, you get your proof, you take it, you use it anywhere else.
So that's basically what it is.
There's also one more scenario where you can use this is you can, it's just instead of doing all
of that thing, for example, on your front end or like in a custom way from your application,
like for something, you can basically order the same piece of chunk of, like the same piece
of computation right from EVM, if for example, like building some EVA application, like a Noddipa
product or something. You can basically just come to EVM endpoint point for proof pocket, say something
like, okay, I need this big chunk of competition for me at this address deployed in, I don't
know, like two hours or like in five minutes or like for the fifth amount of, I don't know,
something, whatever, for this amount of E, for example. So you ask them, they're like,
okay, well, I see the order, I'll generate it, this particular statement, here goes,
here goes a proof of yours, and you get the proof like right inside your EBM, right inside your
EVM application. So that's like the second, that's like the second way of using this.
Okay, cool, cool. And now maybe finally mentioned generators. So generators basically, right,
they just do a bunch of computation, give back the results, get paid for it.
Sounds a lot like, you know, proof of work, sort of, right?
Or like, is this de facto going to be that you have a lot of the crypto mining farms
that maybe do like GPU mining are then just going to say, okay, now we're also going
to do, you know, proof production for like, let's say this marketplace, other kind of ZK systems?
How do you imagine that this market is going to look like?
Yeah, it's like, first of all, I want to say that this might be similar.
This might seem to be similar with just traditional proof of work thing, but there's still
quite a lot of differences.
It's like, first of all, it's like doing just proof of work is like computing hashers over
and over again for the sake of securing, for the type of secure like some proof of work product, right?
So that's what it is.
And it's not like you're proving something new each time.
It's not like you're proving something, something which, you know, makes sense.
you just prove hashes.
And these hashers are present in there
just for the sake of computational complexity
for the sake of you, not to spam the cluster.
So that's what it is.
In case we're talking about, like, for example,
ZK Proof Generation, yes, it is required
to have quite a lot of computational facilitated
for this.
There are folks which are doing like,
you know, specifically AISX, specific hardware,
specific hardware for this.
Some of them do programmable.
A6, some of them do, you know, just slap thing on board.
They're like, okay, we're good with it.
and that's something which is similar, but what is different is that what you prove is not like
just, you know, meaningless hashes for the sake of computational complexity. It's something,
you prove that you proved basically the sequence of some actions which were done at some
protocol or outside of the protocol or by somebody else. If you digitalize all your everyday
actions, you can actually create a proof of all your everyday actions. And is it going to be, it's like,
get a bit meaningless, I mean, like a proof that you cross the road, for example,
or a proof that you, I don't know, something, that you walked, I don't know,
like 500, it's like 500 miles, right? So it's like, is that meaningless?
Maybe, but it's like, is it meaningless for everybody? Not really. It's definitely not
meaningless for everybody. So this is what is different. So that's the first thing,
which is different. The second thing, which is different with just proof of work stuff,
is you have the same piece of computation is like basically doing being done over and over again.
In here, complication each time may be different. And in case we're talking about like,
in case we're talking about defining computations custom circuit, there are two ways to handle this.
In case circuit is the same over and over again, just like with DKVMs for ZKVMs, okay?
So we have the same circuit. So you got to put the same circuit over and over again with just in different input data.
This situation makes sense for people to produce specific A6 for this, specific hardware for this, which is dedicated to particular circuit, right?
When we're talking about that each piece of competition can be represented by a different surface, it makes sense.
It's like, it's not like just makes sense, it does make sense to produce specific hardware for each of these circuits because they're not that widespread and that might be easier to just, you know, compute them on like CPU or GPU or something.
Because these are basically like dynamic circuits and this has, the system has to be more.
flexible. So if we're talking about like if this market is going to end up just like, you know,
mining market, no, it's not going to end up because it's not going to end up that way because
it's different by design. So maybe there will be like some specific circuits or some specific
computations that are where there's just like an enormous large demand that maybe looks more like
mining, but then in general, because it's much more generalized there, it would be different
dynamic.
Yeah.
Yep.
What are the biggest challenges with this?
Like, what are, yeah, what do you guys find are the biggest, maybe conceptual, technical
challenges that you're facing?
Which challenges do with case right now?
It's like, I got to admit that one of the biggest in terms of conceptual challenges we're
face currently is how to make the protocol which powers proof market, it's like, you know,
secure and decentralized and not. I mean, how to arrange this product goes the way so it could
handle proof market and similar applications. Because, I mean, again, a lot of data, a lot of
computations, we cannot deploy that on the roll of. We cannot deploy that relief. It has to be
something interesting. It has to be like something different. And that's like one of the, that's like
one of the challenges, the architectural challenges
were, it's like we're thinking
about right now. That's
in terms of, you know, like, tech stuff.
So we already realize
that's a solution which can handle this.
I mean, it has to leverage
those things which come from
a DBMS industry. It has to leverage
like, it has to leverage, like dynamic
sharding. It has to leverage security
techniques which were developed
within crypto industry. And only
the combination of those
would be able to handle basically what
is required for this and maybe not only this.
There is also nuance that inside the proof market,
like applications and proof market alike applications, right?
They have to be able to access the theorem data
because the majority of what's going on around proving happens around the theorem
and you've got to have data to put that as an input to the verifiers
for each proof which comes from a food generator.
So we're basically got to be able to access Ethereum's data
from the inside of a radical, from the inside of a fruit market,
just like it was run on Ethereum, but not on the Ethereum because Ethereum cannot handle the load.
So this weird combination, weird combination of two industries is what bothers us currently
is, and it is what in terms of like architecture we're facing right now the most challenge about.
It's the, you know, how to handle this proof market like applications,
how to handle applications
which require a lot of data,
a lot of applications stuff,
transparent data access,
how to handle that.
So this could be still aligned with it here.
Cool.
Can you talk a bit about
what is the product roadmap?
As I mentioned already,
there are basically like two big chunks.
It's like the first big chunk
is the combination of a Proof-Walki Plus compiler.
It's like in here,
in here,
the thing is about
put it as much applications as like as it makes sense for them.
It's like to, you know, to help as much applications as possible, right?
Basically, that's what it is.
And we target, for example, to introduce the notion of ZK gaming.
I mean, not just, you know, Space Invader is a like thing, just like it was with Dark Forest, right?
I mean, Dark Forest is good.
I mean, I love it.
Space invader than good.
But we want to introduce a more widespread.
It's like a more interesting notion of ZK gaming.
It's like, we want to be able to prove 3D games.
want to be able to prove, like, I don't know, something's really weird.
What happens, for example, in a really action game, like, on Ethereum.
So that's like, that's like the thing.
Another thing is that on our side, not like entirely on our side, but there is an ML,
there's EKML extension, basically coming for a compiler, which would allow to prove ML models
to Ethereum.
And what I'm talking about proving the model theorem,
I'm not talking about proving something like similar,
something trivial or something similar to, I don't know,
you know, just a classifier or something.
I'm talking about proving whatever you want, basically,
because you can produce like circuits dense and off.
So that's the thing.
And the guys which are doing this, as an example again,
they target to prove GPT2 to theorem.
Is it useful?
I don't know.
Is it fun?
Yes, it is. So that's what it is. That's the roadmap. That's the roadmap regarding the circuit
compiler plus blue blocking. There's also a thing in the roadmap that we want to be able to
produce ZKEVLVM via ZKLVM because we think that's going to be like an important milestone for
Ethereum community because it's about the security of ZK AVM circuits. We do not want anybody
to get hurt by under-constrained circuits.
So we try to basically compile EVM with ZKLVM.
So we could say something like, okay, guys, here goes ZKVM, which is easily audible,
which is like, you know, secure, which was not written manually, but it was generated
automatically.
And the compiler is kind of, you know, proven.
So it's good.
It's fine.
And it definitely is not under constraint and nobody, nobody's going to lose, you know,
nobody's going to lose anybody because of that.
So that's like the bot around the compiler plus proof market.
It's like the bot around the proof market plus DBMS thing.
Well, again, it's like, as I already mentioned,
we got to get to that point when we'll be able to make the product
which powers the market accessible to set by the developers
for people to be able to see that, okay, well, mix into industries make sense
that, okay, it's like, there's like, you know,
there's enough, there's enough of scalability,
there's not like, you know, there's not of data which can be handled.
There's not a reputation to be, which can be handled.
And you can like leverage, I don't know, transparent access to Ethereum data.
And currently we're trying to make that accessible bythreadbody,
bythread bythead by the developers, not just us.
Because we've already been body in this and like beta mode in private for, I don't know,
seven, eight months.
And I mean, we want to make it public.
We want to make it public for some, for some,
for somebody to say that, okay, it was worthwhile.
It's good.
It works.
So what is the timeline here?
Like when do you guys go to like a main net and how do you see like if things go well?
How do you see the projects sort of developing over the next two years?
Look, it's like just to avoid confusion.
It's like we have no such thing as we have no such thing as main net in terms of like, you know,
DBBS or something because it's.
It's DBMS, right?
So I don't know if there is like such a thing as we made that in that.
So the second thing is, the second thing is in terms of when the beta, when the beta bowl ends,
like we target the beta mode to end like before, before the middle of November,
before the DevConnectal Istanbul.
So we target the competent event, having all of our verifiers, having all of our, like,
having all of our endpoints and everything, everything, like in production,
deployed on Ethereum on like Madeithereo.
Right.
So if we're talking about like production or something for protocoling for DBMS,
there will still be quite significant time span when we're going to be testing this,
just like in the public beta.
And it's good around for some time.
And so, you know, it's still, it's still going to be testing.
to be tested in public.
So, well, we have plans.
We have plans.
But it's like they're too ambitious to make you tell about them.
Thanks so much for joining, Misha.
I think this is like a really cool area of ZK.
And I think it's going to be super fascinating as we see this getting indicated in different
crypto applications.
Yeah.
And so, yeah, thanks so much for coming on.
I'm excited to see how Neil Foundations and the CK marketplace are going to play out.
and you know how in general the impact of CK on crypto is going to be.
Thanks for inviting me.
It was nice.
Cool.
Well, thanks so much for listening for tuning in.
If you want to support the podcast, make sure to leave us iTunes review.
And we look forward to being back next week.
Thank you for joining us on this week's episode.
We release new episodes every week.
You can find and subscribe to the show on iTunes, Spotify, YouTube, SoundCloud,
or wherever you listen to podcasts.
And if you have a Google home or Alexa device, you can tell it to listen to the latest episode of the Epicenter podcast.
Go to epicenter.tv slash subscribe for a full list of places where you can watch and listen.
And while you're there, be sure to sign up for the newsletter, so you get new episodes in your inbox as they're released.
If you want to interact with us, guests, or other podcast listeners, you can follow us on Twitter.
And please leave us a review on iTunes.
It helps people find the show, and we're always happy to read them.
So thanks so much, and we look forward to being back next week.
