Epicenter - Learn about Crypto, Blockchain, Ethereum, Bitcoin and Distributed Technologies - Mance Harmon: Hashgraph – A Radically Novel Consensus Algorithm
Episode Date: January 25, 2018Hashgraph is a new consensus algorithm that radically differs from proof-of-work as well as proof-of-stake consensus algorithms. While work on Hashgraph begun in 2012, it’s design is radically diffe...rent from today’s blockchain architectures. The Hashgraph team claims that it has found an optimal consensus algorithm design that will be impossible to significantly improve upon. We were joined by Mance Harmon, who is CEO of the Swirlds, the company developing Hashgraph. Our conversation covered the origin story of hashgraph, how it compares to existing consensus algorithms and how Hashgraph works. Topics covered in this episode: Leemon Baird and Mance Harmon’s long history of building companies together What motivated Leemon Baird to start working on Hashgraph in 2012 The existing categories of consensus algorithms and their problems How Hashgraph consensus combines voting and gossip protocols The performance characteristics of Hashgraph How a public Hashgraph network could look like Episode links: Hashgraph Homepage Hashgraph Whitepaper Hashgraph Consensus - Detailed Examples Sybil Attacks in Hashgraph Hidden Forces Podcast Episode on Hashgraph Lemon Baird's Talk on Hashgraph at Harvard Business School This episode is hosted by Brian Fabian Crain and Meher Roy. Show notes and listening options: epicenter.tv/219
Transcript
Discussion (0)
This is Epicenter Episode 219 with guest Mans Harmon.
This is a show which talks about the technologies, projects and startups driving decentralization and the global blockchain revolution.
My name is Brian Fabman Crane.
And I'm Meher Roy.
Today we have a very interesting episode lined up for our audience.
We are going to interview Mann's Harmon of the project Hashgraph.
Haschgraph is pioneering a very unique consensus algorithm that offers something super,
interesting to the blockchain consensus toolset. Manz, welcome to the show.
Thank you. Thank you for having me. Glad to be here. So we're going to discuss
hashgraph during the episode, but perhaps before we get into consensus algorithms, hashgraph and
such complex topics, tell us a bit about your background and how you came to be involved in the
blockchain space. Sure. Well, I have a deep tech background. I started off
my career doing research for the Air Force senior scientist for machine intelligence. I worked on a
team of five doing basic research and machine learning, specifically reinforcement learning and the
combination of reinforcement learning with neural networks, convolutional networks, back in the early and
mid-90s before it was known as deep learning. Incidentally, that's where I met my business partner
and co-founder and creator of Hashgraph, Lehman Baird, Dr. Lehman Baird. We were on that same team
working together for the senior scientist. I also taught computer science at the Air Force Academy.
I was a course director for cybersecurity there at the academy. Lehman was also at the academy
and was a full professor at the academy. I then went off and managed a massive software program
for the missile defense agency for the U.S. government,
basically a massive simulator that allowed the government
to learn how to protect its citizens and its allies
from incoming ballistic missile attacks.
Lehman and I decided we wanted to become entrepreneurs.
So we started our first company.
I left the military,
and it was an identity and access management company,
a distributed single sign-on solution back in the days of the Palm pilots around the year 2000
and sold that in 2004.
Went to work for the acquirer, a Fortune 500, and became the senior exec in the company for
product security, stayed in that role for a short amount of time, and Liam and I decided
to start our second company, again in the space of identity, and ran that for six and a half
years, sold that to private equity, ended up going to work for a friend, Andre Duran, who's the CEO of
Ping Identity. I became the head of labs. I stood up the Labs organization for Ping and also
headed up architecture for Ping. In 2012, Lehman decided that he was seriously interested in this
space of distributed consensus, sort of independently of anything that was going on with blockchain and
Bitcoin, you know, his vision required far more performant and secure approach than what's required
for just a cryptocurrency. And so he went to work, and he sort of nod on this problem for
years. In 2015, he got to the point where he'd solved the problem, and it's what we now call
today Hashgraph. It was in that role at Ping that this happened while I was at
at Ping. And so I decided to leave Ping and started our third company with Lehman, mine and
in Lehman's third company. And it just turns out that Ping happened to be a first investor in Swarlds,
the company that's commercializing, Hashgraph, the technology. And that's how we got to where
we are today.
Cool. Well, you guys certainly have an interesting background. That's a bit unusual from many
in the blockchain space and the distributed leisure space.
I think more experienced than probably your average person in this industry.
Well, Lehman has been inventing things for as long as I've known him,
known him, excuse me.
It just happens that what he invented this time was exceptionally good
and at exactly the right time in the market, right?
And so, you know, Ashgraph is, you know,
it's not unusual for Lehman to come up with things.
There are a lot of inventions.
I'm sure that we'll talk about some other time,
but Hashgraph is what we're pursuing today.
Cool.
Yeah, I mean, of course, we will speak more about Hashgraph later
and how exactly it works,
but I do think it's interesting when people approach
this problem of distributed ledger in blockchain,
or, you know, the thing that these technology is trying to solve
without coming, you know, from having sort of gone
to the sequence of Bitcoin and then next, right?
I think that kind of enforces a certain way of thinking.
But still, it was interesting to hear that Lehman started working on this in 2012.
So was he aware of Bitcoin back then?
Did he kind of follow it?
And did that in any way kind of shape his thinking that say, okay, I'm not going to do it that way.
Or I see some problems there.
And that's going to inform the design of hashcraft.
Do you know kind of how that went?
Yeah.
Oh, no, I know exactly how it happened.
I mean, Lehman and I, this was in Austin, Texas.
by the way, Cedar Park, a suburb of Austin. And there's a particular Starbucks on the
corner. Lehm and I lived a mile apart and there's a Starbucks on the corner and we spent many,
many, many evenings at that Starbucks talking about the different approaches that he had been taken.
And the simple answer is no. I mean, he did know about Bitcoin, but it didn't really inform
his thinking at all because the problem set, the use cases that he had in mind, he understood
very clearly early on that Bitcoin would never be able to address them. And so it wasn't as though
he said, well, here's how blockchain works and here's what I don't like about it, et cetera,
et cetera. It was more like I have this, you know, clean slate to start with a sort
tabularosa approach to coming up with an algorithm.
that can address this set of constraints that I have that are implicit in the use cases we care about.
And so it means clearly he understood what was going on in the market for Bitcoin,
but it really had no influence on the design of Hashgraph at all.
Cool.
Well, even though Hashgraph seems to be so independent in its origination and kind of been developed in parallel,
You guys, when you explain Hashcraft, often kind of analyze, okay, these are the existing
type of kind of consensus systems in blockchain, how they work.
Would you mind kind of running us through how you view those different categories and
your perspective on the strengths and shortcomings of them?
Sure.
Yeah.
Well, so when Lehman had the inspiration of gossip about gossip, you know, that enables, you
that enables this approach, and we'll talk later about what that means. It had a unique set of
properties in the market. And what we did at that point was begin to do a serious deep dive into
the full range of categories of consensus algorithms to just explore how hash graph was different
than all of those. And it turns out that there are only a handful of categories. Now, Lehman
had done an exhaustive literature search, and so he understood what
these approaches were in the categories were, but when you put it on paper and you start to, you know,
to point out the differences, the streaks and weaknesses, you can, you can more clearly see
really, you know, the power and performance of Hashgraph as it relates to the field. And so
the field is pretty small in terms of categories. In the permissioned world, you know, Hyperledger
and Corder from R3 and EEA, et cetera, of course, they don't use.
use blockchain. They've replaced proof of work blockchain with what we'll call as a category
leader-based algorithms. And these are represented by PXOS and RAFT and PBFT, practical Byzantine fault
tolerance. And there are dozens of variants on this category. And it works very simply. The nodes in
the network elect a leader. They send all of their transactions to the leader. The leader has
responsibility for taking those transactions, putting the transactions in an order, and then sending
those transactions in that order out to every member of the network, out to every node. The nodes, of
course, just take them in that order and commit them to the underlying database in the same order,
and that's how they keep the databases in sync. So this is okay. It doesn't scale that well. You can,
you know, get to dozens of nodes. You can not scale to thousands or tens of thousands of nodes.
you know in fact in the literature the most i've ever seen is about a hundred notes
that scale in this manner the problem is that there is a leader and because there's a leader
that's a bottleneck these systems top out at about a thousand transactions per second roughly
speaking if you look at ibn and you know hyperledger and cocoa Microsoft cocoa and cord
and others, they'll announce or they've marketed the fact that they can achieve a thousand
transactions per second. They're all using this category. And the problem is that there's a
leader, so there's a bottleneck in terms of scalability, both in the number of nodes and the number
of transactions per second. But more importantly, maybe more importantly, is the fact that
it's vulnerable to distributed denial of service attacks. And nobody ever really talks about this.
But each member of the network, by design, has to know the IP address of the leader,
or at least how to communicate with the leader.
And if that's the case, then it becomes possible for any node that is compromised,
whether that be an insider threat, you know, a disgruntled employee,
or the node has been compromised by an attacker from, you know, a foreign country, et cetera,
whatever the case might be, if the attacker can direct a botnet to do,
do a DDoS attack on the leader, then they can attack one computer at a time and stop the
communication flow for the entire network.
Now what happens is that the leader, the network recognizes the leader has gone offline.
And so they decide to elect a new leader.
That happens very quickly, by the way, you know, within a couple of seconds normally.
But the attacker knows the IP address of the new leader.
And so they just changed the target of the attack and you just follow the leader with this DDoS attack.
So this is a fundamental problem in the architecture.
So the other problem is that it's not fair in a certain technical sense.
So there are use cases where maintaining the actual order of events as they flow into the network is extremely important.
For example, maybe a distributed stock market.
if it's possible for a single party to prevent bids or asks to flow into the market or to influence the order of those transactions to unfair benefit, then that's a problem, right?
And with a leader-based system, the leader has responsibility for exactly that, for putting the transactions in an order.
and if the leader so chose, they could drop transactions, they could influence the order to unfair advantage.
And so, in other words, the leader could be bribed.
So this category of use cases that require fairness is fairly broad.
It's not just trading platforms.
It could be games.
If you had an MMO and, you know, people in the game environment, two people reach over to pick up the pot of gold at the same time.
the one that actually pushed the button first should get the gold.
It shouldn't be possible to influence that.
Or auctions.
If you're creating an eBay, a distributed eBay,
the person that pushed the button last should actually win the auction.
And so this fairness property is really important for a large group of use cases,
and leader-based systems just inherently are not fair.
So that's the problem with leader-based category.
Of course, the public networks don't use leaders, or most of them don't.
We're beginning to see some of the public networks beginning to use some types of leader-based technology.
But proof of work blockchain is inherently different.
All the transactions flow to all of the miners in the network.
The miners collect all those transactions, put them into a block, and then they compete with each other.
to solve a hard crypto puzzle.
This is the proof of work part of this.
And the miner that solves the puzzle first
earns the right of publishing their block of transactions
to the rest of the miners
that then gets put on top of their local copy of the chain,
the block chain.
The miners unilaterally decide
which transactions go into the block.
So it isn't fair in the sense
that they can prevent transactions
flowing into the network.
And if they put it in the block, they decide the order of the transactions within the block.
And so it's not fair in ordering either.
And so proof of work blockchain and blockchain in general, where there is a minor that decides which transactions go in the block and the order is not fair.
Now, proof of work blockchain is more secure than leader-based systems because you can't predict which miners is going to win the crypto puzzle first.
Who's going to solve the inverse hash puzzle first?
And therefore you can't do a DDoS attack against the network
in the same way you can a leader-based system.
So it's more secure, it's not fair.
Of course, performance is terrible compared to leader-based systems today,
a few transactions per second and extremely expensive
when using proof of work.
And so those are two of the major categories.
Now, in response to the problems of proof-of-work blockchain, the market has come up with a new generation of technologies that try to replace proof of work with other mechanisms.
We call these economy-based solutions for the simple reason that the approach is to make a fundamental assumption that the stakers, in this case, changed the term miners to stakers for economy.
economy-based systems. The stakers are rational and they're going to do what's in their own self-interest.
In other words, they're going to try to maximize the amount of money that they make. So that's an
assumption. And then we come up with a system of incentives that when applied across all of the
stakers will result in sort of an emergent behavior that the stakers come to agreement on the
blocks that should go on top of the chain. In other words, they come to consensus.
And there are various ways of doing this.
But what's important is to understand that, one, the fundamental assumption that the stakers are rational is not a good assumption.
Of course, we assume that those running the nodes, the owner of this hardware that's running the node in this economy-based system, wants to maximize the amount of money they make.
they make. But if that node is compromised by a virus or a disgruntled employee or anything,
then that assumption is out the window. It's not a good assumption. The virus doesn't care
whether or not the person that owns the computer makes money. The virus only cares about
bringing down the network. That's the point of the virus, perhaps. And so it's not a good
assumption to assume rationality as the basis of consensus for the entire solution, number one.
And then secondly, the whole approach is extremely complex in the sense that it's impossible
to formally analyze it in the sense that we would formally analyze an algorithm and write math
proofs of certain properties about that algorithm. The community is talking about proving
formally that code is correct and that the code will do certain things. For example,
a lot has been made that we can write proofs that certain smart contracts will result in the
staker, either winning more money or losing money once they've entered into the contract.
That's not the right question to be asking, right? We care about that, but that's not what we
fundamentally care about. What we fundamentally care about is,
whether or not a single attacker can do something, maybe unknown to us today, but can figure
out a way to attack the network and cause the network to basically come to a standstill,
cause the network to not be able to come to consensus. That's the math proof we care about.
We need to know for certain, guaranteed, that it's impossible for a single attacker to prevent
the network from coming to consensus. And those proofs don't exist for the economy-based model.
And they don't because the system is complex. It's chaotic. It's an economy. It's a model of an
economy. And so for the same reasons, if it's not possible to write a formal proof that, for example,
a stock market will never crash again because it's too complex and it's a chaotic system,
you can't do the same for this category of approach, these economy-based solutions.
So those are three, right?
Leader-based system, proof-of-work blockchain, economy-based solutions.
The question then is, well, what can you do formally?
I mean, we want to guarantee the best security possible for the platform
if this platform is going to be processing trillions of dollars of value.
That's fundamentally important.
and we want it to be extremely fast.
We want to be able to scale up way beyond,
orders and orders of magnitude beyond where we are today.
So what do you do?
It turns out there is another category,
we'll call pure voting-based algorithms.
They go back decades, 30 plus years,
and they work in the following way.
Let's assume perhaps we have 10 nodes in our network.
And if I want those nodes to agree on the order of transactions, I can ask those nodes to vote on a question that's related to the order.
And what that means is that each node, I ask each node the same question, each node will cast their vote.
And when they do so, they send it to every other node in the network.
And then when a node receives a vote, it sends an acknowledgement, not just back to the sender,
but again to every other node in the network.
And then there may be multiple rounds of this voting process
that's required for those nodes, those 10,
to come to agreement on the order of transactions.
And if you have a stream of transactions,
it becomes even more complicated.
The point here is this.
The amount of bandwidth required,
the amount of traffic that's flowing just to accommodate
the voting process just explodes.
And it explodes in the number of nodes
and then the number of transactions.
And so for that reason,
this category is not practical at scale
and no pure voting-based system exists at scale today.
However, this approach has phenomenal security properties.
These pure voting-based systems have been shown
to achieve the gold standard in terms of security,
provably in the mathematical sense that I've been describing.
And that standard is called asynchronous Byzantine fault tolerance.
Everything else prior to this point has been something less than that.
And only these pure voting-based systems have been shown to achieve asynchronous BFT,
but they've not been able to achieve it at scale.
So those are the four.
Leader-based systems, proof-of-work blockchain, economy-based solutions,
pure voting-based systems that are theoretical and don't scale.
The question then is, how do you achieve the properties of these pure voting-based systems?
Yeah, well, maybe let me just, like, jump in here a little bit.
Yeah, please.
Because I feel there are a bunch of stuff, you know, maybe I don't fully agree with.
So, you know, for the last year, as I'm sure most listeners know, a bit more than that,
we've been working on a cosmos and tendament.
and of course, so what tenement is is essentially, right, an implementation of people.
So it has kind of these voting-based properties, right, and proof of stakes.
I think game theoretically, it's extremely safe.
In terms of performance, I know they have run, we have done some benchmarking with like, I don't know, 100 notes, a few hundred notes,
and something like 10,000 transactions per second.
Sure, sure, sure.
And now, and again, with the leader base, so now there is a validator that proposes each block, right?
So there is kind of a leader.
And it is, of course, an interesting question, like, how can they be attacked?
I think DDoS is something that can be mitigated.
I think the point you made about fairness is a more fundamental one.
And that seems to be, I can see that being a massive issue.
And of course, we are going to speak about, and I think what is where you make a good point
is that even though I think these systems seem to scale pretty well up to a point, right?
And the point may be a few hundred validators or something like that, but, you know, it will not be,
you know, you won't be able to do 10,000 validators, right?
Like that seems clearly infeasible or even a thousand, you know, your performance will get much worse.
And of course, if you only have hundreds,
That may be enough if you're sharding, right?
In a single shard, it may be sufficient to only have hundreds of validators to ensure security of the system.
Well, I mean, my fault on this is actually a hundred validers is perfectly fine.
If those are like, you know, independent, running their own separate system, there's some sort of reasonable distribution of power.
I mean, we look at Bitcoin today, and, you know, it's for many times it was maybe three miners or three mining pools have actually a majority.
Yeah.
And that's, you know, concerning, even though so far it's worked.
So if you have 100 in this reasonable distribution, then I think that's fine.
Of course, the question is always, okay, who are these validers?
What are their incentives?
How much power do they have?
How are they associated?
If what you, you know, you kind of made the point before, oh, someone have a, like, a virus and bring the system down, well, I think then the question is, of course, if they run all the same hardware, the same and the same setup, then, yeah, that could be a risk.
So diversity in their setup as well.
Well, certainly there's that, but it's, you know, it's still the question of, is there a way to bring down the network, period?
And unless you can, so what we always have done throughout our career, our careers,
Lehman and I is start with first principles, right? Start with the fundamentals and start with
the math and create the best, most solid foundation possible. And once you have the best building
block, then the building you build with those blocks turns out to be fundamentally different
in some meaningful ways than if you start with something that's not quite as strong.
You know, it has a flaw in one way or another or can be attacked in one way or another
because these vulnerabilities compound, right?
And so, for example, just to make the point, what Lehman has designed is a way to scale this,
you know, a sharded system, a protocol for sharding, that maintains the asynchronous BFT nature across the entire system.
Right. Now, this is not yet public. It will become public in the coming weeks and months.
But, you know, in most cases, when you shard, the security of the system as a whole is degraded.
in this case because we've started with asynchronous BFT as the building block.
It turns out you're able then to create a sharded solution that system-wide is asynchronous BFD.
And then I would say just one other thing.
There are a number of different platforms out there that are experimenting with a combination of these different architectures.
And they're doing so because they need to achieve scale and they want to achieve high throughput.
etc, et cetera, et cetera.
So they're doing it for good reasons,
and I can appreciate why they're making these combinations.
The problem is that when you do create the combinations,
you may improve certain aspects,
but you inherit the vulnerabilities of all.
And so the vulnerabilities associated with leader-based
and the vulnerabilities associated with proof-of-work blockchain
or economy-based solutions,
if you combine them, you have then a,
you know, a larger set of vulnerabilities to defend against than if you stay pure to one of the categories.
So before we move on to Hashgraph, I'd like to know sort of
the history and sort of genealogy of voting-based protocols.
So, like Leslie Lamport and Shostak, they defined the Byzantine General's problem in 1982.
in a paper, right?
And they define the problem as being,
there's a city,
and there's a set of generals around the city,
and these generals must coordinate and attack at the same time.
Yet some of the generals are probably corrupted,
and will try to make sure that part of the generals attack at one time
and the other part attacks at a different time.
So you have traitors inside these group of generals,
but despite these traitors, these generals must come to an agreement on when to attack.
And Byzantine fault tolerance, like the Byzantine generals' problem,
stayed an open problem for quite a while, until algorithms that were based on leader election
and voting followed by leader election were kind of proposed.
So the idea would be one general would become a leader, that general would propose a time to attack,
the other generals would vote, and then once enough generals vote affirmatively,
you could like sort of collect these votes and attack at that time.
And the problem there also was that none of these protocols in a computer science algorithm,
like in the performance sense were performant.
And finally, in 1999, in MIT, there was a PhD thesis and paper published on practical Byzantine fault tolerance.
I think it was Castro, Miguel Castro, that published that paper.
And practical Byzantine fault tolerance showed for the first time that in a leader-based system,
and there's a leader and other people voting on what the leader proposes,
that that can be made practical and turn it into a computer system that can be used to vote
not just on one piece of data like when to attack, but on thousands of transactions.
So in the family of leader-based algorithms, you see this pattern.
Like there's an idea.
It's not practical for a decade or decade and a half,
and then some person comes and makes it practical.
in a system.
And at some level, with Haskgraph, what you're saying is there's another family, that's the voting-based algorithm, that's different from this genealogy, this part of the consensus tree.
And that has existed as an academic idea, and what you're doing is you're making it practical.
Can you point us to, like, who came up with this family and, like, do you know what are the large,
What are the important papers in voting-based algorithms?
Yeah, well, so thank you for the genealogy.
I think you've done a better job there than I could have done off the cuff.
Lehman.
Thank you.
That was good.
It was great.
So I can't tell you off the cuff what the papers are by reference, but I certainly could get the references.
The starting point is the same, right?
the pure voting based algorithms that go back to the 70s and 80s is the starting point.
And then there are sort of two trees, if you will, that would branch.
One is the Pax-Doss and Raft and PBF tree that you've mentioned.
And by the way, the whole DDoS set of attacks and problems that are associated with this category,
there is a phenomenal paper.
Again, I can get you the reference and you can provide it, you know, when you publish this.
is deal specifically with PBFT and how it's vulnerable to these various types of
timing based in DDoS attacks.
But so, so that's a branch.
And then there hasn't been another branch.
I mean, there's no, there's no in-between paper from the 70s to Hashgraph.
You know, Hashgraph as an approach is brand new.
And what makes that possible for the very first time is this inspiration of, if you're going to gossip information,
and you know there's send transactions across to everybody, that you can do something special,
a little bit special there that makes it possible to use this internal data structure that you build up that is the hash graph.
And then combine that with the use of a pure voting-based algorithm without ever having to cast the votes.
That's entirely new.
And it's not as though there are steps from the 70s to 2015.
There's one step from the 70s to 2015, and that was Lehman in the hash graph.
So I think you just kind of summarized hash graph, but I'm pretty sure nobody would be able to understand it from that.
So you said, right, the gossip.
So we have the gossiping of transaction that then kind of a data structure emerges that, that is called the hash graph.
and that that can somehow be combined with this voting type of approach.
Approach to create a consensus algorithm.
So maybe we can unpack that exactly what's going on here.
First of all, gossip.
I don't think many, everyone will be kind of, it will be clear what that means.
Can you explain what gossip is?
Well, gossip protocols have existed for decades as well.
it's a common approach for sending information to a large population very efficiently over a data network.
And the idea is, you know, it gets its name from what you observe in the workplace or in your social circles.
You know, one person tells another person a rumor, and then that person tells somebody else, the same rumor,
and the rumor just, you know, flows through the population exponentially fast.
Alice tells Bob the rumor and then, you know, at time T plus one, Bob is telling Ed and Alice is telling Charlie the rumor, etc.
It flows to the population exponentially fast.
That is a gossip protocol, and you can implement that in data networks.
And the question then is, what is it that you are gossiping about?
And you can gossip about a lot of different things.
You can gossip identity information.
you can gossip transactions to sort of opaque, arbitrary transactions that are only understood by an application that sits above.
What we do is we gossip about gossip, and that's going to be a little bit hard to wrap your head around initially.
Every node in the network has a local copy of a database, and the goal is for all those databases to stay in sync.
Some people call them a ledger, right?
Have the ledgers stay in sync.
If I write a transaction, if I create a transaction and I want to update my ledger, that transaction
has to go to every other node in the network so that they can update their copy as well.
So all transactions have to go to all ledgers.
That's the minimum bandwidth required for this gossip protocol.
What we do is create a graph that memorializes when people,
talk to each other, when nodes talk to each other. And so if Alice talks to Bob, Bob creates a circle
that goes into this graph. If you were to draw the graph, you know it's got lines and circles.
Bob creates a circle that memorializes this sinking event between himself and Alice.
And as part of that circle, Bob creates some transactions that would be understood by the
application and they're the payload that then go in this circle. All right. Also in the circle is
encapsulated a timestamp and a couple of other hashes and I'll get to what those are in just a
moment. But that's it. You know, there's a data structure that represents this circle and then the
data structure is a payload of events, a timestamp, and then two hashes and that gets signed.
that data structure gets signed.
And maybe just to sort of bring this point up here, gossip protocols are being used, right,
by existing blockchains like Bitcoin or Ethereum or others.
And they're basically used to, again, like propagate transactions, fill up the mempool,
propagate blocks, right?
But they're not used in the consensus, right, the kind of like separate networks.
Or in Bitcoin, I think there's even something, some like a relay network between miners
to propagate blocks faster?
Does that use also some kind of gossip protocol probably?
They're used everywhere.
I mean, gossip protocols as a category are used all throughout all kinds of applications
across computer science.
So yes, it's a very well understood approach,
and we're certainly not unique in the fact that we are using the gossip protocol.
What we're unique in is what we're gossiping.
We're gossiping the transactions as part of that event.
but we're also gossiping two hashes that represent the last event that was passed to me by somebody else.
That event with its package of transactions gets hashed.
That hash goes into my event, this new event, and the last event I created, it gets hash,
and its hash goes into my event.
And so this event, with all of the different data items,
that I've said, plus the two hashes that link back to two prior events, that then gets gossiped
to the network. And when members of the network receive these events through the gossip protocol,
they can, by using these hashes, create the hash graph. The hash is link, make it possible
to link together these events in a certain order that creates the hash graph. And provably,
each node in the network ends up having the same hash graph.
They're identical, at least through a point in time.
They're identical.
And so new events are always flowing in on top.
You're always gossiping and talking to other members of the network.
The network just runs as fast as it can.
And at a moment in time, it becomes the case that all the events prior to that moment in time
are known by all the members of the network.
and you agree on the hash graph.
So that's important.
That's the gossip about gossip that we're talking about.
Then, and the inspiration was that if we do that,
there is enough information there that we can take a pure voting based algorithm
and use the information in the local copy of the hash graph.
And instead of sending votes over the network,
you know, asking each member of the network to cast a vote on a certain
question, we just locally say for this question, what would each member of the network vote
if they were to cast the vote? And there's enough information in this hash graph to answer that
question for every member. And so it's gossip about gossip with virtual voting. You never have to
send a vote over the network and you're done. That's it. And it maintains the properties of the
pure voting-based systems in the sense that it's asynchronous BFT. It achieves for the first time ever
the gold standard in security as it relates to distributed systems, and it does it at scale.
And also, in addition, because you have to send the transactions to everybody, that's the baseline,
and all we're adding is two hashes to that event, that message that's being sent,
we don't think you can do any better in terms of bandwidth efficiency.
So we've achieved the best one can achieve in terms of bandwidth efficiency,
and we've achieved asynchronous BFT for the first time.
That's why we make, I realize it's a really bold statement,
but that's why we make the claim that Lehman has actually solved
the problem of distributed consensus at scale with the hash graph.
it is actually pretty
easy to
imagine the
the contribution of
Hashgraph right like we can
harken back to that imagination that there's a
city and the generals
gathered around the city so
let's say 50 generals
the three of us are
are part of these 50
so we each have our individual
armies Brian Mans and me
and we are part of this group
and the group
of 50 generals, we must attack together and we have to sort of agree on a time.
And let's say not only do we have to agree on our time, but we have to agree on a sequence of times.
So maybe we have to agree on like when will we fire the cannons, when will we deploy the artillery.
And like there's a sequence of times that we want to agree on.
And if we agree on these sequence of times correctly, then our attack will happen.
If we don't agree on it or some of us agree on one sequence and the others on another sequence, then our attack can fail.
And then there are some generals inside the camp that are traitorous and they want half of us to attack through one plan and the other have to attack through other plan.
So like what is Bitcoin?
Like Bitcoin is the idea that, hey, all of us are going to solve puzzles.
and one of us is going to win at solving the puzzle
and then the person that wins at solving the puzzle
will announce one time to do something like
or fire the cannons
and then he's going to solve the puzzle
and broadcast the solution and the time of firing the cannons
to all of the other generals
then all of the other generals
now all of us will start working on the next problem
and then one of the first
of us will solve that problem and that the solution to that problem is additive on the solution
of the first problem and that the second solution broadcast the second time to do something else
like deploy the artillery then a third round will happen all the generals will solve a different
problem again one of them will solve it and so on so that's bitcoin effectively a leader based
A leader-based system is one of the generals, let's say me, will become the leader.
And I'll say, let's fire the cannons at like 12 p.m.
And I'll sign the statement and I'll send this message to all of the generals through my messengers.
So generals are connected to messengers like delivery boys that can go and deliver the message.
The other generals will like receive my message and like vote affirmatively.
right okay yeah let's agree to attack at 12 and then these votes are then sent through delivery boys to
other generals and if at some point i as a general receive a message with enough votes saying let's
attack at 12 i accept that as the as the correct time to like you know fire the cannons or whatever
and then somebody else mans becomes the general that proposes the time to
let's say deploy the artillery, all of us vote on it,
and then finally, if there are enough votes,
we agree on the time to deploy the artillery,
then Brian, let's say, creates a message,
okay, let's send in the cavalry at this time,
and then we vote, we decide on when to send the cavalry.
So that is how, like, a leader-based system will work.
And to me, like Hashgraph appears to be,
in both Bitcoin and a leader-based system,
The thing that is getting exchanged between the generals is data about the strategy itself.
At what time should we do what?
That's the application level information.
That's the application level information, right?
Like what should we do when?
Whereas in hashgraph, what you're saying effectively is,
what if in addition to application level information,
when to do what
we also record the sequence of what the messenger boys have done
in order to send these messages across to the generals
so I'm a general let's say I say
attack at 12 and I send it to Mans
Mans receives that messenger boy
in Mans's letter it says okay I received a messenger boy from
from Meher that told me this
then you add information about what you think
I think not only should we deploy the cannons at 12
but we should deploy the artillery at 1pm
and then you also put in some past information
that when was the last time you sent a messenger boy
out to some other general
and you compile all of this into a packet
and then you instruct your messenger boy to go to another general
and when the general receives your messenger boy
he is going to create another data packet
but include the behavior of your messenger boy
in his data packet
and so once sort of
these messenger boys end up
delivering and
you start to collect information about who proposed what
and how the messenger boys worked
after a certain point in time
everyone will agree on who proposed what and what the messenger boys did.
And based on that, they can come to the conclusion on when to deploy the cannons,
when to deploy the artillery, something like that.
That's it.
And it's great that you use that fundamental analogy for the different categories.
And that's right.
I mean, that sort of demonstrates when we say we're gossiping, we're not gossiping transactions.
Of course, the transactions go along for the ride as a pay.
in the event, but the event is describing with the two hashes, you know, who we talked to
last effectively, right? It's the hash of the last event that was given to me and the hash
of the last event that I created. And so it's the gossip about gossip, the metadata that's on top
of the transactions that's important in creating the hash graph. And that's important also. You know,
Well, it's interesting to me that hash graph as a term is becoming increasingly used.
And, you know, there are a lot of platforms or an increasing number of platforms in the market today
that are based on DAGs, directed acyclic graphs.
And, of course, hashgraph is a directed acyclic DAG, excuse me, as well.
It's a DAG.
The fundamental difference, what makes a hash graph, a hash graph,
is what that DAG represents.
And in our case, it represents the gossip flow,
the flow of the communication across the network
as opposed to the transactions.
That's how we're fundamentally different than everything else.
And we define Hashgraph to be a DAG that represents the communication across the network.
I mean, listeners will probably have something like,
Similarly with DAC and DAC-based systems, we did do a podcast about this before, about something called Spector, which is a paper by two Israeli academics on this topic.
Now, that also seems very interesting.
So why is this gossip about gossip so fundamentally different from like having a DAC-based structure that has transactions or blocks?
Well, the information is not there to calculate the votes.
You have to know how Alice would reason about the information she received from Bob.
As opposed to, you know, if I just send Alice a transaction and don't tell her anything else,
she doesn't know where that transaction came from.
And she can't reason about where I receive the transaction from in the first place.
All of that information is lost if all you include are the transaction.
you really need to include the information that represents the flow of the transactions
across the network in order to be able to to use the voting based algorithms.
Right, right.
No, that makes sense.
I think that's very well put, right?
So in essence, you can kind of think of it like this, right?
What you guys are doing is like almost like continuously that all the nodes kind of update,
okay, this is what I'm seeing.
that gets kind of all adjusted, and so of course they can say, okay, now I'm using these simple rules to say,
this is the order of transactions. And I know what everyone else would do, because I have kind of synced up
on what information they've seen in in what order. Yeah. So everybody, when I talk, when Alice talks to
Bob, for example, she has a graph, a hash graph. Bob has a hash graph. They're identical up to a point.
and then Alice knows some things that Bob doesn't and vice versa and they share with each other
the Delta what Alice knows that Bob doesn't she gives to Bob and vice versa at the end of that
sinking event their hash graphs are synced in other words Bob now knows everything that Alice
knows and vice versa and they can reason about where transactions came from who talk to who
and when and because everybody's gossiping with everybody all the time
the local copy of the hash graph is always being brought into increasing sync across the network.
And at a moment in time, they are in sync and they will forever be in sync from that moment in time.
So I think that actually brings up an interesting point.
So there's the term finality, right, which is used in, for example, in things like tendament, right, where every block is final, right?
So you know, okay, a block takes like three seconds, then it's final.
in proof of work systems, it's much harder.
So one doesn't never really has finality.
So one uses these like quasi finality,
people are probably familiar that in Bitcoin for a long time,
you know, or maybe still to a large extent,
one would kind of say, okay, six transaction,
or six confirmation.
So once a transaction six blocks deep,
you kind of consider it final,
but you know, it's not exactly the same thing.
And then in Ethereum as well,
with Casper, I think there will be some kind of finality coming, but it will take some time to get there.
What does it look like in Haschcraft? At what point, you know, how many seconds does it take until you can say it's final?
And how does that depend on, you know, the number of notes that are participating in this?
So what I can tell you are sort of what we've seen empirically and then explain it.
So, a information that's gossiped into the network, it goes out to the entire network exponentially fast.
The amount of time it takes to get to every node is logarithmic in the number of nodes.
And so that's an important point.
The amount of time it takes information to get from the originator to everybody else is logarithmic in the number of nodes.
In terms of finality, what we see is that on average, it takes about eight gossip periods.
I'll call that a gossip period.
It takes about eight gossip periods before the transactions are final, on average.
It can be more, it can be less, but on average, it's about eight gossip periods before the transactions are final.
Now, finality in our case is fundamentally different than, say, in proof of work blockchain.
In proof of work blockchain, you have a partial order on the transactions.
You care that, you know, what you want to know for sure is that two coins, I'm sorry,
that a person can't double spend a single coin.
And so if Bob tries to spin that coin in two different places, double spin the coin,
then the partial ordering of those transactions to ensure that you can't do that is all that's important.
In a total ordering system, every transaction gets put into a total order across all transactions for the entire universe.
That's a much harder problem to solve.
And so we solve, when we say finality, what we mean is that we have a total order on all transactions across the entire universe.
And that becomes really important when you start thinking about treating this like a database.
You know, if you literally have databases underneath the nodes, then you care about the order of every transaction.
And you have to have a total order to ensure acid compliance across all the database.
And we have that.
The other thing that's different is that when we achieve finality, we're 100% sure of the fact
that it's final.
And we know for certain that every other member of the network knows for certain that it's
final and it will never change, guaranteed.
So it's fundamentally different than that Bitcoin, for example, where you have blocks
and with each new block on top of the chain, you are probabilistically more certain that the,
that the coin's not going to disappear if you're the merchant, for example.
And so we're, you know, we achieve a higher level of certainty or finality than what's currently
achieved in Bitcoin.
And that's also, you know, part of the definition of BFT, just being BFT.
You can't be BFT unless you're not.
you have 100% certainty that at a moment in time,
everyone agrees on the order and the order will never change.
And so Bitcoin blockchain is not BFT by definition.
And so you were speaking about eight gossip periods and how long is it?
Like let's assume we had now a public hash graph network.
Sure, sure.
So if it's, so empirically what we've done in the lab and we've done lots of tests,
both in a single data center, cross-country, you know, using AWS, for example, from Oregon to
Virginia, across data, there's cross-country, and then globally. And what we're able to see in those
contexts with a varying number of nodes, we haven't achieved hundreds of nodes yet, we haven't
tested hundreds of nodes yet, we'll achieve it, we just don't have the quota from Amazon to do
hundreds of nodes yet. Is that, you know, depending on the number of nodes,
and the geo distribution, we go anywhere from sub-second finality, total order finality,
with 100% certainty, to seconds of a finality.
And it may end up being that it's tens of seconds finality on the order of transactions
when we go global at scale.
It remains to be seen we're still doing that work.
So, of course, that kind of already gets us a little bit into a topic that I'm sure many people will be curious about, which is, you know, is there going to be a public hash graph network?
And especially, you know, how would that look like in a public chain context?
Now, we're speaking about notes and discuss about gossip.
So presumably, let's say if I could spin up lots of notes, run lots of notes and kind of gossip, fake information stuff.
I mean, I presume I would be able to disrupt this process.
You know, up to, you know, how is that security?
Well, the real question you're asking is, how do we go from a permission network where there's one vote per computer?
And you have to give permission to a computer to join the network to an open network where you can't have one vote per computer.
Because, to your point, you know, somebody could stand up a bunch of sock puppets.
and if they stand up enough, then they can have two-thirds of the total vote,
which would dictate the order of transactions.
In all cases, whether it's hash graph or leader-based systems or any of these distributed consensus algorithms,
what's required to be public or open is a scarce resource.
You have to make it expensive for a,
would-be attacker to stand up all those sock puppets. Or you have to make it impossible for them to,
you have to make it impossible for them to have enough of the voting weight, if you're waiting
votes, to be able to influence the order of the transactions. And so a public network built
on hashgraph would be the same. You know, that's why the cryptocurrencies are actually required
in the public networks. A lot of people view the cryptocurrencies,
sort of the point of the public network. That's not how we view it. I mean, if there were a public
network that's built on the hash graph, it would be to make it possible to build globally distributed
applications. And it turns out you have to have a scarce resource and the cryptocurrency
serves as that scarce resource. And so it certainly would be a feature, but it's a necessary
feature for the bigger vision of globally distributed applications.
And yeah, there would be a cryptocurrency associated with that, and there would be a way of
using the number of coins that perhaps one owns when they run a node is a mechanism for
waiting the influence of their vote on the consensus voting process.
And so instead of it being one computer one vote, it's one coin one vote.
And then there's a whole set of issues that you have to deal with if you take that approach,
which I'm not prepared to go into today.
Certainly we would explore that in great detail when that becomes appropriate for a public network.
So that's very interesting.
Of course, that makes perfect sense.
I think the aspect of having kind of coins or stuff.
stake or something like that, that raise votes and protects from civil attacks.
You know, I think that is where many of these networks are going and that seems very
reasonable. One thing I'm curious here, would in such a scenario, would you need a block
reward or some other kind of incentivization for people to participate in this? Or is that not
necessary? Well, yeah, no, absolutely. I don't think there's a free lunch. And a lot of the systems that
we see today really don't account for all the costs that are associated with being a node.
And I view those as flaws in the economic underpinnings of those platforms.
So, for example, if you're running a full node, then you are spending money on bandwidth,
you spend money on storage, and you spend money on CPU cycles.
And I think all three of those, a full node would be you would want to compensate
a full node for the amount of resource that they provide in each one of those categories.
And so if you're going to have a public network at scale and it's open consensus,
it's, you know, anybody wants to run a node can run a node,
then you have to have a system of economics that actually accounts for all of the costs
in order to be viable at full scale, you know, at maturity.
And yeah, certainly that would be part of it.
a public platform.
So of course,
Mann's hashcraft looks like a very interesting
consensus algorithm and
deployed in a public setting could
make an interesting
asset or
or coin or however we might choose to see it.
My question would be,
what other applications
might, it's a two-part question.
First part is what other applications
are you deploying hashcraft to
in the short term apart from
potential public system.
And the second question is,
since you and your team are approaching the problem of consensus
from a very different perspective to people from the blockchain industry itself,
how do you see the blockchain industry evolve
if your consensus algorithm prove successful?
Interesting. Okay.
Well, to start, we, yeah, we see.
started by addressing enterprise use cases.
I mean, that's where Swirlds is focused as a company.
And the first use case, PING identity,
I'd mentioned that I was head of labs and architecture
for PING.
The first actual application built on top of Hashgraph
was an identity focused application.
It actually solves sort of an obscure problem
in the world of identity.
It's something associated with a protocol code called Open ID Connect, OIDC.
And PING has then taken that and proposed that to the OIDF,
the standards body associated with this as a way to address the problem in the standard.
And they continue to pursue that, even now that I'm gone.
We then approached the credit union industry.
And the credit unions, as an industry, wanted to create a platform that makes it possible,
for the 6,000 credit unions in North America
that represent 105 million members
to build arbitrary distributed applications.
I mean, not just FISA-related applications,
financial services-related applications,
but in addition to those messaging applications,
information exchange as it relates to risk, for example,
of course, identity-related applications,
and they've now created a new organization,
called CU Ledger, that stands for Credit Union Ledger, the industry CEOs voted as an industry to create a new service organization, CU Ledger.
And CU Ledger now is capitalized. It has a CEO. The CEO came from MasterCard, a senior exec from MasterCard.
And they this year are rolling out this new platform built entirely on Hashgraph. And there will be an app store.
They have an existing, you know, robust third-party developer system that already builds applications for the industry.
Well, now they'll be able to build distributed applications and market those through the App Store as a channel to the 6,000 credit unions.
And so that will be the first large-scale deployment of the technology.
And that's happening even as we speak.
And then there's, you know, a large pipeline of other customers.
that are cross-industry.
There's healthcare, there is supply chain management,
there are other financial services organizations
that are using the technology.
None of these have been yet made public,
and they will be.
They're less mature than CU Ledger is,
but they're coming, and you'll see those this year.
And so that's what Swarles has been focused on.
As an industry on the public side, what I think will happen, you know, with or without Hashgraph,
and I think Haschgraph maybe will accelerate this in some ways, is that the stack of technologies that are there to deliver the product,
the elements of that stack will become increasingly differentiated.
I mean, we start with Bitcoin. It's like one monolithic code base.
in some ways it's a mess.
What we would like to see, and what I'm sure we will see as the industry matures,
is that each layer of the stack, for example, with Hashgraph, what we have is a consensus,
I think of it as a consensus server.
Maybe server is not the right word.
It's an SDK built in Java, entirely in Java, that handles communications across all the nodes.
It implements that gossip protocol that we've talked about.
and puts arbitrary transactions in consensus order.
So that is it.
That's all it does.
That's the foundation.
That's, you know, the consensus layer, the bottom of the stack.
If there were going to be a public network, then there would be a whole range of services
that go on top of that layer, that consensus layer.
There'd be a services layer.
And, you know, one of those might be a cryptocurrency, another one might be storage,
another one would be smart contracts, et cetera, et cetera, et cetera.
Anonymity, services, you can imagine what kind of services you might want.
I could see and expect that there will be competition for providing best of breed in the market
for maybe each one of these services and for the consensus layer itself.
And that's how markets mature.
And so we will certainly hopefully excel.
that process. We have our own ideas about how all of this technology should work in a mature,
well-architected stack that serves all the use cases of the industry. And I think that the
industry will, you know, come to agreement on what the definition of those pieces are. And you'll
see companies spring up that do one of those things. And they do it really well. That's what I would
expect. Cool. Well, maybe one last question here. What's the timeline of Hashcraft? Where are you guys
have currently and what's the roadmap for the next, you know, maybe 12 to 24 months? Yeah. Well,
so the work that we are doing on that consensus layer that I just mentioned is coming close to
version 1-0 capability. That work has been going on for years. This is a complicated.
thing. So hashgraph in some ways is far more complicated to implement than blockchain is to
implement. And, you know, making a gossip protocol that is highly performant is a work of art.
And so a lot's gone into that. So we will continue to harden that consensus layer. And then
even with or without a public network, it's still important to have that services layer that I just
described. All of the services I mentioned have value even in a permission context, right?
And so, you know, we will build out that services layer. And everybody wonders if there will be
a public network. It's fair to speculate that there will be. You probably would be right.
And so, you know, we're not making any announcements, but we're doing all the things that anybody
would do in our situation. Okay, okay. I think that that is.
the most elegant and non-announcement announcement that I've heard.
Non-announcement, thank you. That's right. That's a non-announcement.
Yeah, I mean, you guys certainly have already published a lot of material about
about hash graph, right, in detail, technical papers and comparisons. And I think that's great,
really, that there's already a lot of substance there to dive into.
And there's also some nice, nice interviews, nice talks.
that are available.
Well, I think we are at the end of the episode.
Thanks so much, Manns, for joining us today.
It's been a pleasure to learn about Hasdraft.
I'm super excited to see what's going to come out of that.
You know, whether your claims are true and it's going to be as a revolutionary, I don't know.
But it's certainly original and it's certainly something truly novel with interesting properties.
So I think that I'm, you know, I'm thrilled about, I'm sure Mayor and many others are thrilled about to see what would come.
comes out of that.
Yeah, well, thank you.
No, thank you so much.
I mean, it is different than anything else I've seen in the market.
We're very proud of that fact.
We don't want to look like anybody else.
And so our approach has been fundamentally different as well.
And it's all a very stayed, well thought out, mature approach to both the technology and the
marketplace.
So we're looking forward to it as well.
Thank you for your interest.
Yeah, and thanks so much for a listener for once again tuning in.
Of course, we're going to put links to the Hatchcraft white paper and many of the other materials
website in the show notes.
If people want to learn more about it and dive deeper, they can go there.
And yeah, thanks so much for a listener for tuning in.
So we put out episodes of Episinter every Tuesday, usually, although sometimes a little bit later.
And you can subscribe to the show on iTunes, SoundCloud, your favorite podcast application,
or you can watch the videos on YouTube.com slash epicenter Bitcoin, I think.
And you can support the show by leaving us a review on iTunes
or you can of course send us a tip in Bitcoin, Bitcoin Cash or Ether.
So thanks so much and we look forward to being back next week.
