Grey Beards on Systems - 37: GreyBeards discuss blockchains with Donna Dillenberger, IBM Fellow

Starting point is 00:00:00 Hey everybody, Ray Lucchese here with Howard Marks here. Welcome to the next episode of the Greybeards on Storage monthly podcast show where we get Greybeards storage and system bloggers to talk with storage and system vendors to discuss upcoming products, technologies, and trends affecting the data center today. This is our 37th episode of the Graveyard's on Storage, which was recorded on October 12, 2016. We have with us here today Donna Dillenberger, IBM fellow. Ray met Donna at IBM Edge last month in Vegas and was intrigued by what IBM was doing with blockchains and wanted to learn more. So Donna, tell us a little bit about yourself and perhaps give our audience an introduction

Starting point is 00:00:49 into blockchains. I'm Donna Dillenberger. I work at IBM Research, and I'm part of the blockchain group at IBM. I'm an IBM fellow, and prior to working on blockchain, I worked on cognitive, I still work on cognitive analytics. I'm also an adjunct professor at Columbia University teaching an advanced course in computer design. Prior to that, I also worked in the Courant Institute of Mathematical Sciences.

Starting point is 00:01:21 So what is blockchain? Blockchain is a software technology. It started in 2009 and it was created as part of Bitcoin, which is a cryptocurrency. And Bitcoin was created because of the financial recession that we had in 2008. And a group of programmers got together and they wanted to create a cryptocurrency that was not going to be controlled by central governments and wasn't controlled by large banks. So they created this cryptocurrency called Bitcoin.

Starting point is 00:02:02 And as part of that, they had to create a software protocol that recorded cryptocurrency transactions. And as part of not relying on a central institution, they created this protocol blockchain, so that it could be run on multiple computer systems. There wouldn't be one central computer system that owned the Bitcoin transactions. As part of the blockchain protocol, they also provided incentives for strangers to donate computing capacity. But the blockchain protocol could be used

Starting point is 00:02:41 for more than just recording Bitcoin transactions. It could be used to record any information or transactions that groups would like to share with other groups. So in the case of Bitcoin, not just wallet transactions, but in the case of other companies, they could use blockchain to record other financial transactions that are based on more accepted currencies. It could be used for provenance of supply chains. It is being used for invoice systems to reduce disputes. So at Edge, they talked about diamonds, the proof of diamonds almost or something like that. Wasn't that another... Yeah, it's also being used for provenance of items, not just manufacturing items,

Starting point is 00:03:29 but also provenance of high-value items like diamonds. So in that case, on a blockchain, you would also see the pedigree of these diamonds. It was mined from this conflict-free mine. It was traded by these traders in Antwerp. It was polished by these certified diamond polishers. It has this type of diamond rating. know a whole lot about blockchain but what i keep hearing is that we can use this basically crypto system to engender trust without a trusted authority and i don't get that i mean just conceptually it's like okay i used to be a new y Yorker and I spent some time on 47th Street. Let me tell you, a diamond cutter's workshop is absolutely the worst environment you can put a computer into. Because that dust that's in the air is oil impacted and conductive and really abrasive. But that aside, if I've got a diamond and someone's saying that it's a C2 colorless diamond, I have to trust the guy whose opinion that is.

Starting point is 00:04:55 I'm really having difficulty with the concept of trusting everyone who's on the blockchain but not having a trusted authority somewhere that vouches for the other people on the chain. Yeah. So, you know, how do you provide trust in a trustless environment? So what happens is that there's some data, right? There's some person that says that, you know, this diamond has this type of rating. And maybe that person gives you a certificate saying that this diamond has this rating. So you are trusting that person. To make that rating even more trustworthy is if you have a second person also give it that rating that, you know, does not work on 47th Street. Even better way is to have like a third or like, you know,

Starting point is 00:05:46 50 people give that diamond a rating and all these 50 people are all on different, are all in different countries. Right. And so that helps the pedigree of this diamond. The idea is, it's real easy to find 50 idiots. Yeah. Yeah. Yeah. Yeah. But, uh to find 50 idiots. Yeah, yeah, yeah.

Starting point is 00:06:06 I mean, I'm not impressed by numbers. Yeah, yeah, but maybe it's better to trust 50 people than one person, especially if these 49 other people themselves are registered diamond authorities. Yeah, absolutely. So if they are, you know, official diamond authorities and they provide you some proof. But that makes them the external trusted authority. Except there's not one of them, right? There's multiple of them, right? If you only have one of them, right, that one person could be bribed, you know, that one person could be extorted to give the wrong information, you know, that one person could be just ill one day and, you know, just dizzy and may give the wrong rating. This brings up the question of consensus in a blockchain environment. You know, when a Bitcoin

Starting point is 00:07:02 gets transferred from, let's say, Ray to Howard, we need to somehow determine that that actually occurred. And we both need to agree on that sort of thing. And there seems like there's different, and I'll call it algorithmic consensus capabilities that are available in various blockchains. Could you kind of give us some idea what that means, Donna? Yeah. So initially, you know, in Bitcoin and Bitcoin-derived blockchains, right, if you have, for example, multiple copies of data you want to share, you know, so, for example, you have, like, multiple databases where there's this rating of a diamond, you know, just practically, you know, all these copies of the database, they're going to get out of sync, right? Either due to, you know, network errors or timing errors, or some computer systems get hacked, others don't get hacked, right? So the question is, how do you get all these databases back in sync? And as you said, Ray, we do that in blockchain

Starting point is 00:08:03 with what we call a consensus protocol. And so there are many consensus protocols. The first one that Bitcoin created is called Proof of Work. And the blockchains that are derived from Bitcoin also uses Proof of Work. And so you have to remember that Bitcoin had a motivation to have strangers donate their most powerful computers to the Bitcoin network. So that helped them very cleverly design this consensus protocol to motivate strangers to donate their powerful computers. So proof of work is, say, you know, between all of us, we are each running a blockchain node. That means we have a computer system and it has a blockchain database and it has blockchain software that synchronizes our blockchain databases.

Starting point is 00:08:56 So before data gets committed into any of our blockchain databases, our blockchain software in proof of work has to solve a complex math problem that has nothing to do with the transaction that gets committed. And then it has to take the result of the transaction and create a hash value for it. A hash value comes from a mathematical function that if you put an input of data, the output is some unique opaque number. Yeah, I think we're pretty well concerned. We understand hash because it's all over storage, quite frankly, from a CRC perspective. And how storage is allocated or widely split across devices is somewhat of a hash function kind of solution.

Starting point is 00:09:43 Not to mention how critical hashes are to deduplication. Yeah, so I think we're good on hashes. But so they did this complex calculation and they came up with an answer. And then the first database, right, that comes up with a hash value that starts with a nonsensical value, for example, four zeros to provide randomness in it, it becomes the database of authority. And whatever data that computer system has, the other blockchain nodes have to now follow what we'll call that a master database. And again, they use this thing called proof of work. And so what that means is that the group, and this is only Bitcoin, right? This isn't how Hyperledger works. I'll get into that, right? So the Bitcoin blockchains, right, the group, and this is only Bitcoin, right? This isn't how Hyperledger works.

Starting point is 00:10:25 I'll get into that, right? So the Bitcoin blockchains, right, the groups or person that donated the most powerful computer, they're most likely to have the computer that solves this complex math problem first. And so everyone is motivated to donate their really powerful computers because the systems that do solve the complex math problem first and hash it and come up with a hash value that starts with this nonsensical value they call

Starting point is 00:10:54 a nonce value, they get free bitcoins. And that's what they call miners. So the reward of donating a powerful computer is you get free bitcoins. But not wanting to have the Bitcoin network dominated by powerful computers, that's why they came up with, after you solve this complex math problem, then you also have to create a hash value that starts with this random nonce value. And it could be I donated a laptop to the Bitcoin network. Even though my laptop wasn't the first one to solve the complex math problem, my laptop was the first one that created this hash function that started with the four nonsensical values that was required. So that's how you could get laptops to be the master copy. And so that's called proof of work.

Starting point is 00:11:39 And you see that it's extremely vulnerable to hacking. Now, another consensus value. Before we move on, let me see if I understand this right. So conceptually, not functionally. Can I think of this blockchain database as having multi-master replication, but each record, the authoritative copy for each record is determined through this proof-of-work process? Yeah. Only for blockchain 1.0 implementations.

Starting point is 00:12:13 We have evolved since then. Not only Bitcoin, but Ethereum and other Bitcoin-derived blockchains. Right. Well, the whole mining part is only interesting. Well, it's interesting to me because I want to set up my laptop to be a miner and see if I can't get some free Bitcoins. But that's a different question. I have to tell you that we have evolved from that. That is how Bitcoin works.

Starting point is 00:12:37 And the Bitcoin guys are using machines with 12 GPUs to do your mining. So odds of your laptop paying off are kind of low, right? I mean, that's why they have that random value in it, right? Yeah, it is non-zero, but kind of low. Okay, so tell us about the other more sophisticated consensus capabilities or protocols, rather. You know, at IBM, right, we're working with a blockchain implementation that's not derived from Bitcoin. It's called Hyperledger, right?

Starting point is 00:13:11 And it's a Linux Foundation open source blockchain implementation. And of course, you know, this blockchain implementation, Hyperledger, did not have a historical motivation of like getting strangers to donate powerful computers to a blockchain network. The Hyperledger blockchain, companies or government agencies that use the Hyperledger blockchain, they don't have a problem finding computing capacity. They either have their

Starting point is 00:13:38 own data centers or they're going to run the blockchain in a cloud somewhere. And so there's no such thing as proof of work or mining in Hyperledger. Instead, what happens is that when data is committed to the blockchain, before it gets committed, it's sent to all the other nodes in the blockchain. And if two-thirds plus one, you know, of these nodes agree that this is the data, right? They create a hash value of the data, and they hash it with the historical state of the blockchain databases each one has. That means that no one has tampered with this data, and no one has deleted an entry in their historical databases.

Starting point is 00:14:20 Then that data gets committed. I'm sorry. So the consensus algorithm takes the current state of the blockchain, hashes that with the new data, and comes up with a new hash value. And then if two-thirds plus one of the nodes in the blockchain agree that that's the hash value, that becomes the new data? Then that data gets committed, correct. Okay. Okay. Okay.

Starting point is 00:14:45 Okay. So it is a different solution here. Yes. Correct. And the blockchain nodes don't hash all historical state, right? Because every block that gets added to the blockchain, you know, say like you start with the Genesis block, then there's a hash value on that.

Starting point is 00:15:02 Then a new block gets added to the blockchain. Then there's a hash value of the. Then a new block gets added to the blockchain. Then there's a hash value of the new block combined with the hash value of the prior block, so that all you need is the hash value of the prior block, which... Or it contains the hash value of the prior block to that, etc., etc., an infinite item. Yes. That's interesting.

Starting point is 00:15:21 And it's also, you know, since you don't have to motivate strangers to donate, you know, powerful computers, it makes more sense, right? Because the consensus algorithm, right? You could have wrong data in your data, in your blockchain database, but because you have the most powerful computer, now everyone has to think with your copy of the data, which could have been, you know, it could have false entries in it. Not to mention that the work you're proving is useless work. It's some sort of mathematical computation,

Starting point is 00:15:54 whether it's useless or not, it's depending on the computation. But the result of which we don't care about, which is why it's useless. The group that donated the computer doesn't think it's useless because, you know, you're doing the mathematical function to get free Bitcoin. Right. And I'm assuming that Hyperledger is intended for semi-private business-to-business data exchange.

Starting point is 00:16:19 It could be for public or private. It's an open source solution. Yeah. But there's some, you know, you establish a chain. Yeah. And there's a community of the people who use that chain to exchange data. There's a, we have no incentive for strangers to donate their laptops or computers, right? We don't, we don't, that's not one of our tenants of the protocol.

Starting point is 00:16:47 It's not, you know, get people to join this chain for the sake of joining this chain. Or, you know, for free bitcoins. Right. Right, well, free bitcoins are the incentive, but from, you know... Well, there must be, you know, like a startup effect. So let's say I wanted to create a blockchain between the three of us or something like that. It's not sufficient

Starting point is 00:17:04 to have three nodes to make two-thirds plus one. I guess it is. You could have three nodes. We'd all have to agree on what the blockchain is. No, you need at least four, right? Right. Yeah, you need, the minimum is four so that, you know, you have... And these four could be, you know, effectively cloud apps.

Starting point is 00:17:23 They could be, cloud apps. They could be laptops. They could be just about any sort of computing service with access to the other ones, I guess, through the Internet. I guess that's how it's working. It would probably be preferable if they were more available than a laptop. So a server that was there all the time that could be queried to say, okay, what do you think it is, and stuff like that. Not necessarily needing HA per node, but more. So when a transaction occurs, it must be broadcast to all the nodes in the blockchain.

Starting point is 00:17:57 And then there's this hashing activity that occurs, and then they kind of, through some sort of protocol, say this is my hash, this is my hash, this is my hash, and if all the hashes two-thirds plus one agree, then that becomes a new... Where does the two-thirds plus one get calculated? In the hyperledger blockchain, right? There's one node that is randomly chosen as a leader node, and every few blocks of data, the leader node changes. So, you know, all the transactions are sent to the leader node. The leader node orders them, you know, based on when they came in, and then sends them back out to all the other nodes,

Starting point is 00:18:38 so that all the other nodes commit the transactions in the same order. And then all the hash values that come from each of the member nodes are sent back to the leader node that can compute if whether two-thirds plus one of the nodes agree on the hash value. Okay. And then the leader acknowledges or negative acknowledges to the writer? Yes. Then we say, then if the hash values agree, then the data is committed. If the hash values do not agree, then you tell the blockchain application

Starting point is 00:19:12 that submitted that transaction that it wasn't committed. Right. Okay. That's interesting. Yeah, that addresses, you know, two rights to the same place at approximately the same time issues too. Yeah, but addresses two rights to the same place at approximately the same time issues, too.

Starting point is 00:19:27 Yeah, but it's almost like a distributed storage database of some type, a distributed ledger. Correct. Sure. Being an old Microsoft sysadmin, Active Directory is kind of my model distributed database. Yeah. And then just one more point. Well, one other thing that Howard said, which is like he can't envision like a computer being in a diamond retail office or something like that.

Starting point is 00:19:56 You don't actually need to have a blockchain node to use the blockchain. So, for example, there could be some blockchain nodes in the cloud, but then access to the blockchain is through a browser or through a mobile device, you know, to be able to read the data on the blockchain or write data on the blockchain. So all the diamond retailer would need is either a smartphone or, you know, something that has a browser. Oh, really? I always thought that all the players into the blockchain would have to be nodes on the blockchain. But in this case, it's just a consumer of the blockchain service. Yeah, the analogy is all of us use the Internet.

Starting point is 00:20:37 But in our home, all of us don't have web servers, right? Well, some of us do. Yeah, Howard, of course. Yeah, there's an inner core of like blockchain databases. I mean, all of us use databases all the time. Even if we don't know we're using databases, you know, when we query information, it goes up to some database, right?

Starting point is 00:21:00 But surrounding these databases, these blockchain databases, are millions of users with browsers and mobile devices or Internet of Things sensors. So you mentioned Hyperledger as a solution that IBM is working with. So what sort of, I don't know exactly what the terms are, what sort of offerings does IBM have surrounding Hyperledger? Yeah, so last year Hyperledger didn't exist. You know, last year we worked with these Bitcoin-derived blockchains, and our customers told us they wanted something different.

Starting point is 00:21:36 You know, besides the holes in a proof-of-work consensus algorithm, you know, they wanted to be able to have the data be encrypted, which is not provided in Bitcoin-derived blockchains, right? Every transaction on a Bitcoin-derived blockchain is public. They wanted to be able to have every record on the Hyperledger blockchain be signed so that if Howard put a record there, you know, Howard could digitally sign it. Whereas Bitcoin-derived blockchains, you know, anonymous users could just put records and invoke smart contracts.

Starting point is 00:22:14 Oh. Yeah. So there we bring in the normal electronic signing reputation services. And only hyperledger. Yeah. Bitcoin doesn't have that. The Bitcoin-derived blockch Hyperledger does that. Bitcoin doesn't have that. The Bitcoin and Rai blockchains don't have that. So you mean there's a digital signature to every transaction that identifies

Starting point is 00:22:33 the owner or the originator of the transaction? If you configure the Hyperledger blockchain to do that. If you want to run it as an anonymous, public, unencrypted blockchain, you to do that if you want to run it as an anonymous public unencrypted blockchain you could do that as well so besides encryption and digital i don't want to do that you know the third thing that uh our customers asked for was um uh they wanted to be able to

Starting point is 00:23:00 provide access control to the record so if they put a record on the blockchain they wanted to be able to say that only Ray could see it and Donna can see it. So that was the third thing that our customers asked us to do.

Starting point is 00:23:18 And so that's why IBM wrote this blockchain implementation from scratch. And we knew that we were going to donate it to the Linux Foundation because our customers also said they wanted to work with a blockchain that was open source, not owned by one company. And then once it was donated to the Linux Foundation, many other companies joined.

Starting point is 00:23:38 It became one of the most popular open source projects in the history of open source. And so now you have other companies donating code to the Hyperledger blockchain, right? You have financial companies and other IT companies donating code to provide additional features into the Hyperledger blockchain. Okay. Yeah, so besides helping, you know, working with customers to add new features into the Hyperledger open source blockchain.

Starting point is 00:24:07 IBM has a cloud where clients can go to host their blockchains. And what we've done there is we've built this special hardware in this publicly accessible cloud to provide further security features to the blockchain. So it's built on our Linux 1 Z systems, and we're standing up these systems, you know, in all over the world, you know, Asia, Europe, the Americas. And what it does is that if you install blockchain and blockchain applications and the data in one of these systems, we have a feature in it called a secure service container.

Starting point is 00:24:51 And what that does is that, you know, the blockchain software is attested. So that means it's signed. So it's signed by IBM or our vendors or, you know, for example, if Howard just wants to make sure it's his smart contract that runs and not, you know, some smart contract that a hacker provides, you know, that Howard would, you know, sign his smart contract and say this is the only type of smart contract that, you know, should run on this blockchain. So once these blockchain applications and the blockchain software is installed in this secure cloud, malware can't install themselves

Starting point is 00:25:32 because, again, malware isn't signed. And, you know, we check the contents, the bytes of the software stack. And if the bytes are different, the blockchain doesn't boot up. And the other thing is that the data is, you know, it has to be persisted in a file eventually. And those files are always encrypted, and the encryption keys are protected in memory that's not addressable to applications.

Starting point is 00:26:09 And we're also having the encryption algorithms implemented in a tam it maliciously, then the cards would detect that and zero out, you know, the encryption keys. And so, you know, besides protecting your data through attestation to make sure that, you know, no malware could install themselves and, you know, protecting your data with encryption, with keys that aren't saved in the clear like other platforms save them. We also have really high crypto compliance levels, what we call federal information processing standard 140-2 level 4, the highest level of crypto certification.

Starting point is 00:27:08 So, you know, so besides donating features to the open source hyperledger blockchain and providing a high security blockchain cloud, you know, all over the world, we also, you know, this hardware that's in our high security blockchain cloud companies, some of them choose to run blockchain on premise. And so they could, you know, they could run this hardware on their, you know, in their own data centers as well. Then we also were working with many industries on blockchain solutions, you know, for, you know, so these are blockchain applications that, again, help share data for specific industries and business use cases. And besides that, there's many, many, many clients that we're working with, helping

Starting point is 00:28:04 them design their blockchain applications, helping them design their blockchain applications, helping them write their blockchain applications. And then, you know, besides that, just we're doing a lot of work with respect to, you know, the blockchain community. Just even if, you know, IBM doesn't commercially help you with a specific blockchain application that the company wants to put into production, we host a lot of hackathons and we help many people understand how they can use blockchain and also what it's, you know,

Starting point is 00:28:45 limitations of blockchain as well. Okay. So we've been talking about, you know, entries in the blockchain. What actually constitutes an entry and, you know, what is the blockchain as it's stored? And I get the, now I get the whole data transfer and validation of source and that that everybody agrees about what the contents of that message is but I've got to store it someplace and what are those messages and so wouldn't it be kind of application specific the contents of those I call transactions or messages or entries?

Starting point is 00:29:29 Well, I'm kind of at, you know, is this something where I define fields and records? Is it XML? You know, just that level. Interesting. Donna? They're stored in a key value data store. It's stored in a key value data store. But what are the elements that are in the value portion of the key value data store? So the keys could be something like this is a signature, this is a diamond rating, and the value would be the text associated with those signatures and diamond rings. But a transaction would be made up of multiple key value entries, I guess. Is that correct? Yep. Yeah, that's correct. Values could also be like

Starting point is 00:30:05 certificates, certificates of authenticity, certificates of origin, health inspection certificates, property deeds, mortgages. Right. And it could be a single value. It could be a numerical value,

Starting point is 00:30:22 a single letter, a string. Right. So records or transactions would effectively be multiple key value elements that would be stored in this key value store. So as far as the key value store, that could be any key value database in each of the nodes. So I guess my question is, would each of the nodes be running the same blockchain software with the same blockchain key value store? It could be Mongo database. It could be anything, right?

Starting point is 00:30:58 Currently, yeah, every blockchain node runs the same blockchain software, and they're using the same data store. But we are, you know, as an open source community, we are providing the capability to have other data stores plugged in, not just the particular one that is provided with Hyperledger now. Right now, it's a key value data store called RocksDB. Okay. And then ultimately, once we have that, then the members of the chain would have to negotiate what the limitations of the data types that that store could hold were.

Starting point is 00:31:42 There isn't a limitation. It's almost self-defining. If I choose some key value store that won't let me store over 64k as a value, then everybody else has to know that. Yeah, I suppose. There are some restrictions.

Starting point is 00:32:00 So yeah, I mean, one of the problems with, I understand it with Bitcoins, is that the performance has started to degrade as the calculations have increased, the blockchain has increased and things of that nature. There's been some discussions about changing the size of the transaction for the Bitcoin blockchain. Does any of that affect the hyperledger? Yeah, I mean, one of the reasons why the Bitcoin blockchain has a very low transaction rate is this whole proof-of-work consensus that we just talked about, right? Because, you know, all

Starting point is 00:32:39 the nodes of a Bitcoin blockchain, you know, they have to do this complex math problem. And then, you know, the one that wins, right, has to propagate their data values onto the other nodes of a Bitcoin blockchain. So what we're doing in Hyperledger, besides the consensus value I just talked, I just described the the two-thirds plus one, is that there's going to be another evolution of consensus, right? You could stick with the two-thirds plus one, which is called the practical Byzantine fault tolerance.

Starting point is 00:33:18 Practical Byzantine fault tolerance? Yeah. Okay. Yeah, that name comes from computer science, right? So, yeah, have you heard of it? It's just so rare that you hear practical and Byzantine used together. Okay, yeah, I understand it. Okay. Yeah, so the idea is that there's this enemy ruler, right?

Starting point is 00:33:44 And they hold a castle, and it's a really big castle. And then there are these Byzantine armies, you know, surrounding this enemy castle. And the Byzantine armies would like to take that castle, but no one Byzantine army could conquer that castle by themselves. There has to be a concerted effort, a coordinated effort for the Byzantine armies around the castle to take the castle. But the problem is that some of the Byzantine generals are traitors. And, you know, if they get a message where all of them are supposed to attack at two o'clock, the traitors, you know, are, you know, they say they're going to attack, but they really won't attack. And then the other thing is that the messengers that send these attack or

Starting point is 00:34:26 not attack messages, some of them are traitors as well, and they could change the message or they could get killed or compromised. So it's an analogy, the Byzantine armies are analogies for how do you get a computer network to communicate with each other when some of the computer nodes are not trustworthy, and the computer network itself, you know, you can't always trust. And so that's called the Byzantine fault-tolerant problem. And so there's a mathematical proof that as long as two-thirds plus one of the Byzantine armies are not compromised, then even if, you know, even if less than one-third of the armies are traitors, the correct messages will go through to attack or to not attack.

Starting point is 00:35:10 And then the practical part comes because this computer science problem has been implemented as an algorithm. You could implement this algorithm so that it never completes, which is not practical, but then there is a practical implementation of it where it does complete. So that's where the name of that consensus algorithm comes from. But then an evolution of this consensus algorithm is another one that we'll be providing next year. And this is the concept of endorsers and validators and committers.

Starting point is 00:35:43 So here, again, this is requirements from customers. They've told us that instead of setting this data to all the nodes in a blockchain network, first, you know, send it to endorsers. And the endorsers are the nodes that whoever is originating the record is going to identify. And, you know, there has to be at least one of them.

Starting point is 00:36:07 So the idea is that if Howard sent Ray $50, then I as Donna, I have no idea whether Howard really sent Ray $50. The only person who knows if Howard sent Ray $50 is Ray. You did send me $50, by the way. So Ray becomes an endorser. Ray puts on the blockchain, yes, I endorse this transaction because it really happened. So Ray is the endorser. And then to make sure that, you know, Ray and Howard didn't delete any historical records, you know, I'm added as a validator.

Starting point is 00:36:40 So I can't endorse the transaction because I'm not party to it, but I could see that the hash value of that transaction and the historical state that Howard and Ray calculated matches my hash value, you know, when I create the hash of the transaction that Ray endorsed along with the historical state. So then those are validators. And then once it's been endorsed and it's validated, then the rest of the nodes could commit it. And so that also makes the consensus a little bit more business-friendly, and also it makes it more scalable. Yeah, so there was one of the sessions that Edge talked about, the supply chain management, using what looks like an endorser, validator, committer type of consensus. Is that true?

Starting point is 00:37:28 Yeah, yeah. They were talking about the next version of this consensus protocol that will be available in Hyperledger next year. But it looked like IBM's supply chain was actually using it in real time, right? Yeah, yeah. Right now the IBM supply chain is in production, but we're using the practical Byzantine fault tolerance algorithm. And then next year we'll do the endorser, validator, and committer one. Endorser, validator, committer reminds me of my days in the casino business.

Starting point is 00:37:58 Okay, how so? Four layers of people looking at the dealer before. Exactly. Okay. Yeah, the thing that you're always looking for is a player and a dealer colluding to cheat the house. And so the eye in the sky becomes the validator. Okay. And certainly it's not workflow, but it's just struck me as being a similar problem and a good solution.

Starting point is 00:38:22 So on the IBM blockchain cloud and stuff like that, are there multiple – I'm not exactly sure. Are there multiple blockchains being utilized by these various applications? Are they all using the same blockchain, just different key value records being applied to this blockchain? I mean, for instance, the diamond exchange, diamond validation rather, and the supply chain, are they different blockchains? Are they the same blockchain, just different records in there? They're different blockchains. Okay.

Starting point is 00:38:52 Yeah. I mean, I wouldn't imagine why I would want to build one. It doesn't seem like it makes a difference. You'd have more nodes and more people validating. Why allow any communications at all into the diamond trading one from people who don't trade diamonds? Correct. Except for queries to validate.

Starting point is 00:39:11 Right. This way, it's kind of just firewalling it off, not to mention that the database scalability issues of there's one blockchain and every transaction in the world goes through it. Not unlike Bitcoin I might add. That's the problem Bitcoin ran into, is that they had a high-cost algorithm and it broke down when the transaction rate got too high.

Starting point is 00:39:34 It's still working. We'll call Bitcoin, you know, blockchain 1.0, right? I mean, it was a really good idea, right? I mean, you got so many strangers to donate free capacity for Bitcoin. Well, greed is a great motivator. So you mentioned Hyperledger. You mentioned Ethereum. Are there other open source blockchain solutions out there that people should be aware of?

Starting point is 00:40:02 Well, Ethereum isn't an open source blockchain, right? It's controlled by one company, which is, you know, Ethereum. I didn't realize that. I thought it was open-source. Okay. Is Ethereum Bitcoin, then? It's derived from Bitcoin, yeah. It's derived from Bitcoin.

Starting point is 00:40:17 So the Bitcoin consent with proof of work kind of consensus. Yeah, it does proof of work, correct. So Microsoft offers like a blockchain service as well, I understand? Or is it Azure? It might be Azure. Yeah, Microsoft has a cloud and their cloud is called Azure. And they provide Ethereum blockchain software hosted on their Microsoft Azure cloud. Okay. And the blockchain software that IBM runs is runs on Linux. So any, effectively, anybody that runs a Linux server could potentially be a member of a blockchain that is hosted on IBM cloud. Or does it all have to be the secure cloud environment that you've

Starting point is 00:41:00 talked about? No, anyone with a Linux server, you could even run Linux on a Mac, right, could be part of a blockchain network. It's just more secure if it's on a secure server. Right, right, right. And so, I mean, it seems like blockchains is taking the financial services community and just, you know, they're going wild with this blockchain stuff. Could you explain why it works so well in that environment? Because the financial community, they have a lot of interactions that just don't stay in one bank.

Starting point is 00:41:38 And, you know, blockchain is best suited for applications that go, that require information to be shared across multiple companies. And so, you know, that's just one of the industries where they have a lot of transactions that need to be shared across companies. So, I mean, you see in whatever the time frame is that the wire transfers that are occurring now through, you know, check, I don't know, even clearing kinds of things and, you know,

Starting point is 00:42:02 invoicing and contracts and all this stuff will all be blockchain applications at one point? Oh, it looks so much nicer than Swift. What's Swift, Howard? Speaking of the financial stuff, yes. Swift is the international wire network between banks. And I mean, I haven't worked on it in 35 years, but it hasn't changed. I might add, it was literally telegrams. And, you know, there's no, you know, there's no decent authentication of who the sender is to the receiver. Basically, you know, because I got the caller ID from Citibank, I believe that this withdrawal for a billion dollars is actually coming from Citibank.

Starting point is 00:42:48 And there's no acknowledgement of receipts, and there's no acknowledgement of the transaction actually having been processed. A simple protocol running through the blockchain could be so much better. And everybody could private, and you could validate internally that not only that the wire you sent to the correspondent bank got there, but that the guy at the end of the chain got his money. Well, it's about the end of our show today. This has been really interesting. And I think there's a whole world of blockchain that's going to open up across at least financial services and it'll go off from there. Howard, are there any final questions you have for Donna? No, I think I got it. And why now makes sense. Too many of the explanations I tried looking at online were just tied to Bitcoin and some of the libertarian politics behind Bitcoin.

Starting point is 00:43:46 It's like, yeah, I kind of want to know how it works. I want to know why it's good for the world. Yeah. I will say I was at a conference a couple of months back, and there was a company called Blockstack that had implemented a blockchain worldwide distributed file system. So when you created a file, it would update its blockchain with that file name and location. And, you know, it was like, it was bizarre.

Starting point is 00:44:13 Number one, who would want a worldwide distributed file system that had unique names for every file in the world? And so they were also doing a blockchain PKI and public key infrastructure, which seemed like a better solution, which would be a shared version of certifications and stuff like that. I could see that being a very useful tool. Well, that would address a lot of the problems about PKI. Well, about certificate revocation and things like that that don't work as well as they should. Hey, Donna, is there something you would know? Closing thoughts? Closing thoughts? Thank you, Howard.

Starting point is 00:44:50 Yeah, you know, besides like using blockchain for existing business processes that exist today, right, I'm excited about new types of applications of blockchain. So something like now, if you have this blockchain and you can't tamper with it because if you try to delete a record, then the hashes won't match against your tampered blockchain and all the other blockchain nodes. So imagine that there's these IoT sensors, these Internet of Things sensors, and they are able to put on the blockchain,

Starting point is 00:45:27 here's this grain, and this grain was farmed in this soil, and these IoT sensors say, here's this level of toxicity in the soil. Here are these eggs, right? And these eggs come from these hens that again, you know, there's measurements of, you know, the amount of antibiotics that they were fed. And so it could revolutionize our food supply chain, right? We, all of us, we have to trust that the food processors don't try to poison us, right? Or they put things in the food that we eat that they would also feed to like their youngest infant. But sometimes mistakes happen, right? And toxic substances

Starting point is 00:46:14 get into our food supply chain either on purpose or by accident. But if you have not only the provenance of diamonds, but the provenance of our food products in the blockchain. And you go to the store and a consumer can just scan it with their phone. They could scan the barcode and see the whole provenance of all the ingredients in two, for example, two cake mixes. And if they have a choice, you know, they'll choose the cake mix that has a blockchain provenance because they know that no one has substituted, you know, plastic glue, you know, for milk powder in that cake mix or something. You know, once you start... Well, to be flippant, the kind of people who care about that aren't buying cake mix.

Starting point is 00:46:53 They're cooking from scratch, but... Yeah, or, you know, buying flour. But the point is made. You know, and then once you start measuring something, you change companies' behaviors, right? If they know that this is being watched, then they'll be more careful as well. That's interesting because, I mean, the provenance of something like a diamond is really worthwhile because of the value there. But the provenance of like a food item or even a manufactured item or something like that, that's relatively inexpensive. But through the mechanism of blockchain can be almost automated. It's just amazing.

Starting point is 00:47:32 And, you know, parents are concerned because sometimes, you know, manufacturers make furniture or toys with, like, you know, paints that have been, like, that have been. Well, there's, like, lead for pigments. Yeah, yeah. Or other toxic varnishes that aren't healthy, but they haven't retooled their factories yet, right? So you can't regulate everything, right?

Starting point is 00:47:56 But if you provide this transparency, then the consumers can change, you know, companies' behavior so they don't let things flip curiously i can only think of one product that's tracked to that degree today and and that's only in ray's home state of colorado where they track weed from from seedling to final sale yeah because consumers care about that right well actually that's not it i, the main reason for it isn't because consumers care about it, but to keep organized crime out of it. Yeah, I was going to say U-232, uranium and stuff like that might be worthwhile. I haven't purchased any U-235 if any of the federal agencies are listening.

Starting point is 00:48:41 I have no fissile materials here. But that aside. You are close to Los Alamos. So, Howard, you know, it's not that far away. Yes, I am. And I have an appointment up there next week and I'm really excited about it. Yeah. Going to get a get a tour of the unclassified side of the house. But the tracking in Colorado is is so that, you know, you can't have organized crime come in as middlemen.

Starting point is 00:49:08 If every plant is tracked all the way to the retail, then there's no opportunity for that. So there's lots of other good reasons to want to keep track of things. I mean, the implications of something like that, the IoT sensor would have to be everywhere, you know, and monitoring everything. And maybe as long as it's anonymous, secured, encrypted, you know, non-modifiable, I think that would make a lot of sense. But gosh, there's a hell of a business here. Yeah. I mean, not only the provenance of like our food and, you know, manufactured items. But you can imagine, like, the provenance of data, too, right? So the late 80s, you know, the web, you know, took off because it allowed people to share information.

Starting point is 00:49:54 But now we have the problem that the information on the web, how do we know that, you know, information that's put out, you know, that Ray and Howard, you know, both wrote like papers on the same topic. But it's just that, you know, there might be a community that trusts, you know, what Howard says more than Ray, right? But at least now people could see the provenance of the data. Whereas before, you know, a lot of this is anonymous or it's posted by these opaque user IDs. And, you know, with Hyperledger, at least when you sign your data, you sign it with a digital signature and it can't be self-signed. It has to come from a certificate authority. So you can't pose as, you know, if there's a record out there that says President Obama was born in Kenya and it's signed by myself,

Starting point is 00:50:47 and then there's another record out there that says President Obama was born in Hawaii and it's signed by the state of Hawaii, now readers of this information can decide which data on the web that they would rather believe. And that doesn't exist yet. It's going to change presidential politics forever. Let's not go there. Let's not go there. I'm not going any deeper than that. That's good.

Starting point is 00:51:13 That's good. Yeah, but the provenance of data. Yeah, it's very interesting how far this could potentially go. Yeah. You know, it would make us a more careful society. Yeah. Well, there it would make us a more careful society. Yeah. Well, there's nothing wrong

Starting point is 00:51:27 with a more careful society. Okay. We have run kind of long, however. Yes. On the thought of a more careful society. This has been great.

Starting point is 00:51:37 It has been a pleasure to have you, Donna, here with us on our podcast. Next month, we will talk to another storage startup

Starting point is 00:51:44 systems technology person. Any questions you might want to ask, please let us know. That's it for now. Bye, Howard. Bye, Ray. Until next time. Thanks again, Donna.

Your Ad Here

Grey Beards on Systems - 37: GreyBeards discuss blockchains with Donna Dillenberger, IBM Fellow

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.