The Science of Everything Podcast - Episode 139: Cybersecurity and Cryptocurrencies

Starting point is 00:00:34 You're listening to The Sides of Everything podcast, episode 139, Cybersecurity and Cryptocurrencies. I'm your host, James Fodor. In this episode, we're going to give a brief, but hopefully a reasonably comprehensive introduction to some basics in computer security and also look at cryptocurrencies and the blockchain. So these are a few different ideas

Starting point is 00:00:58 that I decided to put together into a single episode. They're related, but distinct. I mean, we could potentially do one or more episodes on each of these topics, but I thought it might be better to give a simpler overview and put a few things together. So we're going to start by talking about the basis of how information is sent securely over the internet, that is RSA cryptography, and then we'll look at the transport layer security system, which is a protocol that underpins the internet today. And then we'll progress to talking about cryptocurrencies and the blockchain, so how that technology works, and how transactions

Starting point is 00:01:33 are made and how the security arrangement works and concluded by looking at some of the pros and cons of cryptocurrency as a currency. So if you like, the first part of the episode will be focused on the crypto part and the second part of the episode will be focused on the currency part, if you want to think of it that way. There are no recommended pre-listing prerequisites for this episode, so we can just jump straight in and begin by talking about RSA cryptography. So cryptography is the science of sending and decrypting information in a secure way, or storing information securely as well. And particularly the issue that we're interested in here is the use of cryptography to send information securely over the internet. The key issue is how can you ensure that one person

Starting point is 00:02:22 sending a message to a second person? In cryptography, the two communicating parties are traditionally named Alice and Bob. So the question is, how can Alice send a message to Bob in a way whereby Alice is sure that the message was delivered, that the message wasn't corrupted, and that Bob is the person who actually receives the message and not anybody else. And in particular, the question is how that can be ensured, even when the message is sent via a publicly accessible medium. So an example could be if you're sending a radio message, or if you're sending a message over the internet,

Starting point is 00:02:55 or over any medium really where there's a possibility of interception or of snooping on the content of the message. So that's what cryptography is about, at least in the context that we're interested in it here. RSA cryptography, as I mentioned, is a particular approach to cryptography, which forms the basis of many of the most common systems used in the internet, as well as in many cryptocurrencies. RSA is an acronym comprised of the letters of the surnames of the three individuals who devised the algorithm in the 1970s. So I'm just going to call it RSA for short, which is pretty standard. The basic idea here is that RSA cryptography relies on the difficulty in factoring large numbers, particularly large prime numbers. This is called the factoring problem.

Starting point is 00:03:44 The essential idea is that certain mathematical operations are easy to perform, but very difficult to invert. And finding the factors of a large number is a good example of this. if I ask you to check whether or not two numbers multiply to give a third number, that's very easy. You just times them together and then you see if the number that you get is the same as the target number. That's extremely simple to do. Conversely, if I ask you to find numbers that multiply together to give a target number, in other words, find the factors of this number, that can be quite difficult, especially if the target number is quite large.

Starting point is 00:04:22 So there's this asymmetry here between how easy it is to check an answer. and how easy it is to generate that answer in the first place. Or in other words, how easy it is to compute a function and how easy it is to compute the inverse of that function, like going in the opposite direction. And this is the basic idea behind RSA cryptography applied to the problem of finding or factorizing large numbers. Functions that have this property of being easy to perform

Starting point is 00:04:46 but difficult to reverse or invert are known as one-way functions. Now, there's a bit of complexity here about whether or not one-way functions truly exist mathematically because we don't currently know whether it's whether there are any ways to quickly invert these one-way functions like to for example to quickly find the factors of very large numbers no one has been able to prove that there is no relatively fast way of doing this in polynomial time for those who know about such things but don't worry if that doesn't make any sense to you so it's not proven that these one-way functions exist it's more that no one has figured out a way to readily invert them yet, and so the system is relied upon as probably in practice being very

Starting point is 00:05:30 difficult to invert and find the factors. So it's important to bear in mind that these systems are not provably secure, they're just widely thought to be quite secure. It is possible that someone could come up with an algorithm that allows you to quickly invert these one-way functions, and thereby undermining the security of the system, as we'll explain in a moment. But that's thought to be quite unlikely. Anyway, so we've got these one-way functions based on factorizing large numbers. So A one-way function is one where it's easy to compute one way, but very difficult to compute the other way, or at least as far as we know, it's very difficult to compute it the other way. And so that's really handy, because what it allows you to do is to generate two numbers,

Starting point is 00:06:07 one of which is called the public key, another which is called the private key. So there's your public key and your private key, and these are just both big numbers, like very large numbers that are generated in a certain way. The actual mathematical formalism that's used in RSA public key cryptography is called modular arithmetic. So this uses properties the remainder that's left over after you perform division. So for example, if I take 13 divided by 7, 7 goes once into 13 with 6 left over. So we would say that 13 mod 7 is equal to 6 as an example. So this modular arithmetic has a lot of mathematical.

Starting point is 00:06:48 properties that make it very suitable for deployment in cryptography and that's the actual mathematics that underlies the operation of the system. That however is too technical to explain in detail here so instead I'm going to simplify things rather dramatically but preserve the key intuition behind it. We can think of it this way. To generate my private key what I do is I randomly select two large prime numbers and then I multiply them together to get a third even bigger number. So this third really big number is used to be my public key. It's called public because I can tell everyone my public key. In fact, that's the point of it. You communicate it to everyone. And that's how people can

Starting point is 00:07:27 send messages to you. They can use your public key. What makes the system secure is the fact that while everyone can see your public key and it's very easy for you to generate your public key, you just multiply your two prime numbers together to get your public key, that's trivial. It's very difficult for anyone to go back. That is, it's very hard for anyone to work out what your private key is given your public key, because your private key is that is, let's say, one of those two numbers that you multiply together initially to get the public key. It's easy to do that. It's easy to go forwards, but it's very hard to go back. This is the non-invertible function that we talked about. You can then use your private key, so one of these prime numbers that you selected, to encrypt

Starting point is 00:08:09 messages that you want to send people. Or alternatively, you can use it to decrypt messages that people send to you. In cryptocurrency, we describe an unencrypted message as plain text. So you, you know, type on the keyboard and each character is assigned a number. And then you can write any string down that I want to convey as a long span of numbers. I can then encrypt that by performing a mathematical operation on it to convert from the plain text to what's called the cipher text. So that's the encoded version or the encrypted version. version. And the encrypted version, the cipher text, will appear random to anyone who is trying to snoop or spy on the message that I'm sending or the message that someone is sending to me.

Starting point is 00:08:52 The cipher text isn't truly random. It's generated in a deterministic way. It's just that it's very difficult to work out what the pattern is if you don't have the secret private key. For someone with the private key, it's trivial. They just perform a simple mathematical operation using the private key, which converts the cipher text back to plain text and therefore allows them to read the secret message. But for anyone who doesn't have that private key, there's no way for them to do that. They could theoretically try to guess all the possibilities of the private key, but that's why we use very large, often 256 or 512 bits long in order to make it virtually impossible that anyone can just sort of brute force it. It would take, you know, longer than the

Starting point is 00:09:31 expected lifetime of the sun in order to do this by brute force, by just checking all of the, the possibilities and so practically that's impossible. The only other way they could do it is if they found a way to calculate the private key from your public key. But remember that the whole point is that we use a non-invertible function so that that's also, as far as we know, practically impossible. We don't know of any such functions that allow you to do that. The key thing to understand here is the centrality of the non-invertible function or the one-way function for the system to work, because it's what allows me to easily generate a public key and share that to people, and then they can use that public key to encrypt information that they want to send to me,

Starting point is 00:10:13 and I can use my private key to decrypt that information and read it, you know, convert it back to plain text. But no one can take my public key and reverse engineer or work out my private key from the publicly available public one, because doing so would involve inverting a non-inverdable function or going backwards on a one-way function, which is practically speaking impossible. And that relies fundamentally, as we said, on the difficulty in factoring very large numbers. So there's a lot of detail to it surrounding the precise way in which these things are calculated using modular arithmetic. I'm not going to try to explain that. But the basic principle is that you choose a large prime number, which is your private key or the basis of your private key.

Starting point is 00:10:56 You then use that to compute a public key going forwards in that way is very easy. you tell everyone your public key, they can use that public key, perform some mathematical operations to encrypt messages they want to send to you. To anyone who doesn't have your private key, the messages will look random as gibberish, but to you, because you know how it was encoded and it was encoded using your public key, you can use your private key to decrypt it and read the message. But no one else who doesn't have your private key will be able to do that. In order to do that, they would have to be able to work out your private key using your public key, but that's essentially impossible because it's very difficult to factor very large numbers.

Starting point is 00:11:34 So that's the key of how public key cryptography, RSA cryptography works. Now I've mentioned that you can use your public key to broadcast that to people and tell people, hey, use this to encrypt messages to send it to me, which you can then read using your private key. That's one use of the technology, but you can also use it to send messages or to sign digitally sign messages. So I can use my private key to sign a message and then send that to everyone so that everyone can see, can verify using my public key that I have signed it, but no one else can produce that signature. So the signature effectively is just more numbers, right? But the number you can check using my public key, that the numbers could only have been generated by someone who knew the private key. And therefore you can verify that the possessor of the private key must have generated these numbers, you know, the digital signature.

Starting point is 00:12:28 And again, this is secure because everyone can check using the public key that the signature was generated correctly, but no one else can produce that signature because in order to do so, they would need to know the private key, which is not possible to work out from the public key. So again, it's relying on these one-way functions. So that's really useful that we can use the same underlying technology to either, if we bring back Alice and Bob, Alice can send a message to Bob, which only Bob can read. They can do that by using Bob's public key and then Bob can read it using his private key. Bob can also send a message back to Alice, signing it using his private key, and Alice can verify that that message came from Bob.

Starting point is 00:13:08 So if you have a pair of people both communicating with their own private and public keys, you can incorporate both components of this. So Alice can encode a message to Bob using Bob's public key and also signing it using her own private key, and she then sends that message. Bob receives the message, he decrypts it using his private key, so he can read it, and then he uses Alice's public key to verify that indeed the message came from Alice, because there's two things here, right? You have to be able to read the message, and you have to also be out of verify who sent it. So by using a pair of public and private keys from two different people, as long as each knows the other's public key, but not private key, each can send a message to the other and verify that the message that they received came from

Starting point is 00:13:51 the person whom it was supposed to come from. So this is really, useful technology obviously because you can send messages over a publicly accessible system so people can snoop and they can see what the cipher text is if they have access to your internet traffic for example but they won't actually be able to decrypt the content of that message unless they know your private key which is why you have to keep that secret now that we've gone over the basics of our rsa cryptography works let's talk about it in the context of the internet specifically and here is where we have to introduce what's called the transport layer security or TLS for short. This is a cryptographic protocol which is designed to provide secure communications

Starting point is 00:14:29 over a computer network and is widely used for the internet. Its most visible deployment is in the hypertext transfer protocol secure extension of the original hypertext transfer protocol. And this is often abbreviated HTTPS. So people tend to not include that in internet addresses anymore, but you may recall back in the day or if you see people pasting, you know, links, that it starts with HTTP colon and slash slash and then WWW. The HTTP stands for Hypertext Transfer Protocol, which is the protocol that specifies how information is exchanged, well, how certain types of information is exchanged on the internet.

Starting point is 00:15:08 And more recent addresses will have an S on the end of that, so HTTPS. And the S stands for secure, and that represents the fact that it's incorporated the transport layer security protocol into the Hypertext Transfer Protocol. In other words, it's encrypting communications. Early on in the internet, there wasn't actually encryption. And so if you sent a request to a server to send you certain information, then the server responds back, that wasn't encrypted.

Starting point is 00:15:35 And so anyone who had access to your network traffic, such as your internet service provider or ISP or anyone else, they could just read all of the traffic that you sent to website and everything that they sent backwards because it wasn't encrypted in general. So that's changed now. The internet's gotten a lot more secure because of the introduction of HTTP. and the transport layer security protocol. In general, sites that are secured now have encrypted traffic. So your requests and their responses are encrypted. At least most of the data is encrypted. Some of the metadata, such as like when you send their requests or how much data and things like that, are not themselves encrypted, but in general, most of the content is. So let's

Starting point is 00:16:13 talk about how that works. The basic idea is that when you connect to a server, so whenever you visit a website, your computer connects to that server that is hosting that website. This initiates what's called a handshake procedure. And this is just the name that's given for essentially your computer and the server, recognizing each other and identifying that each is who you say that you are, and then sending a secure message between your computer and the server, allowing you to then communicate securely. So the first thing that will happen is that your computer, which is, we'll call the client,

Starting point is 00:16:49 the client connects to the server, assuming that it's a TLS enabled, TLS enabled servers, meaning that it's capable of executing this protocol. Not all servers are, but typically browsers these days will indicate if a server that you try to connect to, like a site that you try to visit, does not have TLS enabled, or its protocols are out of date or something like that, or its certificate's not valid. So typically your browser will tell you if you're trying to connect to a server that doesn't have this enabled. And there's been a big push in recent years to focus more and more on this side of things. So let's assume that your client is connecting to a TLS-enabled server and it requests a secure connection. Your computer, so the client will present a list of

Starting point is 00:17:31 essentially cryptographic algorithms that it's capable of using and the server selects from these which it wants to use and sends that back to the client. So basically they're just deciding we're going to use this particular algorithm and this particular formalism for exchanging information. So once they've decided that because obviously they have to agree on the details that they're using. The server then responds by providing identification in the form of a digital certificate. So I mentioned this before. The certificate specifies the name of the server and some kind of central authority that vouches for the authenticity of the certificate.

Starting point is 00:18:08 And it will also include the server's public key. Now the purpose of this step is so that you can be confident that the server you're connecting to is the one you intend to connect to. So, for example, if I want to connect to my bank, I can send my information securely to my bank by using my bank's public key. Remember, if you want to send information that only a person can read, you use their public key to encrypt that information. The problem with that is you need to know what the public key is of the server or the institution that you want to communicate with. So how do you know that? Well, you need to have some sort of central authority that vouchers for the authenticity of the public key.

Starting point is 00:18:48 because otherwise you might have the wrong public key. And then, sure, you can send information securely to someone, but that someone's not who you thought they were. So this system has been set up relatively recently in the last 10 years or so. I think that this has been developed, where there are trusted central certificate authorities, which are organizations which essentially track servers and server names and the corresponding public key for these different organizations

Starting point is 00:19:13 and basically gives it a badge saying, yes, we verify that this institution, has these servers with these corresponding public keys. These central certificate authorities are tracked by the major browsers. So as of a couple of years ago, there were 52 different organizations that were trusted by Mozilla Firefox, 60 organizations by MacOS, and about 100 organizations by Microsoft Windows and the Edge browser. So the browser that you're using will have built under the hood a set of organizations that it trusts, you know, in different countries around the world,

Starting point is 00:19:48 and for different industries and different types of websites and so forth. Those trusted organizations, which presumably are vetted by the browser company, the companies that produce the web browsers, in turn, will vet servers and provide a certificate that specifies the server name and the public key that corresponds to that. Now, when I say vet, it doesn't necessarily mean that they're validating that the content on the website is accurate or anything like that. It's simply that the server is owned by an organization, which is who it says it is, that they're not trying to impersonate someone else.

Starting point is 00:20:19 So this obviously means that you want to ensure that you use a trusted web browser, which is run by an organization, which you trust to have reliable certificate authorities, which in turn verify the identity of websites that you contact or that you visit, because otherwise you don't really have any way of knowing whether they are who they say they are. These days, most web browsers will have some kind of visual representation of this. So, for example, if I go to my bank, account on Mozilla Firefox I see at the top left next to the URL I see an icon of a padlock

Starting point is 00:20:53 and if I click on that it says connection secure certificate issued to and then it has the name of my bank and if I want to read I can have a look at some more information about that and the nature of the certificate and who issued it and so forth and that should exist on most websites these days that you visit and if you try to visit a website that doesn't have that enabled or that doesn't present a valid the certificate, then generally there'll be some sort of pop-up or notification which will tell you this is potentially, this connection is potentially unsecure, and it would suggest that you leave. So again, this process here about presentation of a certificate from a certificate authority, that doesn't by itself secure information that's sent between the client and the server.

Starting point is 00:21:35 What it does is it enables you to get access to the public key of the server that you're communicating with and be confident that it is the correct public key of the institutional or or the website that you actually want to connect to, and not someone pretending to be from a different website. Of course, you still have to exercise some caution here, because it still could be the case that you have gone to a website, which is a legitimate website, but is perhaps a different website than the one you thought it was,

Starting point is 00:22:00 and you may enter a password there which you didn't mean to, or share information with someone that you didn't actually want to because you were confused about which website you were on. So of course, you as the user still needs to exercise some caution here. But the purpose of the certificate authority system is essentially to reduce, reduce the risk of servers that are pretending to be somebody else. For example, someone's setting up a fake website for your bank,

Starting point is 00:22:22 which then encourages you to enter your password there. This system helps to reduce the risk of that because a certificate authority ideally wouldn't issue a certificate for a server like that, although of course sometimes this can still happen, because mistakes can be made. But anyway, at the end of this process, you have now received the public key for the server that you're communicating with,

Starting point is 00:22:43 which means that you can now send messages to that server which only that server can read. So the client, having now received the public key of the server, what they do is they generate a random number or a pseudo random number, and they encrypt that using the service public key and send it to the server. So now the server can use its private key to unencrypt the message from the client, and both the client and server now have access to that random number, which no one else will know, because it's a long number that's generated random. that number is then used to generate a secret session key for subsequent sending and

Starting point is 00:23:20 decryption of messages. So the actual RSA protocol is used only during that initial stage of sending the random number between the client and the server. It's not used, or at least it's not typically used, not in the TLS, it's not used to actually send messages back and forth, because it's essentially it's expensive to do that, because there's a limit to how long each message can be, and you'd have to break up your, you know, if you wanted to send an entire website, for example, or or like a page on that website, you'd have to break it up into different segments, and then you'd have to be careful about how you encrypted them relative to each other, because if you're not careful in the way that you do that,

Starting point is 00:23:52 then attackers can potentially get information about the content of those messages, so you have to be careful there. And it's just quite expensive. So RSA cryptography is ideally suited to sending fairly short messages between two parties, not much longer messages. So the way that it typically works is that you use RSA, the public and the private keys, to send a random number securely, between the client and the server,

Starting point is 00:24:15 and then the client and the server use that random number or sequence of random numbers for subsequent sharing of data using a different protocol. And as I said, that protocol that's used will be determined at the start of the handshake procedure. So I mentioned that the client will provide a list of protocols that are accepted, like ciphers and so forth that can be used for encryption.

Starting point is 00:24:34 The server will pick one that it's happy with, and then that random number that's subsequently shared will be used as the basis for further communication. So it's like a two-stage process. First, you share the random number, and then that random number, which no one else knows, is then used for the basis of sharing the actual message that you want to communicate with, because it's much easier and sort of cheaper to do it that way than to have to keep using the public and the private key. So at the end of that process, when you've exchanged that random number,

Starting point is 00:25:01 that concludes the handshake, and now the client and the server can begin sharing secure information, and they will continue to use that random number that they shared to encrypt the information until the end of that session. and that's a sort of a one-time use thing. It won't be used in the future, and so there's, it's not like a password which you reuse. This is just a randomly generated single use key for that particular session. So when the session ends, that session key is thrown away and it won't be used again. If you connect again to that website, you'll generate another one. Note that all of what we've been talking about, both RSA cryptography and also the Transport Layer Security Protocol,

Starting point is 00:25:36 these are all different from storage of passwords. So using a password to sign into an online account is a bit different because when you're communicating with a server using the TLS, that's designed so that the client can be confident about the server that they're communicating with. But the server doesn't necessarily know who you are. And that's what the purpose of a password is so that it can verify, the server can verify, such as when I sign into my bank account. The server, the bank knows who I am of all of its clients or of other people who might connect. They need to know who I am. and so what they ask me to do is generate a password, which I send to them securely. And the way that it works is that they don't, well, the way that it's supposed to work is that

Starting point is 00:26:16 the server, so say, my bank, they won't actually store my password, not in plain text at least. They will encrypt it using something, using what's called a hash function. A hash function is essentially a cryptographic function which allows you to take some input and then convert it into a big long number. There is various reasons why you might want to use a hash function. one is that it's a convenient way of storing information because the point of a hash function is that even if the input is different lengths, the output will always be the same length. So for example, users may generate passwords of different lengths, but you don't necessarily want to store the length of the password. So you convert it into, you pass it through a hash function and convert it into just a big long number.

Starting point is 00:26:57 and then the hashed version of every user's password has the same length, and so that provides some extra security. And there's also data efficiency lookup reasons why it's convenient to have the same length for everything, but we won't get into that here. So there's quite a few reasons to use a hash function. Security is one and sort of ease of storage is another. But the point of it is that a hash function is just some mathematical function, which takes in a number or like it could be text converted to a number,

Starting point is 00:27:21 and then it outputs a apparently random, like a random looking number, of a fixed length. And it should be the case that you can't work out what the input was which generated the hash value. Even if you know the hash function and even if you know the output, you can't work out what the input was because again, it's another example

Starting point is 00:27:38 of a one-wave function. So hash functions are another application of this one-wave or non-reversible function idea that you can know what the function is. You can see this was the hash function, like this is how you compute it, and this is the output, but you still don't know what the input was.

Starting point is 00:27:53 So typically, at least, on a secure server, you'll generate your password, it will be encrypted, sent to the server. The server will then apply its hash function and store your password in a database somewhere. But they won't store the password in plain text, they'll store the hashed version of it. And one reason for doing that, apart from sort of efficiency and extra security about not knowing its length that I mentioned, is that if a hacker somehow breaks in and gains access to this server, which stores the list of passwords, all they will find is a list of apparently random numbers. right? Now these are the hashed passwords, but the hacker won't know what the actual passwords are.

Starting point is 00:28:30 Typing in this hash won't be useful because when you try to log into the system, the system is looking for your password. It will then apply the hash function and compare the result to the stored value. If you just typed in the hash function into the password box, what the system will do is it will take the hash of the hash and then it will return some other hash number, right? And that's not going to match the hash that's stored in the system. So a hacker could get access to your hashed password, which is still not ideal because there are still ways that they can try to extract information from that, but it's still better than them having access to your plain text password. Unfortunately, not all people, not all corporations or organizations deploy this system correctly, and there have been high-profile cases of lists of passwords being leaked that have been stored in plain text form. that is very bad practice and should never be done.

Starting point is 00:29:25 The organization should really never know what your password is. It shouldn't be able to ever tell you your password. If you ever interact with an institution which can tell you your password, you should ideally not interact with them further, or at least not give them any sensitive information. Because in a secure system, they shouldn't know what your password is. They should be able to reset your password for you, but they should never know what it is,

Starting point is 00:29:47 because they should only have stored it in a hashed form, which means they don't know what it is. All they can do is verify that you've typed in the correct thing when you type in the box, it's passed through the hash, and they can compare it to the database, where they have the hashed version stored. So they should be able to know that it's you because you've entered the password,

Starting point is 00:30:05 but not actually know what the password is. That's how password systems work. I should mention that it's still very important to choose a secure password, meaning one that's ideally long and relatively random. one of the risks of choosing an insecure password. So say I just use password as my password, which I believe is still the most common password used on online accounts, unfortunately.

Starting point is 00:30:29 If a hacker does get access to the hashed form of my password, they won't necessarily know what my password is, but what they can do is they can use a technique where they essentially guess common passwords, and what they do is they take a list of commonly used passwords. They pass it through the hash function that this organization uses, In principle, that shouldn't be advertised either, but they might be able to find out what that is. Or they could just use a variety of common hash functions as well.

Starting point is 00:30:55 Let's give a simple example. Suppose my password is password. You pass straight through the hash, and the hashed version of password is 357. In practice, it's going to be a lot more numbers than that, but that would just make it easier for us to talk about. 357 is my hashed password, right? So the hacker, suppose they gain access to the database and they see listed their 357. Now they don't know what my password is, but what they'll do is they'll take their list of commonly used passwords, and what they'll do is they'll generate the hash corresponding to password.

Starting point is 00:31:27 So as they put password in, they see 357, and then they look at the list of hashed passwords that they've hacked, and they'll see 357 listed there. And so they now know what my password is, or at least anyone who has 357 as the hash, they know that their password was password, because they've been able to basically reverse engineer that. So that's always a vulnerability of these systems that because the hash function is known, if you can have a good guess as to what the password was, you can just put that through the hash and check that it is correct or not. And if you've guessed it correctly, then you now know what the person's password is.

Starting point is 00:32:01 So that's one of the reasons why it's important to still store the hash passwords in a secure place. And it's also a reason why it's important to choose a secure password, because even if the hash version of your password does get leaked, it's going to be very hard for them to guess what the password was. if you've chosen a long and relatively random password. If you choose something that is short and or not random, like common words, for example, then it's going to be very easy for people to guess your password. So these online security systems always work with a combination of steps in working together.

Starting point is 00:32:32 No single step is foolproof, but in combination with each other, they generally yield very secure communications. Let's now transition from talking about online security and cybersecurity in general to talking about cryptocurrency. Cryptocurrency relies on many of the ideas of cryptography that we've just discussed, but it deploys them in a sort of specific application, a specific context. So let's talk about how that works. First of all, what is a cryptocurrency? Well, a cryptocurrency is just a digital online currency, which is designed to function as a means of exchange, like any other currency. But typical currencies have some kind of issuer, whether it's the government who issues the currency or whether it's in the past it was often a private bank which would issue bank notes, which is essentially a promissory note on some kind of deposit held at that bank.

Starting point is 00:33:24 Regardless of exactly who is the issuer, there'll be some central authority which is responsible for issuing that currency. And any people who use that currency are reliant on the central authority. sometimes this doesn't work out very well. So famously in Germany between or just after the First World War, there was an instance of hyperinflation where the government started to print currency without any restrictions. And that very quickly led to an expansion of the quantity of money circulating, which meant that its value dramatically fell and there was hyperinflation.

Starting point is 00:33:57 People using money as kindling for fires or as wallpaper and things like that because it was effectively useless as a means of exchange because it was prices had gone up so much. So some people are quite worried about that and other things that central banks do with currencies these days. Also, there are people who are interested in the possibilities of a decentralized system which isn't reliant on any particular authority to control. That's the underlying motivation behind cryptocurrency. So it's a decentralized system that allows you to verify that the two parties to a transaction actually have the money that they claim to have. even if those two parties don't know each other and don't trust each other and without needing any role for a

Starting point is 00:34:35 central referee or a central administrator or authority. So this kind of decentralization and ability to send payments securely without anyone, without any bank or government knowing what you've sent or who you've sent it to, that's the underlying motivation behind a cryptocurrency. The very first cryptocurrency was Bitcoin, which was released in 2009. So cryptocurrencies are fairly new. There were some antecedens before that, but the first real cryptocurrency only emerged in 2009. As of June 2020, there were more than 25,000 cryptocurrencies trading in the marketplace, of which more than 40 had market capitalizations exceeding $1 billion US dollars. So cryptocurrency has grown quite rapidly and is now quite a big business. So let's talk a bit about how cryptocurrencies work.

Starting point is 00:35:23 And then we'll conclude with a discussion of some sort of pros and cons of cryptocurrency as a method of exchange. So you've probably heard of the blockchain. The blockchain is a specific technology which underpins cryptocurrencies. So cryptocurrency is more the currency itself and the general idea of a distributed currency without a central authority.

Starting point is 00:35:44 Blockchain is the specific technology that enables that to work. So what is a blockchain? Well, a blockchain, fundamentally, you can think of it as a database plus a set of code that allows it to work. That's sort of what it is, physically, if you like, or in terms of on a computer, it's a bunch of code and a data set.

Starting point is 00:36:02 But in terms of more conceptually, a blockchain is a distributed ledger. A ledger is just a list of transactions, and distributed means that there's no one single authority who maintains the ledger. So at your bank, there will be some kind of centralized ledger where the bank maintains a list of everyone's accounts and the transactions on those accounts and how much money you have in your account. And if there's any question about how much money you have or you know which transaction happened, you ask the bank because they maintain that ledger. That's, you know, stored on their service somewhere, right? Hopefully stored securely.

Starting point is 00:36:35 But a blockchain is different because that has a distributed ledger. So there's no one authority who maintains the ledger. It doesn't, distributed doesn't just mean that there's no single copy. It means there's no single authority as to who decides or whose version is sort of the official one. That's not how it works. Instead, the ledger is maintained and distributed, updated in a, instead the ledger is maintained and updated in a distributed way. So many people have copies of the ledger. Effectively, this means copies of the code that underpins that particular version of that

Starting point is 00:37:09 that particular cryptocurrency. They have it running on their system. And each person who is kind of running the code for that particular cryptocurrency, that's called a node. So basically, you can think of that as a server, which has the code running the relevant software and is connected to other people also running that code. And the blockchain consists of all of these servers interacting with each other, all of these nodes running the code and communicating with each other. There will be some kind of consensus algorithm which specifies how it is that the network collectively decides whether a transaction is valid or not. And that's the sort of key technological insight or key innovation of cryptocurrencies. It's this distributed system and the consensus method,

Starting point is 00:37:54 which is used to determine whether a transaction is valid or not. Because if you have a centralized authority that you don't need a consensus mechanism, you just, the authority says whether a transaction is valid or not or whether or what the balance is of any account on that system. But in a distributed ledger, there isn't a central authority. So instead you have this consensus mechanism. So there's a bunch of people running these nodes. It could be, depending on how big the cryptocurrency is,

Starting point is 00:38:18 there could be tens of thousands of people running these nodes, each of which has the same software running on it, and they each maintain their own version of the ledger, but no one version is like the official version. Each version is sort of equally official, and really what counts is the consensus mechanism that specifies, okay, which is the version that we're all going to accept?

Starting point is 00:38:37 I'll explain how that consensus mechanism works in a moment. But first, let's introduce a few more key terms that are important to know. So the blockchain, the blockchain is just this distributed ledger, plus the consensus mechanism that allows the network to decide whether to accept or reject a transaction. It's called a blockchain because it consists of a chain of blocks.

Starting point is 00:38:58 I know it's sort of a radical idea, right? A blockchain is a chain of blocks. But a block is really just a list of transactions. A list of transactions typically that occur near to each other in time. So they're sort of ordered, right? Depending on the currency in question, a block may contain a few dozen or maybe a few hundred transactions. It sort of depends.

Starting point is 00:39:18 So it's a relatively small number of transactions, it's not like millions or something. And these blocks are linked to each other or chained to each other using particular mathematical functions. So particularly it uses cryptographic hashes. So those are the hash functions that I talked about. I'll explain that in a bit more detail in a moment. But the important point here is to understand that a block, you know, the block of the blockchain is simply a list of transactions and, you know, the details of those transactions. Maybe that's 100 transactions, right? And these transactions typically occur, you know, close to each other in time.

Starting point is 00:39:48 They're grouped together into a block. And these blocks are then linked together mathematically. or cryptographically, the set of all of these blocks linked together in a chain is the blockchain. It's the distributed ledger. So blockchain distributed ledger is sort of the same thing, and they consist of these blocks of, you know, say, 100 transactions, which are linked together in a certain way. The consensus mechanism specifies how to decide whether a transaction should be added to the blockchain or not. And so that's critical for determining whether, you know, I've spent my money or not. Well, if I engage in a transaction, then whether or not or not,

Starting point is 00:40:22 that transaction is accepted as valid and added to the blockchain will determine whether or not I've actually spent my money. So it's all down to this consensus mechanism and whether a transaction is added to the blockchain or not. So let's talk about how the consensus mechanism actually works here and how blocks are constructed because that's important to understand how this system is kept secure, so the sort of crypto part of it, and how abuses such as double spending, so spending the same money twice, and fake transactions are prevented. I'm going to start by talking about Bitcoin, which uses what's called a proof-of-work consensus method. And this is because this is the most prominent and the oldest cryptocurrency that uses this, and proof-of-work systems were also,

Starting point is 00:41:05 most of the earliest cryptocurrencies were proof-of-work systems. There's been a gradual move towards proof-of-stake, which is an alternative system. I'll explain what those are in a moment. But just be aware that what I'm about to say doesn't apply to every cryptocurrency, but it does apply to Bitcoin and many of the other major ones. So in a proof of work-sense, system, the way in which the whole system is kept honest, effectively, is by relying on people performing difficult work, which is very hard to fake. Any type of system like this, which relies on consensus, must enforce some method of sending a signal which is sort of costly to send, because otherwise everyone would send a signal saying, oh, look, here's a transaction, or here's

Starting point is 00:41:45 a transaction, even if those transactions weren't real, right? So you have to have some way of protecting the system from people trying to scam each other or people. trying to steal money. And that relies on sending some kind of costly signal. In a proof of work system, that costly signal is essentially computing mathematically difficult operations. And that requires buying the computer hardware and connecting it up to the network and crunching the numbers. That costs money to do and so it's difficult to fake. So that's why it's called proof of work because you actually have to do some work, some number crunching, some complex mathematics, and prove that you've done it. And that's expensive to do. So that's a disincentive towards

Starting point is 00:42:21 fraudulent behavior. We'll explain in a moment how exactly it works, but at least you can get the idea that if you have to do some work and prove that you've done the work, then there's some sort of cost that's built in there which helps to protect against people just flagrantly sending out fraudulent transactions whenever they feel like it. So now that we understand the basic idea of, well, you're going to have to do some sort of number crunching here to prove that the transaction is valid. Let's explain specifically how it works on a proof of work system. So I have an account on the blockchain. Anyone can create an account. as long as you sort of downloads a software, run it on your computer, and execute the command to create an account.

Starting point is 00:42:57 And let's say someone sends me some Bitcoin. I'm just going to use Bitcoin as a sort of stand-in for some other cryptocurrency. Also note that if I say something like 10 Bitcoin, a single Bitcoin is worth many thousands of dollars. So you wouldn't generally use multiple bitcoins in transacting, because that's a lot of money. But I'm just going to use Bitcoin just for simplicity here. So just don't get confused by that. So let's suppose I create an account and a friend sends me 10 Bitcoin. Now I want to use that Bitcoin to buy something, so how can I do that? I will first have to find the account number, so the account number of the person I want to buy from, and just like in the RSA cryptography, we discussed earlier, everyone on the blockchain will

Starting point is 00:43:39 have a public key and a private key. So you generate a private key when you create your account. That's in fact how you create an account. It's basically generating your private key, and you need to keep that secure because if anyone gets your private key, they effectively have your money. Once I've got my private key, I can use that to generate a public key using mathematical apparatus that we kind of discussed before, relying on modular arithmetic and one-way functions and all that stuff. I won't explain that again. But the basic idea is I generate a private key and use that to create a public key, and then I can share the public key with everyone. People can use that public key to send me money.

Starting point is 00:44:12 If I want to send someone money, I just find their public key, and then I can direct money to them by using the public key. key. So that part's easy. I can send money to someone or request someone to send me money using the public key, but people can only extract money from my account or transact from my account using my private key. So that's my number that I keep secure. Whenever I want to send a transaction to broadcast it to the network, I have to sign that transaction with a digital signature, and I do that with my private key. That's just the same as we discussed earlier with RSA cryptography. So this is kind of all the same as we've seen before. In RSA cryptography, you can send a message that only a specific person can decode using their public key,

Starting point is 00:44:51 and you can digitally sign a message verifying that it's come from you using your private key. Everyone can verify that it's come from you using the public key that everyone knows, but no one else can fake that because you're the only one with your private key. And so that's how transactions are sent on the blockchain. Everyone signs their transactions using their private keys. Everyone checks using the corresponding public key that the signature is valid. And so everyone can do that on the blockchain because public. kids are public and they're sent, you know, from node to node on the network, and so everyone can check

Starting point is 00:45:20 those. If you, for example, try to pay out of an account that you don't have the private key for, I mean, you could try that. You could try putting in the account number, like the public key of that account and try to extract money from it, but it's not going to work because you're not going to be able to sign that transaction appropriately because you don't have their private key. And then when you send it to the software, it will just be rejected automatically. Anyone who's running a copy of the server, running the software on their server, running a node, in other words. As soon as that transaction comes in, they'll check if it's valid. The software will essentially do this automatically, and it will find the private key is not valid. It will find the signature on that transaction is not

Starting point is 00:45:57 valid, because it wasn't signed correctly, obviously, and it will just reject it. It will just say, no, not a valid transaction. So that's a fairly simple way already of ensuring that only the person with the private key is able to transact using that account, whereas anyone can send money to anyone else or also validate signatures from everyone else using their public keys. So that's all very nice. But what we need is an additional system for tracking all of the transactions together and also validating them as part of the blockchain. That's a little bit trickier. So what happens is let's suppose, again, I've got my 10 Bitcoin and I want to spend three Bitcoin to buy something. So this time, let's suppose that I sign the transaction correctly. So I use my private key to generate

Starting point is 00:46:40 digital signature. I put in the account number of the person I want to pay the money to, which will, the account number is pretty much the same as their public. I mean, technically, they're not the same, but close enough to be the same thing for our purposes. So all of that specified, I've written up the transaction. I send it off using my software and that gets broadcast to the network using the software that I've installed on my system. Everyone who's running a server, so every node on the system, then we'll pick up this transaction, or at the moment, it's a candidate transaction because it hasn't been accepted yet, and they'll check that it's been signed correctly and then it meets all the standards and that there aren't mistakes and things like

Starting point is 00:47:13 that that will be checked automatically. Once that's happened, it will then be kind of held in a pool of candidate transactions. Now this is the key thing. Everyone who's running a node will be collecting these candidate transactions that they, that are transmitted around the network. Not everyone will receive every candidate transaction at the same time, or maybe not at all, depending on how things are configured. So every node in the network is kind of collecting up this list in real time of these candidate transactions. The reason they're doing this is because what everyone who's running the node is trying to do is they're trying to produce the next block on the blockchain. Remember, there's no central authority who decides what the next block will be.

Starting point is 00:47:52 It's determined in a distributed way. So everyone who's running this node is collecting up these candidate transactions and kind of when there's enough of them, bundling them together to form a block, but then there's an extra key step to it. Because if you just had to bundle transactions together and said, hey, this is the new block, well, obviously there's a potential for abuse there, because how do we know that that's a valid block of transactions? How do we know that you've correctly validated them? If you're running the software correctly, of course, that the software, let's say the Bitcoin software will automatically verify everyone's private key. But maybe you're faking. Maybe you're lying when you say, oh, hey, I've verified these transactions, guys, let's add this

Starting point is 00:48:26 to the blockchain. There needs to be a way to check that and to prevent double spending of the same money and other things like that as well. So to do this, what happens is there's a sort of a rule that's specifies whether or not a block will be accepted. And this is where proof of work comes in. So everyone who's running a node is trying to create the next block. And at the same time, everyone who's running a node is checking everyone else to see has the next block being created yet. How do they check? Well, they check by a similar method to how public and private keys work. The system is set up so that everyone who is trying to create the next block is competing to solve a mathematical puzzle. I'll explain a little bit more about that in a moment, but basically it relies on this one-way function again.

Starting point is 00:49:10 The puzzle is very easy to verify if someone has a solution. It's like factorization, right? If someone says A times B equals C, it's very easy to check that by just multiplying A and B and getting C, right? You either get C or you don't. Very easy to check. But much harder to work out what the solution is. If I tell you here, C, what are the two numbers that factorize to give it, much harder, particularly if the number is large.

Starting point is 00:49:33 So it's the same underlying logic here of the one-way function thing. There's a mathematical puzzle which everyone who is competing to produce the next block tries to solve. Once you've got a solution, you'll broadcast that to the network and everyone else will check it. If it's wrong, then they'll say, no, you haven't got a solution. That's not a valid new block. Try again. But if you do get a valid solution, everyone will check it and they'll quickly verify that, yes, this is the right solution. And then that will be accepted as the new block.

Starting point is 00:49:59 So that's how the new block is accepted. everyone checks this mathematical operation, has been performed correctly, this mathematical puzzle, and then once someone's solved the puzzle, they'll broadcast that, everyone will see, oh, yep, this meets the criteria, this is a valid solution to the puzzle, and the new block will be accepted.

Starting point is 00:50:16 But in terms of how the block is actually generated, there are certain inputs that they have to take in, so the hash of the previous block, so the hash, remember, the hash function, you take in some input and produce a number as a result. You can think of the previous block hash as just basically like the ID of the previous block.

Starting point is 00:50:31 So you take an ID of the previous block, you take in an ID of all of the transactions that you want to put in the new block, because remember, each block is just a bunch of transactions, and everyone's competing to include these new candidate transactions in a new block. So if I'm trying to construct a new block, I take in the hash of the previous block, everyone has to do that, I take in the hashes of the set of transactions that I'm trying to put in my new block, as well as other things like a time stamp. I put all these things together and I generate a new hash number using the hash function. Here's where the trick works. There's a certain rule that specifies whether that hash will be valid or not. Basically, whether it satisfies a certain equation. And most hashers won't satisfy the equation. Even if I've followed the steps correctly, even if I've used the previous block hash correctly

Starting point is 00:51:19 and if I've used a set of valid transactions correctly and I've used the timestamp and all this other stuff and I generate the hash for my new candidate block, chances are it still won't pass this extra test of does it satisfy this extra requirement? In which case, well, tough luck. It's not a valid block. I'll have to try again. Again, this might sound a bit strange because you think, well, can't I just compute what the hash of the new block should be? Well, no, no. The whole point is that it's hard to do this because you want it to be hard because you want someone to have to exert computational work in order to serve as a proof that they've actually processed the transactions. You want it to be hard to propose a new block because you want that to act as a disincentive to people maliciously or fraudulently introducing false transactions into the system.

Starting point is 00:52:05 So it's sort of designed so that it's difficult to generate a new block. And I'll explain in a moment how that adds to security. But hopefully it's at least a little bit intuitive that if there needs to be some barrier to just proposing a new block to the system because otherwise everyone will propose new blocks and it'd be hard to tell which ones were sort of valid or not. So this proof of work system is designed so that even if you follow. the rules to compute the hash of the new block, probably it still won't be accepted, because you just sort of will get unlucky and won't, and your hash won't solve the extra puzzle. Remember I mentioned solving a puzzle? The block won't meet the criteria. And it's very difficult to do that. It's not like you can work out in advance how to do it. You kind of just have

Starting point is 00:52:43 to try and hope that you succeed. The problem is deliberately designed that way so that you can't work out what the hash should be for the block and then just propose that hash. That would defeat the whole purpose, but because it relies on one way functions, you can propose a solution very, well, relatively easily, but probably it's not going to be the correct solution, and so you'll just have to try again. Eventually, by chance, one of these people who's running these node and trying to generate the new block will happen on the right solution, or a correct solution. There may be more than one, but they'll happen on a solution that solves the equation and is acceptable. And then they'll broadcast that to the system, say, hey, look, I found a block which is acceptable

Starting point is 00:53:21 and which is valid. And everyone will say, oh, yep, that's a valid block, and they'll accept it. And once the block has been accepted by enough other nodes on the network, then it becomes sort of official, as much as anything becomes official

Starting point is 00:53:35 in a distributed system, right? So it relies on this consensus effect of when other people who are running the software accept that your node has solved the puzzle and met the criteria, then they'll accept that as the next block in the blockchain, and then they'll add that onto their version of the ledger.

Starting point is 00:53:50 until that point everyone's competing to get the next block added but when someone solves the correct puzzle and everyone else validates that yes that's a correct solution everyone who's running the node adds that onto the blockchain and now that extra block becomes part of the distributed ledger and we continue then with a new block so just to recap all the key things here first of all people have to send a valid transaction in in the first place otherwise if you haven't signed the transaction correctly with your private key, then it would just be rejected by the software out of hand. But even after that's happened, everyone who's running a node and trying to generate the new block has to correctly use the existing information in the right way. They have to use the hash of

Starting point is 00:54:35 the previous blocks so that the blocks are connected together in the blockchain. That's what ensures the security of all of the transactions over time. They're not just treated individually. They're linked to the previous transactions and to the next transactions through the block. So you have to use the previous block. You have to use the correct hashes for the transactions you want to include into it. You have to use the correct timestamp. And there's another number called a nonce as well, which is sort of like a one-off number that you used to add security. I won't try to explain exactly how that works. So there's a bunch of inputs that you have to use, including the previous block and the right set of transactions. You have to use those correctly.

Starting point is 00:55:09 If you don't, you won't be able to produce a valid solution to the puzzle that's being posed by the system. But even if you do use those correctly, chances are you still won't get the right answer just because it's really difficult problem and there's no sort of way to just solve it. You have to just guess a solution and hope that it works. So this adds security to the system because for example if someone was trying to add a block which doesn't build off a previous block, then it wouldn't be possible for them to get that accepted because it wouldn't use the previous block hash correctly. So you have to build on an existing block as part of the blockchain. You can't just start from somewhere else and then pretend that that you're just going to sneak in a whole bunch

Starting point is 00:55:46 of previous blocks into the blockchain. It doesn't work because you have to use the previous block hash in order to get your new block to be accepted. And double spending can also be prevented by using an iterating non-center timestamp, which is a way to protect from the same account transacting the same Bitcoin at the same time. Well, one of those will occur first and anything else will be rejected. So these inputs into the proof of work system are ways of preventing people from introducing false information into the system or deviating from what happened previously. But even then, and this is the key point, even if they've correctly constructed it,

Starting point is 00:56:20 they're still probably not going to generate a correct block because of this built-in difficulty of the problem that like the mathematical puzzle that you have to solve with your candidate block. That requires just getting lucky or you can increase your odds by running more of these servers. And this is what's called mining. These are what Bitcoin miners or other crypto miners are doing. Basically, they are running servers that have the code of, let's say, Bitcoin running on it, and that are constantly collecting candidate transactions from people who are transacting Bitcoin and trying to construct the next block using these candidate transactions.

Starting point is 00:56:54 They're trying to solve this puzzle, which everyone else is also trying to solve. So they're competing against each other. And this is called mining. I guess by analogy with like finding a diamond or something in a mine, you find something that's valuable, which other people are also looking for. When you find the solution, you find a candidate block that meets the criteria. Hooray, you celebrate because you'll get rewarded with that. So each time a new block is added to the blockchain, there's some new Bitcoin are created,

Starting point is 00:57:19 and that is given to the person who successfully generated that block. So that acts as an incentive for people to try to process these transactions and add blocks to the blockchain. This is necessary, of course, because people processing these transactions and validating them and ensuring that they're added correctly is what gives security to the system. So people are rewarded for doing that if they successfully produce the next block in the blockchain. But, of course, most times you try this, you fail. that's why it's difficult. That's why you have to prove that you've done the work by crunching a lot of these numbers and trying a lot of possibilities until you happen on the correct one. So that's one of the reasons why proof of work systems like Bitcoin are very computationally intensive because there are thousands of people around the world running very expensive and very energy intensity of computational setups to try to crunch a lot of these numbers and basically to try to generate the next block.

Starting point is 00:58:08 the more computers you run, the higher the chances that one of the computers that you run gets lucky and gets the next block, and therefore you get the reward for that. So that's why there's so much electricity used and so much computational power that goes into these systems, because lots of people are competing for these financial rewards by generating the correct block. And this also adds security to the system because you have to prove that you've done the computational work, and that's expensive to do. And if you haven't done the work correctly, then someone else will easily identify that because you'll say, hey, here's a new candidate a block, and they'll say, wait, no, that doesn't solve the riddle. That's not correct, and they'll reject it. Now, by that description of how the proof of work consensus mechanism works, you may have realized that there is kind of a way around this.

Starting point is 00:58:52 So if I try to broadcast a fake block, a block with, say, fraudulent transactions in it, or that doesn't build on the previous block correctly or anything like this, I can try to broadcast that to the network, but everyone else would just reject it. They'll say, no, that's not a valid transaction. Okay, well, what if I get extra nodes? So instead of just running my one node, I get 10 extra nodes, and I install software on these nodes such that they will accept my fraudulent new block. So I'll send it out, and everyone else will say no, but my 10 nodes will say yes.

Starting point is 00:59:24 Well, that's still really not going to be enough because nearly everyone else is saying no. The fact that your 10 are saying yes, that's not sufficient. Well, what if instead of 10, it's 100 or 1,000, or more to the point, what if I get enough control over the network as a whole so that I have more than half of the computational power on the network. Then what I can do is I can introduce my fraudulent transactions or my fake transactions onto the network and bundled up in a block and say, hey, here's this block, and I can get more than half of the nodes on the network to accept the fake block because I control them. I can tell them what to do effectively. I can get them to accept what normally people would

Starting point is 01:00:02 reject. Using this method, this is called a 51% attack. It's a bit misconduct. It's a bit misconduct. misleading because you don't have to have 51% specifically. In fact, you don't even theoretically have to have more than half. It's just basically impossible that it will work unless you have at least half. But practically speaking, you might need a fair bit more than half to get it to work reliably. While your servers, your fraudulent servers are trying to create and propagate new fake blocks, there are legitimate servers that are following the rules trying to create new blocks as well. And so they might beat you at your own game, so to speak, and propagate a legitimate transaction while you're trying to propagate your fake transaction. So you have to make sure that you have enough nodes that are working sort of fast enough to get your fake ones out or your fraudulent transactions in your fraudulent blocks out before the legitimate servers can get theirs out.

Starting point is 01:00:50 So you may need more than 50% of the computational power. So this is a theoretical weakness of all proof of work blockchains, at least as far as I know, it is a weakness of, Bitcoin, the main security against this is just that getting more than half of the computational power of the network is prohibitively expensive because that will require you to buy more computer hardware and hook it all up to the network than everyone else combined who's running Bitcoin in the world. And that's not practical to do that. It would cost many billions of dollars. I don't know if anyone's calculated how much it would cost. It's not practical to do that for a cryptocurrency as large as Bitcoin. However, it has happened for smaller cryptocurrencies before because for smaller cryptocurrencies,

Starting point is 01:01:32 there may not be that many people running nodes on the crypto, and therefore it might be more feasible for you to just outspend them all, even for a short period of time. And so that is still a concern for smaller cryptocurrencies, 51% attacks may actually be feasible. But the idea is if the cryptocurrency becomes large enough, then a 51% attack becomes prohibitively expensive. And therefore, there'll always be more honest nodes in the system than dissonable. or at least if there are dishonest nodes they can't agree with each other. And so the trust is embodied in the idea that the majority of the nodes in the system

Starting point is 01:02:04 will be honest, at least enough of the time, and so that whatever the consensus emerges will be what is added to the blockchain. And so that's how trust is developed despite the fact that the individual parties don't trust each other. You can trust the consensus mechanism of the whole system and the process that's used to validate individual transactions and then to validate blocks of transactions that are added to the blockchain. One can try to introduce fake transactions or fake blocks that don't adhere to that,

Starting point is 01:02:31 but unless you have more than half of the computational power of the network, it's practically impossible to do that because your fake ones will just be rejected. They won't meet the standards, they won't solve the difficult mathematical problem that is required to accept a new block, and therefore they'll just be rejected. Now there's something else that you may have thought about. Well, what if I just take the existing system and just tweak the software a little bit to generate my own version of it, where it's Bitcoin, but I just think. I give myself, you know, a thousand extra Bitcoins.

Starting point is 01:02:59 Now, you can do things like that. The trouble is that now you're not running on the same system as everyone else. You're running on a variant of it. And that's called a fork, or specifically a hard fork. So this creates a split of the cryptocurrency, where basically some people use the old version and some people run the new version. I mean, you can actually run both of them. But the point is, now that you're using different software,

Starting point is 01:03:18 you've basically created a new version of the currency. And that can and has happened. It's happened with Bitcoin and it's happened with other cryptocurrencies before, where they're split into two. There's nothing stopping you from doing that. You can fork any publicly accessible, whenever the code is accessible, which typically it is for a cryptocurrency, because that's sort of the point of it, you can always fork it. And you could have like, I could, I can make a James Bitcoin, which is my version of Bitcoin, which could have the entire existing chain of transactions up to now, but then going forwards, I could do effectively whatever I wanted, right,

Starting point is 01:03:49 if I set up the code in a certain way. The issue with that, of course, is, well, why would anyone care about your fork of the cryptocurrency. No one's going to accept James Bitcoins, but they may accept Bitcoins for payment, right? So you can always fork an existing crypto, but it's probably not going to be worth anything unless you can convince enough people to come with you on the fork. And that leads us nicely into a final summary of sort of pros and cons and criticisms and defenses of cryptocurrency. So we've talked a bit about how it works and how you have the system of the blockchain and the consensus mechanism and using them private and public keys to try to protect people who use it and to ensure that fraudulent transactions or fraudulent blocks are rejected,

Starting point is 01:04:28 and that is vulnerable to a 51% attack, but if the number of users is large enough, then those become very expensive and so very unlikely to happen. This allows transactions in a currency without any centralized authority, like a bank or a government, and also it allows for a form of anonymity because people don't know who owns any particular account. So there's no central authority that you have to show your ID to or something in order to set up a bank, account, you can just do it. Anyone can install the software and do that and no one knows who you are. So it allows for anonymity and it allows for decentralization, which many people think is valuable because they don't trust central banks or they don't trust governments or they worry

Starting point is 01:05:06 about the power that these systems have and they don't necessarily want people knowing what they're transacting or how much money they have. So those are some of the major advantages of cryptocurrency. Another advantage is that the whole transaction record is actually stored publicly. And so anyone can look at the entire list of transactions that have happened. You know, ever since Bitcoin started, every single transaction that's been accepted is all listed there. You'll have to have the software to run that, and the size of the blockchain is quite large these days, obviously because there have been a lot of transactions, but it's all there and publicly available. So it's transparent in a sense, but it's also anonymous.

Starting point is 01:05:41 So it's a bit strange because everyone can see all of your transactions on the blockchain. They're not private in that way, but no one knows who you are unless you tell them, hey, this account is me, right? then in which case they could see all your transactions. So that's why it's sometimes called pseudonymous, because everyone's transactions are visible, but no one can identify you with any particular account on the blockchain unless you go out and tell people. And also, people can set up as many private accounts as you like,

Starting point is 01:06:04 generating new private keys. That's effectively free to do because a private key is basically just a very long number, and there's plenty of numbers to go around far more than could ever be used up. So anyone can really just generate as many private keys that they like and change money between their own accounts. And so there's not necessarily one person per account. It's a bit different than a traditional bank when normally you'd have one or maybe a couple of accounts

Starting point is 01:06:25 because it's a bit costly to create an account. And I don't think the institution is going to let you have like a thousand different accounts. But for a cryptocurrency, that's not a problem at all. You could have as many accounts as you like. So those are some of the main advantages, the transparency, the anonymity, and the lack of having to trust a centralized authority.

Starting point is 01:06:43 Some of the disadvantages or criticisms that have been levied at cryptocurrency are as follows. So one is the environmental cost. I've talked about the proof of work systems that require a lot of number crunching, and this requires a lot of computational power, which in turn uses a lot of electricity. So there's a large environmental cost to these in terms of greenhouse gas emissions and also use of rare minerals for creating the hardware itself.

Starting point is 01:07:08 In 2019, it was estimated that Bitcoin's electricity consumption was about 7 gigawatts, or about 0.2% of the global total, which is roughly equivalent to the energy consumption. each year by Switzerland. So cryptocurrencies that use proof of work are not very environmentally friendly. It's quite expensive to validate each transaction because of the number of people competing to produce the next block. Effectively, most of the computational power is deliberately wasted because the problem is deliberately made hard in order to serve as a sort of a proof that you've actually done the work, right, as an incentive

Starting point is 01:07:41 for people to put that computational power into the system. But there's far more being used than is necessary, deliberately by design. And so that makes the system in some sense wasteful. Now, defenders would say, well, that's the whole point, right? It ensures that there's lots of people mining and that there's lots of computational power in the system. But critics argue that there are better ways to do this that aren't as environmentally costly in terms of the energy use. And so this leads to the alternative consensus mechanism, the major alternative one that I've mentioned earlier, called a proof-of-stake system. So the second largest cryptocurrency called Ethereum uses a proof of stake system. It moved only a year or two ago, actually,

Starting point is 01:08:17 so that's fairly recent. In a proof of stake system, there isn't any proof of work. There's no number crunching that you have to do. I mean, there's a little bit, but it's not very difficult. In a proof of work system, there's this deliberate system where even if you validated a block, it's still probably not going to be correct because you have to also solve this, you know, tricky mathematical problem, which is hard to solve and you basically just have to guess it lots and lots of times. So it's deliberately set up so that it's computationally difficult. But in a proof of stake system, they don't have any of that. In a proof of stake system, the way it works is that people who have lots of currency around, so in Ethereum it's ether, for example, people have

Starting point is 01:08:52 lots of ether can vouch for a new block by basically staking some of their ether against that block. So they'll say, I stake 10 ether that this new block is a valid block, and then everyone will check and see if it meets the criteria. And if it doesn't, if you've staked your ether on an invalid block, you lose some or all of it. I don't know the exact algorithm for determining that. But the point is it's a disincentive for staking against invalid blocks. And so once again, it still has the similar level of security as proof of work, because in order to successfully get people to accept something that was a block that was not valid,

Starting point is 01:09:27 you'd have to have a huge amount of ether. You'd have to have control over a very large proportion of the ether in the system, which is very unlikely. This method of proof of stake means that we don't have to have all of these servers competing against each other to number crunch this. very difficult problem and it avoids the computational cost of proof of work system. So it's, I believe when Ethereum adopted proof of stake, it cut its energy consumption by something like 99.9%. So it's a massive, massive difference. So that is a mitig- so proof of stake can be a

Starting point is 01:09:59 mitigation of much of the environmental cost of cryptocurrency, although there are people, particularly some of the more hardcore Bitcoin enthusiasts who insist that proof of work is better and so it's quite unlikely that Bitcoin will ever switch, at least in my opinion. are some other criticisms of cryptocurrency as well. So another one is wash trading. Wash trading is a practice where essentially you buy and sell from yourself, or at least parties that are closely associated with each other. You might wonder what the purpose of that is. Well, the purpose of it is basically to drive up the volume of transactions, say for a particular currency. And wash trading occurs in traditional financial markets as well. It's typically

Starting point is 01:10:34 illegal because it's a way of manipulating prices by making it look like prices are going up or down or that there's more of a volume of trading so people like, oh, this is an important asset and they might buy that and that pushes up the price. There's various sophisticated ways you can try to manipulate prices using wash trading. But it's typically illegal. It seems to be a major problem in crypto markets. So a study from 2019 found that 80% of all trades on unregulated crypto exchanges, and most of them aren't regulated, could be wash trades. I don't know that that statistic is super reliable, but at the same time, it does seem that it's quite, the number is quite high, particularly because of the anonymity of crypto, makes it difficult to prove who

Starting point is 01:11:11 owns the accounts that are transacting with each other. And so people can use that potentially to manipulate the price of different cryptocurrencies, especially for new cryptocurrencies, which might be trying to sort of make a place for themselves. If they artificially generate lots of trading, then people might think that it's sort of more valuable than it is. This is particularly an issue because the value of a currency often comes from how many people are willing to accept it. And so if there's lots of trading volume, you might think that lots of people are actually willing to accept the crypto, whereas in fact, maybe most of it's just fake. It's wash trading. So that's an issue that I think needs to be resolved, at least better than

Starting point is 01:11:43 currently has been. Another problem is volatility. So cryptocurrency prices have proved to be very volatile going up and down very considerably. So for example, over a single week in May of 2022, Bitcoin lost 20% of its value, and Ethereum lost 26% of its value, and smaller ones lost even more. And if you look at a graph of Bitcoin's price in US dollars over the years since it's been released, you'll see that it's been generally going up, but it's very volatile, much more volatile, even than the stock market, which is already quite volatile. And so this is an issue if you want to use a currency as a store of value. That's one of the purposes of currency is that's a way to store wealth.

Starting point is 01:12:20 And typically you don't want stores of value to be that volatile. Cash is somewhat volatile. There's always the risk of inflation, but inflation is usually in developed countries in particular that have a strong central bank. It's typically not that high, and the value of the currency never really goes up and down that dramatically as it has in many cryptocurrencies. So that's another limitation of cryptocurrency, at least as a store of value. It also means that if you're trying to transact something, you don't necessarily know how much it costs. If the value of crypto can swing wildly one week to the next, it can be hard to predict, well, how much is this going

Starting point is 01:12:52 to cost for my business or for my holiday or whatever it is that you might want to spend, you know, paying in crypto. And that's a disadvantage of a currency if prices can vary quite dramatically. Some people have argued that this is a teething problem, that the volatility will go down as more and more people accept crypto and it becomes more standardly used, but we'll have to see. But that is, I think, a limitation for using it as an actual currency. If its value keeps varying a lot, it becomes hard to predict how much things cost and hard to use it as a reliable store of value. Another criticism of crypto is that it's widely used in organized crime, so drug trafficking, prostitution, weapons trading, things like that. Obviously, the fact that it's relatively anonymous

Starting point is 01:13:31 and not regulated much makes it easier to use for these things. It's pretty hard to find reliable numbers for how much of the transactions of crypto or, say, Bitcoin are criminal. It's obviously by nature sort of impossible to know that, but there is a concern that it seems to be a lot and that it's very difficult to deploy existing methods to deal with it. So many existing financial institutions have rules against transacting with organizations or with areas of the world, even where there is known to be a lot of organized crime or where there's large risk of payments being used to fund terrorism or poaching or human trafficking, drugs trafficking, things like that. But none of that really exists in crypto and it's sort of hard to enforce as well because many banks

Starting point is 01:14:16 have what are called know your customer laws, which means that the bank has to know something about who opens all the bank accounts so that it can actually. exercise at least minimal due diligence over whether you're a criminal or not and whether you're using your account to do criminal activities. But cryptocurrency doesn't really have that and it's not really clear how you could incorporate that into the system because the whole point is that it's anonymous and anyone can create an account whenever they want and anyone can transact with whoever they want whenever they want and there's no central authority. So the regulation is really lagged behind the technology here and it's a bit hard to know how the systems could be designed to

Starting point is 01:14:47 incorporate those sort of checks into them. Maybe it will be figured out but I don't know that we've worked that out yet. And so that's an ongoing concern that the systems are really vulnerable to being used for criminal practices. A final problem with crypto, which is sort of an interesting one, is that it's quite susceptible to accidental loss. It's thought that a large proportion of all the Bitcoin that's existed. Now, I forget the percentage, but I think it might have been like 5 to 10% of all Bitcoin has been lost irretrievably. Because if the private key is lost, say it's stored on a hard drive, which you accidentally throw away or becomes corrupted. There's no way to retrieve that, and it's, for all intents and purposes, impossible to retrieve those tokens. And that can be an

Starting point is 01:15:29 issue, particularly if it's a large amount of Bitcoin, and it's also an issue from an individual point of view, because, you know, you've lost all of that money. Obviously, if you hold cash, then you lose the cash, you know, there's no way to recover that. But if there's an accidental server outage on a bank account or something, you're not going to be liable for that loss, right, as an individual. There's protections there, and there's ways to, to, to recover. But cover that because it's not dependent on any particular single number or something like that. Whereas there's no recourse really on cryptocurrency for cryptocurrencies because there's no central exchange. There's nothing to be done. If you've lost your private key, then the money's

Starting point is 01:16:04 irretrievable. End of story. And so that's a potential downside as well. You have to be really sure that you've securely stored your private key in a way that you won't lose it, but no one else will ever be able to find access to it either. And that's actually quite hard for a lot of people. If you think that a lot of ordinary people use very, very insecure passwords, even for even for very important things like their bank account, it's sort of hard to see many, many like non-cryptoenthusia sort of everyday people taking very careful care of their private keys, which aren't even passwords, just very long numbers. So I think that's another barrier to widespread adoption as well.

Starting point is 01:16:39 Anyway, that concludes the little summary of some of the pros and cons and criticisms of of cryptocurrency, and that concludes the episode as well. Hopefully you've found this informative. If you would like to get in touch with me, you can send me an email. My address is Fods12 at gmail.com. That's FODDS12 at gmail.com. I'd encourage everyone to leave a positive review of the podcast on iTunes or Spotify or whatever aggregator you prefer to use.

Starting point is 01:17:02 That really helps to spread the word. If you would like to make a donation to support the podcast financially, you can find our Patreon page. Just type in Science for Everything podcast, Patreon, to get a link to that. or you can make a one-off donation via PayPal to my address, Fots12 at Gm.com. Any financial contribution is much appreciated, and I'm grateful to all of my existing patrons for their support.

Starting point is 01:17:26 So that concludes this episode. Thanks very much for listening. I'll talk to you next time.

The Science of Everything Podcast - Episode 139: Cybersecurity and Cryptocurrencies

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.