Video - Bitcoin Wallet Encryption

In this video, Andreas Antonopoulos covers Elliptic Curve Crypto (ECC) & EC Digital Signature Algorithm (ECDSA), Key formats (hex, compressed, b58, b58check, Key types, Key mnemonic word list (BIP0039), and Key encryption (BIP0038).

TRANSCRIPT

Antonopoulos: I'm Andreas. I'm going to be talking today about keys and wallets. If you have questions, jump in and ask the questions at any point in time. I'm going to be leaving in about an hour. And then, those of you who have more knowledge are going to be answering questions for those of you who have more questions. This is a community after all, and I'm just not here to serve you all.

Today, we're going to be talking about keys and wallets. This is just a checklist of what I want to cover, Elliptic Curve Cryptography, generating keys, formats of keys, type 1 keys, type 2, keys, Bit 39, Bit 38, Shamir’s Secret Sharing system and paper wallet. We'll get to as much of that as we can. So, how many of you are here for the first in this meet-up? Welcome.

All right. So, the meet-up works like this. We go -- Mondays, we do seminars, and Sundays we do hackathons. On Mondays, we have someone who can talk about the technical topic, present some information and do Q&A. You then go away and do your own study and research, coding and whatnot. And then on Sundays, we get together and we do a hackathon where we collaboratively code and exchange ideas of work on projects. And the idea is to figure out if the information that the person in front provided on Monday was actually correct.

Now, the correct answer is no. It wasn't. It was only partially correct. And during the hackathon, we found out exactly what bits the person got wrong. That will be me today, okay? So, I will get some things wrong. I'm going to do some of these from memory. The rest I'm going to be looking at my notes here. This is all the information that's available in various bits and pieces all over the Bitcoin Wiki. And I'm going to cover a number of different levels. If you have questions, jump in during the presentation and ask me the questions.

All right, so how many of you are familiar with Bitcoin technology at the moment? Okay, that's good. How many of you understand public key cryptography at a basic level? Okay. Who does not understand public key cryptography? It’s perfectly fine if you don't understand public key. Okay. Yes, how many of you understand Elliptic Curve Cryptography? I can’t do all the pretty pictures, going to run the equations. Okay, okay which is about as much as the settings as I have.

All right, so let's start by some basic terms. A Bitcoin wallet is not a wallet. It's a keychain. Bitcoin wallets have no coins. The coins are on the blockchain on the network. The wallet has keys in it, which is why calling it a wallet was a monumentally stupid decision that will come to haunt us because it's not a wallet because it doesn't have coins it. It only has keys. So, it's a keychain in every sense of the word both metaphorically it's a keychain and in terms of public key cryptography it's a public key keychain, all right? That’s what a wallet is.

Wallets are implemented in a number of different ways. The implementation of a wallet is application specific. Bitcoin do influence a binary file wallet in a very specific way, multi ways, different way, electrum wallets different way, binary wallet different way, blockchain info wallets different way, okay? They all have the same basic information, key pairs, pairs of public and private keys as well as metadata, corresponding addresses and things like human readable things. Ladies, all right. What we're going to talk about today is not the wallet formats. We're going to talk about the keys that are in them and how they work. Yes?

Male: Just to be clear, then wallets have nothing to do with Bitcoin?

Antonopoulos: Okay, so wallets have nothing to do with Bitcoin. In fact, none of the keys are stored on the Bitcoin network or on the blockchain. They are presented on the blockchain in order to create encumbrances. That means they're on the blockchain. What you will see is an amount of coins locked to a specific public key. And that is the only pairs that will happen on the blockchain.

And then you will see those same coins unlocked when they're used as inputs in transactions by a signature that exposes the public key. So, you will see both the public address and the public key of a key pair on the blockchain if it has been involved in a transaction. A really important thing to realize is you don't need to be online to bear keys. You don't need to be online to use those keys. You don't need to be online to sign something with those keys. All of those things can happen offline, the mathematical functions that do not depend on any information that's in the blockchain. We’ll go into more detail on that. Does that answer your question?

Male: Yeah, sure.

Antonopoulos: All right, so wallet has no coin. It's actually a keychain. The coins are on the network. They keys are in your hard drive and they are not in the network although they sometimes appear in the network in order to signify ownership of an amount by locking and to signify proof of ownership when spent in a transaction. That’s the only time keys actually appear on the blockchain. So, within a wallet, you have two types of keys, private keys and public keys. These form a key pair. A private key is a number. It’s just a number. It’s a 256-bit number. All right, what is the best method of generating a random private key, anyone?

Male: Find a good source of entropy.

Antonopoulos: Find a good source of entropy, excellent or dice, pencil and paper. What I want to explain by saying that is, you need no computers. You need no access to the internet. You need no blockchain. You need nothing other than dice or coin, pencil and paper. How do I generate a private key with a coin, pencil and paper? I flip a coin 256 times. If it's heads up, I won. If its tail's up I write zero. At the end of that, I have a 256-bit random number evenly distributed across the entire search space of random numbers. That number is a private key, right? That's all you need.

And *00:07:48, how do you do this? *00:07:51, the most common form of doing this is to find a source of entropy. What does that mean? A source of entropy is a random number generator that has been properly seeded, and it is a random number generator that has specific properties. It is a cryptographically secure random number generator or CSRNG. What is the difference between a cryptographically secure random generator and one you made based on Introduction to Python articles you read in a book?

That one works. The one you try to hack together is why you got your money stolen. Random number generators are very difficult to do right. Don't try this at home. This is not DIY stuff. Actually figuring out how to create a random number generator that evenly distributes output across the entire possible space of numbers it's supposed to generate, is a hard problem. Really, really smart mathematicians do this work and they often do it wrong, all right.

So, I don't build random number generators, and I've been in cryptography for almost 20 years. Instead I rely upon industry established, properly peer-reviewed, cryptographically secure random number generators. So, the first lesson in this is, do not try to do this yourself. Find a proper cryptographically secure random number generator. Most operating systems have either a source of entropy or a library with cryptographically secure random number generators or both. For example in Java, Secure Random, in Python, the random Library etcetera, etcetera. These are properly approved cryptographically secure random number generators.

The next important step is seeding that. A seed is the initialization of a cryptographically secure random number generator. That is a starting point of randomness. And then you create further randomness by putting it through this. That is usually sourced from the operating system. Operating systems have ways of producing randomness through user activity. We're going to do a mouse, type keys, listen to the sound that's coming in from the microphone at such a gain that it's picking up cosmic radiation pops. You can do all kinds of things like that.

At Stanford, they have a long-running project that uses a webcam pointed at a lava lamp and looking at the little colourful bubbles. And that is a really good random source. Pure randomness is impossible to prove. You know, this is a topic of really difficult study. There is actually an XKCD comic if you read those, where they’re thinking -- or maybe it was Dilbert. I'm not sure. It might be Dilbert where they're taking them on a tour. And this is our Random Number Generation Department. And there's a guy and he goes six, six, six, six, six. They walk away.

And then Dilbert, I think observes well, given just a sample of six numbers that may in fact have been part of a larger random sequence. And that is scientifically correct. And that's why you'll never know a full generator is actually random. All right, so get one from the operating system, feed it into a cryptographically secure random generator, produce a random number. That random number will have an amount of entropy in it.

In cryptography, entropy is measured in bits. That means how many bits of randomness exist or how big a bit sample can we pull from this? That is still rand, and if you try to pull more, then it's no longer rand, right? So, you need to find a source that will give you at least 256-bits of randomness, preferably more. Start with a larger source of randomness. Maybe pull a couple of kilobytes of randomness. Here's the next trick. One of the most convenient ways to convert that into a very simple 256-bit number evenly distributed across the entire search space is to feed that random number generator into SHA256.

SHA256 is a secure hashing algorithm that will take a random data stream and will convert it into a fingerprint, which happens to conveniently be 256 bits. So, if you take randomness, shove it through a message digest or a hash and come out with 256 bits, what you have is 256 bits of randomness, perfectly packaged, ready to turn into a key. The output of this is your private key. That's good. Start with a lower spectrum and compress it up, not good, right? When will create this *00:13:10, this is no longer 256-bits of entropy.

Male: Right, but if -- but -- but if you're feeding the SHA256 and it's scrambling, it's going to be -- still going to be pretty hard for someone to figure out where the most of the randomness are.

Antonopoulos: Pretty hard, impossible.

Male: Okay.

Antonopoulos: We're going for impossible. Private keys, putting money on them, you're absolutely right. It’s good in a pinch but that's not the right way to do it.

Male: Okay.

Antonopoulos: This process is the correct process for generating private key. So, it depends on the operating system. On Linux, for example, if you're pulling from dev random, what you're pulling is actually one of these in a box. So, a properly implemented operating system if you could rely on it, a specific source of cryptographically secure randomness, not dev random, dev urandom which is entropy controlled and will not give you more bits and it actually has an entropy, is this. Now, if you pass it through this twice, it's a bit of extra work but just to make sure, do it this way. All right, so now we've got a 256-bit number. Yes?

Male: Is it a full space that *00:14:29?

Antonopoulos: It's not exactly a full space because this is done over a prime field. Let *00:14:38 prime field of order P equals two to the 256 minus 2 to the 32 minus 2 to the 9 minus 2 to the 8 minus 2 to the 6 minus 2 to the 4 minus 1, something like that. That's the prime numbers used to initialize the elliptic curve. That is the upper limit of the elliptic curve. So, it's actually -- this is much less than 2 to the 56. The chance of you actually randomly falling into this relatively tiny space is low but yes you're right. So, the constraints are -- now the nice thing is that, you'll know because what happens is in the very next step when you try to validate this by producing the equivalent public key, it will fail because it's outside the field of the elliptic curve. Yes?

Male: Real quickly to see this microphone noise *00:15:40 randomness?

Antonopoulos: Yes, it's just an initialization factor to ensure that you're not always starting from the same random number sequence. Yeah. All right, any other questions? All right, moving on. So now, we take this private key and we could represent it to the number of different formats. When you look at a private key, you may see it looking different ways, and that is confusing at first but it's all the same thing. It's all the same number. It can be represented in a number of different formats. If I take this number and I show it to you in binary, hex or octal, it's still the same number, right? If I put a checksum on the ends and a version byte at the beginning so that it's difficult to accidentally mistype it, it's still the same number but it appears different in every one of these formats.

The 34x we're going to look at very quickly are, hex, Base 58 check, also known as WIF in this case, and Base 58 check, compressed or rather WIF compressed. I'll write it down again. WIF, which is a Base 58 check and WIF compressed, which is also a Base 58 check with one slight difference. All right, so hexadecimal, hexadecimal 256-bit keys will give you a 64-digit, each hexadecimal digit is 4-bits, 64-digit hex number. So, you know, 0FAB1122111F whatever, okay. And it will look like that, that's the pure hexadecimal representation of this full 256-bit key. Now, normally you don’t use it that way. If you see a private key in Bitcoin, *00:17:44. Yes?

Male: *00:17:50

Antonopoulos: Well, that's the --

Male: It always starts with a 5.

Antonopoulos: It always starts with a 5, yes, indeed it always starts with a 5. Why does it start with five because it’s using a format called Base 58 check. Basically a check is the same format that makes every Bitcoin address start with one, that makes every Litecoin address start with an L, that makes *00:18:10 address start with an N, and most importantly that makes every doggie coin address start with a D. All right, same thing, what you're seeing in all of these addresses is the version number, and the version number is different for each of these encodings. So, in a whiff, the version is decimal 128 or hexadecimal 80, right? Yes?

Male: Does the 5 correspond with *00:18:50 algorithm you should get 5 or --

Antonopoulos: Yes.

Male: Okay.

Antonopoulos: The 5 corresponds to this.

Male: Okay.

Antonopoulos: So, decimal 128, 80, right, okay. So, what this does now is you append this or rather you prepend this, sorry, to the beginning of the key. So, you take, you take the hex, now add to the beginning of it, the version number. All right, that's an hex again, right? And then you add to the end 32 bits which is a 32-bit checksum. This is produced by double hashing this part. You take this part, version plus hex, pass this through double hash SHA two times and then you take the last eight hexadecimal digits, and you append them and that's checksum. That makes sure that you can’t mistype this address by accident, right, just a standard checksum, right?

Base 58 is implemented in every Bitcoin library. Base 58 check, which is Base 58 with a checksum, is also implemented in every Bitcoin library, *00:20:45 Python in three, four lines, pretty easy to implement, right? So, if you want to go out and look, look at your favourite programming language, go to GitHub, find one of the projects that implements Bitcoin tools in that language, in C. It’s Picocoin or in Java, it's BitcoinJ and Python PiBitcoin tools or Picoin. Use one of these languages, go and check and find in the source code the Base 58, Base 58 check and see how it's implemented, you will understand it's pretty simple.

What this does when you take this hex address and wrap it by prefixing it, and then putting a checksum at the end, and then you take all of this and you convert it into a Base 58 alphabet. Who's familiar with Base 58 alphabet? Everybody, okay, who's not familiar with a Base 58? Okay, so Base 58 alphabet is taking all lowercase letters 26, plus all capital letters 26, that's -- that's 50 --

Male: 52.

Antonopoulos: 2.

Male: 0 through 9.

Antonopoulos: Plus, plus 0 through 9 which is 10, gives us 62, and then you take out 6 letters, and the 6 letters you take out are *00:22:08, 6 letters and digits. 0 O, lowercase o LL 1, no LL, right, so you take out letters that are confusing L's look like ones, O's look like 0 so you take out the O's, you take out the 0's, you take out the L's and what you're left with is 58 remaining characters. And then you just use those in coding scheme. And it's like Base 64 only it's missing some things that can be confusing, make sense? Basic 58, it's really simple. Again, if you're implementing this in code, you write all the characters in an array, and then you index that array by the value you're trying to calculate, right, and you just use the subscript to index that array to find correct character.

Male: Is it because Satoshi didn't like the O's and 0's and the 1’s and the L's?

Antonopoulos: Sorry?

Male: Satoshi -- Satoshi didn't like the 1’s and the 0's and the L's or is it confused?

Antonopoulos: The problem is when you're reading a tough paper if you try to type it in, in something that people often confused and passwords, is the same reason why you don't have them in license plates in cars.

Male: Yeah.

Antonopoulos: Right. There were no Q's and O's and numbers like that.

Male: Yes.

Male: I have a question. Why is -- why is -- why didn’t you just double hash?

Antonopoulos: Pretty much all of the places in Bitcoin, when you see this, we do a double hash. It's just harder than doing the single hash and it provides an extra layer of randomness. What that means, by the way, is that the reason we use a double hash with a checksum, is because we already has a double hash function between the libraries and used it elsewhere. The double hash is also used in generating public addresses which we'll see in a bit. And that's so that they are not reversible. If you have a Bitcoin address, you cannot even tell public key let alone the private keys, which we'll talk about after a second. All right, so we take the key because this is hex number. We stick a prefix, we stick a checksum, we convert it to a double -- sorry we stick a checksum based on double hashing, we convert it Base 58, and what comes out?

Male: A long string that starts with five.

Antonopoulos: A long string that starts with five X, X, X, X, X, blah, blah, blah, blah, and that's a private key included as a Base 58 check including right. Yes, everybody got this? So, two types of private keys, one of them just has an extra zero one on the end and a different prefix K or L that means they’re compressed private key. Otherwise, all private keys start with a five unless they're just in hex notation, in which case they are a hex notation of 64 digits and hex that starts with whatever it starts with, this is a random number yeah. So far, so good.

Does everybody understand why we use Base 58, check? You cannot do mistype one of these without noticing. It provides error detection and correction within the algorithm. This is very important if you encode this in a QR code. The software can detect if there's been a miss scan. The software can detect if there's been a mistype because it will be an invalid address. It’s exactly the same reason why the last two digits of your credit card number are actually a checksum based on the previous digits, so the credit card number can be identified as incorrect without even having to look up anything in the database, yeah.

All right, and since a private key is just a random number, there's no other way to tell if it's wrong, especially if it's just hex. So, you would need a checksum. All right, so we've got private keys. Moving on. We’re taking a private key and we're going to convert it into a public key through a process of multiplication. Now, this is some weird-ass multiplication because we're going to do this in an elliptic curve on a prime field. And that's a lot of fancy math for jumping around from dock to dock on a very big piece of paper.

You start off with an elliptic curve. The elliptic curve we use is something like this and it's Y squared equals Y cube equals X squared plus 7 or whatever. It’s the SEC P, 256 K1 coalesce curve. It’s a standard curve used in elliptic cryptography inside Bitcoin but basically it looks like this. The process of adding two numbers on an elliptic curve can be geometrically represented by drawing a line for one point through the curve until it intersects creating another point in the curve. So, P, Q, Q Prime and the points are reflected across the curve. And multiplication is equivalent to drawing a line or rather addition is equivalent to drawing a line.

Multiplication by a number means addition many times. So, it's drawing a line that bounces back and forth across the curve in different places each time. I’d like to create an analogy for you, which is like this. You’re going to build your table, right? I put down the white ball and then I hit it. If I hit it exactly the same way, it's going to bounce perhaps a thousand times and it's going to end up in exactly the same position. After it's bounced a thousand times and landed in a position, you have no way of figuring out how it bounced and what angle it did. But I can redo that every single time. So, I can hit the ball and it will land in the exact same spot but you cannot figure out the path just by looking at where it landed, make sense? That's an irreversible -- mathematically irreversible function. That's where we're doing an elliptic curve.

Another example in real life, if I take the colour yellow and the colour blue and I mix them together, I create a shade of green. I can create the same shade of green by those two colours every single time. There is no way you can reverse engineer from the shade of green which precise yellow you might use, right? That's a one-way function. Mixing colors easy, un-mixing colours impossible, right? In mathematics, that's the process of factorization disregard or exponent *00:28:58 raising something to a power mod P versus trying to *00:29:02 reverse is basically a public key.

The elliptic curve doesn't actually look like this because it's on a field of primes. And what that means is that what the elliptic curve looks like is a series of dots specifically about 10 to the 70 dots, 256-bit 2 to the 256-bit dots, half of that, above the axis, reflected half of that below the axis. The process of doing multiplication on this field is like doing this, right? Or you're doing it with a number that's 256 bits long, so you make256 bits worth of these little jumps and you land somewhere, and where you land is your public key. And there's no way to work that out backwards. That's what it looks like but there's an easy way of writing it.

The way to write it is this. So, that's a private key times the constant. This is called the generator point. It’s a fixed point within the Bitcoin ecosystem. G is always the same for everybody. You just multiply this by the number K, which is your private key which you got here. And that produces your public key. This is a point on the curve. This is a point on the curve. This is simply a multiplier. This is not a point on the curve, okay? Even simpler way to write this is, what you're doing is, you're taking the general point and you're adding it to itself K times, where K is your private key, and that causes it to bounce all around the elliptic curve until it lands on that point which is your public key, okay?

Now in programming, what you're going to do is, you're going to have a library. It’s called something like EC multiply. And you're going to have a point that's called G, and you're literally going to write EC multiply K comma G, and the result you're going to get is going to be your public key. So, you don't need to worry exactly about how this works. So, it is a multiplication over a group field defined on an elliptic curve mods a prime number, a very large prime number. Yes?

Male: Is that one time *00:31:54 application linear with a private key?

Antonopoulos: Excellent question, really good question. The multiplication is actually a very low-cost operation code that’s curved which is the one we used here, have some efficiencies built in. You can do about 70,000 multiplications, not the individual sets, the full multiplication. You can generate 70,000 keys, public keys per second on your average laptop, right, to break one will take you 70,000 trillion years but you can generate 70,000 within a second, okay? So, this is a very, very expensive of operation. So, you do this and you get a public key. Now, public key is a point.

Male: I have a question.

Antonopoulos: Yes.

Male: Is it, is this one required to use G or can you use anything?

Antonopoulos: No, it is required to use G. That is the definition of how you do private keys for elliptic curve digital signature algorithm according to the National Institute of Standards Security Group Sec P that defined the Sec P 256 K1 curve. So, this is based on an international standard.

Male: Is the G of the Bitcoin network is the same across the network?

Antonopoulos: The G of the Bitcoin network is actually the G of this curve in the NIST documents.

Male: Right, okay.

Antonopoulos: So, it's even broader than just the Bitcoin network.

Male: Okay.

Antonopoulos: Yes?

Male: And the choice of G is also quite well defined on the *00:33:24?

Antonopoulos: The choice of G is a very large prime as well.

Male: Yeah and I thought about *00:33:27, I guess or --

Antonopoulos: It just have to be a large one.

Male: Okay, just tell *00:33:31.

Antonopoulos: It really doesn't matter. There are a few points *00:33:35 that multiple points of five.

Male: Just to be clear --

Antonopoulos: As far as I understand, I mean, my knowledge of elliptic curve has almost been exhausted. We're probably two questions away from exhaustion, yeah.

Male: So, just to be clear if we use a different G in your -- in your algorithm, it's not going to work?

Antonopoulos: It's not going to work as nobody else would be able to recreate that, and they won’t be able to validate signatures that you create based on that. All right, it is going to work in terms if you're going to produce a point.

Male: Yeah.

Antonopoulos: It just won't be a point that anybody would be able to use. All right, so now you've got a public key and that's defined as two coordinates X and Y in the elliptic curve field, right? For simplicity’s sake, think of these coordinates as just two big numbers, right? What we do now is take these two big numbers and write them down in hex as follow-up. So, the hex representation of a public key is 04XY. This is 32 bytes and this is 32 bytes, which produces a 130 character long 128 plus 2, 130 character long hexadecimal number that starts with 04. By the way, you can try all of these out. One of the easiest ways to do this is go to bit address, right? Go to the last tab of bit address that says wallet details, paste the private key in there and you're going to see the private key as hex, as Base 64, as a Base 58 check encoded, as a Base 58 check encoded compressed key. Then you're going to see the corresponding public key as hex starting with 04, as a Base 58 check encoded public key address starting with 1 etcetera, etcetera, right, and a compressed point.

Male: Is it a corporate address?

Antonopoulos: Bit address or *00:35:45 bits of software. You can do paper wallets of all types of other things, and it does everything in JavaScript in your browser. So, please have a look at your browser. All right, any questions so far? All right, so if you -- if you take this and then you create a Base 58 check encoded -- sorry, this is the public key. Now from this, we create an address and that address is created through an irreversible process of hashing with two different hashing algorithms. So, first one is, write ND160 and the second one is SHA256. Yeah, one, two, we get another piece of hex, yes, so far, so good. We take that piece of hex here and we put a prefix of 00, and the checksum and we produce a Base 58 check encoding of this, that is one and that's your Bitcoin address.

Male: Can we just flip the SHA256 and write them the --

Antonopoulos: Is it the other way around?

Male: Yeah.

Antonopoulos: The magic of the whiteboards, those are inversed. So, first you SHA it, then you write them. Thank you.

Male: Just write this.

Antonopoulos: He's the one who's going to be answering questions after I leave. Yes?

Male: What’s going to be *00:37:52?

Antonopoulos: Right.

Male: Start with a bigger one and then *00:37:56?

Antonopoulos: That makes a lot of sense. All right, so what you're left with is a 34-character ones prefix, then you Base 58 check encode that, okay. I mean, leave some of this and you got this, all right.

Male: I think technically, the address is not all *00:38:26?

Antonopoulos: The address is not the public key. In fact, if you have an address, you cannot figure out what the public key is. There's no way you can go this way, however, you can find a signature on the blockchain made for this public address then it reveals the public key. So, when you ask the software to give you the public address that corresponds to a Bitcoin address, what it does is it goes on the blockchain and tries to find a signature that matched this, and from that signature find the public key. So, it's only if you make a signature that you've revealed your public key. Until that moment, all you've given out is an address which is irreversible. Yes?

Male: Why don't we use the public key as an address? What's the purpose of that existing?

Antonopoulos: Obfuscation and it's much shorter, and I don't know. There might be other very good reasons, various types of cryptographic attacks, lengthening attacks, who knows.

Male: *00:39:38.

Antonopoulos: Also, we have job security. Bitcoin developers, the position is so complicated that it takes seven hours like this to explain it. So that way, people are employed.

Male: So, after this, the public doesn't really *00:39:59 away?

Antonopoulos: No, this way you -- oh, the public key, no, you well, okay. So, the public key as I mentioned before, can be easily generated from private key. But you don't actually need to keep it around. You can generate it on demand. You don't need to keep anything around other than the private key. Everything else can be derived and it will be derived in a deterministic manner. A private key, let me make this clear, a private key generates the same public key, which generates the same Bitcoin address every single time, right? So, if you just store the private key, you can regenerate everything. If you have the public key, you can generate the address but you cannot generate the private key. And if you have the address, you can't generate anything, all right, so far so good.

Now, I mentioned before that this is stored as 04XY. If you see a hex representation of a public key, it will look like that, 04 and then 128 hexadecimal digits which represent the X and Y points. Elliptic curves, if you remember, are symmetric across the X axis, which means that the Y is either a positive or a negative point, yeah, but it's always the same one. So, you drop the Y. If you know the X, you can find which part of the curve it’s on. So, all you need to know is that is it above or below the axis. You know where it's going to be. Let me draw this out for you again. This is the elliptic curve. If I know X I know that my Y is either going to be here or it's going to be here. So, all I need to know is, is this at the plus Y or is it a minus Y. And therefore, I don't need to store Y, and the Y is 32 plus. So, I can save a lot of space by simply removing the Y, so far so good?

Male: What's storing the sign?

Antonopoulos: Huh?

Male: What's storing the sign?

Antonopoulos: You just store the sign, so then you store this as either 02X or 03X and I can't remember which one is positive and which one is, negative but one of these prefixes means that the Y is on the -- below the X axis and the other one -- I think this one means it's above. I could be wrong, right? One or the other but basically if you see a public key that starts with 04, that means it's uncompressed and it has 130 characters. If it starts with 02 or 03, it is a compressed public key. And in one case, it's a compressed public key with a positive line, the other one is a compressed public key with a negative line. You can easily recreate the line by redrawing the point on the curve. There are libraries to do that and you've saved 32 bytes. So, this will be 98, 98 hexadecimal characters instead of 130, something like that. Yes?

Male: When you're saving the character like that, then you're optimizing for reducing the size of data to be transferred across the network or what are the benefits *00:43:18?

Antonopoulos: Keeps the wallet size smaller.

Male: Okay.

Antonopoulos: So, if you think about it, you've got all the public keys in half or the wallet size. Also yes, public keys are introduced in the encumbrances and *00:43:35. So, I think it also saves data on the network yeah, okay. Yes?

Male: The network stores the public key or I don't know enough to *00:43:46 about it because I thought that the address *00:43:49.

Antonopoulos: That's a good point knowing the address it’s stored, I’m not sure.

Male: It creates jobs.

Antonopoulos: Sorry?

Male: It creates jobs.

Antonopoulos: It creates jobs. No, it's more than just a wallet. I think it does affect the network storage perhaps in the zero signature you produce. This is -- I think in the digital signature, you provide a public key plus the signature.

Male: Discreet private date *00:44:12 the public keys?

Antonopoulos: Yeah, there's a public key itself in full, so there you are saving 32 bytes if you are compressed. So, you can save network traffic. You would also save *00:44:28 traffic because that gets listed in a transaction. Every transaction has at least one public key signature. So, if you save 32 bytes for every transaction, that's a pretty big deal. All right, so if you take this and then you produce through this process a Base 58 check address, the version 00 which corresponds to Bitcoin *00:44:58 meaning blockchain network, not TestNet, that's what 00 means. We’ll always start with a block. So, you cannot tell from the prefix of the address if it is compressed or not, yeah. If you un-Base 58 check, there are actually slightly different ways. When you un-Base 58 check it, you can see that the hex in here is 32 bytes short, right, so far so good.

Male: Why is it shorter?

Antonopoulos: Oh, it’s not because it came out of the SHA256, so it will be shorter, you're right, sorry. That's an example of the things that I see that are wrong.

Male: But I'm just going to be just making from any form of public key.

Antonopoulos: Yes.

Male: *00:45:44?

Antonopoulos: So, you wouldn't know what it is, yeah. So, what happens is, you end up with addresses that look different but are based on the same public key, different representations of the same public key, but they’re also based on the same private key, right? This is going to get confusing.

Male: Along the three addresses?

Antonopoulos: These, we have only two -- only two addresses.

Male: Oh, sorry, yeah.

Antonopoulos: It's either going to be one of these or it's going to be one of the two of these. So, I think if you *00:46:20 and you put in a private key and you generate this, it's going to *00:46:24 both be compressed and the uncompressed Bitcoin address, both of which start with a one but look different, right? So, that means two addresses can correspond to the same public key. That's something to remember. Yes?

Male: Can I actually use both of those addresses to then receive money into the same?

Antonopoulos: Yes, by this because what an address does is it provides an encumbrance that can only be resolved through a signature by the private key. How many private keys are there?

Male: One.

Antonopoulos: One, so two addresses to put an encumbrance that can both be solved by the same private key signature means that the money is owned by one private key.

Male: Messy.

Antonopoulos: It's messy but it may look like different addresses if you don't account for it properly. Yes?

Male: So can you explain the use of the word encumbrance *00:47:16 to visualize it with us?

Antonopoulos: Right, when I send Bitcoin to you, what I'm doing is I'm creating a transaction whose outputs have your address listed as a padlock that only you can unlock. There are two different types of padlocks, but they both open with the same key, the private key you have. So, I can put a silver padlock *00:47:42 padlock but they both open with the same key that you have.

Male: When you use *00:47:50?

Antonopoulos: So what your -- an encumbrance is what, what it means, it take a transaction output, an encumbrance meaning burden it with the requirements that someone presents a key.

Male: Got it.

Antonopoulos: So, sometimes you’re encumbering an output by saying this is encumbered by a public key, which means that someone needs to sign something. All the times, you’re encumbering it with a script address if you're doing a pay to script hash, in which case someone needs to present a script and the necessary signatures to satisfy that script, in order to resolve that encumbrance. Other encumbrances might involve contracts in the future, right. In general, it's not just a payment to an address, into payments that basically requires someone to present cryptographic proof of something, the most common something is a digital signature with a private key that proves that they own the public address that they use to receive the money, and they will thereby resolve the encumbrance placed by the sender. That means this, do you want to try that again?

Male: Yes.

Antonopoulos: Does that answer your questions? Yeah

Male: Just to extend what you’re asking, so for the -- so I don’t do TestNet. The TestNet this is really your -- you know, you don't want to present your private key to network, right?

Antonopoulos: In order to test it with your money, you sign and you present a public key and a signature that can be verified against that publicly as having been created by its private key equivalent. And someone with a public key and the signature can verify it without knowing the private key. And that proves your ownership of that publicly. And since that public key can be resolved to the address on which the encumbrance was placed, that proves your ownership of these box.

Male: Got you.

Antonopoulos: So, you sign and present the public key. The recipients or the person verifying the transaction says, I see a public key. Does this signature correspond to this public key? Yes, it does. Does this public key if I turn it into address correspond to the address that placed the encumbrance on this output? Yes, that means this person owns this output. And therefore, they can't spend them and therefore this transaction that is using those outputs is properly signed and valid. Yes?

Male: So, you get a wallet and with that wallet you get typically, I believe 20 addresses. Does that mean there are 10 private keys for the wallet?

Antonopoulos: No, there are probably 20 private keys for the wallet. It doesn't generate both compressing and uncompressed. Most wallets nowadays do compress public keys.

Male: There are just compressed or uncompressed?

Antonopoulos: Yeah, some wallets use compressed some use some *00:50:31 uncompressed. Most of them, they are not compatible with each other, which leads to all kinds of confusion when you're going to import wallets from one format to the other? So, they're easily converted, right? They all correspond back to fundamentally a number, the 256-bit private key number that is the same and from which you can generate all of these other representations. Yes?

Male: So in theory, there's -- we have two types of address, but you could create your own -- your own -- generate a public address from your private key?

Antonopoulos: But no one would be able to validate that if somebody else needs to validate it when they see the signature.

Male: Because they don't know the type of key *00:51:14 compressed or not. How do they --

Antonopoulos: They do know whether it's compressed or not because in the signature *00:51:19 public key.

Male: Got it.

Antonopoulos: And that's how people can find out the public keys behind addresses because they just look and see if you’re signed before.

Male: Right, okay.

Antonopoulos: Thank you for that, public keys. Let's move on. So, we've now done ECC generating format in an hour. This works overly optimistic. So, we're going to do the rest in 10 minutes. Actually, it's going to be fairly easy. So, what I've described is type 1 non-deterministic keys. And the reason they're called non-deterministic is because they're randomly generated through the process we solve that cryptographically secure random number generate, or rather every single one of them is randomly generated. So for example, you have a Bitcoin D wallet that has 20 keys in it, how are these 20 keys picked? Like this, random number K, random number K, random number K, random number K.

Male: These K are all different, right?

Antonopoulos: These are all different from those public keys, from those addresses, right? So, you regenerate a random number or a private key, a public key and address. Then you throw the dice again, random number private, key public key address, all right? Now, what does that mean? Bitcoin D also generates chain addresses for generating for pulling back chain when you do a transaction. Also, one of the best practices you can do is to use a different address for every transaction. How many keys do you need to backup based on this scheme? All of them. And every time you generate a transaction which has chain therefore creating a new address, therefore creating a new key, or is re -- is not reusing but is generating completely new address just for that transaction, you will need to make another backup of your wallet, okay? That is the disadvantage of type 1 non-deterministic keys. You need very, very large backups okay? And you need to take backups all the time every time you did transaction. Yes?

Male: So, you can *00:54:03 the address with another Bitcoin D address?

Antonopoulos: Bitcoin D regenerates keys in a pool, and so it doesn't need to back them up every time but still every time it generates new ones, you still has to keep a backup of every single key. Yes?

Male: Why would I want to generate an address every time for every transaction?

Antonopoulos: Privacy.

Male: Right, so nobody can track?

Antonopoulos: So nobody can track the links between addresses. That's the best practice in Bitcoin. Yes?

Male: How would you *00:54:28?

Antonopoulos: Because someone might send you money to an address that you've long stopped using but they happen to find this in an old email and then they sent you money and that's why I never delete private keys, because you can always delete your private key, you can never undelete it. So, never delete it. And that's exactly what the wallets ended up being huge. All right, so it also exposes you to risk because you've got a big file with a lot of keys flying around, and managing all these keys gets very complicated. So, two solutions, Type 2 deterministic wallets based on chains, Type 2 deterministic wallets based on trees, Type 2 to deterministic wallets based on chains were first introduced in Electrum.

Male: Type two *00:55:53?

Antonopoulos: Electrums. Electrum is a wallet and it uses Type 2 deterministic keys. These are also now used in Armory as well as the Bits of Proof enterprise-class server and they’re implemented in a number of other systems. SX Tools if you have a look at that, that's one of the things we introduced in one of the hackathons. It’s the Bitcoin command-line tools. They also handle Type 2 deterministic chains that are Electrum compatible. So far, so good. The simple idea is, from the random number you generate a hexadecimal seed. This seed is the generator for all future addresses. From that, you then use a mathematical function to generate keys in such a way that you cannot predict these keys without having the seed. So, if you have K1, you can't figure out K2. If you have K2, you can't figure out K1, right?

Male: So, even if your private keys are compromised as long as the seed is uncompromised, you’re still fine?

Antonopoulos: You can regenerate everything from the seed, all of your private keys, all of your public keys and this can keep going forever. There is normally a gap which is that, instead of generating every single, you generate K1, K6, K11 every 5, right, that's called the gap of 5. So, the keys that you're generating are spaced apart.

Male: Is that an arbitrary number?

Antonopoulos: You can pick an arbitrary number. If you do, then it will generally different keys if you change that gap based on that same seed, which is why they tell you to remember, go to your seed and your gap when you do this. This provides a bit more variation in the keys. Yes?

Male: If someone have access to my computer --

Antonopoulos: Yes.

Male: -- and they copy a key -- they keys, *00:57:54?

Antonopoulos: If someone has access they could copy all of this, yeah or they can steal all the seeds. So, you -- so, these things are usually stored encrypted, both when they’re types -- Type 1 non-deterministic keys, where they’re just random number key, random number key, random number key, random number key, you have to keep those things encrypted. So, in Bitcoin D, that's done with *00:58:16 passphrase command and the encrypt wallet commands. In Electrum, that’s encrypted using a different system etcetera, etcetera. Usually, when you type your password in order to operate your wallet, what you're doing is you're de-encrypting your wallet from a hard drive. Yes?

Male: So, if someone could have excess and *00:58:34 and stole my keys and using my keys on *00:58:43 and I realise that it had been stolen, so from my seed *00:58:50 can see that these are two people that have seen keys that are --

Antonopoulos: No, that doesn't look at keys. You would have to generate transactions to empty any encumbrances that will get those keys. You'd have to go out on the network and create transactions that claim funds locked under those keys before somebody else claims them. Yeah?

Male: Was your -- the description of the seed and your *00:59:17 are we going to talk about KDF development?

Antonopoulos: We're not going to talk about a KDF because it's a bit complex. How this is generated involves a KDF, right, key derivative functions and key stretching algorithms are used for a number of different reasons. One is to generate sequential keys that are not relatable, right. Normally, this is done with a one-way hash function, but they're also used to stretch the computation required to generate multiple seeds to try to brute force a system.

So, for example, when you do 20 thousand repetitions of script, which is the proof of worth algorithm and password stretching algorithm used or PB key PDK, DF2, or whatever it is, what this does is it makes it much harder to do a brute force attack because it increases the computational requirement for every single attempt. On your browser, it will take 3 quarters of a second. How can you try to do that for the trillion keys, that's three quarters of a trillion seconds?

Male: If you -- let's just say that somebody becomes aware of K1, K2, K3 *01:00:35, does it have a possibility to *01:00:40?

Antonopoulos: So, these are one-way functions and these are one-way functions.

Male: Okay.

Antonopoulos: Upward, so if you have K1 or K1 through K infinity, you cannot derive the seed. If you have a subset of KN through M, you cannot derive Ks *01:00:56 subset, yes. But if you have the seed, you can derive everything, which gives you a perfect backup.

Male: So, I'm going back, I'm back at the seeds and then I’m importing from a different wallet?

Antonopoulos: Yes, you will generate --

Male: *01:01:12?

Antonopoulos: Great question. How does any wallet know how far to go? Very simply, what it does is, it starts K1 and generates the address, and then it goes on the network and says any transactions for K1 address, yes, K2, K3, K4 and it goes until it finds no more transactions under those addresses. And then it says well, you probably got this far, so it will start trawling the entire sequence or chain looking for all transactions until it finds a gap. Usually, what it does is looking for a hundred consecutive addresses after the last one it found the transaction for which there are no transactions before it decides I've now found all of the keys that were used. Yes?

Male: Does it have to text the network or can you just locally search block to block *01:02:04?

Antonopoulos: No, it just starts watching, yeah, when I say the network, you can't actually search a network. You have to have a blockchain card. You can use blockchain info API.

Male: And this works because wallet uses those keys, like, from the beginning?

Antonopoulos: Yes.

Male: Every time, not like oh, I receive 1,000 keys and *01:02:20?

Antonopoulos: No, the wallet uses a specific part of the key generation for chains and each part use an address that goes to the next one. So, you will see transactions for every one of the address. Sometimes there’s little gaps but they won't be very big. So, if you say let's look for a hundred keys after they run out of gaps and you're like okay, this is definitely the final gap, this is how far we've done. Yes?

Male: You say the transaction is saved in the blockchain?

Antonopoulos: Yes.

Male: Where is the blockchain, in your hard drive?

Antonopoulos: The blockchain is a -- is everywhere. It’s on everybody's hard drive. So, everybody has a copy of the blockchain or what they think is a copy of the blockchain with the longest difficulty chain of transaction status.

Male: *01:03:02?

Antonopoulos: 26 gigabytes approx. Yeah, that has every single transaction since November 9th -- November 3rd, 2009 which is a Genesis block, using the number 13. Sorry.

Male: So it's in everybody's hard drive?

Antonopoulos: Everybody's hard drive, right, if you have a full blown client. You can also have a lightweight client and that just asks another server that has the full bloockchain. All right, so far so good. Now, this is where it gets really easy so I can check off this one very quickly. Bit 39, it’s a new standard that’s coming up now. It’s based on the Electrum module. What it does is it allows you to take a hexadecimal number which is a seed and produce 12 words, coffee chair, asparagus, pong, tea kettle, orange, you know, whatever, 12 words, these words have deterministically derived. You can convert those 12 words back to the seed.

So, if you are asking your girlfriend, wife, husband, brother, sister to help you restore your Electrum wallet because you lost it while you're on a trip in Thailand, and they're trying to read the seed over the phone instead of reading 125F0H whatever, instead they're reading orange, coffee, kettle, barbecue, right, and so those 12 words will reconstruct that same seed. Yes?

Male: Is it restricted to English in anyway?

Antonopoulos: Sorry?

Male: Is it restricted to English in anyway?

Antonopoulos: It is restricted to English entirely.

Male: Is it?

Antonopoulos: Yes.

Male: Okay.

Antonopoulos: It is actually restricted to a specific dictionary, and the reason for that is because based on the specification. every single wallet in the world can take 12 words and deterministically recreate the same seed from which you can then recreate the same keys.

Male: And then you randomly generated a set of 12 words *01:05:27 considers that well, that's the key --

Antonopoulos: There's a checksum built in. So, one of the words is a checksum from the other words, so then, no, it will say those are not correct.

Male: How big is a seed?

Antonopoulos: How big is what?

Male: Seed.

Antonopoulos: The seed is usually 32 bytes.

Male: 32 bytes.

Male: So, I don't need to back up my wallet everywhere, anywhere as long as I remember those 12 words?

Antonopoulos: Correct.

Male: All right.

Antonopoulos: That is something you can do with Armory, and that is something you can do with Electrum, that is something you will be able to see in more and more wallets now that this is being standardized and across industry standard. This was recently published and it's going to make backing up or restoring wallets a lot easier. It’s also going to make it easier to transfer Type 1 -- sorry Type 2 deterministic wallets from computer to computer, from different wallet to different wallets simply by transmitting 12 words. Yes?

Male: Why is the QT client implemented? I mean, is this a *01:06:35 C approach or --

Antonopoulos: Because the QT client is more focused on implementing the stability of the core Bitcoin protocol than implementing user base pretty wallet features that are actually already broken and have an absolutely horrible graphical user interface, and as of recently were removed from the bitcoin.org site as the recommended wallet for new users because it absolutely sucks as an user interface. So, I'm glad. They're not going to -- I don't think you're going to see much development of the Reference Client as a wallet. In fact, the current discussion among developers is to strip all of the wallet functionality out, do it over RPC, separate the wallet file for the main routing node and blockchain database, and then that will enable all the wallets to be built on top which are not going to be implemented in QT.

Male: But the hard drive MAC of *01:07:31?

Antonopoulos: Sorry?

Male: Did the hard drive * 01:07:34?

Antonopoulos: The 12 words matters, yes and if you put it wrong it will check the checksum and find your box.

Male: Why RPC?

Antonopoulos: RPC is just easy, that's why.

Male: Okay.

Antonopoulos: It's because RPC and inter-process communication on the same machine are almost as fast, and it's very easy to do separation of security controls subway and to isolate components. It provides a clean API layer between the two, and it's very common in any language to implement, just basic JSON RPC.

Male: Is there any functionality that allows someone to send coins to an address that then somehow get sent to another address? Is it --

Antonopoulos: A cascade address system.

Male: Yeah.

Antonopoulos: No.

Male: No? Okay.

Antonopoulos: But you can do multisec paid descript hash, which we talked about it in other seminar. We're going to revisit again in more detail.

Male: Thank you.

Antonopoulos: All right, so a substance for Bitcoin improvement proposal, it's a series of -- it's like the RFC process on the internet, and you can find these bits on the *01:08:39 Bitcoin organization. Each one is basically an X document that says how this will work and how to implement it correctly and then people can implement it. 39 is the mnemonic code, just a code of mnemonic from the Greek word "mnemonic" which means memory. This is a memory trip, yeah mnemonic.

All right, so that's BIP 39. I'm going to do BIP 32 and BIP 38 very quickly. But it's decimal *01:10:41. All right, so BIP 32 is called the hierarchical deterministic wallet, hierarchical because it forms some hierarchy in the form of a tree. It’s an emery tree or one that has N branches, then this is actually 32-bit emery tree. That means there are four billion branches per branch, yeah. You can get seriously lost in this key space.

What this is, is basically the same concept as before, a random number generating master key from which you generate a series of private keys which can then have sub-keys, which can then have sub-keys, which can then have sub-keys to infinity, right. And each of the sub-keys can have 4 billion sub-keys below it, each of which can have 4 billions of sub-keys below that, each of which can have 4 billion sub-keys below that, three levels deep in the tree and you're in more than cosmological numbers, yeah.

There is another little trick to this, which is that you can simultaneously deliver the private and the public key and derive them separately. It’d be probably easier if I use two colours for this one. All right, so write out our public, lock our private. This way it gets interesting. You can derive sub-public keys from public keys, and you can derive sub-private keys from private keys.

Now, of course, you can also derive like that, right. So, if you've generated a private key, can generate the corresponding public key, but here's the really magic trick through the magic of elliptic curve mathematics, you can have a public master key from which you generate a series of public trial keys without knowing the private key. Why is this tremendously useful? Let’s say you've got a web server that needs to sell widgets, and you want to generate a unique transaction address for every customer and every widget and track them independently. You can take just this thing and put it on the public web server. The web server has no private keys but using the master public key, it can generates an infinite sequence almost of master -- of sub public keys, each for a different transaction. It can keep generating public keys. It doesn't want log any of the private keys back on a separate server.

You can now generate transactions that consume any payments to those public keys by knowing this private key, and simply the sequence number for the corresponding public key, or you can guess them using the same technique that Electrum uses with gaps. You can just keep generating and see if there are any payments against them. You could pre-generate a million privates and a million publics, and you can use a different one for every transaction. And you can actually generate the spending transactions off the web server while the web server can generate an infinite hearing for the number of keys directly off the public chain without having any of the private information.

Written by Andreas M. Antonopoulos on March 2, 2014.