RSA Encryption - How & why it works. Euclid, Euler, Cocks and much more.
Transcript
Transcript generated by YouTube auto-captions. May contain errors.
Up until the 1970s, cryptography had been based on symmetric keys. That is, the sender encrypts their message using a specific key and the receiver decrypts using an identical key. As you may recall, that encryption is a mapping from some message using a specific key to a cipher text message. To decrypt a cipher text, you use the same key to reverse the mapping.
So, for Alice and Bob to communicate securely, they must first share identical keys. However, establishing a shared key is often impossible if Alice and Bob can't physically meet or requires extra communications overhead when using the Diffy Helman key exchange. Plus, if Alice needs to communicate with multiple people, perhaps she's a bank, then she's going to have to exchange distinct keys with each person. Now she'll have to manage all of these keys and send thousands of messages just to establish them.
Could there be a simpler way? In 1970, James Ellis, a British engineer and mathematician, was working on an idea for a non-secret encryption. It's based on a simple yet clever concept. Lock and unlock are inverse operations.
Alice could buy a lock, keep the key, and send Bob the open lock. Bob locks his message and sends it back to Alice. No keys are exchanged. This means she could publish the lock widely and let anyone in the world use it to send her a message.
And she only needs to keep track of a single key. [Music] Ellis never arrived at a mathematical solution, though he had an intuitive sense of how it should work. The idea is based on splitting a key in two parts, an encryption key and decryption key. The decryption key performs the inverse or undo operation which was applied by the encryption key.
To see how inverse keys could work, let's do a simplified example with colors. How could Bob send Alice a specific color without Eve, who is always listening, intercepting it? The inverse of some color is called a compliment color, which when added to it produces white, undoing the effect of the first color. In this example, we assume that mixing colors is a one-way function because it's fast to mix colors and output a third color and it's much slower to undo.
Alice first generates her private key by randomly selecting a color, say red. Next, assume Alice uses a secret color machine to find the exact complent to her red. Nobody else has access to this. This results in cyan, which she sends to Bob as her public key.
Let's say Bob wants to send a secret yellow to Alice. He mixed this with her public color and sends the resulting color back to Alice. Now Alice adds her private color to Bob's mixture. This undoes the effect of her public color, leaving her with Bob's secret color.
Notice Eve has no easy way to find Bob's yellow since she needs Alice's private red to do so. This is how it should work. However, a mathematical solution was needed for this to work in practice. The solution was found by another British mathematician and cryptographer, Clifford Cox.
Cox needed to construct a special kind of one-way function called a trapoor one-way function, which is a function that is easy to compute in one direction yet difficult to reverse unless you have special information called the trapdo. For this, he turned to modular exponentiation as follows. Take a number, raise it to some exponent, dividing by the modulus and outputting the remainder. This can be used to encrypt a message as follows.
Imagine Bob has a message which is converted to a number m. Then he multiplies his number by itself e times. Then divides this result by a random number n and outputs the remainder of the division. This results in some number C.
This calculation is easy to perform. However, given only C, E, and N, it is much more difficult to determine which M was used because we'd have to resort to some form of trial and error. So, this is our one-way function that we can apply to M. It's easy to perform but difficult to reverse.
This is our mathematical lock. Now what about the key? The key is the trap door. Some piece of information that makes it easy to reverse the encryption.
We need to raise C to some other exponent, say D, which will undo the initial operation applied to M and return to the original message M. So both operations together is the same as m the power of e all raised to the power of d which is the same as m to the power of e * d. e is the encryption d is the decryption. Therefore we need a way for alice to construct e and d which makes it difficult for anyone else to find d.
This requires a second one function which is used for generating d. For this, we look back to Uklid. Over 2,000 years ago, Uklid showed every number has exactly one prime factorization, which we can think of as a secret key. Now, it turns out that prime factorization is a fundamentally hard problem, while multiplication is easy.
Let's clarify what we mean by easy and hard by introducing what's called time complexity. We all have multiplied numbers before and each of us has our own rules for doing so in order to speed things up. If we program a computer to multiply numbers, it can do it much faster than any human can. Here's a graph that shows the time required for a computer to multiply two numbers.
The time required to find the answer increases as the number gets larger. Notice that the computation time needed is well under 1 second, even with fairly large numbers. Therefore, it is easy to perform. Compare this to prime factorization.
If someone told you to find the prime factorization of 589, you will notice the problem feels harder. No matter what your strategy, it will require some trial and error until you find a number which divides evenly into 589. After some struggle, you will find 19 * 31 is the prime factorization. If you were told to find the prime factorization of 437,231, you'd probably give up and get a computer to help you.
This works fine for small numbers. Though, if we try to get a computer to factor larger and larger numbers, there is a runaway effect. The time needed to perform the calculations increases rapidly as there are more steps involved. As the numbers grow, the computer needs minutes, then hours, and eventually it will require hundreds or thousands of years to factor huge numbers.
We therefore say it's a hard problem due to this growth rate of time needed to solve it. So factorization is what Cox used to build the trapdo solution. Step one, imagine Alice randomly generated a prime number over 150 digits long. Call this P1.
then a second random prime number roughly the same size. Call this P2. She then multiplies these two prime numbers to get a number n which is over 300 digits long. This multiplication step would take less than a second.
We could even do it in a web browser. Then she takes the factorization of n p1 * p2 and hides it. Now, if she gave n to anyone else, they would have to have a computer running for years to find the solution. Step two, Cox needed to find a function which depends on knowing the factorization of n.
For this, he looked back to work done in 1760 by Swiss mathematician Leonard Uler. Uler continued to investigate properties of numbers, specifically the distribution and location of prime numbers. One important function he defined is called the five function. It measures the breakability of a number.
Given a number say n, it outputs how many integers are less than or equal to n that do not share any common factor with n. For example, if we want to find of 8, we look at all values from 1 to 8, then we count how many integers 8 does not share a factor greater than one with. Notice six does not count because it shares a factor of two while 1 3 5 and 7 are counted as they only share a factor of 1. Therefore 5 of 8 equals 4.
What's interesting is that calculating the five function is hard except in one case. Look at this graph. It's a plot of values of five over integers from 1 to a,000. Notice any predictable pattern?
The straight line of points along the top represent all the prime numbers. Since prime numbers have no factors greater than one, the phi of any prime number p is simply p minus 1. To calculate 5 of 7, a prime number, we count all integers except since none of them share a factor with 7. 5 of 7 equals 6.
So, if you were asked to find PH 5 of 21,377, a prime number, you'd only need to subtract 1 to get the solution 21,376. Five of any prime is easy to compute. This leads to an interesting result based on the fact that five function is multiplicative. That is 5 A * 5 B= 5 A * B.
If we know some number n is a product of two primes p1 and p2 then five of n is just the value of 5 for each prime multiplied together or p1 -1 * p2 minus one. We now have a trapdo for solving 5. If you know the factorization for n, finding 5 n is easy. For example, 7 * 11 = 77.
So 5 of 77 is 6 * 10 = 60. Step three, how to connect the five function to modular exponentiation. For this, he turned to Uler's theorem, which is a relationship between the five function and modular exponentiation as follows. M the power of n is congruent to 1 n.
This means you can pick any two numbers such that they do not share a common factor. Let's call them m and n. For example, m= 5 and n= 8. Now, when you raise m to the power of 5 n and divide by n, you will always be left with 1.
Now, we just need to modify this equation using two simple rules. First if you raise the number one to any exponent k you will always get one. In the same way we can multiply the exponent 5 n by any number k and the solution is still one. Second if you multiply 1 by any number say m it always equals m.
In the same way we can multiply the left hand side by m to get m on the right hand side. This can be simplified as m ^ of k * 5 n + 1. This is a breakthrough. We now have an equation for finding e * d which depends on fi n.
Therefore, it's easy to calculate d only if the factorization of n is known. Meaning d should always be Alice's private key. It's the trapoor which will allow her to undo the effect of e. Let's do a simple example to see all of this in action.
Say Bob has a message he converted into a number using a padding scheme. Let's call this M. Alice then generates her public and private key as follows. First, she generates two random prime numbers of similar size and multiplies them to get n 3,127.
Then she calculates PH of n easily since she knows the factorization of n 316. Next she picks up some small public exponent e with the condition that it must be an odd number that does not share a factor with 5 n. In this case she picks e= 3. Finally, she finds the value of her private exponent d, which in this case is 2 * 5 of n + 1 / 3 2011.
Now, she hides everything except the value of n and e. n and e make up her public key. Think of it as an open lock. She sends this to Bob to lock his message with.
Bob locks his message by calculating M to the power of E mod N. Call this C his encrypted message which he sends back to Alice. Finally, Alice decrypts his message by using her private key D access through a trap door. C to the power of D equals Bob's original message M.
Notice that Eve or anyone else with C, N, or E can only find the exponent D if they can calculate PH of N, which requires they know the prime factorization of N. If N is large enough, Alice can be sure that this takes hundreds of years with even the most powerful network of computers. This trick was immediately classified after its publication. However, it was independently rediscovered in 1977 by Ron Rivst, Addie Shamir, and Len Adelman, which is why it is now known as RSA encryption.
RSA is the most widely used public key algorithm in the world and the most copied software in history. Every internet user on Earth is using RSA or some variant of it, whether they realize it or not. Its strength relies on the hardness of prime factorization which is a result of deep questions about the distribution of prime numbers. A question which has remained unsolved for thousands of years.