For thousands of years the art of encryption consisted in hiding one secret from the enemy. By gaining access to this secret the opponent could decode the message and, perhaps, future messages encoded using the same key. Many complicated secret encodings were created, like the Vigenère cipher where the secret alphabet would change among several pages of possible alphabets.
The Nazis created a fabulous machine called Enigma, which would use a complex mechanical system of rotors to produce very strong encryption in the age before computers. In fact modern electronic computers were researched and developed greatly by Alan Turing and a team of allied codebreakers at Bletchley Park during the wartime efforts to break Enigma! Cryptography is at the heart of Computer Science and has been a major driving force for computer development throughout the years. Today games and entertainment are likely the greatest drivers of innovation, but cryptography will never lose its place at the birth of the computer.
Breaking cryptography, then, for hundreds of years boiled down to guessing a key. But modern computers can break any historical cryptographic key in minutes. All a computer has to do is try billions of combinations. Since this was impossible up until WWII, such crypto systems were considered secure. For example, a Vigenère cipher could take a lifetime to break using pencil and paper.
Using a modern computer, all possible Vigenère alphabets would be tested in seconds and the secret code revealed. It was clear that against modern computers more would be needed than the clever combinatorial mechanisms of the past. Computers are applied math, and advanced math would be needed to keep computers from breaking codes.
This was the reality of cryptanalysis post World War II – math was the new weapon against programmable computers. Intelligence agencies quickly shifted more and more resources towards math geeks fresh out of top universities. Newly graduated mathematicians, computer scientists and logicians would find excellent jobs researching number theory, cryptography, abstract algebra and correlated subjects. It was the dream job for any geek, to play with the world’s greatest computers and do what they loved the most, cryptographic research.
It was in this historical context that Whitfield Diffie and Martin Hellman researched cryptography at the Electrical Engineering Department of Stanford University.
After years of hard work, they finally published what is now considered a classic paper in cryptographic research entitled “New Directions in Cryptography”. The need for a new kind of cryptographic system was highlighted right on the first paragraph of their text:
“Widening applications of teleprocessing have given rise to a need for new types of cryptographic systems, which minimize the need for secure key distribution channels and supply the equivalent of a written signature. This paper suggests ways to solve these currently open problems.”
(The Diffie-Hellman paper is easily found via web search.) The impact of this publication can be felt to this day. The name Diffie-Hellman has become as natural in cryptographic research as Coca Cola is to soft drinks. It’s fundamental knowledge for any cryptography student, even if they’ve never read the paper, its two great contributions are everywhere around us to this day: public key cryptography and its corresponding routine for key exchange. In fact, this last concept has been named the Diffie-Hellman key exchange, and it’s exactly what your web browser does when accessing a https page, it’s what happens when 2 people want to trade cryptocurrencies, it’s what email signature verification programs do – in short, Diffie-Hellman is everywhere in the modern world.
What the world didn’t know yet is that there was secret work being done in public key cryptography for at least 6 years before Diffie-Hellman. James Ellis of the United Kingdom’s intelligence services had developed a system for public key encryption long before Diffie-Hellman, but this was classified information up until 1997 when it was announced that the UK had invented “non secret encryption” a long time ago, which was proved by a internal and secret publication from 1970 that was then made public. Ellis and the British team are now academically accepted as pioneers in public key encryption. But since this was all secret work, we couldn’t have known any better, so Diffie-Hellman, which didn’t work for the spooks as far as we know, got all the fame for their incredible development.
But something was missing. This great new idea needed a commercially viable implementation. Someone had to take this a step further and develop it into something everyone could use. Enter three names which form another acronym that is widely recognized in the cryptographic world: Rivest, Shamir and Adler.
Today, RSA is synonymous with public key cryptography and Diffie-Hellman is often referred to as the key exchange that precedes the RSA encryption. RSA is a system for encryption, it’s not a single particular technique, but in practice it refers to public key cryptography that uses the factoring problem at its mathematical core. Several different types of functions are usable for the public key math, but prime number factorization is the one used by RSA and it’s basically everywhere.
RSA is everywhere, in your browser, the system used by credit card processors, it’s in popular email security and privacy programs, it’s in free and open source encryption programs. Yeah, RSA is everywhere – except in Bitcoin! Bitcoin uses a different mathematical system than RSA.
The public key encryption system used in Bitcoin is called Elliptic Curve Cryptography and it exploits the fact that if you know certain parameters that are used to draw an elliptic curve, then you can easily follow this curve, but if you’re only given a certain piece of the information used to draw these curves, then you are in for a hard time figuring out the path taken along the elliptic curve. First, you may be wondering what an elliptic curve looks like. Here’s one, courtesy of Wikipedia:
It doesn’t look too fancy and in fact it’s mathematically very simple. But its use in Bitcoin encryption has made it one of the most researched and popular cryptographic subjects by academics from around the world.
The curve is symmetric around the X axis, every positive value has a corresponding negative value with the same magnitude. This is exploited by elliptic curve encryption which switches between the positive and negative sides of the curve in order to make it even more difficult to reverse the process. Here’s an animation, authored by Cloudflare, which shows how a path is followed on the curve. By knowing where you started it is trivial to rebuild A -> B -> C -> D -> E path, but if you don’t know some of the initial parameters, then rebuilding the path (decoding the data) becomes a very difficult mathematical problem:
Bitcoin exploits this system for its public key cryptography. When you sign a Bitcoin transaction sending someone a value in coins, you are actually performing a ECDSA signature, which stands for Elliptic Curve Digital Signature Algorithm. There are infinite elliptic curves (EC) so when you want to use EC’s for encryption you must first publish its formula. Bitcoin’s elliptic curve is called Secp256k1 which can be drawn on any math graphic computer program or calculator by the formula y2 = x3 + 7. This deceivingly simple looking formula is the powerhouse behind Bitcoin. Those two variables and the number 7 guard the Bitcoin blockchain from being tampered with. Every single Bitcoin address is derived from parameters of Secp256k1, every single Bitcoin private key is a different parameter of the encryption system that uses that formula.
This tiny mathematical equation guards over 130 billion dollars in market cap right this moment:
If you’re into Bitcoin (and since you’ve reached this far down on this text, we guess you are), then this tiny equation is your new e = mc2!
Let’s do a simple example using pencil and paper (a calculator will come in handy for the exponentials), so you can grasp how public key encryption works. Since the Bitcoin elliptic curve is used for signing and not for encrypting, we’ll use RSA to encrypt the message “Hi”. The idea is very similar to any other kind of public key system: one piece of the system is kept secret, the other is made public and only by having the two piece of data can you reverse the process easily, otherwise you need to try an immense number of guesses. This big number of guesses establishes how hard it is to reverse the process. So, let’s get back to our experiment.
First, choose two prime numbers which we’ll call p and q (these letters are just convention, they’re traditionally used in RSA and other crypto examples).
We’ll use p=5 and q=3
(In reality these two numbers would be very, very large. It’s easy to break RSA when these numbers are tiny, but again this is just an example to show you how RSA can be done with pencil and paper.)
Introduce a new variable n, which will be
n = p x q so
n = 5 * 3 = 15 therefore
n = 15
The explanation for the following step involves a bit of number theory which we won’t get into. In short, it uses a piece of magic called Euler’s Totient. We now need exponents for the numbers we just chose and to get those we use Euler’s Totient to calculate a new parameter we’ll call z.
z = (q-1) * (p-1)
So, for our case:
z = (3-1) * (5-1)
z = 2 * 4
z = 8
Now we need another prime number which does not divide z. 2 is a prime number, but 8 is divisible by 2, so we can’t use it. So let’s choose the next prime after 5, which is 7.
k = 7
Now comes the fun part. Our public key is n and k! You send n and k to the public and anyone who wishes to send us a message can use these two parameters to encode secret messages that only we can decode. So to receive secret messages, we send all our friends our RSA public key:
n = 15
k = 7
What about the private key? We need to solve the equation k * j = 1 (mod z) where j is the secret. Since our k equals 7 and our z equals 8, we get
7 * j = 1 (mod 8)
We hand this to Wolfram Alpha which tells us that the solution is:
So any whole number greater than zero that we plug in there will give us a valid j. Let’s use n=2 ( NOTE: this n is not related to the RSA parameter named n, this “n” was generated by Wolfram as a solution to our j equation). Plugging in 2 for n, we get:
j = 8*2 + 7
j = 23
And there it is. j = 23 is our secret key! Note that we have deliberately chosen this number, in fact it could have been 31 or 39 (just keep on adding 8). Any of these numbers solve the equation for j. A real encryption program would ask you to randomly move your mouse around the screen or type random stuff on your keyboard in order to generate an unpredictable value of j at this point.
Start short digression. In Bitcoin an eventual real number “j”, after encoding, would be about this big:
E9873D79C6D87DC0FB6A5778633389F4453213303DA61F20BD67FC233AA33262
That’s a typical value of “j” in a Bitcoin public key system. Remember Bitcoin does not use RSA, so this would be a private key in a ECDSA system which uses much smaller keys than RSA would. In practice the value of j would be a LOT larger than this for RSA. Let’s get back to our example. End short digression.
Now comes the Diffie-Hellman exchange. Anyone who wishes to send you secret messages must have our calculated values for n and k. We start the exchange by contacting the other party and we tell them “to send us secret messages use n=15 and k=7”. Our peer, who we don’t trust and we assume that they don’t trust us either, then replies by saying “ok, received, and here’s our n and k”. Voila. We’ve just done a simple version of a Diffie Hellman exchange. There’s more to it in real life, but this is enough for our pencil and paper experiment.
We can now send secret messages to our peer and they can send us secret messages. But we’re only interested in what they’ll send us for the sake of this example. So let’s get in the shoes of our peer for a moment and let’s encrypt the message “Hi” that we’ll send back to ourselves. The RSA formula for encryption is …
T ^ k = C (mod n)
… where T is Hi and C is the Ciphertext (the output encrypted message). Now here’s a frequent doubt in the minds of RSA students: T ^ k is a numeric operation, like 2^3 is two to the power of 3 which is 8. If T is “Hi” then how do we turn this into a number?! Great question indeed. What we have here is a case where we need to turn text into a whole number. Somehow we need to turn Hi into a number that we can raise to the power of k (remember k=7). How do we do that? Well, we do exactly as if we were dealing with decimal numbers, except we’re working with a weird base here instead of 10. What does the number 6743 mean in practice? It’s just 6 in the thousands, 7 in the hundreds, 4 in the tens and 3 in the units. Correct? We can do the same thing for “Hi”. H is in the 10th power, e is in the 9th and so on until d is in the units. All we need to do now is give these letters numerical values and build a number out of them using our crazy Hi encoding! So let’s give each letter a numerical value:
H=1
i = 2
Now we have an encoding for our “alphabet” that is made up of 2 different characters. We can see that our numerical system only has 2 values, so we could use a binary base numerical system, but for simplicity we’ll just use base 10, our good old decimal system. Any base will do as long as it is equal or more to the number of characters in the message and, of course, you must use the same base when decoding the message. So let’s use the base 10 math that we’re all used to:
Hi = 12
Great! 12 is our T! Now we can raise T to the k’th power :
T ^ k = C (mod n)
Remember here we are the peer who is encoding a message to us, and we sent k=7 and n=15 to this peer. He’s now going to use these 2 numbers, which together are the public key, to encrypt their message to us. This, right here, is where the public key is used to encrypt:
12 ^ 7 = C (mod 15)
We get that 12 ^ 7 equals:
35831808
Wow! We used very tiny parameters, a very tiny message and just look at how much bigger that number is than our chosen p and q. Imagine how big the numbers would become when the RSA keys are thousands of digits long and messages are large! This takes us back to the very beginning of this post where we mentioned that computers made old cryptography obsolete due to the sheer size of the numbers any home PC can process.
But that’s not our Ciphertext C yet. That big number needs to be divided by our n now so that we can take the remainder of this division and that remainder will then be the encrypted message (that big number modulo n).
So we do that number mod 15 and we get that
C = 3
That whole big number has just been boiled down to the number 3 via the magic of modular arithmetic. Here it is important to note one big shortcoming in the RSA system. Noticed how our n=15 was used as the modulo here? What if our encoded message wasn’t just the number 12 but was something larger than our n? That’d be a problem – a big problem. Since n was smaller than the encoded message, it overflowed our maximum size and now we could have 2 messages with the same result. For example, our T was 12 but if we add 15 to 12 we get 27 and 27 is also 12 modulo 15! But 27 is a completely different message! In fact we don’t even have a character that maps to 7, so it’s an invalid message.
This is a big limitation of RSA: number n always has to be larger than the encoded message. As you can imagine, for a large message n would have to be gigantic. This is why RSA is not actually used to encode messages (you read that right). RSA is only used to encrypt an encryption key for a symmetric cipher! That’s the missing part of the Diffie Hellman exchange we didn’t mention above: you don’t actually use the exchange to send messages back and forth, you actually exchange encryption keys for much faster encryption algorithms. But we’re detouring from our example once more, let’s now send our encrypted message back to ourselves.
Our peer now sends us C over the network.You have mail! We open the secret message and it reads simply C = 3. Note how unlikely it seems to decode the number 3 back to the message we sent just by looking at it! How does the number 3 become “Hi” again? Pretty incredible, isn’t it? But let’s not get ahead of ourselves. So, how do we decrypt this message C = 3?
Here is the RSA decryption formula:
C ^ j = T (mod n)
Which, translated, says the secret message raised to the private key’s power is equal to the plain text modulo n.
Why can’t everyone decipher this on their own? Because they don’t know j! j is our private key! It is analog to the Bitcoin private key you must guard. By looking at this equation we get an idea of the mathematical principle behind RSA. Why is it so hard to derive j? Because j is a logarithm and logarithms are always hard to guess even for computers. Since all numbers involved here are whole numbers (integers), we all j a discrete logarithm because it takes only “whole” or discreet values. j cannot be a fractional number like 3.14 or 1.618, it can only be 1 or 3 and nothing in between, thus the “discreet logarithm” name. The question an attacker must ask is what number must the secret message be raised to so that it is congruential to the original message T modulo n? Before we decrypt this trivially (because we know that j = 23) let’s try to brute force our way to find j.
C ^ j = T (mod n)
We have C, we have n (n is public) and we need to discover j to recover T which is the secret message. So, as eavesdroppers, we have 2 unknowns: T and j. Let’s plug in the numbers we have.
3 ^ j = T (mod 15)
So how do we start guessing? We start plugging in values into j, using a computer, until T shows up with something readable. So, from 1 onwards substitute for j and solve for T. Given a T, decode it using the agreed upon encoding which is public information, and see if T makes sense. Repeat this for each j until T is readable (determining whether T is “readable” is a challenge in itself, think about it!).
In our case it would take 22 wrong guesses until we hit jackpot on the 23rd try. 22 guesses is very very easy, right? Yes, because we chose very small values for q and p right at the beginning. In a practical situation p and q would be so large, so immensely large, that testing one by one of the possibilities for j would take longer than the Earth is expected to exist. The numbers involved in RSA are astronomical. And there are more security features built into this system, which we won’t discuss here, but you should know that RSA is a LOT harder to break than this. The test for T, for example, is very complicated. So now that we’ve seen how to brute force a secret message, let’s plug in our j and decrypt it in a legit manner.
3 ^ j = T (mod 15)
3 ^ 23 = T (mod 15)
3 ^23 = 94143178827 so
94143178827 = T (mod 15)
We divide 94143178827 by 15 and find its remainder which is 12. So T = 12 and that’s our Text message back. Let’s decode it using our custom encoding:
H=1
i = 2
And we get that T = 12 = Hi
Congratulations! You’ve performed your very own RSA encryption and decryption by hand!
This process we’ve just performed, where you have missing parameters to mathematical formulas (the private key) is exactly the same which makes guessing Bitcoin signatures so hard. As we just saw, when we have the private key (“j” in our example) it’s a trivial task to determine the value of T modulo n. But when you don’t have j, you must try one by one of all the possibilities for j. Since, in reality, j is an astronomical number, this task becomes intractable.
Well, we hope you’ve enjoyed this tour through the history and the application of public key cryptography! As you can see, the principles behind this kind of encryption technology were developed from rather simple mathematical ideas which, when put together in a clever and creative way, becomes a very strong system to protect information. We saw that the formula behind Bitcoin’s transaction signature system is very simple looking, but it hides immense mathematical complexity when you are missing certain pieces of the puzzle. This is the principle behind all public key, also known as asymmetric key, cryptosystems: the missing piece of the puzzle makes it very difficult to guess the correct value for the secret key.
If you like this article, please share it on your favorite social media platform and help us spread knowledge of cryptography to all who might be interested in getting into this exciting and mystifying field! Thank you!
Public Key Cryptography Wikipedia Entry
Globalsign: What is Public-key Cryptography?
How Does Public Key Encryption Work? (Public Key Cryptography and SSL)
What is Public Key Cryptography?
A Deep Dive on End-to-End Encryption: How Do Public Key Encryption Systems Work?