## RSA Encryption - Tutorial

In the last 3 months I've really started to hear lots about RSA and its use in software protections, Hiew for example uses a 256-bit key, Advanced Disk Catalog uses 64-bits and IDA an effectively irreversible 1024-bits. In 1976 three researchers at M.I.T. (Ron Rivest, Adi Shamir and Les Adleman) introduced a public key cryptosystem, prior to this only private key cryptosystems had been used.

The RSA cryptosystem is based on modular exponentiation modulo the product of 2 large primes. Each individual has an encrypting key consisting of a modulus n = pq, where p & q are large primes, say with 200 digits each, and an exponent e that is relatively prime to (p-1)(q-1). To produce a usable key, 2 large primes must be found (this can be done quickly on a computer using probabilistic primality tests). However the product of these primes n = pq, with approximately 400 digits, cannot be factored in a reasonable length of time.

#### RSA Encryption

In the RSA encryption method, messages are translated into sequences of integers. This can be done by translating each letter into an integer, as is done with the Caesar cipher. These integers are grouped together to form larger integers, each representing a block of letters. The encryption proceeds by transforming the integer M, representing the plaintext (the original message) to an integer C, representing the ciphertext (the encrypted message) using the function.

`C = Me mod n`

Below I'll show you an example of using RSA encryption, for practical reasons I've used small primes p and q, rather than primes with 100 or more digits. Obviously this cipher ISN'T secure.

We select 2 primes, p = 43 & q = 59 so that n = 43 · 59 = 2537, and with e = 13.

`gcd (e,(p-1)(q-1) = gcd(13,42.58) = 1   (gcd = greatest common divisor)`

Lets take the hypothetical message STOP, first we'll convert the letters into their numerical equivalents (position in the alphabet-1) and then group those numbers into blocks of 4.

1819 1415 = ST OP

We encrypt each block using the mapping:

C = M13 mod 2537

Computations using modular multiplication show that 181913 mod 2537 = 2081, and 141513 mod 2537 = 2182. The encrypted message is thus 2081 2182.

#### RSA Decryption

The plaintext message can be quickly recovered when the decryption key d, an inverse of e modulo (p-1)(q-1) is known. (Such an inverse exists since gcd(e,(p-1)(q-1))=1). To see this, note that if d e º 1 (mod (p-1)(q-1)), there is an integer k such that d e = 1 + k(p-1)(q-1). It follows that.

Cd = (Me)d = Mde = M1+k (p-1)(q-1)

By Fermat's theorem (assuming that gcd(M,p) = gcd(M,q) = 1, which holds except in rare cases, it follows that Mp-1 º 1 (mod p) and Mq-1 º 1 (mod q), consequently.

Cd = M · (Mp-1) k (q-1) º M · 1 º M (mod p)

and:-

Cd = M · (Mq-1) k (p-1) º M · 1 º M (mod q)

Since gcd(p,q) = 1, it follows that:-

Cd º M (mod pq)

#### Example Decryption

Using the simple cipher above we receive the message 0981 0461, lets go about decrypting it.

n = 43 · 59 and e (exponent) = 13, we can work out that d = 937 is an inverse of 13 modulo 42 · 58 = 2436. We therefore use 937 as our decryption exponent, therefore.

P = C937 mod 2537

Using fast modular exponentiation (an algorithm) we compute 0981937 mod 2537, = 0704 and 0461937 mod 2537 = 1115. Quick translation reveals that this message was HELP.

#### RSA Security

Well, why is the RSA cryptosystem suitable for public key cryptography?, when we know the factorization of the modulus n, that is, when we know p and q we can use the Euclidean algorithm to quickly find an exponent d inverse to e modulo (p-1)(q-1). This lets us decrypt messages sent using our key.

However no method is known to decrypt messages that is not based on finding a factorization of n. The most efficient factorization methods in 1995 required billions of years to factor 400-digit integers. Consequently when p and q are 200-digit primes, messages encrypted using n = pq as the modulus cannot be found in a reasonable time unless the primes p and q are known.

I've heard a few people talk about breaking RSA, finding all the primes for example which don't exceed a specified integer isn't particularly difficult, an application of Eratosthenes sieve will serve this purpose, yet you should always be on the look out for potential implementation and key choice weaknesses (authors prime choices might be careless). If however you encounter a Russian author using a 1024-bit keyfile, forget it, the suns going to burn out before current desktop technologies can factor it, unless of course you ask very nicely for the key :-).

As breaking RSA is a matter of simply factoring the public modulus, several techniques have been developed which are considered much more efficient and intelligent than just trying all the possibilities.

GNFS - General Field Number Sieving is as far as I know the latest factoring technology which involves sieving sets of "Q-values", the results are then accumulated and sorted before being run through a graphing algorithm followed by bitmatrix reduction, as you can see this isn't trivial :-).

MPQS - The Multiple Polynomial Quadratic Sieve is somewhat simpler and is the predecessor to the NFS (Number Field Sieve), basic theory is as follows:

Find x, y such that x ^ 2 == y ^ 2 mod RSA-N.

This is based on the good probability that: gcd(x+y, RSA-N) is not 1 or RSA-N, hence you'd have a factor. Apparently Hiew's 256-bit key was broken in this fashion.

Sieving the Q-Interval is another technique which is based around the fact that quadratic residues given by a certain polynomial and divisible by a particular prime are regularly spaced (I've seen the maths behind this and its not really for me :-) ). Essentially in sieving you add the log of the prime every such regular distance through initially zeroed memory, for each prime in the factor base. When thats done you pick out places where the accumulation exceeds a threshold; thats where you find quadratic residues divisible by lots of small primes, hence these are more likely to be double-partial, partial or full relations.

I've shamelessly taken this from the HASP site (although I loathe the dongle salespeople this is a well written summary of how RSA works in practice :-) ).

#### How does Public-Key encryption work?

Let's assume we have two parties, Bob and Alice, who wish to transmit confidential information to one another over the Internet. Alice would like to send Bob a corporate document using a Public Key encryption system. To accomplish a secure transmission, they will need to do the following :-

Alice and Bob agree on a public-key cryptosystem.
Bob generates a pair of mathematically linked keys : one public, one private.
Bob transmits his public key to Alice over any insecure medium.
Bob keeps the private key a secret.
Alice uses Bob's public key and the encryption algorithm to encrypt her message, creating a ciphertext.
Alice transmits the ciphertext to Bob.
Bob decrypts the ciphertext using the same algorithm and his private key.

Public key cryptography solves the key-management problem with symmetric cryptosystems. Previously, Alice and Bob had to agree on a key and somehow Alice had to securely transmit that key to Bob without a third party, Eve, finding out what it was. In the case of public-key systems, if Eve were to listen to Alice and Bob's entire exchange, she could intercept the encrypted message and Bob's public key, but because she doesn't have Bob's private key she cannot decrypt the message without unreasonable effort.

In most practical applications of public key systems, a network of users agree on a public key cryptosystem, and pair of keys is generated for each user. The public keys are published in a database accessible to network users, making the process of key exchange that much simpler. RSA derives its security from the difficulty of factoring large numbers. The public and private keys are functions of a pair of large (100 or 200 digits, sometimes larger) prime numbers. Recovering the plaintext from the ciphertext is equivalent to factoring the product of the two prime numbers.

Generating keys in RSA works as follows : take two large primes, p and p , and compute their product n = pq; n is called the modulus. Choose a number, e , less than n and relatively prime to (p-1)(q-1), which means e and (p-1)(q-1) have no common factors except 1. Find another number d such that (ed - 1) is divisible by (p-1)(q-1). The values e and d are called the public and private exponents, respectively. The public key is the pair (n, e); the private key is (n, d). The factors p and q may be kept with the private key, or destroyed.

The security of RSA rests in the difficulty of factoring large integers, and these large integers are the mathematical relationship between between the public and private keys. It is currently difficult to obtain the private key d from the public key (n, e). However if one could factor n into p and q, then one could obtain the private key d. Thus the security of RSA is based on the assumption that factoring is difficult. The discovery of an easy method of factoring large integers would make it significantly easier to break RSA.