San Jose State University

Department of Computer Science

CS 265

RSA Timing Attack

Submitted By:

Ramya Venkataramu

SID: 004395639

Date Submitted : 03/24/2006

Section: 01

TABLE OF CONTENTS

ABSTRACT 3

INTRODUCTION 3

The RSA Cryptosystem 3

Repeated Squaring Algorithm 3

The Timing Attack 4

METHODOLOGY 4

Attack #1: Is a practical timing attack against OpenSSL 5

Attack #2: Attack using Paul C. Kocher’s Method 5

IMPLEMENTATION 6

Implementation Setup 6

Implementation of RSA Cryptosystem 6

Demonstration 6

Attack #1: Implementation of the Practical Timing Attack over OpenSSL 0.9.7d 6

Difficulties Encountered 10

Difficulties Solved 10

Demonstration 10

Attack #2: Implementation of an Attack using Paul C. Kocher’s Method 10

Difficulties Encountered 11

Difficulties Solved 12

Demonstration 12

CONCLUSION 12

REFERENCES 12

APPENDIX 13

A-1 Implementation of RSA Cryptosystem 13

A-2 Brumley and Boneh’s Approach 13

A-3 Kocher’s Method on Repeated Squaring Algorithm 15

ABSTRACT

It was believed that the only way to attack RSA cryptosystem was by solving the “hard” problem of factorizing an integer ‘N’ (the modulus) into its’ two relatively prime components (‘p’ and ‘q’). However, innovative side channel attacks called Timing attacks were able to break the RSA Cryptosystem through a different approach. In this project, the details of the various timing attacks are studied and their implementations are carried out in order to gain an in-depth knowledge in this area.

INTRODUCTION

The RSA Cryptosystem

The RSA cryptosystem, invented by Rivest, Shamir, and Adleman is a “one-way” mathematical function used to securely encrypt and decrypt messages. Its security is based on the idea that factoring an integer into its prime divisors is a hard problem.

Messages are encrypted using: C = Me mod N

Ciphertexts are decrypted using: M = Cd mod N

- M is the message

- C is the cipher text

- e is the exponent

- d is the private key

- N is the modulus and N = p*q, p > q

Repeated Squaring Algorithm

The exponentiation in encryption and decryption is an expensive operation. Repeated Squaring Algorithm is an efficient method used to compute modular exponentiation.

x = M

for j = 1 to n

x = mod( x^2, m)

if dj == 1 then

x = mod( x*M, m )

endif

next j

return x

Figure 1: Repeated Squaring Algorithm, Source: [2]

Figure 1 illustrates the repeated squaring method of performing modular reductions.

The Timing Attack

The RSA cryptosystem is secure. However, there are surprising indirect attacks that can be carried out against the RSA system to recover bits of the private key. Timing attack is one such indirect side channel attack. Timing attack depends on time taken to perform certain crypto operation with a set of input parameters. This timing information can then be used to determine certain amount of the “secret information”.

METHODOLOGY

There are two well known timing attacks:

·  Brumley & Boneh’s attack over OpenSSL, [1].

·  The timing attack based on Kocher’s idea, [2].

This project implements these two attacks.

Attack #1: Is a practical timing attack against OpenSSL 0.9.7d, using the concepts in [1].

Attack #2: Is a timing attack over repeated squaring algorithm using ideas in [2].

Before describing the attack in details, here are some basic concepts and terms.

OpenSSL

“The OpenSSL Project is a collaborative effort to develop a robust, commercial- grade, full-featured, and Open Source toolkit implementing the Secure Sockets Layer (SSL v2/v3) and Transport Layer Security (TLS v1) protocols as well as a full-strength general purpose cryptography library. The project is managed by a worldwide community of volunteers that use the Internet to communicate, plan, and develop the OpenSSL toolkit and its related documentation.” [3]

To optimize the encryption/decryption process OpenSSL uses:

·  Chinese Remainder Theorem

·  Sliding Window Exponentiation

·  Montgomery Multiplication

·  Karatsuba’s Algorithm

Chinese Remainder Theorem (CRT)

CRT is a mathematical technique that can speedup the exponentiation operation. With Chinese Remaindering, the function m = cd mod N is computed in two steps. First, evaluate m1 = cd1 mod p and m2 = cd2 mod q, where d1 and d2 are pre-computed using d, p and q are the prime components of the modulus N. Then m1 and m2 are combined to m using CRT.

Sliding Window Exponentiation

Sliding Window Exponentiation is an optimization of the ‘square and multiply’ method. This algorithm performs modular multiplication at every step. It is required to pre-compute a multiplication table which can then be used in successive computations. Hence, in each iteration a block of bits can be processed. For a 1024-bit modulus it uses a window size of five [1].

Montgomery Reduction

Montgomery is a method of implementing reduction modulo operation using a series of efficient operations. Montgomery reduction transforms a reduction modulo q into a reduction modulo some power of 2 (denoted by R). However, in order to use Montgomery reduction all variables must first be put into Montgomery form. The Montgomery form of a number x is x*R mod q [1]. Since RSA deals with huge numbers, the Montgomery reduction method speeds up the process, even though there is an overhead involved initially in putting the numbers in Montgomery form.

Attack #1: Is a practical timing attack against OpenSSL

The attack depends on time variation of various operations in OpenSSL RSA decryption:

·  Schindler’s observation of the number of extra reductions in Montgomery’s multiplication.

·  The choice of multiplication routine – Karatsuba vs. Normal multiplication.

Extra reduction in Montgomery reduction

At the end of the Montgomery reduction, a check is made if the output is greater than the modulus q. If so, subtract q from the output to ensure the output is in the range of 0 to q. This step is called extra reduction. The number of extra reductions causes a timing difference which helps us deduce how close g is to a multiple of one of these factors.

Timing in multiplicative methods

OpenSSL uses 2 different multiplicative methods:

·  Karatsuba/Recursive multiplication – for multiplying two numbers with an equal number of words.

·  Normal Multiplication – multiplying two numbers with an unequal number of words.

There is some timing information revealed by these two multiplication routines. Karatsuba is faster than normal multiplication. Hence, multiplying equal number of words takes shorter time than multiplying unequal number of words.

Attack #2: Attack using Paul C. Kocher’s Method

The actual time (Ti) taken to sign a large number of random messages is computed.

Attacker can compute (on a machine similar to the system on attack) ti, time taken to compute Mi * Mi2 (mod m) for each message. If d1 in the private key is 1, then Mi * Mi2 (mod m) for each I is performed, otherwise it is not.

By looking at set Ti and ti and the correlation between the sets, attacker can identify the value of bit 1.

Attacker can proceed in a similar manner to identify other bits.

To identify the correlation between the Ti and ti, the values were normalized and compared.

IMPLEMENTATION

Implementation Setup

All the below implementations were carried out on IBM Thinkpad T23 notebook, running SuSE Linux 10 operating system.

Implementation of RSA Cryptosystem

A simple RSA cryptosystem is implemented. It performs encryption of a message and decryption of a cipher text. Both methods use the repeated squaring algorithm for efficiency.

Encryption – The message from an input file is parsed. Every character is encrypted into a cipher text which is stored in an output file.

Decryption- The input file contains the encrypted message. This message is decrypted and the resulting deciphered text is written in an output file.

This sample implementation can handle key size of up to 32-bits.

Demonstration

Refer Appendix A-1

Attack #1: Implementation of the Practical Timing Attack over OpenSSL 0.9.7d

The OpenSSL 0.9.7d is downloaded from [6] and the build is performed to create the executables. “Blinding” function of openssl is turned of using “”

OpenSSL performs 4 types of operations:

1.  Signing (RSA_eay_private_encrypt function)

2.  Decryption (RSA_eay_private_decrypt function)

3.  Signature Verification (RSA_eay_public_decrypt function)

4.  Encryption (RSA_eay_public_encrypt function)

This implementation attacks the signing function (RSA_eay_private_encrypt) which can be found in the file rsa_eay.c in directory /crypto/rsa.

To optimize the encryption/decryption process OpenSSL uses:

·  Chinese Remainder Theorem

·  Sliding Window Exponentiation

·  Montgomery Multiplication

·  Karatsuba’s Algorithm

The implementation of Chinese Remainder Theorem is done in the function RSA_mod_exp which can be found in the file rsa_eay.c in directory /crypto/rsa.

Here the ‘rdtsc’ function [7] is used to calculate the time for the exponentiation bn_mod_exp to execute.

The Karatsuba and Normal multiplication is executed by the function BN_mul which is found in the file bn_mul.c in directory /crypto/bn.

Montgomery Reductions is executed by the function BN_mod_exp which can be found in the file bn_exp.c in the directory /crypto/bn.

BN or big number data structure - is a key data structure used. This data structure is needed since normal C data types have ‘int’ or ‘long long’ which are of size 32 and 64 bits (on Intel-386 architecture) respectively. On the other hand BN data structure maintains a pointer to a large data type and can handle numbers which are of the order of 1024 or more bits. This is needed since the size of the modulus, p or q values, and private key used for practical purposes are of the order of 1024 bits.

Particulars of the Attack:

The attack proceeds to guess the value of q (where N = p * q and q < p) one bit at a time, using the decryption timing information for certain known plain-texts.

The initial guess for q lies between lies between 2512 and 2511. Decryption times for different possible combinations of first few bits are found and arrived at an initial guess by finding the peaks in these decryption time.

Now suppose we have already found top i-1 bits.

·  g is the top i-1 bits of q (assuming these bits are already recovered) and the remaining bits are 0.

·  Let ghi be equal to g, but with ith bit is set to 1. This implies g < ghi < q or g < q < ghi

·  Let ug = g * R-1 mod N

·  Let ughi = ghi * R-1 mod N

·  Time to decrypt ug and ughi are measured.

·  Note, ug and ughi are used instead of g and ghi as RSA decryption converts its input to Montgomery form before exponentiation and hence will use g and ghi.

·  The difference in DecryptionTime(ug) and DecryptionTime(ughi) is used to determine bit i of q. If this difference is ‘large’, then bit i of q is 0 and g < q < ghi. If this difference is ‘small’ then bit i of q is 1 and g < ghi < q. This large and small difference is due to time variations in openssl (Extra reduction and multiplication algorithm used) that were described before.

For any particular bit of q, the number of queries for a guess g is determined by two parameters [1]:

Neighborhood Size – For every bit of q, measure the decryption time for a neighborhood of values g, g+1, g+2, ……,g+n. [1]

Sample Size – For each value of g+i, sample the decryption time multiple times and compute the median decryption time. This is required to overcome the effect of a multi-user environment. Repeatedly decrypting for g+k and using the median value as the effective decryption time is more effective than doing it once. [1]

The neighborhood and sample size must be large enough to obtain delta values with a strong indicator of the private key bit. I have chosen a sample size of 7 and neighborhood size of 3200.

The program doing the attack is using some functions from OpenSSL’s libcrypto library. These functions are part of OpenSSL and were used to handle big numbers of g, ghi, etc and doing math between the big numbers. The list of functions used is as follows:

·  BN_init - function is used to initialize any big number data type.

·  BN_bin2bn – Converts a binary value to big number form.

·  BN_uadd – performs the addition of two big numbers and stores the result in a third big number.

·  BN_mul – performs the multiplication of R^-1 (mod N) with the input cipher text (neighbor value in this attack)

·  BN_print_fp - This function prints the input big number data structure to a file.

·  BN_clear_bit - is used to set the input bit value to 0.

·  BN_set_bit - is used to set the input bit value to 1.

The Algorithm to recover the private key bits makes use of time variances which occur in OpenSSL’s implementation of RSA and is as in Figure 2:

Initialize g with top i-1 bits of q.

ghi is made equal to g.

Determine R-1 mod N (This can be gotten by looking at openssl source code and is not a secret).

While there are more bits to be found

BN_set_bit function is used to set bit i of ghi to 1.

for k = 0 to Neighborhood size

BN_add function is used to add g to k.

BN_mul is used to multiply g+k with R-1 mod N

Store the multiply result in ug

/* determine the time used to decrypt ug */

for j=0 to Sample Size

Call the OpenSSL signing/decryption function with arguments.

Note the difference in start and end times for ug

end for

let t1 be the median of decryption times of ug over the Sample Size.

/* the same process is repeated with ughi */

BN_add function is used to add ghi to k.

Use BN_mul to multiply the add result with R-1 mod N

Store the multiply result in ughi

for j=0 to Sample Size

Call the OpenSSL signing/decryption function with arguments.

Note the difference in start and end times for ughi

end for

let t2 be the median of decryption times of ughi over the Sample Size.

let delta = | t1 – t2 |

If delta is “large” then

/* bit i of q is 0 */

BN_clear_bit function is used to clear bit i in ghi

else