Lecture 8 - Basic Number Theory

Computer and Network Security Lecture 8 – Basic Number Theory
Dahlia Malkhi, Shlomo Kipnis Page 1
The Hebrew University of Jerusalem

Lecture 8 - Basic Number Theory

In this lecture, we review some of the basics of number theory that will be used in the following lectures.

Definitions

Additive Group

Consider the following Set: Zp = {0,1,2, … , p-1} with the operation ‘addition mod p’ (to be denoted by +). The set Zp with the + operation is an AdditiveGroup because it has the following properties:

It is closed – for every a and b in Zp, a+b is also a member of that set.
Formally:  (a  Zp) , (b  Zp) : (a+b)  Zp.
It has a “Zero” element – there is an element z in Zp, that for each member a in Zp, performing the operation + on those 2 numbers will result in a.
Formally: (z  Zp): (a  Zp) a+z = a.
Every element has an opposite element – for every element a in Zp, there exists an element b in Zp, so that a+b=0.
Formally: (a  Zp) : (b  Zp) : a+b =0.
Associativity – for every 3 elements in Zp, no matter in which order the + operation is performed, it always yields the same result.
Formally:  (a  Zp) , (b  Zp), (c  Zp) : (a+b)+c = a+(b+c).

Multiplicative Group

Let us look at the set Zp* = {1,2,3,…, p-1} with the operation ‘multiplication mod p’ (to be denoted by *). Similarly, this set is called a Multiplicative Group if it has the following properties:

It is closed – for every a and b in Zp, (a*b) is also a member of that set.
Formally:  (a  Zp) , (b  Zp) : (a*b)  Zp.
It has a “Unity” element – there is an element u in Zp, that for each member a in Zp, performing the operation * on those 2 numbers will result in a.
Formally: (u  Zp) (a  Zp) : a*u = a.
Every element has an inverse element (denoted as a-1) – for every element a in Zp, there exists an element b in Zp, so that a*b =1.
Formally: (a  Zp) (b  Zp) : a*b =1.
Associativity – for every 3 elements in Zp, no matter in which order the * operation is performed, it always yields the same result.
Formally:  (a  Zp) , (b  Zp), (c  Zp) : (a*b)*c = a*(b*c).

Field

A set is a Field if it is both an Additive Group and a Multiplicative Group, and it has the following properties:

Commutativity –  (a  Zp) (b  Zp) : a+b = b+a and ab = ba

Distributivity –  (a  Zp) , (b  Zp), (c  Zp) : a(b+c) = ab + a*c)

An example of a Field is the set of integers modulo a prime p: the group (Zp, +,*,0,1) where Zp = {0,1,2, … , p-1}.

Properties

If p is prime, then the set Zp* = {1, 2, …, p-1} with the operation multiplication modulo p defined on it, has the following properties:

Zp* is Cyclic

Zp* is Cyclic, meaning it has a generator. A generator is an element g of Zp* so that every element i of Zp*, is the result of raising g to the j-th power, where 1 j p-1.
Formally: Zp* = {gi : i = 1, 2, … , p-1} = {g1, g2, g3, …, gp-1}.

A cyclic group may have more than one generator.

Let us consider the following example:For Z 7*= {1, 2, 3, 4, 5, 6} the element 3 is a generator, since:31 = 3 (mod 7)34 = 4 (mod 7)32 = 2 (mod 7)35 = 5 (mod 7)33 = 6 (mod 7)36 = 1 (mod 7)

Fermat’s Little Theorem

If p is prime, then for each element a in the set Zp* : a p-1 = 1 (mod p).

Let us prove this theorem: p is prime, and therefore a and p are relatively prime (The term ‘relatively prime’ means that they do not share any common factor other than 1.) In this case, a has an inverse, and therefore: ab = ac(mod p) implies b = c (mod p).

Since a and p are relatively prime, there is no k in Zp* for which a*k=p (mod p). This is why the following multiples a (mod p), 2a (mod p), …, (p-1)a (mod p)give all the residues 1, 2, …, p-1 permuted:

a * 2a * … * (p-1)a = 1 * 2 * … * (p-1) (mod p) 

a p-1 * [1 * 2 * … * (p-1)] = [1 * 2 * … * (p-1)] mod p

Since Zp* is a multiplicative group, we can remove [12 …* (p-1)] from both sides of the equation to obtain: a p - 1 = 1 (mod p)

From this theorem, we can easily deduce that:

a p = a (mod p) because aap-1= a1 = a
a-1 = a p-2 (mod p) because aa p-2 = a p-1 = 1

The second deduction gives us a way to calculate the inverse of an element (a–1is the inverse of a) in O(log p) steps, in comparison to a search that takes O(p) steps. This is possible because ap-2 can be calculated in O(log p) steps.

Properties regarding order(a)

The order of a, denoted as order (a), is the smallest b that satisfies the equation ab = 1.
For example:order (1) = 1.

For every a in Zp*, order (a) is a divisor of p-1 (order (a) divides p-1).
Formally: a  Z*p : order (a) | p-1.
An element a of Zp* is square (meaning there exists such a b in Zp* so that a = b2) if and only if a(p-1)/2 = 1 (mod p).
Formally: (b  Zp) , a= b2 a(p-1)/2 = 1 (mod p).
The equation gx gy (mod p) is true if and only if x = y (mod (p-1)).
Formally: gx gy (mod p) x = y (mod (p-1)).

Finding Big Primes

In public key cryptographic systems, we need to find large primes and to perform operations with prime numbers. It is essential that we can find such primes fast and with high levels of certainty.

The function of the density of the prime numbers, denoted as Π(x)is, defines the number of primes between 1 and x. For example: Π(10) = 4 (the prime numbers in the range [1..10] are 2, 3, 5, 7 – a total of four).

This function satisfies the relation: Π(x) 

Since this is true for the density of prime numbers from 1 to x, dividing each side of the relation by x will give us the density of prime numbers near a specific y, which is approximately .

We can see that primes are quite dense - for a number of the order of 21000, there are enough prime numbers near it (more precisely – about one in every 1000 numbers will be prime). This means that in order to find a prime of magnitude ~n, we need to perform a primality testing procedure only for about ln(n) randomly chosen numbers. But performing a ‘brute-force’ testing for primality in large numbers is very expensive; for a number of the order of 21000, we have to check for all the numbers that are smaller than its root – about 2500 numbers.

We will now describe two witnesses of compositeness (=non-primality) of numbers. If a number is lacking a property, that all prime numbers have, we can be certain that this number is not prime.

Witness 1: According to Fermat’s Little Theorem, If p is prime, then for every a (such that aZp*), we have: a p-1 = 1 (mod p).

If we find an element a, such that a n-1 ≠ 1 (mod n), we can be certain that n is not prime. As we need to be more certain that n is prime, we will randomly pick some elements a1, a2… and perform this test again. If none of them satisfy the non-equation, with high probability n is prime. It is unlikely that a non-prime number will pass this test for many times.

A disturbing problem of this nice algorithm is the existence of a few composite numbers, which are called ‘Carmichael numbers’. Those numbers pass this primary test for all a. Those numbers become scarcer among large numbers, but they still exist.

Witness 2: If p is prime, then the equation x2=1 (mod p) has only two solutions: x = 1 and x = -1 (mod p) (note: this is true for odd p, or in other words, p≠2).

The proof of this claim is as follows:
Ifx2 = 1 (mod p) then x2 - 1 = (x + 1)(x - 1) = 0 (mod p). Hence, either p divides (x+1) or (x-1). From this we deduce that either x = 1 (mod p) or x = -1 (mod p). (Regularly, p could divide (x+1) or (x-1) or both. Restricting to odd p, dividing both is impossible.

If we find an element d such that d≠1 and d≠-1 but d2=1 (mod n) then we can be certain that n is not prime. Similarly to the former witness finding, we will repeat the test to be more certain.

Primality Testing (Miller-Rabin)

The following algorithm is a probabilistic algorithm for testing the primality of a number. It is important to clarify that every prime number always passes the algorithm. However, the algorithm might declare a composite number as its output, but this will be with extremely small probability.

The algorithm

Pick a large odd n.
Repeat the next three steps s times:

Pick a random element a from the set zp* = {1, …, n-1}
Test if an-1≠ 1 (mod n)
As seen above (witness 1), if this relation is satisfied, n is not prime.
If this relation is satisfied, then we continue.
While computing an-1, test if ever we come across a = r, where r 1(mod n), such that r2 = 1(mod n) (meaning r is not one of the two trivial roots of n).
As seen above (witness 2), if we find such an r, then n is not prime.
As long as no such r is found, we continue.

If in all s times we do not come to the conclusion that n is not prime 
conclude that n is prime.
If n proves to be composite, try n+2, etc.

Pseudo-code of Miller-Rabin method

Foe every randomly chosen number a, we perform the procedure witness(n, a):

witness (n, a):

Let (bk , bk-1 , …, b0) be the binary representation ofthe number n-1.

d = 1

for (i = k downto 0):

root = d

d = d 2 (mod n)

if (d=1) and (root≠1) and (root ≠ n-1)  return true

if (bi = 1)  d = d*a(mod n)

if (d ≠ 1)  return true

else return false

Error rate of Miller-Rabin primality test

According to a known theorem, if n is an odd composite number, then the number of witnesses to the compositeness of n is at least (n-1)/2. This means that for any such n, the Miller-Rabin algorithm has a chance of at most ½ for not discovering its compositeness for a randomly chosen a < n. The probability of not discovering its compositeness for s rounds is (½)s.Therefore, when the algorithm ends with the conclusion that n is prime, the certainty that it is indeed prime is of 1 – (½)s.The larger s is, the more we can be sure that p is prime if it didn’t fail the tests.

The Discrete Log Problem

Let us now look at the following problem: Given a prime number p, the group Zp*, a generator g of Zp*, and y Zp*, find x such that gx = y (mod p). In other words, solve the equation: x = loggy . Since g is a generator of Zp*, we know that such an x exists.

This problem of finding the Discrete Log is considered to be very difficult. An algorithm for that purpose is usually exponential in the length of p, meaning that it takes O(2 f(|p|)) steps, where f(|p|) is a linear function in the size of p.

Number Factoring Problem

Let us now look at the following problem: Given a number n that is the product of two large primes p and q, (n = pq), find its two prime factors p and q. This problem is considered to be very difficult.

Euler’s Totient Function

Euler’s Totient function defines the number of positive integers less than n that are relatively prime to n (that is, they do not share any common factor other than 1). This function, called the Totient Function (supposedly from Total and Quotient), turned out to be so useful that it was given its own notation, (n).

If p is prime (p) = p-1.
Explanation: When p is prime, all the integers {1,2, …, p-1} are relatively prime to p.

If n is a product of two distinct primes, p and q (n = pq), (n)= (p-1)(q-1).
Explanation: In total, there are n = pq numbers in the set {0,1,2…n-1}. In order to get the size of (n), we need to remove the numbers that are not relatively prime to n. Because p and q are prime, the only numbers that are not relatively prime to n are the numbers that are either a multiple of p or a multiple of q. In the set {0,1,2…n-1}, there are p multiples of q: {q, 2q, …, (p-1)q}, and q multiples of p: {p, 2p, …, (q-1)p}. So, there are p+q-1 numbers that are not relatively prime to n (0 is a multiple of q and of p and therefore is counted twice). Thus we obtain: (pq) = pq-(p+q-1) = (p-1)(q-1).
As an example, (10) = |{1,3,7,9}| = 4, and, because 10=5*2: (2*5) = (2-1)*(5-1)=4.

Euler’s theorem

For every a and n that are relatively prime, a(n) = 1 (mod n).

Fermat’s little theorem is a special case of Euler’s theorem.

Euclid’s Algorithm

The Greatest Common Divisor (gcd) of two integers is the largest integer that evenly divides both of them. Two integers are relatively prime if and only if their gcd is 1. For example, gcd(12,8) = 4, gcd(12,25) = 1, and gcd(12,24) = 12. So – 12 and 25 are relatively prime but the other pairs are not. Note that the above definitions imply that for any positive integer x, we have that 1 is relatively prime to x, and we have gcd(0,x) = x.

Euclid’s algorithm is a method of finding the gcd of two numbers x and y. The idea is to repeatedly replace the original numbers with smaller numbers that have the same gcd, until one of the numbers becomes zero. The remaining number is the gcd. It is easy to show that <x,y> and <x-y,y> have the same common divisors (and so the same greatest common divisor). So, we can subtract y from x and still have the same gcd. But since we would like to get our new numbers as small as possible, we may as well subtract as many y’s as we can from x, so we replace x with its remainder when divided by y. Since this won’t get us anywhere once x is smaller than y, we now switch our new x and y, then repeat the process. Each step of the algorithm now looks like:

<x,y> is switched by <y, remainder(x/y)>

Since at each step one of the numbers gets smaller, eventually one of the numbers will be zero, and so the other will be the gcd.

This algorithm can also be used to efficiently find multiplicative inverses modulo n. In RSA (see next lecture), the numbers d and e are inverses; one is chosen, and the other is calculated using Euclid’s algorithm.

Lemma: The gcd of two numbers x and y can be expressed as the sum of some multiple of each: gcd(x,y) = ux + vy, where u and v are integers.

Lecture 8 - Basic Number Theory