Introduction to the Advanced Encryption Standard:Advanced Encryption Standard (AES) http://www.progressive-coding.com/tutorial.php?id=0&print=1

3 of 25 5/14/2007 10:27 PM

The Advanced Encryption Standard, in the following referenced as AES, is the winner of

the contest, held in 1997 by the US Government, after the Data Encryption Standard

was found too weak because of its small key size and the technological advancements in

processor power. Fifteen candidates were accepted in 1998 and based on public comments

the pool was reduced to five finalists in 1999. In October 2000, one of these five

algorithms was selected as the forthcoming standard: a slightly modified version of the

Rijndael.

The Rijndael, whose name is based on the names of its two Belgian inventors, Joan

Daemen and Vincent Rijmen, is a Block cipher, which means that it works on

fixed-length group of bits, which are called blocks. It takes an input block of a certain size,

usually 128, and produces a corresponding output block of the same size. The

transformation requires a second input, which is the secret key. It is important to know

that the secret key can be of any size (depending on the cipher used) and that AES uses

three different key sizes: 128, 192 and 256 bits.

To encrypt messages longer than the block size, a mode of operation is chosen, which I

will explain at the very end of this tutorial, after the implementation of AES.

While AES supports only block sizes of 128 bits and key sizes of 128, 192 and 256 bits, the

original Rijndael supports key and block sizes in any multiple of 32, with a minimum of

128 and a maximum of 256 bits.

Further readings:

Unlike DES, which is based on an Feistel network, AES is a

substitution-permutation network, which is a series of mathematical operations that

use substitutions (also called S-Box) and permutations (P-Boxes) and their careful

definition implies that each output bit depends on every input bit.

Description of the Advanced Encryption Standard algorithm

AES is an iterated block cipher with a fixed block size of 128 and a variable key length. The

different transformations operate on the intermediate results, called state. The state is a

rectangular array of bytes and since the block size is 128 bits, which is 16 bytes, the

rectangular array is of dimensions 4x4. (In the Rijndael version with variable block size,

the row size is fixed to four and the number of columns vary. The number of columns isAdvanced Encryption Standard (AES) http://www.progressive-coding.com/tutorial.php?id=0&print=1

4 of 25 5/14/2007 10:27 PM

the block size divided by 32 and denoted Nb). The cipher key is similarly pictured as a

rectangular array with four rows. The number of columns of the cipher key, denoted Nk, is

equal to the key length divided by 32.

A state:

------

| a0,0 | a0,1 | a0,2 | a0,3 |

| a1,0 | a1,1 | a1,2 | a1,3 |

| a2,0 | a2,1 | a2,2 | a2,3 |

| a3,0 | a3,1 | a3,2 | a3,3 |

------

A key:

------

| k0,0 | k0,1 | k0,2 | k0,3 |

| k1,0 | k1,1 | k1,2 | k1,3 |

| k2,0 | k2,1 | k2,2 | k2,3 |

| k3,0 | k3,1 | k3,2 | k3,3 |

------

It is very important to know that the cipher input bytes are mapped onto the the state

bytes in the order a0,0, a1,0, a2,0, a3,0, a0,1, a1,1, a2,1, a3,1 ... and the bytes of the cipher

key are mapped onto the array in the order k0,0, k1,0, k2,0, k3,0, k0,1, k1,1, k2,1, k3,1 ... At

the end of the cipher operation, the cipher output is extracted from the state by taking the

state bytes in the same order. AES uses a variable number of rounds, which are fixed: A

key of size 128 has 10 rounds. A key of size 192 has 12 rounds. A key of size 256 has 14

rounds. During each round, the following operations are applied on the state:

1. SubBytes: every byte in the state is replaced by another one, using the Rijndael S-Box

2. ShiftRow: every row in the 4x4 array is shifted a certain amount to the left

3. MixColumn: a linear transformation on the columns of the state

AddRoundKey: each byte of the state is combined with a round key, which is a

different key for each round and derived from the Rijndael key schedule

4.

In the final round, the MixColumn operation is omitted. The algorithm looks like the

following (pseudo-C):

AES(state, CipherKey)

}

KeyExpansion(CipherKey, ExpandedKey);

AddRoundKey(state, ExpandedKey);

for (i = 1; i < Nr; i++)

}

Round(state, ExpandedKey + Nb*i);

{

FinalRound(state, ExpandedKey + Nb * Nr);

{

Observations:

The cipher key is expanded into a larger key, which is later used for the actualAdvanced Encryption Standard (AES) http://www.progressive-coding.com/tutorial.php?id=0&print=1

5 of 25 5/14/2007 10:27 PM

operations

The roundKey is added to the state before starting the with loop

The FinalRound() is the same as Round(), apart from missing the MixColumns()

operation.

During each round, another part of the ExpandedKey is used for the operations

The ExpandedKey shall ALWAYS be derived from the Cipher Key and never be

specified directly.

AES operations: SubBytes, ShiftRow, MixColumn and

AddRoundKey

The AddRoundKey operation:

In this operation, a Round Key is applied to the state by a simple bitwise XOR. The Round

Key is derived from the Cipher Key by the means of the key schedule. The Round Key

length is equal to the block key length (=16 bytes).

------

| a0,0 | a0,1 | a0,2 | a0,3 | | k0,0 | k0,1 | k0,2 | k0,3 | | b0,0 | b0,1 | b0,2 | b0,3 |

| a1,0 | a1,1 | a1,2 | a1,3 | XOR | k2,0 | k2,1 | k2,2 | k2,3 | = | b2,0 | b2,1 | b2,2 | b2,3 |

| a2,0 | a2,1 | a2,2 | a2,3 | | k1,0 | k1,1 | k1,2 | k1,3 | | b1,0 | b1,1 | b1,2 | b1,3 |

| a3,0 | a3,1 | a3,2 | a3,3 | | k3,0 | k3,1 | k3,2 | k3,3 | | b3,0 | b3,1 | b3,2 | b3,3 |

------

where: b(i,j) = a(i,j) XOR k(i,j)

A graphical representation of this operation can be seen below:

The ShiftRow operation:Advanced Encryption Standard (AES) http://www.progressive-coding.com/tutorial.php?id=0&print=1

6 of 25 5/14/2007 10:27 PM

In this operation, each row of the state is cyclically shifted to the left, depending on the

row index.

The 1st row is shifted 0 positions to the left.

The 2nd row is shifted 1 positions to the left.

The 3rd row is shifted 2 positions to the left.

The 4th row is shifted 3 positions to the left.

------

| a0,0 | a0,1 | a0,2 | a0,3 | | a0,0 | a0,1 | a0,2 | a0,3 |

| a1,0 | a1,1 | a1,2 | a1,3 | -> | a1,1 | a0,2 | a1,3 | a1,0 |

| a2,0 | a2,1 | a2,2 | a2,3 | | a2,2 | a2,3 | a2,0 | a2,1 |

| a3,0 | a3,1 | a3,2 | a3,3 | | a3,3 | a3,0 | a3,1 | a3,2 |

------

A graphical representation of this operation can be found below:

Please note that the inverse of ShiftRow is the same cyclically shift but this time to the

right. It will be needed later for decoding.

The SubBytes operation:

The SubBytes operation is a non-linear byte substitution, operating on each byte of the

state independently. The substitution table (S-Box) is invertible and is constructed by

the composition of two transformations:

1. Take the multiplicative inverse in Rijndael's finite field

2. Apply an affine transformation which is documented in the Rijndael documentation.

Since the S-Box is independent of any input, pre-calculated forms are used, if enough

memory (256 bytes for one S-Box) is available. Each byte of the state is then substituted by

the value in the S-Box whose index corresponds to the value in the state:

a(i,j) = SBox[a(i,j)]

Please note that the inverse of SubBytes is the same operation, using the inversed S-Box,

which is also precalculated.Advanced Encryption Standard (AES) http://www.progressive-coding.com/tutorial.php?id=0&print=1

7 of 25 5/14/2007 10:27 PM

The MixColumn operation:

I will keep this section very short since it involves a lot of very advance mathematical

calculations in the Rijndael's finite field. All you have to know is that it corresponds to

the matrix multiplication with:

2 3 1 1

1 2 3 1

1 1 2 3

3 1 1 2

and that the addition and multiplication operations are a little different from the normal

ones.

You can skip this part if you are not interested in the math involved:

Addition and Substraction:

Addition and subtraction are performed by the exclusive or operation. The two operations are the same;

there is no difference between addition and subtraction.

Multiplication in Rijndael's galois field is a little more complicated. The procedure is as follows:

Take two eight-bit numbers, a and b, and an eight-bit product p

Set the product to zero.

Make a copy of a and b, which we will simply call a and b in the rest of this algorithm

Run the following loop eight times:

1. If the low bit of b is set, exclusive or the product p by the value of a

2. Keep track of whether the high (eighth from left) bit of a is set to one

Rotate a one bit to the left, discarding the high bit, and making the low bit have a value

of zero

3.

If a's hi bit had a value of one prior to this rotation, exclusive or a with the hexadecimal

number 0x1b

4.

Rotate b one bit to the right, discarding the low bit, and making the high (eighth from

left) bit have a value of zero.

5.

The product p now has the product of a and b

Thanks to Sam Trenholme for writing this explanation.

The Rijndael Key Schedule:

The Key Schedule is responsible for expanding a short key into a larger key, whose parts

are used during the different iterations. Each key size is expanded to a different size:

An 128 bit key is expanded to an 176 byte key.

An 192 bit key is expanded to an 208 byte key.

An 256 bit key is expanded to an 240 byte key.

There is a relation between the cipher key size, the number of rounds and the

ExpandedKey size. For an 128-bit key, there is one initial AddRoundKey operation plus

there are 10 rounds and each round needs a new 16 byte key, therefor we require 10+1

RoundKeys of 16 byte, which equals 176 byte. The same logic can be applied to the twoAdvanced Encryption Standard (AES) http://www.progressive-coding.com/tutorial.php?id=0&print=1

8 of 25 5/14/2007 10:27 PM

other cipher key sizes. The general formula is that:

ExpandedKeySize = (nbrRounds+1) * BlockSize

The Key Schedule is made up of iterations of the Key schedule core, which works on 4-byte

words. The core uses a certain number of operations, which are explained here:

Rotate:

The 4-byte word is cyclically shifted 1 byte to the left:

------

| 1d | 2c | 3a | 4f | -> | 2c | 3a | 4f | 1d |

------

Rcon:

This section is again extremely mathematical and I recommend everyone who is interested

to read this description. Just note that the Rcon values can be pre-calculated, which

results in a simple substitution (a table lookup) in a fixed Rcon table (again, Rcon can also

be calculated on-the-fly if memory is a design constraint.)

S-Box:

The Key Schedule uses the same S-Box substitution as the main algorithm body.

The Key Schedule Core:

Now that we know what the operations are, let me show you the key schedule core (in

pseudo-C):

keyScheduleCore(word)

}

Rotate(word);

SBoxSubstitution(word);

word[0] = word[0] XOR RCON[i];

{

In the above code, word has a size of 4 bytes and i is the iteration counter from the Key

Schedule

The Key Expansion:

First, let me show you the keyExpansion function as you can find it in the Rijndael

documentation (there are 2 version, one for key size 128, 192 and one for key size 256):

KeyExpansion(byte Key[4*Nk] word W[Nb*(Nr+1)])

}

for(i = 0; i < Nk; i++)Advanced Encryption Standard (AES) http://www.progressive-coding.com/tutorial.php?id=0&print=1

9 of 25 5/14/2007 10:27 PM

W[i] = (Key[4*i],Key[4*i+1],Key[4*i+2],Key[4*i+3]);

for(i = Nk; i < Nb * (Nr + 1); i++)

}

temp = W[i - 1];

if (i % Nk == 0)

temp = SubByte(RotByte(temp)) ^ Rcon[i / Nk];

W[i] = W[i - Nk] ^ temp;

{

{

Nk is the number of columns in the cipher key (128-bit -> 4, 192-bit -> 5, 256-bit ->

6)

W is of type "word", which is 4-bytes

Let me try to explain this in an easier understandable way:

The first n bytes of the expanded key are simply the cipher key (n = the size of the

encryption key)

The rcon value i is set to 1

Until we have enough bytes of expanded key, we do the following to generate n more

bytes of expanded key (please note once again that "n" is used here, this varies

depending on the key size)

we do the following to generate four bytes

we use a temporary 4-byte word called t

we assign the previous 4 bytes to t

we perform the key schedule core on t, with i as rcon value

we increment i

we XOR t with the 4-byte word n bytes before in the expandedKey (where

n is once either either 16,24 or 32 bytes)

1.

we do the following x times to generate the next x*4 bytes of the expandedKey

(x = 3 for n=16,32 and x = 5 for n=24)

we assign the previous 4-byte word to t

we XOR t with the 4-byte word n bytes before in the expandedKey (where