ECE 4480/5480 Computer Architecture and Design

spring

Exam II

Name: ______Solution______(print)


1.(25%) Caches are important to providing a high performance memory hierarchy to processors. Below is a list of 32-bit memory address references, given as word addresses.

6, 214, 175, 214, 6, 84, 65, 174, 105, 85, 215.

(a) Given a direct-mapped cache with two-word blocks and a total size of eight blocks’

Identify the tag, and the index of each reference and also list if each reference is a hit or a miss and the final cache data assuming the cache is initially empty.

The following table is for your convenient adding more entries as needed

Ans: total cache size = 8 blocks and using 2-word blocks,

1 bit for word offset, 3 bits for block selection, i.e., 3 bits for index, the rest is Tag bits

Address (word) / Cache Tag / Index / Cache Data / H/M
6
0x00000000000110 / 0x0000000000 / 011 / M(6) M(7) / M
214
0x00000011010110 / 0x0000001101 / 011 / M(6) M(7) replaced by M(214) M(215) / M
175
0x00000010101111 / 0x0000001010 / 111 / M(174) M(175) / M
214
0x00000011010110 / 0x0000001101 / 011 / M(214) M(215) / H
6
0x00000000000110 / 0x0000000000 / 011 / M(214) M(215) replaced by M(6) M(7) / M
84
0x00000001010100 / 0x0000000101 / 010 / M(84) M(85) / M
65
0x00000001000001 / 0x0000000100 / 000 / M(64) M(65) / M
174
0x00000010101110 / 0x0000001010 / 111 / M(174) M(175) / H
105
0x00000001101001 / 0x0000000110 / 100 / M(104) M(105) / M
85
0x00000001010101 / 0x0000000101 / 010 / M(84) M(85) / H
215
0x00000011010111 / 0x0000001101 / 011 / M(6) M(7) replaced by
M(214) M(215) / M


2. (25%) A computer system has 96 bytes of total capacity in the cache data, not including cache tag, 1 word = 4 bytes, LRU replacement policy is used and the cache starts empty. For a three-way set associative cache with two-word blocks, assuming the following sequence of byte addresses that are issued to access the cache.

0, 12, 4, 8, 15, 1, 4, 9, 32, 132, 116, 112, 40, 64, 113, 84, 20, 64,

(a) find the hit rate. Show all your work

96 bytes à 24 words, two-word blocks, 12 blocks, 3-way set associative, 4 sets

2 bits byte offset; 1 bit for word offset, 2 bits for set index, the rest of bits are tags

M(0): word 0 content, i.e. byte0 byte1 byte2 byte3, b1b0=00 01 10 11

i.e., M(0) has content of byte address 0 – 3; M(4) has content of byte address 4-7; etc.

Address in bytes / Set index / Way0 / Way1 / Way2 / H/M
0x00000000
0 / 00 / TAG: 0x000 data:M(0)M(4) / M
0x00001100
12 / 01 / TAG:0x000
data:M(8)M(12) / M
0x00000100 / 00 / H
0x00001000 / 01 / H
0x00001111 / 01 / H
0x00000001 / 00 / H
0x00000100 / 00 / H
0x00001001
9 / 01 / TAG:0x000
data:M(8)M(12) / H
0x00100000
32 / 00 / TAG: 0x000 data:M(0)M(4) / TAG:0x001
Data:M(32)M(36) / M
0x10000100
132 / 00 / TAG: 0x000 data:M(0)M(4) / TAG:0x001
Data:M(32)M(36) / TAG:0x100
Data:
M(128)M(132) / M
0x01110100
116 / 10 / TAG:0x011
Data:
M(112)M(116) / M
0x01110000
112 / 10 / H
0x00101000
40 / 01 / TAG:0x000
data:M(8)M(12) / TAG:0x001
Data:M(40)M(44) / M
0x01000000
64 / 00 / Tag:0x010
Data:
M(64)M(68) / TAG:0x001
Data:M(32)M(36) / TAG:0x100
Data:
M(128)M(132) / M
0x01110001
113 / 10 / H
0x01010100
84 / 10 / TAG:0x011
Data:
M(112)M(116) / TAG:0x010
Data:M(80)M(84) / M
0x00010100
20 / 10 / TAG:0x011
Data:
M(112)M(116) / TAG:0x010
Data:M(80)M(84) / TAG:0x000
Data:M(16)M(20) / M
0x01000000
64 / 00 / H

(b) show the final content of cache tag and cache data. see the above table. hit rate=50%

3. (15%) A fully-associative cache using an LRU replacement policy always has a better hit rate than a direct-mapped cache with the same cache capacity.

This statement is true or false. Please give an example to justify your answer.

False

For example a direct-mapped cache has N one-word blocks

Block 0 has word 0

Block 1 has word 1

…..

Block N-1 has word N-1

A program repeatedly accesses locations from word 0 to word N, the direct-mapped cache will have a miss rate 1/N for repetition after the first repetition, i.e., after the cold start misses. The miss occurs when accessing word N which is not in the cache, replacing word 0 by word N. The next access is word 1, in the cache, hit, etc.

A fully-associative cache with the same cache size, i.e., N one-word blocks using LRU

A program repeatedly accesses location from word 0 to word N,

First N memory accesses will have misses (cold start)

Memory access sequence: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, …. N-1

Block 0 has word 0

Block 1 has word 1

….

Block N-1 has word N-1

Now Block 0 is the LRU block, and Block 0 will be replaced by the next location causing miss.

Memory access sequence: N, 0, 1, 2 ,3, 4

Accessing word N will cause replacing word 0 by word N

Accessing word 0 will cause replacing word 1 by word 0

…….

-à Hit Rate = 0.

4. (a) (20%) Consider the following program

integer A[1000]

for ( i=0; i < 1000; i +=1)

{ A[i] = A[i] + i}

Run the above program on a processor with a 1K byte direct-mapped write-back cache with 4-word cache blocks. What is the approximate data cache miss rate?

assuming integers are 1 word long and a word has 4 bytes.

Each block has 4 words and each word has 4 bytes. 1K bytes à 1024/16 = 64 blocks

Block 0 has word 0 – word 3, i.e., A[0] – A[3]

Block 1 has word 4 – word 7, i.e., A[4] – A[7]

……

Block 63 has word 996 – word 999, i.e., A[996] – A[999]

Each miss brings the whole block to the cache, the next 3 memory accesses will be hit.

Altogether there are 250 misses (caused by accessing A[0], A[4], A[8], … , A[996])

The program reads 1000 times and writes 1000 times. Miss rate = 250/2000 = 12.5%

4. (b) (5%) For the same problem in 4. (a), if change the cache write policy to write through cache, will the cache miss rate increase or decrease? Explain

Cache miss rate: no change. Write through policy will not change the cache miss rate because write through policy does not alter cache data in any way to change the cache miss rate.

5. (10%) Given the following information: 32-bit virtual memory address, 8-KB pages, 64-MB physical memory, 128 entry direct-mapped TLB with one physical page number per entry, each TLB has a valid bit and a dirty bit. Determine the TLB entry size in bits if you only use the minimum necessary physical address size.

(Show all your work)

64MB physical memory ---à needs 26 bits for physical memory address.

Each page has 8KB ---à 13 bits, i.e., page offset for 8KB page is 13 bits.

26 – 13 = 13 bits ---à physical page number is 13 bits wide.

Virtual memory address has 32 bits, 32 – 13 = 19 bits (All 19 bits will be used as virtual page number if TLB is fully associative)

Now TLB is 128 entry direct-mapped TLB. It needs 7 bits for TLB indexing.

19 – 7 = 12 Tag bits, or say TLB Tag.

Each TLB entry will have 12 bits TLB tag + 13 bits physical page number = 25 bit.

Ans: 25 bits excluding valid bit, dirty bit.