Final exam
Computer Systems Architecture
27 January 2012
6 Questions 100 points
Please show the detail of your solution. If the questions are not clear, please state your assumption.
Question 1: Given a following assembly code:
int func1(int x,int y){
pushl %ebp
movl %esp,%ebp
movl 8(%ebp), %eax
cmpl 12(%ebp), %eax
jl .L2
movl 8(%ebp),%eax
.L2:
movl 12(%ebp), %eax
.L3:
popl %ebp
ret
}
parameter x and y are stored at memory location with offset 8, and 12 relative to the address in register %ebp. The function will store the final value in register %eax. Write the original C function. (20 points)
Question 2: For the following code, what is the final value of register %eax (20 points)
pushl %ebp
movl %esp, %ebp
movl 12(%ebp), %eax
movl 8(%ebp), %edx
movl %eax, %ecx
sall %cl, %edx
movl 16(%ebp), %eax
movl (%eax), %eax
leal (%edx,%eax), %eax
popl %ebp
ret
Assume that the value of registers before starting this code is as follows:
%ebp 0xbffff398
%esp 0xbffff36c
and the value of memory is
0xbffff370: 0x00000064
0xbffff374: 0x00000005
0xbffff378: 0xbffff384
0xbffff37C: 0xbffff380
0xbffff380: 0x00000009
0xbffff384: 0x0000000a
Question 3: Assume that for each function call, the compiler will increase the stack size by the size of local variable + return address only (10 points).
For the following three functions, what are the final stack look like assuming that you current frame is in function c and address of register esp before call function a is 0x8000.
double c(int x, int &y, char w){
double z;
z = x+*y+w; -- you are in this frame
return z;
}
double b(int x,int &y, char w, char z){
return c(x,y,w) +z ;
}
double a(int x, int &y){
char w,z;
z = 3;
w = 4;
return b(x,&y,w,z);
}
int main(){
int x,y;
double z;
x = 1; y= 2;
z=a(x,&y);
}
Question 4: Assume a 5 stage microprocessor system (Fetch, Decode, Execute, Memory, and Writeback), calculate the total number of cycles of below instructions.
1)No pipeline system
2)A pipeline system
In this in-order machine with one unit of ALU that can perform all arithmetic operations, multiplier instruction (multiplier with pipeline support) takes 3 cycles, divider instruction (no pipeline for divider) takes 5 cycles, and other instructions take 1 cycle. (20 points)
- MUL R1, R2, R3
- DIV R4, R1, R6
- LD R9, [R4]
- MUL R2, R1,R3
- LD R3, [R2]
- MUL R4, R3, R6
- LD R5, [R4]
- MUL R7, R8, R9
- DIV R4, R10,R4
- ADD R8,R9, R11
Questions 5: Given this load pattern, calculate the total number of cache miss for a 2-set cache (2 ways-set associate) with the block size of 16 bytes (byte-addressable). The system is based on true LRU algorithm. (20 points).
0xAA00
0xAA04
0xAA10
0xAA20
0xAA30
0xAA14
0xAA05
0xAA16
0xAA40
0xAA50
0xAA06
0xAA17
0xAA67
0xAA78
0xAA09
0xAA80
0xAA7A
Question 6: Consider the following page-reference addresses:
0xAAA00
0xAAB00
0xA0A00
0xA1B00
0xA2A00
0xA3A00
0xAAC00
0xA4A00
0xA5A00
0xA1FFF
0xA2AFF
0xA3BFF
0xA4A00
0xA6000
0xA7000
How many page faults would occur for true-LRU replacement algorithms, assuming six frames and page size is 4kbytes and cache size is 32 bytes? (10 points)