CPE 626 - Advanced VLSI Systems :: Midterm Exam
Assigned: February 28, 2002
Due:March 7, 2002 // 19:00
Name:______
1(60). Design a high speed integer multiplier to support the following ARM multiply instructions:
MUL – multiply (32-bit result),
MLA – multiply accumulate (32-bit result),
UMALL – unsigned multiply long (64-bit result),
UMLAL – unsigned multiply-accumulate long (64-bit result),
SMULL – signed multiply long (64-bit result),
SMLAL – signed multiply-accumulate long (64-bit result).
The multiplier should perform multiplication in multiple clock cycles using carry save array with four layers of adders (see lecture slides) and modified Booth encoding.
A. (20) Draw a block schematic of the proposed multiplier datapath and control unit.
B. (40) Write a VHDL structural model of the multiplier and corresponding testbench for testing.
Notes:
(0) Turn in the block schematic for part (a), source code for the multiplier and testbench, test inputs and outputs for part (b).
(1) Bonus marks (max 10) will be awarded if you support an early termination mechanism.
(2) Other verifiable improvements and/or contributions will be considered for a bonus award (e.g. postlayout simulation).
(3) Other types of multipliers are acceptable (e.g., simple array multiplier, shift-and-add multiplier, ...). However, the maximum number of marks in such cases will be 40.
2. (20) ARM Instruction Set Architecture & Organization.
A. (10) Explain the rational behind the instructions for multiple register data transfers. Discuss the implementation of these instructions in ARM processors. Give examples.
B. (10) Explain conditional execution. Describe how it is supported in ARM instruction set. Discuss the advantages and disadvantages of the conditional execution.
3. (20) Cache Memories.
A. (5 marks) Draw a block structure of the following data cache memory: word is 32 bits, addressable unit is a byte, 2-way set-associative, 8-word blocks, and the cache size is 8KB. Replacement policy is LRU, the write policy is write-back, and on write miss the block is loaded into the cache.
B. (5 marks) What are two basic options on a write-miss? Discuss advantages and disadvantages for each of them (implementation complexity, effectiveness, ...).
C. (10 marks) Estimate the number of total data cache misses as well as the number of data cache misses for each type (cold, conflict, capacity) during the execution of the following loop:
...
for (cnt = 0; cnt < 100; cnt ++) {
mult_vectors(1024, a, b, c);
}
void mult_vectors(int size, int*& a, int*& b, int*& c)
{
for (int i=0;i<size;i++)
c[i] = a[i]*b[i];
}
Assume cache parameters from part a). The size of the vectors is 1024 elements, starting address for vector a is 0000 0000h, for vector b is 0001 0000h, and for vector c is 0002 0000.
CPE626 Midterm Exam // 02/28/2002Page 1 of 2