COSC 2021: Computer Organization
Winter 2004
Solution to Assignment 3
Question 1The execution time for the five instructions is as follows:
Instruction class / Instruction memory / Adder for PC+4 / Register read / ALU operation / Adder for branch / Data memory / Register write / TotalR-format / 3 / X / 1.5 / 2 / 0 / 0 / 1.5
Load word / 3 / X / 1.5 / 2 / 0 / 3 / 1.5
Store word / 3 / X / 1.5 / 2 / 0 / 3
Branch / 3 / X / 1.5 / 2 / Y
Jump / 3 / X
Recall that addition of (PC + 4) is executed in parallel with step 1 (instruction memory access), so the maximum of the duration required to compute (PC + 4) and instruction memory access is included in an instruction. Similarly, computation of branch target address is performed in parallel with steps 2 and 3 (register access and ALU operation). The maximum between the time required for register access and ALU operation versus the time required to calculate the branch target address is therefore included.
A.X = 3 and Y = 3
Instruction class / Instruction memory / Adder for PC+4 / Register read / ALU operation / Adder for branch / Data memory / Register write / TotalR-format / 3 / 3 / 1.5 / 2 / 0 / 0 / 1.5 / 8
Load word / 3 / 3 / 1.5 / 2 / 0 / 3 / 1.5 / 11
Store word / 3 / 3 / 1.5 / 2 / 0 / 3 / 9.5
Branch / 3 / 3 / 1.5 / 2 / 3 / 6.5
Jump / 3 / 3 / 3
B.X = 5 and Y = 5
Instruction class / Instruction memory / Adder for PC+4 / Register read / ALU operation / Adder for branch / Data memory / Register write / TotalR-format / 3 / 5 / 1.5 / 2 / 0 / 0 / 1.5 / 10
Load word / 3 / 5 / 1.5 / 2 / 0 / 3 / 1.5 / 13
Store word / 3 / 5 / 1.5 / 2 / 0 / 3 / 11.5
Branch / 3 / 5 / 1.5 / 2 / 5 / 10
Jump / 3 / 5 / 5
C.X = 1 and Y = 8
Instruction class / Instruction memory / Adder for PC+4 / Register read / ALU operation / Adder for branch / Data memory / Register write / TotalR-format / 3 / 1 / 1.5 / 2 / 0 / 0 / 1.5 / 8
Load word / 3 / 1 / 1.5 / 2 / 0 / 3 / 1.5 / 11
Store word / 3 / 1 / 1.5 / 2 / 0 / 3 / 9.5
Branch / 3 / 1 / 1.5 / 2 / 8 / 11
Jump / 3 / 1 / 3
In the single cycle implementation, the clock cycle must have the same length for every instruction. Since the clock cycle is determined by the longest path in the machine, it is inefficient as many instructions can be completed in a shorter duration.
Question 2
A. Thedatapath for addi instruction is highlighted in the following figure.The buses shown in a lighter shade are not used. The values of the control signals are tabulated in part (C).
B.Fig. 5.14 from the text is modified to incorporate theaddi instruction
Opcode / ALUOp / Instruction Operation / Funct field / Desired
ALU action / ALU control
Input
LW / 00 / load word / xxxxxx / add / 010
SW / 00 / store word / xxxxxx / add / 010
Branch equal / 01 / branch equal / xxxxxx / subtract / 110
R-type / 10 / add / 100000 / add / 010
R-type / 10 / subtract / 100010 / subtract / 110
R-type / 10 / AND / 100100 / and / 000
R-type / 10 / OR / 100101 / or / 001
R-type / 10 / set on less than / 101010 / set on less than / 111
addi / 00 / add immediate / xxxxxx / add / 010
C.Fig. 5.27 from the text is modified to incorporate theaddi instruction
Input or Output / Signal name / R-format / lw / sw / beq / addiInputs / Op5 / 0 / 1 / 1 / 0 / 0
Op4 / 0 / 0 / 0 / 0 / 0
Op3 / 0 / 0 / 1 / 0 / 1
Op2 / 0 / 0 / 0 / 1 / 0
Op1 / 0 / 1 / 1 / 0 / 0
Op0 / 0 / 1 / 1 / 0 / 0
Outputs / RegDst / 1 / 0 / X / X / 0
ALUSrc / 0 / 1 / 1 / 0 / 1
MemtoReg / 0 / 1 / X / X / 0
RegWrite / 1 / 1 / 0 / 0 / 1
MemRead / 0 / 1 / 0 / 0 / 0
MemWrite / 0 / 0 / 1 / 0 / 0
Branch / 0 / 0 / 0 / 1 / 0
ALUOp1 / 1 / 0 / 0 / 0 / 0
ALUOp0 / 0 / 0 / 0 / 1 / 0
Question 3
A.
We have to modify the existing datapath so that PC+4 can be written to register 31 (corresponding to $ra) while PC is written with the new jump address. To accomplish this, we expand
- The multiplexor for RegDst to include 31 as an additional input.With three inputs to the multiplexor, control RegDst should be expanded to2 bits. To select 31 as the input, we assume that RegDst = 10.
- The multiplexor for MemtoReg to include PC + 4 as an additional input. With three inputs to the multiplexor, control MemtoReg is expanded to 2 bits. To select PC + 4, we set MemtoReg = 10.
The execution steps for the jal instruction are:
Step 1:Instruction Fetch: remains unchanged
Step 2Instruction Decode and Register Fetch: remains unchanged
Step 3Execution, Memory Address Computation, Branch, Jump or Jump and Link Completion:
jal:Reg[31] = PC + 4;
PC = PC[31-28] || ( IR[25-0] < 2 );
The datapath for the jal instruction is shown using the bold lines in figure 1.
B.
A new state 10 is added from state 1, such that state 1 -> state 10 -> state 0
The control values for this new state are PCWrite, PCSource = 10, RegDst = 10, MemtoReg = 10, and RegWrite. Since RegDst and MemtoReg are now 2 bits, their corresponding values in the other states have to change as well to reflect this enhancement. We show the changes in fig. 2using shaded blocks.
C.
Based on our result in part (b), the jal instruction will take 3 cycles (state 0 → state 1 → state 10 → state 0).