Section 1

2. (See Mano, pg 208, problem 6-2)

The following is a list of instructions in hexadecimal code with their corresponding memory locations. PC is 100 at the start of the program.

When this program is run, what are the locations and symbolic instructions that are executed, and what is the value in AC when the program halts?

100: 2105
101: 7010
102: 7200
103: 3106
104: 7001
105: CAB4
106: 0

Location / Instruction
(Symbolic Op codes) / comment / AC
100 / LDA 105 / load AC with the data in memory 105
AC <-- M[105] / M[105]
101 / SPA / Skip if AC is positive
If AC(15) = 0 then PC <-- PC+1 / M[105]
102 / CMA / Complement AC /
103 / STA 106 / Store AC data to memory at 106.
M[106] <-- AC /
104 / HLT / Halt the computer /
105 / BUN AND STA BUN
106 / 0000 / Store sum here

The value in AC when the program halts:

1. Addressing Modes. Create an example similar to the one Mano presents on pg 265, Figure 8-7, and the associated Table 8-4 on pg 266, using your own numbers.

Addressing Mode / Effective address / Value of the operand loaded into the AC
a) / Direct Address / PC+1=185+1=[186] / 833
b) / Immediate Address / PC+1=185+1=186 / 300
c) / Indirect Address / PC+1=185+1=[[186]] / 756
d) / Relative Address / 185+2+[185+1]=187+186=30D / 256
e) / Indexed Address / 90+[185+1]=90+300=390 / 726
f) / Register / R1=216 / 216
g) / Register Indirect / [R1]=[216] / 465
h) / Autoincrement / [R1]=[216] / 465
i) / Autodecrement / [R1-1]=[215] / 370
Registers / Memory
PC = 185
R1 = 216
XR = 90
AC
/ Address / Memory
185 / Load to AC | Mode
186 / Address = 300
212 / Next Instruction
215 / 370
216 / 465
300 / 756
30D / 256
390 / 726

You may find the following table helpful:

2. RPN practice. Fill in the following table. As before, do enough that you feel you have the hang of it.

Infix notation / RPN notation / Tree notation / Value
15 - 10 / 15 10 - / / 5
(55 - 22) / 11 / 55 22 - 11 / / / 3
55 - 22 / 11 / 55 22 11 / - / / 53
(14 + 13*2)/(3+5) / 14 13 2 * + 3 5 + / / / 5
(42 – (13 + (36 / (15 - 16 + 17)))) / 42 13 36 15 16 17 + - / + - / / 31
((42+13) - (36 + (15 /(16 – 17)))) / 42 13 + 36 15 16 17 - / + - / / 34

You may find the following applet helpful:

3. Instruction formats. Fill in some of the cells in the following table.

Statement / 3 address / 2 address / 1 address / 0 address
x = a - b / SUB x,a,b / SUB a,b
STA x, a / PUSH A
SUB b
POP x / PUSH A
PUSH B
SUB
POP X
x = (a - b) / c / SUB x, a, b
DIV x, x, c / SUB a, b
DIV a, c
STA x, a / PUSH a
SUB b
DIV c
POP x / PUSH a
PUSH b
SUB
POP a
PUSH c
DIV
POP X
x = (a-b*c)/e / MULT b, b, c
SUB a, a, b
DIV x, a, e / MULT b, c
SUB a, b
DIV a, e
STA x, a / PUSH b
MULT c
POP b
PUSH a
SUB b
DIV e
POP x / PUSH b
PUSH c
MULT
POP b
PUSH a
PUSH b
SUB
POP a
PUSH a
PUSH e
DIV
POP x
x = (a*b-c*d)/(e-f) / MULT a, a, b
MULT c, c, d
SUB a, a, c
SUB e, e, f
DIV x, a, e / MULT a, b
MULT c, d
SUB x, c
SUB e, f
DIV x, e
STA x, a / PUSH a
MULT b
POP a
PUSH c
MULT d
POP c
PUSH a
SUB c
POP a
PUSH e
SUB f
POP e
PUSH a
SUB e
POP x / PUSH a
PUSH b
MULT
POP a
PUSH c
PUSH d
MULT
POP c
PUSH a
PUSH c
SUB
POP a
PUSH e
PUSH f
SUB
POP e
PUSH a
PUSH e
DIV
POP x

You may find the following page helpful:

1. (See Mano, pg 293, problem 8-12)

Write a program to evaluate the following arithmetic statement:

Assume that the computer has the following arithmetic instructions:

  • ADD
  • SUB
  • MULT
  • DIV

Using 3, 2, 1 and 0 address instructions -

  1. Using a general register computer with 3-address instructions.
  2. Using a general register computer with 2-address instructions.
  3. Using an accumulator type computer with 1-address instructions.
  4. Using a stack organized computer with 0-address instructions.

Hint: Converting the expression to RPN makes some of these programs easier.

3-address / 2-address / 1-address / 0-address
MULT D, D, E
SUB D, D, F
MULT C, C, D
SUB A, A, B
ADD A, A, C
MULT H, H, K
ADD G, G, H
DIV X, A, G / MULT D, E
SUB D, F
MULT C, D
SUB A, B
ADD A, C
MULT H, K
ADD G, H
DIV A, G
STA X, A / PUSH H
MULT K
ADD G
POP G
PUSH D
MULT E
SUB F
MULT C
ADD B
POP B
PUSH A
SUB B
POP A
DIV G
POP X / PUSH D
PUSH E
MULT
PUSH F
SUB
PUSH C
MULT
PUSH B
ADD
POP B
PUSH A
PUSH B
SUB
POP A
PUSH H
PUSH K
MULT
PUSH G
ADD
POP G
PUSH A
PUSH G
DIV
POP X

Conventions:

  1. 3-address:
    SUB R1,R2,R3 // R1 <-- R2 - R3
  2. 2-address
    SUB R1,R2 // R1 <-- R1 - R2
  3. 1-address
    SUB A // ACC <-- ACC - M[A]
  4. 0-address
    PUSH A // TOS <-- M[A]
    PUSH B // TOS <-- M[B]
    SUB // TOS <-- M[A] - M[B]

2. (See Mano, pp 265-266)

A two-word instruction is stored at location 350 with its address field at location 351 (all numbers in hexadecimal). The first word of the instruction specifies the operation code and mode; the second word specifies the address part.

The values of the program counter (PC register), a general register (R1), the index register (XR), the base register (BR), and certain addresses in memory are as shown below.

Evaluate the effective address and the value that is loaded into the AC if the addressing mode of the instruction is:

Addressing Mode / Effective address / Value of the operand loaded into the AC
a) / Direct Address / PC+1=350+1=[351] / 833
b) / Immediate Address / PC+1=350+1=351 / 400
c) / Indirect Address / PC+1=350+1=[[351]] / 900
d) / Relative Address / 350+2+[350+1]=352+400=752 / 189
e) / Indexed Address / 125+[350+1]=125+400=525 / 735
f) / Register / R1=834 / 834
g) / Register Indirect / [R1]=[834] / 950
h) / Autoincrement / [R1]=[834] / 950
i) / Autodecrement / [R1-1]=[833] / 900
Registers / Memory
PC = 350
R1 = 834
XR = 125
AC
/ Address / Memory
350 / Load to AC | Mode
351 / Address = 400
352 / Next Instruction
833 / 900
834 / 950
400 / 833
900 / 456
752 / 189
525 / 735

Section 2

  • Find a reference to a processor that implements an instruction pipeline and describe some of its key features. You should say how many segments are used and whether the processor is CISC or RISC, among other issues.

The ARM processor is a RISC processor which features a 5 stage, 4 segmentpipeline.

Pipeline description

Advanced processing of instruction fetch and branch prediction - unblocks branch resolution from

potential memory latency-induced instruction stalls.

Up to four instruction cache line prefetch-pending - further reduces the impact of memory latency

so as to maintain instruction delivery.

Between two and four instructions per cycle forwarded continuously into instruction decode -

ensures efficient superscalar pipeline utilization.

Fast-loop mode - provides low power operation while executing small loops.

Superscalar decoder - capable of decoding two full instructions per cycle.

Speculative execution of instructions - enabled by dynamic renaming of physical registers into an

available pool of virtual registers.

Increased pipeline utilization - removing data dependencies between adjacent instructions and

reducing interrupt latency.

Virtual renaming of registers - accelerating code through an effective hardware based unrolling of

loops without the additional costs in code size and power consumption.

Any of the four subsequent pipelines can select instructions from the issue queue - providing out of order dispatch further increasing pipeline utilization independent of developer or compiler (page 6)

Reference: The ARM Cortex-A9 Processors, Whitepaper

  • Be sure to include citations.
  • Determine the number of clock cycles that it takes to process 500 tasks in a 8-segment pipeline. See Mano, pg 329, 9-3.

k = 8 segments

n = 500 tasks

cycles = (k+n-1) = 8+500-1 = 507 cycles

  • A non-pipeline system takes 50 ns to process a task. The same task can be processed in a 8-segment pipeline with a clock cycle of 10 ns. Determine the speedup ratio of the pipeline for 100 tasks. What is the maximum speedup that can be achieved? See Mano pg 329, 9-4

Section 3

  • Pick a number between 16 and 128 - call this A \

A = 112

  • Pick another number between 5 and 10 - call this T

T = 8

  • The numbers you select should be different from those selected by other students.
  • Explain how a direct mapping cache would work with a computer using A bits for memory addressing and T bits for a direct mapping cache.

The A bit memory address will be divided into two fields, T bits for the index field and A-T bits for the tag field. Given that A = 112 and T = 8, the index field is 8 bits and the tag field is 104 bits. The direct mapping cache organization uses the full 112 bit address to access the main memory and the 8 bit index to access the cache.

  • How many words will the cache have?

28 = 256 words

  • Ram sizing:
  • How many 256x8 RAM chips are needed to provide a memory capacity of 16,384 bytes?

256x8 RAM chips have 256 bytes per chip so 16384/256 = 64 RAM chips

  • How many lines of the address bus must be used to access 16,384 bytes of memory? How many of these lines will be common to all the chips?

14 bits are necessary to specify each byte of memory (214 = 16384), each chip has 256 1 byte addresses, so 8 lines will be common to all chips (28 = 256).

  • How many lines must be decoded for chip select? Specify the size of the decoders.

6 bits are necessary to specify the specific chips (26 = 64)

  • A computer uses RAM chips of 1024x1 capacity
  • How many chips are needed, and how should their address lines be connected to provide a memory capacity of 4096 bytes?

1024x1 * 8 (to make each a byte) * 4 = 4096 bytes, so 8*4 = 32 chips

  • How many chips are needed to provide a memory capacity of 64K bytes? Explain in words how the chips are to be connected to the address bus.

4096 * 4 =16kB

32 chips (for 4096 bytes) * 4 = 128 chips to get 64kB

Each chip must be connected to the address bus to take the lines necessary to determine which memory location is being addressed. The chips are connected in groups of eight, so that each of these groups form one word. This means that the design is repeated eight fold for the chips, such that for a given memory location each bit of the word is stored (as the chips are of x1 construction).

  • How many tag bits are needed in a direct mapped cache of 1024 words in a computer with 1M words?

It requires 20 bits to address 1M words, so the tag size would be 1024-20 = 1004.