A look at Intel Processors from the 4004 to the Pentium Pro

CS-350-2: Computer Organization and Architecture

Spring 2004

David Lenhardt

Marcus O'Malley

Christopher Payne

Jonathan Taylor

Table Of Contents

Introduction…………………………………Page 2

Prelude to the 4004 Microprocessor………...Page 2

The 4004 Microprocessor………...………....Page 3

The 8008 Microprocessor………...………....Page 3

The 8080 and 8085 Microprocessors……….Page 4

The 8086 and 8088 Microprocessors……….Page 5

80286 Microprocessor………...………..…...Page 6

80386 Microprocessor………...………..…...Page 7

80486 Microprocessor………...…………….Page 8

Pentium Microprocessor………...………..…Page 9

Pentium Pro Microprocessor………...……...Page 10

Summary………...………..…………………Page 11

Bibliography………...………..……………..Page 12

Introduction

Since the development of the 4004 microprocessor, Intel's microprocessors have greatly evolved. From the 4004 microprocessor all the way through the Pentium Pro, Intel manufactured new, faster, more efficient processors at an incredible rate. We will begin with the justification of the development of the 4004, and then discuss the architecture of the 4004 and the changes in the 8008. We will briefly touch on the 8080 and 8085 before explaining the significant features and changes in both the 8086 and 8088. After these came the 80286 in 1982, 80386 in 1986, and the 486 processors in 1991. The 486 processor was innovative because it was the first time that motherboards could be reused. Previous to its creation when a new processor was developed the motherboard was deemed obsolete. This new technology could support several versions of the 486 processor. Each significant improving upon each other until finally the Pentium processor and Pentium pro were created putting Intel back on top of the world of high-speed processors.

Prelude to the 4004 Microprocessor

In 1969, Intel agreed to produce a set of 12 calculator chips for Busicom. However, Intel employee Marcian Hoff Jr., concerned with both the design complexity and the package requirements for each of the calculator chips, believed it was easier to have a single purpose computer that could perform all the functions. Masatoshi Shima, who worked for Busicom, and Stanley Mazor, who joined Intel late in 1969, contributed their own ideas, which furthered Hoff's idea. After Busicom accepted Shima and Hoff's idea about the development of a single purpose microprocessor, Intel hired Federico Faggin as the designer for the chip set. Faggin decided to develop a chip set called the 4000 series that consisted of the 4001 ROM chip, the 4002 RAM chip, the 4003 I/O expansion shift register chip, and the 4004 chip used as the central processing unit (CPU). Faggin focused on the circuit design and layout of the chip set while Shima worked on the logic for the 4004 chip. Together, they were able to produce the first functional 4004 chip in January of 1971. The 4004 chip became fully functional in March of 1971 after Faggin fixed two minor bugs. The designs of the 4002 and 4004 chips then received final amendments by Faggin before production of the 4000 chip set began in August of 1971.

Faggin felt that the 4000 chip set had far more potential than just being used for calculators. Unfortunately, Busicom owned the rights to the 4000 chip set architecture. Faggin requested that Intel obtain the rights to the chip set, but was unsuccessful due to the consensus by Intel management that the chip set was only good for calculator-like products. Faggin then proved the opposite to Intel's management when he demonstrated the use of a 4004 chip in a totally different application. He and Hoff then used this proof along with Busicom's financial troubles to convince the owner of Intel, Robert Noyce, and the rest of management to lower the price for development of the chips in exchange for nonexclusive rights to sell the chips (Faggin et Al., 1996).

The 4004 Microprocessor

The 4000 chip set varies in the number of 4001 ROM chips and 4002 RAM chips it contains. At the 4004 chip set limit, it is able to support 16 256-kilobyte 4001 ROM chips, as well as 16 80-nibble 4002 RAM chips, where every 20 nibbles represent one register. The 4001 and the 4002 chips drive the 4003 chip, and are equipped with a 4-bit output port. The 4004 chip controls both the 4001 and 4002 chips with the use of five control lines. It includes 16 4-bit registers. Five of the registers consist of a 4-bit accumulator and four 12-bit push-down address stacks where one stack is an instruction pointer and the other stacks are used as return addresses for subroutines. Another register is an arithmetic unit that is used for binary and decimal arithmetic. The other ten registers are used for various other purposes such as logic control and bus timing.

The 4004 chip contains 2,300 transistors and has a total of 45 different instructions, with 16 instruction types that are divided into three groups. The main group consists of 16 general instructions, the IO and RAM groups have 15 instructions, and the accumulator group contains 14 binary arithmetic instructions. The 4000 architecture uses these instructions to address RAM data in several steps. First, a DCL instruction, which is the memory control instruction, selects four RAM chips out of 16 possible RAM chips. The send register control instruction then selects a single 4-bit nibble out of the 256 nibbles available from the four RAM chips that are selected. Finally, a single I/O and RAM instruction is executed on the 4-bit nibble (Faggin et Al., 1996).

The 4004 uses 12 bits for addresses, which allows up to four kilobytes of addressable memory. The 4004 uses a 4-bit multiplexed bus to transfer 4-bits of data at a time over the bus in a successive fashion. The 12-bits of the address are split up into two parts, where 8-bits are used for instructions and 4-bits are used for data (Wikipedia, 2004). A single four-bit instruction takes a total of eight cycles per instruction because the multiplexor bus is only able to transfer four bits at a time. Of those eight cycles, the first three cycles of the instruction are used to gather the address, which is followed by the next two cycles that are used to gather the instruction to complete the fetch stage. The final three cycles are used to perform the instruction, which includes the decode, execute and cycle stages, respectively. If the instruction is an eight-bit instruction, it is then processed in 16 cycles. The 4004 is able to compute 60,000 instructions per second on average at the frequency of 740.74 kilohertz, although the exact number of instructions depends on the total number of four-bit and eight-bit instructions done within the second (Bunyan, 1998).

The 8008 Microprocessor

Intel started the development of the 8008 microprocessor before the release of the 4004. While Intel searched for a designer for the 4004 chip set, the company decided to accept a request from Computer Terminal Corporation (CTC) for the design of a single eight-bit microprocessor. Intel then assigned their newly hired employee, Hal Feenay, to work on the specifications of the chip. Feenay worked with Mazor on the chip until CTC had financial troubles, which caused the project to halt temporarily. In January of 1971, Intel began work on the chip once again, and assigned Feenay to the design of the chip, under the supervision of Faggin. Faggin and Feenay then finished the detail design of the 8008, having used the proven design of the 4004 by March of 1972 (Faggin et Al., 1996).

The 8008 microprocessor contains several upgrades from the 4004. The first change is that the microprocessor's frequency is increased to a range of 400 to 800 kilohertz (Carpinelli, 2001). A second change is the 8008 contains 3,500 transistors (Intel, 2004). Another change includes the microprocessor processing eight-bits at a time. This change is coupled with an upgrade to the multiplexed bus, which is able to transfer eight bits at a time. Another difference in the 8008 is that the number of bits for memory addresses are increased to 14, which allows up to 16 kilobytes of memory to be addressable. This difference allows Intel to increase the number of bits in both the instruction pointer and the subroutines to 14-bit addresses. Furthermore, the 8008 has a total of seven separate return addresses for subroutines instead of the three that the 4004 has. Intel also increased the size of the accumulator and binary arithmetic registers to eight-bits. In addition, Intel included six eight-bit data registers as well as two eight-bit temporary registers.

Another upgrade that is incorporated into the 8008 microprocessor is that the instruction set is increased to 48 different instructions. The instructions are divided into five main groups, which include index register instructions, accumulator group instructions, program counter and stack instructions, I/O instructions, and machine instructions. The instruction groups are then further divided into sub-groups that corresponds with the instruction type as well as whether the instruction uses memory or registers for the instruction.

The 8008 includes new features as well. The first significant feature that the 8008 contains is the introduction of interrupts while an instruction is being executed. When a device raises the interrupt line that is connected to the CPU, then at the fetch stage of the next instruction that is to be executed, the contents of the 14-bit instruction pointer are copied out into a stack. Then the address of the first instruction in a subroutine used to process the interrupt is copied into the instruction pointer at the next fetch stage. The instructions of the interrupt are executed, and finally the original contents of the instruction register are then copied back into the instruction pointer to continue the original instructions before the occurrence of the interrupt.

The inclusion of the ready line is a second, significant feature that is included in the 8008. The ready line allows the 8008 to work with a combination of various memory types that run at different speeds. If the 8008 microprocessor requests a datum from memory, and the datum is not ready, the 8008 would need to wait until the memory that contains that datum gives a signal on the ready line. The 8008 microprocessor is synchronized with memory by the ready line because the 8008 is only able to access the datum at each memory cycle when the ready line is lifted (Intel, 1972).

The 8080 and 8085 Processors

The 8080 and 8085 microprocessors include several changes from the 8008. One of the most significant changes that both microprocessors have is that their machine code is backwards compatible with the 8008 microprocessor. That meant that 8080 and 8085 both can reuse the machine code written for the 8008 (Carpineli, 2001).

The 8080 was released in 1974. It includes a 16-bit address bus that is able to access up to 64 kilobytes of memory. The 8080 also includes a separate eight-bit data bus, which is a contrast from earlier processors that uses a single multiplexed bus (Gwennap, 2003). While the address bus accesses memory and I/O addresses, the data bus accesses data. In addition, the 8080 uses a 16-bit instruction pointer, along with six 8-bit general purpose registers that are combined into 16-bit pairs where each register has a specified partner with which it is paired. The pair of temporary registers that the 8008 has are also able to hold 16 bits together, but are used solely for internal execution of instructions. Intel also uses a 16-bit stack pointer in the 8080 in exchange for the seven levels of address stack that the 8008 microprocessor uses. The stack pointer contains the address of the next available location in memory that can be used for the stack and is able to point to any addressable location in memory (Intel, 1974).

By 1976, Intel had released the 8085 microprocessor. It includes several differences over the 8080. One difference is while the 8080 runs at two megahertz, the 8085 runs at 6.25 megahertz. Another change is that the 8085 uses only +5 volts for power instead of both +12 volts and +5 volts that the 8080 needs (Hamzah, 2001). Another difference is an increase to 74 separate instructions from the 72 instructions that the 8080 can perform although both processors still have the same four address modes. Those four different address modes for instructions include direct, register, register indirect and immediate (Intel, 1974). Finally, the 8085 contains 6,500 transistors, which is an increase from the 4,500 transistors that the 8080 has (Intel, 2004).

The 8086 and 8088 Processors

The next two processors that were released by Intel was the 16-bit 8086 and the 8-bit 8088, which were released in 1978 and 1979, respectively. Intel designed each microprocessor to contain 29,000 transistors, and they designed the 8086 to run at frequencies of five, eight and 10 megahertz, whereas the 8088 ran at 4.77 megahertz and eight megahertz (Intel, 2004). Intel increased the 8086's addressable memory to one megabyte along with a 20-bit address bus used to address memory. They also included an external 16-bit data bus in the 8086 but Intel then decreased the 8088's data bus to 8-bits. Intel designed the address bus and the data bus to be a multiplexed bus, which was able to send both data and address over the same bus. Intel kept the 8086 and 8080 architectures the same despite those changes (Intel, 1990).

The 8086's instruction set contains a total of 133 different instructions. The instructions are divided into several groups where each group represents an instruction type. Those instruction types are data transfer, arithmetic, logic, string manipulation, unconditional jump, control transfer, return from call, interrupts, and microprocessor control (Intel, 1990). Furthermore, there are at least seven address modes used to reference the datum required for an instruction, which include immediate, direct, register, register indirect, register relative, base plus index, and base relative plus index (Nalty, 1997).

The 8086 architecture uses a coprocessor architecture that consists of two different processors- the Bus Interface Unit and the Execution Unit. The Bus Interface Unit controls all external bus functions, which includes the instruction fetch, operand fetch, the storage of the results of operations on operands in main memory, and control of the prefetch instruction queue. The Bus Interface Unit uses the prefetch instruction queue to store up to six bytes of instructions at a time. The Execution Unit, which contains an arithmetic logic unit, eight general-purpose registers, two temporary registers, and queue control logic, is used to decode and execute the instructions it receives from the Bus Interface Unit, and then pass the results of those instructions back to the Bus Interface Unit. Both processors are able to work separately yet concurrently in a pipeline approach, where the Execution Unit executes the instructions given to it by the Bus Interface Unit while the Bus Interface Unit fills the prefetch queue with new instructions if there is room available in the queue (Rucinski, 2003).

The 8086 architecture introduces another significant feature that became known as segmented memory. The 8086 contains four 16-bit segment registers that can access up to 256 Kilobytes at a time when all four registers are combined. The first of the segments is the code segment that holds instructions, the second is the data segment that holds data references, the third is the stack segment that stores return subroutine addresses and the final segment is an extra segment that is used by string operations to hold memory addresses. Each of these segments are pointers that holds the base address of each of their respective segments (Intel, 1990). However, there is a problem in that the address bus is 20 bits whereas the segment registers is only 16 bits. Intel remedied this problem with a two-step solution. First, the segment register address is multiplied by 16 to shift the address over to the left by four bits, which allows the address to be transferred over the 20-bit address bus. Then, the segment address is increased by the addition of an address in a 16-bit offset register, whose sum results in the true, physical address in memory. The 16-bit registers that hold the offset are the instruction pointer for the code segment, the stack pointer for the stack segment, and the source and destination index registers that could be used with any of the segments. This combination of the segment registers and the registers that hold the offset would allow the entire megabyte of memory to be addressable. However, several memory addresses are dedicated or reserved for specific purposes such as the Interrupt Vector table and are not suppose to be used as part of a segment (McQuire, 2002).

The 80286 Processor

The Intel 80286 processor, officially named the iAPX 286, was introduced in February of 1982 (Wikipedia, 2004b). The 80286 processor contains 134,000 transistors and the same basic set of registers, address modes, and instructions as its predecessors, the 8086 and 8088. It is also upward compatible with the 8086 and 8088 (Intel, 1990).

Like its predecessor, the 286 is a 16-bit architecture. It is available in clock speeds of 8, 10, and 12 megahertz. It operates in two modes, 8086 real address mode and protected virtual address mode. In 8086 real address mode programs use real addresses with up to one megabyte of address space. In this mode the processor is really just a faster 8086. In protected virtual address mode the processor provides for more addressable memory, multi-user protection, and virtual memory.