Table of Contents:

1 List of Figures ii

2 List of Tables iii

3 List of Symbols iv

4 List of Definitions v

5 Introductory Materials 1

5.1 Executive Summary 1

5.2 Acknowledgement(s) 3

5.3 Problem Statement 3

5.4 Operating Environment 3

5.5 Intended user(s) and intended use(s) 3

5.6 Assumptions 4

5.7 Limitations 8

5.8 End product and other deliverables 9

6 Project Approach and Results 10

6.1 Functional Requirements for the End Product 10

6.2 Design Constraints 10

6.3 Technical Approach Considerations and Results 10

6.4 Testing Approach Considerations 11

6.5 Detailed Designs 12

6.6 Sutherland Design 12

6.7 Dautremont Design 20

6.8 Mousetrap Design 23

6.9 Design Selection Criteria 30

6.10 Implementation Process 31

6.11 Testing Process and Results 32

6.12 Project End Results 33

7 Resource and Schedules 34

7.1 Personnel Effort Requirements 34

7.2 Resource Requirements 35

7.3 Estimated Financial Requirements 36

7.4 Schedule 38

8 Closure Materials 41

8.1 Project Evaluation 41

8.2 Commercialization 44

8.3 Recommendations for Additional Work 44

8.4 Lessons Learned 44

8.5 Risk and Risk Management 45

8.6 Team Information 47

8.7 Summary 48

8.8 References 48

8.9 Appendix A - Statement of Work………….………………………………...A-1

8.10 Appendix B - Milestone Evaluation……………………….………...... …….B-1

1  List of Figures

Figure 1 Top View of Data Pipeline …………………………………………………. / 5
Figure 2 Latches ……………………………………………………………………… / 6
Figure 3 Pre-Driver …………………………………………………………………... / 6
Figure 4 Pads/Pad Drivers ……………………………………………………………. / 7
Figure 5 Clock ………………………………………………………………………... / 7
Figure 6 Sutherland FIFO Diagram ……….…………………...…………………….. / 13
Figure 7 C-Muller Transistor Level Design….……………………………………….. / 14
Figure 8 Switch Transistor Level Design………………………………………….…. / 15
Figure 9 Inverter Transistor Level Design……………………………………………. / 16
Figure 10 FIFO Component Level Design…………………………………………... / 18
Figure 11 Simulation of Sutherland FIFO……………………………………………. / 19
Figure 12 Data must pass through all storage elements………………………………. / 20
Figure 13 Parallel architecture reduces number of elements data must pass through... / 20
Figure 14 Tokens are used to allow storage elements to read and write……………… / 21
Figure 15 Gates used in circuit……………………………………………………….. / 21
Figure 16 Control Logic Circuit………………………………………………………. / 22
Figure 17 Control logic and data storage……………………………………………... / 22
Figure 18 Dissimilar Token and Token Sense lines allow operations to occur. The
inverters shown are necessary for proper operation……………………….. / 23
Figure 19 Block Diagram of Mousetrap Pipeline…………………………………….. / 24
Figure 20 Schematic of the Data Latch……………………………………………….. / 25
Figure 21 Simulation Results for Test of Single Data latch………………………….. / 26
Figure 22 Schematic of Dual Rail Monotonic XOR/XNOR Gate……………………. / 27
Figure 23 Simulation Results for Test of Single XOR/XNOR Logic Gate…………... / 28
Figure 24 Schematic of Single Bit of Mousetrap Pipeline……………………………. / 29
Figure 25 Schedule…………...……………………………………………………….. / 39
Figure 26 Deliverable Schedule …………………………………………………….. / 40

2  List of Tables

Table 1 Gate Truth Tables………………………………………………………..…. / 21
Table 2 Original Personal Effort Requirements……………….…………………..... / 34
Table 3 Revised Personal Effort Requirements…………………………………..…. / 34
Table 4 Actual Personal Effort Requirements…………………………………...….. / 35
Table 5 Original Resource Requirements……………………………………..…….. / 35
Table 6 Revised Resource Requirements …………………..…………………...…... / 36
Table 7 Actual Resource Requirements……………………………………………... / 36
Table 8 Estimated Financial Requirements………………………………………..... / 36
Table 9 Revised Financial Requirements…………………………………………..... / 37
Table 10 Actual Financial Requirements……………………………………………. / 37
Table 11 Proposed Milestones……………………………………………………..... / 41
Table 12 Evaluation Definition…………………………………………………..….. / 43
Table 13 Project Milestones and Success of Project……………………………..….. / 43
Table 14 Team Information …………………………………..…………………..… / 47
Table 15 Proposed Milestones…………………………………………………..…... / B-3
Table 16 Evaluation Definition………………………………..………………...…... / B-3

3  List of Symbols

Symbol / Meaning
/ AND gate. The standard logic gate, as described in the Dautremont design.
/ Muller-C element. This symbol has one inverted input, as described in the Sutherland design.
/ NMOS transistor. Basic transistor symbol, referenced in the Dautremont and Sutherland designs.
/ NOT gate. The standard inverter, as described in the Sutherland and Dautremont designs.
/ PMOS transistor. Basic transistor symbol, referenced in Dautremont and Sutherland designs.
/ Switch. The appropriate input is chosen based on the control wire, and the input is inverted. This component is described in the Sutherland design.
/ Tristate inverter. The conventional tristate inverter, as described in the Dautremont design.
/ XOR gate. The standard XOR logic gate, as described in the Dautremont design.

4  List of Definitions

Asynchronous – Lacking a common signal to coordinate behavior, especially the electrical signals between two circuit components.

C-Muller – The event driven logic analog to an AND gate that generates an event when there is an event on both of its inputs.

Cadence – Integrated circuit design tool that allows for front to back circuit design.

Circuit family – The specific architecture used to implement a component. It often determines overall speed and size.

Corner - Timing variation due to systematic and random process errors.

DLL – Delay lock loop; a timing device that synchronizes two different signals.

DRC – Design rule check; verifies the geometry of a circuit layout conforms to certain specifications so as to guarantee a high yield of good die.

Fast corner – Fast corner process characterized by high voltage and low temperature.

FIFO – First in first out; an electronic component that buffers data and presents the data in the order it was received.

Forward latency – Delay between data valid signal and the first QED clock edge.

FO4 delay - Process independent measure of fan out of four independent inverters driven by one inverter.

HDL – Hardware description language

Integrated circuit – A tiny complex of electronic components and their connections that is produced in or on a small slice of material (as silicon).

IC – Integrated circuit

Lambda based design rules – Design rules based on a process-specific length lambda. All design rules are then specified as integral multiples of lambda. In this way, a given design becomes scalable and process independent.

Layout – A geometric layout of the physical layers of material in an integrated circuit.

LVS – Layout versus schematic – A process used in the Cadence technology suite to compare the schematic with an extracted form of the layout.

Maximum data throughput – The maximum amount of data that can be produced by the pipeline in a specified amount of time.

MOSIS: A fabrication service that allows students to fabricate circuits.

Process: A system of operations used to take a silicon wafer to an integrated circuit, typically designated by the length of the smallest feature that the system can produce.

QED – Signal from the graphics processor requesting data from the pipeline.

Reverse latency – How quickly new data can be accepted when the pipeline is full.

Schematic – A drawing or diagram of the circuit that makes it easier for the user to visualize and understand in terms of mathematical equations.

Slew rate – The maximum rate of change of an output signal.

Slow corner – A slow corner process characterized by low voltage and high temperature.

Specifications – a detailed, precise presentation of something or a plan or proposal for something.

Tapeout – Final layout draft - After the iterative process of fixing errors in the layout and running simulations, a layout will be created that meets all the rules of integrated circuit design and the specifications of the process.

Throughput: A measure of pipeline efficiency; specifically, the number of memory requests served per unit of time.

Topology – A branch of mathematics concerned with those properties of geometric configurations (as point sets) that are unaltered by elastic deformations (as a stretching or a twisting)

Verilog – A hardware description language used to describe the behavior of a circuit and verify that a design is correct.

41

5 Introductory Materials

This section of the design report will provide an overview of the read data path pipeline project. First, it will provide the motivation and problem statement that define the project. Next, it will explain the environment the circuit is being designed for, and who will use the circuit. Finally, it will outline assumptions and limitations inherent in the work, and describe generally what the final product should be.

5.1 Executive Summary

The executive summary includes the need for the project, the actual project activities performed, and the recommendation for follow-up work.

Project Need:

Micron Technologies is one of the world’s leaders in advanced semiconductor solutions. One of the items produced by Micron is the graphics card used in a computer. The graphics card is constantly refreshing the display by reading data from graphics memory. The time between data being ready in the graphics memory and the graphics processor being ready to handle the data is not constant due to process and operating variations. Micron desired a pipelined processor to ensure data is being passed from the memory to the processor at the right time and also meet throughput requirements. The challenge to meet the timing specification, throughput needs, in a method that is independent of process variations and operating conditions was given to senior design team: May04-19. Upon successful completion the graphics card will be able to function correctly since data will be read according to the asynchronous specifications.

Project Activities:

The activities involved in the creation of the 32-bit read data pipeline processor included: understanding, research, design, and testing. The goals of the project include creating a read data pipeline processor to meet the specification given by Micron technologies, but also to fabricate the circuit design and to check the actual results of the design after fabrication. To be able to have the circuit design fabricated and tested before the end of the project May 2004 the layout and design needed to be submitted for fabrication on January 12, 2004. To meet the deadline a lot of time had to be spent on the project during the first semester.

The first step in the project was to understand the need for the project and the problem the creation of the pipeline would solve. The read data pipeline solves the problem that data is being read from the memory at a different rate than it is entered; therefore the pipeline will provide a buffer to solve the asynchronous problem. After understanding the purpose of the project research on the different techniques/methods currently used in the implementation of asynchronous pipelines needed to be completed. Information on such methods as used by Sutherland and in the Mousetrap design were discovered and understood.

After learning about the different techniques that could be used to create a pipelined processor, the Dautremont method was designed as an extension of an already existing method, but with a new twist. For each process a schematic, layout and post layout simulations were completed in conjunction with hand calculated timing analysis. Test for forward latency, reverse latency and throughput were also completed.

After the selection criteria for the pipelined processor was set each design was compared to determine which design would be completed to the fullest and submitted for fabrication. The Dautremont design was selected due to the fact that it met all of the requirements set forth by Micron and the team, it took the least amount of space 234 um2 per bit, and has a fast forward latency of 6.05 ns, and it was a new idea.

The final layout of the circuit was completed by the deadline and submitted to MOSIS. While waiting for the circuit to be fabricated post layout testing was completed and a test plan for testing the fabricated circuit was discussed. Upon receiving the fabricated circuit back testing of the fabricated circuit will be completed in the Carver Lab.

Final Results:

The final result of this project is a design report sent to the client, Micron Technology. This report shows the details of the schematics, layouts, and test results of the designs used by the design team.

At the beginning of this project, the team was given a basic functional specification of the read data path pipeline, including inputs and outputs, a block diagram of its placement on the surrounding chip, and timing diagrams indicating behavior. Using this information, the team developed a conceptual understanding of how the part needed to work.

Next, the team began researching various circuit design and pipelining strategies in order to create a concrete design that was both fast and space efficient. Significant investigation into event driven pipelining and control led to an important and fundamental understanding of the issues involved.

Research and advice from Dr. Geiger and Mr. Brian Johnson led to three different pipeline designs. Each design was simulated using Verilog and investigated with hand calculations involving delay through the circuit and layout area. After some deliberation, the Dautremont design was chosen for final layout and simulation.

The modular components were laid out in Cadence individually, and then integrated into the final design. Simulations were carried out throughout the process to detect problems early. The design was testing against the timing specifications of Micron, and final reports were created.

In addition, the team was forced to consider some fabrication issues in the design. This was accomplished when the group submitted a portion of the design to the MOSIS program for fabrication.

Follow-up Recommendations:

No additional work is required to complete the project as since the objective was to design a pipeline processor to meet the specifications given by Micron Technology.

5.2 Acknowledgement(s)

Thanks to Brent Keeth and Brian Johnson from Micron Technology for creating a project that students can handle in a Senior Design Project, and for being contact points within Micron.

Thanks to Dr. Randy Geiger for being the faculty advisor for the project, and for providing his knowledge and experience in the area of integrated circuit design.

Thanks to MOSIS for providing students the opportunity to fabricate a circuit. Without this help, the cost of fabrication would be out of reach for a senior design project.

5.3 Problem Statement

General problem statement: Micron Technology needed an integrated circuit for a graphics card that can quickly move data from graphics memory to a graphics processor. The data from the memory arrives at the processor asynchronously, so a data path pipeline is needed to mediate between the two timing configurations.