San Jose State University

College of Engineering

“OPB Flyby-DMA Controller”

CMPE202 Project Report

Fall 2002

Professor:

Prof. Donald Hung

Date: 12/18/2002


Table of Contents

1 Introduction 3

2 System Design 4

2.1 System Block Diagram 4

2.2 System Address Map 5

3 DMAC Main-Core Design 5

3.1 Slave Side Address Decoder 6

3.2 Slave Side Transfer Handshaking 6

3.3 DMAC Register Module 7

3.4 DMAC Configuration and Status Registers (CSR) 8

3.4.1 Start Address Register @ 0x00000400 8

3.4.2 Word Count Register @ 0x00000404 8

3.4.3 Control Register @ 0x00000408 8

3.4.4 Status Register @ 0x0000040C 8

3.5 Timing & Control 9

3.6 DMAC-FSM 10

4 OPB Arbiter 11

5 OPB Bus Logic 13

6 Test Plan 14

6.1 Core Modules 14

6.2 Stimulus Block Diagram 14

6.3 Test Modules required for testing Core Modules 15

6.3.1 CPU Master Device 15

6.3.2 Memory Slave Device 15

6.3.3 DMA Peripheral Slave Device 16

7 Verilog Source Code 17

7.1 DMAC Main-Core – dma_controller.v 17

7.2 Arbiter Main-Core – obp_arbiter.v 23

7.3 OPB Bus Logic Main-Core – opb_logic.v 26

7.4 Memory Test-Core – memory_sl3.v 28

7.5 DMA Peripheral Test-Core – dma_peripheral_sl2.v 30

8 Verilog Testbench Code 33

8.1 DMAC Testbench – dma_controller_tb.v 33

8.2 Stimulus(CPU) Testbench Code - opb_dmac_system_tb.v 40

9 Output Waveforms 50

10 Appendix A – References 57


OPB Flyby-DMA Controller

1  Introduction

In System On a Chip (SOC) systems, just like in traditional board-based computer systems, it is important to make the most efficient use of the system resources. For example, the system (peripheral) bus is one such resource that is used by the CPU to access the peripherals. However, it is typical that the peripheral data needs to be stored in memory before it is ready to be used. For this reason, it is beneficial for a system to have a device to facilitate peripheral-to/from-memory transfers, during the times in which the CPU has no other reason to access the system bus. The name given to such a device is ‘Direct Memory Access Controller’, or DMA Controller.

The focus of this project is to design and verify a DMA Controller targeted for IBM’s On-chip Peripheral Bus (OPB). The OPB is a flexible system bus which takes advantage of the interconnection fabric of ASICs and FPGAs, by allowing devices to be connected through AND/OR logic, rather than through tri-state buses. Therefore, this DMA Controller shall adhere to the OPB Specification version 2.1.

Decisions had to be made as to what kind of functionality this DMA Controller should have. It needed to be an OPB master, in order to take control of the bus. But it also needed to be a slave, so the CPU could program it. So it was determined that the DMAC would have two distinct “sides”, which would share a register file.

It was also decided to make this a “flyby” controller (one-stage), rather than a two-stage controller. What this means is that the DMAC does not buffer the data between the memory and the peripheral. Instead, it helps the peripheral behave as if it were the CPU, by driving the address and control lines appropriately, requiring only that the peripheral properly handle the data bus.

There still needs to be some handshaking between the peripheral and the DMAC. The OPB specifies the use of two signals for this purpose. The peripheral gets the attention of the DMAC using a ‘dma request’ signal, and the DMAC can then respond with a ‘dma grant’ signal. The details of this interaction are spelled out later in this report.

1

2  System Design

2.1  System Block Diagram

For the purposes of illustration, and to facilitate testing, a reference design was created. Figure 1 shows the System Block Diagram for this reference design. The CPU in this system must include an OPB-to-Processor Bus bridge, which can either be integrated or exist as a separate core. Notice that there are two master devices (CPU and DMAC); three slaves, including a DMA enabled peripheral; and the OPB logic and bus arbiter.

Figure 1- System Block Diagram

2.2  System Address Map

We have chosen an arbitrary address range for various devices in our reference design. The following table gives the system wide address map we have used.

Device / Start / End
Memory / 0000:8000 / 0000:FFFF
DMAC / 0000:0400 / 0000:040F
DMA Peripheral / 0000:0420 / 0000:042F

3  DMAC Main-Core Design

The DMAC implementation includes a slave side, which enables the CPU to access the registers; and a master side, which allows the DMAC to facilitate bus transfers. Both sides share the DMAC register file, but the CPU is prevented from writing to the registers while the DMAC is busy.

The slave side consists of an Address Decoder, and Handshaking logic to acknowledge transfers. The valid addresses are defined below, when the four DMAC registers are discussed.

The master side is essentially a finite state machine, referred to below as the Timing and Control Section. It handles all of the address and control signaling in order to effect DMA flyby burst transfers.


3.1  Slave Side Address Decoder

The Slave Side Address Decoder generates several ‘valid’ signals, which are used to enable updates to the DMAC registers during write operations; or to specify which register should drive the data bus, in the case of read operations.

When OPB_select indicates the address bus is active, the address decoder compares the most significant bits of this address to the DMAC base address. If there is a match, then ‘valid_xfer’ will be asserted, along with an individual valid for the selected register.

3.2  Slave Side Transfer Handshaking

The Slave Side Transfer Handshaking is designed to allow zero wait-state accesses by the CPU. The logic asserts the two acknowledge signals once it recognizes that a valid transfer is in progress, and it releases the acknowledgement at the end of the current clock cycle.

In addition, the Handshaking section has the responsibility for generating the data bus enable, if a valid read operation is taking place.

3.3  DMAC Register Module

The DMAC Register Module contains four registers:

·  Start Address Register – location in memory of first word in transfer

·  Word Count Register – maximum number of words to transfer

·  Status Register – used to verify operations have occured properly

·  Control register – used by the CPU to control DMAC operation

The DMAC Register Module utilizes various valid signals to control how registers are updated, and it is the only part of the DMAC that drives the OPB data bus. It also contains logic to prevent the CPU from writing to the registers if the DMAC is busy with a transfer. Further details about the DMAC registers can be found in the next section.


3.4  DMAC Configuration and Status Registers (CSR)

3.4.1  Start Address Register @ 0x00000400

·  32-bits

·  Read/Write

·  Aligned on word boundary (multiple of 4)

·  CPU writes destination address for first word of a transfer

3.4.2  Word Count Register @ 0x00000404

·  32-bits

·  Read/Write

·  CPU writes this register with number of 32-bit words to be transferred

·  Gets decremented by DMAC

·  Transfer completes when WCR == 0 or dmaReq == 0, causing ‘done’ to be asserted

3.4.3  Control Register @ 0x00000408

·  Bit 0:‘rnw’
- direction (1=read, 0=write)

·  Bit 1: ‘go’
- causes transfer to be enabled
- transfer will commence when slave asserts dma request

·  Bit 2:‘ena_int’
- enable interrupt (1=enable, 0=disable)

3.4.4  Status Register @ 0x0000040C

·  Bit 0: ‘done’
– true when transfer completes. If interrupt enabled, generates interrupt

·  Bit 1: ‘busy’
- indicates transfer in progress
- cpu may not write to any registers

·  Bit 2: ‘int_ena’
- true when interrupt is enabled

·  Bit 3: ‘dma_err’
- true indicates a DMA error has occurred

3.5  Timing & Control

The Timing and Control Section is the heart of the DMAC. Utilizing the DMAC registers, it handles all the details of accepting peripheral requests for memory access; of requesting the bus from the OPB arbiter; of performing proper sequencing of the bus to allow burst transfers to occur between peripheral and memory slaves; and of releasing the bus and indicating to the CPU that a transfer has occurred.

The Finite State Machine diagram in the next section describes the operation of the Timing and Control Section.

3.6  DMAC-FSM

Here is the Bubble diagram depicting the state changes, inputs and outputs of the DMA Controller.


4  OPB Arbiter

DMA is an example of where the central processor or CPU turns control of the bus over to another device (a DMAC). A device that has control of a bus is known as a bus Master; memory components and non-DMA I/O devices are known as bus slave components.

In early system busses, only processor and a DMAC could have ownership of bus (i.e., become the Bus Master). Modern system busses (e.g. IPB) allow any I/O device to become bus master – an I/O device that can become the bus master can perform DMA itself instead of using a DMAC.

If all I/O cards can become bus master then don’t need a separate DMA controller; each card is a DMAC. Hence, a Bus arbitration scheme is necessary to balance two factors:

- Bus priority: the highest priority device should be serviced first

- Fairness: Even the lowest priority device should never be completely locked out from the bus

A piece of control logic known as the ‘arbiter’ decides which device gets ownership of the bus; this control logic resides in the system chipset and is a part of the overall system bus. The arbitration scheme in our design can be defined as follows:

Bus arbitration scheme:

1. A bus master wanting to use the bus asserts the bus request

2. A bus master cannot use the bus until its request is granted

3. A bus master asserts a bus lock to the arbiter

4. A bus master must signal to the arbiter the end of the bus utilization

The block diagram of the OPB_Arbiter is as follows:

The signals that are used by the arbiter are :

Bus Request – is used by device to ask for control of bus (arbiter input)

Here, CPU and DMAC generate the Bus Request. Hence, the signals are M0_request and M1_request.

Bus Grant – is used by arbiter to grant bus to device (arbiter output)

Here, Arbiter generates the Bus Grant to CPU and DMAC. Hence, the signals are OPB_M0Grant and OPB_M1Grant

Timeout – is used by arbiter to disable a request after a certain number of clock cycles.

A peripheral typically has control of bus for a maximum number of clock cycles (typically 32) before it must release bus to the arbiter.

5  OPB Bus Logic

The figure below shows a physical implementation of the OPB. Since the OPB supports multiple master devices, the address bus and data bus are implemented as a distributed multiplexer. This design will enable future peripherals to be added to the chip without changing the I/O on either the OPB arbiter or the other existing peripherals. By specifying the bus qualifiers as I/O for each peripheral (select for the ABus, DBusEn for the DBus), the bus is implemented using centralized AND/OR’s.

Control signals from OPB masters and slaves to and from the OPB arbiter and the peripherals are similarly OR’ed together, and then sent to each device. Bus arbitration signals such as Mn_request and OPB_MnGrant are directly connected between the OPB arbiter and each OPB master device.

6  Test Plan

The main intent here is to identify the Core Modules and the Test Modules that are necessary to completely test the working of Core Modules and to explain the overall Stimulus Block.

6.1  Core Modules

These are the modules that are under test. For testing these modules, we will need test modules simulating a CPU Master, a Memory Slave and a DMA Peripheral Slave devices as explained in the next section.

·  OPB DMA Controller

·  OPB Bus Logic

·  Arbiter

Refer sections 3, 4 and 5 for more details on the Core Modules.

6.2  Stimulus Block Diagram

6.3  Test Modules required for testing Core Modules

These modules will simulate a CPU Master, a Memory Slave and DMA Peripheral Slave devices.

6.3.1  CPU Master Device

CPU Master device is embedded in the main test bench (opb_dmac_system_tb.v). This test bench apart from simulating CPU signals, instantiates the following modules:

·  dma_controller

·  OPB_Arbiter

·  opb_logic

·  dma_peripheral_sl2

·  memory_sl3

The CPU portion of the test bench configures DMAC configuration registers with a hard-coded memory address, a small word count, direction and interrupt enable.

6.3.2  Memory Slave Device

Memory Slave device test module is simulates an OPB memory device. It does the following:

·  Has a pre-assigned address range and decodes the assigned address range

·  Generates Sln_XferAck

·  Monitors and displays OPB ABus and OPD DBus signals

·  Drives OPB DBus for Memory read (RNW == 1)

The verilog file memory_sl3.v contains the code for Memory Slave Test Module.

Memory Module Prototype:

module memory_sl3(clk, reset, opb_select, opb_rnw, opb_abus, opb_dbus, opb_fwxfer,

sl3_XferAck, sl3_fwAck, sl3_dbus, sl3_dbusEn);

Here is the block diagram of Memory Slave Test Module.