ECE 734 Project Proposal (10/2000)
Huaibin Yang, ID 9017919045
Emal:
A Survey of DDFS Architecture and Implementation
Background
Fast frequency switching is crucially important in modern wireless communication systems such as TDMA/CDMA digital cellular systems and spectrum-spread wireless LANs. For example, the TDMA system may require that the carrier frequency have to be switched during a signal slot, that is, the change must be accomplished within 100us. Linear phase shifting is also crucial in any system that uses phase shift keying modulation techniques. Such system includes IS-95, IS-94, GSM, DCS-1800, CDPD and several others.
Direct Digital Frequency Synthesizer (DDFS) can achieve fast frequency switching in small frequency steps, over a wide band. Also it provides linear phase and frequency shifting with good spectral purity. So, DDFS is best suited to use in the above communication systems. A further requirement for DDFS is low power consumption budget, especially for portable wireless terminals.
Motivation
I have discussed fundamental frequency synthesizer techniques, Phase Lock Loop (PLL) and DDFS in the ‘in-class presentation’, and focused on the basic DDFS architectures such as RAM-based and phase accumulated methods as well as some implementation examples of DDFS building blocks. A fully pipelined FA architecture for phase accumulator is presented. A simplified ROM look-up table model, aimed at reducing table size to meet power consumption budget in wireless transceivers, is introduced to give an overall conclusion that ROM size is one of the key considerations in DDFS implementation. Along this direction, I hope do more exploration on this special topic.
Project Objective
This course project intends to be survey type. I will focus to discuss the low power DDFS architecture and implementation.
In the first part of the project, basic DDFS concept, design and performance analysis will be presented based on some frequency synthesizer books ([1], [2]). Then I will review some “classic” papers ([3], [4]) in terms of a historical view to understand DDFS technology development and new ideas (almost all papers about DDFS cite [3] at the first place). Then with above background knowledge, the main part of the project comes to explore the most newly papers ([5], [6], [7], [8], [9], some will be added into this list during the progress of the project), which bring new architectures and implementations to reduce power consumption of DDFS in wireless applications. The book ([10]) addresses almost every topic about DDFS and collects the bunch of papers before 1996. It is a good reference if extending the discussion of DDFS further not only for low power implementation. Conclusion and some discussions will be given in the last part of project report.
PROJECT REPORT
ABSTRACT
Direct Digital Frequency Synthesizer (DDFS) can achieve fast frequency switching in small frequency steps, over a wide band. Also it provides linear phase and frequency shifting with good spectral purity. J.Tierney proposed the DDFS idea in 1971[3]. With development of VLSI technology and requirement of modern communication systems, since 1980’s, DDFSs have been widely used in wireless transceivers and many kinds of frequency synthesizer systems especially which impose high demands on frequency synthesizer’ agility.
A standard DDFS architecture consists of accumulator, ROM lookup table, DAC and some reconstruction filters. [1] gives the analysis of spurious effects due to phase truncation of accumulator and finite word length effects of ROM lookup table. Among all building blocks of DDFS, ROM represents both power and performance bottlenecks [11]. Reducing ROM size and power dissipation level are main concerns in this survey.
Early techniques to reduce ROM size are Sunderland’s architecture [12], which leads to 50% reduction of ROM size. Nicolas’s architecture [13] is a further optimization and enhancement of Sunderland’s. In the recent decade, many efforts have been done to change their architectures to get smaller ROM, and even ROM-less. Bellaouar ([5], [7]) proposed a new architecture of only 16-point small lookup-table size, used in wireless communication. Yamagishi [8] observed that reducing the bit number of ROM output is more effective than reducing ROM storage size in decreasing the power dissipation level, so his implementation aims at smaller number of ROM output bit. Hegazi [9] found that traditional DDFS architecture has different number of sine wave samples when DDFS outputs different frequencies. He introduced a method of generating the fixed number of sine wave samples for different output frequencies. Based on this point, his implementation is ROM-less. So far, all DDFS’s architectures are ROM-based. Mortezapour [6] gave the most dramatic change to the traditional DDFS architecture. That is, replacing the ROM with a Non-linear DAC. The phase accumulator’s outputs will be directly as the inputs of a non-linear DAC for generating sine wave. So, the design of non-linear DAC becomes the key consideration in his paper.
The remaining part of this survey is organized as the following three sections. Section I introduces basic DDFS concepts; Section II surveys ROM compression techniques; Section III illustrates two ROM-less designs.
SECTION I: DDFS Introduction
Basic concept
A basic diagram for the standard DDFS is shown in Figure 1. Almost all DDFS are composed of these same fundamental building blocks, although with some enhancements or modifications. J.Tierney presented the idea at [3] in 1971.
It uses an M-bit accumulator and a sine function ROM lookup table. This block is clocked with frequency Fclk. For each of clock period, the M-bit input word is added to the accumulator. The output of the accumulator addresses the ROM lookup table to generate a K-bit digitized sine value. Then this value is converted to an actual analog voltage by D/A converter. Since the DDFS is essentially a sampled system, the D/A output should normally be passed through a reconstruction filter, which removes unwanted alias frequencies. Finally, the D/A output is passed through an ideal hard limiter in order to remove any residual AM that may be present.
The minimum frequency resolution of the DDFS is
.
The output frequency is given as
where FIW is frequency input word.
Design considerations
Representation of the Ideal DDFS Output Spectrum
,
where h(t) represents the output sample-and-hold, h(t) = 1 for and 0 otherwise.
Tc =fclk –1 .
1. Phase Truncation Related Spurious Effects
If M of FIW is large, it is impractical that the bit width of the accumulator feedback input is M. Usually, we use W<M bits in the basic architecture to address the sine lookup table. In this phase truncation case,
.
[1] gives the DDFS output:
,
where e(n) represents the phase error at each clock instance due to truncation, and provides the equation for the magnitude of the largest spur in the spectrum:
where
B = number of accumulator bits truncated
Fr = integer representation for FIW
= the greatest common divisor between Fr and 2 B
This equation represents an important finding that the magnitude only depends on Fr through the greatest common divisor factor . Also [1] shows the algorithm for determining the discrete output spectrum of the DDFS in the presence of phase truncation. The conclusion is that if values of Fr have the same value of , the spurious output amplitudes and numbers are unchanged. Only the position of each spur in the output spectrum is altered.
2. Lookup Table Finite Word Length Effects on Spectral Purity
Finite quantization in the sine lookup table sample values leads to a DDFS output spectrum impairments. Here [1] gives an result signal-to-noise ratio (CNR) at the DDFS output:
For example in D=10 bits, Fclk = 40 MHz, CNR = 138dBc/Hz. Also [1] analysis the spurious level in the worst case. In this example, it is about –60 dBc.
SECTION II: ROM Size Compression
For DDFS, transforming the phase accumulator phase value to sin() is particularly crucial if table storage size is to be kept reasonable because, ROM represents both power and performance bottlenecks[11]. Each additional bit used in the lookup table process potentially represents a doubling of the required table storage space. ROM compression methods are used to minimize the ROM size.
- Sine waveform symmetry
Due to the symmetry of the function /2 and , with proper manipulation of the phase and amplitude, lookup table samples can be stored for phase values spanning the 0 to /2. That is, the accumulator value is reflected into the first quadrant by computing = mod /2.
- Sunderland architecture (Using trigonometric identity)[12]
Further technique used to compress ROM size is presented in [12]. is decomposed into a sum of three angles
= + β +γ.
Using the basic trigonometric identity:
Sin( + β +γ) ≈ sin(( + β) + cos(α)sin(γ).
So, one large size ROM can be split two smaller ROMs. The upper ROM, called Coarse ROM, contains the quantity of sin(( + β). The lower ROM contains the quantity of cos(α)sin(γ) and is called Fine ROM because its size can be considerably smaller than the upper one due to sin(γ) <1. This architecture is called Sunderland architecture conventionally.
- Nicholas architecture (Sine-phase difference, Using Tylor series expansion for sin() )[13]
According to the first-order Taylor series expansion:
f(x) = sin( x /2) – x /2,
We store f(x) which has the smaller dynamic range resulting in further reduced table size.
- Paper [5] proposed a new architecture for small lookup-table size.
In this architecture only 16 points are stored. The final sine and cosine waveform values are computed through the linear relationship between the sample points. It gives experimental results about a DDFS with 60-dBc spectral purity, 29-Hz frequency resolution, and 9-bit output data for sine function generation. The DDFS is implemented in 0.8-um CMOS. Experimental results verify that the average power dissipation of the DDFS logic is only 9.5mW(at 33 MHz, 3.3V).
The paper’s basic idea and implementation is illustrated as follows:
Similar to the architectures of the above in [8] and [10], this one uses Taylor-series-based mapping technology as well, but linear interpolation is used between consecutive points stored in the ROM. In every interval [i , i+1 ]
sin = sini + ( - i ) + sin
[i , i+1]. iand i+1 are two successive stored phases. The ( - i) term represents the LSB’s of the phase , while irepresents the MSB address of the ROM. The interpolation coefficient for sin is also stored in ROM. Since it is multiplied by a small number ( - i), only few MSB’s of the coefficients need to strore.
Its architecture is shown as the following figure:
An N-bit accumulator is used for the phase control. The choice of N depends on the required frequency resolution. The m+n+3 MSB’s of the output of the accumulator are used of the subsequent blocks as the phase . The most significant three bits of the phase accumulator output are utilized to control the generation of the full sine wave, as is explained below. M bits( MSB’s of m+n remaining bits of the phase) are used to address the ROM for sine and cosine and the interpolation coefficients. The lower significant n bits (LSB’s of m+n bits of the phase) represent ( - i) in the interpolation formula.
The third MSB is used to decrease the address of the ROM and permit switching between sine/cosine waves for the first quarter of the sine wave. It is then XOR’ed with m+n bits (LSB’s) of the phase word. The second MSB is XOR’ed with the third MSB to select (through MUX 1 and 2) between samples stored in the ROM. The first MSB is used to generate the last half of the wave and as a sign bit.
As the summary, the comparison of relative ROM Sizes using previous compression techniques are compared, (quoted from paper [5])
Compression Technique / ROM size in normalized bitsSine/Cosine Symmetry / 39.38
Sunderland’s Technique / 10.46
Nicholas’s Technique / 2.46
Bellaouar’s Technique / 1
- Reducing output bit number of ROM [8].
Yamagishi [8] proposed a method to further reduce the ROM size. His idea is based on the statement that reducing the number of bits output by the ROM is more effective in decreasing the power level than is reducing the number of words input of the ROM. He gives the power dissipation characteristics of the ROM in the following graph.
For example, when the storage of ROM is reduced to 1024 words from 4096 words, that is, the number of input bits is reduced from 12 to 10, the reduction of supply current is from 3 mA to 2.8 mA, only about 7%. However, the number of output bits reduced from 8 to 6, the power reduction is about 30%. This is because most current from the ROM is dissipated in sense amplifiers.
To achieve few output bit number, they devised a double-trigonometric-approximation architecture for the sine-wave lookup table. The graph below demonstrates the general idea of this architecture.
This architecture generates the familiar quarter cycle sine waveform using two triangle waveforms A and B, and approximation error C. It is no need to pay attention to the generation of A and B using either ROM or complex processing, because data for waveform A is equal to a part of the input phase data and data for waveform B is also equal to a part of the input phase data that inverted. This data can be changed from input data by few XOR gates depending on bit-width. Only approximation error data C must be stored in ROM. However, the ROM required capacity is small, since data A and B are removed from the sine waveform.
This architecture is similar with the Nicholas’s architecture, but aiming at reducing both input and output bit number. We can see this through the following implementation architecture that is given in this paper.
With this architecture, ROM output data is decreased by 3 bits. That is, this architecture compresses the ROM output bit-width from 9bits (the conventional ROM compression technique using quarter wave symmetry of a sine wave) to 6 bits, which results in a more than 30% reduction in ROM power dissipation.
SECTION III: ROM-less Architecture and Implementation
- An ROM-less DDFS architecture with constant Over Sampling Rate
In conventional DDFS architecture, all ROM addresses are visited only for the minimum output frequency, on the other hand, only two addresses per cycle are used to generate the maximum frequency. Based on this observation, paper [9] proposed an architecture that a fixed number of sine wave samples is generated during each output cycle with a time spacing depending on the frequency being synthesized. The following is the block diagraph for the new architecture.
Let’s define over-sampling ratio (OSR):
Where W is input frequency word and N is the number of bits in the accumulator. In this architecture another accumulator (counter) is used to synthesize the sampling interval using its over flow signal as output. Only the overflow signal is used as an indicator to the sampling interval.
The sampling frequency is given as
The number of samples per cycle is chose to be 2 s. The desired output frequency is therefore given as:
Note that in this architecture, Nyquist rate is not of concern since the accumulator’s overflow signal is not used as an output signal. It is used to increment the counter whose output is the number of the sine wave sample to be generated. A very small ROM block can be used to map the sine amplitude, e.g. an OSR of 4, the small ROM is only 8 addresses.
In the above example, since only 8 ROM samples are used, there is no need to use a ROM to map the sine wave sample’s amplitudes. A tailored DAC can give sine waves directly. It is equivalent to a three-bit DAC that has eight samples switching pool.
This architecture mainly improves the frequency switching speed and reduces die area because of ROM-less and simplified DAC. The power dissipation with an additional accumulator is the same as for the conventional architecture because accumulator power dissipation is about 50% the total power of DDFS.
- ROM-less DDFS Architecture with non-linear DAC
The paper [6] proposes a new architecture. It proposed a design technique that uses nonlinear digital-to-analog converter (DAC) to replace look-up table. Significant saving in power dissipation results from this ROM-less implementation. The experimental results for a 3.3V supply are both 4 and 94 mW at a clock rate of 25MHz and 230MHz, respectively.
Here illustrates its new idea. The new architecture with nonlinear DAC shows as follows:
The function of the nonlinear DAC is to convert the digital phase information from the phase accumulator directly into an analog sine output voltage. If this proposed architecture requires the same number of bits of amplitude resolution as the conventional ROM based DDFS does, the performance of this architecture will be theoretically identical to the conventional one. The main advantage of paper [6]’s architecture is that it does not require a ROM look-up table. Therefore, the power dissipation will be less than that of a conventional DDFS if the non-linear DAC consumes about the same power as a linear DAC.
First, let’s focus on its implementation of non-linear DAC
where DAC output vo is a function of the complementor output st(n) and the MSB of the phase accumulator output. Assume that the peak value of the output sine wave is equal to 2 i –1, where i defines the number of bits of amplitude resolution of the sine wave.
1, the integer j represents the number of MSB bits output from the phase accumulator that is used as input bits to the nonlinear DAC.