Assignment 9: LPC Analysis, Re-Synthesis and Formant Extraction

Linguistics 582

Basics of Digital Signal Processing

Fall 2007

Assignment 9: LPC Analysis, re-synthesis and formant extraction

Reading: K. Johnson, Sections 2.3.1, 2.3.4, 4.5

Wakita, H. (1976). Instrumentation for the study of speech acoustics. In Lass, N. , Contemporary Issues in Experimental Phonetics, NY: Academic Press., pp. 3-37.
READ:pp. 3-20, leaving out section on Cepstrum Method (pp. 7-10)

pp. 29-33

Exercises:

(1)Record yourself producing a short phrase or sentence (with mostly voiced consonants), using whatever audio software is available on the machine you are using to do this assignment. Save the audio data in WAV format. Read this audio file into MATLAB using wavread:

[myphrase, srate] = wavread(‘myfile’);

Make sure you can play it back using:

soundsc(myphrase, srate);

(2)Write a MATLAB function to extract the LPC filter coefficients and Gain for an input signal.

function [A, G] = get_lpc(signal, srate, M, window_ms, slide_ms)

Output arguments:

Afilter coefficients (M+1 rows x nframes columns)

GGain coefficients (vector length = nframes)

Input arguments:

signalinput signal

sratesampling rate in Hz

MLPC filter order

window_msduration of analysis window (milliseconds)

slide_msnumber of milliseconds between successive analyses (frames)

Create a loop to go through the signal starting from the beginning. On each pass through the loop, analyze a sequence of signal values corresponding to window_ms. A typical value for speech is 20 milliseconds. Analyze that sequence of signal values using the MATLAB function lpc:

[A,G] = LPC(X,M)

A vector of filter coefficients (length M+1; A(1)=1, see Eqs.3,4)

G gain

X input signal

M filter order

Take the real part of the vector of A coefficients, transpose it, and append it as a new column to a matrix that you are building up. Take the real part of G and append to a vector of Gains that you are building up.

On the next pass through the loop, advance the beginning of the analysis window by slide_ms. A typical value for this would be 10ms. Each pass through the loop can be thought of as determining the lpc coefficients for one frame of data. If window_ms is longer than slide_ms (which is usually the case), then the successive frames are overlapping.

One wrinkle. Usually, the analysis window is weighted so it is influenced more by points near the middle of the window, rather than at the ends (to avoid discontinuities at the ends). So before performing the lpc analysis on a window of data, multiply the window of data by a Hamming weighting function, using the MATLAB hamming function. If data is the sequence of data points you wish to analyze, and its length (in samples) is window_pts, then transform the data as follows:

Hamming_data = hamming(window_pts) .* data

Note: the hamming function returns a column vector, so if data is a row vector (which it usually is), you will need to tranpose the hamming window:

Hamming_data = hamming(window_pts)’ .* data

(3)Write a MATLAB function to synthesize a signal using the filter coefficients derived from the an LPC analysis.

function [signal, t] = syn_lpc (srate,f0,frame_dur,A,G)

Output arguments:

signalgenerated signal

ttime value of each sample

Input arguments:

sratesampling rate (Hz)

f0fundamental frequency (Hz)

frame_durduration of a synthesis frame (milliseconds)

Afilter coefficient matrix (M+1 rows x nframes cols)

Ggain vector (length = nframes)

The structure of this function should be almost identical to syn.m that you wrote for Assignment 8. The major difference from syn.m is that the filter coefficients are supplied by the A and G matrices, instead of being computed from formant frequencies. (The A matrix contains the a coefficients of the filter, the denominator of the transfer function, while the b coefficient, the numerator of the transfer function is set to the Gain parameter, G). As in syn.m, use make_buzz to synthesize a source function of the appropriate duration (based on the number of frames and the frame_dur). On each pass through the loop, take a sequence of source samples corresponding to one frame, filter it through the lpc-determined filter and concatenate it with the output of previous frames. Note: the laryngeal filter and the radiation filter are all built into the lpc filter; you do not need to do those steps.

Test your function on the phrase you recorded (with voiced consonants) by analyzing it with get_lpc, and re-synthesizing it with syn_lpc. You should set the value of M for get_lpc based on the length of your vocal tract. For a relatively long vocal tract, there will be approximately one resonance every 1000 Hz, so dividing the Nyquist frequency by 1000 gives the number of formants. Since each formant has two coefficients (poles), this number should be multiplied by 2. Thus, the basic number of coefficients is srate/1000 or (2*Nyquist/1000). Add two more coefficient pairs to that—one for shaping the laryngeal pulse, another for cases (like nasalization) where there are essentially “extra” poles. For a shorter vocal tract, reduce the number coefficients by 2.

(4)Write a MATLAB function that calculates formant frequencies and bandwidths from a vector of filter coefficients, acoef.

function [F, BW] = formants (acoef, srate)

Output arguments:

Fformant frequencies in Hz

BWformant bandwidths in Hz

Input arguments:

acoefvector of a filter coefficients

sratesampling rate

Calculate formant frequencies as the angle of the roots of the filter polynomial. Convert from radians per sample to Hz. Sort into ascending order, and remove all the negative angles. To sort into ascending order, use the MATLAB sort function.

[Y,I] = SORT(X) sorts into ascending order and also returns an index matrix I.

If X is a vector, then Y = X(I).

Calculate the formant bandwidths from the magnitude of the roots. Use the sort index I returned from sorting the formant frequencies to reorder the bandwidths.

(5) Write a MATLAB function to plot the formants from an lpc analysis.

function fplot (signal,sr,A,G)

Input arguments:

signalsignal

srsampling rate in Hz

Afilter coefficient matrix (M+1 rows x nframes cols)

Ggain vector (length = nframes)

Use the formants function you wrote in (4) to calculate the formant frequencies for each frame. Plot the waveform and the first four formants in separate subplots of a figure window. Try to get them time-aligned.

% To plot in the upper of plot of a figure with two rows of plots

% and one colum:

subplot (2,1,1)

plot (…) % your plot command% To plot in the lower one:

subplot (2,1,2)

plot(…) % your plot commannd

It should look something like this: