Stochastic Simulation Notes

by

Kevin D. Donohue

Department of Electrical Engineering

University of Kentucky

Lexington, KY 40506

These notes were developed to provide basic understanding, rules, and practical tricks for obtaining reliable answers from computer simulations, particularly for evaluating the performance of communication and signal processing systems. Performance measures examined include integral performance measures such as mean square error, bias, and variance/efficiency, and probability measures such as probability of detection and false alarm.

Stochastic Simulation:

Purpose:

Stochastic simulations can establish relationships between system inputs, outputs, and parameters in a statistically reliable manner. (Simulations are especially helpful when a closed-form relationship cannot be derived.)

For signal processing and communication systems, simulations can be useful for:

comparing performances of systems under statistically similar conditions

optimizing processing and system parameters

developing an algorithm through identifying its limitations in a given application

.

Components of Simulation Design:

Problem Definition:
Be as quantitative as possible. Identify quantities to be determined. Determine system conditions and inputs affecting these quantities.

System Model:
Develop a system model suitable for computer implementation that includes the interactions of all critical phenomena whose inputs, outputs, and model parameters are consistent with ones in the defined problem.

Experimental Design:
Statistically design/analyze the simulation (i.e. ensemble of runs) to determine the reliability of the estimated quantities from a finite number of runs.

Implementation:
Design programs to implement simulation model. Consider required computing resources (processing time and memory) in developing code so program will finish in a reasonable amount of time.

Example of Monte Carlo Simulation:

The classic example of a Monte Carlo simulation is evaluating a multiple integral. Obviously, the best place to use it is where the dimensions are so high that a direct iterative technique will take too many years to compute and a closed form solution cannot be found. However, for the sake of illustration and comparison, the following example can be solved quite easily by all 3 methods (closed-form solution, iteration, and Monte Carlo).

Problem Definition:

Find the probability that 2 identical and independent Rayleigh distributed signal values, are bounded such that

System Model:

Express probability in terms of Rayleigh distribution and limits indicated in problem statement.

Solution 1 (Exact Solution): Evaluate integral in terms of b and r.

Solution 2 (Numerical Solution): Illustration of surface representing the integrand for b=1.

Develop iterative program to break the double integral down into smaller volumes, using increments along the axes, respectively:


The evaluation of the integrand on a finer the resolution grid results in a more accurate the solution (provided there are no singularities over the region of integration); however, the number of iterations increases as a function of grid resolution according to:

Number of iterations

Solution 3 (Stochastic or Monte Carlo Simulation):

The integration problem can be reformulated as follows. The volume under the integrand's surface equals its mean value times the area of the integration region. (This is true! Think about it for a while, if it is not obvious. You may want to think in terms of some simple one-dimensional examples first, or recall the mean-value theorem from Calculus).

So now the problem can be reformulated in terms of estimating the mean based on a finite number of samples taken from an infinite (continuous surface) population.

In some sense, this is done with numerical approach where the sampling strategy is a uniform grid over the region of integration. For the Monte Carlo approach however, points are chosen at random in the integration region (random sampling). Thus, a critical simulation issue relates to the design of the random experiment (i.e. how reliable is the answer for the given number of samples taken?). This will be addressed later, for now let's compare the 3 solutions for accuracy and efficiency.

Solve Problem for r = 1 and b=1.

Exact solution: Evaluate closed from expression to obtain 0.09020401043105

Numerical solution: Matlab Code:

% The following code numerically evaluates the double integral of:

% x1*x2*exp(-((x1)^2+(x2)^2)/2) over the range 0 < x1^2 + x2^2 < r^2

% It used different increments on the evaluation grid to compare accuracy

% Loop through exponentially decreasing increments for evaluation grid

for k=1:3

d1 = .1^k; % Increment along x1

d2 = .1^k; % Increment along x2

r = 1; % Bound on integration region

true_value = (2*b-2*b*exp(-r^2/2)-r^2*exp(-r^2/(2*b)))/(2*b); % Closed-form solution

probability_value(k) = 0; % Clear value to accumulate

% summation of volumes

% Step through x2 axis until limit reached

n2=0; % Initialize increment on x2 axis

while (n2 <= floor(r/d2))

% Step through x1 axis and evaluate function and accumulate

% function values until limit reached

n1=0; % Initialize increment on x1 axis

while (n1 <= floor((sqrt(r^2-(n2*d2)^2))/d1))

function_value = (n1*d1)*(n2*d2)*exp(-((n1*d1)^2+(n2*d2)^2)/2);

probability_value(k) = probability_value(k) + function_value;

n1=n1+1;

end

n2=n2+1;

end

% Scale by incremental area

probability_value(k) = probability_value(k)*d1*d2

end

% Compute error normalized by the true value

nme = (probability_value-true_value)/true_value;

Results:

Increment / 10-1 / 10-2 / 10-3
Number of Iterations / 70.7 / 7.07103 / 7.07105
Result / 0.08723514188359 / 0.09021074482542 / 0.09020434991276
Percent Error / 3.29 / 7.510-3 / 3.710-4

Stochastic Solution: Matlab Code

% The following code stochastically evaluates the double integral of

% x1*x2*exp(-((x1)^2+(x2)^2)/2) over the range 0 < x1^2 + x2^2 < r^2.

% It generates 10^1, 10^2, … 10^5 random points over the function domain

% to compute the integral for "ntrials" times. The mean and variance of

% the estimate errors are computed to examine the performance/convergence

% properties of this estimator.

r=1; % Radius limit

b = 1; % distribution parameter

ntrials=100; % Number of times for estimate the area for a fixed number of points

true_value = (2*b-2*b*exp(-r^2/2)-r^2*exp(-r^2/(2*b)))/(2*b); % Closed-form solution

% Loop to increase the number of random points used to determine sample mean

for kn = 1:5

% Loop to try stochastic integration many times to statistically access error

for k=1:ntrials;

n=round((4/pi)*10^kn); % Number of random points over r by r region

x2=r*rand(1,n); % Generate random numbers over rectangular region

x1 = r*rand(1,n); % Generate random numbers over rectangular region

% Find those values that fall in the region of integration (sector)

at = find((x1.^2 + x2.^2)<= r^2);

% Evaluate function at those points

psamp=x1(at).*x2(at).*exp(-(x1(at).^2+x2(at).^2)/2);

% Find mean value and scale to estimate volume

probability_value(kn,k)=mean(psamp)*(pi/4)*r^2;

end

end

% Compute Root Mean Square Error and normalize by true value.

rmspe = sqrt(mean((probability_value - true_value),2).^2)/true_value; % mean error

% Compute mean value of all the area estimates

me = mean(probability_value,2);

% Computer the plus/minus 95% confidence limit for the mean values given above

se = 2*std(probability_value,0,2)/sqrt(ntrials);

Results:

Number of random samples / 102 / 103 / 105
Mean Estimate from 100 Trials / 0.09036053355664 / 0.09022315422829 / 0.09019620490268
95% Confidence Interval /  0.0013512814137 /  0.0004370462649 /  0.0000387358233
Root Mean Square Percent Error over 100 trails / 1.735 10-3 / 2.12210-4 / 8.65310-5

Monte Carlo simulation the description of the error is more involved. Probabilities and confidence intervals describe the nature of the error. The error given in the table indicates that we are 95% confident of the true value being within the specified interval around the mean value (the assumption here is the that the error is Gaussian).

Monte-Carlo Integration – Statistical Analysis:

Many statistical simulations compute an expected value, given by:

g(x) may be unknown or known, but intractable or costly to compute

fX(x) is the probability density function of x

The expected value for many common distributions can be estimated from:

where is referred to as the sample mean. The estimate can be shown to be unbiased by taking the expected valve of the estimator:

Estimator efficiency and consistency can be observed through its variance:

Therefore, the variance of the estimate is a function of N and the variance of function g(x) is reduced by a factor of N:

The standard error (or standard deviation) decreases proportionally to

To determine N beforehand for a specified level of precision, var(g(x)) must be known. However, this information is rarely available, so initial estimates of this value have to be obtained. Once a reasonable estimate of var(g(x)) is obtained, the precision expressed in confidence intervals is given by:

True value can be anywhere in interval about with probability .

where if is assumed Gaussian, is computed from:

Note that if is a uniform distribution over the region of integration, the expected value integral for g(x) becomes:

where


Multiply above integral by size of region D(A) to obtain:

Example:

Find the mean-square error (MSE) between the limit spectrum and the estimated spectrum magnitude of transfer function H(f) from the system output driven by white noise (assume H(f) represents a linear time invariant system).

Plot error as a function of frequency resolution for a fixed data length.

Simulation design questions:

What level of precision is required in the result?

What model(s) should be used for H(f)?

What length should the data segments be for performing the spectrum estimate?

What estimators should be used?

What should vary over each run?

How many runs should be made for a given frequency point?

Precision is typically addressed through confidence intervals.

Confidence Interval Estimation

A confidence interval is a random interval whose end points  and  are functions of the observed random variables such that the probability of the inequality


is satisfied to some predetermined probability ():


A useful distribution for mean estimates from a finite number of samples is the Student's t Distribution.

Student’s t Distribution

Let be
with sample mean and sample variance . Then with degrees of freedom has the student’s t distribution

Let and denote values of the distribution with probability density function areas to the left of these values being and , respectively.

By symmetry

Note for =.05, area between t values is 0.95.

Therefore, the probability that the true mean is in a neighborhood of the sample mean is given by:

Comparison of .95 = 1- points for t and Gaussian distributions.

% Generate Degree of Freedom axis

new = [3:500];

% Plot number of standard deviations requared to achieve 95% critical points for t statistics and normal statistics

plot(new,tinv(.975,new),'b',new,norminv(.975,0,1)*ones(size(new)),'r--')

title('Standard deviations for 95% confidence interval (Blue/Solid- t distribution, Red/Dashed- Normal)')

xlabel('Degrees of freedom')

ylabel('Number of standard deviations')

axis([0 500 1 3])

Determination of Sample Size and Stopping Rules

The level of confidence (precision) desired for the output measure ultimately determines the number of runs and sample size.

To determine output variable mean to within units at confidence level 1-, use the following formula:

where , therefore


can be obtained from the student’s t-distribution table.

S is the variance of the integrand and can be approximated from preliminary sample runs while observing the convergence for increasing N as shown in the next example.

Example:

Determine the performance of a PSD-based peak detector that estimates the center frequency of band-pass channel from measurement taken at the output of a channel driven by white noise. The center frequencies for this channel has a lower bound of 100 Hz and an upper bounded of 1000 Hz. The bandwidth of each channel is 20%. The error is defined as the actual center frequency minus the estimate (i.e. )

Simulation design questions:

What level of precision is required in the result?

The precision level is application dependent. Since problem does not say, we can make it a reasonable number. So let there be good precision in the first 2 significant digits. Therefore, set the 95% confidence limits to 0.5% of the smallest center frequency that may occur. In this case it is (100Hz*0.005) = 0.5

What model(s) should be used for the channel?

Since not specified, we will use an IIR filter (Butterworth filter) driven with white noise (one of my favorites).

How long should the data segments be for estimating the spectrum?

The length of the data segment may be related to physical constraints (i.e. lack of stationarity, hardware, throughput demands, resolution requirements, …).

For this problem assume for smallest bandwidth, 20 Hz (at 100Hz center frequency), we want at least 5 independent points (4 Hz resolution), so segment length must be at least (1/4Hz) 0.25 seconds. Note a stochasitc precision/resolution better than 4 Hz will not provide any benefit (Why?).

Sufficient averaging for the PSD estimation, again, depends on the application and SNR. Let's set an arbitrary limit of 10 independent segments. So 2.5 seconds of data must be collected per estimate. With an effective maximum frequency of about 1000 + 3*.2*1000 = 1600Hz, let's sample at 4000 Hz.

What estimators should be used?

Since nothing is specified, let's use a standard time-hopping window approach (Welch's method), which Matlab uses in its PSD function.

What should vary over each run?

For fixed channel characteristic, white noise sequences should be independently generated for each run (measurement).

How many runs should be made for a given frequency point?

Consider the performance measure to be computed:

MSE =

(Bias)

(estimator variance - efficiency)

The precision was originally described for f , so a variance estimate for this value is needed before the relationship between precision and number of independent runs can be determined. So, preliminary runs can be generated to get an idea of the variance magnitude. The worst case will be the broadest bandwidth at 1000 Hz.

Note the other error measures are also dependent on this number. If we want similar precision on all the above quantities, the variance of all the integrand values should be examined. The one that would likely have the highest variance is either f2 or f

Code example:

% This creates an IIR filter to model a band-pass channel, excite it with white

% noise, estimate the PSD from the output, and estimate the center frequency of

% the channel from the maximum peak position on the psd. This will happen "runs"

% number of times and the variance of the estimate will be plot as a function of

% the cumulative number of runs.

runs = 200; % Number of simulation runs (independent estimates)

fs = 4000; % Sampling frequency in Hz

fc = 1000; % Center frequency in Hz

bw = 0.2*fc; % Bandwidth is 20% of center frequency

fu = -(-bw-sqrt(bw^2+4*fc^2))/2; % Find upper frequency limit

fl = fu-bw; % Find lower frequency limit

[b,a] = butter(2, 2*[fl fu]/fs); % Create Butterworth filter

siglength = 2.5; % Length of signal from which to do the estimation

psdlen = round(.25*fs); % Segment length for time-hopping psd window

t = [0:round(2.5*fs)]/fs; % Time axis

% Round up length to next power of 2.

nfft = 2;

while nfft < psdlen;

nfft=nfft*2;

end

% Loop to increase number of trails incrementally and observe variability

% in estimate for increasing trail

ncount = 0; % Initialize counter for each increment in trial size

for n=5:5:runs

n

% Loop to excite channel, estimate and store results for n trials

for k = 1:n

sig = filter(b,a,randn(1,length(t))); % Create Channel output

[p,f] = psd(sig,nfft,fs,hamming(psdlen),floor(psdlen)/2); % Compute PSD

np = find(max(abs(p)) == abs(p)); % Find maximum peak

fest(k) = f(np); % Assign its corresponding frequency to the estimate

end

% Estimate the standard deviation of the output for each pass in loop

% to examine the likelihood of convergence and a reasonable variance

ncount = ncount+1; % Increment index to store standard deviation results

festvar(ncount) = std(fest(1:n)); % Standard deviation of actual estimate

fvarvar(ncount) = std((fest(1:n)-mean(fest(1:n))).^2);%Standard deviation of variance integrand

end

% Plot results, if variability significantly reduces as the trials increase, a reasonable

% standard deviation for the integrand can be obtained

figure(1)

plot(runs*[1:length(festvar)]/length(festvar),festvar)

xlabel('Number of runs')

ylabel('standard deviation')

title('Convergence of standard deviation of the estimate')

figure(2)

plot(runs*[1:length(fvarvar)]/length(fvarvar),fvarvar)

xlabel('Number of runs')

ylabel('standard deviation')

title('Convergence of standard deviation of the estimate`s variance')

Round the standard deviation of the center frequency estimate up to 40 (actually looks like 34 on the graph) and predict the required number of runs:

or

For =.05, plot t(N) and (N/6400)0.5 as a function of N:

The about graph sugests that about 25000 runs will be within the confidence limit requirements.

The same analysis can be done for the variance estimate. If variance magnitude is on the order of 1000, then a confidence interval of 50 would provide precision in the first 2 digits. If variance is rounded up to 1400 (actually around 1150), the predicted number of runs required is:

or

By comparing equations, it can be infered that satisfying the estimate of the center frequency will exceed the requirements for the variance in this case.

So now we can run the simulation to determine bias, variance (for efficiency studies and computing confidence limits), and MSE for an overall error value.

Code Example:

% This will create an IIR filter to model a band-pass channel, excite it with

% white noise, estimate the PSD from the output, and estimate the center frequency

% of the channel from the maximum peak position on the psd. This will happen "runs"

% number of times. The mean value of the estimate for each center frequency is

% computed along with the bias, variance, and mean square error. These values are

% saved to a mat file when finished

runs = 25000; % Number of simulation runs (independent estimates)

fs = 4000; % Sampling frequency in Hz

fca = [100:100:1000]; % Center frequencies in Hz

siglength = 2.5; % Length of signal from which to do the estimation

psdlen = round(.25*fs); % Segment length for time-hopping psd window

% raise to next power of 2.

nfft = 2;

while nfft < psdlen;

nfft=nfft*2;

end

t = [0:round(2.5*fs)]/fs; % Time axis

% This loop updates the center frequency value does estimation

for kf= 1:length(fca)

fc = fca(kf) % Assign center frequency

bw = 0.2*fc; % Band limit is 20% of center frequency

fu = -(-bw-sqrt(bw^2+4*fc^2))/2; % Find upper frequency limit

fl = fu-bw; % Find lower frequency limit

[b,a] = butter(2, 2*[fl fu]/fs); % Create butterworth filter

% Loop to excite channel, do and store estimation

for k = 1:runs

sig = filter(b,a,randn(1,length(t))); % Create Channel output

[p,f] = psd(sig,nfft,fs,hamming(psdlen),floor(psdlen)/2); % Compute PSD

np = find(max(abs(p)) == abs(p)); % Find maximum peak

fest(k) = f(np); % Assign its corresponding frequency to the estimate

end

fce(kf) = mean(fest); % Actual frequency estimate

fcsde(kf) = std(fest); % Standard devation of the estimate

bias(kf) = fce(kf)-fc; % Bias of the estimate

msetot(kf) = mean((fest-fc).^2); % Mean square error of the estimate