ASSIGNMNETS IE306 – SPRING 2007

(All problems must be prepared at home. Those with (*) will be handed in and will be controlled. The attendance will

be controlled during the PS!!!!)

PS 1: Random variablesimportant distributions (due: 27. 2. at 12.00)

1*) The waiting time, in hours, between successive speeders spotted by a radar unit is assumed to follow a continuous random variable with cumulative distribution

0 x ≤ 0

F(x)=

x /(x+2) x > 0

a)Find the probability of waiting less than 12 minutes between successive speeders

b)Find the density of the above distribution.

2*) A continuous random variable has density

f(x) = 1- x/2 for 0 < x ≤ 2

0 else

a)Find the CDF F(x). b) Compute the probability P(0.2 < X ≤ 1.2). c) Find the expectation of X.

3*) Suppose that the probabilities are 0.3, 0.3, 0.2 and 0.2 respectively, that X = 0,1,2, or 3 power failures will strike a certain subdivision in any given year.

a) Find the probability X < 2 of the random variable X representing the number of power failures striking this subdivision.

b) Find the mean of X.c) Find the variance of X.

4*) If X and Y are independent random variables with expectations μx = 1 and μy = 3 and variances 2x=4 and 2y=3,

a) find the expectation and variance of the random variable Z= 2X + 4Y + 3.

b) Find the expectation and variance of Z if X and Y are not independent but have correlation 0.6.

5*) Two random variates X, Y have covariance 2 and variances 2x=9 and 2y=4. Calculate their correlation.

6) Among the discrete distributions with nonzero probabilities only for 0 and 1 find the distribution with

a) the minimal expectation; b) expectation equal to 0.8.

c) the minimal variance (2 solutions) d) The maximal variance

7*) The number of hurricans hitting the coast of Florida annually has a Poisson distribution with mean 0.9.

a) What is the probability that 2 or more hurricanes will hit the coast in a year?

b) What is the variance of the number of storms hitting the coast?

8*) Assume that that a battery has an exponential time to failure distribution with a mean of 50 months.

a) What is the probability that it breaks down in the first year?

b) What is the variance of the time to failure?

9*) It is assumed that IQ scores are normally distributed with a mean of 100 and a standard deviation of 15.

a) What proportion of society has an IQ below 110?

b) A person with a higher IQ than 99.5% of the population is called a genius. Which IQ is necessary to be called a genius?

10) An electronic subassemply has a time to failure modeled by a Weibull distribution with β = 0.5 and α = 2000 hours (ν=0).

a) What is the mean time to failure?

b) What fraction of these subassemblies will fail before 3000 hours?

11) A random variate X with beta distribution with alpha = 3 and beta = 5 should be transformed to the random variate Y such that the domain of Y is the interval (1; 11);

a) Find the necessary linear transform. b) Find the expectation and variance of Y. c) Find the density of Y.

12*)a) The interarrival times of customers are assumed to be independent and to follow an exponential distribution with mean value 30 seconds. What is the probability that the sum of two interarrival times is smaller than 60?

b) The weight of a certain kind of apples is known to have a normal distribution with expectation 160 g and standard deviation 25 g. What is the probability that ten randomly (and independently) selected apples have a weight of less than 1400 g ?

c) What is the probability that the sample mean of 1000 randomly selected apples is below 155g?

PS 2: quantiles, estimation,Poisson process, Chi-Square goodness of fit test (due: 6. 3. at 12.00)

13*) Find the 90 % quantile the 10% quantile and the median of the normal distribution with μ = 25 and σ = 3.

14) Find the 90 % quantile the 25% quantile and the median of the Weibull distribution with β = 0.5 , α = 2000 and ν = 0.

15*) Assume that the arrivals to a supermarket follow a Poisson process with rate λ = 120 per hour.

a) Find the expectation and the variance of the interarrival times.

b) Find the probability that in the next minute no customer is going to enter the supermarket.

c) Find the expectation and the variance of the number of customers arriving per minute.

16*) Find the moment estimate of the Binomial distribution for a sample (only containing integers) with

sample mean 8 and sample variance 6.

17) Find the moment estimate of the Beta distribution for a sample of size 200 with sample mean 0.4 and variance 0.01.

18*) For a sample of size 100 we observe the following frequencies: “< 1” 23; “ <2” 29 ; “< 3” 26; “< 4” 22;

Is it possible that this sample comes from a uniform distribution on (0,4)?

19*) To see the influence of the sample size on the chisquare test make the same test as above.

a) Assume a sample size of 1000 and all observed values multiplied by 10.

b) Assume a sample size of 10000 and all observed values multiplied by 100.

20*) A shopkeeper counts the number of customers per day who ask for a certain brand of dog-food.

In the last two months he observed

# of customers asking01 2345678910 11

# of days observed1210742201100 1

Make a chi-square test to check if it is possible, that the number of customers asking per day follows a Poisson distribution.

HINT: To estimate the parameter of the Poisson distribution you can simply calculate the average number per day.

Do not forget to select the classes such that all Ei are >= 5.

PS 3: Covariance and random number generation, discrete random variates (due: 13. 3.)

21*)a) For a sample of size 500 we observe an empirical one step autocorrelation of 0.2321.

Formulate Ho and H1 for the simple test for independence we have learned in the course.

Cacluate the P-Value.

b) Is H0 rejected? Can we assume that this sample is iid if we use α = 0.05?

Can we be sure that the result is correct? How large is the error probability.?

c) For a sample of size 80 withone step autocorrelation of -0.2 answer all questions of a) and b).

22*) Implement the LCG with m=2^31-1 and a = 16807 in EXCEL using the mod function. Make a plot of 1000 pairs generated with that generator for a seed selected by yourself. (add the printout to your PS answers also state the seed you used.)

23*) Experiment with “baby generator” LCGs with m= 512, c = 1, and different values for multiplier “a”. Implement them in Excel and plot all pairs that could be generated by this generator.

a) Try to verify for which multipliers the maximal period is obtained. Give 3 examples for multipliers with maximal and 3 examples for multipliers with shorter period. State the length of the period.

b) Give two examples of multipliers “a” with maximal period, one with good and one with bad 2-dimensional lattice properties.(add the printouts of these two lattices to your PS answers.)

c) Why look the plots of question 22 and 23 so different?

24*) To generate Binomial random variates in C use:

a) The sequential search method.

b) The guide table method.

Code both methods in C and add the printout of these functions to your PS answers.

c) For a sample of size 1000 000, for parameters n=100 and p=0.7 calculate the sample mean and sample variance. Compare these values with the theoretical values of that distribution. (state your results!!)

d) What is you experience when comparing the speed of the algorithms of a) and of b)?

PS 4: Random variate generation and CI (due: 19. 3. at 15.00)

25*)A random variate has the density:

1/2 for 0 ≤ x < 1

f(x) =1 / (2 x^2) for 1 ≤ x

0 else

Use the inversion method to generate variates from that distribution. Write all details of a c-function for the above algorithm. (Assume that a function uniform() is available.)

26*)Use the rejection method and the composition method to design a random variate generation procedure for the distribution with density f(x) = x exp(-x^2/2) for x > 0.

a) Write all details of a c-function for that algorithm.

Hint: The density has its mode at 1. Use a constant hat for the interval (0,1.7) and an exponential hat with parameter μ = 0.5 for the interval (1.7,∞) .

b) Calculate the expected number of trials necessary in the algorithm of a). (Hint: f(x) is a real density with integral 1).

27*) Construct an random variate generation algorithm similar to the one in 26) for the Gamma(3/2) distribution which has the density: x^(3/2) exp(-x) 4/(3 π).

Hint: Use Excel to plot the density. Then decide about the two regions. Try to select the parameter μ of the exponential hat for the tail-region such that the hat fits well.

28*) One main output of an inventory simulation are the yearly inventory costs. With n = 1000 independent replications a sample mean of 3250 and a sample standard deviation of 520 is observed.

a) Compute a 95% confidence interval for the mean yearly inventory costs.

b) When presenting this interval to your boss he asks: “ Does your result imply that a year with inventory costs of 4000 is nearly impossible?” What is your answer?

29*) Making 20000 independent runs of a simulation you find out that in 225 cases the output is larger than 100. Construct a 99% confiddence interval for the probability that the output of the simulation is larger than 100.

30*) In a simulation you try to estimate the exact probability that the output is larger than 1000. What sample size n is necessary to be sure that the 95% confidence interval for that probability is not longer than 1 %. (ie. +/- 0.5 %).

31) In a simulation study you are interested in the average number of visitors to a cafeteria that cannot be served till 13.00.

What steps are necessary to determine the number of replications required that the error of the estimate is with 95% probability smaller than 5?

PS 5: Output analysis: Prediction interval, comparing two systems, Bonferoni (due: 19. 3. at 16.00)

32*) One main output of an inventory simulation are the yearly inventory costs. With n = 4000 independent replications a sample mean of 3250 and a sample standard deviation of 320 is observed.

a*) Try to calculate an interval that contains the yearly inventory costs with 90% probability.

b*) What assumptions are necessary that you can solve a) using just the information given above.

c) How could we calculate the interval of a) from simulation output without the assumptions of b)?

33*) In the simulation of a single system with 1000 repetitions an output was obtained and ordered:

The smallest 35 observations obtained were:

36; 41; 65; 66; 68; 82; 83; 83; 85; 86; 87; 87; 89; 90; 91; 92; 94; 94; 95; 97; 98; 99; 101; 101; 102; 103; 107; 107; 108; 109; 110; 110; 112; 113; 114

Find a 95% confidence interval for the 2% quantile q(0.02) of the distribution of the output.

34*) To compare the performance of two different systems you make a pilot study with two dependent samples of size 100 using common random numbers.

For system A you observe x-bar = 13.5 .

For system B you observe x-bar = 14.1 .

For the standard deviation of the differences of the pairs we find sD = 1.9 .

a) Compute a confidence interval for the difference of the mean values using the results of the pilot study.

b) Using the result of a) what is your answer to the question: Is the mean output of system A larger than the mean output of system B?

35*) Compute the necessary sample size (equal for both systems) to obtain a 95% confidence interval for the difference of the mean values which has a half length not longer than +/- 0.05.

36*) To compare the performance of two different systems you make a pilot study with two independent samples of size 100.

For system A you observe x-bar = 13.5 and s = 2 .

For system B you observe x-bar = 14.1 and s = 7 .

a) Compute a confidence interval for the difference of the mean values using the results of the pilot study.

b) Using the result of a) what is your answer to the question: Is the mean output of system A larger than the mean output of system B?

37) Try to find the sample sizes for system A and system B such that the confidence interval is not longer than 0.2 and the sum of the two sample sizes is minimal.

38) Do you think that the experimental design of 34) or of 36) is better? Why? When can we expect that using common random numbers will lead to a better design?

39) In a simulation study we compare the performance of 4 different systems with the performance of the current system.

When computing 95% confidence intervals for the differences between the current system and all new systems find an upper bound for the probability that at least one of the 4 confidence intervals is wrong.

Comment: We can call this probability “overall error probability” or overall α.

40*) a) Do the same as above for 20 different 95% confidence intervals.

b) Find the exact probability that at least one of the 20 confidence intervals is wrong if you assume that the confidence intervals are calculated using independent samples.

c) What can we do to correct the problem of a) . What kind of CIs should we calculate? What is the probability that at least one of the CIs is wrong then?

PS 6: Implementing Simulations (due: Tue 18.4. at 24.00) Qu 41) has double weight, 42) quadruple weight!!

Write yourown code!! Do not use my code given in the last years!!

Remeber Question : 26)Use the rejection method and the composition method to design a random variate generation procedure for the distribution with density f(x) = x exp(-x^2/2) for x > 0.

a) Write all details of a c-function for that algorithm.

Hint: The density has its mode at 1. Use a constant hat for the interval (0,1.7) and an exponential hat with parameter μ = 0.5 for the interval (1.7,∞) .

b) Calculate the expected number of trials necessary in the algorithm of a). (Hint: f(x) is a real density with integral 1).

41*) Implement the generator for the distribution of question 26).

a) Code the random variate generator function rand_f().

b) Test it with the chi square test with 1000 classes and samplesize 1 000 000.

Hint: The CDF of the density is 1-exp(-t^2/2) .

Hint: For α=0.05 you can accept H0 if the Chi square value of your test is below 1072.53.

42*) Implement an inventory simulation in C or C++ using the event scheduling approach.

Assume that every week has 5 working days of 10 hours each. The demand occurs always in single units and the event times follow a Poisson process having inter-event times with exponential distribution and mean value 1.5 hours. The lead times follow a uniform distribution with 10 to 40 hours.

No back-orders are possible (thus no negative inventory is possible.)

In the start we have an inventory of 50. The simulation runs for 100 days. Main output are the total inventory costs.

The maximal inventory level is 100. The inventory costs are consisting of the fixed ordering costs of 50 per order and of the cost of lost sales equal to 20 for each lost sale.

We are using an (r,S) policy.

Every day in the morning (thus at clock times, 0,10,20,30,…) you check the inventory level. If it is below r place an order “up to” S. The real aim of this simulation would be to find good values for r and S that lead to low costs. But you can here just assume that r is 40 and S is 100.

Note that for deciding if you should order you must not just consider the current inventory level but you must also consider the “incoming orders”. Thus you have to keep track of these incoming orders.

Define a structure event containing the event time and type. Code a very simple FEL (eg. an array is ok as the FEL is always very short.)

a*) Be sure to code functions: put_on_FEL get_from_FEL

check_if_order demand_occurs order_arrival

b*) For verification make a trace! Check that the simulation really works correctly!! Include a file of your simulation with and without trace!

c*) The output of the simulation should contain at least: r and S, total number of orders placed, number of orders that arrived, total number of demands that occurred, number of lost sales, total costs.

d) Try to find values of r and S that lead to lower costs.

e) What is the distribution and the mean value of the daily demand?

43*)Use C or C++ to code the simulation of the below very simple queuing network:

Patients enter into an emergency hospital 24 hours a day. The input process is a Poisson process is with a constant rate of 3 patients per hour. They are treated in an emergency room by a doctor (service time follows a Gamma 2 distribution with an expectation of 10 minutes). After w “rest time” of 30minutyes they enter into the queue for the doctor again with a probability of 20 percent.

Start the simulation with an empty system and simulate 20 days. As output find the maximal number in queue..

Include an output file that includes the output of your simulation for one run of 20 days and a detailed trace for the first 30 patients.

HINT: A Gamma 2 distribution with expectation 10 has a value of λ=0.2 . So to generate random variates from that distribution you can generate variates from the Gamma(2) distribution with λ=1 (using for example the sum of two independent exponential variates) and multiply the result by 5.

You can use the functions: put_on_FEL get_from_FEL from the question above: Be sure to code as least the functions: arrival end_of_service

Please mail a zip-file (as name use your surname) containing all files to