Math 419 (or 592)Sample Homework for Chapter 2

Andrew Ross

(I received assistance from Siméon-Denis Poisson)

Problem 2.16

(note: I am doing a somewhat different problem than the book is doing, because this is a sample homework and not a solution key)

FlyByNite Airlines wants to fly their planes with as many full seats as possible, so they engage in overbooking. However, they also want to limit the number of people who get bumped due to too many people arriving for the flight. We are considering a plane with a seating capacity of 144 people, and we suppose that each person has a 90% probability of showing up, and that each person acts independently of all the others. We want to know: how many tickets should we sell so that Pr(no bumped passengers) >= 60%, but as close to 60% as possible.

Define “n” as the number of tickets we sell. Define X as the number of people that show up for the flight. X has a Binomial distribution with parameters “n” and p=0.90. We want to choose n so that Pr(X<=100) >= 60%.A clear lower bound on our final number of tickets is the plane capacity of 100. If we sold 100 tickets, we would have Pr(no bumped passengers)=100%. We can increase n from 100 and watch the probability fall toward 60%.

We can approximate n as follows: approximate the Binomial with a Normal (denoted Y), whose mean is n*p and whose variance is n*p*(1-p). We want Pr(Y<=100) >= 60%. The Standard Normal CDF hits 60% at a z-score of 0.2533. Thus, we set

(100-n*p)/sqrt(n*p*(1-p)) = 0.2533, and solve for n to get approximately 110.23.

To compute the answer more exactly, we use the Binomial CDF function “binomdist” in Excel, and try various values of n:

pval / cutoff
0.9 / 100
n value / E[X] / Std(X) / Pr(X<=cutoff)
105 / 94.5 / 3.1 / 0.983
106 / 95.4 / 3.1 / 0.960
107 / 96.3 / 3.1 / 0.919
108 / 97.2 / 3.1 / 0.857
109 / 98.1 / 3.1 / 0.773
110 / 99.0 / 3.1 / 0.671
111 / 99.9 / 3.2 / 0.559
112 / 100.8 / 3.2 / 0.446

(the E[X] and Std(X) columns are not needed to compute Pr(X<=cutoff), but they are nice to see.) Thus, we see that the largest value of n that keeps the bumping probability low enough is 110. Here are the formulas behind the spreadsheet:

n value / E[X] / Std(X) / Pr(X<=cutoff)
105 / =D16*pval / =SQRT(D16*pval*(1-pval)) / =BINOMDIST(D16,cutoff,pval,FALSE)

I obtained this display by pressing CONTROL-` (that’s the same as CONTROL-~). Note, to keep some of the excitement in this problem, the formula using BINOMDIST shown just above has some bugs deliberately introduced into it, but the spreadsheet shown with the numbers is correct.

Also note: the number of decimal places in the E[X], Std(X), and Pr(X<=cutoff) columns have been adjusted to be reasonable, not too many and not too few. This is part of presenting your work professionally.

Problem 2.30

Just like above, I expect some sentences describing why you’re doing what you’re doing, not just a sequence of formulas.

The Pareto Problem

Instead of doing it for Pareto RVs, I will do it for Exponentials.

An Exponential RV has the following CDF: F(x) = 1-exp(-lambda*x) for x>0

Part 1:

The PDF is found by …(fill in this part yourself). And we get f(x) = lambda * exp(-lambda*x) . Here are graphs of the PDF for various values of lambda:

The moments are as follows:

Lambda / Mean / StdDev
½ / 2 / 2
1 / 1 / 1
2 / 0.5 / 0.5

Unlike the Pareto problem, there is no trick question here.

Part 2:

We start with U = F(X) = 1-exp(-lambda*X)

Move the exp() to the left and the U to the right to get

Exp(-lambda*x) = 1-U

Take natural logs on both sides to get

-lambda*x = ln(1-U)

Divide by –lambda to get

X = -ln(1-U)/lambda

We may shortcut 1-U with U, so we get

X = -ln(U)/lambda

Part 3:

I simulated 1000 variables in Excel. Here is the top of the spreadsheet:

lambda / Mean: / predicted / 2
0.5 / observed / 1.910218
StdDev: / predicted / 2
Sample# / Unif(0,1) / Expon. / observed / 2.012971
1 / 0.975257 / 0.050109
2 / 0.302133 / 2.393773

The formulas in cell C10 is =-ln(B10)/lambda. The predicted mean and standard deviation are indeed achieved (within a reasonable tolerance), so we have some confidence that our implementation of the CDF inversion was correct. Our histogram also looks enough like the Exponential PDFs above.

Part 4:

I simulated W=X1+…+X10 using Excel, as follows. The formula in column B10 is

=-ln(rand())/lambda, and the others are the same. Note that I have skipped the step where I put the U(0,1) variable in its own cell; I just include it in the inversion formula directly.

lambda
0.5
Sample# / Exp1 / etc / Exp10 / sum
1 / 1.084901 / etc / 1.468569 / 9.508652
2 / 0.084185 / etc / 3.022312 / 15.96042

I did 1000 samples. The predicted mean is 10*(1/0.5)=20, and the observed mean for one set of data is 19.86, which is a fairly good match. The predicted standard deviation is sqrt(10*(1/0.5)^2)=6.32, and the observed standard deviation was 6.30 for the same set of data as above. These match the theory fairly well. Here is the corresponding histogram:

This does appear mostly bell-curve-shaped, though it is a little skewed (higher on the left).

The Newsboy Problem

Instead of showing you how to do this in Excel, I will show how to do it in SciLab.

I start with

xvec=10:20;

probvec=[.05 .05 .1 .1 .1 .2 .1 .1 .1 .05 .05];

retail = 0.50;

wholesale = 0.10;

salvage = 0.05;

Part (a):

Using

mymean = sum(xvec .* probvec);

We get 15 as the expected demand. This is not surprising, as the demand distribution is symmetric around 15.

Part (b):

Using

rawmoment2=sum(xvec .* xvec .* probvec);

myvar = rawmoment2 - mymean^2;

we get approximately 2.63 as the standard deviation.

Part (c):

Using

bought=14

demand=12

// note, the next three lines have some DELIBERATE bugs, to make you figure out what it should do.

sold = max(bought,demand)

leftover = min(bought-demand,0)

profit = -bought*wholesale + sold*wholesale + leftover*salvage

we get $4.70 as the resulting profit (that is the real answer, not the one that the buggy code gives).

Part (d):

Using the same code but with demand=17, we get $5.60 as the resulting profit.

Part (e):

Now we set the “demand” variable to be a vector:

demand = xvec;

and then run the same code, to get a vector for profit:

Demand / 10 / 11 / 12 / 13 / 14 / 15 / 16 / 17 / 18 / 19 / 20
Profit $ / 3.80 / 4.25 / 4.70 / 5.15 / 5.60 / 5.60 / 5.60 / 5.60 / 5.60 / 5.60 / 5.60

Then, we take the expected value in the same way we did above,

Eprofit=sum(profit .* probvec)

Giving a resulting expected profit of $5.3075. It is not necessary to round this to pennies.

Part (f):

Now we do a “for” loop over the various amounts that he could buy:

buyvec=xvec;

for ni=1:length(buyvec)

bought=buyvec(ni);

demand=xvec;

sold = // etc, as above; you should retype it here once you fix the bugs.

Eprofit(ni) = sum(profit .* probvec);

end

disp('Column1:Candidate Amount to Buy')

disp('Column2: Expected profit')

[buyvec(:), Eprofit(:)]

I will not show the results here, to leave some mystery for you.

Part (g):

Now we don’t just ask for the “max” of the Eprofit vector, we also ask for the index of the maximum element:

[m, k]=max(Eprofit);

buyvec(k)

disp('With a resulting expected profit of')

m

And again, I will leave it to you to determine the results.

Problem 2.46 (for grad students)

As you do this problem, try doing a small example: suppose that it turns out that X=3. Then, what are the values of the various In variables?

Things To Note (commentary on the sample homework)

This is just a sample homework to show you some suggested formats. Things to take note of:

Please write Math 419 or 592, whichever class you’re enrolled in.

Make the filename include your last name, e.g. ross-ch02hw.doc

Give your name at the top of the homework as well. Yes, some people forget to do this.

Include a quick note of acknowledgement to people who helped you with your work, as above. Try to avoid a lot of inter-team help, though.

A title page is not necessary, and is pretty much a waste of paper.

The problems are done in the sequence they were assigned.

Much of the homework is narrative, rather than just a collection of formulas and graphs without any explanation.

And now, for some common mistakes that people make when they’re doing graphs, especially in Excel. Consider the following graph:

And its much-improved version:

What are the differences?

The y-axis doesn’t have so many superfluous zeros.

The labels are much more informative.

Utilizations above 1.0 are meaningless in this situation, so our x-axis doesn’t go above 1.0 (though in other situations, utilizations above 1.0 might be acceptable).

The grey background (a waste of laser printer toner) has been eliminated.

In the first graph, there’s a very large data point. The y scale of the graph has to stretch so much that you can’t see what’s going on with the other data points. In the second graph, that last point has been removed. This is a case where less data can convey more information.

Another problem I’ve seen occur is when someone uses the “Line graph” chart type in Excel, when they should use the X-Y chart type. Here’s an example:

Notice how the line is not convex: it increases, levels off for a while, then increases again. In many stochastics applications, this would be very surprising behavior, but here it is only an artifact of the poorly chosen chart type. Here, Excel is treating “.8” as a category and “.75” as another category, as if we were graphing sales of apples, oranges, donuts, etc., where the categories have no quantitative relationship to each other. If we had chosen the X-Y chart type, it would have given us a properly spaced scale, and the graph would look convex as it should.

Another mistake that sometimes occurs is when someone uses a smoothed curve interpolation for a graph, but the data happen in such a way that the curve is not increasing when the true graph should be increasing:

If you do a graph and see that behavior, you should switch to using the non-smoothed, ordinary linear interpolation.

Part of this course is learning to present your work in a professional way. Therefore, I might take off points for any of the above graphing mistakes.