Exercises are numbered.
An urn contains three chips: one black, one green and one red. We draw one chip at random. Give a sample space \(\W\) and a collection of events for this experiment.
Prove \(A \ (B \cup C) = (A \cup C) \ B\)
Give a sample space for rolling two fair dice. Find the probability of rolling a sum between (inclusive) 3 and 5.
A 5 digit number is chosen between 00000 and 99999. Find the probability that exactly one digit is smaller than 3.
Ordered selection with replacement: our first counting principle. You have \(n\) distinct things. This counts the number of ways you can pick \(k\) things with replacement:
\[
n^k
\]
Ordered selection without replacement: permutations. The number of ways you want to pick \(k\) things out of \(n\) distinct things is
\[
\vphantom{P}^nP_k = \frac{n!}{(n-k)!} = n(n-1)\cdots (n - k + 1).
\]
Unordered selection without replacement. \(n\) distinct things, \(k\) choices, but order doesn't matter. This shows up when organizing people into groups for example.
Combinatorial identities
\[
\binom{n}{k} = \binom{n}{n-k}, \quad \binom{n}{k} = \binom{n-1}{k-1} + \binom{n-1}{k}
\]
The first identity can be remembered using Pascal's triangle.
Binomial theorem
\[
(x+y)^n = \sum_{j=0}^n \binom{n}{j} x^j y^{n-j}
\]
Roll a fair die then toss a coin the number of times shown on the die. What is the probability that all coin tosses result in heads?
You have the numbers between \(\{0,\ldots,9\}\). How many 8 digit numbers are there? Is this ordered selection or unordered selection? Is this with or without replacement?
Find the probability that a five-card poker hand will be four of a kind. That is, four cards is of the same value and the other is of a different value.
Law of total probability
\[
\Prob(A) = \Prob(A \cap B) + \Prob(A \cap B^c)
\]
Definition of conditional probability
\[
\Prob(A | B) = \Prob(A \cap B) \Prob(B)
\]
Bayes rule: lets you "invert" conditional probabilities
\[
\Prob(B | A) = \frac{\Prob(A | B) \Prob(B)}{\Prob(A|B) \Prob(B) + \Prob(A|B^c) \Prob(B^c)}
\]
There are two coins on the table. The first tosses heads with probability \(p\) and the second with probability \(r\). You select one at random, flip it, and get heads. What is the probability that the second coin was chosen?
A fair die is rolled. If the outcome is odd, a fair coin is tossed repeatedly. If the outcome is even, then a biased coin with \(p\) probability of heads is tossed repeatedly. If the first \(n\) throws result in heads, what is the probability that the fair coin is being used?
46% of the electors of a town consider themselves as independent, whereas 30% consider themselves democrats and 24% republicans. In a recent election, 35% of the independents, 62% of the democrats and 58% of the republicans voted.
What proportion of the actual population voted.
A random vote is picked. Given that he voted, what is the probability that he is independent? What is the probability that he's a democrat?
Do all problems from chapter 8. I think conditional probability is really important to understand well!
If \(A\) is independent of \(B\) then
\[
\Prob(A \cap B) = \Prob(A) \Prob(B)
\]
\(A\) is independent of itself then it must have \(0\) or \(1\) probability.
If \(n\) events are independent, \(A_1,\ldots,A_n\) then the condition of independence has to be checked for all \(k\) collections of events:
\[
\Prob(A_{i_1} \cap \cdots A_{i_k} ) = \prod_{j=1}^k \Prob(A_{i_j})
\]
Gamblers ruin formula. If you start with \(n\) dollars, then the probability that you end up with \(n + N\) dollars before losing your fortune is
\[
\Prob( \text{Make $n + N$ dollars before ruin} ) = \frac{n}{n + N}
\]
CDF. Is the general object that works for both discrete and continuous random variables.
The CDF completely describes the behavior of a random variable. It always exists, but the pdf or pdf might not.
Recall standard normal density and distribution.
Let \(\W = \{1, \ldots, n\}\) and let \(X(k) = k^2\) for all \(k \in \W\), where each \(k\) is equally likely. Find \(F_X\) and draw a graph.
The probability mass function of an exponential random variable \(X \sim \Exponential(\lambda)\) is \(\lambda e^{-\lambda x}\). Find \(T(x) = 1 - F(x)\) of \(X\).
Do exercise 13.1 from the textbook. Is it discrete or continuous. Important.
Let \(f(x) = \frac{1}{2} \Exponential( - |x|)\). Compute the probability of the following event
\[
\{ \Exponential( \sin (\pi X) ) \geq 1 \}
\]
Is there a value of \(c\) that makes \(f = \frac{c}{1 + x^2}\) an probability density function?
Support of a continuous random variable: the set on which \(f_X \neq 0\). For example the support \(\Uniform([a,b])\) is the interval \([a,b]\).
Review theorems 18.3 and 19.2: If \(Y = g(X)\) and \(X\) is discrete, then the pmf of \(Y\) is
\[
\Prob(Y = y) = \sum_{x : g(x) = y} \Prob(X = x)
\]
Let \(X\) is continuous, with support \(D\). If \(g\) is continuously differentiable, one-one and has inverse \(g^{-1}\) then \[
f_Y(y) = \begin{cases}
f_X(g^{-1}(y)) \left| \tfrac{dg^{-1}(y)}{dy} \right| & y \in g(D) \\
0 & \text{otherwise}
\end{cases}
\]
For complicated cases, you must compute the cdf and differentiate it to find the density. If \(g = X^2\) for example, it might not be one-one and the above theorem does not apply.
If \(X\) is \(N(0,1)\) a standard normal, find the pdf (or density function) of \(Y = \mu + \sigma X\).
If \(X\) is Cauchy, with density \(f_X(x) = \frac{1}{\pi(1+x^2)}\) for all \(x \in \R\), then find the density of \(Y = X^2\). Important.
We throw a ball from the origin with velocity \(v_0\) and an angle \(\theta\) with respect to the \(x\)-axis. We assume \(v_0\) is fixed and \(\theta\) is uniformly distributed on \([0,\pi/2]\). We denote \(R\) the distance at which the object lands. Find the probability density function of \(R\), where
\[
R = \frac{v_0^2 \sin(2 \theta)}{g}
\]
We roll two fair dice. Let \(X_1\) be the smallest and \(X_2\) be the largest of the two outcomes. Find \(f_{X_1,X_2}\), find the marginal \(f_{X_1}\). Are \(X_1\) and \(X_2\) independent?
Do example 23.4 from your textbook that we did in class.
Suppose \((X,Y)\) is distributed uniformly in the circle of radius \(1\) about \((0,0)\).
Do exercise 24.3 and 24.4
Review example 27.3 with the St. Petersburg paradox.
Do exercise 27.1. In Vegas, a roulette is made of 28 boxes, namely 18 black boxes, 18 red boxes and a 0 box and 00 box. If you bet 1 on black, you get 2 if the ball stops in a black box and 0 otherwise. Let \(X\) be your profit. Compute \(\E[X]\). Note if you repeatedly bet 1 each time you play Roulette, you play Gambler's ruin.
Compute expectation of a normal random variable \(N(\mu,\sigma^2)\).
Compute expectation of a Cauchy random variable.
Compute \(\E[X(X-1)]\) of a Poisson random variable by differentiation.
Compute the expectation of a Gamma random variable. Recall \(\Gamma(1,\alpha)\) has density function
\[
f_X = \frac{1}{\Gamma(\alpha)} x^{\alpha - 1} e^{-x} dx
\]
Suppose 100 balls are tossed independently and at random into 50 boxes. Let \(X\) be the number of empty boxes, find \(\E[X]\).
Let \(X\) be \(\Exponential(1)\). Compute \(\E[X]\) and \(\Var(X) = \sigma^2\). Compute the chance that \(X\) is \(6\) standard deviations away from its average. Compare it with the estimate from Chebyshev inequality.
Start with the Monty Hall problem. The way to think about this is as follows. What is the probability that you will win if you switch? [ \mathbb{P}(win if you switch) = \mathbb{P}(you chose the wrong door initially) = \frac23 ] Isn't that crazy? The difference here is that the person opening the door gave you extra information about the whole process, and ridiculously, it makes more sense to switch.
Sample spaces: set notation { }
includes events. Talk about the sigma-algebra, or collections of events. Perhaps I will talk about this at some point. The reason we define this collection of events is because there are different types of infinities.
Rules of probability: 3 rules.
** remark ** the rules of probability should be introduces after introducing the union and intersection notation.
Algebra of events: introduce union, intersection. Then do rules of probability. Then more notation. subset, proper subset. Set subtraction. Complement notation.
Complements lemma for unions and intersections: this is in the exercise 1.2. Use Venn diagrams.
Algebra of events: Intersection of \(n\) events, and disjointness. Distinction between pairwise disjointness and the above. Do example 2.5, which says, lets write down the solutions to \[
|x - 5| + |x - 3| \geq |x|
\]
Use example 2.7 to show the structure of proofs. This states that
Remark This can be fleshed out to illustrate the structure of proofs. Show "both sides of the inequality". Also reinforce use of Venn diagram. \[
(a,b) = \cup_{i=1}^n (a, b- \frac1n )
\] Draw a picture for this. Also do 2.8 to show how \[
(a,b] = \cap_{i=1}^n (a, b + \frac1n )
\]
Distributive lemma for set notation. It's good to write the lemma as $A \cup ( B \cap C) = $ and $A \cap (B \cup C) = $. Draw a venn diagram. This needs exercise 1.2 to be cited for the complements lemma.
Again this is a good proof to go over, since it illustrates the "show one is a subset of the other" proof.
Lecture 2 It seems like I have to slow down. So the recap is as follows from chapter 1.
Sample space, events, and outcomes.
Unions, intersections, and complements.
Ex 1.1 Let's do these events again.
Rules of probability.
Example: Take two events from a roll of a die.
Demorgan's law
\[
(A \cup B)^c = A^c \cap B^c
\]
Return back to Lemma 2.6 and do the distributive law.
Then do Ex 2.2a.
I wasn't able to do chapter 3.
Finally move onto the chapter 3 about the algebra of events. Smallest sigma algebra.
Talk about making abstract the notion of length.
Introduce first proof by induction, Lemma 3.5.
Lecture 1
It's on a wednesday, so there is a quiz.
Topics to cover: Algebra of sets. closure under unions and complements. Prove that it's also closed under complements. Say why it's important for a complete probability model.
Do the countable additivity postulate from the usual rules of probability. This is the induction proof for every finite \(n\). But what if \(n = \infty\)? This is called the countable additivity rule.
Properties of probability. Write down the probability of \(\Prob(B \setminus A)\) when \(A \subset B\).
How to assign probabilities? Lets look at the finite case. Say \(\W = \{\w_1,\ldots,\w_n\}\). How do we assign probabilities to an arbitrary set \(A\)? Introduce cardinality notation. Does it satisfy all the following rules?
Do example \(3.8\) and ask them how many coin tosses it is modeling. Can have equally likely things. Assign probabilities unevenly and ask to find probability of heads. What does it tell you about coin 1? Do this word of caution from page 21 here. So this says that you can instead write the \(\W = \{ HH, HT, TH \}\) and assign probabilities to it. This is a common mistake when trying to model the outcomes of a fair coin!
Remark The word of caution on page 19 should be moved to page 21.
Lecture 4 Do addition rule for events that are not necessarily disjoint. Do the proof for this.
State countable subadditivity as a corollary. Do inclusion-exclusion rule:
\[
\Prob(A_1 \cup \cdots \cup A_n) = \sum_{i=1}^{n} (-1)^{i-1} \sum_{1 \leq j_1 < \cdots < j_i \leq n} \Prob(A_{j_1} \cap \cdots \cap A_{j_i}
\]
Think of this as an exercise in parsing a formula. Do the example where the break it down for three events.
The chapter has more on elementary combinatorics.
Draw a picture of two sets, and show when you multiply and when you add.
First principle of counting: if you have two sets of things, say knives and spoons. \(m\) knives and \(n\) spoons. Ask, how many utensils are there.
Second principle of counting. if you have \(m\) distinct knives and \(n\) distinct spoons and ask, how many different ways can I pair these knives and spoons? The important word is distinct.
What if ask, how many different ways you can pair two utensils, and now you say that all the knives are the same and all the spoons are equivalent.
Turns out that these rules of counting have important applications in physics. Turns out that the way you count elementary particles has deep physical meaning, and leads to different system statistical behavior that can actually be measured in an experiment. One way is called Fermi-Dirac and the other is called Bose-Einstein.
It starts with the roll of a die. Write down the 6x6 matrix,
We got upto Chapter five in the first lecture. In the second lecture, start with the urn problem.
You have an urn, with orange and purple balls in it. You stick your arm into the urn, and pick up two balls at random. What are the chances that they have different colors?
Answer: label the balls, and use the counting principles.
So we did the second principle of counting to do the urn problem. Do the urn problem with replacement now.
Remark The organization of this chapter is not clear to me.
Note to self, figure out the urn problem.
First lecture will plan to finish chapter 6 and bits of 7. Start of recap of permutations, combinations, and then do proof of combinations theorem.
Do a couple of examples. Then do properties of combinations.
Make project announcement.
Get names of everyone who can program.
On the second lecture, will do more conditional probability.
When talking about disjointness and independence, give the example where the two things might be disjoint, but one happening influences the happening of the other. This means that they cannot happen simultaneously, and hence cannot be independent.
\(n\) independent events defined inductively. Give them an example of \(3\) pairwise independent events, but not independent.
Do example 9.1 where you talk about events from a deck of cards.
The 1s and 2s in exercise 9.2 should not have apostrophes.
Lecture 10. Start with Gambler's ruin.
I stopped with some measurability spiel.
Start with Bernoulli distribution again. Then do binomial distribution.
Do example 10. Explain how to do it clearly. Say this will be on the quiz.
Do exercise 11.3 We roll two fair dice. Let \(X\) be the product of two outcomes. What is the probability mass function of \(X\)?
Geometric distribution. Time until first success. Describe the tail of a distribution. The couple that has a religious problem with contraception.
Geometric distribution tails \((1-p)^{n-1}\). Let \(X \sim \text{Geom}(p)\). Fix and integer \(k \geq 1\). Compute the conditional probability that \(X = k + x\) given \(X \geq k+1\). It's memoryless, again.
Suppose we are tossing a p-coin, where \(p \in [0,1]\) fixed, until we obtain \(r\) heads.
Remark On page 67, it should read \(\cup_{i=1}^{\infty} (-\infty,x+1/n] = (-\infty,x]\).
Finished it at the cumulative distribution function. Remind them about the structure of the proof, which required at least two lemmas.
To test them on
Next weeks quiz on stuff.
Week 9-10 Note You need to get them to do negative binomial. But this will better to do when doing moment generating functions as stated in the notes.
Will have to give them more details on the project. Have them send me a list of projects in order of priority, give me your top three choices. You will get one of your top three choices.
If you choose a hard project and do a good job on it, then you'll get a good grade.
Be creative! Ask questions, and try to answer them yourself.
Now we're doing normal distribution, the CLT and normal approximation. It might be a good idea to show them the graph of the Gore-Bush stuff from your 1070 class.
Show recurrence formula, requires integration by parts. So again, I'll have to use something like
\[
\int uv' = u v - \int u' v
\]
Or in the more direct form \[
\int u v = u \int v - \int u' \int v
\] where \(\int v\) represents any antiderivative of \(v\).
Lesson: The gamma function is a generalization of the factorial and somehow it shows up everywhere.
Functions of a discrete random variable.
There is a simplification when \(g\) is invertible.
\[
f_Y(y) = f_X(g^{-1}(y))
\]
Remark It's also probably a good idea to prove a proposition that says that following
Proposition: If you have a positive function \(f > 0\) and it integrates to \(1\) on the real line, there is a constant \(c\) such that \(\frac{1}{c} f\) is a probability density function.
Remark I wonder if it's a good idea to do Jensen's inequality and Holder's inequality.
The following is from David S. Moore, "Fundamental practice of statistics".
Yogi Berra said it: “You can observe a lot by just watching.” That’s a motto
for learning from data. A few carefully chosen graphs are often more instructive than
great piles of numbers. Consider the outcome of the 2000 presidential election in
Florida.
Elections don’t come much closer: after much recounting, state officials declared
that George Bush had carried Florida by 537 votes out of almost 6 million votes
cast. Florida’s vote decided the election and made George Bush, rather than Al
Gore, president. Let’s look at some data. Figure 1 (see page xxvi) displays a graph
that plots votes for the third-party candidate Pat Buchanan against votes for the
Democratic candidate Al Gore in Florida’s 67 counties.
What happened in Palm Beach County? The question leaps out from the graph.
In this large and heavily Democratic county, a conservative third-party candidate
did far better relative to the Democratic candidate than in any other county. The
points for the other 66 counties show votes for both candidates increasing together
in a roughly straight-line pattern. Both counts go up as county population goes up.
Based on this pattern, we would expect Buchanan to receive around 800 votes in
Palm Beach County. He actually received more than 3400 votes. That difference
determined the election result in Florida and in the nation.
The reason appears to be that they used a confusing butterfly ballot that made people vote for Buchanan instead.
Topics left - Functions of a discrete random variable - Functions of continuous random variable
Recall the \(\log(X)\) example. Do theorem 19.2.
Do Def 23.1. Say \((X,Y)\) is jointly distributed with joint density function \(f\) if \(f\) is piecewise continuous, and for all nice two dimensioal sets \(A\)
\[
\Prob((X,Y) \in A) = \int_A f(x,y)\, dx dy
\]
Suppose \((X,Y)\) are uniformly distributed on \([-1,1]^2\). Example 23.3. Basically find two different densities.
Week 13
remark perhaps give them negative binomial on the final exam.
remark Theorem 29.1 seems incorrectly stated.
Correlation measures linear dependence. If \(\rho(X,Y) = 1\), then there are constants \(a\) and \(b\) such that \[
\Prob(Y = a X + b) = 1
\]
Two major theorems. Law of large numbers and central limit theorem. Suppose \(X_1,X_2,\ldots,X_n\) are independent with the same mean \(\mu\) and finite variance \(\sigma^2\), then \[
\lim_{n \to \infty} \Prob( \left| \frac{1}{n} \sum_{i=1}^n X_i \right| \geq \e ) = 1
\]
Proof of the law of large numbers.
Central limit theorem. Let \(X_1,X_2,\ldots\) be iid random variables. Assume that the variance is finite. Then if \(\mu = \E[X_1]\) then \[
Z = \frac{1}{\sigma \sqrt{n}} \sum_{i=1}^n X_i
\] converges in distribution to a normal random variable.
Do example 38.4. Waiting time in a certain toll station is exponential with an average waiting time of 30 secs. You want to find the probability that your wait is between 45 minutes and one hour. If \(X_i\) is the waiting time of the car of number \(i\), then we want to compute \(\Prob(45 < X_1 + \cdots X_{100} < 60)\).