Due Dec 2
Do problems 3.1 3.4 3.6 3.8 3.11 and 3.12 from Lawler’s book.
a. From the Poisson distribution a. The number of calls in the second hour is independent of the first. Therefore, a. We know the times $T_i$ are iid $\operatorname{Exp}(\lambda)$. Then a. This is Bayes rule. a. Similar to the above Why do you think this is Binomially distributed? Can you explain it?
It essentially follows from the hint. The last bit that one has to prove is the following. We know that since $P$ is a transition matrix of an irreducible, aperiodic Markov chain. We have where $D$ is a diagonal matrix with eigenvalues ${\lambda_1,\ldots,\lambda_n}$ on the diagonal. We know that $\lambda_1 = 1$ and $|\lambda_i| < 1$. Then Now it’s clear that $D - I$ is diagonal with entries ${\lambda_1-1,\ldots,\lambda_n-1}$ on the diagonal. From this it follows that $A$ must have one $0$ eigenvalue and the rest must be negative.
The solution is given in the accompanying python file. Simply find the Jordan decomposition of $A$ as Then, The eigenvalues are $(0,-3,-7)$ from which we get
a. Again, computing using the python code, you get the eigenvector corresponding to $0$: This is not normalized, so we normalized this to get b. You leave state $1$ with rate $1 + 1 + 1$, and therefore, the time until you leave the state $1$ is $T = \operatorname{Exp}(3)$. We know that $\E[ T ] = 1/3$. a. Make state $4$ absorbing. So you get Let $T_4$ be the time to absorption into the $4$\textsuperscript{th} state, with $b(x) = \E[ T_4 | X_0 = x ]$. From page 74 in the textbook, Therefore, we $b(1) = 1$. The $[1 \quad 1 \quad 1]^T$ notation means that you’re taking a row vector and taking it’s transpose to get a column vector.
Since it seems like $\lambda_n > \mu_n$, we compute This is not summable in $n$ (it’s the Harmonic series). Therefore the chain is not transient, it’s recurrent. Let’s determine if it’s positive recurrent. This clearly blows up, and hence the chain is null recurrent.
For the second part, we have $\lambda_n = (n+1)/(n+2)$. Since the birth rate is lower than the death rate, we expect this to be recurrent. which is not summable. Therefore it’s recurrent. You might suspect that this case might be positive recurrent: This isn’t summable either, and this case is also null recurrent.
We did this in class. For transience we have So this is transient iff $\mu < \lambda$. For recurrence, notice that This means it’s always positive recurrent for $\lambda \leq \mu$. In fact, $\pi(0) = 1$.
Hints
3.4 This solution is due to Dylan Finlayson (thanks!). Choose $a$ larger in absolute value than all the entries of the matrix $A$. We have to show
$P$ is aperiodic. The only part that needs explanation is this one. To show that it’s aperiodic, simply notice that the diagonal entries of $P_{ii}$ are bigger than $0$. This is because
Recall the definition of periodicity: it’s the greatest common divisor of the set
But we’ve seen that $P_{ii} = p_1(i,i) > 0$. One way of staying in state $i$ after $n$ steps is to keep taking the loop from $i$ to $i$ $n$ times. Therefore, for all $n \geq 1$. This shows that the periodicity of the state $i$ is $1$. This is true for all the states, and hence the chain is aperiodic.
Due Nov 11.
Consider the generating function where $X$ is a non-negative random variable. How does this compare with the moment generating function
Solution
Show that $\phi(s)$ is convex by differentiating the series for $\phi(s)$: What are the conditions on $X$ and its probabilities so that $\phi’‘(s) = 0$ for all $s \in (0,1)$.
Solution: Clearly Since all the terms in the sum are positive, this is zero iff $p_i = 0 \forall i \geq 2$. This means that $\phi = p_0 + s p_1$, which makes it a linear function.
Using Jensen’s inequality give me an easy lower bound for $\phi(s)$. Recall that Jensen’s inequality says that if $f$ is a convex function, then
Solution: We want to get a lower bound for $\phi(s) = \E[s^X]$. So we have to set $f(X) = s^X$ for any fixed $s$. To apply Jensen’s inequality you have to verify that $f$ is convex. This follows from
Then
Consider the simple random walk (SRW) with state space $S = { 0, \pm 1, \pm 2,\ldots }$. The walk goes forward or backwards with equal probability. This is the one-dimensional simple random walk ($d=1$).
Show the one-dimensional random walk is recurrent by showing that
and then using one of the criteria for recurrence/transience that we developed in class.
Solution: This is straight from the textbook. Use Stirling, then show that This proves recurrence.
In $d \geq 2$, the simple random walk can take a step (with equal probability) in each one of the $2d$ directions.
What is the probability that the random walk goes from $x = (x_1,\ldots,x_d)$ to $y = (x_1 + 1,\ldots,x_d)$ in one step? Solution: The notation is a little ambiguous, but I meant that only the first coordinate changes. This has probability $1/2d$ there are $2d$ directions you can go in.
It turns out that for $d \geq 2$, Is the simple random walk transient or recurrent for $d \geq 2$? Again
Solution When $d = 2$, the number of visits is infinite. When $d \geq 3$, the expected number of visits is finite.
The simple random walk is recurrent in one dimension. Is it null recurrent or positive recurrent? Hint: Try to find a stationary distribution $\pi(x)$.
Based on your findings in part 3, simply give me a non-mathematical argument for $d=2$. You can afford to be a little vague here. Give me an intuitive explanation for whether the walk is expected to be transient, positive recurrent or null recurrent. Solution Loosely speaking, in dimension $2$ there is a lot more room than $d=1$, so one expects it to be null recurrent again.
Alternately, you can argue that the $d=2$ walk consists of two independent one dimensional walks. For the $d=2$ walk to return to $0$, both $d=1$ walks must return to $0$, and this has smaller probability. So it ought to take longer for the $d=2$ walk to return than the $d=1$ walk; but we've shown that the expected time of return of the $d=1$ is infinite. Therefore it's null recurrent.
The hint ought to have helped you solve this one.
a. $p(x,x+1) = p$; and $p(x,0) = 1-p$. That is, if you’ve seen $x$ consecutive $1$ so far, and you see a $0$, the current run of $1$s is reset to $0$. a. The equation is Then $\pi(x) = p^x \pi(0)$ and This gives $\pi(0) = 1 - p$. Therefore $\pi$ is geometric.
For 2.6(c), we have to find a recursion equation. Here’s how you’d get started. Let’s consider $\E[ T_k - T_{k-1}]$, the time to go from $k-1$ successive $1$s to $k$ succesive ones. Given that we’ve seen $k-1$ successive ones, if the next one is a $1$, $T_k - T_{k-1} = 1$. If the next one is a $0$, we’ve gone back to square one and the we must again wait to see $k$ ones in succession. Then, letting $\E[T_k] = a(k)$,
With a little bit of work, one sees that Verify that this formula solves the recursion.
If there is a chance the branching process runs off to infinity without returning to zero, this means that the chain is transient ($\mu > 1$)! If the branching process returns to zero with probability $1$ it must be recurrent ($\mu \leq 1$).
You do not have to submit the code electronically. Just use a programming language to run it, print out the code and the results, and then attach it to your HW. Note that $a(n) = \phi_n(0)$. This is the probability that the chain goes extinct at the $n$th step. To find this, recall the recursion So a sample python code to create the function $\phi_n$ would be.
sys.setrecursionlimit(10000)
def phi(s):
return p_0 + p_1 * s + p_2 * s**2 + p_3 * s**3
def phin(n,s):
if n == 1:
return phi(s)
else:
return phin(n-1,phi(s))
# define the numbers p_0, p_1 and p_2 and p_3 first. Or however many numbers you need to define phi.
print(phin(10,0))
Due Oct 17. The code for this should be submitted electronically through canvas.
Start by reading example 3 on page 30, or please refer to your notes. Suppose you’re a gambler playing a simple game where you lose $-1$ with probability $1-p$ and gain $+1$ with probability $p$. We want to figure out long it takes for the Game to end. Let $T$ be the time or the number of steps to absorption into state $0$ or $N$ (you or the bank is ruined). Then, let the conditional expectation be
In the case when $p=1/2$, first figure out how the text book finds the equation
What is the corresponding equation for general $p \neq \frac12$?
Solution:
In the $p=1/2$ case, how does one figure out the solution to the recursion equation above? The textbook suggests that we guess one of the solutions. Here is one way to make an educated guess. Manipulating the equation above, we get
If we make the substitution $H(j) = G(j+1) - G(j)$, then we get
$H$ is a discrete time-derivative of $G$, so let us pretend that these are smooth functions $G \colon \R \to \R$, and view the difference equation above as a differential equation
Compare your solution in the textbook. Can you verify your solution by substituting it into Solution: We verify this by plugging it into the recursion equation for $G$. Did you get it correct? What did the textbook get wrong? Solution The textbook had $G(j) = j^2 + c_1 j + c_2$, which has an incorrect sign for the quadratic term.
A small, interesting project is to try and guess the solution for the general case $p \neq \frac12$ using the above method. Does it work?
Suppose you run the Markov chain for the urn model starting with a 100 red balls and 200 blue balls. What is the probability that you’re going to have less than 60 red balls in the urn after a really long period of time.
Solution We know $X \sim \Binomial(300)$ is the stationary distribution of the number of blue balls. Then
Use the central limit theorem to approximate the probability that you’ve previously written down. Hint: Recall normal approximation for the binomial distribution from MATH 5010. Solution The binomial formula above is quite a pain to compute, but we know that where $\mu$ and $\sigma^2$ are the mean and variance of the binomial. Then Using the python code
import scipy.stats as st
import numpy as np
st.norm.cdf(-90/np.sqrt(150/4))
Gives a value of $3.37 \times 10^{-49}$ for this probability. So it’s quite unlikely.
We suggested in class that one way to make a random variable that would fall in an area $A$ was to do the following:
I want you to prove that $(X_1,X_2)$ is uniformly distributed on the area $A$. To do so, assume that $B \subset A$, and compute
Interpret the answer you get in terms of areas.
Solution: You’re essentially following the textbook. If $(X_1,X_2)$ is uniform in an area $A$, then for any set $B$ in $A$, the probability that $(X_1,X_2) \in B$ should be
Our procedure uses $U_1,U_2$ to generate $(X_1,X_2)$. If $(U_1,U_2)$ is accepted then we set $X_1,X_2 = U_1,U_2$. So we should compute the probability This proves that the $X_1,X_2$ that we’ve generated using $U_1,U_2$ is uniformly distributed on $A$.
I want to see your code for this problem. It must be submitted with the HW on Canvas. An ideal format would be the jupyter notebooks in python, or R-markdown. Of course, you’re free to use anything, but the code must run. Please include graphs and pictures and make them pretty!
Consider the reflecting random walk on $5$ states. The transition matrix for this can be found in my code example on Canvas.
Find $\pi$, the stationary probabilities by simulating running the chain for a long time, and computing the average amount of time spent in each state.
Sample from the stationary probabilities directly by ‘coupling from the past’. Simulate this random variable a few times and estimate the stationary probabilities by seeing how often it lands in each state. The algorithm may be found in Chapter 11 of Ross’ book (see syllabus), but we’ll also go over it carefully in class in case you don’t have the book.
Start by reading wikipedia’s page on the PageRank algorithm that Google uses to ranking its search results. Get thoroughly confused. It sorta works like this:
After looking at the wikipedia page, see Firas’ presentation. There is an example of a Markov chain on $4 \OR 5$ states there.
This HW was due on Sep 26, and you simply had to correct the errors on your exam. People who didn’t make any errors on the exam were exempt.
Due on Sep 19, 2016. REMEMBER there is a quiz on Sep 16.
You’re going to be tested on classification of states, and mean return times in this one. You’re also going to be asked to fill in some details of proofs.
In class we stated that there were two communication classes.
Are there any types of communication classes? For example, is there a class $T$ that you can stay in forever with probability $p$ and leave eventually with probability with $1-p$ (for $p > 0$)? We will show that such classes cannot exist.
Let $T = \cup_{i=1}^k T_i$ be a disjoint union of all the transient communication classes, and let $R$ contain all the recurrent states.
Fix a state $i \in T_1$. We show first that there must be a way for the chain to transition from $i$ to a state $j \in R$; i.e., show that this has non-zero probability. One way to do this is by contradiction. Assume that you can never get from $i$ to the set $R$ and arrive at a contradiction.
Solution: Let $T_1,\ldots,T_k$ be all the transient states. Let $R$ represent the union of all the recurrent classes. This is a key argument.
Suppose you start in transient class $T_1$. There must be a way to leave class $T_1$; for if not, $T_1$ is a communicating class that you can never leave and this means that it is recurrent. Let us suppose we’ve ordered $T_1,\ldots,T_k$ so that one (of the many possible) classes that you can reach is $T_2$.
We claim that you can never go back from $T_2$ to $T_1$. If it were possible to go back, then this would mean that $T_1$ and $T_2$ communicate; this violates our assumption about them being distinct communicating classes. Arguing this way, we find that must there is path from $i$ to a final communication class $T_k$. Now, we cannot have an edge between $T_k$ and any previous state $T_i$ for $i=1,\ldots,k-1$. If it did this would mean that $T_i$ and $T_k$ communicate, and this is not allowed. This means that $T_k$ must have a state from which there is a positive probability of reaching $R$. In other words, there is $k_i$ such that
Next we must show that if $i \in T$, $\Prob( \exists~n > 0 \,s.t., X_n \in R | X_0 = i ) = 1$. To arrive at a contradiction, suppose again that we never exit the transient states. Since $T$ is a finite set, there must be at least one state that is repeated infinitely many times. Let us show this semi-rigorously. Let |
Then,
represents the probability that you never leave the transient class. Let
Let $R_j^{(k)}$ be the $k^\text{th}$ return to $j$. Then,
Then, this is non-zero iff $\rho(j,j) = 1$ for at least one $j \in T$.
However since there is certainly one way of reaching $R$ from $j$. This is a contradiction.
Consider a two-state Markov chain with $S = {0,1}$. Let $p(0,1) = 1/2$ and $p(1,0) = 2/3$. Let $T$ be the first return time to $0$ given that you started in state $0$.
Compute $\E[T]$ using the stationary distribution of the chain.
Solution: The stationary distribution is obtained by solving $\pi P = \pi$. We get Then, using equation $1.11$, we get
Find $\Prob(T \geq n)$ by computing Solution: This is on page $26$. Simply read the textbook and follow the computation. Notice how we’re computing the full distribution of the random variable $T$.
Compute $\E[T]$ using the formula Solution: It’s the line following $1.13$.
Suppose $a_n$ is a sequence of numbers such that Show that Solution: We use the $\e-\delta$ definition of a limit. Fix $\e$ and pick $N$ so large that Then $\e$ was arbitrary, this completes the proof. We’ve used the triangle inequality in the third step.
Suppose $X$ is a Markov chain that’s irreducible and has period $d > 1$. Let $Y(n,i)$ be the amount of time spent in state $i$ in $n$ steps of the chain. We’ve seen that Show that $A \to \pi(i)$, where $\pi$ is the unique stationary probability distribution.
Solution: We know that This is in the textbook. So set $\sum_{r=1}^d p_{0+r} = a_0$, $\sum_{r=1}^d p_{d+r} = a_1$, $\sum_{r=1}^d p_{2d+r} = a_2$ and so on. That is, we’re grouping every $d$ terms in the sum $A$. Then we have The first of these terms goes to $\pi(i)$ since from what we’ve shown previously since $a_k \to d \pi(i)$ as $k \to \infty$. The second term goes to $0$ since $p_k(j,k) \leq 1$ and
Recall the umbrella example from HW 1.
What is the average number of umbrellas you’d have on hand in the long term?
Solution: Simply use the stationary probabilities, and find This ranges from $2.2$ to $2.5$. I could live with this.
Suppose you start with $0$ umbrellas. What is the average amount of time until you have $0$ umbrellas again?
Solution: $\E[T] = \frac{1}{\pi(0)} = (5-p)/(1-p)$. As $p = 1$, (it rains all the time) then the time between no umbrellas goes to $\infty$! Can you see why? As $p = 0$, (it’s shiny all the time), this is $5$; can you see why?
Due Sep 7, 2016.
There are two coins on the table. The first tosses heads with probability $p$ and the second with probability $r$. You select one at random, flip it, and get heads. What is the probability that the second coin was chosen?
Answer Let $H$ be the event that you get heads. Let $C_2$ be the event that the second coin is chosen. We want
Let $A_0, A_1, \ldots, A_n$ be events of non-zero probability.
Show that
Answer: (2pts) We need the non-zero probability assumption because the conditional probabilities are meaningless otherwise. In other words, to define
we need $\Prob(B) > 0$ to ensure that we’re not dividing by zero. So make the above statement to be true, we need to assume that
The proof is by induction. For the induction hypothesis, assume that it’s true for any set of $n$ non-empty events. We will show that it’s true for $n+1$ non-empty events ${A_i}_{i=0}^{n}$. Then, using the definition of conditional probability in the second step and the induction hypothesis in the third step.
We’ve used the notation (that we’ll use throughout the course) that
Suppose we have some stochastic process with state space $S = { a, b, c}$. What does equation in the previous part say about the event
Answer (2pts)
Suppose the process is Markov, simplify the equation you obtained above.
Answer (2pts) We simply apply the Markov property.
Consider a stochastic matrix of the form
Find the determinant of $A$.
Answer (2pts): $\det(A) = 1/3 - 1/6 = 1/6$.
Find the eigenvalues of $A$ by solving the characteristic equation
where $I$ is the identity matrix.
Answer: (2pts) It’s easy to see that which gives the eigenvalues $x=1,x=1/6$.
Find normalized eigenvectors ($|v| = 1$) of $A$ finding solutions to $Av = \lambda v$. Answer: (2 pts) For $\lambda = 1$, solve to get the left eigenvector $(0.6172, 0.762)$. For the eigenvalue $\lambda = 1/6$, we get the eigenvectors $(-0.7211,0.7211)$. For this last step, I got python to solve the linear system of equations above. These are normalized left eigenvectors; i.e., they satisfy $\sqrt{v_1^2 + v_2^2} = 1$. The right eigenvectors may be found to be I agree that it looks rather weird because of the normalization. Nevertheless, this is what python chugs out if you ask for eigenvectors. If you didn’t use the computer, but solved by hand, you would get something like this.
Find a matrix $Q$ such that $Q^{-1} A Q$ is diagonal. In other words, find the Jordan decomposition of $A$. Hint: once you’ve found the eigenvectors $v_1$ and $v_2$, let $Q = [v_1 v_2]$.
Answer: (2 pts) This is the same as where $\Lambda$ is the diagonal matrix of eigenvalues (or more generally something in Jordan form).
A thinker who owns 4 umbrellas travels back and forth between home and office, taking along an umbrella (if there is one at hand) in rain (probability $p$) but not in shine (probability $q=1-p$). Let the state be the number of umbrellas at hand, irrespective of whether the thinker is at home or at once. Suppose he has two umbrellas at home and two at work, and he’s at home. The state $X_0 = 2$. It rains that day, he takes one umbrella from home to work and so he has three umbrellas at work. The state is now $3$. Set up the transition matrix and find the stationary probabilities.
Answer: First, draw a picture of the possible transitions.
The form the matrix
Solve $\pi P = P$ and use the constraint $\sum_{i=0}^4 \pi_i = 1$ (it’s a probability vector). This gives us the equations
So finally we get $\pi_1 = \pi_2 = \pi_3 = \pi_4$ and therefore,