4
$\begingroup$

Assume we have a random walk starting at 1 with probability of moving left one space $q$, moving right one space $p$, and staying in the same place $r=1-p-q$. Let $T$ be the number of steps to reach 0. Find $h(z)$, the ordinary generating function.

My idea was to break $T$ into two variables $Y_1,Y_2$ where $Y_1$ describes the number of times you stay in place and $Y_2$ the number of times we move forward or backward one. Then try to find a formula for $P(T=n)=P(Y_1+Y_2=n)=r_n$, but I'm getting really confused since there are multiple probabilities possible for each $T=n$ for $n\geq 3$. Once I have $r_n$ I can then use $h_T(z)=\sum_{n=1}^\infty r_n z^n$, but I'm not sure where to go from here.

  • 0
    You may be interested in the answers to [this question](http://math.stackexchange.com/questions/23808/coin-toss-question/23845), though they don't directly apply to this one.2011-12-15

4 Answers 4

0

This isn't another answer, but something that seems worth adding and is too long for a comment.

What makes the probabilities for hitting times nice to work with is that their generating functions combine by multiplication, due to a sort of transitivity: If $A\gt B\gt C$, then to get from $A$ to $C$ for the first time, you have to get from $A$ to $B$ for the first time and then get from $B$ to $C$ for the first time. Summing over the hitting time for $B$ yields a convolution, and thus multiplication of the generating functions.

This leads to the simple equation $h=z(q+rh+ph^2)$ (see the other answers and my comments under Brian's). The generating functions for the probabilities of ending up somewhere at some time, but not necessarily for the first time, don't have this nice property, but we can obtain them from the ones for the hitting times.

The generating function $f$ for the probabilities for the return time (the first time to return to the same place) satisfies

$f=z(qg+r+ph)\;,$

where $g$ is the generating function for the hitting time for first hitting the origin starting at $-1$. This equation arises just like the one for $h$: We go left with probability $q$ and then have to hit the origin starting at $-1$, or we stay at the origin with probability $r$ and thus "return" in one step, or we go right with probability $p$ and then have to hit the origin starting at $1$.

The numerator of $h$ is invariant with respect to interchange of $p$ and $q$, so $g$ is $h$ with $p$ in the denominator replaced by $q$. Thus $f$ takes the simple form

$f=1-\sqrt{(1-rz)^2-4pqz^2}\;.$

We can get the generating function $F$ for the probabilities of ending up where we started (not necessarily for the first time) from $f$ via

$F=\frac1{1-f}\;.$

This can be derived either as in this answer, or by noting that the probability for returning to the same place is the sum over all $n$ of the probabilities for returning there for the $n$-th time, and the generating function for returning for the $n$-th time is obtained by an $n$-fold convolution as $f^n$, so

$F=1+f+f^2+\dotso=\frac1{1-f}\;.$

Thus we have

$F=\frac1{\sqrt{(1-rz)^2-4pqz^2}}\;.$

Finally, we can get the generating function $H$ for ending up at $0$ starting from $1$, not necessarily for the first time, as the product

$H=hF=\frac1{2pz}\left(\frac{1-rz}{\sqrt{(1-rz)^2-4pqz^2}}-1\right)\;.$

4

A walk that first reaches $0$ after $2n+1$ non-stationary steps must start with a walk of $2n$ steps from $1$ to $1$ that never hits $0$ and end with a single step to the left. The number of such walks is $C_n$, the $n$-th Catalan number: the first $2n$ steps are a Dyck path, since the number of left steps at any point can never exceed the number of right steps to that point. Each such path includes $n$ right steps and $n+1$ left steps, so each has probability $p^nq^{n+1}$, and the probability of reaching $0$ in $2n+1$ non-stationary steps is therefore $C_np^nq^{n+1}$. Thus,

$P(Y_2=n)= \begin{cases} 0,&\text{if }n\text{ is even}\\ C_mp^mq^{m+1},&\text{if }n=2m+1\;. \end{cases}$

Note that the generating function for the Catalan numbers is

$\sum_{n\ge 0}C_nz^n=\frac{1-\sqrt{1-4z}}{2z}\;,$

and that $\sum_{n\ge 0}\binom{m+n}nz^n = \frac1{(1-z)^{m+1}}\;.$

If $Y_2=2n+1$, stationary steps can be inserted anywhere before the last non-stationary step, so there are $\binom{m+2n}m$ ways to insert $m$ stationary steps, and

$\begin{align*} h(z)&=\sum_{n\ge 0}C_np^nq^{n+1}z^{2n+1}\sum_{m\ge 0}\binom{m+2n}mr^mz^m\\ &=qz\sum_{n\ge 0}C_np^nq^nz^{2n}\frac1{(1-rz)^{2n+1}}\\ &=\frac{qz}{1-rz}\sum_{n\ge 0}C_n\left(\frac{pqz^2}{(1-rz)^2}\right)^n\\ &=\frac{qz}{1-rz}\cdot\frac{1-\sqrt{1-\frac{4pqz^2}{(1-rz)^2}}}{\frac{2pqz^2}{(1-rz)^2}}\\ &=\frac{1-\sqrt{1-\frac{4pqz^2}{(1-rz)^2}}}{\frac{2pz}{1-rz}}\\ &=\frac{1-rz-\sqrt{(1-rz)^2-4pqz^2}}{2pz}\;, \end{align*}$

if I didn’t louse up any of the algebra.

The ‘snake oil’ route, added after reading joriki’s comment: Let $h_n$ be the probability of terminating in $n$ steps. In general a walk that terminates in $n$ steps takes one of two forms: it’s a stationary step followed by a walk that terminates in $n-1$ steps, or it’s a step to the right followed by a walk of $k$ steps that reaches $1$ without terminating followed by a walk that terminates in $n-1-k$ steps. This yields the recurrence $h_n=rh_{n-1}+p\sum_kh_kh_{n-1-k}\;.$ With the usual convention that $h_k=0$ when $k<0$, this works fine except at $n=1$: $h_1=q$, not $0$. The correct recurrence is therefore $h_n=rh_{n-1}+p\sum_kh_kh_{n-1-k}+q[n=1]\;.$ Multiplying through by $z^n$ and summing over $n$ yields the equation $h(z)=rzh(z)+pz\big(h(z)\big)^2+qz\;.$ Rewrite it as $pz\big(h(z)\big)^2+(rz-1)h(z)+qz=0$ and apply the quadratic formula to get $h(z)=\frac{1-rz\pm\sqrt{(rz-1)^2-4pqz^2}}{2pz}\;,$ and it only remains to determine the correct sign in the numerator.

Since $h(0)=h_0=0$, we must have $\lim\limits_{z\to 0}\;\;h(z)=0$, so the numerator of $h(z)$ must approach $0$ as $z\to 0$, and we must have the negative sign: $h(z)=\frac{1-rz-\sqrt{(rz-1)^2-4pqz^2}}{2pz}\;.$

  • 0
    @joriki: Yes, that is nice. One of these years I may yet learn to think of generating functions as more than clotheslines for sequences.2011-12-15
3

A classical way to determine $h(z)$ is to compute $h_n(z)=\mathrm E_n(z^T)$ for every $n\geqslant0$, where $\mathrm E_n$ denotes the expectation starting from $n$, hence $h(z)=h_1(z)$.

Then $h_0(z)=1$ and, considering the first step of the random walk, one gets, for every $n\geqslant1$, $ h_n(z)=rzh_n(z)+pzh_{n+1}(z)+qzh_{n-1}(z), $ with $r=1-p-q$. Fix $z$ in $(0,1)$. Then the sequence $(x_n)_n=(h_n(z))_n$ satisfies the relations $x_0=1$, $x_n\to0$ when $n\to\infty$, and $ax_{n+1}-bx_n+cx_{n-1}=0$ for every $n\geqslant1$, for some positive $(a,b,c)$.

Hence $x_n=\alpha s_z^n+(1-\alpha)t_z^n$, where $s_z$ and $t_z$ are the roots of the polynomial $Q_z(t)=at^2-bt+c$. Since $a=pz$, $b=1-rz$ and $c=qz$, one sees that $Q_z(0)=qz\gt0$, Q_z'(0)=-(1-rz)\lt0 and $Q_z(1)=-(1-z)\lt0$. Thus, $0\lt s_z\lt1\lt t_z$. If $\alpha\ne1$, $|x_n|\to\infty$ when $n\to\infty$. But $(x_n)_n$ should stay bounded hence this shows that $\alpha=1$. Finally, $x_n=s_z^n$ for every $n\geqslant0$, where $Q_z(s_z)=0$ and $0\lt s_z\lt 1$.

In particular, for every $z$ in $(0,1)$, $h(z)=s_z$, that is, $ h(z)=\frac{1-rz-\sqrt{(1-rz)^2-4pqz^2}}{2pz}. $ Note that the limit of $h(z)$ when $z\to1^-$ is the probability $\mathrm P_1(T\lt\infty)$, which is $1$ if $p\leqslant q$ and $q/p\lt1$ if $p\gt q$.

The technique above is flexible enough to be valid for any random walk. If the steps are $i$ with probability $p_i$, the recursion becomes $ h_n(z)=z\sum\limits_ip_ih_{n+i}(z). $ The case at hand is $p_{-1}=q$, $p_0=r$ and $p_1=p$. When $p_{-1}$ is the only nonzero $p_i$ with $i$ negative, a shortcut is to note that one can only go from $n\geqslant1$ to $0$ by first reaching $n-1$, then reaching $n-2$ from $n-1$, and so on until one reaches $0$ starting from $1$. These $n$ hitting times are i.i.d. hence $h_n(z)=h(z)^n$ for every $n\geqslant1$, and one is left with $ h(z)=zp_{-1}+z\sum\limits_{i\geqslant0}p_ih(z)^{i+1}. $ In the present case, this reads $ h(z)=qz+rzh(z)+pzh(z)^2, $ hence the expression of $h(z)=s_z$ as solving a quadratic equation should not be surprising.

2

I do not have the references to find the generating function of $Y_2$ with me, and it is rather tedious to re-do those computations. I'll skip this part, and only find the generating function of $T$ provided the generating function of $Y_2$ is known. Let $F (z) = E (z^{Y_2})$ be the generating function of $Y_2$.

First, let us write the generating function of $T$ and change the indices of the summation:

$h (z) = \sum_{N=1}^{+ \infty} P (T = N) z^N = \sum_{N=1}^{+ \infty} \sum_{k=0}^{N-1} P (Y_2 = N-k, Y_1 = k) z^N = \sum_{n=1}^{+ \infty} \sum_{k=0}^{+ \infty} P (Y_2 = n, Y_1 = k) z^{n+k}.$

Note that, is $Y_2$ is known, the law $Y_1$ is a sum of $Y_2$ i.i.d. random variables of geometric law: after each step, you spend an geometric time of parameter $p+q$ waiting before taking the next step. Since the law of $Y_1$ knowing $Y_2$ is well-known, let us condition over the value of $Y_2$.

$h (z) = \sum_{n=1}^{+ \infty} P (Y_2 = n) z^n \sum_{k=0}^{+ \infty} P (Y_1 = k | Y_2 = n) z^k.$

The generating function of a single random variable of geometric law of parameter $p+q$ is:

$g (z) = \sum_{k=0}^{+ \infty} (1-p-q) (p+q)^k z^k = \frac{1-(p+q)}{1-(p+q)z}.$

The generating function of the sum of $n$ independent random variables with the same law is $g^n$ (generating functions are really, really nice), so that:

$\sum_{k=0}^{+ \infty} P (Y_1 = k | Y_2 = n) z^k = g(z)^n = \left( \frac{1-(p+q)}{1-(p+q)z} \right)^n.$

If we inject this expression into the formula for $h$, we get:

$h (z) = \sum_{n=1}^{+ \infty} P (Y_2 = n) z^n g(z)^n = F (z g(z)) = F \left( \frac{[1-(p+q)]z}{1-(p+q)z} \right).$

The last step is to find $F$. But I hope this clears your confusions.

Edit : fixed the confusion between geometric and exponential laws. I also want to stress the point that finding $F$ is feasible, the method is well-known, but this is another matter (besides, I don't think that your problem is there). I prefer to keep this answer rather short, and if necessary I will add something about $F$. Later.

  • 0
    Oh, excuse me. I mixed exponential and geometric random variables. However, what I definitely worked with discrete times.2011-12-15