2
$\begingroup$

Consider a time-homogeneous Markov chain $\{X_n\}_{n=0}^\infty$ with the state space state space $S=\{0,1,2\}$ and the following transition probability matrix: \begin{pmatrix} 1 & 0 & 0 \\ \alpha & \beta & \gamma \\ 0 & 0 & 1 \end{pmatrix} where $\alpha,\beta,\gamma>0$. Note that state 0 and 2 are absorbing.

Let $ T=\min\{n\geq 0\mid X_n=0\textrm{ or }X_n=2\} $ be the time of absorption of the process. It is intuitively true that $$ P(X_T=0\mid X_1=1)=P(X_T=0\mid X_0=1)\tag{*} $$ which is the key point of the so called "first step analysis". See for instance Chapter 3 in Karlin and Pinsky's Introduction to Stochastic Modeling. But the book does not bother giving a proof of it.

Here is my question:

How can one prove (*) using the definition of conditional probability and the Markov property?

  • 0
    Is anything known for the initial distribution of $X_0$?2017-02-08
  • 0
    @Blaza: No. I believe it is irrelevant.2017-02-11

3 Answers 3

2

Here is a formal proof. For any sequence ${\bf x}=(x_0,x_1,x_2,\dots)$ in $\{0,1,2\}^{\mathbb{N}}$ define $$H({\bf x})=\sum_{n=0}^\infty \,\prod_{j=0}^{n-1} {\bf 1}[x_j=1]\,{\bf 1}[x_n=0],$$ so that $H({\bf x})=1$ if the sequence $\bf x$ hits "0" before hitting "2", $H({\bf x})=0$ otherwise. Note that an initial value of "1" can be dropped, i.e., $$H(1,x_1,x_2,\dots )=H(x_1,x_2,\dots).\tag1$$

Since $\mathbb{P}(X_0=1\mid X_1=1)=1$, under the measure $\mathbb{P}(\,\cdot \mid X_1=1)$ equation (1) gives
$$H(X_0(\omega), X_1(\omega),\dots)=H(X_1(\omega), X_2(\omega),\dots),$$ almost surely. Therefore, $$E(H(X_0,X_1,\dots)\mid X_1=1)= E(H(X_1,X_2,\dots)\mid X_1=1)= E(H(X_0,X_1,\dots)\mid X_0=1),$$ where the last equation follows since $(X_n)$ is a time homogeneous Markov process.

That is, $$P(X_T=0, T<\infty\mid X_1=1)=P(X_T=0, T<\infty\mid X_0=1).$$

  • 0
    Thank you for your answer. In this proof, only the "time homogeneous" property is used, isn't it? It seems that if one is going to prove (*) only, then the "Markov property" is not needed.2017-02-11
  • 1
    @Jack You are right. But the really strong condition that I used is $\mathbb{P}(X_0=1\mid X_1=1)=1$. This would not hold for most other Markov chains.2017-02-11
0

Here is a proof I learn from a note on "first step analysis", which is essentially the same as Byron's answer.

Consider the function $f:\{0,1,2\}\to\{0,1\}$ with
$$ f(i)=\begin{cases}1, &i=0\\ 0,&i\neq 0. \end{cases} $$

Note that $f(X_0)=\cdots=f(X_{T-1})=0$ because $X_0,\cdots,X_{T-1}$ are not absorbing states. On the other hand, by the time homogeneous property (well, yes, this step is a little too slick), $$ E(f(X_1)+f(X_2)+\cdots+f(X_T)\mid X_1=1)=E(f(X_0)+f(X_1)+\cdots+f(X_T)\mid X_0=1). $$ But the LHS is $$E(f(X_T)\mid X_1=1)=P(X_T=0\mid X_1=1)$$ and the RHS is $$E(f(X_T)\mid X_0=1)=P(X_T=0\mid X_0=1).$$

0

Note that the event $X_T=0$ can be decomposed as a union of disjoint events: $$ A_0:=\{X_0=0\}, A_1:=\{X_0\neq 0,X_1=0\},A_2:=\{X_0\neq 0,X_1\neq 0,X_2= 0\},\\ A_k:=\{X_0\neq 0,\cdots,X_{k-1}\neq 0,X_k=0\},\cdots. $$ Now, we claim that we have the following, which of course implies ($*$).

\begin{align} \tag{1} P(A_0\mid X_1=1)&=0 \\ \tag{2} P(A_k\mid X_1=1)&=P(A_{k-1}\mid X_0=1), \quad k\geq 1 \end{align}

(1) is trivial. Now we show (2). For $k=1$, $$ P(X_0\neq 0,X_1=0\mid X_1=1)=0=P(X_0=0\mid X_0=1). $$ For $k>1$, \begin{align} &P(X_0\neq 0,X_1\neq 0,\cdots,X_{k-1}\neq 0,X_k=0\mid X_1=1)\\ =&\begin{cases} P(X_2=0\mid X_1=1)&k=2\\ P(X_2\neq 0,\cdots,X_k=0\mid X_1=1)&k>2 \end{cases}\tag{3}\\ =&\begin{cases} P(X_1=0\mid X_0=1)&k=2\\ P(X_1\neq 0,\cdots,X_{k-1}=0\mid X_0=1)&k>2 \end{cases}\tag{4}\\ =&P(A_{k-1}\mid X_0=1) \end{align} where in (3) we use the fact that $\{X_1=1\}\subset\{X_0\neq 0,X_1\neq 0\}$ and in (4) we apply the basic properties of Markov chains.