7
$\begingroup$

I have been looking all over the net to find a way to work out a probability distribution of a maximum of partial sums of independent random variables, but to no avail. So I have decided to try to work it out for myself. Here are the results of this endeavour and I would appreciate if people with better understanding of probability than me would give it a look over to see if I’ve made a mess of it somewhere. Many thanks. Here it goes.

Given a set $\{X_i:i=0,1,\ldots,n \}$ of independent random variables with $X_0 = 0$ and with given p.d.f.'s $f_{X_i}(x)$ and corresponding c.d.f's $F_{X_i}(x)$, define $S_k=\Sigma_{i=0}^k\,X_i$, and $M_k=\max\{S_0,S_1,\ldots,S_k \}$, and we want to find a distrinution of $M_n$. We have

$ \begin{eqnarray*} P(M_n

where

$ \begin{eqnarray*} P(S_n

where

$ \begin{eqnarray*} P(S_n

where

$ \begin{eqnarray*} f_{S_k}(s)&=&\int\limits_{-\infty}^{\infty} f_{X_k}(x)\,f_{S_{k-1}}(s-x)\mathrm dx, \end{eqnarray*} $

with

$ \begin{eqnarray*} f_{S_0}(s)=\delta (s), \end{eqnarray*} $

so that putting it all together gives a recursion formula

$ \begin{eqnarray*} F_{M_n}(m) = \frac{F_{M_{n-1}}(m)}{F_{S_{n-1}}(m)} \int\limits_{-\infty}^m f_{S_{n-1}}(s)F_{X_n}(m-s)\mathrm ds \end{eqnarray*} $

with

$ \begin{eqnarray*} F_{M_0}(m) = H(m) \end{eqnarray*} $

the Heaviside step function.

Added 1: Using another approach, I have obtained the following result $ f_{M_n}(m) = f_{M_{n-1}}(m)F_{X_n}(0) + \int\limits_0^m f_{X_n}(m-x)f_{M_{n-1}}(x)\mathrm dx $

which for 2 normal r.v. seems to give a result that agrees with the one put forward in one of the answers by Sasha.

Added 2: Finally I got some free time to look at this problem again and here are my thought on it. Once again, I would appreciate any comments on it.

We begin by considering a joint distribution of $f_{S_1,S_2}(s_1,s_2)$ where $S_1 = X_1$, $S_2 = S_1 + X_2$, $X_1 \sim X_2 \sim X$, and $X_1$ and $X_2$ are independent. We have

$ f_{S_1,S_2}(s_1,s_2) = f_{S_2 \mid S_1}(s_2 \mid s_1)f_{S_1}(s_1) = f_{X_2}(s_2 - s_1)f_{X_1}(s_1) = f_{X}(s_2 - s_1)f_{X}(s_1) $

Next, we have

$ \begin{eqnarray*} F_{M_n}(m)&=&P(M_n

where $s_0 \equiv 0$. I hope the above notation is clear enough. Now

\begin{eqnarray*} F_{M_n}(m) = \mathbb E[\mathbb I_{ M_n \leq m}] = \prod_{i=1}^n \int\limits_{-\infty}^{-\infty} f_X(s_i - s_{i-1}) \mathbb I_{s_i \leq m} \mathrm ds_i. \end{eqnarray*}

The characteristic function $\varphi_{M_n}(t) = \mathbb E[\mathbb e^{i t M_n}]$ is then

\begin{eqnarray*} \varphi_{M_n}(t) = \prod_{i=1}^n \int\limits_{-\infty}^{-\infty} f_X(s_i - s_{i-1}) \mathbb e^{i t s_i} \mathrm ds_i \end{eqnarray*} $$

  • 0
    The formula *added after comments* holds in one case and one case only: when $n=1$.2011-09-30

4 Answers 4

5

The step where you assert that $ \mathrm P(S_n because $(S_n)_{n\ge0}$ is a Markov chain, is a beautiful example of the subtle errors one can make when using the Markov property.

To see that this equality is false in general, consider an inhomogenous random walk $(S_n)$, exactly like in your post. Fix a time $n\ge2$, and assume that $X_{n-1}=-1$ or $-2$ but that $X_k=\pm1$ or $0$ for every $0\le k\le n$ such that $k\ne n-1$.

Then, $X_{n-1}\le0$ almost surely hence $[M_{n-1}<1]=[M_{n-2}<1]$. And $X_{n-1}\le-1$ almost surely hence $[M_{n-2}<1]\subseteq[S_{n-2}<1]\subseteq[S_{n-1}<0]$. And $X_n\le1$ almost surely hence, conditionally on $[S_{n-1}<0]$, $S_n<1$ almost surely. This shows that $ \mathrm P[S_n<1\mid M_{n-1}<1]=1. $ On the other hand, $[S_{n-1}=0]$ has positive probability because $n\ge2$, and $[S_{n}=1]$ contains $[S_{n-1}=0]\cap[X_{n}=1]$, hence $ \mathrm P[S_n<1\mid S_{n-1}<1]<1. $ Added in note The key properties of the random walk in this counterexample are that $X_{n-1}\le-x$ and $X_n\le x$ almost surely, for a given positive $x$, and that both events $[S_{n-1}=0]$ and $[X_n\ge1]$ have positive probability.

  • 0
    Regarding your *what constrain[t]s etc.* comment, this is another problem. If you have another problem you want to put on MSE, write another post.2011-09-30
2

If you're interested in the situation for large $n$, then note that if your random variables $X_i$ have finite variance, then as $n \to \infty$ the scaling limit of $S_k$ is a Brownian motion $B_t$. The distribution of $M_t = \max_{0 \le s \le t} B_s$ is known to be the same as that of $|B_t|$, i.e. the absolute value of a normal random variable with mean 0 and variance $t$, and the pdf is $P(M_t \in dx) = \sqrt{\frac{2}{\pi t}} e^{-x^2/2t} dx.$ See, e.g., I. Karatzas and S. Shreve, Brownian Motion and Stochastic Calculus, 2ed, Proposition 2.8.1 (the result quoted above is Problem 2.8.2 but you just have to take the result in Proposition 2.8.1 and integrate out $a$).

1

So, there wrong move in your answer was already pointed out and the solution in terms of joint probabilities was discussed. Let us take a look at this problem from Markov's prospective. As I told you, the process $S_k$ is an inhomogeneous (time-dependent) Markov process - indeed, if you know the current state $S_k = s$ and the current time instance $k$, you know the distribution of $S_{k+1}$ even without knowledge of the whole history.

For the moment let us forget that the process is time-dependent, so we consider just a sum of iid random variables. Let us denote $u_n(m) = \mathsf P_0\{M_n. Here $\mathsf P_0$ is a probability conditioned on the fact that you start from $S_0=0$. Due to the fact that $S$ is just a sum of independent increments, if we start not from $S_0=0$ but from $S_0 = s$ we will get $ \mathsf P_s\{M_n That means that knowledge study $\mathsf P_s\{M_n for any $s$ and a fixed $m$ is equivalent to the knowledge of $\mathsf P_0\{M_n for any $m$, so that of the distribution of $M_n$. For the shortcut we denote $v_n(s):=u_n(-s)$.

It's more convenient to deal with $\mathsf P_s$ due to the following reason. Suppose, you know $v_n(s)$ - that is the probability that starting from $S_0 = s$ the maximum will not exceed the zero level, i.e. just put $m=0$ in $(*)$. How can you calculate now $v_{n+1}(s)$? So you start from $s$ and then jump to some point $t$ where you already know the probability to stay always inder the zero level - so you just need to apply the law of total probability: $ v_{n+1}(s) = \int\limits_\mathbb R v_n(t)f_X(t-s)\,dt. $ with $v_0(s) = 1-H(s)$. So you get from $(*)$ that $v_n(s) = \mathsf P_0\{M_n<-s\}$ for any $s\in \mathbb R$ and you know the distribution of the maximum.

What will change it you allow $X_i$ to have different distributions as in your original question? The additional variable will appear, but the idea of $(*)$ leaves unviolated: you still have the sum of independent increments. If you shift the starting point, you can shift the upper bound for the maximum on the same value and the probability will not change. However, what is tricky here is that the starting time matters. Suppose you know $ v_n(0,s) = \mathsf P\{M_n<0|S_0=s\} $ where $0$ in $v_n$ tells us that you conditioning on the value of $S_k$ at the moment $k=0$. But then to calculate $v_{n+1}(0,s)$ we need to know $v_n(1,s)$ since it will be different due to the differ in distributions of jumps. So, $ v_{n+1}(0,s) = \int\limits_\mathbb R v_n(1,t)f_{X_1}(t-s)\,dt \quad (**) $ with an initial condition $ v_0(k,s) = 1-H(s) $ for all $k = 0,1,2,3,...$

For example, to calculate $\mathsf P\{M_n you algorithm is:

  1. you know $v_0(n,s)=1-H(s)$ for any $s$.

  2. from $(**)$ you find $v_1(n-1,s)$;

  3. from the previous step you find $v_2(n-2,s)$ by applying $(**)$.

    <...>

  4. from the previous step you find $v_{n-1}(1,s)$ and applying $(**)$ you've done.

0

Since $\max(S_0, \ldots, S_n) \le m$ implies $S_0 \le m \land S_1 \le m \land \ldots \land S_n \le m$, we have $ \mathbb{P}(M_n \le m) = \mathbb{P}(S_0 \le m \land S_1 \le m \land \ldots \land S_n \le m) $

That is the cdf of the maximum is the cdf of joint distribution at vector with equal components: $ F_{M_n}(m) = F_{S_0,S_1,\ldots,S_n}\left(m,m,\ldots,m\right) $ Your success in dealing with this will depend on tractability of $F_{S_0,\ldots, S_n}$. Assume $m \ge 0$, so that $0 = S_0 \le m$ is trivially true.

Considering $n=2$, and the case of standard normal $X_i$, vector $\{S_1, S_2\}$ follows binormal distribution with $\mu_1 = \mu_2 = 0$ and $\sigma_1 = 1$, $\sigma_2 = \sqrt{2}$ and correlation $\rho = \frac{1}{\sqrt{2}}$.

Use the fact that $\frac{d}{d x} F_{\mathcal{BN}}(x,y) = f\left(\frac{x-\mu_1}{\sigma_1}\right) \Phi\left( \frac{\sigma _1 \left(y-\mu _2\right) - \rho \, \sigma _2 \left(x-\mu _1\right)}{ \sigma _1 \sigma _2 \sqrt{1-\rho ^2}} \right) $, and similarly for derivative with respect to $y$, we have $ \frac{\mathrm{d}}{\mathrm{d} m} \Phi_{S_1,S_2}\left(m,m\right) = \frac{1}{2} \left( f(m) + 2 f\left(\frac{m}{\sqrt{2}}\right) \Phi\left(\frac{m}{\sqrt{2}} \right)\right) $ where $f$ is the pdf of standard normal, and $\Phi$ is its cumulative distribution function. This shows, that $\max(S_1, S_2)$ follows an equal proportion mixture between standard normal distribution and Azzalini's skew-normal distribution.