0
$\begingroup$

Let $X_1, X_2, \ldots$ be independent and identically distributed continuous random variables. Let $N$ be the smallest value of $n$ for which $X_n > X_1$. Show that $P(N > k) = 1/k$ (for $k = 1, 2, \ldots$) and hence that $P(N = k) = 1/[k(k-1)]$. What is $E(N)$?

I have no idea where to start, any hint/sketch of the solution would be much appreciated. Thank you!

Does E(N) exist, since using the definition gives the harmonic series which diverges

  • 0
    For $k$ iid random variables, the probability that the first one is greater than the others is $1/k$, by symmetry.2017-02-08
  • 0
    But how do I prove this?2017-02-08

2 Answers 2

2

As Paul commented, you can use symmetry.

If $N>k,$ this means that $X_1$ is the largest of the first $k$ random variables. So $$ P(N>k) = P(X_1>X_2, X_1>X_3,\ldots X_1>X_k) $$ By symmetry, each ordering of the random variables is equally probable. There are $k!$ total orderings and $(k-1)!$ orderings in which $X_1$ is largest. Thus $$P(N>k) = \frac{(k-1)!}{k!} = \frac{1}{k}.$$ Or you can just say each of the $k$ variables is equally likely to be largest, and $\sum_{i=1}^kP(\mbox{$X_i$ is largest})=1$ so $P(\mbox{$X_1$ is largest}) = 1/k.$

  • 0
    Thank you? How do I do E(N)??2017-02-08
  • 0
    @Romeo123 Well, you know $E(N) = \sum_k k P(N=k).$ Can you figure out $P(N=k)$ from knowing $P(N> i )$ for all $i$?2017-02-08
  • 0
    @Romeo123 Alternatively, there's an expression out there for $E(N)$ in terms of $P(N>k)$ which you can derive from the identity $N =1+ \sum_{k=1}^\infty I(N > k)$ where $I$ is the indicator function (after convincing yourself the identity is true).2017-02-08
2

It might be intuitive from the symmetry perspective. A rigorous argument is as follows:

Let $F$ be the distribution function of $X_1, X_2, \ldots$, by the definition of the random variable $N$ and the independence assumption, it follows that \begin{align} & P[N > k] = P[X_2 \leq X_1, X_3 \leq X_1, \ldots, X_k \leq X_1] \\ = & \int_{-\infty}^\infty P[X_2 \leq x, \ldots, X_k \leq x] dF(x) \\ = & \int_{-\infty}^\infty P[X_2 \leq x]\cdots P[X_k \leq x] dF(x) \\ = & \int_{-\infty}^\infty F(x)^{k - 1} d F(x) = \int_0^1 u^{k - 1} du \\ = & \frac{1}{k}. \end{align}

Here in the last step, we need the condition that $F$ is continuous so that the change of variable works.

  • 0
    One should probably add that all random variables **have** a density function (since this is not true for general random vectors et cetera)2017-02-08
  • 0
    @AdamHughes I think a minimal condition would be $F$ is continuous. For which case we can show (for a proof, see [this](http://math.stackexchange.com/questions/1564584/prove-uniform-distribution/1564909#1564909) post) that $F(X)$ has uniform distribution on $[0, 1]$. The existence of density might be too strong.2017-02-08
  • 0
    how does one define $dF$ if not as a density? As a second factor I'm not sure I see continuity as a weaker condition as a contiguous CDF implies absolute continuity wrt Lebesgue measure, hence existence of the Radon-Nikodym derivative, so the implication is the other direction.2017-02-08
  • 0
    The notation $dF(x)$ is a shorthand for $F(dx)$, or $\mu(dx)$, where $\mu$ is the measure on $\mathbb{R}^1$ induced by $F$. To use this notation, we do not require $\mu$ has a density $f$ with respect to the Lebesgue measure.2017-02-08
  • 0
    I absolutely agree. In fact, as I think about this, I'm pretty sure that this is equivalent to density's existence. The real issue I think is that the reason you can integrate from 0 to 1 without skipping anything is because of continuity, the change of integrand is density existence.2017-02-08