2
$\begingroup$

I'm having some difficulties with the following proof:

Only onder special circumstances it can be the case that $\rho(X,Y)^1=\pm1$, and these circumstances are explored by considering the proof of the Cauchy-Schwartz inequality more carefully. Let $a=\operatorname{var}(X)$, $b=2\operatorname{cov}(X,Y)$, $c=\operatorname{var}(Y)$ and suppose that $\rho(X,Y)=\pm1$. Then $\operatorname{var}(X)\operatorname{var}(Y)\neq0$ and $$b^2-4ac=4\operatorname{var}(X)\operatorname{var}(Y)\big[\rho(X,Y)^2-1\big]=0,$$ and so the quadratic equation $$as^2+bs+c=0$$ has two equal real roots, at $s=\alpha$, say. Therefore, ...

I find the proof overall a bit weird, because I don't know how they came up with it. But I can follow the algebraic steps and such, so in essence I'm ok with the proof.

It continues as follows:

Therefore, $W=\alpha[X-\mathbb E(X)]+[Y-\mathbb E(Y)]$ satisfies $$ \mathbb E(W^2)=a\alpha^2+b\alpha+c=0, $$ giving that $\mathbb P(W=0)=1$, and showing that (essentially) $Y=-\alpha X+\beta$, where $\beta=\alpha\mathbb E(X)+\mathbb E(Y)$. A slightly more careful treatment discriminates between the values $+1$ and $-1$ for $\rho(X,Y)$: $$ \begin{aligned} \rho(X,Y)&=1\quad\text{if and only if }\mathbb P(Y=\alpha X+\beta)=1\text{ for some real }\alpha\text{ and }\beta\text{ with }\alpha>0\\ \rho(X,Y)&=-1\quad\text{if and only if }\mathbb P(Y=\alpha X+\beta)=1\text{ for some real }\alpha\text{ and }\beta\text{ with }\alpha<0.\\ \end{aligned} $$

\begin{align*} \rho(X,Y)&=1&& \begin{aligned}[t] \text{if and only if }\mathbb P(Y=\alpha X+\beta)=1\\ \text{for some real $\alpha$ and $\beta$ with }\alpha>0, \end{aligned}\\ \rho(X,Y)&=-1&& \begin{aligned}[t] \text{if and only if }\mathbb P(Y=\alpha X+\beta)=1\\ \text{for some real $\alpha$ and $\beta$ with }\alpha<0. \end{aligned}[t] \end{align*}

So, I can follow everything, except for the last bit where they are giving the conditions for $\rho(X,Y)=1$ and $\rho(X,Y)=-1$.

Say we look at $\rho(X,Y)=1$. It follows that $\operatorname{cov}(X,Y)=\sqrt{\operatorname{var}(X)\operatorname{var}(Y)}$. What else can I use to show that these are indeed the conditions?

  • 0
    so then how is $\rho$ defined?2017-01-30
  • 1
    $$\rho(X,Y)=\frac{\operatorname{cov}(X,Y)}{\sqrt{\operatorname{var}(X)\operatorname{var}(Y)}}\\ \operatorname{cov}(X,Y)=\mathbb E([X-\mathbb E(X)][Y-\mathbb E(Y)])\\ \operatorname{var}(X)=\mathbb E([X-\mathbb E(X)]^2)$$2017-01-30
  • 0
    why does the first equality have a little $1$ on top of it?2017-01-30
  • 0
    oh shit, i've removed it. i first referred to that equality, but then i deleted that, but i forgot to remove the (1) on the $=$ sign2017-01-30
  • 0
    @Clarinetist haha.2017-01-30
  • 0
    what is the part that you are having trouble with?2017-01-30

1 Answers 1

2

This proof is a paraphrased version of what is in Casella and Berger's Statistical Inference, p. 172.

I use $\mu_{T}$ and $\sigma^2_{T}$ to denote the mean and variance respectively of a random variable $T$, and given another random variable $U$, $\sigma_{TU} = \text{Cov}(T, U)$.

p. 172 approaches this proof using a more brute-force method and does not use Cauchy-Schwarz: consider $$h(t) = \mathbb{E}\left\{[(X - \mu_X)t + (Y - \mu_Y)]^2\right\}\text{.}$$ After some work, it can be shown that $$h(t) = t^2\sigma^2_{X} + 2t\sigma_{XY}+\sigma^2_{Y} = at^2+bt+c\text{.}$$ This is a quadratic with respect to $t$. Notice that $h$ is an expected value of a random variable which is always nonnegative: hence $h(t) \geq 0$ for all $t$. Because of this, $h$ has at most one root $r$ such that $h(r) = 0$. Therefore, it either has no roots or has one root; in both cases, the discriminant is non-positive; i.e., $$b^2 - 4ac = (2\sigma_{XY})^2-4(\sigma^2_{X})(\sigma^{2}_{Y}) \leq 0\text{.}$$ Dividing by $\sigma_{X}\sigma_{Y} = \sqrt{\sigma^2_{X}\sigma^{2}_{Y}}$ gives $$\dfrac{(2\sigma_{XY})^2}{\sigma_{X}\sigma_{Y}}-4\dfrac{(\sigma^2_{X})(\sigma^{2}_{Y})}{\sigma_{X}\sigma_{Y}} = 4\sigma_{XY}\cdot\rho(X, Y)-4\sigma_{X}\sigma_{Y}=4\sigma_X\sigma_Y\cdot [\rho(X, Y)^2-1]\leq 0\text{.}$$ Now multiply both sides by $\sigma_X\sigma_Y$ to get $$4\sigma^2_X\sigma^2_Y\cdot[\rho(X, Y)^2-1] \leq 0\text{.}$$ Solving for $\rho(X, Y)$ gives $$\rho(X, Y)^2 \leq 1\text{.}$$ In the case of equality, we have $\rho(X, Y)^2 = 1$, or $\rho(X, Y) = \pm 1$. Using the link above, we have to only have one root for this to happen, which is $\alpha$, as explained in your question.

  • 0
    Thank you. It is indeed a different approach from the approach in my book, but it helped me to make the proof in my book more readable. I have one last question left on the proof in my book; would you mind having a look at it? I'll put it in an edit, which might make it a bit messy, but I see no better alternative.2017-01-30
  • 1
    @ShaVuklia I'll let you know when I edit my post to answer the rest of your question. My schedule's packed today.2017-01-31
  • 0
    That is not necessary anymore! I've looked at it from a fresh point of view today, and I understand it now. We've shown that $\rho(X,Y)=\pm 1$ means that $Y$ is a linear combination of $X$. If you plug in $X$ and $Y=aX+b$ in the definition of $\rho(X,Y)$, then you see that $\rho(X,Y)=1$ for $a>0$, and $\rho(X,Y)=-1$ for $a<0$.2017-02-02
  • 0
    @ShaVuklia Good to hear. If this answer was helpful, feel free to mark it.2017-02-07