5
$\begingroup$

How can you show that the magnitude of the sample correlation coefficient is at most $1$?

The formula is huge, I'm not even sure how to approach this. Can anyone point me in the right direction?

Note that this is the sample correlation coefficient:

$r_{xy} = \dfrac{\displaystyle \sum_{i=1}^n (x_i - \bar{x})(y_i - \bar{y})}{(n-1)s_xs_y} = \dfrac{\displaystyle \sum_{i=1}^n (x_i - \bar{x})(y_i - \bar{y})}{\sqrt{\displaystyle \sum_{i=1}^{n} (x_i - \bar{x})^2 \displaystyle \sum_{i=1}^{n} (y_i - \bar{y})^2}}$

2 Answers 2

5

This follows from Cauchy–Schwarz inequality. The Cauchy–Schwarz inequality states that for any two vectors $a$ and $b$ in an inner product space, we have that $\lvert \langle a, b \rangle \rvert^2 \leq \lvert \langle a, a \rangle \rvert \lvert \langle b, b \rangle \rvert$ In your case, the vector $a$ is taken as $a_i = (x_i-\bar{x})$ and the vector $b$ is taken as $b_i = (y_i-\bar{y})$ and the inner product of $a$ and $b$ is taken as $\displaystyle \langle a, b \rangle = \sum_{i=1}^n a_i b_i$. Hence, we get that $\displaystyle \langle a, b \rangle = \sum_{i=1}^n a_i b_i = \sum_{i=1}^n (x_i - \bar{x})(y_i - \bar{y})$ $\displaystyle \langle a, a \rangle = \sum_{i=1}^n a_i a_i = \sum_{i=1}^n (x_i - \bar{x})^2$ $\displaystyle \langle b, b \rangle = \sum_{i=1}^n b_i b_i = \sum_{i=1}^n (y_i - \bar{y})^2$ Hence, by Cauchy–Schwarz inequality, we get that $\left(\sum_{i=1}^n (x_i - \bar{x})(y_i - \bar{y})\right)^2 \leq \left( \sum_{i=1}^n (x_i - \bar{x})^2 \right) \left( \sum_{i=1}^n (y_i - \bar{y})^2\right)$ Taking the squareroot, we get that $\left|\sum_{i=1}^n (x_i - \bar{x})(y_i - \bar{y})\right| \leq \sqrt{\left( \sum_{i=1}^n (x_i - \bar{x})^2 \right) \left( \sum_{i=1}^n (y_i - \bar{y})^2\right)}$ Hence, we can conclude that $|r_{xy}| = \dfrac{\left|\displaystyle \sum_{i=1}^n (x_i - \bar{x})(y_i - \bar{y})\right|}{\displaystyle \sqrt{\left( \sum_{i=1}^n (x_i - \bar{x})^2 \right) \left( \sum_{i=1}^n (y_i - \bar{y})^2\right)}} \leq 1$

EDIT

Proof of Cauchy Schwarz inequality:

First note that if the vector $b$ is zero, then the inequality is trivially satisfied since both sides are zero. Hence, we can assume that $b \neq 0$. Now look at the component of $a$ orthogonal to $b$ i.e. $c = a - \dfrac{\langle a, b \rangle}{\langle b, b \rangle} b$ i.e. $a = c + \dfrac{\langle a, b \rangle}{\langle b, b \rangle} b$ You can check that $c$ is orthogonal to $b$ by computing $\langle c, b \rangle = \langle a,b \rangle - \dfrac{\langle a, b \rangle}{\langle b, b \rangle} \langle b, b \rangle = \langle a,b \rangle - \langle a,b \rangle = 0$ You can also check that $\langle c, \alpha b \rangle = 0 = \langle \beta c, b \rangle$.

We now have that \begin{align} \langle a,a \rangle & = \left \langle c + \dfrac{\langle a, b \rangle}{\langle b, b \rangle} b, c + \dfrac{\langle a, b \rangle}{\langle b, b \rangle} b \right \rangle\\ & = \langle c,c \rangle + \left \langle c,\dfrac{\langle a, b \rangle}{\langle b, b \rangle} b \right \rangle + \left \langle \dfrac{\langle a, b \rangle}{\langle b, b \rangle} b, c \right \rangle + \left \langle \dfrac{\langle a, b \rangle}{\langle b, b \rangle} b, \dfrac{\langle a, b \rangle}{\langle b, b \rangle} b \right \rangle\\ & = \langle c,c \rangle + \left \lvert \dfrac{\langle a, b \rangle}{\langle b, b \rangle} \right \rvert^2 \langle b, b \rangle = \langle c,c \rangle + \dfrac{\left \lvert \langle a, b \rangle \right \rvert^2}{\langle b, b \rangle} \end{align} Now $\langle c,c \rangle \geq 0$. This gives that $\langle a,a \rangle \geq \dfrac{\left \lvert \langle a, b \rangle \right \rvert^2}{\langle b, b \rangle}$ Rearranging, we get what we want, namely $\lvert \langle a, b \rangle \rvert^2 \leq \lvert \langle a, a \rangle \rvert \lvert \langle b, b \rangle \rvert$

1

Let $x=(x_1,\ldots,x_n), y=(y_1,\ldots,y_n), x' = (\bar x,\ldots,\bar x)$ and $y' = (\bar y,\ldots,\bar y)$. Using this, we can simplify notation by writing $r_{xy}=\frac{\sum\limits_{i=1}^n(x_i-\bar x)(y_i-\bar y)}{\sqrt{\sum\limits_{i=1}^n(x_i-\bar x)^2\sum\limits_{i=1}^n(y_i-\bar y)^2}}\leq\frac{\sum\limits_{i=1}^n|x_i-\bar x||y_i-\bar y|}{\sqrt{\sum\limits_{i=1}^n(x_i-\bar x)^2\sum\limits_{i=1}^n(y_i-\bar y)^2}}=\frac{\|(x-x')(y-y')\|_1}{\|x-x'\|_2\|y-y'\|_2}$ where $\|\cdot \|_1$ denotes the $L^1$ norm and $\|\cdot \|_2$ denotes the $L^2$ norm. We then only need that $\|(x-x')(y-y')\|_1\leq\|x-x'\|_2\|y-y'\|_2$ which follows from the Cauchy-Schwarz inequality.

  • 0
    @ErickWong Yes of course. Sometimes my thoughts run together.2012-06-15