1
$\begingroup$

I just discovered today that $\mathbf{1}=(1,\ldots,1)$ is an eigenvector with eigenvalue $1$ of $X(X^TX)^{-1}X^T$ for certain $X \in \mathbb{R}^{m\times n}$ where $n \leq m$ and where the first column of $X$ is $\mathbf{1}$. Why is this?

The motivation for this is from a statement that says that the residual of the fitted values of a linear least squares estimator sum to 0. In other words, given some standard linear model $y = X\beta + \epsilon$ where $\epsilon$ is some random error, the least squares estimate for $\beta$ would be $\hat \beta = (X^TX)^{-1}X^Ty$ and the fitted value $\hat y = X\hat \beta$ or $X(X^TX)^{-1}X^Ty$. Also important to note is that $X$ is a design matrix and thus has an intercept column where all elements are $1$. Then $\mathbf{1}^T(y - \hat y) = 0$.

Through matlab, I inferred that it was because $X(X^TX)^{-1}X$ has eigenvector $\mathbf{1}$. I couldn't prove why this is the case.

  • 0
    @chaohua$n$g Oops. Fi$x$ed it.2012-08-29

3 Answers 3

2

The first column of $X$ is $X e$ where $e = (1,0,\ldots,0)^T$. If $v = X w$ for some vector $w$, $X (X^T X)^{-1} X^T v = X (X^T X)^{-1} (X^T X) w = X w = v$

3

$X(X^TX)^{-1}X^Ty$ is the projection of $y$ onto the column space of $X$. If $y = [1\ldots1]^T$, then $y$ lies in the column space of $X$ because the first column of $X$ is $[1\ldots1]^T$. Hence, the projection will be $y$ again. Therefore, $X(X^TX)^{-1}X^Ty = y$, i.e. $y$ is an eigenvector of this projection matrix with eigenvalue $1$.

3

In typical problems of this kind, $X$ has many more rows that columns. You wrote $X \in \mathbb{R}^{m\times n}$ and $m\le n$. That doesn't make sense.

One has $X\in\mathbb{R}^{n\times m}$ and $m\le n$ (you see, I wrote $n\times m$ rather than $m\times n$). Then $X^T X$ is an $m\times m$ matrix, and $m\le n$, so it's a small matrix, maybe as small as $2\times 2$ in some typical applications. If the columns of $X$ are linearly independent then $X^T X$ is invertible, i.e. $(X^T X)^{-1}$ exists. Now let $H=X(X^T X)^{-1}X^T$. The rank of the big $n\times n$ matrix $H$ is the small number $m$.

Now:

  • If $v$ is in the column space of $X$ then $Hv=v$.
  • If $v$ is orthogonal to the column space of $X$ then $Hv=0$.

So there you have two eigenspaces with eigenvalues $0$ and $1$, and since they're orthogonal complements of each other, they span the whole space.

Here's a proof of the two bulleted points. The second one is the easier one: Suppose $v$ is orthogonal to the column space of $X$. Then clearly $X^T v=0$. Therefore $Hv = X(X^T X)^{-1}X^T v$ $= X(X^T X)^{-1}0 = 0$. Now suppose $v$ is in the column space of $X$. Then $v=Xu$ for some column $u$ of (small) length $m$. Then $ Hv = HXu = \big(X(X^T X)^{-1} X^T\big) Xu = X(X^T X)^{-1} (X^T X) u = Xu = v. $

So if the column $\mathbf{1}$ of $1$s is in the column space of $X$, then $H\mathbf{1}=\mathbf{1}$.