2
$\begingroup$

Let $X$ be an $n \times p$ matrix, and let $\mathbf 1$ be a vector of $1$'s of length $n$.

Why does the following hold (assuming $X'X$ is not singular):

$$\left( X'X \right)^{-1}X' \mathbf 1 = \begin{pmatrix}1\\0\\\vdots\\0\end{pmatrix}.$$

(I know this is basic, but I am not sure how to tackle this.)

Thanks.

  • 3
    It doesn't hold... Suppose $X$ is $n\times n$ and invertible, then $(X'X)^{-1}X'1 = X^{-1}1$, assuming $X'$ means $X$ transpose.2011-12-03
  • 0
    This can't be right. If n=p, the left side simplifies to $X^{-1} 1$, which is not, in general, equal to the right hand side.2011-12-03
  • 0
    Typically $X$ would have many more rows than columns when this notation is used, so supposing that $X$ is $n\times n$ seems silly.2011-12-03
  • 0
    You doing something like this in Matlab? ones(4,3)\ones(4,1)2011-12-03
  • 0
    Assuming $n>p$, and the rank of $X$ is $p$ so that the inverse actually exists, the vector $\left( {X'X} \right)^{ - 1}X'a$ is the $p\times n$ vector of coefficients in a linear combination of the columns of $X$, and that linear combination is the projection of the $n\times 1$ vector $a$ onto the column space of $X$. Any reasonable answer should bear that in mind.2011-12-03

1 Answers 1

4

I just realized that Tal Galili must have neglected to tell us that the first column of $X$ is a column whose every entry is $1$. That typically happens in certain kinds of applications, and typically $n\gg p$.

As I said in a comment:

Assuming $n>p$, and the rank of $X$ is $p$ so that the inverse actually exists, the vector $\left( {X'X} \right)^{ - 1}X'a$ is the $p×n$ vector of coefficients in a linear combination of the columns of X, and that linear combination is the projection of the $n×1$ vector $a$ onto the column space of $X$. Any reasonable answer should bear that in mind.

So take the $1$ to mean a whole column of scalar $1$s. Then we should get the coefficients of the linear combination of columns of $X$ that gives us a column of $1$s. Since that's the first column of $X$, we get $1,0,0,0,\ldots,0$, i.e. $1$ times the first column plus $0$ times each of the other columns.

That's the only way to do it since the columns of $X$ are linearly independent, as is implicitly stated in the question in the form of presupposing that $X'X$ is invertible.

Later note: The more leisurely explanation: The $p\times p$ matrix $X'X$ is invertible only if $X$ has $p$ linearly independent columns. So that must be the case. That implies $n\ge p$. In typical applications $p$ is fairly small and $n$ is much bigger. In typical applications in statistics, $n$ is the sample size. Given any $n\times 1$ column vector $Y$, the the matrix $(X'X)^{-1}X'Y$ is a $p\times 1$ column vector. It is the vector of coefficients in a linear combination of the columns of $X$ that approximates $Y$ as closely as possible by a linear combination of the columns of $X$. In other words, it's the vector of regression coefficients when $Y$ is regressed on $X$. If $Y$ happens to be one of the columns of $X$---say it's the $k$th column---then $Y$ is expressed exactly as a linear combination of the columns of $X$. There's only one way to do that, since the columns are independent. That's by multiplying the $k$th column by $1$ and the others by $0$ and adding them up. If one of the columns is a column of $1$s, then that's what happens with that column. If the first column is a column of $1$s, then one gets the column vector whose first entry is $1$, and the others are $0$. That's what's happening here.