0
$\begingroup$

Let $A$ be a real symmetric positive definite $n \times n$ matrix. Then there exists some orthogonal matrix $O$ and diagonal matrix $D$ such that $O^T D O = A$, where $O$ has entries the eigenvectors of $A$ and $D$ has entries the real eigenvalues of $A$.

Let $x_0$ be some point in $\mathbb{R}^n$, and consider the vector $y = x_0 + O(x-x_0)$.

Then if I have any real twice-differentiable function $u: \mathbb{R}^n \to \mathbb{R}$, why can I write $$u_{x_i} = \sum_{k=1}^n u_{y_k} o_{ki}$$ and $$u_{x_i x_j} = \sum_{k,l=1}^n u_{y_k y_l} o_{ki} o_{lj}$$ where $o_{ki}$ is the entry in the $k$-th row and $i$-th column of the matrix $O$?

Also, what should be the geometric picture I have when I think about this vector $y$, and what part, if any, does the specific choice of $x_0$ play in these calculations?

1 Answers 1

2

This has nothing to do with $O$ being orthogonal. If $O$ is any matrix at all, the first equation is true. By the chain rule, we have

$$\frac{\partial u}{\partial x_i} = \sum_k \frac{\partial u}{\partial y_k}\frac{\partial y_k}{\partial x_i}.$$

What is $\frac{\partial y_k}{\partial x_i}$? Well, since $x_0$ is constant and $O$ is constant, we have $Ox_0$ constant as well. When taking derivatives, we can ignore the constant pieces (so $x_0$ plays no role at all). We have $ y_k = (Ox)_k + (x_0)_k + (Ox_0)_k = \sum_j o_{kj}x_j + (x_0)_k +(Ox_0)_k$. Taking the partial with respect to $x_i$ gives $\frac{\partial y_k}{\partial x_i} = o_{ki}$.

Plugging this into the above formula for $\frac{\partial u}{\partial x_i}$ gives the first result.

The second result follows from the first result after noting that $o_{ki}$ is constant and replacing $u$ in the previous derivation with $u_{x_i}$.

  • 0
    Thanks for the quick response! I have one question about what you wrote. Why is the expression on the left not $\frac{\partial u \circ y}{\partial x_i}$ instead of just $\frac{\partial u}{\partial x_i}$?2011-12-21
  • 0
    @user1736: If you were to write out the left side, it would be $\frac{\partial u(y(x))}{\partial x_i}$, but most people write it as I did.2011-12-21
  • 0
    Oh, I see. I just saw the notation in wikipedia too. Those two expressions aren't generally the same function though, right?2011-12-21
  • 0
    @user1736: Well, interpreted literally, you're right - the domain of what I wrote is "domain of y's" while the domain of the proper form is "domain of x's", and these domains need not be the same.. But $\frac{\partial u}{\partial x_i}$ only makes sense (due the domain issues) if "u", in this context, means "u$\circ$ y". In short, it may be shoddy notation technically, but in practice there's rarely any cause of confusion (at least, once you get used to it) and it's used quite commonly.2011-12-21
  • 0
    Yea, sorry for the trouble but I guess I am still not used to it. So the only difference between the two partial derivatives is the extra mapping that changes the domains? As an example, if I look at the set $\{y: \frac{\partial u(y)}{\partial x_i} >0\}$, then is $\{x: \frac{\partial u(y(x))}{\partial x_i} > 0\}$ just the set of $x$ that $y$ maps into the first set?2011-12-21