1
$\begingroup$

I am new to vector derivatives and trying to figure out a lot for my Machine Learning course. I have given the following:

$x \in \mathbb{R}^n$, $y \in \mathbb{R}^d$, $A \in \mathbb{R}^{d \times n}$,

Let $B$ be symmetric (and pos.def.). What is the minimum of

$(Ax -y)^T (Ax -y) + x^T Bx$

with respect to $x$?

How can I approach this problem? I have no clue yet, however, I figured out, that $(Ax -y)^T (Ax -y) + x^T Bx \in \mathbb{R}^1$, thus a real number, isn't it? Hence $B \in \mathbb{R}^{n \times n}$

I looked up some derivation rules and I've got the following:

$\frac{\partial}{\partial x} (Ax -y)^T (Ax -y) + x^T Bx = 2A^T (Ax - y) + (B + B^T) \cdot x$

Setting $2A^T (Ax - y) + (B + B^T) \cdot x$ to $0$:

\begin{align*} 0 & = 2A^T (Ax - y) + (B + B^T) \cdot x\\ 0 & = 2A^T A x - A^T y + (B + B^T) \cdot x\\ A^T y & = 2A^T A x + (B + B^T) \cdot x\\ (B + B^T)^{-1} \cdot A^T y & = 2A^T A x + x\\ \end{align*}

and here I am some what stuck, however, I don't know if this is the right approach

  • 0
    Sounds$ $like$ $a good idea. Unless some similar problems solved during the lectures could indicate what is going on? (Anyway, if I had to guess, I would follow @joriki's suggestion.)2012-04-15

1 Answers 1

1

You are good until the previous-to-last equation, which reads $A^Ty=2A^TAx+(B+B^T)x$, that is $A^Ty=Cx$ where $C$ is the matrix $C=2A^TA+B+B^T$. Hence, if $C$ is invertible, $ x=C^{-1}A^Ty. $ To show that $C$ is invertible, note that $C$ is symmetric hence it is enough to show that the associated quadratic form is definite positive, that is, that $z^TCz\gt0$ for every $z\ne0$. Now, $z^TCz=2z^TA^TAz+z^TBz+z^TB^Tz=2\|Az\|^2+2z^TBz\geqslant2z^TBz, $ and, by hypothesis, $B$ is definite positive hence $z^TBz\gt0$ for every $z\ne0$, which ends the proof.