I am new to vector derivatives and trying to figure out a lot for my Machine Learning course. I have given the following:
$x \in \mathbb{R}^n$, $y \in \mathbb{R}^d$, $A \in \mathbb{R}^{d \times n}$,
Let $B$ be symmetric (and pos.def.). What is the minimum of
$(Ax -y)^T (Ax -y) + x^T Bx$
with respect to $x$?
How can I approach this problem? I have no clue yet, however, I figured out, that $(Ax -y)^T (Ax -y) + x^T Bx \in \mathbb{R}^1$, thus a real number, isn't it? Hence $B \in \mathbb{R}^{n \times n}$
I looked up some derivation rules and I've got the following:
$\frac{\partial}{\partial x} (Ax -y)^T (Ax -y) + x^T Bx = 2A^T (Ax - y) + (B + B^T) \cdot x$
Setting $2A^T (Ax - y) + (B + B^T) \cdot x$ to $0$:
\begin{align*} 0 & = 2A^T (Ax - y) + (B + B^T) \cdot x\\ 0 & = 2A^T A x - A^T y + (B + B^T) \cdot x\\ A^T y & = 2A^T A x + (B + B^T) \cdot x\\ (B + B^T)^{-1} \cdot A^T y & = 2A^T A x + x\\ \end{align*}
and here I am some what stuck, however, I don't know if this is the right approach