1
$\begingroup$

Some (hopefully) relevant facts (according to the matrix cookbook)

  • $\frac{\partial}{\partial X} \text{Tr}( \mathbb{F}(X)) = f(X)^T$

    • where $\mathbb{F}(\cdot)$ is a differentiable function of each of the elements of $X$ and $f(\cdot)$ the scalar derivative of $\mathbb{F}(\cdot)$.
  • $\partial (X \circ Y) = (\partial X) \circ Y + X \circ (\partial Y)$

  • $\frac{\partial}{\partial X} \| X \|_F^2 = \frac{\partial}{\partial X} \text{Tr}(XX^T) = 2X$

I have a fixed $M$ which is a masking matrix - it has $1$ on certain elements, and $0$ elsewhere. I am trying to combine the facts above to get: $$ \frac{\partial}{\partial F} \|F \circ M \|_F^2$$ In the hopes of setting it equal to the $0$ matrix, and having some kind of closed form relationship. How can I find a formula for the quantity above, and is it possible to have a closed form expression?

So far, I've combined the above to get:

$$\frac{\partial}{\partial F}\text{Tr}((F \circ M)(F \circ M)^T) = 2(F\circ M)\cdot((\partial F) \circ M + F \circ (\partial M))$$

But I'm not sure this correct or where to go from here.

If this is too complicated, can't I use the bound in the following question: $$\| A \circ B\|_F \leq \text{Tr}(AB^T) \leq \| A\|_F \| B\|_F$$ and instead of my minimization with my original term, minimize my problem using the bounds provided here?

2 Answers 2

3

Replace the trace with the double-dot (aka Frobenius) product $$\operatorname{tr}(A^TB)=A:B$$ in your masked function.

Then finding the differential and gradient is simple $$\eqalign{ L &= (M\circ F):(M\circ F) \cr\cr dL &= 2\,(M\circ F):(M\circ dF) \cr &= 2\,(M\circ M\circ F):dF \cr &= 2\,(M\circ F):dF \cr\cr \frac{\partial L}{\partial F} &= 2\,M\circ F \cr\cr }$$ In the 3rd line, I made use of the fact that Frobenius and Hadamard products are mutually commutative, i.e. $$A\circ B:X = A:B\circ X$$

Setting the gradient to zero, yields little useful info. Elements in $X$ corresponding to the zero elements of $M$ are unbounded, while the elements corresponding to the unity elements of $M$ are zero. $$\eqalign{ \cr }$$

  • 0
    Thanks for your answer! This answer differers from Rodrigo's answer by a factor of A/M2017-02-01
  • 1
    I think I understand you got from $2(M\circ M \circ F)$ to $2(M \circ F)$ because $M$ is a *masking* matrix of $1$ and $0$, and so $M^2=M$. Is this correct?2017-02-01
  • 1
    The mask is composed of the elements $\{0,1\}$ which are unchanged when squared.2017-02-01
  • 0
    Thanks @greg - In terms of the optimization component of the question, how can I now "solve" for the $F$ which makes that true? How can I "factor out" the $F$ from the $2(M \circ F)$?2017-02-01
  • 0
    @user79950 Re-read the last sentence of my answer. In more concrete terms, if $M_{ij}=0$ then $F_{ij}$ is unconstrained, and if $M_{ij}=1$ then $F_{ij}=0$. So $F=0$ would be the simplest solution to the problem. Another would be $F=1-M$.2017-02-02
  • 0
    This isnt the only term in my optimization problem - its one of a couple.2017-02-02
1

Given $\mathrm A \in \mathbb R^{m \times n}$, we define the cost function

$$f (\mathrm X) := \| \mathrm A \circ \mathrm X \|_{\text{F}}^2 = \sum_{i=1}^m \sum_{j=1}^n a_{ij}^2 x_{ij}^2$$

Differentiating with respect to $x_{kl}$, we obtain $2 a_{kl}^2 x_{kl}$. Hence,

$$\nabla_{\mathrm X} \, f (\mathrm X) = 2 \, \mathrm A \circ \mathrm A \circ \mathrm X$$