0
$\begingroup$

So, I am trying to take the derivative of the following equation, because it is needed in an optimization problem. I want to make sure I am on the right track. The equation is:

$ -3 \mathbb E[(w^Tz)^2]^2 $

So my question is, what is:

$ \frac{\delta (-3 \mathbb E[(w^Tz)^2]^2)}{\delta w} = ? $

Please assume here that $w$ is a 2-dimensional column vector, just like $z$. $z$ is also a zero mean, unit variance (joint) random variable. ($w$ is a deterministic vector).

I would like a break down of the steps for evaluating the derivative here - I half syspect the chain rule is involved, however I am getting thrown off by the presence of the expectation operator.

Thanks!

1 Answers 1

5

Notice that $(w^T z)^2 = w^T z z^T w = w^T (z z^T) w$. Thus $f(w) = E_z (w^T z)^2 = w^T E_z (z z^T) w$. The derivative is given by $\frac{\partial f(w)}{\partial w} = 2 w^T E_z (z z^T)$. Just as a reminder, $\frac{\partial (w^TAw)}{\partial w} = (A+A^T)w$. This becomes $2Aw$ when $A$ is symmetric.

You wished to compute the derivative of $\phi(w) = -3 f(w)^2$. This can be computed using the usual calculus rules as $\frac{\partial \phi(w)}{\partial w} = -6 f(w) \frac{\partial f(w)}{\partial w} = -12 \, (w^T E_z (z z^T) w) \, w^T E_z (z z^T)$.

Now $E_z(zz^T) = I$ because $z$ is zero mean with covariance $I$. Hence, the final answer is: $-12 \, (w^T w) \, w^T = -12||w||_2^2 w^T$, where $||w||_2$ is the $L2$-norm of $w$.

  • 0
    Not really, it was just simpler to eliminate the integral completely, reducing the problem to a simpler one. If I let $\phi(w,z) = (w^T z)^2$, then the link above shows that $\frac{\partial E_z(\phi(w,z))}{\partial w} = E_z(\frac{\partial \phi(w,z)}{\partial w})$. The inside 'squaring' remains, the outside 'squaring' is dealt with using the usual product rule. Both squarings resulting in non-linearities.2012-08-12