0
$\begingroup$

I'm trying to re-write $v (u x)$ where $v$ and $u$ are row vectors and $x$ is a column vector as some expression $M x$ (or $\bar{v}x$, etc.). The motivation is because I'm trying to compute the gradient (and \nabla^{'}_x [ v (u x) ] \cdot x should yield a row vector).

Intuitively, since the expression $v (ux)$ can be re-written as

$[v_1 (ux), v_2(ux), \cdots, v_n(ux)]$

(where $v_i$ is a scalar), this quantity can be computed. However, doing this and substituting into a larger expression is pretty obnoxious and I'd like to do it using standard matrix operations (non element-wise) if possible.

I know this has something to do with tensors, but I don't understand how it all fits together. Any help is appreciated.

1 Answers 1

1

If by $Mx$ you mean $M$ is a matrix, then clearly you cannot write $v(ux)$ as $Mx$. Why? A matrix represents a linear operation that sends column vectors to column vectors. But your operation sends the column vector $x$ to a row vector. What you can do is write

$ v(ux) = \left((v^T\otimes u)x\right)^T $

where $v^T\otimes u$ is the tensor product of the column vector $v^T$ and the row vector $u$, and can be represented by the matrix $M$ with entries $M_{ij} = v_iu_j$. So you multiply $x$ by this matrix and then take the transpose.

Then taking the gradient is as simple as taking the gradient of $(Mx)^T$.