0
$\begingroup$

I am groking the following equation:

$$ \frac{\delta{\mathbf{h}_i}}{\delta{\mathbf{h}_{i-1}}} = \prod_{i = k + 1}^{t}\mathbf{\Theta}^T \text{diag}[\mathbf{\phi'}(\mathbf{h}_{i-1})] $$

Where $\frac{\delta{\mathbf{h}_i}}{\delta{\mathbf{h}_{i-1}}} $ is a 2 x 2 jacobian matrix, $\mathbf{\Theta}^T$ is a 2 x 2 matrix, and $\text{diag}[\mathbf{\phi'}(\mathbf{h}_{i-1})]$ is also a 2 x 2 matrix.

My question is, in what order should this multiplication be executed? I know that matrix multiplication is not commutative in general, and so I am stumped as to what "order" I am supposed to perform the above matrix products in. Does the equation dictate the order? If not, then how does one know?

3 Answers 3

1

You should consider this to be taken in the order that the index of the product says. So first $i=k+1$, then $i=k+2$, then $i=k+3$ and so on until $i=t$. This expansion occurs left to right, with $i=k+1$ on the far left and $i=t$ on the far right.

  • 0
    Yes, however by the "second" iteration, am I placing the matrix product to the left or right of what I already have? :)2017-01-17
  • 0
    Oh, on the right. Edited to make that clear.2017-01-17
  • 0
    Thanks - and sorry for the dumb question, but why is it on the right and not the left?2017-01-17
  • 0
    @Spacey Not dumb at all. It's because that's how the notation is defined. If you want to know why the notation works that way, I would guess it's because it was codified by people who spoke languages that are written left to right. I'm suddenly curious how it works in Hebrew or Arabic.2017-01-17
  • 0
    Ok - just to make sure I understand, if I say here is matrix A and matrix B, and I want you to multiply them, the "standard" notation of "A times B" is to say AB and not BA, right?2017-01-17
  • 0
    Yup. :"A multiplied by B" means "AB" not "BA"2017-01-17
  • 0
    Thanks, I will accept your answer. (Interesting note about the languages). One last question: The context here is recurrent backpropagation: In that sense, because of the diagonal matrix present, would it not be true that the order doesn't actually matter? (But only for this case). Thanks again!2017-01-17
  • 0
    @Spacey Yeah, since each term of this product is diagonal, here they commute.2017-01-17
  • 0
    Hmm, upon further inspection, it does not seem to me that the diagonal parts here matter, since each term here will not be diagonal after all, therefore it seems that the order still matter for this nonetheless.2017-01-17
1

Matrix multiplication is associative, so you could take this in "pairs" then go in the proper ordering, right to left. However, since you have some diagonal matrices, multiplying a chain of those is special in that the order doesn't matter.

Here's an extended explanation. We know the following for matrices $A, B,$ and $C$ of suitable sizes (the multiplication makes sense):

$$ (AB)C = A(BC) $$

yet in general, $AB \neq BA$. So, when you have a chain, you can work in pairs - here's an example:

$$ (AB)(CD) = A(BC)D $$

(again, assuming everything is of suitable size). Now let's examine matrix-vector multiplication in a chain.

\begin{align*} ABCD\mathbf{x} &= ABC\mathbf{x'} \\ &= AB\mathbf{x''}\\ &= A\mathbf{x'''} \end{align*}

This is if we use the usual "right-to-left" scheme. Of course, we may also multiply any two adjacent matrices and still arrive at the same product.

The commutativity of diagonal matrix multiplication is very easy to see - take any two diagonal matrices of the same size and multiply both ways. It's neat to see!

  • 0
    Thanks Sean - interesting - may you please expand on i) Why the "proper" order is right to left, and ii) Why having the diagonal means that this order doesnt matter? Thanks!2017-01-17
  • 0
    See my edited answer.2017-01-17
  • 0
    Sean, thanks - I think I understood that the *order* of multiplication does not matter, (thanks to associativity), however I think what initially confused me was the *placement* of the matrices of the product, expanded. The placement seems to be "standard", in that we always put matrices "to the right" as we expand the product. I guess this is a standard. Once we have *placed* all of them in the proper ordering, then the order of computation doesnt matter due to associativity. Would you agree with my assessment? Thanks!2017-01-17
  • 0
    No, the ordering still matters, but the grouping doesn't quite matter as much. You must still multiply by order of appearance but you are allowed to take pairs for associativity purposes. See any linear algebra text.2017-01-17
  • 0
    Basically, i) always go right to left in the multiplication, and ii) if that is satisfied, then we are at liberty to do arbitrary groupings.2017-01-17
0

Matrix multiplaction is not commuative, but it is associative. This means that $A \cdot B \cdot C$ is different from $B\cdot A \cdot C$. However, $(A \cdot B) \cdot C$ is the same as $A \cdot (B \cdot C)$.