3
$\begingroup$

I'm currently watching a lecture on Machine Learning at Stanford university. The lecturer defines $f$ to be a mapping from $\mathbb{R}^{m \times n}$ to $\mathbb{R}$. I understand this to be a function that maps a matrix to a scalar value, i.e. trace, determinant.

However, he then goes on to say that the derivative of such a function is defined by the derivative of $f$ with respect to each element which to me would be $\frac{df}{dA_{mn}} : \mathbb{R} \to \mathbb{R}$. That doesn't make sense : / it's derivative would therefore have a different mapping, which can't be true.

My explanation isn't fantastic so refer to this link http://www.youtube.com/watch?v=5u4G23_OohI#t=3363s (it'll take you straight to the relevant time)

  • 0
    Quite grateful that someone added the maths font for me :] where can I find that? Is it just latex?2011-10-26
  • 0
    What do you mean by a different mapping? Whan you think about the derivative, what it really means is $(m \times n) \times 1$ partial derivatives. Because it is usefull for the computations we make, we usually list them in an $m \times n$ by $1$ matrix. But nothing stops you from listing them in a different way, is just that the matrix he gets is not the same as the one we get...2011-10-26
  • 0
    If you click "edit" you can see the code. This is true of everyone's posts. You can click "edit" to see their code even if you don't have permission to make changes.2011-10-26
  • 0
    @Matthew: Alternatively, if you see a piece of LaTeX you want to know the code for on the site, you can right click on it and choose "Show Source".2011-10-26
  • 0
    @MatthewHaworth: Not that I have any problems with it but why do you ask your question here? As far as I remember there is a Q&A forum for such questions on ml-class.org.2011-11-25

1 Answers 1

2

The derivative of a multivariate function is called a Jacobian.

This is matrix whose entries are the partial derivatives you mentioned: $\frac{\partial f}{\partial A_{mn}}$

  • 1
    Keep in mind that the Jacobian is an $m \times n$ by $1$ matrix. The speaker looks to these entries as being an $n \times m$ matrix, so that is definitelly NOT the Jacobian.. And this is exactly what confused him, he thought that is the jacobian....2011-10-26
  • 0
    When the output is $\mathbb{R}$, it's traditional to ignore the last "$\times 1$". So instead of thinking of the derivative of $f$ as a $m \times n \times 1$ "matrix" (actually 3-tensor), you can just think of it as an $m \times n$ matrix as well. In the same way for $f(x)=x^2$. We think of $f'(x)=2x$ as a function, not a $1 \times 1$ matrix.2011-10-26
  • 0
    Well it is exactly this "ignore " part which confused the person asking the question...Also, technically the jacobian should be an $m \cdot n \times 1$ matrix, he probably thought of it that way....2011-10-26
  • 0
    By "ignore" I meant we usually identify things like $m \times 1$ and $1 \times m$ matrices with $m$ component vectors. In the same way we usually identify the $1 \times 1$ matrix $[4]$ with the number $4$.2011-10-26
  • 0
    Until there is a particular application or use in mind, it's just notation.2011-10-26
  • 1
    I've realised my misunderstanding, of course f does map $\mathbb{R}^{m \times n}$ to $\mathbb{R}$ but then taking the derivative with respect to A maps it back out to $\mathbb{R}^{m \times n}$ again. It was as simple as that :P2011-10-26