1
$\begingroup$

Suppose $J(\mathbf{A})$ is defined as follows $J=\text{tr}(\log \mathbf{P})$ $\mathbf{P}=\frac{e^\mathbf{A}}{\mathbf{1} \mathbf{1}' e^\mathbf{A}}$

where division, exp and log are taken pointwise, $\mathbf{1}$ is a column vector of ones and $\mathbf{A}$ is square. What's the easiest way of showing that $\mathbf{I}-\mathbf{P}$ is the gradient of $J(\mathbf{A})$?

  • 0
    Gradient with respect to A, added clarification2012-11-30

1 Answers 1

3

$\def\tr{\operatorname{tr}}$Just computing, I'd say. We have wrting $P = (p_{ij})$, $A = (a_{ij})$ and $d$ for the dimension \begin{align*} p_{ij} &= \frac{\exp a_{ij}}{\sum_{k=1}^d \exp a_{kj}}\\ \log p_{ij} &= a_{ij} - \log \sum_{k=1}^d \exp a_{kj}\\ J(A) &= \tr\log P\\ &= \tr A - \sum_{l=1}^d \log \sum_{k=1}^d \exp a_{kl} \end{align*} So we have \begin{align*} \partial_{a_{ii}}J(A) &= 1 - \frac 1{\sum_{k=1}^d \exp a_{ki}}\cdot \exp a_{ii}\\ &= 1 - p_{ii}\\ \text{and for $i\ne j$:}\\ \partial_{a_{ij}} &= -\frac 1{\sum_{k=1}^d \exp a_{kj}}\cdot \exp a_{ij}\\ &= -p_{ij} \end{align*} So we have $\nabla J(A) = \mathrm{id} - P(A)$