0
$\begingroup$

I am trying to understand "Computing PageRank using Power Extrapolation" by Taher Haveliwala, Sepandar Kamvar, Dan Klein, Chris Manning, and Gene Golub from Stanford University.

The paper I'm referring to can be found here.

I don't understand how their Algorithm 1: Computing $\mathbf y =\mathbf A\mathbf x$

$\begin{align*} \mathbf y&=c\mathbf P^\top\mathbf x\\ w&=\|\mathbf x\|_1-\|\mathbf y\|_1\\ \mathbf y&=\mathbf y+w\mathbf v \end{align*}$

How is this computing $\mathbf y =\mathbf P^\prime{}^\top\mathbf x$?

1 Answers 1

1

I'm not surprised you don't understand it; it's rather badly written. I believe the definition $P_{ij}=1/\deg(i)$ in the paper is meant to apply only if there is a connection from $i$ to $j$, and otherwise $P_{ij}=0$. With this interpretation, the rest makes sense.

There are two different reasons for $\mathbf y$ to become denormalized. One is that the probability in the nodes without outgoing connections is "lost", since all the transition probabilities for these nodes are $0$ (whereas the transition probabilities for all other nodes add up to $1$). The other is that $\mathbf P$ has been multiplied by $c\lt1$. In both cases, the "missing" probability is supposed to be distributed according to $\mathbf v$.

Now the idea behind the algorithm is that it's not necessary to keep track of where this probability is lost and for which of the two reasons: We know that in the end $\mathbf y$ should be normalized, and we know that what we want to add to make it normalized should be proportional to $\mathbf v$, so we just need to find the coefficient $w$ such that $\mathbf y+w\mathbf v$ is normalized. This is what the second line does.