I'm not surprised you don't understand it; it's rather badly written. I believe the definition $P_{ij}=1/\deg(i)$ in the paper is meant to apply only if there is a connection from $i$ to $j$, and otherwise $P_{ij}=0$. With this interpretation, the rest makes sense.
There are two different reasons for $\mathbf y$ to become denormalized. One is that the probability in the nodes without outgoing connections is "lost", since all the transition probabilities for these nodes are $0$ (whereas the transition probabilities for all other nodes add up to $1$). The other is that $\mathbf P$ has been multiplied by $c\lt1$. In both cases, the "missing" probability is supposed to be distributed according to $\mathbf v$.
Now the idea behind the algorithm is that it's not necessary to keep track of where this probability is lost and for which of the two reasons: We know that in the end $\mathbf y$ should be normalized, and we know that what we want to add to make it normalized should be proportional to $\mathbf v$, so we just need to find the coefficient $w$ such that $\mathbf y+w\mathbf v$ is normalized. This is what the second line does.