Let $I_i$ be the indicator random variable denoting whether the ith point in $S$ is pareto optimal. Then the number of pareto-optimal points in $S$ is $M = \sum_{i=1}^{n} I_i$ and $E[M] = \sum_{i=1}^n E[I_i]$ by the linearity of expectation. Since $I_i$ is a binary valued random variable, we also have $E[I_i] = P(I_i = 1)$. Putting this together, we get $E[M] = \sum_{i=1}^n P(I_i = 1)$.
To evaluate $P(I_i=1)$, we need to find out the probability that there is no point above and to the right of the ith point. This can be done by first conditioning on the ith point's location within the unit square and then demanding that all the remaining $(n-1)$ points not lie in the smaller rectangle above and to its right. If the ith point is $(x_i,y_i)$, we want none of the remaining $(n-1)$ points in the rectangle of sides $1-x_i$ and $1-y_i$. The probability that a random point lies in this rectangle is simply its area and the probability that none of the $(n-1)$ points lie in this rectangle is therefore $(1-(1-x_i)(1-y_i))^{n-1}$.
\begin{eqnarray}
P(I_i=1 \mid \text{ith pt} = (x_i,y_i)) &=& (1-(1-x_i)(1-y_i))^{n-1} \\
P(I_i=1) &=& \int_0^1 \int_0^1 (1-(1-x_i)(1-y_i))^{n-1} dx_i dy_i \\
&=& \int_0^1 \int_0^1 (1-uv)^{n-1} du dv
\end{eqnarray}
This integral is pretty easy to compute. By holding $v$ constant, we can first evaluate the inner integral as 
$\int_0^1 (1-uv)^{n-1} du = \frac{1}{nv} (1-(1-v)^{n})$
Making the substitution $v \rightarrow 1-v$, we get 
\begin{eqnarray}
P(I_i = 1) &=& \frac{1}{n} \int_0^1 \frac{1-v^n}{1-v} dv \\
&=& \frac{1}{n} \int_0^1 (1+v+\dots+v^{n-1}) dv \\
&=& \frac{1}{n} \left(1+\frac{1}{2}+\dots+\frac{1}{n}\right) = \frac{H_n}{n}
\end{eqnarray}
Finally, we compute $E[M] = \sum_{i=1}^n P(I_i=1) = H_n$, where $H_n$ is the nth Harmonic number.