2
$\begingroup$

I want to know how to go about proving that the Good-Turing estimator has a total probability of $1$. I have seen this proof (page 2) but I found unclear the first step:

$\sum_j \theta[j] = \sum_r \theta[r]N_r = \frac{1}{N}\sum \left[(r+1) \frac{N_{r+1}}{N_r}\right]N_r$

$\theta[j]$ is the probability of having an $n$-gram, $\theta[r]$ is the probability of a $n$-gram occurring $r$ times and $N_r$ is the number of $n$-grams that occur $r$ times.

Since $\sum_r(r+1)N_{r+1}$ = $\sum_r rN_r$, it's more or less straightforward that it actually sums $1$. However, as I said, I don't understand the first part: $\sum_j \theta[j] = \sum_r \theta[r]N_r$. What is going on there?

Thanks in advance.

1 Answers 1

1

$\theta[j]$ is the probability that a future sample will be $x_j$, and $\theta(r)$ is the probability of a word occurring given that it appeared $r$ times in the sample $W$. Let $J(r)=\{j:x_j\text{ appeared }r\text{ times in }W\}$. It’s assumed (near the bottom of page 1) that if $j,j\,'\in J(r)$, then $\theta[j]=\theta[j\,']$, and this common value is $\theta(r)$. Thus,

$\begin{align*} \sum_j\theta[j]&=\sum_r\sum_{j\in J(r)}\theta[j]\\ &=\sum_r\theta(r)|J(r)|\\ &=\sum_r\theta(r)N_r\;. \end{align*}$

  • 1
    @Robert: :-) You’re welcome. (It may help that I’ve seen many very similar manipulations.)2012-08-28