Why is the negative binomial distribution defined as $P(X=x|r,p)= \binom{x-1}{r-1}p^{r}(1-p)^{x-r}$
Basically this is the probability that $x$ Bernoulli trials are needed for $r$ successes. So we need $r-1$ successes in the first $x-1$ trials. Then success on the $r^{th}$ trial happens with probability $p$. Why can't we write it as the following:
$P(X = x|r,p) = \binom{x}{r}p^{r} (1-p)^{x-r}$
This means that you have $r$ successes in the first $x$ trials.