Are you constrained to use the estimator that you have indicated? If not, another choice for an estimator would be to form the joint pdf of $f(a_i,\dots,a_{i+k-1} \mid n)$ and use the value of $n$ that maximizes this joint pdf as your population estimator.
The joint pdf of the $k$ consecutive values is
$\begin{eqnarray} f(a_i,\dots,a_{i+k-1}) = k!{n \choose k} {{n-k} \choose {i-1}} a_i^{(i-1)} (1-a_{i+k-1})^{(n-i+1)-k} \end{eqnarray}$
This can be seen as follows - Let us say we have $n$ unordered random variables. We choose $k$ of them with order (in $k! {n \choose k}$ ways). Of the remaining $(n-k)$ variables, we choose $(i-1)$ (in ${{n-k} \choose {i-1}}$ ways) and designate them to be the first $(i-1)$ ordered variables in some order. They all are less than $a_i$ and the probability of this happening is $a_i^{i-1}$. The remaining variables we designate to be larger than the $(i+k-1)$th variable which happens with probability $(1-a_{i+k-1})^{(n-i+1-k)}$. It only remains to compute the probability that the $j$th variable takes the value $a_j$ for $i \leq j \leq i+k-1$. But this is just $F(a_j)d(a_j) = d(a_j)$ since the variables are uniformly selected from $[0,1)$.
Maximizing this as a function of $n$ is a potentially messy business. To get an idea of how good this approach is, I used Stirling's approximation for the factorials and further assumed that $n$ is much larger than $i,k$. Then, the maximizing value is approximately
$\begin{eqnarray} \tilde{n} = \frac{i+k-1}{\log \frac{1}{1-a_{i+k-1}}} \end{eqnarray}$
Simulations seem to suggest that this estimator is better for some parameter ranges (i.e., when $n$ is much larger than $i,k$ for ex.) while in other regimes, the estimator in the question is better (when $k,i$ can't be ignored in comparison with $n$). Even in the latter regime, it might be possible to optimize the $n$ value that maximizes the joint pdf above more carefully (without using asymptotic stuff like Stirling's) to get a better estimator. Edit: I checked in simulation that numerically maximizing the joint pdf without resorting to Stirling's approximation does give a uniformly better estimator.
Another thing I noted from the simulation is that if $k=2$, the estimator $\frac{1}{a_{i+1}-a_i}$ performs poorly. To see this, note that the distribution of $D = a_{i+1}-a_i$ is independent of $i$ and is equal to $f_D(d) = n_t(1-d)^{n_t-1}$ where $n_t$ denotes the true value of $N$. This makes the population estimator $\tilde{N}$ have the density function
$\begin{equation} f_{\tilde{N}}(\tilde{n}) = \frac{n_t}{\tilde{n}^2} \left(1-\frac{1}{\tilde{n}}\right)^{n_t-1} \end{equation}$
This estimator has an unbounded expectation and thus an unbounded bias. But this is a moot point if for some reason you are constrained to use the estimator you had indicated in the question.