Sorry for the long intro. I think the explanation motivates the question and puts it in context. But if you want to skip the story, then just move on to the grey boxes; they should contain enough information to make out my question.
This question is inspired by Sadeq Dousti's question on producing a factored integer from a small interval: Factoring some integer in the given interval. For convenience, I will assume that the problem is to produce a factored integer from the interval $[x-G; x]$, and our goal is to make $G = G(x)$ as small as possible. (Beware that Sadeq's version uses $N$ in place of my $x$, and fixes $G(x)$ concretely to be $O(\log x)$.)
One can consider several candidate easily-stated algorithms for this problem, all of which work in the following general way. Let $S$ be a set of integers such that
- Given a number $N$, testing if $N$ is in $S$ can be done efficiently. Moreover, any $N \in S$ can also be factored in time $\mathrm{poly}(\log N)$.
- The gaps in $S$ are "small": for all sufficiently large $x > 0$, there exists $N \in S$, such that $x - G(x) \leq N \leq x$. (Hence my notation $G$.)
Then the given problem can be solved for intervals of length $G(x)$: just search for an $S$-element in the interval and output its factorization. The whole problem then becomes how small can we make $G(x)$ to be. Here are some candidate sets:
- Powers of $2$.
- Primes.
- The set of numbers $N$ such that all prime divisors of $N$ are at most $\log^A(N)$ for some absolute constant $A < \infty$. I'll call these polylog-smooth numbers. (See the fineprint at the end of the question.)
- The set of numbers $N$ such that at most one of its prime divisors exceeds $\log^A(N)$ where $A$ is as before. I'll call these almost polylog-smooth numbers.
Ok, enough with the names :-). All these examples satisfy the first requirement (testing+factoring) by design. (For instance, in the last case, divide out all small prime divisors of $N$ by brute force, and we will be left with either $1$ or a large prime number, and so we'll be in good shape.) The big question then is to analyze the gaps in these sets.
- The powers of two are very sparse. They have a gap $G(x) = O(x)$ and this is tight.
- The primes are suspected to have better gaps (see the Consequences of Riemann hypothesis and Cramér's conjecture). In short, the gap might be as small as $O(\log^2 x)$. On the other hand, the best unconditional bound is $O(x^{\theta})$ for $\theta < .53$.
- I do not have enough background to grok the question of gaps for the polylog-smooth numbers. Nevertheless, I found Andrew Granville's survey on smooth numbers (Smooth numbers: Computational Number Theory and beyond) says that even the average gap is around $x^{1/A}$. (I am quoting Theorem 1.14 from page 4/page 270 in the pdf I have linked to.) This might be quite bad compared to the conjectured gap for the primes example, but a great improvement over the powers of $2$.
Now, what about the set of almost polylog-smooth numbers? They are denser than the primes, so it's definitely at least as small as $O(\log^2 x)$ conditionally and $O(x^{\theta})$ unconditionally.
But taking a clue from the improvement we got in going from the powers of $2$ to the smooth-numbers, I believe that the actual gaps for the almost-smooth numbers should be even better than that of the primes. That is the motivation behind my question:
What are the current best bounds on the gaps on the almost polylog-smooth numbers (conditionally and unconditionally)? Specifically, is it possible that $G = O(\log x)$ for this case?
Any help?
Fineprint that morally shouldn't matter. A number is $B$-smooth if its prime divisors are at most $B$. It seems to me that when people analyze properties of the smooth numbers $\leq x$, they usually set the smoothness parameter $B$ to be a function of $x$, like $B = \log^A x$. I defined a smooth number more "intrinsically" by setting $B = \log^A N$. I presume that this is just a technical issue, and none of the bounds will really change that much. If I am wrong, then please let me know. In any case, I see no reason why one definition is better for this application than the other.