1
$\begingroup$

Let $\;\;\displaystyle x=\left \lfloor\frac{\sqrt{4N+(2a+1)^2}-1}{4}\right\rfloor\bmod10^8 \;\;\text{ where}\; N,a \text{ are integers.}$

When $N$ and $a$ are sufficiently large, this expression overflows in languages like C++. However, seeing as I only need the result modulo 10^8, I am wondering if there is some manipulation I can apply here to avoid overflow.

Mainly, $(2a+1)^2$ overflows when $a$ is large. However I can't just apply the modulus to each part and then go about my day because the overall answer is wrong.

  • 0
    Is it even possible?2012-11-27
  • 1
    I'm not sure that it is possible, I'm just sure that there's a spare parenthesis in your expression ;)2012-11-27
  • 0
    @tohecz Not anymore! :D I am wondering if there is a way to simplify sqrt(4N+(2a+1)^2) in some way (completing the square?) so that I don't have to ultimately figure out the square of a (which overflows).2012-11-27
  • 0
    do $a$ and $N$ themselves fit into the C++ limitations on integers?2012-11-27
  • 0
    @DanShved Long longs I believe. At their maximum value, $a$ and $N$ are both equal to exactly 10^122012-11-27
  • 0
    if $N$ is much smaller than $a$, your square root is approximately equal to just $2a+1$.2012-11-27
  • 0
    @akkkk Unfortunately N is always 10^12 for my purposes. But $a$ can range from 1 clear up to 10^122012-11-27
  • 0
    @akkkk, you don't need $N$ to be much smaller than $2a+1$; you just need it to be much smaller than $(2a+1)^2$. KaliMa, is this always the case? Is there, say, a fixed small integer $M$ such that $N$ is always less than $Ma$? If so, then let me know, and I will post a solution.2012-11-27
  • 0
    Scrub that -- Dan Shved already posted the solution :-)2012-11-27
  • 0
    You could consider (temporarily) using arbitrary precision arithmetic (e.g. in C++ `mpz_class` in [GMP](http://gmplib.org/) is very easy to use), or [128-bit integers](http://gcc.gnu.org/onlinedocs/gcc/_005f_005fint128.html) (although, you'd need to be careful about rounding error here).2012-11-27
  • 0
    @DouglasS.Stones I have GMP and it resolves the issue, but I couldn't get the ceil() function working (for a different part of the routine) at all so I gave up.2012-11-27

1 Answers 1

5

OK, if both $a$ and $N$ don't exceed $10^{12}$, then $x$ also doesn't exceed $10^{12}$. Therefore, there's a high chance that you'll be able to get the exact value just by doing calculations in double (which produces relative errors of about $10^{-17}$ since it has 51 binary digits) and then casting to long long in the end.

If you don't want to take that chance, there is a way to find the result more reliably, but for some cost in execution time, because this way involves binary search.

Basically, you need to find $y_0 = \left\lfloor \sqrt{4N + (2a+1)^2}\right\rfloor$, which is the maximal $y \in \mathbb{Z}$ such that $$ y^2 \leqslant 4N + (2a+1)^2. $$ Clearly, $y_0 \geqslant (2a+1)$. Our main idea is that in fact when $(2a+1)$ is as large as $N$, $(2a+1)^2$ is much larger than $4N$, therefore $y_0$ doesn't differ that much from $(2a+1)$. To make this more precise, let's intruduce the new variable $t_0 = y_0 - (2a+1)$. $t_0$ is the maximal integer $t$ such that $$ (2a+1+t)^2 \leqslant 4N + (2a+1)^2. $$ We can rewrite it like this: $$ t(4a+2+t) \leqslant 4N. $$ The function on the left is monotone, therefore we can use binary search to find $t_0$. All we need to do now is pick some reasonable boundaries $t_1$, $t_2$ such that $t_0$ is guaranteed to be between them. Let's pick $t_1=0$ and $t_2 = \left\lceil \sqrt{4N}\right\rceil$. It is quite easy to see that $t_1 \leqslant t_0 \leqslant t_2$. It is also easy to see that for every $t$ such that $t_1 \leqslant t \leqslant t_2$ the value $t (4a+2+t)$ is small enough to fit into long long. Therefore, you can perform a binary search to find the correct value for $t$, and in this binary search you'll be able to do all the calculations in long long.

UPDATE: By request, I'm adding an example for $a=N=10^{12}$. We need to find $t_0$, which is the largest integer $t$ such that $$ t(4 \cdot 10^{12} + 2 + t) \leqslant 4 \cdot 10^{12}. $$ Let's try substituting some values for $t$ and see if this inequality (which I will just call the inequality) holds. If we set $t=0$, we get $0 \leqslant 4 \cdot 10^{12}$, which is true. Therefore $0 \leqslant t_0$.

OK, now let's try setting $t=\left\lceil \sqrt{4N}\right\rceil = 2 \cdot 10^6$. We get $2 \cdot 10^6 (4 \cdot 10^{12} + 2 + 2 \cdot 10^6) \leqslant 4 \cdot 10^{12}$. Wow, this is clearly wrong, i.e. our $t$ is way too big. So we know that $0 \leqslant t_0 < 2\cdot10^6$.

Well, let's keep on guessing. Since we're doing binary search, we'll always split our "window of possibilities" into two equal parts. So our next candidate is $t = 10^6$. Again, it's easy to see that the inequality doesn't hold, so $0 \leqslant t_0 < 10^6$.

So let's try $5 \cdot 10^5$. Actually, this could go on for a while, so I'll skip a few steps. In this particular case, we'll pick smaller and smaller values of $t$, and they will all be too large. In the end, we'll come to something like this: $0 \leqslant t_0 < 2$. So, again, we will split our interval in two and check the value $t=1$. For $t=1$ the inequality says: $4 \cdot 10^{12} + 3 \leqslant 4 \cdot 10^{12}$. Hm. Even $t=1$ is too large! So $0 \leqslant t_0 < 1$, which means that $t_0=0$, which in turn means that $y_0 = \left\lfloor \sqrt{4N + (2a+1)^2}\right\rfloor = 2a+1+t_0 = 2 \cdot 10^{12} + 1$. This is it.

Actually, this example with $N=a=10^{12}$ is not very good, because this way $N$ is way smaller than $(2a+1)^2$, so adding $4N$ cannot change the value of our square root even by $1$. That's why we started with boundaries $0 \leqslant t_0 < 2 \cdot 10^{12}$ and then the right boundary kept on decreasing until we got $0 \leqslant t_0 < 1$. If we picked different numbers, for example $N=10^{12}$ and $a=10^6$, then the situation would be more interesting: your interval of possible $t_0$ values would shrink from both sides and finally zero in on the correct value of $t_0$.

  • 0
    I am having a tough time following this. Can you give a concrete example? Say using a=10^12 and N=10^122012-11-27
  • 0
    OK, I'll add an example to my answer. But I don't know if you are well acquainted with binary search. If not - then you should probably read something about it, e.g. [this wikipedia article](http://en.wikipedia.org/wiki/Binary_search_algorithm).2012-11-27
  • 0
    @KaliMa I've updated my answer.2012-11-27