I am trying to implement the Schönhage-Strassen algorithm (SSA) for multiplying large integers, but it only gives the right result if all $\delta_j$ are zero. I'll explain what I mean by this:
SSA (as described in the original 1971 paper Schnelle Multiplikation großer Zahlen) basically works like this:
- Let $a$ and $b$ be two integers to multiply.
- Determine $n$. Roughly speaking, $n$ will be about $\frac12 \log_2 \log_2 \max(a,b)$
- Compute integers $z_j$ such that $a\cdot b=\sum\limits_j z_j\cdot2^{j\cdot2^{n-1}}$, as follows:
- Compute $\xi_j\equiv z_j \pmod {2^{2^n}+1}$; this step performs an FFT+iFFT
- Compute $\eta_j\equiv z_j \pmod {2^{n+2}}$
- Compute $z_j$ from $\xi_j$ and $\eta_j$:
- $\delta_j\equiv\eta_j-\xi_j \pmod {2^{n+2}}$ <--- THIS IS THE $\delta_j$ MENTIONED ABOVE
- $z_j=\xi_j+\delta_j(2^{2^n}+1)$
- Output $\sum\limits_j z_j\cdot2^{j\cdot2^{n-1}}$
I tested the other parts of the code, and I even did a a complete example with pencil and paper. Plus, everything works when $\delta=0$, so I'm pretty certain the problem is in step 3.3 or 4.
Wikipedia doesn't go into enough detail to even mention $\xi_j$ and $\eta_j$.
Does anyone know of a working example, sample code, or other information that might help me track down the problem?
Edit: I will put the code online if I get it to work.
Edit 2: I uploaded some Java code here. It runs as a regular Java program; the output shows one working example and one that doesn't work. Computation of $\delta,\xi,\eta,z_r$ happens in lines 148-160.
Also, there is a modified version here which uses much smaller input values, but I head to tweak the code so it handles pieces less than one byte.