4
$\begingroup$

What is the best program to factor large arbitrary-form integers on a single computer, or on a few disjointed computers? "Best" is obviously subjective, but what do you recommend?

I'm working on a project to factor general-form numbers that I know are composite, with a scale of 100 - 1000 digits. I have a few computers I can use to process in parallel, but they're nowhere close to a cluster - and they certainly don't have the horsepower / memory needed for something like GNFS.

I've looked around a little, but what I've found can be classified as:

  1. Prove primality of general-form numbers (mine = known composites)
  2. Factor special-form numbers (mine = general-form numbers)
  3. Factor general-form numbers on a huge cluster (no cluster)

I'm missing the last one, "Factor general-form numbers on one or two sneakernet systems".

Final caveats: I'd prefer something free, but I'd be willing to spring for Maple / Mathematica / whatnot if it's my best option. Also I'd prefer something that's already built (binary package), but I can compile from source if that's a better option.

  • 1
    The "lucky" in Chris's comment cannot be emphasized enough, unfortunately. At the risk of sounding patronizing, one should always do trial division with "small" primes (how many of the first few primes should you divide with is up to you) first before using more advanced methods.2010-12-07

1 Answers 1

7

You're asking about a very wide range of numbers!

For all numbers, you should do ECM first. I recommend GMP-ECM (Windows, source).

For numbers of about 100 digits, your best bet (after doing an appropriate amount of trial factorization and ECM) is to use a SIQS. msieve would be a good choice.

For numbers of between 100 and 200 digits, you should do a large amount of ECM, at the very least a CPU-day's worth (and possibly 100+ CPU-days). Once those complete, if no factors were found, you should move to GNFS. GGNFS (Windows, source) is good and does not require a cluster, though for the upper end of the range you'll need at least a year if you're running on a single (multicore, one hopes) machine. Here are some guidelines on using NFS.

You have no hope of factoring hard numbers of 200 to 300 digits without many machines. Thus, for large general-form numbers, you should focus entirely on ECM in hopes of finding a small factor (up to about 60 digits, with patience, or about 70 digits, with patience and good fortune). If you find a factor, the cofactor may well be susceptible to GNFS (or it might be prime).

No one has a reasonable chance to factor hard numbers of 300 to 1000 digits, unless possibly the NSA. You should use ECM on such numbers in hopes of finding a factor, but you're unlikely to finish the factorization even if you do.

Now if you have access to a CUDA-enabled GPU (or several), you can speed the ECM with EECM . I don't know of any tools that do NFS on GPUs.

In short: ECM a lot, then finish off with SIQS or NFS if the number is small enough (below 180-200 digits depending on your resources and patience). If you want to throw money at the problem, buy more hardware, possibly GPUs -- don't buy commercial software, that has no bang for your buck.

  • 0
    I hope there are other answers -- maybe I could learn something! But yes, EECM on as many cores as you have sounds like the place to start. As for the curves: I would start with the next level, 43M, and do 4000 to 5000 on each core of a quad (adapt as appropriate). EECM probably will use a different B2 that will require slightly fewer total curves at each level, FWIW.2010-12-13