4
$\begingroup$
  • Suppose $s_{1}$ are the numbers with 10-digits that have $1$ prime factor.
  • Suppose $s_{2}$ are the numbers with 10-digits that have $2$ prime factors.
  • Suppose $s_{n}$ are the numbers with 10-digits that have $n$ prime factors.

Leading $0$s are allowed. Which of $s$ would have the smallest expected value of tries to guess the right $10$ digit number?

  • 1
    N.B. deleted many comments from previous incarnation on security.se which are now off topic after the migration and after the changes to the question.2011-02-21
  • 0
    By ten digit numbers do you include the ones where you pre-pad with 0's, e.g. 0000123456?2011-02-21
  • 0
    My answers did not count numbers with leading zeros, so please see the edit2011-02-21
  • 1
    If you read http://mathworld.wolfram.com/Semiprime.html the technique will allow you to calculate $s_n$ with more loops. Basically take all collections of n-1 small primes and count the primes you can multiply by bigger than the biggest and keep the product under 10^10.2011-02-22
  • 10
    Meta thread http://meta.math.stackexchange.com/questions/1688/how-many-ten-digit-primes-are-there (Please vote this comment up so that it will be visible.)2011-02-22

3 Answers 3

10

I don't know how to do this mathematically, but you can do it programmatically: use the Sieve of Eratosthenes to generate factor counts from 1 to $10^{10}-1$. Note that $10^{10}$ is a bit over $2^{33}$ so you won't be able to store this in memory on a conventional computer (one byte per entry means you'd need 9.3 GB of RAM just for the array). There are some time-memory trade-offs here that could keep you off the disk, and there are ways to parallelize some of this, but it's still not something you can easily figure out on a commodity computer (unless there is a better method).

I ran a C program for 10-digit numbers and obtained the following counts:

01: 455052511
02: 1493776443
03: 2227121996
04: 2139236881
05: 1570678136
06: 977694273
07: 550454756
08: 291646797
09: 148930536
10: 74342563
11: 36585097
12: 17836903
13: 8641282
14: 4167745
15: 2002277
16: 959377
17: 458176
18: 218163
19: 103657
20: 49031
21: 23133
22: 10837
23: 5091
24: 2349
25: 1089
26: 499
27: 224
28: 102
29: 44
30: 19
31: 7
32: 3
33: 1

This table gives the counts for each $s_i$ for $1 \leq i \leq 33$ where integers are between 2 and $10^{10} - 1$. Prime multiplicities are counted, as you specified in a comment.

This ran in about 12 minutes on an x86 using 10 GB of RAM. So the answer to your question is that $s_3$ is largest. Code posted below.


#include 
#include 

#define MAX 10000000000       // sieve size
#define PPOWER 255            // marker for prime powers

// allocate MAX bytes of memory
unsigned char sieve[MAX];

// use # of factors as index, # of ints is value
unsigned long tally[100];


// sieve[] is initialized to all 0's; then for each i, sieve[i] will contain
// the number of prime factors (including multiplicity) for each i, except
// sieve[p^j] for prime p, j >1 will contain the special marker PPOWER.
// We do this to properly account for prime multiplicities.

main() {
  unsigned long i, j;
  int c;

  for (i=2; i < MAX; i++) {

    if (sieve[i] != PPOWER) {
      // if composite, tally and continue
      if (sieve[i] > 0) {
        tally[sieve[i]]++;
        continue;
      }

      // ok, i is prime; tally and mark all prime powers as PPOWER
      // (takes some thought to see why we do this)
      tally[1]++;
      j = i*i;
      c = 2;
      while (j < MAX) {
        sieve[j] = PPOWER;
        tally[c]++;
        j = j * i;
        c++;
      }
    }

    // now sieve as usual
    for (j=i*2; j < MAX; j += i) {
      if (sieve[j] != PPOWER)
          sieve[j]++;
    }
  }

  for (c=1; c < 100; c++)
    if (tally[c]) printf("%.2d: %ld\n", c, tally[c]);
}

  • 0
    Clarified the question, thank you for your notice. Does it make now sense?2011-02-21
  • 0
    It does make sense now, but given the history I'm afraid the discussion is a bit of a mess. Also, my answer above requires that your 10-digit number does not have leading 0's. You can modify the python program to use `xrange(1, 9999999999, 2)` instead if you want to include them. This means the program will take about twice as long to run.2011-02-21
  • 1
    The new phrasing of the question could be turned into a (awkward) security question like this: "Suppose I have to choose two numbers: P and Q where Q is the number of prime factors of P. If P is my bank PIN code, what Q should I choose in order to make the set of P as large as possible?"2011-02-21
  • 0
    @Fixee: it could be broken into many *security* questions such as "suppose that banks' customers prime-factorize their PINs and distribute the factors to their wallet, car and home. Of how many factors should the PINs consist of?" But such questions should go to security.SO, not here, but good idea.2011-02-21
  • 1
    @hhh: With your latest rewording of the question, I'd change the title to something like "What set of 10-digit numbers with $k$-prime factors has largest cardinality?" And include the details that leading 0's are allowed in the numbers and that you include multiplicities when counting primes. Cheers.2011-02-22
  • 0
    @Fixee: thank you. You are right, it is so more descriptive. Can you share the C program?2011-02-22
  • 0
    @Fixee: the first two disagree with what I got from Alpha and OEIS. I don't have an independent way to check. I do find your values sum to 10^10-22011-02-22
  • 0
    Ross: The above list is for **nine** digit numbers; my computer only has 4GB of RAM so I can't do ten digits. My numbers above do agree with Alpha and OEIS for 9-digits I believe.2011-02-22
  • 0
    @Fixee: I see you have updated to 10 digits and the numbers match.2011-02-23
  • 0
    @Ross: Yup, all done. I just needed to get an acct on a machine with enough memory. We have a 48-core machine with 82GB here, so that was more than enough. I'm pretty sure this problem could be more efficiently solved with a little more math and a lot less computation.2011-02-23
  • 0
    @hhh: My answer now includes the code used to produce the list; there is probably a slicker way, but this was enough to get it done.2011-02-24
  • 0
    @Fixee: my comp cannot calc it so decreased the value of MAX to 100 but `$ gcc test.c $ . ./a.out ./a.out:13: parse error near \`^@^@^@^@\Çë^K^F^@^@^@^@^P^@^@^B^@^@^@...'`. `$ gcc --version gcc (GCC) 3.3.5 (propolice) Copyright (C) 2003`2011-02-26
  • 0
    @hhh: This is very weird: you're saying it compiles successfully but gets a parse error when you run the executable?! Make sure your cut/paste didn't pick up HTML oddities like `<` for the less-than symbol (although this would cause a compilation error).2011-02-26
  • 0
    @Fixee: first one with obsd(4GB) and now with ubuntu(64GB): `$ gcc test.c $ . ./a.out ./a.out:7: unmatched \` $ gcc --version gcc (Ubuntu 4.4.3-4ubuntu5) 4.4.3 Copyright (C) 2009 Free Software Foundation, Inc.`. It is the same code http://pastebin.com/XVcwEfYT. When I ran it with the smaller size for the MAX, it returned err on line 6 but with the whole code on line 7.2011-02-26
  • 0
    @hhh: Here's the problem: you are using an extra period (`$ . ./a.out`) when you run the program. The first period means to "source" the file (http://ss64.com/bash/period.html) which causes Unix to try and interpret the a.out file as shell commands (which is why you're seeing those errors). Instead you want to run `$ ./a.out` and it should work fine.2011-02-27
  • 0
    @Fixee: after 33 factors, why does the code return huge numbers?2011-02-27
  • 0
    @hhh: I don't know what you're talking about?! I've run this code on a variety of machines and it's worked just fine for any power of 10 up to $10^{10}$.2011-03-02
7

There are about

$$\frac{10^{10}(\log\log 10^{10})^{k-1}}{(k - 1)!\log 10^{10}}$$ 10-digit numbers with k prime factors, allowing leading zeros, or $$\frac{1}{(k - 1)!}\left( \frac{10^{10}(\log\log 10^{10})^{k-1}}{\log 10^{10}}-\frac{10^9(\log\log 10^9)^{k-1}}{\log 10^9} \right)$$ without leading zeros.

Based on this you'd expect those with 3 or 4 prime factors to be most likely. The ones with two and five are pretty far behind, with the others not even close.

Counts (with leading 0s):

1 455052511 (predicted: 434294482)
2 1493776443 (predicted: 1362215689)
3 2227121996 (predicted: 2136374810)
4 2139236881 (predicted: 2233663566)
5 1570678136 (predicted: 1751537079)
6 977694273 (predicted: 1098780384)

So those with three prime factors win. Note: the exact counts assume that you want to count repeated prime factors more than once (the predictions are basically the same either way). If you look at only the number of distinct prime factors the totals vary but 3 remains in first place.

  • 0
    Nice work solving this. But where do your exact counts come from? I also did some predicting similar to what you have above, but 3 and 4 factors were too close to call (indeed, your predictions show 4 winning, but in fact 3 wins when counting exactly).2011-02-23
  • 2
    I have some decent code for counting semiprimes in C using the PARI library, and that can be used to count 3- and 4-almost-primes as well. But in this case I just found them in Sloane's OEIS.2011-02-23
6

Wolfram Alpha gives the number of up to 10 digit primes as 455,052,511. When you ask for 2 prime factors, do repeated factors count? So how many prime factors does $2^{31}*3=6442450944$ have?

Added: OEIS gives the number of up to 10 digit semiprimes as 1493776443 I don't find a result for other numbers of factors easily.

Edited to allow numbers with leading zeros in the 10 digits

  • 0
    Milikan: yes also repeated factors. It has 32 factors.2011-02-21
  • 0
    Milikan: how did you get the first sentence with WA?2011-02-21
  • 1
    I put in PrimePi(10^10)-PrimePi(10^9). I tried to paste the link in, but some character prevented it. PrimePi(n) is the number of primes less than n.2011-02-21
  • 0
    Edited to allow leading zeros in the numbers, as requested by hhh.2011-02-21