4
$\begingroup$

I'm working on building some software that does machine learning. One of the problems I've come up against is that, I have an array of numbers:

$[{a, b, c, d}]$

And I want to compute the following efficiently:

$ab + ac + ad + bc + bd + cd$

Or:

$abc + abd + acd + bcd$

Where the number of variables in each group is specified arbitrarily. I have a method where I use:

$f(x) = a^x + b^x + c^x + d^x$

And then compute:

$f(1) = a + b + c + d$

$(f(1)^2-f(2))/2 = ab + ac + ad + bc + bd + cd$

$(f(1)^3 - 3f(2)f(1) + 2f(3))/6 = abc + abd + acd + bcd$

$(f(1)^4 - 6f(2)f(1)^2 + 3f(2)^2 + 8f(3)f(1) - 6f(4))/24 = abcd$

But I worked these out manually and I'm struggling to generalize it. The array will typically be much longer and I'll want to compute much higher orders.

  • 4
    See Newton's identities http://en.wikipedia.org/wiki/Newton's_identities2011-09-05

2 Answers 2

4

See Newton's identities en.wikipedia.org/wiki/Newton's_identities

2

Besides using the Newton's identities as mentioned in @Soarer's comment, you could also consider an algorithm to generate all combinations. E.g. to compute $$abc+abd+acd+bcd$$ you would generate 3-combinations out of 4 elements. Staying with this example, you have already an array of numbers $a = [a_0, a_1, a_2, a_3]$ so you would generate all possible triples of indexes $(i, j, k), i < j < k$ and then multiply and add $a_i * a_j * a_k$. This method could be numerically more robust then Newton's identities because only additions and no subtractions/divisions are used. The efficiency could also be favourable but this would require more analysis. There are perhaps still more efficient algoritms.

  • 0
    I'm not too fussed about numerical stability. If needs be, I can just increase the precision, but I do care about speed. I don't think that's more efficient, generating combinations is exponential time. Whereas, using Newton's identities, I can do it in quadratic time.2011-09-05
  • 0
    The generation method generates linearly only those combinations that are needed. In the case [a,b,c,d] the number of multiplications to compute all symmetric polynomials is 6+2*4+3 = 17 without storing any intermediate results. The Newton's method needs about 12 mult. for f(1)-f(4), 3 for f(1)^k, 1 for f(2)^2 and 12 mult. for building the sums, i.e. totally about 28 multiplications. Here I considered already storing some intermediate results, which of course complicates the algorithm. Please correct me if I am wrong.2011-09-05
  • 0
    Maybe I'm misunderstanding what you are saying. Say I have an array with 100 elements. If I understand you correctly, you want me to enumerate over ever possible combination that has `n` elements in it and sum the multiplication. If `n` is 4, that's 3921225 combinations. With the other method I can do it by creating 3 new arrays and summing each one, then applying the equations above which is O(nm).2011-09-05