I'm not the best at math(but eager to learn) so please excuse me if I'm not explaining this problem correctly, I will try to add as much info to make it clear. I basically receive 2 pieces of data, one is a list of integers and the other is a target_sum, and I want to figure out all the ways I can use the list to equal the target sum. So for a list of [1,2,4]
to a target_sum of 10
, I would get:
2 * 4 + 1 * 2 + 0 * 1 2 * 4 + 0 * 2 + 2 * 1 1 * 4 + 3 * 2 + 0 * 1 1 * 4 + 2 * 2 + 2 * 1 1 * 4 + 1 * 2 + 4 * 1 1 * 4 + 0 * 2 + 6 * 1 0 * 4 + 5 * 2 + 0 * 1 0 * 4 + 4 * 2 + 2 * 1 0 * 4 + 3 * 2 + 4 * 1 0 * 4 + 2 * 2 + 6 * 1 0 * 4 + 1 * 2 + 8 * 1 0 * 4 + 0 * 2 + 10 * 1
The current algorithm I'm using is two parts, one builds a look up table of what combinations are possible and the other builds the actual table: Table building:
for i = 1 to k for z = 0 to sum: for c = 1 to z / x_i: if T[z - c * x_i][i - 1] is true: set T[z][i] to true
Possibility construction:
function RecursivelyListAllThatWork(k, sum) // Using last k variables, make sum /* Base case: If we've assigned all the variables correctly, list this * solution. */ if k == 0: print what we have so far return /* Recursive step: Try all coefficients, but only if they work. */ for c = 0 to sum / x_k: if T[sum - c * x_k][k - 1] is true: mark the coefficient of x_k to be c call RecursivelyListAllThatWork(k - 1, sum - c * x_k) unmark the coefficient of x_k
This is the basic idea, my actual code is a slightly different because I am using bounds to remove the possibility of infinite values(I say a single value cannot exceed the value of the sum).
The problem is, the table building part does not scale. It is flawed in, at least two ways, one is its dependent on the previous number to be completed(thus I cannnot break it and run it individually for each number) and the second problem is it requires to read a table before it writes(I am learning about how to get around this technically but currently it makes the program very slow).
Is there a more efficient way to do this that scales?
Here's an approach I tried to take but failed(so far):
create a large table full of all possible values.z to target_sum.. create another large table of T[z - c * x_i][i - 1] and compare if the values exist. If they do exists, add T[z][i] to a third table that contains the correct master
I don't need code just the logic(if this is possible). If it helps you(as it often helps me understand) here is some python code with my approach/examples:
#data = [-2,10,5,50,20,25,40] #target_sum = 100 data = [1,2,3,4,5,6,7,8,9,10] target_sum = 10 # T[x, i] is True if 'x' can be solved # by a linear combination of data[:i+1] T = [] # all values are False by default T.append([0, 0]) # base case R=200 # Maximum size of any partial sum max_percent=0.3 # Maximum weight of any term for i, x in enumerate(data): # i is index, x is data[i] for s in range(-R,R+1): #set the range of one higher than sum to include sum itself max_value = int(abs((target_sum * max_percent)/x)) for c in range(max_value + 1): if [s - c * x, i] in T: T.append([s, i+1]) coeff = [0]*len(data) def RecursivelyListAllThatWork(k, sum): # Using last k variables, make sum # /* Base case: If we've assigned all the variables correctly, list this # * solution. # */ if k == 0: # print what we have so far print(' + '.join("%2s*%s" % t for t in zip(coeff, data))) return x_k = data[k-1] # /* Recursive step: Try all coefficients, but only if they work. */ max_value = int(abs((target_sum * max_percent)/x_k)) for c in range(max_value + 1): if [sum - c * x_k, k - 1] in T: # mark the coefficient of x_k to be c coeff[k-1] = c RecursivelyListAllThatWork(k - 1, sum - c * x_k) # unmark the coefficient of x_k coeff[k-1] = 0 RecursivelyListAllThatWork(len(data), target_sum)
Any help or suggestions would be appreciated. I have worked on this for a long time and all my experiments have failed. I'm hoping to get the correct answer but even ideas of different approaches would be great so I can experiment with them.
Thank you.
p.s. I have asked a question on stackoverflow 2 days ago about improving my existing algo, but I got answers from posters who admitted to not fully understanding what I was asking for and because they have answered the question, I am unable to delete it to post here. I have flagged it for deletion.
Update: Regarding some of the comments,I'm not looking for a fast way of doing this(although it would be nice), I'm looking for a scalable way..my method works but each loop is dependent on the last loop which causes it to be bound to a single process. The math is in such a way that it builds upon previous results. If I can somehow break the process up into independent parts then I can use more cpu/computers to handle the work. I know it'll take a long time, but if it takes 600 hours on one cpu, then two should cut it down a bit and so on..right now I can't use other computers so I'm forced to wait 600 hours(while everything else on the system is ideal). please help!
Also the results are large but not infinite as I have bounds set so the number cannot exceed a certain percent of target_sum.