Say I have 20 sets, containing a variable amount of elements. How would I go about finding the 10 elements that cover the most number of sets?
Imagine I could search for three terms at once on Wikipedia. I know all the data in Wikipedia. Each page is a set of words. I want to search for the three terms that will return as many results as possible.
At first, it seems like searching for the most frequently occurring words is the best solution - this is probably fine for just three terms at once, but imagine ten thousand terms, on very small sets. The two most frequently occurring words tend to occur together in our sets (e.g, Happy Birthday) - we're trying to find something like the best orthogonal representation of all these sets, with N dimensions.
I haven't studied set theory yet, and although I am reading through Suppes' Axiomatic Set Theory nowadays, I'm sure I haven't formulated this problem in precise mathematical language, and I apologize for this. I am willing to clarify the problem more.
I'm looking for an algorithm, or a mathematical model that I could implement as an algorithm. Perhaps this question is better posited to StackOverflow!
Note that the result should be optimized but not necessarily perfect.