2
$\begingroup$

Say I have nodes A,B,C and D. If choose to start with A then B I can make a total of 2^(4-2) subsets:

For A then B

{A, B} {A, B, C} {A, B, D} {A, B, C, D} 

I can do the same by choosing first A then C:

For A then C

{A, C} {A, B, C} {A, C, D} {A, B, C, D} 

But as you can see there is an overlap, {A,B,C} and {A,B,C,D} occur in both lists.

How do I calculate the number of distinct subsets, keeping in mind that I would want to compare any number of these groups of subsets of length 2^(N-2)?

I can do this by remembering all the subsets generated for A then B and then only add distinct ones for A then C but I was hoping there is a shortcut since I will be working with a large number of nodes and I'm only interested in the number not in the actual subsets.

A little more information might help: the nodes are actually in a partially ordered graph, and the above mentioned Y will be equal to the number of edges in this graph.

  • 0
    No I'm looking for the number of sets which have 2 specific elements, for example A and B, and then I want to know the number of sets which have another 2 specific elements, for example B and C. This is obviously the same number, but then I want to know how many distinct subsets these two sets of subsets have.2012-01-20

3 Answers 3

1

EDIT: I've rephrased the answer to keep a more compact answer. I keep the mention to the binomial coefficient for consistency with the comments.

So the question boils down to this: if I have N nodes and Y subsets of size $2^{(N-2)}$ how many of these subsets are actually distinct?

At first, I thought it was possible to use the binomial coefficient $\left(\begin{array}{c}n\\ k\end{array}\right)$, but this is clearly not the case.

As mentioned by the other answers and comments, the inclusion/exclusion principle should be used, and the general formula is: $ \mathbf{card}(\bigcup^n_{i=1} A_i) = \sum^n_{k=1} (-1)^{k+1} \left(\sum_{1 \leq i_1 < \cdots < i_k \leq n} \mathbf{card}(A_{i_1} \cap \cdots \cap A_{i_k}) \right) $

So we just need to define the $A_i$. Given a set of nodes $N$, we are interested in the subsets generated by two letters $X, Y \in N$, and we write:

$ Z_{XY} = \{ Z \subseteq N \mid X \in Z \land Y \in Z\} $

In this case, the question becomes: given a set of generators $G = \{\{X_1, Y_1\}, \cdots \{X_k, Y_k\}\}$ (note that the number $k$ stands for $Y~$ in the original question), what is the cardinality of $Z_{X_1Y_1} \cup \cdots \cup Z_{X_kY_k}$?

If we write $A_i$ for $Z_{X_iY_i}$, we have $ \mathbf{card}(A_{i_1} \cap \cdots \cap A_{i_n}) = 2 ^{N - \mathbf{card}(\{X_{i_1}, Y_{i_1}, \cdots, X_{i_n}, Y_{i_n}\})} $ because the cardinality of the intersection corresponds to the number of subsets possible without the generators. Using this formula and the general formula for inclusion/exclusion, you can calculate the cardinality of the set.

For instance, we have: $ \begin{align*} \mathbf{card}(Z_{X_1Y_1} \cup Z_{X_2Y_2}) & = \mathbf{card}(Z_{X_1Y_1}) + \mathbf{card}(Z_{X_2Y_2}) - \mathbf{card}(Z_{X_1Y_1} \cap Z_{X_2Y_2}) \\ & = 2^{N-2} + 2^{N-2} - 2^{N - \mathbf{card}(\{X_1, Y_1, X_2, Y_2\})} \end{align*}$

Similarly: $ \begin{align*} \mathbf{card}(Z_{X_1Y_1} \cup Z_{X_2Y_2} \cup Z_{X_3Y_3}) & = \mathbf{card}(Z_{X_1Y_1}) + \mathbf{card}(Z_{X_2Y_2}) + \mathbf{card}(Z_{X_3Y_3}) \\ & - ~ \mathbf{card}(Z_{X_1Y_1} \cap Z_{X_2Y_2}) - ~ \mathbf{card}(Z_{X_1Y_1} \cap Z_{X_3Y_3}) - ~ \mathbf{card}(Z_{X_2Y_2} \cap Z_{X_3Y_3}) \\ & + ~ \mathbf{card}(Z_{X_1Y_1} \cap Z_{X_2Y_2} \cap Z_{X_3Y_3}) \\ & = 2^{N-2} + 2^{N-2} + 2^{N-2} \\ & - ~2^{N - \mathbf{card}(\{X_1, Y_1, X_2, Y_2\})} - 2^{N - \mathbf{card}(\{X_1, Y_1, X_3, Y_3\})} - 2^{N - \mathbf{card}(\{X_2, Y_2, X_3, Y_3\})} \\ & + ~2^{N - \mathbf{card}(\{X_1, Y_1, X_2, Y_2, X_3, Y_3\})} \end{align*}$

With the simple example given in the original question:

$N = \{A, B, C, D\}$

$Y = Z_{AB} \cup Z_{AC} \cup Z_{BC}$

For AB, we have the subsets $\{\{A,B\},\{A,B,C\},\{A,B,D\},\{A,B,C,D\}\}$

For AC, we have the subsets (I do not include those already in AB): $\{\{A, C\},\{A,C,D\}\}$

For BC, we have the subsets (I do not include those already in AB): $\{\{B, C\},\{B,C,D\}\}$.

Using the formula, we have indeed $\mathbf{card}(Y) = 2^2 + 2^2 + 2^2 - 2 - 2 -2 + 2 =8$.

If we consider BD instead of BC, then we just add the subsets $\{\{B, C\}\}$, which gives $7$, as expected.

  • 0
    It's reformulated now, I think this general formula should work. I hope it rhymes with you data :)2012-01-22
3

If there are $N$ nodes then there are $2^N$ possible sets, including the empty set and the universal set. Taken together, these sets make up the power set.

You are interested in subsets of the power set which have a give two nodes as elements of each of their elements. There are ${N \choose 2}=\frac{N(N=1)}{2}$ such subsets, each of which has $2^{N-2}$ elements.

If you choose two of these subsets of the power set and look at the the numer of distinct elements of their union, this is the sum of their sizes less the size of their overlap or intersection (to avoid double counting). This depends on whether the two nodes which define the subsets of the power set overlap or not.

  • If they do overlap (your example with $A,B$ and $A,C$) then the value is $2\times 2^{N-2}-2^{N-3} = 3\times 2^{N-3}$. With $N=4$ this gives $6$.

  • If they do not overlap (e.g. $A,B$ and $C,D$) then the value is $2\times 2^{N-2}-2^{N-4} = 7\times 2^{N-4}$. With $N=4$ this gives $7$.

Added:

  • It gets more complicated with more cases using inclusion-exclusion. For example with $A,B$ and $A,C$ and $D,E$ the value is $3\times 2^{N-2} -2^{N-3} -2 \times 2^{N-4} +2^{N-5}= 25 \times 2^{N-5}$
  • 1
    @Roy: the$n$ it get more complicated. In any case you get inclusion-exclusion if you have more than two. I will add something more to my answer2012-01-20
2

Use Inclusion-Exclusion. Number of subsets containing $A$ and ($B$ or $C$) equals number containing $A$ and $B$, plus number containing $A$ and $C$, minus number containing $A$ and $B$ and $C$.

  • 0
    Thanks Gerry, I'm only finishing up my Bsc. Computer Science so I've only had one discrete mathematics course which didn't include much combinatorics but I'm sure to look into it :)2012-01-21