4
$\begingroup$

Given a set of $n$ binary strings of length $m$, how many binary strings at least should we add to the set to make sure that the binary "AND" and "OR" operators are closed on the set? The output of the problem is the number of binary strings needed to add.

For example, given the set {001, 010, 110}, we should at least add {000, 011, 111} to the set to make both "AND" and "OR" operators closed. So the output is 3.

Is there any good algorithm or idea related to the problem? Thanks a lot.

  • 0
    But how to optimize it?2011-05-20

2 Answers 2

2

I believe you are trying to compute the "sublattice of the boolean lattice of length m" generated by your n binary strings. I don't know a fast way to compute sublattices. In general, it is exponential in n.

Here is an example: For any positive integer n, and any positive integer m ≥ n, start with the binary strings of length m which are zero in position i for all i > n, and which are nonzero in all but one other place. There are exactly n such strings. How many strings do you have to add? 2n−n, since every string with exactly one nonzero entry (in the first n positions) is an AND of n−1 of the initial strings. This output is exponential length in n.


If you included "not" or otherwise included "xor", you would suddenly upgrade to finding "ideals in the boolean algebra" and this becomes linear algebra, and things like gaussian elimination should make the problem quadratic in n.

Of course the output itself could still be exponential in n, so you have to settle for a different, but very useful, kind of output: a basis.


As pointed out by Ross Millikan, the critical problem for a practical solution is a better data-type for the output. Just listing all the words is going to be exponential in general, but sometimes one could produce ridiculously fast membership test functions (and product them ridiculously quickly) — in my example, the membership function simply checks that the positions from n+1 to m are all 0.

However, no data-type can work magic. Depending on how many possible answers there are, the data-type will have to be at least a certain size. Queyranne–Tardella (2008) attempt to answer the question of how many sublattices are there and in fact study exactly the original problem. They provide a data-type for the output and prove it is memory optimal up to a constant factor:

Queyranne, Maurice; Tardella, Fabio. "Sublattices of product spaces: hulls, representations and counting." Discrete Math. 308 (2008), no. 9, 1508–1523. MR2392592 DOI:10.1016/j.disc.2007.03.080

  • 0
    @Timothy: the way the problem works is that, when you're concerned with closure of the set, there is no 'least'. Yes, you could add _all_ strings to satisfy that you have all possible ANDs and ORs (plus maybe a lot more. But to ask for closure, you're asking for exactly that already, _the_ subset of the universe that satisfies the criteria (adding more would be adding strings that don't fit the criteria of being 'reachable' by ANDs and ORs).2011-05-20
1

You can look for cases where one bit implies another. In your example bit 2 (the MSB) implies bit 1-there are no examples in your given set with bit 2 on and bit 1 off. So you can never get rid of this and 100 and 101 will not be required. This is $n^2m$ in complexity

Added in response to comment: You are looking for cases where bit $i$ is set in all your input words whenever bit $j$ is set. So you could make an array $m \times m$ and initialize it to all $1$. Go through all $n$ input words. For each word, check if bit i=0 and bit j=1. If so, set the $(i,j)$ element to $0$. At the end, if any entries in the array are still $1$, having bit $j$ on means for sure bit $i$ is on in your input words. As AND and OR cannot destroy this, none of your output words will have $i$ off and $j$ on. As you check $m(m-1)/2 \approx m^2$ bits in $n$ input words, this is of order $m^2n$

  • 0
    How did you get $n^2 m$? Can you say a little more about the idea of your algorithm? I can't fully understand it. Thanks.2011-05-20