How hard is it to translate an arbitrary well-formed formula into CNF formula? It seems it can get exponential in some occasions like $(a\wedge b)\vee (c\wedge d)$ is transformed into $(a\vee c)\wedge(a\vee d)\wedge (b\vee d)\wedge (b\vee c)$, in which the size of the formula seems to expand exponentially. Therefore, I feel the problem should be in NPC. However, I do not have a clue of how to reduce this problem.
Converting each formula into Conjunctive Normal Form?
2 Answers
It is true that a CNF expansion can be exponentially large.
However, the concept of NP-completenes is usually only applied to decision problems, where the expected answer is always in {yes,no}. One reason for this is exactly to avoid problems that are "hard" simply because the answer is large and there takes a lot of time to write.
So you cannot even speak about whether the problem is in NPC without first reformulating it such that its output is a single bit. (There are some well-known NP-complete problems that naturally seem to ask for more information -- such as the Traveling Salesman. But what one really proves is that the yes/no problem "is there an itinerary of length at most $N$?" is NP-complete).
Also, you seem to be committing a fallacy when you jump from "this is hard" to "it is probably NP-complete". Remember that there are two points to NP-completenes:
The problem must be hard enough, usually called "NP-hard", that is, there are polynomial reductions from every problem in NP -- usually shown by presenting a single reduction from another problem already known to be NP-hard.
At the same time, the problem must be easy enough, namely, it must be in NP. That is, there must be a nondeterminsitic machine that solves it in polynomial time. Or equivalently, there must be a set of "certificates" for problem instances whose answer is "yes" such that a purported certificate can be checked for correctness in polynomial time.
Your intuitive leap to "should be in NPC" appears to ignore the second test.
-
0Also, note that "hardness" is not a total order between problems. In particular, it is **not** the case that just because a problem is not in P, it has to be NP-hard. Assuming that P$\ne$NP, one can explicitly construct (by diagonalization) a subset of $\mathbb N$ that is neither in P nor NP-hard. – 2012-10-07
There are various ways to convert formulas to CNF avoiding the exponential growth of your example.
Wikipedia's Conversion_into_CNF shows how to do that. A good reference is the Handbook of Satisfiability, Chapter 2, CNF Encodings.
-
2It should be noted that the way they avoid exponential growth is by finding the CNF for a non-equivalent, but equisatisfaiable, formula. – 2013-11-25