On the Hardness of Counting Problems - Part 1
1 What is a Counting Problem?
Every undergraduate student in Computer Science knows the
and
families and the legendary question “is
equals
?”. However, there is another interesting family that is not always covered in undergraduate courses; that is the
(“sharp P”) family. In this post we will define and discuss this family and two concrete similar-but-different problems from it that demonstrate interesting properties about its hardness.
Formally,
is the family of decision problems that can be solved in polynomial time. A known example is the Perfect Matching problem: given a graph, determine whether it has a perfect matching or not. A perfect matching is a set of edges that touches all the vertices in the graph and any pair of edges have disjoint endpoints.
The family
contains all the decision problems that has polynomial time algorithm over the non-deterministic computational model (we won’t get into that). We know that
, but we don’t know if the other direction also holds. A famous
-complete example is the Hamiltonian Cycle problem: given a graph, determine whether it contains a cycle that visits every vertex exactly once. We say “complete” to note that the problem is harder at least as any other problem inside that family, that is, if we could solve it efficiently (i.e. in polynomial time) then we could solve any other problem in the family efficiently. When a problem
is at least harder as other problem
we denote it
.
Any problem in either
or
can be rephrased slightly different. We can ask “how many perfect matching this graph contains?” rather than “is there a perfect matching in this graph?”. When asking how much instead of is there the problem become a Counting problem. Following that, the formal definition of
is: the family the Counting version of any problem in
.
Notice that for any problem
we have
because having a count of the desired property can, in particular, distinguish between the cases that this count is
or higher, that is, decide whether that property exists or not. Hence, we can safely state the general hierarchy, that
.
Let
be the Perfect Matching and Hamiltonian Cycle problems respectively, and let
be their Counting versions. It is interesting to see whether the hierarchy between problems from
and
preserved for their Counting versions as well. For instance, for the problems above, we know that
since
and
is
-complete, is that implies also
?
The short answer is no. In the next section we will show that in a detailed discussion.
2 Counting Makes Problems Much Harder
We start with two basic definitions from the field of boolean logic.
A Conjunctive Normal Form (
) formula
other a set of
boolean variables is defined as
where each of its
clauses is of the form
, that is, the clause is a disjunction of some number of literals. A literal
is a variable or its negation. Note that since there are
variables then
.
A Disjunctive Normal Form (
) formula
other a set of
boolean variables is defined analogously to the
version, as
where each of its
clauses is of the form
.
The CNF-SAT problem, gets as an input a
boolean formula, and decides whether that formula has a satisfying assignment (s.a.) or not. That is, an assignment of
for each of its variables such that the formula evaluates to
. The DNF-SAT is defined the same but for
formulas rather than
ones.
We will need the following facts in the coming discussion:
- The operator (negation) transforms any formula to a formula and vice versa.
- For any boolean formula over variables (regardless its form), there are exactly possible assignments each of which satisfy exactly one from .
It is very easy to solve DNF-SAT efficiently, meaning, in polynomial time. Given a
formula
, iterate over its clauses until you find a satisfiable clause then declare that
is satisfiable, however, if you failed to find such a clause then declare that
is not satisfiable. Observe that any clause of the form
is satisfiable if and only if it does not contains a variable and its negation; which we can check with a single pass over the literals of
. As a result,
.
In contrast to DNF-SAT, the CNF-SAT is know to be
-complete, so we believe that it cannot be solved in polynomial time. Similar to the claim from the previous section, we conclude that CNF-SAT is “at least harder” as DNF-SAT, that is,
. However, as regard to the Counting versions of those problems, we can show the opposite inequality; that
.
To prove that, we need to show that if we could solve
efficiently, then we could solve
efficiently. So suppose we have a magical algorithm that can solve
in polynomial time. Given a
formula
over
boolean variables, construct
and feed it to the magical algorithm. Let its output be the number
, then return
.
The correctness of that reduction follows from the two facts we stated earlier. Note that also all the procedures can be executed in polynomial time, so we solved
in polynomial time using an efficient solver for
, therefore,
.
Actually, if we will look closely on that reduction, we will see that it should work perfectly for the other direction as well, that is, solving
by a an efficient algorithm for the
, so we may conclude that both problems are equally hard. This is an evidence that the hierarchy between decision problems do not preserved for their Counting versions. Putting it differently, we can say that Counting makes problems much harder, such that even an “easy” problems from
become harder as the entire
when you consider its Counting version.
3 is Also -Complete
Since
and
are equally hard, it is enough to show that
is
-complete; for that we need a reduction from any problem in
to
. Recall the Cook-Levin reduction which proves that CNF-SAT is
-complete. That reduction can be applied to any
and produce a polynomial procedure that converts any input
for the original problem
, to a
formula
for the problem CNF-SAT, such that
is satisfiable if and only if the decision problem
says
for the original input
. That way, any efficient solver for CNF-SAT solves
efficiently, so
.
An important property of the Cook-Levin reduction is that it keeps the number of s.a. for
exactly as the number of witnesses for the original input
of
. That means, that we can use the same reduction to convert any Counting problem
to
. That is, the reduction will yields a procedure to convert any input
for
to a formula
for
such that the number of s.a. of
is the solution for
. As a result,
is
-complete, and then also
.
Following the conclusion from previous section, we can say further that even a simple problem from
, can have a Counting version that is not just “harder” but harder as any other problem in
.
4 Is There a Structure inside ?
Going back to the
and
problems, we saw that they are related via the relation
for any
formula
over
boolean variables. We exploited that relation to show that the problems are equally “hard” when talking about exact-solutions. However, an hierarchy emerged once we moved to consider approximated solutions for those problems. Therefore, there is some kind of structure for the family
under the appropriate notion of “solving” the problems.
Next post we will discuss an approximation algorithm for
and the hardness of approximating
, showing that the additive relation from above it not enough for constructing an approximation algorithm for
based on approximation for
.