1 Recap

The previous post discussed the and families. We said that the hierarchy between decision problems from do not preserved when moving to their Counting versions. We defined and formulas (see section 2 in previous post), and the problems CNF-SAT that is -complete and DNF-SAT that is in . We used the relation between their Counting versions to show that and are equally hard to be solved exactly. The relation is as follows: if is a variables formula, then Furthermore, we showed that both problems are -complete, demonstrating that even an “easy” decision problem like DNF-SAT can be very hard when considering its Counting counterpart.
This post we will show that there is some hierarchy inside when considering approximated solutions instead of exact solutions.

2 Approximation Algorithm for #DNF-SAT

2.1 Deterministic Approximation

Let be a formula, and . We want to find a polynomial time algorithm that yields some which is an approximation for . Formally, for an approximation factor of , . We will talk more about the approximation parameter later.
First note that if a clause contains a variable and its negation then immediately we can conclude that there are no satisfying assignments (s.a.) for that clause as it contains a contradiction, so we will omit it from the formula (can be done in linear time). So from now on, assume that do not contain any contradiction. For the rest of this discussion, we say that a boolean variable appear in a clause if either it or its negation contained in .
Let’s focus on an arbitrary clause . We can easily count the number of s.a. it has; it is exactly when is the number of variables that do not appear in . That is true because the value of any variable in is determined by its appearance (if it has negation or not) so only variables which are not in can be either or and having two options.
Moreover, we can efficiently sample at uniform any s.a. for that clause. Simply iterate over the variables, if a variable appears in then it have only one allowed value so assign it that value, and if it is not appear in then toss a coin and assign or accordingly.
Trivial approximation algorithm will calculate the exact for each in and returns their sum . However, we may have s.a. that were counted more than once (if they are satisfying more than a single clause). Thus the approximation factor of that algorithm is , because in the worst case all the clauses have the same set of satisfying assignments so we counted each s.a. times, yielding .
We can improve that algorithm by using the inclusion-exclusion principle: instead of accumulating the number of s.a. for each clause, accumulate the number of s.a. that weren’t count so far.
Unfortunately, finding which assignment was already counted forces us to enumerate all possible assignments, which obviously takes exponential time. Hence we will give a probabilistic estimation for that, instead. When working with estimations, we need to state the success probability for our estimation and it is also preferable that we will have a way to increase that probability (mostly at the cost of more computations and drawing more random samples).
Next section we will describe how to get an estimation of such that for any , even a polynomially small one (i.e. ) with a very high probability.

2.2 Probabilistic Approximation

The overall procedure is still simple: iterate over each clause and estimate the fraction of s.a. for that were not counted before (as a number between zero and one), denote that estimation by , let be the total number of s.a. for calculated as mentioned in the deterministic approximation section. Then output .
It left to show how to compute , so we do the following. Let , when will be determined later. Sample independent times, an uniformly random s.a. for . For each assignment define a corresponding binary variable that is if it is not satisfying any previous clause, and otherwise. Note that evaluating all takes time. Then return . That completes our probabilistic approximation algorithm for .

Analysis

We need to analyze the quality of approximation of each , then show that with high probability is within the claimed approximation bounds. For the first part, we focus on a single estimation for the clause .
Let be the number of all s.a. for and let be the number of s.a. for that aren’t satisfying any clause before . Out goal is to show that is, with high probability, close to . Observe that all the variables for estimating were actually sampled i.i.d from a Bernoulli distribution with parameter (which is also its expectation). As a result , so apply Hoeffding inequality to get which means that with high probability the estimator is “successful”, that is, and equivalently, If all the estimators are successful, then and also similarly , which are exactly the approximation bounds that we want to achieve. To conclude the analysis, note that by applying the union bound, all the estimators will be successful with probability .

Trade off between the parameters

It is interesting to examine the relation between the different parameters, and how each affect the running time, success probability, and the quality of approximation.
Consider the parameter , which is the number of samples the algorithm has to draw to estimate each . That depends linearly on the parameter which controls the “success probability” of all the estimators (simultaneously). On the one hand, large gives a success probability that is extremely close to . But on the other hand, it increase the number of samples per clause, and the total running time of the algorithm, which is .
Similarly, the better approximation we want for , the smaller that we should set (even smaller than ), then the number of samples and the running time increase inversely proportional to .

3 What About Approximating #CNF-SAT?

Previous section dealt with approximating, either in the deterministic or probabilistic manner, the problem. Unfortunately, as oppose to that we did with exact solutions, approximating do not help us approximating . We will prove that by proving a stronger claim.
Let be any -complete problem, and denote its Counting versions. There is no polynomial-time approximation algorithm of to any factor, unless .
Assume that . Suppose toward contradiction that we have some polynomial-time approximation algorithm for , and denote it by . Given an instance for the decision problem , simulate and get an approximation for . Then declare as the output of the decision problem if that approximation is zero, and otherwise. Note that this reduction runs in polynomial time. Moreover, the Counting versions have zero solutions if and only if . Since any multiplicative approximation of zero is still zero, we can use the approximation algorithm for the counting problem to efficiently solve the -complete decision problem, contradicting that .

4 Formulas are Not Always Easy!

During this and the previous post, we mentioned formulas as something which is easy to solve. We showed that , and always compared it to its “difficult” variation - the version. However, a formula can also be “difficult”. It just depends what we are asking. For example, deciding if a formula is a tautology (i.e. evaluated to for any assignment) is an -complete problem. One can show that by noting that is a formula, and is a instance for CNF-SAT if and only if is a instance for the DNF-TAUTOLOGY problem. Thus, the problem are equally hard.
Similarly, we can define an “easy” problem for formulas: decide whether a formula is a tautology can be done in polynomial time, by checking if its negation has a s.a. which is polynomial since and is a formula.
As a conclusion, the hardness of those decision problems depends not only on the form of the formula but also on what are we asking to decide.