$\newcommand{\lyxlock}{}$

Warning: MathJax requires JavaScript to correctly process the mathematics on this page. Please enable JavaScript on your browser.

1 Randomized Algorithm Settings

Randomized algorithm is an algorithm that in addition to its inputs also uses internally

coin-tosses (i.e. uniformly random bits) in order to decide on its output, so different invocations of the algorithm with same inputs may yield different outputs that depend on the results of these coins. One can think about the internal coin-tosses as a binary string

of length

that is being fixed before the algorithm starts and being consumed sequentially during the algorithm’s execution. Such string is often called seed, and the randomness within the algorithm now comes from choosing a different string at the beginning of each execution.

Note that tossing

fair coins, is exactly like drawing uniformly at random a string

from the set

(i.e. each string has a probability of

Suppose we have a randomized algorithm

that solves a problem of interest. We have a guarantee that for any input

of length

, algorithm

runs in

, and outputs either a valid solution

if it succeeds or output

. The success probability of the algorithm is

(which may be even infinitesimally small).

The first question to ask is, can we convert the randomized algorithm into a deterministic one?

2 Brute Force Derandomization

The answer is yes, and there is a simple way to do that: Upon receiving an input

, iterate over all possible strings

and run

. That is, simulate

with

as the random seed. Return the first solution that

outputs (i.e. anything that is not

The idea behind this method is that if there is a non-zero probability for

to output a valid solution for the input

, then there must be a random seed that leads to that output when executing algorithm

. So all we need to do is iterating over all possible seeds of length

, and that leads to a deterministic algorithm.

You probably already see the problem here. If

then the running time of the deterministic procedure is

That is, of course, terrible running time.

On the other hand, if

is a constant or even up to

, then the resulting running time will stay polynomial, and everyone is happy. In the next section we will see that with an additional assumption, we can “decrease” the seed length from

bits.

3 Pairwise Independent Seed

We start off with a definition.

k-wise Independent Family. Let

be a set of random variables. We say that

is a k-wise independent family if for any subset of

variables

, and values

that is, any subset of

random variables from

is fully independent. The family

is called pairwise independent family when

What this definition have with our settings? Observe that up until now we implicitly assumed that the algorithm needs its coins to be fully independent. That is, if we think of

as a series of Bernoulli RVs then we actually require

to be t-wise independent family, which is a much stronger property compare to the pairwise independence property. If we relax that requirement and assume that the algorithm only needs its coins to be pairwise independent, then there is a way (described in next section) to construct

pairwise independent bits using only

fully independent bits. That construction can be “embedded” into our algorithm such that we ultimately end up with a procedure that only requires

fully independent seed. Applying the derandomization method from previous section will result with a deterministic algorithm that will still run in

time.

Of course, the aforementioned derandomization method is valid only if the assumption of pairwise independent seed doesn’t break the correctness analysis of the randomized algorithm.

4 Stretching the Seed

Suppose we have

fair fully independent coins, we want to construct a method to generate from them

fair pairwise independent bits. The following claim shows such construction.

Suppose we have

fully independent Bernoulli random variables, each with probability

. Fix them in some arbitrary order to get a (random) vector

of length

. Let

be the set of all possible sums modulo 2 of those RVs and note that

, then

is a family of pairwise independent bits.

To prove that we start with a very basic lemma.

Let

be two independent random variables, let

then

. Moreover,

is independent from

and from

For any

thus

. Next w.l.o.g we prove that

independent from

. Take any

. In case

then

and

are independent. And in case that

then similarly we get

which conclude the proof.

Now we prove the claim from above.

Take two different variables

and

, and let

, we want to show that

and

are independent. Obviously, we only care about the set of indices which are non-zero in the vectors

and

. Let these sets be

respectively. Denote

and

where summation over empty set is defined as the constant

(all the operations in the proof are modulo 2). Note that any pair of variables from

are independent because they defined on mutually disjoint set of indices. Moreover, by the lemma from above, each of them if it is not the constant zero then is distributed as

. When plugging-in the new notations we get

Because

there must be at most one variable from

which may be constant. Therefore, no matter which

it is (if any), either

is a sum of two non-constant variables. W.l.o.g let it be

, then it has the

component which does not appear in

and can flip the result of the sum with probability

regardless of what happen anywhere else, therefore

is independent of

, and that completes the proof.

Last one remark. In our settings the randomized algorithm either return a solution or says “failed”, if we have a polynomial procedure to validate a solution we can generalize the settings to the case when the algorithm always output a solution, and then we verify it to determine if it is “succeeds” case of “failed” case.