$\newcommand{\lyxlock}{}$

Warning: MathJax requires JavaScript to correctly process the mathematics on this page. Please enable JavaScript on your browser.

1 The Definitions and Motivation

Hoeffding and Chernoff bounds (a.k.a “inequalities”) are very common concentration measures that are being used in many fields in computer science. A concentration measure is a way to bound the probability for the event in which the sum of random variables is “far” from the sum of their means.

For example: suppose we have

(fair) coins to toss and we want to know what is the probability that the total number of heads (i.e sum of indicators) will be “far” from half of the tosses (which is the sum of their means), say that either

. Chernoff’s and Hoeffding’s bounds can be used to bound the probability for that to happen.

Both bounds are similar in their settings and their results, and sometimes it is not clear which bound is better for a given setting; maybe even both bounds are equivalent. In this post we will try to get an intuition which bound is stronger (and when).

Note that each bounds have several common variations and we will discuss only the ones that are stated below. Let’s start with the first bound, a multiplicative version of Chernoff’s bound.

Chernoff bound. Let

be a collection of independent random variables ranging in

, denote

and let

, then for any

Hoeffding bound. Let

be independent random variables ranging in

where

, denote

and let

, then for any

2 Writing Hoeffding’s bound as a Chernoff’s bound

In first glance it seems that we cannot compare between the definitions, mostly because this small technical issue: Hoeffding’s bound allows the random variables to be in any close interval

rather than

. The solution for that is to scale and shift the variables to make them takes values in

. We will start with exactly that, and then transform Hoeffding’s inequality into “Chernoff’s”.

So for any

, denote

and let

. Observe that

. Clearly, now the range of the variables

so we can write Hoeffding’s bound in terms of

’s

Where the inequality in the middle of the process is where we apply Chernoff’s bound with

, note that there is an implicit assumption here that

So we managed to formulate Hoeffding’s bound in Chernoff’s settings, ending with a very messy formula. In order to explore the behavior of the bounds further, we will simplify the analysis by adding another assumption: that

3 Which bound to used?

Now we assuming that

. The restriction from previous section that

is now holds whenever

, which make sense because Chernoff’s bound, as we defined it, only concerned with deviations that are up to

Next, the term we got for Hoeffding is simplified further into

Now easy to compare between the bounds. With Hoeffding we can achieve bound of

and with Chernoff

, it just left to compare between

and

. If

then Chernoff is stronger and vise versa. We conclude that none of the bounds is always preferred upon the other (which is not so surprising), and the answer to the question “which one to use” depends on the parameters of the distribution in hand.

4 Chernoff bound is tight!

One surprising fact about Chernoff’s bound is that in some cases it is tight (i.e. gives a lower bound). The theorem below is taken from these notes (see the link for the full proof).

Let

be i.i.d random variables ranging in

with

, denote

and let

.
If

, then for any

Note the differences between that regime and the one from the definition at the beginning of the post: in this theorem the variables are i.i.d (rather than only independent) and they are discrete.