Linearity of expectation

Expectation is linear. This is one of the most surprising facts in probability theory. Expectation is the “average value” of a random variable. In a sense, if the random variable X represents the outcome of an experiment, then the expected value is what we would expect the average outcome to be if we ran the experiment a lot of times.

When we say that expectation is linear, that means that if X and Y are random variables and a is a constant, then E[aX + Y] = aE[X] + E[Y]. This relation follows from the definition of expected value in terms of integrals. Since integrals are linear operators, it follows that expectation is linear.

It is surprising because it is true regardless of the dependence between the two random variables. Most facts in probability theory have dependence conditions on them, but not linearity of expectation. So no matter how interdependent two random variables X and Y may be, the expected value of their sum is the sum of their expected values.

Linearity of expectation is very useful when working with the probabilistic method. There are a number of very elegant proofs of somewhat deep results that use nothing more complicated than linearity of expectation. Here is an example from chapter 2 of The Probabilistic Method.

Theorem There is a 2-colouring of K_n with at most {n \choose a}2^{1 - {a \choose 2}} monochromatic K_a.

Proof. Take a random 2-colouring of K_n. Let X be the number of monochromatic K_a. Then X = \sum_c X_c where X_c is the indicator variable indicating if a given a-subset is monochromatic. The expected value for each X_c is 2 \times 2^{-{a \choose 2}} = 2^{1-{a \choose 2}}. There are {n \choose a} possible a-subsets, so by linearity of expectation, E[X] = {n \choose a}2^{1-{a \choose 2}}. Hence there exists some colouring with at most the expected value of monochromatic K_a.