Minimalist Mathematician

A blog about life as a math grad student

Category: Combinatorics

Linearity of expectation

Expectation is linear. This is one of the most surprising facts in probability theory. Expectation is the “average value” of a random variable. In a sense, if the random variable X represents the outcome of an experiment, then the expected value is what we would expect the average outcome to be if we ran the experiment a lot of times.

When we say that expectation is linear, that means that if X and Y are random variables and a is a constant, then E[aX + Y] = aE[X] + E[Y]. This relation follows from the definition of expected value in terms of integrals. Since integrals are linear operators, it follows that expectation is linear.

It is surprising because it is true regardless of the dependence between the two random variables. Most facts in probability theory have dependence conditions on them, but not linearity of expectation. So no matter how interdependent two random variables X and Y may be, the expected value of their sum is the sum of their expected values.

Linearity of expectation is very useful when working with the probabilistic method. There are a number of very elegant proofs of somewhat deep results that use nothing more complicated than linearity of expectation. Here is an example from chapter 2 of The Probabilistic Method.

Theorem There is a 2-colouring of K_n with at most {n \choose a}2^{1 - {a \choose 2}} monochromatic K_a.

Proof. Take a random 2-colouring of K_n. Let X be the number of monochromatic K_a. Then X = \sum_c X_c where X_c is the indicator variable indicating if a given a-subset is monochromatic. The expected value for each X_c is 2 \times 2^{-{a \choose 2}} = 2^{1-{a \choose 2}}. There are {n \choose a} possible a-subsets, so by linearity of expectation, E[X] = {n \choose a}2^{1-{a \choose 2}}. Hence there exists some colouring with at most the expected value of monochromatic K_a.

A theorem by Erdös

This is a quick application of the probabilistic method to additive number theory that I personally think is really cool. A set A is called sum-free if it contains no a_1, a_2, a_3 such that a_1+a_2 = a_3.

Theorem (Erdös, 1965). Every set B of n nonzero integers contains a sum-free subset A of cardinality at least n/3.

Proof. Let p = 3k+2 be a prime such that p > 2max_i |b_i| and let C = \{k+1, \dots, 2k+1 \}. Clearly, C is a sum-free subset of the group \mathbb Z_p, and \frac{|C|}{p-1} > \frac 13.

Let us choose at random an integer x, 1 \leq x < p according to a uniform distribution on \{ 1, 2, \dots, p-1 \} and define d_1, \dots, d_n by $d_i \equiv xb_1 \pmod p$, and let $latex 0\leq d_i

1/3$. Therefore, the expected number of elements b_i such that d_i \in C is more than n/3.

Consequently there is an x and a subsequence A of B of cardinality at least n/3 such that xa \pmod p \in C for all a \in A. Since C is sum-free, so is A. Q.E.D.

Ribbon graphs of link diagrams

This has been a weird week. Grad school seems to go in waves: first I feel really stupid and like I understand nothing, and then suddenly I’m on top of everything. Up until now, this rollercoaster has been out of sync in all of my classes, but this week I’ve been on top of everything. I wonder how long it will last. I even understand what we’re doing in analysis (intro to functional analysis), though that will probably change when we start with abstract measure theory next week…

Update: It did not change when we did abstract measure theory. Shocking, I know. I found the abstract stuff so much easier to do than the concrete Lebesgue measure. When we did abstract measure theory, I didn’t get confused by all the things I do know about the real numbers and I could focus on just understanding measures on sets, rather than mixing in all the weird things that happen with real numbers. Much more fun.

I have a vague recollection of promising a post on the topic of my conference presentation, so here it goes. First of all, you need to know what a knot is. A normal knot, that you tie your shoes with, is not mathematically interesting. It can be tied, so it can (mathematically at least) be untied. Instead, to define a mathematical knot we tie a knot in a piece of string, and connect the ends. That’s the intuitive definition. There are some tweaks to make it behave nicely, but you get the general idea. A link is a knot with several components.

The title of this post mentions link diagrams. We can find a link diagram of a link by simply projecting the link “nicely” onto the plane. I’m fudging a lot of the details about knots, because they are not the main topic here. I might do a post later on knots where I explain all the details properly. Here “nicely” should be taken to mean that one can recover the original knot from the projection.

Now that we have a vague idea what a link diagram is, we define a ribbon graph. A ribbon graph G is a collection V(G) of discs called vertices, and a collection E(G) of discs called edges. The edge discs intersect exactly two vertex discs along an arc on each vertex, and no edges intersect. An easier way to think about a ribbon graph is as a graph embedded in a surface, and then we cut out a thin neighbourhood around the graph to obtain a ribbon graph.

Next, I’m going to tell you how ribbon graphs are connected to link diagrams. We take a link diagram, and eliminate each crossing just like when constructing a knot polynomial, choosing a negative or a positive splicing at each crossing. We can choose whichever of the two ways to eliminate each splicing we like. When all the crossings have been eliminated, we are left with a number of disjoint circles. These will be the vertices of our ribbon graph. Connect the vertices where there were crossings with edge discs, and you will end up with a ribbon graph. Since we could choose any splicing we for each crossing, there are up to 2^{\text{\# crossings}} possible ribbon graphs associated with each link diagram.

The natural questions to ask is then: how are these connected? What can we say about the ribbon graphs of a given link diagram? If two link diagrams have the same set of ribbon graphs associated with them, how are the link diagrams related?

My undergraduate advisor already answered both of those questions for links in \mathbb R^3. It turns out that the ribbon graphs arising from the same link diagram are partial duals of each other. For the second question, it turns out that the link diagrams are related by a very simple “flip” move. In my undergraduate thesis, I proved that the same results holds for checkerboard-colourable link diagrams of links in \mathbb RP^3. This is roughly what I presented at the conference a few weeks ago, to a room full of grad students. Here is a link to the paper on ArXiV.

It was a very nice conference, by grad students, for grad students. The organizers did a really good job putting it together, and there were a lot of interesting people there. It was all very friendly, in part I think because we are all so early in our careers. There were none of the “let me interrupt to talk about why my research is superior disguised as a question for 10 minutes”, which was nice. I also saw quite a few interesting talks, in particular one on the Erdös-Ko-Rado conjecture that I might try to turn into a blog post on a later date. No promises though, since apparently things keep popping up that keep me from blogging for weeks at a time. It should be easier for me to write regularly now that term is over, but no promises on the topics.

Graph embeddings – The Map Colour Theorem

Happy 2015 everyone! Last year was a good one. I graduated with a good undergraduate degree, and passed my first semester in grad school. I’ve spent some time thinking about what area I want to specialise in during the next four and a half years. I’m leaning towards graph theory, and in particular graph embeddings as my specialisation in grad school. This will be the first post of many on graph embeddings, why they are interesting and different useful techniques. I’m sure most of you have heard of the map colour theorem before: it states that given a simple planar graph, the vertices can be coloured using at most four colours such that no two adjacent vertices have the same colour.

This result was first conjectured in the 19th century by a Francis Guthrie. There were many attempted proofs, but no correct ones until over 100 years later. The final proof was done using a computer. It is based on the idea that if the conjecture is false, then there exists a smallest graph which requires five colours to be coloured. The proof showed that no such counterexample can exist. They managed to show that most graphs can be reduced to smaller ones with the property that if the smaller map can be 4-couloured, so can the original. Using this idea, they reduced the infinitely many planar simple graphs to 1936 cases that were checked by computer. The complete proof was announced in 1976 by Kenneth Appel and Wolfgang Haken at University of Illinois. It was considered quite controversial at the time, and still is to some, since it was the first proof of a major theorem that was done in large part by computers. Several people have claimed to have found major errors in the proof, but none have stood up. The proof has also been made more efficient by several computer scientists, minimising the chance that there might be a mistake in it. In 2005, Benjamin Werner and Georges Gonthier formalised the proof in the Coq proof assistant (possibly a future post). This mans that we don’t have to trust that Appel  and Haken’s algorithm’s are correct any more, we just have to trust Coq.

One of the first things that was proven related to the four colour theorem was the five colour theorem. It was proven by Percy John Heawood in 1890, and is based on the false proof of the four colour theorem by Kempe in 1879. The proof is by contradiction as follows:

We assume that G is the smallest graph which cannot be coloured using only five colours. Using the Euler formula we can show that either every vertex is of degree at most 4, in which case the graph is trivially five-colourable, or it has a vertex, say v, of degree 5. Now consider the graph G', obtained by removing vertex v. It had five adjacent vertices. Since G' is smaller than G, it can be five-coloured, so these vertices, v_1, \dots v_5 (in cyclic order around v) are coloured with at most five different coulours. If they are coloured with less than five coulours, we can use one of the unused ones for v to obtain a five-colouring of G. Hence they all have different colours: 1, 2, \dots 5. Consider the subgraph G_{13} containing only the vertices coloured by 1 or 3. Either v_1 and v_3 are in different components, or they are in the same component. If they are in different components, then the colouring of say the component containing v_1 can be reversed, freeing up colour 1 for the vertex v. Hence they are in the same component, so there is a path joining v_1 and v_3, using only colours 1 and 3. Now consider the subgraph G_{24}. By a similar argument, vertices v_2 and v_4 must be connected by a path coloured only with colours 2 and 4. However, since the vertices were labeled cyclically, this path must intersect the one between v_1 and v_3, which is a contradiction. Hence G can be five-coloured.

A few last words: the four-colour theorem can be generalised to graphs on any surface, and this is where graph embeddings come in. Heawood conjectured that the maximum number of colours needed for any graph is given by the formula p = \lfloor \frac{7+\sqrt{49-24\chi}}{2} \rfloor. This was proven true by Ringel and Young in 1968, with one exception. I’ll talk more about this in my next math post.

Interesting property of matrices

A few weeks ago, Gilbert Strang from MIT gave my department’s colloquium on the topic of banded matrices. A diagonal matrix is a 1-banded matrix, one where the semi-diagonals above and below the diagonal are non-zero as well is a 3-banded matrix, and so on. An example of a 3-banded matrix is \left( \begin{array}{ccc} 1 & 2 & 0 \\ 3 & 4 & 5 \\ 0 & 6 & 7 \end{array} \right).

Now, the interesting property of matrices is as follows: take a 2 \times 2-matrix, say A = \left( \begin{array}{cc} 1 & 2 \\ 3 & 4 \end{array} \right). Notice that the ratio on the diagonal is  \frac{2}{3}. Now square the matrix: A^2 = \left( \begin{array}{cc} 7 & 10 \\ 15 & 22 \end{array} \right). The ratio on the diagonal is still \frac{10}{15} = \frac{2}{3}. If we cube A we get A^3 = \left( \begin{array}{cc} 37 & 54 \\ 81 & 118 \end{array} \right). Now the ratio on the diagonal is \frac{54}{81} = \frac{2}{3}. It actually turn out that thus ratio along the diagonal is preserved for all powers of a 2 \times 2-matrix. There are several ways to prove that it holds: but the most obvious is by induction on the powers of A. This proof is fairly mechanical, so I will leave it as an exercise for the reader 🙂 (I’ve always wanted to write that).

The first question we ask as mathematicians is “how general is that property”. For these matrices, it turns out that it doesn’t generalise to general n \times n-matrices. However, it does generalise to 3-banded matrices. All the ratios along the diagonal of a 3-banded matrix are preserved under powers of the matrix. This is once again easy to prove by induction on the power of A.

I think this is a really neat little fact, and I’ve never seen it before. I don’t know if it shows up in any applications, so if you know of any please leave a comment! I’ll be back after Christmas with some graph theory 🙂 Merry Christmas!

Ramsey Theory

I just today received the forms I need from U to apply for my student visa. Now all that remains is to explain repeatedly on forms and in an interview that no, I am not a terrorist. I’m still also only half way through my exam term. My senior thesis is submitted and four out of my eight exams are done. And then there is the oral defense of my senior thesis. But half of it is done. Today I’m going to explain something I came across revising for my combinatorics exam, namely Ramsey theory.

The basic idea is that any set which is sufficiently large will contain some ordered subset. One example is the Bolzano-Weierstrass theorem in analysis. In a simple form it states that every bounded sequence in \mathbb{R} has a convergent subsequence. So there is a subset with some type of structure.

For Ramsey theory we instead look at graphs. We take the complete graph K_n and colour each edge either red or blue. The Ramsey number R(s,t) is the smallest positive integer such that any two-colouring of the edges of K_{R(s,t)} contains a monochromatic red K_s or a monochromatic blue K_t. To see how this works, I suggest you try to colour the edges of a K_6 such that there is no monochromatic triangle (K_3).

We are going to prove that R(3,3) = 6. First, see the picture below for a colouring of K_5 with no monochromatic triangle. From this we deduce that R(3,3) > 5.

Ramsey

Source: Wikipedia

Consider a K_6. If we take the vertex labeled 1 and focus on the five edges incident to it, it is trivial to see that at least three of them must be of the same colour, say red. We call these edges \{1, x_1 \}, \{1, x_2 \}, \{1, x_3 \} . Now, we look at the edges connecting the vertices x_1, x_2, x_3. There are three of them and either all of them are blue, in which case we have a monochromatic blue triangle, or at least one of them is red, We assume that the edge \{x_1, x_2 \} is red. Then we have that the triangle 1, x_1, x_2 is monochromatic red. Hence we always have a monochromatic triangle in any colouring of K_6 and so R(3, 3) = 6.

Ramsey’s theorem states that each of these Ramsey numbers exist. Proving it is not difficult, though it can be a bit fiddly. I would recommend looking up a book on Ramsey theory, because it is a very interesting subject. In practice, these numbers are very difficult to find. Only some of the smaller ones have been found, and others have been bounded. Finding a lower bound is as simple as presenting a colouring with no red K_s or blue K_t. Finding upper bounds is much more difficult, since the argument used above often does not work. So one either has to find another general method or check each colouring of the upper bound. Checking this is a computationally difficult problem and hence takes a lot of time.

Ramsey numbers can be extended to arbitrarily many colours and can be shown to exist for any finite number of colours. In this case, finding the precise numbers becomes even more difficult as they must be larger. We will prove that R(3, 3, 3) \leq 18. To do this, we will assume that R(3, 6) = 18.

We call the third colour yellow. First, we combine the colours blue and yellow into a colour blue-yellow. We then use our assumption to deduce that any colouring of K_{18} must contain either a red K_3, in which case we are done, or a blue-yellow K_6. In the second case, we use that we have showed R(3, 3) = 6 to deduce that this blue-yellow K_6 must contain a monochromatic blue or yellow triangle. Hence we have shown that R(3, 3, 3) \leq 18.