There are many open problems in maths, most of which have the decency to sound appropriately daunting. For example, is it the case that “every Hodge class on a projective complex manifold X is a linear combination with rational coefficients of the cohomology classes of complex subvarieties of X”? I have no idea what most of those words mean, so I’m certainly willing to believe that this is a Very Hard Problem. However, many open problems sound like they really should have been solved by now. In this post I’ll discuss a few “embarrassing” open problems in maths that are far harder than they might appear.

The first of these problems involving counting graphs. A graph is a collection of vertices with a bunch of edges connecting some or all of these vertices to each other.

A tree is an acyclic graph, i.e a graph containing no cycles.

Open question: How many trees are there with n vertices?

If we consider labelled trees then this question has any easy solution: there are labelled trees on n vertices. Counting the total number of *graphs* on n vertices is even easier. However, no closed form solution is known for unlabelled trees and this problem has been open for well over a century.

Another problem involving graphs is that of computing the diagonal Ramsey numbers. Before stating this problem, consider the following puzzle:

How many people must be present at a party to ensure that there are guaranteed to be three people present who either all know each other or are mutual strangers?

For the sake of this puzzle we’ll assume that any pair of people either know each other or they don’t (if I know you then you know me and I can’t kinda-know-you-a-bit). To make this puzzle more maths-y, we’ll represent each party guest as the vertex of a complete graph (i.e. a graph in which every vertex is connected to every other vertex). We colour each edge of the graph red or blue according as the relevant two people do or don’t know each other. Denoting the complete graph on n vertices by , the puzzle now becomes:

What is the least value of n such that however we colour the edges of in red and blue there must exist either a red triangle or a blue triangle?

It’s easy to see that n=5 is too small (just draw and you’ll easily find a suitable colouring). To see that the solution is n=6 we need to prove that whenever you colour the edges of in red and blue there must exist either a red or blue triangle. So let’s suppose that you’re given any colouring of . Pick a vertex v and consider the edges coming out of v. There are five such edges and so either at least three of them are coloured blue or at least three are coloured red. Let’s assume that at least three are coloured blue (otherwise just swap the words “red” and “blue” in what follows!).

Pick three vertices that are connected to v by blue edges and call them a, b and c. If any of the edges a-b, b-c or a-c are coloured blue then this edge, together with the edges from v, form a blue triangle. Otherwise, all three of the edges a-b, b-c and a-c are red. But in this case, these edges form a red triangle.

So we’ve solved the puzzle and the answer is “six people”. Now suppose that instead of asking for a red or blue triangle we asked for four vertices such that all of the edges between them were the same colour. i.e., instead of asking for a monochromatic copy of , we’re after a monochromatic copy of .

This question is a little harder but it’s still not terribly hard to prove that the solution is n=18. i.e., the following two facts are true:

1. Whenever you colour the edges of in red and blue there must exist four vertices such that the edges between them form a red or blue copy of .

2. It’s possible to colour the edges of such that, for any set of four vertices in the graph, there exists at least one blue edge and at least one red edge between them.

We’ll denote the answer to the first puzzle by R(3) and the answer to the second puzzle by R(4), i.e. R(3)=6 and R(4)=18. What if we ask for an n large enough that any colouring of the edges of in two colours must contain a monochromatic copy of or ? i.e., what are the values of R(5) and R(6)? No-one knows! Like the problem of counting trees, this problem has been open for decades.

Erdős asks us to imagine an alien force, vastly more powerful than us, landing on Earth and demanding the value of R(5) or they will destroy our planet. In that case, he claims, we should marshal all our computers and all our mathematicians and attempt to find the value. But suppose, instead, that they ask for R(6). In that case, he believes, we should attempt to destroy the aliens.

– Joel Spencer

Not only are these values unknown, mathematicians aren’t even close to being able to compute them.

Another open problem concerns colouring in the natural numbers (i.e. the numbers 0,1,2,3,…). It’s known that, for any colouring of the naturals in finitely many colours (e.g. colour 1 in blue, 2 in red, 3 in blue, 4 in green,…), and any n, there exists a sequence of numbers of the form a, a+b, a+2b, … , a+nb such that every number in this sequence has the same colour. In fact, it’s known that for any colouring of the naturals in finitely many colours there must exist an *infinite* sequence , , , … such that every finite sum of these numbers gets the same colour. Even stronger results are known, but I’ll not state them here as they’re quite technical. The upshot of these results is that whenever you colour the natural numbers using finitely many colours there must exist large monochromatic substructures with each of many different properties. However, these structures are all defined only in terms of addition. What if we also consider multiplication? Well, in that case we know almost nothing! The following problem is open:

Is it the case that whenever the natural numbers are coloured in finitely many colours there must exist numbers y and z such that y+z and have the same colour?

Well, actually, the answer to that question is “obviously yes, just set y=z=2”! However, if we rule out that case then the answer is unknown.

The inability of mathematicians to answer the previous question is part of a wider trend: mathematicians are generally pretty useless at solving problems that involve relating the additive and multiplicative structures of the naturals. A fundamental property that’s defined in terms of multiplication is that of being prime: a number is prime if it has no divisors other than itself and one. The prime numbers have been studied for centuries and a vast quantity of deep results have been proved about them. However, the following two problems have been unresolved for centuries:

Is every even number greater than two the sum of two primes?

Are there infinitely many numbers n such that n and n+2 are both prime?

For the next open problem, consider the following algorithm:

Given input n, set . For all , set if is even and otherwise.

Call the resulting sequence the *Collatz sequence with seed n*. What do these sequences look like? I’ve listed the Collatz sequences for the first few values of n below.

{1,4,2,1,4,2,1,…}

{2,1,4,2,1,4,2,1,…}

{3,10,5,16,8,4,2,1,4,2,1,…}

{4,2,1,4,2,1,…}

{5,16,8,4,2,1,1,…..}

It looks like all sequences eventually just alternate between the values 4, 2 and 1, and computers have verified this fact for all n less than . However, given the theme of this post, you’ll not be surprised to hear that no-one has been able to prove this fact. In fact, Paul Erdos famously said that “mathematics is not yet ready for such problems.”

The final example of an “embarrassing” open problem is one of the (possibly *the*) most important questions in maths. In fact, it’s one of the Clay Institute’s seven “millennium problems” and there’s a million dollar prize for answering it.

Suppose that you work for a sales company and wish to visit every city in some country to promote your wares. Given your limited budget, you want to know if there exists a route that visits every city (travelling by road, say) with total distance less than some fixed amount. How would you work out if such a route exists? Well, you could try every single sequence of roads that passes through every city, but it’s easy to see that this quickly becomes untenable (if there are n cities in a country, the number of possible routes grows exponentially in n). In fact, there are no known algorithms for efficiently checking whether such a route exists. However, suppose someone claimed to have found a short enough route. In this case you could trivially check their claim: just add up the lengths of the roads in their proposed route.

Returning to colouring graphs, suppose that you’re challenged to colour the *vertices* of a graph using only the colours red, blue and green in such a way that no vertices with an edge between them have the same colour. This again seems like a pretty tricky task: there are possible colourings of the vertices of a graph with n vertices and for large n this number is far too large for you to merely try all colourings. As in the previous example though, you can immediately test the correctness of any proposed solution: just colour the vertices in the proposed colours and check that no adjacent vertices get the same colour.

There are thousands of other problems from many areas in maths and industry that share a key feature with these two examples, namely that finding a solution to them appears to be very tricky but checking a solution is very easy. The P vs NP questions asks whether there are actually algorithms that quickly find solutions to *all* of these problems. In other words, is it generally true that finding solutions to problems is as easy as verifying them? The answer to this question is surely “no, obviously not”, but after decades of research by thousands of mathematicians no-one is even close to being able to prove this.

Hi, the definition of Collatz sequence you picked seems to be inconsistent with your examples: given your definition, the examples should end with the recurring sequence 4,2,1,4,2,1,…., not an infinite stream of 1s.

I’ve corrected this now. Thanks.

Um, so you explained what a graph is, and then what a tree is.

What’s a labelled tree?

Ah, sorry. A labelled tree is a tree on n vertices, together with a label from 1 to n for each vertex. Each label is used exactly once. For example, consider a tree with four vertices (like a Mercedes logo; one vertex in the centre and three “spokes” extending from it). Without labels there is only one such tree. However, if we label the vertices then there are lots of different non-equivalent trees with this shape (for example, the tree with the central vertex labelled 1 is distinct from the one with the central node labelled 2).

Well, I think the author had in mind the tree in which every node is labelled and therefore distinguishable from other nodes. So for labelled graphs these two graphs are different:

A — B — C

A — C — B

(in here letters stand for nodes)

Eh, sorry, for some reason I didn’t notice the AngryFaic’s comment :[

Pingback: Bookmarks for October 17th from 11:55 to 12:19 « Mark's life

In the Collatz sequences you describe, a(m+1) = 3*a(m) + 1, when a(m) is not even. Then, if a(m) =1, should a(m+1) not equal 4 (i.e. 3*1 + 1 = 4). This would make every sequence converge to a repeating pattern of 4, 2, 1 not an endless string of 1s – or have I misunderstood the rule?

You’re right; I’ve changed it now.

Collatz sequence cannot be ending in 1, 1, 1. It has to be the cycle 4, 2, 1

Small nit on the Collatz conjecture (3x+1 map) stuff above. The sequences do not get stuck at 1, they get stuck repeating 4, 2, 1, 4, 2, 1, 4, 2, 1…. or we decide to stop at 1.

The Collatz sequence should end with {1,4,2,1,4,2,1, ….} and not all 1’s.

The Collatz sequences should enter a loop of 1,4,2,1,4,2,1 etc.

The Collatz sequence is confusingly described. As described 1 -> 4 -> 2 -> 1, and the conjecture is that every number will eventually get to 1, not that you get an infinite sequences of 1s.

> For the sake of this puzzle we’ll assume that any pair of people either each other or they don’t

I think you left out the word “know”.

> The answer to this question is surely “no, obviously not”, but after decades of research by thousands of mathematicians no-one is even close to being able to prove this.

I don’t want to criticize you, but I don’t think we can say for certain that P!=NP, the proof is necessary, IMO.

Thank you for the post.

Thanks, corrected the typo.

And yes, I’m not claiming that P != NP has been settled; I only meant that intuitively it seems ludicrous that they might be equal (to me, at least).

I have just found and entry on OEIS for the general formula for the first problem. http://oeis.org/A000055

G.f.: A(x) = 1 + T(x)-T^2(x)/2+T(x^2)/2, where T(x) = x + x^2 + 2*x^3 + … is g.f. for A000081

I havent checked it, I presume its wrong tho?

I was deliberately vague about what a “closed form” solution is. For more details, see http://en.wikipedia.org/wiki/Closed-form_expression. The important point here is that you’re restricted to using a finite number of elementary functions in the formula.

Another intractable problem is “what is the minimum number of respondents who will not bother to read previous posts when they’re eager to post their own ‘gotchas’ before the redundant posts finally cease?”

“no closed form solution is known for unlabelled trees and this problem has been open for well over a century.”

Pet peeve of mine here. What you really mean is “no *USEFUL* closed form”. It is trivial to find closed form formulas for anything you can compute. In fact, I recently created a program which *finds* formulas for anything you can compute: http://www.xamuel.com/formula.php The formulas it produces are utterly useless in practice (and they’re so complicated they’re something of a crime against humanity), but that’s not the point.