a course in arithmetic ramsey theory - epfl · 5/6/2017 · theorem 1.2 (solymosi). for all k, any...

A course in arithmetic Ramsey theoryFrank de ZeeuwMay 6, 2017

1 Coloring Theorems 21.1 Van der Waerden’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2 Bounds for Van der Waerden’s Theorem . . . . . . . . . . . . . . . . . . . . . . 31.3 The Hales–Jewett Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.4 Schur’s Theorem and Rado’s Theorem . . . . . . . . . . . . . . . . . . . . . . . 72 Roth’s Theorem 92.1 Hilbert cubes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.2 Combinatorial proof of Roth’s Theorem . . . . . . . . . . . . . . . . . . . . . . 112.3 Bounds for Roth’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132.4 An alternative proof using the regularity lemma . . . . . . . . . . . . . . . . . 143 The Density Hales–Jewett Theorem 153.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153.2 An outline of the proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163.3 Preparation for the proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193.4 The proof of the density Hales–Jewett theorem . . . . . . . . . . . . . . . . . 223.5 Consequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284 Arithmetic progressions over finite fields 294.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294.2 The slice rank method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304.3 Sunflowers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324.4 Arithmetic progressions in FD3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

This is a partial draft of the lecture notes for the first half of the course Additive Combinatorics at EPFL.1

Chapter 1

Coloring Theorems

1.1 Van der Waerden’s TheoremThe following theorem of Van der Waerden [37] is acentral result in arithmetic Ramseytheory. Historically, it was the third theorem of this kind, after the first steps of Hilbert(Lemma 2.1) and Schur (Theorem 1.8). See Soifer [30] for an exhaustive history of Van derWaerden’s theorem. Our presentation of the proof is based on the exposition in lecturenotes of Leader [22].Theorem 1.1 (Van der Waerden). For all k, ` there is a number W (k, `) such that anyk-coloring of [W (k, `)] has a monochromatic arithmetic progression of length ` .

Proof. We define an (`,m)-fan to be a set of m monochromatic `-term arithmetic progres-sions {ai, ai + di, . . . , ai + (` − 1)di} of different colors that all have the same focus f , i.e.f = ai + `di for all 1 ≤ i ≤ m. We prove the following claim for all k , ` , and m ≤ k .

There is a number V (k, `,m) such that if [V (k, `,m)] is k-colored, then there is amonochromatic (` + 1)-term arithmetic progression, or there is an (`,m)-fan.If this claim holds for m = k , then we directly get an (` + 1)-term arithmetic progression,or there is an (`, k)-fan. In the second case, one of the progressions in the fan must havethe same color as the focus, so again there is a monochromatic (` + 1)-term arithmeticprogression. In other words, if V (k, `, k) exists, then W (k, ` + 1) also exists.We prove the claim by double induction, with an outside induction on ` and an insideinduction on m. For ` = 1, we set V (k, 1, m) = k + 1, so that the pigeonhole principlegives a monochromatic 2-term arithmetic progression. Assuming that V (k, ` − 1, m) existsfor all k,m, we prove by induction on m that V (k, `,m) exists for all k,m. For m = 1 theexistence of V (k, `, 1) follows from the existence of V (k, `−1, k), since an (`, 1)-fan is just amonochromatic `-term arithmetic progression. Assuming that V (k, `,m− 1) exists, we nowprove that V (k, `,m) exists.We set V = V (k, `,m− 1) and

M = 2 ·W (kV , `) · V ,and we let [M ] be k-colored. We split the first half of [M ] into W (kV , `) blocks of length V .Define a coloring of this sequence of blocks with kV colors, by letting the color of a blockbe its coloring.By the choice of the number of blocks, there is an `-term monochromatic arithmeticprogression of blocks. In other words, the first numbers of these blocks form an arithmetic

2

progression with common difference D, and each block is colored in the same way. By thechoice of the length of the blocks, each of those blocks contains in its first half a copy ofthe same (`,m − 1)-fan, translated by jD for some 0 ≤ j ≤ ` − 1. This fan consists ofm− 1 monochromatic `-term progressions ai + jdi with common focus f = ai + `di, whichis contained in the same block as the fan. We can assume that f has a different colorthan the progressions of the fan, since otherwise we would have an (` + 1)-term arithmeticprogression. For each i = 1, . . . , m− 1 there is a color such that the numbers

ai + j1di + j2D,for j1 = 0, . . . , ` − 1 and j2 = 0, . . . , ` − 1 all have that color. Then for each i the numbers

ai + jdi + jD

with j = 0, . . . , ` − 1 form a monochromatic arithmetic progression with focus f + `D.Moreover, the numbers f + jD form another monochromatic arithmetic progression, of adifferent color and with the same focus, so we have an (`,m)-fan. This proves the claim.1.2 Bounds for Van der Waerden’s TheoremThe values of W (k, `) that we can obtain from our proof of Theorem 1.1 are ridiculouslylarge. Even for k = 3, we would get (roughly speaking) an exponential tower of height ` .Let us discuss some of the more sensible bounds that are known.We write W(k, `) for the smallest number W for which the statement of Theorem 1.1holds, i.e., the smallest W such that whenever [W ] is k-colored, there is a monochromatic`-term arithmetic progression. The best general bounds1 in terms of k and ` are

c`−1k ` ≤ W(k, `) ≤ 22k22`+9,

where c is some constant. The upper bound is due to Gowers [15]. He obtained it froma quantitative improvement of Szemerédi’s Theorem (to be mentioned later in this course),which he proved using heavy analytic methods. The lower bound is an application of theprobabilistic method and the Lovász Local Lemma; see for instance [17, Section 4.3].There are many improvements in specific cases, like k = 2 or ` = 3. For 3-termarithmetic progressions, the best bounds arekc log k ≤ W(k, 3) ≤ kck .

The lower bound is due to Gunderson and Rödl [19], and is based on Behrend’s construction[5], which we will see later in the course. The upper bound follows from a quantitive versionof Roth’s Theorem (Theorem 2.4) proved by Sanders [28].To see at least one proof of a more sensible bound, we give an argument of Solymosifor an upper bound on W(k, 3). It is weaker than the bound mentioned above, but the proofis far simpler. Solymosi wrote down a version of this argument in his lecture notes [31,Theorem 7]. It is also mentioned in the book of Tao and Vu [35, Remark 6.18], and similarreasoning is used to prove a related theorem by Graham and Solymosi [18].1To the best of my knowledge...

3

Theorem 1.2 (Solymosi). For all k , any k-coloring of [k2k+2 ] has a monochromatic 3-termarithmetic progression.

Proof. Assume [N ] is k-colored without a monochromatic 3-term arithmetic progression. Letc1 be the most popular color, and let C1 be the set of numbers with that color. Then we have|C1| ≥ N/k . Let d1 be half the most popular difference in C1, so we can write 2d1 = b− afor at least 13(N/k2 )12N ≥ N6k2pairs a < b ∈ C1. Here we used the fact that at least one third of the differences determinedby any set of integers are even. Set

S1 = {a : a− d1, a+ d1 ∈ C1},so we have |S1| ≥ N/(6k2). Since there is no monochromatic 3-term arithmetic progression,no number in S1 has the color c1.We repeat the procedure starting with S1. Let C2 ⊂ S1 be the largest color class, withcolor c2, so we have |C2| ≥ |S1|/k ≥ N/(6k3). Let d2 be half the most popular difference inC2, so 2d2 = b− a for at least

13(|C2|2 )12N ≥ 23(N/(4k3)2 )N ≥ N44k6

pairs a < b ∈ C2. SetS2 = {s : s− d2, s+ d2 ∈ C2},so |S2| ≥ N/(44k6). As above, no number in S2 has the color c2. But the color c1 also doesnot occur in S2, because s ∈ S2 forms an arithmetic progression with s−d2−d1, s+d2 +d1,both of which have color c1 (since s− d2, s+ d2 ∈ C2 ⊂ S1).We keep repeating this until we have a set Sk , which has none of the colors, so mustbe empty. But we have (with some calculation and rough overestimation)|Sk | ≥

N42k+1k2k+1 > Nk2k+2 ,

which gives a contradiction if N ≥ k2k+2 .1.3 The Hales–Jewett TheoremWe now prove a generalization of Van der Waerden’s Theorem, due to Hales and Jewett[20]. The statement may appear a little odd at first, but it turns out to be just the rightformulation to obtain various other generalizations of Van der Waerden, which would behard to obtain otherwise (see for instance Corollary 1.5). We will see that the proof isalmost the same as the one we gave for Theorem 1.1.We say that L ⊂ [` ]d is a Hales–Jewett line if there is a nonempty set I ⊂ [d] andnumbers ai ∈ [` ] for all i 6∈ I , such that

L = {(xi) ∈ [` ]d : xi = xj for all i, j ∈ I, xi = ai for i 6∈ I}.For instance, {(2, 1, 1, 3, 1), (2, 2, 2, 3, 1), (2, 3, 3, 3, 1)} is a line in [3]5. A line can be repre-sented as a word w ∈ ([` ] ∪ {∗})d that contains at least one star. A line is then obtained

4

from a word by replacing all the stars in the word by the same number j , for each j ∈ [` ].The example above can be written as 2 ∗ ∗31.We call the “last” point of the line, i.e. the one where ∗ is replaced by ` , the focus ofthe line. Given two lines L1 ⊂ [` ]d1, L2 ⊂ [` ]d2 , we write L⊗ L′ for the line in [` ]d1+d2 that isobtained by concatenating the words of L and L′. We use the same notation when one ofthe lines is a point (a word without stars).One can think of [` ]d as a grid in Rd, and then a Hales–Jewett line is the intersectionof that grid with a line; but it is not true that every geometric line gives a Hales–Jewettline. For example, {(1, 3, 2), (2, 2, 2), (3, 1, 2)} lies on a geometric line but is not a Hales-Hewett line. Nevertheless, when it is clear from the context, we use the word “line” for aHales–Jewett line.Theorem 1.3 (Hales–Jewett). For all k, ` there is a number HJ(k, `) such that if [` ]HJ(k,`) isk-colored, then there is a monochromatic Hales–Jewett line.

Proof. We define an m-fan in [` ]d to be a set of m lines in [` ]d which are monochromatic in[` − 1]d with different colors, and which all have the same focus f ∈ [` ]d. We will prove thefollowing claim for all k , ` , and m ≤ k .There is a number M = M(k, `,m) such that if [` ]M is k-colored, then there is a

monochromatic Hales–Jewett line in [` ]M or there is an m-fan in [` ]M .If this claim holds for m = k , then we directly get a line in [` ]M , or there is a k-fan. In thesecond case, one of the lines in the fan must have the same color in [` − 1]M as the focus ofthe fan, so again there is a monochromatic line in [` ]M . In other words, if M(k, `, k) exists,then HJ(k, `) also exists.We prove the claim by double induction, with an outside induction on ` and an insideinduction on m. Assuming that M(k, ` − 1, m) exists for all k,m, we prove by induction onm that M(k, `,m) exists for all k,m. For m = 1 the existence of M(k, `, 1) follows from theexistence of HJ(k, ` − 1), since a 1-fan in [` ]M is basically the same as a monochromaticline in [` − 1]M . Assuming that M(k, `,m− 1) exists, we now prove that M(k, `,m) exists.We set

M = α + β with α = HJ(k `M(k,`,m−1), ` − 1), β = M(k, `,m− 1).We factor [` ]M = A× B with A = [` ]α , B = [` ]β.Let [` ]M be k-colored. We color A with k `β colors by letting the color of a ∈ A be the k-coloring of B induced by the k-coloring of a×B. Since the dimension of A is HJ(k `β , `−1),there is a line L in A whose restriction to [`−1]α is monochromatic for this coloring. In otherwords, for each a ∈ L, except for the focus of L, we get the same k-coloring on B, whichwe denote by cL. Because the dimension of B is M(k, `,m− 1), there is a monochromaticline L′ in B, or there is an (m− 1)-fan F in B for the coloring cL. In the first case, the linef1 ⊗ L′ is a monochromatic line in [` ]M .We obtain an m-fan as follows. Let f1 ∈ A be the focus of L, and let f2 ∈ B be the focusof F . If cL gives f2 the same color as one of the lines in F , then we have a monochromaticline L′ in B for cL, and L ⊗ L′ is a monochromatic line in A × B = [` ]M . So we can assumethat the cL-color of f2 is different from the lines in F . For each line Li of the m− 1 lines inF , L⊗ Li is a line in [` ]M with focus f1 ⊗ f2, and these lines are monochromatic in [` − 1]Mwith different colors. Moreover, L⊗ f2 is another monochromatic line with focus f1⊗ f2, andits color is different from the other m− 1 lines.

5

Corollary 1.4 (Van der Waerden). For all k, ` there is a number W (k, `) such that anyk-coloring of [W (k, `)] has a monochromatic `-term arithmetic progression.

Proof. Set W = ` ·HJ(k, `). We define a map φ : [` ]HJ(k,`) → [W ] byφ(x1, . . . , xHJ(k,`)) = ∑

i∈[HJ(k,`)] xi.Given a k-coloring c of [W ], we get a k-coloring c∗ of [` ]HJ(k,`) by setting c∗(x) = c(φ(x)).Then there is a monochromatic Hales–Jewett line for c∗, which corresponds to a monochro-matic `-term arithmetic progression for c in [W ]. Indeed, if the word for this line has d stars,and the first point is (a1, . . . , aHJ(k,`)), then the progression is (∑ai) + jd for j ∈ [` ].Corollary 1.5 (Gallai). For all d, k and every finite set S ⊂ Nd, there is a number G =G(d, k, S) such that if [G]d is k-colored, then there is a monochromatic set of the form v+λSfor some v ∈ Nd and λ ∈ N.

Proof. Set G = |S| · HJ(k, |S|). Write S = {s1, . . . , s|S|} (the order does not matter). Wedefine a map φ : [|S|]HJ(k,|S|) → [G] byφ(x1, . . . , xHJ(k,|S|)) = ∑

i∈HJ(k,|S|) sxi.

Given a k-coloring c of [G]d, c∗(x) = c(φ(x)) gives a k-coloring c∗ of [|S|]HJ(k,|S|). Thenthere is a monochromatic line for c∗, which gives a monochromatic homothetic copy of S.Corollary 1.6 (Squares). For all k there is a number Q = Q(k) such that if [Q]2 is k-colored,then there is a monochromatic square.

Proof. Take S = {(0, 0), (0, 1), (1, 0), (1, 1)} in Corollary 1.5.We define a Hales–Jewett space in [` ]d of dimension δ to be a set represented by aword w ∈ ([` ]∪{∗1, . . . , ∗δ}), with each star ∗i occurring at least once. For instance, in [3]4the word ∗12 ∗2 ∗2 represents the two-dimensional Hales–Jewett space{1211, 1222, 1233, 2211, 2222, 2233, 3211, 3222, 3233}.

Corollary 1.7 (Multidimensional Hales–Jewett). For all k, `, δ there is a number M =MHJ(k, `, δ) such that if [` ]M is k-colored, then there is a monochromatic δ-dimensionalHales–Jewett space.

Proof. The key is that a δ-dimensional Hales–Jewett space in [` ]δd can be represented asa Hales–Jewett line in [`δ ]d. Set H = HJ(k, `δ) and M = δ · H . We take any bijectionψ : [` ]δ → [`δ ], and then define a map φ : [` ]δH → [`δ ]H by

φ(x1, . . . , xδH) = (ψ(x1, . . . , xδ), . . . , ψ(xδ(H−1), . . . , xδH)).Given a coloring c of [` ]M = [` ]δH , we get a coloring c∗ of [`δ ]H by setting c∗(x) = c(φ−1(x)).By Theorem 1.3, there is a monochromatic line L for c∗ in [`δ ]H . Then φ−1(L) is a δ-dimensional Hales–Jewett space in [` ]δH .

6

1.4 Schur’s Theorem and Rado’s TheoremThe following theorem was proved by Schur [29] in 1916, before Van der Waerden’s Theorem.A 3-term arithmetic progression (x, z, y) is a solution of the equation x+y = 2z, and Schur’sTheorem is a Ramsey result for a similar linear equation. But note that in an arithmeticprogression x, y, z must be distinct, which we do not require in Schur’s Theorem.Theorem 1.8 (Schur). For all k there is a number S(k) such that any k-coloring of [S(k)]has a monochromatic solution of x + y = z.

Proof. Given a k-coloring of [N ], we construct a complete graph with vertex set [N+ 1], andwe color the edge between i and j with the color of |i− j|. Since(i− j) + (j − k) = i− k,a monochromatic triangle in this graph corresponds to a monochromatic solution of x+y = zwith x, y, z ∈ [N ]. So it suffices to prove the claim that there is a number R (k) such that ifa complete graph on R (k) vertices is k-colored, then it has a monochromatic triangle.We use induction on k . For k = 1 the claim is trivial. Given that R (k−1) exists, we showthat the claim holds for R (k) = k ·R (k−1)+1. Pick any vertex v . It has R (k) = k ·R (k−1)incident edges, so at least R (k − 1) of these have the same color; let E be the set of thoseedges. If two edges of E have endpoints that are connected by another edge of the samecolor, then we have a monochromatic triangle. Otherwise, the R (k − 1) endpoints of E forma complete subgraph that is colored with k − 1 colors, so we are done by induction.Rado’s Theorem [26] is a generalization of Schur’s Theorem to other linear equations.We will prove Rado’s Theorem only for a single homogeneous equation, but Rado [26] proveda similar statement for systems of nonhomogeneous equations. Van der Waerden’s Theoremcan be seen as a special case of that general version, since arithmetic progressions of anylength can be considered as a solution of a system of linear equations (again with therequirement that the components of the solution are distinct).To prove Rado’s Theorem, it is convenient to first deduce the following generalizationof Schur’s Theorem 1.8 from Van der Waerden’s Theorem 1.1. This exposition is based onthat of Leader [22].Lemma 1.9. Let r ∈ Q. For all k there is a number N such that any k-coloring of [N ] hasa monochromatic solution of x + ry = z.

Proof. We can assume that r > 0, by rearranging the equation if necessary. Write r = s/twith s, t positive integers. We use induction on k ; for k = 1 the statement is easy.We show that if N is big enough for k−1 colors, then W (Ns+1, k) suffices for k colors.Given a k-coloring of [W (k,Ns + 1)], there is a monochromatic arithmetic progressiona, a+d, . . . , a+Nsd of color c. If an element itd of the progression td, 2td, . . . , Ntd hascolor c, then we are done, because (a, itd, a + isd) would be a monochromatic solution ofx + (s/t)y = z. Otherwise, the progression td, 2td, . . . , Ntd is colored with k − 1 colors,so by induction it contains a monochromatic solution.Theorem 1.10 (Rado). An equation ∑

i∈[m]aixi = 0with all ai ∈ Z is regular if and only if there is a non-empty I ⊂ [m] such that

∑i∈I ai = 0.

7

Proof. First we prove that if ∑i∈I ai = 0 for some I ⊂ [m], then the equation ∑aixi = 0 isregular. Pick any i0 ∈ I . Consider the equationai0x +∑

i6∈I

ai

y+ ∑i∈I\{i0}

ai

z = 0. (1.1)Since ∑i∈I ai = 0, this equation is the same as

ai0x +∑i6∈I

ai

y− ai0z = 0,which we can rearrange to

x + ∑i 6∈I aiai0 y = z.

By Lemma 1.9, for any k there is an N such that this equation has a monochromatic solution(x, y, z), which is then also a monochromatic solution to (1.1). Then we can construct amonochromatic solution (xi)i∈[m] to ∑aixi = 0 by setting xi0 = x , xi = y for i ∈ [m]\I , andxi = z for i ∈ I\{i0}.Next we prove that if ∑aixi = 0 is regular, then there is some subsum ∑

i∈I ai = 0.To do this we define a special coloring using a prime number p >∑|ai|. For x ∈ N,let c(x) be the last non-zero digit in the base p expansion of x . More precisely, we write

x = crpr + · · · + c1p + c0 with 0 ≤ cj < p, we set L(x) = min{j : cj 6= 0}, and we setc(x) = cL(x). Then c defines a (p − 1)-coloring of N. Since ∑aixi = 0 is regular, there isa monochromatic solution (x1, . . . , xm), with c(xi) = cj0 for all i. Set L = min{L(xi) : i ∈ [m]}and I = {i : L(xi) = L}. Considering ∑i∈[m] aixi in base p, we see that ∑i∈I aicj0 ≡ 0mod p, which implies ∑i∈I ai ≡ 0 mod p, since 0 < cj0 < p. Because by assumption wehave ∑ |ai| < p, this implies that ∑i∈I ai = 0, as desired.

8

Chapter 2

Roth’s Theorem

After having seen several coloring theorems, we now move on to density theorems. Our goalin this chapter is to prove Roth’s Theorem, which states that any sufficiently dense set ofintegers contains a 3-term arithmetic progression. In the proof of Roth’s Theorem, we willuse an easier density theorem for different objects called Hilbert cubes, which we provefirst, together with its coloring version.2.1 Hilbert cubesA Hilbert cube of dimension d in N is a set of the form

{x0 +∑i∈I

xi : I ⊂ [d]}with generators x0, . . . , xd ∈ N (not necessarily distinct). Note that an arithmetic progres-sion a, a+b, . . . , a+db is an example of a Hilbert cube (set x0 = a, x1 = · · · = xd = b), soit follows from Van der Waerden’s Theorem that we can find a monochromatic Hilbert cubeof any dimension for any coloring of a sufficiently large [N ]. Nevertheless, we give a directproof, which shows that cubes are an ideal object for inductive proofs.Moreover, this lemma has historical value, since it is the very first Ramsey-type theorem.Hilbert [21] proved it in 1892, as a lemma in his proof of what is now known as Hilbert’sIrreducibility Theorem (it states that for every irreducible integer polynomial there areinfinitely many ways to substitute integers in a subset of the variables and get anotherirreducible polynomial; see [36] for an exposition).Lemma 2.1 (Hilbert). For all k, d there is a number H(k, d) such that if [H(k, d)] is k-colored,then there is a monochromatic Hilbert cube of dimension d.

Proof. We use induction on d. Assume that H = H(k, d − 1) exists, and set K = kH + H .Given a coloring of [K ], consider each of the intervals [i, i+H−1] for i ∈ [kH +1]. There arekH ways to color such an interval, so by the pigeonhole principle there are two intervalsthat are colored in the same way, say the ones starting with i1, i2. By the choice of H ,there is a monochromatic (d − 1)-dimensional cube in [i1, i1 + H − 1], say with generatorsx0, . . . , xd−1, and its translation by i2 − i1 must then also be monochromatic in the samecolor. Adding the generator xd = i2 − i1 to the list of generators gives a d-dimensionalcube that is monochromatic.Szemerédi [33] proved the following density version of Hilbert’s coloring theorem.

9

Lemma 2.2 (Szemerédi). Every subset S ⊂ [N ] of size |S| ≥ 4N1−1/2d contains a Hilbertcube of dimension d.

Proof. For any set T ⊂ [N ], defineTi = {t ∈ T : t + i ∈ T}.

We have ∑i∈[N ]Ti = (|T |2

),

so there is an i such that |Ti| ≥ (|T |2 )/N ≥ |T |2/(4N) (as long as |T | ≥ 2). We will use thisfact repeatedly.Applying this fact to S, we get an i1 such that (using 4N1−1/2d ≥ (4N)1−1/2d)|Si1| ≥ |S|

24N ≥((4N)1−1/2d)2

4N = (4N)1−1/2d−1.We write Si1,i2 = (Si1)i2 , etc. Applying the fact to Si1 , we get an i2 such that |Si1,i2| ≥(4N)1−1/2d−2 . Repeating this, we get i3, . . . , id such that

|Si1,...,id | ≥ (4N)1−1/2d−d = 1.Let i0 be any element of Si1,...,id . Then i0, i1, . . . , id generate a cube of dimension dcontained in S. Indeed, i0 ∈ Si1,...,id means that i0, i0 + id ∈ Si1,...,id−1 , which then impliesthat i0, i0 + id−1, i0 + id, i0 + id + id−1 ∈ Si1,...,id−2 , etc. Altogether we see that i0 +∑j∈I ij isin S for any I ⊂ [d].We will actually use the lemma in the following form. Note that log = log2.

Corollary 2.3. If N ≥ 2log2(4/δ), then any S ⊂ [N ] with |S| ≥ δN contains a Hilbert cube ofdimension at least 14 log logN .

Proof. We choose d to be any integer between 14 log logN and 12 log logN , so in particular2d ≤ log1/2N . We can rearrange the assumption N ≥ 2log2(4/δ) to2log1/2 N ≥ 4/δ,

which impliesN = 2logN ≥ (4/δ)log1/2 N ≥ (4/δ)2d .This gives δN ≥ 4N1−1/2d , so we can apply Lemma 2.2 to get a Hilbert cube of dimension

d ≥ 14 log logN .

10

2.2 Combinatorial proof of Roth’s TheoremRoth’s Theorem is the density version of Van der Waerden’s Theorem for arithmetic pro-gressions of length 3.Theorem 2.4 (Roth). For every δ there exists N0 such that, for all N > N0, any S ⊂ [N ]with |S| ≥ δN contains a arithmetic progression of length three.We give a combinatorial proof of Roth’s theorem, which is due to Szemerédi [33]. Hereis an outline of the proof.• Density increment: The main strategy is to show that if S has no 3-term arithmeticprogression, then there is a long arithmetic progression P ⊂ [N ] such that the densityof S on P is larger than δ by a constant depending on δ . If N is large enough, we cankeep repeating this argument on subprogressions. Since the density cannot becomelarger than 1, we must encounter a 3-term arithmetic progression in S.• Partitioning into progressions: To find P , we partition [N ] = A ∪ B so that S ∩ B =∅, and we partition A into long arithmetic progressions P1, . . . , Pk . Then S haslarger density on A, which implies it has larger density on one of the Pi. To beable to partition A into long arithmetic progressions, we need to choose B and dso that d and (B + d)\B are small. Indeed, we can partition A by starting withd arithmetic progressions covering [N ], and then removing B, which gives a newprogression starting at every b+ d 6∈ B for which b ∈ B.• Finding a Hilbert cube: To find B and d, we take a large Hilbert cube C ⊂ S withgenerators c0, . . . , ck . Observe that if 2c−s ∈ 2C−S, then s, c, 2c−s is an arithmeticprogression with difference c − s. Thus, if S has no 3-term arithmetic progression,then 2C − S is a large set disjoint from S. It may not have the partitioning propertythat we want, but we can find a subset that does.• Decomposing the Hilbert cube: Decompose C into the cubes Ci with generatorsc0, . . . , ci, so that Ci+1 = Ci ∪ (Ci + ci+1). Set Di = 2Ci − S, which is disjoint from S.We have a sequence D0 ⊂ · · · ⊂ Dk ⊂ [N ], so one of the sets Di+1\Di = (Di+2ci+1)\Dimust be small; we set B = Di and d = 2ci+1. Then we have B disjoint from S with(B + d)\B small, so we can partition A = [N ]\B into long arithmetic progressions,and on one of these we find the density increment.

In the actual proof, we have to work out the quantitative details, we have to deal with thecomplication that 2C − S might not lie inside [N ], and we have to ensure that d is not toolarge.Proof of Theorem 2.4. We assume that S contains no 3-term arithmetic progression. Todeal with the complications mentioned after the outline, we use a convenient partition of[N ]. This may lead to small losses in density, for which we roughly compensate with factors1/2; we omit the elementary details of checking that these adjustments suffice. We split [N ]into four parts [iM+1, (i+1)M ] for i = 0, . . . , 3 (with the last part perhaps slightly smaller).We can assume that S has density at least δ/2 on the second part, since otherwise there isa density increment on one of the other three parts, and we can jump to the last paragraphof the proof. We split the second interval [M + 1, 2M ] into about (log logN)/2 intervals oflength between N/(2 log logN) and N/(log logN), and we pick one such interval, say I , onwhich S has density at least δ/2.

11

We assume that N ≥ 22 log2(4/δ), which implies |I| ≥ N/(2 log logN) ≥ 2log2(4/δ), so thatwe can use Corollary 2.3 to find a Hilbert cubeC ⊂ S ∩ I

of dimension k ≥ (log log |I|)/4 ≥ (log logN)/8, and with generators c0, . . . , ck , wherec1, . . . , ck ≤ |I| ≤ N/ log logN . We let Ci be the cube generated by c0, . . . , ci, and weset

Di = 2Ci − (S ∩ [1,M ]),so Di ⊂ [M + 1, 4M ]. As observed in the outline, each Di is disjoint from S. We have asequenceD0 ⊂ D1 ⊂ · · · ⊂ Dk ⊂ [N ],which implies that there is an i such that |Di+1\Di| ≤ N/k . Since Di+1 = Di ∪ (Di + 2ci+1),this implies |(Di + 2ci+1)\Di| ≤ N/k . Setting B = Di and d = 2ci+1 we have|(B + d)\B| ≤ N

k ≤8Nlog logN .We also have d ≤ 2N/ log logN and

|B| ≥ |D0| ≥ |S ∩ [1,M ]| ≥ (δ/2)M = (δ/8)N.We can partition A = [N ]\B into

d+ |(B + d)\B| ≤ 10N/ log logNprogressions with difference d, as described in the outline. Thus we can write

A = P1 ∪ · · · ∪ P` ,with ` ≤ 10N/ log logN , each Pi an arithmetic progression, and all Pi disjoint.We have |B| ≥ (δ/8)N and thus |A| ≤ (1− δ/8)N . Let J ⊂ [` ] be the set of indices i forwhich |Pi| ≥ (δ2/160) log logN . Then we have∑i6∈J

|Pi| ≤10Nlog logN · δ2 log logN160 ≤ δ216N.

Thus we have∣∣S ∩ (⋃i∈J Pi)∣∣

|A| ≥|S| −

∣∣∣(⋃i 6∈J Pi)∣∣∣

|A| ≥δN − δ216N(1− δ8 )N ≥

(δ − δ216

)(1 + δ8)≥ δ + δ232 .

This means that S has density δ + δ2/32 on the union of the progressions Pi of length atleast (δ2 log logN)/160, so there is such a Pi on which S has density at least δ + δ2/32.We now repeat the argument for the set |S ∩Pi| as a subset of Pi. After repeating lessthan 32/δ2 times, we would obtain a density greater than 1, so we must have encountereda 3-term arithmetic progression. Here N should be sufficiently large, which means largeenough so that after iterating the function f (x) = (δ2 log log x)/160 up to 32/δ2 times,starting with N , we still have a value larger than 22 log2(4/δ), so that Corollary 2.3 can beapplied. This means that N0 is an exponential tower whose height depends polynomiallyon 1/δ .12

2.3 Bounds for Roth’s TheoremThere are several other proofs of Roth’s Theorem. Let r3(N) be the maximum size of a setS ⊂ [N ] that contains no 3-term arithmetic progression. We proved that r3(N) = o(N). Theoriginal proof of Roth [27] used Fourier analysis, and it showed that

r3(N)� N · 1log logN .This bound is significantly stronger than what we get from our proof of Theorem 2.4, andFourier analysis seems to be the only method that gives strong quantitative bounds forRoth’s Theorem. Roth’s bound has been improved many times over the years, and thecurrent best bound, due to Bloom [6], isr3(N)� N · (log log)4logN .

On the other hand, Behrend [5] proved a lower bound of the form r3(N)� N · 2−c√logN ,which is still the best known, except for a small improvement of Elkin [11].Theorem 2.5 (Behrend). For all N ∈ N there is a set S ⊂ [N ] of size |S| ≥ N · 2−4√logNthat does not contain an arithmetic progression of length three.

Proof. For K, L to be chosen later, we intersect [K ]L with a sphere of radius r to getSr = {(xi) ∈ [K ]L : x21 + · · ·+ x2

L = r2}.The spheres Sr with r ∈ [K 2L] cover [K ]L, so, by the pigeonhole principle, there is a radiusr0 such that

|Sr0| ≥ K L

K 2L.A line intersects a sphere in at most two points, so there are no three distinct pointsx, y, z ∈ Sr0 such that x + y = 2z, i.e., Sr0 contains no 3-term arithmetic progression ofvectors.We define an injective map φ : [K ]L → [(2K )L] by

φ(x1, . . . , xL) = ∑i∈[L] xi · (2K )i.

Note that for x, y, z ∈ Sr0 we have x + y = 2z if and only if φ(x) + φ(y) = 2φ(z), soφ(Sr0) ⊂ [(2K )L] does not contain a 3-term arithmetic progression. We set S = φ(Sr0).The rest is calculation. Given N , we set L = b√logNc and K = 122L, so that (2K )L ≤ Nand

(2K )L ≥ 2L2 ≥ 2(√logN−1)2≥ N · 2−2√logN , K 2L2L ≤ L · 23L ≤ 2−4√logN .

Then|S| ≥ K L

K 2L = (2K )L · (K 2L2L)−1 ≥ N · 2−6√logN ,which finishes the proof.13

2.4 An alternative proof using the regularity lemmaWe now give an alternative proof of Roth’s Theorem, which also shows a two-dimensionalgeneralization called the Corners Theorem. The proof relies on the regularity lemma, whichwe will not prove or even state here. We use the following corollary of the regularity lemma,which is much easier to formulate, and we sketch how it would be proved from the regularitylemma.Lemma 2.6 (Triangle removal). For every α > 0 there is a β > 0 such that, if a graph Gon n vertices has at least αn2 edge-disjoint triangles, then it has at least βn3 triangles.Proof sketch. By the regularity lemma, for any ε > 0, we can remove at most εn2 edgesfrom G, so that we can partition the vertices of G into not too many subsets X1, . . . , Xk ,with each pair Xi, Xj ε-regular. This means that, for any A ⊂ Xi with |A| > ε|Xi| andB ⊂ Xj with |B| > ε|Xj |, the density of edges between them is close to the density of edgesbetween Xi, Xj , i.e., ∣∣∣∣ |E(A,B)|

|A||B| −|E(Xi, Xj )||Xi||Xj |

∣∣∣∣ < ε.

Moreover, we can assume that each pair Xi, Xj has either density zero, or density abovesome constant, say |E(Xi, Xj )|/|Xi||Xj | > ε/2.If we take ε < α , then the number of removed edges is less than the number of edge-disjoint triangles, so some triangle must be left intact. If the vertices of that triangle are inXi, Xj , Xk , then each of these pairs must have density more than ε/2. A calculation with theε-regular property then gives that Xi, Xj , Xk determine at least f (ε)n3 triangles for somepolynomial function f . We set β = f (ε).Theorem 2.7 (Corners Theorem). For every δ there exists N0 such that, for all N > N0,any S ⊂ [N ]2 with |S| ≥ δN2 contains a corner {(s, t), (s+ d, t), (s, t + d)} with d 6= 0.Proof. We define a tripartite graph G on three vertex sets X, Y , Z consisting of lines:

X = {x = i : i ∈ [N ]}, Y = {y = j : j ∈ [N ]}, Z = {x + y = k : k ∈ [2N ]}.We connect two lines from different sets if their intersection point is in S. Then a trianglein G corresponds to lines x = i, y = j, x + y = k such that (i, j), (k − j, j), (i, k − i) ∈ S.This is a corner with (s, t) = (i, j) and d = k − i− j , except that if k = i+ j , then the threelines forming the triangle are concurrent; in that case we call the triangle degenerate. Wewish to show that there is a non-degenerate triangle in G.For every point of S, the three lines from X, Y , Z intersecting at that point give adegenerate triangle in G, so G contains at least δN2 degenerate triangles, which areedge-disjoint. Since G has 4N vertices, we can apply Lemma 2.6 with α = δ/42 to geta β > 0 such that G has at least β(4N)3 triangles. There are at most N2 degeneratetriangles in G, hence as long as N is sufficiently large so that β(4N)3 > N2, there must bea non-degenerate triangle, which means S contains a corner.Corollary 2.8 (Roth). For every δ there exists N0 such that, for all N > N0, any S ⊂ [N ]with |S| ≥ δN contains a 3-term arithmetic progression.Proof. Given S ⊂ [N ], define T = {(x, y) ∈ [2N ]2 : x − y ∈ S}. Each line x − y = s fors ∈ S contains at least N points of [2N ]2, so |S| ≥ δN implies |T | ≥ δN2 = (δ/4)(2N)2.Thus, by Theorem 2.7, T contains a corner {(s, t + d), (s, t), (s + d, t} with d 6= 0. Thatmeans that s− t − d, s− t, s− t + d ∈ S, and this is a 3-term arithmetic progression withcommon difference d.

14

Chapter 3

The Density Hales–Jewett Theorem

3.1 IntroductionIn this chapter we prove the following theorem.Theorem 3.1 (Density Hales–Jewett). For all ` > 1, δ > 0 there is a numberD0 = DHJ(`, δ)such that, for all D > D0, any S ⊂ [` ]D with |S| ≥ δ`D contains a Hales–Jewett line.

The theorem was first proved by Furstenberg and Katznelson [14] using ergodic methods.A combinatorial proof was found by Polymath [25], which was later simplified by Dodos,Kanellopoulos, and Tyros [9]. The proof we give here is that of Dodos, Kanellopoulos, andTyros, with some small changes in the exposition.For a fixed ` we denote the statement of the theorem by D HJ (`). We prove by inductionon ` that D HJ (`) holds for all ` ≥ 2. For the base case ` = 2, observe that we can thinkof a vector in [2]D as a set in 2[D], and we can think of a set S ⊂ [2]D as a set systemS∗ ⊂ 2[D]. A Hales–Jewett line in [2]D then corresponds to sets A,B ∈ 2[D] such thatA ⊂ B. Thus D HJ (2) follows from Sperner’s Theorem [32], which says that if S∗ ⊂ 2[D]with |S∗| > ( D

bD/2c), then there are A,B ∈ S∗ with A ⊂ B.To show that D HJ (`) implies D HJ (`+1), we will use a density increment strategy likein our proof of Theorem 2.4. To motivate this strategy, let us first discuss a set-theoreticinterpretation of D HJ (3).A point x ∈ [3]D can be thought of as a partition of [D] into three sets, i.e.,(X, Y , Z ) with disjoint X, Y , Z ∈ 2[D] such that X ∪ Y ∪ Z = [D].

For instance, we can let X be the set of indices of the coordinates where x has a 1, Y theones where x has a 2, and Z the ones where x has a 3. This makes Z redundant, so wecould just as well think of disjoint pairs, i.e.(X, Y ) with disjoint X, Y ∈ 2[D].

In other words, we identify [3]D with the disjoint product

2[D] � 2[D] = {(X, Y ) ∈ 2[D] × 2[D] : X ∩ Y = ∅}.What is a Hales–Jewett line of [3]D in this interpretation? It is a triple of pairs of theform (X, Y ), (X ∪W,Y ), (X, Y ∪W ) ∈ 2[D] � 2[D],

15

with X, Y ,W disjoint. This should remind us of Theorem 2.7, where we saw corners(x, y), (x + d, y), (x, y + d) — a line in [3]D is a “corner” in 2[D] � 2[D]! This suggeststhat we might be able to use ideas from proofs of Theorem 2.7. The proof of Theorem 2.7that we gave used the regularity lemma, and it seems possible to use the same idea to proveD HJ (3), but for larger ` we would then need the hypergraph regularity lemma. Instead,we will dig up the original proof of Theorem 2.7, due to Ajtai and Szemerédi [1]. We willsketch the Ajtai–Szemerédi proof in the next section, and then we will see how to translateit to a proof of D HJ (3). It will turn out that this argument extends to general ` , which issurprising if one compares it to the situation for arithmetic progressions, because the knownproofs of Roth’s Theorem do not extend easily to the general case it all.3.2 An outline of the proof

3.2.1 An argument of Ajtai and SzemerédiWe sketch the original proof of the Corners Theorem (Theorem 2.7) of Ajtai and Szemerédi [1],which uses a density increment argument similar to the one we used for Theorem 2.4. Thisproof has the awkward feature that it uses Szemerédi’s Theorem, which seems to be a muchdeeper statement than the Corners Theorem. Nevertheless, the Ajtai–Szemerédi argumentis the inspiration for the proof of Theorem 3.1 of Polymath [25] and Dodos, Kanellopoulos,and Tyros [9].Given a dense set S ⊂ [N ]2, we wish to show that S contains a corner{(x, y), (x + d, y), (x, y+ d)}

with d 6= 0. We take a diagonal D defined by an equation of the form x+y = t on which Sis dense. Let U be the set of x-coordinates of the points of S on D, and let V be the set ofy-coordinates of points of S on D. If S does not contain a corner, then (U×V )∩S = D∩S,so S has a relatively low density on U × V .The sets

U × V , U × V c, Uc × [N ]partition [N ]2 (here the superscript c denotes the complement within [N ]), so the fact thatS has low density on U ×V implies that S has a density increment on U ×V c or Uc× [N ].We denote the product on which S has a density increment by X × Y . We wish topartition X × Y into large sets that are similar to [M ]2 for some M < N , since then we canrepeat the argument. Let us define a grid in [N ]2 to be a Cartesian product in which bothfactors are arithmetic progressions with the same length and the same difference; in otherwords, a scaled and translated copy of some [M ]2.First we show how a set of the form X × [N ] with a dense X ⊂ [N ] can be partitionedinto large grids. By Szemerédi’s Theorem, there is a long arithmetic progression P1 in X ;we remove that progression, and then find another progression P2; we repeat this to findprogressions Pi and remove them until the remaining set is too small to apply Szemerédi’sTheorem. For each i, we then partition most of Pi× [N ] into grids of the form Pi× P̃i, whereeach P̃i is a translate of Pi. As a result, we have partitioned most of X × [N ] into grids ofthe form Pi × P̃i.To partition X × Y into grids, we first partition X × [N ] into grids Pi × P̃i as above.Then we consider (X × Y ) ∩ (Pi × P̃i) = Pi × (Y ∩ P̃i),

16

which is a product of an arithmetic progression with a set that is (typically) dense on atranslate of that arithmetic progression. Thus we can use the same argument as above topartition Pi × (Y ∩ P̃i) into grids.As a result, after discarding a relatively small number of points, we have partitionedX ×Y into grids that are fairly large. The density increment of S on X ×Y must also occuron one of these subgrids, so we can iterate the argument with a larger δ . Since the densitycannot increase to more than 1, we must encounter a corner contained in S.3.2.2 Applying the Ajtai–Szemerédi argument to [3]DWe now discuss how we can apply the argument that we sketched in Section 3.2.1 to adense set S ⊂ [3]D , which as in Section 3.1 we think of as a dense set S ⊂ 2[D]�2[D]. Alongthe way, we will point out how to generalize the concepts to [` ]D .Finding a good diagonal. The goal is to find a density increment of S on a set that isequivalent to 2[E ] � 2[E ] for some E < D. The analogue of a diagonal x + y = t is the setof (X, Y ) satisfying

X ∪ Y = Tfor some fixed T . However, not every two pairs (X1, Y1), (X2, Y2) on such a diagonal are theendpoints of a corner (X, Y ), (X∪W,Y ), (X, Y ∪W ); this is only the case if we have X1 ⊂ X2or vice versa. Therefore, it is not enough to find a diagonal with many points of S, we alsohave to ensure that this diagonal contains many pairs (X1, Y1), (X2, Y2) with X1 ⊂ X2.Note that within a diagonal, we may as well ignore the second set Y , so we can thinkof the points on the diagonal as sets X ∈ 2T . Our goal is then to find a T such that manysubsets X1, X2 ∈ 2T with X1 ⊂ X2 come from points in S. This is one place where we willuse induction, since D HJ (2) (i.e., Sperner’s Theorem) gives us a way to find a pair X1 ⊂ X2in a dense set system. With some more work, we can find a subset T for which there aremany such pairs.Let us translate this part of the argument back into the Hales–Jewett language. Thesubproduct 2T � 2T ⊂ 2[D] � 2[D] corresponds to a Hales–Jewett subspace of dimension|T | in [3]D , which we can think of as [3]|T |. The diagonal X ∪ Y = T then corresponds to[2]|T | ⊂ [3]|T |. A pair of points on the diagonal that can be completed to a corner is a pairX1, X2 ∈ T with X1 ⊂ X2, which is the same as a line in [2]|T |. Therefore, our goal is to useD HJ (2) to find a large subspace [3]E ⊂ [3]D such that S contains many lines of [2]E .In general, given a dense set S ⊂ [` ]D we will use D HJ (` − 1) to find a large subspace[` ]E such that S contains many lines of [`−1]E . Of course, a direct application of D HJ (`−1)only gives a single line, but with some work we can use it to obtain many lines.Obtaining a density increment. Suppose we have a diagonal X ∪ Y = T as above, con-taining many pairs (X1, Y1), (X2, Y2) ∈ S with X1 ⊂ X2. We can rewrite such a pair as(X, Y ∪W ), (X ∪W,Y ) for a nonempty set W . If S contains no corner, then it follows thatthe point (X, Y ) is not in S. We define

U = {X ∈ 2[D] : (X, T\X ) ∈ S} and V = {Y ∈ 2[D] : (T\Y , Y ) ∈ S},so that many points of U � V are not in S. We partition 2[D] � 2[D] into the sets

U � V , U � V c, Uc � 2[D],17

where the superscript c denotes the complement within 2[D]. Then S must have a densityincrement on U � V c or Uc � 2[D]. The latter is of the form W � 2[D], and the former is anintersection of two sets of the form W � 2[D] and 2[D] �W ′, so it suffices to know how topartition sets of the form W � 2[D].Again we translate this into Hales–Jewett language. The set W � 2[D] consists of pairs(X, Y ) with X ∈ W and Y arbitrary (but disjoint from X ), so it has the property that if weadd or remove elements of Y (while maintaining disjointness from X ), we obtain anotherelement of W � 2[D]. The corresponding set C ⊂ [3]D has the property that if in any vectorof C we change any 2s to 3s or vice versa, we get another vector in C . Let us define thiskey property in general.Definition 3.2. A set C ⊂ [` ]D is ij-insensitive (for i < j ∈ [` ]) if, whenever in a vector of Cwe change some is to js and/or we change some js to is, then the resulting vector is in C .

Thus a set like W � 2[D] corresponds to a 23-insensitive set in [3]D , while 2[D] � Wcorresponds to a 13-insensitive set in [3]D . In the partition of 2[D] � 2[D] into the partsU � V ,U � V c, Uc � 2[D], each part is a disjoint product, which corresponds in [3]D tothe intersection of a 23-insensitive set and a 13-insensitive set. This generalizes to thefollowing definition.Definition 3.3. A set C ⊂ [` ]D is Cartesian if C = C1∩ · · ·∩C`−1, where for each i ∈ [`−1]the set Ci is i`-insensitive.

In the proof for general ` , we will partition [` ]E into parts, each of which is a Cartesianset. The properties of the “diagonal” that we found will imply that S has low density onone of the Cartesian sets in the partition, which then implies that it has a density incrementon one of the other Cartesian sets.Partitioning into subspaces. In the last step, we need to partition a set of the form W�2[D]into subsets that are of the form 2[E ]�2[E ] for some E < D. As in the Ajtai–Szemerédi proof,this can then be used to partition U � V c or Uc � 2[D].Here we again use induction, but now we need a multidimensional version of D HJ (2),which will follow from D HJ (2) itself. In set language, it tells us that a dense subset of 2[D]contains a “Hilbert cube” of sets, i.e., a subset of the form

{S0 ∪⋃i∈I

Si : I ⊂ [E ]}for disjoint sets S0, . . . , SE ∈ 2[D]. Such a set system is equivalent to 2[E ]. In Hales–Jewettlanguage, it is a subspace of [2]D .For general ` , we find a density increment for S on a Cartesian set, and we partition thatCartesian set into subspaces of sufficiently large dimension. Then S must have a densityincrement on one of those subspaces, which is equivalent to some [` ]F for a sufficientlylarge F . Then we can repeat the argument in that [` ]F , so, since the density cannot keepincreasing, we must encounter a Hales–Jewett line that is contained in S.

18

3.3 Preparation for the proofIn this section we prove some results that play an important role in the proof of Theorem3.1, but that are somewhat independent from the main argument.3.3.1 Coloring Hales–Jewett linesThe proof will use a variant of the Hales–Jewett theorem, in which we we color the linesinstead of the points. It was first proved by Graham and Rothschild [16] in a much moregeneral form, but here we prove only the specific case that we need.Theorem 3.4. For all `,m there is a number GR (`,m) such that for all D ≥ GR (`,m)the following holds. If the lines in [` ]D are 2-colored, then there is an m-dimensionalHales–Jewett space V ⊂ [` ]D such that all lines in V have the same color.

The proof requires two lemmas. We define a map X from the set of lines L in [` ]D to 2[D]by letting X(L) be the set of active coordinates of L; more formally, if L is represented bythe word (Li)i∈[D] ∈ ([` ] ∪ {∗})D), thenX(L) = {i ∈ [D] : Li = ∗}.

Lemma 3.5. For any `,m there exists a number N = N(`,m) such that the following holds.For any 2-coloring c of [` ]N there is an m-dimensional subspace W ⊂ [` ]N such that thecolor of a line in W depends only on the positions of its active coordinates, i.e., for anylines L, L′ ⊂ W we have that X(L) = X(L′) implies c(L) = c(L′).Proof. We set λ1 = (`+1)m−1 and a1 = HJ(2λ1, `), and then for i ≤ m we recursively define

λi = (` + 1)m−i+∑j∈[i−1] aj , ai = HJ(2λi, `).We write N = ∑i∈[m] ai and [` ]N = A1 × · · · × Am,where Ai = [` ]ai .Given a 2-coloring of [` ]N , we find a line in each Ai by starting with Am and workingbackwards until A1. We color Am by giving x ∈ Am a color that encodes all colors of alllines in A1 × · · · × Am−1 × x . There are λm such lines, so this requires 2λm colors. Sinceam = HJ(2λm, `), there is a monochromatic line Lm ⊂ Am. This means that for each x ∈ Lm,the lines in A1 × · · · × Am−1 × x are colored in the same way.Recursively we do the following. We color Ai by giving x ∈ Ai a color that encodes allcolors of all lines in

A1 × · · · × Ai−1 × x × Li+1 × · · · × Lm.There are λi such lines, so this requires 2λi colors. Since ai = HJ(2λi, `), there is amonochromatic line Li ⊂ Ai.When we are done, we set W = L1 × · · · × Lm, which is an m-dimensional subspaceof [` ]N . Consider distinct lines L, L′ ⊂ W with X(L) = X(L′). Then L, L′ must have fixedcoordinates in the coordinates corresponding to some Li ⊂ Ai, where L has a value xi ∈ Liand L′ has a value x ′i ∈ Li. By the choice of Li, all lines in L1×· · ·×Li−1×xi×Li+1×· · ·×Lmand in L1× · · ·× Li−1× x ′i × Li+1× · · ·× Lm are colored in the same way, which implies thatL and L′ have the same color.

19

Lemma 3.6 (Finite Unions Theorem). For any m there exists a number F = FU(m) suchthat for any 2-coloring of 2[F ] there are disjoint subsets X1, . . . , Xm ∈ 2[F ] such that everyunion

⋃i∈I Xi for a nonempty I ⊂ [m] has the same color.

Proof. We prove the stronger statement that for any p, q there is an FU(p, q), such that if2[FU(p,q)] is colored red and blue, then there are X1, . . . , Xp with all unions red, or there areY1, . . . , Yq with all unions blue. For p = q = m this is the statement of the theorem. Weprove it by induction on p+ q, with the statement being trivial if p = 1 or q = 1.Set F = FU(p, q) = MHJ(2, 2, FU(p − 1, q)), where the function MHJ comes fromthe Multidimensional Hales–Jewett Theorem (Corollary 1.7). Given a 2-coloring of 2[F ], weget a 2-coloring of [2]F , so Corollary 1.7 gives a monochromatic FU(p− 1, q)-dimensionalsubspace of [2]F . This subspace corresponds in 2[F ] to disjoint sets S0, S1, . . . , SFU(p−1,q)such that

S0 ∪⋃i∈I

Si

has the same color c1 for every I ⊂ [FU(p− 1, q)].Suppose c1 is red. We can consider the set of all unions of S1, . . . , SFU(p−1,q) as aninstance of 2[FU(p−1,q)], so by induction, either there are q sets Si1, . . . , Siq such that allunions are blue, and we are done, or there are p− 1 sets Si1, . . . , Sip−1 such that all unionsare red. In the second case, the p sets S0, Si1, . . . , Sip−1 have all unions red, so we are alsodone. If c1 is blue instead of red, we observe that FU(p − 1, q) = FU(p, q − 1) and weargue symmetrically.Proof of Theorem 3.4. Set GR (`,m) = N(`, FU(m)) and take D ≥ GR (`,m). Given a 2-coloring c of the lines in [` ]D , Lemma 3.5 gives an FU(m)-dimensional subspace W ⊂ [` ]D ,which we can think of as [` ]FU(m), in which X(L) = X(L′) implies c(L) = c(L′). Thus we candefine a coloring c∗ on 2[FU(m)] by setting c∗(X ) = c(L), where X ∈ 2[FU(m)] and L is any linein [` ]FU(m) such that X(L) = X . Lemma 3.6 then gives disjoint sets X1, . . . , Xm ∈ 2[FU(m)] withall unions the same color. These correspond to disjoint sets Y1, . . . , Ym of coordinates in [` ]D(each coordinate of [` ]FU(m) corresponds to some ∗i in the word for W , which correspondsto one or more coordinates in [` ]D), which define an m-dimensional subspace V ⊂ W (theword for W has ∗i in the coordinates of Yi), and the unions of the Yi correspond to all linesin W . By construction of c∗, all these lines have the same color.3.3.2 The multidimensional density Hales–Jewett theoremWe now prove that D HJ (k) implies a multidimensional version of itself. This fact will playan important role in the proof of D HJ (k + 1).Lemma 3.7. Assume that D HJ (k) holds. For all n, η there is a number MDHJ(k, n, η) suchthat for E ≥ MDHJ(k, n, η) the following statement holds. If X ⊂ [k ]E with |X | ≥ ηkE , thenthere is a subspace V ⊂ [k ]E of dimension n such that V ⊂ X .

Proof. We use induction on m. For n = 1, the statement is D HJ (k). Take β = DHJ(k, η/2)and α ≥ E(n− 1, η/(2(k + 1)β)), and set E = α + β. Consider X ⊂ [k ]E = [k ]α × [k ]β with|X | ≥ ηkα+β . For x ∈ [k ]α , we write Xx = {y ∈ [k ]β : x × y ∈ X} ⊂ [k ]β . We have∑

x∈[k ]α |Xx | = |X | ≥ ηkα+β.

20

Let A be the set of x ∈ [k ]α such that |Xx | ≥ (η/2)kβ . Then we have∑x 6∈A |Xx | < (η/2)kβ ·kα ,so ∑x∈A |Xx | ≥ (η/2)kα+β . Since for each x we have |Xx | ≤ kβ , it follows that |A| ≥ (η/2)kα .By D HJ (k) and the choice of β, for every x ∈ A there is a line Lx contained in Xx .Since the number of lines in [k ]β is less than (k + 1)β , there is a line L in [k ]β and a setC ⊂ A with |C | ≥ (η/(2(k + 1)β))kα such that L ⊂ Sx for all x ∈ C . By induction and thechoice of α , there is an (n− 1)-dimensional subspace V contained in C . Then V × L is ann-dimensional space contained in X .The following technical lemma will be used in the next lemma and in Section 3.4.1.Lemma 3.8. For all k,M and η1 > η2 there is a number E(k,m, η1, η2) such that thefollowing holds for all E > E(k,M, η1, η2). If X ⊂ [k ]E with |X | ≥ η1kE , then there is anα < E and an M-dimensional subspace V ⊂ [k ]α such that

|{y ∈ [k ]E−α : x × y ∈ X}| ≥ (η1 − η2)kE−αfor all x ∈ V .Proof. We use a density increment argument. Whatever α is, we write

Xx = {y ∈ [k ]E−α : x × y ∈ X}.If α = M and V = [k ]M satisfy the requirements, then we are done; otherwise there isan x ∈ [k ]M such that |Xx | < (η1 − η2)kE−α , which implies that there is an x1 ∈ [k ]M with|Xx | ≥ (η1 + η2/kM)kE−α . Now consider α = 2M and V = x1 × [k ]M ; again we are doneunless there is an x2 ∈ x1×[k ]M such that |Xx | ≥ (η1+2η2/kM)kE−α . Since E > MkM/η2, wecan repeat this kM/η2 times, but the density cannot increase above 1, so we must encountera V with the claimed property.Given an n-dimensional subspace W ⊂ [k + 1]E we write ρ(W ) for its restriction, whichis the subset of W where the active coordinates take only the values 1, . . . , k (and not k+1).Then ρ(W ) can be viewed as a copy of [k ]n inside W . Note that the fixed coordinates inρ(W ) may still take the value k + 1. The following lemma is a variant of Lemma 3.7, wherethe given set X is dense in [k + 1]E , and we want to find a subspace W that itself neednot be contained in X , but so that we do have ρ(W ) ⊂ X . It does not follow directly fromLemma 3.7, since X need not be dense in [k + 1]E .Lemma 3.9. Assume that D HJ (k) holds. For all n, η there is a number MDHJ∗(k, n, η)such that for E ≥ MDHJ∗(k, n, η) the following holds. If X ⊂ [k + 1]E with |X | ≥ η(k + 1)E ,then there is a subspace W ⊂ [k + 1]E of dimension n such that ρ(W ) ⊂ X .Proof. Set M = MDHJ(k, n, η/2) and E = E(k,M, η, η/2). By Lemma 3.8, there exist anα < E and an M-dimensional subspace V1 ⊂ [k ]α such that, with β = E − α , we have|Xx | ≥ (η/2)kβ for every x ∈ V1. Then we get∣∣X ∩ (ρ(V1)× [k ]β)∣∣ = ∑

x∈ρ(V1) |Xx | ≥ (η/2)kβ · |ρ(V1)| = (η/2) ∣∣ρ(V1)× [k ]β∣∣ .In other words, X is (η/2)-dense on ρ(V1) × [k ]β . The sets ρ(V1) × y for y ∈ [k ]β partitionρ(V1)× [k ]β , so there is a y0 ∈ [k ]β such that X is (η/2)-dense on ρ(V1)× y0, i.e.,

|X ∩ (ρ(V1)× y0)| ≥ (η/2)|ρ(V1)× y0|.By Lemma 3.7, since M = MDHJ(k, n, η/2) and ρ(V1) × y0 is an M-dimensional space,there is an n-dimensional subspace V2 ⊂ ρ(V1) × y0 such that V2 ⊂ X . Let W be then-dimensional space in [k + 1]E that contains V2, so that ρ(W ) = V2 ⊂ X .

21

3.4 The proof of the density Hales–Jewett theoremIn this section we prove Theorem 3.1, in the three steps which were outlined in Section3.2.2 for the case ` = 3. As mentioned in Section 3.1, we use induction on ` , and thebase case D HJ (2) follows from Sperner’s Theorem. We assume throughout this sectionthat D HJ (` − 1) holds, and we will deduce D HJ (`). We assume that S ⊂ [` ]D satisfies|S| ≥ δ`D , with D > D0 for some D0 specified below, and that S contains no Hales–Jewettline. The goal is to find a density increment for S, specifically a Hales–Jewett subspaceV ⊂ [` ]D such that |S ∩V | ≥ (δ+ f (δ))`D , where V has at least a certain dimension d, andf (δ) is an increasing function of δ .In the three subsections that follow, we will work through several claims, and there willbe several parameters to keep track of. Let us give the definitions of all these parametershere, so that they are easy to find, although the reader is of course advised to ignore theseformulas until they turn up in the proof (the function GR comes from Theorem 3.4, thefunction E comes from Lemma 3.8, and the function F will be specified at the end of Section3.4.3).

δ1 = δ/(8`DHJ(`−1,δ/4)), δ2 = δ · δ1/10, δ3 = δ2/`,m = max{DHJ(` − 1, δ/4), log(`−1)/` (δ2), F (`, δ3, d)} ,

D0 = E(` − 1, GR (m), δ, δ22 ).Some terminology. We use the term (` − 1)-space to refer to a Hales–Jewett space in[` ]D whose active coordinates run only from 1 to ` − 1 (but whose fixed coordinates maystill take the value `). Sometimes we may use the term `-space in [` ]D to refer to a space,just to distinguish it from an (` − 1)-space. An `-space V ⊂ [` ]D contains one maximal(` − 1)-space, which we denote by ρ(V ). Similarly, we talk about (` − 1)-lines and the(` − 1)-line ρ(L) corresponding to an `-line L.3.4.1 Finding a subspace where S has many (` − 1)-linesOur goal in this subsection is to find a subspace V ⊂ [` ]D such that S contains a densesubset of all (`−1)-lines in ρ(V ). Our first claim is a direct application of Lemma 3.8, usingthe assumption that D > D0 = E(` − 1, GR (m), δ, δ22 ).Claim. There is an α < D and a GR (m)-dimensional subspace V1 ⊂ [` ]α such that for allx ∈ V1 we have

|{y ∈ [` ]D−α : x × y ∈ S}| ≥ (δ − δ22 )`D−αFrom now on we write β = D − α and [` ]D = [` ]α × [` ]β . For x ∈ [` ]α we write

Sx = {y ∈ [` ]β : x × y ∈ S},so by the claim we have |Sx | ≥ (δ − δ22 )`β . Similarly, for an (` − 1)-line L in [` ]α , we define

SL = {y ∈ [` ]β : L× y ⊂ S}.

22

Claim. There is an m-dimensional subspace V2 ⊂ V1 such that for every (` − 1)-line L inρ(V2) we have |SL| ≥ 2δ1`β .

Proof. We use Theorem 3.4. We color an (` − 1)-line L in ρ(V1) red if|SL| ≥ 2δ1`β,and blue otherwise. By Theorem 3.4 applied to ρ(V1), which has dimension GR (m), V1 hasan m-dimensional subspace V2 such that the (` − 1)-lines in ρ(V2) are either all red or allblue. We show that ρ(V2) has at least one red line, which then implies that all lines in

ρ(V2) are red.Since m ≥ DHJ(` − 1, δ/4), we can pick an (` − 1)-subspace W ⊂ ρ(V2) that hasdimension exactly n = DHJ(`−1, δ/4). Since W ⊂ V1, we have |Sx | ≥ (δ−δ22 )`β ≥ (δ/2)`βfor all x ∈ W . Thus∑y∈[` ]β |S ∩ (W × y)| = |S ∩ (W × [` ]β)| = ∑

x∈W

|Sx | ≥ (δ/2)`β · `n.Let Y be the set of y ∈ [` ]β such that |S ∩ (W × y)| ≥ (δ/4)`n. Then we have∑

y∈Y

|S ∩ (W × y)| ≥ ∑y∈[` ]β |S ∩ (W × y)| −∑

y6∈Y

|S ∩ (W × y)|≥ (δ/2)`β+n − (δ/4)`β+n = (δ/4)`β+n.

Since for every y we have |S ∩ (W × y)| ≤ |W | = `n, we get |Y | ≥ (δ/4)`β .For any y ∈ Y , since W × y is an (` − 1)-space of dimension n = DHJ(` − 1, δ/4) onwhich S is (δ/4)-dense, there is an (` − 1)-line Ly ⊂ W such that Ly×y is contained in S.The number of (` − 1)-lines in W is less than `n, so there is an (` − 1)-line L0 in W thatoccurs as Ly for at least (using δ1 = δ/(8`n))|Y |/`n ≥ 2δ1`βdifferent y ∈ Y . This means that |SL0| ≥ 2δ1`β , so L0 is a red line contained in ρ(V2). By thechoice of V2, this implies that all lines in ρ(V2) are red, i.e., they satisfy |SL| ≥ 2δ1`β .

Claim. There is a set Y1 ⊂ [` ]β with |Y1| > δ1`β such that for all y ∈ Y1 we have L×y ⊂ Sfor a δ1-dense subset of all (` − 1)-lines in ρ(V2).Proof. Let us write L for the set of (` − 1)-lines in ρ(V2). We have |SL| ≥ 2δ1`β for every(` − 1)-line L in V2, so we have∑

y∈[` ]β |{L ∈ L : y ∈ SL}| = ∑L∈L

|SL| ≥ 2δ1`β · |L|.Note that y ∈ SL if and only if L × y ⊂ S. Let Y1 be the set of y ∈ [` ]β such that|{L ∈ L : L× y ⊂ S}| ≥ δ1|L|. Then we have∑

y∈Y1|{L ∈ L : L× y ⊂ S}| ≥ ∑

y∈[` ]β |{L ∈ L : L× y ⊂ S}| −∑y6∈Y1|{L ∈ L : L× y ⊂ S}|

> 2δ1`β|L| − δ1|L| · `β = δ1`β|L|.Since each term in this sum has size at most |L|, it follows that |Y1| > δ1`β .23

Claim. There is a set Y2 ⊂ [` ]β with |Y2| > (1 − 3δ2)`β such that for all y ∈ Y2 we have|S ∩ (V2 × y)| ≥ (δ − δ2)`m.

Proof. If for any y ∈ [` ]β we have |S ∩ (V2 × y)| ≥ (δ + δ22 )`m, then we have a densityincrement on an m-dimensional subspace and we are done since m ≥ d, so we can assumethat|S ∩ (V2 × y)| < (δ + δ22 )`mfor all y ∈ [` ]β . Since V2 ⊂ V1, we also have |Sx | ≥ (δ− δ22 )`β for all x ∈ V2, which implies∑

y∈[` ]β |S ∩ (V2 × y)| = ∑x∈V2|Sx | ≥ (δ − δ22 )`m+β

SetY2 = {y ∈ [` ]β : |S ∩ (V2 × y)| ≥ (δ − δ2)`m}.We have

|Y2| · (δ + δ22 )`m >∑y∈Y2|S ∩ (V2 × y)| ≥ ∑

y∈[` ]β |S ∩ (V2 × y)| −∑y6∈Y2|S ∩ (V2 × y)|

≥ (δ − δ22 )`m+β − (δ − δ2)`m(`β − |Y2|) = (δ2 − δ22 )`m+β + (δ − δ2)|Y2|`m.This implies|Y2| > δ2 − δ22

δ2 + δ22 `β > (1− 3δ2)`β.

Hence Y2 is as claimed.Claim. There is an m-dimensional subspace V ⊂ [` ]D such that S contains a δ1-densesubset of all (` − 1)-lines in ρ(V ), and S has density at least δ − δ2 on V .

Proof. From the previous two claims we have |Y2| > (1 − 3δ2)`β and |Y2| > δ1`β . By ourchoice of δ2 we have δ1 = 10δ2/δ > 3δ2, so Y1 ∩ Y2 is nonempty. We pick any y ∈ Y1 ∩ Y2and set V = V2 × y, so that by the previous two claims V has the desired properties.3.4.2 Obtaining a density incrementBy the last claim of Section 3.4.1, we have an m-dimensional subspace V ⊂ [` ]D on whichS has density at least δ−δ2, and such that S contains a δ1-dense subset of all (`−1)-linesin ρ(V ). We identify V with [` ]m, and we let T ⊂ [` ]m be the subset corresponding to S∩V ,so T contains no `-lines, we have |T | ≥ (δ − δ2)`m, and T contains a δ1-dense subset ofthe (` − 1)-lines in [` − 1]m.Let us make the definition of insensitivity slightly more formal. For x ∈ [` ]m andi ∈ [` − 1], let σi(x) be the point obtained by replacing every ` in x by i. Then X ⊂ [` ]m isi`-insensitive if we have

x ∈ X if and only if σi(x) ∈ X .To see that this corresponds to the definition in Section 3.2.2, observe that x, y ∈ [` ]m canbe obtained from each other by swapping some is and `s if and only if σi(x) = σi(y). Notethat with this definition it is clear that if X is i`-insensitive, then so is X c = [` ]m\X .Finally, recall that C ⊂ [` ]m is Cartesian if there are i`-insensitive sets Ci for i ∈ [`−1]such that C = C1 ∩ · · · ∩ C`−1. Note that some of the Ci may be equal to [` ]m, or in otherwords, they could be omitted from the intersection.24

Claim. There is a Cartesian set C ⊂ [` ]m with |C | ≥ δ3`m and |T ∩ C | ≥ (δ + δ2)|C |.Proof. Let E be the set of all endpoints in [` ]m of the (` − 1)-lines in T ∩ [` − 1]m, and set

P = (T ∩ [` − 1]m) ∪ E.Note that E is in bijection with the set of (` − 1)-lines in T ∩ [` − 1]m, so we have|E | ≥ δ1(`m − (` − 1)m) ≥ (δ1/2)`m.Also note that, since T contains no `-line, T is disjoint from E .Define

Pi = {x ∈ [` ]m : σi(x) ∈ T}.Then Pi is i`-insensitive by the definition above, since by definition of Pi we have x ∈ Piif and only if σi(x) ∈ T , which because of σi(σi(x)) = σi(x) is equivalent to σi(σi(x)) ∈ T ,which by definition of Pi means σi(x) ∈ Pi. We also haveP = P1 ∩ · · · ∩ P`−1.Indeed, if x ∈ T ∩ [` − 1]m, then σi(x) = x ∈ T for every i, so x ∈ Pi for every i; if x ∈ E ,then x is an endpoint of a line in T ∩ [` − 1]m, and each σi(x) lies on that line, so x ∈ Pifor every i. Conversely, if x ∈ P1 ∩ · · · ∩ P`−1, then either x ∈ [` − 1]m, in which case

x = σi(x) ∈ T , or x 6∈ [` − 1]m, in which case σi(x) ∈ T for every i, implying that x is theendpoint of a line in [` − 1]m. To summarize, P is a Cartesian set.We have |P| ≥ |E | ≥ (δ1/2)`m, so (writing Pc = [` ]m\P)|Pc| ≤ (1− δ1/2)`m.On the other hand, we have T ∩E = ∅ and thus T ∩P ⊂ [` − 1]m, so using the assumption

m ≥ log(`−1)/` (δ2) we get|T ∩ P| ≤ ((` − 1)/`)m · `m ≤ δ2`m.Since |T | ≥ (δ − δ2)`m, we get

|T ∩ Pc| ≥ (δ − 2δ2)`m.DefineQi = P1 ∩ · · · ∩ Pi−1 ∩ Pc

ifor i ∈ [` ]. The sets Q1, . . . , Q` partition Pc. Since taking complements preserves insensi-tivity, Pci is i`-insensitive, so Qi is a Cartesian set for each i.Let I ⊂ [` ] be the set of indices i for which |Qi| ≥ δ3`m. Then, since we chose δ3 = δ2/` ,∑

i6∈I

|Qi| < ` · δ3`m = δ2`m.Hence, using δ2 = δ · δ1/10, we have∣∣(T ∩ Pc) ∩⋃i∈I Qi

∣∣|Pc| ≥

|T ∩ Pc| −∣∣∣⋃i6∈I Qi

∣∣∣|Pc|

≥ (δ − 2δ2)− δ2(1− δ1/2) ≥ (δ − 3δ2) (1 + δ1/2) ≥ δ + δ2.Hence there exists i0 ∈ I such that |T ∩Qi0| ≥ (δ + δ2)|Qi0|.We set C = Qi0 , so C is Cartesian, since we observed that every Qi is Cartesian. Bythe choice of i0 we have |C | ≥ δ3`m and |T ∩ C | ≥ (δ + δ2)|C |.

25

3.4.3 Partitioning into subspacesIt remains to show how to partition a Cartesian set into subspaces of a certain dimension.First we do this for insensitive sets.Claim. For all `, n, η there is a number N(`, n, η) such that for N > N(`, n, η) the followingholds. Given an i`-insensitive set X ⊂ [` ]N with |X | ≥ η`N , there exists a family V ofpairwise disjoint n-dimensional subspaces contained in X such that |X\(∪V∈VV )| < η`N .

Proof. Set β = MDHJ∗(`, n, η/2) (where the function MDHJ∗ comes from Lemma 3.9),N = β · 2(` + n)β`β−n/η, and α = N − β. We have∑

x∈[` ]α |Xx | = |X | ≥ η`α+β.Set

A1 = {x ∈ [` ]α : |Xx | ≥ (η/2)`β},so we have |A1| ≥ (η/2)`α .For x ∈ A1, by Lemma 3.9 and the choice of β, there is an n-dimensional subspaceVx ⊂ [` ]β such that ρ(Vx) ⊂ Xx . Then x × ρ(Vx) ⊂ X , so the fact that X is i`-insensitiveimplies that x×Vx ⊂ X . Since there are less than (`+n)β distinct n-dimensional subspacesin [` ]β , it follows that some V1 occurs as Vx at least η/(2(` +n)β)`α times, or in other wordsthe set

A2 = {x ∈ [` ]α : x × V1 ⊂ X}satisfies |A2| ≥ (η/(2(` + n)β))`α . Note that A2 is also i`-insensitive, because the fact thatX is i`-insensitive implies that x × V1 ⊂ X if and only if σi(x)× V1 ⊂ X .Set

V1 = {x × V1 : x ∈ A2}.This is a family of pairwise disjoint n-dimensional subspaces contained in X , with|∪V∈V1V | = |A1| · `n ≥ η2(` + n)β `α · `n = η2(` + n)β`β−n `N .

If |X\(∪V∈V1V )| < η`N , we are done, so we can assume thatX1 = X\(∪V∈V1V ),

satisfies |X1| ≥ η`N . The set X1 need not be i`-insensitive, but it has the property that forevery y ∈ [` ]β the setXy1 = {x ∈ [` ]α : x × y ∈ X1}is i`-insensitive. We can apply the argument above to Xy1 ⊂ [` ]N−β for each y ∈ [` ]β , toobtain a family V2 of pairwise disjoint n-dimensional subspaces contained in X1 with

|∪V∈V1∪V2V | ≥ 2 · η2(` + n)β`β−n `N .Note that all subspaces in V1 ∪ V2 are pairwise disjoint.By the choice of N , we could repeat this 2(`+n)β`β−n/η times to get families V3,V4, . . .with the analogous properties. Since the density of the union of the subspaces cannotincrease above 1, we must obtain |X\(∪V∈∪ViV )| < η`N . Then we set V = ∪V∈∪ViV .26

We can now define the function F that we used to define m. Set θ = δ3/(` − 1). DefineNr for r ∈ [` − 1] by N`−1 = d and Nr = N(`,Nr+1, θ), where the function N comes fromthe claim above. Then we set F (`, δ3, d) = N1.Claim. Given the Cartesian set C ⊂ [` ]m with |C | ≥ δ3`m, there exists a family V of pairwisedisjoint d-dimensional subspaces contained in C such that |C\(∪V∈VV )| < δ3`m.

Proof. Let C1, . . . , C`−1 such that C = C1 ∩ · · · ∩ C`−1, with Ci an i`-insensitive set. Weprove by induction on r that for all r ∈ [` − 1] there exists a family Vr of pairwise disjointNr-dimensional subspaces contained in C1 ∩ · · · ∩ Cr such that

|(C1 ∩ · · · ∩ Cr)\(∪V∈VV )| < rθ`m. (3.1)For r = 1, this follows from the previous claim, with η = θ. For r = ` − 1 we obtain thecurrent claim, since (` − 1)θ = δ3.We now prove the statement for r + 1. By induction, the statement for C1 ∩ · · · ∩ Crgives a family Vr of pairwise disjoint Nr-dimensional subspaces contained in C1 ∩ · · · ∩ Crsatisfying (3.1). Note that we have |Vr| ≤ `m−Nr .Let

U = {V ∈ Vr : |V ∩ Cr+1| ≥ θ|V |}.The set V ∩ Cr+1 is i(r + 1)-insensitive as a subset of V . For every V ∈ U we apply theprevious claim with η = θ to V∩Cr+1 as a subset of V , using the fact thatNr = N(`,Nr+1, θ).Since V ∩Cr+1 = V ∩ (C1 ∩ · · · ∩Cr+1), we get a family WV of Nr+1-dimensional subspacessuch that|(V ∩ (C1 ∩ · · · ∩ Cr+1))\(∪W∈WVW )| < θ`Nr . (3.2)Then we set

Vr+1 = {W : W ∈ WV , V ∈ U}.We prove that Vr+1 is as required. By (3.1), the number of elements of C1∩· · ·∩Cr+1 thatare missed by all V ∈ Vr is less than rθ`m. On the other hand, for every V ∈ Vr , either wehave V 6∈ U and then |V ∩ (C1∩· · ·∩Cr+1)| < θ`Nr , or V ∈ U and then by (3.2) all but θ`Nrelements of V ∩ (C1 ∩ · · · ∩ Cr+1) are covered by Vr+1. Since |Vr| ≤ `m−Nr , it follows thatthe number of elements of C1 ∩ · · · ∩Cr+1 missed in this way is at most `m−Nr · θ`Nr = θ`m.Altogether, the subspaces in Vr+1 miss at most rθ`m + θ`m = (r + 1)θ`m elements ofC1 ∩ · · · ∩ Cr+1, as required.3.4.4 Completing the proofWe can now wrap up the proof. From Sections 3.4.1 and 3.4.2, we obtained a Cartesian setC ⊂ [` ]m with |C | ≥ δ3`m and |T ∩ C | ≥ (δ + δ2)|C |, unless we have a density incrementδ22 on a d-dimensional subspace (in the proof of the fourth claim in Section 3.4.1). Bythe second claim in Section 3.4.3, C can partitioned into disjoint d-dimensional subspaces,with a remainder of size at most δ3`m. Since δ3 < δ2/2, T has density at least δ + δ2/2on the union of these subspaces, so there must be a d-dimensional subspace W ′ for which|T ∩W ′| ≥ (δ + δ2/2)|W ′|. In the original setting, V corresponds to a subspace W ⊂ [` ]Dsuch that |S ∩W | ≥ (δ + δ2/2)|W |. Thus we have a density increment δ2/2 (or δ22 ) on ad-dimensional subspace, and if d is sufficiently large we can repeat this argument. Sincethe density cannot keep increasing, we must encounter a Hales–Jewett lines contained inS. This finishes the proof.

27

3.5 ConsequencesCorollary 3.10 (Szemerédi’s Theorem). For all δ, ` there is a number N0 such that for allN > N0 the following holds. If S ⊂ [N ] and |S| ≥ δN , then S contains an arithmeticprogression of length ` .

Proof. We set D = DHJ(`, δ/2) and N0 = `D . Given N > `D and S ⊂ [N ] with |S| ≥ δN ,we can find an interval of length `D on which has S has density at least δ/2, for instanceas follows. Either the first half or the second half of [N ] contains (δ/2)N elements of S,and that half can be covered by disjoint intervals of length `D . Then S has density at leastδ/2 on the union of these disjoint intervals, which implies that S has density at least δ/2on one of the intervals. After translating and renaming, we can assume that S ⊂ [`D ] andS has density at least δ/2.We define a map φ : [` ]D → [`D ] by

φ(x1, . . . , xD) = 1 + ∑i∈[D](xi − 1)` i−1,

which is a bijection. Therefore φ−1(S) is a subset of [` ]D with density at least δ/2, so byTheorem 3.1 and the choice of D, φ−1(S) contains a Hales–Jewett line L of length ` . Thenφ(L) is an arithmetic progression of length ` contained in S.

28

Chapter 4

Arithmetic progressions over finite fields

4.1 IntroductionIt is natural to look for variants of Roth’s Theorem in other groups. For a group G, writer3(G) for the maximum size of a subset of G that does not contain a three-term arithmeticprogression (a solution of x − 2y + z = 0.) This is slightly different from Roth’s Theorem,since there we consider subsets of a specific subset [N ] of the group Z, but in fact Roth’sTheorem is closely related to the group Z/NZ. Roth’s proof [27] actually showed thatr3(Z/NZ) = o(N), and this implies Roth’s Theorem in the way that we stated it (given adense subset of [N ], we can embed it into Z/2NZ, so that the modulo arithmetic plays norole).Another finite group for which this question is interesting is the vector space FDp (wherep is a prime, so that Fp = Z/pZ is a field). In particular, bounding r3(FD3 ) has becomeknown as the capset problem. Note that in FD3 , three points form a three-term arithmeticprogression if and only if they form a line. Thus, a set without a three-term arithmeticprogression is a set that does not contain a line (such a set is sometimes called a “cap”or a “capset”, hence the name for the problem). Also note that we can compare FD3 to [3]D;then every Hales–Jewett line corresponds to a line over F3, but there are far more F3-linesthan Hales–Jewett lines. This suggests that a density theorem for F3 lines, i.e. for r3(FD3 ),could be quantitatively stronger than the Density Hales-Jewett Theorem.Brown and Buhler [7] proved that r3(FD3 ) = o(3D), Meshulam [23] improved that tor3(FD3 ) = O(3D/D) (which is O(N/ logN) if we write N = 3D), and Bateman and Katz [3]obtained r3(FD3 ) = O(3D/D1+ε). Meshulam’s proof uses Fourier analysis in a similar way toRoth’s original proof of Roth’s Theorem [27]. From below, Edel [10] proved r3(FD3 ) ≥ (2.2)D .Recently, Ellenberg and Gijswijt [12], based on work of Croot, Lev, and Pach [8], gave asimpler proof of a much better bound. They proved that

r3(FD3 ) = O((3− α)D)for some α > 0, specifically α ≥ 0.24. Their proof uses basic properties of polynomialsand some linear algebra, and can be extended to other finite groups. Tao [34] elegantlyreformulated the proof of [12] in terms of the slice rank of a tensor.In this chapter we present Tao’s proof. First we will introduce the slice rank and applyit to a problem from extremal set theory, where some of the details are easier, and then wewill use it to prove the result of Ellenberg and Gijswijt.

29

4.2 The slice rank methodAs a warmup for the slice rank method, we present a classic example of a proof usingthe linear algebra method. This proof uses the rank of a matrix, and we present it in a(somewhat unusual) way that will make the generalization to tensors more natural. Thetheorem is due to Berlekamp [4].Theorem 4.1 (Oddtown theorem). Let X be a finite set and let S ⊂ 2X be a set system. If|S| is odd for every S ∈ S, and |S ∩ T | is even for all distinct S, T ∈ S, then |S| ≤ |X |.Proof. Let F : S × S → F2 be the function defined for S, T ∈ S by

F (S, T ) = |S ∩ T | mod 2.By assumption, for S, T ∈ S we have F (S, T ) = 1 if S = T and F (S, T ) = 0 if S 6= T .We can think of F as an |S| × |S| matrix in which an entry is nonzero if and only if itis on the diagonal. Thus the rank of this matrix is equal to |S|. On the other hand, we canwriteF (S, T ) = ∑

x∈X

fx(S)fx(T ),where fx : S → F2 is the function that is 1 if x ∈ S and 0 if x 6∈ S. We can think of fx asa vector of length |S|, so we have written the matrix of F as a sum of |X | matrices, eachof which is a product of the form vvT for a vector v . Since a matrix of the form vvT hasrank one, and since matrix rank is subadditive, this implies that the rank of the matrix of F ,which we found to be |S|, is at most |X |.In general, a k-tensor is a function X k → F. Traditionally, the tensor rank of a tensor Tis the minimum R for which we can write T = ∑R

i=1 fi with each fi of the form g1(x1) · · ·gk (xk ).In other words, tensors of the form g1(x1) · · ·gk (xk ) have rank 1, and the rank of a tensoris the minimum number of rank one tensors that we can decompose it in. Then a 2-tensoris a matrix, and its tensor rank equals its matrix rank (since the rank of a matrix equalsthe minimum number of rank one matrices, which have the form uvT , that the matrix can bedecomposed into).Here we will use a different notion of rank for tensors, called the slice rank. A tensorhas slice rank one if it has the form gh, where g and h depend on disjoint nonempty subsetsof the variables. The slice rank of a general tensor is the minimum number of slice rankone tensors that we can decompose it in. We will only use the slice rank for k = 3, sofor concreteness we define it formally only in that case, but the generalization would bestraightforward.Definition 4.2. Let F be a field and let X be any finite set. A function F : X × X × X → Fhas slice rank 1 if we can write

F (x, y, z) = g(x)h(y, z), or F (x, y, z) = g(y)h(x, z), or F (x, y, z) = g(z)h(x, y).For a function F : X × X × X → F, the slice rank SR(F ) is the minimum number R suchthat we can write F = ∑Ri=1 Fi where each Fi has slice rank 1.In the proofs that follow, we will use the slice rank in a similar way to how we used thematrix rank in the proof of Theorem 4.1. Given some set S whose size we want to bound,we construct a diagonal tensor that has |S| nonzero entries on its diagonal. Then the slicerank of the tensor is |S|, and an upper bound on the slice rank gives an upper bound on |S|.However, first we have to carefully prove the natural fact that the slice rank of a diagonaltensor equals the number of nonzero entries on its diagonal.

30

Lemma 4.3. If F : X × X × X → F has the property that F (x, y, z) 6= 0 if and only ifx = y = z, then SR(F ) = |X |.Proof. Define δ(a, b) to be 1 if a = b and 0 if a 6= b. Then we can write

F (x, y, z) = ∑x0∈X

δ(x, x0)F (x0, y, z),which shows that SR(F ) ≤ |X |.Suppose that SR(F ) = ` < |X |. Then we can write

F (x, y, z) = j∑i=1 fi(x)gi(y, z) + k∑

i=j+1 fi(y)gi(x, z) + ∑̀i=k+1 fi(z)gi(x, y).

Let V be the vector space of all functions v : X → F such that for all 1 ≤ i ≤ j we have∑x∈X

fi(x)v (x) = 0.Since V is the solution set of j homogeneous equations within the |X |-dimensional spaceof all functions on X , it follows that V has dimension at least |X | − j .Let v ∈ V be a function with maximal support Sv = {x ∈ X : v (x) 6= 0}, so that

|Sv | ≥ |X | − j.

Indeed, if we had |Sv | < |X | − j = dim(V ), then there would be a nonzero w ∈ V thatvanishes on Sv (since the number of equations v (x) = 0 for x ∈ Sv is less than the dimensionof V ), so that v + w ∈ V would have larger support than v .We have∑x∈X

v (x)F (x, y, z) = j∑i=1 gi(y, z)

(∑x∈X

fi(x)v (x))

+ k∑i=j+1 fi(y)(∑

x∈X

v (x)gi(x, z))+ ∑̀i=k+1 fi(z)

(∑x∈X

v (x)gi(x, y))

= k∑i=j+1 fi(y)gi(z) + ∑̀

i=k+1 fi(z)hi(y)for certain functions hi. This shows that the 2-tensor

G(y, z) = ∑x∈X

v (x)F (x, y, z),which corresponds to a matrix, has matrix rank at most ` − j < |X |− j . Since F (x, y, z) 6= 0if and only if x = y = z, we have G(y, z) = 0 whenever y 6= z. Moreover, we haveG(y, y) = v (y)F (y, y, y), which is nonzero whenever y ∈ Sv . Thus G corresponds to an|X | × |X | diagonal matrix with |Sv | nonzero entries on its diagonal, so it has matrix rank|Sv |. Since |Sv | ≥ |X | − j , this is a contradiction.

31

4.3 SunflowersA k-sunflower is a family of k sets such that any two of the sets have the same intersection.In other words, any element is either contained in all the sets of the sunflower, or in at mostone of the sets. Erdős and Szemerédi [13] conjectured that if a set system S ⊂ 2[N ] containsno 3-sunflower, then |S| < (2 − β)N for some β > 0. Prior to the work of Ellenberg andGijswijt [12], Alon, Shpilka, and Umans [2] observed that a bound for the capset problemlike that in [12] would imply the conjecture of Erdős and Szemerédi. The result of Ellenbergand Gijswijt [12] thus proved this conjecture, and the resulting bound comes to β ≥ 0.06.Naslund and Sawin [24] then observed that the slice rank method of Tao [34] can be applieddirectly to this problem, and gives a better bound. We give the proof of Naslund and Sawin,which is also a natural introduction to Tao’s proof of the Ellenberg–Gijswijt bound.Theorem 4.4 (Naslund-Sawin). If S ⊂ 2[N ] is a set system without a 3-sunflower, then|S| ≤ 3(N + 1)CN , with C = 3/22/3 ≤ 1.89.Proof. We can identify 2[N ] with {0, 1}N , so that we can view S as either a set system or aset of binary vectors. Then S has the property that for any three distinct vectors x, y, z ∈ S,the sum x + y + z has a coordinate equal to 2, since otherwise the sets corresponding tox, y, z would form a 3-sunflower. Let Sm be the subset of sets x ∈ S that have exactly melements. Each Sm has the property that for any x, y, z ∈ Sm that are not all equal, thesum x + y+ z has a coordinate equal to 2.Define a function F : S3

m → Q byF (x, y, z) = N∏

i=1 (2− (xi + yi + zi)).Because of the property of Sm just mentioned, for x, y, z ∈ Sm we have F (x, y, z) = 0 if andonly if x = y = z. By Lemma 4.3, this implies SR(F ) = |Sm|.On the other hand, expanding F gives (using the notation x I = x i11 · · · x iNN and |I| = ∑i∈I i)F (x, y, z) = ∑

I,J,K∈{0,1}NI+J+K≤(1,...,1)

cIJKx IyJzK = ∑I∈{0,1}N|I|≤N/3

x IfI(y, z)+ ∑J∈{0,1}N|J|≤N/3

yJgJ(x, z)+ ∑K∈{0,1}N|K |≤N/3

zKhK (x, y)for some functions fI , gJ , hK . Here we used the pigeonhole principle to observe that ifI + J + K ≤ (1, . . . , 1), then we have either |I| ≤ N/3, or |J| ≤ N/3, or |K | ≤ N/3. Sinceeach term in this sum is a slice, it follows that the slice rank of F satisfies

|Sm| = SR(F ) ≤ 3 ∑k≤N/3

(Nk

).

It remains to compute a numerical upper bound. By the binomial theorem we have(1 + x)N = ∑k≤N(Nk

)xk , so for any 0 < x < 1 we have

x−N/3(1 + x)N = ∑k≤N

(Nk

)xk−N/3 > ∑

k≤N/3(Nk

)xk−N/3 > ∑

k≤N/3(Nk

).

We can choose x ∈ (0, 1) to minimize f (x) = x−1/3(1 + x) using basic calculus. We havef ′(x) = 13x−4/3(2x−1), so f has a minimum at x = 1/2, with the value f (1/2) = 3/22/3. Hence

|S| = N∑m=0 |Sm| ≤ 3(N + 1) ∑

k≤N/3(Nk

)≤ 3(N + 1)(3/22/3)N ,

which completes the proof.32

4.4 Arithmetic progressions in FD3We now give Tao’s proof [34] of the capset bound of Ellenberg and Gijswijt [12].Theorem 4.5 (Ellenberg-Gijswijt). If S ⊂ FD3 contains no arithmetic progression of lengththree, then |S| ≤ 3CD for some C ≤ 2.76.

Proof. An arithmetic progression in FD3 corresponds to distinct x, y, z such that x−2y+z = 0,which over F3 is equivalent to x + y + z = 0, so S contains no such solution with x, y, zdistinct. Moreover, if x, y, z are not distinct but not all equal, say x = y 6= z, thenx + y + z = 0 is not possible (since then 2x + z = 0, so z = 3x + z = x). Thus, forx, y, z ∈ S, we have x + y+ z = 0 if and only if x = y = z.We define F : S × S × S → F3 by

F (x, y, z) = D∏i=1 (1− (xi + yi + zi)2).

Recall that in F3 we have 1 − x2 6= 0 if and only if x = 0. Thus F (x, y, z) 6= 0 if and onlyif x + y + z = 0, which for x, y, z ∈ S is equivalent to x = y = z. Hence F is a diagonaltensor, so by Lemma 4.3, we have SR(F ) = |S|.On the other hand, F (x, y, z) expands to∑I,J,K∈{0,1,2}D|I|+|J|+|K |≤2D

cIJKx IyJzK = ∑I∈{0,1,2}D|I|≤2D/3

x IfI(y, z) + ∑J∈{0,1,2}D|J|≤2D/3

yJgJ(x, z) + ∑K∈{0,1,2}D|K |≤2D/3

zKhK (x, y)for some functions fI , gJ , hK . This implies that

SR(F ) ≤ 3 ∑a+b+c=Db+2c≤2D/3

D!a!b!c! .

By the multinomial theorem we have(1 + x + x2)D = ∑

a+b+c=DD!

a!b!c!xb+2c.Therefore, for 0 < x < 1 we have

x−2D/3(1 + x + x2)D = ∑a+b+c=D

D!a!b!c!xb+2c−2D/3

>∑

a+b+c=Db+2c≤2D/3

D!a!b!c!xb+2c−2D/3 > ∑

a+b+c=Db+2c≤2D/3

D!a!b!c! .

To minimize f (x) = x−2/3(1 + x + x2), we calculate that f ′(x) = 13x−5/3(4x2 + x − 2), so f hasa minimum at x0 = (√33− 1)/8 with the value f (x0) ≤ 2.76. This proves the theorem.

33

Bibliography

[1] Miklós Ajtai and Endre Szemerédi, Sets of lattice points that form no squares, StudiaScientiarum Mathematicarum Hungarica 9, 9–11, 1974.[2] Noga Alon, Amir Shpilka, and Christopher Umans, On sunflowers and matrix multipli-

cation, Computational Complexity 22, 219–243, 2013.[3] Michael Bateman and Nets Hawk Katz, New bounds on cap sets, Journal of the Amer-ican Mathematical Society 25, 585–613, 2012.[4] Elwyn Berlekamp, On subsets with intersections of even cardinality. Canadian Math-ematical Bulletin 12, 471–477, 1969.[5] Felix Behrend, On sets of integers which contain no three terms in arithmetic progres-

sion, Proceedings of the National Academy of the Sciences 32, 331–332, 1946.[6] Thomas Bloom, A quantitative improvement for Roth’s theorem on arithmetic progres-

sions, Journal of the London Mathematical Society 93, 643–663, 2016.[7] Tom Brown and Joe Buhler, A density version of a geometric Ramsey theorem, Journalof Combinatorial Theory, Series A 32, 20–34, 1982.[8] Ernie Croot, Vsevolod Lev, and Péter Pach, Progression-free sets in Zn4 are exponen-

tially small, Annals of Mathematics 185, 331–337, 2017.[9] Pandelis Dodos, Vassilis Kanellopoulos, and Konstantinos Tyros, A simple proof of

the density Hales–Jewett theorem, International Mathematics Research Notices 12,3340–3352, 2014.[10] Yves Edel, Extensions of generalized product caps, Designs, Codes and Cryptography

31, 5–14, 2004.[11] Michael Elkin, An improved construction of progression-free sets, Israel Journal ofMathematics 184, 93–128, 2011.[12] Jordan Ellenberg and Dion Gijswijt, On large subsets of Fnq with no three-term arith-

metic progression, Annals of Mathematics 185, 339–343, 2017.[13] Paul Erdős and Endre Szemerédi, Combinatorial properties of systems of sets, Journalof Combinatorial Theory, Series A 24, 308–313, 1978.[14] Hillel Furstenberg and Yitzhak Katznelson, A density version of the Hales–Jewett

theorem, Journal d’Analyse Mathématique 57, 64–119, 1991.34

[15] Timothy Gowers, A new proof of Szemerédi’s theorem, Geometric and Functional Anal-ysis 11, 465–588, 2001.[16] Ron Graham and Bruce Rothschild, Ramsey’s theorem for n-parameter sets, Transac-tions of the American Mathematical Society 159, 257–292, 1971.[17] Ron Graham, Bruce Rothschild, and Joel Spencer, Ramsey theory, Wiley-Interscience,second edition, 1990.[18] Ron Graham and József Solymosi, Monochromatic equilateral right triangles on the

integer grid, Topics in Discrete Mathematics, Springer, 2006.[19] David Gunderson and Vojtěch Rödl, Extremal problems for affine cubes in the integers,Combinatorics, Probability and Computing 7, 65–79, 1998.[20] Alfred Hales and Robert Jewett, Regularity and positional games, Transactions of theAmerican Mathematical Society 106, 222–229, 1963.[21] David Hilbert, Ueber die Irreducibilität ganzer rationaler Functionen mit ganzzahligen

Coefficienten, Journal für die reine und angewandte Mathematik 110, 104–129, 1892.[22] Imre Leader, Ramsey Theory, notes from a course at Cambridge University, 2000.Available online at https://www.dpmms.cam.ac.uk/ par31/notes/

[23] Roy Meshulam, On subsets of finite abelian groups with no 3-term arithmetic progres-sions, Journal of Combinatorial Theory, Series A 71, 168–172, 1995.

[24] Eric Naslund and William Sawin, Upper bounds for sunflower-free sets,arXiv:1606.09575, 2016.

[25] D.H.J. Polymath, A new proof of the density Hales–Jewett theorem, Annals of Mathe-matics 175, 1283–1327, 2012.[26] Richard Rado, Studien zur Kombinatorik, Mathematische Zeitschrift 36, 424–470, 1933.[27] Klaus Roth, On certain sets of integers, Journal of the London Mathematical Society

28, 104–109, 1953.[28] Tom Sanders, On Roth’s theorem on progressions, Annals of Mathematics 174, 619–636, 2011.[29] Issai Schur, Über die Kongruenz xm + ym ≡ zm(mod p), Jahresbericht der DeutschenMathematiker-Vereinigung 25, 114–117, 1916.[30] Alexander Soifer, The Mathematical Coloring Book, Springer, 2009.[31] József Solymosi, Elementary additive combinatorics, in: Additive combinatorics, CRMProceedings & Lecture Notes 43, AMS, 2007.[32] Emanuel Sperner, Ein Satz über Untermengen einer endlichen Menge, MathematischeZeitschrift 27, 544–548, 1928.[33] Endre Szemerédi, On sets of integers containing no four elements in arithmetic pro-

gression, Acta Mathematica Academiae Scientiarum Hungaricae 20, 89–104, 1969.35

[34] Terence Tao, A symmetric formulation of the Croot-Lev-Pach-Ellenberg-Gijswijt capsetbound, blog post, 2016.

[35] Terence Tao and Van Vu, Additive Combinatorics, Cambridge University Press, 2006.[36] Mark Villarino, Bill Gasarch, and Kenneth Regan, Hilbert’s proof of his irreducibility

theorem, arXiv:1611.06303, 2016.[37] Bartel Leendert van der Waerden, Beweis einer Baudetschen Vermutung, Nieuw Archiefvoor Wiskunde 15, 212–216, 1927.

36

a course in arithmetic ramsey theory - epfl · 5/6/2017 · theorem 1.2 (solymosi). for all k, any...

Documents