in copyright - non-commercial use permitted rights ...49120/eth-49120-02.pdfanother important eld is...

Research Collection

Doctoral Thesis

Some Extremal Problems about Saturation and Random Graphs

Author(s): Korándi, Dániel

Publication Date: 2016

Permanent Link: https://doi.org/10.3929/ethz-a-010655957

Rights / License: In Copyright - Non-Commercial Use Permitted

This page was generated automatically upon download from the ETH Zurich Research Collection. For moreinformation please consult the Terms of use.

ETH Library

https://doi.org/10.3929/ethz-a-010655957

http://rightsstatements.org/page/InC-NC/1.0/

https://www.research-collection.ethz.ch

https://www.research-collection.ethz.ch/terms-of-use

Diss. ETH No. 23504

Some Extremal Problemsabout

Saturation and Random Graphs

A thesis submitted to attain the degree of

Doctor of Sciences of ETH Zurich

(Dr. sc. ETH Zurich)

presented by

Daniel Korandi

MASt in Mathematics, University of Cambridge

born on February 23, 1990

citizen of Hungary.

accepted on the recommendation of

Prof. Dr. Benny Sudakov, ExaminerProf. Dr. Jacques Verstraete, Co-Examiner

2016

Abstract

Extremal combinatorics is a major branch of discrete mathematics thatstudies problems looking for the maximum or minimum size of combinato-rial structures satisfying certain properties. Saturation is one of the oldestthemes in the area and considers questions of the following form: What isthe minimum size of a structure that cannot be extended without creatinga forbidden substructure?

Another important field is probabilistic combinatorics, that has grownout of extremal combinatorics and studies random combinatorial objects.The Erdos-Renyi random graph G(n, p) is of particular interest, and it is arecent trend in the area to look at classic extremal questions restricted torandom graphs.

This thesis contributes to the study of saturation problems and of ex-tremal questions in random graphs, including the combination of them:saturation-type problems in random graphs. In Chapter 2 we look at a sat-uration problem in bipartite graphs and settle a conjecture of Moshkovitzand Shapira in the asymptotic sense. In Chapter 3 we consider the classicsaturation problem in random graphs and obtain some tight results. InChapter 4 we use the differential equation method to analyze a randomprocess that is closely related to weak saturation. As an application, weimprove previous results about the fundamental group of random simplicialcomplexes. In Chapter 5 we look at a classic edge decomposition prob-lem in the random graph setting, and prove some asymptotically correctestimates.

ii

Zusammenfassung

Die extremale Kombinatorik ist ein bedeutender Zweig der diskreten Math-ematik. Sie befasst sich mit Fragestellungen uber die minimale oder max-imale Große einer kombinatorischen Struktur mit bestimmten gefordertenEigenschaften. Sattigung ist eines der altesten Themen in diesem Gebietund studiert Probleme der folgenden Art: Was ist die minimale Große einerStruktur, die nicht erweitert werden kann ohne eine verbotene Teilstrukturzu erzeugen?

Ein weiteres wichtiges Feld ist die probabilistische Kombinatorik, diesich aus der extremalen Kombinatorik heraus entwickelt hat und zufalligekombinatorische Objekte studiert. Hierbei ist der Erdos-Renyi ZufallsgraphG(n, p) von besonderem Interesse. Dies spiegelt sich unter anderem imaktuellen Trend wieder, klassische Fragen der extremalen Kombinatorik aufZufallsgraphen zu beschranken.

Diese Dissertation tragt zur Untersuchung von Sattigungsproblemenund von extremalen Problemen in Zufallsgraphen bei und verbindet diesedurch die Untersuchung von Sattigung in Zufallsgraphen. In Kapitel 2 be-trachten wir ein Sattigungsproblem in bipartiten Graphen und beweiseneine Vermutung von Moshkovitz und Shapira im asymptotischen Sinn.In Kapitel 3 untersuchen wir das klassische Sattigungsproblem in Zufalls-graphen und erzielen einige scharfe Resultate. In Kapitel 4 benutzen wir dieDifferentialgleichungsmethode um einen Zufallsprozess zu analysieren, dereng mit der schwachen Sattigung in Graphen zusammenhangt. Diese Anal-yse verwenden wir, um fruhere Resultate uber die Fundamentalgruppe vonzufallig ausgewahlten Simplizialkomplexen zu verbessern. In Kapitel 5 be-trachten wir ein klassisches Problem der Kantenzerlegung im Kontext vonZufallsgraphen und beweisen einige asymptotisch korrekte Abschatzungen.

iii

Acknowledgements

First and foremost, I want to thank my advisor, Benny Sudakov, for all hisadvice and guidance that helped me take my first steps as a mathematician.Doing a PhD with Benny was probably one of the best decisions of my life.

I am just as indebted to my parents and brothers. Without their supportand encouragement none of this would have been possible. Special thanksgo to all the great math teachers that I could learn from and to Judit Csathfor teaching me how English is done.

I have had the pleasure of collaborating with many people, and for thatI am grateful to my co-authors, Victor Falgas-Ravry, Wenying Gan, PingKittipassorn, Michael Krivelevich, Shoham Letzter, Bhargav Narayananand Yuval Peled.

Doing a PhD would not have been so fun without all the people sur-rounding me. Many thanks to Igor Balla, Shagnik Das, Felix Draxler,Roman Glebov, Nina Kamcev, Matt Kwan, Humberto Naves, AlexeyPokrovskiy, Pedro Vieira, Jan Volec, Frank Mousset, Rajko Nenadov, Ne-manja Skoric, Corentin Perret, Laci Lovasz, Katie Arterburn, Zach Nor-wood and all my other friends from ETH, UCLA and elsewhere just forbeing there and talking to me.

And Ilcsi. Thank you for your love and eternal patience.

Chapter 2 is a version of the paper “Ks,t-saturated bipartite graphs”that I worked on with Wenying Gan and Benny Sudakov. It was publishedin the European Journal of Combinatorics, Volume 45 (2015). Chapter 3is joint work with Benny Sudakov and is essentially the same as the paper“Saturation in random graphs” that is now accepted for publication in Ran-dom Structures & Algorithms. In Chapter 4, I included collaborative workwith Yuval Peled and Benny Sudakov that was published as “A randomtriadic process” in the SIAM Journal on Discrete Mathematics, Volume 30

iv

(2016). My contribution to this result was the combinatorial analysis of thediscrete random process in question. Finally, Chapter 5 is an adaptation ofthe paper “Decomposing random graphs into few cycles and edges”, pub-lished in Combinatorics, Probability and Computing, Volume 24 (2015).

v

Contents

Contents vi

1 Introduction 1

2 Saturation in bipartite graphs 62.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.2 Lower bounds on the saturation number . . . . . . . . . . . 82.3 Extremal graphs . . . . . . . . . . . . . . . . . . . . . . . . 142.4 The K2,3 case . . . . . . . . . . . . . . . . . . . . . . . . . . 152.5 Further remarks . . . . . . . . . . . . . . . . . . . . . . . . . 20

3 Saturation in random graphs 213.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 213.2 Strong saturation . . . . . . . . . . . . . . . . . . . . . . . . 24

3.2.1 Lower bound . . . . . . . . . . . . . . . . . . . . . . 253.2.2 Upper bound . . . . . . . . . . . . . . . . . . . . . . 26

3.3 Weak saturation . . . . . . . . . . . . . . . . . . . . . . . . . 313.4 Further remarks . . . . . . . . . . . . . . . . . . . . . . . . . 35

4 A random triadic process 374.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

4.1.1 Proof outline . . . . . . . . . . . . . . . . . . . . . . 394.2 The differential equation method . . . . . . . . . . . . . . . 404.3 Calculations . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

4.3.1 Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . 464.3.2 Degrees . . . . . . . . . . . . . . . . . . . . . . . . . 514.3.3 Open triples . . . . . . . . . . . . . . . . . . . . . . . 524.3.4 3-walks . . . . . . . . . . . . . . . . . . . . . . . . . . 53

vi

CONTENTS

4.3.5 4-walks . . . . . . . . . . . . . . . . . . . . . . . . . . 554.3.6 Codegrees . . . . . . . . . . . . . . . . . . . . . . . . 57

4.4 The second phase . . . . . . . . . . . . . . . . . . . . . . . . 574.4.1 The lower bound . . . . . . . . . . . . . . . . . . . . 584.4.2 The upper bound . . . . . . . . . . . . . . . . . . . . 60

4.5 Further remarks . . . . . . . . . . . . . . . . . . . . . . . . . 63

5 Decomposing random graphs into few cycles and edges 645.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

5.1.1 Proof outline . . . . . . . . . . . . . . . . . . . . . . 655.2 Covering the odd-degree vertices . . . . . . . . . . . . . . . . 665.3 Cycle decompositions in sparse random graphs . . . . . . . . 725.4 The main ingredients for the dense case . . . . . . . . . . . . 755.5 Cycle-edge decompositions in dense random graphs . . . . . 795.6 Further remarks . . . . . . . . . . . . . . . . . . . . . . . . . 81

References 82

vii

Chapter 1

Introduction

Extremal combinatorics is a major branch of discrete mathematics thathas been actively developed for over 80 years. It studies problems lookingfor the maximum or minimum size of combinatorial structures satisfyingcertain properties. Such questions are often motivated by other fields ofmathematics and science, when the optimization of certain combinatorialquantities is needed.

A nice example of an extremal problem in non-mathematical terms is thewell-known puzzle that asks how many knights can be placed on the chessboard in a way that no two of them attack each other. This puzzle fits amore general framework of extremal questions, called Turan theory, whereone seeks to maximize the size of a structure avoiding certain forbiddensubstructures. The cornerstone result in this area is Turan’s theorem [52]that determines the maximum number of edges in graphs not containingany large cliques.

One way to approach such a problem is to think about the correspondingprocess where we are trying to extend our structure step-by-step (addingknights or edges) in such a way that we never create a forbidden substruc-ture (a pair of attacking knights or large cliques). Then we want to seehow far we can get with this process. This leads us to another interestingquestion: How soon can we get stuck? What is the minimum size of a struc-ture that cannot be extended without creating a forbidden substructure?Questions of this sort are called saturation problems.

Saturation was first considered by Zykov [55] and by Erdos, Hajnaland Moon [24], and has proved to be a very fruitful subject to study. Itsdevelopment has revealed close ties to percolation theory, and saturation

1

1. INTRODUCTION

problems have been instrumental in motivating the development of varioustools in combinatorics, most notably the linear algebraic method.

Probabilistic combinatorics is an area that has grown out of extremalcombinatorics. It studies random combinatorial objects such as G(n, p),the Erdos-Renyi random graph on n vertices, where each pair of verticesis connected by an edge with probability p, independently of the others.Such random objects first appeared as probabilistic constructions for ex-tremal questions. For example, the implicit use of G(n, 1/2) was key to afundamental result of Erdos [21] on Ramsey theory from 1947, where heproved the existence of graphs containing no cliques or independent setsof super-logarithmic size. The idea of using randomness in combinatorialconstructions revolutionized extremal combinatorics.

The study of random graph models for their own sake was initiated byErdos and Renyi [25] in 1959. They investigated the structure and size ofthe connected components of G(n, p) as the edge probability p increasesfrom 0 to 1. In particular, they found the probability threshold where therandom graph becomes connected. This paper led to intensive research thatdetermined the threshold for many other structural properties, as well. Werefer the interested reader to the monographs of Bollobas [12] and Janson, Luczak and Rucinski [33].

A more recent trend is to look at random analogs of classic extremalproblems. Probably the first result of this kind appears in a 1986 paper ofFrankl and Rodl [30], where (again, motivated by a question from extremalcombinatorics) they find the maximum number of edges in a triangle-freesubgraph of G(n, p). The systematic study of similar problems began in themid-1990s. For more examples, see the survey of Rodl and Schacht [48].

This dissertation contributes to the study of saturation problems andof extremal questions in random graphs, including the combination of thetwo themes: saturation-type problems in random graphs. Below, we givea short description of the topics covered. More detail can be found in theintroduction of the respective chapter.

For two graphs H and F , H is said to be F -saturated if it containsno copy of F as a subgraph, but the addition of any edge missing fromH creates a copy of F in H. Or in other words, if H is a maximal F -free graph. An F -free graph is weakly F -saturated if the missing edgescan be added back in some order such that every added edge creates a

2

1. INTRODUCTION

new copy of F (possibly using the previously added edges). Clearly, anyF -saturated graph is also weakly F -saturated. The saturation numberssat(n, F ) and w-sat(n, F ) are defined to be the minimum number of edgesin an F -saturated and weakly F -saturated graph on n vertices, respectively.

Probably the most natural question one might ask here is the satura-tion number of cliques. Zykov [55] and independently Erdos, Hajnal andMoon [24] showed that sat(n,Ks) =

(n2

)−(n−s+2

2

). Somewhat surprisingly,

the weak saturation number w-sat(n,Ks) turns out to be the same as thesaturation number [35, 36]. In their paper, Erdos, Hajnal and Moon alsosuggested the following bipartite variant of the problem. We say that ann-by-n bipartite graph is F -saturated if the addition of any missing edgebetween its two parts creates a new copy of F . What is the minimumnumber of edges in a Ks,s-saturated bipartite graph?

This question was answered independently by Wessel [53] and Bollobas[10] in a more general, but ordered, setting: they showed that the minimumnumber of edges in a K(s,t)-saturated bipartite graph is n2−(n−s+1)(n−t+1), where K(s,t) is the “ordered” complete bipartite graph with s verticesin the first color class and t vertices in the second. However, the verynatural question of determining the minimum number of edges in the usual,unordered, Ks,t-saturated case remained unsolved.

This problem was considered recently by Moshkovitz and Shapira whoalso conjectured what its answer should be. In Chapter 2 we give an asymp-totically tight bound on the minimum number of edges in a Ks,t-saturatedbipartite graph, which is only smaller by an additive constant than theconjecture of Moshkovitz and Shapira. We also prove their conjecture forK2,3-saturation, which was the first open case.

Saturation can also be defined more generally in an arbitrary host graphG, in which case we say that a graph is F -saturated in G if it is a maximalF -free subgraph of G. Using this terminology, Chapter 2 treats saturationin Kn,n, but many other host graphs (e.g., complete multipartite graphsand hypercubes) have also been considered in the past decades. The workpresented in Chapter 3 initiates the study of saturation in random graphs.For constant probability p, we give asymptotically tight estimates on thesaturation number of cliques in G(n, p), and determine the exact value ofthe weak saturation number in the same setting.

This research fits nicely with previous results. Indeed, the opposite ofthe saturation problem, the Turan problem in G(n, p) and its complete so-

3

1. INTRODUCTION

lution has attracted significant attention in recent years (see [48]). On theother hand, weak saturation can also be thought of as a kind of bootstrappercolation. Then our question of finding the smallest percolating subgraphof G(n, p) is closely related to a problem that Balogh, Bollobas and Mor-ris [4] looked at: they were interested in the smallest p for which G(n, p)percolates in Kn.

A random simplicial 2-dimensional complex Y2(n, p), as introduced byLinial and Meshulam [41] in 2006, is a higher dimensional analog of theErdos-Renyi random graph. It is defined to have n vertices, all the

(n2

)edges, and each triangular face independently with probability p. Com-binatorial and topological properties of random complexes have been thesubject of active research in the past decade, with tools coming from variousfields of mathematics. One of the questions raised in the original paper ofLinial and Meshulam was how the fundamental group of Y2(n, p) behaves.In particular, what is the threshold where the complex becomes simplyconnected.

This question was studied by Babson, Hoffman and Kahle [3], who pro-vided lower and upper bounds for the threshold. In Chapter 4, we improvetheir upper bound by a

√log n factor. We have some reason to believe that

our estimate is asymptotically correct, although the lower bound in [3] isfar away from it. We prove our result by exhibiting a contractible subcom-plex in Y2(n, p) when p is above the threshold that contains the complete1-skeleton. This is enough to show that the complex is simply connected.We do so by analyzing the following graph process, that is yet anotherinstance of weak triangle-saturation with random restrictions.

Let H = H(n, p) be an underlying 3-uniform random hypergraph on nvertices where each triple independently appears with probability p. Ourprocess starts with the star G0 on the same vertex set, containing all theedges incident to some fixed vertex v0. Then we repeatedly add an edge xy ifthere is a vertex z such that xz and zy are already in the graph and xzy ∈ H.We say that the process propagates if it reaches the complete graph beforeit terminates. In Chapter 4 we prove that the threshold probability forpropagation is p = 1

2√n.

Problems about packing and covering the edge set of a graph using cer-tain subgraphs form a wide subfield of extremal combinatorics, one thathas been intensively studied since the 1960s. One of the oldest conjecturesin this area, made by Erdos and Gallai [23], claims that the edges of ev-

4

1. INTRODUCTION

ery graph on n vertices can be decomposed into O(n) cycles and edges.Although an O(n log n) upper bound is not hard to show, the first signifi-cant step towards the solution was only made recently by Conlon, Fox andSudakov [18]. They also showed that the conjecture holds for the randomgraph G(n, p).

In Chapter 5 we look at the minimum for random graphs more closely.Note that any odd-degree vertex needs to be the endpoint of at least oneedge. As typically about half of the vertices in G(n, p) have odd degree, thisshows that we need at least n/4 edges in our decomposition. On the otherhand, G(n, p) contains about

(n2

)p edges and any cycle covers at most n of

them, so we also need at least np/2 cycles. Our result shows that this lowerbound is asymptotically correct, as G(n, p) can indeed be decomposed intoa union of n

4+ np

2+ o(n) cycles and edges.

5

Chapter 2

Saturation in bipartite graphs

2.1 Introduction

For two graphs G and F , G is said to be F -saturated if it contains nocopy of F as a subgraph, but the addition of any edge missing from Gcreates a copy of F in G. The saturation number sat(n, F ) is defined asthe minimum number of edges in an F -saturated graph on n vertices. Notethat the maximum number of edges in an F -saturated graph is exactly theextremal number ex(n, F ), so the saturation problem of finding sat(n, F )is in some sense the opposite of the Turan problem.

Probably the most natural setup of this problem is when we choose Fto be a fixed complete graph Ks. This was first studied by Zykov [55] in the1940’s, and later by Erdos, Hajnal and Moon [24] in 1964. They proved thatsat(n,Ks) = (s − 2)n −

(s−1

2

). Here the upper bound comes from the Ks-

saturated graph that has s−2 vertices connected to all other vertices. Later,the closely related notion of weak saturation was introduced by Bollobas[11]. A graph G is weakly F -saturated if it is possible to add back themissing edges of G one by one in some order, so that each addition createsa new copy of F . Trivially, if G is F -saturated, then any order satisfiesthis property, hence G is also weakly F -saturated. Let w-sat(n,Ks) be theminimum number of edges in an n-vertex graph that is weakly Ks-saturated.We then have w-sat(n,Ks) ≤ sat(n,Ks). Somewhat surprisingly, one canprove using algebraic techniques (see e.g. [42]) that these two functions areactually equal. On the other hand, the extremal graphs for these problemsare not the same, and already for s = 3 there are weakly K3-saturated

6

2.1. INTRODUCTION

graphs (i.e., trees) which are not K3-saturated.The paper by Erdos, Hajnal and Moon also introduced the bipartite sat-

uration problem, where we are looking for the minimum number of edgessat(Kn,n, F ) in an F -free n-by-n bipartite graph, such that adding any miss-ing edge between the two color classes creates a new copy of F . (Of course,this definition is only meaningful if F is also bipartite.) They conjecturedthat sat(Kn,n, Ks,s) = n2− (n− s+ 1)2. Once again, this is seen to be tightby selecting s−1 vertices on each side of the bipartite graph and connectingthem to every vertex on the opposite side.

In the bipartite setting, one can impose an additional restriction on theproblem by ordering the two vertex classes of F and requesting that eachmissing edge create an F respecting the order: the first class of F lies inthe first class of G. For example, let K(s,t) be the complete “ordered” s-by-t bipartite graph with s vertices in the first class and t vertices in thesecond, then a bipartite graph G is K(s,t)-saturated if each missing edgecreates a Ks,t with the s-vertex class lying in the first class of G. Indeed,the conjecture of Erdos, Hajnal and Moon was independently confirmed byWessel [53] and Bollobas [10] a few years later as the special case of thefollowing result: sat(K(n,n), K(s,t)) = n2 − (n− s+ 1)(n− t+ 1). This wasfurther generalized in the 80s by Alon [1] to complete k-uniform hypergraphsin a k-partite setting using algebraic tools. Alon showed that the saturationand weak saturation bounds are the same in this case as well. For a moredetailed discussion of F -saturation in general, we refer the reader to thesurvey [27] by Faudree, Faudree and Schmitt.

In this chapter we study the unordered case of bipartite saturation. Al-though this is arguably the most natural setting for the bipartite problem, itdid not receive any attention until very recently in [5, 43]. Moshkovitz andShapira [43] studied the unordered weak saturation number of Ks,t, s ≤ t,and showed that w-sat(Kn,n, Ks,t) = (2s − 2 + o(1))n. Note that, surpris-ingly, it is much smaller than the corresponding ordered saturation numberand only depends on the size of the smaller part. One might think that asimilar gap exists for saturation numbers as well. Moshkovitz and Shapiraconjectured that this is not the case, and that ordered and unordered bipar-tite saturation numbers differ only by an additive constant. More precisely,they made the following conjecture and constructed an example showingthat, if true, this bound is tight.

7

2.2. LOWER BOUNDS ON THE SATURATION NUMBER

Conjecture 2.1.1. Let 1 ≤ s ≤ t be integers. Then there is an n0 such thatif n ≥ n0 and G is a Ks,t-saturated n-by-n bipartite graph, then G contains

at least (s+ t− 2)n−⌊(

s+t−22

)2⌋

edges.

Our main result confirms the above conjecture up to a small additiveconstant.

Theorem 2.1.2. Let 1 ≤ s ≤ t be fixed and n ≥ t. Then

sat(Kn,n, Ks,t) ≥ (s+ t− 2)n− (s+ t− 2)2.

The proof is presented in Section 2.2. In Section 2.3, we show that if theconjecture is true, it has many extremal examples. Finally, in Section 2.4,we prove Conjecture 2.1.1 in the first open case of K2,3-saturation.

2.2 Lower bounds on the saturation

number

Let G be a bipartite graph with vertex class U and U ′ of size n. Assume1 ≤ s ≤ t ≤ n and suppose G is Ks,t-saturated, i.e. each missing edgebetween U and U ′ creates a new K(s,t) or a new K(t,s) when added to G.Here K(a,b) refers to a complete bipartite graph with a vertices in U and bvertices in U ′.

Let us start with the following, easy special case of Theorem 2.1.2.

Proposition 2.2.1. Suppose a Ks,t-saturated graph has minimum degreeδ < t− 1. Then it contains at least n(t+ s− 2)− (s+ t− 2)2 edges.

Proof. The Ks,t-saturated property ensures that each vertex has at leasts−1 neighbors, so we actually have s−1 ≤ δ < t−1. Let u0 be a vertex ofdegree δ; we may assume that u0 ∈ U . Then adding any missing edge u0u

′

to G (where u′ ∈ U ′ −N(u0)) should create a new K(t,s) because it cannotcreate a K(s,t). For such a u′, let Su′ ⊆ U be the t − 1 vertices other thanu0 in the t-class of this K(t,s), and define V ⊆ U to be the union of theseSu′ . Then all vertices in V have at least s − 1 neighbors in N(u0) and allvertices in U ′−N(u0) have at least t−1 neighbors in V . Now we can count

8


the number of edges in G as follows:

e(U,U ′) = e(V,N(u0)) + e(V, U ′ −N(u0)) + e(U − V, U ′)≥ (s− 1)|V |+ (t− 1)(n− |N(u0)|) + δ(n− |V |)≥ (s− 1)|V |+ (t− 1)(n− t+ 2) + (s− 1)(n− |V |)≥ n(t+ s− 2)− (t− 1)(t− 2)

≥ n(t+ s− 2)− (s+ t− 2)2.

The case when δ ≥ t−1 is considerably more complicated. We introducethe following structure to count the edges of G (see Figure 1). The core ofthis structure is a set A0 = A0 ∪ A′0 with A0 ⊆ U and A′0 ⊆ U ′ satisfyingthe following technical property:

• there are vertices u0 ∈ A0 and u′0 ∈ A′0 such that their neighborhoodsare also contained in the core.

Next, we build the shell around the core: starting with A = A0, weiteratively add any vertex v to A that has at least t− 1 neighbors in it. Inother words, A = A ∪ A′ is the smallest set containing A0 such that anyvertex v ∈ G − A has fewer than t − 1 neighbors in A. Here A0 ⊆ A ⊆ Uand A′0 ⊆ A′ ⊆ U ′. We use the variables x0 = |A0|, x′0 = |A′0|, x = |A| andx′ = |A′| to denote the sizes of the corresponding sets. Obviously x0 ≤ xand x′0 ≤ x′.

The following, rather scary, lemma is the key to our lower bounds onthe saturation numbers. It shows that we can find about n(s+ t− 2) edgesin a Ks,t-saturated graph, provided we have a small enough core.

Lemma 2.2.2. Assuming δ ≥ t − 1, suppose the core spans e = e(A0, A′0)

edges. Then G has at least

n(s+ t− 2)− (x0 + x′0)(t− 1)−⌊

(s− 1)2

4

⌋+ e+ min(t− s)x, (t− s)x′

edges.

Proof. By the construction of A, we know that it spans at least e + (t −1)(x+x′−x0−x′0) edges. Indeed, each vertex we added to the shell brings

9


A A′

u0 u′0

A0 A′0

B1

B2

B

B′1 dB ≥ t− s

B′2 dB < t− s

B′ s− 1 ≤ dA < t− 1

C1

C2

C

C′1

dV−B2≥ s− 1

C′2

dV−B2< s− 1

C′ dA < s− 1

Figure 2.1: the structure for counting the edges

10


at least t−1 new edges. The idea is to count t−1 edges from the remainingvertices on one side of the graph, say U ′ − A′, and then to find s − 1 new(yet uncounted) edges from the other side, U −A. Of course, if a vertex inU − A has at least s − 1 neighbors in A′, then these edges are guaranteedto be new.

So let us continue with our definition of the structure. We know thatany vertex in U − A has fewer than t − 1 neighbors in A′. We break thisset into two parts by defining B to be the set of vertices in U − A havingat least s − 1 neighbors in A′, and C to be those having fewer than s − 1neighbors in A′. Similarly we break U ′ − A′ into two sets B′ and C ′ basedon the size of the neighborhood in A. We need to break B further into twoparts B1 and B2, by defining B1 to be the set of vertices having at leastt − s neighbors in B′. Similarly, let B′1 be the set of vertices in B′ havingat least t− s neighbors in B (see Figure 1).

Note that any vertex in B′1 already has t − 1 neighbors in A ∪ B (atleast s − 1 in A and at least t − s in B), but this is not necessarily truefor B′2. This, together with our strategy to find s − 1 new edges from thevertices in C motivates our last partitioning: We now break C into twoparts C1 and C2, where C1 is the set of those vertices in C which have atleast s − 1 neighbors outside B′2, and C2 = C − C1. We similarly defineC ′1 = v ∈ C ′ : |N(v)−B2| ≥ s− 1, where N(v) is the neighborhood of v,and C ′2 = C ′ − C ′1.

An observation here, which will prove to be crucial when counting theedges, is that C2 and C ′2 span a complete bipartite graph. Indeed, supposethere is a missing edge vv′ in G, with v ∈ C2 and v′ ∈ C ′2. Adding this edgecreates a K(s,t) or a K(t,s), suppose it is a K(s,t). Then v′ is connected to allthe s− 1 vertices other than v in the s-vertex class of this K(s,t). But v′ isin C ′2, so it has at most s − 2 neighbors outside B2, consequently there isa vertex w ∈ B2 in the s-class. Similarly, using that v is in C2, we find atleast t− s vertices of the t-class in B′2. But then w ∈ B2 has at least t− sneighbors in B′2 ⊆ B′, which contradicts the definition of B2. The sameargument leads to a contradiction if the edge creates a K(t,s), hence we canconclude that there is no missing edge between C2 and C ′2.

On another note, observe that adding the edge u0v′, where u0 is the

vertex in A0 defined in the property of the core and v′ is any vertex in C ′,cannot create a K(s,t). Indeed, if it created a K(s,t), then all the vertices ofthe t-class except v′ are neighbors of u0, so they are sitting in the core, A′0.This means that each vertex in the s-class is connected to at least t − 1

11


vertices in the core, hence the whole s-class is in A. But then v′ has at leasts − 1 neighbors in A, contradicting v′ ∈ C ′. So we see that adding u0v

′

creates a K(t,s). Then, all the vertices of the s-class of this copy of K(t,s)

except v′ are in A′0, therefore the vertices of the t-class have at least s− 1neighbors in A′. Hence all of them are in A∪B, implying that every v′ ∈ C ′has at least t− 1 neighbors in A∪B. The same argument shows that eachv ∈ C has at least t− 1 neighbors in A′ ∪B′.

Lemma 2.2.2 will now follow from the following claim, possibly appliedto the graph with the two vertex classes switched.

Claim 2.2.3. Assuming δ ≥ t− 1, suppose |C2| ≤ |C ′2|. Then

e(U,U ′) ≥ n(s+ t− 2)− (x0 + x′0)(t− 1)−⌊

(s− 1)2

4

⌋+ e+ (t− s)x.

Proof. Let y = |C2| and y′ = |C ′2|, and let us count the edges in G. Wenoted above that each vertex in B′1 has at least t − 1 neighbors in A ∪ B,so e(A ∪ B,B′1) ≥ (t − 1)|B′1|. By assumption, each vertex in B′2 hasdegree at least t − 1, hence e(A ∪ B ∪ C,B′2) ≥ (t − 1)|B′2|. We have alsoshown that each vertex in C ′ has at least t − 1 neighbors in A ∪ B, soe(A ∪B,C ′) ≥ (t− 1)|C ′|. This so far means that

e(A ∪B,B′1) + e(A ∪B ∪ C,B′2) + e(A ∪B,C ′) ≥ (t− 1)(n− x′). (2.1)

Now look at what we have left from the other side: By definition, any vertexin B has at least s − 1 neighbors in A′, so e(B,A′) ≥ (s − 1)|B|. We alsodefined C1 so that its vertices have at least s− 1 neighbors outside B′2, thisgives e(C1, A

′ ∪ B′1 ∪ C ′) ≥ (s− 1)|C1|. As we noted above, the vertices ofC2 are all connected to the vertices of C ′2, so e(C2, C

′2) = yy′. Using the

fact that y(s− 1− y) ≤⌊

(s−1)2

4

⌋(y is an integer), we get that

e(B,A′) + e(C1,A′ ∪B′1 ∪ C ′) + e(C2, C

′2) (2.2)

≥ (s− 1)(n− x− y) + yy′

≥ (s− 1)(n− x)− (s− 1)y + y2

≥ (s− 1)(n− x)−⌊

(s− 1)2

4

⌋.

We have also seen that e(A,A′) is at least e+ (t− 1)(x+ x′ − x0 − x′0).

12


It is easy to check that we never counted an edge more than once above,hence

e(U,U ′) ≥ (t− 1)(n− x′) + (s− 1)(n− x)

+ (t− 1)(x+ x′ − x0 − x′0) + e−⌊

(s− 1)2

4

⌋= n(t+ s− 2) + (t− s)x− (x0 + x′0)(t− 1) + e−

⌊(s− 1)2

4

⌋,

what we wanted to show.

We state the following immediate corollary of this claim, which we needin Section 4.

Corollary 2.2.4. If we have equality in Claim 2.2.3, then the followingstatements hold:

• any vertex in B′1 ∪ C ′ has exactly t− 1 neighbors in A ∪B,

• any vertex in B has exactly s− 1 neighbors in A′,

• the vertices in C1 have exactly s− 1 neighbors outside B′2, and

• y(s− 1)− yy′ = b(s− 1)2/4c.

Now we are ready to prove our general theorem, which is tight up toan additive constant. Let us emphasize, however, that since our methodsdo not give the exact result, we will not make any effort to optimize theconstant error term.

Theorem 2.2.5. If G = (U,U ′, E) is a Ks,t-saturated bipartite graph withn vertices on each side, then it contains at least (s+ t− 2)n− (s+ t− 2)2

edges.

Proof. Following Lemma 2.2.2, our plan is to find an appropriate core.By Proposition 2.2.1, we may assume that the minimum degree of our

graph is at least t−1. Suppose for contradiction that G contains fewer than(s+t−2)n−(s+t−2)2 edges. Then there is a vertex u0 ∈ U of degree at mosts+t−3. Moreover, there is a non-adjacent vertex u′0 ∈ U ′−N(u0) of degreeat most s+ t−3 as well, since otherwise the number of edges in G would be

13

2.3. EXTREMAL GRAPHS

at least (n− (s+ t−3))(s+ t−2) > (s+ t−2)n− (s+ t−2)2, contradictingour assumption. Set A0 = u0∪N(u′0) and A′0 = u′0∪N(u0), and defineA0 = A0 ∪ A′0 to be the core.

Using the above notation, we see that x0 = |A0| = 1+ |N(u′0)| ≤ s+t−2and x′0 = |A′0| = 1 + |N(u0)| ≤ s+ t− 2. Since u0 and u′0 are not adjacent,we can add the edge u0u

′0 to create a new Ks,t. Notice that all the vertices

of this Ks,t are adjacent to either u0 or u′0, hence they all lie in the core.Consequently, the core spans e = e(A0, A

′0) ≥ st − 1 edges. Now applying

Lemma 2.2.2 we get

e(U,U ′) ≥ n(s+ t− 2)− (x0 + x′0)(t− 1)

+ min(t− s)x, (t− s)x′ −⌊

(s− 1)2

4

⌋+ e

≥ n(s+ t− 2)− (x0 + x′0)(t− 1)

+ min(t− s)x0, (t− s)x′0 −⌊

(s− 1)2

4

⌋+ st− 1

≥ n(s+ t− 2)− (s+ t− 2)2 + st− 1−⌊

(s− 1)2

4

⌋≥ n(s+ t− 2)− (s+ t− 2)2.

This contradicts the assumption, thus proving the theorem.

2.3 Extremal graphs

As we mentioned in the introduction, Moshkovitz and Shapira [43] con-structed a Ks,t-saturated n-by-n bipartite graph showing that the bound ofthe Conjecture 2.1.1, if true, is tight. It appears that this example is notunique. In this section we describe a general family of such graphs whichcontains the example by Moshkovitz and Shapira as a special case (whenl = 1).

Example. As usual, we denote the two sides of the bipartite graph by Uand U ′, where |U | = |U ′| = n. Let us break each class into two major parts:U = V ∪W and U ′ = V ′∪W ′, where |V | = |V ′| =

⌊t+s−2

2

⌋(assume n is large

enough). Suppose W and W ′ are further broken into some parts W1, . . . ,Wl

and W ′1, . . . ,W

′l where |Wi| = |W ′

i | ≥ t− s for all i. The construction of anextremal graph G goes as follows.

14

2.4. THE K2,3 CASE

First include in G all the edges between V and V ′, making it a completebipartite graph. Also, for every i, choose the edges between Wi and W ′

i tospan an arbitrary (t − s)-regular graph. It remains to describe the edgesgoing between different type of classes.

We do not include any edge between Wi and W ′j for any i 6= j. Instead,

choose arbitrary sets S ′ ⊆ V ′ and S1, . . . , Sl ⊆ V of size s − 1, and takeall edges going between Wi and S ′ as well as the edges between Si and W ′

i ,for all i. A straightforward computation shows that the number of edgesin this G is exactly the number in the conjecture. We claim that G isKs,t-saturated.

Let us see what happens when we add a missing edge uu′ to G. Ifu′ ∈ W ′, i.e. u′ ∈ W ′

i for some i, then let N be the set of its t− s neighborsin Wi. Since u ∈ U −N − Si, the set Si ∪ u ∪N ∪ S ′ ∪ u′ then formsa K(t,s). On the other hand, if u′ ∈ V ′, then u ∈ Wi for some i. Let N ′ bethe set of the t− s neighbors of u in W ′

i , then u′ ∈ U ′ −N ′ − S ′ and hencethe set Si ∪ u ∪ S ′ ∪N ′ ∪ u′ forms a K(s,t). This proves the saturationproperty.

The asymmetric structure of the above example comes from the relax-ation of the l = 1 case, which corresponds to the construction of Moshkovitzand Shapira. When all the vertices in W ′ are connected to the same sub-set of V of size s − 1, adding an edge between W and W ′ creates both aK(s,t) and a K(t,s). Our example exploits the freedom we had in choosingthe edges between W ′ and V . In our case, when l > 1, adding an edgebetween Wi and W ′

j with Si 6= Sj creates only a K(t,s). The existence ofsuch asymmetric examples provides further difficulties in proving an exactresult.

2.4 The K2,3 case

For t = s, Conjecture 2.1.1 trivially follows from the ordered result byBollobas [10]. The other extreme is also easy to handle. When s = 1, theK1,t-saturated property merely means that the vertices of degree less thant− 1 span a complete bipartite graph. Then it is a simple exercise to show(see [43]) that the conjecture holds in this case as well.

Thus the first open case is s = 2 and t = 3, where the conjecture assertsthat any K2,3-saturated graph contains at least 3n− 2 edges. We note that

15

2.4. THE K2,3 CASE

there are many saturated graphs on 3n− 2 edges. In fact, there are manysuch examples which are even K(2,3)-saturated: Just take a vertex v′ ∈ U ′that is connected to everything in U , and make sure that every other vertexin U ′ has degree 2.

In this section we prove the matching lower bound. A brief summary ofthe coming theorem can be phrased as follows. By finding an appropriatecore, our techniques from Section 2.2 easily give a 3n− 3 lower bound. Therest of the proof is then a series of small structural observations, ultimatelyruling out the possibility that a K2,3-saturated graph with 3n − 3 edgesexists.

Theorem 2.4.1. If G = (U,U ′;E) is a K2,3-saturated bipartite graph withn ≥ 4 vertices in each part, then it has at least 3n− 2 edges.

Proof. As a first step, we show in the spirit of Proposition 2.2.1 that it isenough to consider graphs of minimum degree 2.

Lemma 2.4.2. If G contains fewer than 3n−2 edges, then it has minimumdegree 2. Moreover, it contains two non-adjacent vertices u0 ∈ U and u′0 ∈U ′ of degree 2.

Proof. The saturation property ensures that each vertex has at least oneneighbor. Suppose there is a vertex u of degree 1 – wlog u ∈ U – and letu′ ∈ U ′ be its neighbor. Take any vertex v′ ∈ U ′ other than u′, then addingthe edge uv′ cannot create a K(2,3), so it must create a K(3,2), with the2-vertex class being u′, v′. For any such v′, let Uv′ ⊆ U be the 3-class ofthis K(3,2), so Uv′ consists of u and two neighbors of v′. We count the twoedges between v′ and Uv′ for each v′ ∈ U ′, v′ 6= u′ to get a total of 2n − 2different edges.

Now let X = ∪Uv′ , then every vertex in X is connected to u′ becauseeach of the above K(3,2)’s contains u′. This gives |X| new edges. On theother hand, we still have not encountered any edges touching U −X. Butsince we know that each vertex has at least one neighbor, we surely haveat least n − |X| new edges. This is already a total of 3n − 2 edges in G,contradicting our assumption.

Therefore the minimum degree is at least 2, but in fact it is exactly two,as otherwise we would have at least 3n edges in the graph. Let u0 havedegree 2 – we may assume u0 ∈ U . If every non-adjacent vertex in U ′ hasat least 3 neighbors, then we have 2 · 2 + 3(n− 2) = 3n− 2 edges incident

16

2.4. THE K2,3 CASE

to U ′, again a contradiction. Hence there is a u′0 ∈ U ′ of degree 2 that isnot adjacent to u0, and we are done.

Suppose G is a counterexample to our theorem, and apply Lemma 2.4.2to get two non-adjacent vertices u0 and u′0 of degree 2. Denote the neighborsof u0 by u′1, u

′2 ∈ U ′, the neighbors of u′0 by u1, u2 ∈ U , and let A0 =

u0, u1, u2 and A′0 = u′0, u′1, u′2 be the core of the structure we describedat the beginning of Section 2.2. Using this core we will also construct thesets A, B = B1 ∪ B2, C = C1 ∪ C2 and A′, B′ = B′1 ∪ B′2, C ′ = C ′1 ∪ C ′2 asdefined by the structure.

Assume that |C2| ≤ |C ′2| and apply Claim 2.2.3 with s = 2 and t = 3 tothe structure of core A0 = A0 ∪ A′0. These choices for s and t significantlysimplify the bound we get from this claim:

e(U,U ′) ≥ 3n− 6 · 2− 0 + e+ x = 3n− 12 + e+ x.

Using that the addition of the edge u0u′0 creates a K2,3 inside the core (as all

neighbors of u0 and u′0 are in A0), it is easy to check that e = e(A0, A′0) ≥ 6.

We also know that x ≥ x0 = 3, so e(U,U ′) ≥ 3n−3. Then these inequalitiestogether with Corollary 2.2.4 imply that if e(U,U ′) = 3n−3 then G satisfiesthe following five properties:

1) e = e(A0, A′0) = 6 and x = |A| = 3,

2) any vertex in B′1 ∪ C ′ has exactly 2 neighbors in A ∪B,

3) any vertex in B has exactly 1 neighbor in A′,

4) the vertices in C1 have exactly 1 neighbor outside B′2, and

5) for y = |C2| and y′ = |C ′2| (with 0 ≤ y ≤ y′) we have y(s− 1)− yy′ =y(1− y′) = 0, so either y = 0 or y = y′ = 1.

The following lemma supplements the fifth property and shows thatC2 must be empty and C ′2 must be non-empty, by taking care of the casey = y′ = 0 and y = y′ = 1.

17

2.4. THE K2,3 CASE

Lemma 2.4.3. If |C2| = |C ′2| then G spans at least 3n− 2 edges.

Proof. As G is a counterexample, by Lemma 2.4.2 it has minimum degree 2.Since y = y′, we may apply Claim 2.2.3 and Corollary 2.2.4 to G’s “mirror”,with U and U ′ switched, and observe that the five properties hold for thismirror graph as well. Then the first property gives x = 3, x′ = 3 ande = 6. So A = u0, u1, u2 and A′ = u′0, u′1, u′2 (i.e. any vertex not inthe core has at most one neighbor in it), and the core spans 6 edges. Bysymmetry we can assume that adding the edge u0u

′0 creates a K(2,3) on the

set u0, u1, u′0, u′1, u′2, so the missing edges are u0u

′0, u2u

′1 and u2u

′2. We

also assumed that there is no vertex of degree 1, so u2 must have someneighbor v′ in B′. Note that v′ has exactly one neighbor in A, in particularit is not connected to u1.

Now let us see what happens when we add the edge u0v′. We cannot

create a K(2,3), because that would use both u′1 and u′2, but their onlycommon neighbor other than u0 is u1 (recall that no vertex outside A′ canhave 2 neighbors in A), which is not connected to v′. So it must be a K(3,2),and it is not using u2, as u2 has no common neighbor with u0. But then theK(3,2) contains two neighbors of v′ that are not in A, but are connected to avertex in A′. Then, by definition, these neighbors are in B. So v′ ∈ B′1 hasat least two neighbors in B and one in A, and this contradicts the secondproperty.

From now on we assume that C2 is empty and C ′2 is non-empty. Thenthe fourth property also implies that the vertices in C = C1 have exactlyone neighbor outside B′2. Moreover, the third property tells us that eachvertex in B has exactly one neighbor in A′.

Lemma 2.4.4. All vertices in B are connected to the same vertex in A′.

Proof. Break B into parts based on the neighbor in A′ by putting thevertices in B connected to w′ ∈ A′ into the set Bw′ . We claim that verticesin different parts do not share common neighbors, or in other words, anyvertex v′ ∈ B′ ∪ C ′ has all its neighbors in B contained in the same partBw′ .

Indeed, any vertex in B′ has at most one neighbor in B: this is true bydefinition for the vertices in B′2, and follows from the second property forB′1 (every vertex in B′1 has a neighbor in A). Now look at the vertices in C ′.An easy observation in Lemma 2.2.2 shows that adding the edge u0v

′ for

18

2.4. THE K2,3 CASE

v′ ∈ C ′ cannot create a K(2,3). So it creates a K(3,2), and this K(3,2) mustcontain the two neighbors of v′ in B and a neighbor w′0 of u0 in A′. Henceboth neighbors of v′ are in Bw′0

, establishing the claim.As we noted above, C ′ is not empty, so take a vertex v′1 ∈ C ′ and assume

that the neighbors of v′1 in B are in Bw′1. We will show that B = Bw′1

.Suppose not, i.e. there is a v1 ∈ Bw′2

with w′1 6= w′2. Then the edge v1v′1

is missing; let us see what happens when we add that edge. We create aK(2,3) or a K(3,2), so in any case there are vertices v2 ∈ U and v′2 ∈ U ′ suchthat v1v

′2, v2v

′2 and v2v

′1 are all edges of G. Here v2 cannot be in A, as it

is connected to v′1 ∈ C ′. It is not in B either, since then one of v′1 andv′2 would have neighbors in both Bw′1

and Bw′2. So v2 ∈ C = C1 (since C2

is empty). Now the fourth property says that v2 has exactly 1 neighboroutside B′2. Since v′1 ∈ C ′, v′2 must be in B′2. But the vertices in B′2 haveno neighbors in B so v1v

′2 cannot be an edge, giving a contradiction.

Note that this lemma implies that one of the two neighbors of u0 – sayu′1 – is not connected to any vertex in B, and therefore it is adjacent to atleast two vertices in A = A0. We also recall that the core only spans sixedges. It is time to analyze what happens in the core when we add the edgeu0u

′0. It might create a K(3,2) or a K(2,3), but the obtained graph is inside

the core in both cases.Case 1: u0u

′0 creates a K(3,2).

If this K(3,2) used u′2, then the core of G would contain more than 6 edges:5 from the K(3,2) and 2 other edges incident to u′1, which is impossible. Sou′1 is connected to both u1 and u2, while u′2 is not connected to any of them.Note, however, that u′2 is connected to all vertices in B.

Let v be any vertex in U − A. When we add the edge vu′0 to G, wecreate a K(2,3) or a K(3,2), so there is a vertex v′ connected to both v and u1

or u2. Then v′ is not in A′, since v is only connected to u′2 in A′, but bothu1u

′2 and u2u

′2 are missing. Thus v′ ∈ B′ (it has a neighbor in A, so it is not

in C ′). When we add the edge u0v′, we cannot create a K(2,3), because that

would use both u′1 and u′2, which only share u0 as their common neighbor.So it creates a K(3,2) using one of u′1 and u′2. It cannot be u′1, becausethen the 3-class of the K(3,2) is exactly A, making v′ have 2 neighbors in A.Thus, by definition, v′ ∈ A′ which contradicts v′ ∈ B′. But it cannot be u′2either, because then v′ would have two neighbors in B, which together witha neighbor in A that v′ must have, contradicts the second property. So thiscase is impossible.

19

2.5. FURTHER REMARKS

Case 2: u0u′0 creates a K(2,3).

Then one of u1 and u2 – say u1 – is connected to both u′1 and u′2, and theother is connected to neither. But then u′1 has exactly two neighbors, u0

and u1, and the set A = u0, u1, u′1, u′2 spans four edges. This means that

we can apply Lemma 2.2.2 taking A as the core, and u0 and u′1 being its“distinguished” vertices with their neighborhoods also sitting in the core.One can check that all conditions are satisfied, and with our new valuesof x ≥ x0 = 2, x′ ≥ x′0 = 2 and e = 4, we get that G has at least3n− 8− 0 + 4 + 2 = 3n− 2 edges. This contradiction finishes the proof ofthe theorem.

2.5 Further remarks

Although we could slightly improve the error term in Theorem 2.2.5, itseems that more ideas are needed to prove the full conjecture. We also notethat our methods can be used to provide an asymptotically tight estimateon the minimum number of edges in a Ks,t-saturated unbalanced bipartitegraph (i.e., with parts of size m and n). Determining the precise value inthis unbalanced case might be even more challenging, although we believethat a straightforward modification of the extremal construction from thebalanced case is tight here as well.

Bipartite saturation results were generalized to the hypergraph setting in[1, 43], where G and F are assumed to be k-partite k-uniform hypergraphs,and G is F -saturated if any new hyperedge meeting one vertex from eachcolor class creates a new copy of F . It would be interesting to extend ourresults to get an asymptotically tight bound for the unordered k-partitehypergraph saturation problem.

20

Chapter 3

Saturation in random graphs

3.1 Introduction

For some fixed graph F , a graph H is said to be F -saturated if it is a max-imal F -free graph, i.e., H does not contain any copies of F as a subgraph,but adding any missing edge to H creates one. The saturation numbersat(n, F ) is defined to be the minimum number of edges in an F -saturatedgraph on n vertices.

As we pointed out in the previous chapter, the saturation problem isin some sense the opposite of the Turan problem, and the first results onthe topic were published by Zykov [55] in 1949 and independently by Erdos,Hajnal and Moon [24] in 1964. They considered the problem for cliques andshowed that sat(n,Ks) = (s − 2)n −

(s−1

2

), where the upper bound comes

from the graph consisting of s − 2 vertices connected to all other vertices.Since the 1960s, the saturation number sat(n, F ) has been extensively stud-ied for various different choices of F . For results in this direction, we referthe interested reader to the survey [27].

Here we are interested in a different direction of extending the originalproblem, where saturation is restricted to some host graph other than Kn.For fixed graphs F and G, we say that a subgraph H ⊆ G is F -saturated inG if H is a maximal F -free subgraph of G. The minimum number of edgesin an F -saturated graph in G is denoted by sat(G,F ). Note that with thisnew notation, sat(n, F ) = sat(Kn, F ).

Such a question with a complete bipartite host graph already appearedin the above-mentioned paper of Erdos, Hajnal and Moon, and Chapter 2

21

3.1. INTRODUCTION

discussed some further developments on that problem. But over the years,several other host graphs have also been considered, including completemultipartite graphs [28, 47] and hypercubes [15, 34, 44].

In recent decades, classic extremal questions of all kinds are being ex-tended to random settings. Hence it is only natural to ask what happenswith the saturation problem in random graphs. As usual, we let G(n, p)denote the Erdos-Renyi random graph on vertex set [n] = 1, . . . , n, wheretwo vertices i, j ∈ [n] are connected by an edge with probability p, inde-pendently of the other pairs. In this chapter we study Ks-saturation inG(n, p).

The corresponding Turan problem of determining ex(G(n, p), Ks), themaximum number of edges in a Ks-saturated graph in G(n, p), has attracteda considerable amount of attention in recent years. The first general resultsin this direction were given by Kohayakawa, Rodl and Schacht [39], andindependently Szabo and Vu [51] who proved a random analog of Turan’stheorem for large enough p. This problem was resolved by Conlon andGowers [19], and independently by Schacht [50], who determined the cor-rect range of edge probabilities where the Turan-type theorem holds. Thepowerful method of hypergraph containers, developed by Balogh, Morrisand Samotij [6] and by Saxton and Thomason [49], provides an alternativeproof. Roughly speaking, these results establish that for most values of p,the random graph G(n, p) behaves much like the complete graph as a hostgraph, in the sense that Ks-free subgraphs of maximum size are essentiallys− 1-partite.

Now let us turn our attention to the saturation problem. When s = 3,the minimum saturated graph in Kn is the star. Of course we cannot exactlyadapt this structure to the random graph, because the degrees in G(n, p)are all close to np with high probability, but we can do something verysimilar. Pick a vertex v1 and include all its incident edges in H. This way,adding any edge of G(n, p) induced by the neighborhood of v1 creates atriangle, so we have immediately taken care of a p2 fraction of the edges(which is the best we can hope to achieve with one vertex). Then the edgesadjacent to some other vertex is expected to take care of a p2 fraction ofthe remaining edges, and so on: we expect every new vertex to reduce thenumber of remaining edges by a factor of (1− p2).

Repeating this about log1/(1−p2)

(n2

)times, we obtain a K3-saturated bi-

partite subgraph containing approximately pn log1/(1−p2)

(n2

)edges. It feels

22

3.1. INTRODUCTION

natural to think that this construction is more or less optimal. Surpris-ingly, this intuition turns out to be incorrect. Indeed, we will present anasymptotically tight result which is significantly better. Moreover, ratherunexpectedly, the asymptotics of the saturation numbers do not depend ons, the size of the clique.

Theorem 3.1.1. Let 0 < p < 1 be some constant probability and s ≥ 3 bean integer. Then

sat(G(n, p), Ks) = (1 + o(1))n log 11−p

n

with high probability.

Our next result is about the closely related notion of weak saturation(also known as graph bootstrap percolation), introduced by Bollobas [11] in1968. Generalizing our definition from the previous chapter, we say that agraph H ⊆ G is weakly F -saturated in G if H does not contain any copiesof F , but the missing edges of H in G can be added back one by one in someorder, such that every edge creates a new copy of F . The smallest numberof edges in a weakly F -saturated graph in G is denoted by w-sat(G,F ).

Clearly w-sat(G,F ) ≤ sat(G,F ), but Bollobas conjectured that whenboth G and F are complete, then in fact equality holds. This somewhatsurprising fact was proved by Lovasz [42], Frankl [29] and Kalai [35, 36]using linear algebra:

Theorem 3.1.2 ([35, 36]).

w-sat(Kn, Ks) = sat(Kn, Ks) = (s− 2)n−(s− 1

2

)Again, many variants of this problem with different host graphs have

been studied [1, 43, 44], and it turns out to be quite interesting for randomgraphs, as well. In this case we are able to determine the weak saturationnumber exactly. It is worth pointing out that this number is linear in n, asopposed to the saturation number, which is of the order n log n.

Theorem 3.1.3. Let 0 < p < 1 be some constant probability and s ≥ 3 bean integer. Then

w-sat(G(n, p), Ks) = (s− 2)n−(s− 1

2

)with high probability.

23

3.2. STRONG SATURATION

We will prove Theorem 3.1.1 in Section 3.2 and Theorem 3.1.3 in Sec-tion 3.3. We conclude the chapter with a discussion of open problems inSection 3.4.

Notation. All our results are about n tending to infinity, so we oftentacitly assume that n is large enough. We say that some property holds withhigh probability, or whp, if the probability tends to 1 as n tends to infinity. Inthis chapter log stands for the natural logarithm unless specified otherwisein the subscript. For clarity of presentation, we omit floor and ceiling signswhenever they are not essential. We use the standard notations of G[S]for the subgraph of G induced by the vertex set S, and G[S, T ] for the(bipartite) subgraph of G[S ∪ T ] containing the S-T edges of G. For setsA,B and element x, we will sometimes write A+ x for A∪ x and A−Bfor A \B.

3.2 Strong saturation

In this section we prove Theorem 3.1.1 about Ks-saturation in randomgraphs.

Let us say that a graph H completes a vertex pair u, v if adding theedge uv to H creates a new copy of Ks. Using this terminology, a Ks-freesubgraph H ⊆ G is Ks-saturated in G, if and only if H completes all edgesof G missing from H.

We will make use of the following bounds on the tail of the binomialdistribution.

Claim 3.2.1. Let 0 < p < 1 be a constant and X ∼ Bin(n, p) be a binomialrandom variable. Then

1) P[X ≥ np+ a] ≤ e−a2

2(np+a/3) ,

2) P[X ≤ np− a] ≤ e−a2

2np and

3) P[X ≤ n

log2 n

]≤ (1− p)n−

nlogn .

Proof. The first two statements are standard Chernoff-type bounds (seee.g. [33]). The third can be proved using a straightforward union-boundargument as follows.

24


Let us think about X as the cardinality of a random subset A ⊆ [n],where every element in [n] is included in A with probability p, independentlyof the others. X ≤ n

log2 nmeans that A is a subset of some set I ⊆ [n] of

size nlog2 n

. For a fixed I, the probability that X ⊆ I is (1− p)n−|I|. We can

choose an I of size nlog2 n

in

(n

n/ log2 n

)≤(

en

n/ log2 n

) nlog2 n

≤ (e log2 n)n

log2 n ≤ e3n log logn

log2 n

different ways, so

P

[X ≤ n

log2 n

]≤ e

3n log logn

log2 n · (1− p)n−n

log2 n ≤ (1− p)n−n

logn .

3.2.1 Lower bound

First, we prove that any Ks-saturated graph in G(n, p) contains at least(1 + o(1))n log1/(1−p) n edges. In fact, our proof does not use the propertythat Ks-saturated graphs are Ks-free, only that adding any missing edgefrom the host graph creates a new copy of Ks. Now if such an edge createsa new Ks then it will of course create a new K3, as well, so it is enough toshow that our lower bound holds for triangle-saturation.

Theorem 3.2.2. Let 0 < p < 1 be a constant. Then with high probability,G = G(n, p) satisfies the following. If H is a subgraph of G such that forany edge e ∈ G missing from H, adding e to H creates a new triangle, thenH contains at least n log1/(1−p) n− 6n log1/(1−p) log1/(1−p) n edges.

Proof. Let H be such a subgraph and set α = 11−p . Let A be the set of

vertices that are incident to at least log2α n edges in H, and let B = [n]−A

be the rest. If |A| ≥ 2nlogα n

then H contains at least 12|A| log2

α n ≥ n logα n

edges and we are done. So we may assume |A| ≤ 2nlogα n

and hence |B| ≥n(1− 2

logα n). Our aim is to show that whp every vertex in B is adjacent to

at least logα n− 5 logα logα n vertices of A in H. This would imply that Hcontains at least |B|(logα n−5 logα logα n) ≥ n(logα n−6 logα logα n) edges,as needed.

25


So pick a vertex v ∈ B and let N be its neighborhood in A in the graphH. Let uv be an edge of the random graph G missing from H. Since H isK3-saturated, we know that u and v must have a common neighbor w inH. Notice that for all but at most log4

α n choices of u, this w must lie in A.Indeed, v is in B, so there are at most log2

α n choices for w to be a neighborof v, and if this w is also in B, then there are only log2

α n options for u, aswell. So the neighbors of N in H ⊆ G must contain all but log4

α n of thevertices (outside N) that are adjacent to v in G. The following claim showsthat this is only possible if |N | ≥ logα n − 5 logα logα n, thus finishing ourproof.

Claim 3.2.3. Let 0 < p < 1 be a constant and G = G(n, p). Then whpfor any vertex x and set Q of size at most logα n − 5 logα logα n, there areat least 2 log4

α n vertices in G adjacent to x but not adjacent to any of thevertices in Q.

Proof. Fix x and Q. Then the probability that some other vertex y is

adjacent to x but not to any of Q is p′ = p(1− p)|Q| ≥ p log5α nn

. These eventsare independent for the different y’s, so the number of vertices satisfying thisproperty is distributed as Bin(n−1−|Q|, p′). Its expectation, p′(n−1−|Q|)is at least p

2log5

α n, so the probability that there are fewer than 2 log4α n such

vertices is, by Claim 3.2.1, at most e−Ω(log5 n). But there are only n ways tochoose x and

∑logα ni=1

(ni

)≤ n · nlogα n ways to choose Q, so the probability

that for some x and Q the claim fails is

n2 · nlogα n · e−Ω(log5 n) ≤ exp(O(log2 n)− Ω(log5 n)) = o(1).

3.2.2 Upper bound

Next, we construct a saturated subgraph of the random graph that con-tains (1 + o(1))n log1/(1−p) n edges. The following observation says that itis enough to find a graph that is saturated at almost all the edges.

Observation. It is enough to find a Ks-free graph G0 that completes all butat most o(n log n) missing edges. A maximal Ks-free supergraph G ⊇ G0

will then be Ks-saturated with an asymptotically equal number of edges.

26


Before we give a detailed proof of the upper bound, let us sketch themain ideas for the case s = 3. For simplicity, we will also assume p = 1

2.

As we mentioned in the introduction, if we fix a set A1 of log4/3

(n2

)≈

2 log4/3 n vertices with B1 = [n] − A1, then G[A1, B1] is a K3-saturatedsubgraph with about n log4/3 n edges. But we can do better than that (seeFigure 3.1):

So instead, we fix A1 to be a set of 2 log2 n vertices and add all edgesin G[A1, B1] to our construction. This way we complete most of the edgesin B1 using about n log2 n edges. Of course, we still have plenty of edgesin G[B1] left incomplete, however, as we shall see, almost all of them areinduced by a small set B2 ⊆ B1 of size o(n). But then we can complete allthe edges in B2 using only o(n log n) extra edges: just take an additional setA2 of log4/3

(n2

)vertices, and add all edges in G[A2, B2] to our construction.

This way, however, we still need to take care of the Θ(n log n) incompleteedges between A2 and B3 = B1−B2. As a side remark, let us point out thatdropping the K3-freeness condition from K3-saturation would make our lifeeasier here. Indeed, then we could have just chosen A2 to be a subset ofB2.

The trick is to take yet another set A3 of o(log n) vertices, and add allthe o(n log n) edges of G between A3 and A2∪B3 to our construction. Nowthe o(log n) vertices in A3 are not enough to complete all the edges betweenA2 and B3, but if |A3| = ω(1), then it will complete most of them. Thisgives us a triangle-free construction on (1 + o(1))2 log2 n edges completingall but o(n log n) edges, and by the observation above this is enough.

B1

B2 B3

A1 A2

A3

Figure 3.1: strong K3-saturation

27


Let us now collect the ingredients of the proof. The next lemma saysthat the graph comprised of the edges incident to a fixed set of a verticeswill complete all but an approximately (1−p2)a fraction of the edges in anyprescribed set E.

Lemma 3.2.4. Let s ≥ 3 be a fixed integer, and suppose a = a(n) grows toinfinity as n → ∞. Let A ⊆ [n] be a set of size a and E be a collection ofvertex pairs from B = [n]−A. Now consider the random graph GA definedon [n] as follows:

1) GA[B] is empty,

2) GA[A] is a fixed graph such that any induced subgraph on alog2 a

vertices

contains a copy of Ks−2,

3) the edges between A and B are in GA independently with probabilityp.

Then the expected number of pairs in E that GA does not complete is atmost (1− p2)a−

alog a · |E|.

Proof. Suppose a pair u, v ∈ E of vertices is incomplete, i.e., adding theedge uv to GA does not create a Ks. Because of the second condition, theprobability of this event can be bounded from above by the probability thatu and v have fewer than a

log2 aneighbors in A. As the size of the common

neighborhood of u and v is distributed as Bin(a, p2), Lemma 3.2.1 impliesthat this probability is at most (1 − p2)a−

alog a . This bound holds for any

pair in E, so the expected number of bad pairs is indeed no greater than(1− p2)a−

alog a · |E|.

A Ks-saturated graph cannot contain any cliques of size s. So if wewant to construct such a graph using Lemma 3.2.4, we need to make surethat GA itself is Ks-free. The easiest way to do this is by requiring thatGA[A] do not contain any Ks−1. In our application, GA will be a subgraph ofG(n, p), so in particular, we need GA[A] to be a subgraph of an Erdos-Renyirandom graph that is Ks−1-free but induces no large Ks−2-free subgraph.Krivelevich [40] showed that whp we can find such a subgraph.

28


Theorem 3.2.5 (Krivelevich). Let s ≥ 3 be an integer, p ≥ cn−2/s (forsome small constant c depending on s) and G = G(n, p). Then whp Gcontains a Ks−1-free subgraph H that has no Ks−2-free induced subgraph onns−2s polylog n ≤ n

log3 nvertices.

We will also use the fact that random graphs typically have relativelysmall chromatic numbers (see e.g. [33]):

Claim 3.2.6. Let 0 < p < 1 be a constant and G = G(n, p). Then χ(G) =(1 + o(1)) n

2 log1/(1−p) nwhp.

We are now ready to prove our main theorem.

Theorem 3.2.7. Let 0 < p < 1 be a constant, s ≥ 3 be a fixed integerand G = G(n, p). Then whp G contains a Ks-saturated subgraph on (1 +o(1))n log1/(1−p) n edges.

Proof. Define α = 11−p , β = 1

1−p2 , and set a1 = 1p(1 + 3

log logα n) logα n,

a2 = (1 + 2log logβ n

) logβ(n2

)and a3 = a2√

log a2= o(log n). Let us also define

A1, A2, A3 to be some disjoint subsets of [n] such that |Ai| = ai, and letB1 = [n]− (A1 ∪ A2 ∪ A3).

Expose the edges of G[A1]. By Theorem 3.2.5 it contains a Ks−1-freesubgraph G1 with no Ks−2-free subset of size a1

log3 a1whp. Let H1 be the

graph on vertex set A1 ∪ B1 such that H1[A1] = G1, H1[B1] is empty, andH1 is identical to G between A1 and B1. Now define B2 to be the set ofvertices in B1 that are adjacent to fewer than (1 + 2

log logα n) logα n vertices

of A1 in the graph H1. We claim that H1 completes all but O(n log log n)of the vertex pairs in B1 not induced by B2.

To see this, let F be the set of incomplete pairs in B1 not induced byB2. Now fix a vertex v ∈ B1 − B2 and denote its neighborhood in A1 byNv. By definition, |Nv| ≥ (1 + 2

log logα n) logα n. Let dF (v) be the degree of

v in F , i.e., the number of pairs u, v ⊆ B2 that H1 does not complete.Conditioned on Nv, the size of the common neighborhood of v and someu ∈ B1 is distributed as Bin(|Nv|, p), so by Claim 3.2.1 the probability that

this is smaller than a1log3 a1

≤ |Nv |log2 |Nv |

is at most (1 − p)|Nv |−|Nv |/ log |Nv | ≤(1− p)logα n = 1

n. On the other hand, if the common neighborhood contains

at least a1log3 a1

vertices in A1, then it also induces a Ks−2, thus the pair is

complete. Therefore E[dF (v)] ≤ 1 and E[∑

v∈B2−B1dF (v)] ≤ n.

29


Here the quantity∑

v∈B2−B1dF (v) bounds |F |, the number of incomplete

edges in B1 that are not induced by B2, so by Markov’s inequality thisnumber is indeed O(n log log n) whp. By the Observation above, we cantemporarily ignore these edges, and concentrate instead on completing thoseinduced by B2. To complete them, we will use the (yet unexposed) edgestouching A2.

The good thing about B2 is that it is quite small: For a fixed v ∈ B1, thesize of its neighborhood in A1 is distributed as Bin(a1, p), so the probabilitythat it is smaller than (1+ 2

log logα n) logα n = pa1− logα n

log logα nis, by Claim 3.2.1,

at most e− logα n/4(log logα n)2 1logn

. So by Markov, |B2|, the number of suchvertices, is smaller than n

lognwhp.

Now expose the edges of G[A2]. Once again, whp we can apply The-orem 3.2.5 to find a Ks−1-free subgraph G2 that has no large Ks−2-freeinduced subgraph. Let H2 be the graph on A2∪B2 such that H2[A2] = G2,H2[B2] is empty, and H2 = G on the edges between A2 and B2. Let usapply Lemma 3.2.4 to GA2 = H2 with E containing all vertex pairs in B2.The lemma says that the expected number of incomplete pairs in E is atmost

(1− p2)a2− a2

log a2 ·(|B2|

2

)≤ (1− p2)logβ (n2) ·

(n/ log n

2

)= o(1),

so by Markov, H1 completes all of E whp, in particular it completes all theedges in G[B2]. Note that |B2| ≤ n

lognalso implies that H2 only contains

O(n) edges, so adding them to H1 does not affect the asymptotic amountof edges in our construction.

However, the edges connecting A2 to B3 = B1−B2 are still incomplete,and there are Θ(n log n) of them, too many to ignore using the Observation.The idea is to complete most of these using the edges between A3 andA2 ∪ B3, but we need to be a little bit careful not to create copies of Ks

with the edges in H2[A2] = G2. We achieve this by splitting up A2 intoindependent sets.

By Claim 3.2.6, we know that k = χ(G[A2]) = O(

a2log a2

), so G2 ⊆ G[A2]

can also be k-colored whp. Let A2 = ∪ki=1A2,i be a partitioning into colorclasses, so here G2[A2,i] is an empty graph for every i. Let us also splitA3 into 2k parts of size a4 = a3/2k and expose the edges in G[A3]. Eachof the 2k parts contains a Ks−1-free subgraph with no Ks−2-free subset

30

3.3. WEAK SATURATION

of size a4log3 a4

with the same probability p0, and by Theorem 3.2.5, p0 =

1− o(1) (note that a4 = Ω(√

log a2) grows to infinity). This means that theexpected number of parts not having such a subgraph is 2k(1− p0) = o(k),so by Markov’s inequality, whp k of the parts, A3,1, . . . A3,k, do contain suchsubgraphs G3,1, . . . , G3,k.

Now define H3,i to be the graph with vertex set A3,i∪A2,i∪B3 such thatH3,i[A3,i] = G3,i, H3,i[A2,i ∪ B3] is empty, and the edges between A3,i andA2,i ∪ B3 are the same as in G. Then we can apply Lemma 3.2.4 to showthat H3,i is expected to complete all but a (1−p2)a4−a4/ log a4 fraction of theA2,i-B3 pairs. This means that the expected number of edges between A2

and B3 that H3 = ∪ki=1H3,i does not complete is bounded by

(1− p2)Ω(√

log a2) · |A2||B3| ≤ (1− p2)Ω(√

log logn) ·O(n log n).

So if F ′ is the set of incomplete A2-B3 edges, then another application ofMarkov gives |F ′| = o(n log n) whp.

It is easy to check that H = H1 ∪ H2 ∪ H3 is Ks-free, since it can beobtained by repeatedly “gluing” together Ks-free graphs along independentsets, and such a process can never create an s-clique. To estimate the sizeof H, note that Claim 3.2.1 implies that whp all the vertices in A1 have(1 + o(1))p|B1| neighbors in B1, so H1 contains (1 + o(1))n logα n edges.We have noted above that H2 contains O(n) edges and H3 clearly contains

at most |A2 ∪ B3||A3| = O(n logn√

log logn

)edges. So in total, H contains

(1 + o(1))n logα n edges.Finally, the only edges H is not saturated at are either in F , in F ′,

touching A3, or induced by A1 ∪ A2. There are o(n log n) edges of eachof these four kinds, so any maximal Ks-free supergraph H ′ of H has (1 +o(1))n logα n edges. This H ′ is Ks-saturated, hence the proof is complete.

3.3 Weak saturation

In this section we prove Theorem 3.1.3 about the weak saturation numberof random graphs of constant density. In fact, we prove the statement for aslightly more general class of graphs, satisfying certain pseudorandomnessconditions. We will need some definitions to formulate this result.

31


Given a graph G and a vertex set X, a clique extension of X is a vertexset Y disjoint from X such that G[Y ] is a clique and G[X, Y ] is a completebipartite graph. We define the size of this extension to be |Y |.

The following definition of goodness captures the important propertiesneeded in our proof.

Definition 3.3.1. A graph G is t-good with respect to γ if G satisfies thefollowing properties:

P1. For any vertex set X of size x and any integer y such that x, y ≤ t,G contains at least γn disjoint clique extensions of X of size y.

P2. For any two disjoint sets S and T of size at least γn/2, there is avertex v ∈ T and X ⊆ S of size t− 1 such that X ∪ v induces a cliquein G.

It is not hard to see that the Erdos-Renyi random graphs satisfy theproperties:

Claim 3.3.2. Let 0 < p < 1 be a constant and let t be a fixed integer. Thenthere is a constant γ = γ(p, t) > 0 such that whp G = G(n, p) is t-good withrespect to γ.

Proof. To prove property P1, fix x, y ≤ t and a set X of size x, and splitV − X into groups of y elements (with some leftover): V1, . . . , Vm where

m =⌊n−xy

⌋. The probability that for some i ∈ [m], all the pairs induced

by Vi or connecting X and Vi are edges in G is p = p(y+x2 )−(x2), which is a

constant. Let BX,y be the event that fewer than mp/2 of the Vi satisfy this.By Claim 3.2.1, P(BX,y) ≤ e−mp/8.

Now set γ = p(2t2 )/4t, then γn ≤ mp/2, so if BX,y does not hold then

there are at least γn different Vi’s that we can choose to be the sets Yi we arelooking for. On the other hand, there are only

∑tx=0

(nx

)≤ t(nt

)≤ t · et logn

choices for X and t choices for y, so

P(∪X,yBX,y) ≤ t2 · et logn · e−γn/4 = o(1),

hence P1 is satisfied whp.

32


To prove property P2, notice that Theorem 3.2.5 implies that whp anyinduced subgraph of G on at least n

log3 nvertices contains a clique of size t.1

Let us assume this is the case and fix S and T . Then if a vertex v ∈ T hasat least n

log3 nneighbors in S, then the neighborhood contains a t− 1-clique

in G that we can choose to be X. So if property P2 fails, then no such vcan have n

log3 nneighbors in S.

The neighborhood of v in S is distributed as Bin(|S|, p), so the prob-

ability that v has fewer than nlog3 n

≤ |S|log2 |S| neighbors in S is at most

(1 − p)|S|−|S|/ log |S| ≤ e−p|S|/2 ≤ e−pγn/4 by Claim 3.2.1. These events areindependent for the different vertices in S, so the probability that P2 failsfor this particular choice of S and T is e−Ω(n2). But we can only fix S andT in 22n different ways, so whp P2 holds for G.

Now we are ready to prove the following result, which immediately im-plies Theorem 3.1.3.

Theorem 3.3.3. Let G be 2s-good with respect to some constant γ. Then

w-sat(G,Ks) = (s− 2)n−(s− 1

2

).

Proof. To prove the lower bound on w-sat(G,Ks), it is enough to show thatG is weakly saturated in Kn. Indeed, if H is weakly saturated in G and Gis weakly saturated in Kn, then H is weakly saturated in Kn: We can justadd edges one by one to H, obtaining G, and then keep adding edges untilwe reach Kn in such a way that every added edge creates a new copy of Ks.But then Theorem 3.1.2 implies that H contains at least (s − 2)n −

(s−1

2

)edges, which is what we want.

Actually, G is not only weakly, but strongly saturated in Kn: propertyP1 implies that for any vertex pairX = u, v, G contains a clique extensionof X of size s − 2. But then adding the edge uv to this subgraph createsthe copy of Ks that we were looking for.

Let us now look at the upper bound on w-sat(G,Ks). Fix a core C ofs − 2 vertices that induces a complete graph (such a C exists by propertyP1, as it is merely a clique extension of ∅ of size s−2). Our saturated graphH will consist of G[C], plus s− 2 edges for each vertex v ∈ V − C, giving

1We should point out that this statement is much weaker than Theorem 3.2.5 andcan be easily proved directly using a simple union bound argument.

33


a total of(s−2

2

)+ (s− 2)(n− s+ 2) = (s− 2)n−

(s−1

2

)edges. We build H

in steps.Let V ′ be the set of vertices v ∈ V − C adjacent to all of C. Note

that |V ′| ≥ γn by property P1. We start our construction with the graphH0 ⊆ G on vertex set V0 = C ∪ V ′ that contains all the edges touching C.

Once we have defined Hi−1 on Vi−1, we pick an arbitrary vertex vi ∈V − Vi−1, and choose a set Ci ⊆ Vi−1 of s− 2 vertices that induces a cliquein G with vi. Again, we can find such a Ci as a clique extension of vi∪C. Infact, by the definition of V ′, this Ci will lie in V ′. Then we set Vi = Vi−1∪viand define Hi to be the graph on Vi that is the union of Hi−1 and the s− 2edges connecting vi to Ci. Repeating this, we eventually end up with somegraph H = Hl on V = Vl that has

(n2

)−(n−s+2

2

)edges. We claim it is

saturated.

C

vi

CiVi−1

Figure 3.2: Weak K4-saturation. Black edges are in H, gray edges are in Gbut not in H.

We really prove a bit more: we show, by induction, that Hi is weaklyKs-saturated in G[Vi] for every i. This is clearly true for i = 0: any edge ofG[V0] not contained in H0 is induced by V ′, and forms an s-clique with C.

Now assume the statement holds for i − 1. We want to show that wecan add all the remaining edges in Gi = G[Vi] to Hi one by one, each timecreating a Ks. By induction, we can add all the edges in Gi−1, so we mayassume they are already there. Then the only missing edges are the onestouching vi.

If v ∈ Vi is a clique extension of Ci ∪ vi then adding vvi creates a Ks, sowe can add this edge to the graph. Property P1 applied to C∪Ci∪vi to find

34


clique extensions of size 1 shows that there are at least γn such vertices.Let N be the new neighborhood of vi after these additions, then |N | ≥ γn.

Now an edge vvi can also be added if v∪C ′ induces a complete subgraphin G, where C ′ is any s − 2-clique in Gi[N ]. Let us repeatedly add theavailable edges (updating N with every addition). We claim that all themissing edges will be added eventually.

Suppose not, i.e., some of the edges in Gi touching vi cannot be addedusing this procedure. There cannot be more than γn/2 of them, as thatwould contradict property P2 with S = N and T being the remainingneighbors of vi in Vi. But then take one such edge, vvi, and apply propertyP1 to C ∪ v, vi. It shows that there are γn disjoint clique extensions ofsize s−2, that is, γn disjoint s−2-sets in V ′ that form s-cliques with v, vi.

But at most γn/2 of these cliques touch a missing edge other thanvvi, so there is a C ′ among them that would have been a good choice forv. This contradiction establishes our claim and finishes the proof of thetheorem.

3.4 Further remarks

Many of the saturation results also generalize to hypergraphs. For example,Bollobas [9] and Alon [1] proved that sat(Kr

n, Krs ) = w-sat(Kr

n, Krs ) =

(nr

)−(

n−s+rr

), where Kr

t denotes the complete r-uniform hypergraph on t vertices.It would therefore be very interesting to see how saturation behaves inGr(n, p), the random r-uniform hypergraph.

With some extra ideas, our proofs can be adapted to give tight resultsabout Kr

r+1-saturated hypergraphs, but the general Krs -saturated case ap-

pears to be more difficult. It is possible, however, that our methods canbe pushed further to solve these questions, and we plan to return to theproblem at a later occasion.

Another interesting direction is to study the saturation problem for non-constant probability ranges. For example, Theorem 3.1.1 about strong sat-uration can be extended to the range 1

logε(s) n≤ p ≤ 1 − 1

o(n)in a fairly

straightforward manner. As for K3-saturation, a bit of extra work yields thecorrect asymptotics in the range n−1/2+ε ≤ p 1, as well. Here we can showthat sat(G(n, p), K3) = (1 + o(1))n

plog np2. Note also that for p n−1/2

there are much fewer triangles in G(n, p) than edges, so any saturated graph

35


must contain almost all the edges, i.e., sat(G(n, p), K3) = (1 + o(1))(n2

)p.

These results give us a fairly good understanding of K3-saturation in ran-dom graphs. However, in general, Ks-saturation for sparse random graphsappears to be more difficult, and it seems that for p much smaller than 1

logn

the saturation numbers will depend on s.It is also not difficult to extend our weak saturation result, Theo-

rem 3.1.3, to the range n−ε(s) ≤ p ≤ 1. A next step could be to obtain resultsfor smaller values of p, in particular, it would be nice to determine the ex-act probability range where the weak saturation number is (s−2)n−

(s−1

2

).

Note that our lower bound holds as long as G(n, p) is weakly Ks-saturatedin Kn. This problem was studied by Balogh, Bollobas and Morris [4], whoshowed that the probability threshold for G(n, p) to be weakly Ks-saturated

is around p ≈ n−1/λ(s)polylog n, where λ(s) =(s2)−2

s−2. It might be possible

that w-sat(G(n, p), Ks) changes its behavior at the same threshold.

F -saturation in the complete graph has been studied for various differ-ent graphs F (see [27] for references). For example, Kaszonyi and Tuza[37] showed that for any fixed (connected) graph F on at least 3 vertices,sat(n, F ) is linear in n. As we have seen, this is not true in G(n, p). How-ever, analogous results in random host graphs could be of some interest.

36

Chapter 4

A random triadic process

4.1 Introduction

The principle of triadic closure is an important concept in social networktheory (see e.g. [20]). Roughly speaking, it says that when new friendshipsare formed in a social network, it is more likely to occur between two peoplesharing a common friend, thus “closing” a triangle, than elsewhere. Wewill consider a simplistic model of the evolution of a social network, wherefriendships can only be formed through a common friend, and triadic closureeventually occurs at any triangle with probability p, independently of othertriangles. We refer to this process as the triadic process.

Formally, let H = H(n, p) be a random 3-uniform hypergraph on [n]where each triple independently appears with probability p. The triadicprocess is the following graph process. We start with the star G0 on thesame vertex set [n], containing all the edges incident to some vertex v0, andrepeatedly add any edge xy if there is a vertex z such that xz and zy arealready in the graph and xzy ∈ H. We say that the process propagates ifall the edges are added to the graph eventually. It is easy to see that thisevent does not depend on the order the edges are added in. In this chapterwe prove that the threshold probability for propagation is 1

2√n.

Theorem 4.1.1. Suppose p = c√n

, for some constant c > 0. Then,

1) If c > 12, then the triadic process propagates whp.

2) If c < 12, then the triadic process stops at O(n

√n) edges whp.

37

4.1. INTRODUCTION

As usual, we say that some property holds with high probability or whpif it holds with probability tending to 1 as n tends to infinity.

Randomized graph processes have been intensively studied in the pastdecades. One notable example is the triangle-free process, originally moti-vated by the study of the Ramsey number R(3, n) (see e.g. [26]). In thisprocess the edges are added one by one at random as long as they do notcreate a triangle in the graph. The triadic process is a slight variant ofthis, with a very similar nature. Indeed, our analysis makes good use of thetools developed by Bohman [7] when he applied the differential equationmethod to track the triangle-free process. Several other related processeswere also analyzed using differential equations, e.g. [8]. For more infor-mation about this method we refer the interested reader to the excellentsurvey of Wormald [54].

Coja-Oghlan, Onsjo and Watanabe [17] investigated a similar kind ofclosure while analyzing connectivity properties of random hypergraphs.They say that a 3-hypergraph is propagation connected if its vertices canbe ordered in some way v1, . . . , vn so that each vi (i ≥ 3) forms a hyperedgewith two preceding vertices. They obtain the threshold probability for thepropagation connectivity of H(n, p) up to a small multiplicative constant.Using this directed notion of connectivity, our problem asks when the ran-dom 3-hypergraph on the line graph of Kn is propagation connected fromthe star.

Our main motivation for considering the triadic process comes fromthe theory of random 2-dimensional simplicial complexes. A simplicial 2-complex on the vertex set V is a set family Y ⊆

(V≤3

)closed under taking

subsets. The dimension of a simplex σ ∈ Y is defined to be |σ| − 1. Weuse the terms vertices, edges and faces for 0, 1 and 2-dimensional simplices,respectively. The 1-skeleton of a 2-complex is the subcomplex containingits vertices and edges.

The Linial–Meshulam model of random simplicial complexes, introducedin [41], is a generalization of the Erdos–Renyi random graph model and hasbeen studied extensively in recent years. The random 2-complex Y2(n, p) isdefined to have the complete 1-skeleton, i.e., all vertices and edges, and eachof the faces independently with probability p. The study of random com-plexes involves both topological invariants and combinatorial properties,including homology groups, homotopy groups, collapsibility, embeddabilityand spectral properties.

38

4.1. INTRODUCTION

One of the oldest questions of this kind, asked by Linial and Meshulam[41], is how the fundamental group π1(Y2(n, p)) of the random 2-complexbehaves. Babson, Hoffman and Kahle [3] showed that if p < n−α for somearbitrary α > 1/2 then the fundamental group is nontrivial whp. On theother hand, they proved that π1(Y2(n, p)) is trivial for p >

√4 log n/n,

which means that the threshold probability for being simply connectedshould be close to n−1/2. As a corollary of the first part of Theorem 4.1.1,we improve the upper bound on the threshold probability.

Corollary 4.1.2. Let p = c√n

for some constant c > 12. Then Y2(n, p) is

simply connected whp.

Proof. Suppose we have a 2-complex C such that one of its edges, e, iscontained in a unique face f . Then we can collapse f onto the other twoedges without changing the fundamental group of C. In fact, C − f −e is homotopy equivalent to C, the former complex being a deformationretract of the latter. We say that a 2-complex with complete 1-skeleton is acollapsible hypertree if we can apply a sequence of collapses to it and end upwith a tree. Clearly, a collapsible hypertree has trivial fundamental group.

Now observe that Theorem 4.1.1 implies that Y2(n, p) contains a col-lapsible hypertree whp. Indeed, if the process propagates, then take C tobe the subcomplex of the faces that correspond to the triples we used to addedges to the graph. Then by definition, the reverse of the triadic processon C is exactly a sequence of collapses resulting in a star.

Basic results about the topology of complexes tell us that the additionof faces to a simply connected complex does not change the fundamentalgroup, hence π1(Y2(n, p)) is trivial whp.

After the publication of this result, we learned that Gundert and Wagner[32] independently improved the bound in [3], and showed that Y2(n, p) iswhp simply connected for p = c√

nwhere c is a sufficiently large constant.

4.1.1 Proof outline

Instead of exposing all the triples at once, we will be sampling them onthe fly, trying to extend the edge set of the graph. Both the proofs ofthe upper bound and the lower bound consist of two phases. In the firstphase we make one step at a time: we choose, uniformly at random, one(yet unsampled) triple spanning exactly two edges and expose it. With

39

4.2. THE DIFFERENTIAL EQUATION METHOD

probability p the triple is selected, hence we can add the third edge toour edge set. The second phase proceeds in rounds: we simultaneouslyexpose all the unsampled triples spanning two edges, and extend the edgeset according to the outcome.

The essence of the proof is to track the behavior of certain variablesthroughout the process. As we will see, this is not a very hard task todo in the second phase, using standard measure concentration inequalities.However, during the initial phase of the process, the codegrees (one of thevariables we track) are not concentrated, which forces us to do a morecareful analysis of the beginning of the process. For this we will use thedifferential equation method.

We organize the rest of the chapter as follows. In Section 4.2 we givean overview of how we apply the differential equation method. A detailedanalysis of the actual implementation follows in Section 4.3. We move onto the second phase of the process in Section 4.4, thereby completing theproof of Theorem 4.1.1. We finish with some remarks in Section 4.5.

Notations: Throughout the chapter, we will omit floor and ceiling signswhenever they are not necessary. The sign ± will be used to represent botha two-element set of values and a whole interval, but it should be clear fromthe context which one is the case.

4.2 The differential equation method

At any point in the process, we say that a vertex triple u, v, w is open if itspans exactly two edges but has not yet been sampled. We will also use thenotation uvw for an open triple with edges uv and vw. By an open tripleat u, we mean a triple uvw, i.e., one that has its missing edge adjacent tothe vertex u.

In each step, our process picks an open triple uniformly at random andsamples it. If the answer is positive then we close the triple by adding themissing edge to the graph. To analyze this process we apply the differentialequation method, using some ideas from [7].

For simplicity, let us denote the graph we obtain after i samples byGi. We consider the following random variables: Dv(i) is the degree of thevertex v in Gi. Fv(i) is the number of open triples at v, so it is the number

40


of ways for v to gain a new incident edge in Gi+1. Xu,v(i) is the codegreeof u and v, i.e., the number of common neighbors of u and v in Gi.

To provide some insight, we first heuristically describe the process. Letus assume for now that the Dv(i) are concentrated around some value D(i),and similarly the Fv(i) are approximately equal to some value F (i). Wefurther assume that the variables are very close to their expectations.

In step i + 1 we choose an open triple uniformly at random, so eachtriple is chosen with probability 2∑

v Fv(i)≈ 2

nF (i), and then sample it. With

probability p the sample is successful, hence we can close the triple. As thenumber of open triples at a vertex v is about F (i), the change in the degreeof a vertex v we expect to see is

D(i+ 1)−D(i) ≈ 2p

n.

Now let us see how Fv(i) is affected by a step. We gain open triplesat v either if we successfully sample one of them (adding the edge vw),in which case new open triples are formed with the neighbors of w, orif we successfully sample a triple at some neighbor of v. On the otherhand, we lose the sampled triple regardless of the outcome. The probabilityof sampling an open triple at some specific vertex w is 2Fw(i)∑

v Fv(i)≈ 2

n, so

assuming all the codegrees are negligible compared to D(i), the expectedchange is

F (i+ 1)− F (i) ≈ 2

n(2pD(i)− 1).

To smooth out this discrete process, we introduce a continuous variablet and say that step i corresponds to time t = ti = i

n2 . Let us also rescaleD and F by considering the smooth functions d and f in t, where we wantd(t) to be approximately D(i)/

√n and f(t) to be approximately F (i)/n.

Note that, since p = c/√n, our assumptions so far suggest the following

behavior:

d′(t) ≈ d(t+ 1/n2)− d(t)

1/n2≈ n3/2(D(i+ 1)−D(i)) ≈ 2c

and

f ′(t) ≈ f(t+ 1/n2)− f(t)

1/n2≈ n(F (i+ 1)− F (i)) ≈ 4cd(t)− 2.

41


Let us emphasize that this little musing that we are presenting hereis not a proof at all — a detailed analysis and the proof of concentrationwill follow in Section 4.3. However, it at least indicates why it is plausibleto believe that the actual values of Dv(i) and Fv(i) follow the trajectoriesof d and f given by the system of differential equations d′(t) = 2c andf ′(t) = 4cd(t)− 2.

In the previous paragraphs we made the assumption that the codegreesare negligible compared to the degrees, but since they are not concentrated,proving this still needs some thought. To this end, we introduce two morerandom variables. Yu,v(i) denotes the number of open 3-walks uww′v from uto v, i.e., 3-walks where we require that uww′ be open (but allowing w = v),and Zu,v(i) is the number of open 4-walks uww′w′′v (again, allowing vertexrepetitions), where both uww′ and w′w′′v are open. Note that Yu,v is notsymmetric in u and v.

The point is that Yu,v and Zu,v are concentrated (as we will see in Section4.3), and — amazingly enough — their one-step behavior can be describedwith fairly simple formulas. So let us continue with our thought experimentand assume that all Yu,v(i) ≈ Y (i), all Zu,v(i) ≈ Z(i), and all variables areclose to their expectations.

First of all, the increase in the codegrees comes from a successful samplein a 3-walk, so we expect

Xu,v(i+ 1)−Xu,v(i) ≈2p(Yu,v(i) + Yv,u(i))

nF (i)≈ 4cy(t)

n2f(t).

This will be enough to prove a uniform O(log n) upper bound over all thecodegrees, so we can keep ignoring the effect of X in the next few para-graphs.

Let us look at the change in Yu,v(i). There are three different ways anew open 3-walk uww′v can appear after step i+1, depending on which oneof uw,ww′ and w′v is the new edge. When uw is the new edge, there is a4-walk utww′v in Gi where utw is open. We can count such configurationsby first choosing w′ as a neighbor of v and then choosing an open 3-walkutww′. Note that for any such choice, uww′ will be an open triple in Gi+1,except if w′ is the same as u, or a common neighbor of u and w. The lattercases are negligible, so there are about Y (i)D(i) possibilities in this case.

Similarly, when ww′ is the new edge, new 3-walks come from 4-walksuwtw′v, and we can count the number of options by first choosing w as a

42


neighbor of u and then an open 3-walk from w to v. Again, the triple uww′

will be open in Gi+1 if w′ is neither u, nor a common neighbor of u and v,so we find Y (i)D(i) possibilities of this type. Finally, w′v can only be thenew edge if w′tv was successfully sampled in some open 4-walk uww′tv, sothere are about Z(i) such options.

On the other hand, we lose an open 3-walk if we sample its open triple,whether or not the sample is successful. As any particular triple is chosenwith probability about 2

nF (i), this means that we expect to see

Y (i+ 1)− Y (i) ≈ 2

nF (i)

(p(2Y (i)D(i) + Z(i)

)− Y (i)

).

The change in Z(i) is a bit easier to analyze: Once again, we obtain anew 4-walk uww′w′′v if one of its edges is added in step i+1. We will assumeit is the first edge, uw, but by symmetry our counting argument works forall other edges, as well. Then the 4-walk comes from a 5-walk utww′w′′vin Gi. We can count the number of options by first taking an open triplevw′′w′ at v and then choosing an open 3-walk from u to w′. Again, thecreated 4-walk will automatically be open unless w′ is u or a neighbor of u,so there are about Y (i)F (i) candidates of this type and 4Y (i)F (i) in total.And then of course, we lose an open 4-walk if we sample one of its two opentriples, regardless of the outcome. This suggests

Z(i+ 1)− Z(i) ≈ 2

nF (i)

(4pY (i)F (i)− 2Z(i)

).

Once again, we are looking for smooth functions y and z such thaty(t) is approximately Y (i)/

√n and z(t) is about Z(i)/n. Then the same

computation as before gives the differential equations

y′(t) =2

f(t)

((2cd(t)− 1)y(t) + cz(t)

)and

z′(t) =4

f(t)

(2cy(t)f(t)− z(t)

).

We have yet to talk about the initial conditions of the above systemof differential equations. Our process starts with a star centered at somevertex v0, i.e., an n-vertex graph with n− 1 edges, all of them touching v0.

43

4.3. CALCULATIONS

Then Dv(0) = 1, Fv(0) = n−2, Yu,v(0) = 0 and Zu,v(0) = n−3 for any twovertices u and v other than v0. For convenience, we will drop the center ofthe star from consideration in the sense that we do not define the variableswith v0 among the indices. This is a technicality that allows us to proveconcentration, and since our recurrence relations never use those variables,it causes no problem.

Hence we obtain the initial conditions d(0) = 0, f(0) = 1, y(0) = 0 andz(0) = 1, and an easy calculation shows that the corresponding solution ofour system of differential equations is

d(t) = 2ct f(t) = 1− 2t+ 4c2t2

y(t) = d(t)f(t) z(t) = f 2(t).

In the next section we prove that the variables indeed closely follow thepaths defined by these functions.

4.3 Calculations

In this section we show that our variables follow the prescribed trajectoriesup to some time T . Of course, we cannot hope to do so if f(t) vanishessomewhere on [0, T ], as that would mean that the process is expected todie before time T . Now if c > 1/2 then f has no positive root, so this isnot an issue: we can take T =

√log n. However, if c ≤ 1/2 then f does

reach 0, first at time T0 = 1−√

1−4c2

4c2. In this case T will be chosen to be a

constant arbitrarily close to T0.The allowed deviation of each variable will be defined by one of the error

functions

g1(t) = eKtn−1/6 and g2(t) = (1 + d(t))eKtn−1/6,

where

K = 100 · max0≤t≤T

(1 +

d(t)

f(t)+

1

f(t)

).

It is clearly enough to prove the first part of Theorem 4.1.1 for c ≤ 1,so from now on we will assume this is the case.

Let us define Gi to be the event that all of the bounds below in Propo-sition 4.3.1(a)-(e) hold for every pair of vertices u and v and for all indicesj = 0, . . . , i. This section is devoted to the proof of the following result,which is the key to proving that the variables follow the desired trajectories.

44

4.3. CALCULATIONS

Proposition 4.3.1. Fix some vertices u and v. Then, conditioned on Gj−1,each of the following bounds fails with probability at most n−10.

(a) Dv(j) ∈(d(tj)± g1(tj)

)√n

(b) Fv(j) ∈(f(tj)± g1(tj)

)n

(c) Xu,v(j) ≤ 50 log n

(d) Yu,v(j) ∈(y(tj)± g2(tj)

)√n

(e) Zu,v(j) ∈(z(tj)± g2(tj)

)n

As a corollary, we obtain our main result.

Theorem 4.3.2. Suppose c ≤ 1, and T ≤√

log n and K are defined asabove. Then the bounds in Proposition 4.3.1(a)-(e) hold with high probabil-ity for all vertices u and v and for every j = 0, . . . , T · n2.

Proof. It is easy to check that G0 always holds. If Bj is the event that,conditioned on Gj−1, at least one of these bounds fails for j, then the failureprobability is exactly P[∪Tn2

j=1Bj]. A trivial union bound over all pairs ofvertices and all equations in Proposition 4.3.1 shows that P[Bj] ≤ 5n−8,hence another union bound over the indices gives P[∪Tn2

j=1Bj] ≤ n−5 = o(1).

To prove Proposition 4.3.1, we follow the strategy in [7] and analyze eachrandom variable separately. Our plan is to use some martingale concentra-tion inequalities to bound the probability of large deviation. However, sincewe cannot track the exact values of the expectations, only estimate them bysome intervals, we will use two separate sequences to bound each variable:A submartingale to bound from below, and a supermartingale to boundfrom above.

Recall that a stochastic process X0, X1, . . . is called a submartin-gale if E[Xi+1|X1, . . . , Xi] ≥ Xi for all i, and a supermartingale ifE[Xi+1|X1, . . . , Xi] ≤ Xi for all i. We say that a sequence X0, X1, . . . ofvariables is (η,N)-bounded if Xi − η ≤ Xi+1 ≤ Xi + N for all i. Wecall a sequence of pairs X±0 , X

±1 , . . . an (η,N)-bounded martingale pair,

if X+0 , X

+1 , . . . is an (η,N)-bounded submartingale and X−0 , X

−1 , . . . is an

(η,N)-bounded supermartingale. The following concentration results ofBohman [7] are essential for proving that the variables follow the desiredtrajectories:

45

4.3. CALCULATIONS

Lemma 4.3.3 (Bohman). Suppose η ≤ N/10 and a < ηm. If 0 ≡X±0 , X

±1 , . . . is an (η,N)-bounded martingale pair then

P[X+m ≤ −a] ≤ e−

a2

3ηmN and P[X−m ≥ a] ≤ e−a2

3ηmN .

The general idea for analyzing a random variable R(i), representing anyof the above five variables, is the following. In step i, an open triple issampled, and thus with probability p a new edge is added to our graph. Wesplit the one-step change in R(i) into two non-negative variables: Ai is thegain and Ci is the loss in step i, so R(j) = R(0) +

∑ji=1Ai − Ci. The gain

comes from the contribution of the added edge after a successful sample.Loss can only occur when some open triple stops being open, either becauseit was sampled or because its missing edge was added through some otheropen triple (although the effect of the latter event is negligible compared tothe former if the codegrees are small).

Next we estimate the expectation of Ai (using the recurrence relationswe hinted at in Section 4.2), so that we can define A+

i and A−i , shiftedcopies of Ai with non-negative and non-positive expectations, respectively.This way B±j =

∑ji=1 A

±i is an (η,N)-bounded martingale pair, where η

is approximately the expectation and N is some trivial upper bound onAi. We do the same with the Ci to define C±i and the martingale pairD±j =

∑ji=1C

±i .

Finally we establish a connection between the concentration of R(j) =R(0) +

∑ji=1Ai−Ci and the concentration of our shifted variables B±j and

D±j in Lemma 4.3.7, and then use the concentration of martingale pairs,Lemma 4.3.3, to bound the error probabilities in Corollary 4.3.8.

The rest of this section is devoted to the actual calculations. The readermight want to skip the details at a first reading. The first subsection es-tablishes the tools we use to prove concentration, while the remaining fivesubsections prove one by one the five parts of Proposition 4.3.1.

4.3.1 Tools

The following claim will help us clean up the calculations of the expecta-

tions. Recall that K = 100 ·max0≤t≤T

(1 + d(t)

f(t)+ 1

f(t)

).

46

4.3. CALCULATIONS

Claim 4.3.4. Let 0 ≤ t ≤ T so that f(t) > 0 is bounded away from 0 (tmight depend on n). If r(t) is one of the functions 1, d(t) or f(t) then thefollowing inequalities hold:

(r(t)± g1(t))(f(t)± g1(t))(

1 +O( logn√n

))

f(t)± g1(t)⊆ r(t)± K

20g1(t),

(r(t)± g1(t))(y(t)± g2(t))(

1 +O( log2 n√n

))

f(t)± g1(t)⊆ r(t)d(t)± K

20g2(t),

(z(t)± g2(t))(

1 +O( logn√n

))

f(t)± g1(t)⊆ f(t)± K

20g2(t).

Proof. Straightforward calculus shows that

1

f(t)± g1(t)⊆(

1

f(t)± g1(t)

f 2(t)+O

(g2

1(t)

f 3(t)

)).

Using this, we will multiply out the formulas on the left-hand side of theinequalities. Note that g1(t) and g2(t) are both O(n−1/7), so in the expandedformulas, any term containing two factors of the type gα(t) or a factor ofO(polylog n√

n) is consumed by an O(n−2/7) error term. Hence the left-hand

side of the first inequality is contained in

(r(t)± g1(t))(f(t)± g1(t))

(1 +O(

log n√n

)

)(1

f(t)± g1(t)

f 2(t)+O(n−2/7)

)⊆ r(t)±

(2r(t)

f(t)+ 1

)g1(t) +O(n−2/7) ⊆ r(t)± K

20g1(t).

Similarly, the left-hand side of the second inequality is contained in

(r(t)± g1(t))(y(t)± g2(t))

(1 +O(

log2 n√n

)

)(1

f(t)± g1(t)

f 2(t)+O(n−2/7)

)⊆ r(t)d(t)±

(r(t)

f(t)+

y(t)

f(t)(1 + d(t))+

r(t)y(t)

f 2(t)(1 + d(t))

)g2(t) +O(n−2/7)

⊆ r(t)d(t)± K

20g2(t)

47

4.3. CALCULATIONS

using y(t) = f(t)d(t) and g2(t) = (1 + d(t))g1(t). Finally, the left-hand sideof the last inequality is contained in

(z(t)± g2(t))

(1 +O(

log n√n

)

)(1

f(t)± g1(t)

f 2(t)+O(n−2/7)

)⊆ f(t)±

(1

f(t)+

z(t)

f 2(t)(1 + d(t))

)g2(t) +O(n−2/7)

⊆ f(t)± K

20g1(t)

using z(t) = f 2(t).

The remaining lemmas connect the concentration of the original vari-ables and those shifted by the expectations. We will use the followingobservations in the calculations.

Claim 4.3.5. Let s(t) be a differentiable function on [0, T ] such thatsupt∈[0,T ] |s′(t)| = O(polylog n) and ti = i

n2 . Then

1

n2

j−1∑i=0

s(ti) =

∫ tj

0

s(τ)dτ +O(n−1).

Proof. It is a well-known fact in numerical analysis that for reals a ≤ q ≤ b∣∣∣∣∫ b

a

s(τ)dτ − (b− a)s(q)

∣∣∣∣ ≤ (b− a)2 supt∈[a,b]

|s′(t)|.

Taking a = q = ti with b = ti+1 and using ti+1 − ti = 1n2 , this gives∣∣∣∣∫ ti+1

ti

s(τ)dτ − s(ti)

n2

∣∣∣∣ ≤ ti+1 − tin2

supt∈[ti,ti+1]

|s′(t)|,

and summing these up for i = 0, . . . , j − 1, we get∣∣∣∣∣∫ tj

0

s(τ)dτ − 1

n2

j−1∑i=0

s(ti)

∣∣∣∣∣ ≤ tj · supt∈[0,tj ]|s′(t)|

n2

= O

(polylog n

n2

)≤ O(n−1).

48

4.3. CALCULATIONS

This claim will be applied when s is one of the functions d′, f ′, x′, y′, z′, g′1and g′2, in which case s′ is indeed bounded by O(polylog n) in the interval[0,√

log n].

Claim 4.3.6. For α ∈ 1, 2 we have∫ t

0

gα(τ)dτ ≤ 1

K(gα(t)− n−1/6).

Proof. Note that gα(t) = ϕ(t)eKtn−1/6, where ϕ(t) is either constant 1 or1 + d(t). In both cases, ϕ(0) = 1 and ϕ′(t) ≥ 0 for t ≥ 0, so

g′α(t)

K=

(1

Kϕ(t)eKtn−1/6

)′= (ϕ′(t)eKt/K + ϕ(t)eKt)n−1/6 ≥ gα(t).

Hence ∫ t

0

gα(τ)dτ ≤∫ t

0

g′α(τ)

Kdτ =

1

K(gα(t)− n−1/6),

as required.

It is time to formally define the shifted variables. Recall that if R(i)represents one of our random variables, then we use the non-negative vari-ables Ai and Ci for the one-step increase and decrease in R, respectively,so that R(i)−R(i− 1) = Ai−Ci. Our aim is to show that R(i) is approx-imately nγr(ti) for some real γ, where the error (the allowed fluctuation ofR) is bounded by nγgα(ti) for some α ∈ 1, 2. Here our choice of γ and αdepends on the variable R represents: γ will be 1 for F and Z, 1/2 for Dand Y , and 0 for X, while α will be 1 for D and F , and 2 for X, Y and Z.

To show the concentration of R, we approximate Ai and Ci by theirexpectations, which, as we shall prove, lie in the intervals nγ−2(rA(ti−1) ±K2gα(ti−1)) and nγ−2(rC(ti−1) ± K

2gα(ti−1)), respectively, for some appro-

priately chosen functions rA(t) and rC(t). Thus we can define the shiftedvariables A+

i and C+i having non-negative expectation, as well as A−i and

C−i having non-positive expectation as follows:

A±i = Ai − nγ−2(rA(ti−1)∓ K

2gα(ti−1)) with B±j =

j∑i=1

A±i and

C±i = Ci − nγ−2(rC(ti−1)∓ K

2gα(ti−1)) with D±j =

j∑i=1

C±i .

49

4.3. CALCULATIONS

Lemma 4.3.7. Suppose the variable R satisfies R(j) = nγr(0)+∑j

i=1Ai−Ci, where r is a polynomial in t such that r′(t) = rA(t)− rC(t). Then

R(j) ≤ nγ(r(tj) + gα(tj))− nγ−1/6/2 +B−j −D+j and

R(j) ≥ nγ(r(tj)− gα(tj)) + nγ−1/6/2 +B+j −D−j .

Proof. Let us first consider the upper bound:

R(j) = nγr(0) +

j∑i=1

Ai − Ci

= nγr(0) +

j∑i=1

(A−i + nγ−2(rA(ti−1) +

K

2gα(ti−1))

)

−j∑i=1

(C+i + nγ−2(rC(ti−1)− K

2gα(ti−1))

)= B−j −D+

j + nγr(0)

+ nγ−2

j∑i=1

(rA(ti−1)− rC(ti−1)) + nγ−2

j∑i=1

Kgα(ti−1)

Now we apply Claim 4.3.5 with functions rA(t)− rC(t) (a polynomial) andKgα(t) (a product of a polynomial and an exponential function). As T ≤√

log n, their derivatives are clearly bounded by O(polylog n) on [0, T ].

R(j) ≤ B−j −D+j + nγr(0)

+ nγ∫ tj

0

(rA(τ)− rC(τ))dτ + nγ∫ tj

0

Kgα(τ)dτ +O(n−1)

≤ nγ(r(tj) + gα(tj))− nγ−1/6/2 +B−j −D+j

using Claim 4.3.6 and nγ−1/6 +O(n−1) ≥ nγ−1/6/2 in the last step.The lower bound comes from an analogous argument by changing the

appropriate signs.

Using this, we can estimate the probability that R(j) deviates from itsexpectation:

50

4.3. CALCULATIONS

Corollary 4.3.8. Suppose the numbers γ, α and the functions R, r, rA, rCsatisfy the conditions of Lemma 4.3.7. Suppose furthermore that B±j andD±j are (η1, N1)-bounded and (η2, N2)-bounded martingale pairs, respec-tively, where ηβNβ ≤ ε and ηβ < Nβ/10 for β = 1, 2. Then the probability

that R(j) 6∈ nγ(r(tj)± gα(tj)) is at most 4e−n2γ−1/3

50εj .

Proof. Lemma 4.3.7 shows that R(j) > nγ(r(tj)+gα(tj)) implies nγ−1/6/2 <B−j −D+

j , hence this event is contained in the union of the events nγ−1/6/4 <

B−j and −nγ−1/6/4 > D+j . A straightforward application of Lemma 4.3.3

then gives a bound of e−n2γ−1/3

50εj on the probability of each event, thus R(j) >

nγ(r(tj) + gα(tj)) occurs with probability at most 2e−n2γ−1/3

50εj . A similarargument using the other inequality of Lemma 4.3.7 gives the same boundon the probability of the event R(j) < nγ(r(tj) − gα(tj)), finishing theproof.

4.3.2 Degrees

Recall that in this section, and also in the next four sections, we assume Gj−1

holds, i.e., the values of Dv, Fv, Xu,v, Yu,v and Zu,v are all in the prescribedintervals during the first j − 1 steps.

Proof of Proposition 4.3.1(a). Let Ai be the indicator random variable ofthe event that an open triple at v was successfully sampled in step i. ThenDv(j) =

∑ji=1 Ai. The probability that Ai+1 = 1 is

2pFv(i)∑w Fw(i)

∈ 2c(f(ti)± g1(ti))

n3/2(f(ti)± g1(ti))⊆ 1

n3/2

(2c± K

2eKtin−1/6

)using Claim 4.3.4.

Set

A±i = Ai −1

n3/2

(2c∓ K

2eKti−1n−1/6

)and B±j =

j∑i=1

A±i ,

then B±j is a ( 3cn3/2 , 1)-bounded martingale pair. So if we define Ci and rC(t)

to be 0 for all i, then all the conditions of Corollary 4.3.8 are satisfied withthe choice of rA(t) = 2c, r(t) = d(t), γ = 1/2, α = 1 and ε = 3n−3/2. Hencethe probability that R(j) = Dv(j) is not in

√n(d(tj) ± g1(tj)) is less than

4e−n1/6logn ≤ n−10, using 150j ≤ 150n2

√log n ≤ n2 log n.

51

4.3. CALCULATIONS

4.3.3 Open triples

Proof of Proposition 4.3.1(b). Here we break the one-step change in Fv(i)into two parts: Ai will be the gain in the open triples at v caused bythe i’th sample and Ci will be the loss, so that we can write Fv(j) =n− 1 +

∑ji=1 Ai − Ci.

We may lose a particular open triple uwv in two different ways: eitherif we sample it, or if we successfully sample another open triple with thesame missing edge vu. There are at most Xu,v ≤ 50 log n candidates forthis other triple and a successful sample has probability p = O(1/

√n), so

the linearity of expectation gives

E[Ci+1] =2Fv(i)(1 +O( logn√

n))∑

w Fw(i)

∈2(f(ti)± g1(ti))(1 +O( logn√

n))

n(f(ti)± g1(ti))⊆ 1

n

(2± K

2eKtin−1/6

)using Claim 4.3.4

Set

C±i = Ci −1

n

(2∓ K

2eKti−1n−1/6

)and D±j =

j∑i=1

C±i ,

then D±0 , D±1 , . . . is a ( 3

n, 50 log n)-bounded martingale pair, because one

sample can only “break” the open triples with the same missing edge, andthere are at most codegree-many of them.

On the other hand, as we have already mentioned in Section 4.2, thereare two ways to obtain new open triples at v. The contribution of a newedge vu touching v in step i + 1 is Du(i) − Xu,v(i) because it creates anopen triple at v with any edge of u except if the third edge is alreadythere. Alternatively, a new edge incident to a neighbor u of v creates anew open triple unless it connects to another neighbor of v. There are atmost

∑u,u′∈Dv(i) Xu,u′(i) open triples that could create an edge between two

52

4.3. CALCULATIONS

neighbors of v, so

E[Ai+1] =2p∑Fw(i)

∑wu∈Fv(i)

Du(i)−Xu,v(i)

+

2p∑Fw(i)

∑u∈Dv(i)

Fu(i)−∑

u,u′∈Dv(i)

O(Xu,u′(i))

.

Note how we abuse our notation to also think of the quantities Dv(i) andFv(i) as the set they count. So u ∈ Dv(i) should be understood as a neighborof v and wu ∈ Fv(i) refers to an open triple vwu. Using Claim 4.3.4 we get

E[Ai+1] ⊆4c(d(ti)± g1(ti))(f(ti)± g1(ti))(1−O( logn√

n))

n(f(ti)± g1(ti))

⊆ 1

n

(4cd(ti)±

K

2eKtin−1/6

).

This means that for

A±i = Ai −1

n

(4cd(ti−1)∓ K

2eKti−1n−1/6

)and B±j =

j∑i=1

A±i ,

B±0 , B±1 , . . . is a martingale pair.

Next, we show that it is a ( lognn,√n log n)-bounded martingale pair.

Indeed, adding an edge vw in step i + 1 can increase the number of opentriples at v by at most Ai ≤ Dw(i) whereas an edge ww′ not touching vcan only increase it by one. The upper bound then comes from Dw(i) =O(√n log n) ≤

√n log n and Ai ≥ A±i . On the other hand, A±i is smallest

when Ai = 0. Observing that 4cd(t) ≤ 8c2√

log n we see that the change isbounded from below by (− log n/n).

Therefore we can apply Corollary 4.3.8 with rA(t) = 4cd(t), rC(t) = 2,r(t) = f(t), γ = 1, α = 1, and ε = log2 n/

√n to show that the probability

that R(j) = Fv(j) + 1 (or Fv(j)) is not in the interval n(f(tj) ± g1(tj)

)is

at most 4e−n1/6/ log3 n ≤ n−10.

4.3.4 3-walks

Proof of Proposition 4.3.1(d). Once again, we break the one-step changein Yu,v(i) into two parts: Ai will be the gain in the open 3-walks from u

53

4.3. CALCULATIONS

to v caused by the i’th sample and Ci will be the loss, so we can writeYu,v(j) =

∑ji=1Ai − Ci.

We lose a particular 3-walk uww′v either if we sample its open tripleuww′, or if we add the missing edge uw′ by successfully sampling some othertriple (as before, the latter event is unlikely since the codegrees are small).Then the linearity of expectation and Claim 4.3.4 gives

E[Ci+1] =2Yu,v(i)(1 +O( logn√

n))∑

w Fw(i)

∈2(y(t)± g2(t))(1 +O( logn√

n))

n3/2(f(t)± g1(t))⊆ 1

n3/2

(2d(t)± K

2g2(t)

).

So defining

C±i = Ci −1

n3/2

(2d(ti−1)∓ K

2g2(ti−1)

)and D±j =

j∑i=1

C±i ,

we get that D±0 , D±1 , . . . is a ( logn

n3/2 , 50 log n)-bounded martingale pair.

Now let us look at Ai+1, the number of ways a new open 3-walk uww′vcan be created in step i+1. We follow the analysis described in Section 4.2.If uw is the new edge, then we need to count the 4-walks utww′v in Gi

where w′ is not u or a neighbor of u, and utw is open. Let N be the setof such candidates for w′, then |N | = Dv(i) − O(log n), and the expectedcontribution to Ai+1 of this type is

2p∑

w′∈N Yu,w′(i)∑r Fr(i)

∈2c(d(t)± g1(t) +O( logn√

n))(y(t)± g2(t)

)n3/2(f(t)± g1(t))

⊆ 1

n3/2

(2cd(t)y(t)

f(t)± K

6g2(t)

)Strictly speaking, we are using the linearity of expectation over the indicatorvariables for each fixed 3-walk uww′v. The probability that this walk iscreated is the number of t’s such that utww′ is an open 3-walk in Gi,divided by the number of open triples.

We similarly get that the expected contribution where ww′ is the newedge is

2p∑

w′∈N Yw′,u(i)∑r Fr(i)

∈ 1

n3/2

(2cd(t)y(t)

f(t)± K

6g2(t)

),

54

4.3. CALCULATIONS

whereas new open 3-walks where w′v is the new edge come from open 4-walks uww′tv in Gi, so the expected contribution of this type is

2pZu,v(i)∑r Fr(i)

∈ 2c(z(t)± g2(t))

n3/2(f(t)± g1(t)⊆ 1

n3/2

(2cz(t)

f(t)± K

6g2(t)

).

Putting all of these together, we see that for

A±i = Ai −1

n3/2

(2c(2d(ti−1)y(ti−1) + z(ti−1)

)f(ti−1)

± K

2g2(ti−1)

),

B±j =∑j

i=1A±i is a martingale pair. In fact it is ( log2 n

n3/2 , 50 log n)-bounded,since a new edge can contribute at most codegree-many new 3-walks.

Now we can apply Corollary 4.3.8 with rA(t) = 2c(2d(t)y(t)+z(t)

)/f(t),

rC(t) = 2y(t)/f(t), r(t) = y(t) (recall the differential equation that y sat-isfies to see that r′ = rA − rC), γ = 1/2, α = 2 and ε = log4 n/n3/2

to show that the probability that R(j) = Yu,v(j) is not in the interval√n(y(tj)± g2(tj)

)is at most 4e−n

1/6/ log5 n ≤ n−10.

4.3.5 4-walks

Proof of Proposition 4.3.1(e). This time we define Ai to be the number ofnew open 4-walks created in step i and Ci to be the number of open 4-walkswe lose in step i, so that Zu,v(j) = n− 2 +

∑ji=1 Ai − Ci.

Once again, we lose an open 4-walk uww′w′′v if one of its open triplesuww′ or w′w′′v is sampled, or if one of their missing edges uw′ or w′v isadded through a successful sample of a different open triple. Hence we get,using Claim 4.3.4

E[Ci+1] =4Zu,v(i)(1 +O( logn√

n))∑

w Fw(i)

∈4(z(t)± g2(t))(1 +O( logn√

n))

n(f(t)± g1(t))⊆ 1

n

(4z(t)

f(t)± K

2g2(t)

).

So defining

C±i = Ci −1

n

(4z(ti−1)

f(ti−1)∓ K

2g2(ti−1)

)and D±j =

j∑i=1

C±i ,

55

4.3. CALCULATIONS

we get that D±0 , D±1 , . . . is a ( log2 n

n, 2500 log2 n)-bounded martingale pair,

because an added edge of the form uw′ or w′v can ruin at most Xu,w′ ·Xw′,v ≤(50 log n)2 open 4-paths.

On the other hand, the analysis in Section 4.2 shows that a new open4-walk uww′w′′v can be created in four different ways, based on which oneof the four edges was added in step i + 1. In the case when uw is the newedge, we need to count the 5-walks utww′w′′v where utw and vw′′w′ areopen, and w′ is not u or a neighbor of u. Let M be the set of such edgesw′w′′ for fixed u and v. Then

|M | = Fv(i)− Yv,u(i)−O(Xv,u(i)) = Fv(i)−O(

√n log3 n),

hence the expected contribution in this case is

2p∑

w′w′′∈M Yu,w′(i)∑r Fr(i)

∈2c(f(t)± g1(t) +O( log2 n√

n))(y(t)± g2(t)

)n(f(t)± g1(t))

⊆ 1

n

(2cf(t)y(t)

f(t)± K

8g2(t)

).

But the remaining three cases are essentially the same, we only need toswitch u and v or the two indices of the variables Y . This means that

E[Ai+1] ∈ 1

n

(8cf(t)y(t)

f(t)± K

2g2(t)

),

so we can define

A±i = Ai −1

n

(8cf(ti−1)y(ti−1)

f(ti−1)∓ K

2g2(ti−1)

)and B±j =

j∑i=1

A±i ,

where B±0 , B±1 , . . . is a ( log3 n

n, 3√n log2 n)-bounded martingale pair. This

is because a new edge of the form uw can add at most Dv(i)Xw,w′′(i) =

O(√n log3/2 n) ≤

√n log2 n new 4-walks and the same bound works for an

edge touching v, whereas a new edge not touching u and v creates at most100 log n open 4-walks: at most codegree-many in both of the positions ww′

and w′w′′.Now we apply Corollary 4.3.8 with rA(t) = 8cf(t)y(t)/f(t), rC(t) =

4z(t)/f(t), r(t) = z(t) (the differential equation for z implies r′ = rA− rC),γ = 1, α = 2 and ε = log6 n/

√n to show that the probability that R(j) =

n+Zu,v(j) is not in the interval n(z(tj)± g2(tj)

)is at most 4e−n

1/6/ log7 n ≤n−10.

56

4.4. THE SECOND PHASE

4.3.6 Codegrees

Proof of Proposition 4.3.1(c). Let Ai = Xu,v(i)−Xu,v(i−1) be the increasein the codegree of u and v in a step so that Xu,v(j) = 1+

∑ji=1Ai. It is easy

to see that Ai+1 is the indicator random variable of the event that the opentriple of an open 3-walk from u to v or from v to u is successfully sampledin step i+ 1. The probability of this event is

2p(Yu,v(i) + Yv,u(i))∑w Fw(i)

∈ 4c(y(t)± g2(t))

n2(f(t)± g1(t))⊆ 1

n2

(4cy(t)

f(t)± K

2g2(t)

).

So if we set A−i = Ai − 1n2

(4cy(ti−1)f(ti−1)

+ K2g2(ti−1)

)then B−j =

∑ji=1A

−i

is a supermartingale and it is (10√

lognn2 , 1)-bounded. Now we can apply

Lemma 4.3.7 with γ = 0, α = 2, rA(t) = 4cy(t)f(t)

= 4cd(t) = 8c2t, rC(t) = 0

and r(t) = 4c2t2 + 1 to R(j) = Xu,v(j). Then the first inequality gives

Xu,v(j) ≤ 1 + 4c2t2j + g2(tj) +B−j .

Therefore (keeping in mind that tj ≤√

log n and c ≤ 1) we see thatif Xu,v(j) > 50 log n then B−j > 25 log n. But by Lemma 4.3.3 this hasprobability at most

e−252 log2 n30 logn ≤ e−10 logn = n−10

for any j ≤ n2√

log n, finishing our claim.

4.4 The second phase

In this section we analyze the second phase of the process and prove ourmain result, the lower and upper bounds on the threshold probability. Un-like in the first phase, where we made one step at a time, here we exposetriples in rounds. In a round we simultaneously sample all the currentlyopen triples, and then add the edges accordingly.

Let us adapt our notation to the second phase as follows. From now onDv(i), i = 0, 1, . . . will denote the degree of the vertex v after i rounds inthe second phase. For example, Dv(0) is the degree of v at the end of thefirst phase, i.e., Dv(Tn

2) with the old notation. We similarly re-define the

57


other variables Fv, Xu,v, Yu,v and Zu,v, and let Gi denote the graph after thei’th round.

As in the previous chapter, we will make use of the following Chernoff-type inequalities (see, e.g., [33]).

Claim 4.4.1. Let X ∼ Bin(n, p) be a binomial random variable. Then

1) P[X > np+ a] ≤ e−a2

2(np+a/3) and

2) P[X < np− a] ≤ e−a2

2np .

4.4.1 The lower bound

Suppose c < 12

is some fixed constant. Before we start the second phase,we need to decide how many steps the first phase should take. Recall thatf(t) has a root at T0 = 1−

√1−4c2

4c2and that it is monotone decreasing in the

interval [0, T0]. It is easy to check that d(T0) < 1, so fix a positive constantδ < 1 − d(T0) and choose ε > 0 so that cε

1−2c< δ. We define the stopping

time T to be in the interval [0, T0] so that f(T ) = ε/2. Hence if we applyTheorem 4.3.2 with this T , we get that after Tn2 steps

• Dv(0) ≤ (d(T ) + g1(T ))√n ≤ (1− δ)

√n and

• Fv(0) ≤ (ε/2 + g1(T ))n ≤ εn

for every vertex v. At this point, we move on to the second phase of theprocess.

The plan is to show that the second phase ends in O(log n) rounds, whileall the degrees stay below

√n. This would imply that the final graph has at

most n√n edges, in particular, it is not complete. The following statement

bounds the degrees of the vertices in the first O(log n) rounds. Showingthat in the meantime the second phase gets stuck will be an easy corollary.

Claim 4.4.2. Let m = 4 log1/2c n. Then, with high probability, Dv(i) <√n

for every vertex v and 0 ≤ i ≤ m.

Proof. We will prove by induction that with high probability

• Dv(i) ≤(1− δ + (1 + 2c+ . . .+ (2c)i−1)cε+ i · n−1/6

)√n and

58


• Fv(i) ≤ ((2c)iε+ 2n−1/6)n

hold for every vertex v and 1 ≤ i ≤ m. Note that, by our choice of ε, thebound on the degrees is less than

(1− δ + cε

1−2c+ i · n−1/6

)√n <√n.

To proceed with the induction, we condition on the event that thebounds hold for i and then estimate the probability that they fail for i+ 1for some vertex v.

First we show that the degree of each vertex increases by at most((2c)icε+ n−1/6

)√n in round i+ 1. Indeed, the number of new edges that

touch the vertex v is stochastically dominated by the binomial distribution

Bin(Fv(i),

c√n

). Hence, by the first Chernoff-bound in Claim 4.4.1,

P

[Dv(i+ 1)−Dv(i) > Fv(i)

c√n

+ (1− 2c)n1/3

]< e−Ω(n1/6),

so a union bound over all the vertices shows that the first bound fails inround i+ 1 with probability at most e−Ω(n1/7).

The second inequality follows from the first one by an easy countingargument. Since we sample all the current open triples every round, the onescounted in Fv(i + 1) are all new triples, i.e., they contain at least one newedge added in round i+1. Now an open triple either has a new edge incidentto v or not. If it does, we can choose it in at most

((2c)icε+ n−1/6

)√n

ways, and then extend each choice in at most√n ways to get a triple (as

all degrees are below√n). If not, then we first choose a neighbor of v and

then a new incident edge. Consequently, the total number of open triplesat v is

Fv(i+ 1) ≤ 2 ·((2c)icε+ n−1/6

)√n ·√n = ((2c)i+1ε+ 2n−1/6)n.

Taking a union bound over all the m rounds then completes the proof.

Corollary 4.4.3. Let Q(i) be the total number of open triples after i rounds.Then Q(m) = 0 with high probability.

Proof. If a triple is open after the i’th round, then it contains at least onenew edge. Of course, the number of open triples containing some fixed newedge uv is at most Dv(i) + Du(i) ≤ 2

√n, whp. On the other hand, the

number of new edges cannot exceed the number of positive samples in thei’th round, distributed as Bin(Q(i − 1), c√

n). Putting these together, this

59


means that Q(0), . . . , Q(m) is a sequence of random variables where Q(i)is stochastically dominated by 2

√n · Bin(Q(i− 1), c√

n). In particular,

E[Q(i)] = E[E[Q(i)|Q(i− 1)]] ≤ E[2cQ(i− 1)] = 2cE[Q(i− 1)].

Using Q(0) ≤ n3, a simple application of Markov’s inequality gives

P[Q(m) > 0] ≤ E[Q(m)] ≤ (2c)mQ(0) ≤ n−4 · n3 = o(1).

Proof of Theorem 4.1.1, part 2. Corollary 4.4.3 shows that whp the processruns out of open triples after at most m rounds in the second phase. Ac-cording to Claim 4.4.2, at this final stage all vertices have degree at most√n, i.e., the graph has at most n3/2

2edges whp.

4.4.2 The upper bound

Suppose 12< c ≤ 1 is fixed. Then we can run the first phase all the way,

for n2√

log n steps. Indeed, as the function f(t) has a global minimum off(

14c2

)= 1 − 1

4c2> 0, we can apply Theorem 4.3.2 with stopping time

T =√

log n.Our plan is to give rapidly increasing lower bounds on the degrees and

codegrees as the graph evolves, thus showing that we reach the completegraph in O(log log n) rounds. Let us analyze the first round separately.

The initial parameters of the second phase are, as implied by Theorem4.3.2,

• Xu,v(0) ≤ 50 log n,

• Zu,v(0) = 16c4n log2 n+O(n log3/2 n) ≥ 2c4n log2 n.

for any vertices u and v.

Lemma 4.4.4. There is some constant γ > 0, such that the codegreeXu,v(1) ≥ γ log2 n for every pair of vertices u, v with high probability.

Proof. Fix u and v. We expect most of their new common neighbors to bevertices w with open triples to both u and v. So if Xp,r denotes the numberof open triples pqr, then we want many vertices w such that both Xu,w andXw,v are relatively large.

60


Claim 4.4.5. For every pair of vertices u, v, there are at least a ·n verticesw such that Xu,w(0), Xv,w(0) ≥ b log n, where a = c4

2500and b = c4

50are

positive constants.

Proof. Note that an open 4-walk is just a sequence of two open triples,hence

2c4n log2 n ≤ Zu,v(0) =∑

w∈V \u,v

Xu,w(0) · Xv,w(0).

Here each summand is bounded by (50 log n)2, so if fewer than an verticesw satisfy c4 log2 n ≤ Xu,w(0) · Xv,w(0), then the right hand sum aboveis less than c4 log2 n · n + (50 log n)2 · an = 2c4n log2 n, a contradiction.At the same time, the bound on the codegrees implies that each w withc4 log2 n ≤ Xu,w(0) · Xv,w(0) satisfies our requirements.

Now if some w shares at least b log n open triples with both u and v,then it becomes a new common neighbor of them after the first round with

probability at least

(1−

(1− c√

n

)b logn)2

≥(cb logn

2√n

)2

(here and later in

this section we use that (1− α)β ≤ 1− αβ/2 for all αβ ≤ 1). These eventsare independent for different w’s, hence Xu,v is bounded from below by the

Binomial random variable Bin(an, c

2b2 log2 n4n

). Then by Lemma 4.4.1, the

probability that Xu,v(1) is smaller than γ log2 n is e−Ω(log2 n) for a sufficientlysmall γ. A union bound over all pairs of vertices finishes the proof.

To make our life easier, we consider a slightly different second phase fromthis point on. Instead of sampling open triples with success probability p,we will consider a sprinkling process, and sample all triples with successprobability 4√

n lognin each round (starting from round 2). This means that

some triples will have a higher than p chance to exist, but as long as thenumber of rounds m is O(log log n), the effect is negligible: each triple is

still sampled with probability at most c+o(1)√n

. Formally we can say that we

are proving the result for any constant c′ > c.

To give a lower bound on the codegrees in Gi+1, we define the followingsequence:

xi = γ2i−1

log2i−1+1 n, i = 1, . . . ,m

with xm+1 = n10

, where we choose m = O(log log n) to be smallest possible

such that xm ≥ 14

√n log n. Let us also set pi = 1−

(1− 4√

n logn

)xi−1

.

61


Lemma 4.4.6. With high probability

Xu,v(i) ≥ xi

for all 1 ≤ i ≤ m+ 1 and all pairs of vertices u and v.

Proof. Lemma 4.4.4 shows that the lower bound on the codegrees holds fori = 1, so assume 2 ≤ i. We condition on the event that the statement holdsfor i− 1 and bound the probability that it fails for i.

We claim that under these conditions G(n, pi) is a subgraph of Gi. To seethis, observe that if an edge uv is missing from Gi−1, then it has Xu,v(i−1) ≥xi−1 independent chances of probability 4/

√n log n of being added in the

i’th round. Moreover, these events are independent for the different non-edges, as the triples are sampled independently, and each triple has at mostone missing edge. This means that missing edges are added independentlywith probability at least pi while existing edges are kept in the graph, thusindeed G(n, pi) ⊆ Gi.

We intend to use the Chernoff bound to show that all the codegrees inG(n, pi), and thus also in Gi, exceed xi. For this, observe that the codegreeof any fixed pair of vertices in G(n, pi) is a binomial random variable Ri ∼Bin(n− 2, p2

i ). A straightforward calculation gives E[Ri] > 2xi.Indeed, for 2 ≤ i ≤ m we have

(n− 2)

(1−

(1− 4√

n log n

)xi−1)2

≥(n− 2) · 4x2

i−1

n log n>

2x2i−1

log n= 2xi.

whereas for i = m+ 1 (using xm ≥ 14

√n log n),

(n− 2)

(1−

(1− 4√

n log n

)xm)2

≥ (n− 2)(1− 1/e)2 > 2xm+1,

Thus, as xi ≥ δ log2 n, Claim 4.4.1 shows that P[Ri < xi] = e−Ω(log2 n).Now taking the union bound over all vertex pairs and over all i finishes theproof.

Proof of Theorem 4.1.1, part 1. We claim that Gm+2 is the complete graph.Indeed, Lemma 4.4.6 shows that whp all the codegrees in Gm+1 are linear,so the probability that a fixed edge is missing from Gm+2 is at most (1 −4/√n log n)Ω(n) = e−Ω(

√n/ logn). A union bound over all pairs of vertices

then completes the proof.

62


4.5 Further remarks

Probably the most natural question that one can ask is the following. Whathappens if the process starts with some other tree, and not the star? In-tuitively it seems that we are in a worse situation as there are fewer opentriples to start with. We would therefore expect that if p ≤ 1−ε

2√n, then start-

ing with any fixed tree, the triadic process fails to propagate whp. In fact,we believe that whp this holds for all trees simultaneously.

Using the topology language, this is equivalent to saying that p = 12√n

is the threshold probability for a random 2-complex to contain a collapsiblehypertree (the upper bound comes from Corollary 4.1.2). We must note thata complex can have trivial fundamental group without actually containinga collapsible hypertree. A yet stronger question would be to ask for a lowerbound matching the threshold estimate in Corollary 4.1.2 for being simplyconnected.

Going in a different direction, it would also be interesting to study sim-ilar processes that are perhaps more meaningful from the social networkspoint of view. For example, a triadic process where vertices are discouragedto reach high degrees could be a more realistic model.

63

Chapter 5

Decomposing random graphsinto few cycles and edges

5.1 Introduction

Problems about packing and covering the edge set of a graph using cyclesand paths have been intensively studied since the 1960s. One of the oldestquestions in this area was asked by Erdos and Gallai [22, 23]. They conjec-tured that the edge set of any graph G on n vertices can be covered by n−1cycles and edges, and can be partitioned into a union of O(n) cycles andedges. The covering part was proved by Pyber [46] in the 1980s, but thepartitioning part is still open. As noted in [23], it is not hard to show thatO(n log n) cycles end edges suffice. This bound was recently improved toO(n log log n) by Conlon, Fox and Sudakov in [18], where they also provedthat the conjecture holds for random graphs and graphs of linear minimumdegree. This chapter treats the problem in the case of random graphs inmore detail.

Let 0 ≤ p(n) ≤ 1 and define G(n, p) to be the random graph on nvertices, where the edges are included independently with probability p.We hope to find a close to optimal partition of the edges of a random graphinto cycles and edges. Observe that in any such partition each odd-degreevertex needs to be incident to at least one edge, so if G(n, p) has s odd-degree vertices, then we certainly need at least s/2 edges. Also, a typicalrandom graph has about

(n2

)p edges, whereas a cycle may contain no more

than n edges, so we need at least about np2

cycles (or edges) to cover the

64

5.1. INTRODUCTION

remaining edges. This simple argument gives a lower bound of np2

+ s2

onthe optimum.

In this chapter we show that this lower bound is asymptotically tight.Let odd(G) denote the number of odd-degree vertices in the graph G. Wesay that G(n, p) (with p = p(n)) satisfies some property P with high prob-ability or whp if the probability that P holds tends to 1 as n approachesinfinity. Our main result is the following theorem.

Theorem 5.1.1. Let the edge probability p(n) satisfy p = ω(

log lognn

). Then

whp G(n, p) can be split into odd(G(n,p))2

+ np2

+ o(n) cycles and edges.

In fact, as we show in Lemma 5.2.2, for most of the p’s in the range,odd(G(n, p)) ∼ n

2. This immediately implies the following, perhaps more

tangible, corollary.

Corollary 5.1.2. Let p = p(n) be in the range [ω(

log lognn

), 1 − ω

(1n

)].

Then whp G(n, p) can be split into n4

+ np2


Here we use the standard notation of ω (f) for any function that isasymptotically greater than the function f(n), i.e., g(n) = ω (f(n)) if

limn→∞g(n)f(n)

= ∞. In this chapter log stands for the natural logarithm,and for the sake of clarity we omit the floor and ceiling signs whenever theyare not essential. We call G an Euler graph if all the vertices of G haveeven degree (not requiring that G be connected).

5.1.1 Proof outline

We will break the probability range into three parts (the sparse, the inter-mediate and the dense ranges), and prove Theorem 5.1.1 separately for eachpart. The proofs of the denser cases build on the sparser cases, but all theproofs have the following pattern: we start with deleting odd(G(n,p))

2+ o(n)

edges so that the remaining graph is Euler, and then we extract relativelyfew long cycles to reduce the problem to a sparser case.

In the sparse case we will check the Tutte condition to show that thereis a large matching on the odd-degree vertices, and then use expansionproperties to iteratively find cycles that are much longer than the averagedegree. In the end, we are left with a sparse Euler subgraph, which breaksinto o(n) cycles.

65

5.2. COVERING THE ODD-DEGREE VERTICES

The denser cases are somewhat more complicated. We will need tobreak G(n, p) into several random subgraphs. These graphs will not beindependent, but we can remove edges from one of them without affectingthe random structure of the others.

In the intermediate case we break G(n, p) into three random subgraphs,G(n, p) = G1 ∪ G2 ∪ G3. First we find an edge set E0 in G2 such thatG(n, p)−E0 is Euler. Then we break (G2∪G3)−E0 into matchings and G1

into even sparser random graphs. Using a result by Broder, Frieze, Suenand Upfal [14] about paths connecting a prescribed set of vertex pairs inrandom graphs, we connect the matchings into cycles using the parts of G1.The remaining edges are all from G1, and the tools from the sparse casetake care of them.

In the dense case we break into four subgraphs, G(n, p) = G1∪G2∪G3∪G4, where G4 contains the majority of the edges. Again we start by findingthe edge set E0 in G3. Next, we apply a recent packing result by Knox,Kuhn and Osthus [38] to find many edge-disjoint Hamilton cycles in G4.Then we break the remaining edges from G3 ∪ G4 into matchings and useG2 to connect them into cycles. At this point we have a still intact randomgraph G1 and some edges from G2 left, and these fit into the intermediatesetting, hence the previous results complete the proof.

The chapter is organized as follows: in Section 5.2 we prove all the resultswe need about odd-degree vertices, including the typical existence of E0 andthe fact that normally about half the vertices are odd. Section 5.3 showswhy we can iteratively remove relatively long cycles from sparse randomgraphs, and proves Theorem 5.1.1 in the sparse range. In Section 5.4 weshow the details of breaking into matchings and then connecting them intofew cycles, and prove the main lemma for the intermediate range. Finally,we complete the proofs of the denser cases in Section 5.5.

5.2 Covering the odd-degree vertices

The first step in the proof is to reduce the problem to an Euler subgraphby removing relatively few edges. This section is devoted to the discussionof the results we need about odd-degree vertices in random graphs.

As in the previous chapters, we will use the Chernoff-type inequalitiesto bound large deviations. We have collected the variants needed for thischapter in the following claim. For proofs, see [2].

66


Claim 5.2.1. Suppose we have independent indicator random variablesX1, . . . , Xn, where Xi = 1 with probability pi and Xi = 0 otherwise, forall i = 1, . . . , n. Let X = X1 + · · · + Xn be the sum of the variables andp = (p1 + · · ·+ pn)/n be the average probability so that E(X) = np. Then:

(a) P(X > np+ a) ≤ e−2a2/n for all a > 0,

(b) P(X > 2np) ≤ e−np/20,

(c) P(X < np− a) ≤ e−a2/np and hence

(d) P(X < np/2) ≤ e−np/20.

In particular, all the above estimates hold when X ∼ Bin(n, p) is a binomialrandom variable.

In the coming results, we will also make use of the following easy obser-vation: For X ∼ Bin(n, p), the probability that X is odd is exactly

bn−12 c∑i=0

(n

2i+ 1

)p2i+1(1− p)n−(2i+1) =

(1− p+ p)n − (1− p− p)n

2

=1− (1− 2p)n

2.

First we prove the following estimate on the number of vertices of odddegree in G(n, p).

Claim 5.2.2. Suppose the probability p satisfies ω(

1n

)< p < 1 − ω

(1n

).

Then whp G ∼ G(n, p) contains n2(1 + o(1)) vertices of odd degree.

Proof. Let us fix a bipartition of the vertices into two nearly equal partsV = V1 ∪ V2, where n1 = |V1| = bn/2c and n2 = |V2| = dn/2e. Our planis to show that whp roughly half the vertices of V1 and roughly half thevertices of V2 have odd degree in G.

So let Y1 be the number of odd-degree vertices in V1. The probability

that a specific vertex v has odd degree is 1−(1−2p)n−1

2, so by the linearity of

expectation, the expected number of odd degree vertices in V1 is

E(Y1) =n1

2− n1(1− 2p)n−1

2∼ n1

2

67


for ω (1/n) ≤ p ≤ 1 − ω (1/n). We still need to prove that Y1 is tightlyconcentrated around its mean.

Let us now expose the random edges spanned by V1. Then the degrees ofthe vertices in V1 are determined by the crossing edges between V1 and V2.At this point, each vertex in V1 has n2 incident edges unexposed, and theseedges are all different. So let v be any vertex in V1. The probability thatv is connected to an odd number of vertices in V2 is p1 = 1−(1−2p)n2

2∼ 1

2,

so the probability that v will end up having an odd degree is either p1

or (1 − p1) (depending on the parity of its degree in G[V1]). This meansthat, conditioning on any collection of edges spanned by V1, Y1 is the sumof n1 indicator variables with probabilities p1 or 1 − p1, so we can applyClaim 5.2.1 (a) and (c) with average probability minp1, 1 − p1 ≤ p1 ≤maxp1, 1− p1 to get

P(|Y1 − E(Y1)| > a) ≤ e−2a2/n1 + e−a2/n1p1 .

Then taking a = n2/3 and using that p1 ∼ 1/2 we get

P(|Y1 − E(Y1)| > n2/3) ≤ 3e−2n1/3

= o(1).

Repeating the same argument for V2, we see that if Y2 is the number ofodd-degree vertices in V2, then

E(Y2) =n2 − n2(1− 2p)n−1

2∼ n2/2

andP(|Y2 − E(Y2)| > n2/3) ≤ 3e−2n1/3

.

So with probability at least 1 − 6e−2n1/3= 1 − o(1) the number of odd

vertices is n2(1 + o(1)).

In order to show that typically there is a small set of edges covering theodd vertices, we will need some properties of sparse random graphs. Thefollowing somewhat technical lemma collects the tools we use in the proof.

Lemma 5.2.3. Let p = p(n) satisfy ω(

log lognn

)≤ p ≤ log10 n

n. Then the

following properties hold whp for G ∼ G(n, p):

1) the giant component of G covers all but at most o(n) vertices, and allother components are trees,

68


2) the independence number α(G) is at most 2 log(np)p

,

3) the diameter of the giant component of G is at most 2 lognlog(np)

,

4) any set T of at most 2n1/10 vertices spans at most 2|T | edges,

5) any two disjoint vertex sets T, T ′ of the same size n1/10 ≤ t ≤ n/100are connected by less than tnp/6 edges, and

6) all but at most n/2 log2 n vertices in G have at least np/5 odd-degreeneighbors.

Proof. For the first two properties we refer to Bollobas [12], while the thirdproperty was proved by Chung and Lu [16]. To prove the fourth one, notethat the probability that a fixed set T of size t spans at least 2t edges is at

most((t2)

2t

)· p2t. So the probability that some T of size t ≤ 2n1/10 spans at

least 2t edges is at most

2n1/10∑t=5

(n

t

)·(t2

2t

)· p2t ≤

2n1/10∑t=5

(nt4p2)t ≤2n1/10∑t=5

n−t/2 = o(1).

We use a similar argument to prove the fifth property. For fixed T1 andT2 of size t, the probability that at least tnp/6 edges connect them in G

is at most(

t2

tnp/6

)ptnp/6. So the probability that the property does not hold

can be bounded from above by

n/100∑t=n1/10

(n

t

)2

·(

t2

tnp/6

)ptnp/6 ≤

n/100∑t=n1/10

(ent

)2t

·(et2p

tnp/6

)tnp/6

≤n/100∑t=n1/10

(e2n2

t2·(

6et

n

)log logn/6)t

.

Here 6etn

< 12, so for large enough n this probability is smaller than∑n/100

t=n1/10(1/2)t < 2 · 2−n1/10= o(1).

The last property has a similar flavor to Claim 5.2.2. We will prove thatthe probability that a particular vertex has fewer than np/5 odd neighborsis at most 1/ log3 n. Then the expected number of such vertices is at mostn/ log3 n, hence the probability that there are more than n/2 log2 n of themis, by Markov’s inequality, at most 2/ log n = o(1).

69


So let us pick a vertex v in G. Using Claim 5.2.1, we see that theprobability that it has fewer than np/2 or more than 2np neighbors is atmost 2e−np/20. Assume this is not the case, and expose the edges spannedby the neighborhood N of v. Now the number of odd neighbors of v isdetermined by the edges between N and N = V − (N ∪ v). In fact, theprobability that a vertex u ∈ N is connected to an odd number of vertices

in N is p1 = 1−(1−2p)n−d−1

2∼ 1

2, where d = |N | < n/2 is the degree of v,

so u has an odd degree in G with probability p1 or (1 − p1). Thus thenumber of odd neighbors of v is a sum of d indicator random variables withprobabilities p1 or (1− p1). Another application of Claim 5.2.1 then showsthat the probability that v has fewer than np/5 < dp1/2 odd neighbors(conditioned on np/2 ≤ d ≤ 2np) is at most e−dp1/20 < e−np/50.

Summarizing the previous paragraph, the probability that v has fewerthan np/5 odd neighbors is at most

2e−np/20 + e−np/50 ≤ 3e−np/50 ≤ e−3 log logn =1

log3 n

for large enough n, establishing the sixth property.

Now we are ready to prove the following statement, which will serve asa tool to get rid of odd degrees. We should point out that this lemma onlyworks for p logn

n. However, its flexibility – the fact that we can apply it

to any set S – will prove useful when tackling the dense case.

Lemma 5.2.4. Let p = ω(

lognn

)but p ≤ log10 n

n. Then whp in G ∼ G(n, p)

for any vertex set S of even cardinality, there is a collection E0 of |S|2

+o(n)edges in G such that S is exactly the set of vertices incident to an oddnumber of edges in E0.

Proof. Take any set S. First, we find a matching in S0 = S greedily,by selecting one edge ei at a time spanned by Si, and then removing thevertices of ei from Si to get Si+1. At the end of the process, the remainingset of vertices S ′ ⊆ S is independent in G. The second property fromLemma 5.2.3 implies that |S ′| ≤ 2 log(np)

p. Now let us pair up the vertices

in S ′ arbitrarily, and for each of these pairs vj,1, vj,2 take a shortest pathPj in G connecting them. (Recall that for p = ω

(lognn

)the random graph

G(n, p) is whp connected.) The third property ensures that each of the Pjcontains at most 2 logn

log(np)edges. Note that we do not assume these paths to

be edge-disjoint and they may contain the ei’s as well.

70


Let us define E0 to be the “mod 2 union” of the Pj and the ei, i.e., weinclude an edge e in E0 if it appears an odd number of times among them.Then the set of odd-degree vertices in E0 is indeed S, and the number ofedges is at most

|S|2

+2 log(np)

p· 2 log n

log(np)=|S|2

+ o(n)

since np log n.

We can actually push the probability p down a bit by giving up on theabove mentioned flexibility. The following lemma takes care of the odd-degree vertices and the small components in the sparse case.

Lemma 5.2.5. Let p = ω(

log lognn

), but p ≤ log10 n

n, and let S be the set of

odd-degree vertices in G ∼ G(n, p). Then whp there is a collection E0 of|S|2

+ o(n) edges in G such that G− E0 is an Euler graph.

Proof. Let us assume that G satisfies all the properties from Lemma 5.2.3.Our plan is to show that there is a matching in the induced subgraphGS = G[S] covering all but at most n/ log2 n vertices in S, using the defectversion of Tutte’s theorem on GS. For this we need that the deletion of anyset T of t vertices from GS creates no more than t + n/ log2 n odd-vertexcomponents. In fact, we will prove that the deletion of t vertices breaks GS

into at most t+ n/ log2 n components.If t ≥ n

100then this easily follows from the second property: if at least t

components were created, then we could find an independent set in G of sizet simply by picking a vertex from each component. But α(G) ≤ 2 log(np)

p<

n100

, so this is impossible.Now suppose there is a set T of t < n

100vertices such that GS − T has

at least t + nlog2 n

components. Here the number of components contain-

ing at least 2 log2 n vertices is clearly at most n2 log2 n

. On the other hand,

according to the sixth property of Lemma 5.2.3, there are at most n2 log2 n

components containing a vertex of degree less than np/5 in GS. Thus thereare t components of size at most 2 log2 n, and hence of average degree atmost 4 by the fourth property, such that all the vertices contained in themhave at least np/5 neighbors in GS. Each such component then has a vertexwith at least np/5−4 neighbors in T . Pick a vertex like that from t of thesecomponents to form the set T ′.

71

5.3. CYCLE DECOMPOSITIONS IN SPARSE RANDOM GRAPHS

We see that there are at least t(np/5− 4) > tnp/6 edges going betweenT and T ′, but this contradicts the fourth property if t ≤ n1/10 and the fifthproperty if n1/10 < t < n

100. So by Tutte’s theorem, we can find some edge

set M that forms a matching in S, covering all but nlog2 n

of its vertices.

Now let F be the set of edges appearing in the small components of G(recall that in this probability range, G(n, p) whp has a giant componentand possibly some smaller ones). According to the first property, theseedges span trees covering o(n) vertices in total, so |F | = o(n). We see thatthe set M∪F takes care of all odd vertices outside the giant component andall but at most n

log2 nof them inside. The rest of the proof follows the idea

from the previous lemma: we pair up the remaining odd vertices arbitrarilyand take shortest paths Pj in G connecting them. Once again, the thirdproperty implies that each path has length at most 2 logn

log(np). Then E0, the

“mod 2 union” of the Pj and M ∪ F satisfies all requirements and uses atmost

|S|2

+ o(n) +n

log2 n· 2 log n

log(np)=|S|2

+ o(n)

edges.

5.3 Cycle decompositions in sparse random

graphs

In this section we show that Euler subgraphs of sparse random graphscan be decomposed into o(n) edge-disjoint cycles. The following statementfrom [18], which we use to find long cycles, is an immediate consequence ofapplying Posa’s rotation-extension technique [45] as described in [13].

Lemma 5.3.1. If a graph G does not contain any cycle of length at least3t, then there is a set T of at most t vertices such that |N(T )| ≤ 2|T |.

We say that a graph G is sufficiently sparse if any set of vertices S spansless than r|S| edges, where r = max |S|

12 log2 n, 7. Note that any subgraph

of a sufficiently sparse graph is also sufficiently sparse.In what follows, we proceed by showing that on the one hand, for p

small enough the graph G(n, p) is typically sufficiently sparse, while on theother hand, any sufficiently sparse graph contains few edge-disjoint cyclescovering most of the edges.

72


Lemma 5.3.2. Let p = p(n) < n−1/6 and let G ∼ G(n, p). Then whp G issufficiently sparse.

Proof. For fixed s, the probability that there is an S of size s containing atleast rs edges is at most(

n

s

)·((s

2

)rs

)· prs ≤ ns ·

(esp2r

)rs=(n ·(esp

2r

)r)s.

Put x = n ·(esp2r

)r. If we further assume that r ≥ s

12 log2 nand r ≥ 7 then we

get that for large n and p < n−1/6

x ≤ n · (6ep log2 n)r ≤ n · (6e)7

(log2 n

n1/6

)7

= o(1),

where we used the fact that 6ep log2 n < 1 for large n. Hence the prob-ability that for some s there is a set S of size s which spans more thanmax s

12 log2 n, 7 · s edges, i.e., that G is not sufficiently sparse, is at most

n∑i=2

xi <x2

1− x< x = o(1).

Corollary 5.3.3. For p < n−1/6, whp all subgraphs of G(n, p) are suffi-ciently sparse.

Using these lemmas we can derive one of our main tools, the fact thatwe can make a subgraph of a random graph relatively sparse by iterativelyremoving long cycles.

Proposition 5.3.4. Let H be a sufficiently sparse n-vertex graph of averagedegree d > 84. Then it contains a cycle of length at least d log2 n.

Proof. Assume to the contrary that there is no such cycle. Define H ⊆ G tobe the d/2-core of G, i.e., the non-empty subgraph obtained by repeatedlyremoving vertices of degree less than d/2. Since there is no cycle of length atleast d log2 n inH, we can apply Lemma 5.3.1 to find a set T of t ≤ d log2 n/3vertices such that |N(T )| ≤ 2|T | = 2t. Let S = T ∪N(T ) and s = |S|, then

73


the minimum degree condition implies that there are at least dt/4 edgesincident to T , hence the set S spans at least sd/12 edges.

Note that r = d/12 > 7. Also, since s ≤ 3t ≤ d log2 n, we haver ≥ s/12 log2 n. So the set S spans at least max s

12 log2 n, 7 · |S| edges,

contradicting the assumption that H is sufficiently sparse.

Corollary 5.3.5. Let p < n−1/6. Then whp G ∼ G(n, p) has the followingproperty. If H is a subgraph of G on n0 vertices of average degree d, thenH can be decomposed into a union of at most 2n0/ log n0 cycles and a graphH ′ of average degree at most 84.

Proof. By Corollary 5.3.3 all subgraphs of G(n, p) are sufficiently sparsewhp. So we can repeatedly apply Proposition 5.3.4 to remove cycles fromH as follows. Let H0 = H. As long as the graph Hi has average degreedi > 84, we find a cycle Ci+1 in it of length at least di log2 n0 and defineHi+1 = Hi − E(Ci+1). After some finite number (say l) of steps we end upwith a graph H ′ = Hl of average degree at most 84. We need to bound l.

Going from Hi of average degree di to Hj, the first graph of averagedegree below di/2, we remove at most din0/2 edges using cycles of length atleast di

2log2 n0. So the number of cycles we removed is at most n0/ log2 n0.

Thus if H had average degree d then l ≤ n0 log2 d/ log2 n0 ≤ 2n0/ log n0, asneeded.

To conclude this section, we prove Theorem 5.1.1 in the range

ω(

log lognn

)≤ p ≤ log10 n

nby observing that whp G(n, p) contains no more

than o(n) short cycles.

Lemma 5.3.6. Let p < log10 nn

. Then whp G(n, p) contains no more than√n cycles of length at most log log n.

Proof. Let Xk be the number of cycles of length k in G(n, p). The numberof cycles in Kn of length k is at most nk, and each cycle has probabilitypk of being included in G(n, p), hence E(Xk) ≤ (np)k ≤ (log10 n)k. So ifX =

∑log lognk=3 Xk is the number of cycles of length at most log log n, then

we clearly have

E(X) ≤log logn∑k=3

E(Xk) ≤log logn∑k=3

(log10 n)k ≤ (log10 n)2 log logn

using log10 n > 2.

74

5.4. THE MAIN INGREDIENTS FOR THE DENSE CASE

Hence we can apply Markov’s inequality to bound the probability thatthere are more than

√n short cycles:

P(X ≥√n) ≤ (log10 n)2 log logn

√n

= exp(

20(log log n)2 − (log n)/2)

= o(1).

Corollary 5.3.7. Let p < log10 nn

. Then whp any Euler subgraph H ofG ∼ G(n, p) can be decomposed into o(n) cycles.

Proof. Use Corollary 5.3.5 to remove o(n) edge-disjoint cycles and end upwith a graph H1 of average degree at most 84. Note that H1 is still an Eulergraph, so we can break the edges of G1 into cycles arbitrarily. We claim thatthe number of cycles we get is o(n). Indeed, by Lemma 5.3.6, whp there areat most

√n = o(n) short cycles, i.e., of length at most log log n, while the

number of long cycles can be bounded by the number of edges divided bylog log n. Since H1 contains a linear number of edges, the number of longcycles is O( n

log logn) = o(n) and we are done.

Our theorem for small p is then an immediate consequence ofLemma 5.2.5 and Corollary 5.3.7.

Theorem 5.3.8. Let ω(

log lognn

)< p < log10 n

n. Then whp G ∼ G(n, p) can

be decomposed into odd(G)2


5.4 The main ingredients for the dense case

For larger p we use a strong theorem by Broder, Frieze, Suen and Upfal [14]about the existence of edge-disjoint paths in random graphs connecting aprescribed set of vertex pairs. We need the following definition to stateit: Suppose S is a set of vertices in a graph G. We define the maximumneighborhood-ratio function rG(S) to be maxv∈V (G)

|NG(v)∩S||NG(v)| .

Theorem 5.4.1 (Broder-Frieze-Suen-Upfal). Let p = ω(

lognn

). Then there

are two constants α, β > 0 such that with probability at least 1 − 1n

thefollowing holds in G ∼ G(n, p). For any set F = (ai, bi)|ai, bi ∈ V, i =

1, . . . , k of at most αn log(np)logn

disjoint pairs in G satisfying the propertybelow, there are vertex-disjoint paths connecting ai to bi:

75


• There is no vertex v which has more than a β-fraction of its neighbor-hood covered by the vertices in F . In other words, rG(S) ≤ β, whereS is the set of vertices appearing in F .

We shall use the following statement to establish this property, so thatwe can apply Theorem 5.4.1 in our coming proofs.

Lemma 5.4.2. Let M be a matching covering some vertices of V , and let

G ∼ G(n, log3 nn

) be a random graph on the same vertex set, where G andM are not necessarily independent. Then with probability 1 − n−ω(1) we

can break G into log n random graphs Gi ∼ G(n, log2 nn

) and M into log nsubmatchings Mi of at most n

lognedges each, such that rGi(Si) ≤ 4√

lognfor

all i = 1, . . . , log n, where Si is the set of endvertices of Mi.

Proof. First, we partition the edge set of G into log n graphs G1, . . . , Glogn

by choosing an index ie for each edge e uniformly and independently at

random, and placing e in Gie . Then each Gi has distribution G(n, log2 nn

).Then we can apply Claim 5.2.1(b) and (d) to the degree of each vertexof each of the Gi’s, and use the union bound to see that all the Gi have

minimum degree at least log2 n2

and maximum degree at most 2 log2 n with

probability 1− 2n log n · e− log2 n/40 = 1− n−ω(1).Now break the M into log n random matchings M1, . . . ,Mlogn simi-

larly, by placing each edge f ∈ M independently in Mif where if isa random index chosen uniformly. Since there are at most n/2 edgesin M , another application of Claim 5.2.1(b) gives that with probability1− log ne−n/20 logn = 1− n−ω(1) each Mi contains at most n

lognedges.

If Gi has maximum degree at most 2 log2 n, then the neighborhood ofan arbitrary vertex v in Gi may meet at most 2 log2 n edges from M . Theprobability that at least (log n)3/2 of them are selected in Mi is at most(

2 log2 n

log3/2 n

)1

(log n)log3/2 n≤(

2e log2 n

log3/2 · log n

)log3/2 n

≤(

2e

log1/2 n

)log3/2 n

≤ e− logn·log logn.

So taking the union bound over all vertices v and indices i gives that withprobability 1− n−ω(1), the neighborhood of any v in any Gi meets at most

76


log3/2 n of the edges in Mi, and thus it contains at most 2 log3/2 n ver-

tices from Si. Since all neighborhoods have size at least log2 n2

, we get thatrGi(Si) ≤ 4√

lognfor all i.

The next theorem is the main tool on our way to the proof of Theo-rem 5.1.1.

Theorem 5.4.3. Let 2 log5 nn≤ p ≤ n−1/6 and G ∼ G(n, p). Suppose G

is randomly split into G′ and G′′ by placing its edges independently into

G′ with probability p′ = 2 log5 nnp

and into G′′ otherwise. Then whp for anysubgraph H of G′′ such that G′∪H is Euler, G′∪H can be decomposed intoo(n) cycles.

Proof. Note that, although G′ and G′′ are far from being independent, G′

on its own has distribution G(n, p′p) = G(n, 2 log5 nn

) and G′′ has distributionG(n, (1 − p′)p) where (1 − p′)p < n−1/6. It is easy to see from Claim 5.2.1that whp G has maximum degree at most 2n5/6.

By Corollary 5.3.5, whp any H ⊆ G′′ can be decomposed into o(n) cyclesand a graph H0 of average degree at most 84. Therefore it is enough for usto show that whp G′ satisfies the following: for any H0 containing at most42n edges such that G′ ∪H0 is Euler, G′ ∪H0 is an edge-disjoint union ofo(n) cycles. Our plan is to break H0 into few matchings and then to useTheorem 5.4.1 on random subgraphs of G′ to connect them into cycles.

So define V0 ⊆ V to be the set of vertices of degree at least log2 n2

in H0,and let V1 = V − V0 be the rest. Note that |V0| = O( n

log2 n). We break into

matchings in two rounds: first we take care of the edges spanned by V1,then the ones crossing between V0 and V1. Let us split G′ into two random

graphs G1, G2 ∼ G(n, log5 nn

) by placing each edge of G′ independently inone of them with probability 1/2.

Now let us consider the subgraph of H0 spanned by V1: the maximum

degree is at most log2 n2− 1, so we can break the edge set into log2 n match-

ings M1, . . . ,Mlog2 n. To find the cycles, we also split G1 into log2 n parts

G1,1, . . . , G1,log2 n so that each G1,i has distribution G(n, log3 nn

). We can useLemma 5.4.2 to further break each Mi and G1,i into log n parts Mi,j andG1,i,j, respectively, so that whp each Mi,j contains O(n/ log n) edges, andthe endvertices of Mi,j only cover a o(1)-fraction of the neighborhood of anyvertex in G1,i,j.

77


We create the cycles as follows: if Mi,j consists of the edgesv1v′1, . . . , vkv

′k, then we choose the corresponding set of pairs to be F1,i,j =

(v′1, v2), (v′2, v3), . . . , (v′k, v1). Here the above properties of Mi,j and G1,i,j

ensure that we can apply Theorem 5.4.1, and with probability at least1 − 1

nthe matching can be closed into a cycle. Hence with probability

1− log3 nn

= 1−o(1) all the Mi,j’s can be covered by log3 n cycles altogether.Let us turn to the edges of H0 between V0 and V1, and define the follow-

ing auxiliary multigraph Ga on V1: for each v ∈ V0, pair up its neighborsin V1 (except maybe one vertex, if the neighborhood has odd size) and letEv be the set of edges – a matching – corresponding to this pairing. Definethe edge set of Ga to be the disjoint union of the Ev for v ∈ V0. The ideais that an edge ww′ ∈ Ev corresponds to the path wvw′, so we want to findcycles covering the edges in Ga and then lead them through the originalV0 − V1 edges.

By the definition of V1, the maximum degree in Ga is at most log2 n2− 1,

so the edge set ∪v∈V0Ev can be split into log2 n matchings N1, . . . , Nlog2 n.

Now it is time to break G2 into log2 n random subgraphs G2,i of distribution

G(n, log3 nn

) each. Once again, we use Lemma 5.4.2 to prepare for the cyclecover by splitting each of the Ni and G2,i into log n parts Ni,j and G2,i,j.When we define the set of pairs F2,i,j, we need to be a little bit careful: wemust make sure that no cycle contains more than one edge from any givenEv. This way the cycles do not become self-intersecting after switchingthe edges from Ev back to the corresponding paths through v. Since themaximum degree of G, and hence the cardinality of Ev, is at most 2n5/6,we may achieve this by using at most 2n5/6 cycles per matching. Indeed,split Ni,j into at most 2n5/6 subsets Ni,j,k so that none of the Ni,j,k containsmore than one edge from the same Ev (this can be done greedily). Thendefine the sets of pairs F2,i,j,k for k = 1, . . . , 2n5/6 to close Ni,j,k into a cycle

the same way as before, and take F2,i,j = ∪2n5/6

k=1 F2,i,j,k.As above, all conditions of Theorem 5.4.1 are satisfied when we use G2,i,j

to find the paths corresponding to F2,i,j that close Ni,j into cycles, and sincethe error probabilities were all O( 1

n), whp we can simultaneously do so for

all i and j. We have log3 n matchings, so in total we get 2n5/6 log3 n = o(n)edge-disjoint cycles that cover all but o(n) edges of H0 between V0 and V1

(missing at most one incident edge for each v ∈ V0).Finally, we apply Corollary 5.3.5 on the subgraph of H0 induced by V0

to see that the edges spanned by V0 can be partitioned into O(n/ log3 n)

78

5.5. CYCLE-EDGE DECOMPOSITIONS IN DENSE RANDOMGRAPHS

cycles and O(n/ log2 n) = o(n) edges (recall that |V0| = O(n/ log2 n)).So far we have found o(n) edge-disjoint cycles in G′ ∪ H. Once we

remove them, we get an Euler graph containing only o(n) edges from H.So we can find o(n) edge-disjoint cycles covering all of them and remove

these cycles, as well, to get an Euler subgraph of G′ ∼ G(n, log5 nn

). NowCorollary 5.3.7 shows that we can partition the remaining graph into o(n)cycles, concluding our proof.

5.5 Cycle-edge decompositions in dense

random graphs

At last, we are ready to prove Theorem 5.1.1 in the denser settings. Thecase p ≤ n−1/6 is fairly straightforward from our previous results, we justneed to be a little bit careful.

Theorem 5.5.1. Let log6 nn≤ p ≤ n−1/6. Then whp G ∼ G(n, p) can be

decomposed into odd(G)2


Proof. We split G into the union of three disjoint random graphs G1, G2 andG3 by putting each edge e ∈ E(G) independently into one of the graphs.

With probability p1 = 2 log5 nnp

we place e into G1, with probability p2 = log2 nnp

we place it into G2, and with probability 1 − p1 − p2 we place it into G3.

This way G1 ∼ G(n, 2 log5 nn

) and G2 ∼ G(n, log2 nn

).Now let S be the set of odd-degree vertices in G. Applying Lemma 5.2.4

to S in G2 gives a set E0 of |S|/2 + o(n) edges in G2 such that G − E0 isEuler. Taking H to be the subgraph G2 ∪ G3 − E0 of G′′ = G2 ∪ G3 andsetting G′ = G1, we can apply Theorem 5.4.3 to split the edge set of G−E0

into o(n) cycles. The theorem follows.

To get a tight result for larger p, we must remove cycles containingnearly all vertices. A recent result by Knox, Kuhn and Osthus [38] helpsus to find many edge-disjoint Hamilton cycles in random graphs.

Theorem 5.5.2 (Knox-Kuhn-Osthus). Let log50 nn≤ p ≤ 1 − n−1/5 and

G ∼ G(n, p). Then whp G contains bδ(G)/2c edge-disjoint Hamilton cycles.

Let us point out that we do not actually need such a strong result:δ(G)/2− nε disjoint Hamilton cycles would also suffice for our purposes.

79

5.5. CYCLE-EDGE DECOMPOSITIONS IN DENSE RANDOMGRAPHS

Theorem 5.5.3. Let p ≥ n−1/6. Then whp G ∼ G(n, p) can be decomposed

into odd(G)2

+ np2

+ o(n) edges.

Proof. Similarly to Theorem 5.5.1, we partition G into the union of fourdisjoint random graphs G1, G2, G3 and G4 by assigning each edge of G

independently to G1 with probability p1 = 2 log5 nnp

, to G2 with probability

p2 = n4/5 log3 nnp

, to G3 with probability p3 = log2 nnp

and to G4 otherwise (with

probability p4 = 1 − p1 − p2 − p3 = 1 − o(1)). It is easy to see from theChernoff bound (Claim 5.2.1(a)) that whp the maximum degree of G3 isat most npp3 + n3/5 ≤ 2n3/5, and the maximum degree of G4 is at mostnpp4 + n3/5. Let us assume this is the case.

Let S be the set of odd-degree vertices in G. Now as before, we useLemma 5.2.4 to find a set E0 of |S|

2+ o(n) edges in G3 such that G−E0 is

Euler. Notice that G4 ∼ G(n, pp4), where pp4 = p(1− o(1)), so whp G4 hasminimum degree at least npp4−n3/5 by Claim 5.2.1(c). Also, pp4 < 1−n−1/5

(because pp2 > n−1/5), hence we can apply Theorem 5.5.2 and find npp4−n3/5

2

edge-disjoint Hamilton cycles in it. Let H0 be the graph obtained fromG3 ∪ G4 − E0 by removing these cycles. Then the maximum degree of H0

is at most 4n3/5, hence we can break the edge set of H0 into n4/5 matchingsMi.

We want to use Theorem 5.4.1 to close the Mi’s into cycles, so let us split

G2 into n4/5 random graphs G′i ∼ G(n, log3 nn

) uniformly. By Lemma 5.4.2we can further partition each Mi and G′i, with probability 1 − n−ω(1), intolog n matchings and graphs, Mi,j and G′i,j in such a way that we can applyTheorem 5.4.1 on G′i,j for any pairing Fi,j on the vertices of Mi,j. ChooseFi,j, as before, so that the resulting paths together with Mi,j form a cycle.Then with probability at least 1 − 1

nthe theorem produces the required

cycle, so whp all n4/5 log n cycles exist simultaneously.This way we find n4/5 log n = o(n) edge-disjoint cycles covering H0 and

some edges in G2. Let H be the graph containing the unused edges of G2.Then G1∪H is Euler, and we can apply Theorem 5.4.3 with the host graphG1 ∪ G2 from distribution G(n, p(p1 + p2)), and the partition G′ = G1,G′′ = G2. This gives us a decomposition of G′ ∪ H into o(n) cycles whp,completing our proof.

80


5.6 Further remarks

The above proof settles the question for p = ω(

log lognn

), but it would be

nice to have a result for the whole probability range. The bottleneck inour proof is Lemma 5.2.5, where we obtain a small edge set E0 such thatG(n, p)−E0 is Euler. We believe that similar ideas can be applied to provethis lemma for even smaller values of p if one puts more effort into findingshort paths between vertices not covered by the matching. In any case, itseems that the asymptotics of the optimum is defined by the smallest suchE0 for any p ≤ log n/n, so a complete solution to the problem would firstneed to describe this minimum in the whole range.

Another direction might be to further explore the error term of our the-orem. One clear obstacle to an improvement using our methods is Corol-lary 5.3.7. While we could slightly improve it to give a O(n/ log n) bound,showing that the error term is significantly smaller would need more ideas.

Since the publication of our results, Glock, Kuhn and Osthus [31] haveapplied tools about robust expanders to attack the problem and, partiallyimproving our result, proved sharp bounds when p is a constant.

81

References

[1] N. Alon, An extremal problem for sets with applications to graphtheory. J. Combin. Theory Ser. A 40 (1985), 82-89.

[2] N. Alon and J. H. Spencer, The probabilistic method, 3rd ed., Wiley(2008).

[3] E. Babson, C. Hoffman and M. Kahle, The fundamental group ofrandom 2-complexes, J. Amer. Math. Soc. 24 (2011), 1-28.

[4] J. Balogh, B. Bollobas and R. Morris, Graph bootstrap percolation,Random Structures Algorithms 41 (2011), 413-440.

[5] J. Balogh, B. Bollobas, R. Morris and O. Riordan, Linear algebra andbootstrap percolation, J. Combin. Theory Ser. A 119 (2012), 1328-1335.

[6] J. Balogh, R. Morris and W. Samotij, Independent sets in hypergraphs,J. Amer. Math. Soc. 28 (2015), 669-709.

[7] T. Bohman, The Triangle-Free Process, Adv. Math. 221 (2009), 1653-1677.

[8] T. Bohman, A. Frieze and E. Lubetzky, Random triangle removal,preprint, arXiv:1203.4223 (2012).

[9] B. Bollobas, On generalized graphs, Acta Math. Acad. Sci. Hungar 16(1965), 447-452.

[10] B. Bollobas, On a conjecture of Erdos, Hajnal and Moon, The Amer-ican Mathematical Monthly 74 (1967), 178-179.

82

[11] B. Bollobas, Weakly k-saturated graphs, in: Beitrage zur Graphenthe-orie (Kolloquium, Manebach, 1967), Teubner, Leipzig (1968), 25-31.

[12] B. Bollobas, Random graphs, 2nd ed., Cambridge Stud. Adv. Math.73, Cambridge University Press (2001).

[13] S. Brandt, H. Broersma, R. Diestel and M. Kriesell, Global connec-tivity and expansion: long cycles and factors in f -connected graphs,Combinatorica 26 (2006), 17-36.

[14] A. Z. Broder, A. M. Frieze, S. Suen and E. Upfal, An efficient algorithmfor the vertex-disjoint paths problem in random graphs, Proceedingsof SODA ’96, pp. 261-268.

[15] S. Choi and P. Guan, Minimum critical squarefree subgraph of a hy-percube, in: Proceedings of the Thirty-Ninth Southeastern Interna-tional Conference on Combinatorics, Graph Theory and Computing189 (2008), 57-64.

[16] F. Chung and L. Lu, The diameter of sparse random graphs, Adv. inAppl. Math. 26 (2001), 257-279.

[17] A. Coja-Oghlan, M. Onsjo and O. Watanabe, Propagation Connec-tivity of Random Hypergraphs, Electron. J. Combin. 19 P17 (2012),25pp.

[18] D. Conlon, J. Fox and B. Sudakov, Cycle packing, Random StructuresAlgorithms 45 (2014), 608-626.

[19] D. Conlon and W.T. Gowers, Combinatorial theorems in sparse ran-dom sets, preprint, arXiv:1011.4310 (2010).

[20] D. Easley and J. Kleinberg, Networks, crowds, and markets: reasoningabout a highly connected world, Cambridge University Press (2010).

[21] P. Erdos, Some remarks on the theory of graphs, Bull. Amer. Math.Soc. 53 (1947), 292-294.

[22] P. Erdos, On some of my conjectures in number theory and com-binatorics, Proceedings of the fourteenth Southeastern conference oncombinatorics, graph theory and computing (Boca Raton, Fla., 1983),Congr. Numer. 39 (1983), 3-19.

83

[23] P. Erdos, A. W. Goodman and L. Posa, The representation of a graphby set intersections, Canad. J. Math. 18 (1966), 106-112.

[24] P. Erdos, A. Hajnal and J.W. Moon, A problem in graph theory, TheAmerican Mathematical Monthly 71 (1964), 1107-1110.

[25] P. Erdos and A. Renyi, On random graphs I., Publ. Math. Debrecen6 (1959), 290-297.

[26] P. Erdos, S. Suen and P. Winkler, On the size of a random maximalgraph, Random Structures Algorithms 6 (1995), 309-318.

[27] J. Faudree, R. Faudree and J.R. Schmitt, A survey of minimum satu-rated graphs and hypergraphs Electron. J. Combin. DS 19–Jul, 2011.

[28] M. Ferrara, M.S. Jacobson, F. Pfender and P.S. Wenger, Graph satu-ration in multipartite graphs, J. Comb. 19 (2016), 1-19.

[29] P. Frankl, An extremal problem for two families of sets, European J.Combin. 2 (1982), 125-127.

[30] P. Frankl and V. Rodl, Large triangle-free subgraphs in graphs withoutK4, Graphs Combin. 2 (1986), 135-144.

[31] S. Glock, D. Kuhn and D. Osthus, Optimal path and cycle decompo-sitions of dense quasirandom graphs, J. Combin. Theory Ser. B 118(2016), 88-108.

[32] A. Gundert and U. Wagner, On topological minors in random simpli-cial complexes, Proc. Amer. Math. Soc., 144 (2016), 1815-1828.

[33] S. Janson, T. Luczak and A. Rucinski, Random Graphs, Wiley (2000).

[34] R. Johnson and T. Pinto, Saturated subgraphs of the hypercube, Com-bin. Probab. Comput., to appear.

[35] G. Kalai, Weakly saturated graphs are rigid, Ann. Discrete Math. 20(1984), 189-190.

[36] G. Kalai, Hyperconnectivity of graphs, Graphs Combin. 1 (1985),65-79.

84

[37] L. Kaszonyi and Z. Tuza, Saturated graphs with minimal number ofedges, J. Graph Theory 10 (1986), 203-210.

[38] F. Knox, D. Kuhn and D. Osthus, Edge-disjoint Hamilton cycles inrandom graphs, Random Structures Algorithms, 46 (2015), 397-445.

[39] Y. Kohayakawa, V. Rodl and M. Schacht, The Turan theorem forrandom graphs, Combin. Probab. Comput. 13 (2004), 61-91.

[40] M. Krivelevich, Bounding Ramsey numbers through large deviationinequalities, Random Structures Algorithms, 7 (1995), 145-155.

[41] N. Linial and R. Meshulam, Homological connectivity of random 2-dimensional complexes, Combinatorica 26 (2006), 475-487.

[42] L. Lovasz, Flats in matroids and geometric graphs, in: CombinatorialSurveys (Proc. 6th British Comb. Conf.), Academic Press (1977), 45-86.

[43] G. Moshkovitz and A. Shapira, Exact bounds for some hypergraphsaturation problems, J. Combin. Theory Ser. B 111 (2015), 242-248.

[44] N. Morrison, J. Noel and A. Scott, Saturation in the hypercube andbootstrap percolation, Combin. Probab. Comput., to appear.

[45] L. Posa, Hamiltonian circuits in random graphs, Discrete Math. 14(1976), 359-364.

[46] L. Pyber, An Erdos-Gallai conjecture, Combinatorica 5 (1985), 67-79.

[47] B. Roberts, Partite saturation problems, preprint, arXiv:1506.02445(2015).

[48] V. Rodl and M. Schacht, Extremal results in random graphs, in: ErdosCentennial Volume, Bolyai Soc. Math. Stud. 25 (L. Lovasz, I. Ruzsaand V. Sos, eds.), Springer (2013), 11-33.

[49] D. Saxton and A. Thomason, Hypergraph containers, Invent. Math.201 (2015), 925-992.

[50] M. Schacht, Extremal results for random discrete structures, preprint.

85

[51] T. Szabo and V.H. Vu, Turan’s theorem in sparse random graphs,Random Structures Algorithms 23 (2003), 225-234.

[52] P. Turan, On an extremal problem in graph theory, Matematikai esFizikai Lapok 48 (1941), 436-452.

[53] W. Wessel, Uber eine Klasse paarer Graphen,I: Beweis einer Vermu-tung von Erdos, Hajnal and Moon, Wiss. Z. Hochsch. Ilmenau 12(1966), 253-256.

[54] N. Wormald, The differential equation method for random graph pro-cesses and greedy algorithms, Lectures on Approximation and Ran-domized Algorithms, PWN, Warsaw (1999), 73-155.

[55] A. Zykov, On some properties of linear complexes (in Russian), Mat.Sbornik N. S. 24 (1949), 163-188.

86

Curriculum Vitae

Education

• 2008-2011: BSc in Mathematics (with honors),Eotvos Lorand University, Budapest, Hungarythesis supervisor: Andras Frank

• 2011-2012: MASt (Part III) in Mathematics (with distinction),University of Cambridge, UK

• 2012-2013: PhD in Mathematics (no degree),University of California, Los Angeles, CA, USAadvisor: Benny Sudakov, transferred with no degree to:

• 2013-2016: PhD in Mathematics,ETH Zurich, Switzerlandadvisor: Benny Sudakov

Employment

• 2012-2013: Teaching assistant,University of California, Los Angeles, CA, USA

• 2013-2016: Teaching assistant,ETH Zurich, Switzerland

87

Publications

• Domination in 3-tournaments (with B. Sudakov),submitted.

• Saturation in random graphs (with B. Sudakov),Random Structures & Algorithms, to appear.

• A random triadic process (with Y. Peled and B. Sudakov),SIAM Journal on Discrete Mathematics 30 (2016), 1-19.

– An extended abstract for Eurocomb 2015 appeared in:Electronic Notes in Discrete Mathematics 49 (2015), 189-196.

• Decomposing random graphs into few cycles and edges (with M. Kriv-elevich and B. Sudakov),Combinatorics, Probability and Computing 24 (2015), 857-872.

• Ks,t-saturated bipartite graphs (with W. Gan and B. Sudakov),European Journal of Combinatorics 45 (2015), 12-20.

• Separating path systems (with V. Falgas-Ravry, T. Kittipassorn, S.Letzter and B. P. Narayanan),Journal of Combinatorics 5 (2014), 335-354.

88

in copyright - non-commercial use permitted rights ...49120/eth-49120-02.pdfanother important eld is...

Documents