the harvard college mathematics · pdf filethe harvard college mathematics review is produced...

104
The Harvard College Mathematics Review ^ Vol. 2, No. Spring 2008 ^•is^SS^: In this issue: ^HRENIKSHAH A Taste of Elliptic Curve Cryptography THOMAS LAM Tiling with Commutative Rings DANA ROWLAND Twisting with Fibonacci MR A Student Publication of Harvard College

Upload: phamdung

Post on 07-Feb-2018

220 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

The Harvard College ■ ■Mathematics Review ^

Vol. 2, No. Spring 2008

^ • i s ^ S S ^ :

In this issue:

^HRENIKSHAHA Taste of Elliptic Curve Cryptography

THOMAS LAMTiling with Commutative Rings

DANA ROWLANDTwisting with Fibonacci MR

A Student Publication of Harvard College

Page 2: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

Website. Further information about The HCMR can befound online at the journal's website.

http : //www . thehentr . org/ ( i )

Instructions for Authors. All submissions should include the name(s) of the author(s), institutional affiliations (ifany), and both postal and e-mail addresses at which the corresponding author may be reached. General questions shouldbe addressed to Editor-in-Chief Scott D. Kominers at hemr!hcs. harvard .edu.

Articles. The Harvard College Mathematics Review invitesthe submission of quality expository articles from undergraduate students. Articles may highlight any topic in undergraduate mathematics or in related fields, including computer science, physics, applied mathematics, statistics, and mathematical economics.

Authors may submit articles electronically, in .pdf, .ps,or .dvi format, to [email protected], or in hardcopy to

The Harvard College Mathematics ReviewStudent Organization Center at HillesBox # 36059 Shepard StreetCambridge, MA 02138.

Submissions should include an abstract and reference list. Figures, if used, must be of publication quality. If a paper isaccepted, high-resolution scans of hand drawn figures and/orscalable digital images (in a format such as .cps) will be required.

Problems. The HCMR welcomes submissions of originalproblems in all mathematical fields, as well as solutions topreviously proposed problems.

Proposers should send problem submissions to ProblemsEditor Zachary Abel at hcmr-problems@hcs . harvard .edu or to the address above. A complete solution or a detailedsketch of the solution should be included, if known.

Solutions should be sent to hemr-solutionsShcs .harvard.edu or to the address above. Solutions shouldinclude the problem reference number. All correct solutionswill be acknowledged in future issues, and the most outstanding solutions received will be published.

Advertising. Print, online, and classified advertisementsare available; detailed information regarding rates can befound on The HCMR's website, (1). Advertising inquiriesshould be directed to hemr- advertise@hcs .harvard,edu, addressed to Business Manager Charles Nathanson.

Subscriptions. One-year (two issue) subscriptions areavailable, at rates of $10.00 for students, $15.00 for other individuals, and $30.00 for institutions. Subscribers should mailchecks for the appropriate amount to The HCMR's postal address; confirmation e-mails should be directed to DistributionManager Nike Sun, at hemr-subscribes hcs. harvard,edu.

Sponsorship. Sponsoring The HCMR supports the undergraduate mathematics community and provides valuablehigh-level education to undergraduates in the field. Sponsorswill be listed in the print edition of The HCMR and on a special page on the The HCMR's website, (1). Sponsorship isavailable at the following levels:

Sponsor $0 - $99Fellow $100-$249Friend $250 - $499

Contributor $500-$1,999Donor $2,000 - $4,999Patron $5,000 - $9,999

Benefactor $10,000 +

Fellows ■ Dr. Barbara Currier • Ellen and William Kominers ■Teach for America • Contributors ■ The Harvard Undergraduate Council ■ QVT Financial LP • Patrons • The HarvardUniversity Mathematics DepartmentCover Image. The image on the cover shows a surfacewhose level sets are real elliptic curves. An application ofelliptic curves is the subject of this issue's "A Taste of Elliptic Curve Cryptography," by Shrenik Shah (p. 13). The image was created in Mathcmatica™ by Graphic Artist ZacharyAbel.

©2007-2008 The Harvard College Mathematics ReviewHarvard College

Cambridge, MA 02138

The Harvard College Mathematics Review is produced andedited by a student organization of Harvard College.

Page 3: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

— -2 —Contents

From the EditorScott D. Kominers '09

Student Articles

1 Latin Squares and their Partial TransversalsN i k o l a o s R a p a n o s , K a s t r i t s i H i g h S c h o o l ' 0 8 4

2 A Taste of Elliptic Curve CryptographyS h r e n i k S h a h ' 0 9 1 3

3 Bridging the Group Definition GapM a t t h e w D a w s o n , U n i o n U n i v e r s i t y ' 0 8 3 4

4 Soliton Solutions of Integrable Systems and Hirota's MethodJ u s t i n C u r r y, M a s s a c h u s e t t s I n s t i t u t e o f Te c h n o l o g y ' 0 8 4 3

Faculty Feature Article

5 Tiling with Commutative RingsP r o f . T h o m a s L a m , H a r v a r d U n i v e r s i t y 6 0

6 Twisting with FibonacciP r o f . D a n a R o w l a n d , M e r r i m a c k C o l l e g e 6 6

Features

7 Mathematical Minutiae • i Has This Funny PropertyZ a c h a r y A b e l ' 1 0 a n d S c o t t D . K o m i n e r s ' 0 9 7 5

8 Statistics Corner • How Statisticians Discovered the Options Backdating ScandalR o b e r t W . S i n n o t t ' 0 9 7 8

9 Applied Mathematics Corner • Secret Sharing and ApplicationsP a b l o A z a r ' 0 9 8 3

10 My Favorite Problem • Linear Independence of RadicalsI u r i e B o r e i c o ' 1 1 8 7

1 1 P r o b l e m s 9 31 2 S o l u t i o n s 9 5

13 Endpaper • Math Has This Funny PropertyZ a c h a r y A b e l ' 1 0 a n d S c o t t D . K o m i n e r s ' 0 9 1 0 1

Page 4: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

Editor-in-ChiefScott D. Kominers '09Business Manager

Charles Nathanson '09

Articles EditorShrenik Shah '09Features EditorFrancois Greer ' 11Problems EditorZachary Abel'10

Distribution ManagerNike Sun '09

Graphic ArtistZachary Abel'10

Cover and Logo DesignHannah Chung '09

Board of ReviewersZachary Abel'10Monica Allen '09Pablo Azar '09Eleanor Birrell '09Jeremy Booher' 10Justin Chen, Caltech '09Grant Dasher '09Francois Greer ' 11Andrea Hawksley, MIT '07Scott D. Kominers '09Menyoung Lee '10John Lesieutre '09Sam Lichtenstein '09Daniel Litt' 10Charles Nathanson '09Shrenik Shah '09Arnav Tripathy ' 11

Issue Production DirectorsErnest E. Fontes '10Zachary Abel'10Menyoung Lee' 10

WebmasterBrett Harrison'10

Board of Copy EditorsZachary Abel'10Pablo Azar '09

Eleanor Birrell '09Jeremy Booher'10Hannah Chung '09Grant Dasher '09

Ernest E. Fontes '10Brett Harrison '10

Scott D. Kominers '09John Lesieutre '09

Daniel Litt'10Shrenik Shah '09

Arnav Tripathy ' 11Joe Zimmerman' 10

Business BoardZachary Abel'10

Brett Harrison'10Scott D. Kominers '09

Daniel Litt'10Charles Nathanson '09

Nike Sun '09

Faculty AdvisersProfessor Benedict H. Gross '71, Harvard University

Professor Peter Kronheimer, Harvard UniversityDr. Alon Amit, Google

Page 5: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

oFrom the Editor

Scott D. KominersHarvard University '09Cambridge, MA 02138

[email protected]

This spring, Professor John Duncan, the instructor teaching my algebraic geometry class, assigned supplementary readings from The Harvard College Mathematics Review (HCMR):

"Alexander Ellis's article gives an excellent taste of the Poincare-Hopf Index Theorem,"

Professor Duncan explained, strongly recommending that all the students read Ellis's article (Dunking donuts: Culinary calculations of the Euler characteristic, HCMR 1 #1 (2007), 3-14) to complement the day's lecture.

In a single year, The HCMR has already reached the status of a teaching tool. Indeed, articlesfrom The HCMR have appeared not only in Harvard classrooms but also in classes at other schools(e.g. Professor Ivars Peterson's "Communicating Mathematics" course) and in summer programs(e.g. Professor Keith Conrad's 2007 PROMYS lectures).

This new step in The HCMR's maturation is truly impressive. The HCMR could never havereached this level without its scholarly authors, nor could have it even existed without its dedicatededitors, tireless business staff, and sleepless production directors.

Furthermore, the success of The HCMR has been facilitated by the journal's sponsors and advisers, both at Harvard and elsewhere. In particular, we at The HCMR are especially indebted toProfessor Benedict H. Gross '71, Professor Peter Kronheimer, and Dr. Alon Amit, who haveadvised the journal and to Professor Clifford H. Taubes for his continued support and encouragement. We also appreciate the administrative assistance provided by Dean Paul J. McLoughlin IIand Mr. David R. Friedrich as well as the continued, generous support of The Harvard Mathematics Department.

This issue is my last as Editor-in-Chief. I am glad to see that The HCMR's future looks sobright at this juncture.

The incoming Co-Editors-In-Chief, Zachary Abel '10 and Ernest E. Fontes '10, are bothdedicated members of The HCMR's staff. Both are talented mathematicians and energetic leaders,and both have been deeply involved in the production of this and past issues. I look forward toseeing The HCMR continue to expand under their joint administration.

On a personal note, I would like to thank those teachers most responsible for my involvementin the founding of The HCMR, {Mrs. Susan Schwartz Wildstrom, Professor Noam D. Elkies,Professor Andreea C. Nicoara}, and to wish good luck to the new Co-Editors-In-Chief.

Scott D. Kominers '09Editor-in-Chief, The HCMR

Page 6: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

STUDENT ARTICLE1

Latin Squares and their PartialTransversals

Nikolaos RapanostKastritsi High School '08

Patras, [email protected]

AbstractWe introduce the theory of Latin squares and their partial transversals. Furthermore, we presentthe history and some applications of Latin squares. We also prove using elementary methods thata 6 x 6 Latin square has a partial transversal of length 5. After developing some of the theory ofso-called partial Latin squares we provide a simple proof of a result originally due to Woolbright,which gives a lower bound for the length of the longest partial transversal in an n x n Latin square.Followingly we improve slightly the best known lower bound for the length of the longest partialtransversal in an n x n Latin square.

1.1 IntroductionA Latin square of order n is an n x n square matrix whose cells consist of n distinct symbolssuch that each symbol appears exactly once in each row and each column. The theory of theseobjects has a long history. The earliest written reference of Latin squares is the solution to thecard problem published in 1723 (see [DK] for further information). Euler (1779) initiated thesystematic development of the theory, when he posed 'The Problem of the 36 Officers" in his paper"Recherches sur une nouvelle espece de quarre magique".1 Later, Cay ley (1877-1890) showed thatthe multiplication table of a group is a certain type of a Latin square. Latin squares played animportant role in the foundation of finite geometries, a subject which was also in development atthe turn of the nineteenth century. In the 1930s, a major application of Latin squares was found byFisher who used them and other combinatorial structures in the design of statistical experiments.Latin squares also find applications in computer science and in the construction of error-correctingtelegraph codes.

More on the history and applications of Latin squares can be found in Section 1.2 below. InSection 1.3 we define the concept of a partial transversal of length j of an n x n Latin square. Weintroduce an operation # which will enable us to prove some existence results for partial transversals. We then survey some lower bounds on the length of the longest partial transversal in annxnLatin square. In Section 1.4 we define partial Latin squares and use some of their properties toprove one of the aforementioned lower bounds. We conclude by sketching how to improve on theproven bound.

1.2 History and Applications of Latin SquaresLatin squares have a long history, stretching back at least as far as medieval Islam, when they wereused on amulets. Abu 1'Abbas al Buni wrote about them and constructed, for example, 4x4 Latin

t Nikolaos Rapanos has a high school senior at Kastritsi High School in Patras, Greece. He is member of hisNational Math team during the last 6 years and he has been distinguished in various math competitions, suchas the IMO-BMO. His mathematical interests include Analysis, Geometry, and Inequalities. Beyond math, hisinterests include physics, music, tennis and waterskiing. He plans to study Engineering, although a doublemajor in mathematics is increasingly likely.

English: Investigations on a new species of magic square.

Page 7: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

Nikolaos Rapanos—Latin Squares and their Partial Transversals 5

squares using letters from a name of God. In his famous etching Melancholia, the fifteenth centuryartist Albrecht DUrer portrays a 4 x 4 magic square, a relative of Latin squares, in the background.Other early references to them concern the problem of placing the 16 face cards of an ordinaryplaying deck in the form of a square so that no row, column, or diagonal should contain more thanone card of each suit and each rank. This is known as "The Card Problem."

The systematic treatment of Latin squares started by Leonhard Euler in 1779, when he posed"The Problem of the 36 Officers." This problem was not solved until the beginning of the twentiethcentury. The problem was to arrange 36 officers, each having one of six different ranks and belonging to one of six different regiments, in a square formation 6 x 6, so that each row and each filecontained just one officer of each rank and just one from each regiment. Working on this problem,Euler defined Graeco-Latin—or orthogonal—Latin squares as a pair of n x n Latin squares sothat when one is superimposed on the other, each of the n2 combinations of the symbols (takingthe order of the superimposition into account) occurs exactly once in the n2 cells of the array. "TheProblem of the 36 Officers" can be solved by finding a pair of orthogonal Latin squares of order 6.Euler knew (c. 1780) that there was not a Graeco-Latin square of order 2 and knew constructionsof Graeco-Latin squares for n odd and for n divisible by 4, but he was unable either to find a pairof orthogonal Latin squares of order 6 or prove that they did not exist. Based on much experimentation, he conjectured that Graeco-Latin squares did not exist for n = 2 (mod 4) (oddly evenintegers according to his notation).

In 1901, Gaston Tarry proved (by exhaustive enumeration of the possible cases) that thereexists no Graeco-Latin square of order 6, adding evidence to Euler's conjecture. However, in 1959,Parker, Bose and Shrikhande were able to construct an order 10 Graeco-Latin square, and provideda construction for the remaining even values of n that are not divisible by 4 (of course, exceptingn = 2 and n = 6).

Many more applications of Latin squares were developed in the 20th century. Arthur Cayleycontinued work on Latin squares and in the 1930s the concept arose again in the guise of multiplication tables when the theory of quasi-groups and loops began to be developed as a generalizationof the group concept.

Latin squares played an important role in the foundation of finite geometries, a subject whichwas also in development at this time. A large application area for Latin squares was opened byFisher who used them and other combinatorial structures in the design of statistical experiments.Sets of Latin squares that are orthogonal to each other have found an application as error-correctingcodes in situations where communication is disturbed by noise besides simple white noise, such aswhen attempting to transmit broadband internet over powerlines. Latin squares are also useful forscheduling round-robin tournaments. As a matching procedure, Latin squares relate to problemsin graph theory, job assignment (known as the Marriage Problem), and processor scheduling formassively parallel computer systems. More about applications of Latin squares can be found in[DK].

An interesting open problem seeks a formula L(n) for the number of nxn Latin squares. Thenumber of n x n Latin squares is known to increase very fast—in fact L(ri) > YY}=1 j\—but theexact formula is not yet known.

1.3 Partial Transversals1.3.1 DefinitionsDefine a partial transversal of an n x n Latin square to be a set of n cells, one from each row andcolumn. A partial transversal has length j if it contains j distinct symbols. For j = n the partialtransversal is called a transversal.

We continue with the proof of an elementary fact as an example.

Proposition 1. Every 6x6 Latin square has a partial transversal of length at least 4.

Proof. Assume that there exists a 6 x 6 Latin square whose maximum partial transversal has lengthless than 4. Then one can find at least 2 same symbols at every partial transversal of the Latinsquare. Thus, we may assume that our Latin square looks like the one in Figure 1.3.1.

Page 8: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

The Harvard College Mathematics Review 2.1

a

b X y

ci lB ' m

C

Je

fFigure 1.1: Every 6x6 Latin square has a partial transversal of length at least 4.

It is clear that symbols x and y are both different than c and at least one of them is different thana. Without loss of generality, we may examine only the case in which x is different than a, thenone of the gray-colored cells, say the (io, jo), must contain a symbol that is neither a nor x. Notethat all the gray-colored cells contain symbols that are different than c. Therefore it is clear that thepartial transversal consisted of the cells {(1,1), (2,3), (5,5), (io, jo)} and two more (acceptable)cells and contains at least 4 distinct symbols—or in other words, its length is at least 4. We works imi lar ly i f y is d i f ferent than a and f in ish the proof as above. □1.3.2 The operation #Given annxn Latin square and a partial transversal T of length n — k with k > 2, one can findanother partial transversal of equal or greater length in the following manner:

1. Choose two cells (ii, ji) and (i2, j2) of T such that T - {(ii, ji), (12,32)} contains n-kdistinct symbols. Since k > 2, there exist two such cells. These two cells can either containtwo distinct duplicated symbols, or two occurrences of the same symbol, if this symbolappears in the transversal at least three times.

2. Replace these two cells with the cells (h, J2) and (2*2, ji).

Since we chose cells containing duplicated symbols, the new partial transversal has length at leastn - fc, as each of the symbols in the original transversal is represented in one of the unchangedcells. An example of this operation is given in Figure 1.2. We call this operation #, a notationchosen for its shape.Proposition 2. Every 6x6 Latin square has a partial transversal of length at least 5.Proof This proposition follows directly from Drake's result in [Dr], but we also give another proofas a motivating example for the use of operation #. Let us assume that there is no partial transversalof length 5. There exists a partial transversal of length 4 according to Proposition 1, so there willbe a partial transversal containing a multiset2 of symbols either of the form {a, a, b, b, c, d} or theform {a, a, a, b, c, d}.Case 3. There is a partial transversal containing symbols of the form {a, a, b, b, c, d}. An illustration for this case is presented in Figure 1.2.

We may assume without loss of generality that this partial transversal is on the diagonal, andcall it T0. We can apply # to the cells (1,1) and (3,3) in T0 to get a new partial transversal Tx.

2 A set whose elements may be repeated is called a multiset.

Page 9: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

Nikolaos Rapanos—Latin Squares and their Partial Transversals 7

ak

ai b

b

c

d

Figure 1.2: An example of the operation #. In this case, we replace the cells (1,1) and (3,3)in the partial transversal on the diagonal with the cells (1,3) and (3,1) to obtain another partialtransversal, also of length 4.

By our hypotheses, the new cells (1,3) and (3,1) in T\ must contain a symbol chosen from the setc, d. Because of symmetry, we only need to analyze two cases: either both symbols are the sameor there is one c and one d.Subcase 3.1. Both symbols obtained from the above application of # are c's.

Since both are c s, we can apply # to the cells (1,3) and (5,5) in T\ to obtain a new partialtransversal T2 (as shown in Figure 1.3(a)), and we discover that the symbols in (1,5) and (5,3)must be chosen from the set {a, b, d}. On the other hand, applying # to the cells (1,1) and (4,4)in To gives us a partial transversal T3, and we discover that the symbols in (1,4) and (4,1) mustboth contain d (see Figure 1.3(b)). We can now apply # to the cells (1,4) and (6,6) in T3 toobtain T4, and discover that the symbols in (1,6) and (6,4) must be chosen from the set {a, b, c}.We know that our Latin square looks like Figure 1.3(b), where the #' s are symbols from the set{a, b, c, d}. The first row contains five distinct symbols from the set {a, b, c, d}, a contradiction bythe pigeonhole principle.Subcase 3.2. The symbols of the cells (1,3) and (3,1) are different and chosen from the set {c, d}.

Without loss of generality, we can assume that the cell (3,1) contains a c and the cell (1,3)contains a d. The partial transversal T/ is the one obtained by # to the cells (1,1) and (3,3) inTo . Applying # to the cells (1,1) and (4,4) in To, we obtain a new partial transversal T5. Clearly,the cell (4,1) must be filled with a d and the cell (1,4) with a c. Note, that cells obtained after anapplication of # cannot contain another symbol, say e, because then we would derive directly ourproposition. This is the reason, for which we assume that each cell obtained by # must contain asymbol from the set {a, b, c, d}. The partial transversal Te contains all three c' s of our Latin squareand the cells (2,2), (4,3) and (6,6). Applying # to the cells (1,3) and (5,5) in Te, we obtain a newpartial transversal T7, and we discover that the cell (1,5) must contain a b. Applying # to the cells(6,6) and (1,4), we obtain a new partial transversal T& containing the cell (6,1). As we noted above,this cell obtained by # must contain a symbol from the set {a, b, c, d}. This yields a contradiction,since the first row already contains these symbols.Case 4. We are left with the case, in which the partial transversal (we may assume it is the diagonal)contains symbols of the form {a, a, a, b, c, d}.

We may assume, without loss of generality, that the diagonal contains the symbols {a, a, a, b, c,d} in this order. Call this partial transversal To. Applying # to the cells (1,1) and (2,2) of To, wediscover that the cells x\ = (1,2) and X2 = (2,1) must contain symbols of the set {b,c,d).Applying # to the cells (1,1) and (3,3) of To, we discover that the cells X3 = (1,3) and x± = (3,1)must also contain symbols from the set {b, c, d}. Since x\ ^ X3, we understand that one of these

Page 10: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

The Harvard College Mathematics Review 2.1

a ok a c d X Xv

a a

c b c b

b d b

u 6 y c

d X d(a) (b)

Figure 1.3: (a) Another example of the operation #. Here, we replace the cells (1,3) and (5,5)in the partial transversal indicated in bold with the cells containing symbols x and y. If this Latinsquare has no partial transversal of length greater than 4, we must have x G {6, d} and y e {a, d}.(b) After two more applications of #, we know that if the Latin square had no partial transversal oflength greater than 4, then it would appear as pictured, where the symbols x are chosen from theset {a, b, c, d}.

two cells contains a different symbol of that in cell x±. Without loss of generality, we assume that1:3 ^ X4. If we consider the partial transversal containing {#3, #4, (2,2), (4,4), (5,5), (6,6)},t h e n w e h a v e r e d u c e d t h i s c a s e t o c a s e 3 . □1.3.3 Open Problems on Latin SquaresKoksma [Ko] showed that an n x n Latin square has a partial transversal of length at least ^ti.This was improved by Drake [Dr] to ^p and then simultaneously by Brouwer et al. [BdVW]and Woolbright [Wo] to n — y/n. This was in turn improved by Hatami and Shor [HS] to n —11.0525(logn)2. We obtain a different lower bound which, while still of order n — O(logn)2,slightly improves upon Hatami and Shor's constant found in [HS]. Both results fall short of twoconjectures, Ryser's conjecture of n for odd n, and Brualdi's Conjecture of n — 1. The difficultyof these problems is based on the fact that for large Latin squares, brute force is intractable. Inparticular, the two conjectures are stated as follows:Conjecture 5 (Ryser). Any Latin square of odd order has a transversal.Conjecture 6 (Brualdi). Any Latin square of order n has a partial transversal of length at leastn - l .

1.4 Partial Latin Squares and a Proof of Woolbright's TheoremWe define a partial Latin square as an n x n square matrix of cells, some of which contain asymbol such that no symbol appears twice in any row or column. A partial transversal of ann x n partial Latin square is a set of n filled cells, one from each row and column. We say thata partial transversal of a partial Latin square is of length j if it contains j distinct symbols. Anmxm subsquare S' of an n x n partial Latin square S is the set of ra2 cells in some subset of rarows and some subset of ra columns of the partial Latin square, where some filled cells of S maypossibly be replaced by empty cells in S'.

Consider a Latin square with a partial transversal of maximum length n — k, where k > 2.By applying # to this partial transversal, we obtain other partial transversals, whose lengths must

Page 11: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

Nikolaos Rapanos—Latin Squares and their Partial Transversals 9

also ben — k and whose set of symbols is the same as the first. Applying # repeatedly to thesepartial transversals, we eventually obtain a maximal set of partial transversals closed under #. Allof these partial transversals contain the same set of n - k distinct symbols, so by ignoring all cellsexcept those in this set of partial transversals, we obtain a partial Latin square S containing n — kdistinct symbols and a set T of partial transversals of S closed under #. We will call this pair (S, T)a partial Latin square satisfying Ak. More formally we have the following definition.Definition 7. Ann x n partial Latin square satisfying Ak is an n x n array of cells, some ofwhich contain a symbol, together with a nonempty set T of partial transversals of length n — k, thatsatisfies the following properties:

1. Each filled cell must appear in at least one of the partial transversals in T.

2. The set T of partial transversals must form a connected graph under #.3

3. The set T of partial transversals must be closed under the operation #.

We consider a partial Latin subsquare (S', T') satisfying Ay of a partial Latin square (S, T)satisfying Ak. In this case, we require that T' be a subset of T restricted to S\ i.e. that

T ' C { r n S ' : T€T } .Note that the case analysis in the proof of Proposition 2 on Latin squares of size 6 sketched earliershows that any partial Latin square satisfying A2 must have size at least 7. Namely, we have alreadyproved that every 6x6 Latin square has a partial transversal of length at least 5.Lemma 8. If(S, T) is a partial Latin square satisfying Ak such that no subsquare satisfies AkJhenno cell is contained in all partial transversals in T. In other words, there is no fixed cell. That is,given a filled cell (i,j) and a partial transversal containing (i,j), by a sequence of operations #,one can obtain a partial transversal in T not containing (i, j).Proof. Suppose there is a cell (i,j) contained in all partial transversals. We will call this a fixedcell. Let a be the symbol in this cell. If a appears anywhere else in the partial transversal, then thereis a partial transversal containing both a's (the second a appears in some partial transversal sinceevery filled cell does, and this partial transversal must contain the first a since all partial transversalsdo). We can then apply # to this partial transversal to obtain a partial transversal without the fixedcell, which is a contradiction. We are left with the case in which a does not appear anywhere elsein the partial Latin square. By deleting the row and column containing the a, one finds a subsquares a t i s f y i n g A k , a c o n t r a d i c t i o n o f t h e h y p o t h e s i s . □Proposition 9. Given a minimal partial Latin square (S, T) satisfying Ak and any filled cell, thereexists a partial transversal in T containing both that cell and another one with the same symbol.Proof. We may assume that there is only one cell containing the symbol a. We already know (fromDefinition 7) that there must be a partial transversal in T passing through a. Choose a cell on suchpartial transversal and call it T\. By Lemma 8 no cell is fixed, so there is a partial transversal T\in T that does not pass through a. On the other hand, the set T of partial transversals must form aconnected graph under the operat ion #, which is a contradict ion. □Corollary 10. Given a minimal partial Latin square (S, T) satisfying Ak, there exists a subsquareof S satisfying Ak-i.Proof. We can choose any filled cell, say (1,1) containing a, and a partial transversal To passingthrough it that duplicates a. Now consider the set of partial transversals containing at least twoa's, including the one in cell (1,1), which are generated by a sequence of operations # starting withTo. This is exactly the set of partial transversals generated by # starting from the partial transversalTo' = To — (1,1) in the subsquare formed by deleting the first row and column. Taking this set of

3 When we say that the set T of partial transversals must form a connected graph under #, we mean thatany transversal in T can be obtained from any other transversal in T by repeated applications of #.

Page 12: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

1 0 T h e H a r v a r d C o l l e g e M a t h e m a t i c s R e v i e w 2 . 1

partial transversals gives an (n-l) x (n- 1) partial Latin square satisfying Ak-i. Note that thissubsquare may have some empty cells which were filled in the original nxn square. □

We now state a lemma, theorem, and corollary proven by Hatami and Shor [HS]:Lemma 11. In an (n — 1) x (n — 1) partial Latin square satisfying Ak-i induced as describedabove from an n x n partial Latin square satisfying Ak, the partial transversals generated by #must have a fixed cell,that is some cell that appears in all of these partial transversals.Theorem 12. In a minimal partial Latin square S satisfying Ak, there are at least nk-i + k filledcells in each row and column, where nk-i is the size of the smallest subsquare of S satisfyingAk- i .Corollary 13. If we let nk = nbe the size of the original partial Latin square satisfying Ak, then

n k > n k - i + 2 k . ( 1 . 1 )

Proof The larger square has nk - k distinct symbols, of which at least nk-i + k appear in eachr o w a n d c o l u m n . □1.4.1 An elegant proof of Woolbright's TheoremBased on our previous analysis, we can give a very short and elegant proof for Woolbright's result[Wo]. Note that the original proof in [Wo] is very complicated.

Let Sk be a partial Latin square satisfying Ak such that no subsquare satisfies Ak. We call sucha partial Latin square minimal. It was shown in Section 1.4 that there must be a subsquare of Sksatisfying Ak-i. Choose Sk-i to be the smallest subsquare of Sk satisfying Ak-i and, recursively,Sm to be the smallest subsquare of Sm+i satisfying Amt until the sequence ends at S\. Denote bynj the size of Sj. In this terminology, Proposition 1 can be expressed as

n 2 > 7 . ( 1 . 2 )

Theorem 14 (Woolbright, 1978). Every nxn Latin square has a partial transversal of length atleast n — y/n.

Proof Suppose that we are given anxn Latin square which does not have a partial transversalof length more than n — k. It follows from the definition of nk, that n — k > nk — k. We havealready shown that nk — nk-i > 2k in Corollary 13. Adding up, we get:

nk — nk-i > 2-kn k - i - n k - 2 > 2 - ( f c - l )

n 3 - n 2 > 2 - 3n 2 — n \ > 2 - 2

▶ nk = 2(1 + 2 + • • • + k) - 2 = k2 + k - 2.

Thus nk > k2, or in other words k < y/n~k~. Consequently, there must be a partial transversalo f l e n g t h a t l e a s t n — y / n . □

The reader may enjoy trying to derive the following inequality for the sequence {nk}'-Theorem 15. In Sk as defined above, for all j < k,

(nk -nj)(nk-i +nj - n* + fc) < nj(rij - ra^-i - 2j) + (n* -nj)(nk -k-rtj + j). (1.3)Note that inequality (1.3) simplifies to

(nk - rij)(2nj -f nk-i - 2nk + 2k - j) < rij(rij - rij-i - 2j). (1.4)

Suppose we have a Latin square, which has no partial transversal of length more than n — I.By the previous sections, there is a sequence ri2 < n3 < • ■ • < nj satisfying the inequalities (1.2)

Page 13: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

Nikolaos Rapanos—Latin Squares and their Partial Transversals 11

and (1.3) defined earlier. These inequalities hold for 1 < j < k < I. We want to place a lowerbound on the expression n-l. Since n - / > nk - k, it is sufficient to place a lower bound on theexpression nk — k.

Using the above inequalities we can prove that in an n x n Latin square, there is a partialtransversal of length at least n - 11.0368(log n)2. Although for large n we have

n - 11.0368(logn)2 >n-\/n,

for small values of n the inequality flips. Finally we can state that every nxn Latin square has apartial transversal of length at least

max {n — y/n, n — 11.0368(log n)2 } .

However we omit the proof of this research since it lies outside of the scope of this article.Now we give an example of a sequence satisfying inequalities (1.2) and (1.3).

Proposition 16. The sequence u2 = 7 and uk = a3Lv^J (a - 6fe"Lv^J J for any k > 2, witha > 2 3 and b < \, satisfies the inequalities (1.2) and (1.3).

This proposition shows that inequality (1.3) cannot imply anything better than

n - f e )(logn)2

since the sequence {uk} satisfies all the conditions.In our research, we only needed to examine the case in which {uk} is an upper bound for

the sequence {nk}. This is because if there exists an N such that nj > Uj for all j > N, thenftj — j > Uj —j which gives a much better lower bound for the length of a partial transversal than inthe previous case. Furthermore, the slowest growing sequence, that satisfies the inequalities wouldbe a lower bound for the sequence {nk}. We were not able to find the slowest growing sequence,but we can show that this sequence is bounded below by 2™ and above by

exp I \lklog - log

for some constant c.

1.5 AcknowledgmentsI would like to thank my mentor Dr. Timothy Nguyen for his guidance and his support during thementorship, Professor David Jerison for overseeing my project, and my tutor Ms. Allison Gilmorefor all her assistance. I would also like to thank the Center for Excellence in Education, the Research Science Institute and the Massachusetts Institute of Technology for providing me with thesupport and opportunity to perform this research. I am thankful to the staff of The HCMR for theirassistance and creative cooperation.

References[BdVW] Andries E. Brouwer, A. J. de Vries, and R. M. A. Wieringa: A lower bound for the length

of partial transversals in a Latin square, Nieuw Arch. Wisk. 26 #3 (1978), 330-332.

[Br] Victor Bryant: Aspects of Combinatorics. Cambridge: Cambridge Univ. Press 1993.

[Ca] Peter J. Cameron: Combinatorics. Cambridge: Cambridge Univ. Press 1994.

Page 14: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

1 2 T h e H a r v a r d C o l l e g e M a t h e m a t i c s R e v i e w 2 . 1

[DK] Jozsef D6nes and A. Donald Keedwell: Latin squares and their applications. New York:Academic Press, 1974.

[Dr] David A. Drake: Maximal sets of Latin squares and partial transversals. J. Stat. PI. I. 1(1977), 143-149.

[HS] Pooya Hatami and Peter W. Shor: A Lower Bound for the length of a Partial Transversalin a Latin Square, to appear in J. Combinatorial Theory Ser. A (2008).

[Ko] K. K. Koksma: A lower bound for the order of a partial transversal in a Latin Square, J.Combinatorial Theory Ser. A 7 (1969), 94-95.

[Sh] Peter W Shor: A lower bound on the length of a partial transversal in a Latin square, J.Combinatorial Theory Ser. A 33 #1 (1982), 1-8.

[Wo] David E. Woolbright: An n x n Latin square has a transversal with at least n - y/ndistinct elements, J. Combinatorial Theory Ser. A, 24 #2 (1978), 235-237.

Page 15: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

STUDENT ARTICLE

A Taste of Elliptic Curve CryptographyShrenik Shah*

Harvard University '09Cambridge, MA 02138

[email protected]

AbstractThis paper develops several classical algorithms and cryptosystems in cryptography, and developsthe theory of elliptic curves to reveal the improvements provided by elliptic curve cryptography.The prerequisites to this paper are an understanding of groups, fields, and some elementary numbertheory.

2.1 IntroductionElliptic curve cryptography is not only a surprising application of a deep and powerful area ofnumber theory to computer science, but as we shall see, is also a very practical technique that isused today around the world. In this article we present a small taste of cryptography, presentingsome classical cryptographic protocols and factorization methods. After this we describe some ofthe mathematical theory of elliptic curves. These are combined in a section that shows how touse elliptic curves in the earlier protocols, and explains why these represent an improvement overtraditional methods, while citing some limitations to this viewpoint.

It is impossible, of course, to give anything approximating a complete account of these subjectsin a short article. Thus I strive to give an overview of the structure of these fields by including afew examples from cryptography and developing the theory needed to present the correspondingelliptic curve cryptosystems. Moreover, the paper is intended for one who is familiar with thebasic properties of groups and fields, as well as elementary number theory, but with no exposure tocryptography or elliptic curves. Section 2.4 together with Section 2.5.2.2 and Section 2.5.4 sketchthe proofs of some key results that follow from the existence of the Weil pairing, an algebraicstructure defined on an elliptic curve, and require some additional mathematical maturity. SinceSection 2.3 makes no use of Section 2.2, some readers may want to begin with Section 2.3, and useSection 2.2 as a reference when reading Sections 2.4 and 2.5.

2.2 CryptographyAt the heart of cryptography is security—cryptography allows one to very carefully control the information and powers available in a system to various parties in a way that cannot be manipulatedor broken by dishonest participants. Thus cryptosystems are often tested for resistance to various"attacks" and required to have a very small probability of being broken. Most of classical cryptography proves results that are conditional on central assumptions. These assumptions are made tobe general enough so that they do not rely on the difficulty of a specific problem, such as factoringintegers. However, the central assumptions are far from being proven, so cryptography in somesense rests on fragile ground. This section will describe these assumptions as well as some types ofcryptosystems that can be constructed under these assumptions. It will also present classical cryptosystems as concrete examples. These cryptosystems will then be modified to use elliptic curveslater in this paper. The last subsection will explain some of the mathematics behind the problem of

tShrenik Shah, Harvard '09, is a mathematics concentrator and English minor. He is also enrolled in aconcurrent masters program in computer science. He is a founding member of The HCMR and currently servesas Articles Editor.

13

Page 16: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

1 4 T h e H a r v a r d C o l l e g e M a t h e m a t i c s R e v i e w 2 . 1

factoring integers and describe classical algorithms that aim to perform this task more efficientlythan the usual brute force methods.

2.2.1 Theory of ComputationThe most common formalization for the notion of a "computer" is the Turing machine. For ourpurposes, we will instead refer nonrigorously to the notion of an algorithm and assume that allbasic operations, such as addition and multiplication, all take a single time step. These assumptionsserve our purposes because we will not analyze the precise complexity of the algorithms we study.The time an algorithm takes is measured as a function of the input size, where the input is a stringof binary digits. For the interested reader, [Si] is a good introduction to the formal theory ofcomputation.

An algorithm can either take as an input a string in binary and output "accept" or "reject," ortake in a string in binary and output another string in binary. An algorithm of the latter sort issaid to be computing a function. Any subset of {0,1}*, the set of all binary strings, is termed alanguage. The subset of {0,1}* accepted by an algorithm A is a language said to be decided byA. The time t(n) taken by the algorithm is measured as the maximum number of time steps for aninput of length n. The class P consists of all languages that can be decided by an algorithm thatruns in polynomial time, meaning that t(n) < p(n) for some polynomial p.

Some languages have the property that there exists some string (specific to a particular stringx G L) with which membership in the language can be verified quickly. For example, those familiarwith basic graph theory will recall the definition of a Hamiltonian path, which is a path that visitsevery vertex exactly once. It is conjectured to be a difficult computational problem to determine,for a given graph G, whether such a path exists. However, given such a path, an algorithm can veryeasily verify that it is indeed a Hamiltonian path. The description of the path is called a witnesstestifying to the existence of a Hamiltonian path for G.

The class NP consists of all languages L with the following property: There exists a polynomial-time algorithm A such that if x G L, then there exists a witness w such that A(x, w) = 1.However, ifxfiL, then A(x, w) = 0 for all choices of w. It is a very well-known open problemto determine whether P and NP are equal (or not).

In fact, the problem of determining whether a Hamiltonian path exists is called NP—completebecause it can be proven that for every language L in the class NP, there is a polynomial timealgorithm that maps any x G L to graphs that have a Hamiltonian path and x £ L to graphsthat have none. Thus, if a polynomial time algorithm were ever written to determine whether aHamiltonian path exists, an algorithm could be then developed to decide any language in NP. Bycontrapositive, if any NP program were ever proven to have no polynomial time algorithm, thentesting for existence of a Hamiltonian path could not be done in polynomial time either.

It seems, then, that proving that P ■=/ NP would be rather powerful, in that it shows that everyNP-complete problem is difficult to solve. Unfortunately, this difficulty is worst-case hardness,which means that for every algorithm, all that is known is that there exists an input on which it iswrong, not that no algorithm can solve the problem for most inputs. Thus, although the idea ofbasing cryptography on the assumption that P NP is an attractive notion, especially since theseclasses seem "surely" different, a much stronger assumption is needed, as will be seen in the nextsection.

The last notions needed are merely some technical definitions: A function v : N —▶ R isnegligible if for any exponent a > 0 there exists a constant ca such that n~a > v(n) for n > ca.Roughly, if v eventually shrinks faster than any inverse polynomial, then v is negligible. We definethe big-O notation f(n) = 0(g(n)) to mean that there exists N G N and C > 0 such that forn > N, \f(n)\ < C\g(n)\, and f(n) = S(g(n)) to mean that there exists c, C > 0 such thatc\g(n)\ < |/(n)| < C\g(n)\, again for n > N.

Finally, a probabilistic polynomial time (PPT) algorithm is an algorithm with the additionalproperty that it may generate uniform random bits whenever it wishes. In order words, a PPTalgorithm may, in addition to following deterministic instructions, flip a fair two-sided coin.

Page 17: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

Shren ik Shah—A Tas te o f E l l i p t i c Curve Cryp tography 15

2.2.2 Central AssumptionsThis section details the main assumptions of cryptography. These definitions are taken from [GB].The main assumption of cryptography is that one-way functions exist. A one-way function / :{0,1}* —▶ {0,1}* has two properties:

• There is a polynomial-time algorithm that computes the one-way function.

• For any PPT A, on a randomly chosen input x G {0, l}n with uniform distribution, thereexists a negligible function va (allowed to depend on A) such that the probability thatA(ln,f(x)) = z where f(z) = f(x) is less than or equal to VA(n) (for n sufficientlylarge). (Note that ln = 1... 1 with 1 repeated n times.)

The second property above is a mouthful, so we'll satisfy ourselves with an imprecise definition:given the output on a randomly chosen input of a one-way function, a polynomially bounded algorithm has a low probability of finding an input that would produce the same output. Note that theln in the input to A is to ensure that its running time is based on the size of the space of possiblex, rather than on the output f(x), which could be much smaller.

The existence of one-way functions implies that P ^ NP, but the converse is not true. Thefield of average-case complexity strives to prove such an equivalence. Unfortunately, many of theresults related to this question are pessimistic, and imply that proving an equivalence would requirefairly sophisticated, non-intuitive methods.

Some candidate one-way functions:• Computing the product n = pq of two prime numbers p, q. We will discuss later in this

paper some algorithms faster than the naive 0(y/n) algorithm for factoring such numbers.• The discrete logarithm problem:. Given the multiplicative group (Z/pZ)x,gx, and g,

where g is a generator of this group, determine x.

• Computing modular square roots: Given a quadratic residue a mod n, compute a value xsuch that x2 = a mod n.

As a notational remark, we will use, as above, a = b mod n for n | (a — b). We will frequentlyuse = for =, particularly when working in the ring Z/nZ. We will also use (a, b) for the greatestcommon divisor of a and b.

Another problem that is conjectured to be hard is the Diffie-Hellman problem. In (Z/pZ)x,given the generator g together with gx and gv, the problem is to compute gxy. This problem isimportant in several protocols that we will discuss later. The hardness of this problem is frequentlyassumed by cryptographers. This is known as the Diffie-Hellman assumption. If one has a solutionto the discrete logarithm problem for either g, gx or g, gy, one can certainly compute gxy, so theDiffie-Hellman assumption is stronger than the assumption that computing discrete logarithms ishard.

As it turns out, cryptographers do not yet know how to create certain encryption schemes (discussed later) without an even stronger assumption. This is that a trapdoor function / : {0,1}* —▶{0,1}* exists, with the following properties:

• The function / is one-way.

• There exists a PPT algorithm T so that for every input length n, there exists a tn G {0,1}*of length bounded by a polynomial p, and for all x G {0, l}n, T(f(x),tn) — z with theproperty that f(z) = f(x).

• It is also important that a trapdoor function can be generated together with its trapdoors tnefficiently.

Trapdoor functions essentially have a built in method to invert the function, which is important ifone party computes the function at a point, while another needs to invert this computation in orderto access information. This occurs in public key encryption, described in the next section.

Page 18: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

1 6 T h e H a r v a r d C o l l e g e M a t h e m a t i c s R e v i e w 2 . 1

2.2.3 Types of Cryptographic SchemesCryptographic schemes are simple tasks that represent common tasks requiring security againsta very specific breach. This is in contrast to cryptographic protocols, which are often rathercomplex, and composed of many different schemes. Election systems and auction systems areexamples of such protocols. In this section, we describe a few of the most common schemes in anonrigorous fashion, followed by examples. Texts such as [Gol] and [GB] are excellent referencesfor those interested in a formal treatment.2.2.3.1 Public Key EncryptionIn a public key encryption scheme, Alice wants to send a message M to Bob, while an eavesdropperEve is listening over the channel. Bob generates a secret decryption key d and publishes a publickey e. Alice computes the encryption C = Ence(M) and sends it to Bob. Bob receives C andcomputes M = DeCd(C). We require various minimal properties of this encryption system:

• There should be some efficient algorithm to generate pairs (d, e) of a private key and associated public key.

• The algorithms Ence(-) and Dec,i(-) should be efficient.

• Knowing e and C should not reveal information about M.

In fact, we can define more properties that guard against more subtle attacks. For example, seeingmany messages sent across this channel should not reveal additional information about any of themessages. The one-time pad discussed in the next section will illustrate the importance of thisrequirement; it is aptly named, as it should only ever be used once. Another example is non-malleability, and requires that an eavesdropper cannot meaningfully modify the message duringt r a n s m i s s i o n . #One-time Pad. This does not fit any of the above cryptographic schemes, but is of great historicalsignificance, predating modern cryptography by over half a century. In these earlier times, actualbooks of bits would be distributed physically, and the sender would indicate which bits were to beused for the decryption. The sheet used would be destroyed after use. The idea is that Alice wantsto send a message M G {0, l}n to Bob, and they have earlier agreed upon a secret one-time pads G {0, l}n. Alice sends C — M 0 s to Bob, where 0 denotes the exclusive-or operation, whichis defined as bitwise addition modulo 2 of two given binary strings. For example, for any stringt G {0, l}n,t®t = 0n. Bob computes M = C0s = M0s0s, and both Alice and Bob destroyall traces of s. If Eve knows no information about s, she knows absolutely nothing about M, evenif she has C.

The "one-time" property, however, is critical. If Alice sends another message M' with the samekey s, then Eve now might have both C = M 0 s and C' = M' 0 s. By computing C 0 C =M 0 A/', Eve gains some information about the messages M and Af'. Using some algorithm thatperhaps takes advantage of the sparseness of the English language, Eve could possibly decipher Mor M' from this information.RSA Public Key Encryption. In the RSA (Rivest, Shamir, Adleman) cryptosystem, Alice choosestwo large primes p, q and releases the number n = pq and an exponent e as her public key, so that(e,<p(n)) = 1. (We define the Euler totient function tp by ip(n) = \{a\(n,a) = 1,0 < a < n}\= \(Z/nZ)x |.) Then she computes d = e~l mod tp(n) as her private key. Bob wants to send themessage M to Alice, and encrypts it using the function C = Encn>e(M) = Me mod n. Finally,Alice simply computes M = Decd(C) = Cd = Med = Mk*{n)+l = M mod n by Euler'stheorem.

Unfortunately, RSA is vulnerable to some attacks. For example, if the same message is sentusing e different choices of n, the message can be decrypted via simple Chinese remaindering. Thisis particularly dangerous when a small e (like 3) is chosen for efficiency purposes. A solution is topad all messages with random bits at the end, chosen differently with every encryption. Anothervulnerability is that if the same message is sent using two relatively prime choices of e for the samevalue of n, the message can be decrypted using the Euclidean algorithm. A weakness that renders

Page 19: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

Shren ik Shah—A Tas te o f E l l i p t i c Curve Cryp tog raphy 17

RSA relatively useless for the purposes of identity-based cryptography is that knowing a pair e, dof encryption and decryption keys allows one to factor n, which we prove in Corollary 5. See [Bo]for an excellent survey of the attacks on this cryptosystem.Elgamal Public Key Encryption. In Elgamal Public Key encryption, Alice chooses a multiplicative group Wp , a generator g of this group, and a secret integer x. She computes gA = gx, andpublishes as her public key (¥p ,g,gA). To encrypt 0 < M < p, Bob sends his message bypicking a uniformly random secret integer 0 < y < p — 1 and computing and sending C\ = gy,C2 = MgA = Mgxy to Alice, who decrypts by computing C2C^X = Mgxyg~xy = M.

The key observation is that all of the operations above could be replicated with any cyclicgroup in place of F* above. We will later replace this with a group structure associated to anelliptic curve.

Cryptographically, this system has a key weakness. The pair (C\, C2) relate to M by the simpleformula M = C2C^X. Thus, if Eve has the ability to modify the message as it is sent to Bob, shecan change M to the known function of M, fk(M) = kM, for some k, without Bob knowing thatany change had been made. Some cryptosystems make it difficult for Eve to control exactly howher modification might affect the sent message—this property is called non-malleability. RSA isalso malleable, since an encryption of M2 can be computed from an encryption of M.Theorem 1. If the Diffie-Hellman problem is intractible, then Elgamal Public Key Encryption isunbreakable.Proof. Suppose for contradiction that given public key gx, and message (gy, D) (even for only onesuch D), one could compute M = g~xyD. Then one could compute M~1D = gxyD~lD = gxy,v i o l a t i n g t h e i n t r a c t i b i l i t y o f t h e D i f f i e - H e l l m a n p r o b l e m . □2.2.3.2 Identity-based EncryptionA weakness of public key encryption is that an adversary can pretend to be Bob, the intended recipient, and request the message to be encrypted in his public key instead of Bob's—he could thendecrypt the message received. To solve this problem, one might try to design a trusted database thatholds everyone's public keys, but even then, the adversary can intercept communications with thedatabase. The solution provided by identity-based encryption, proposed by Shamir in [Sham],is to use some identity-based value that is a function of some unique information about Bob thatanyone sending Bob a message would know. As an example, one might use the output of a publicly known hash function (defined in Section 2.2.3.5) on Bob's name, cell phone number, andaddress as an identifier. A trusted server provides a decryption key to Bob on some occasion in anauthenticated manner. Bob can then decrypt messages forever afterwards.2.2.3.3 Digital SignaturesIn a digital signature scheme, Alice wants to sign a document M in a way that is verifiable andunforgeable. She publishes a public key e that anyone can see, and keeps a secret key s. She signsthe message with the function C = Sigs(AT), and sends (Af, C) to the desired recipients. Anyonecan run the verification algorithm Vere(M, C) to test whether the message was indeed signed byAlice.

There are several levels of information we can assume an adversary might have. In the weakestcase, the adversary Eve might know only the public key e, with no examples. More realistically,she might know pairs (M, C) that were produced by Alice earlier. The worst case is that Eve mayhave forced Alice to sign certain documents of her choice in her efforts to forge a signature on adocument that Alice has not yet signed.

We also can place different requirements on possible forgeries, but for the purposes of this paper, we will use the strongest possible requirement, that of existential unforgeability: Eve shouldnot be able to sign any message for which she has not yet seen a signature. There are weaker levelsof security than this, detailed in [GB].

We finally require, naturally, that Sigs() and Vere(-, •) should be efficiently computable.In practice, digital signatures are very frequently attached to emails and other forms of elec

tronic correspondence.

Page 20: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

1 8 T h e H a r v a r d C o l l e g e M a t h e m a t i c s R e v i e w 2 . 1

Digital Signature Algorithm. Recall that in a digital signature scheme, Alice wants to equip adocument M with an unforgeable signature. The setup for this algorithm is more complex. Aliceneeds to pick a group ¥% and a large prime p such that p \ q — 1, and ^- = e is as small aspossible. The message will be in the range 0 < M < p. She also picks an p G Fj such that g hasorder exactly p. By random guesses, exponentiated by a factor of ^-, this can usually be donequickly. Note that if q = 2p + 1, meaning that p, q are a pair of Sophie-Germaine primes, thenthen one can use any nontrivial quadratic residue for g, since the group of residues modulo q isthen of order p. Alice also picks a publicly known hash function H : {0,1}* —» {0,1}£, randomlychooses a secret integer 0 < a < p, computes gA = ga, and publishes (F* ,p, g, gA,H).

To sign M G {1,..., q - 1}, Alice computes H(M), picks a random integer 0 < k < p, andcomputes x = gk mod p (in F* first, then reduced modulo p) and y = k~l(H(M) + ax) mod p,producing a signed document (M,x,y). The verifier Bob computes c\ = y~1H(M) mod p,C2 = 2/-1# mod p, and accepts if x = gclgcA2 mod p. A correctly signed document is alwaysaccepted, since x = gk = gv~1(H(M)+ax) = gCl+ac2 mod p

This algorithm is frequently used in practice, though it is in fact an open problem to prove thatthe hardness of breaking this cryptosystem follows from the intractibility of the discrete logarithmproblem. On the other hand, the next section describes a provably existentially unforgeable protocolto compute digital signatures.

Elgamal Digital Signatures. Let q = 2p + 1 be a pair p, q of Sophie-Germaine primes, and G thegroup of squares in Z%. Fix a generator g of G. Alice fixes a private key x randomly selected from{0,... ,p - 1}, computes gA = gx, and publishes the public key (p, g, qa).

To sign M G {1,..., q - 1}, Alice picks y randomly selected from {0,... ,p - 1}, andcomputes h = gy. Then Alice computes H(M\\y), where || denotes concatenation, and c =-xH(M\\h) - y mod p. Finally Alice publishes SIGA(M) = (M, h, c).

Verification simply requires checking that gIAIi"M^h)gch = 1.Theorem 2. If computing the discrete logarithm x ofgx is hard, and the hash function is a randomoracle, then Elgamal Digital Signatures are existentially unforgeable.

Proof. If an adversary were to compute (M,h,c) meeting the requirements for a fixed message M,then it would be necessary for the adversary to have obtained H(M\\h) from the oracle, otherwisethe value of gAM"h)gch would be a random value. If the adversary knows H(M\\h), this impliesthat M and h are fixed, since it is difficult (in the random oracle model) to find another messagesuch that H(M\\h) is the same. To forge a single message M, one must be able to determine a valuec and a value c' for at least two possible values a, a' for H(M\\y), since H(M\\y) is a randomvalue. Thus, the adversary has gAgch = gA 9° h, or gx(a~a > = gc~c . The number a — a' has aninverse modulo p since it is nonzero, so gx = g^c~c )(a~a ^ yields x = (c—c')(a—a')~l mod p,thus solving the discrete logarithm problem that was assumed hard. □2.2.3.4 Key ExchangeDiffie and Hellman developed the notion of a key exchange, wherein Alice and Bob have noprior shared secret, but they want to be able to send messages via private key cryptography (acryptographic scheme we will not discuss in this paper). The desired property is that Alice andBob both compute the same value, and that an eavesdropper Eve is unable to determine anythingabout that value with high probability. There is a natural generalization to n-partite key exchanges,where n trusted players want to agree on a single key.

This seems fairly straightforward, though there are some interesting issues that arise. With justthese properties, Eve could impersonate Alice and share a key with Bob. Bob, thinking that Eve isAlice, will use the shared key to encrypt messages, which Eve can then read. To counter an attacklike this, one needs to use authentication, a topic discussed in detail in [GB] and [MvOV].

From the Diffie-Hellman problem described above, it is not difficult to guess their protocolfor the key exchange. A group F* with generator g is fixed ahead of time. Alice picks oGFpx

Page 21: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

Shren ik Shah—A Tas te o f E l l i p t i c Curve Cryp tography 19

at random, and Bob similarly picks b. Alice sends ga to Bob, who sends gb to Alice. They bothexponentiate to compute gab, which is the shared key.

There is an n-partite version of this protocol, again with a publicly known field ¥p with generator g. Players pi,..., pn pick secrets s\,.. .,sn- In round k, k = 0,..., n — 2, p* sendsgSiSjl ...sjk tQ ajj piavers^ where ji,.-- ,jk are distinct elements of {1,..., i — 1, i + 1,..., n}.The players then use gsi ",8n as their shared secret.2.2.3.5 Cryptographic Hash FunctionsHash functions are a useful tool, often used within the earlier-mentioned cryptosystems. A hashfunction H : {0,1}* —▶ {0,1}* sends any string of arbitrary length to its hash, of length polynomial in the length of the input (though often of constant length, in practice). In the usual definitionof a collision-free hash function, it should be intractible to find a collision in time polynomial inthe input length k, where a collision is a pair of strings s\,S2 such that H(si) — H(s2). H isusually required to be a one-way function.

The random oracle model of a hash function works as follows: given a message M G {0,1}*,the random oracle always outputs a random string in {0,1}\ except when a previous query is repeated, in which case it produces the original output. When using a hash function in a cryptosystem,one sometimes proves results about the security of the cryptosystem by assuming that the hash function is a random oracle. The assumption that a hash function H is indistinguishable from a randomoracle is stronger than assuming that H is a collision-free hash function.

2.2.4 Miller-Rabin and Pollard's AlgorithmIn a sense that the following theorem makes precise, the group (Z/nZ)x contains all the information about the factorization of n, and this information can be extracted efficiently knowing verylittle about this group. In fact, simply knowing a reasonably small multiple of its order suffices tofactor n in polynomial time.Theorem 3. Suppose that we know n and ktp(n) for some positive integer k. Assume also thatlog k = log°^ n. Then n can be factored efficiently.Proof It suffices to find a single factor, because given a factorization n = n\ - n2, one can runthe algorithm again on n\ and n2, as <p(ni) \ <p(n) (and one can argue that the run time of thisrecursive algorithm is still polynomial). One can efficiently check whether n is a prime power, soassume this is not the case.

In this case, we can design an algorithm as follows. Given a composite input n, we first factork(p(n) = 2em where £ G Z is chosen maximally such that m eZ. Next, we pick a random integera, where 1 < a < n, and compute (a, n). If (a, n) ^ 1 then we are done, since in this case wehave found a nontrivial factor of n. Otherwise, (a, n) = 1 and

Thus, if we consider

a*(n) = a fc^(n) = a2*m = j_ ^ ^

m 2 m 2 m 2 m _ j _a , a , a , . . . , a m o d n

we have a sequence that eventually becomes 1. For fixed a, consider the largest value of j suchthat a?3rn -^ 1 mod n. In Lemma 4, we show that for at least \ of the possible choices of a, wehave that a2'm £ -1 mod n. Then, a2J+1™ - 1 = (a2'm + l)(a2'm - 1) = 0 mod n. Thus,(a2Jm 4- l)(a2Jm — 1) = sn for some integer s. Also, neither of the factors on the left are 0 ormultiples of n, since we required a2Jrn ^ -1 mod n. So we can compute (a2Jrn + l,n) and(a23m - 1, n) to obtain a factorization of n into m n2, as desired. By choosing random values fora until we find one that yields a factorization of n, this algorithm terminates in expected polynomialt i m e . □

Page 22: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

2 0 T h e H a r v a r d C o l l e g e M a t h e m a t i c s R e v i e w 2 . 1

Lemma 4. In the proof of Theorem 3 above, at least \ of our choices of a have the property thatthere exists j such that a23fn ^ 1,-1 mod n while a23 m = 1 mod n. Moreover, we will useonly the fact that a2*™ = 1 mod nfor all a G (Z/nZ)x.Proof We adapt the proof of correctness of the Miller-Rabin primality test from [CLRS]. Define apair (a, j) of integers to be bad if a G (Z/nZ)x, j G {0,1,..., £}, and a2'm = -1 mod n. Thedesired result translates into proving that the number of bad pairs (a,j) is at most \\(Z/nZ)x\,since for any a of the form described in the statement of the lemma, there is a unique value of jsuch that (a,j) is bad. Note that since m is odd, (n - 1,0) is bad. Thus there exists at least onebad pair, so we may pick the largest possible j such that there is a bad pair (a,j), and fix a valueof a so that (a, j) is bad. Note that j < £ since, by assumption, a2 m = 1 mod n for all a. Let

S={xe (Z/nZ)x\x23m = ±1 mod nj.

This set is closed under multiplication modulo n, so it is a subgroup of (Z/nZ)x. Every bad pair(a,j) has a a member of S, because we picked a to be maximal and we allow x23m = ±1 mod nin the definition of S. If S also contained numbers a such that (a, j) is not bad, this would be fine,as we are only proving a bound on the number of bad pairs.

We now prove that S / (Z/nZ)x. Note that n by assumption is not a prime power, son = n' - pa for some prime p | n, where a is chosen maximally with pa \ n, and n' ^ 1.Since a23m = —1 mod n, we have a23m = -1 mod n' by the Chinese Remainder Theorem,as (n',pa) = 1. Moreover, again by this theorem, there exists b such that b = a mod n',b =1 modpa. Thus, by our preceding calculation, b23m = -1 mod n',b2Jm = 1 modpa. Bythe Chinese Remainder Theorem, b23fn ^ 1 mod n' implies b23m ^ 1 mod n, and b23m ^-1 mod pa implies b23m ^ -1 mod n. Thus b2'm ^ ±1 mod n, so 6 £ 5.

It suffices, then, to show that b G (Z/nZ)x. Note that since a G (Z/nZ)x, (a,n) = 1, so(a, n') = 1. Since b = a mod n', (6, n') = 1. Also by definition, 6=1 mod pa, so (b,pa) = 1.Thus, b is relatively prime to both n' and pa, so b is relatively prime to their product, n. Thus6 G (Z/nZ)x, as desired. So 5 is strictly contained in (Z/nZ)x, and by Lagrange's theorem, ith a s o r d e r a t m o s t \ \ ( Z / n Z ) x \ . □Corollary 5. If we have a pair e, d of corresponding RSA encryption and decryption keys, then wecan factor n.

Proof. Since ed = 1 mod <p(n), ed — 1 is a multiple of </>(n), whereby we can factor n usingT h e o r e m 3 . □

A Carmichael number is a number n such that an_1 = 1 mod n for all integers a such that(a,n) = 1.Corollary 6. If we have a bound of \og°^ n on the size of the primes dividing <p(n), or ifn is aCarmichael number, we can factor n efficiently.Proof The first statement can be shown by using a weak estimate on the density of primes to findan upper bound on the product of primes smaller than the bound log°(1) n in order to find a smallmultiple of (f(n). The second follows from the theorem that for a prime p dividing a Carmichaeln u m b e r n , p — 1 | n — 1 . □

We can similarly define the Rabin-Miller randomized polynomial-time primality test: Given apositive integer n, we first check that it is not a perfect power. We then factor n — 1 = 2em, where£ is maximal, pick a random 1 < a < n, and compute (a, n). If (a, n) ^ 1 then we're done, sincen is then composite. Otherwise, (a, n) = 1. We then check that a2 m = an_1 = 1 mod n, whichcan fail only if n is composite. We then consider

m 2 m 2 2 m 2 £ m m i r x * Ma , a , a , . . . , a m o d n ,

Page 23: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

Shren ik Shah—A Taste o f E l l ip t i c Curve Cryp tography 21

a sequence that eventually becomes 1. If a23rn / ±1 while a23 m = 1, then x2 — 1 has morethan two roots in Z/nZ, so this ring cannot be a field and n cannot be prime.

Corollary 7. If this test outputs composite, it is correct. Ifn is prime, the test outputs prime withprobability > |.Proof. The first sentence is clear from the algorithm. So suppose n is composite. Note that thea G (Z/nZ)x with an_1 = 1 mod n form a subgroup T of (Z/nZ)x. We divide into cases,depending on whether or not T = (Z/nZ)x.

If there exists any x G (Z/nZ)x \ T, then by Lagrange's theorem, at least \ of the choicesof x will have this property, since the smallest possible index for a proper subgroup is 2. If ouralgorithm as described above picks any a G (Z/nZ)x \ T, which it does with probability > \, itcorrectly classifies n.

If, instead, T = (Z/nZ)x, n - 1 has all of the necessary properties that k<p(n) had in theproof of Lemma 4 (as remarked in its statement). This lemma then shows that at least \ of thechoices of a will satisfy a?3m ^ — 1 mod n for the largest value of j such that o?3rn ^ 1 mod n,a n d w i l l t h u s l e a d t h e a l g o r i t h m t o c l a s s i f y n a s c o m p o s i t e . □

We can exploit the fact above by essentially "guessing" the structure of (Z/nZ)x, trying topick a multiple of ip(n) by choosing numbers with many small prime divisors. The homomorphismZ/nZ —> Z/piZ given by sending a to its residue modulo p implies that raising a number a G(Z/nZ)x such that p \ a to a power that is a multiple of pi — 1 will yield a multiple of pit

This leads to a Pollard's factorization algorithm: We pick an integer 0 < a < n, pick aninteger k that is a multiple of many small primes (say lcm(l,..., ra)), and compute (ak — 1, n).If (ak — l,n) = n, we use the algorithm above to factor n with k in place of k(p(n), and if(ak — 1, n) = 1, we pick a larger value of k (or a larger choice of ra). If p* — 1 | k for some but notall i, (ak — 1, n) is likely to be a proper nontrivial factor of n, as ak — 1 is unlikely to be a multipleof pj — 1, where pj — l\k (it is not difficult to see that this occurs with probability B(^-j-)). If(ak — 1, n) = 1, then increasing k is natural, because this could only occur if k were not a multipleof pi — 1 for any prime p*.

Unfortunately, this algorithm sometimes will be very inefficient, particularly when the pt — 1are not products of small primes. This observation illustrates the failure of (Z/nZ)x to reveal thefactorization of n easily, even though its order alone would be sufficient to factor n. Although inmost cases, the order ip(n) is a product of mostly small primes, in the worst case, when <p(n) is amultiple of some very large primes, this approach gives us little hope. Thus Pollard's algorithm isan efficient way to factor most n, but not all.

We thus see that the general problem of determining the properties of (Z/nZ)x is one that maybe easy for most randomly chosen inputs, but very hard on certain inputs. It would be nice, then,to have a group that contains all of the information about the factorization of n, yet whose order is"random." If this were the case, there would be a good chance of being able to factor n by tryingmany times on randomly chosen orders, using the same methodology as Pollard's algorithm.

This is where elliptic curves come in handy. Each elliptic curve E over the ring Z/nZ associates to n a group G(E,¥Pi) that, in a sense made precise by Theorem 8, contains the sameinformation about n as the group (Z/nZ)x. Moreover, we will see that the role of pi — 1 in thisproblem is replaced by a number in the interval pi + 1 — 2y/pi < \G(E, ¥Pi)\ < pi -f1 + 2y/p[. Itis rather likely that some number in this range will have small prime factors, and as a consequence,the algorithm will much more efficiently be able to factor numbers.

2.3 Elliptic CurvesElliptic curves may be viewed from the perspectives of many fields of mathematics, includingnumber theory, analysis, and algebraic geometry. Although we'll give a taste of this in Section 2.4,the focus of this section will be on the special case of elliptic curves over a finite field. Our goal is todefine the group law associated to an elliptic curve over a field and remark upon the generalizationto a ring such as Z/nZ.

Page 24: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

2 2 . T h e H a r v a r d C o l l e g e M a t h e m a t i c s R e v i e w 2 . 1

2.3.1 PreliminariesWe first define affine and projective space over a field K. The affine space A# is defined to be theset Kn, with no vector space structure. The projective space P# is defined, again as a set, to be{(ao, •.., otn) G Kn+l \ 0} modulo the equivalence relation (ao,..., an) ~ (Aao,..., Aan)for A G Kx. We will write the equivalence class of (ao,..., an) G P# as [ao,..., an]. Intuitively, the projective space contains points "at infinity" corresponding to intersections of parallelhyperplanes in affine space. On the other hand, it is clear from the definition that there are no distinguished points in projective space, and so there is no preferred affine subset of projective space. Acover of projective space by subsets "isomorphic" to affine space can be found by considering then+l subsets where on = 0 for each i. These are called A^, though the K is often omitted. Thetechnicalities of defining the structure on these spaces, which are actually varieties, can be foundin [Shaf] or [Ha]. None of these details will be important to the overview in this paper, though it isoccasionally useful to keep in mind that the endomorphisms on elliptic curves we consider in 2.4are cases of a more general notion.

An elliptic curve is defined over a field or ring. Let the field K have characteristic neither equalto 2 or 3, an assumption we will hold throughout this paper. Then an elliptic curve E is the set ofpoints [x, y, z] G P2^ defined by the homogeneous polynomial y2z = x3 4- axz2 4- bz3 over theprojective plane P2K for a,b G K, or the curve y2 = x3 4- ax 4- b defined over the affine plane A2Ktogether with a point at infinity, corresponding to [0,1,0] in the projective form of the curve. Oneshould also think of the defining polynomial itself as being part of the information present in E.Under the conditions on the characteristic, any other object one might call an "elliptic curve" canbe transformed by a simple change of variables into the form given. Over a ring R, the definitionis similar, though in this case, we will require 2,3 G Rx. The projective space P2R will also needto be redefined, which we do in Section 2.3.3. Note that we say "elliptic curve over K" (or R) tomean an elliptic curve whose coefficients a, b are in K (or R).

2.3.2 The Group LawSuppose that we have points Pi = (x\, y\), P2 = (X2,2/2) on an elliptic curve y2 = x3 4- ax 4- bwith a, b G Q (or, more generally, any field K). Then an equation for the line through Pi andP2 is obtained by taking ra = vxlZVx t0 De tne sl°Pe °f trus une anc* using the formula y =m(x — x\) 4- 2/1. This line intersects the curve in a third point, which we shall solve for bysubstituting this expression for y: (m(x - xi) 4- yi)2 = x3 4- ax 4- b, or

x — ra x 4- (—2myi 4- 2ra x\ 4- a)x — ra xx 4- 2mx\y\ — y\ -f b = 0.

Since the coefficient ra2 is the sum of the roots of the polynomial, while x\, X2 are already roots,we can find the abscissa of the third point as X3 — ra2 — x\ — X2, and the ordinate from the equationof the line, yz = m(x3 — x\) 4- y\. We label P3 = (#3,2/3), and we define 0 = Pi 4- P2 + P3 inthe additive group of points on the elliptic curve, so that Pi 4- P2 = — P3. We define the negationof a point to be its reflection over the #-axis, meaning that — (x, y) = (x, —y). The third pointon this line connecting (x, y) and (—x, y) is the point at infinity, and the identity of this additivegroup.

Commutativity and identity properties follow immediately from the above definition, but associativity is tedious to verify, as it breaks up into cases. We refer a reader interested in the proof to[ST], though we will provide a somewhat opaque proof of this result for subfields of C when wemotivate the Weil pairing in Section 2.4.

Associated to an elliptic curve E over a field K we have defined an abelian group structure G(E, K) on its points. Moreover, for a field extension L/K, there is a natural injectionG(E, K) <—* G(E, L). Many amazing results in number theory describe various properties ofG(E, Q). Although we will not need them, they are worth mentioning. Mordell's theorem showsthat this group is always finitely generated. Mazur's theorem shows that the torsion subgroup iseither Cn for 1 < n < 10 or n = 12, or C2 x C271 for 1 < n < 4. Determining G(E,¥P) isgenerally easier than computing the group law over Q, and in fact, the most important informationabout G(E, ¥p) for our purposes will be its order (which is finite).

Page 25: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

Shren ik Shah—A Tas te o f E l l i p t i c Curve Cryp tography 23

2.3.3 Elliptic Curves over;One can define elliptic curves over rings as well, though to do so requires a more complicated grouplaw, since division is not permissible and other issues arise. The story is detailed in Lenstra's paper[Le], which also describes the factoring algorithm we present in Section 2.5.1. There are variousconstraints that make the process simpler. We'll avoid delving into the general theory and studythe specific case of Z/nZ, where n is prime to 2 and 3. This will suffice for our purposes, becauseour factoring algorithm will specifically check for divisibility by small primes. Over a ring R, theprojective space PR is redefined as

{(a0,...,an) G Pn+1|(a0,...,an) = (1)} /(a0,... ,an) ~ (Aa0,..., Aan), A G Rx,

and it becomes generally more important to consider the elliptic curve within projective space. Westill define the curve as the solutions to the homogeneous equation

2 3 . 2 . i 3y z = x 4- axz + bz

in P2R, though we require that the discriminant -4a3 - 2762 be a unit (which just means nonzeroover a field, so this is consistent).

The ring Z/nZ holds information about its factors within its subgroups, and has the propertythat if n = mn2, where (m,n2) = (l),thenZ/nZ = (Z/n\Z)x(Z/n2Z). We can show that thistranslates to the elliptic curve as well. Note that for an elliptic curve E over R and an injection (p :R^> S, G(E, R) C G(E, S). But we can replace the injection <p with any homomorphism, andapply this map both to the coefficients of E and to R to obtain a homomorphism (p : G(E,R) —>G(E, S). In particular, for an elliptic curve E over Z/nZ and pi reduction modulo n*, we obtainmaps (pi : G(E, Z/nZ) -> G(E, Z/mZ). We can now prove:Theorem 8 ([Wa, pp. 65-66]). The map

(fi x q>2 ' G(E,Z/nZ) -> G(E,Z/mZ) x G(E,Z/n2Z)

is an isomorphism.Proof. Note first of all that the discriminant of E when reduced modulo ni or n2 is still a unit,since it is relatively prime to n. Thus the groups on the right above are well-defined. Reductionmodulo ni yields, via the Chinese Remainder Theorem,

ipi x ip2 : Z/nZ =■ Z/niZ x Z/n2Z,

and thus a bijection between triples of these elements. This map induces maps on Pl/nZ, whichwe'll denote ^i,<?2' smce ^ (xiViz) = (ux ,uy',uz) are equivalent triples, their images areequivalent via the image of the unit u. When applied to the bijection on triples, this yields abijection

if1 X if2 : Pz/nZ —> Pz/niZ X Pz/n2Z-

Finally, y2z = x3 4- axz2 + bz3 mod n implies y2z = x3 4- axz2 4- bz3 mod n* for i = 1, 2.The converse is true, again by the Chinese Remainder Theorem. Thus the map (p\ x (p2 is abijection. That it is a homomorphism can be verified by a tedious but simple computation that wew i l l o m i t . □

Our last comment is that although we have not specified the exact group law over a ring Z/nZ,one can use the usual group law over a field K when dealing only with units. This fails to yieldthe right answer exactly when a nonzero nonunit appears in one of the coordinates of one of thepoints. For the purpose of factoring, however, we actually need not explain how to carry out thegroup operation in this situation, as a nonzero nonunit appears exactly when we have successfullymanaged to factor n, so the algorithm can terminate at this stage.

Page 26: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

2 4 T h e H a r v a r d C o l l e g e M a t h e m a t i c s R e v i e w 2 . 1

2.4 The Weil PairingThe Weil pairing is a construction on the n-torsion points of an elliptic curve that is important forproving theoretical results, such as the analogue of the Riemann Hypothesis for elliptic curves,as well as for computational applications, discussed in Section 2.5.2.2 and Section 2.5.4. Theprogression used in the subsections following the motivation come largely from Chapters 2 to 4 and11 to 12 of [Wa], though we provide few proofs. Our goal is to sketch the ideas behind the resultson elliptic curves over finite fields most relevant to cryptography. Readers without background incomplex analysis can safely skip Section 2.4.1.

The texts [Sil], [Si2] provide an abstract viewpoint on pairings that is more useful from thepurposes of number theory, while [CFA] provides a comprehensive treatment of the cryptographicapplications of the Weil pairing and the related Tate-Lichtenbaum pairing.2.4.1 MotivationFor those familiar with either the theory of algebraic curves over C or of compact Riemann surfaces,the following will serve as motivation for some of the theorems that follow (and provide proofs ofspecial cases of results we will not prove).

If we view an elliptic curve in P£ as a compact Riemann surface X of genus 1, there existsa lattice A C C such that X maps biholomorphically to the quotient E = C/A. Indeed, forthe inverse, the Weierstrass p-function defines the map z \-> (p(z), p'(z)) sending C/A to thesolutions of y2 = 4x3 - 60Gf4(A)^ - 140G6(A), where Ga,Gq are the Eisenstein invariants ofthe lattice A.

The importance of this perspective comes from the obvious addition law on points in C/A,which is carried via the isomorphism to yield an abelian group structure on E. This formula isgiven by

, v 1 ( p ' ( z 2 ) - p ' ( z i ) \ 2 .P {Z l+Z2)=4{P(Z2) -P(Z l ) ) - * *> - * * ) .

One can derive the associativity of the group law over subfields of C from this isomorphism. Wealso can easily determine the n-torsion points of C/A: if A = Zlji 0 Zu>2, then these are preciselyh^x + £^vf0r o < £u£2 < n. Thus the n-torsion group, which we denote by Gn(E,C), isisomorphic to Z/nZx_ Z/nZ, for any n. More generally, we let Gn(E, K) denote the subgroupof G(E, K) (where K denotes the algebraic closure) of points P such that nP = 0. With someadditional technical steps, one can use the case of C to conclude that Gn(E, K) = Z/nZ x Z/nZfor general fields of characteristic 0.

We denote by Cn = (C) the group of nth roots of unity in C, where C is a primitive root. In[Ga], a Weil pairing wn : Gn(E, C) x Gn(E, C) -» Cn is defined by the formula:

^i^i , ^2^2 mnJi m2uj2\ {2ni(£im2 — mi£2)\4 - , 4 - = e x p * ' -n n n n ) \ n JThe key properties of this pairing follow directly from this definition. It is linear in each component,and satisfies wn(0,g) = wn(g, 0) = wn(g,g) = 1 for all g. On the other hand, for g ^ 0, thereexist h, h' such that wn(g, h) ^ 0, wn(h, g) ^ 0—this means that wn is nondegenerate. FinallyWn(g, h) = wn(h, g)~l. In the terminology of linear algebra, one will easily recognize that theseproperties are analogous to those of a nondegenerate skew-symmetric bilinear form.2.4.2 EndomorphismsWe next study certain maps on elliptic curves. These maps are a general case of regular maps.Let us consider a curve in affine space, A2K. Then the coordinates x, y themselves define functionsassigning to each point of the curve a value in K. One can generate many functions on the curveby considering polynomials in x, y. Such maps are examples of regular functions. One can alsodefine a map r : E —> A2K by defining r(x,y) = (p\(x,y),p2(x,y)) for a pair of regularfunctions pi, P2. If the image of an elliptic curve E under a map of this form happens to lie on E,and the map acts as a group homomorphism on G(E, K), then the map is called a endomorphism.

Page 27: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

Shren ik Shah—A Tas te o f E l l i p t i c Curve Cryp tography 25

We allow, in fact, pi and P2 to be rational functions. Under a map of this form, it is possible forpoints of the curve to be sent to the point at infinity.

We can determine a convenient form in which all such maps can be written. Since all pointson the curve E satisfy y2 = x3 4- ax 4- b, we can assume that the numerator and denominator ofpi, p2 have no powers of y larger than 1. In fact, multiplying the denominator by its conjugate withrespect to y (meaning we replace y with -y), we find that pi(x, y) = 9i(x^**)(x)y for functionsqi,Si,U G K(x, y) and i G {1,2}. Since r(x, -y) = r(-(x, y)) = -r(x, y), in fact, we havesi(x) = 0 and q2(x) = 0, so r(x,y) = (n(x),r2(x)y), where ri,r2 G K(x) are rationalfunctions of x.

If the map r is nonzero, we define the degree of this map to be the larger among the degreeof the numerator and denominator of r\, and if r is zero, we define the degree to be 0. Thesum of endomorphisms defines an endomorphism by summing the images using the group law,and endomorphisms admit multiplication by an integer in the same way. We denote the identityendomorphism r(x,y) = (x,y)by 1 and, over ¥q, the Frobenius endomorphism by <pq(x, y) =(xq,yq), easily checked to be an endomorphism.

Following [Wa], we will use the Weil pairing in the next section to derive the following result:Theorem 9. Let a, r be endomorphisms and s,t G Z. Then

deg(scr 4- tr) = s2 deg a + t2 deg r 4- st(deg(a + r) — deg a - deg r).

Another way to create new endomorphisms is via composition, for example <pq = (pq o • • -o ,n times. The endomorphism r is defined to be separable if r[ (x) is not identically 0. This rathertechnical definition is explained by Lemma 10, which expresses the notion of separability in termsof the degree and kernel of the map, two seemingly more intrinsic qualities. This lemma can beproven by carefully keeping track of the roots of the numerator and denominator of n, r2.Lemma 10. ([Wa], p. 49-50) For r a nonzero endomorphism on an elliptic curve over an algebraically closed field, deg r > | ker t\, with equality if and only ifr is separable.

Thus, if we wanted to compute the number of points on the curve E over ¥pk, we mightconsider yk, which fixes ¥pk and no other elements of Fp. Then the endomorphism (ppk — 1 haskernel exactly corresponding to the points of E over Fpfc. However, we need to know whetherthis endomorphism is separable or not. This question is answered by another lemma, a convenientcondition for separability of certain endomorphisms over ¥pk.Lemma 11. ([Wa], p. 54) Over ¥pk, where r, s G Z, nppk 4- s is separable exactly when p\ s.

The main idea of the proof is to first reduce to the question of whether s is separable, and tothen answer this by first showing that if n = (n(x), r2(x)y), £^y = n.

These results will be used later to conclude the Hasse bound. The final step simply involvescomputing the degree of (fpk — 1 using Theorem 9.

Another result that we will invoke later is:Theorem 12 ([Wa, p. 95-96]). If \G(E,¥pk)\ = pk + 1 - I then y2pk - £<ppk +■ pk = 0 asendomorphisms. Moreover, for ra £, <p2k — nuppk 4- pk 0.

For the first statement, the key idea is to show that <ppk - £<ppk + pk is identically zero onGn(E,¥p) for infinitely many choices of n. Then Lemma 10 implies that this endomorphism isidentically zero. The second is easy: If (ppk - vrnppk + pk = 0, then by subtracting ippk - £<ppk 4-pk = o we find (£ — m)<ppk = 0 as endomorphisms, which can happen only if £ = ra.2.4.3 The Weil Pairing and its ConsequencesWe shall state two key theorems, whose proofs are in [Wa].Theorem 13 ([Wa, p. 75]). //char K \ n or n = 0, then Gn(E, K) ^ Z/nZ x Z/nZ.

Page 28: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

2 6 T h e H a r v a r d C o l l e g e M a t h e m a t i c s R e v i e w 2 . 1

Theorem 14 ([Wa, pp. 83, 334-335]). LetCn CK be the group ofnth roots of unity and E be theelliptic curve y2 = x3 4- ax 4- b. Then there exists a map wn : Gn(E, K) x Gn(E, K) —> Cncalled the Weil pairing, with the properties that

• The map wn is bilinear and nondegenerate in each variable (as defined in Section 2.4.1).

• Wehavewn(g,g) = landwn(g,h) = wn(h,g)~l forg,h G Gn(E,K).

• Ifip : K —▶ K is an automorphism with a, b as fixed points, wn((f(g),(p(h)) = (p(wn(g,h)).

• If a is an endomorphism ofE, wn(a(g),a(h)) — wn(g, /i)degCT.If <j is an endomorphism of E, it is easy to see that it restricts to an endomorphism of Gn(E,K).

Using Theorem 13 we can choose a basis ui,u2 for Gn(E,K), so that a is defined by (vi,uj2) »->(hoJi 4- ^2^2,raia;i 4- m2w2). Then one can show via simple computations (see [Wa]) thatwn(ui,u2) is a primitive n* root of unity and that det (^ ^22) = deg a. As a consequence, onecan prove that Theorem 9 holds over fields of characteristic p by computing the determinants ofboth sides as endomorphisms over Gn(E, K) for each n with p\n. One can extend this result toall n by applying Lemma 10.2.4.4 The Hasse BoundOver finite fields, it is possible to very accurately determine the order of the group of points onan elliptic curve. We will use some of the facts about endomorphisms we stated without proof inSection 2.4.2.

Let E have coefficients in a finite field Fpfc. By the remarks above, which used Lemma 10 andl lA^(ypk- l ) = \G(E,¥pk) \ .Theorem 15 ([Wa, pp. 91-94]). For an elliptic curve E over¥pk, 2y/p^ > pk-\-l-\G(E,¥pk)\ >~2y ffi .Proof Define q = pk for simplicity. By the preceding observation, q + 1 - \G(E, ¥q)\ = q 4-1 -deg((^q — 1), a quantity we will call £. By Theorem 9, for s, t G Z with p \ t,

deg(s<pq -t) = s2 deg(ipq) 4-12 deg(-l) + st(deg(<pq - 1) - deg((pq) - deg(-l))= qs2 4-12 - 8t£

Since deg(s<pq -t) = qs2 + t2-st£>0,q(z)2-£(z) + l>0. Since { f : p \ i) is dense inR, this inequality holds for all real r G R in place of |, so qr2 - £r 4-1 > 0. This implies that thediscriminant is nonpositive, so £2 - 4q < 0, or \£\ < 2y/q, as desired. □

This powerful result has many important cryptographic consequences. Several algorithms usethe narrow range of possible orders for G(E, ¥q) given by the Hasse bound in an essential way tocompute the order of the group. We also can use the range for the order to bound the running timeof algorithms, such as those for computing factorizations and discrete logarithms.

2.4.5 The Riemann Hypothesis for Elliptic CurvesNote that the following results are entirely optional, as the they are not used in the remainder of thepaper. The purpose of this section is to prove that the Hasse bound implies an analogue of the Riemann hypothesis for curves. This connection is suggestive of a far more general analogy betweenpoints on varieties and primes in number fields. The usual Riemann hypothesis is equivalent to astatement about the density of primes. In the same sense, the Riemann hypothesis on an ellipticcurve is equivalent to a statement about the number of points on the curve.

The usual Riemann zeta function ((s), given by Yl^Li n~S f°r &e(s) > 1 (and which can beextended to C \ 1), satisfies the identity

»r(i)«.)-.-*r(l^) «.-.).

Page 29: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

Shren ik Shah—A Tas te o f E l l i p t i c Curve Cryp tography 27

written with pleasing symmetry. A famous conjecture asserts that for s G C \ {-2, -4,... } suchthat ((s) = 0, Re(s) = \. For an elliptic curve E over Fpfc, where we will write q = pk and£ = q + l - \G(E,¥q)\, we define

^.^giaadi,-.)Theorem 16 ([Wa, pp. 97, 355]). We have:

ql-2s -£q~s + lCe(s)

( l - q s ) ( l - q i - s )

which has the evident symmetry (e(s) = (e(1 — s).Proof. First consider the polynomial x2-£x-\-q = (x-pi)(x-p2). This divides (xn-pi)(xn-p%) = x2n - (pi 4- p2)xn 4- qn (since p\p2 = q), with quotient we will call f(x).

We have ip2 - £<pq 4- q = 0 by Theorem 12, so 0 = f(<pq)(y>2q - £yq + q) = vf ~ (pi +P2 )Vq +qn = 0, so by the uniqueness of £ from Theorem 12, p? +■ p% = <T + 1 ~ \G(E,¥q^)\.(It is easy to see inductively that pi 4- p2 is an integer for all n.) In particular,

& ( 8 ) = e x p p J - i £ ^ £ 2 1 , - U e x p £ g ^ ^ ^ ~ ^ gn = l

|G(^,F,n)| _ns\ _on(<Sr «?" + 1 - P? - P2 - n s

Using the Taylor expansion log(l - q s) = Y1 ~3~T~> we 0Dtain

Ce(s) = exp (- log(l - q1-*) - log(l - <T*) + log(l - piT) 4- log(l - p2T))ql~2s -£q~s + 1

l - s ^( l - 9 - ) ( l - 9 :

Moreover, we can verify the Riemann hypothesis in this case.Theorem 17 ([Wa, p. 357]). //Ce(«) = 0, Re(s) = \.

Proof. Indeed, for the numerator q~2s(q2s - £qs + q) to vanish, we find by the quadratic formula(treating the numerator as a quadratic in qs) that <f = i±yfl2 ~4q. By the Hasse bound, £2 < 4q,so the two solutions for qs satisfy \qs\ = \Jt + ^T^ = V^- This implies \qs\ = qRe(s) = y/q,o r R e ( « ) = | . □2.5 Elliptic Curve CryptographyIn this section we present the factoring algorithm using elliptic curves. After this we study thediscrete logarithm problem on elliptic curves and study its security. Assuming the hardness of thisproblem, we provide elliptic curve variants of several of the cryptosystems mentioned in Section2.2.3. Finally, we detail the practical implications of using elliptic curves in place of computationsover Fx.

2.5.1 Factoring via G(E, Z/nZ)It is now very easy to describe the elliptic curve factoring algorithm, since it is a minor modificationof Pollard's algorithm, and uses in a simple way the development of the preceding two sections.The key difference is that instead of working over Z/nZ, we work over G(E, Z/nZ) for a suitablychosen elliptic curve E.

We first randomly choose a curve y2 = x3 4- ax 4- b with a point P on it. As Lenstra observesin [Le], this can be done efficiently by choosing a G Z/nZ randomly, choosing a random pair

Page 30: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

2 8 T h e H a r v a r d C o l l e g e M a t h e m a t i c s R e v i e w 2 . 1

P = (u, v) G (Z/nZ)2, and setting b = v2 -u2 -au mod n, so that P lies on E. Then for somelarge integer k with many small prime factors, we compute kP by repeated squaring, using the sameformula as if Z/nZ were a field. (We could, as in Pollard's algorithm, take k = lcm(l, ...,N)for some large integer N.) Note that inverses modulo n are efficiently computable via Euclid'salgorithm, so this process is computationally efficient. If at any point during this computation,we fail because an inverse cannot be computed, we have found a nonzero nonunit of Z/nZ, thusfactoring n.

Write Z/nZ = Z/p^Z x • • • x Z/p£fcZ, and consider the induced decomposition of theelliptic curve given in Theorem 8. Then the failure to compute an inverse by the usual methodoccurs whenever P G G(E, Z/p^Z) x • • • x G(E, Z/p£feZ) has a coordinate that is the pointat infinity in G(E,Z/p^Z) for some i. This is essentially the same behavior as in Pollard'salgorithm, except that the component elliptic curves have varying orders. The requirement that khave many shared factors with <p(n) now changes to having shared factors with the orders of thesecomponent curves. By picking many random curves, it is likely that one of the curves will havesmall factors in ip(n).

2.5.2 Elliptic Curve Discrete Logarithm ProblemThe elliptic curve discrete logarithm problem is as follows: Given G(E, K), a point P G G(E, K),and Q = nP, compute n. There are a tremendous number of algorithms that aim to solve thisproblem, many for very specific classes of elliptic curves. A short introduction to this can be foundin [Wa] and a thorough treatment can be found in [CFA]. Here we cover general methods developedby Pollard, as well as a solution using the results of Section 2.4 specific to a certain class of curves.2.5.2.1 Pollard p and A AlgorithmsThe Pollard p algorithm is very simple and general; it can be used to compute the discrete logarithm in Fx as well. We will write the algorithm here using the notation of arithmetic on an ellipticcurve, with uppercase letters representing points on the curve and lowercase letters representingintegers. Pick a function / : G(E, K) —> G(E, K) of sets that, as [Wa] describes, "behaves ratherrandomly." While / need not be an endomorphism, we will require it to be reasonably explicit, ina sense made precise below. The intuition is that if / were chosen so that its value at every pointwas another truly random value of the curve, one would expect iterating P, f(P), f(f(P)), etc. to

repeat a value in roughly J^12iEJ91 steps.Pick random i0, j0 and define So = PoP 4- <?oQ, where nP = Q is the discrete logarithm

instance we are trying to solve. Also define Si = f(Si-i), but keep track of values pi,qi suchthat Si =piP 4- qiQ- This imposes a requirement that / be sufficiently explicit for the image of apoint of the form Si-i = pi-iP 4- qi-iQ to have an efficiently computable representation in theform piP 4- qiQ- In other words, we need to be able to write the difference f(Si) - Si in the formpP + qQ- Now suppose that for some integers a ^ (3, Sa = Sp. This will eventually be the case,since G(E, K) is finite. Then paP-\-qaQ = ppP+qpQ, or P(pa-pp) = Qfap-qa). From here,there are only k — gcd(qp — qa, \G(E, K)\) different possible choices for the logarithm, so we cantestthemall. This is because modulo ra = lG(^K)l t the value of n is simply (q0-qa)~1(Pa-pp),and there are only k possible choices modulo \G(E, K)\ that leave this residue modulo ra.

The Pollard A algorithm is a variant on this idea. Instead of having a single starting point So,this process is simultaneously carried out on an array of values ,So,o, So,i,..., So.fc. Iftnere is amatch between two different paths, we can get a relationship similar to that in the p algorithm, andagain find few possibilities for the logarithm.

Interestingly, as [Wa] explains, the p and A are chosen to match the nature of these algorithms.In the p algorithm, one searches for a path that loops back to itself, which might look like the Greekletter p. In the A algorithm, two paths need to converge together, which looks like a A.2.5.2.2 Reduction via the Weil PairingStarting with a result of Menezes, Okamoto, and Vanstone in [MOV], cryptographers have managed to use Weil and other pairings on elliptic curves to reduce instances of the elliptic curve

Page 31: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

Shren ik Shah—A Tas te o f E l l i p t i c Curve Cryp tography 29

discrete logarithm problem to instances of the discrete logarithm problem over finite fields, whichare usually much easier to solve.

It is possible to efficiently compute the order of a point as a consequence of Theorem 15, whichcan be used to restrict the range of possible orders of the point to just finitely many, each of whichcan then be checked. There are faster algorithms, a subject covered in Section 19.4 of [CFA].

Suppose that we need to compute s = logP Q, where P has order n. We have Gn(E, ¥p) CG(E,¥pk) for some choice of k. One chooses R G G(E,¥pk), and computesits order ra. Next one computes R' = j^jR € Gn(E,¥p) (since (ra,n) | n), and computes£ = \ogWn(piRt) wn(Q, R'), a discrete logarithm problem in ¥pk. Then wn(P, R'Y = ™n(£P, R')= wn(Q,R'), implying that in the subgroup wn(Gn(E,¥p),Rf) = C(m,n) C Cn, we have£P = Q, which in turn implies that s = £ mod (ra, n). For sufficiently many choices of R, thevalues £ mod (ra, n) should allow one to reconstruct s mod n.

Unfortunately, the discrete logarithm problem over ¥pk is difficult if k is large. We define acurve E over ¥p to be supersingular if \G(E,¥P)\ = p 4- 1 - a where a = 0 mod p. One canshow that k is rather small in the case of such a curve. Following [Wa], we will do this in thespecial case of a = 0.Theorem 18 ([Wa, p. 146]). If\G(E,¥p)\ = p 4- 1, and there exists P G G(E,¥P) of order n,then we can take k = 2 above.

Proof. Since P has order n, n | p + 1 by Lagrange's theorem, and thus 1 = -p mod n. LetQ G Gn(E, ¥p). We have <p2p = -p by 12, so y2p(Q) = -pQ = 1 • Q = Q since nQ = 0. SinceV ? 2 f i x e s e x a c t l y G ( E , ¥ p 2 ) , Q G G ( E , ¥ p 2 ) . D

2.5.3 Cryptography

Many of the cryptographic schemes presented in Section 2.2.3 depended upon the hardness ofthe discrete logarithm problem in Fx. Resting on the assumption that the elliptic curve discretelogarithm problem is hard for a class of instances of G(E, R) that can be efficiently generated, wecan develop secure cryptographic protocols. The following are given in [Wa], though the details oftheir security are discussed more fully in [CFA]. As in the last subsection, we use lowercase lettersto indicate integers and uppercase letters to denote points on the elliptic curve. In some cases, suchas the Elgamal Public Key Encryption scheme, the message should be encoded as a point on thecurve. In others, such as the Digital Signature Algorithm, it is encoded directly as an integer.

2.5.3.1 Diffie-Hellman Key ExchangeIt is an open problem to prove the difficulty of determining the point xyP from P, xP, and yP,even under the assumption that the discrete logarithm problem is hard. This problem is similarto that of the Diffie-Hellman problem described in Section 2.2.2, and is called the elliptic curveDiffie-Hellman problem.

For the protocol, Alice and Bob agree on a choice of E and p so that the elliptic curve Diffie-Hellman problem is hard for G(E, ¥p), and agree on a point P G G(E, ¥p). Alice secretly choosesx while Bob secretly chooses y, both integers. Alice sends xP to Bob, who sends yP back. Theyboth compute xyP, and extract a key from it.

2.5.3.2 Elgamal Public Key EncryptionSuppose that Bob wants to receive a message. He publishes an elliptic curve E over Fp, p, a pointP, and xP, where x is his secret key. Alice encrypts her message M G G(E, ¥p) by picking asecret integer y and computing and sending C\ = yP, C2 = M 4- y(xP) to Bob, who decryptsby computing C2 — xC\ = M. It is unclear how an eavesdropper would be able to compute Mwithout solving the discrete logarithm problem, though in the manner of Theorem 1 one can relatethe hardness of breaking this cryptosystem to the elliptic curve Diffie-Hellman problem. Again, themodification to produce an elliptic curve algorithm from that given above was simply a matter ofwriting the variables additively and using the group of points on a curve E.

Page 32: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

3 0 T h e H a r v a r d C o l l e g e M a t h e m a t i c s R e v i e w 2 . 1

2.5.3.3 Digital Signature AlgorithmRecall that Alice has a document ra G Z that she wishes to sign. Alice picks E over ¥q, where\G(E,¥q)\ = ep for p a prime and e very small, say 1,2, or 4. She also picks a point P GG(E, ¥q) of order p, just as in the classical variant. Finally she picks a secret integer s and publishes (E,¥q,p, e, P, sP). Alice picks a random integer 1 < £ < p, computes R = £P = (x, y)and a = k~ (ra 4- s#) mod p, and produces a signed document (ra, R, a). The verifier Bob computes ci = a~1m mod p, C2 = a~lx mod p, and accepts if R = c\P 4- c2(sP). A correctlysigned document is always accepted, since R = /cP = a-1(ra 4- sx)P = cxP 4- C2(sP). Asbefore, if the discrete logarithm problem is hard, it seems difficult to use the public information,even with many signed documents, to forge a signature, except perhaps in some special cases. It isunknown, however, whether the hardness of breaking this algorithm follows from the hardness ofthe discrete logarithm.

2.5.4 Cryptosystems Using the Weil PairingIn [BF], Boneh proposed an identity-based encryption scheme using a type of nondegenerate bilinear pairing that can be constructed, for example, from the Weil pairing. In this section, we followthis paper's construction, though for simplicity we provide only the weakest cryptosystem developed (one that is vulnerable to certain attacks). Moreover, we will define the new pairing only forthe curve E given by y2 = x3 4- 1.

Let p = 2 mod 3 be prime. We choose E as above because the pairing needed has a particularly natural description in terms of the automorphism o : G(E,¥p2) —▶ G(E,¥p2) givenby (x,y) i—▶ (x,£y), where C is a primitive third root of unity. One can show, moreover that\G(E, ¥p)\ = p 4-1, so the weakness proved in Theorem 18 applies here.

We fix a prime q | p 4- 1 and a point P G G(E,¥P) of order q. Then the modified Weilpairing is the function wq(Pi, P2) = wq(P\, a(P2)), though we will be interested in its restrictionto S x &(S), where S = (P). The modified pairing, still bilinear, satisfies a different kind ofnondegeneracy property: for P' a generator of Gq(E,¥p), wq(P', P') is a primitive q* root ofunity. We denote by Cq C C the group of qth roots of unity.Setup. A secret integer s is chosen by the Private Key Generator (PKG), while sP = Q is madepublic, together with p, q, and P. Two cryptographic hash functions Hi : {0,1}* —▶ S \ {0} andH2 : Cn —▶ {0, l}m for some fixed ra are also made public.

Each party comes to collect from the PKG their private key ds, which the PKG generates fromtheir identifier Ib by the formula ds — sH\(Ib).Encryption.To send a message ra to Bob, who has identifier Ib, Alice generates r randomlyfrom Cq, computes the encryption key es = wq(Hi(lB), Q), and computes the cyphertext C =(rP, ra ®H2(erB)). She sends C to Bob.Decryption. Given the cyphertext C = (c\,c2), Bob computes

c2 0 H2(wq(dB,ci)) = ra 0 H2(erB) 0 H2(wq(sHi(IB),rP))= m®H2(wq(Hi(lB),Q)r)eH2(wq(Hi(lB),Q)r) = m

where the second equality is by linearity.Nondegeneracy is critical in proving the security of this algorithm, which is proved in [BF].As a final note, a simpler example of the utility of the Weil pairing is in the one-round tripartite

matching protocol proposed by Antoine Joux in [Jo]. In this protocol, parties pi,p2,P3 have apublic non-degenerate (in the sense defined in this section) bilinear map b : G x G —▶ Cn such asthe modified Weil pairing above and generator g G G. They pick random secrets si, s2, S3, and pisends Sig to every other player. Finally pi computes b(s2g, S3g)S3 = b(g, g)sis*S3, as do each ofthe other players. The value of b(g, g)sis*ss is the established common key.

The theory of pairings is one of the most active and interesting in elliptic curve cryptography.Indeed, there is an entire conference each year, called Pairing, dedicated solely to their study. Thereare additional pairings other than the Weil and modified Weil pairings discussed here, including theTate and Tate-Lichtenbaum pairings, and many more applications. One can even define pairings onmore general varieties than elliptic curves.

Page 33: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

Shren ik Shah—A Tas te o f E l l i p t i c Curve Cryp tography 31

2.5.5 Practical ConsiderationsThe value of elliptic curve cryptography, already illustrated by the discussion of pairings in theprevious section, is increased by several more pragmatic considerations.Security. The primary benefit of elliptic curve cryptography is that seemingly greater security maybe achieved than for cryptosystems implementing the usual discrete logarithm problem. As explained in [HMV], the most efficient algorithms to solve the discrete logarithm problem over(Z/pZ)x are in subexponential time, far more efficient than the best algorithms to solve the discretelogarithm problem on general elliptic curves (Pollard's algorithm, discussed in Section 2.5.2.1, isessentially the best known method).Efficiency. The Elgamal cryptosystem based on the usual discrete logarithm problem was discussedin the section on cryptography. This algorithm requires computations that in practice are far moretime consuming than the equivalent computations on elliptic curves, where doubling and additiontend to be more efficient. Moreover, the security benefit mentioned earlier creates a new source ofefficiency. The integers needed to do elliptic cryptography with comparable security to classicalcryptosystems are usually a tenth of the size, creating a significant speedup in arithmetic on thecurve.Versatility. Elliptic curves, as illustrated by the factoring algorithm, are numerous for a givenchoice of ring Z/nZ. This creates an extra dimension of versatility and functionality missingin classical algorithms, and explains the huge improvement over Pollard's factoring methods.Structure. The Weil pairing and other structures defined on elliptic curve provide a rich range oftools. As Sections 2.5.2.2 and 2.5.4 reveal, the interaction of these structures on the curve can behelpful for designing cryptosystems or performing computational tasks.It is not wise to immediately replace all one's cryptosystems by those employing elliptic curvemethods, however. Elliptic curves are very complex objects, and in most algorithms above, thecurve E is published. The discrete log problem has been shown to be much easier on many subclasses of elliptic curves by results such as Theorem 18, so if one selects a curve with particularproperties, many of which might not be readily apparent, an adversary could crack the cryptosystem using "special" techniques specific to that curve. That said, much work has indeed been doneon the selection process to produce a curve with optimal security, and [CFA] and [HMV] providesubstantial coverage of this.

The study of more sophisticated number theory such as modular forms and complex multiplication can have cryptographic implications, as discussed in [CFA]. A generalization of ellipticcurve cryptography is hyperelliptic curve cryptography, which performs arithmetic on the Jacobianof hyperelliptic curves. Elliptic curve cryptography is a subject where deep number theory hasdirect impact on practical matters, and the results described in Sections 2.4 and 2.5 only scratch thesurface.

2.6 ConclusionsFor the reader interested in investigating this material further, the following books provide additional information on the topics discussed in this paper:Cryptography. The approach taken by this paper most closely parallels the philosophy of [MvOV],which focuses on providing descriptions of real cryptosystems, rather than establishing the foundations of the subject. An amazing introduction to foundational cryptography can be found in [Gol]and [Go2].Elliptic Curves. Three introductory texts on elliptic curves are [M2], [ST], and [Wa], which together cover all the material here and much more. For a more advanced introduction, look to [Sil]and [Si2] or [Hu]. For those interested in the complex-analytic aspects of the subject, [La] and [Fo]are excellent texts. From an algebro-geometric viewpoint, one might look into [Ha] or [Ml].Elliptic Curve Cryptography. For all-purpose text that is encyclopedic in both its coverage of thetheory and practice of elliptic curve cryptography, look to [CFA]. The text [HMV] focuses on thepractical aspects of the subject and is very readable.

Page 34: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

3 2 T h e H a r v a r d C o l l e g e M a t h e m a t i c s R e v i e w 2 . 1

The group of points on an elliptic curve can be used as a replacement for multiplicative groupsin many situations. As we saw in the factoring algorithm, the key improvement was that factoring nvia Pollard's algorithm could be reduced to "guessing" a number that is either a multiple of <^(n),or has many factors in common with it, while over an elliptic curve, <p(n) is replaced with thegroup order, which can take on many values for fixed n as E varies. This added flexibility gives anew dimension to the development of cryptographic systems, since, as seen in the examples above,one can vary the curve E in setting parameters while keeping the field fixed. Although we did notdelve into the more complex algorithms that make more important use of this phenomenon, thefactoring algorithm shows that elliptic curves can provide important computational savings. Wealso saw that structures such as the Weil pairing can give rise to cryptosystems for which no simpleimplementation using classical methods are known. For their security, efficiency, and versatility, elliptic curve methods are used for real cryptographic applications every day, revealing their amazingpervasiveness in both theory and applications within mathematics and computer science.

References[BF] Dan Boneh and Matt Franklin: Identity-Based Encryption from the Weil Pairing, Ad

vances in Cryptology-Crypto 2001: 21st Annual International Cryptology Conference,Santa Barbara, California, USA, August 19-23, 2001, Proceedings (2001).

[Bo] Dan Boneh: Twenty years of attacks on the RSA cryptosystem, Notices of the Amer.Math. Soc. 46 #2 (1999), 203-213.

[CFA] Henri Cohen, Gerhard Frey, and Roberto Avanzi: Handbook of Elliptic and HyperellipticCurve Cryptography. Boca Raton: CRC Press 2006.

[CLRS] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein: Introduction to Algorithms. Cambridge, MA: MIT Press, McGraw-Hill 2001.

[Fo] Otto Forster: Lectures on Riemann Surfaces. New York: Springer-Verlag 1981.

[Ga] Steven D. Galbraith: The Weil Pairing on Elliptic Curves over C, preprint (2005).

[GB] Shaft* Goldwasser and Mihir Bellare: Lecture notes on cryptography, Summer Course,"Cryptography and Computer Security," at MIT 1996.

[Gol] Oded Goldreich: Foundations of Cryptography: Volume 1, Basic Tools. New York:Cambridge Univ. Press 2001.

[Go2] Oded Goldreich: Foundations of Cryptography: Volume 2, Basic Applications. NewYork: Cambridge Univ. Press 2004.

[Ha] Robin Hartshorne: Algebraic Geometry. New York: Springer-Verlag 1977.

[HMV] Darrel R. Hankerson, Alfred J. Menezes, and Scott A. Vanstone: Guide to Elliptic CurveCryptography. New York: Springer-Verlag 2004.

[Hu] Dale Husemoller Elliptic curves. New York: Springer-Verlag 2004.

[Jo] Antoine Joux: A One Round Protocol for Tripartite Diffie-Hellman, J. of Cryptology 17#4 (2004), 263-276.

[Ko] Neal I. Koblitz: A Course in Number Theory and Cryptography. New York: Springer-Verlag 1994.

[La] Serge Lang: Elliptic Functions. New York: Springer-Verlag 1987.

[Le] Hendrik W Lenstra: Elliptic curves and number-theoretic algorithms, (1986).

Page 35: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

Shren ik Shah—A Tas te o f E l l i p t i c Curve Cryp tography 33

[MOV] Alfred J. Menezes, Tatsuaki Okamoto, and Scott A. Vanstone: Reducing elliptic curvelogarithms to logarithms in a finite field, Information Theory, IEEE Transactions on, 39#5 (1993), 1639-1646.

[MvOV] Alfred J. Menezes, Paul C. van Oorschot, and Scott A. Vanstone: Handbook of AppliedCryptography. Boca Raton: CRC Press 1997.

[Ml] James S. Milne: Abelian varieties, course notes.

[M2] James S. Milne: Elliptic curves, course notes.

[Shaf] Igor R. Shafarevich: Basic Algebraic Geometry, Volume 1. New York: Springer-Verlag1994.

[Sham] Adi Shamir: Identity-based cryptosystems and signature schemes, Proceedings ofCRYPTO 84 (1985), 47-53.

[Sil] Joseph H. Silverman: The Arithmetic of Elliptic Curves. New York: Springer-Verlag1986.

[Si2] Joseph H. Silverman: Advanced Topics in the Arithmetic of Elliptic Curves. New York:Springer-Verlag 1994.

[ST] Joseph H. Silverman and John Tate: Rational Points on Elliptic Curves. New York:Springer-Verlag 1992.

[Si] Michael Sipser: Introduction to the Theory of Computation. Course Technology 1996.

[Wa] Lawrence C. Washington: Elliptic Curves: Number Theory and Cryptography. BocaRaton: Chapman & Hall/CRC Press 2003.

Page 36: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

STUDENT ARTICLE

Bridging the Group Definition GapMatthew G. Dawson*Union University '08

Jackson, [email protected]

[email protected]

AbstractIn the early 1830s, a young French mathematician named Evariste Galois laid the foundations ofgroup theory, although he never precisely defined groups. Galois studied groups in the context ofsets of arrangements and his ideas were reformulated into a more abstract setting in the twentiethcentury. This paper provides precise definitions for constructs closely related to Galois's originalnotion of group theory and explores important group properties in that context, demonstrating thatthe modern concepts of the group, subgroup, normal subgroup, and solvable group can be expressedin terms of arrangement sets.

In the early nineteenth century, the theory of polynomials in a single variable was significantlyadvanced. Paolo Ruffini (1765-1822) discovered that the general quintic polynomial is not solvable by radicals and produced a nearly complete proof of this result. Niels Abel (1802-1829) wasable to solve one of the greatest open questions of his day when he provided a correct and completeproof of Ruffini's discovery. A precocious French mathematician by the name of Evariste Galois(1811-1832) then made a surprisingly complete advancement, discovering a criterion that determines when a polynomial is solvable by radicals. Along the way, he stumbled upon the branchof mathematics now known as group theory. Galois associated groups with polynomial equations,showing that a polynomial equation is solvable by radicals when the associated group has a certainproperty now known as solvability.

Interestingly, although Galois was the first to study groups in the abstract setting, his conceptof group bears no superficial resemblance to the more familiar definition that is found in moderntextbooks. Today, a group is defined as a set, together with an associative binary operation, havingboth the identity and inverse properties.1 Galois, however, thought of groups in the context ofarrangements. A rigorous study of Galois's arrangement sets and their relation to modern grouptheory is not well known and is difficult to find, although such an exposition may be found in [Ti].This paper reconciles the two definitions of group and studies the relationships between the twoperspectives. Ultimately, we shall determine precisely how the modern definition of solvable grouptranslates into Galois's terminology.

To begin our journey, we must provide precise definitions for the terminology that Galois used.First, we define the concept of arrangement key to Galois's formulation of group theory.Definition 1. Given a nonempty finite set S of n elements, an arrangement of S is an n-tuple(ai,ei2,... ,an) G Sn such that for every element s G S there exists exactly one i such thatai = s, 1 < i < n.

In addition, the set of all arrangements of a set S is denoted by Arr(S) and the set of allpermutations on 5 is denoted by Sym(S).

* Matthew Dawson, Union University '08, is a mathematics major who lives in Jackson, TN. Thanks to homeeducation from his parents, he will be graduating four years early. In addition to mathematics, interests includephysics, computer science, history, and music.

!That is, there is an identity element of the group and every element of the group has an inverse.

34

Page 37: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

Matthew G. Dawson—Bridging the Group Def in i t ion Gap 35

To illustrate this definition, let us consider a simple example. Suppose S = {a, b, c}. We listthe elements of Arr(5), denoting abc rather than (a, b, c) for brevity:

Arr ({a, b, c}) = {abc, acb, bac, bca, cab, cba}.

A permutation on S is simply a one-to-one correspondence mapping S into itself. Using thecyclic notation for permutations, we have that Sym(S') = {(ab), (be), (ac), (abc), (acb), id}.

The power and versatility of arrangements is demonstrated by our first observation, whichallows us to associate permutations on Arr(S) with permutations on S.

Proposition 2. Let S be a finite set with n elements.

1. Let f G Sym(S), and consider the mapping Pj on Arr(S) such that for each arrangementOi = (ai,a2,a3,--.,an) G Arr(S),

Pf(a) = (f(ai),f(a2), f(a3),..., f(an)).

Then P/ is a permutation on Arr(S).

2. For all a,/3 G Arr(S), there exists a unique permutation f G Sym(S) such that P/(a) =

3. For all f,g G Sym(S), Pf o Pg = Pfog.

The third part of Proposition 2 establishes that the map / i-+ P/ is a homomorphism fromSym(S) to {Pf | / G Sym(S)} C Sym(Arr(S')); the second part tells us that it is an isomorphism; that is, Pf = Pg if and only if / = g.

The fact that such an isomorphism exists should not be surprising, when Cayley's Theorem isconsidered: clearly, if a set S has n elements, then | Sym(5)| = n! and | Arr(S)| = n!. Cayley'stheorem tells us that because Sym(S) has n! elements, it will be isomorphic to a subgroup ofSn\. Since Sym(Arr(5)) is isomorphic to Sn\, we see that there must be an isomorphism betweenSym(S') and some subgroup of Sym(Arr(5)).

Proposition 2 assures us that the permutations in Sym(S') can be applied to arrangements inArr(5) in a well-behaved fashion. Indeed, Proposition 2 sets up a group action of Sym(S') onArr(S). It does so because the map / h-> P/ is a homomorphism from Sym(S) to Sym(Arr(S)).Furthermore, the second part of Proposition 2 implies that the homomorphism is one-to-one, sothat the group action is faithful.

Corollary 3. Let S be a finite set. Then the mapping P : Sym(S') —▶ Sym(Arr(S)) given byP(f) = pf is a faithful action ofSym(S) on Arr(S).

Henceforth, we will drop the notation P/ and instead use the same notation to denote a,permutation on S and the corresponding permutation on Arr(5). In addition, we will denote functioncomposition by juxtaposition.

Now that a group action has been set up, the concept of orbit may be discussed. The readermay recall that, given a group G, a set M, and an action of G on M, the orbit of ra G M is definedto be G(m) = {g(m) \ g G G}.

Now, if a set S and an arrangement a G Arr(S) are considered, then the orbit of a is Arr(5)(that is, (Sym(5))(a) = Arr(S)). We know this by part two of Proposition 2, which tells us thatgiven any arrangement in Arr(5), we can find a permutation in Sym(S) that will map a to thatarrangement.

A more general concept similar to that of orbit will be quite useful in this paper; we need toconsider the application of subsets of Sym(-S) on a single arrangement.Definition 4. Let 5 be a nonempty finite set and let H C Sym(S). Then for all arrangementsa G Arr(S), we define if (a) - {/(a) | / G H}.

Page 38: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

3 6 T h e H a r v a r d C o l l e g e M a t h e m a t i c s R e v i e w 2 . 1

To illustrate this definition, let H = {id, (abc), (acb), (ac)}. In this case, we have

H(abc) = {abc,bca,cab,cba}.

Now suppose we are given a finite set S, an arrangement a of S, and a set M of arrangementsof S. We define:Definition 5. Let 5 be a nonempty finite set, let C C Arr(5), and let a eC. Then the permutation set of a in C, denoted ixa (C), is the set

Ma(C) = { /€Sym(5) | / (a)€C}.

Let us go back to our previous example, where S = {a, b, c}, C = {abc, bca, cab, cba}, anda = abc. Then Ma (C) will be the set of all permutations that map abc to some arrangement in C.The reader can check that txa(C) = {id, (abc), (acb), (ac)}.

As suggested above, the txa(C) construction is the inverse of the H(a) construction. We statethis formally in the following lemma.Lemma 6. Let S be a nonempty finite set, let H C Sym(S) and M C Arr(S), and let a G M.Then H(a) = M if and only ifH = MQ(M).Proof. First suppose that H(a) = M. We wish to show that H = Ma(M). Let / G H. Thenf(a) G H(a) = M and hence / G cxa(M). Thus, H C Ma(M).

Next let h G MQ(M). Thus h(a) e M = H(a), so that h(a) = f(a) for some f e H.Therefore, by Proposition 2, we know that h = f, so that /i G H. Hence, Ma(M) C #. Thus,ff = Ma(M).

To prove the other half of the biconditional, suppose that H = txia(M). We must show thatH(a) = M. Let 0 G H(a), so that 0 = /(a) for some / G H = x\a(M). Now / G Ma(M)implies that /? = /(a) G M. Thus H(a) C M.

Finally let 0 e M. Then by Proposition 2, 0 = g(a) for some p G Sym(S). Clearlyp G tx ]Q(M) = H. Thus, p G H imp l ies tha t /? = g(a) G i / (a ) . □

So far, we have associated a permutation set with each pair (a, M) where a is an arrangementand M is an arrangement set. We can also define a permutation set directly associated to a givenarrangement set.Definition 7. Let S be a nonempty finite set, and let C C Arr(5). Then the total permutationset associated with C (or total permutation set of C), m(C), is the set

txi(C) = {/ G Sym(5) | 3a G C such that /(a) G C}.

By checking the relevant definitions, we see that

M(<7) = |J Ma(C).ot£C

Hence, we have that tx\a (C) C txj(C) for all a G C.As before, an example will help to illustrate the definition. We consider again C = {abc, bca,

cab, cba}. Then,

IXl(C) = Mabc(C) U Mbca(C') U Xcab(C) U EXcba(C)= {id, (abc), (acb), (ac), (ab), (be)} = Sym(S).

Now that we can associate permutation sets with arrangement sets, we are ready to study theimplications of those associated permutation sets having special properties, the most important ofwhich is described in our next definition.Definition 8. A set C of arrangements of a nonempty finite set S is a Galois Set of Arrangements(or (GSA)) if for all / G cxi(C), a G C implies f(a) G C.

Page 39: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

Matthew G. Dawson—Bridging the Group Def in i t ion Gap 37

The above definition lays out the single most important concept in this paper, which is a closeapproximation to Galois's original concept of the group. Recall that the more familiar definitionstates that a group is a set of objects with an associative binary operation such that the set hasan identity element and contains inverses for every element in the set. Galois sets of arrangementsbear no immediate resemblance to this algebraic structure. However, these two concepts are closelyrelated. We shall soon see that the connection between GSAs and algebraic groups arises throughthe permutation sets associated with GSAs.

Before moving any further, let us determine whether C = {abc, bca, cab, cba} is a GSA. Firstrecall that m(C) = {id, (abc), (acb), (ac), (ab), (be)} = Sym(S). In order for C to be a GSA,it must be the case that each permutation g G ex(C) maps every arrangement in C to anotherarrangement in C. Consider / = (ab), which is an element of \x\(C). Now f(abc) = bac, andbac £ C. Therefore, because / = (ab) G txj(C) yet abc G C and f(abc) <fc C, we see that Ccannot be a GSA.

Let us consider another example: suppose that M = {abc, acb}. We shall first list out all ofthe elements of dxj(M).

abc -^ abc abc -4- acb acb -4> abc acb -L> acb

Hence tx(M) = {id, (be)}. Because all the permutations in txi(M) map both abc and acb to eitherabc or acb (this fact can be checked by examining the above diagram), we have that M is a GSA.

The reader may have noticed that txabc(M) = txiacb(M) = cxi(M). The reader may also havenoticed that Mabc(M) forms a group of permutations in the modern sense. These observations leadus to an interesting, general result.Lemma 9.IfC is a set of arrangements of a finite set and a G C is such that txia(C) forms agroup under composition, then t>4a(C) = txi(C).Proof. Suppose that a G C such that txja (C) forms a group under composition. First we show thattxi(C) C txia(C). Let / G m(C). Then by definition of the associated permutation set, there exist0,7 G C such that f(0) = 7. By Proposition 2, there exists exactly one permutation g G Sym(S)such that g(a) = 0, and there exists exactly one permutation h G Sym(S) such that h(a) = 7.Note that by the definition of the permutation set of a in C, g G Ma(C) and h G MQ(C). Considerthe permutation hg~l:

(hg-'m = h(g~l(0)) = h(g-\g(a))) = h(a) = 7.By Proposition 2, there exists exactly one permutation / such that f(0) = 7. Thus we have/ = hg~l. But because Ma(C) forms a group under composition, we know that hg~l G cxia(C).Hence / G Ma(C). Thus, txi(C) C Ma(C).

By the definition of m(C), it is clear that ooa(C) C m(C). Therefore, we have that t<a(C) =m ( C ) . □

With this last result, we have developed all of the necessary tools to establish the connectionbetween groups and GSAs.Theorem 10. Let S be a nonempty finite set, let C C Arr(S), and let a G C. Then C is a GaloisSet of Arrangements if and only if>^a (C) forms a group under composition.Proof Suppose that C is a Galois Set of Arrangements. Then, for all / G x(C), f(0) G C forall 0 G C. We wish to show that tx\a(C) forms a group with respect to function composition.

Let f,g G ixia(C). Thus g(a) G C. Also, C is a GSA, so that f(0) G C for all 0 G C. Itfollows that (fg)(a) = f(g(a)) G C. Therefore, by the definition of the permutation set of a inC, fg G \x*a(C). Hence Ma(C) is closed under composition.

Now consider the identity permutation id : S —▶ S. Then id(a) = a, so that id G ixia(C)-Recall that, since id is the identity permutation, id 0/ = / o id = / for all permutations / GSym(S). Therefore, the set Ma(C) contains an identity element.

Next let / G txa(C). Then by the definition of the permutation set of a in C, f(a) = 7for some 7 G C. Now f~l(j) = a (recall that / is a permutation, so that /_1 exists), so that

Page 40: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

3 8 T h e H a r v a r d C o l l e g e M a t h e m a t i c s R e v i e w 2 . 1

f~l G m(C). Thus, since C is a GSA, f~\0) G C for all 0 G C. Hence, f~x(a) e C so that/"* G txia (C). Thus, Ma (C) contains an inverse for each element, whence Ma (C) forms a groupwith respect to function composition.

By Lemma 9 we know that Ma(C) = m(C). We wish to show that C is a GSA. Let / Gm(C). In order to show that C is a GSA, we must to show that f(0) G C for all 0 e C. Now/ G Ma(C), since MQ(C) = m(C). Next let /3 G C. By Proposition 2, there exists exactly onepermutation h : S —▶ S such that /i(a) = 0. Clearly, h G Ma(C). But Ma(C) forms a groupunder composition, so that fh G MQ(C). In other words, (fh)(a) = f(h(a)) — f(0) G CT h e r e f o r e , C i s a G S A . D

Let us look at Theorem 10 in light of the examples we have used so far. For the set C ={abc, bca, cab, cba}, we recall that Mabc(C') = {id, (abc), (acb), (ac)}. Now, Mabc(C) is not agroup. Therefore, Theorem 10 tells us that C is not a GSA, confirming our earlier observation.Also, for M = {abc, acb}, we saw that txtabc(M) is a group. Theorem 10 then tells us that M is aGSA, as we determined above.

It should be noted that Theorem 10 finishes the task of reconciling the two group definitions. Aset C of arrangements is a Galois set of arrangements if and only if at least one of the permutationssets of an arrangement in C is a group. Theorem 10 also implies that if C forms a GSA, thenMa (C) forms a group for each a G C.

Now, we suppose that Ma(C) forms a group with respect to function composition. Lemma 9then guarantees that Ma(C) = m(C), so that m(C) is a group. Thus, if C is a GSA, then thetotal associated permutation set of C is a group. The converse, however, is not true—the totalpermutation set of C may be a group even if C is not a GSA. For instance, we showed that C ={abc, bca, cab, cba} is not a GSA and also determined that m(C) = Sym(S). From group theory,we know that Sym(S') is a group that is isomorphic to S3.

Our next priority is to determine how solvability translates to the language of permutation sets.Before doing that, we state a few more results and give one more definition.Lemma 11. Let S be a nonempty finite set, let H C Sym(S), and let M be a GSA ofS. Then forall a G M, H(a) = M if and only ifH = m(M) .Lemma 12. Let T and V be sets of permutations of a finite set S, and let a be an arrangement ofS. Then T(a) = V(a) if and only ifT = V.

We have already studied how to apply a permutation set to an arrangement. Next, we shalldefine the application of a single permutation to an arrangement set.Definition 13. Let S be a nonempty finite set, and let M C Arr(S). Then for all permutationsg G Sym(S), we define g(M) = {0(7) | 7 G M}.

To illustrate the above definition, let M = {abc, acb} and g = (ab) and consider the followingdiagram:

abc ^ bac

acb —▶ bca

Then, we see that g(M) — {bac, bca}.In what follows, we respectively denote gH and Hg for the left and right cosets of H c G in

G associated tog eG.Theorem 14. Let N be a GSA of a finite set S, and let H = tx\(N). Then for all permutationsg G Sym(S),

1. The set g(N) is a GSA ofS and tx(g(N)) = gHg~l

2. For all a G N, g(N) = g(H(a)) = (gH)(a).

Page 41: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

Matthew G. Dawson—Bridging the Group Def in i t ion Gap 39

Proof First we prove the first statement. Let a G N and let g G Sym(S). Note that becauseN is a GSA of S, Theorem 11 assures us that N = H(a). Let 0 = g(a). We first show that^ ( g ( N ) ) = g H g - \

Let k G M/3(#(A0). Thus, k(0) = 7 for some 7 G g(N). Now, 7 = g(S) for someS G N. But AT = #(a), so that (5 = h(a) for some /i G H. Thus &(/?) = g(h(a)). Buta = g~x(0), whence k(0) = g(h(g~l(0))). Therefore, by Proposition 2, we have k = ghg'1.Hence, ix\0(g(N)) C p##_1.

Next let/i G i/ and consider ghg~l G gHg~l. Then (ghg l)(0) = (ghg 1)(g(a)) =g(h(a)) G g(N). Hence, ^/i^T1 G_M/5(p(AT)). Thus, ptfcT1 C M^(p(iV)).

Therefore, txp(g(N)) = gHg~l. As H is a group under composition, so is gHg l. Thus, byTheorem 10, g(N) is a GSA. Now, we suppose that MQ(C) forms a group with respect to functioncomposition. By Lemma 9 we know that m(#(AT)) = gHg'1.

Next, we prove the second component of the theorem. Let g G Sym(S') and a G N. ByTheorem 11, N = H(a), so that g(N) = g(H(a)). It remains to show that g(N) = (gH)(a).Thus, suppose h G H and consider gh G #//. Note that (gh)(a) = g(h(a)) and that h(a) G AT.Thus, g(h(a)) G p(AT), whence (gH)(a) C p(AT). Next, let 0 G $(#). Then, /? = 0(7) forsome 7 G AT and so 7 = h(a) for some h e H. Thus, /? = g(h(a)) = (gh)(a), whenceg ( N ) C ( < ? # ) ( a ) , w h e r e b y p ( 7 V ) = ( < ? # ) ( a ) . □

A few simple observations follow directly from Theorem 14. First, applying a permutation to aGSA yields another GSA. Next, the associated permutation set of the resulting GSA is a conjugategroup of the associated permutation set of the original GSA. Finally, in the second part of Theorem14, we see that the GSA obtained by applying a single permutation to a GSA can be obtained byapplying a left coset of the original GSA's associated permutation set to an arrangement in theoriginal GSA.

Thus, the application of a permutation to a GSA results in a GSA that captures the notions ofconjugate groups and left cosets. Rigatelli [Ri, p. 124] noticed that Galois referred to cosets as"groups," suggesting that the use of this word might confuse people attempting to understand hiswork. Perhaps Theorem 14 sheds some light on this issue: the application to an arrangement of theleft coset of a permutation group results in a GSA, which is what Galois would call a "group."

We also state a result similar to Theorem 14, which relates right cosets to the application of apermutation group to an arrangement.Theorem 15. Let N be a GSA of a finite set S, let H = \x\(N), and let a G N. Then for allgeSym(S),H(g(a)) = (Hg)(a).

Before we can study solvability, we must first study normal subgroups. To that end, we shallrefine our focus and work more specifically towards establishing a criterion which determines whena given GSA has an associated permutation set that is a normal subgroup of the permutation set ofanother GSA.

A question immediately arises: if one GSA is a subset of another GSA, is the permutation setof the former GSA a subgroup of the permutation set of the latter GSA? That is, suppose that Mand N are GSAs of a finite set S, N C M, and let G = m(M) and H = m(JV). One wouldexpect that, since N C M,\t must be true that H < G as groups. To establish the truth of thisstatement, suppose that / G H. Then, there exists a G N such that f(a) G N. Since N C M,we have that f(a) G M. Thus, by definition of total associated permutation set, / G G. Thus,H C G. Because M and N are GSAs, we know by Lemma 9 and Theorem 10 that G and H formgroups. Therefore, H < G.

In order to establish a criterion that establishes when GSAs have associated permutation setswhere one is a normal subgroup of the other, we need three lemmas and an intermediate theorem.The proofs of the lemmas are straightforward.Lemma 16. Let M and N be GSAs of a finite set S, N C M, and let G = m(M) and H = m( N).Then for all g e G and a e N, (Hg)(a) C M and (gH)(a) C M.Lemma 17. Let M and N be GSAs of a finite set S such that N C M, let G = m(M) andH = \x\(N), and let a G N. Then, the following equalities hold:

Page 42: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

40 The Harvard College Mathematics Review 2.1

1. {H(0) \0eM} = {(Hg)(a) \ g G G}

2. {g(N)\geG} = {(gH)(a)\geG}.

Lemma 18. Let S be a finite set, let A C Arr(S) and B C Arr (S), and let a G Arr(S). Then(A fl B)(a) = A(a) n B(a) and (A U B)(a) = A(a) U B(a).Theorem 19. Let M and N be GSAs of a finite set S, N C M, and let G = m(M) and H =m(AT). Then, the following sets are partitions of M:

PR(M,N) = {H(0) \0eM},PL(M,N) = {g(N)\geG}.

These sets are known, respectively, as the right and left partitions of M by N.Proof. We must show that the sets Pr(M, N) and Pl(M, N) are partitions of M. By Lemma 17,we know that for all a G N, PR(M,N) = {(Hg)(a) \ g G G} and PL(M,N) == {(gH)(a) \g e G } .

We know from group theory that {gH \ g G G} forms a partition of G and that {Hg | g G G}forms a partition of G. Hence, for all f,g G G, fH n gH = 0 and Hf C\ Hg = 0. Also,C = UgecgH and G = UgeGHg.

Let f,g eG. Then, we have by Lemma 18 that (fH)(a) n (gH)(a) = (fH n p#)(a) =(0)(a) = 0. Similarly, (#/)(a) n (#p)(a) = (#/ n ify)(a) = (0)(a) = 0. Hence,Pl(M, AT) and Pr(M, N) are pairwise disjoint.

Finally, Lemma 18 implies that UgeG((gH)(a)) = (UgeGgH)(a) = G(a) = M. Similarly,UgeG((Hg)(a)) = (UgeGHg)(a) = G(a) = M. Since we know by Lemma 16 that gH C Mand Hg C M for all g G G, we have that PL(M, N) and PR(M, N) are partitions of M. n

As an example, let M = {abc, acb, bac, bca, cab, cba} and let N = {abc, acb}. Note that Mand N are both GSAs. Now, let H = \x\(N) = {id, (6c)}. Then, H(abc) = {abc, acb} = N andH(bac) = {bac, cab} and H(cba) = {cba, bca}. Note that

{H(abc), H(bac), H(cba)} = {{abc, acb}, {bac, cab}, {cba, bca}}

forms a partition of M. This partition is the right partition of M by N, denoted Pr(M, N).Also, note that [id](AT) = N = {abc,acb}, [(ac)](N) = {cba,cab}, and [(ab)](N) =

{bac, bca}. The reader can check that {[id](AT), [(ac)](N), [(ab)](N)} forms a partition of M.This partition is the left partition of M by N, denoted Pl(M, N). With this notation, we nowcontinue.Theorem 20. Let M and N be GSAs of a finite set S, N C M, and let G = m(M) and H =m(A0. Then H<G if and only ifPL(M, N) = PR(M, N).

IfH<G, we shall say that N is a normal subset of M.Proof. Let M and N be GSAs of a finite set S, N C M, and let G = m(M) and H = m(AT).First, we show that H < G implies Pl(M, N) = PR(M, N). Thus, suppose that H < G. ThenHg = gH for all g G G, so that by Lemma 12, (Hg)(a) = (gH)(a) for all a G N, g G C.Suppose that (Hg)(a) G Pr(M,N), for some g e G, a e N\ then, (Hg)(a) = (gH)(a) GPL(M,N). Similarly, suppose that (gH)(a) G PL(M,N), for some g e G, a e N; thus(pff)(a) = (#<?)(<*) G Pr(M, AT). Hence, P«(M, AT) = PL(M, AT).

Next, suppose that PL(M,N) = PR(M,N). Then for all # G <3,a G AT, we know that(gH)(a) G PL(M,N) and hence (gH)(a) G PR(M,N). Therefore, there exists /?EM suchthat (gH)(a) = H(0). Now Theorem 14 tells us that (gH)(a) is a GSA and that w((gH)(a)) =£//£-1. Thus, by Theorem 11, we know that (gH)(a) = (gHg'1)^) for all 7 G (gH)(a) =H(0). Therefore, since 0 G H(0) (recall that /? = id(/3), where id is the identity permutation inif), we have that H(0) = (^)(q) = (gHg-1)^) and since ff(/?) = gHg-{(0), Lemma 12assures us that # = gHg'1. Hence, since # = gHg~l for all g €G,we have that H <G. □

Page 43: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

Matthew G. Dawson—Bridging the Group Def ini t ion Gap 41

We have now translated the concept of normal subgroups into the language of arrangement sets.Now that normality is understood in terms of arrangement sets, it would seem natural to considerquotient groups.

As cyclic quotient groups are deeply important to the concept of solvability, our final resultestablishes a criterion which determines when the quotient group of the associated permutationsets of two arrangement sets is cyclic.Theorem 21. Let M and N be GSAs of a finite set S, N C M, let G = m(M) and H = m(AT),such that H <G, and let a G N. Then the quotient jjOfGbyH is cyclic if and only if there existsa permutation f G G such that there exists n G Nfor each T G Pl(M, N) = Pr(M, N) withT=(fnH)(a) = fn(N).Proof. Let M and N be GSAs of a finite set S, N C M, and let G = m(M) and H = tx(N),such that H < G. Also, let a e N. Because H<G,we know that PL(M, N) = PR(M, N).

Next, suppose that § is cyclic; then, there exists / G G such that § is ^ = (fH), the cyclicgroup generated by fH. Thus, we know that there exists ra £ N such that bH = (fH)n foreach bH £ §. Also, by the definition of quotient group, bH = (fH)n = (fn)H. Consider anarbitrary arrangement set T £ Pl(M, N) = Pr(M, N). We know by Lemma 17 and Theorem15 that there exists k e G such that T = (kH)(a). But we already know that there exists n G Nsuch that kH = fnH, whence T = (fnH)(a).

To show the converse, suppose that there exists f £ G such that there exists for each arrangement set T G PL(M,N) = PR(M,N) an n G N such that T = (fnH)(a). Then, for eachb e G, there exists n G N such that (bH)(a) = (fnH)(a), whence we have by Lemma 12 thatb H = f n H = ( f H ) n . T h e r e f o r e , § = ( f H ) , w h e r e b y § i s c y c l i c . □

Theorem 21 gives us all that we need to completely describe solvability in terms of GSAs.According to the modern definition of solvability, a group H0 is solvable if there exists a normalchain of groups

Hn = {id} < Hn-x < • • ■ < Hi < G = H0,

wnere j£l- is cyclic for 0 < i < n. All of the pieces of this definition now have analogs in thelanguage of GSAs.

However, more work can be done on the relationship between GSAs and groups. A completedescription of quotient groups in terms of GSAs has not been provided; it is possible that Galois'slanguage of arrangement sets is not abstract enough to completely describe quotient groups.

Galois never provided a satisfactory definition of group in his memoir. However, it is clear that,to Galois, groups were always sets of permutations (Galois actually used the word "substitution"instead of "permutation"), and that Galois's permutation groups were always assumed to act uponarrangements. Thus, he would denote a particular group of permutations by writing the list ofarrangements created when that permutation group was applied to a single arrangement.

The GSA definition provided by this paper provides a precise definition for the group conceptexpressed by Galois. Notice that a set of arrangements must meet only one property to be a GSA,compared to the three or four properties required by the modern group definition. The property metby GSAs is loosely related to closure, which Galois recognized was essential to his group concept.The modern group properties are immediately met by the permutation set associated with a GSA,as Theorem 10 shows.

At the end of his memoir, Galois listed out all of the arrangements of a set with four elements.He then proceeded to show that the permutation group associated with that set of arrangements issolvable; he repeatedly partitioned his list of arrangements until he had a list of arrangements thatcontained only one arrangement. Similarly, he showed that the permutation group associated withthe set of all arrangements of five elements is not solvable, thereby showing that the general quinticpolynomial is not solvable. While this fact was known prior to Galois's work, Galois was able touse his more general machinery to very quickly arrive at this result.

Galois greatly advanced in the knowledge of the theory of equations, at the same time revolutionizing modern algebra. It took others, however, to provide a satisfactory definition of theconcepts Galois originated. Abstract groups today are studied in a more general context than were

Page 44: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

4 2 T h e H a r v a r d C o l l e g e M a t h e m a t i c s R e v i e w 2 . 1

Galois's permutation groups. It shall forever be true, however, that arrangement sets are key to thedevelopment of modern algebra.

Acknowledgements and RemarksMany thanks must be given to Dr. Matt Lunsford, who had the original idea for the project andgave me a few hints along the way when they were needed. Also, the research in this paper wassupported by an Undergraduate Research Grant from Union University, for which the author is verythankful.

Finally, although the precise statements of all of the definitions and theorems, as well as theproofs, are entirely original to the author, many of the results are not. Also, some notation wasborrowed from [Ti], including the Sym(S'), H(a), and g(M) notations. The author takes responsibility for introducing the groovy bow tie notation.

Those who wish to learn more about the history of Galois theory should consult [Ed], whichcontains a translation of Galois's memoir. Finally, the referees of this paper must be thanked verymuch for many helpful comments that greatly improved the work.

References[Ed] Harold M. Edwards: Galois Theory. New York: Springer-Verlag 1984.

[Ri] Laura Toti Rigatelli: Evariste Galois. Basel, Switzerland: Birkhauser 1996.

[Ti] Jean-Pierre Tignol: Galois Theory of Algebraic Equations. New Jersey: World Scientific2001.

Page 45: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

STUDENT ARTICLE

Soliton Solutions of Integrable Systemsand Hirota's Method

Justin M. Curry*Massachusetts Institute of Technology '08

Cambridge, MA [email protected]

AbstractIn this paper we investigate a general class of solutions to various partial differential equationsknown as solitons or stable solitary wave solutions. We introduce necessary background by considering general solutions of the classical wave equation and some of its variants, focusing on featuresof linearity, non-linearity, dissipation and dispersion. The Korteweg-de Vries (KdV) equation ispresented as an iconic non-linear dispersive wave equation that admits soliton solutions. How soli-ton solutions are approximated motivates an introduction to the Pad6 approximation, which seeksconvergence by expressing a solution as a quotient G/F of polynomials of exponentially decayingfunctions. The Pad6 approximation motivates a substitution that decouples the KdV equation into apair of equations on the polynomials G and F. The decoupled version of the KdV equation is thengreatly simplified by introducing a bilinear differentiation operator known as Hirota's D-operator.Another substitution allows Hirota's D-operator to express the KdV equation in a single bilinearform. This final form illustrates how the perturbation method can be used to produce exact solitonand multi-soliton solutions. The generation of multi-solition solutions in an almost additive fashion with this method is summarized as a non-linear superposition principle. Connections betweenHirota's method, Kac-Moody algebras and quantum field theory are briefly mentioned.

4.1 IntroductionThe study of the dynamical behavior of physical systems has been, and continues to be, a majorsource of mathematical inspiration. The twentieth century in particular has initiated a deep inquiryinto a variety of non-linear systems and their unifying themes. In the spectrum of dynamics, twoopposites have attracted considerable attention: chaos and solitons. Chaos theory has demonstratedthat both partial and ordinary differential equations can exhibit incredibly rich behavior, allowingsome deterministic systems to be exponentially unpredictable for increasing time. On the otherextreme, soliton theory provides several important examples of non-linear systems behaving in astable, quasi-linear fashion.

In this paper, we explore this second extreme and build up a concrete introduction to solitonsvia an inspection of the Korteweg-de Vries (KdV) equation—a non-linear dispersive equation,which is effective for describing surface waves in a shallow water domain. The existence of stable"solitary waves"—the precursors for the term "soliton"—was first discovered experimentally in1834 by J. Scott Russell, who chased on horseback a one foot high and 30 feet long wave generatedby a stopping canal boat, traveling at eight to nine miles an hour for nearly two miles in unalteredform. This solitary wave solution was re-discovered as a solution to the KdV equation in 1895[DJ]. Since then, stable solitary wave solutions have featured prominently in many other non-linearpartial differential equations (PDEs) and the methods for generating soliton solutions have led tomany deep ideas in mathematics and physics.

+ Justin Curry is a current senior in mathematics at the Massachusetts Institute of Technology. His researchinterests involve geometry, integrable systems, and mathematical physics. He plans on entering a PhD programin mathematics in Fall 2008.

43

Page 46: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

4 4 T h e H a r v a r d C o l l e g e M a t h e m a t i c s R e v i e w 2 . 1

The goal of this paper is to provide an intuition for some of these results. We begin witha simple description and definition of classical linear wave equations in Section 4.2. The one-dimensional wave equation is solved using d'Alembert's method in Section 4.2.1. Explicit planewave solutions to the wave equation are described in Section 4.2.2 and the relevant terminologyof dispersion relations, phase and group velocities are defined in Section 4.2.3. Once this relevantbackground is covered, we consider in Section 4.2.4 a lesser-known class of solutions to the waveequation that cannot be approximated by plane waves, called solitary waves or solitons.

Pursuant to our objective to understand which PDEs admit soliton solutions, we consider inSection 4.3 generalized wave equations that more accurately model various physical phenomena.In particular, we stress the effects of linearity, non-linearity, dispersion and dissipation on solutionsto the corresponding PDEs in Sections 4.3.1,4.3.2,4.3.3 and 4.3.4.

Finally, we focus on the aforementioned KdV equation in Section 4.4. We first outline someof the nice properties of the KdV equation and the conservation laws it obeys in Section 4.4.1. Itis stated, but not proved, that the KdV equation satisfies infinitely many conservation laws and thisrelates to its integrability. Our focus then returns to solitary wave solutions to the KdV equation.The fact that these solutions cannot be approximated by plane wave solutions leads us to introducethe Pade approximation in Section 4.4.2, which will motivate us to decouple the KdV equationinto two equations whose solutions are polynomials of exponentially decaying functions. Padeapproximation will lead us to consider in Section 4.4.3 how the perturbation method can approximate solutions to the KdV equation. Our attempt to unite Pade approximation and the perturbationmethod via a decoupled pair of equations will be our way of motivating and introducing Hirota'smethod in Section 4.4.4. A change of variables suggested by Hirota's method will allow us toput the KdV equation into a very elegant bilinear form. Before further exploring this new form,we graphically demonstrate the notion of a two-soliton solution in Section 4.4.5, and qualitativelymotivate the desire to produce multi-soliton solutions to the KdV equation. The substitution suggested by the Pade approximation is then abandoned in Section 4.4.6 in favor of another change ofvariables for the KdV equation. This alternative bilinear form will then allow us to apply the perturbation method of Section 4.4.3 to produce exact (not approximate) multi-soliton solutions to theKdV equation in Section 4.4.7. The relationship of this powerful method to deep ideas involvingKac-Moody algebras and quantum field theory are then mentioned briefly in Section 4.5.

Special thanks must be extended to Professor Aliaa Barakat at the Massachusetts Institute ofTechnology for guiding the author through the rich mathematical vistas involved in integrable systems. Without her direction and support, the current paper would be non-existent.

4.2 The Wave Equation(s)Definition 1. An equation (or system of equations) is linear if, whenever it has solutions m andu2, it also has am + bu2 as a solution, where a and b are scalar coefficients.Definition 2. The classical wave equation that describes a wave propagating with constant speedc is given by the following linear partial differential equation

d 2 U 9 „ 9^ = c 2 W ( 4 . 1 )

In one dimension, equation (4.1) models the height of a plucked string as a function of spaceand time. More specifically, the one-dimensional wave equation

— = c 2 ^ ( 4 2 )d t 2 d x 2 ( '

is an idealized model derived from using force balance and Newton's laws.

Page 47: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

Justin M. Curry—Soliton Solutions of Integrable Systems 45

4.2.1 d'Alembert's Solution to the 1-D Wave EquationPutting n = x - ct and £ = x + ct we have that

d 2 u _ 2 d 2 u 2 d 2 u 2 d 2 ud t2 ~C dn2 d£dn de 'd ^ u _ ^ u 8 2 u d 2 udx2 ~ 8n2 + dt&n + de '

Substituting into equation (4.2), we find

d2u = 0,d^dn

and integrating twice, we have that solutions take the form

where / and g are arbitrary functions. This corresponds to solutions propagating in the left andright directions. If we take equation (4.2) and factor accordingly

( £ - • £ ) ( £ + * £ ) « - ( £ - ' £ ) ■ - « *

then d'Alembert's solution tells us that if we consider solutions to the simplified wave equation

u t + c u x = 0 , ( 4 . 4 )

we are left with right-traveling solutions only, i.e.

u(x,t) = f(n) = f(x-ct).

4.2.2 Plane Wave SolutionsDefinition 3. A plane wave is a solution to equation (4.1) that takes the form

u ( S , t ) = A e * U ' * - u t ) ( 4 . 5 )

where i is the imaginary unit, k is the wave vector, u is the angular frequency, and A is the (possiblycomplex) amplitude.

For the remainder of this paper, we will restrict our attention to one-dimensional wave equations, in which case k and x are treated as scalar-valued quantities. If x is thought of as havingunits of length (say meters ra), then k must have units that are the corresponding inverse (ra-1). Itis common to call k the wave number.

In the theory of differential equations it is common to "guess" the solution to a given equationby substituting in a function that has certain required properties (most notably it solves the provideddifferential equation given certain constraints). We call such an assumed form for a solution anansatz for the differential equation. For example, plane waves can be taken as a good ansatz for asolution to many dierent wave equations. If we consider equation (4.4) and assume it has a solutionof the form (4.5), u(x, t) = e^*-^ then we find that the angular frequency and wave numbermust satisfy the relation

u = c k . ( 4 . 6 )

This is an example of a dispersion relation.

Page 48: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

4 6 T h e H a r v a r d C o l l e g e M a t h e m a t i c s R e v i e w 2 . 1

4.2.3 Dispersion Relations, Phase and Group VelocitiesDefinition 4. A dispersion relation is a relation between the energy of a system and its momentum.

Since energy in waves is proportional to frequency u and the wave number k is proportionalto momentum, equation (4.6) is an example of a dispersion relation. These sorts of relations willbe very valuable when considering how fast certain Fourier components in the initial profile of awave travel and how fast energy dissipates for the given system. One distinguishes between thesenotions by defining two types of velocities.

To motivate the first definition of velocity, imagine that we are watching an animation of ofa one-dimensional wave equation u(x,t) that can be written as function f(kx — wt) for k andu constant. Now imagine we are following a crest of the wave and we notice that at a specificpoint in time to and point in space xo, the wave has a height H = u(xo,to) = /(Co) whereCo = kxo — u;to. If we allow a short amount of time At to elapse, we see that the point H on thecurve has traveled a small distance Ax, so that u(xq + Ax,to + At) — H. For A* and Ax smallenough, we see that this can only be the case if

kx§ — wto = Co = k(xo + Ax) — u)(to -f At),

e.g. the point that gets mapped to H is the same point after the wave has traveled a small distance.This is only true if

• a a . A x ukAx = uAt =>• —— = —.A t kThis is known as the phase velocity of the wave.Definition 5. For a wave of the form u(x, t) = f(kx - ut) where k and uj are constants, the phasevelocity cph is defined as the constant

c p h : = | . ( 4 . 7 )

Although the phase velocity can be defined more generally, where cph determines the speedat which any one frequency component travels, we will restrict ourselves to the definition givenabove.Definition 6. The propagation of energy in a system is given by the velocity of wave packets,known as the group velocity, which is given by

c 3 r : = w ( 4 . 8 )

In the case of equation right-traveling waves f(x — ct) (4.4), we determined the linear dispersion relation w = ck (4.6). Applying the definitions for the group (4.8) and phase (4.7) velocities,we see that in this instance

c k d uCph~ k " k ~~~ dk ~~gr'

Definition 7. A non-dispersive wave is a wave which is governed by a linear dispersion relation.For linear equations, differentiation of u gives the coefficient of k, which is similarly achieved

by division. Thus a linear relation implies equality of cph and cgr. Conversely, imagine that theequation cph = cgr holds. Treating a; as a function only of k, we can separate variables and solvethe differential equation as follows:

dw~dk ~ kduo dk

k)gU = log k + c.

Page 49: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

Justin M. Curry—Soliton Solutions of Integrable Systems 47

Figure 4.1: Plot of a Solitary Wave Solution

Exponentiating both sides of the equation we obtain the linear dispersion relation u — Ck whereC = ec. Dispersive waves have unequal phase and group velocities, while non-dispersive waveshave equal phase and group velocities.

4.2.4 Solitary Wave SolutionsPlane wave solutions are not the only solutions to the classical linear wave equations presented inequations (4.1), (4.2), (4.4). For example, Figure 4.1 shows a completely different solution that isnot of this plane wave form. Letting

u ( x , t ) = s e c h 2 ( x - c t ) , ( 4 . 9 )and calculating

m = — 2csech2(x — ct) tanh(x — ct),ux — 2 sech2(x — ct) tanh(x — ct),

we have thus verified that a wave of the form (4.9) satisfies our simplified right-traveling linearwave equation (4.4).

Although not obvious at this point, a solution of the form (4.9) is an example of a solitary wavesolution or soliton.Definition 8. ([DJ], [ZK]) A solitary wave solution or soliton is a solution to any wave equationthat satisfies the following three properties:

1. retains its shape (initial profile) for all time,

2. is localized (asymptotically constant at ±oo or obeys periodicity conditions imposed on theoriginal equation),

3. can pass through other solitons and retain size and shape.As we will see in the next few sections, other types of wave equations either permit or dismiss

the possibility of solitary wave solutions. Of particular interest will be the case when the waveequation under consideration is non-linear. Certain non-linear equations will allow solitary wavesolutions. In these instances, the third condition in Definition 8 will become especially importantas many localized solutions tend to scatter off of one another irreversibly. This is in sharp contrastto linear equations, where two waves can pass through each other without change. Since solitonsexhibit at most a phase change after interaction, we will be able to "add" (in a well-defined wayto be described later) two soliton solutions to obtain a third one, achieving in effect a non-linearsuperposition principle!

Page 50: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

4 8 T h e H a r v a r d C o l l e g e M a t h e m a t i c s R e v i e w 2 . 1

4.3 Generalized Wave EquationsIn the previous section, we considered an example of a linear, non-dispersive wave equation andthe types of solutions it allows. However, these sorts of equations are often inadequate for describing the rich dynamical behavior of the universe. Physical models for vibrations, gravity waves,internal waves, surface waves and a broad range of related phenomena require a mix of dissipative,dispersive and nonlinear behavior. In this section we will consider various combinations of thesefeatures and discuss whether or not they support stable solitary wave solutions.

4.3.1 Linear Dispersive WavesLet us consider a partial differential equation with odd spatial derivatives, such as

u t + c q u x + S u x x x = 0 , ( 4 . 1 0 )

where Co and S are constant. Here we are using subscript notation where ux := |^, uxx := §^fand so on. Taking as our ansatz the plane wave solution u(x, t) = e*(fca;-wt\ we get the followingnon-linear dispersion relation between the frequency u and the wave number k

u = cok — 5k3.

Applying the definitions for phase and group velocities we find that

£ 7 2 V / d u o n 2Cq — ok = 7- = Cph 7* cgr = -~r — co - odk .

If 6 > 0 we then have thatCgr 5: C"ph'

If we assume that the Fourier transform of u(x, 0)—call it A(k)—has a continuum of wave numbers in its initial profile, then the evolution of the profile is given by

/ o o A(k)ei{-oo

i(kx—u(k)t) dk

and the Fourier components literally "disperse" or separate out according to their wave numbers.This example demonstrates that a linear dispersive wave does not exhibit stable solitary wave solutions since an initial profile consisting of many superposed wave numbers breaks apart into individual components instead of retaining its shape.

4.3.2 Linear Dissipative WavesWhat is interesting to note is that we get real dispersion relations for u; whenever we have odd orderderivatives for our spatial variable x. If we consider the alternative equation

the dispersion relation then is

We thus have the solution

u t + c q u x - 8 u x x = 0 , ( 4 . 1 1 )

u = cok — iSk .

u(x,t) = e~k *+*fc<*-*)jwhich decays exponentially with time.Definition 9. A dissipative wave is a wave whose energy decreases as time increases.

Page 51: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

Just in M. Curry—Sol i ton Solut ions of Integrable Systems 49

In Section 4.2.3 we noted that energy in waves is proportional to frequency. This statement istrue for a given amplitude, but energy is also proportional to a wave's amplitude. Although energyis a difficult concept to define generally, for a wave periodic in x with period T we may define theenergy E of a wave u(x, t) as

2 Jo Hx,t)\2For the above example, we find that

which clearly goes to zero as tEcxe~2k'

4.3.3 Non-Linear Non-Dispersive WavesA common feature of equations (4.1), (4.10), and (4.11) is their linearity. With such systems, oncetwo solutions are produced their sum is guaranteed to be a solution, and we can find a basis for thespace of all solutions—the tools of linear algebra are at our disposal. Non-linear systems are muchharder to study precisely for this reason. Let us consider a simple example where the wave's speedc depends on its amplitude. The equation

u t + c ( u ) u x - 0 , ( 4 . 1 2 )

where c(u) = Co + bun for b a constant is one such example. It is clearly non-linear, because if weconsider two solutions u and v and we substitute their sum into equation (4.12), we obtain

ut + vt + co 4- b(u + v)n(ux + vx) = 0

if and only if (u + v)n = 0, which is not true in general.Surprisingly, the solution to equation (4.12) follows (almost) precisely d'Alembert's solution,

except that c in the solution u(x, t) = f(x - ct) for equation (4.4) is replaced by the function c(u).This has important implications, because if c(u) is increasing, then the wave travels faster as itsamplitude increases, until finally the wave steepens and breaks (where "breaking" in mathematicalterms is multi-valuedness).

4.3.4 Non-Linear Dispersive WavesThe upshot of the previous few subsections has been that neither linear dispersive nor non-linearnon-dispersive wave equations admit solitary wave solutions. In those cases, a wave profile eitherhas the tendency to break up according to wave number (dispersive) or steepen to multi-valuedness(non-linearity). One might imagine that a mix of these two behaviors would yield even more wildbehavior, but it is surprising that these two effects can actually neutralize each other to producesoliton solutions.

We wish to provide an intuitive argument that a wave equation of the form

u t + c q u x + b u n u x + S u x x x = 0 ( 4 . 1 3 )

has a solitary wave solution that retains its shape. If we suppose that equation (4.13) has a solitarywave solution as depicted in Figure 4.2, then a necessary assumption for the wave to retain its shapeis the condition

^top = V bottom = V,

where vtop and ^bottom denote the speed at the top and bottom respectively. If we move to acoordinate system that travels with the wave we can hopefully simplify the analysis. Putting \ —£x - Qt where v = Q/£, so that at x ~ x t = 0 and forx~ x — vt for all other times, we are leftwith a problem in terms of x-

If Umax = A, the amplitude of the wave, then we can approximate our solution in a neighborhood of the top as

u ~ A(l — const x x2)-

Page 52: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

50 The Harvard College Mathematics Review 2.1

A

x — vt = 0

Figure 4.2: Solitary Wave Solution: Intuitive Argument

We then have that uxxx ~ 0 in this neighborhood, so equation (4.13) reduces to exactly the equationwe had in Section 4.3.3: •

Accordingly,

ut 4- (co + bun)ux ~ 0.

vtop = co + bAn,and if b > 0, then vtop > co.

At the bottom of the wave un ~ 0, and thus we can neglect the non-linear term, reducingequation (4.13) to

u t + c 0 u x + 5 u x x x ~ 0 . ( 4 . 1 4 )Since this is exactly equation (4.10), we know that

cph = cq — Sk = Vbo (4.15)

and if 5 > 0, then bottom < c0. This would lead us to conclude that the velocity at the top of thewave is larger than the velocity at the bottom, and thus our wave will steepen and break. But thiscontradicts the stability of a solitary wave solution! Where did we go wrong in our analysis? Theanswer is that the expression (4.15) assumes that at the bottom u has a plane wave form! If insteadwe use e±x as our ansatz for (4.10), we derive the following non-linear dispersion relation

which in turn implies

n = Co£ + S£3

f t 2^bottom = -T = Co + S£ .

We now have the important result that

^top = ^bottom <=> S£2 = bAn.

The above argument demonstrates several important lessons:

• A non-linear dispersive wave equation of the form (4.13) has a solitary wave solution whichmoves at constant speed v while preserving its shape.

Page 53: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

Justin M. Curry—Soliton Solutions of Integrable Systems 51

• A solitary wave solution to a non-linear wave equation cannot be approximated by a planewave solution u ~ e*(fc*-<*>*) ut ramer requires an exponentially decaying solution of theform u ~ e±(ex-Qt) where v = Q/£.

Later in this paper we will expand on the details for the second point when we introduce a perturbation method for generating soliton solutions. To further motivate and focus our discussionwe will restrict our attention to a historically important example of a non-linear dispersive waveequation, which admits soliton solutions, known as the KdV equation. After introducing the KdVequation and illustrating some of its properties, we will dive deeper into just how it produces solitonsolutions.

4.4 The Korteweg-de Vries (KdV) EquationOne of the first PDEs for which soliton solutions were discovered is the Korteweg-de Vries (KdV)equation,

ut + 6uux + uxxx = 0.As described in the introduction, this equation is useful for describing surface waves in a shallowwater domain. It is straightforward to verify that

u(x,t) = l— sech2 (|) , x = tx ~ ta, ft = t\ (4-16)

is a traveling wave solution to the KdV equation.4.4.1 Conservation LawsOne of the nice features of the KdV equation (and deeply connected to its integrability) is that itadmits infinitely many conservation laws. The two bottom rungs on this conservation ladder areconservation of mass and energy. If we require the KdV equation to obey the periodicity condition

u(x + l,t) = u(x,t),then we can prove that the "mass"

and the "energy"

M := / u(x,t)dxJo

E:= j ^(u(x,t))2dx==0are independent of time. Simply differentiating with respect to time we have

-— = / ut= -6uux - uxxx = [-3u2]J - [uxx\o = 0

if u and uxx are assumed to be periodic in x. Using the same conditions on w, we see that

d E f 1 f \ 2 f 1= / uut = — / bu ux — uuxxxJ o J o J odt

- [ l= " I! Tx (2 - / {uUxx) + / UxUxx = [lul = 0.0

Both M and E are also independent of time if we do not enforce periodicity, but rather requireu, ux,uxx —*• 0 as x —> ±oo and where M and E are integrated on the whole real line. However,this is the case with square-integrable functions, so the demand is not too strict.

Page 54: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

5 2 T h e H a r v a r d C o l l e g e M a t h e m a t i c s R e v i e w 2 . 1

4.4.2 Pade ApproximationAlthough the infinity of conservation laws already points to some of the deeper aspects of theKdV equation, we are primarily concerned with how the KdV equation generates soliton solutions.As detailed in an earlier section, solitary wave solutions cannot be approximated by plane wavesolutions and instead require exponentially decaying solutions of the form e±x, where x = to—Qt.In particular, we need to expand u in terms of e exp(x) where e is small. Unfortunately, the right-hand side in the expression

u ( x , t ) ~ e a i e x p ( x ) 4 - e 2 a 2 e x p ( 2 x ) H ( 4 . 1 7 )

may diverge for large x, contrary to the behavior required by a solitary wave solution. Indeed, assuggested by Figure 4.2, we expect that as x —▶ +°o,

u(x,t) ~exp(-x).One way to achieve convergence is to find the Pade approximation u = G/F of (4.17), where Gand F are polynomials in exp(x).Definition 10. Given that a function f(x) is ra + n times differentiable, the Pade approximantof order (ra, n) is the rational function

R, x = Po 4- pix 4- P2X2 H h PmXm _ G(x)qo 4- qix 4- q2X2 H h qnxn F(x)

which agrees with f(x) to the highest possible order, i.e.

/(0) = R(0)/'(<>) = R'(0)

f(m+n\0) = R{m+n)(0)

Example 11. Supposef(x) = x - x3 4- x5 - x7 H

which converges for |x| < 1. Factoring, we obtain an alternating geometric series in terms of x2

x ( l - ( x 2 ) 1 + ( x 2 ) 2 - ( x 2 ) 3 + - . . ) = — ^ F ( 4 . 1 8 )1 4- xz

and thus f(x) ~ x~x as x —> oo. Substituting exp(x) for x in (4.18) gives

ext)( ~v)/(exp(x)) = exp(x) - exp(3X) 4- exp(5x) = x + e^ % ) ~ ex^~^

as x —^ co.4.4.3 Perturbation MethodConsidering again the KdV equation

u t + 6 u u x + u x x x = 0 , ( 4 . 1 9 )and applying the perturbation method one expands u as a power series in a small parameter e toobtain an infinite sequence of linear equations on the components of the expansion as follows. Wesubstitute the following expression for u in (4.19)

u = e m 4 - 2 U 2 4 - e 3 u 3 4 - • • • , ( 4 . 2 0 )

Page 55: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

Just in M. Curry—Sol i ton Solut ions of Integrable Systems 53

and by collecting like powers of e we obtain the following series of equations:

! + 6)wl = 0' (421)m + d x ^ ) U 2 = - 6 u i ^ ' ( 4 2 2 )(

(di + w)U3=-6{U2^x-+Ul-dx-)

As explained in Section 4.13, we need to choose an exponentially decaying solution for u. Let

ui=aiexp(x), X — fa ~ ^> Q = £,

where ai and £ are arbitrary. Substituting this solution into (4.22), we determine that

u2 = a2exp(2x), a2 = -^-.

Proceeding successively, we can thus find all the Wj's in (4.20), obtaining a power series in e exp(x)as in (4.17). However, as already mentioned, this expression will diverge for large x, which we cantry to circumvent via a Pade approximation. The trouble with this approach is that there is nosimple way to determine the required functions G and F. One potential trick is to reverse engineera known solitary wave solution so that it takes the form G/F. In the case of (4.19) the one-solitonsolution (4.16) is

£ 2 2 ( X \ £ 2 2 ^u(x, t)-- seen ^-) - x + ^^ - 2 + exp(x) + exp(_x)'

soG _ 2 £ 2 e x p ( x ) 2 £ 2 e x p ( x )F ~ 1 + 2 exp(x) 4- exp(2x) (1 4- exp(x))2 ' (4.23)

The problem with reverse engineering is that (4.23) is an artificial byproduct of a solution wealready know. What would be better is to develop a method which determines the functions G andF without a priori having the solution u. This approach is embodied by Hirota's method.

4.4.4 Hirota's MethodOur practice with the perturbation method and Pade approximations suggests that making the substitution u(x, t) = G[exp(£x - Qt)]/F[exp(£x - Qt)] in (4.19) to obtain equations for G and Fmay be a fruitful undertaking. We first calculate:

U = F

ut =

GFGtF - GFt

F2_ GXF — GFX

_ G^x _ SGXXFX + 3GXFXX + GFXXX GXF2 + GFXXFX _ GF3U X X X — p 2 4 " O ^ 3 ^ 4 •

Page 56: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

5 4 T h e H a r v a r d C o l l e g e M a t h e m a t i c s R e v i e w 2 . 1

Substituting into the KdV equation (4.19), we obtain the following complicated equation

G t F - G F t , n G G x F - G F xut 4- 6uux 4- uxxx = — (- 6

+x x x p 2 i ^ p p 2

GXXXF — 3GXXFX — 3GXFXX — GFXX

+ 6 F G X F 2 + F G F F X - G F 3 X = 0 ^rAt first glance equation (4.24) has only made things worse. We could try to decouple this equationinto a simpler set of equations. Re-expressing the 6uux term with F3 in the denominator as onewith F4 in the denominator, we could require that individually the term with F2 in the denominatorand F4 in the denominator are both zero:

GtF - GFt 4- GXXF - 3GXXFX - 3GXFXX - GFXXX = 0 (4.25)GGXF2 - G2FFX + GXFF2 4- GFFXFXX - F3G = 0. (4.26)

Unfortunately, the G and F we derived in (4.23) do not satisfy (4.25) and (4.26), but only bya missing factor of 6GXFXX. Changing the minus sign in front of 3GXFXX to a plus sign andtransferring the remainder to the numerator of the F4 term, the KdV equation (4.19) becomes

GtF — GFt 4- GXXF — 3GXXFX 4- 3GXFXX — GFXXX ip 2 +

6(GXF - GFX)GF ~ (F£*x ~ Fx) = 0.

Setting the terms with denominators F2 and F4 equal to zero, we obtain the decoupled equations:

GtF - GFt + GXXF - 3GXXFX + 3GXFXX - GFXXX = 0 (4.27)GF - (FFXX - F2) = 0 . (4 .28)

We have done a great deal of work, but it doesn't appear to have paid off. However, careful analysis of the pattern of derivatives suggests that (4.27) and (4.28) can be written even more simply.Introducing a new bilinear differentiation operator, Hirota's D-operator, will greatly simplify theseexpressions once and for all.Definition 12. The Hirota D-operator for two n-times differentiable functions / and g is definedby:

Dnx i ' 9 = (dX l - dX2)n f (x i )g (x2 ) \X l=X2=x (4 .29)

Example 13. We determine the following quantities

Dtf - g = ftg - fgt,Dxf g = fxg- fgx,Dlf ' 9 = fxxg - 2fxgx + gxxf,D2xf-f = 2fxxf-2f2,&xf • 9 = fxxxg - 3fxxgx 4- 3fxgxx — fgxxx.

With the Hirota D-operator in hand, we then immediately recognize that equations (4.27) and(4.28) for G and F reduce to the following quadratic, also called bilinear, form:

( D t + D 3 x ) G F = 0 , ( 4 . 3 0 )2 G F - D 2 X F F = 0 . ( 4 . 3 1 )

Page 57: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

Justin M. Curry—Soliton Solutions of Integrable Systems 55

Figure 4.3: Two-Soliton Solution for Various Times

Equations (4.30) and (4.31) are the culmination of this section. Not only is the form aesthetically pleasing, but we will soon see how such a form enables one to produce soliton solutions inan almost trivial manner. Before this method for producing such solutions is presented, we wouldlike to first understand what it means for the KdV equation to admit a multi-soliton, or AT-soliton,solution.

4.4.5 AT-Soliton SolutionsIn Section 4.2.4 we outlined the defining characteristics of a soliton solution. In particular, wenoted how the third condition in Definition 8 gives rise to a non-linear superposition principle. Sofar, we did nothing to illustrate this principle graphically, nor did we explain how an equation mightadmit multiple soliton solutions. In Figure 4.3 we have graphed the following two-soliton solutionto the KdV equation (4.19) for six different times (p.75 [DJ]):

3 + 4 cosh(2a? - St) 4- cosh(4a? - 64t)u(x, t) - 12 j3cosh(x _ 2gt) + cosh(3a. _ 36t)]2 • (4.32)

One of the striking features of Figure 4.3 is that the tall wave actually catches up to, and passesright through, the smaller wave in an almost linear fashion. Careful inspection and explorationby the reader will reveal that after the interaction, the short wave has actually been pushed backand the tall wave has advanced forward relative to where they would have been if the waves hadevolved individually, without interaction. This phase-shift is the trademark of the non-linearityof the KdV equation. Aside from this small difference, the ability for individual solitary wavesto interact strongly and retain their shape is the defining characteristic of solitons. How multiplesoliton solutions such as (4.32) are produced for the KdV equation in a more direct, algebraic

Page 58: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

56 The Harvard Col lege Mathematics Review 2.1

fashion is the objective of the final few sections. In particular, we will find that reformulating theKdV equation into another bilinear form will allow us to simplify our analysis considerably.4.4.6 Logarithmic Substitution for the KdV EquationWe motivated the decoupling of the KdV equation into two equations involving G and F by introducing the Pade approximant and asking for a better method to produce these polynomials inexp(x). We could conceivably try to express G and F in terms of a power series like (4.20), substituting into the two bilinear equations (4.30) and (4.31) and obtaining pairs of equations analogousto equations (4.21), (4.22), and so on. This, however, turns out to be a rather complicated approachas it stands, so in this section we will introduce an alternative substitution that reduces the KdVequation to another single equation in bilinear form. This will allow us to apply the perturbationmethod to produce exact multi-soliton solutions.

Instead of taking u = G/F, let us make the substitution

« = 2^Iog/.

Then (4.19) can be written as

2(log/)xxt 4- 3dx(u) 4- uxxx = 0,which upon integrating once by x becomes

2 ( l o g f ) x t 4 - 3 u 2 + u x x = 0 . ( 4 . 3 3 )

We now calculate the relevant quantities:

w2=4(^y+4(^y-8&.,„ . _ i n / J x \ , 0 / f J x J x x a ( J x x \ q J x J x x x n J x x x xu°>--12[j) +24-T~6{t) "8_p~ + 2"7"'

2 ( l o g / ) x l = - 2 M + 2 ^ .

After substituting into (4.33), simplifying and multiplying by /2/2, we obtain the equation

f f x t - f x f t + 3 f x x - 4 f x f x x x 4 - f f x x x x = 0 . ( 4 . 3 4 )

We are now ready to put the KdV equation into an alternate bilinear form involving the HirotaD-operator defined in equation (4.29). In example 13 we calculated several quantities using the-D-operator. We now add the following two calculations:

D x D t f . f = 2 ( f t x f - f t f x ) ( 4 . 3 5 )Dxf - f = 2( f fxxxx - Afx fxxx 4- 3 f2x) . (4 .36)

If we compare equations (4.35) and (4.36) with the transformed KdV equation (4.34) we can immediately deduce the alternate bilinear form

D x ( D t + D 3 x ) f . f = 0 . ( 4 . 3 7 )

Compared against (4.30) and (4.31), equation (4.37) is a simpler equation. In particular, we willfind that the perturbation method produces multi-soliton solutions in a direct manner from thisequation.

Page 59: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

Just in M. Curry—Sol i ton Solut ions of In tegrable Systems 57

4.4.7 Producing Af-Solitons via the D-OperatorWe are now in a position to reapply the perturbation method of Section 4.4.3. It is important tonote that when we try to solve PDEs such as the KdV equation via the perturbation method, weusually have an expansion of infinite order, whose coefficients we must determine successively.Truncating the solution leaves us with only approximate solutions to the original PDE. In contrast,we will find that applying the perturbation method to equations in bilinear form and choosing ourearly components wisely will force our infinite expansion to truncate at finite order. This will allowus to produce exact, rather than approximate, solutions via an expansion (4.20) of finite order in e.

First, we take the one-soliton solution (4.16) and write for £ = 2, Q — £3, the solution in termsof the transformed equation for /:

o / 2 x - S t \ o 2

u(x, t) = 2sech2(.r - 4*) = 4- (1 + e2g_8tj = ^ Ml + ^~M).

If for notational convenience we write

B[ f ,g] :=Dx(Dt + Dl) f .g,

we determine that for / = 1 4- e2x~8t

B[f, f] = B[l, 1] + B{1, e2x-8t] 4- B[e2x~8t, 1} + B[e2x~8t, e2x~8t} = 0,

which checks that this is indeed a solution to (4.37). We wish to generalize this solution to accountfor j/V-soliton solutions.

We assume that, like in Section 4.4.3, / can be expanded in positive powers of e from which wecan obtain an infinite sequence of equations on the components of the expansion. More precisely,we write

oo

/ = l + ^ £ n / n ( x , ( ) ,n = l

and substitute this expression into (4.37), which upon collecting powers of e becomes

B[l, 1] + e(B[l,fi] + B[j\, 1]) 4- e2(B[l, f2] + B[fi, fi] + £[/2, !]) + ••.

+*r ( £ Blf™,fr-m)) + • • • = 0. (4.38)\ m = 0 /

Expression (4.38) then reduces to a series of equations, where each term with common power of eis required to be zero. We see, using (4.35) and (4.36), that the equation for fi reduces to

which we will rewrite using the following notation:

K d t d x 3 J ' d x 'The first few equations from (4.38) then become

D / i - 0 ( 4 . 3 9 )2 D f 2 = - B [ fi , fi ] ( 4 . 4 0 )2 D f 3 = - B [ fi , f 2 ] - B [ f 2 , fi } . ( 4 . 4 1 )

Page 60: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

5 8 T h e H a r v a r d C o l l e g e M a t h e m a t i c s R e v i e w 2 . 1

We can then easily check that, for /i = exp(xi) where x% = ^x - £3t 4- ai for £i and ai arbitraryconstants,

D/i=0, B[/ i , / i ] = 0, and Df2 = 0.Accordingly, we may choose fn = Oforn = 2,3, ...in expression (4.38) and we regain the solitarywave solution.

It is at this point that we make the very important observation that equation (4.39) is linear!This linearity, as we will now explain, is the key to generating multi-soliton solutions to the KdVequation. Let us assume that

/ i = e x p ( x i ) 4 - e x p ( x 2 ) ( 4 . 4 2 )where \i is defined above. Since (4.39) is linear we know Dfi = 0. From (4.40) we have that

2 D f 2 = - B [ f u fi ]= - £[exp(xi),exp(xi)] - 5[exp(xi),exp(x2)]

- #[exp(x2),exp(xi)] - £[exp(x2),exp(x2)].

Noting the fact that only terms involving both xi and X2 are non-zero, we find that

2D}2 = -2{(h - £2)(£32 - *?) + (*i - ^2)4}exp(xi + X2). (4.43)

Equation (4.43) has a solution of the form

/2 = ,42exp(xi+X2),and upon substituting into (4.43), we find that

Proceeding to equation (4.41), we substitute our expressions for f2 and /1 and determine that

2Df3 = - ,42£[exp(xi),exp(xi 4- X2)] - A2B[exp(xi 4- X2),exp(xi)]- A2B[exp(x2), exp(xi + X2)] - A2B[exp(xi + X2), exp(x2)]

= - 2A2{(-£2)£32 4- (-^2)4}exp(2xi + X2)- 2A2{(-£i)£\ 4- (-^i)4} exp(2Xi 4- X2)

= 0.

Notice that in contrast to our one-soliton solution, assuming /1 has the form (4.42) allows us totruncate (4.38) by putting fn = 0 for n > 3. Putting e = 1, we now have an exact two-solitonsolution to the KdV equation:

(£. — £2\2J-^Tf) exP(*i +X2).The method developed above generalizes to any exact Af-soliton solution simply by putting

N

/ i = ] T e x p ( x O ( 4 . 4 4 )

and the expansion (4.38) is guaranteed to terminate after the /W term. Although this termination canbe proven, we will not do so here. It is important to note that expression (4.44) and its correspondingexact solution give us a non-linear superposition principle. The ability to take one-soliton solutionsand combine them to form multi-soliton turns out to be an important feature of integrable systems.We could, in fact, take this as a definition of integrability.

A2

Page 61: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

Just in M. Curry—Sol i ton Solut ions of Integrable Systems 59

Definition 14. ([Hi] p. 101) A set of equations written in Hirota bilinear form is Hirota integrable,if one can combine any number N of one-soliton solutions into an iV-soliton solution.

The fact that (4.44) generates an ./V-soliton solution to the KdV equation (4.37) is testament toits Hirota integrability. In all cases known so far, Hirota integrability has turned out to be equivalentto more conventional definitions of integrability [Hi].

Although other methods for finding exact multi-soliton solutions exist, the Hirota jD-operatoris considered to be the most direct and algebraic method for doing so. The geometric and deeperconnections of Hirota's method with some other areas of mathematics and physics will be discussedbriefly in the next section.

4.5 Directions ForwardThe field of integrable systems has seen many spectacular developments in the past several decades.This growth can largely be attributed to the fruitful exchange that occurs at the nexus of mathematics and physics—a nexus occupied by the study of Riemann surfaces, Kac-Moody algebras, twistortheory and quantum field theory. It is our hope that in this paper we have illustrated, starting withsimple physical phenomenon, some of the beautiful mathematical structure that lies behind ourmodels of the universe.

Historically speaking, Hirota's method was discovered by some of the very same brute-forcecalculations and by-hand manipulations carried out here [Ba]. It was only later, through the workof the Japanese mathematicians Date, Jimbo, Miwa, Kashiwara, Sato and Sato, that the deep connections between Hirota's bilinear form for non-linear PDEs and Kac-Moody algebras were discovered.

The seemingly arbitrary substitutions made to reduce the KdV equation to its bilinear form,either the rational substitution u = G/F or the logarithmic substitution u = 2(\ogf)xx, areactually part of a broader class of functions known as r-functions. The discovery that the partitionfunctions of several important quantum field theories are r-functions of non-linear PDEs is a majortheme of current research in theoretical physics [Ba].

References[Ba] Aliaa Barakat: Personal Discussions with the Author, 01/2008.

[DJ] P. G. Drazin and R. S. Johnson: Solitons: an introduction. Cambridge: Cambridge Univ.Press 1989.

[Hi] Jarmo Hietarinta: Introduction to the Hirota Bilinear Method, volume 638 of Lect. NotesPhys. New York: Springer-Verlag 2004.

[ZK] Norman J. Zabusky and Martin D. Kruskal: Interaction of "Solitons" in a CollisionlessPlasma and the Recurrence of Initial States, Phys. Rev. Lett. 15 #6 (1965), 240-243.

Page 62: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

FACULTY FEATURE ARTICLE5

Tiling with Commutative RingsThomas Lamt

Harvard UniversityCambridge, MA 02138

tfy [email protected]

AbstractWe explain an approach, due originally to Barnes, to tiling problems using some commutativealgebra. We investigate in particular the occurence of coloring arguments in tiling problems. Theonly prerequisites are linear algebra and familiarity with rings and ideals.

5.1 A Recreational ProblemConsider the collection R of squares obtained from the chessboard by removing two oppositecorners:

Can this configuration be covered with the vertical and horizontal dominoes

B CDso that every square is covered by exactly one domino? In other words, can R be tiled by verticaland horizontal dominoes?

The coloring gives away the answer to this well-known problem. The region R has 32 blacksquares and 30 white squares. Since each domino covers exactly one black and one white square,no tiling is possible. The aim of this article is to explain a way to tackle tiling problems using a littlecommutative algebra. More precisely, we will explain how to obtain coloring arguments, similarto the above chessboard coloring, in a systematic way. I will assume that the reader is familiar withlinear algebra and has seen rings and ideals before.

5.2 Tiles, Regions, and Tiling ProblemsLet N = {0,1,2,...} denote the natural numbers. A tile or region is a finite subset of N2 considered as a collection of boxes in the first quadrant.1 The tiling problems that we shall consider are ofthe following form: given a (possibly infinite) set T of tiles and a region R, can R be tiled (that is,covered with tiles so that each square in R is covered once)? Each tile r e T can be translated anywhere within N2 and used as many times as desired but we shall insist that rotations and reflections

t Thomas Lam was born in Hong Kong and grew up in Australia. He earned his BSc in 2001 from Universityof New South Wales (Australia) and received his PhD in 2005 from MIT, studying with Richard Stanley. Hehas been Benjamin Peirce Assistant Professor at Harvard University since then.

he interested reader will have no trouble generalizing our statements to higher dimensions.

60

Page 63: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

Thomas Lam—Tiling with Commutative Rings 61

are not allowed. If we want to allow rotations of a tile then they must be added to T separately.Because we may translate tiles as much as we like, we will also assume that each tile r G T hasbeen translated as far southwest as possible, so that it touches the x- and ?/-axes. Thus, in the abovechessboard problem, T consists of two elements: the vertical domino V = {(0,0), (0,1)} andhorizontal domino H = {(0,0), (1,0)}.

C such that5.3 Coloring ArgumentsLet T be a set of tiles. A coloring argument for T is a function / : N2

(o,6)€k

for any kCN2 which is a translate of a tile in T. It is not difficult to check that the set of coloringarguments for T forms a vector space over C, which we denote O(T) and shall call the coloringspace.

If R c N2 is some region, then we say that a coloring argument / e O(T) forbids Rif f(R) 0. If a coloring argument / forbids R then one immediately deduces that R is nottileable by T. If we replace black and white by 4-1 and -1, then the chessboard coloring gives thefollowing coloring argument

- 1 4-1 - 1 4-1

+1 - 1 -hi - 1

- 1 4-1 - 1 +1

+1 - 1 4-1 - 1

(which has formula f(a, b) = (-l)a+6) for the tile set T = {V", H} consisting of the two dominoes.

5.4 Tile PolynomialsLet us consider the polynomial ring C[x,y] in two variables, where C denotes the complex num-

«.,*>.bers. To each box (a, b) G N2 in the first quadrant we associate the monomial xay[

y3 xy3 x y x3y3

y2 xy2 *v x3y2

y xy x2y x3y

l X x2 x3

To each region R (or tile r) we associate the region (or tile) polynomial

PR(x,y)= Yl x<xyb eClx>y]-(a,b)eR

Page 64: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

6 2 T h e H a r v a r d C o l l e g e M a t h e m a t i c s R e v i e w 2 . 1

Thus, pv (x, y) = 1 + y and ph(x,v) = 1 4- x.We note that translating a tile r in the direction (a, b) corresponds to multiplying the tile poly

nomial by xayb. Our assumption that the tiles r € T are southwest-justified means that eachpT (x, y) is not divisible by a monomial.2

When is a region R tileable by T? This happens exactly when

p R ( x , y ) = ] T x a y b p r ( x , y ) , ( 5 . 1 )(a,6),r

where the summation is over some collection of translated tiles.

5.5 Tile IdealLet us define the tile ideal JT C C[x, y] to be the ideal generated by the tile polynomials pT as rvaries over the tiles in T. A typical element of p(x, y) G It is thus a finite linear combination

p(x,y) = qi(x,y)pT1(x,y) + • • • 4- qk(x,y)pTk(x,y), (5.2)

where n G T are tiles and qi(x, y) G C[x, y\. In particular, if a region R is tileable by T thenlooking at (5.1) we see that pr G It- However, the converse is not true. The polynomials qi(x, y)in (5.2) may involve negative signs which would allow one to "remove" tiles. Let us say that aregion R is tileable by T over C if pR G It- Tileability over C is a much easier problem, as weshall soon see.

For example, letting R = {(0,0), (0,1), (0,2), (1,1), (2,1), (3,0), (3,1), (3,2)}, we obtain:

_nz_It is easy to see that R is not tileable by the dominoes T = {V, H}. However we have

pR(x, y) = 1 4- y + y2 4- xy 4- x2y 4- x3 4- x3y 4- x3y2= (1 + 2/ + 2/2 4-z2 + x2y + x2y2 - x - xy2)pH(x,y) G It,

so R is tileable by dominoes over C.

5.6 Reduction to Finite Sets of TilesA basic theorem in commutative algebra is the Hilbert Basis Theorem. In our setting, it states thatTheorem 1 (Hilbert Basis Theorem). Every ideal I in a polynomial ring C[xi,x2, ...,xn] isfinitely generated. Furthermore, if S C I is any possibly infinite set of generators, then a finitesubset S' C S will generate I.Corollary 2. Any possibly infinite set T of tiles can be replaced by a finite subset T'cT of tiles,so that tileability by T over C is the same as tileability by T' over C.P r o o f A p p l y T h e o r e m 1 t o t h e t i l e i d e a l J T C C [ x , y ] . □

5.7 Tiling Over C and Coloring ArgumentsProposition 3. We have an isomorphism ofC-vector spaces

0(T)-Homc(C[a?,y]//T,C).2We can avoid having to make these assumptions by using the ring C[x,y,x~l,y~1} instead, but that

makes other things somewhat more complicated.

Page 65: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

T h o m a s L a m — T i l i n g w i t h C o m m u t a t i v e R i n g s 6 3

Proof Let / G O(T). We define a C-linear map (j): C[x, y] -> C by the formula

cj>(xayh) = f(a,b)

and extending by linearity. Since / is a coloring argument, the map 4> descends to a well-definedmap 4>: C[x, y]/IT -> C. This defines a C-linear map O(T) -> Homc(Cta, y]/IT, C).

In the other direction, let 0 G Homc(C[:r, y]/Ir, C). We define / : N^ -> C by the formula

/(a, 6) = 4>(xayb mod /T).

This / lies in O(T) and the resulting map Homc(C[:r, y]/lT, C) -> O(T) is inverse to the one int h e p r e v i o u s p a r a g r a p h . □

It is now time for one of the main results in this article.Theorem 4. A region R C N2 is tileable by T over C if and only if no coloring argument f GQ(T) forbids R.Proof. The "only if" statement is obvious. To prove the "if" direction, we suppose that R is nottileable by T over C so that pR(x, y) £ It- But this means the image pR(x, y) G C[x, y]/IT is anon-zero vector in the C-vector space C[x, v}/It- There is thus a map 0 G Homc(C[a:, 2/]//t, C)such that (I>(Pr) ^ 0. Using the isomorphism of Proposition 3 this gives a coloring argument/ G O ( T ) s u c h t h a t f ( R ) ^ 0 . □

5.8 Nullstellensatz and VarietiesLet J C C[x, y] be an ideal. We define the variety V(I) of I to be

V(I) = {(a,0) G C2 | p(a,0) = 0 for every p(x,y) G /}.

If X C C2 is a set of points in the plane we define the ideal I(X) c C[x, y] of X by

I(X) = {p(x, y) G C[x, y] \ p(a, 0) = 0 for every (a, 0) G X}.

(One can obviously make these definitions in dimensions more than two.)An ideal 7 in a commutative ring B is called radical if for any b G B such that bn G I we have

bei. For example, the ideal (l + x,l + y) C C[x, y] that we have previously seen is radical. Afundamental result in commutative algebra and algebraic geometry is Hilbert's Nullstellensatz.Theorem 5 (Nullstellensatz). Let I C C[x i, x2,..., xn] be an ideal not equal to the whole polynomial ring. Then V(I) is non-empty. Furthermore, if I is radical then we have I(V(I)) = I.

5.9 Tile VarietyTheorem 4 is satisfying theoretically, but to solve our favorite tiling problems it would be nice toexhibit an explicit basis for O(T). By Proposition 3, the dimension of O(T) is equal to that ofHomc(C[x, y]/IT,C). If C[x, v]/It is infinite-dimensional over C (it will always be of countabledimension), then Romc(C[x,y]/lT,C) will be of uncountable dimension. As an example, takeT = {V} to consist of only the vertical domino. Then C[x, y]/IT - C[x] is infinite-dimensionalover C. For simplicity we will assume that C[x,y]/IT and thus O(T) is a finite-dimensionalC-vector space.3

Define the tile variety VT = V(It) C C2 to be the variety associated to the ideal 7T. Forexample, if T = {V, H} then Vt is given by the set of common zeroes of 14- x and 1 + y. ThusVT = {(-1, -1)}. It will follow from Theorem 6 below that if C[x, y]/IT is finite-dimensionalover C then Vt is a finite set of points.

3The description we now give will not lead to a basis for O(T) in the infinite-dimensional case, but othertechniques such as Grobner bases can still tackle the general case.

Page 66: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

6 4 T h e H a r v a r d C o l l e g e M a t h e m a t i c s R e v i e w 2 . 1

For a point (a, 0) G Vt define a map 4>a,p G Homc(C[x, v]/It, C) by evaluating polynomials at (a, 0):

<P«,p(p(x>y)) = p(<*,0).Note that this equations is well-defined exactly because (a,0) G Vt. These elements ofHomc(C[x,2/]//T,C) are very special: they are not just linear maps, but also C-algebra homo-morphisms of C[x, v]/It to C Under the isomorphism of Proposition 3, (j>a^ corresponds to thecoloring argument / : N2 —▶ C given by fa,p(a, b) = aa 0b.

Perhaps you now see where we are heading. If we take T = {V, H} to consist of the twodominoes, and (a,0) = (-1,-1) then /_if_i(a, 6) = (-l)a+6 is just the black-white chessboard coloring!

5.10 A Basis for the Coloring SpaceTheorem 6. Suppose C[x,v]/It has dimension n over C and It is a radical ideal. Then Vt ={(ai,/3i),..., (an,0n)} consists of n points and the set {/Q.,^ G O(T)} forms a basis of thecoloring space O(T).Proof. We claim that an element p(x,y) G C[x,y]/IT is completely determined by its valuesp(ai,0i) on Vt- This follows from Theorem 5: if p, q G C[x, y] take the same values everywhereon Vt then the difference p - q lies in I(VT) and thus in IT by the Nullstellensatz. In particular,we have

dimc(C[a:,y]//T)<|VT|.But if {(ai,0i),...,(am,0m)} C Vrandj G [l,ra] is fixed let us pick for each i ^ j in [l,ra]a polynomial

( j ) , v x - a i ( j \ y — 0 i^ {x'y)= ^ or 4 (x'y)=fr%'insisting that we choose an expression such that the denominator is non-zero (most of the timeeither one will do). Then the product

q(J)(x,y) = l[[q^(x,y)eC[x,y}

takes the value 1 at (aj,0j) and the value 0 at every other (a*, 0i). These ra polynomials give ralinearly independent elements of C[x, v]/It- Thus,

dimc(C[x,2/]//T)>|VT|and we conclude that n = dimc(C[x, y]/IT) = |VT|. In particular, we have shown that Vt ={(ai,/?i),..., (an,0n)} is finite. One checks that the maps {0^,^} C Homc(C[x,y]/7T,C)form a dual-basis to {q(j)(x, y)} c C[x, j/]//t, to complete the proof. □

For T = {V, H}, we have remarked that It is radical so Theorem 6 says that the chessboardcoloring is essentially the only coloring argument. There is also a version of Theorem 6 whichapplies even when It is not radical.

5.11 Summary of StrategyLet us summarize our approach to a tiling problem. We are given a set T of tiles and a regionR. First, we convert each tile r G T into a polynomial pT(x,y). We (try to) solve all thesepolynomials simultaneously, to find the tile variety Vt C C2. If Vt = 0 then every region R istileable by T over C.

We suppose Vt consists of a finite set of points. Next we evaluate pR(x, y) at each point (a,0)of Vt- If for some point we have pR (a,0) ^ 0 then we have found a coloring argument fa^ whichforbids R. If not, but in addition we know that It is radical, then we can conclude from Theorem 6that no coloring argument can show that R is not tileable. Of course, to completely resolve whetherR is tileable by T is a much harder problem.

Furthermore, all the results so far work in any number of dimensions.

Page 67: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

T h o m a s L a m — T i l i n g w i t h C o m m u t a t i v e R i n g s 6 5

5.12 Final CommentsEssentially all of what we have presented so far is a simplification of work of Barnes [Bl, B2].However, much more can be said if we are willing to restrict our class of tiling problems. Let usnow assume that all the tiles and regions that we consider are bricks. In two-dimensions, bricksare just rectangles. In d-dimensions, they are regions of the form [ai, bi] x • • x [a^, bd\-

A fundamental result is an analogue of the Hilbert Basis Theorem over N, due to de Bruijn andKlarner.Theorem 7 ([dBK]). When considering tiling problems of bricks by bricks, any collection of bricktiles can be replaced by a finite subcollection.

For brick tiling problems, tiling over C and usual tilings are not too different. Barnes proved:Theorem 8 ([B2]). Let T be a finite set of brick tiles. Then there is some constant K such thatevery brick region R with all dimensions greater than K can be tiled by T if and only if it can betiled by T over C

Together with Ezra Miller and Igor Pak, I have been studying some computational issues fortilings. I now describe some of our results. Let us say that a set S of bricks has a finite descriptionif it is a finite union S = U*Si of brick classes Si such that each class is of one of the followingforms:

1. {(/i,...,/d) \li = a]

2 . { ( h , . . - , l d ) \ l i > a }

3. {(h,..., Id) | h > a and U = b mod c

for integers a, b and c.Proposition 9 ([LMP]). Let T be a set of bricks. Then the set S of bricks which can be tiled by Tadmits a finite description.Theorem 10 ([LMP]). Suppose we are in d = 2 dimensions and T is a finite set of bricks. Then itis possible to compute a finite description for the set S of bricks tileable by T.

Surprisingly, we conjecture that Theorem 10 fails in higher dimensions. That is, when d > 3,a finite description for the set S of bricks tileable by T is not computable.

References[Bl] Frank W. Barnes: Algebraic Theory of Brick Packing I, Discrete Math. 42 (1982), 7-26.

[B2] Frank W. Barnes: Algebraic Theory of Brick Packing II, Discrete Math. 42 (1982), 129-144.

[dBK] Nicolaas G. de Bruijn and David A. Klarner: A finite basis theorem for packing boxes withbricks, Phillip Res. Repts 30 (1975) 337*-343*.

[LMP] Thomas Lam, Ezra Miller, and Igor Pak: Brick tilings, in preparation.

Page 68: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

FACULTY FEATURE ARTICLE6

Twisting with FibonacciDana Rowland*

Merrimack CollegeNorth Andover, MA 01845

[email protected]

AbstractDetermining when two links are equivalent is one of the central goals of knot theory. This paperdescribes the Conway polynomial, a link invariant that offers one approach to this problem. Whencalculating the Conway polynomial of the (n, 2) torus knots, we encounter the familiar patterns ofPascal's triangle and the Fibonacci sequence.

6.1 IntroductionPick up a piece of string. Tangle it up, twist it around, knot it up, and then attach the ends. The resultis a mathematical knot. Suppose a friend does the same thing. Try to twist, stretch, or otherwisedeform the two tangled loops, without cutting the strings, so that they look exactly the same. Ifthis is possible, then the knots are said to be equivalent. Determining whether or not two knots areequivalent is one of the central questions in knot theory.

Mathematically, a knot is a continuous closed curve in space that does not intersect itself. Alink is the disjoint union of finitely many knots, where the number of components of the link isdetermined by the number of knots. If each component is assigned a direction, then the link isoriented. Two oriented links are equivalent if one can be deformed into the other in such a waythat the orientation is preserved. A two-dimensional picture of a link is called a projection.

Figure 6.1: Do these two oriented link projections represent equivalent links?

Knot theorists use link invariants to distinguish between different links. If two links are equivalent, then calculating the link invariant for each link produces the same result, despite the fact thatthe link projections may appear to be drastically different.

One example of a link invariant is the Conway polynomial. Given a projection of an orientedlink L, we can assign a polynomial V(L), described using the variable z. The polynomial is defined

*Dana Rowland has been teaching in the mathematics department at Merrimack College since 2001, whereshe is now an associate professor. She earned her doctorate in mathematics in 2001 at Stanford University underRalph Cohen, in the area of algebraic topology. As an undergraduate, she majored in both mathematics andEnglish at the University of Notre Dame. Her current research interests include knot theory and graph theory.When she is not doing mathematics, Dr. Rowland enjoys playing bassoon and valve trombone in the MerrimackCollege jazz ensemble and being constantly surprised by her amazing children Ben (4 years) and Michela (21months).

66

Page 69: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

D a n a R o w l a n d — T w i s t i n g w i t h F i b o n a c c i 6 7

so that no matter how the link is twisted around in space, the polynomial will not change—any twoprojections of the same link will have the same Conway polynomial.

The Conway polynomial of a particular projection is calculated recursively by applying thefollowing two definitions.Definition 1. If L is equivalent to a single unknotted circle ("the unknot"), then its Conway polynomial is equal to 1; that is,

O)- (6.1)

) (

Figure 6.2: These three link projections are identical outside of the region shown.

Definition 2. Suppose L+, L_, and L0 are three oriented link projections that are identical exceptnear one crossing of L+, where they appear as in Figure 6.2. Then their Conway polynomialssatisfy the relation

2 V ( L o ) = V ( L + ) - V ( L _ ) . ( 6 - 2 )

These two defintions, together with the requirement that two projections of the same link musthave the same Conway polynomial, suffice for calculating the Conway polynomial of any knot orlink. We can always unknot a link by changing finitely many crossings. By applying Definition2 at one of these crossings and appropriately assigning the roles of L+, L_, and L0, we can findthe Conway polynomial of a given link in terms of the Conway polynomials of links with fewercrossings. We can eliminate all crossings by repeating this process, resulting in only the trivial knotor trivial links.

By Definition 1, we know V I I J ] = 1. The Conway polynomial of a trivial link is 0. To

see this, label the trivial link as L0 and form L+ and L_ by joining two of the components using apositive and negative crossing respectively. Applying (6.2) gives

-(00)-(00)-(00)-(OMo:= 0.

The following example illustrates how these two definitions are used to calculate the Conwaypolynomial of the trefoil knot.Example 3. The Conway polynomial of the trefoil knot is 1 +■ z2.Proof. Select one crossing in the trefoil knot. Since the crossing is a positive crossing, label thetrefoil knot as L+ and replace the region near the chosen crossing as shown in Figure 6.2 to obtain

Page 70: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

68 The Harvard College Mathematics Review 2.1

Figure 6.3: The trefoil knot.

L- and L0. Then use (6.2).

* (<§)W(§)Wc3>-(O + 2V

The Conway polynomial of the unknot is 1. To find the Conway polynomial of the link, selectanother crossing and apply (6.2) again.

+ zV

+ lz

Since the Conway polynomial of the trivial link is 0, we see that

-(00MO)

= 0+12 = 2

and the Conway polynomial of the trefoil is 1 + z(z) = 1 + z2, as claimed. □Before continuing, we urge the reader to try a problem or two.

Exercise 4. A different projection of the trefoil knot is shown in Figure 6.4. Make this knot out ofstring and manipulate the string to show that this knot is equivalent to the one used in Example 3.Verify that calculating the Conway polynomial starting with this projection also results in 1 + z2.

Figure 6.4: Another projection of the trefoil knot

Exercise 5. Show that the Conway polynomial of the figure-eight knot, shown in Figure 6.5, is

Page 71: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

Dana Rowland—Twisting with Fibonacci 69

Figure 6.5: The figure-eight knot.

6.2 Torus LinksSuppose you have two strings, side by side, and the tops of the strings are fixed. Twist the rightstring over the left string n times, and orient both strings in the same direction. Without introducingany additional crossings, attach the bottoms of the strings to the tops, as shown in Figure 6.6. Theresulting knot or link is known as an (n, 2) torus link, because it lies flat on a torus, wrappingaround twice longitudinally and n times through the center.1

Figure 6.6: An (n, 2) torus link.

Now we explore what happens when we calculate the Conway polynomials of these links. LetCn denote the (n, 2) torus link. Notice that £i is the unknot, C2 is the link from Example 3, and£3 is the trefoil. When n is odd, Cn is a knot. When n is even, £n is a two-component link.Furthermore, for n > 3, changing a single crossing in the (n, 2) torus link results in a projectionof the (n - 2,2) torus link. See Figure 6.7.

Figure 6.7: Applying (6.2) to torus links with L+ = £n, L- = £n-2, and Lo = Cn-i-

lSee [Ad, Section 5.1] for a general description of (n, ra) torus knots and links.

Page 72: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

70 The Harvard College Mathematics Review 2.1

Using the definitions for calculating the Conway polynomial, we observe that for all n > 3,

V ( £ n ) = V ( £ n - 2 ) + z V ( C n - i ) . ( 6 . 3 )This quickly leads to a table of Conway polynomials:

V(£i) = 1V(£2) = zV(£3) = l + 22V(£4) = 2z + z3V(£5) = l + 3z2-+z4V(£6) = 3z 4 4z3 4 /V(£7) = 1 4- 6z2 4- 5z4 4- *6V(£8) = 4z + IO23 4 6*5 4- /V(£9) = 1 4- I0z2 4 15*4 4 7z6 4- *8

6.3 Pattern Recognition and FormulasIf we look at the table of Conway polynomials, we can immediately make a few observations.First, the degree of the polynomial for the (n, 2) torus link is n - 1. The polynomials are allmonic, meaning that the highest order term has 1 as its coefficient. When n is odd, the constantterm of the polynomial is 1 and the polynomial contains only even powers of z. When n is even,the polynomial contains only odd powers of z, and the coefficient of the z term is n/2.

These observations begin to reveal the behavior of the Conway polynomial of the (n, 2) toruslink, but we can do better by taking a closer look at the pattern formed by all the coefficients:

V ( £ iV(£2V(£3V(£4V(£5V(£6V(£7V(£8V(£9

= 1= \ z= 14- l*2= 2 z 4 - l z 3= 1 4- 3z2 4 lz4

3z + 4z3 4 lz5= 14 6z2 + 5z4 + lz6= 42 + 10z3 4 6z5 + lz7= 1 4 IO22 4 15z4 4 7*6 4- lz8

Notice the appearance of Pascal's triangle along the diagonals of the Conway polynomial coefficients! Alternatively, we can find these coefficients of the Conway polynomials within Pascal'striangle, as seen in Figure 6.8.

Recall that the entries in Pascal's triangle are the binomial coefficients. This suggests thefollowing formulae.Theorem 6. The Conway polynomial of the (2n 4 1,2) torus knots is given by the equation

V(£2n+i) =

j = 0

+<TK+n2W' + 2n\ 2n2n }e

(6.4)

Page 73: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

Dana Rowland—Twisting with Fibonacci

1

71

Figure 6.8: The coefficients of the Conway polynomials of the (n, 2) torus knots and links can befound within Pascal's triangle.

and the Conway polynomial of the (2n, 2) torus links is given by

v(«-i?).+n'K+ nrw ■ ♦ £ : : » -

E3=0

n + j \ 2j+i2j + l z

(6.5)

Proof. We prove (6.4) and (6.5) using the principle of mathematical induction. Since £i is theunknot, we know that V(£i) = 1 = Q. In Example 3, we showed that V(£2) = z= Qz, sothe results are valid when n = 0.

Assume we know the formulae hold for all positive integers less than ra. We prove the formulaholds when ra = 2n is even. (The case for ra odd is similar, and is left to the reader.) We have

V(£2n) = V(£2n-2) 4- zV(C2n-i)

j = on - 2= £3=0n - l= £j = 0

n- '14- j2j + lE(",~irK'*' + *En - 1 + j

2j + l

n + j \ 2 j + l2j + l]Z

4

3=0

n - l + j2 j

n - 1 4 j \ 2j2 3 U

2j + l . 2n-lz - 4- Z

The sudden appearance of Pascal's triangle allowed us to conjecture and prove a result aboutthe Conway polynomials of mathematical knots. As frequently occurs in mathematics, results froma seemingly unrelated field can be utilized where least expected.

Page 74: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

7 2 T h e H a r v a r d C o l l e g e M a t h e m a t i c s R e v i e w 2 . 1

6.4 Torus Links and the Fibonacci SequenceIn [Ka], Kauffman observed another relationship: If we evaluate the Conway polynomials of thesetorus links at z = 1, then we obtain the Fibonacci sequence.

Recall that the Fibonacci sequence {/n} is defined by the following recursive relation.

/i = lf2 = lfn — fn-2 4" J n-l

When z = 1, we see that V(£i)\z=i = 1, V(£2)|2=i = 1, and the recursive relation from(6.3) becomes

V(£n)|2=1 = V(£n-2)|2=1 4 1 • V(Cn-i)\z = 1 ,which establishes the identity

V ( £ n ) U = / n .Combining this with (6.4) and (6.5), we obtain the identities

^ - t h A ( 6 - 6 )3=0n - l n + j

2j + l/'3 = 0

The fact that the Fibonacci numbers occur as sums of the diagonals of Pascal's triangle shownin Figure 6.8 was discovered by Edouard Lucas in 1876. See [Ko] for a direct proof.

6.5 ConclusionSince the polynomials given in (6.4) and (6.5) are distinct, we know that the (n, 2) torus knots andlinks are distinct for different values of n. The Conway polynomial can be a useful tool for tellingmany knots and links apart, including those in Figure 6.1.Exercise 7. Use the Conway polynomial to prove that the oriented link projections in Figure 6.1do not represent equivalent links.

The Conway polynomial is equivalent to a normalized version of the very first polynomialfor knots and links, which was invented by J. Alexander in 1928. Alexander defined his originalpolynomial using the Seifert matrix constructed from a surface spanning an oriented link, andin 1969 John Conway discovered the polynomial could be derived more simply using the abovedefinitions. The Alexander-Conway polynomial was one of the major tools used for distinguishingknots for the next 60 years.

However, the Conway polynomial cannot distinguish between all knots or links. Two knotsor links can have the same Conway polynomials even if they are not equivalent. For example,splittable links are links that can be deformed so that the components lie on different sides of aplane in R3.Exercise 8. Show that the Conway polynomial of any splittable link is 0. (Hint: Label the splittablelink as Lo.)

Another example which illustrates the limitations of the Conway polynomial is the 11-crossingConway knot, shown in Figure 6.9. This knot has Conway polynomial 1, even though it cannot beunknotted.

In fact, given any knot, there are infinitely many other knots with the same Conway polynomial(c/[Cr,p. 164]).

Page 75: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

D a n a R o w l a n d — T w i s t i n g w i t h F i b o n a c c i 7 3

Figure 6.9: This 11 crossing knot has the same Conway polynomial as the unknot.

In 1984, Vaughan Jones discovered a connection between von Neumann algebras and braidgroups, which led to a new polynomial for knots and links. The Jones polynomial has the advantage that there are no known examples of knots that have the same Jones polynomial as the trivialknot. Jones' discovery encouraged other mathematicians to search for more knot polynomials.This quickly led to the discovery of the Homfly polynomial, a two-variable generalization of bothdie Alexander-Conway and Jones polynomials which was developed independently by Jim Hoste,Adrian Ocneanu, Raymond Lickorish and Ken Millett, Peter Freyd and David Yetter, and JozefPrzytycki and Pawel Traczyk. Still other polynomial invariants grew out of a surprising connectionbetween knot theory and theoretical physics.

The connections between knot polynomials and previously unrelated fields of mathematics ledto renewed interest in the mathematical theory of knots and links, and rapid advances in the subject.Still, none of these polynomials provides a complete invariant—infinitely many examples exist ofpairs of non-equivalent knots that still have the same knot polynomials. Knot theorists continue tosearch for more ways to distinguish knots and links.

We refer the interested reader to [Ad, Cr, Sc] for more about knot theory and to [BQ, En] foradditional properties of Pascal's triangle and the Fibonacci sequence. We conclude with a finalproblem.

Take a rubber band, and twist it n times while holding onto two ends of the rubber band. Linkthe ends together using a clasp with two crossings so that the resulting knot is alternating. Theresult is a twist knot, pictured in Figure 6.10. Note that the figure-eight knot in Figure 6.5 is anexample of a twist knot.

Figure 6.10: A twist knot

Exercise 9. Find a formula for the Conway polynomials of the twist knots. Use this to conclude thatthe twist knots are distinct for different values of n, and that none of the twist knots are equivalentto (n, 2) torus knots.

6.6 AcknowledgmentsThe author thanks sarah-marie belcastro and Grant Dasher for suggested revisions and helpfulcomments.

Page 76: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

7 4 T h e H a r v a r d C o l l e g e M a t h e m a t i c s R e v i e w 2 . 1

References[Ad] Colin C. Adams: The knot book. Providence, RI: American Mathematical Society, 2004.

[BQ] Arthur T. Benjamin and Jennifer J. Quinn: Proofs that really count. Washington, D.C.:Mathematical Association of America, 2003. (The Dolciani Math. Expositions 27)

[Cr] Peter R. Cromwell: Knots and links. Cambridge: Cambridge Univ. Press, 2004.

[En] Hans Magnus Enzensberger: The Number Devil. New York:, Henry Holt and Company,1998.

[Ka] Louis H. Kauffman: The Conway polynomial, Topology, 20 #1 (1981), 101-108.

[Ko] Thomas Koshy: Fibonacci and Lucas numbers with applications. New York: Wiley-Interscience, 2001. (Pure and Applied Math.)

[Sc] RobScharein: Knotplot: Hypnogogic software, http://knotplot .com, 1998-2007.(An OpenGL program for visualizing and manipulating mathematical knots.)

Page 77: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

FEATURE

7MATHEMATICAL MINUTIAE

i Has This Funny PropertyZachary Abel*

Harvard University ' 10Cambridge, MA 02138

[email protected]

Scott D. Kominers *Harvard University '09Cambridge, MA 02138

[email protected]

Recall that the derivative of a real-valued function g : R —> R is given by

g'(z) = lim g(z + h)-g(z)^* v ; h - > o h

While this definition implies continuity, woe be unto the real analyst1 who tries to differentiate thefunction

h l i x ) = \ x2 x>0

twice, even though he may do so once. Indeed, many a differentiable, real-valued function is nottwice differentiable.

We define the complex derivative of a function / : C —> C as

f{z) = lim f±±]±zmJ v ' h - + o h

where h ranges over complex numbers (if this limit exists). Although this definition looks similarto that of the real derivative, the real and complex derivatives are wildly different beasts. Forexample, we have the following, which shows that—unlike real-differentiable functions—complex-differentiable functions are always multiply differentiable.Theorem 1 (Cauchy's Integral Formula [SS, Cor. 4.2] [La, Ch. Ill Thm. 7.7]). Any complex function f : C -▶ C that is differentiable near z0 € C is infinitely differentiable there. Furthermore,

* Zachary Abel, Harvard '10, is a computer science and mathematics concentrator. He is an avid problemsolver and researcher, with interests in such varied fields as computational geometry, number theory, partitiontheory, category theory, and applied origami. He is a founding member of The HCMR and currently serves asProblems Editor/Graphic Artist, and Issue Production Director.

t Scott D. Kominers, Harvard '09, is a mathematics concentrator, ethnomusicology minor, and economicsenthusiast. He is an enthusiastic researcher, working in a range of fields including number theory, computational geometry, category theory, mathematical economics, urban economics, law and economics, and historicalmusicology. He is a founding member of The HCMR and has served as Editor-in-Chief since The HCMR'sinception.

!real analyst, ('riial aenalist), noun. (1) A mathematician who studies the analytic properties of the realnumbers. (2) An analyst who is not fake.

75

Page 78: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

7 6 T h e H a r v a r d C o l l e g e M a t h e m a t i c s R e v i e w 2 . 1

we may explicitly calculate these values:

f (zo) = TT^ f 7— L+i dzi2m J (z - z0)n+1

where 7 is any sufficiently small loop around zo.Not only is a complex differentiable function twice differentiable, it is actually smooth! Furthermore, it has a convenient power series representation:Theorem 2 ([SS, Thm. 4.4] [La, Ch. IV Thm. 7.3]). /// is complex-differentiable near z0, then fis analytic near zo, i.e. it has a power series expansion

0 0

fc=0

in an open neighborhood ofzo-In light of Theorem 1, we can take Theorem 2 even further, explicitly calculating the power

series coefficients: ck = ^f(n)(z0) = ^ /7 {J}%f*+i • This is a far cry from the real-analyticcase, where even an infinitely differentiable function may not have a power series expansion. Thefunction

h2(x) = < _iye * x>0.

is a classic example of a smooth function h2 : R -▶ R with no power series expansion at x = 0.(See if you can prove this!)

For any differentiable function g : R -+ R, the image set g(R) is simply an interval.2 With ourtrusty i, however, we can see far more about the shape of the image:Theorem 3 (Liouville's Theorem [SS, Cor. 4.5] [La, Ch. HI, Thm. 7.5]). If f : C -+ C is ananalytic function such that the image /(C) is bounded, then f is constant.

Whereas in the real case we could only describe the image of a function as an interval, in thecomplex case we know instead that the image is bounded if and only if it is a single point. Wefurthermore have control over the images of complex analytic functions with unbounded images:Theorem 4 (Picard's Little Theorem [SS, Exer. 6.11] [La, Ch. XII, Thm. 2.8]). If an analyticfunction f : C —▶ C is nonconstant, then its image omits at most one value. That is, /(C) is eitherCorC\ {p}for some peC.

So much power in one little i!

AcknowledgementsWhile there's no i in team, many people contributed to this feature. In particular, i was taughtby Andreea Nicoara in the Harvard Mathematics 213a course, Fall 2007, and the authors greatlyappreciate Nicoara's instruction.3 The presentation here also benefited from the comments andadvice of Brett Harrison, Daniel Litt, Menyoung Lee, Noam D. Elkies, Shrenik Shah, Ernie Fontes,and Andrea Hawksley. The authors are indebted to the constant advice and support of their parents,{Terry, Bruce} Abel and {Ellen, William} Kominers.

2Incidentally, it may be any interval—closed, open, or half-open. In fact, this works even if g is only knownto be continuous. (.Prove it!)

3In addition to [La] and [SS], the authors recommend [AI] and [Re], both of which were used as course textsin Nicoara's Mathematics 213a.

Page 79: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

Zachary Abel and Scott D. Kominers—i Has This Funny Property 77

References[Al] Lars V. Ahlfors: Complex Analysis, 3rd ed. New York: McGraw-Hill, Inc., 1979.

[La] Serge Lang: Complex Analysis, 4th ed. New York: Springer, 1999.

[Re] Reinhold Remmert: Classical Topics in Complex Function Theory. New York: Springer,1998.

[SS] Elias M. Stein and Rami Shakarchi: Complex Analysis, Princeton, New Jersey: PrincetonUniversity Press, 2003. (Princeton Lectures in Analysis 2.)

Page 80: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

FEATURE

8STATISTICS CORNER

How Statisticians Discoveredthe Options Backdating Scandal

Robert W SinnottfHarvard University '09Cambridge, MA 02138

rs inno t t@fas .ha rva rd .edu

8.1 IntroductionA stock option grant is a contract made between a corporation and another entity in which, ona specified exercise date, the corporation agrees to buy or sell a specified number of shares ofits stock for a fixed price called the exercise price. Such a contract can be quite valuable if thedifference between the exercise price and the stock's trading price on the exercise date is large. Forexample, suppose that a company executive received the option of purchasing 100,000 shares of acompany at an exercise price of $35 on January 31. If the stock trades at $55 a share on January31, then this contract is worth 100,000 x ($55 - $35) = $2,000,000. Thus, it is to the executive'sadvantage for his options to have a low exercise price, coupled with a high share market price onthe exercise date. The latter point is the purpose of the stock option grants—such grants incentivizeexecutives to increase the fundamental value of their companies. The former point is the root of theoptions backdating debacle.

Options backdating is the practice of marking the grant date of an option with a date priorto date on which the decision to grant the option was made. This is not in general a problem—companies have the right to enter into any agreement and award any compensation according totheir internal compensation policies so long as they properly report such awards to their shareholders and the IRS. So that options grants are not counted against company earnings, they must beissued with exercise prices at or above the stock price on the grant date. In the vernacular, theseare called out-of-the-money (or at-the-money) option grants. By contrast, options have a positivevalue on their grant date are counted against earnings and are called in-the-money grants.

Executives desire higher compensation. However, by receiving stock options with grant dateson which stock prices were at a minimum, executives are able to obtain maximal compensationwithout having to report reduced earnings. Of course, it is difficult to predict stock price minima inthe short-term before they occur. Unsurprisingly, however, it is easy to determine such minima expost.

Prior to the 2002 Sarbanes-Oxley reforms, stock option grants could be reported months afterthe actual grant date, leaving a situation ripe for abuse. The empirical evidence of Lie [L2] demonstrates that, statistically speaking, the probability that such abuse did not occur is impossibly small.Although initial studies such as Yermack [Ye] suggested that executives used insider information toinform the scheduling of option grant dates, Lie's analysis shows conclusively that this explanationis not sufficient to explain the data and that the only reasonable possibility proposed so far is thatmany option grants were backdated in order to maximize executive compensation. This analysiswill be the subject of the remainder of this article.

t Robert W. Sinnott, Harvard '09, is a statistics concentrator.

78

Page 81: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

Robert W. Sinnott—The Options Backdating Scandal 79

Stock prices for fiscal year ending 1/31/1999

$90

$80

$70

$60

$50

$40

$30

$20

v \ t iH& i W 1• * * * = J * 1

fi rA ; j f v

ex1 i i 1 1 1 i i i i i i i lo o o o o o o o o o o o o o o c o o o o o o s o s o s

CMcn

C ^ C J r s ]- ^ - « C M - l - h

CMcn

Figure 8.1: Used with permission from Lie [LI].

8.2 Evidence of Option BackdatingSeveral studies including Chauvin and Shenoy [CS], Yermack [Ye], and Aboody and Kasznik [AK]examined the periods around stock option grants, yielding conflicting results regarding abnormalstock returns before and after stock option grants. However, all of these studies focused on scheduled stock option grants, making their data less reactive to the opportunistic behavior of companyexecutives. Lie [L2] innovatively focused on unscheduled grants during the period from 1992-2002, i.e., those grants which were not dated within a week of the grant dates from the previousyear.

Lie [L2] sought to answer two questions:

1. Were stock prices on days surrounding stock option grants abnormal?

2. If so, could this abnormality be explained by company executives having inside informationabout the future returns of their stock?

A negative answer to the second of these questions implies the practice of option backdating.The first question is reasonably straight forward to answer. Lie used the three factor Fama-

French model (see [FF]) to create a baseline for each stock's expected market returns and thencompared each company's actual returns to those predicted by the model. This allowed Lie totest whether the stock price was significantly lower at the option grant date than the market wouldotherwise predict. Abnormal returns in this test would prove that abnormal returns occurred aroundthe grant date but not whether insider information was being used. After all, abnormal returnsare being compared to the market, so relative returns are specific to individual companies. The

Page 82: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

8 0 T h e H a r v a r d C o l l e g e M a t h e m a t i c s R e v i e w 2 . 1

literature has not shown whether or not company executives can accurately predict short term stockprice patterns.

The Fama-French three factor model (8.1) was proposed in 1993 by Eugene Fama and KenFrench [FF] as a generalization to the widely known Capital Asset Pricing Model (CAPM):

Rs-Rf=0ix (Km - Rf) I fcx (SMB) + 03 x (HML) + a (8.1)

It predicts a stock's returns Rs by a regression of previous returns against overall market returnsRm, the difference between high book-value to price and low book-value to price stock returnsHML and the difference in returns between small and large cap stocks SMB (after standardizingusing the riskless interest rate Rf). The intercept term for the stock returns is denoted a. As SMBand HML are differenced terms, the Rf standardization cancels out in both cases. This model (8.1)is widely accepted; it is effective as an approximate prediction of returns for individual stocks givenknowledge of the returns of other stocks and of general market behavior.

Lie calculated the regression coefficients using the stock price information from the year priorto fifty days before the option grant date. He then used these coefficients to calculate predictedstock prices during the interval surrounding the stock option grant, using standard linear regressiontechniques. These estimates theoretically allow for a reasonable comparison between the actualstock value and the overall behavior of the market, allowing Lie to isolate firm-specific trends fromoverall market behavior.

The graph in Figure 8.2 shows the dramatic statistical deviation of average stock prices fromtheir predicted levels around the option grant date for nonscheduled option grants before and afterthe 2002 Sarbanes-Oxley reforms regulating the use of stock options for executive compensation.The interpretation of the observed patterns is unmistakable. Grants are frequently awarded on dateswhich correspond to local minima of stock prices and are correlated with the reporting requirementsof their corresponding firms. We thus have an answer to the first question: stock prices on dayssurrounding stock option grants are statistically abnormal.

Having answered the first question, Lie then turned to the second. Could Yermack's [Ye] theoryof executive insider information and option grant timing explain the apparent predictive ability ofthe executives receiving the stock grants? In order to test this, Lie used a logistic regression todetermine the factors defining the choice to grant stock options on a particular day, regressingagainst not only the individual stock's returns but also the returns of the market as a whole. Thelogic was simple: If executives are working off of inside information, they should be able to predictchanges in returns that on the individual firm level but not those on the market level. If, however,the grant appears to be decided not only by the individual stock returns but also by the returns ofthe market as a whole, the executive options must have been granted ex post.

In general, a logistic regression is used when a binary outcome (0 or 1) is being determined.In this case, the binary decision is whether to grant options on a particular date. For the dataset,Lie used the actual option grant dates, and then randomly selected five dates in the six-month rangesurrounding the option grant for each company (resulting in a total of 10,003 observations) and setthe dependent variable equal to 1 for the actual grant dates and to 0 for the random dates.

For regression coefficients, Lie used both the abnormal (stock-specific) stock market returnsand the predicted (market level) stock returns for the intervals of 30-10 days before, 10-5 daysbefore, 5-2 days before, 2-0 days before, 0-2 days after, 2-5 days after, 5-10 days after, and10-30 days after the grant. He controlled for seasonality by adding dummy variables for eachmonth of the year (eight actual return variables, eight predicted return variables, and eleven monthvariables in total). In the equation (8.2) below, (MONTHS) is the column vector of seasonaldummy controls, (ABNORMAL) is the column vector of stock-specific returns for each of thelisted intervals, and (PREDICT) is the column vector of predicted (market level) stock returns,and X = (xij) is a data matrix with observations as rows, with the appropriate values for eachobservation component.

Page 83: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

Robert W. Sinnott—The Options Backdating Scandal 81

Abnormal Stock Returns around ESO Grants

Day relative to option grant

•Pre-SOX-Post-SOX:- Post-SOX:• Post-SOX:

filing lag is 1 day or lessfiling lag is 2 daysfiling lag is 3 days or more

Pre-SOX is beforeAugust 29, 2002.Post-SOX is on or afterAugust 29, 2002.

Figure 8.2: Used with permission from Lie [LI].

logit(pi In PiI - P i

: 0o + (MONTHS)- (xid,...,xihj)

+ (ABNORMAL) • (xi2j, - - - ,xi<u)+ (PREDICT) • (.220,;, • • • ,*27,j)

(8.2)

Using the estimates derived from this regression, Lie found that the abnormal returns for theregressed intervals to be significant in predicting the "decision" to grant options. He also determined that the overall market predicted returns were very significant in the four days surroundingthe option grants. This result conclusively showed that unless executives can effectively predictmarket level fluctuations in stock price, they must be backdating the options to minimize the grantdate price.

8.3 ConclusionBy using standard OLS regression in the Fama-French model and then using logistic regressionon the stock returns surrounding the dates of option grants to model the option grant decision, Liewas able to uncover a multi-billion dollar fraud that was occurring in a recently estimated 18.9% of

Page 84: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

8 2 T h e H a r v a r d C o l l e g e M a t h e m a t i c s R e v i e w 2 . 1

all ESO grants. These simple applications of statistical models have sent Shockwaves through thecorporate landscape, reinforcing the need for the Sarbanes-Oxley reforms of 2002.

AcknowledgementsThe author thanks Erik Lie for his advice and for the use of his graphs.

References[AK] David Aboody and Ron Kasznick: CEO stock option awards and the timing of corporate

voluntary disclosures, Journal of Accounting Economics 29 (2001), 73-100.

[CS] Keith W. Chauvin and Catherine Shenoy: Stock price decreases prior to executive stockoption grants, Journal of Corporate Finance 7 (2001), 53-76.

[FF] Eugene F. Fama and Kenneth R. French: Common risk factors in the returns on stocks andbonds, Journal of Financial Economics 33 (1993), 3-56.

[LI] Erik Lie: Backdating of executive stock option (ESO) grants, http://www.biz.u i owa .edu / f acu l t y / e l i e / backda t i ng .h tm .

[L2] Erik Lie: On the timing of CEO stock option awards, Management Science 51 #5 (2005),802-812.

[Ye] David L. Yermack: Good timing: CEO stock option awards and company news announcements, Journal of Accounting Economics 34 (1997), 449-476.

Page 85: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

FEATURE9

APPLIED MATHEMATICS CORNER

Secret Sharing and ApplicationsPablo Azar*

Harvard University '09Cambridge, MA 02138

[email protected]

9.1 Shamir's Secret Sharing SchemeIn this column, I will discuss Shamir's secret sharing protocol [Sn]. The motivation for this protocol is the desire for individual privacy while computing an aggregate piece of data. For example,suppose that, to prevent corruption, no employee of a bank is allowed to access the security vaultalone. Instead, each employee is given a piece of the password to the vault. When k employeesget together, they can reconstruct the password and access the vault. However, k - 1 of them willhave insufficient information about the password, so that no k - 1 corrupt employees can steal thebank's money.

You can encode a piece of information as a binary £-bit number a^_i2^_1 -f- h 2ai + a0where a* G {0,1}. If the message you want to share is too long, you can split it up into smallermessages so that each takes less than £ bits to write. Without loss of generality, we may assumethat the message can be written as an -bit number.

Then we can interpret the message as an element of the finite field F2* -1 If you have a messagex G F2£, you can compute any polynomial c0 + c\x +■ h cnxn G F2* and use Lagrange'sinterpolation theorem: If F is a field and (xi, yi),..., (xn, yn) are pairs of points in F2, then thereexists a unique polynomial p(x) = cn-ixn-1 + • • • + co of degree n-l such that p(xi) = yi fori = 1,... ,n.

What does this have to do with sharing secrets? Well, suppose that you encode a secret sas an -bit number in W2p and suppose that you want to distribute s among n people numbered1,..., n. You want them to be able to reconstruct s if k people get together and cooperate, but getno information if fewer than k people pool their knowledge. Now, generate k - 1 random numbersci,..., Ck-i uniformly from F2£ and consider the polynomial f(x) = s + cix -\ h Ck-ix ~ .To each person i, give the share f(i).

Players h,...,ik can get together and pool their shares to obtain the set {(ii,f(ii)), - - -,(u, f(ik))} of distinct points. With these k points, they can use Lagrange's interpolation theoremto reconstruct the polynomial /. Reconstructing / is equivalent to reconstructing its coefficients.In particular, all k players get knowledge of the secret s, which is the constant coefficient of /.

Furthermore, if k - 1 players get together they do not learn anything about s. To see this,consider the following argument: given a set of coefficients (ci,..., ck-i) G F^-1 and a fixedsecret s, we can generate the polynomial f(x) = s +■ cix -f • • • + Ck-ixk~x and the vector of

tPablo Azar, Harvard 09, is an applied mathematics concentrator from Buenos Aires, Argentina. He is alsoenrolled in a concurrent masters program in computer science. He is a founding member of The HCMR andcurrently serves on The HCMRs staff.1 You can construct this field if you know an irreducible polynomial f(x) of degree t with coefficients in F2.The field is given by the quotient F2* := ¥2[X]/f(X). The elements of this field are polynomials of degreeless than t with coefficients in F2, with all operations conducted modulo f(X). Such polynomials require £bits to encode.

83

Page 86: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

8 4 T h e H a r v a r d C o l l e g e M a t h e m a t i c s R e v i e w 2 . 1

shares (f(ii),..., f(ik-i)) G F^"1. This gives us a map

M:Ffc_1 ->Ffc_1M(c i , . . . ,ck- i ) = ( f ( i i ) , . . . , f ( ik - i ) ) -

This map is bijective by Lagrange's interpolation theorem: given a vector of shares (a^,...,aik_1), there exists a unique polynomial of degree k - 1 with constant coefficient s that interpolates the points {(0,s), (ii,^),..., (ik-i,aik_1)}. This polynomial is characterized by itsnon-constant coefficients ci,..., Ck-i.

Therefore, there is a bijection between coefficients (ci,... ,Ck-i) and shares (ai,.. .,ak-i).But remember that the dealer chose the coefficients ci,..., Ck-1 to be uniformly and independentlydistributed. This implies that the shares (ai,... ,ak-i) are also uniformly and independentlydistributed. Given the secret s, the players ii,...,ik-i can get any possible combination of k - 1shares with equal probability. This shows that k - 1 shares do not reveal anything valuable aboutthe secret.

9.2 Multi-Party Protocols, Corrupt Players and Corrupt DealersThe scheme proposed above is very elegant, but the assumptions on the dealer and the honesty ofthe players may be too strong for applications. The first problem that arises is that there may beno dealer. In this case, each of the players may have a secret si,..., sn, and all of them wantto compute a function f(si,... ,sn) without revealing any information about their correspondingsecrets besides what is known from f(si, ...,sn) [Ya]. Furthermore, some of the players maybe malicious or faulty and give fake or incorrect shares to the other participants. To detect whichplayers are being dishonest, the concept of information checking was introduced by Rabin andBen-Or [RB]. Their work expands the secret sharing protocol so that, when more than half theplayers are honest and there are appropriate communication channels, any multiparty computationcan be performed by the honest parties.

Another problem may be that of a corrupt dealer. That is, the dealer may be distributing sharessi,...,sn to the players so that when players ii,..., ik put their shares together, they get thesecret s, but when players ji,... ,jk put their shares together, they get the secret s' ^ s. A dealeris honest if and only if the secret reconstructed by any combination of k players is the same. In thiscase, we say that the players' shares are consistent.

To address this second problem, the concept of Verifiable Secret Sharing was introduced byChor, Goldwasser, Micali and Awerbuch [CGMA]. In a Verifiable Secret Sharing scheme, thedealer can broadcast some information, revealing as little information as possible about the sharesso that the players can verify that their shares are consistent/ A particularly elegant scheme fordoing this was introduced by Feldman [Fe]. In this scheme, the dealer takes a cyclic group G withpublicly known generator g, such that obtaining the value of x if one knows gx is computationallyintractable. If \G\ is a prime p, all shares in this scheme are in Fp. If the dealer uses the polynomialf(x) = s + cix H (- Ck-ixk~x mod p, she can post g,gs,gCl,..., gCk~l, gfih\- - -, gf(in)on a bulletin board. This way, every player with share f(ij) can check that gf(i^ as computed by them is equal to the posted gf^ and all players check that the posted #/(ij) equals

9.3 Two-Party Protocols and ApplicationsThese secret sharing and multi-party computation protocols lead to important applications. A toyexample is the salary problem, in which n people learn their average salary without revealinganything about their own salaries except what is learned from knowing the sum of all the salaries.

2 In practice, such groups are constructed by taking primes p, q such that q = 2p + 1 and taking G =Sq(Zq ), the group of all non-zero squares in the finite field of order q. It is conjectured that there are aninfinite number of primes of the form q = 2p + 1 where p is prime. Such primes are called Sophie Germainprimes.

Page 87: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

P a b l o A z a r — S e c r e t S h a r i n g a n d A p p l i c a t i o n s 8 5

Many applications consider computations with only two parties. Since the multi-party protocolrelying on secret sharing needs more than half the players to be honest, it does not apply to two-party computation. Some of these applications (described below), rely on a two-party primitivecalled Oblivious Transfer [Ra, NP], which was introduced by Michael Rabin in 1981.3 In theOblivious Transfer protocol, there is one sender and one receiver. The sender has N messages,and the receiver chooses one of them. The protocol is designed so that the sender does not knowwhich message was chosen, and the receiver does not learn anything about any of the other N-lmessages.

One important example is private querying of databases. Say Alice has a large database,which Bob pays to use on a per-query basis. However, Bob does not want Alice to know whathe is querying. Furthermore, since Alice derives her profit from Bob's queries, she does not wantanything revealed to Bob except the results of his query.

Another application is privacy preserving data mining. Suppose that two rival companieshave datasets Di and D2, on which they want to perform data mining. However, they want toreveal as little as possible about their proprietary data to their rival. Lindell and Pinkas [LP] suggest such a protocol, showing that one can get aggregate data about Di U D2 revealing as little^formation as possible about Di or D2 individually. An important lesson from their work is thatiheoretical protocols may not be the most efficient and that they may need to be modified to accomodate resource constraints. When the databases in question are large, one may want to minimizecommunication between the parties so as to limit the amount of bandwidth used. The reader interested in the practical applications of multi-party computations is encouraged to look at the reportby Du and Atallah [DA], where many interesting problems—including these—are presented.

Shamir's original secret sharing scheme is both simple and applicable. While the generalizations and applications depend on some difficult concepts, the basic secret sharing scheme reliessolely on linear algebra in finite fields. It is yet another example demonstrating that mathematicscan be modern, elegant, and useful.

AcknowledgementsThe vast majority of the information presented here was demonstrated to me in some form byProfessor Michael O. Rabin of Harvard University. I would also like to thank Shrenik Shah, for hisextensive edits. The exposition of Shamir's secret sharing scheme owes much to his guidance.

References[CGMA] Benny Chor, Shafi Goldwasser, Silvio Micali, Baruch Awerbuch: Verifiable Secret Shar

ing in the Presence of Faults, Proc. 26th IEEE Symp. on Foundations of Computer Science (1985).

[DA] Wenliang Du, Mikhail J. Atallah: Secure Multi-Problem Computation Problems andTheir Applications: A Review and Open Problems, CERIAS Tech Report 2001-51(2001).

[Fe] Paul Feldman: A practical scheme for non-interactive verifiable secret sharing. Proceedings of the 28th IEEE Symposium on the Foundations of Computer Science (1987).

[LP] Yehuda Lindell, Benny Pinkas: Privacy Preserving Data Mining. J. Cryptology 15(2002), 177-206.

[NP] Moni Naor, Benny Pinkas: Oblivious transfer and polynomial evaluation, Proceedingsof the thirty-first annual ACM symposium on Theory of computing (1999).

[Ra] Michael O. Rabin: How to exchange Secrets with Oblivious Transfer, Technical ReportTR-81 Harvard University: Aiken Computation Lab (1981).

3The version I am presenting is given by Naor and Pinkas [NP]

Page 88: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

8 6 T h e H a r v a r d C o l l e g e M a t h e m a t i c s R e v i e w 2 . 1

[RB] Tal Rabin, Michael Ben-Or: Verifiable secret sharing and multiparty protocols with honest majority, Proceedings of the twenty-first annual ACM symposium on Theory of computing (1989).

[Sh] Adi Shamir: How to share a secret, Communications of the ACM (1979).

[Ya] Andrew C. Yao: Protocols for secure computations, Proceedings of the 23rd AnnualIEEE Symposium on Foundations of Computer Science (1982), 160-164.

You've spent countlesshours solving linear anddifferential equations.

TEACHFOR solvingour nation's greatestinjustice.

Visitwww.teachforamerica.org/math__science_engineering to learn more.

T E A C H A M E R I C AAlt academic majors.Full salary and benefits.

Page 89: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

FEATURE10

MY FAVORITE PROBLEM

Linear Independence of RadicalsIurie Boreico*

Harvard University ' 11Cambridge, MA 02138

bore [email protected] .edu

The problem I intend to discuss here was mentioned in the prior HCMR—in particular, authorZachary Abel [Ab, p. 79] stated that "the set {v^ | n G N is squarefree} is linearly independentover rationals." More formally:Problem. Letni,n2,... ,nk be distinct squarefree integers. Show that ifai,a2,...,ak £ Z arenot all zero, then the sum S = a\ y/h~[ 4- a2 yjn^ 4- ... 4- ak y^fc is non-zero.

Note that this problem is equivalent to Abel's statement, since we may clear denominators toobtain coefficients in Z.

10.1 Preliminary AnalysisBy setting Ai = a2m, the problem can be restated as follows: if

fc^ ± ^ = 0 ,

then at least one of the expressions Ai/Aj must be a perfect square. Indeed, in our case none ofthe expressions Ai/Aj = (ai/a3)2ni/n3- is a perfect square, so the sum

fcaiyjnl, + a2sjn2~4- - - - + aky/nk = ^2±y/A~i

i = l

must not be zero. The converse follows similarly.In this form, the problem can be tackled for small values of k by simply squaring. For example,

if k = 2, we have \fAl - \fM = 0, so y/M = ^/M- Squaring gives Ai = A2 and thusAi/A2 = 1, which is a perfect square. If k = 3 we have WLOG that \fM = ±\fM 4- \fM,and so again squaring gives A\ = A2 + A3 ± 2\/A2A3. Hence \fMM = ±(A2 + A3 - Ai)/2,which implies A2A3 = (A2 +■ A3 - Ai)2/4 is a perfect square, and so

A 2 A 2 A 3A 3 / 1 3

i2A3 = (A2 + A3-AiyyA 2 V 2 ^ 3 )

For k = 4 we may rewrite the problem as ±\fM±y/M = ± \fM ± \fM- Then by squaring wehave A\ 4- A2 - A3 - A4 ± 2y/AxA2 ± 2sJ'A3A4 = 0 and we handle this using the previouslyestablished case k — 3.

t Iurie Boreico, Harvard ' 11, is a prospective mathematics concentrator residing in Weld. His mathematicalknowledge is yet too vague to define his interests, but they tend towards number theory. When not doing math,he usually misses his home country, Moldova, or wastes his time in some other way.

87

Page 90: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

8 8 T h e H a r v a r d C o l l e g e M a t h e m a t i c s R e v i e w 2 . 1

Unfortunately, this approach does not extend to k > 4, for however we rearrange the givenexpressions, squaring only increases the number of radicals. In fact, as an olympiad-style problem,this problem is very hard, and probably very few would be successful in solving it. With enoughpatience and creativity, however, several solutions are possible.

10.2 SolutionsSolutionlfrom[Kv]. Letpi,p2,...,pN be all the primes dividing mn2--nk- We will provethe following statement by induction on AT:

Recall that S = aiy/n~[ 4- a2y/n2~ + ... 4- aky/h~k. Then there exists an expression S' =h y/m\ + b2y/rfi2 + ... + bi y/rrfi where rai, m2,..., mi are squarefree integers with prime factorsamong the pi,p2,... ,pN and 6* are integers, such that SS' is a non-zero integer (in particular,S 0, as desired).

For N = 0 this is obvious, as in this case k = 1, m = 1 and we get S = ax ^ 0 so we can setS' = 1. For N = 1 we either have S = aiy/pl, in which case we may let S' = y/pl, or we haveS = aiy/]*[ + a2. In the last case we may take S' = -aiy/pl + a2, so SS' = a2 - a\pi. This isnon-zero as a\ is divisible by an even power of pi, whereas a\pi is divisible by an odd power ofpi, so the two cannot be equal.

Now we perform the induction step. Assume that the theorem is true for N < ra; we prove itfor N = ra 4- 1. Let pN = pn+1 = p. We may write S = Si + S2y/p where the primes appearingin the radicals in 5i, S2 are among px,..., pn, and further, neither S\ nor S2 is identically 0 (elsewe would already be done, as p would be irrelevant). So there exists a sum S2 of the form given inthe claim such that S2S2 is a non-zero integer k.

The intermediate product SS2 can then be written as S4 + ky/p where the primes appearing inthe radicals in S4 are also among pi,...,pn.We may thus multiply it by S4 - ky/p to get S2 - k2p.Finally, it is easy to see that all prime factors of radicals of S2 - k2p are among pi,... ,pn, so,assuming this number is not itself zero, the induction hypothesis implies that there exists a nonzero weighted sum of radicals S5 whose prime factors appear among pi,p2,...,pn such that(Si - k2p)S5 is a non-zero integer. So we obtain the desired representation SS2(S4 - ky/p)S5 eZ \ {0} where Sf = S2(S4 - ky/p)S5 is a sum of radicals of the desired type.

Thus, we are done if we manage to prove we do not run into trouble when multiplying £4 -ky/p, as the product could become zero at that step (e.g. if S4 - ky/p = 0). It is sufficient toprove that S2 - k2p ^ 0. If S4 is an integer this is clear, and if S4 is of form u^/q this also truebecause u2q ^ k2p, as p does not divide q. Otherwise, S4 contains at least two distinct radicalsin its canonical expression (if we consider y/l as a radical),1 and we can assume without loss ofgenerality that pn appears in one of these two radicals but not in the other. So S4 = S6 + S7y/p^where S6, S7 are sums of radicals with prime factors among pi,p2,... ,pn-i, and S6,S7 ^ 0.Therefore S24 - k2p = S% 4- 2S6S7y^+ S^pn - k2p. As S6S7 0, by expanding the expressionS4 - k2p we will get at least one radical containing pn. But then by the induction hypothesis, thee x p r e s s i o n i s n o n - z e r o a s c l a i m e d . □

This solution, even if it might seem somewhat unnatural and tedious, is completely logical inits construction. By starting from the well-known idea of multiplication by a conjugate (the caseN = 1 above), the idea is to actually produce a sort of "conjugate" expression for more complicatedsums involving radicals, i.e. something involving the same radicals which when multiplied by theoriginal produces an integer. The (somewhat unappealing) induction step is just a set of technicalmanipulations that help realize this idea.

!By the canonical representation of an expression involving radicals we mean its simplest possible form—that is, the form obtained by extracting the squares out of the radicals and grouping together the terms whichhave the same square-free numbers under the radicals. For example, (\/2 4 \/l0)2 = 24 2v 2 • \7l0 4 10would be brought to 2 4 4\/5 410 = 1244\/5. The statement of the problem is just the fact that the canonicalrepresentation is indeed "canonical," that is, the same number can not be written as such a sum in two differentways (otherwise subtracting the two expressions would produce a counterexample).

Page 91: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

I u r i e B o r e i c o — L i n e a r I n d e p e n d e n c e o f R a d i c a l s 8 9

If one takes as a starting point the mentioned idea of conjugate expressions, one might considerthe following question:Question. What is the expression conjugate to y/ai 4- y/ai + ... 4- y/a~n~?

We know that for ra = 2 the conjugate is y/a~[ - y/ai. Of course we could have chosen someother combination of signs, like -y/a[ - y/ai or -y/a[ 4- y/ai, but we do not get anything newfrom them, as these two expression are just the original ones with the opposite sign. Given thisexample, we might thnk that the expression y/al 4- y/ai 4- • • • + y/aZ has many conjugates, andthat they represent all expressions of form ±y/a{± y/ai±... ± y/aZ for all combinations of plusesand minuses. Again, we need to ensure that the same expression does not occur twice, the secondtime with opposite sign, which can be realized by requiring that the sign of y/a~[ is always positive.We get a family of 2n~ * alike sums: y/al ± y/ai ± y/ai ±...± y/aZ. This might suggest that theproduct of this entire family could in fact be the required non-zero integer we sought in Solution 1,but unfortunately while it is indeed possible that this product is an integer, there is no obvious wayto handle this huge expression directly and prove that it is non-zero.

These considerations inspire the next solution:Solution 2. Consider the linear expression L(xi, x2,..., xn) = aixi 4- a2x2 4-... 4- anxn. Wewill also consider its conjugate expressions of form L'(x\, x2,..., xn) = a\x\± a2x2 ± a3x3 ±...±anxn- There are 2n_1 such expressions. Now take a variable T and consider the polynomial

FL,x1,x2,...,xn (T) = J3 (T - L'(xi,x2,..., xn)) = JJ (T - aixi ± a2x2 ± ... ± anxn),v

where the product is taken over all conjugate expressions L' (including L).Note that FL,Xl,X2,...,Xn (T) is written as a polynomial in T, but can be considered as a poly

nomial in xi, x2,..., xn - Also note that changing the signs of any of x2, x3,..., xn will not affectF because doing so only permutes the set {L'}. Therefore

i?L,xi,x2,...,xn(71) = Fl,x1,±x2,±x3,---,±Xti(T)-

In particular, if we expand the product representation of F into a sum of monomials, eachmonomial term will contain only even powers of xk (k = 2,..., ra), because otherwise changingthe sign of xk would change the sign of the monomial. Note that this is not true for xi, as doing sosends the set {Z/} to the set {-£'}. But by expanding F into a sum of monomials and groupingthe monomials with odd and even powers of xi we can write

FL,x1,x2,...,xn(T) = XiP(x2i,X2,X3, . . . ,Xn,T) + Q(x2i,X2,X3, . . . ,Xn,T).

As we have seen above, P and Q do involve only monomials with even powers of x2, x3,..., xn,and so they depend only on x\, x\,..., x2n. So we can actually write

Fl,x1,x2>...,*„(T)=^

It is also clear that if ai are integers then all the coefficients of P and Q will be integers.Now let us return to the problem. We actually prove a different version of it: that is, that no

non-zero integer M can be represented as a nontrivial canonical sum of radicals. To see that thisimplies the original problem, assume that ££=1 aiv/ra7 = 0. Then, by multiplying by y/h~k~, weget YllZl aiy/nmk~ = -aknk, which is a contradiction if we prove that no non-zero integer canbe represented as a canonical sum of radicals. So let us prove this version, by induction on k. Thebase case, k = 1, is clear.

If we assume that an expression of form a iy/m 4- a2y/ni+.. .4- aky/rtk equals M £ Z\{0},then the polynomial FL,vArr,v^,...,v^r(^) would vanish at T = M. But we saw the polynomialcan be written simply as

y/n{P2(rii,n2,... ,nk,T) + Q2(ni,n2,... ,nk,T),

Page 92: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

9 0 T h e H a r v a r d C o l l e g e M a t h e m a t i c s R e v i e w 2 . 1

so wenkso we have

we would have y/mP2(m,n2,... ,nfc, M) 4- Q2(m,n2, - - - ,nk, M) = 0. But P2(ni,...,, T) and Q2 (m,..., nk, T) are integers. By the base case, A+By/n~{ = 0 implies A = B = 0,we have

P(m,n2,...,nk,M) = Q(m,n2,... ,nk,M) = 0.Hence,

-y/mP(ni,n2,...,rafc, M) 4- Q(ni,n2,...,rafc, M) = 0,/.*. FM,-ynT,v^2-,...,v^(M) = 0. Thus,

J}(M 4- ai VraT ± a2 v^ ± a3y/h~i ±...±ak y/rtk) = 0,

and so M = —aiy/h~i± a2y/h~i ± ... ± aky/n~k for some combination of signs. However, wealready have M = aiy/n~[ 4- a2y/h~i + ... 4- aky/n~k, and summing these two equalities gives

2M = (o2 ± a2)y/ni 4- (o3 ± a3) y^ 4-.. • 4- (ak ± ak)y/nZ.

This cannot happen by the induction hypothesis, and we have reached our contradiction. □

10.3 Further IdeasNow I will explain why I like this problem. The essential reason is that the solutions hint at manyimportant concepts in algebra and number theory, let us talk about some of them:The primitive element theorem. The primitive element theorem states that any finite separablefield extension L/K contains a primitive element, i.e. an element that generates the whole extension. This problem allows us to explicitly find a primitive element (in fact many of them) for theextension Q[y/pi, y/pi, ---, y/Pk}/Q, where pi,p2,... ,Pk are distinct primes.

The field Q[y/pl, y/pi,..., y/p£\ consists of all combinations £ a< y/^i where ca G Q andni,n2,...,n2k are all possible products that can be formed with pi,p2,... ,Pk- As we haveproven that y/nl, y/ni,..., y/hZ are independent over Q, it follows that the degree of the extensionover Q is 2k. Thus to find a primitive element means to find an element 6 in Q[y/pi, ■.., y/Pk]which is a root of an irreducible polynomial of degree 2k. We claim our friend y/pi + y/pi+... 4-y/Pk (or any of its conjugates) can be taken as 6.

Assume P G Q[X] and P(6) = 0, with P irreducible over Q. We can expand each powerof 0 in P(6) and write it as a sum of radicals, and then combine these radicals to obtain P(6) =ai y/n~[ 4- a2y/h~i 4-... 4- a2k y/n~k- The results proven above tell us that P(0) = 0 if and only ifall ai are 0. Let us take now a conjugate of 0, say 0' = eiyfpi 4- t2y/pi + - - - 4- eky/pk wherec» G {-1,1}. We claim P(0') = 0. Indeed, if we expand <9'\ the coefficient of y/nl will eitherbe the same as the coefficient of y/n~i in 0k, or it will be the additive inverse of that coefficient,depending on how many of pj with e3 = -1 divide y/nl. We thus get P(0') = £\ anny/hlwhere pi = UPj\ni eJ- As a11 ai are zero» we get P(9') = 0. Hence P has as roots all theconjugates of ±6, of which there are 2k,soP has degree at least 2k. In fact it must have degree 2kbecause 6 lies in an extension of degree 2fc, so 6 is indeed a primitive element.

It is also clear now that P(X) = neie{-i,i}(X ~ ^yfvl-^y/pi- ---- eky/pk~). The factthat this polynomial has rational coefficients follows from the fact that

P(X) = FL.^Vpa."..^ W * FL-Vin,Vp5,-,VPk-(X)

(we keep the notations of Solution 2) and this has integer coefficients from Solution 2.The degree of extensions of radicals. We noted above that [Q(y/pT, y/pi,..., y/pk) : Q] = 2kwhen pi,p2,...,pk are distinct primes. It would be interesting to show that [Qfy^T, y/pi,...,y/Pk) : Q] = 2h with a constraint weaker than that pi be primes. The degree of this extension istrivially at most 2fc, but it may be less than that. For example we might have y/pi G Q(y/pi, y/pi)if y/pi = my/prpi, where m G Q, in which case adjoining p3 would not alter the extension. We

Page 93: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

I u r i e B o r e i c o — L i n e a r I n d e p e n d e n c e o f R a d i c a l s 9 1

will prove that these are exactly the "uncomfortable" cases. Namely, let us take from pi, p2,..., Pka maximal sequence pi,...,pi which is multiplicatively independent, by which we mean that theproduct of any nonempty subset of elements in the sequence is not a perfect square. More explicitly,pi,p2,...,pi is multiplicatively independent, but for any j > I, there exist 1 < h,.. .,ir < Isuch that PjPixPi2 - • • Pir = a2 is a perfect square. This means

y /P j = y /PhP i2 - 'P i r € Q(y / l> l , y /p i , • • • , y /p i ) ,P i i P i 2 - - P i r

and so Q(y/pl, y/pi,---, y/pi, y/pj) = Q(y/pl, y/pi, • • •, y/pi). It is easy to see that [Q(y/pl,y/pi, • • •, y/pi) : Q] = 2', by arguments similar to those in previous paragraph, as the numbersPi, P2 •. • Pi have distinct squarefree parts and so are linearly independent over Q. (If they did not,they could be multiplied to obtain perfect squares). Also, as above, y/pl + y/^i 4- ... 4- y/pi isa primitive element of the extension, so the primitive element theorem is verified explicitly in thiscase.

Note that y/pl, y/pi,..., y/pk with the operation of multiplication (and division) generate anabelian group A. Let Qx be the multiplicative group (Q, •)• Tne grouP G = AQX satisfies[G : Qx] = 2l, with y/nl, yfh~i,..., y/h~i forming a complete set of representatives for G/Qx,where the m are all the possible 2l products IIi€Jc{i,2,...,i} P<- (The Quotient G/®X is generatedby the images of y/nl, yfh~i,..., y/hl and is isomorphic to (Z/2Z)'). Therefore the result obtainedin this section can be rewritten as [Q(y/pl, y/pi, ---, y/pk) : Q] = [G : Qx]. In this abstract form,the result is easier to generalize.Galois groups. The freedom with which one interchanged signs in front of radicals may suggestin fact that there is no visible difference between y/p and -y/p, and they can be interchanged inexpressions when one is concerned with rationals. This idea leads to the Galois groups. Indeed,ifpi,p2,---,pi are multiplicatively independent then in any expression F(y/pl, y/pi,..., y/pk)with F G Q[Xi ,X2,---,Xk] one may change the signs to get F(±y/pl, ±y/^i, ---, ±y/Pk), andthe new expression will be conjugate to the original. In particular, it will equal 0 if and only if theoriginal equals 0. We thus have 2l isomorphisms of Q(y/pl, y/pi, y/pi, • • •, y/pi) for any choicesof signs €1, e2, - - -, ei G {-1,1}, characterized by sending y/pi to eiy/pl. As the extension isnormal (since Q(y/pl, y/pi, ---, y/pi) is the splitting field of (X2 - pi)(X2 - p2)... (X2 - pi))and has degree 2l, we have found the Galois group of the extension: (Z/2Z)'.Higher powers. The natural question is whether the statement of the problem can be extendedto radicals of any degree. Specifically, we prove that if ai,a2,... ,an,bi,b2,... ,bn G Q+ andky/bl are not all rational, then £"=1 ai ky/bl is not rational. The solution is a generalization of

Solution 2. Firstly, we may assume all the ki equal, as otherwise we can replace them by theirleast common multiple and adjust the bi accordingly. So we need to prove that a sum of formYH=\ ai $* = M eZ cannot occur if at least one of the bi is not a perfect k-lh power. Again,we use induction on ra.

Consider £ a primitive k-ih root of unity. Take the polynomial

P(X, b) = Yl(x-b- C2a2 Vol-...- Clan VK)

with ai rational, where the product is taken over all choices of i2,i3,... ,in G {0,1,..., k — 1}.As replacing y/bl by £ y/bi in the above expression preserves P, we conclude that P can be writtenas a polynomial in X and b with coefficients in Q[b2,b3,...,bn) = Q (we did not argue thisrigorously since it is completely similar to the argument used in the original problem). Now ifAf G Q can be written as £"=1 a{ y/bl then P(M, ai y/bl) = 0. Let d \ k, d > 1 be the smallestinteger such that y/bf G Q; then P(M, x) can be written as

qo(xd) + xqi(xd) + ... 4- xd~Xqd-i(xd)

Page 94: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

9 2 T h e H a r v a r d C o l l e g e M a t h e m a t i c s R e v i e w 2 . 1

where q0, qi,..., qd-i G Q[X]. So if (a\ y/bl)d = ueQwe have

qo(u) + qi(u)y/u + ... 4- qd-i(u) \Zud~l = 0.

Now note that 1, ^u,..., y/ud~l are independent over Q, because fyu is the root of the irreducible polynomial Xd - u. (To see that this polynomial is irreducible, note that the roots ofXd - u have absolute value $/\u\, so if f(x) \ Xd - u has degree ra, then |/(0)| = y/\u\m.But for 0 < ra < d this is not rational, for if it were then (a\ y/bi)m = ± y/\u\m would berational, contradicting the minimality of d. So Xd - u does not have proper factors in Q[X] and isirreducible.) Therefore we conclude that q0(u) = qi(u) = q2(u) = ... = qd-i(u) = 0. If e is aprimitive d-th root of unity we may conclude that

P(M, eu) = qo(u) + qi(u)e^+ ... 4- ^-lW^'1 V/^ri = 0,

soM = 6^u + Y/l=2 £lia* y/bl for some {/*}. But then

n n€ ^ 5 + 5 ^ ^ 0 * ^ = v ^ 4 - ^ a i ^ .

i = 2 i = 2

This is impossible as each of the terms in the left-hand side has real part less than or equal to thecorresponding term of the right-hand side (which a positive real of the same absolute value), andthe inequality is strict for the first term.

References[Ab] Zachary Abel: My Favorite Problem: Bert and Ernie, The Harvard College Mathematics

Review 1 #2 (2007), 78-83.

[Kv] Irrationality of a Sum of Radicals (in Russian), Kvant 2 (1972).

Page 95: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

FEATURE11

ProblemsThe HCMR welcomes submissions of original problems in any fields of mathematics, as well as solutions to previously proposed problems. Proposers shoulddirect problems to [email protected] or to the address onthe inside front cover. A complete solution or a detailed sketch of the solutionshould be included, if known. Solutions to previous problems should be directed [email protected] or to the address on the inside front cover.Solutions should include the problem reference number, as well as the solver's name,contact information, and affiliated institution. Additional information, such as generalizations or relevant bibliographical references, is also welcome. Correct solutions willbe acknowledged in future issues, and the most outstanding solutions received will bepublished. To be considered for publication, solutions to the problems below should bepostmarked no later than September 15, 2008. An asterisk beside a problem or part of aproblem indicates that no solution is currently available.

S08 - 1. It is known that there are 6670903752021072936960 square matrices M of order 9 withentries in {1,..., 9} that show valid sudoku grids.1 How many of them have the property that thesymmetric matrix M 4- MMs positive definite?

Proposed by Noam D. Elkies (Harvard University).

S08 - 2. Professor Perplex is at it again! This time, he has gathered his ra > 0 combinatorialelectrical engineering students and proposed:

"I have prepared a collection of r > 0 identical rooms, each of which is empty exceptfor s > 0 switches. You will be let into the rooms at random, in such a fashion thatno two students occupy the same room at the same time and every student will visiteach room arbitrarily many times. Once one of you is inside a room, he or she maytoggle any of the s switches before leaving. This process will continue until somestudent chooses to assert that all the students have visited all the rooms at least v > 0times each. If this student is right, then there will be no final exam this semester.Otherwise, I will assign a week-long final exam on the history of the light switch."

What is the minimal value of s (as a function of ra, r, and v) for which the students canguarantee that they will not have to take an exam?

Proposed by Scott D. Kominers '09, Paul Kominers (Walt Whitman HS '08), and Justin Chen(Caltech '09).

S08 - 3. Let k > 1 be a natural number. Find all integer solutions to the diophantine equation

a.2*+l+a.** + . 2 , . 1 2 f c + l■ 4-z 4-a; 4- 1 = 2/

Proposed by Ovidiu Furdui (University of Toledo).

!The proposer points out that this calculation is detailed in Bertram Felgenhauer and Frazer Jarvis:Enumerating possible Sudoku grids (2005), http: / /www. af jarvis . staff. shef. ac. uk/sudoku/sudoku.pdf, athough it was independently computed by user "QSCGZ" on the rec.puzzle Google group,thread "combinatorial question on 9x9," 21 Sep. 2003.

93

Page 96: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

9 4 T h e H a r v a r d C o l l e g e M a t h e m a t i c s R e v i e w 2 . 1

S08 - 4. Consider a, b, c three arbitrary positive real numbers. Prove that

b + cEV^ 'E 1 + (g + fr)(o + c)(c + g)- Sabcb + c ) Y 4 ^ c a ( o 4 6 J ( a 4 c )

Proposed by Cosmin Pohoata (Bucharest, Romania).

S08 - 5. Let ABC be a non-isosceles triangle with ZA = 60°. Let H be its orthocenter and /its incenter. Let Bi and d the points such that the equilateral triangles ABd and ABiC intersectthe interior of ABC. Define Be and Ce similarly, so that ABCe and ABeC are equilateral anddisjoint from the interior of ABC.

Show that the lines through HI, Bid and BeCe do not concur, and that the triangle they formis isosceles.

Proposed by Daniel Campos Salas (Costa Rica).

The following problem from the Fall 2007 issue received no submissions. Since thisproblem defied solution, we are rereleasing it for one more issue.

F07 - 5. For i = 1,..., ra, let /» : (Z/raZ U {*})n -▶ (Z/raZ U {*})n be given by

fi((xi,...,xn)) = <

( * , x 2 - \ - l , x 3 , . . . , x n ) i = l a n d # i = 1 ,(xi,...,Xi-i 4- 1,*,Xi+i 4-1,.. -,xn) 1 < i < nmdxi = 1,( x i , . . . , x n - 2 , x n - i 4 - 1 , * ) i = r a a n d ^ n = 1 ,( x i , . . . , x n ) o t h e r w i s e ,

where • 4-1 = •. Find necessary and sufficient conditions on (xi,..., xn) G (Z/raZ)n such thatthere exists a sequence {ik}k=i for which

/*«(•' • CMtei, • • .,xn)))) = (•,... ,•).

Proposed by Paul Kominers (Walt Whitman HS '08), Scott D. Kominers '09, andZachary Abel' 10.

Page 97: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

FEATURE12

Solutions

Projective Paranoia

S07 - 3. The incircle ft abc of a triangle ABC is tangent to BC, CA, AB at P, Q, R respectively.Rays PQ and BA intersect at M, rays PR and CA intersect at N, and the incircle ftMNP oftriangle MNP is tangent to MN and NP at X and Y respectively. Given that X, Y and B arecollinear, prove:

(a) Circles ft abc and CImnp are congruent, and

(b) these circles intersect each other in 60° arcs.

Proposed by Zachary Abel '10.

Solution by the proposer. Let cr(Ai, A2; A3,A4) denote the cross ratio ^ffi^* of four collinear points A\, A2, A3, A4. Let p and r denote the polar maps through circles ft abc and Qmnprespectively, and let J and J be the respective centers of these two circles.

Let Qmnp touch MP at Z, and define MN n BC = S. We first show that X, Z, and C arecollinear. As p(S) = AF, it follows that cr(P, Q; S, RQnAB) = -1, and hence by perspectivitythrough A, cr(B, C; S, P) = -1. An identical argument proves that cr(N, M, MN n YZ, X) =-1. Letting C = XZ n BC, we may calculate

cr(B, C'; S, P) = cr(Y, Z; MAT n YZ, XP n YZ) = cr(N, M; MAT n YZ, X) = -1

where the notation = indicates that equality follows by a perspectivity about point J. Sincecr(P, C; 5, P) = cr(P, C; S, P), it follows that C = C, as claimed.

By Pascal's theorem on hexagon BXCNPM, we find that Y, A, and Z are collinear. Then bythe converse of Brianchon's theorem on hexagon RYNMZQ, there must be some conic tangentto line RN at Y, tangent to QM at Z, and tangent to lines MN and QR. Such a conic is uniquelydetermined by the first three of these four facts and must therefore coincide with circle Qmnp,hence QR is tangent to this circle. Label this point of tangency T. By well-known propertiesof circumscribed quadrilaterals, lines XY, MR, and ZT are concurrent, i.e. T lies on line ZB.Likewise, T is collinear with Y and C.

We may apply similar arguments to circle ft abc- Letting U = CTnAB and V = BTnAC,the converse of Brianchon's Theorem proves that UV is tangent to circle ft abc at some point W.Also as above, RW must pass through Z and QW passes through Y.

Note thatp<uc) = p(U) n p(C) = rwhpq = z

and likewise p(VB) = Y, so p(T) = p(UC n V£) = YZ. In particular, JT _L YZ, and hence/T || PJ. Similarly, r(A) is the join of r(NQ) = XYC\TZ = B and r(MR) - XZnYT = C,hence J A _L BC. Let line JA meet BC and the top of circle ftMNP at D and £ respectively.

By Pascal's theorem on hexagon PPQQRR in circle ftpQR, point S lies on line QP. AsZ.JTS = ZJDS = 90°, quadrilateral JTDS is cyclic. Using a, /3, and 7 for the angles ofA ABC, we may calculate

ZJET = \ZDJT = \ZDST = \ (180° - ZRQP - ZQPS) = 45° - f - \

and, with YZ n PC = P,ZPPZ = ZCPZ - ZPZP = (90° - I) - (45° + f) = ZJPT.

95

Page 98: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

96 The Harvard College Mathematics Review 2.1

This is enough to conclude that ET _L YZ, i.e. that ETI are collinear and furthermore lie on aline parallel to PJ. Thus, JEIP is a parallelogram, and as long as it is not degenerate, we mayconclude that \JE\ = \IP\, i.e. that Qabc = ftpQR.

If it is degenerate, then triangle ABC must be isosceles with A at its vertex. In this case, lineMN is parallel to BC, so from XNY ~ BPY we find that \BP\ = \PY\. And since BI _L PYand BP 1 PJ, it follows that right triangles BPI and PY J are congruent. Thus \PI\ = \YJ\,and again the two circles are congruent.

Let r be the common radius length. To prove part (b), it suffices to show that d = \IJ\ = ry/3.Consider the inversion i through Qmnp- The points i(P), t(Q), and t(R) correspond to themidpoints of ZY,YT, and TZ respectively, so i(ftABc) is the nine-point-circle of triangle YTZ.In particular, the radius of t(ft abc) is half the radius of Qmnp, ie., \.

Let IJ intersect ft abc at H and K, so that \JH\ = d - r and \JK\ = d 4- r. Then\l(HK)\ = \l(JH)\ - \t(JK)\ = £ - £-r. But l(HK) is a diameter of t(flABc), andtherefore has length r. This indeed gives d — ry/3, as needed for part (b). □

Euler and Napoleon

F07 - 1. Consider A.4PC an arbitrary triangle and P a point in its plane. Let D, E, and Fbe three points on the lines through P perpendicular to the lines BC, CA, and AB, respectively.Prove that if ADEF is equilateral and if P lies on the Euler line of A,4PC, then the center ofADEF also lies on the Euler line of A ABC.

Proposed by Cosmin Pohoata (Bucharest, Romania) and Darij Grinberg (Germany).Solution by Yasuhide Minoda (Tetsu Ryoku-Kai, Japan). Let O be the circumcenter of A.4PC.By parallel translation, it suffices to consider the case P = 0 (see Figure 12.1(a)).

Let A', B' , C' be centers of equilateral triangles outside of A,4PC, drawn on sides PC,CA, AB, respectively. It is well known that AA/B'C is equilateral (the outer Napoleon triangle).[Editor's note: if AABC and ADEF have opposite orientation, we should instead take A', B',and C as the centers of equilateral triangles drawn inward on the three sides of A,4PC, so thatAAlB'C' is the inner Napoleon triangle, which is also equilateral.]Lemma 1. AA'B'C' and ADEF are similar with respect to O. In particular, the center ofAA'B'C', the center of ADEF and O are collinear.

(a) Without loss, we may assume P = O.

Figure 12.1: Figures for Problem F07 - 1.

(b) The centroids of AABC andAA'B'C coincide.

Page 99: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

Z a c h a r y A b e l , e d . — S o l u t i o n s 9 7

Proof. Without loss of generality, we can assume ZBAC ^ 27r/3. Scale up or down ADEFwith respect to O to form AD'E'F' so that D' coincides with A'. We want to show E' — B' andF' = C.

Rotate line OB' by n/3 [or -tt/3 depending on orientation] around A' and denote it by £. B'and E' are moved on I by this rotation. On the other hand, B' and E' are moved to C and E',respectively, and these points are both on line OC. Thus, F' and C must be the intersection ofline OC and £ (note that these lines intersect at one point because ZBAC ^ 27r/3).

Therefore, C = F' and B' = E'. It follows that AA'B'C and ADEF are similar withr e s p e c t t o O . OLemma 2. The center of AA'B'C coincides with the centroid of AABC.Proof. Let K be a point symmetric to A' with respect to line BC (see Figure 12.1(b)). TrianglesAB'KC and A,4PC are similar with a ratio 1 : ^/3. In addition, AC : AB = 1 : y/3. Thus,AC = B'K. Similarly, AB' = CK. So AC KB' is a parallelogram.

Let the center of AC KB' be M, and the midpoint of BC be N. The centroid C of AA'B'Clies on MA' and MC : G'A' = 1:2. Thus,

A ' N K A M C _ I f 2 \ 1 _ 1aO? 'am ' c7^7 " i v ly 2

So, by Menelaus' theorem, C lies on the median ~AN of AAPC. Similarly, C also lies on otherm e d i a n s o f A A B C . T h e l e m m a i s p r o v e d . □

By Lemma 1 and Lemma 2, the centroid of ADEF is on line OG, namely the Euler line ofA A B C . □Also solved by the proposers.

Dastardly Haberdashery

F07 - 2. Professor Perplex has rounded up his ra > 0 hat-game seminar students and made thefollowing ominous announcement:

"I have assigned each of you a hat according to a uniform probability distribution,which I will put on your head after allowing you time to discuss a strategy. Hatscome in h > 0 different colors, but some colors might be reused and others mightnot be used at all. Each student will be given a list of the h colors. Nobody willbe able to see his or her own hat, but everyone will have the opportunity to observeall the other hats. Then, you will all be instructed to simultaneously write downone of the colors. If any student correctly identifies the color of his or her own hat,then there will be no final exam this semester. Otherwise, I will assign a week-longhaberdashery final."

What is the probability that the students have to take a final, assuming best play?Proposed by John Hawksley (Massachusetts Institute of Technology '08) and Scott D. Kominers'09.

Solution by Charlie Pasternak (Takoma Park Middle School). A student cannot deduce any information about his own hat color from what he sees alone. This means that the probability of astudent making a correct guess, and the expected number of correct guesses over time, remainsconstant, so the students' best strategy is to spread out the correct guesses as thinly as possible to"waste" as few as possible.

This best strategy is: if ra > h, take h students and assign them a numbers {0,1,..., h - 1},and if ra < h, assign the students the numbers {0,..., ra - 1}. Then, assign each color a numberfrom 0 to h - 1. When the hats are put on, each student sums up the colors of the hats seen, thenwrites for his own hat color the color corresponding to the difference between his assigned numberand the sum of the hats he sees, modulo h.

Page 100: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

9 8 T h e H a r v a r d C o l l e g e M a t h e m a t i c s R e v i e w 2 . 1

If a student's assigned number is the sum modulo h of the colors, his guess will be correct. Ifra > h, at least one person's assigned number is the sum modulo h of the colors, so at least oneguess will be right. If ra < h, then the chance that one of the students' assigned numbers is the summodulo h of the colors is , as is the chance of a correct guess.

Therefore, the chance of the students taking a final is max(0,1 - £). □Also solved by Sherry Gong '11, Arnav Tripathy' 11, Ray C. He (Massachusetts Institute of Technology '07), and the proposers.

Restricted Roots' Radii

F07 - 3. Find all integer monic polynomials f(x) such that

(i) fix) = /(l - x) and

(ii) all complex zeros of / lie in the disk \z\ < \pl.

Proposed by Vesselin Dimitrov '09.

Solution by Noam D. Elkies (Harvard University). The polynomials f0(x) = x2 - x and fi(x)= x2 - x 4-1 satisfy both conditions (the latter has roots of absolute value 1 at the primitive sixthroots of unity). Therefore so does fo(x)a° fi(x)ai for any nonnegative integers a0 and a\. Weclaim that these are the only such polynomials, indeed the only polynomials satisfying (i) whoseroots lie in \z\ < r := 1.3 (note that 21/5 < 1.15 < r).

Let / be any polynomial satisfying both conditions, and let a be a complex zero of /. Then1 - a is also a complex zero of /, so a lies in the intersection of the open discs of radius r about 0and 1. We claim:Lemma 3. If z is a complex number such that \z\ < r and |1 - z\ < r, then J/o(^)2/i (^)3| < 1.

Assuming this lemma, it follows that the algebraic integer y := fo(a)2fi (af has the propertythat y and all its conjugates (which are also values of fofi at roots of /) have absolute value lessthan 1. But then the norm of y, which is the product of those conjugates, is a rational integer ofabsolute value less than 1. Therefore y is an algebraic number of norm zero, whence y = 0. Thismeans that a is a root of either f0 or /i. Moreover, for each j = 0,1 the two roots of f3 areswitched by the involution x <-> 1 - x of C, and thus have the same multiplicity as complex zerosof /. Letting a3 be this common multiplicity, we find that / = fo(x)a°fi (x)ai, as claimed.

We prove the lemma via the following explicit calculation. Let R be the intersection of theclosed discs \z\ < r and |1 - z\ < r. Let y(z) := fo(z)2fi(z)3. We claim \y(z)\ < 1 for allz G R. The function y is analytic; hence by the maximum principle it suffices to prove \y(z)\ < 1on the boundary of R. By symmetry about the real axis and the vertical line Re(z) = 1/2, wemay assume z = x + iy/r2 - x2 for some x G [1/2, r}. For lack of a better idea, we expand|2/(2)|2, obtaining a polynomial P8(x) G <Q[x] of degree 8. A numerical plot suggests that thispolynomial does not exceed 0.9 on [1/2, r}. Since Ps(l/2) = r8(l - r2)6 < 0.9, we can provethat Pg(x) < 0.9 for all x G [1/2, r] by checking that this interval contains no roots of P8 - 0.9,and this is confirmed using Sturm's method (implemented in gp as polsturm). This completesthe proof of the lemma and the solution of F07-3.Remark. We could likewise use /0(a)60/1(a)61 for any positive integers bo,61. The simple choicebo = bi = 1 suffices to solve problem F07-3 as stated, but would only let us improve r from about1.15 to about 1.27. The choice (bo, h) = (2,3) is not quite optimal either, but can be used for anyr less than the positive root r0 = 1.304+ of xl6-3x8 + 3x6 -x4 -1 (for which fo(z)2 fi(zf 4-1has roots with \z\ — |1 — z\ — ro), and numerical experimentation suggests that this ro is verynearly as far as this technique can be pushed. At any rate we cannot take ro > 1.3503 because thetenth-degree polynomial /o /1 4-1 satisfies condition (i) and has a pair of complex roots of absolutevalue 1.350255424-.Remark. While we would get a worse bound had we used (bo,bi) = (1,1) instead of the exponents(2,3) in our Lemma, the proof would be a routine albeit tedious calculus exercise, because insteadof the octic Ps we would have only a cubic polynomial to maximize. This would still be enough

Page 101: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

Z a c h a r y A b e l , e d . — S o l u t i o n s 9 9

to solve problem F07-3 as stated, or even with the bound 21/5 raised to 21/3 = 1.25994-. (Thecutoff value r0 would then be the positive root 1.2724- of xA - x2 - 1.) We do not know whetherthe p roposer ' s cho ice o f 21 /5 a l lows fo r a more e legan t p roo f . □Editor's note. Indeed, the proposer's solution uses a similar method with bo = &i = 1. His choiceof \/2 allows for a clean proof of the requisite lemma: From the identity b(x2 - x) (x2 - x 4-1) =x5 4- (1 - x)5 - 1, the inequality

\fo(a)fi(a)\ = i |a5 + (1 - a)5 - l| < \ (|a|5 + |1 - a|5 + l) = 1

follows immediately.Also solved by the proposer.

A Surprisingly Constant Limit

F07 - 4. Let a, b > 0 be two nonnegative numbers. Find the limitn 1

l i m V ^ - = = = = = = .n^°° t~L n + k + b+ y/n2 4- /era 4- a

Proposed by Ovidiu Furdui (University of Toledo).Solution by Paolo Perfetti (Universita degli studi di Tor Vergata Roma, Math. Dept). The key isto observe that the limit in independent of a and b. In view of this fact we compute the limit takinga = b = 0, getting

l i m y ^ , — l i m — V ^ . = / 1n-00 ^ n + k 4- \Jn2 4- nk »-°° n ~[ \ 4. k 4. \ 4. is. Jo 1 4- x + VI 4- xdx

r 2 1 c ^ 2 1= / — L - d t = 2 — - d y = 2 \ nJ 1 t + y / i J 1 1 + 2 /

1 W22

Now we prove that the limit is independent of a and b. First of all we write

1 1 1

and

r + b+y/nTTa~ r 4- b4- s/nTy/TT^ r 4- b + y/hT(l + 0(£))

-^r-T-(' + °(-))r 4 - 6 4 - y / n r \ \ n r j )

2n

r = n - f - l r = n + l N ' x

where the limit of the above expression as n approaches oo is zero. Thus,2 n - . 2 n . .

l i m y ^ - . = l i m V ^ — — 7 = .n^oo *-^ ^ r 4- b 4- vrar 4- g n^°° ,, r + 0 4- v?irr = n - fl v r = n + l

Similarly,

v - 1 y - 1 = y - ( l - f O ^ 1 ) )

Page 102: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

100 The Harvard College Mathematics Review 2.1

and2 n j ~ , _ i \ 2 n

lim V )r -L = lim Y 0(r~2) - lim ra • 0(n~2) = 0,r = n + l v r = n + l

SO2n

l i m V ^ — = = l i m y ^ ■= = .n-+oo *—* r4o4 y/nr n-+oo ^-^ r 4- Jnr

r = n + l v r = n + l vAs this is independent of a and b, the proof is complete.Also solved by The Northwestern University Math Problem Solving Group and the proposer.

Page 103: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

FEATURE13

ENDPAPER

Math Has This Funny PropertyZachary Abelf

Harvard University ' 10Cambridge, MA 02138

[email protected]

Scott D. Kominers*Harvard University '09Cambridge, MA 02138

[email protected]

Our Mathematical Minutiae article, "i Has This Funny Property" (this HCMR, p. 75), detailsthe analytic power of the complex numbers C. Once we adjoin a single square root of -1 to thereal numbers R, we obtain startling results: Cauchy's Theorem, Liouville's Theorem, The CauchyIntegral Formula....

, some ofL do not

Integral Formula....But if we add just a few more square roots of -1 to obtain the quaternions EI D C, so

the underlying structure breaks down. In the standard presentation, the roots {i, j, k] of -1 <even commute:

ij = k = -ji, jk = i = -kj, ik = -j = -ki.

Since H is not commutative, it is not necessarily true that [q(z)]n is "H-analytic" when q(z)is. Thus, we figured, everything should break down once we pass from C to H.

The endpaper would be called "j and k Have This Funny Property." We would give all the standardcounterexamples from quaternionic analysis, proving in a tour-de-force of mathematical irony howtwo new roots could devolve the entire analytic system of C below its foundations.

We raced to the references. We read. We re-read. We—We were wrong.No counterexamples. In the late 1930s, Fueter obtained quaternionic analogues of Cauchy's

Theorem, Liouville's Theorem, and even of power series developments. Our ironic tour-de-forcewas ruined before we were even born.

We despaired, until we thought to add four more square roots of -1 to obtain the octonions O DH D C. We thought, this extension ofC is not even associative—there is no way octonionic analysiscould be well-behaved!

We raced to the references. We read. We—Math is funny like that.

*See biographical information in this HCMR, p. 75.

101

Page 104: The Harvard College Mathematics · PDF fileThe Harvard College Mathematics Review is produced and ... and in summer programs ... This new step in The HCMR's maturation is truly impressive

FtBREAK

AWAY.Geographically, we're in the center of the financial world.

Philosophically, we couldn't be further away.

The exceptional individuals at QVT come from a wide variety

of academic and professional backgrounds not commonly

associated with investing, from hard sciences to literature.

Every day we confront some of the world's most complexinvestment situations, and we find that success comes not

from textbook training in finance, but from intelligence,

curiosity, and an ability to see things differently from the pack.

QVT is a hedge fund company with over $13 billion under

management. We're going places, and we're looking for moregreat people to help us get there.

QVT Financial LPNEW YORK 1 LONDON