jason papadopoulos workshop herman te riele 1 december...

High-Performance Optimization of GNFS Polynomials

Jason Papadopoulos

Workshop Herman te Riele

1 December 2011

Factoring RSA-768

Find many (a,b) relations such that

Alg(a,b) = 265482057982680 a6

+1276509360768321888 a5b

-5006815697800138351796828 a4b2

-46477854471727854271772677450 a3b3

+6525437261935989397109667371894785 a2b4

-18185779352088594356726018862434803054 a b5

-277565266791543881995216199713801103343120 b6

and

Rat(a,b) = 34661003550492501851445829 a -1291187456580021223163547791574810881 b

have only small factors (1500 CPU-years)

Finding this polynomial took 40 CPU-years in 2005-2007; can we do better in

2011?

GNFS Polynomial Selection

This polynomial was chosen for high relation yield. Approximate by:

• Size property: Alg(a,b) must be small for typical (a,b)

• Root property: Alg(a,b) must have many more small factors than random

numbers of equal size

GNFS Polynomial search:

• Stage 1: find half of a good Alg(a,b)

• Stage 2: optimize the rest

A 2011-era Polynomial Search

• Internet-coordinated effort for RSA-768 using Msieve factorization library

• 30 million stage 1 hits (10? CPU/GPU-years)

An example from stage 1:

Alg(x) = 42928872 x6 +2867096 x

5

+536807905594948376146722 x4

+69471204764889984031666983834329 x3

+563188584894806114431813241917449907 x2

+8317933737184085405738883967023222929 x

+306504315284890658637701358484088152

Rat(x) = 3547998808606749548425 x

-17493240118865140990805560504400722747

Size Optimization

Model the size of �� as a multivariate polynomial ��, �, � , ��, ��, where

• S is the skew of the sieving region

• T is a translation of the origin for �

• � , ��, �� are coefficients of an auxiliary polynomial

To minimize �,

Choose ��, the initial value of � � � �� do {

Choose search direction !" Line search: Choose #$ to minimize ��" % #$!"�

�"&' ( �" % #$!" } until (convergence)

We expect �" ~ �10,10, 10�� 10�- 10.

Choosing an Initial Point

small |��| fails → �� 0 0 0 0 � ~ 51-� (i.e. constant) so # ( 0

large |��| fails → first few � are huge, may take too long to reduce

Another idea: Choose �� so that terms of F cluster around 51-�, i.e. minimize

2�51-� 3 |term 6 of �|��$

(Take logarithms so this is an easy least-squares regression)

stage 1 example: �� 7 �397 598 9918 6912 395

Choosing a Search Direction

Many possibilities

• CADO-NFS, pol51opt: !" ( ?" @AB C (one component of �") • Powell's Algorithm: !" ( D�last few !"�

• Gradient-based: !" ( D�last few G��"��

All of these fail frequently due to bad scaling!

stage 1 example: G�� 7 [-3e72 9e70 3e58 -2e66 1e74] (i.e. 2 variables

out of 5 will never change)

Another idea: solve Hessian��|�" · !" ( 3G��"�

stage 1 example: !� 7 [5e5 -1e4 -5e22 -2e16 -1e8]

Open Issues

Can we find all solutions of G�� ( � ? Yes!

• This is a system of polynomial equations

• Algebraic Geometry provides tools (Homotopy Continuation methods) to

enumerate all solutions

• BUT: many roots possible, and HC software is very complex, often closed

source, and not designed to handle such badly behaved F

Root Optimization

Alg(x) = 42928872 x6 -6591734045201752 x

5

+958541924550301751831682 x4

+129405812408925020397453755897 x3

-8576407607138952307141528191137751178 x2

-453565868257208049993839912191312137956681 x

+9957798604786812405002673274822214304057346835313

Rat(x) = 3547998808606749548425 x -17493240209664423073040018248583762572

Sacrifice (a little) size: Find J�� so that �� % J�� · J1K�� is highly

divisible by small primes

The Root Sieve

Let J�� be a constant � . Then a residue class L (0 M L N OP) is a root of �� % J�� · J1K�� if

��L� % � · J1K�L� Q 0 mod OP

which will be true for J1K�L� S 0 and � on an arithmetic progression

T % 6 · OP U6 V W

J�� ( �� % �� % � → shift start point left by ��L� % ��L

Fix �� and ��, sieve on � , choose (� , ��, ��) that maximizes a weighted sum of

contributions to the root score (penalize larger size too). Sieve optimizations

include

• wheel sieving (4-5X faster)

• unrolled SIMD (4-8X faster)

The Curse of Dimensionality

BUT: For F to be worse by just 1%,

�� V [-10, 10]

�� V [-5229760, 5605120]

� V [-7179558928960, 27618737865280]

i.e. 7.5 X 10�� possible J��!

For skew , number of J�� ~ .. & �.. & �.. → Too many

If �� was degree 7 → �smaller �Z

Root Sieve Over Lattices

1. Fix �� and ��

2. Choose lattice size [ ( ∏ small OP

3. Find {L$}, set of residue classes mod [ with good root score

4. forall L$ {

initialize arithmetic progressions where OP ] [

sieve only over J�� with � ( L$ % ^ · [ }

Hopefully finds top J��, only [ times faster! BUT:

• [ will be large, so finding {L$} is expensive

• Fixed �� and �� → best {L$} mostly have mediocre scores

• Still a huge number of �� and ��

→ Repeat recursively on �� and �� (see Appendix)

Results (RSA-768)

Msieve’s root sieve checks ~400 residue classes in ~45 seconds

Top 200 J�� all have α < -9.7

Alg(x) = 42928872 x6 -6192655231005592 x

5

+909022231908376415214882 x4

+5914942274686414039093517351337 x3

-274301886941084816962176966461353355575 x2

-515564330420194329667591722299231510204137041 x

-196153824160596623683046786114520250839860067525075

Rat(x) = 3547998808606749548425 x -17493240204167224678960892633245036072

RSA768 poly: α = -7.1, skew 40K, 2.37 relations/sec

this poly: α = -11.2, skew 21M, 2.37 relations/sec

Results (RSA-200)

After 3 weeks on one machine (GPU for stage 1):

Alg(x) = 30484740 x5

-26161268598532753 x4

590634003963857441904735 x3

125125924118065591587855467345214 x2

-706571050099340171553551137806154058701 x

17645019293571653563585325729206250805219566160

Rat(x) = 426869834565612874003 x -246949634455459504002026808170971587479

RSA200 poly: 0.8 relations/sec

this poly: α = -6.3, skew 71M, 0.9 relations/sec

(courtesy of Jayson King)

Open Issues

• (Bai 2011) Use tree search and Hensel lifting to find good _L$` instead of

sieving

• Lattice-based formulation: J�� with a root L mod OP all satisfy

a� ��b ( aOP 3L 3L�

0 1 00 0 1 ba 6cdb % aT00b

Combine many of these, skew by S, size-reduce via LLL. Then sieve

remaining progressions over a e X f X g region of lattice points

References

S. Bai (2011) "Polynomial Selection for the Number Field Sieve"

J. Gower (2003) "Rotations and Translations of Number Field Sieve

Polynomials"

P. Jedlicka (2010) "Integral Minimisation Improvement for Murphy's

Polynomial Selection Algorithm"

T. Kleinjung (2006) "On Polynomial Selection for the General Number Field

Sieve"

T. Kleinjung (2008) "Polynomial Selection" (CADO Workshop on Integer

Factorization)

T. Li (1996) "Numerical Solution of Multivariate Polynomial Systems by

Homotopy Continuation Methods"

B. Murphy (2000) "Polynomial Selection for the Number Field Sieve Integer

Factorisation Algorithm"

Appendix: Size Optimization

Transform the original polynomial

�� ( 21$�$$

via translating the origin by T and adding a rotation polynomial R(x):

��h�� ( J�� 3 �� · J1K�� 3 �� % 21$�� 3 ��$$

( 21$h�$$

Appendix: Size Optimization

Assume:

• J�� ( �� % �� % �

• Coefficients are weighted by skew S, i.e. �$ ( 1$h · $ • size ~ j��h�� over sieving region

Then we can minimize �� ( ��

( kl � 231��l� % � �� % 42��l�n % �� % 21��.� % �� % 7��n� % �� % 14��l�� % �.�- % �n� % �- �� % 10 ��l� % �.�� % �n�� % 5�-��

(a polynomial with 273 terms!)

Appendix: Many Local Minima

Vary �� for stage 1 example by negating some of �, � , ��, ��:

Final �� 1.010569 X 10ll

3.408434 X 10l.

3.899769 X 10ll

8.536757 X 10l-

4.014100 X 10ln

8.196526 X 10ll

4.151540 X 10lo (worst) 1.878025 X 10lo

1.200364 X 10ll

4.010228 X 10ll

2.905903 X 10l� (best) 1.289390 X 10ll

3.831444 X 10ln

5.141286 X 10ll

5.191914 X 10ll

4.765400 X 10l.

Appendix: Root Optimization

Chance a random number is divisible by p ~

p��O� ( 2 q r 1OP 3 1OP&�st

Pu�( 1O 3 1

Murphy (2000): Chance Alg(a,b) is divisible by p ~

p��O� ( OO % 1 2 # roots of �� mod OPOP

t

Pu�

Thus: log(gain from using Alg(x)) is 7

w ( 2 xp��O�3p��O�y log O { small

Good α → �� % J�� · J1K�� has many roots mod OP

Appendix: A Recursive Lattice Sieve |� ( width of range of �� [� ( ∏OP such that [� N |�/ 1000 |� ( width of range of �� given �� ( 0 [� ( ∏�next OP� such that [� N |�/ 1000

Initialize progressions of OP| [� Sieve [� X [� X [� grid Save best ~� ,$ ��,$ ��,$� mod [�

For ith triplet {

��,� ( smallest �� for triplet i in the |� range

For j < 1000 { /* find good ��-�' 'planes' */ fix �� ( ��,�, initialize progressions of OP| [� Sieve [� X [� grid Save best � ~� ,� ��,�� mod L1·L2,��,�� ,� += [� }

}

Appendix: A Recursive Lattice Sieve

For jth � ~� ,� ��,�� mod L1·L2,��,�� { /* for each plane */ Fix �� ( ��,� | ( width of range of � given �� ( 0 [ ( ∏�next OP� such that [ N | / 1000000 ��,� ( smallest �� for triplet j in the |� range

For m < 1000 { /* find good �� 'lines' */ fix �� ( ��,�, initialize progressions of OP| [ Sieve [ locations Save best �� ,�mod [ · [� · [�, ��,�, ��,�� ,� += [� }

For best few � ,� { /* for each line */ | ( width of range of � given �� ( ��,� Fix �� ( ��,�, �� ( ��,� Initialize progressions of remaining OP Sieve ~1000000 locations in | range that lie on � ,� % ^ · [ · [� · [� }

}

jason papadopoulos workshop herman te riele 1 december...

Documents