parton distribution functions · = (m/14 tev) exp( y) q = m lhc parton kinematics m = 10 gev m =...

25
Parton Distribution Functions Goals: describe in detail how global PDF analyses are carried out, the strengths and weaknesses of LO/LO* (modified LO)/NLO and NNLO fits, how PDF uncertainties are calculated and how PDF re-weighting can be easily carried out through LHAPDF and other PDF tools, and the use of correlations to constrain PDFs and cross sections 1. Introduction As mentioned in Chapter ??, the calculation of the production cross sections at hadron colliders for both interesting physics processes and their backgrounds relies upon a knowledge of the distribution of the momentum fraction x of the partons (quarks and gluons) in a proton in the relevant kinematic range. These parton distribution functions (pdfs) can not be calculated perturbatively; ultimately, it may be possible to calculate these parton distribution functions non-perturbatively, using lattice gauge theory (reference for LGT). For the foreseeable future, though, pdf’s will be determined by global fits to data from deep inelastic scattering (DIS), Drell-Yan (DY), and jet production at current energy ranges. There are a number of global pdf fitting groups that are currently active [], and which provide semi-regular updates to the parton distributions when new data and/or theoretical developments become available. The resulting pdf’s are available at leading order (LO), next-to-leading order (NLO) and next-to-next-to-leading order (NNLO) in the strong coupling constant ( α s ), depending on the order(s) at which the global pdf fits have been carried out. Some pdf’s have also been produced at what has been termed modified leading order, in an attempt to reduce some of the problems that result from the use of LO pdf’s in parton shower Monte Carlo programs. PDFs of each these orders will be discussed in this chapter. There are two different classes of technique for pdf determination, those based on the Hessian approach, and those using a Monte Carlo approach. Both classes will be discussed in this chapter. In addition, the PDF4LHC working group [] has carried out benchmark comparisons [] of the NLO predictions at the LHC (7 TeV) for six PDF groups. These comparisons will be discussed in Chapter X. 2. Processes involved in global analysis fits Measurements of deep-inelastic scattering (DIS) structure functions ( F 2 ,F 3 ), or of the related cross sec- tions, in lepton-hadron scattering and of lepton pair production cross sections in hadron-hadron collisions provide the main source of information on quark distributions f q/p (x, Q 2 ) inside hadrons. At leading order, the gluon distribution function f g/p (x, Q 2 ) enters directly in hadron-hadron scattering processes with jet final states. Modern global parton distribution fits are carried out to NLO and NNLO, which allows α S (Q 2 ), f q/p (x, Q 2 ) and f g/p (x, Q 2 ) to all mix and contribute in the theoretical formulae for all processes. Nevertheless, the broad picture described above still holds to some degree in global pdf analyses. A NLO (NNLO) global pdf fit requires thousands of iterations and thus thousands of estimates of NLO (NNLO) matrix elements. The NLO (NNLO) matrix elements require too much time for evaluation to be used directly in global fits. Either a K-factor (NLO/LO or NNLO/LO) can be calculated for each data point used in the global fit, and the LO matrix element (which can be calculated very quickly) can be changed in the global fit (multiplied by the K-factor), or a routine such as fastNLO [] or Applgrid [] can be used for fast evaluation of the NLO matrix element with the new iterated pdf. Practically speaking, both provide the same order of accuracy footnoteAn argument has been made that the K-factor approach does not work, for example for inclusive jet production, since different subprocesses contribute to that production and the NLO corrections may be different for each subprocess. However, the K-factors change very slowly in the process of a global fit, and occasional updating is sufficient to preserve the needed accuracy.. Even when fastNLO or Applgrid is used at NLO, a K-factor approach is still needed at NNLO.

Upload: others

Post on 08-Aug-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Parton Distribution Functions · = (M/14 TeV) exp( y) Q = M LHC parton kinematics M = 10 GeV M = 100 GeV M = 1 TeV M = 10 TeV y = 6 4 2 0 2 4 6 Q 2 (GeV 2) x Fig. 1: A plot showing

Parton Distribution Functions

Goals: describe in detail how global PDF analyses are carried out, the strengths and weaknessesof LO/LO* (modified LO)/NLO and NNLO fits, how PDF uncertainties are calculated and how PDFre-weighting can be easily carried out through LHAPDF and other PDF tools, and the use of correlationsto constrain PDFs and cross sections

1. Introduction

As mentioned in Chapter??, the calculation of the production cross sections at hadron colliders for bothinteresting physics processes and their backgrounds relies upon a knowledge of the distribution of themomentum fractionx of the partons (quarks and gluons) in a proton in the relevant kinematic range.These parton distribution functions (pdfs) can not be calculated perturbatively; ultimately, it may bepossible to calculate these parton distribution functions non-perturbatively, using lattice gauge theory(reference for LGT). For the foreseeable future, though, pdf’s will be determined by global fits todata from deep inelastic scattering (DIS), Drell-Yan (DY), and jet production at current energy ranges.

There are a number of global pdf fitting groups that are currently active[], and which providesemi-regular updates to the parton distributions when new data and/or theoretical developments becomeavailable. The resulting pdf’s are available at leading order (LO), next-to-leading order (NLO) andnext-to-next-to-leading order (NNLO) in the strong coupling constant (αs), depending on the order(s) atwhich the global pdf fits have been carried out. Some pdf’s have also been produced at what has beentermedmodified leading order, in an attempt to reduce some of the problems that result from the useof LO pdf’s in parton shower Monte Carlo programs. PDFs of each theseorders will be discussed in thischapter.

There are two different classes of technique for pdf determination, those based on the Hessianapproach, and those using a Monte Carlo approach. Both classes will bediscussed in this chapter.In addition, the PDF4LHC working group [] has carried out benchmark comparisons [] of the NLOpredictions at the LHC (7 TeV) for six PDF groups. These comparisons will be discussed in Chapter X.

2. Processes involved in global analysis fits

Measurements of deep-inelastic scattering (DIS) structure functions (F2, F3), or of the related cross sec-tions, in lepton-hadron scattering and of lepton pair production cross sections in hadron-hadron collisionsprovide the main source of information on quark distributionsfq/p(x, Q2) inside hadrons. At leadingorder, the gluon distribution functionfg/p(x, Q2) enters directly in hadron-hadron scattering processeswith jet final states. Modern global parton distribution fits are carried out toNLO and NNLO, whichallows αS(Q2), fq/p(x, Q2) andfg/p(x, Q2) to all mix and contribute in the theoretical formulae forall processes. Nevertheless, the broad picture described above still holds to some degree in global pdfanalyses.

A NLO (NNLO) global pdf fit requires thousands of iterations and thus thousands of estimates ofNLO (NNLO) matrix elements. The NLO (NNLO) matrix elements require too much time for evaluationto be used directly in global fits. Either a K-factor (NLO/LO or NNLO/LO) canbe calculated for eachdata point used in the global fit, and the LO matrix element (which can be calculated very quickly) can bechanged in the global fit (multiplied by the K-factor), or a routine such as fastNLO [] or Applgrid [] canbe used for fast evaluation of the NLO matrix element with the new iterated pdf.Practically speaking,both provide the same order of accuracy footnoteAn argument has beenmade that the K-factor approachdoes not work, for example for inclusive jet production, since different subprocesses contribute to thatproduction and the NLO corrections may be different for each subprocess. However, the K-factorschange very slowly in the process of a global fit, and occasional updating is sufficient to preserve theneeded accuracy.. Even when fastNLO or Applgrid is used at NLO, a K-factor approach is still neededat NNLO.

Page 2: Parton Distribution Functions · = (M/14 TeV) exp( y) Q = M LHC parton kinematics M = 10 GeV M = 100 GeV M = 1 TeV M = 10 TeV y = 6 4 2 0 2 4 6 Q 2 (GeV 2) x Fig. 1: A plot showing

The data from DIS, DY and jet processes utilized in pdf fits cover a wide range inx andQ2.HERA data (H1 [?]+ZEUS [?]) are predominantly at lowx, while the fixed target DIS [?, ?, ?, ?, ?]and DY [?, ?] data are at higherx. Collider jet data at both the Tevatron and LHC [?, ?, ?, ?] cover abroad range inx andQ2 by themselves and are particularly important in the determination of the highxgluon distribution. To date, no jet data from the LHC has been used in globalpdf fits, although that willchange as high statistics data, and their detailed systematic error information, are published. In addition,jet production data from HERA have been used in the HERAPDF global pdffits [].

show plot of all points in x and Q2 in CT10 fit

There is a tradeoff between the size and the consistency of a data set, in that a wider data setcontains more information, but information coming from different experimentsmay be partially incon-sistent. Most of the fixed target data have been taken on nuclear targets and suffer from uncertainties inthe nuclear corrections that must be made []. This is unfortunate as it is the neutrino fixed target data thatprovide most of the quark flavor differentiation, for example between up,down and strange quarks. AsLHC collider data become more copious, it may be possible to reduce the reliance on fixed target nucleardata. For example, the rapidity distributions forW+, W− andZ production the LHC (as well as theTevatron) are proving to be very useful in constrainingu andd valence and sea quarks, as described inChapter X.

There is considerable overlap, however, for the kinematic coverage among the datasets with thedegree of overlap increasing with time as the full statistics of the HERA experiments are being published.Parton distributions determined at a givenx andQ2 ‘feed-down’ or evolve to lowerx values at higherQ2 values.reference to discussion in Chapter 2? DGLAP-based NLO and NNLO pQCD shouldprovide an accurate description of the data (and of the evolution of the parton distributions) over theentire kinematic range present in current global fits. At very lowx andQ2, DGLAP evolution is believedto be no longer applicable and a BFKL [?, ?, ?, ?] description must be used. No clear evidence ofBFKL physics is seen in the current range of data (reference for this?); thus all global analyses useconventional DGLAP evolution of pdfs.

There is a remarkable consistency between the data in the pdf fits and the perturbative QCD theoryfit to them. Both the CTEQ and MSTW groups use over 3000 data points (check exact numbers) intheir global pdf analyses and theχ2/DOF for the fit of theory to data is on the order of unity, for both theNLO and NNLO analyses. (The NNLOχ2 values tend to be slightly larger than those for NLO.) For mostof the data points, the statistical errors are smaller than the systematic errors,so a proper treatment ofthe systematic errors and their bin-to-bin correlations is important. All modern day experiments providethe needed correlated systematic error information. The H1 and ZEUS experiments have combined thedata from the two experiments in such a way as to reduce both the systematic andstatistical errors,providing errors of both type of the order of a percent or less over much of the HERA kinematics. In thecombination, 1402 data points are combined to form 742 cross-section measurements (including bothneutral current and charged current cross sections). The combined data set, with its small statistical andsystematic errors, forms a very strong constraint for all modern global pdf fits. The manner of using thesystematic errors in a global fit will be discussed later in this chapter.

The accuracy of the extrapolation to higherQ2 depends on the accuracy of the original measure-ment, any uncertainty onαS(Q2) and the accuracy of the evolution code. Most global pdf analysesare carried out at NLO and NNLO. The NLO and NNLO evolution codes have now been benchmarkedagainst each other and found to be consistent (need ref to les houches). Many processes have beencalculated to NLO and there is the possibility of including data from these processes in global fits. Fewerprocesses have been calculated at NNLO [?]. These processes include DIS and DY, but not for exam-ple inclusive jet production. (Progress towards the calculation of inclusive jet production at NNLO isdescribed in Chapter ref:sec.) Typically, jet production is included in global pdf fits using NLO matrixelements, perhaps supplemented by threshold corrections to make an approximate NNLO prediction.do we talk about threshold corrections? Thus, any current NNLO global pdf analyses are still

Page 3: Parton Distribution Functions · = (M/14 TeV) exp( y) Q = M LHC parton kinematics M = 10 GeV M = 100 GeV M = 1 TeV M = 10 TeV y = 6 4 2 0 2 4 6 Q 2 (GeV 2) x Fig. 1: A plot showing

10-7

10-6

10-5

10-4

10-3

10-2

10-1

100

100

101

102

103

104

105

106

107

108

109

fixed

targetHERA

x1,2

= (M/14 TeV) exp( y)

Q = M

LHC parton kinematics

M = 10 GeV

M = 100 GeV

M = 1 TeV

M = 10 TeV

66y = 40 224

Q2

(G

eV2)

x

Fig. 1: A plot showing thex andQ2 values needed for the colliding partons to produce a final state with massM and rapidity

y at the LHC (14 TeV).

approximate for this reason, but in practice the approximation should work reasonably well. Full NNLOprecision awaits the completion of the NNLO inclusive jet cross section, though. Current evolution pro-grams in use should be able to carry out the evolution using NLO DGLAP to an accuracy of a few percentover the hadron collider kinematic range, except perhaps at very largex and very smallx.

The kinematics appropriate for the production of a state of massM and rapidityy at the LHC wasshown in Figure 1 in Section??.

For example, to produce a state of mass 100 GeV and rapidity 2 requires partons ofx values 0.05and 0.001 at aQ2 value of1 × 104 GeV2. Compare this figure to the scatterplot of thex andQ2 rangeincluded in the recent CT10 fit and it is clear that an extrapolation to higherQ2 (M2) is required forpredictions for many of the LHC processes of interest.

3. Parameterizations and schemes

A global pdf analysis carried out at NLO or NNLO needs to be performedin a specific renormaliza-tion and factorization scheme. The evolution kernels are calculated in a specific scheme and to main-tain consistency, any hard scattering cross section calculations used forthe input processes or utiliz-ing the resulting pdfs need to have been implemented in that same renormalization scheme. As wesaw earlier in Chapter??, one needs to specify a scheme or convention in subtracting the divergentterms from the pdfs; basically the scheme specifies how much of the finite corrections to subtract alongwith the divergent pieces. Almost universally, theMS scheme is used; using dimensional regulariza-tion, in this scheme the pole terms and accompanyinglog4π and Euler constant terms are subtracted.

Page 4: Parton Distribution Functions · = (M/14 TeV) exp( y) Q = M LHC parton kinematics M = 10 GeV M = 100 GeV M = 1 TeV M = 10 TeV y = 6 4 2 0 2 4 6 Q 2 (GeV 2) x Fig. 1: A plot showing

Fig. 2: CTEQ6.5 up and down quark distributions normalized to those of CTEQ6.1, showing the impact of the heavy quark

mass corrections.

should we explicitly show the subtraction terms? PDFs are also available in the DIS scheme(where the full orderαs corrections forF2 are absorbed into the quark pdfs), a fixed flavour scheme(see, for example, GRV [?]) and several schemes that differ in their specific treatment of the charmquarkmass.

Basically all modern pdfs now incorporate a treatment of heavy quark effects in their fits, either viathe ACOT general-mass (GM) variable flavor number scheme [] (supplemented by by a unified treatmentof both kinematical and dynamical effects using the S-ACOT [] and ACOT-χ [] concepts), used by CTEQ,or by the Thorne-Roberts scheme [], used by both MSTW and HERAPDF,and the FONLL scheme, usedby NNPDF [].

What about GJR and ABKM?

Incorporation of the full heavy-quark mass effects in the general-massformalism suppresses theheavy flavor contributions to the DIS structure functions, especially at lowx andQ2. In order for thetheoretical calculations in the global fits to agree with the data in these kinematic regions, then thecontributions of the light quark and anti-quark pdfs must increase accordingly. This has a noticeableimpact, especially on predictions forW andZ cross sections at the LHC.

Figure 2 shows the impact of the heavy quark mass corrections on the up and down quark distribu-tions for CTEQ6.5, at aQ value of 2 GeV. The CTEQ6.5 up and down quark distributions are normalizedto the corresponding ones from CTEQ6.1 (which does not have the heavy quark mass corrections). Theshaded areas indicate the CTEQ6.1 pdf uncertainty. The dashed curvesrepresent slightly different pa-rameterizations for the CTEQ6.5 pdfs. The heavy quark mass correctionshave a strong effect (largerthan the pdf uncertainty for CTEQ6.1) at lowx, in a region sensitive toW andZ production at the LHC.

The impact of general-mass variable flavour number schemes (GM-VFNS)lies mostly in the lowx andQ2 regions. Aside from modifications to the fits to the HERA data, and the commensurate changein the fitted pdfs, there is basically no modification for predictions at highQ2 at the LHC. Thus, it is fullyconsistent to use GM-VFNS pdfs with matrix elements calculated with theMS scheme.

should this be discussed in more detail/more elegantly

Page 5: Parton Distribution Functions · = (M/14 TeV) exp( y) Q = M LHC parton kinematics M = 10 GeV M = 100 GeV M = 1 TeV M = 10 TeV y = 6 4 2 0 2 4 6 Q 2 (GeV 2) x Fig. 1: A plot showing

It is also possible to use only leading-order matrix element calculations in the global fits whichresults in leading-order parton distribution functions, which have been made available by both the CTEQand MRST groups. For many hard matrix elements for processes used in theglobal analysis, there existK factors significantly different from unity. Thus, one expects there to benoticeable differences betweenthe LO and NLO parton distributions (and indeed this is often the case).

All global analyses use a generic form for the parameterization of both thequark and gluon distri-butions at some reference valueQ0:

F (x, Q0) = A0xA1(1 − x)A2P (x; A3, A4...) (1)

The reference valueQ0 is usually chosen in the range of1–2 GeV. The parameterA1 is associated withsmall-x Regge behaviour whileA2 is associated with large-x valence counting rules. We expectA1 tobe approximately -1 for gluons and anti-quarks, and of the order of 1/2 for valence quarks, from theRegge arguments mentioned in Chapter X. Counting rule arguments tell us that theA2 parameter shouldbe related to2ns − 1, wherens is the minimum number of spectator quarks. So, for valence quarks ina proton, there are two spectator quarks, and we expectA2 = 3. For a gluon, there are three spectatorquarks, andA2 = 5; for anti-quarks in a proton, there are four spectator quarks, and thusA2 = 7. Sucharguments are useful, for example in telling us that the gluon distribution shouldfall more rapidly withx than quark distributions, but it is not clear exactly at what value ofQ that the arguments made aboveare valid.

The first two factors, in general, are not sufficient to completely describe either quark or gluondistributions. The termP (x; A3, ...) is a suitably chosen smooth function, depending on one or moreparameters, that adds more flexibility to the pdf parameterization.P (x; A3, ...) is chosen so as to tendtowards a constant forx approaching either 0 or 1.

In general, both the number of free parameters and the functional form can have an influence onthe global fit. A too-limited parameterization not only can lead to a worse description of the data, butalso to pdfs in different kinematic regions being tied together not by the physics, but by the limitationsof the parameterization. Note that the parameterization forms shown here imply that pdfs are positive-definite. As they are not physical objects by themselves, it is possible for them to be negative, especiallyat lowQ2. Some pdf groups (such as CTEQ) use a positive-definite form for the parameterization; othersdo not. For example, the MSTW2008 gluon distribution is negative forx <?, Q2 =?GeV 2. Evolutionquickly brings the gluon into positive territory.

The NNPDF approach attempts to minimize the parameterization bias by exploring global fitsusing a large number of free parameters in a Monte Carlo approach. The general form for NNPDFcan be written asfi(x, Qo) = ci(x)NNi(x), whereNNi(x) is a neural network, andci(x) is a “pre-processing function”. The preprocessing function is not fitted, but rather chosen randomly in a spaceof function of the general form of Equation X.more detail here? The CT10 NLO fit uses 26 freeparameters (many of the pdf parameters are either fixed at reasonable values, or are constrained by sumrules), while MSTW08 uses 20 free parameters, and NNPDF effectivelyhas 259 free parameters.

The pdfs made available to the world from the global analysis groups can either be in a form wherethex andQ2 dependence is parameterized, or the pdfs for a givenx andQ2 range can be interpolatedfrom a grid that is provided, or the grid can be generated given the starting parameters for the pdfs (seethe discussion on LHAPDF in Section 8.). All techniques should provide an accuracy on the output pdfdistributions on the order of a few percent.

The parton distributions from the CT10 NLO pdfs release are plotted in Figure 3 at aQ valueof 10 GeV. The gluon distribution is dominant atx values of less than0.01 with the valence quarkdistributions dominant at higherx. One of the major influences of the HERA data has been to steepenthe gluon distribution at lowx. The CT10 up quark, up-bar quark, b quark and gluon distributionsare shown as a function ofQ2 for x values of 0.001, 0.01, 0.1 and 0.3 in Figures??. At low x, the

Page 6: Parton Distribution Functions · = (M/14 TeV) exp( y) Q = M LHC parton kinematics M = 10 GeV M = 100 GeV M = 1 TeV M = 10 TeV y = 6 4 2 0 2 4 6 Q 2 (GeV 2) x Fig. 1: A plot showing

Fig. 3: The CT10 parton distribution functions evaluated at aQ of 10 GeV.have to scale gluon by 0.1

pdfs increase withQ2, while at higherx, the pdfs decrease withQ2. Both effects are due to DGLAPevolution.

4. Uncertainties on pdfs

In addition to having the best estimates for the values of the pdfs in a given kinematic range, it is also im-portant to understand the allowed range of variation of the pdfs, i.e. their uncertainties. A conventionalmethod of estimating parton distribution uncertainties has been to compare different published partondistributions. This is unreliable since most published sets of parton distributions adopt similar assump-tions and the differences between the sets do not fully explore the uncertainties that actually exist.

The sum of the quark distributions (Σfq/p(x, Q2) + fg/p(x, Q2)) is, in general, well-determinedover a wide range ofx andQ2. As stated above, the quark distributions are predominantly determinedby the DIS and DY data sets which have large statistics, and systematic errorsin the few percent range(±3% for 10−4 < x < 0.75). Thus the sum of the quark distributions is basically known to a similaraccuracy. The individual quark flavours, though, may have a greater uncertainty than the sum. This canbe important, for example, in predicting distributions that depend on specific quark flavours, like theWasymmetry distribution [?] and theW andZ rapidity distributions.

The largest uncertainty of any parton distribution, however, is that on thegluon distribution. Thegluon distribution can be determined indirectly at lowx by measuring the scaling violations in the quark

Page 7: Parton Distribution Functions · = (M/14 TeV) exp( y) Q = M LHC parton kinematics M = 10 GeV M = 100 GeV M = 1 TeV M = 10 TeV y = 6 4 2 0 2 4 6 Q 2 (GeV 2) x Fig. 1: A plot showing

Fig. 4: The CT10 up quark, up-bar quark, b quark and gluon parton distribution functions evaluated as a function ofQ2 at x

values of 0.001 (left) and 0.01 (right).

Fig. 5: The CT10 up quark, up-bar quark, b quark and gluon parton distribution functions evaluated as a function ofQ2 at anx

value of 0.1 (left) and 0.3 (right).

Page 8: Parton Distribution Functions · = (M/14 TeV) exp( y) Q = M LHC parton kinematics M = 10 GeV M = 100 GeV M = 1 TeV M = 10 TeV y = 6 4 2 0 2 4 6 Q 2 (GeV 2) x Fig. 1: A plot showing

distributions, but a direct measurement is necessary at moderate to highx. About 43% of the momentumof the proton is carried by gluons, and most of that momentum is at relatively small x (16% of themomentum of the proton, for example, is carried by gluons in thex range from 0.01 to 0.1.) The bestdirect information on the gluon distribution at moderate to highx comes from jet production at theTevatron (and the LHC).

There has been a great deal of activity on the subject of pdf uncertainties. Two techniques inparticular, the Lagrange Multiplier and Hessian techniques, have been used by CTEQ and MSTW toestimate pdf uncertainties [?, ?, ?]. The Lagrange Multiplier technique is useful for probing the pdfuncertainty of a given process, such as theW cross section, while the Hessian technique provides a moregeneral framework for estimating the pdf uncertainty for any cross section. In addition, the Hessiantechnique results in tools more accessible to the general user.

4.1 The Hessian method

The Hessian method for pdf determination involves minimizing a suitable log-likelihood function. Theχ2 function may contain the full set of correlated errors, or only a partial set. The correlated systematicerrors may be accounted for using a covariance matrix, or as a shift to thedata, adopting aχ2 penaltyproportional to the size of the shift divided by the systematic error.

have to rewrite; taken from CT10 paper

As an example, consider the CT10 fits to the combined HERA Run 1 neutral current (e+p) crosssections shown in Figures??. When comparing each experimental valueDk with the respective theoryvalueTk({a}) (dependent on PDF parameters{a}), we account for possible systematic shifts in thedata, estimated by the correlation matrixβkα. There areNλ =114 independent sources of experimentalsystematic uncertainties, quantified by parametersλα that should obey the standard normal distribution.The contribution of the combined HERA-1 set to the log-likelihood functionχ2 is given by

χ2({a}, {λ}) =N

k=1

1

s2k

Dk − Tk({a}) −Nλ∑

α=1

λαβkα

2

+Nλ∑

α=1

λ2α, (2)

whereN is the total number of points,sk =√

s2k,stat+ s2

k,uncor sysis the total uncorrelated error on

the measurementDk, obtained by summing the statistical and uncorrelated systematic errors onDk inquadrature. Minimization ofχ2 with respect to the systematic parametersλα is realized algebraically,by the procedure explained in Refs. [?, ?].

The plot on the left shows a comparison of the unshifted data and the shifteddata. The plot on theright shows the CT10 NLO predictions compared to the shifted data. The CT10 predictions show goodagreement with the combined H1/ZEUS set of reduced DIS cross sections.A χ2 ≈ 680 is obtained fortheN = 579 data points of the combined HERA-1 sample that pass the standard CTEQ kinematical cutsfor the included DIS data,Q > 2 GeV andW > 3.5 GeV. Apart from some excessive scatter of the NCe±p data around theory predictions, which results in a slightly higher-than-ideal value ofχ2/N = 1.18,NLO theory describes well the overall data, without systematic discrepancies.

The data points shown in the figure include systematic shifts bringing theoretical and experimentalvalues closer to one another, by allowing the systematic parametersλα to vary within the bounds allowedby the experimental correlation matrixβkα. As expected, the best-fit values ofλα are distributed consis-tently with the standard normal distribution. Their contribution

α λ2α ≈ 65 to χ2 in Eq. 2 is better than

the value of114 expected on statistical grounds.

The histogram ofλα values obtained in the CT10 fit is shown in Fig. 7. The histogram is clearlycompatible with its stated Gaussian behavior. In each fit, one observes 1-2 values at(±)2-3σ, but thosetend to have a large PDF uncertainty (up to 3σ) and are not persistent in all fits.

Page 9: Parton Distribution Functions · = (M/14 TeV) exp( y) Q = M LHC parton kinematics M = 10 GeV M = 100 GeV M = 1 TeV M = 10 TeV y = 6 4 2 0 2 4 6 Q 2 (GeV 2) x Fig. 1: A plot showing

10-3

10-2

10-1

1

10

102

103

104

105

106

107

10 102 103 104 105

σNC

r (

x,Q

2 ) ⋅

2i e+ P

→ e

+ X

Q2 [GeV2 ]

x=6.18⋅ 10-5 , i=20

x=9.5⋅ 10-5 , i=19x=2⋅ 10-4 , i=18

x=3.2⋅ 10-4 , i=17x=5⋅ 10-4 , i=16

x=8⋅ 10-4 , i=15

x=1.3⋅ 10-3 , i=14

x=2⋅ 10-3 , i=13

x=3.2⋅ 10-3 , i=12

x=5⋅ 10-3 , i=11

x=8⋅ 10-3 , i=10

x=1.3⋅ 10-2 , i=9

x=2⋅ 10-2 , i=8

x=3.2⋅ 10-2 , i=7

x=5⋅ 10-2 , i=6

x=8⋅ 10-2 , i=5

x=0.13, i=4

x=0.18, i=3

x=0.25, i=2

x=0.4, i=1

x=0.65, i=0

Circles: HERA-1 dataBoxes: HERA-1 data with systematic shifts (CT10)

10-3

10-2

10-1

1

10

102

103

104

105

106

107

10 102 103 104 105

σNC

r (

x,Q

2 ) ⋅

2i e+ P

→ e

+ X

Q2 [GeV2 ]

x=6.18⋅ 10-5 , i=20

x=9.5⋅ 10-5 , i=19x=2⋅ 10-4 , i=18

x=3.2⋅ 10-4 , i=17x=5⋅ 10-4 , i=16

x=8⋅ 10-4 , i=15

x=1.3⋅ 10-3 , i=14

x=2⋅ 10-3 , i=13

x=3.2⋅ 10-3 , i=12

x=5⋅ 10-3 , i=11

x=8⋅ 10-3 , i=10

x=1.3⋅ 10-2 , i=9

x=2⋅ 10-2 , i=8

x=3.2⋅ 10-2 , i=7

x=5⋅ 10-2 , i=6

x=8⋅ 10-2 , i=5

x=0.13, i=4

x=0.18, i=3

x=0.25, i=2

x=0.4, i=1

x=0.65, i=0

Shifted HERA-1 data (circles)CT10W NLO theory (lines)

Fig. 6: A comparison of the unshifted and shifted HERA1 combined neutral current data (left) and the comparison of the NLO

CT10 predictions to the shifted data (right).

The overall agreement with the combined HERA-1 data is slightly worse than withthe separatedata sets, as a consequence of some increase inχ2/N for the NC data atx < 0.001 andx > 0.1.

In this case, the absolute size of the systematic error shifts needed is small. This not need be thecase, for example, for cross sections with larger systematic errors, such as inclusive jet production.

The Hessian method results in the production of a central (best fit) pdf, and a set of error pdfs. Inthis method a large matrix (26×26 for CTEQ,20×20 for MSTW), with dimension equal to the numberof free parameters in the fit, has to be diagonalized. The result is 26 (20) orthonormal eigenvectordirections for CTEQ (MRST) which provide the basis for the determination ofthe pdf error for anycross section. Thus, there are 52 error pdfs for the CT10 error setand 40 for the MSTW08 error set.This process is shown schematically in Figure 8. The eigenvectors are nowadmixtures of the 26 pdfparameters left free in the global fit. There is a broad range for the eigenvalues, over a factor of onemillion. The eigenvalues are distributed roughly linearly aslog ǫi, whereǫi is the eigenvalue for thei-th direction. The larger eigenvalues correspond to directions which arewell-determined; for example,eigenvectors 1 and 2 are sensitive primarily to the valence quark distributions at moderatex, a regionwhere they are well-constrained. The theoretical uncertainty on the determination of theW mass atboth the Tevatron and the LHC depends primarily on these 2 eigenvector directions, asW production atthe Tevatron proceeds primarily through collisions of valence quarks. The most significant eigenvectordirections for determination of theW mass at the LHC correspond to larger eigenvector numbers, whichare primarily determined by sea quark distributions. In most cases, the eigenvector can not be directlytied to the behaviour of a particular pdf in a specific kinematic region. There are exceptions, such aseigenvector 15 in the CTEQ6.1 fit, discussed below.

Page 10: Parton Distribution Functions · = (M/14 TeV) exp( y) Q = M LHC parton kinematics M = 10 GeV M = 100 GeV M = 1 TeV M = 10 TeV y = 6 4 2 0 2 4 6 Q 2 (GeV 2) x Fig. 1: A plot showing

CT10

-3 -2 -1 0 1 20

5

10

15

20

25

30

35

ΛΑ

Cou

nt

(a)(b)

Fig. 7: Distribution of systematic parametersλα of the combined HERA-1 data set [?] in the CT10 best fit (CT10.00).

(a)

Original parameter basis

(b)

Orthonormal eigenvector basis

zk

Tdiagonalization and

rescaling by

the iterative method

ul

ai

2-dim (i,j) rendition of d-dim (~16) PDF parameter space

Hessian eigenvector basis sets

ajul

p(i)

s0s0

contours of constant c2global

ul: eigenvector in the l-direction

p(i): point of largest ai with tolerance T

s0: global minimump(i)

zl

Fig. 8: A schematic representation of the transformation from the pdf parameter basis to the orthonormal eigenvector basis.

Perhaps the most controversial aspect of pdf uncertainties is the determination of the∆χ2 excur-sion from the central fit that is representative of a reasonable error.Nominally, a∆χ2 = T 2(tolerance)would correspond to a1 − σ(68%CL) error. PDF fits performed with a limited number of experimentsmay be able to maintain that criterion. For example, HERAPDF uses aχ2 excursion of 1 for a1σ error.For general global fits, such as from CTEQ and MSTW, however, aχ2 excursion of 1 (for a1σ error) istoo low of a value in a global pdf fit. These global fits use data sets arising from a number of differentprocesses and different experiments; there is a non-negligible tension between some of the different datasets. A larger variation in∆χ2 is required for a 68% CL. For example, CT10 uses a tolerance T=10 fora 90% CL error, corresponding to T=6.1 for a 68% CL error, while MSTWuses a dynamical tolerance(varying from 1 to 6.5) for each eigenvector.

The uncertainties for all predictions should be linearly dependent on the tolerance parameter used;thus, it should be reasonable to scale the uncertainty for an observable from the90% CL limit providedby the CTEQ/MSTW error pdfs to a one-sigma error by dividing by a factorof 1.6 (MSTW also pro-vides separate68%CL error pdfs). Such a scaling will be a better approximation for observables moredependent on the low number eigenvectors, where theχ2 function is closer to a quadratic form.

Even though, the data sets and definitions of tolerance are different among the different pdf groups,we will see in Chapter X that the pdf uncertainties at the LHC are fairly similar. Note that relying on theerrors determined from a single pdf group may be an underestimate of thetrue pdf uncertainty, as the

Page 11: Parton Distribution Functions · = (M/14 TeV) exp( y) Q = M LHC parton kinematics M = 10 GeV M = 100 GeV M = 1 TeV M = 10 TeV y = 6 4 2 0 2 4 6 Q 2 (GeV 2) x Fig. 1: A plot showing

central results among the pdf groups often differ by an amount similar to this one-sigma error. (See thediscussion in Chapter?? regarding the PDF4LHC comparisons of predictions and uncertainties fortheLHC.)

Each error pdf results from an excursion along the “+” and “−” directions for each eigenvector.Consider a variableX; its value using the central pdf for an error set (say CT10) is given byX0. X+

i isthe value of that variable using the pdf corresponding to the “+” direction for eigenvectori andX−

i thevalue for the variable using the pdf corresponding to the “−” direction. The excursions are symmetricfor the larger eigenvalues, but may be asymmetric for the more poorly determined directions. In order tocalculate the pdf error for an observable, aMaster Equationshould be used:

∆X+max =

N∑

i=1

[max(X+i − X0, X

i − X0, 0)]2

∆X−

max =

N∑

i=1

[max(X0 − X+i , X0 − X−

i , 0)]2 (3)

∆X+ adds in quadrature the pdf error contributions that lead to an increase in the observableXand∆X− the pdf error contributions that lead to a decrease. The addition in quadrature is justified bythe eigenvectors forming an orthonormal basis. The sum is over allN eigenvector directions, or20 inthe case of CTEQ6.1. Ordinarily,X+

i − X0 will be positive andX−

i − X0 will be negative, and thus itis trivial as to which term is to be included in each quadratic sum. For the highernumber eigenvectors,however, the “+” and “−” contributions may be in the same direction (see for example eigenvector 17in Figure 9). In this case, only the most positive term will be included in the calculation of∆X+ andthe most negative in the calculation of∆X−. Thus, there may be less thanN terms for either the “+”or “−” directions. There are other versions of theMaster Equationin current use but the version listedabove is the “official” recommendation of the authors.

There are two things that can happen when new pdfs (eigenvector directions) are added: a newdirection in parameter space can be opened to which some cross sections willbe sensitive to (an exampleof this is eigenvector 15 in the CTEQ6.1 error pdf set, which is sensitive to thehigh x gluon behaviourand thus influences the highpT jet cross section at the Tevatron and LHC). This particular eigenvectordirection happens to be dominated by a parameter which affects mostly the largex behavior of the gluondistribution.

In this case, a smaller parameter space is an underestimate of the true pdf error since it did notsample a direction important for some physics. In the second case, adding new eigenvectors does notappreciably open new parameter space and the new parameters should not contribute much pdf errorto most physics processes (although the error may be redistributed somewhat among the new and oldeigenvectors). The highx gluon uncertainty did not decrease significantly in the CTEQ pdfs producedafter the CTEQ6.1 set (CTEQ6.6, CT09, CT10), but in these latter fits, there is no single eigenvectorsimilar to eigenvector 15 in the CTEQ6.1 pdf set that encompasses most of the high x gluon uncertainty.Instead this uncertainty is spread among several different eigenvectors.

In Figure 9, the pdf errors are shown in the “+” and “−” directions for the 20 CTEQ eigen-vector directions for predictions for inclusive jet production at the Tevatron from the CTEQ6.1 pdfs.(I′ll try to get the mathematica plots for CT10. The excursions are symmetric for the first 10eigenvectors but can be asymmetric for the last 10, as they correspond topoorly determined directions.

EitherX0 andX±

i can be calculated separately in a matrix element/Monte Carlo program (requir-ing the program to be run2N + 1 times) orX0 can be calculated with the program and at the same timethe ratio of the pdf luminosities (the product of the two pdfs at thex values used in the generation ofthe event) for eigenvectori (±) to that of the central fit can be calculated and stored. This results in aneffective sample with2N +1 weights, but identical kinematics, requiring a substantially reduced amount

Page 12: Parton Distribution Functions · = (M/14 TeV) exp( y) Q = M LHC parton kinematics M = 10 GeV M = 100 GeV M = 1 TeV M = 10 TeV y = 6 4 2 0 2 4 6 Q 2 (GeV 2) x Fig. 1: A plot showing

Fig. 9: The pdf errors for the CDF inclusive jet cross section in Run 1 for the20 different eigenvector directions contained in

the CTEQ6.1 pdf error set. The vertical axes show the fractional deviation from the central prediction and the horizontal axes

the jet transverse momentum in GeV.

of time to generate.

As an example of pdf uncertainties using the Hessian method, the CT10 and MSTW2008 uncer-tainties for the up quark and gluon distributions are shown in Figures?? and ??. While the CT10 andMSTW2008 pdf distributions and uncertainties are reasonably close to each other, some differences areevident, especially at low and highx.

discuss re − diagonlization here?

4.2 The NNPDF approach

4.3 Pdf uncertainties and Sudakov form factors

As discussed in the above section, it is often useful to use the error pdf sets with parton shower MonteCarlos. The caveat still remains that a true test of the acceptances would use a NLO MC. Similar totheir use with matrix element calculations, events can be generated once usingthe central pdf and thepdf weights stored for the error pdfs. These pdf weights then can be used to construct the pdf uncertaintyfor any observable. Some sample code for PYTHIA is given on the benchmark website. One additionalcomplication with respect to their use in matrix element programs is that the parton distributions areused to construct the initial state parton showers through the backward evolution process. The space-likeevolution of the initial state partons is guided by the ratio of parton distribution functions at differentxandQ2 values, c.f.??. Thus the Sudakov form factors in parton shower Monte Carlos will be constructedusing only the central pdf and not with any of the individual error pdfs and this may lead to someerrors for the calculation of the pdf uncertainties of some observables. However, it was demonstrated inReference [?] that the pdf uncertainty for Sudakov form factors in the kinematic region relevant for theLHC is minimal, and the weighting technique can be used just as well with parton shower Monte Carlosas with matrix element programs.

Page 13: Parton Distribution Functions · = (M/14 TeV) exp( y) Q = M LHC parton kinematics M = 10 GeV M = 100 GeV M = 1 TeV M = 10 TeV y = 6 4 2 0 2 4 6 Q 2 (GeV 2) x Fig. 1: A plot showing

Fig. 10: A comparison of the CT10 and MSTW2008 up quark (left) and gluon (right) pdf uncertainty bands at aQ2 value of

100GeV 2.

5. Choice ofαs(mZ) and related uncertainties

Global pdf fits are sensitive to the value of the strong coupling constantαs, explicitly through the QCDcross sections used in the fits, and implicitly through the scaling violations observed in DIS. In fact, aglobal fit can be used to determine the value ofαs(mZ), albeit less accurately than provided by the worldaverage. Some pdf groups use the world average value ofαs(mZ) as a fixed constant in the global fits,while other groups allowαs(mZ) to be a free parameter in the fit. It is also possible to explore the effectsof the variation ofαs(mZ) by producing pdfs at differentαs(mZ) values. The values ofαs(mZ) at NLOand their uncertainties are shown in Figure X for the six pdf groups.

to be developed further show gluon distribution for different alphas values and discuss compensation

It is expected that the LO value ofαs(mZ) is considerably larger than the NLO value. There issome tendency for the NNLO value ofαs to be slightly smaller than the value at NLO due to the (mostly)positive NNLO corrections for most processes.

talk about re − diagonalization and orthogonalitity of pdf and alphas uncert

6. NLO and LO pdfs

Global pdf fitting groups have traditionally produced sets of pdfs in which leading order rather thannext-to-leading order matrix elements, along with the 1-loopαS rather than the 2-loopαS , have beenused to fit the input datasets. The resultant leading order pdfs have mostoften been used in conjunctionwith leading order matrix element programs or parton shower Monte Carlos. However, the leading orderpdfs of a given set will tend to differ from the central pdfs in the NLO fit, and in fact will most oftenlie outside the pdf error band. Such is the case for the up quark distribution shown in Figure?? andthe gluon distribution shown in Figure??, where the LO pdfs are plotted along with the NLO pdf error

Page 14: Parton Distribution Functions · = (M/14 TeV) exp( y) Q = M LHC parton kinematics M = 10 GeV M = 100 GeV M = 1 TeV M = 10 TeV y = 6 4 2 0 2 4 6 Q 2 (GeV 2) x Fig. 1: A plot showing

Fig. 11: The CTEQ6L1 up quark and gluon pdfs compared to the CT10 NLO pdf error bands for the same.

bands. The LO up quark distribution is considerably larger than its NLO counterpart at both smallx andlargex. This is due to (1) the larger gluon distribution at smallx for the LO pdf and (2) the influence ofmissinglog(1 − x) terms in the LO DIS matrix element. The gluon distribution is outside of the NLOerror band basically for allx. It is higher than the NLO gluon distribution at smallx due to missinglog(1/x) terms in the LO DIS matrix element. It is smaller than the NLO gluon distribution at largexbasically due to the momentum sum rule and the lack of constraints at highx.

The global pdf fits are dominated by the high statistics, low systematic error deep inelastic scatter-ing data and the differences between the LO and NLO pdfs are determined most often by the differencesbetween the LO and NLO matrix elements for deep inelastic scattering. This is especially true at lowxand at highx, due to missing terms that first arise in the hard matrix elements for DIS at NLO. As theNLO corrections for most processes of interest at the LHC are reasonably small, the use of NLO pdfs inconjunction with LO matrix elements will most often give a closer approximation of the full NLO result(although the result remains formally LO). In many cases in which a relativelylargeK-factor resultsfrom a calculation of collider processes, the primary cause is the difference between LO and NLO pdfs,rather than the differences between LO and NLO matrix elements.

In most cases, LO pdfs will be used not in fixed order calculations, but inprograms where theLO matrix elements have been embedded in a parton shower framework. In theinitial state radiationalgorithms in these frameworks, shower patrons are emitted at non-zero angles with finite transversemomentum, and not with a zerokT implicit in the collinear approximation. It might be argued thatthe resulting kinematic suppression due to parton showering should be takeninto account when drivingpdfs for explicit use in Monte Carlo programs. Indeed, there is substantial kinematic suppression forproduction of a low-mass (10 GeV) object at forward rapidities due to this effect, but the suppressionbecomes minimal once the mass rises to the order of 100 GeV [].

Page 15: Parton Distribution Functions · = (M/14 TeV) exp( y) Q = M LHC parton kinematics M = 10 GeV M = 100 GeV M = 1 TeV M = 10 TeV y = 6 4 2 0 2 4 6 Q 2 (GeV 2) x Fig. 1: A plot showing

W+y-5 -4 -3 -2 -1 0 1 2 3 4 5

(nb

0

1

2

3

4

5

6

7

NLO CTEQ6.6

LO CTEQ6.6

LO CTEQ6L1

W+ rapidity distribution

W-y-5 -4 -3 -2 -1 0 1 2 3 4 5

(nb

0

1

2

3

4

5

6

7

W- rapidity distribution

Zy

-5 -4 -3 -2 -1 0 1 2 3 4 5

(nb

0

0.5

1

1.5

2

2.5

3

3.5

4

Z rapidity distribution

Hy

-5 -4 -3 -2 -1 0 1 2 3 4 5

(pb

0

0.5

1

1.5

2

2.5

3

3.5

4

H rapidity distribution

Fig. 12: A comparison of NLO predictions for SM boson rapidity distributions to LO predictions for the same, using CTEQ6.6

and CTEQ6L1 pdfs, respectively.

6.01 Modified LO PDFs

Due to the inherent differences between LO and NLO pdfs, and the relatively small differences betweenLO and NLO matrix elements for processes of interest at the LHC, LO calculations at the LHC using LOpdfs often lead to erroneous predictions. This is true not only of the normalization of the cross sections,but also for the kinematic shapes. This can be seen for example in the predictions for theW+/−/Zand Higgs rapidity distributions seen in Figure 12, where the wrong shapesfor the vector boson rapiditydistributions result from the deficiencies of the LO DIS matrix elements used in the fit. This can have animpact, for example, if the LO predictions are used to calculate final-state acceptances.

In an attempt to reduce the size of the errors obtained using LO pdfs with LO predictions, modifiedLO pdfs have been produced. The techniques used to produce these modified pdfs include (1) relaxingthe momentum sum rule in the global fit and (2) using NLO pseudo-data in order to try to steer thefit towards the desired NLO behavior. Both the CTEQ [] and MRST [] modifiedLO pdfs use the firsttechnique, while the CTEQ pdfs use the second technique as well.

A comparison of the full NLO and modified LO pdf predictions forW/Z and Higgs productionat the LHC is shown in Figure 14for three different LHC center-of-massenergies. It can be seen that themodified LO pdfs lead to substantially better agreement with the NLO predictions than observed in thefigure above which used LO pdfs.

Of course, the desired behavior can also be obtained (in most cases) bythe use of NLO pdfs in theLO calculation. Here, care must be taken that only positive-definite NLO pdfs be used.

Increasingly, most processes of interest have been included in NLO parton shower Monte Carlos.Here, the issue of LO pdfs becomes moot, as NLO pdfs must be used in suchprograms for consistencywith the matrix elements.

Page 16: Parton Distribution Functions · = (M/14 TeV) exp( y) Q = M LHC parton kinematics M = 10 GeV M = 100 GeV M = 1 TeV M = 10 TeV y = 6 4 2 0 2 4 6 Q 2 (GeV 2) x Fig. 1: A plot showing

W+y

-5 -4 -3 -2 -1 0 1 2 3 4 5

(nb

0

1

2

3

4

5

6

NLO CTEQ66 14 TeV

LO CT09MC2 14 TeV

LO MRST2007lomod 14 TeV

NLO CTEQ6.6 10 TeV

LO CT09MC2 10 TeV

MRST2007lomod 10 TeV

NLO CTEQ6.6 7 TeV

LO CT09MC2 7 TeV

LO MRST2007lomod

W+ rapidity distribution

MH = 120 GeV

H0y-5 -4 -3 -2 -1 0 1 2 3 4 5

(pb

0

0.5

1

1.5

2

2.5

3

3.5

4 NLO CTEQ66 14 TeV

LO CT09MC2 14 TeV

MRST2007lomod 14 TeV

NLO CTEQ6.6 10 TeV

LO CT09MC2 10 TeV

MRST2007lomod 10 TeV

SM Higgs boson rapidity distribution

Fig. 13: A comparison of LO, NLO and modified LO predictions for theW+ and Higgs rapidity distributions at the LHC.

6.1 NLO and NNLO pdfs

The transition from NLO to NNLO results in much smaller changes to the pdfs as can be observed inFigure??.

have to update these figures toCT10NNLO. also, discuss new features atNNLO?

7. PDF correlations

The uncertainty analysis may be extended to define acorrelationbetween the uncertainties of two vari-ables, sayX(~a) andY (~a). As for the case of PDFs, the physical concept of PDF correlations canbedetermined both from PDF determinations based on the Hessian approach and on the Monte Carlo ap-proach.

7.1 PDF correlations in the Hessian approach

Consider the projection of the tolerance hypersphere onto a circle of radius 1 in the plane of the gradients~∇X and~∇Y in the parton parameter space [?, ?]. The circle maps onto an ellipse in theXY plane. This“tolerance ellipse” is described by Lissajous-style parametric equations,

X = X0 + ∆X cos θ, (4)

Y = Y0 + ∆Y cos(θ + ϕ), (5)

where the parameterθ varies between 0 and2π, X0 ≡ X(~a0), andY0 ≡ Y (~a0). ∆X and∆Y are themaximal variationsδX ≡ X −X0 andδY ≡ Y − Y0 evaluated according to theMaster Equation, andϕ is the angle between~∇X and~∇Y in the{ai} space, with

cos ϕ =~∇X · ~∇Y

∆X∆Y=

1

4∆X ∆Y

N∑

i=1

(

X(+)i − X

(−)i

) (

Y(+)i − Y

(−)i

)

. (6)

The quantitycos ϕ characterizes whether the PDF degrees of freedom ofX andY are correlated(cos ϕ ≈ 1), anti-correlated (cos ϕ ≈ −1), or uncorrelated (cos ϕ ≈ 0). If units forX andY are rescaledso that∆X = ∆Y (e.g.,∆X = ∆Y = 1), the semimajor axis of the tolerance ellipse is directed at

Page 17: Parton Distribution Functions · = (M/14 TeV) exp( y) Q = M LHC parton kinematics M = 10 GeV M = 100 GeV M = 1 TeV M = 10 TeV y = 6 4 2 0 2 4 6 Q 2 (GeV 2) x Fig. 1: A plot showing

Fig. 14: A comparison of LO, NLO and NNLO up quark and gluon pdfs.

an angleπ/4 (or 3π/4) with respect to the∆X axis for cos ϕ > 0 (or cos ϕ < 0). In these units, theellipse reduces to a line forcos ϕ = ±1 and becomes a circle forcos ϕ = 0, as illustrated by Fig. 15.These properties can be found by diagonalizing the equation for the correlation ellipse. Its semiminorand semimajor axes (normalized to∆X = ∆Y ) are

{aminor, amajor} =sinϕ√

1 ± cos ϕ. (7)

The eccentricityǫ ≡√

1 − (aminor/amajor)2 is therefore approximately equal to√

|cos ϕ| as|cos ϕ| →1.

(

δX

∆X

)2

+

(

δY

∆Y

)2

− 2

(

δX

∆X

) (

δY

∆Y

)

cos ϕ = sin2 ϕ. (8)

A magnitude of| cos ϕ| close to unity suggests that a precise measurement ofX (constrainingδXto be along the dashed line in Fig. 15) is likely to constrain tangibly the uncertaintyδY in Y , as the valueof Y shall lie within the needle-shaped error ellipse. Conversely,cos ϕ ≈ 0 implies that the measurementof X is not likely to constrainδY strongly.1

The values of∆X, ∆Y, and cos ϕ are also sufficient to estimate the PDF uncertainty of anyfunctionf(X, Y ) of X andY by relating the gradient off(X, Y ) to∂Xf ≡ ∂f/∂X and∂Y f ≡ ∂f/∂Yvia the chain rule:

∆f =∣

~∇f∣

∣ =√

(∆X ∂Xf )2 + 2∆X ∆Y cos ϕ ∂Xf ∂Y f + (∆Y ∂Y f)2. (9)

1The allowed range ofδY/∆Y for a given δ ≡ δX/∆X is r(−)Y

≤ δY/∆Y ≤ r(+)Y

, where r(±)Y

≡ δ cos ϕ ±√1 − δ2 sin ϕ.

Page 18: Parton Distribution Functions · = (M/14 TeV) exp( y) Q = M LHC parton kinematics M = 10 GeV M = 100 GeV M = 1 TeV M = 10 TeV y = 6 4 2 0 2 4 6 Q 2 (GeV 2) x Fig. 1: A plot showing

δX

δY

δX

δY

δX

δY

cos ϕ ≈ 1 cos ϕ ≈ 0 cos ϕ ≈ −1

Fig. 15: Correlations ellipses for a strong correlation (left), no correlation (center) and a strong anti-correlation(right) [?].

Of particular interest is the case of a rational functionf(X, Y ) = Xm/Y n, pertinent to computationsof various cross section ratios, cross section asymmetries, and statistical significance for finding signalevents over background processes [?]. For rational functions Eq. (9) takes the form

∆f

f0=

(

m∆X

X0

)2

− 2mn∆X

X0

∆Y

Y0cos ϕ +

(

n∆Y

Y0

)2

. (10)

For example, consider a simple ratio,f = X/Y . Then∆f/f0 is suppressed (∆f/f0 ≈ |∆X/X0 − ∆Y/Y0|)if X andY are strongly correlated, and it is enhanced (∆f/f0 ≈ ∆X/X0 + ∆Y/Y0) if X andY arestrongly anticorrelated.

As would be true for any estimate provided by the Hessian method, the correlation angle is in-herently approximate. Eq. (6) is derived under a number of simplifying assumptions, notably in thequadratic approximation for theχ2 function within the tolerance hypersphere, and by using a symmet-ric finite-difference formula for{∂iX} that may fail if X is not monotonic. With these limitations inmind, we find the correlation angle to be a convenient measure of interdependence between quantities ofdiverse nature, such as physical cross sections and parton distributions themselves.

We can calculate the correlations between two pdfs,fa1(x1, µ1) andfa2(x2, µ2) at a scaleµ1 =µ2 =85 GeV. In the figure below, we show self-correlations for the up quark(left) and the gluon (right).Light (dark) shades of gray correspond tocosφ close to 1 (-1). Each self-correlation includes a trivialcorrelation (cosφ = 1) whenx1 andx2 are approximately the same (along thex1 = x2 diagonals). Forthe up quark, this trivial correlation is the only pattern present. The gluon distribution, however, alsoshows a strong anti-correlation when one of thex values is large and the other small. This arises as aconsequence of the momentum sum rule. A fairly complete set of correlation patterns, connected withthe momentum sum rule, perturbative evolution, and constraints from experimental data, can be found inRef. [].

In Chapter , the correlations for certain benchmark cross sections are given with respect to thatfor Z production. As expected, theW+ andW− cross sections are very correlated with that for theZ, while the Higgs cross sections are uncorrelated (mHiggs=120 GeV) or anti-correlated (mHiggs=240GeV). Thus, the PDF uncertainty for the ratio of the cross section for a 240 GeV Higgs boson to that ofthe cross section forZ boson production is larger than the PDF uncertainty for Higgs boson productionby itself. Correlations among various physics processes, especially those important for Higgs production,are discussed in Chapter??.

A simpleC code (corr.C) is available from the PDF4LHC website that calculates the correlationcosine between any two observables given two text files that present thecross sections for each observableas a function of the error PDFs.

Page 19: Parton Distribution Functions · = (M/14 TeV) exp( y) Q = M LHC parton kinematics M = 10 GeV M = 100 GeV M = 1 TeV M = 10 TeV y = 6 4 2 0 2 4 6 Q 2 (GeV 2) x Fig. 1: A plot showing

10-510-4 10-3 0.010.02 0.05 0.1 0.2 0.5 0.7x in g at Q=85. GeV

10-5

10-4

10-3

0.01

0.02

0.05

0.1

0.2

0.5

0.7

xin

gat

Q=

85.G

eVCorrelations between CTEQ6.6 PDF’s

10-510-4 10-3 0.010.02 0.05 0.1 0.2 0.5 0.7x in u at Q=85. GeV

10-5

10-4

10-3

0.01

0.02

0.05

0.1

0.2

0.5

0.7

xin

uat

Q=

85.G

eV

Correlations between CTEQ6.6 PDF’s

Fig. 16: Contour plots of the correlation cosine between two pdfs, for the up quark (left) and the gluon (right).

7.11 Correlations within the Monte Carlo approach

General correlations between PDFs and physical observables can becomputed within the Monte Carloapproach used by NNPDF using standard textbook methods. To illustrate thispoint, let us compute thethe correlation coefficientρ[A, B] for two observablesA andB which depend on PDFs (or are PDFsthemselves). This correlation coefficient in the Monte Carlo approach is given by

ρ[A, B] =Nrep

(Nrep − 1)

〈AB〉rep − 〈A〉rep〈B〉repσAσB

(11)

where the averages are taken over ensemble of theNrep values of the observables computed with thedifferent replicas in the NNPDF2.0 set, andσA,B are the standard deviations of the ensembles. Thequantityρ characterizes whether two observables (or PDFs) are correlated (ρ ≈ 1), anti-correlated (ρ ≈−1) or uncorrelated (ρ ≈ 0).

This correlation can be generalized to other cases, for example to compute thecorrelation betweenPDFs and the value of the strong couplingαs(mZ), as studied in Ref. [?, ?], for any given values ofx andQ2. For example, the correlation between the strong coupling and the gluon atx andQ2 (or ingeneral any other PDF) is defined as the usual correlation between two probability distributions, namely(equation to be supplied later)

where averages over replicas include PDF sets with varyingαs in the sense of Eq. (??). Note thatthe computation of this correlation takes into account not only the central gluons of the fits with differentαs but also the corresponding uncertainties in each case.

8. LHAPDF and other tools

8.02 LHAPDF and Durham PDF plotter

Libraries such as PDFLIB [?] have been established that maintain a large collection of available pdfs.However, PDFLIB is no longer supported, making it more difficult for easy access to the most up-to-datepdfs. In addition, the determination of the pdf uncertainty of any cross section typically involves the useof a large number of pdfs (on the order of 30-100) and PDFLIB is not set up for easy accessibility for alarge number of pdfs.

Page 20: Parton Distribution Functions · = (M/14 TeV) exp( y) Q = M LHC parton kinematics M = 10 GeV M = 100 GeV M = 1 TeV M = 10 TeV y = 6 4 2 0 2 4 6 Q 2 (GeV 2) x Fig. 1: A plot showing

Fig. 17: Screen capture of the Durham pdf plotter website.

At Les Houches in 2001, representatives from a number of pdf groups were present and an inter-face (Les Houches Accord 2, or LHAPDF) [?] that allows the compact storage of the information neededto define a pdf was defined. Each pdf can be determined either from a gridin x andQ2 or by a few linesof information (basically the starting values of the parameters atQ = Qo) and the interface carries outthe evolution to anyx andQ value, at either LO or NLO as appropriate for each pdf.

The interface is as easy to use as PDFLIB and consists essentially of 3 subroutine calls:

• call Initpdfset(name): called once at the beginning of the code;nameis the file name of the externalpdf file that defines the pdf set (for example, CTEQ, GKK [?] or MRST)

• call Initpdf(mem): memspecifies the individual member of the pdf set

• call evolvepdf(x,Q,f): returns the pdf momentum densities for flavourf at a momentum fractionxand scaleQ

Responsibility for LHAPDF has been taken over by the Durham HEPDATA project [?] and regularupdates/improvements have been produced. Interfaces with LHAPDF arenow included in most matrixelement programs. Recent modifications make it possible to include all error pdfs in memory at the sametime. Such a possibility reduces the amount of time needed for pdf error calculations on any observable.The matrix element result can be calculated once using the central pdf and the relative (pdf)×(pdf)parton-parton luminosity can be calculated for each of the error pdfs (orthe values ofx1,x2, the flavourof partons 1 and 2 and the value ofQ2 can be stored). Such a pdf re-weighting has been shown to workboth for exact matrix element calculations as well as for matrix element+parton shower calculations.

A new routine LHAGLUE [?] provides an interface from PDFLIB to LHAPDF making it possibleto use the PDFLIB subroutine calls that may be present in older programs.

Also, extremely useful is the Durham pdf plotter (http://hepdata.cedar.ac.uk/pdf/pdf3.html) whichallows the fast plotting/comparisons of pdfs, including their error bands. All of the pdf plots in this bookwere made with the Durham plotter.

8.1 PDF re-weighting, Applgrid and fastNLO

NLO and NNLO programs are notoriously slow. Thus, it can be very time-consuming to generate ahigher order cross section with one pdf, and then have to re-run the program as well for the 2n (wheren is the number of pdf eigenvectors) error pdfs. Such a step is in fact unnecessary, and most programshave the ability to use pdf re-weighting to substitute a new pdf for the pdf usedin the original generation,using the re-weighting function show in the equation below.

putinpdfre − weightingequation

Page 21: Parton Distribution Functions · = (M/14 TeV) exp( y) Q = M LHC parton kinematics M = 10 GeV M = 100 GeV M = 1 TeV M = 10 TeV y = 6 4 2 0 2 4 6 Q 2 (GeV 2) x Fig. 1: A plot showing

The pdf error weights can either be stored at the time of generation, as discussed above in Sec-tion (Sudakovformfactors), or can be generated on the fly by the program.

Since the pdf-dependent information in a QCD calculation can be factorizedfrom the rest of thehard scattering terms, it is possible to calculate the non-pdf terms one time and then to store the pdfinformation on a grid (in terms of the pdfx values and theirµr and µf dependence). This allowsfor the fast calculation of any hard scattering cross section and the a posteriori inclusion of pdf’s andthe strong coupling constantαs in higher order QCD calculations. The technique also allows the aposterio variation of the renormalization and factorization scales. This is the working principle of thetwo programs fastNLO [] and Applgrid []. These programs are increasingly used for calculation of theNLO matrix elements used in pdf fits.

more elaboration? relate to what is done with the B + S tuples?

9. PDF luminosities

needs to be updated using CT10 pdfs, and lhc energies of 7, 8 TeV, as well as 13.5 TeV.

It is useful to introduce the idea of differential parton-parton luminosities.Such luminosities,when multiplied by the dimensionless cross sectionsσ for a given process, provide a useful estimate ofthe size of an event cross section at the LHC. Below we define the differential parton-parton luminositydLij/ds dy and its integraldLij/ds:

dLij

ds dy=

1

s

1

1 + δij[fi(x1, µ)fj(x2, µ) + (1 ↔ 2)] . (12)

The prefactor with the Kronecker delta avoids double-counting in case thepartons are identical. Thegeneric parton-model formula

σ =∑

i,j

∫ 1

0dx1 dx2 fi(x1, µ) fj(x2, µ) σij (13)

can then be written as

σ =∑

i,j

∫ (

ds

sdy

) (

dLij

ds dy

)

(s σij) . (14)

(Note that this result is easily derived by definingτ = x1 x2 = s/s and observing that the Jacobian∂(τ, y)/∂(x1, x2) = 1.)

Figure 18 shows a plot of the luminosity function integrated over rapidity,dLij/ds =∫

(dLij/ds dy) dy,at the LHC

√s = 14 TeV for various parton flavour combinations, calculated using the CTEQ6.1 par-

ton distribution functions [?]. The widths of the curves indicate an estimate for the pdf uncertainties.We assumeµ =

√s for the scale. As expected, thegg luminosity is large at low

√s but falls rapidly

with respect to the other parton luminosities. Thegq luminosity is large over the entire kinematic regionplotted.

One can further specify the parton-parton luminosity for a specific rapidityy ands, dLij/ds dy.If one is interested in a specific partonic initial state, then the resulting differential luminosity can bedisplayed in families of curves as shown in Figure 19, where the differential parton-parton luminosityat the LHC is shown as a function of the subprocess centre-of-mass energy

√s at various values of

rapidity for the produced system for several different combinations ofinitial state partons. One can readfrom the curves the parton-parton luminosity for a specific value of mass fraction and rapidity. (It isalso easy to use the Durham pdf plotter to generate the pdf curve for any desired flavour and kinematicconfiguration2.)

2http://durpdg.dur.ac.uk/hepdata/pdf3.html

Page 22: Parton Distribution Functions · = (M/14 TeV) exp( y) Q = M LHC parton kinematics M = 10 GeV M = 100 GeV M = 1 TeV M = 10 TeV y = 6 4 2 0 2 4 6 Q 2 (GeV 2) x Fig. 1: A plot showing

Fig. 18: The parton-parton luminosity[

dLij

]

in picobarns, integrated overy. Green=gg, Blue=∑

i(gqi + gqi + qig + qig),

Red=∑

i(qiqi + qiqi), where the sum runs over the five quark flavoursd, u, s, c, b.

It is also of great interest to understand the uncertainty in the parton-parton luminosity for specifickinematic configurations. Some representative parton-parton luminosity uncertainties, integrated overrapidity, are shown in Figures 20, 21 and 22. The pdf uncertainties weregenerated from the CTEQ6.1Hessian error analysis using the standard∆χ2 = 100 criterion. Except for kinematic regions whereone or both partons is a gluon at highx, the pdf uncertainties are of the order of5–10%. Even tighterconstraints will be possible once the LHC Standard Model data is included in the global pdf fits. Again,the uncertainties for individual pdfs can also be calculated online using theDurham pdf plotter. Oftenit is not the pdf uncertainty for a cross section that is required, but rather the pdf uncertainty for anacceptance for a given final state. The acceptance for a particular process may depend on the inputpdfs due to the rapidity cuts placed on the jets, leptons, photons, etc. and theimpacts of the varyinglongitudinal boosts of the final state caused by the different pdf pairs. An approximate “rule-of-thumb”is that the pdf uncertainty for the acceptance is a factor of5–10 times smaller than the uncertainty for thecross section itself.

In Figure 23, the pdf luminosity curves shown in Figure 18 are overlaid with equivalent luminositycurves from the Tevatron. In Figure 24, the ratios of the pdf luminosities atthe LHC to those at theTevatron are plotted. The most dramatic increase in pdf luminosity at the LHC comes fromgg initialstates, followed bygq initial states and thenqq initial states. The latter ratio is smallest because ofthe availability of valence antiquarks at the Tevatron at moderate to largex. As an example, considerchargino pair production with

√s = 0.4 TeV. This process proceeds throughqq annihilation; thus, there

is only a factor of10 enhancement at the LHC compared to the Tevatron.

Backgrounds to interesting physics at the LHC proceed mostly throughgg andgq initial states.Thus, there will be a commensurate increase in the rate for background processes at the LHC.

Page 23: Parton Distribution Functions · = (M/14 TeV) exp( y) Q = M LHC parton kinematics M = 10 GeV M = 100 GeV M = 1 TeV M = 10 TeV y = 6 4 2 0 2 4 6 Q 2 (GeV 2) x Fig. 1: A plot showing

Fig. 19: d(Luminosity)/dy at rapidities (right to left)y = 0, 2, 4, 6. Green=gg, Blue=∑

i(gqi + gqi + qig + qig),

Red=∑

i(qiqi + qiqi), where the sum runs over the five quark flavoursd, u, s, c, b.

Fig. 20: Fractional uncertainty of thegg luminosity integrated overy.

Page 24: Parton Distribution Functions · = (M/14 TeV) exp( y) Q = M LHC parton kinematics M = 10 GeV M = 100 GeV M = 1 TeV M = 10 TeV y = 6 4 2 0 2 4 6 Q 2 (GeV 2) x Fig. 1: A plot showing

Fig. 21: Fractional uncertainty for the parton-parton luminosity integratedovery for∑

i(qiqi + qiqi), where the sum runs over

the five quark flavoursd, u, s, c, b.

Fig. 22: Fractional uncertainty for the luminosity integrated overy for∑

i(qiqi + qiqi), where the sum runs over the five quark

flavoursd, u, s, c, b.

Page 25: Parton Distribution Functions · = (M/14 TeV) exp( y) Q = M LHC parton kinematics M = 10 GeV M = 100 GeV M = 1 TeV M = 10 TeV y = 6 4 2 0 2 4 6 Q 2 (GeV 2) x Fig. 1: A plot showing

Fig. 23: The parton-parton luminosity[

1s

dLij

]

in pb integrated overy. Green=gg, Blue=∑

i(gqi + gqi + qig + qig),

Red=∑

i(qiqi + qiqi), where the sum runs over the five quark flavoursd, u, s, c, b. The top family of curves are for the

LHC and the bottom for the Tevatron.

Fig. 24: The ratio of parton-parton luminosity[

1s

dLij

]

in pb integrated overy at the LHC and Tevatron. Green=gg (top),

Blue=∑

i(gqi + gqi + qig + qig) (middle), Red=

i(qiqi + qiqi) (bottom), where the sum runs over the five quark flavours

d, u, s, c, b.