data process kihyeon cho kyungpook national university spring semester 2004 experimental method and...

Download Data Process Kihyeon Cho Kyungpook National University Spring Semester 2004 Experimental Method and Data Process

If you can't read please download the document

Upload: christian-ronald-norton

Post on 18-Jan-2018

223 views

Category:

Documents


0 download

DESCRIPTION

High Energy Physics

TRANSCRIPT

Data Process Kihyeon Cho Kyungpook National University Spring Semester 2004 Experimental Method and Data Process Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 2 High Energy Physics Data Processing Fitting Conclusions Contents High Energy Physics Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 4 High Energy Physics The ultimate structure of matter and the understanding of the origin of universe Goal Direction Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 5 What is World Made of? Atom Electron Nucleus Proton, neutron quarks (1 fm) (1 am) (100 pm) Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 6 How to know any of this? (Testing Theory) Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 7 How to detect? Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 8 How do we experiment with tiny particles? (Accelerators) Accelerators solve two problems: High energy gives small wavelength to detect small particles. The high energy create the massive particles that the physicist want to study. Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 9 Europe -In 2007, the LHC will be completed at CERN -Two big experiments (ATLAS, CMS) in collab. of HEP institutes and physicists all over the world -CERN, IN2P3(France), and INFN(Italy) are preparing HEP Grid for it. USA -The BaBar Exp at SLAC -The Run II of the Tevatron at Fermilab (CDF and D0) -The CLEO at Cornell -The LHC experiments at CERN (ATLAS, CMS) -The RHIC exp at BNL -The Super-K in Japan -The HEP Grid in the ESNET program Japan -Belle at KEK -Super-K, Kamioka -LHC at CERN (ATLAS) -The RHIC at BNL (USA) -They are now working for it. World-wide High Energy Physics Experiment Korea We have most of these world-wide experimental programs Global Collaboration Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 10 Europe CERN Germany DESY US FNAL US BNL Space Station (ISS) Japan KEK China IHEP Korea CHEP Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 11 Where is Fermilab? 20 mile west of Chicago U.S.A Fermilab Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 12 Overview of Fermilab Main Injector and Recycler p source Booster CDF D0 Fixed Target Experiment Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 13 Fermi National Accelerator Laboratory Highest Energy Accelerator in the World Energy Frontier: CDF, D0 Search for New Physics (Higgs, SUSY, quark composites, Precision Frontier: charm, kaon, neutrino physics (FOCUS, KTeV, NUMI/MINOS,BOONE,etc. Connection to Cosmology: Sloan Digital sky survey, Pierre Auger, Largest HEP Laboratory in USA 2200 employees 2300 users (researchers from univ.) Budget is >$300 million Data Processing Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 15 Why do we do experiments? Parameter determination To set the numerical values of some physical quantities Ex) To measure velocity of light Hypothesis testing To test whether a particular theory is consistent with our data Ex) To check whether velocity of light has suddenly increased by several percent since beginning of this year Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 16 Type of Data Real Data (on-site) Raw Data : Detector Information Reconstructed Data : Physics Information Stream (Skim) Data : Selected interested physics Simulated Data (on-site or off-site) Physics generation : pythia, QQ, bgenerator, CompHEP, Detector Simulation : Fastsim, GEANT, Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 17 HEP Knowledge Reaction Simulation = Event Generation Detector Simulation Simulated Data Real Data Reduction On-sites (Experimental sites) Remote-sites (CHEP + participating institutions) Data Analysis Research Method Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 18 Error ( ) Error Error : the difference between measurement and true value True value We don t know it Statistical error Error due to statistical fluctuation Systematic error More in nature of mistakes due to equipments and experimentalists Experimental value : Meas. stat. error sys. error Example ) m(top) = 4.8 5.3 GeV/c 2 (CDF, 1998) Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 19 Why estimate errors? To know how accuracy of the measurement Example The conventional speed of light c=2.998 X 10 8 m/sec When the new measurement c=3.09 X 10 8 m/sec Case 1. If the error is 0.15, then it is consistent. Conventional physics is in good shape 0.15 is consistent with X 10 8 m/sec Case 2. If the error is 0.01, then it is not consistent 0.01 is world shattering discovery. Case 3. If the error is 2, then it is consistent. However, the accuracy of 3.09 2 is too low. Useless measurement Whenever you determine a parameter, estimate the error or your experiment is useless. Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 20 Examples) Bad and Good Bad Good Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 21 How to reduce errors? Statistical error Repeated measurement N : the expected number of observation = Sqrt(N) : the spread Systematic error No exact formulae Ideal case : All such effects should be absent. Real world : An attempt to be made to reduce it. Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 22 How to solve systematic errors? Use constraint condition Ex) Triangle Calibrations Energy and momentum conservation E(after) E(before) = 0 |P(after)| - |P(before)| = 0 How small of the systematic error? Systematic errors should be around statistical errors Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 23 The meaning of (error) Distributions x -> n(x) Discrete ex) # of times n(x) you met a girl at age x Continuous : ex) Hours sleep each night (x), # of people sleeping for time. For an even larger number of observation and with small bin size, the histogram approach a continuous distribution. Mean and Variance Gaussian distribution In case of larger number of observation It is important for error calculations Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 24 Tracking Performance Tracking Performance Ks Ks Hit Resolution ~200 m Goal : 180 m p COT tracks Residual distance (cm) Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 25 Mean and Variance True Value Measurement Mean Variance 22 s2s2 Standard deviation s In fact, we dont know the true value in the real world. Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 26 Mean and Variance Mean N events has the value of (x 1, x 2, x 3, x N ) Variance When do not know true value Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 27 Accuracy ( In order to know the accuracy of the measurement Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 28 Gaussian Distribution In case of large size of data Gaussian distribution is the fundamental in error treatment. Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 29 The normalized function Mean ( ) Width ( ) Width ( ) is smaller, distribution is narrower. Properties Gaussian Distribution (cont d) Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 30 Gaussian Distribution (cont d) Mean ( ) is same as zero. However width ( ) is different. Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 31 Examples (Gaussian + BG) D.J.Kong ( ) Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 32 CDF Secondary Vertex Trigger NEW for Run 2 -- level 2 impact parameter trigger Provides access to hadronic B decays Data from commissioning run COT defines track SVX measures (no alignment or calibrations) at level 1 impact parameter ~ 87 m d (cm) Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 33 Gaussian fitting Using Mn_fit - + Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 34 Significant Figure The measured value has meaning by significant figures Significant Figure It includes the first figure of uncertainty All the figures between LSD (least significant digit) and MSD(Most significant digit) LSD If there is no point : The far right non-zero figure ex)23000 If there is point : The far right figure ex) MSD : The far left non-zero figure Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 35 Significant Figure (Example) 4 digit : 1234, , 123.4, digit : 10.10, , 100.0, 1.010X digit : 1010 cf) (Four digit of significant figure) Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 36 The calculation Add and Subtract The last result is decided by the minimum point of calculations Example) ( 5 digit of SF) (1 digit of SF) (5 digit of SF) Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 37 Calculations (cont d) Multiply and Divide Same as the minimum digit of significant figure Example) 16.3 X 4.5 = => 73 Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 38 Propagation of Errors Suppose that (x 1,x 2, ) is the variables, then variation of the function of F(x 1, x 2, ) is as follows: In case that there is no correlation between variables Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 39 Propagation of Errors (continued) Suppose that (x 1,x 2, ) is the variables, then variation of the function of F(x 1, x 2, ) is as follows: - In case that there is correlation between variables Let us consider only non-correlation case. Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 40 Combining Errors Add or Subtract (F=x 1 +x 2 or F= x 1 -x 2 ) Example) x 1 = 100. x 2 = 400. F = 500. 22. Example) The error of the measurement Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 41 Combining Errors (cont d) F=ax (a is constant) Example) x =100. 10. a = F = 500. 50. Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 42 Combining Errors (cont d) Multiplication (F=x 1 x 2 ) Example) x 1 = 100. 10. x 2 = 400. F = (400. 45. ) X 10 2 Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 43 Combining Errors (cont d) Division (F= x 1 / x 2 ) Example) x 1 = 100. 10. x 2 = 400. F = 0.028 Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 44 Combining results Using weighting factor Cases With different detection efficiencies With different parts of apparatus With different experiment Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 45 Combining results Using weighting factor (cont d) Average There is N data whose values are (x 1, x 2,...x k, x N ) Suppose that the error of X k is k where weighting factor Error : Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 46 Ex) World Average of sin(2 ) Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 47 Ex) B 0 lifetime summary Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 48 Ex) CDF B d Mixing Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 49 Upper Limit Measurement (B = B m ) Observation (B m > 5 ) Signal is greater than 5 sigma of error. Evidence ( 3 < B m < 5 ) Signal is greater than 3 sigma of error, however less than 5 sigma. Upper Limit (3 < B m ) Signal is less than 3 sigma. Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 50 Upper Limit B l (cont d) Method I. General Case Measurement B = B m B l < B m (90% CL) 1.64 (95% CL) 2.33 (99% CL) Measurement B = B m Ex) B l =(3 5) X B l < (3+1.28X5) X at 90% CL Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 51 Upper Limit B l (cont d) Method 2. Negative B m Background Subtracted Example) B m = (-1 1) X B m = ( 0 1) X Upper Limit at 90 % CL Level g is Gaussian (Mean is B m, width is ) Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 52 Compare Upper Limit (90% CL) B m Method 1Method Assume =1 Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 53 Ex) CP Asymmetry in Charm (D + K - K + ) Cabibbo Suppressed mode Cabbibo Favored mode D + K K + + D K K + D + K + + D K + C.F. A=0.006 A < at 95 %CL Fitting Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 55 Fitting Methods Moment Simple, but inefficiency Maximum likelihood Method Can be used only if the theoretical distribution is known. More general case Least Square Method In case of statistical error Example) For a given N data of (x i, y i ), let us fit using a linear equation of y=ax+b Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS Moment Method is to calculate the average Simplicity Example A linear equation Parameter a is Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS Maximum likelihood Method The likelihood L Where is the parameter to find y i is the function given variable xi To find maximize L To maximize l= log L Normalization is essential. Ex) A linear equation Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 58 Maximum likelihood Method (cont d) Can be used only if the theoretical distribution is known. The most powerful one for finding the values of unknown parameters No histogram needed (event by event) Efficient Method Most case works We can transform one variable to another Ex) Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS Least Square Method Least Square Method for Simple Case The first of polynomials (linear equation y=ax+b) For a given N data of (x i, y i ), let us fit using a linear equation of y=ax+b To find a and b which is the minimization of the sum of distance between data and equation. i.e. when we put Q as follows: Let us find a and b which satisfies the following equations Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 60 Least Square Method (Continued) Least Square Method for Linear Polynomials m of unknown parameters (a 1, a 2, a 3, a m ) F(x)=a 1 f 1 (x)+a 2 f 2 (x)+ + a m f m (x) It is same as linear least square method There will be m equations and solutions Least Square Method for Non-linear Equation Let us expansion as a linear polynomial using Taylor series. Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 61 Least Square Method (Example) Mn_fit used Least Square Method Signal is gaussian. Background is Chebyshev polynomial. Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS Maximum Likelihood vs. Least Square Method Maximum like.Least Square How easy Normalization and maximization can be messy Needs minimization EfficiencyUsually most efficientSometime equivalent to max. like. Input dataIndividual eventsHistograms Estimate of goodness of fit Very difficultEasy Zero eventCover wellTroublesome Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 63 X-Y plane Errors in y-direction are Gaussian X-values are precisely determined The maximum likelihood and the least square methods are equivalent. Example) Mass distributions Maximum Likelihood = Least Square Method Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 64 Fitting Package PAW Mn_fit Root Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 65 PAW Physics Analysis Workstation Inside of CERN library Ntuple n dimensional variables Good to make histogram Include some fitting Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 66 Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 67 Mn_fit Using fitting program in minuit at CERN library Powerful for fitting Easily check the results whether the fitting results are good or not. Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 68 Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 69 mn_fit (example) Signal is Gaussian Maximum likelihood is same as least square method Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 70 ROOT To Handle large data An object oriented HEP analysis Framework ROOT was created by Rene Brun and Fons Rademakers in CERN The ROOT system website is at Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 71 Differences from PAW Regular grammar (C++) on command line Single language (compiled and interpreted) Object Oriented (use your class in the interpreter) Advanced Interactive User Interface Well Documented code. HTML class descriptions for every class. Object I/O including Schema Evolution 3-d interfaces with OpenGL and X3D. Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 72 ROOT example Kyungpook National UniversitySpring 2003 CENTER FOR HIGH ENERGY PHYSICS 73 Conclusions Data Process is important for physicists Ref. Louis Lyons, Statistics for nuclear and particle physicists (Cambridge Press)