top thinkshop-2 nov. 10-12, 2000 pushpa bhat1 advanced analysis algorithms for top analysis pushpa...
Post on 15-Jan-2016
213 views
TRANSCRIPT
![Page 1: Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat1 Advanced Analysis Algorithms for Top Analysis Pushpa Bhat Fermilab Top Thinkshop 2 Fermilab, IL November](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649d265503460f949fc911/html5/thumbnails/1.jpg)
Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat
1
Advanced Analysis Algorithms
for Top AnalysisPushpa Bhat
Fermilab
Top Thinkshop 2Fermilab, ILNovember 2000
A reasonable man adapts himself to the world.An unreasonable man persists to adapt the world to himself.So, all So, all progress depends on the unreasonable one.
- Bernard Shaw
![Page 2: Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat1 Advanced Analysis Algorithms for Top Analysis Pushpa Bhat Fermilab Top Thinkshop 2 Fermilab, IL November](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649d265503460f949fc911/html5/thumbnails/2.jpg)
Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat
2
What do we gain?
b-tag efficiency in Run I: DØ ~20%, CDF ~53% b-tag efficiency in Run I: DØ ~20%, CDF ~53% But, DØ was able to measure the top quark mass But, DØ was able to measure the top quark mass with a precision approaching that of CDF by using with a precision approaching that of CDF by using multivariate techniques to separate signal and multivariate techniques to separate signal and background while minimizing the correlation of background while minimizing the correlation of the selection with the top quark mass.the selection with the top quark mass.
![Page 3: Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat1 Advanced Analysis Algorithms for Top Analysis Pushpa Bhat Fermilab Top Thinkshop 2 Fermilab, IL November](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649d265503460f949fc911/html5/thumbnails/3.jpg)
Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat
3
Optimal Analysis MethodsThe new generation of experiments will be a lot more demanding than the previous in data handling at all stagesThe time-honored procedure of choosing and applying cuts on one event variable at a time is rarely optimal!The measurements being multivariate, the optimal methods of analyses are necessarily multivariateDiscriminant Analysis: Partition multidimensional variable space, identify boundaries between classes of objects Cluster Analysis: Assign objects to groups based on similarityRegression Analysis: Functional approximation/fitting
![Page 4: Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat1 Advanced Analysis Algorithms for Top Analysis Pushpa Bhat Fermilab Top Thinkshop 2 Fermilab, IL November](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649d265503460f949fc911/html5/thumbnails/4.jpg)
Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat
4
Data Analysis TasksParticle Identification e-ID, -ID, b-ID, , q/g
Signal/Background Event Classification Signals of new physics are rare and small
(Finding a “jewel” in a hay-stack)
Parameter Estimation t mass, H mass, track parameters, for example
Function Approximation Correction functions, tag rates, fake rates
Data Exploration Data-driven extraction of information, latent structure analysis
![Page 5: Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat1 Advanced Analysis Algorithms for Top Analysis Pushpa Bhat Fermilab Top Thinkshop 2 Fermilab, IL November](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649d265503460f949fc911/html5/thumbnails/5.jpg)
Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat
5
x1x1
x2x2
Why Multivariate Methods?
x1x1
x2x2
Because they are optimal!Because they are optimal!
D(x1,x2)=2.014x1 + 1.592x2D(x1,x2)=2.014x1 + 1.592x2
![Page 6: Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat1 Advanced Analysis Algorithms for Top Analysis Pushpa Bhat Fermilab Top Thinkshop 2 Fermilab, IL November](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649d265503460f949fc911/html5/thumbnails/6.jpg)
Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat
6
Optimal Event Selection
b)p(b)(xp
s)p(s)|p(x
)x(bp
)x(sp
)(xr
b)p(b)(xp
s)p(s)|p(x
)x(bp
)x(sp
)(xr
defines decision boundariesdefines decision boundariesthat minimize the probabilitythat minimize the probabilityof misclassificationof misclassification
So, the problem mathematically reduces to that of calculating r(x), the Bayes Discriminant Function or probability densities
Posterior probabilityPosterior probability
s)|p(xb)(xp
s)|p(x
r1
r
)|( xsp
s)|p(xb)(xp
s)|p(x
r1
r
)|( xsp
![Page 7: Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat1 Advanced Analysis Algorithms for Top Analysis Pushpa Bhat Fermilab Top Thinkshop 2 Fermilab, IL November](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649d265503460f949fc911/html5/thumbnails/7.jpg)
Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat
7
Probability Density EstimatorsHistogramming:
The basic problem of non-parametric density estimation is very simple! Histogram data in M bins in each of the d feature variables
Md bins Curse Of Dimensionality In high dimensions, we would either require a huge
number of data points or most of the bins would be empty leading to an estimated density of zero.
But, the variables are generally correlated and hence tend to be restricted to a sub-space Intrinsic Dimensionality
![Page 8: Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat1 Advanced Analysis Algorithms for Top Analysis Pushpa Bhat Fermilab Top Thinkshop 2 Fermilab, IL November](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649d265503460f949fc911/html5/thumbnails/8.jpg)
Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat
8
Kernel-Based MethodsAkin to Histogramming but adopts importance sampling
Place in d-dimensional space a hypercube of side h centered on each data point x,
The estimate will have discontinuities
Can be smoothed out using different forms for kernel functions H(u). A common choice is a multivariate Gaussian kernel
N
n
n
d h
xxH
hNxp
1
11)(~
N
n
n
d h
xxH
hNxp
1
11)(~
N
n
n
d h
xx
hNxp
12
2
2/2 2
||exp
)2(
11)(~
N
n
n
d h
xx
hNxp
12
2
2/2 2
||exp
)2(
11)(~
N = Number of data points H(u) = 1 if xn in the hypercube = 0 otherwise
h=smoothingparameter
![Page 9: Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat1 Advanced Analysis Algorithms for Top Analysis Pushpa Bhat Fermilab Top Thinkshop 2 Fermilab, IL November](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649d265503460f949fc911/html5/thumbnails/9.jpg)
Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat
9
Place a hyper-sphere centered at each data point x and allow the radius to grow to a volume V until it contains K data points. Then, density at x
If our data set contains Nk points in class Ck and N points in total, then
NV
Kxp )(
NV
Kxp )(
K nearest-neighbor Method
N = Number of data pointsN = Number of data points
VN
KCxp
k
kk )|(
VN
KCxp
k
kk )|(
KKkk = # of points in volume = # of points in volume
V for class CV for class Ckk
K
K
xp
CpCxPxCp kkk
k )(
)()|()|(
K
K
xp
CpCxPxCp kkk
k )(
)()|()|(
![Page 10: Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat1 Advanced Analysis Algorithms for Top Analysis Pushpa Bhat Fermilab Top Thinkshop 2 Fermilab, IL November](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649d265503460f949fc911/html5/thumbnails/10.jpg)
Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat
10
Discriminant Approximation with Neural Networks
Output of a feed forward neural network can approximate the Bayesian posterior probability p(s|x,y)Directly without estimating class-conditional probabilities
x
y
),,( yxn
r
ryxspyxn
1),|(),,(
r
ryxspyxn
1),|(),,(
![Page 11: Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat1 Advanced Analysis Algorithms for Top Analysis Pushpa Bhat Fermilab Top Thinkshop 2 Fermilab, IL November](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649d265503460f949fc911/html5/thumbnails/11.jpg)
Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat
11
Calculating the Discriminant
Consider the sum
i
iii dyxnyxE 2]),,([),,(
Where di = 11 for signal
= 00 for background = vector of parameters
Then
r
ryxspyxn
d
yxdE
1),|(),,(0
),,(
in the limit of large data samples and provided that the function n(x,y,) is flexible enough.
![Page 12: Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat1 Advanced Analysis Algorithms for Top Analysis Pushpa Bhat Fermilab Top Thinkshop 2 Fermilab, IL November](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649d265503460f949fc911/html5/thumbnails/12.jpg)
Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat
12
NN estimates a mapping function without requiring a mathematical description of how the output formally depends on the input.
The “hidden” transformation functions, g, adapt themselves to the data as part of the training process. The number of such functions need to grow only as the complexity of the problem grows.
x1
x2
x3
x4
DNN
aijii
kjj
NN e1
1(a))};X({ D
- ggg
ij
k
Neural Networks
![Page 13: Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat1 Advanced Analysis Algorithms for Top Analysis Pushpa Bhat Fermilab Top Thinkshop 2 Fermilab, IL November](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649d265503460f949fc911/html5/thumbnails/13.jpg)
Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat
13
Why are NN models powerful?
Neural networks are universal approximators
With a sufficiently large NN, you can approximate a function to arbitrary accuracy
Convergence of approximation is rapid
High dimensionality is not a curse any more!
Model complexity can be controlled by regularization
Extrapolate gracefully
![Page 14: Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat1 Advanced Analysis Algorithms for Top Analysis Pushpa Bhat Fermilab Top Thinkshop 2 Fermilab, IL November](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649d265503460f949fc911/html5/thumbnails/14.jpg)
Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat
14
Also, they need to have optimal flexibility/complexity
x1
x2
)2sin(4.05.0)( xxh Mth Order Polynomial Fit
M=1 M=3 M=10
x1
x2
x1
x2
Simple Flexible Highly flexible
![Page 15: Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat1 Advanced Analysis Algorithms for Top Analysis Pushpa Bhat Fermilab Top Thinkshop 2 Fermilab, IL November](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649d265503460f949fc911/html5/thumbnails/15.jpg)
Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat
15
The Golden Rule
Keep it simpleAs simple as possibleNot any simpler
- Einstein
![Page 16: Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat1 Advanced Analysis Algorithms for Top Analysis Pushpa Bhat Fermilab Top Thinkshop 2 Fermilab, IL November](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649d265503460f949fc911/html5/thumbnails/16.jpg)
Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat
16
Measuring the Top Quark Mass
The DiscriminantsThe Discriminants
Discriminant variables shaded = topshaded = top DØDØ
![Page 17: Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat1 Advanced Analysis Algorithms for Top Analysis Pushpa Bhat Fermilab Top Thinkshop 2 Fermilab, IL November](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649d265503460f949fc911/html5/thumbnails/17.jpg)
Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat
17
Background-rich
Signal-rich
Measuring the Top Quark MassMeasuring the Top Quark Mass
mt = 173.3 ± 5.6(stat.) ± 6.2 (syst.) GeV/c2mt = 173.3 ± 5.6(stat.) ± 6.2 (syst.) GeV/c2
DØ Lepton+jetsDØ Lepton+jets
![Page 18: Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat1 Advanced Analysis Algorithms for Top Analysis Pushpa Bhat Fermilab Top Thinkshop 2 Fermilab, IL November](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649d265503460f949fc911/html5/thumbnails/18.jpg)
Strategy for Discovering the Higgs Boson
at the Tevatron
P.C. Bhat, R. Gilmartin, H. Prosper, PRD 62 (2000) P.C. Bhat, R. Gilmartin, H. Prosper, PRD 62 (2000) hep-ph/0001152
![Page 19: Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat1 Advanced Analysis Algorithms for Top Analysis Pushpa Bhat Fermilab Top Thinkshop 2 Fermilab, IL November](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649d265503460f949fc911/html5/thumbnails/19.jpg)
Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat
19
WH Results from NN AnalysisWH Results from NN AnalysisMMHH = 100 GeV/c = 100 GeV/c22
WH WH vs Wbb
![Page 20: Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat1 Advanced Analysis Algorithms for Top Analysis Pushpa Bhat Fermilab Top Thinkshop 2 Fermilab, IL November](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649d265503460f949fc911/html5/thumbnails/20.jpg)
Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat
20
WH (110 GeV/c2) NN Distributions
![Page 21: Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat1 Advanced Analysis Algorithms for Top Analysis Pushpa Bhat Fermilab Top Thinkshop 2 Fermilab, IL November](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649d265503460f949fc911/html5/thumbnails/21.jpg)
Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat
21
Results, Standard vs. NN
A good chance of discovery up to MH= 130 GeV/c2 with 20-30fb-1
![Page 22: Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat1 Advanced Analysis Algorithms for Top Analysis Pushpa Bhat Fermilab Top Thinkshop 2 Fermilab, IL November](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649d265503460f949fc911/html5/thumbnails/22.jpg)
Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat
22
Improving the Higgs Mass Resolution
13.8% 12.2%
13.1% 11..3%
13%13% 11%11%
Use mjj and HT (= Etjets ) to train NNs to predict the Higgs boson mass
![Page 23: Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat1 Advanced Analysis Algorithms for Top Analysis Pushpa Bhat Fermilab Top Thinkshop 2 Fermilab, IL November](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649d265503460f949fc911/html5/thumbnails/23.jpg)
Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat
23
Newer ApproachesEnsembles of Networks
Committees of Networks Performance can be better than the best single
network
Stacks of NetworksControl both bias and variance
Mixture of ExpertsDecompose complex problems
![Page 24: Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat1 Advanced Analysis Algorithms for Top Analysis Pushpa Bhat Fermilab Top Thinkshop 2 Fermilab, IL November](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649d265503460f949fc911/html5/thumbnails/24.jpg)
Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat
24
Bayesian ReasoningThe Bayesian approach provides a well-founded mathematical procedure to make straight-forward and meaningful model comparisons. It also allows treatment of all uncertainties in a consistent manner.
Examples of useful applications: Fitting binned data to multi-source models PLB 407 (1997) 73
Extraction of solar neutrino survival probability PRL 81(1998) 5056
Mathematically linked to adaptive algorithms such as Neural Networks (NN)
Hybrid methods involving NN for probability density estimation and Bayesian treatment can be very powerful
![Page 25: Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat1 Advanced Analysis Algorithms for Top Analysis Pushpa Bhat Fermilab Top Thinkshop 2 Fermilab, IL November](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649d265503460f949fc911/html5/thumbnails/25.jpg)
Top Thinkshop-2 Nov. 10-12, 2000 Pushpa Bhat
25
Summary
Multivariate methods have already made impact discoveries and precision measurements and will be the methods of choice in future analyses.
We have only scratched the surface in our use of advanced analysis algorithms.
Hybrid methods combining “intelligent” algorithms and probabilistic approach will be the wave of the future!