1 lecture twelve. 2 outline projects failure time analysis linear probability model poisson...
Post on 21-Dec-2015
218 views
TRANSCRIPT
4
Team One
• Hongtao Xu: Project choice
• Sasha Hochstadt: Data Retrieval
• Logan McCleod: Statistical Analysis
• Heather Samoville: PowerPoint Presentation
• Christian Helland: Executive Summary
• Meng Yu: Technical Appendix
5
Assignments
• 1. Project choice
• 2. Data Retrieval
• 3. Statistical Analysis
• 4. PowerPoint Presentation
• 5. Executive Summary
• 6. Technical Appendix
6
PowerPoint Presentations: Member 4• 1. Introduction: Members 1 ,2 , 3
– What– Why– How
• 2. Executive Summary: Member 5
• 3. Exploratory Data Analysis: Member 3
• 4. Descriptive Statistics: Member 3
• 5. Statistical Analysis: Member 3
• 6. Conclusions: Members 3 & 5
• 7. Technical Appendix: Table of Contents, Member 6
8
I. Your report should have an executive summary of one to one
and a half pages that summarizes your findings in words for a non-
technical reader. It should explain the problem being examined
from an economic perspective, i.e. it should motivate interest in the
issue on the part of the reader. Your report should explain how you
are investigating the issue, in simple language. It should explain
why you are approaching the problem in this particular fashion.
Your executive report should explain the economic importance of
your findings.
The technical details of your findings you can attach as an
appendix.
9
Grades
Component A B CIntroductionExec. SummyExplor.DescriptiveStat. Anal.ConclusionsTech. Appen.Overall Proj.
10
Data Sources• FRED: Federal Reserve Bank of St. Louis,
http://research.stlouisfed.org/fred/– Business/Fiscal
• Index of Consumer Sentiment, Monthly (1952:11)
• Light Weight Vehicle Sales, Auto and Light Truck, Monthly (1976.01)
• Economagic, http://www.economagic.com/
• U S Dept. of Commerce, http://www.commerce.gov/– Population– Economic Analysis, http://www.bea.gov/
11
Data Sources (Cont. )
• Bureau of Labor Statistics, http://stats.bls.gov/
• California Dept of Finance, http://www.dof.ca.gov/
12
Part II: Failure Time Analysis
• Exponential– survival function– hazard rate
• Weibull
• Exploratory Data Analysis, Lab Seven
14
Trough Peak DurationOct. 1945 Nov. 1948 37Oct. 1949 July 1953 45May 1954 August 1957 39April 1958 April 1960 24Feb. 1961 Dec. 1969 106Nov. 1970 Nov. 1973 36March 1975 January 1980 58July 1980 July 1981 12Nov. 1982 July 1990 92March 1991 March 2000 120
16
Duration # Ending # At Risk F(t) Survivor0 0 10 0 112 1 10 0.1 0.924 1 9 0.2 0.836 1 8 0.3 0.737 1 7 0.4 0.639 1 6 0.5 0.545 1 5 0.6 0.458 1 4 0.7 0.392 1 3 0.8 0.2106 1 2 0.9 0.1120 1 1 1 0
17
Figure 2: Estimated Survivor Function for Post-War Expansions
0
0.2
0.4
0.6
0.8
1
1.2
0 20 40 60 80 100 120 140
Duration in Months
Su
rviv
or
Fu
nct
ion
18
Figure 3: Exponential Trendline Fitted to Estimated Survivor Function
y = 1.1972e-0.0217x
R2 = 0.9533
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 20 40 60 80 100 120
Duration in Months
Su
rviv
or
Fu
nct
ion
19
Figure 4: Constrained Expontial trendline, Fitted to Estimated Survivor Function
y = e-0.019x
R2 = 0.9313
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 20 40 60 80 100 120
Duration in Months
Su
rviv
or
Fu
nc
tio
n
Exponential Distribution
• Hazard rate: ratio of density function to the survivor function:
• h(t) = f(t)/S(t)
• measure of probability of failure at time t given that you have survived that long
• for the exponential it is a constant:
• h(t) = )exp(/)exp( tt
21
Duration # Ending # At Risk Inter. Haz.0 0 10 012 1 10 0.100024 1 9 0.111136 1 8 0.125037 1 7 0.142939 1 6 0.166745 1 5 0.200058 1 4 0.250092 1 3 0.3333106 1 2 0.5000120 1 1 1.0000
23
Duration # Ending # At Risk Inter. Haz.Cum. Hazard0 0 10 0 012 1 10 0.1000 0.100024 1 9 0.1111 0.211136 1 8 0.1250 0.336137 1 7 0.1429 0.479039 1 6 0.1667 0.645645 1 5 0.2000 0.845658 1 4 0.2500 1.095692 1 3 0.3333 1.4290106 1 2 0.5000 1.929120 1 1 1.0000 2.929
24
Cumulative Hazard Function: Postwar Expansions
y = 0.0223x - 0.2422R2 = 0.9288
-0.5
0
0.5
1
1.5
2
2.5
3
3.5
0 20 40 60 80 100 120 140
Duration in Months
Cu
mu
lati
ve H
azar
d
25
Cumulative Hazard Function, Postwar Expansions
y = 0.0192xR2 = 0.9015
0
0.5
1
1.5
2
2.5
3
3.5
0 20 40 60 80 100 120 140
Duration in Months
Cu
mu
lati
ve H
azrd
Weibull Distribution• F(t) = 1 - exp[
• S(t) =
• ln S(t) = - (t/
• h(t) = f(t)/S(t)
• f(t) = dF(t)/dt = - exp[-(t/t/
• h(t) = (t/
• if h(t) = constant
• if h(t) is increasing function
• if h(t) is a decreasing function
])/( t
])/(exp[ t
27
Source:Wayne Nelson, Applied Life data Analysis(1982) John WileyDiesel Generators, hours to fan failure, (+ indicates running time, i.e. still running whenlast observed)
Hours # Ending # At Risk Interval Interval Hazard Rate Cumulative Hazard Rate450
460+11501150
1560+1600
1660+1850+1850+1850+1850+1850+2030+2030+2030+
207020702080
2200+
Lab Seven
28
Source:Wayne Nelson, Applied Life data Analysis(1982) John WileyDiesel Generators, hours to fan failure, (+ indicates running time, i.e. still running when last observed)
Hours # Ending # At Risk Interval Interval Hazard Rate Cumulative Hazard Rate450 1 70 450
460+ 681150 2 68 7001150
1560+ 651600 1 65 450
1660+ 631850+ 621850+ 611850+ 601850+ 591850+ 582030+ 572030+ 562030+ 55
2070 2 55 47020702080 1 53 10
29
Source:Wayne Nelson, Applied Life data Analysis(1982) John WileyDiesel Generators, hours to fan failure, (+ indicates running time, i.e. still running whenlast observed)
Hours # Ending # At Risk Interval Interval Hazard Rate Cumulative Hazard Rate450 1 70 450 0.0143 0.0143
460+ 681150 2 68 700 0.0294 0.04371150
1560+ 651600 1 65 450 0.0154 0.0591
1660+ 631850+ 621850+ 611850+ 601850+ 591850+ 582030+ 572030+ 562030+ 55
2070 2 55 470 0.0364 0.095520702080 1 53 10 0.0189 0.1143
2200+ 51
30
Cumulative Hazard Rate for Fan Failure
y = 4E-05x + 0.0089
R2 = 0.9816
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
Duration in Hours
Cu
mu
lati
ve H
azar
d
32
LOTTERY AGE CHILDREN EDUCATION INCOME 5.000000 50.00000 2.000000 15.00000 41.00000 7.000000 26.00000 0.000000 10.00000 22.00000 0.000000 40.00000 3.000000 13.00000 24.00000 10.00000 46.00000 2.000000 9.000000 20.00000 5.000000 40.00000 3.000000 14.00000 32.00000 5.000000 39.00000 2.000000 15.00000 42.00000 3.000000 36.00000 3.000000 8.000000 18.00000 0.000000 44.00000 1.000000 16.00000 47.00000 0.000000 47.00000 4.000000 20.00000 85.00000 6.000000 52.00000 1.000000 10.00000 23.00000 0.000000 51.00000 2.000000 18.00000 61.00000 0.000000 41.00000 2.000000 17.00000 70.00000 12.00000 42.00000 2.000000 9.000000 22.00000 7.000000 53.00000 1.000000 12.00000 27.00000 11.00000 72.00000 1.000000 9.000000 25.00000
34
Bernoulli Variable: Bern
• Bern = 0*(lottery=1) + 1*(lottery>0)
• Linear Probability Model: dummy dependent variable
• Bern(i) = c + a*income + b*age +d*children + f*education + e(i)
36
Bern age children education income
0 40.49 1.78 15.57 47.565
1 44.19 1.78 11.94 28.545
Averages for Players and Non-Players
Part IV. Poisson Approximation to Binomial
• Conditions:
• f(x) = {exp[-] x }/x!
• Assumptions:– the number of events occurring in non-
overlapping intervals are independent– the probability of a single event occurring in a
small interval is approximately proportional to the interval
– the probability of more than one event in an interval is negligible
50,1)1(,0 npp
38
Example
• Ten % of tools produced in a manufacturing process are defective. What is the probability of finding exactly two defectives in a random sample of 10?
• Binomial: p(k=2) = 10!/(8!2!)(0.1)2(0.9)8 = 0.194
• Poisson , where the mean of the Poisson, equals n*p = 0.1 p(k=2) = {exp[-1] 12 }/2! = 0.184