the surprising consequences of randomness ls 829 mathematics in science and civilization feb 6, 2010...
Post on 15-Jan-2016
217 views
TRANSCRIPT
The Surprising Consequences of Randomness
LS 829 Mathematics in Science and Civilization
Feb 6, 2010
04/21/23 1LS 829 - 2010
Sources and Resources
• Statistics: A Guide to the Unknown, 4th ed., by R.Peck, et al. Publisher: Duxbury, 2006
• Taleb, N. N. (2008) Fooled by Randomness The Hidden Role of Chance in the Markets and Life, 2nd Edition. Random House.
• Mlodinow, L (2008) The Drunkard’s Walk. Vintage Books. New York.
• Rosenthal, J.S. (2005) Struck by Lightning Harper Perennial. Toronto.
• www.stat.sfu.ca/~weldon04/21/23 2LS 829 - 2010
Introduction• Randomness concerns Uncertainty - e.g. Coin
• Does Mathematics concern Certainty? - P(H) = 1/2
• Probability can help to Describe Randomness &“Unexplained Variability”
• Randomness & Probability are key concepts for exploring implications of “unexplained variability”
04/21/23 3LS 829 - 2010
Abstract Real World
Mathematics Applications of Mathematics
Probability Applied Statistics
Surprising FindingsUseful Principles
Nine Findings and Associated Principles04/21/23 4
LS 829 - 2010
Example 1 - When is Success just
Good Luck?
An example from the world of Professional Sport
04/21/23 5LS 829 - 2010
04/21/23 6LS 829 - 2010
04/21/23 7LS 829 - 2010
Sports League - FootballSuccess = Quality or Luck?
2007 AFL LADDERTEAM Played WinDraw Loss Points FOR Points Against Ratio PointsGeelong 22 18 - 4 2542 1664 153 72Port Adelaide 22 15 - 7 2314 2038 114 60West Coast Eagles 22 15 - 7 2162 1935 112 60Kangaroos 22 14 - 8 2183 1998 109 56Hawthorn 22 13 - 9 2097 1855 113 52Collingwood 22 13 - 9 2011 1992 101 52Sydney Swans 22 12 1 9 2031 1698 120 50Adelaide 22 12 - 10 1881 1712 110 48St Kilda 22 11 1 10 1874 1941 97 46Brisbane Lions 22 9 2 11 1986 1885 105 40Fremantle 22 10 - 12 2254 2198 103 40Essendon 22 10 - 12 2184 2394 91 40Western Bulldogs 22 9 1 12 2111 2469 86 38Melbourne 22 5 - 17 1890 2418 78 20Carlton 22 4 - 18 2167 2911 74 16Richmond 22 3 1 18 1958 2537 77 14
04/21/23 8LS 829 - 2010
04/21/23 9LS 829 - 2010
Recent News Report
“A crowd of 97,302 has witnessed Geelong breakits 44-year premiership drought by crushing a hapless Port Adelaide by a record 119 points in Saturday's grand final at the MCG.” (2007 Season)
04/21/23 10LS 829 - 2010
Sports League - FootballSuccess = Quality or Luck?
2007 AFL LADDERTEAM Played WinDraw Loss Points FOR Points Against Ratio PointsGeelong 22 18 - 4 2542 1664 153 72Port Adelaide 22 15 - 7 2314 2038 114 60West Coast Eagles 22 15 - 7 2162 1935 112 60Kangaroos 22 14 - 8 2183 1998 109 56Hawthorn 22 13 - 9 2097 1855 113 52Collingwood 22 13 - 9 2011 1992 101 52Sydney Swans 22 12 1 9 2031 1698 120 50Adelaide 22 12 - 10 1881 1712 110 48St Kilda 22 11 1 10 1874 1941 97 46Brisbane Lions 22 9 2 11 1986 1885 105 40Fremantle 22 10 - 12 2254 2198 103 40Essendon 22 10 - 12 2184 2394 91 40Western Bulldogs 22 9 1 12 2111 2469 86 38Melbourne 22 5 - 17 1890 2418 78 20Carlton 22 4 - 18 2167 2911 74 16Richmond 22 3 1 18 1958 2537 77 14
04/21/23 11LS 829 - 2010
Are there better teams?
• How much variation in the total points table would you expect IFevery team had the same chance of winning every game? i.e. every game is 50-50.
• Try the experiment with 5 teams. H=Win T=Loss (ignore Ties for now)
04/21/23 12LS 829 - 2010
5 Team Coin Toss Experiment
My experiment …• T T H T T H H H H T
Team Points
3 16
2 12
5 8
1 4
4 0
But all teams Equal Quality(Equal Chance to win)
ExperimentResult----->
•Win=4, Tie=2, Loss=0 but we ignore ties. P(W)=1/2•5 teams (1,2,3,4,5) so 10 games as follows•1-2,1-3,1-4,1-5,2-3,2-4,2-5,3-4,3-5,4-5
04/21/23 13LS 829 - 2010
Implications?
• Points spread due to chance?
• Top team may be no better than the bottom team (in chance to win).
04/21/23 14LS 829 - 2010
Simulation: 16 teams, equal chance to win, 22 games
04/21/23 15LS 829 - 2010
Sports League - FootballSuccess = Quality or Luck?
2007 AFL LADDERTEAM Played WinDraw Loss Points FOR Points Against Ratio PointsGeelong 22 18 - 4 2542 1664 153 72Port Adelaide 22 15 - 7 2314 2038 114 60West Coast Eagles 22 15 - 7 2162 1935 112 60Kangaroos 22 14 - 8 2183 1998 109 56Hawthorn 22 13 - 9 2097 1855 113 52Collingwood 22 13 - 9 2011 1992 101 52Sydney Swans 22 12 1 9 2031 1698 120 50Adelaide 22 12 - 10 1881 1712 110 48St Kilda 22 11 1 10 1874 1941 97 46Brisbane Lions 22 9 2 11 1986 1885 105 40Fremantle 22 10 - 12 2254 2198 103 40Essendon 22 10 - 12 2184 2394 91 40Western Bulldogs 22 9 1 12 2111 2469 86 38Melbourne 22 5 - 17 1890 2418 78 20Carlton 22 4 - 18 2167 2911 74 16Richmond 22 3 1 18 1958 2537 77 14
04/21/23 16LS 829 - 2010
Does it Matter?
Avoiding foolish predictionsManaging competitors (of any kind)Understanding the business of sport
Appreciating the impact of uncontrolled variationin everyday life
04/21/23 17LS 829 - 2010
Point of this Example?
Need to discount “chance”In making inferences from everyday observations.
04/21/23 18LS 829 - 2010
Example 2 - Order from Apparent Chaos
An example from some personal data collection
04/21/23 19LS 829 - 2010
Gasoline ConsumptionEach Fill - record kms and litres of fuel used
Smooth--->SeasonalPattern….Why?
04/21/23 20LS 829 - 2010
Pattern Explainable?
Air temperature?
Rain on roads?
Seasonal Traffic Pattern?
Tire Pressure?
Info Extraction Useful for Exploration of Cause
Smoothing was key technology in info extraction04/21/23 21LS 829 - 2010
Jan 12, 2010 STAT 100 22
Intro to smoothing with context …
Optimal Smoothing Parameter?
• Depends on Purpose of Display• Choice Ultimately Subjective• Subjectivity is a necessary part
of good data analysis
04/21/23 23LS 829 - 2010
Summary of this Example
• Surprising? Order from Chaos …
• Principle - Smoothing and Averaging reveal patterns encouraging investigation of cause
04/21/23 24LS 829 - 2010
3. Weather Forecasting
04/21/23 25LS 829 - 2010
Chaotic Weather
• 1900 – equations too complicated to solve
• 2000 – solvable but still poor predictors
• 1963 – The “Butterfly Effect” small changes in initial conditions ->
large short term effects
• today – ensemble forecasting see p 173
• Rupert Miller p 178 – stats for short term …
04/21/23 26LS 829 - 2010
Conclusion from Weather Example?
• It may not be true that weather forecasting will improve dramatically in the future
• Some systems have inherent instability and increased computing power may not be enough the break through this barrier
04/21/23 27LS 829 - 2010
Example 4 - Obtaining Confidential Information
• How can you ask an individual for data on• Incomes• Illegal Drug use• Sex modes• …..Etc in a way that will get an honest
response?
There is a need to protect confidentiality of answers.04/21/23 28LS 829 - 2010
Example: Marijuana Usage
• Randomized Response Technique
Pose two Yes-No questions and have coin toss determine which is answered
Head 1. Do you use Marijuana regularly?Tail 2. Is your coin toss outcome a tail?
04/21/23 29LS 829 - 2010
Randomized Response Technique
• Suppose 60 of 100 answer Yes. Then about 50 are saying they have a tail. So 10 of the other 50 are users. 20%.
• It is a way of using randomization to protect Privacy. Public Data banks have used this.
04/21/23 30LS 829 - 2010
Summary of Example 4
• Surprising that people can be induced to provide sensitive information in public
• The key technique is to make use of the predictability of certain empirical probabilities.
04/21/23 31LS 829 - 2010
5. Randomness in the Markets
• 5A. Trends That Deceive
• 5B. The Power of Diversification
• 5C. Back-the-winner fallacy
04/21/23 LS 829 - 2010 32
5A. Trends That Deceive
People often fail to appreciate the effects of randomness
04/21/23 33LS 829 - 2010
The Random Walk
04/21/23 LS 829 - 2010 34
Trends that do not persist
04/21/23 35LS 829 - 2010
Longer Random Walk
04/21/23 LS 829 - 2010 36
Recent Intel Stock Price
04/21/23 LS 829 - 2010 37
Things to Note
• The random walk has no patterns useful for prediction of direction in future
• Stock price charts are well modeled by random walks
• Advice about future direction of stock prices – take with a grain of salt!
04/21/23 LS 829 - 2010 38
5B. The Power of Diversification
People often fail to appreciate the
effects of randomness
04/21/23 39LS 829 - 2010
Preliminary ProposalI offer you the following “investment opportunity”You give me $100. At the end of one year, I will
return an amount determined by tossing a fair coins twice, as follows:
$0 ………25% of time (TT)$50.……. 25% of the time (TH)$100.……25% of the time (HT)$400.……25% of the time. (HH)
Would you be interested?
04/21/23 40LS 829 - 2010
Stock Market Investment
• Risky Company - example in a known context• Return in 1 year for 1 share costing $1
0.00 25% of the time0.50 25% of the time1.00 25% of the time4.00 25% of the time
i.e. Lose Money 50% of the time Only Profit 25% of the time “Risky” because high chance of loss
04/21/23 41LS 829 - 2010
Independent Outcomes
• What if you have the chance to put $1 into each of 100 such companies, where the companies are all in very different markets?
• What sort of outcomes then? Use coin-tossing (by computer) to explore
04/21/23 42LS 829 - 2010
Diversification Unrelated Companies
• Choose 100 unrelated companies, each one risky like this. Outcome is still uncertain but look at typical outcomes ….
One-Year Returns to a $100 investment
04/21/23 43LS 829 - 2010
Looking at Profit only
Avg Profit approx 38%04/21/23 44LS 829 - 2010
Gamblers like Averages and Sums!
• The sum of 100 independent investments in risky companies is very predictable!
• Sums (and averages) are more stable than the things summed (or averaged).
• Square root law for variability of averages
Variation -----> Variation/n
04/21/23 45LS 829 - 2010
Summary - Diversification
• Variability is not Risk
• Stocks with volatile prices can be good investments
• Criteria for Portfolio of Volatile Stocks– profitable on average– independence (or not severe dependence)
04/21/23 LS 829 - 2010 46
5C - Back-the-winner fallacy
• Mutual Funds - a way of diversifying a small investment
• Which mutual fund?
• Look at past performance?
• Experience from symmetric random walk …
04/21/23 47LS 829 - 2010
Implication from Random Walk …?
• Stock market trends may not persist
• Past might not be a good guide to future
• Some fund managers better than others?
• A small difference can result in a big difference over a long time …
04/21/23 48LS 829 - 2010
A simulation experiment to determine the value of past
performance data
• Simulate good and bad managers
• Pick the best ones based on 5 years data
• Simulate a future 5-yrs for these select managers
04/21/23 49LS 829 - 2010
How to describe good and bad fund managers?
• Use TSX Index over past 50 years as a guide ---> annualized return is 10%
• Use a random walk with a slight upward trend to model each manager.
• Daily change positive with probability pGood manager ROR = 13%pa p=.56
Medium manager ROR = 10%pa p=.55
Poor manager ROR = 8% pa P=.5404/21/23 50LS 829 - 2010
04/21/23 51LS 829 - 2010
Simulation to test “Back the Winner”
• 100 managers assigned various p parameters in .54 to .56 range
• Simulate for 5 years• Pick the top-performing mangers (top 15%)• Use the same 100 p-parameters to simulate
a new 5 year experience• Compare new outcome for “top” and
“bottom” managers
04/21/23 52LS 829 - 2010
START=100
04/21/23 53LS 829 - 2010
Mutual Fund Advice?
Don’t expect past relative performance to be a good indicator of future relative performance.
Again - need to give due allowance for randomness (i.e. LUCK)
04/21/23 54LS 829 - 2010
Summary of Example 5C
• Surprising that Past Perfomance is such a poor indicator of Future Performance
• Simulation is the key to exploring this issue
04/21/23 55LS 829 - 2010
6. Statistics in the Courtroom
• Kristen Gilbert Case
• Data p 6 of article – 10 years data needed!
• Table p 9 of article – rare outcome if only randomness involved. P-value logic.
• Discount randomness but not quite proof
• Prosecutor’s Fallacy P[E|I] ≠ P[I|E]
04/21/23 56LS 829 - 2010
Lesson from Gilbert Case
• Statistical logic is subtle
• Easy to misunderstand
• Subjectivity necessary in some decision-making
04/21/23 LS 829 - 2010 57
Example 7 - Lotteries:Expectation and Hope
• Cash flow – Ticket proceeds in (100%)– Prize money out (50%)– Good causes (35%)– Administration and Sales (15%)
50 %
•$1.00 ticket worth 50 cents, on average
•Typical lottery P(jackpot) = .000000704/21/23 58LS 829 - 2010
How small is .0000007?
• Buy 10 tickets every week for 60 years
• Cost is $31,200.
• Chance of winning jackpot is = ….
1/5 of 1 percent!
04/21/23 59LS 829 - 2010
Summary of Example 7
•Surprising that lottery tickets provide so little hope!
•Key technology is simple use of probabilities
04/21/23 60LS 829 - 2010
Nine Surprising Findings 1. Sports Leagues - Lack of Quality Differentials 2. Gasoline Mileage - Seasonal Patterns 3. Weather - May be too unstable to predict4. Marijuana – Can get Confidential info5A. Random Walk – Trends that are not there5B. Risky Stocks – Possibly a Reliable Investment5C. Mutual Funds – Past Performance not much help6. Gilbert Case – Finding Signal amongst Noise7. Lotteries - Lightning Seldom Strikes
04/21/23 61LS 829 - 2010
Nine Useful Concepts & Techniques?
1. Sports Leagues - Unexplained variation can cause illusions - simulation can inform
2. Gasoline Mileage - Averaging (and smoothing) amplifies signals
3. Weather – Beware the Butterfly Effect!4. Marijuana – Randomized Response Surveys5A. Random Walks – Simulation can inform5B. Risky Stocks - Simulation can inform5C. Mutual Funds - Simulation can inform6. Gilbert Case – Extracts Signal from Noise 7. Lotteries – 14 million is a big number!
04/21/23 62LS 829 - 2010
Role of Math?
• Key background for – Graphs– Probabilities– Simulation models– Smoothing Methods
• Important for constructing theory of inference
04/21/23 63LS 829 - 2010
Limitation of Math
• Subjectivity Necessary in Decision-Making
• Extracting Information from Data is still partly an “art”
• Context is suppressed in a mathematical approach to problem solving
• Context is built in to a statistical approach to problem solving
04/21/23 LS 829 - 2010 64