noise bottleneck nnt

4
The Noise Bottleneck or How Noise Explodes Faster than Data (very Brief Note for the Signal Noise Section in Antifragile) Nassim N Taleb August 25, 2013 The paradox is that increase in sample size magnifies the role of noise (or luck). Keywords: Big Data, Fooled by Randomness, Noise/Signal PRELIMINARY DRAFT Introduction It has always been absolutely silly to be exposed the news. Things are worse today thanks to the web. We are getting more information, but with constant "consciouness", "desk space", or "visibility". Google News, Bloomberg News, etc. have space for, say, <100 items at any point in time. But there are millions of events every day. As the world is more connected, with the global dominating over the local, the number of sources of news is multiply- ing. But your consciousness remains limited. So we are experiencing a winner-take-all effect in information: like a large movie theatre with a small door. Likewise we are getting more data. The size of the door is remaining constant, the theater is getting larger. The winner-take-all effects in information space corresponds to more noise, less signal. In other words the spurious dominates. Similarity with the Fooled by Randomness Bottleneck. This is similar to my idea that the more spurious returns dominate finance as the number of players get large, and swap the more solid ones. Start with the idea (see Taleb 2001), that as a population of operators in a profession marked by a high degrees of randomness increases, the number of stellar results, and stellar for completely random reasons, gets larger. The “spurious tail” is therefore the number of persons who rise to the top for no reasons other than mere luck, with subsequent rationalizations, analyses, explanations, and attributions. The performance in the “spurious tail” is only a matter of number of participants, the base population of those who tried. Assuming a symmetric market, if one has for base population 1 million persons with zero skills and ability to predict starting Year 1, there should be 500K spurious winners Year 2, 250K Year 3, 125K Year 4, etc. One can easily see that the size of the winning population in, say, Year 10 depends on the size of the base population Year 1; doubling the initial population would double the straight winners. Injecting skills in the form of better-than-random abilities to predict does not change the story by much. (Note that this idea has been severely plagiarized by someone, about which a bit more soon). Because of scalability, the top, say 300, managers get the bulk of the allocations, with the lion's share going to the top 30. So it is obvious that the winner-take-all effect causes distortions: say there are m initial participants and the "top" k managers selected, the result will be k m managers in play. As the base population gets larger, that is, N increases linearly, we push into the tail probabilities.

Upload: wlannes

Post on 30-Nov-2015

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Noise Bottleneck NNT

The Noise Bottleneck or How Noise Explodes Faster than Data

(very Brief Note for the Signal Noise Section in Antifragile)

Nassim N TalebAugust 25, 2013

The paradox is that increase in sample size magnifies the role of noise (or luck).

Keywords: Big Data, Fooled by Randomness, Noise/Signal

PRELIMINARY DRAFT

IntroductionIt has always been absolutely silly to be exposed the news. Things are worse today thanks to the web.

We are getting more information, but with constant "consciouness", "desk space", or "visibility". Google News,Bloomberg News, etc. have space for, say, <100 items at any point in time. But there are millions of events every day.As the world is more connected, with the global dominating over the local, the number of sources of news is multiply-ing. But your consciousness remains limited. So we are experiencing a winner-take-all effect in information: like a largemovie theatre with a small door.Likewise we are getting more data. The size of the door is remaining constant, the theater is getting larger.

The winner-take-all effects in information space corresponds to more noise, less signal. In other words the spuriousdominates.Similarity with the Fooled by Randomness Bottleneck. This is similar to my idea that the more spurious returnsdominate finance as the number of players get large, and swap the more solid ones. Start with the idea (see Taleb 2001),that as a population of operators in a profession marked by a high degrees of randomness increases, the number of stellarresults, and stellar for completely random reasons, gets larger. The “spurious tail” is therefore the number of personswho rise to the top for no reasons other than mere luck, with subsequent rationalizations, analyses, explanations, andattributions. The performance in the “spurious tail” is only a matter of number of participants, the base population ofthose who tried. Assuming a symmetric market, if one has for base population 1 million persons with zero skills andability to predict starting Year 1, there should be 500K spurious winners Year 2, 250K Year 3, 125K Year 4, etc. Onecan easily see that the size of the winning population in, say, Year 10 depends on the size of the base population Year 1;doubling the initial population would double the straight winners. Injecting skills in the form of better-than-randomabilities to predict does not change the story by much. (Note that this idea has been severely plagiarized by someone,about which a bit more soon).Because of scalability, the top, say 300, managers get the bulk of the allocations, with the lion's share going to the top30. So it is obvious that the winner-take-all effect causes distortions: say there are m initial participants and the "top" kmanagers selected, the result will be k

m managers in play. As the base population gets larger, that is, N increases

linearly, we push into the tail probabilities. Here read skills for information, noise for spurious performance, and translate the problem into information and news.

The paradox: This is quite paradoxical as we are accustomed to the opposite effect, namely that a large increases insample size reduces the effect of sampling error; here the narrowness of M puts sampling error on steroids.

Page 2: Noise Bottleneck NNT

Here read skills for information, noise for spurious performance, and translate the problem into information and news.

The paradox: This is quite paradoxical as we are accustomed to the opposite effect, namely that a large increases insample size reduces the effect of sampling error; here the narrowness of M puts sampling error on steroids.

DerivationsLet Z ª Izi

jM1< j<m, 1§i<nbe a (n × m) sized population of variations, m population series and n data points per distribution,

with i, j œ N; assume “noise” or scale of the distribution s œ R+ , signal m ¥0 . Clearly s can accommodate distribu-tions with infinite variance, but we need the expectation to be finite. Assume i.i.d. for a start.

Cross Sectional (n = 1)Special case n = 1: we are just considering news/data without historical attributes.

Let F¬ be the generalized inverse distribution, or the quantile, F¬HwLã inf 8t œ R : F HtL ¥ w<, for all nondecreasingdistribution functions FHxL ª PHX < xL. For distributions without compact support, w œ (0,1); otherwise w œ @0, 1D. Inthe case of continuous and increasing distributions, we can write F-1 instead.The signal is in the expectaion, so E HzL is the signal, and s the scale of the distribution determines the noise (which for aGaussian corresponds to the standard deviation). Assume for now that all noises are drawn from the same distribution.

Assume constant probability the "threshold", z= km

, where k is the size of the window of the arrival. Since we assumethat k is constant, it matters greatly that the quantile covered shrinks with m.

Gaussian NoiseWhen we set z as the reachable noise. The quantile becomes:

F-1HwL = 2 s erfc-1H2wL + m

Where erfc-1is the inverse complementary error function.

Of more concern is the survival function, F ª F HxL ª PHX > xL, and its inverse F-1

(1)F-1s,mHzL = - 2 s erfc-1 2

k

m+ m

Note that s (noise) is multiplicative, when m (signal) is additive.

As information increases, z becomes smaller, and F-1 moves away in standard deviations. But nothing yet by compari-son with Fat tails.

2 Noise Bottleneck.nb

Page 3: Noise Bottleneck NNT

0.02 0.04 0.06 0.08 0.10z

5

10

15

20

1

2

3

4

Figure 1: Gaussian, s={1,2,3,4}

Fat Tailed Noise

Now we take a Student T Distribution as a substitute to the Gaussian.

(2)f HxL ª

a

a+Hx-mL2

s2

a+1

2

a sBI a2

, 12M

Where we can get the inverse survival function.

(3)g-1s,mHzL = m + a s sgnH1 - 2 zL1

IH1,H2 z-1L sgnH1-2 zLL-1 Ia

2, 12M- 1

Where I is the generalized regularized incomplete Beta function IHz0,z1LHa, bL = BHz0,z1LHa,bL

BHa,bL, and BzHa, bL the incomplete

Beta function BzHa, bL ‡ Ÿ0zta-1 H1 - tLb-1 dt. B Ha, bL is the Euler Beta function

BHa, bL ‡ GHaL GHbL êGHa + bL ‡ Ÿ01ta-1 H1 - tLb-1 dt.

Noise Bottleneck.nb 3

Page 4: Noise Bottleneck NNT

2.µ10-7 4.µ10-7 6.µ10-7 8.µ10-7 1.µ10-6z

2000

4000

6000

8000

10000g¬

1

2

3

4

Figure 2: Power Law, s={1,2,3,4}

As we can see in Figure 2, the explosion in the tails of noise, and noise only.

Part 2 of the discussion to come soon.

4 Noise Bottleneck.nb