applications of physics to finance and … · arxiv:physics/0507022v1 [physics.soc-ph] 4 jul 2005...

27
arXiv:physics/0507022v1 [physics.soc-ph] 4 Jul 2005 APPLICATIONS OF PHYSICS TO FINANCE AND ECONOMICS: RETURNS, TRADING ACTIVITY AND INCOME * A. Christian Silva Department of Physics, University of Maryland, College Park, MD, 20742 Abstract: This dissertation reports work where physics methods are applied to financial and economical problems. Some material in this thesis is based on 3 published papers [1, 2, 3] which divide this study into two parts. The first part studies stock market data (chapter 1 to 5). The second part is devoted to personal income in the USA (chapter 6). We first study the probability distribution of stock returns at mesoscopic time lags (return horizons) ranging from about an hour to about a month. While at shorter microscopic time lags the distribution has power-law tails, for mesoscopic times the bulk of the distribution (more than 99% of the probability) follows an exponential law. The slope of the exponential function is determined by the variance of returns, which increases propor- tionally to the time lag. At longer times, the exponential law continuously evolves into Gaussian distribution. The exponential-to-Gaussian crossover is well described by the analytical solution of the Heston model with stochastic volatility. After characterizing the stock returns at mesoscopic time lags, we study the subordination hypothesis with one year of intraday data. We verify that the integrated volatility Vt constructed from the number of trades process can be used as a subordinator for a driftless Brownian motion. This subordination will be able to describe 85% of the stock returns for intraday time lags that start at 1 hour but are shorter than one day (upper time limit is restricted by the short data span of one year). We also show that the Heston model can be constructed by subordinating a Brownian motion with the CIR process. Finally, we show that the CIR process describes well enough the empirical Vt process, such that the corresponding Heston model is able to describe the log-returns xt process, with approximately the maximum quality that the subordination allows (80% 85%). Finally, we study the time evolution of the personal income distribution. We find that the personal income distribution in the USA has a well-defined two-income-class structure. The majority of population (97–99%) belongs to the lower income class characterized by the exponential Boltzmann-Gibbs (“thermal”) distribution, whereas the higher income class (1–3% of population) has a Pareto power-law (“superthermal”) distribution. By analyzing income data for 1983–2001, we show that the “thermal” part is stationary in time, save for a gradual increase of the effective temperature, whereas the “superthermal” tail swells and shrinks following the stock market. We discuss the concept of equilibrium inequality in a society, based on the principle of maximal entropy, and quantitatively show that it applies to the majority of population. Contents Acknowledgements 1 I. Introduction 2 A. Stock returns 2 B. Outline of the dissertation 3 II. Heston model for asset returns 4 A. Heston model-SDE and symmetrization 4 1. Short and long time limits of the Heston model 6 B. Heston model and subordination 6 III. General characteristics of the data and methods 7 IV. Mesoscopic returns 9 A. Data analysis and discussion 10 B. Conclusions 11 * This document is a reformatted version of my PhD thesis. Professor Theodore L. Einstein, Professor Steve L. Heston, Professor Dilip B. Madan, Professor Rajarshi Roy, Professor Victor M. Yakovenko (Chair/Advisor). Email:[email protected] V. Number of trades and subordination 11 A. Discrete nature of stock returns 12 B. Verifying subordination with intraday data 14 C. Models for the subordinator 16 D. Conclusion 17 VI. Income distribution 18 A. Data analysis and discussion 18 References 21 Acknowledgements I want to thank Professor Victor M. Yakovenko for all help trough this 3 years I have spend with him working on different projects. His assistance was vital in finishing my PhD. I also thank for the financial support he provided. I thank Professor Richard Prange for long discussions where I learned a lot of the buy side of finance. His critical questioning was essential in developing my work and in teaching me the practical option pricing concepts. I thank Professors Theodore L. Einstein, Steve L. Heston, Dilip B. Madan and Rajarshi Roy for accepting my invitation

Upload: duongdieu

Post on 30-Aug-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

arX

iv:p

hysi

cs/0

5070

22v1

[ph

ysic

s.so

c-ph

] 4

Jul 2

005

APPLICATIONS OF PHYSICSTO FINANCE AND ECONOMICS:

RETURNS, TRADING ACTIVITY AND INCOME∗

A. Christian Silva†

Department of Physics, University of Maryland, College Park, MD, 20742

Abstract: This dissertation reports work where physics methods are applied to financial and economicalproblems. Some material in this thesis is based on3 published papers [1, 2, 3] which divide this study into twoparts. The first part studies stock market data (chapter 1 to 5). The second part is devoted to personal income inthe USA (chapter 6).

We first study the probability distribution of stock returnsat mesoscopic time lags (return horizons) rangingfrom about an hour to about a month. While at shorter microscopic time lags the distribution has power-lawtails, for mesoscopic times the bulk of the distribution (more than 99% of the probability) follows an exponentiallaw. The slope of the exponential function is determined by the variance of returns, which increases propor-tionally to the time lag. At longer times, the exponential law continuously evolves into Gaussian distribution.The exponential-to-Gaussian crossover is well described by the analytical solution of the Heston model withstochastic volatility.

After characterizing the stock returns at mesoscopic time lags, we study the subordination hypothesis with oneyear of intraday data. We verify that the integrated volatility Vt constructed from the number of trades processcan be used as a subordinator for a driftless Brownian motion. This subordination will be able to describe≈ 85% of the stock returns for intraday time lags that start at≈ 1 hour but are shorter than one day (upper timelimit is restricted by the short data span of one year). We also show that the Heston model can be constructed bysubordinating a Brownian motion with the CIR process. Finally, we show that the CIR process describes wellenough the empiricalVt process, such that the corresponding Heston model is able todescribe the log-returnsxt process, with approximately the maximum quality that the subordination allows (80% − 85%).

Finally, we study the time evolution of the personal income distribution. We find that the personal incomedistribution in the USA has a well-defined two-income-classstructure. The majority of population (97–99%)belongs to the lower income class characterized by the exponential Boltzmann-Gibbs (“thermal”) distribution,whereas the higher income class (1–3% of population) has a Pareto power-law (“superthermal”) distribution.By analyzing income data for 1983–2001, we show that the “thermal” part is stationary in time, save for agradual increase of the effective temperature, whereas the“superthermal” tail swells and shrinks following thestock market. We discuss the concept of equilibrium inequality in a society, based on the principle of maximalentropy, and quantitatively show that it applies to the majority of population.

Contents

Acknowledgements 1

I. Introduction 2A. Stock returns 2B. Outline of the dissertation 3

II. Heston model for asset returns 4A. Heston model-SDE and symmetrization 4

1. Short and long time limits of the Heston model 6B. Heston model and subordination 6

III. General characteristics of the data and methods 7

IV. Mesoscopic returns 9A. Data analysis and discussion 10B. Conclusions 11

∗This document is a reformatted version of my PhD thesis. ProfessorTheodore L. Einstein, Professor Steve L. Heston, ProfessorDilip B. Madan,Professor Rajarshi Roy, Professor Victor M. Yakovenko (Chair/Advisor).†Email:[email protected]

V. Number of trades and subordination 11A. Discrete nature of stock returns 12B. Verifying subordination with intraday data 14C. Models for the subordinator 16D. Conclusion 17

VI. Income distribution 18A. Data analysis and discussion 18

References 21

Acknowledgements

I want to thank Professor Victor M. Yakovenko for all helptrough this 3 years I have spend with him working on differentprojects. His assistance was vital in finishing my PhD. I alsothank for the financial support he provided. I thank ProfessorRichard Prange for long discussions where I learned a lot ofthe buy side of finance. His critical questioning was essentialin developing my work and in teaching me the practical optionpricing concepts.

I thank Professors Theodore L. Einstein, Steve L. Heston,Dilip B. Madan and Rajarshi Roy for accepting my invitation

2

to serve in my dissertation committee. Their questions andcomments were insightful and essential.

Trough nearly 6 years I spend at UMD, I have met incred-ible people which generosity and knowledge was fundamen-tal in developing my technical and personal skills. One ofsuch persons is Professor Dilip Madan. Now that I come tothink about it, Professor Madan was one of the first profes-sors that I met at UMD. No wander he kept asking me whenI was going to graduate! Professor Madan is an incredibleteacher with an incredibly deep understanding of finance andmath finance. Professor Madan took me in his math financegroup without reservations and for such generosity I am for-ever thankful. Many thanks also to the members of the everincreasing math finance group which weekly meetings underProfessor Madan and Professor Fu where a lot of fun. In par-ticular I have to thank Samvit Prakash which has one of themost positive personalities around me. Well let’s just say thatSamvit believed in me when I myself did not. I thank for a va-riety of discussions and a fruitful interaction: George Panay-otov, Huaqiang Ma, Qing Xia, Ju-Yi J Yen and Sunhee Kim.I thank Bing Zhang for long discussions on programming andthe Q-P trade.

Before I worked with Professor Yakovenko, I had the in-credible privilege to work in the highly active nonlinear opticslaboratory under the guidance of Professor Rajarshi Roy. Inmy 2 years of work in the nonlinear optics lab I learned ex-perimental optics and as strange as it might sound, I actuallylearned to pick up the phone and call people! Turn out thatthis is one of the most important skills one can have. I haveto thank Raj for the opportunity of working with him. I thankhim for trying to teach me his insightful and positive approachto life and to research. As in the Math finance group I madea lot of friends in the nonlinear optics lab. I would like tothank them for this friendship and for teaching me differentthings in optics, form stripping an optical fiber to how to bet-ter do a computation. In particular I thank David DeShazer,Wing-Shun Lam, Ryan McAllister, Min-Young Kim, Eliza-beth Rogers. In particular I thank Bhaskar Khubchandani andDr. Parvez Guzdar for the close interaction that resulted intoa nice paper.

I thank also some of the best teachers I had. Their dedica-tion and skill have been inspiring, especially because I keepbothering them and they had the patience to answer my con-fusing questions! I thank Professors Steve Heston, Jack Se-mura, Pavel Smejtek and P. T. Leung.

Finally I thank my family for the patience and support.Here I also need to thank Samir Garzon and Norio Nakagaito.Samir helped us a lot and Norio, well, Norio is just incredible.

I. INTRODUCTION

The interest of physicists in interdisciplinary research hasbeen constantly growing and the area of what is today namedsocio-economical physics is 10 years old [4]. This new areain physics has started as an exercise in statistical mechan-ics, where complex behavior arises from relatively simplerules due to the interaction of a large number components.

The pioneering work in the modern stream of economicalphysics was initiated by Mantegna [5] and Li [6] in the earlynineties followed most notably by Mantegna and Stanley [7]and thereafter by a stream of papers [8] that attempt to iden-tify and characterize universal and non-universal features ineconomical data in general. This statistical mechanical mindframe arises in direct analogy with statistical mechanics ofphase transitions, where materials (such as a ferromagneticand a liquid), that are different in nature, can belong to thesame universality class due to their behavior near the crit-ical point (point at which abruptly the phase changes, sayfrom liquid to solid in water, for instance). These univer-sality classes are identified by critical exponents for quanti-ties that diverge at the critical point, for instance the specificheatC ≈ ǫ−α, whereǫ is the reduced temperature andαthe critical exponent [9]. Therefore, the area of economicalphysics has grown from, and it is still in great part concernedwith, “power-law tails” with universal exponents. This consti-tutes the empirical stream of socio-economical physics, wheremodelling and characterizing the empirical data with methodsand tools borrowed from traditional physical problems is at-tempted [15, 16, 17, 18].

Soon after Mantegna and Li initiated the modern empiricalstream of economical physics, simulations appeared. Onceagain, as in the case of empirical work, these were based intofundamental statistical mechanical models such as the Isingmodel. This literature attempted to construct from simplerules complex behavior that could then mimic the market andexplain the price formation mechanism [10, 11, 12, 13, 14].

This dissertation belongs to the empirical stream of socio-economical physics. We study here two distinct problems.First, we use daily and intraday stock data to describe theessential nature of the stochastic process of price returnsatdifferent time ranges. Second, we use yearly income data tostudy the time evolution of the distribution of income in theUSA.

A. Stock returns

The study of stock returns has a long history dating backto Bachalier in1900, which was the first to model stockdynamics with a Brownian motion [19]. He proposed thatthe absolute price change∆St = ST − ST−t, wheret isthe return horizon, should follow a Gaussian random walk.The clear drawback of such a hypothesis is that the pricesof stocks could become negative. It was apparently Ren-ery [19, 20, 21], who introduced the geometrical Brown-ian motion for the stock price by assuming that log-returns(xt = ln(ST ) − ln(ST−t) ≈ ∆St/St), and not absolutereturns, should follow a Brownian motion. The geometricBrownian motion became popular and accepted as a mainstream idea with the work of Osborne [22] (see also [19] forhistorical notes) and Samuelson (cited in [19]).

It was not until the1960’s, that the hypothesis of Gaussianrandom walks was challenged by Mandelbrot [23] and Fama[24, 25] with studies on daily cotton prices. Since then, Brow-nian motion has been consistently questioned for a variety of

3

assets. Today asset log-returns that follow Brownian motionfor all return horizonst are considered an exception.

In his pioneering work, Mandelbrot introduced, as an alter-native model for stock returns, the stable Levy distribution.This distribution has the drawback that it can present infinitevariance. Despite the unwanted mathematical properties thatsuch a process presents, it was not founded into economicalreasoning. In1973 Clark [26] proposed, as an alternative toMandelbrot’s model, to use subordination [27] to constructthedistribution of assets returns. Subordination has a directfinan-cial implication, it can be liked with financial informationar-rival. Clark suggests that prices react to financial informationand that if this financial information is taken into account,thegaussian random walk is recovered. He showed that the infor-mation arrival can be captured by volume of trades and that ifone takes returns conditional on the volume, these should beGaussian.

Note that in fact, Mandelbrot and Clark do not contradictthemselves, as Clark first implied. Mandelbrot’s Levy sta-ble distribution can also be constructed by subordination,ifone chooses the right subordinator for the Brownian motion.Therefore, the problem is reduced to finding the right subor-dinator if one accepts the subordination hypothesis.

In physics, the concept of subordination can be found inthe construction of non-Shannon entropies, in the limit of thecontinuous-time random walk, in interface growth models andother statistical mechanical problems [28, 29, 30, 31, 32].Themathematical- “physical” idea of subordination is that if thestochastic process is analyzed at the correct reference frame,it will always look like a simple gaussian diffusion. But sincewe are dealing with stochastic processes, the reference frameis moving randomly as well; just enough for the actual processin observation to be described by Brownian motion. For fur-ther mathematical development of subordination, see sectionV C.

After Clark, the concept of subordination has been exten-sively used to construct asset return models [33, 34, 35, 36].Most recently a series of studies have used high-frequencydata to verify Clark’s subordination hypothesis by either as-suming that the volume [38, 39] or the trading activity (num-ber of trades) [40, 41] is responsible for price changes. Strongevidence is found for both; nonetheless number of trades ap-pears better suited, since it has been extensively tested for alarge number of companies [41].

Contemporary to Clark, a series of empirical studies indi-cated that the variance (variance = volatility2) of stock re-turns is not constant (see [43] and references therein). Thisresulted in models for stock returns such as Engle’s ARCHand Bollerslev’s GARCH that attempted to account for thechanging variance in the assets returns by modelling both ina discrete framework [44]. At the same time, models withstochastic volatility were introduced. These models gener-ally assume a mean reverting continuous stochastic differen-tial equation for the volatility [45, 46, 48, 67]. Notice thatstochastic volatility models, GARCH and subordination, arenot entirely orthogonal to each other. Stochastic volatilitymodels can also be constructed by subordination [37] (see alsosection V C) or as limits of discrete GARCH type models [47].

In 1993 Heston [48] introduced an exactly solvable stochas-tic volatility model that is also a limit process for theGARCH(1,1) model [47]. The Heston model become widelyused for option pricing and in the study of asset returns. Weuse a modified version of the Heston model as developed inRef. [49] to describe the general shape of probability densitydistribution (PDF) for the log-returns and the time evolutionof such PDF.

B. Outline of the dissertation

The outline of this thesis is as follows. In chapter II, weintroduce the Heston model for stock returns as developedin Refs. [2, 49]. We summarize the procedure for findingthe closed form solution of the probability distribution for thelog-returns, starting from the correlated stochastic differentialequations as given in Ref. [49]. We also introduce subordi-nation and show how to construct the Heston model using aCox-Ingersoll-Ross (CIR) subordinator [71].

In chapter III, we present the data we use in this thesis.We show the typical features of the stock data and how weconstructed such data.

In chapter IV, we study the time evolution of the empiri-cal distribution function (EDF) for the stock returns at meso-scopic time lagst (1 hour < t < 20 days). We show that inthe short-time limitt << 1/γ, the EDF progressively tends tothe double exponential distribution and for the long-time limitt >> 1/γ, the EDFs progressively tends towards a Gaussian,where1/γ is the characteristic time for such limits. Further-more, we show that the Heston model introduced in chapter IIpresents these fundamental features.

In chapter V, we study the hypothesis of subordination. Wefirst start by pointing out the effect of the discrete nature of ab-solute price changes in the log-returns. Thereafter, we verifythe subordination hypothesis using both tick-by-tick data(thisdata records all trades in a given day, see chapter III) as wellas5 minutes log-returns and number of trades (ticks) data. Wefind that if we use the integrated variance (Vt), which is pro-portional to the number of trades (Nt), as our subordinator,we are able to explain approximately the central85% of theprobability distribution for the log-returnsxt between1 hourand1 day. Finally, we show the quality of modelling the sub-ordinatorVt with the CIR process introduced in section V Cand discuss the implication of such model for the log-returnsxt.

The last chapter of this thesis presents work on the timeevolution of the distribution of income. We show the evolu-tion of the distribution of personal income in the United Statesfrom 1983 to 2001. We show that the bulk of the distribution(excluding very small income and very large income), is de-scribed by the Exponential distribution with average incomechanging from year to year in approximately the same rate asinflation. We conclude that the inflation-discounted incomeof the majority of the population is approximately the samethroughout time and therefore well approximated by a systemin thermal equilibrium. We also show that the top3% earn-ers have income that changes over time even when inflation

4

is accounted for. This chapter is self contained and does notrequire any other part of the thesis to be read.

II. HESTON MODEL FOR ASSET RETURNS

The Heston model was introduced by Heston [48] and be-longs to the class of stochastic volatility models, which havereceived a great deal of attention in the financial literature spe-cially in connection with option pricing [45].

Empirical verification of the Heston model was done forboth stocks [1, 2, 49, 63, 64] and options [46, 65, 66, 67],and good agreement with the data has been found in thesestudies. The version of the Heston model for stock returnsused in [1, 2, 49], as well as in this thesis, was modified fromthe original solution by Heston and has evolved into a differentformula with 3 parameters. One parameter for the variance(θ), one parameter representing the characteristic relaxationtime to the Gaussian distribution (1/γ) and another that givesthe general shape of the curve (α).

The outline of this chapter is as follows. First, we presentthe modified Heston model used in this work by showing itsevolution from solving the related stochastic differential equa-tions (SDE). Thereafter, we introduce subordination and weshow the development of the modified Heston model throughsubordination.

A. Heston model-SDE and symmetrization

The formal way of presenting the Heston model is given bytwo stochastic differential equations (SDE), one for the stockpriceSt and another for the variancevt.

dSt = µSt dt + σtSt dW(1)t , (1)

dvt = −γ(vt − θ) dt + κ√

vt dW(2)t , (2)

where the subscriptt indicates time dependence,µ is the driftparameter,W (1)

t andW(2)t are standard random Wiener pro-

cesses,σt is the time-dependent volatility andvt = σ2t is the

variance. In general, the Wiener process in (2) may be corre-lated with the Wiener process in (1):

dW(2)t = ρ dW

(1)t +

√1 − ρ2 dZt, (3)

where Zt is a Wiener process independent ofW(1)t , and

ρ ∈ [−1, 1] is the correlation coefficient. Note that (1) and(2) are well known in finance. These represent, respectively,the log-normal geometric Brownian motion stock process in-troduced by Renery, Osborne and Samuelson [19] (used byBlack-Melton-Scholes (BMS) [68, 70] for option pricing. SeeRef. [69] for a practical application of BMS to physics) andthe Cox-Ingersoll-Ross (CIR) mean-reverting SDE first intro-duced for interest rate models [71, 72].

In order to solve (1) and (2) together with (3), we firstchange variables from stock priceSt to mean removed (de-mean) log-returnxt = ln(St/S0)−µt (4). All further resultsand solutions are constructed for the demean log-returnxt,which we will simply refer to as log-return or return:

dxt = −vt

2dt +

√vt dW

(1)t . (4)

After performing the change of variables from price to re-turn, we solve the Fokker-Planck equation (5) [62] implied bySDEs (2) and (4), for the transition probabilityPt(x, v | vi) tofind the returnx and the volatilityv at timet given the initialdemean log-returnx = 0 and variancevi at t = 0. For sim-plicity, we drop the explicit time dependence notation for thereturnsxt and call themx.

∂tP = γ

∂v[(v − θ)P ] +

1

2

∂x(vP ) (5)

+ ρκ∂2

∂x ∂v(vP ) +

1

2

∂2

∂x2(vP ) +

κ2

2

∂2

∂v2(vP ).

The general analytical solution of (5) forPt(x, v | vi) withinitial conditionPt=0(x, v| vi) = δ(x)δ(v − vi) can be foundby taking a Fourier transformx− > px and a Laplace trans-form v− > pv (see [49] for details),

Pt(x | vi) =

+∞∫

0

dv Pt(x, v | vi) =

∫dpx

2πeipxxPt,px(0 | vi),

(6)where the hidden variablev is integrated out, sopv = 0.Therefore we have

Pt(x | vi) =

∫ +∞

−∞

dpx

2πeipxx−vi

p2x−ipx

Γ+Ω coth (Ωt/2)

× e−2γθ

κ2 ln(cosh Ωt2 + Γ

Ω sinh Ωt2 )+ γΓθt

κ2 . (7)

where

Γ = γ + iρκpx (8)

and

Ω =√

Γ2 + κ2(p2x − ipx). (9)

The marginal probability densityPt(x | vi) could then becompared to empirical stock returns directly. Nevertheless,vi

has to be treated as an extra parameter. In order to avoid this,we assume thatvi has the stationary distribution of the CIRstochastic differential equation (2),Π∗(v),

Π∗(v) =αα

Γ(α)

vα−1

θαe−αv/θ, α =

2γθ

κ2. (10)

Using equation (10) we arrive at the probability distributionof the demean log-returnsPt(x),

5

Pt(x) =

∫ ∞

0

dvi Π∗(vi)Pt(x | vi) (11)

where the final solution is

Pt(x) =1

∫ +∞

−∞

dpx eipxx+Ft(px) (12)

with

Ft(px) =γθ

κ2Γt (13)

− 2γθ

κ2ln

[cosh

Ωt

2+

Ω2 − Γ2 + 2γΓ

2γΩsinh

Ωt

2

]

where as before

Γ = γ + iρκpx (14)

and

Ω =√

Γ2 + κ2(p2x − ipx). (15)

The operation of removing the initial volatility dependenceof the marginal probability densityPt(x | vi) using equation(11) was first introduced in Ref. [49]. This removes an ad-ditional degree of freedom and therefore simplifies the finalmarginal probability density.

In order to further simplify the original Heston model, weassume that equations (1) and (2) are uncorrelated. Thatamounts in takingρ = 0 in expression (13). This approxi-mation was shown to be acceptable for some companies andindexes in the US market [1, 2, 49] but might not be good fordifferent markets [64] or for option pricing [45, 48].

In order to arrive at the probability density function usedin this work, we need to further simplify the equation forPt(x, ρ = 0) (12) into a zero skew symmetrical function.

We replace in (12)px → px + i/2 andρ = 0 to find

Pt(x) = e−x/2

∫ +∞

−∞

dpx

2πeipxx+Ft(px), (16)

whereα = 2γθ/κ2,

Ft(px) =αγt

2−α ln

[cosh

Ωt

2+

Ω2 + γ2

2γΩsinh

Ωt

2

], (17)

and

Ω =√

γ2 + κ2(p2x + 1/4) ≈ γ

√1 + p2

x(κ2/γ2). (18)

Finally, we drop thee−x/2 term in (16). Notice that bothtakinge−x/2 ≈ 1 andp2

x + 1/4 ≈ p2x are needed to produce a

new characteristic functioneFt(px) that correctly goes to unitywhenpx = 0. The final functional form forPt(x) is

Pt(x) =

+∞∫

−∞

dpx

2πeipxx+Ft(px), (19)

Ft(px) =αt

2− α ln

[cosh

Ωt

2+

Ω2 + 1

2Ωsinh

Ωt

2

],(20)

t = γt, α = 2γθ/κ2,

Ω =√

1 + (pxκ/γ)2, σ2t ≡ 〈x2〉 = θt. (21)

We have expressed the original Heston model for the prob-ability density of log-returnsx, in a highly symmetrical formwith three parameters,θ, α and γ. The parameterθ canbe found by calculating the variance of demean log-returnsσ2

t ≡ 〈x2t 〉 = θt (21) of Pt(x) (19). The remaining two pa-

rameters,α andγ, are responsible for the general shape of thecurve and the relaxation rate ofPt(x) to a Gaussian distribu-tion [2, 49]. The parameterα is also responsible to define theanalyticity at zero return. Ifα = 1, value used in this thesis,the short-time-limit is a double exponential distribution(seenext subsection). This distribution is not analytical at zerobut becomes when time progresses. Forα > 1 the distribu-tion is always analytical with a center that is Gaussian andwhenα < 1 the distribution starts non-analytic at zero (goingto zero as a power-law with exponent2α − 1 [49]) and thenevolves into a analytic distribution with Gaussian center.

Notice that the average for the log-returnsx from equa-tion (19) is 〈x〉 = 0. This average is not consistent with

SDE (4), but with the simplifieddxt =√

vtdW(1)t , where

the drift termvt/2 is set to zero. Therefore,x in equation(19) does only approximately represent demean log-returnsx = ln(St/S0) − µt. This difference arises because we tooke−x/2 ≈ 1 andp2

x + 1/4 ≈ p2x in equation (17) in order to

derive equation (20).The log-returnsx in equation (19) can be exactly given by

x = ln(St/S0) − µt − ω(t), where the extra term,ω(t), re-moves the non zero average ofx = ln(St/S0) − µt.

The extra termω(t) arises because the average of the stockprice at timet needs to be given byµ only. Hence

〈St〉 = S0eµt〈eYt〉, 〈eYt〉 ≡ 1, (22)

whereYt is the stochastic process

St = S0eµt+Xt

< eXt >= S0e

µt−ln(<eXt >)+Xt

⇒ ω(t) = −ln(< eXt >)

xt = ln(St) − ln(S0) − µt = Xt + ω(t)

⇒ Yt = Xt + ω(t). (23)

Empirically, the correction represented byω(t) or by work-ing with equation (16) instead of equation (19) is small, andit can be safely neglected. We choose to work withx =ln(St/S0) − µt − ω(t), and we callx in (19) the log-return.

6

1. Short and long time limits of the Heston model

The short time lag limit of the modified Heston model (19)can be found by assumingΩt ≪ 2 in expression (7). We alsotakeρ = 0 and ipx → 0, since we interested in the short-time-limit of the symmetric modified Heston model of equa-tion (19). When taking the limitΩt ≪ 2 in (7), the resultingPDF is the Fourier inverse of the characteristic function ofaGaussian with random variancevi and zero drift. Sincevi is aGamma random variable with distribution (10), the final char-acteristic function for the short-time-limit distribution of themodified Heston model is

Pt(px) =

∫ ∞

0

dvie−vipxt

2 Π∗(vi) = (1 +θtp2

x

2α)−α. (24)

The probability distribution can be found analytically [49]as

Pt(x) =21−α

Γ(α)

√α

πθtyα−1/2Kα−1/2(y), (25)

whereK is the modified Bessel function and

y =

√2αx2

θt. (26)

For α = 1, we recover the Laplace (symmetrical doubleexponential) distribution

Pt(x) =e−y

√2θt

, y =

√2αx2

θt. (27)

Notice that the short time limit is not a Gaussian with vari-ancevi, only because of the assumed randomization ofvi (24).Therefore, this randomization has substantial effect in the lim-iting distributions, which can be checked empirically [2] (em-pirical results will be presented in chapter IV).

The long time lagt limit for the modified Heston modelcan be found by taking the limitΩt ≫ 2 in the characteristicfunction (20). The resulting characteristic function is

Pt(px) = 〈eipxx〉 = eαγt2

(1−

√1+x2

0p2x

), x0 = κ2/γ2. (28)

The characteristic function in equation (28) is the charac-teristic function for the zero skew Normal Inverse Gaussian(NIG) model. NIG was first introduced by Barndorff-Nielsento describe the distribution of sand particles sizes [73] and wassubsequently used in other physical problems such as turbu-lence [74]. In1995, Barndorff-Nielsen also introduced NIGfor stock returns [35]. NIG can also be obtained as a limitof the Generalized Hyperbolic distribution [33, 75], as wellas by subordinating a Brownian motion to the inverse gaus-sian distribution [33] (next section will introduce the idea ofsubordination).

NIG is part of the wide class of Levy pure jump models[33], and the fact that it is recovered as a limit of the simpli-fied Heston stochastic volatility model (19), is another conse-quence of the randomization ofvi. Notice that if we take thelong time limit before the randomization ofvi in the full He-ston model given in Eq. (7), we will not find NIG as the longtime limit.

The central limit theorem can be invoked for NIG and there-fore for Heston [15, 27, 35, 49]. That is, as time progresses,the distributionPt(x) of returnsx will become increasinglyGaussian. The characteristic time scale for the central limittheorem to act ist0 = 2/(αγ). Fort ≫ t0 the probability dis-tribution is essentially Normal with mean zero and varianceθt.

Notice that for long time lagst, there are two characteristictime limits. Heston tends to NIG for timest ≫ 1/γ and thenNIG tends to a Normal distribution for timest ≫ 1/αγ. Ifα ≥ 1, NIG and Heston regimes can not be effectively dis-tinguished. It is only in the caseα < 1, that there will be adistinguished NIG regime.

In summary, the most important limits forPt(x) that weuse in this study are: Exponential (ifα = 1) at short time lagsand Gaussian at long time lags,

Pt(x) ∝

exp(−|x|√

2/θt), t = γt ≪ 1,exp(−x2/2θt), t = γt ≫ 1.

(29)

B. Heston model and subordination

Subordination is a form of randomization in which one con-structs a new probability distribution, by assuming one ormore parameters of the original probability distribution to berandom [27],

PNew(y, z) =

∫ ∞

−∞

dθP (y, θ)Q(θ, z). (30)

In the case of subordination, a Markov processY (N)is randomized by introducing a non-negative processN(t),called a randomized operational time. The resulting processY (N(t)) does not need to be Markovian in general [27]. Werestrict ourselves to subordination of a Brownian motion withdrift θ and standard deviationσ (31). We also assume in whatfollows, thatt is time lag in usual units of time, unless oth-erwise indicated. The probability densityPt(y) for the timechanged Brownian motionY (N) can be written

Pt(y) =

∫ ∞

0

dN1√

2πσ2Ne

−(y−θN)2

2σ2N Pt(N). (31)

The moments of a Brownian subordinated process are re-lated to the moments of the subordinator. If we usePt(y) in(31), the first4 moments can be calculated as

〈y〉 = θ〈N〉N (32)

7

〈(y − 〈y〉)2〉 = σ2〈N〉N + θ2〈(N − 〈N〉N )2〉N (33)

〈(y−〈y〉)3〉 = 3σ2θ〈(N −〈N〉N )2〉N +θ3〈(N −〈N〉N )3〉N(34)

〈(y − 〈y〉)4〉 = 3σ4(〈(N − 〈N〉N )2〉N + 〈N〉2N ) +

6θ2σ2(〈(N − 〈N〉N )3〉N +

〈N〉N 〈(N − 〈N〉N )2〉N ) + θ4〈(N − 〈N〉N )4〉N , (35)

where〈〉 refers to taking the expected value and〈〉N refers totaking the expected value with respect toN . The timet depen-dence of the moments ofY are given by the moments of therandomized operational timeN . Furthermore, even thoughthe subordinator has odd moments, odd moments in the re-sulting processY are only different from zero, if the Gaussianin equation (31) has a driftθ 6= 0. For the present work, weassume that the odd moments are all zero since the empiricalprobability distribution of log-returns are quite well describedby zero skew probability distributions and because we workwith mean zero returns [2]. By assuming zero odd momentsprobability distribution, we simplify the even moments. Thesecond and fourth moments forY depend only on the first andsecond moments of the subordinatorN (33,35).

In the case of the modified Heston model (19), the subor-dination takes the following terms. We assume that the log-returnsx follow a Brownian motion with zero drift and vari-anceVt. The varianceVt is our “random operational time”,since it changes randomly. We will show in chapter V thatthe varianceVt can be estimated (at least partially) using thenumber of tradesNt that occur in a the time intervalt. ThevarianceVt is then a constant timesNt, Vt = σ2Nt.

The varianceVt is given byVt =∫ t

0 ds vs, where the in-stantaneous variancevt appearing in the SDE (2) is integratedin the interval0 → t. For this reason,Vt is also know as in-tegrated variance. The Laplace transform for the conditionalprobability densityPt(Vt| vi) is analytically known [33, 71].Therefore, subordination becomes a useful tool to constructasset models with stochastic variance having the CIR processas a subordinator [37].

The Laplace transform of the subordinator of the modifiedHeston model (20) can be read off immediately,

P (px) = 〈eipxx〉 ⇒ P (px) =

∫ ∞

0

dVte−

p2xVt2 P (Vt) (36)

where the integral with respect toVt defines a Laplace trans-form of the probability densityP (Vt), for which the Laplaceconjugated variable is calculated atp2

x/2. Therefore we arriveat

Pt(Vt) =

+∞∫

0

dpVt epVt x+Ft(pVt ), (37)

Ft(pVt) =αt

2− α ln

[cosh

Ωt

2+

Ω2 + 1

2Ωsinh

Ωt

2

],(38)

t = γt, α = 2γθ/κ2, Ω =√

1 + 2(κ/γ)2pVt . (39)

The only difference between the characteristic exponent(38) and the characteristic exponent for the Heston model (20)is in Ω, wherepVt replacesp2

x/2 as the Laplace variable forVt.

The first and second moments for the integrated CIR pro-cess (38) are

〈Vt〉 = θt (40)

〈(Vt − 〈Vt〉)2〉 =2θ2

αγ2(e−γt − 1 + γt). (41)

The time dependence of the variance (41) shows that theCIR process is not independent and identically distributed(IID). That is expected since we have a mean reverting SDE(2) for the instantaneous variancevt with exponential relax-ation to the mean [62, 71, 72].

We have shown that subordinating a zero drift gaussian tothe integratedVt, given by equation (37) is equivalent to solv-ing for the transition probability densities for the uncorrelated(ρ = 0 in equation (3)) system of SDEsdxt =

√vtdW

(1)t

anddvt = −γ(vt − θ)dt + κ√

vtdW(2)t (2). However, it is not

clear how to use subordination in order to produce a stochasticprocess that is equivalent to the correlated (ρ 6= 0) system ofSDEs [37].

III. GENERAL CHARACTERISTICS OF THE DATA ANDMETHODS

We use 2 databases for this study. Daily closing pricesare downloaded from Yahoo [50] and intraday data is con-structed using the TAQ database from the NYSE [51]. TheTAQ database records every transaction that occurred in themarket (tick-by-tick data), where the average number of trans-actions in a day for a highly traded stock, such as Intel, is20000 (from 1993 to 2001). That is equivalent, in terms ofdata quantity, to approximately77 years of daily data.

Our data has the time that the transaction occurred, the pricethe transaction was realized and the volume of the transaction(number of shares that exchanged hands). The TAQ databasedoes not account for splits or dividends whereas Yahoo givesthe prices corrected for splits and dividends. However wedo need to correct for splits and dividends because the TAQdatabase is used only when constructing intraday returns. Thesplits and dividends are realized overnight and therefore willnot show up if we calculate intraday returns.

After downloading the TAQ data, we remove any trade thatis recorded as an error and also restrict the data to trades thattook place inside the conventional6.5 hours trading day from9: 30 AM to 4: 00 PM. Any trade that happen before9: 30 AMand after4: 00 PM is ignored. We choose to restrict to busi-ness hours because we want our data set to agree with Yahoo

8

16

17

18

19

Pric

e of

INT

C

0 9 19 28 37 47 56

0

500

1000

Num

ber

of ti

cks

(tra

des)

Length of day in hours

0 1 2 3 4 5 60

1/4 M

0.5 M

3/4 M

1 M

Days from 01/02/1997

Vol

ume

(1 M

illio

n)

FIG. 1: Intraday stock price and number of trades constructed fromthe TAQ database at each5 minute interval from Thursday, 2ndof January 1997 to Thursday 9th of January 1997 for Intel (upperpanel). Volume of trades during each day is shown in the lowerpanel.Days are separated by an effective overnight time interval that is con-structed from the data, such that the open-to-close variance and theclose-to-close variance of the log-returns follow the same∝ t line(see Fig. 5).

daily data in the limit of one day that is defined from the openbell (9: 30 AM) to close bell (4: 00 PM).

We define as the daily open price, the price of the first tradethat happened after or at9: 30 AM. We also define the dailyclose price, the price of the trade that happened right before orat 4: 00 PM. A typical time series for intraday prices, numberof trades and volumes for1 particular week is shown in Fig.1.

Notice that the intraday volume and trading activity (num-ber of trades) can be well described by a parabola (Fig. 3).This typical intraday pattern [52, 53] has also been foundfor high-frequency volatility proxies, such as the root meansquare return for all ticks that happen in a certain intervaloftime [54, 55, 56, 57, 58, 59]. The statistics for such a patternfor the number of trades of Intel in the year1997 is shown inFig. 3. Notice that the probability density for different parts ofthe day will clearly have different widths and averages. There-fore, mixing all parts of the day will result in a wider probabil-ity density for number of trades and other intraday quantities[53]. We do not study the consequences of such a mixture, weonly are careful to work with intraday time lags that divideequally all day [2]. In such a way, all parts of the daily trendare equally represented. Since we are working with pricesquoted at every5 minutes (Five minutes close prices) and theday from open to close has only78 such intervals, we workwith returns that aret = 5, 10, 15, 30, 65, 130, 195, 390 min-utes long.

Another important characteristic of daily and intraday datais shown in Fig. 2. The cumulative number of trades from1993 to 2001 (

∑i=12/31/2001i=01/01/1993 Ni) increase almost exponen-

tially. The behavior of the commutative number of tradesshows that the average number of trades change from year toyear. The same type of behavior is found for the square of thedemean log-returns (the variance of the returns). Therefore,

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5x 10

7

Year

Com

ulat

ive

num

ber

of tr

ades

from

01/

01/1

993

INTC, 1993 − 2001

1993

1994

1995

1996

1997

1998

1999

2000

2001

50 %

150 %

250 %

350 %

INTC returns

FIG. 2: Cumulative number of trades and return from1993 to 2001for Intel. The increase of the cumulative number of trades indicatethat the parameters describing the stock are changing.

0 50 100 150 200 250 300 3500

200

400

600

800

1000

1200

Ave

rage

num

ber

of ti

cks

in 5

min

, Nt

Time in minutes from 9:30 AM to 4:00 PM

INTC, 1997

f(T) = 0.0069326T2 − 3.0351T + 440.5032

FIG. 3: Average number of trades (ticks) in a given period of theday. The error bars represent the volatility. The red solid line givesthe best fit parabola to the average number of trades. Same type ofpattern is found for absolute returns [54] and volume.

the probability density for the returns, volume and number oftrades is only approximately stationary throughout the years.When studying returns (chapter IV), we assume the data asstationary, and we take data from1993 to 1999. When study-ing subordination using the number of trades (chapter V), wereduce the non-stationary effect of the data by working withone year of data.

In order to study intraday returns, we construct from thetick-by-tick data,5 minutes close prices. The5 minute closeprice is defined in analogy with the day close price. The 5minutes volume ( or number of trades (ticks)) is the sum ofall traded volume ( or number of trades (ticks)) in a5 minutesinterval.

When constructing intraday returns time series, we do notinclude nights or weekends. Effectively our largest intradayreturn is from open to close (time lag of 390 min = 6.5 hours).A common procedure, not adopted here, is to assume the openof the next day as the close of the present day [60, 61]. Thiswill include returns that are effectively overnight, whereno

9

−0.04 −0.03 −0.02 −0.01 0 0.01 0.02 0.03 0.0410

−4

10−3

10−2

10−1

Log−return, x

Com

ulat

ive

prob

abili

ty d

ensi

ty, 1

−C

DF t(x

), C

DF

t(x)

INTC, 1997

5 min

15 min

1:05 hours

x with overnightx without overnight

FIG. 4: Cumulative density function for the positive and negativelog-returns of Intel. Log-returns constructed including overnighttime lags (solid lines) show higher probability of large returns thanlog-returns that do not include overnight time lags (dashedlines). Wechoose not to include overnight time lags in our intraday return timeseries.

trades are present. The result of such practice is illustrated inFig. 4. Clearly, the tails of the distribution of returns includingovernight time lags are considerably enhanced, if comparedwith the distribution of intraday returns that do not includeovernight time lags.

When working with high-frequency (intraday) data record-ing errors are inevitable. In order to remove errors in thetick-by-tick data as well as our5 minutes close time series,created from the tick-by-tick data, we use Yahoo database asour benchmark. We assume that the daily Yahoo databasedoes not have errors. Our filtering technique consists oftwo parts. First, we calculate the log-return between themaximum and minimum price of a given day for the Ya-hoo data (rHL). We then calculate the log-return (r5min =ln(ST ) − ln(ST−5min)) for the 5 minutes price data in thesame day and compare torHL. We replace any log-return|rt| > rHL with the return immediately preceding it. Wealso replace the number of trades and volume of the “cor-rupted” 5 minute interval by the immediately preceding ones.The second filtering procedure consists of requiring that thelargest and smallest 5 minutes log-return (r5min) in a givenday, be between the maximum and the minimum of all thetime series formed by the yahoo open to close return data(min(rOC) < r5min < max(rOC)). Once again, if theconditionmin(rOC) < r5min < max(rOC) is not satisfied,we replace the “corrupted” log-return, volume and number oftrades by the immediately preceding one.

The typical effect of such a simple error removal algorithmis to change less than1% (on the order of0.1%) of the data.

The same filtering procedure is used for tick-by-tick data,except that instead of replacing the “corrupted” log-return andvolume, we just ignore it. In fact ignoring or replacing by thenearest value is found to be equivalent (for tick-by-tick or5minutes data) for the purpose of this work: the probabilitydensity and moments are the same.

IV. MESOSCOPIC RETURNS

The actual observed empirical probability distribution func-tions (EDFs) for different assets have been extensively stud-ied in recent years [1, 15, 49, 60, 61, 64, 77, 78, 79, 80, 81].We focus here on the EDFs of the returns of individual largeAmerican companies from 1993 to 1999, a period without ma-jor market disturbances. By ‘return’ we always mean ‘log-return’, the difference of the logarithms of prices at two timesseparated by a time lagt.

The time lagt is an important parameter: the EDFs evolvewith this parameter. At micro lags (typically shorter than onehour), effects such as the discreteness of prices and transac-tion times, correlations between successive transactions, andfluctuations in trading rates become important (for discrete-ness effects see chapter V)[15, 16]. Power-law tails of EDFsin this regime have been much discussed in the literature be-fore [60, 61]. At ‘meso’ time lags (typically from an hour toa month), continuum approximations can be made, and somesort of diffusion process is plausible, eventually leadingto anormal Gaussian distribution. On the other hand, at ‘macro’time lags, the changes in the mean market drifts and macroe-conomic ‘convection’ effects can become important, so sim-ple results are less likely to be obtained. The boundaries be-tween these domains to an extent depend on the stock, themarket where it is traded, and the epoch. The micro-mesoboundary can be defined as the time lag above which power-law tails constitute a very small part of the EDF. The meso-macro boundary is more tentative, since statistical data atlongtime lags become sparse.

The first result is that we extend to meso time lags a stylizedfact[124] known since the 19th century [82] (quoted in [19]):with a careful definition of time lagt, the variance of returnsis proportional tot.

The second result is that log-linear plots of the EDFs showprominent straight-line (tent-shape) character, i.e. thebulk(about 99%) of the probability distribution of log-return fol-lows an exponential law. The exponential law applies to thecentral part of EDFs, i.e. not too big log-returns. For the fartails of EDFs, usually associated with power laws at microtime lags, we do not have enough statistically reliable datapoints at meso lags to make a definite conclusion. Exponen-tial distributions have been reported for some world markets[1, 49, 64, 77, 78, 79, 80, 81] and briefly mentioned in thebook [15] (see Fig. 2.12). However, the exponential law hasnot yet achieved the status of a stylized fact. Perhaps this isbecause influential work [60, 61] has been interpreted as find-ing that the individual returns of all the major US stocks formicro to macro time lags have the same power law EDFs, ifthey are rescaled by the volatility.

The Heston model is a plausible diffusion model withstochastic volatility, which reproduces the timelag-varianceproportionality and the crossover from exponential distribu-tion to Gaussian. This model was first introduced by Heston,who studied option prices [48]. Later Dragulescu and Yako-venko (DY) derived a convenient closed-form expression forthe probability distribution of returns in this model and ap-plied it to stock indexes from 1 day to 1 year [49]. The third

10

0 50 100 150 2000

0.002

0.004

0.006

0.008

0.01

0.012

Time lag t in hours

Var

ianc

e, σ

t2 = ⟨

x t2 ⟩

Variance as function of time, data 1993 − 1999

0 3.25 6.5 10 13 19.5 240

0.2

0.4

0.6

0.8

1x 10

−3

Time lag t in hours

σ t2 = ⟨

x t2 ⟩

MRK

INTC

MSFT IBM

MRK

FIG. 5: Top panel: Variance〈x2t 〉 vs. time lagt. Solid lines: Linear

fits 〈x2t 〉 = θt. Inset: Variances for MRK before adjustment for

the effective overnight timeTn. Bottom panel: Log-linear plots ofCDFs vs.x/

√θt. Straight dashed lines−|x|

√2/θt are predicted by

the DY formula (29) in the short-time limit. The curves are offset bya factor of 10.

result is that the DY formula with three lag-independent pa-rameters reasonably fits the time evolution of EDFs at mesolags.

A. Data analysis and discussion

We analyzed the data from Jan/1993 to Jan/2000 for27Dow companies, but show results only for four large cap com-panies: Intel (INTC) and Microsoft (MSFT) traded at NAS-DAQ, and IBM and Merck (MRK) traded at NYSE (pleasesee the appendix for more companies). We use two databases,TAQ to construct the intraday returns and Yahoo database forthe interday returns (see Chapter III). The intraday time lagswere chosen at multiples of 5 minutes, which divide exactlythe 6.5 hours (390 minutes) of the trading day. The interdayreturns are as described in [1, 49] for time lags from 1 day to1 month = 20 trading days.

In order to connect the interday and intraday data, we haveto introduce an effective overnight time lagTn. Withoutthis correction, the open-to-close and close-to-close varianceswould have a discontinuous jump at 1 day, as shown in the in-set of the left panel of Fig. 5. By taking the open-to-close timeto be 6.5 hours, and the close-to-close time to be 6.5 hours +Tn, we find that variance〈x2

t 〉 is proportional to timet, asshown in the left panel of Fig. 5. The slope gives us the He-ston parameterθ in Eq. (21). Tn is about 2 hours (see TableI).

0 1 2 3 4 510

−3

10−2

10−1

Normalized log−return, x/σt

Cum

ulat

ive

prob

abili

ty d

ensi

ty, 1

−C

DF t(x

) =

CD

Ft(−

x)

Theoretical Heston curves with α = 1

γ t10 min 0.17 15 min 0.26 30 min 0.52 γ t

9 days 82.20 20 days 182.7 30 days 274.0

γ t 2:00 hours 2.06 6:30 hours 6.71 3 days 18.27

Gaussian

Exponential

1/γ = 58 min for INTC

0 100 200 300 400 500 600 700 80010

−2

10−1

100

INTC data,1993 − 1999

k

Cha

ract

eris

tic fu

nctio

n, P

t(k)

20 days

5 days1 day = 8:51 hours

3:15 hours2:10 hours

1:05 hours

30 min

FIG. 6: Top panel: Theoretical CDFs for the Heston model plottedvs.x/

√θt. The curves interpolate between the short-time exponen-

tial and long-time Gaussian scalings. Bottom panel: Comparisonbetween empirical (points) and the DY theoretical (curves)charac-teristic functionsPt(k).

In the right panel of Fig. 5, we show the log-linear plots ofthe cumulative distribution functions (CDFs) vs. normalizedreturnx/

√θt. The CDFt(x) is defined as

∫ x

−∞Pt(x

′) dx′,and we show CDFt(x) for x < 0 and1−CDFt(x) for x > 0.We observe that CDFs for different time lagst collapse ona single straight line without any further fitting (the parame-ter θ is taken from the fit in the left panel). More than 99%of the probability in the central part of the tent-shape distri-bution function is well described by the exponential function.Moreover, the collapsed CDF curves agree with the DY for-mula (29)Pt(x) ∝ exp(−|x|

√2/θt) in the short-time limit

for α = 1 [49], which is shown by the dashed lines.

TABLE I: Fitting parameters of the Heston model withα = 1 for the1993–1999 data.

γ 1/γ θ µ Tn

1

hourhour 1

year

1

yearhour

INTC 1.029 0: 58 13.04% 39.8% 2: 21

IBM 0.096 10: 25 9.63% 35.3% 2: 16

MRK 0.554 1: 48 6.57% 29.4% 1: 51

MSFT 1.284 0: 47 9.06% 48.3% 1: 25

Because the parameterγ drops out of the asymptotic Eq.(29), it can be determined only from the crossover regimebetween short and long times, which is illustrated in the leftpanel of Fig. 6. We determineγ by fitting the characteristicfunction Pt(k), a Fourier transform ofPt(x) with respect tox. The theoretical characteristic function of the Heston model

11

−0.2 −0.1 0 0.1 0.2

100

101

102

103

104

105

106

Log−return, x

Pro

babi

lity

dens

ity, P

t(x)

INTC data, 1993 − 1999

5 min

30 min

1:05 hours

3:15 hours

1 day = 8:51 hours

5 days

20 days

0 0.05 0.1 0.15 0.210

−4

10−3

10−2

10−1

INTC data, 1993 − 1999

Log−return, x

Com

ulat

ive

prob

abili

ty d

ensi

ty, 1

−C

DF t(x

), C

DF

t(−x)

Gaussian

5 min30 min

1:05 hours

3:15 hours

1 day = 8:51 hours

5 days

20 days

Positive xNegative x

FIG. 7: Comparison between the 1993–1999 Intel data (points) andthe DY formula (20) (curves) for PDF (top panel) and CDF (bottompanel).

is Pt(k) = eFt(k) (20). The empirical characteristic functions(ECFs) can be constructed from the data series by taking thesum Pt(k) = Re

∑xt

exp(−ikxt) over all returnsxt for agivent [83]. Fits of ECFs to the DY formula (20) are shownin the right panel of Fig. 6. The parameters determined fromthe fits are given in Table I.

In the left panel of Fig. 7 we compare the empirical PDFPt(x) with the DY formula (20). The agreement is quite good,except for the very short time lag of 5 minutes, where the tailsare visibly fatter than exponential. In order to make a more de-tailed comparison, we show the empirical CDFs (points) withthe theoretical DY formula (lines) in the right panel of Fig.7. We see that, for micro time lags of the order of 5 minutes,the power-law tails are significant. However, for meso timelags, the CDFs fall onto straight lines in the log-linear plot,indicating exponential law. For even longer time lags, theyevolve into the Gaussian distribution in agreement with theDY formula (20) for the Heston model. To illustrate the pointfurther, we compare empirical and theoretical data for severalother companies in Fig. 8.

In the empirical CDF plots, we actually show the rankingplots of log-returnsxt for a givent. So, each point in theplot represents a single instance of price change. Thus, thelast one or two dozens of the points at the far tail of each plotconstitute a statistically small group and show large amountof noise. Statistically reliable conclusions can be made onlyabout the central part of the distribution, where the pointsaredense, but not about the far tails.

B. Conclusions

We have shown that in the mesoscopic range of time lags,the probability distribution of financial returns interpolates be-tween exponential and Gaussian law. The time range wherethe distribution is exponential depends on a particular com-pany, but it is typically between an hour and few days. Sim-ilar exponential distributions have been reported for the In-dian [77], Japanese [78], German [79], and Brazilian markets[64, 80], as well as for the US market [1, 49, 81] (see also Fig.2.12 in [15]). The DY formula [49] for the Heston model [48]captures the main features of the probability distributionof re-turns from an hour to a month with a single set of parameters.

V. NUMBER OF TRADES AND SUBORDINATION

The concept of subordination has important fundamentaland practical implications. From a fundamental point of view,it gives a relation between microstructure of the market andprice formation that can be exploited in simulations and mod-elling [42, 55, 84, 85]. From a practical point of view, thesubordinator can be identified with the integrated varianceVt

[56, 86]. This would imply a direct measure of the meansquare return which could impact pricing and hedging bothof options on a particular stock as well as variance swaps and

0 200 400 600 800 100010

−2

10−1

100

IBM data,1993 −1999

k

Cha

ract

eris

tic fu

nctio

n, P

t(k)

20 days

5 days

1 day = 8:46 hours3:15 hours

2:10 hours

1:05 hours

30 min

0 0.05 0.1 0.15 0.210

−4

10−3

10−2

10−1

MRK data, 1993 − 1999

Log−return, x

Com

ulat

ive

prob

abili

ty d

ensi

ty, 1

−C

DF t(x

), C

DF

t(−x)

Gaussian

5 min30 min

1:05 hours

3:15 hours

1 day = 8:21 hours

5 days

20 days

Positive xNegative x

0 100 200 300 400 500 600 700 80010

−2

10−1

100

MRK data,1993 − 1999

k

Cha

ract

eris

tic fu

nctio

n, P

t(k)

20 days

5 days1 day = 8:21hours

3:15 hours

2:10 hours

1:05 hours

30 min

0 0.05 0.1 0.15 0.210

−4

10−3

10−2

10−1

MSFT data, 1993 − 1999

Log−return, x

Com

ulat

ive

prob

abili

ty d

ensi

ty, 1

−C

DF t(x

), C

DF

t(−x)

Gaussian

5 min30 min

1:05 hours

3:15 hours

1 day = 7:55 hours

5 days

20 days

Positive xNegative x

0 200 400 600 800 100010

−2

10−1

100

MSFT data,1993 − 1999

k

Cha

ract

eris

tic fu

nctio

n, P

t(k)

20 days

5 days

1 day = 7:55 hours

3:15 hours2:10 hours

1:05 hours

30 min

FIG. 8: Comparison between empirical data (symbols) and theDYformula (20) (lines) for CDF (left panels) and characteristic function(right panels).

12

options on the variance.In this chapter we verify and model the subordination hy-

pothesis as given by Eq. (36). We will restrict our study tointraday Intel data in the year1997. We restrict to a year ofdata because of the nonlinear drift of the number of trades:we would like to minimize this effect (see Fig.??). We choseIntel because it has been studied by us in Ref. [2] (chapter IV)and it can be modelled well with the Heston model introducedin chapter IV. It is true that it is a highly traded stock, and thatis an advantage, since that are a lot of trades in a day and there-fore the statistics is better. Therefore smaller stocks should bealso checked in the future. The year of1997 represents mostof what one finds for other years, except perhaps2000 and2001 which we did not verified because of technical problems(to large data set requires especial computing techniques thatshould be implemented in the future).

We begin by showing the influence of the discrete nature ofthe absolute price change in the intraday log-return data. Thisis rarely pointed out, even though there is a vast literatureonintraday log-returns [15, 60, 61, 68, 87]. This discretenesshas to be accounted for when considering subordination, oreven when studying intraday returns. It implies that a contin-uous probability density is only a convenient approximationfor some return horizons.

In section V B, we verify when and for what range ofdata does subordination apply. We assume that the integratedvolatility Vt is the random subordinator of a driftless Brown-ian motion and thatVt is proportional to the number of tradesNt in an interval of timet. We also use tick-by-tick data tocheck for subordination by constructing the probability den-sity of the log-returnsxN afterN trades (36).

In section V C, we model the integrated varianceVt withthe CIR process introduced in Eq. (38). We present the levelof agreement between the data and the theoretical CIR modeland we link these results to the distribution of log-returnsxt.

In the last section, we present a summary of our findings.

A. Discrete nature of stock returns

On a tick-by-tick level, price changes are discrete. There isa minimal price change for bid and offers that is set by internalrules of the stock exchange. In the case of Intel in the year of1997, the minimal price change was$1/8 for the first part ofthe year and after June, 24th it became$1/16 [88, 89]. Never-theless, empirically we find that the smallest price change onrealized transactions ish = $1/64 (Fig. 9). This differenceis a direct consequence of the mechanism of trading, and wewill not study it here (see Ref. [90, 91])[125]. We note thatthe minimal price change set by law is clear in Fig. 9, sincethe most probable price changes are indeed0, ±4h = $1/16and±8h = $1/8, according to the rules of the NASDAQ ex-change in1997.

Our goal in this section is to identify the discrete natureof absolute price changes [126] afterN trades (mNh =Sn − Sn−N ) in the log-returns afterN trades (xN =ln(Sn) − ln(Sn−N)) and in log-returns after a time-lagt(xt = ln(ST ) − ln(ST−t)), since these log-returns are the

−25

0

25

50%

PD

F(m

N)−

PD

F(m

N−

1)

INTC, 1997

0.0010.01

0.11

10100%

PD

F(m

N)

−48 −40 −32 −24 −16 −8−4 0 4 8 16 24 32 40 480

10203040

50%

mN

=(Sn−S

n−N)/h, h = $1/64

PD

F(m

N) N = 1, <T

N> = 1.49 s

N = 4000, <TN

> = 98.5 min

FIG. 9: Dimensionless absolute returnsmN = (Sn − Sn−N )/hfor N trades in log linear and linear scale (center and bottom panelsrespectively). In the top panel we show the difference of thePDFsfor mN andmN−1 to illustrate the oscillatory nature of the discretePDF for absolute returns: it evolves from a “pulse” like shape forN = 1 to a “constant wave” forN = 4000.

quantities that we ultimately want to model. We want to pointout that the discrete nature of the log-returns for intradayworkis generally overlooked but it can influence in the analysis ofshort returns.

We will refer to minimal price changeh = $1/64 as “quan-tum of price” or simply “quantum” in analogy with quantummechanics.

The discrete nature of the price change can be used to modelthe price dynamics starting from a microscopic approach asrecently suggested in [55, 57, 92, 93]. We are interested in thelimit where the quantum effect is not noticeable and thereforequantities such as number of trades and returns can be treatedas continuous random variables.

Fig. 9 shows the probability density for the dimensionlessabsolute price returnmN = (Sn − Sn−N )/h afterN tradesin steps of one quantumh. The nature of the tick-by-tick dis-tribution (N = 1) is considerably different fromN = 4000.More than50% of the returns are zero forN = 1, and mostof the other returns have a probability of less than1% except±4h and±8h. The probability has a clearly oscillatory naturewhere multiples of4h are maxima (Fig. 9, top panel). After4000 trades the probability distribution formN has changedinto a two level system (Fig. 9). The probability of the mostprobablemN in N = 1 have now approximately the sameprobability. Therefore, the zero return has (after4000 trades)a comparable probability to the other probability maxima.

The quantum nature of the price changes is removed byworking with log-returns, except for the zero return. Noticethat intraday log-returns can be approximated by the ratio [94]

xN = lnSn − lnSn−N ≈ Sn − Sn−N

Sn−N=

mN

Sn−N/h. (42)

The log-returns can also be written

m0,Nh = 0

13

−24 −20 −16 −12 −8 −4 0 4 8 12 16 20 24

0

10

20

30

40

50%

mN

=(Sn−S

n−N)/h, h = $1/64

PD

F(m

N)

−0.24 −0.16 −0.08 0 0.08 0.16 0.24 0

20

40

60

80

xN

/h=(Log(Sn)−Log(S

n−N))/h, h = $1/64

PD

F(x

N/h

|mN

)

FIG. 10: Effect of taking log-returns instead of taking absolute re-turns. Lower panel shows the probability density of the dimension-less log-returnsxN/h conditioned onmN , P (xN/h|mN ). Thevalues concentrated about a multiple ofh (upper panel), spreadabout their respectiveh value. The vertical color coded lines (lowerpanel) indicate theh value from which each, equally color coded,P (xN/h|mN ) originated. The discreteness ofmN is removed bytaking log-returns since the spread ofP (xN/h|mN ) is larger thanh.

mi,Nh = SiN − S(i−1)N , i = 1, 2, 3...

xi,N =mN∑j=i−1

j=0 mj,N + C, C = S0/h, i = 1, 2, 3, ...,(43)

whereS0 is the first open of the year (in the case of Intel 1997,S0 = $131.75).

The effect of taking log-returns is illustrated in Fig. 10.For each absolute returnmN , there is a potentially differentdenominatorSn−N/h (42) composed by a random walk withinteger valued steps about a levelC (43). Clearly the valuesof the ratioxN will not be integer. Therefore, the ratio ofmN

in Eq. (43) spreads the concentrated discrete absolute returnsmultiple ofh, around the multiple.

The lower panel of Fig. 10 shows the probability density ofxN/h conditioned onmN . The conditional probability den-sity P (xN/h|mN) illustrates a spread for eachmN that islarger thanh. This spread is enough to mix the discretenesswith exception ofmN = 0.

The quality of such a mixture can be seen in Fig. 11 andFig. 12. Even though the cumulative density function forxN

is practically continuous (even forN = 1) with exception ofxN = 0, the stepwise nature ofmN can be easily recognizedup toN = 1000 (Fig. 12). The oscillations in the cumulativedensity functions forxN are centered about the discrete stepsof the cumulative density function ofmN .

The discrete quantum effect atmN = 0 is quite persistent,but it can be neglected for returnsxN with large number oftradesN (for instanceN = 4000). Empirically, it appearsthat the criteria for neglecting themN = 0 effect is that theprobability of havingmN = 0 is of the same order of mag-nitude as the probability of having any othermN (Fig.9). ForIntel 1997 this transition starts approximately atN = 1000.

The effect of data discreteness is also present in the log-returnxt of time lagt. From the log-returnxt, we can con-

FIG. 11: Cumulative probability density for both dimensionless log-returns,xN/h (black line), and dimensionless absolute returns,mN

(blue symbols). Even though the discreteness ofmN is removedwith exception ofxN = 0, the signature of such discreteness is stillvisible. Notice the stepwise nature of the black line.

−24 −20 −16 −12 −8 −4 0 4 8 12 16 20 240

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

mN

=(Sn−S

n−N)/h

CD

F(m

N)

−2 −1.7−1.3 −1 −0.7−0.3 0 0.3 0.7 1 1.3 1.7 2

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

xN

/h= mN

/Sn−N

, h = $1/64

CD

F(x

N/h

)

x10−1

INTC, 1997. N = 1000, <TN

> = 24 min

−24 −20 −16 −12 −8 −4 0 4 8 12 16 20 240

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

mN

=(Sn−S

n−N)/h

CD

F(m

N)

−2 −1.7−1.3 −1 −0.7−0.3 0 0.3 0.7 1 1.3 1.7 2

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

xN

/h= mN

/Sn−N

, h = $1/64

CD

F(x

N/h

)

x10−1

INTC, 1997. N = 4000, <TN

> = 98.5 min

FIG. 12: Cumulative probability density for both dimensionless log-returns,xN/h, and dimensionless absolute returns,mN . WhenNincreases the CDF becomes progressively less oscillatory and the dis-crete nature of the underlying absolute returns becomes less clear.

14

−2 −1.7−1.4 −1 −0.7 −0.3 0 0.3 0.7 1 1.4 1.7 20

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

xt=5min

/h, log−returns

Com

ulat

ive

prob

abili

ty d

ensi

ty, C

DF

(xt=

5 m

in/h

)

INTC, 1997

x10−1

FIG. 13: Cumulative probability density forxt/h with t = 5 min-utes. The discreteness at zero persists fromxN/h as well as theoscillation (stepwise nature) of the CDF.

structxN by conditioning on the number of tradesN presentin t (Nt). The opposite is also true, by conditioning ont wecan constructxt from xN . Therefore some of the discreteeffects that are present inxN will be present inxt. As an ex-ample consider5 minute log-returns. The average number oftrades is〈Nt=5min〉 = 200 ± 184. Because of the reciprocityin constructing the PDF forxt from xN (and vice-versa) byconditioning, this shows that in the composition ofxt=5min,there is a wide range ofxN for which the discrete featurescan not be ignored (clear oscillations and large probability forxt = 0). If we approximate the PDF ofNt=5min by a Gaus-sian distribution, we would have inxt=5min, with the high-est probability,Nt=5min = 200. Therefore some fraction ofxN = 200 will be sampled when we construct the probabilityof xt=5min by conditioning, these returns clearly have a lotof discrete features (Fig. 12) and these features will pass toxt=5min.

Fig. 13 shows the oscillatory stepwise cumulative proba-bility density and also the special nature ofxt=5min = 0 forthe cumulative probability density ofxt=5min. Compare thisfigure with Fig. 11 and Fig. 12. These features originate fromxN and represent small flat portions in the probability densityfunction.

Finally, from the sequence of Figs. 11 and 12 and the cor-respondence betweenxN andxt, we can conclude that thediscrete effects become negligible for a time lagt > 1 hour.

B. Verifying subordination with intraday data

The hypothesis of subordination introduced by Clark [26]has had a strong economical implication, and following hiswork there is a vast body of theoretical and empirical workwhich addresses the issue [38, 39, 40, 41, 42]. Similar to thework of Refs. [40, 41], we verify for subordination consid-ering integrated varianceVt, constructed from the number oftradesNt, to be the subordinator of a Brownian motion.

Due to the discrete nature of the distribution of intraday re-turns presented in section (V A), we can only talk about sub-

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 100000

0.5

1

1.5

2

2.5

3x 10

−4

<x N2

>

Tick−lag, N

INTC, 1997. <xN2 > = σ

N2 N, σ

N2 =2.3896e−008

23.6 47.2 70.8 94.4 118.1 141.7 165.3 188.9 212.5 236.1Average time between N ticks in minutes

FIG. 14: Variance of the log-returnxN for N = 1 to N = 10000.

0 50 100 150 200 250 300 350 4000

0.5

1

1.5

2

2.5

3

3.5

4

4.5x 10

−4Intraday INTC 1997 <x

t2> = θ t, θ=9.5306e−007 (1/min)

time−lag t (min)

<x t2 >

FIG. 15: Variance of the demean log-returnxt for intraday time lagst.

0 50 100 150 200 250 300 350 4000

2000

4000

6000

8000

10000

12000

14000

16000

Intraday INTC 1997 <Nt> = η t, η=40.0271 (1/min)

time−lag t (min)

<N

t>

FIG. 16: Average number of trades in an intraday intervalt.

ordination as formulated in equation (36) after the discrete ef-fects become small. In what follows, we will take all time lagseven those where the discrete effects are large. Nevertheless,we will see that the best subordination will take place for timelags for which discrete effects can be ignored.

The first implication of subordination can be verified withthe use of moments given by equations (33) and (35). Figs.15 and 16 show the linear time relation for both the varianceof xt and the mean ofNt as expected from equation (33).Furthermore, since we are assuming a Brownian motion withstochastic variance given by the number of trades, log-returnsxN afterN trades should be Gaussian distributed with vari-ance〈x2

N 〉 = σ2NN . Fig. 14 shows the linear relation of〈x2

N 〉vs. N . The implied consistency between the slope values inFigs. 14, 15 and 16 required by subordination is

〈x2t 〉 = θt = σ2

N 〈Nt〉 = σ2Nηt ⇒ θ = σ2

Nη. (44)

Using expression (44), the difference betweenθ measured(Fig.15) andθ = ησ2

N from Fig. 14 and Fig. 16 is less than1%.

In order to find a time and a return range where sub-ordination takes place, we look at the data in3 different

15

−1 −0.75 −0.5 −0.25 0 0.25 0.5 0.75 10

0.2

0.4

0.6

0.8

1INTC, 1997

Mean time after N ticks =29.8 min

CD

FN

(xN

)

Mean time after N ticks =64.9 minMean time after N ticks =87 min

Mean time after N ticks =109.7 minMean time after N ticks =130.6 minMean time after N ticks =166.4 min

−4 −3 −2 −1 0 1 2 3 410

−4

10−3

10−2

10−1

100

CD

FN

(xN

) an

d 1−

CD

FN

(xN

)

N = 1200

xN

return in units of σN

, (xN

− µN

)/σN

N = 2600N = 3500N = 4500N = 5500N = 7100

FIG. 17: Cumulative probability density for the demean and stan-dard deviation (STD) normalizedxN log-returns (color coded solidcurves), compared to the Gaussian distribution of mean zeroandSTD one (dashed curve). From smallN to largeN , there is a pro-gressive agreement with the Gaussian with best agreement betweenN = 3500 andN = 4500. While smaller values ofN have CDFsabove the Gaussian, larger values are below the Gaussian.

23.6 47.2 70.8 94.4 118.1 141.7 165.3 188.9 212.5 236.1

−0.5

0

0.5

1.5

2.5

Ske

wne

ss o

f xN

Average time between N ticks in minutes

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000

0

2

4

6

Kur

tosi

s of

xN

Tick−lag, N

0 100 200 300 400 5000

50

100

150

200

250

Kur

tosi

s of

xN

Tick−lag, N

Confidence invervals

FIG. 18: Skewness and excess kurtosis (labelled as ”kurtosis” in thefigure) as a function ofN for the normalized log-returnsxN in Fig.17. For a Gaussian distribution the skewness is zero and the excesskurtosis is also zero. As the number of trades (ticks)N increase theskewness and excess kurtosis become zero. The probability densityfor xN can be well approximated by a Gaussian forN > 2500, sinceboth skewness and excess kurtosis are small.

ways. First, using tick-by-tick data, we construct the distri-bution of the log-returnxN after N trades. xN should beNormal distributed with mean zero and standard deviationσN

√N . We also present theN dependence of the skewness

(〈x3N 〉/(〈x2

N 〉3/2)) and excess kurtosis (〈x4N 〉/(〈x2

N 〉2)−3) ofxN in Fig. 18.

Second, usingt minute returnsxt and the number of tradesNt in the samet interval, we construct the time series

ǫt =xt√Vt

, Vt = σ2NNt, (45)

whereVt is the integrated variance in an intervalt andσN

is the proportionality constant that converts number of trades

FIG. 19: Cumulative density function (CDF) forǫt as defined inequation (45) for three differentt compared to the Gaussian (solidline). The parametersσN in (45) is chosen for the best agreementbetween the Gaussian and the data.

FIG. 20: Cumulative density function (CDF) forǫt as defined inequation (45) for three differentt compared to the Gaussian (solidline). Contrary to Fig. 19, the parameterσN in equation (45) isfound using Fig. 14. Notice that the Gaussian lies above the data inthe tails.

Nt into variance. If indeed subordination holds,ǫt is Normaldistributed with mean zero and standard deviation one, due tothe central limit theorem [27, 41].

Finally, we check subordination by numerically calculatingthe probability mixture equation (36). We construct the prob-ability density function of the number of tradesNt inside atime intervalt by binning the time series ofNt. The choicefor binwidth is according to Ref. [95]. However, the result ap-pears independent of binwidth as long as the binwidth chosenis not too large. The cumulative probability density functionfor the measuredxt and the non-parametric reconstructedx′

t

are shown in Fig. 21.The distributions in Fig. 17, Fig. 19 and Fig. 21(solid

line) show an agreement of approximately85% of the datawith the subordination hypothesis for time lags abovet > 1hour orN > 2500 (Fig. 18). However, the subordination isclearly bad for times close to one day (t = 6.5 hours), wherewe do not have enough data (253 points) to draw meaningfulconclusions.

Notice the clear disagreement above2 standard deviations(STD) as well as at zero in Fig. 17 and Fig. 19. The deviationsat zero are due to the discrete nature of the data (section V A)while the deviations above2 STD show that the subordinationhypothesis can not explain the large changes in returns [42].

16

0 0.01 0.02 0.03 0.04 0.0510

−4

10−3

10−2

10−1

INTC, 1997

Com

ulat

ive

prob

abili

ty d

ensi

ty, C

DF

(−x t)

and

1−C

DF

(xt)

Log−return, xt

5 min

30 min 1:05 hours

2:10 hours

6:30 hoursPositive xNegative x

σN2 = 2.38 10−8

σN2 = 2.0 10−8

FIG. 21: Cumulative distribution of the stock returnsxt comparedto the reconstructed cumulative distribution function (black lines) byrandomizing the varianceVt of a Gaussian distribution. The proba-bility of Vt is constructed by binning the number of trades, and thisprobability is used non-parametrically in the integral (36). The solidlines have parameterσN chosen in order to minimize the least squareerror between the empiricalxt distribution and reconstructed vari-ance changed Brownian motion (36). The dashed line hasσN foundfrom Fig. 14.

For Fig. 19 and Fig. 21(solid line),σ2N = 2×10−8 is found

to give the best agreement between the measured data and thereconstructed data. For Fig. 17, Fig. 20 and Fig. 21(dashedline), σ2

N = 2.39 × 10−8 is found from Fig. 14. Notice thatthe higherσN in Fig. 20 and Fig. 21 (dashed lines) seems toindicate an overestimation ofσN , since the curves constructedby subordination are generally above the data.

The lower value ofσ2N for Fig. 19 and Fig. 21 (solid line)

leads to a violation of relation (44). The difference betweenmeasuredθ in Fig. 15 and the one calculated fromησ2

N isnow of approximately16%. In order to verify the origin ofsuch difference, we remove8% of the largest log-returnxt

data on both tails (ignore8% of the largestxt on the positiveand negative tail for all time lagst used), a total of16% ofthe data. We find now aθ ≈ 8.01 × 10−7. This newθ doesnot violate relation (44) withσ2

N = 2 × 10−8 and reconfirmsthat subordination withVt = σ2

NNt is unable to explain largechanges (> 85%) in the log-returnsxt. This reconfirmationarises because we had to ignore16% of the data in the tails toreduceθ. Dropping16% of the tails is equivalent to lookingonly at the center≈ 85% of the data and saying that subordi-nation is only valid of it.

C. Models for the subordinator

Having verified that a Brownian motion subordinated to thenumber of tradesNt via Vt can describe approximately85%of the return data for time lags larger than1 hour (or, if one ig-nores discreetness effects such as the zero return effect, largerthan30 minutes), we can modelVt instead of modellingxt.

In this section, we verify the quality of modellingVt with aCIR process as given in section (II B). We present the quality

of the CIR fit for Intel in the year1997. We also show that thequality of the Heston fit toxt with parameters from theVt CIRfit is consistent with the quality of the subordination: we areable to model most of the central85% of thext distribution.

Due to previous studies with intraday log-returns [2] (seealso chapter IV), we assumeα = 1 for the simplified CIRmodel in equation (38). The parameterθ is found from the re-lation θ = ησ2

N (44). The remaining parameterγ is found byfitting the empiricalPDF (Vt) for time lagst = 1: 05 hoursand t = 2: 10 hours simultaneously. The regular quality ofsuch a fit is shown in Figs. 22 and 23. The theoretical CIRlines are above the data (Fig. 23). Furthermore, the time de-pendence of the theoretical PDF and CDF only approximatelyfollow the data. For times below1 hour the probability maxi-mum of the empirical distribution is to the left of the theoreti-cal distribution and for times above1 hour to the right.

The results shown in Figs. 22 and 23 indicate that the CIRis only approximately valid. The quality can be further as-sessed by constructing the variance of theVt as a function ofthe time lagt. Fig. 24 shows that the theoretical variancegiven in equation (41) is only approximately correct. Never-theless from equation (33), we know that the variance ofVt

corresponds to the kurtosis ofxt. This indicates that eventhoughVt can not be modelled well (not even the second mo-ment) the implication of that is only important to the fourthand higher moments in the log-returnsxt.

To verify the quality of the parameters found by fitting thesubordinator,Vt, in explaining the log-returns,xt, we presentFigs. 25 and 26. The empirical PDF (25) and CDF (26) forxt show that the corresponding Heston model (dashed blacklines), constructed with parameters found by fitting CIR to theprobability density ofVt, is able to fit only the center of theempirical distributions ofxt (≈ 80% − 85%) at t = 65, 130minutes (Fig. 26).

To recheck the consistency of the subordination approach,we fit the empirical PDF ofxt directly with the Heston model(20). We proceed in similar fashion to the fitting procedure inchapter IV. We assumeα = 1 and takeθ = 8.01×10−7. The

0 1 2 3 4 5

x 10−4

101

102

103

104

105

106

107

108 INTC, 1997

30 min1:05 h

2:10 h

1/2 day

15 min

Parameters γ = 0.0577(1/min) θ = 8.01e−007(1/min)

Vt, integrated variance

Pro

babi

lity

dens

ity fu

nctio

n, P

DF

(Vt)

5 10 15 20 25 x 103 Nt, trades

FIG. 22: Empirical probability density function for the number oftrades (ticks)Nt or integrated varianceVt = σ2

NNt, compared tothe least square fit with the CIR formula (38). Curves are offset by afactors of 10.

17

parameterθ was found from the relationθ = σ2Nη (44), where

η is found from Fig. 14 andσ2N is given such that the subor-

dination in Figs. 19 and 21 is the best possible. Finally, we fitthe empirical PDFs (Fig. 25) for the parameterγ. Therefore,we are effectively only fittingγ, since all the other parametersare the same used in theVt fit (Fig. 22). We find that theγfound from fitting the empirical PDF ofxt directly, is of thesame order of magnitude as with the one found by fitting theempirical PDF ofVt (0.05 fromxt and 0.06 fromVt). Thisshows, that the subordination indeed captures most of the in-formation for the center of the distribution, since fittingVt orxt for γ is equivalent.

Notice that the agreement of the theoretical Heston modelcurves, constructed with parameters from theVt fit, is practi-cally identical to the agreement found in Fig. 21(solid lines)between the CDF ofxt and the CDF constructed by subor-dination using the non-parametric binned probability densityof Vt as the variance of a Gaussian random walk (36). Theinformation content in the number of trades and therefore inthe integrated variance distribution is almost all captured byCIR, even with a regular fit quality (Fig. 23). This last pointimplies that even if we had a better fit to the distribution ofVt,the increase in the fitting quality of the log-returns will not besubstantial.

A substantial increase in the fitting quality of the empiricalPDF and CDF of the log-returns in Figs. 25 and 26 is attainedif one fits the empirical PDF ofxt directly with θ = 9.53 ×10−7 given in Fig. 15. This amounts to takeσ2

N as given byFig. 14 andη by Fig. 16, such that relation (44) is still valid.The parameterγ = 0.02 for the black solid lines in Fig. 26is also considerably different fromγ = 0.06, found by fittingthe empirical PDF ofVt and usingθ = 8.01× 10−7 such thatσ2

N is the best fit value for the subordination in Figs. 21(solidline) and 19. The substantial increase in the fitting qualityforxt, reemphasizes that the number of trades are only able todescribe the center of the distribution of log-returns (section

FIG. 23: Cumulative distribution function (CDF) for the number oftradesNt and integrated varianceVt compared to the CIR fit (solidlines). TheCDF (Vt) goes from0 to 0.5. 1 − CDF (Vt) goes from0.5 to 0. The lower tail (Vt : 0− > 0.5) of theCDF is to the leftand the upper tail (Vt : 0.5− > 0) to the right of0.5 for each timetcurve.

0 50 100 150 200 250 300 350 4000

1

2

3

4x 10

−8 INTC 19970101 to 19971231

<V

t2 −<

Vt>

2 >

DataHestont1.774

100

101

102

103

10−12

10−10

10−8

10−6

t, time in min

<V

t2 −<

Vt>

2 >

FIG. 24: Variance of the integrated variance〈V 2t −〈Vt〉2〉 for differ-

ent time lagst for the data (circles) compared to the theoretical CIRvariance given in equation (41) (solid black line). For comparison thebest power-law fit〈V 2

t − 〈Vt〉2〉 ∝ t1.77 is shown (solid red line).

−0.04 −0.03 −0.02 −0.01 0 0.01 0.02 0.03 0.04

100

102

104

106

INTC, 1997

30 min

1:05 h

2:10 h

1/2 day = 3:15 h

xt, log−return

Pro

babi

lity

dens

ity fu

nctio

n P

DF

(xt)

5 min

FIG. 25: Probability distribution function for the log-returnsxt com-pared to the Heston model (dashed and solid lines). The two linesrepresent a different set of parameters. The solid line has parametersθ from Fig. 14 andγ is found directly by fittingxt. The dashed lineshasθ = σ2

Nη with σ2N from Fig. 21(solid lines) and Fig. 19 andη

from Fig. 16. The parameterγ is then found by fitting the probabilitydensity ofVt. Curves are offset by factors of 10.

V B).

D. Conclusion

We have studied the discrete nature of the probability dis-tribution of absolute returns that arises from the minimal dis-crete price change for bid and offers allowed by the stock ex-change. We have shown that such discrete nature implies thatthe probability distributions of log-returns for intradaytimelags are only approximately continuous. The continuous ap-proximation becomes good for returns with time lags longerthan1 hour.

We have shown that, using the integrated volatilityVt =σ2

NNt derived from the number of tradesNt as the subordi-

18

0 0.01 0.02 0.03 0.04 0.0510

−4

10−3

10−2

10−1

INTC, 1997

Com

ulat

ive

prob

abili

ty d

ensi

ty, C

DF

(−x t)

and

1−C

DF

(xt)

Log−return, xt

5 min

30 min 1:05 hours

2:10 hours

6:30 hours

Positive xNegative x

FIG. 26: Cumulative probability density ofxt compared to the He-ston model. Theoretical lines (dashed and solid) are constructed byintegrating the theoretical probability density functions shown in Fig.25. The two theoretical lines represent a different set of parameter.The solid line has parametersθ from Fig. 14 andγ is found directlyby fitting xt. The dashed lines hasθ = σ2

Nη with σ2N from Fig.

21(solid lines) and Fig. 19 andη from Fig. 16. The parameterγ isthen found by fitting the probability density ofVt. Notice that thesolid black line clearly gives a better fit to the data.

nator of a driftless Brownian motion (36), we are able to de-scribe the center (≈ 85%) of the distribution of log-returnsxt

for time lagst > 1 hour and smaller thant < 1 day. Theupper limit is restricted by the number of data points we have,since we are working with only one year of data.

We also have shown that the CIR process is only able toapproximately describe the distribution function forVt. How-ever, this approximate description is already enough for thecorresponding Heston model to fit the log-returnsxt with ap-proximately the maximum quality that the subordination al-lows (≈ 80%− 85%).

Finally, a direct fit to the log-returnsxt with the Hestonmodel results in a considerable increase in the fitting quality.This reemphasizes that the process of subordination, as im-plied by the empirical probability density ofVt, is only ableto explain the center of the distribution of returns.

VI. INCOME DISTRIBUTION

Attempts to apply the methods of exact sciences, such asphysics, to describe a society have a long history [96]. At theend of the 19th century, Italian physicist, engineer, economist,and sociologist Vilfredo Pareto suggested that income distri-bution in a society is described by a power law [97]. Moderndata indeed confirm that the upper tail of income distributionfollows the Pareto law [98, 99, 100, 101, 102]. However, themajority of the population does not belong there, so charac-terization and understanding of their income distributionre-mains an open problem. Dragulescu and Yakovenko [103]proposed that the equilibrium distribution should follow anexponential law analogous to the Boltzmann-Gibbs distribu-tion of energy in statistical physics. The first factual evidence

for the exponential distribution of income was found in Ref.[104]. Coexistence of the exponential and power-law parts ofthe distribution was recognized in Ref. [105]. However, thesepapers, as well as Ref. [106], studied the data only for a par-ticular year. Here we analyze temporal evolution of the per-sonal income distribution in the USA during 1983–2001. Weshow that the US society has a well-defined two-income-classstructure. The majority of population (97–99%) belongs to thelower income class and has a very stable in time exponential(“thermal”) distribution of income. The upper income class(1–3% of population) has a power-law (“superthermal”) dis-tribution, whose parameters significantly change in time withthe rise and fall of the stock market. Using the principle ofmaximal entropy, we discuss the concept of equilibrium in-equality in a society and quantitatively show that it applies tothe bulk of the population.

A. Data analysis and discussion

Most of academic and government literature on income dis-tribution and inequality [107, 108, 109, 110] does not attemptto fit the data by a simple formula. When fits are performed,usually the log-normal distribution [111] is used for the lowerpart of the distribution [100, 101, 102]. Only recently the ex-ponential distribution started to be recognized in income stud-ies [112, 113], and models showing formation of two classesstarted to appear [114, 115].

Let us introduce the probability densityP (r), which givesthe probabilityP (r) dr to have income in the interval(r, r +dr). The cumulative probabilityC(r) =

∫ ∞

rdr′P (r′) is

the probability to have income abover, C(0) = 1. Byanalogy with the Boltzmann-Gibbs distribution in statisti-cal physics [103, 104], we consider an exponential functionP (r) ∝ exp(−r/T ), whereT is a parameter analogous totemperature. It is equal to the average incomeT = 〈r〉 =∫ ∞

0dr′r′P (r′), and we call it the “income temperature.”

WhenP (r) is exponential,C(r) ∝ exp(−r/T ) is also expo-nential. Similarly, for the Pareto power lawP (r) ∝ 1/rα+1,C(r) ∝ 1/rα is also a power law.

We analyze the data [116] on personal income distributioncompiled by the Internal Revenue Service (IRS) from the taxreturns in the USA for the period 1983–2001 (presently thelatest available year). The publicly available data are alreadypreprocessed by the IRS into bins and effectively give the cu-mulative distribution functionC(r) for certain values ofr.First we make the plots oflog C(r) vs.r (the log-linear plots)for each year. We find that the plots are straight lines for thelower 97–98% of population, thus confirming the exponen-tial law. From the slopes of these straight lines, we determinethe income temperaturesT for each year. In Fig. 27, we plotC(r) andP (r) vs. r/T (income normalized to temperature)in the log-linear scale. In these coordinates, the data setsfordifferent years collapse onto a single straight line. (In Fig.27, the data lines for 1980s and 1990s are shown separatelyand offset vertically.) The columns of numbers in Fig. 27 listthe values of the annual income temperatureT for the corre-sponding years, which changes from 19 k$ in 1983 to 40 k$

19

0 1 2 3 4 5 6

0.1%

1%

10%

100%

1983, 19.35 k$1984, 20.27 k$1985, 21.15 k$1986, 22.28 k$1987, 24.13 k$1988, 25.35 k$1989, 26.38 k$

1990, 27.06 k$1991, 27.70 k$1992, 28.63 k$1993, 29.31 k$1994, 30.23 k$1995, 31.71 k$1996, 32.99 k$1997, 34.63 k$1998, 36.33 k$1999, 38.00 k$2000, 39.76 k$2001, 40.17 k$

Cum

ulat

ive

perc

ent o

f ret

urns

Rescaled adjusted gross income

0 40.17 80.34 120.51 160.68 200.85

0.01

0.1

1

10

Pro

babi

lity

dist

ribut

ion

of r

etur

ns

Adjusted gross income in 2001 dollars, k$

Cumulative percent, 1990 − 2001Probability distribution, 1990 − 2001

Probability distribution, 1983 − 1989

Cumulative percent, 1983 − 1989

100%

10%

10

1

FIG. 27: Cumulative probabilityC(r) and probability densityP (r)plotted in the log-linear scale vs.r/T , the annual personal incomer normalized by the average incomeT in the exponential part of thedistribution. The IRS data points are for 1983–2001, and thecolumnsof numbers give the values ofT for the corresponding years.

centerline

0.1 1 10 1000.01%

0.1%

1%

10%

100%

1983, 19.35 k$1984, 20.27 k$1985, 21.15 k$1986, 22.28 k$1987, 24.13 k$1988, 25.35 k$1989, 26.38 k$

1990, 27.06 k$1991, 27.70 k$1992, 28.63 k$1993, 29.31 k$1994, 30.23 k$1995, 31.71 k$1996, 32.99 k$1997, 34.63 k$1998, 36.33 k$1999, 38.00 k$2000, 39.76 k$2001, 40.17 k$

Cum

ulat

ive

perc

ent o

f ret

urns

Rescaled adjusted gross income

4.017 40.17 401.70 4017

0.01%

0.1%

1%

10%

100%

Adjusted gross income in 2001 dollars, k$

100%

10% Boltzmann−Gibbs

Pareto

1980’s

1990’s

FIG. 28: Log-log plots of the cumulative probabilityC(r) vs. r/Tfor a wider range of incomer.

in 2001. The upper horizontal axis in Fig. 27 shows incomerin k$ for 2001.

In Fig. 28, we show the same data in the log-log scale fora wider range of incomer, up to about300T . Again we ob-serve that the sets of points for different years collapse ontoa single exponential curve for the lower part of the distri-bution, when plotted vs.r/T . However, above a certain in-comer∗ ≈ 4T , the distribution function changes to a powerlaw, as illustrated by the straight lines in the log-log scaleof Fig. 28. Thus we observe that income distribution in theUSA has a well-defined two-class structure. The lower class(the great majority of population) is characterized by the ex-ponential, Boltzmann-Gibbs distribution, whereas the upperclass (the top few percent of population) has the power-law,Pareto distribution. The intersection point of the exponentialand power-law curves determines the incomer∗ separatingthe two classes. The collapse of data points for different yearsin the lower, exponential part of the distribution in Figs. 27and 28 shows that this part is very stable in time and, essen-tially, does not change at all for the last 20 years, save fora gradual increase of temperatureT in nominal dollars. Weconclude that the majority of population is in statistical equi-librium, analogous to the thermal equilibrium in physics. Onthe other hand, the points in the upper, power-law part of thedistribution in Fig. 28 do not collapse onto a single line. Thispart significantly changes from year to year, so it is out of

statistical equilibrium. A similar two-part structure in the en-ergy distribution is often observed in physics, where the lowerpart of the distribution is called “thermal” and the upper part“superthermal” [117].

Temporal evolution of the parametersT andr∗ is shown inFig. 29. We observe that the average incomeT (in nominaldollars) was increasing gradually, almost linearly in time, anddoubled in the last twenty years. In Fig. 29, we also show theinflation coefficient (the consumer price index CPI from Ref.[118]) compounded on the average income of 1983. For thetwenty years, the inflation factor is about 1.7, thus most, ifnot all, of the nominal increase inT is inflation. Also shownin Fig. 29 is the nominal gross domestic product (GDP) percapita [118], which increases in time similarly toT and CPI.The ratior∗/T varies between 4.8 and 3.2 in Fig. 29.

In Fig. 30, we show how the parameters of the Pareto tailC(r) ∝ 1/rα change in time. Curve (a) shows that the power-law indexα varies between 1.8 and 1.4, so the power law isnot universal. Because a power law decays withr more slowlythan an exponential function, the upper tail contains more in-come than we would expect for a thermal distribution, hencewe call the tail “superthermal” [117]. The total excessive in-come in the upper tail can be determined in two ways: as theintegral

∫ ∞

r∗

dr′r′P (r′) of the power-law distribution, or asthe difference between the total income in the system and theincome in the exponential part. Curves (c) and (b) in Fig. 30show the excessive income in the upper tail, as a fractionfof the total income in the system, determined by these twomethods, which agree with each other reasonably well. Weobserve thatf increased by the factor of 5 between 1983 and2000, from 4% to 20%, but decreased in 2001 after the crashof the US stock market. For comparison, curve (e) in Fig.30 shows the stock market index S&P 500 divided by infla-tion. It also increased by the factor of 5.5 between 1983 and1999, and then dropped after the stock market crash. We con-clude that the swelling and shrinking of the upper income tailis correlated with the rise and fall of the stock market. Similarresults were found for the upper income tail in Japan in Ref.[99]. Curve (d) in Fig. 30 shows the fraction of population inthe upper tail. It increased from 1% in 1983 to 3% in 1999, butthen decreased after the stock market crash. Notice, however,that the stock market dynamics had a much weaker effect onthe average incomeT of the lower, “thermal” part of incomedistribution shown in Fig. 29.

For discussion of income inequality, the standard practiceis to construct the so-called Lorenz curve [107]. It is definedparametrically in terms of the two coordinatesx(r) andy(r)depending on the parameterr, which changes from 0 to∞.The horizontal coordinatex(r) =

∫ r

0 dr′P (r′) is the fractionof population with income belowr. The vertical coordinatey(r) =

∫ r

0 dr′r′P (r′)/∫ ∞

0 dr′r′P (r′) is the total income ofthis population, as a fraction of the total income in the sys-tem. Fig. 31 shows the data points for the Lorenz curves in1983 and 2000, as computed by the IRS [110]. For a purelyexponential distribution of incomeP (r) ∝ exp(−r/T ), theformulay = x + (1 − x) ln(1 − x) for the Lorenz curve wasderived in Ref. [104]. This formula describes income distri-bution reasonably well in the first approximation [104], but

20

1985 1990 1995 2000 0

10

20

30

40

50

60

70

80

90

100

110

120

130

140

150

k$

Year

(a) Income separating exponentialand power−law ( r

*)

(b) r* / T

(c) Average income ( T )

(d) Inflation (CPI)

(e) GDP per capita

1985 1990 1995 2000

0

1

2

3

4

5

6

7

8

9

10

r* /

T

FIG. 29: Temporal evolution of various parameters characterizingincome distribution.

0%

10%

20%

30%

40%

Per

cent

of t

otal

Pareto index

Income in tail

Population in tail

1985 1990 1995 2000

0

0.5

1

1.5

2

1985 1990 1995 2000 0

1

2

3

4

5

(a)

(b)

(c)

(d)

(e)

S&

P 5

00/In

flatio

n

S&P 500 / Inflation

FIG. 30: (a) The Pareto indexα of the power-law tailC(r) ∝ 1/rα.(b) The excessive income in the Pareto tail, as a fractionf of the totalincome in the system, obtained as the difference between thetotalincome and the income in the exponential part of the distribution. (c)The tail income fractionf , obtained by integrating the Pareto powerlaw of the tail. (d) The fraction of population belonging to the Paretotail. (e) The stock-market index S&P 500 divided by the inflationcoefficient and normalized to 1 in 1983.

visible deviations exist. These deviations can be correctedby taking into account that the total income in the system ishigher than the income in the exponential part, because of theextra income in the Pareto tail. Correcting for this differencein the normalization ofy, we find a modified expression [106]for the Lorenz curve

y = (1 − f)[x + (1 − x) ln(1 − x)] + fΘ(x − 1), (46)

wheref is the fraction of the total income contained in thePareto tail, andΘ(x − 1) is the step function equal to 0 forx < 1 and 1 forx ≥ 1. The Lorenz curve (46) experiences avertical jump of the heightf atx = 1, which reflects the factthat, although the fraction of population in the Pareto tailisvery small, their fractionf of the total income is significant.It does not matter for Eq. (46) whether the extra income in theupper tail is described by a power law or another slowly de-creasing functionP (r). The Lorenz curves, calculated usingEq. (46) with the values off from Fig. 30, fit the IRS data

0 10 20 30 40 50 60 70 80 90 100%0

10

20

30

40

50

60

70

80

90

100%

Cumulative percent of tax returns

Cum

ulat

ive

perc

ent o

f inc

ome

US, IRS data for 1983 and 2000

1983→

←2000

4%

19%

1980 1985 1990 1995 2000 0

0.5

1

Year

Gini from IRS dataGini=(1+f)/2

FIG. 31: Main panel: Lorenz plots for income distribution in1983and 2000. The data points are from the IRS [110], and the theoreticalcurves represent Eq. (46) withf from Fig. 30. Inset: The closedcircles are the IRS data [110] for the Gini coefficientG, and the opencircles show the theoretical formulaG = (1 + f)/2.

points very well in Fig. 31.The deviation of the Lorenz curve from the diagonal in Fig.

31 is a certain measure of income inequality. Indeed, if ev-erybody had the same income, the Lorenz curve would be thediagonal, because the fraction of income would be propor-tional to the fraction of population. The standard measure ofincome inequality is the so-called Gini coefficient0 ≤ G ≤ 1,which is defined as the area between the Lorenz curve and thediagonal, divided by the area of the triangle beneath the di-agonal [107]. It was calculated in Ref. [104] thatG = 1/2for a purely exponential distribution. Temporal evolutionofthe Gini coefficient, as determined by the IRS [110], is shownin the inset of Fig. 31. In the first approximation,G is quiteclose to the theoretically calculated value 1/2. The agreementcan be improved by taking into account the Pareto tail, whichgivesG = (1 + f)/2 for Eq. (46). The inset in Fig. 31 showsthat this formula very well fits the IRS data for the 1990swith the values off taken from Fig. 30. We observe that in-come inequality was increasing for the last 20 years, becauseof swelling of the Pareto tail, but started to decrease in 2001after the stock market crash. The deviation ofG below 1/2 inthe 1980s cannot be captured by our formula. The data pointsfor the Lorenz curve in 1983 lie slightly above the theoreticalcurve in Fig. 31, which accounts forG < 1/2.

Thus far we discussed the distribution of individual income.An interesting related question is the distribution of familyincomeP2(r). If both spouses are earners, and their incomesare distributed exponentially asP1(r) ∝ exp(−r/T )[127],then

P2(r) =

∫ r

0

dr′P1(r′)P1(r − r′) ∝ r exp(−r/T ). (47)

Eq. (47) is in a good agreement with the family income dis-tribution data from the US Census Bureau [104]. In Eq. (47),we assumed that incomes of spouses are uncorrelated. Thisassumption was verified by comparison with the data in Ref.[106]. The Gini coefficient for family income distribution(47) was found to beG = 3/8 = 37.5% [104], in agree-

21

ment with the data. Moreover, the calculated value 37.5% isclose to the averageG for the developed capitalist countriesof North America and Western Europe, as determined by theWorld Bank [106].

On the basis of the analysis presented above, we proposea concept of theequilibrium inequalityin a society, charac-terized byG = 1/2 for individual income andG = 3/8for family income. It is a consequence of the exponentialBoltzmann-Gibbs distribution in thermal equilibrium, whichmaximizes the entropyS =

∫dr P (r) lnP (r) of a dis-

tribution P (r) under the constraint of the conservation law〈r〉 =

∫ ∞

0dr P (r) r = const. Thus, any deviation of income

distribution from the exponential one, to either less inequalityor more inequality, reduces entropy and is not favorable bythe second law of thermodynamics. Such deviations may bepossible only due to non-equilibrium effects. The presenteddata show that the great majority of the US population is inthermal equilibrium.

Finally, we briefly discuss how the two-class structure ofincome distribution can be rationalized on the basis of a ki-netic approach, which deals with temporal evolution of theprobability distributionP (r, t). Let us consider a diffusionmodel, where incomer changes by∆r over a period of time∆t. Then, temporal evolution ofP (r, t) is described by theFokker-Planck equation [119]

∂P

∂t=

∂r

(AP +

∂r(BP )

), A = −〈∆r〉

∆t, B =

〈(∆r)2〉2∆t

.

(48)For the lower part of the distribution, it is reasonable to as-sume that∆r is independent ofr. In this case, the coeffi-cientsA andB are constants. Then, the stationary solution∂tP = 0 of Eq. (48) gives the exponential distribution [103]P (r) ∝ exp(−r/T ) with T = B/A. Notice that a mean-ingful solution requires thatA > 0, i.e. 〈∆r〉 < 0 in Eq.(48). On the other hand, for the upper tail of income distri-bution, it is reasonable to expect that∆r ∝ r (the Gibrat law[111]), soA = ar andB = br2. Then, the stationary so-lution ∂tP = 0 of Eq. (48) gives the power-law distributionP (r) ∝ 1/rα+1 with α = 1 + a/b. The former process isadditive diffusion, where income changes by certain amounts,whereas the latter process is multiplicative diffusion, whereincome changes by certain percentages. The lower class in-come comes from wages and salaries, so the additive processis appropriate, whereas the upper class income comes from in-vestments, capital gains, etc., where the multiplicative process

is applicable. Ref. [99] quantitatively studied income kineticsusing tax data for the upper class in Japan and found that it isindeed governed by a multiplicative process. The data on in-come mobility in the USA are not readily available publicly,but are accessible to the Statistics of Income Research Divi-sion of the IRS. Such data would allow to verify the conjec-tures about income kinetics.

The exponential probability distributionP (r) ∝exp(−r/T ) is a monotonous function ofr with the mostprobable incomer = 0. The probability densities shown inFig. 27 agree reasonably well with this simple exponentiallaw. However, a number of other studies found a non-monotonousP (r) with a maximum atr 6= 0 andP (0) = 0.These data were fitted by the log-normal [100, 101, 102]or the gamma distribution [113, 114, 120]. The origin ofthe discrepancy in the low-income data between our workand other papers is not completely clear at this moment.The following factors may possibly play a role. First, oneshould be careful to distinguish between personal incomeand group income, such as family and household income. AsEq. (47) shows, the latter is given by the gamma distributioneven when the personal income distribution is exponential.Very often statistical data are given for households and mixindividual and group income distributions (see more discus-sion in Ref. [104]). Second, the data from tax agencies andcensus bureaus may differ. The former data are obtained fromtax declarations of all the taxable population, whereas thelatter data from questionnaire surveys of a limited sample ofpopulation. These two methodologies may produce differentresults, particularly for low incomes. Third, it is necessary todistinguish between distributions of money [103, 120, 121],wealth [114, 122], and income. They are, presumably, closelyrelated, but may be different in some respects. Fourth, thelow-income probability density may be different in the USAand in other countries because of different Social Securityor more general policies. All these questions require carefulinvestigation in future work. We can only say that the datasets analyzed in this paper and our previous papers are welldescribed by a simple exponential function for the wholelower class. This does not exclude a possibility that otherfunctions can also fit the data [123]. However, the exponentiallaw has only one fitting parameterT , whereas log-normal,gamma, and other distributions have two or more fittingparameters, so they are less parsimonious.

[1] A. C. Silva and V. M. Yakovenko, Comparison between theprobability distribution of returns in the Heston model andem-pirical data for stock indexes, Physica A324, 303. (2003).

[2] A.C. Silva, R. E. Prange, and V. M. Yakovenko, Exponentialdistribution of financial returns at mesoscopic time lags: anewstylized fact, Physica A344, 227 (2004).

[3] A. C. Silva and V. M . Yakovenko, Temporal evolution of the“thermal” and “superthermal” income classes in the USA dur-ing 1983-2001, Europhys. Lett.69 (2), 304 (2005).

[4] J. Doyne Farmer, Physicists attempt to scale the ivory towersof finance, Computing in Science & Engeneering,26 Novem-ber/December 1999. Reprinted in Int. J. Theoretical and Ap-plied Finance3, 311 (2000).

[5] R. N. Mantegna, Levy walks and enhanced diffusion in Milanstock exchange, Physica A179, 232 (1991).

[6] W. Li, Absence of 1/f spectra in Dow Jones average, Intl. J.Bifurcations and Chaos1, 583 (1991).

[7] R. N. Mantegna and H.E. Stanley, Scaling behaviour in the

22

dynamics of an economic index, Nature376, 46 (1995).[8] Y. Fan, M. Li, J. Chen, L. Gao, Z. Di,and J. Wu, Network

of econophysicists: a weighted network to investigate the de-velopment of Econophysics, International Journal of ModernPhysics B18, 2505 (2004).

[9] H. E. Stanley, Scaling, universality, and renormalization:Three pillars of modern critical phenomena, Reviews of Mod-ern Physics77, S358 (1999).

[10] D. Challet and Y.-C. Zhang, Emergence of cooperation andorganization in an evolutionary game, Physica A246, 407(1997).

[11] A. Lane and M. Douali, A microstructure model of equity mar-kets, Quantitative Report from The Royal Bank of Scotland(2003).

[12] D. Challet, A. Chessa, M. Marsili, and Y.-C. Zhang, Fromminority games to real markets, Quantitative Finance1, 168(2001).

[13] D. Challet and R. Stinchcombe, Non-constant rates andoverdiffusive prices in simple models of limit order markets,Quantitative Finance3, 165 (2003).

[14] D. Challet, M. Marsilli, and Y.-C. Zhang,Minority Games: in-teracting agents in financial markets(Oxford University Press,Oxford, 2005).

[15] J.-P. Bouchaud and M. Potters,Theory of Financial Risks(Cambridge University Press, Cambridge, 2003).

[16] R. Mantegna and H. E. Stanley,An Introduction to Econo-physics(Cambridge University Press, Cambridge, 1999).

[17] B. Roehner,Patterns of Speculation: A Study in Obser-vational Econophysics(Cambridge University Press, Cam-bridge, 2002).

[18] J. Voit, The statistical mechanics of financial markets(Springer Verlag, Frankfurt, 2001).

[19] M. Taqqu, Paper #134, http://math.bu.edu/people/murad/articles.html.[20] A. G. Laurent, Comments on “Brownian motion in the stock

market”, Operations Research7, 806 (1959).[21] M. F. M. Osborne, Reply to Comments on “Brownian motion

in the stock market”, Operations Research7, 807 (1959).[22] M. F. M. Osborne, Brownian motion in the stock market, Op-

erations Research7, 145 (1959).[23] B. Mandelbrot, The variation of certain speculative prices, The

Journal of Business36, 394 (1963).[24] E. F. Fama, Mandelbrot and the stable paretian hypothesis,

The Journal of Business36, 420 (1963).[25] E. F. Fama, The behavior of stock-market prices, The Journal

of Business38, 34 (1965).[26] P. K. Clark, A subordinated stochastic process model with

finite variance for speculative prices, Econometrica41, 135(1973).

[27] W. Feller,An Introduction to Probability Theory and Its Ap-plications(Wiley, New York, 1971), Vol II.

[28] C. Beck and E. G. D. Cohen, Superstatistics, Physica A322,267 (2003).

[29] R. Failla, P. Grigolini, M. Ignaccolo, and A. Schwettmann,Random growth of interfaces as a subordinated process, Phys.Rev. E70, 010101 (2004).

[30] I. M. Sokolov, Levy flights from a continuous-time process,Phys. Rev. E63, 011104 (2001).

[31] I. M. Sokolov, J. Klafter, and A. Blumen, Do strange kinet-ics imply unusual thermodynamics?, Phys. Rev. E64, 021107(2001).

[32] I. M. Sokolov, Solutions of a class of non-Markovian Fokker-Planck equations, Phys. Rev. E66, 041101 (2002).

[33] W. Schoutens,Levy Processes in Finance(Wiley, New York,2003).

[34] D. B. Madan and E. Seneta, The variance gamma (VG) modelfor share market returns, The Journal of Business63, 511(1990).

[35] O. E. Barndorff-Nielsen, Normal inverse Gaussian distribu-tions and the modeling of stock returns. Research Reportno.300, Department of Theoretical Statistics, Aarhus Univer-sity.

[36] P. Carr, H. Geman D. B. Madan, and M. Yor, The fine structureof asset returns: an empirical investigation, Journal of Busi-ness75, 305 (2002).

[37] P. Carr, H. Geman D. B. Madan, and M. Yor, Stochasticvolatility for Levy Processes, Mathematical Finance13, 345(2003).

[38] M. Richardson and T. Smith, A direct test of the mixture ofdistributions hypothesis: Measuring the daily flow of infor-mation, Journal of Financial and Quantitative Analysis29, 101(1994).

[39] S. Manganelli, Duration, volume and volatility inpactof tradesworking paper 125, European Central Bank (2000).

[40] T. Ane and H. Geman, Order flow, transaction clock, andnormality of asset returns, The Journal of Finance55, 2259(2000).

[41] V. Plerou, P. Gopikrishnan,L. A. Nunes Amaral, X. Gabaix,and H. E. Stanley, Economic fluctuations and anomalous dif-fusion, Physical Review E62, R3023 (2000).

[42] J. Doyne Farmer, Laszlo Gillemot, Fabrizio Lillo, SzabolcsMike, and Anindya Sen , What really causes large pricechanges?, Quantitative Finance4, 383 (2004).

[43] H. Johnson and D. Shanno, Option pricing when variance ischanging, The Journal of Financial and Quantitative Analysis22, 143 (1987).

[44] R. F. Engle, Risk and volatility: econometric modelsand financial practice, Nobel Lecture, December 8, 2003 athttp://nobelprize.org/economics/laureates/2003/engle-lecture.pdf.

[45] J. P. Fouque, G. Papanicolaou, and K. R. Sircar,Derivatives inFinancial Markets with Stochastic Volatility(Cambridge Uni-versity Press, Cambridge, 2000).

[46] J. Hull and A. White, The pricing of options on assets withstochastic volatilities, The Journal of Finance42, 281 (1987).

[47] S. L. Heston and S. Nandi, A close form GARCH option valu-ation model, The Review of Financial Studies13, 345 (2003).

[48] S. L. Heston, A closed-form solution for options with stochas-tic volatility with applications to bond and currency options,Review of Financial Studies6, 327 (1993).

[49] A. Dragulescu and V. M. Yakovenko, Probability distributionof returns in the Heston model with stochastic volatility, Quan-titative Finance2, 443 (2002).

[50] Yahoo Finance, http://finance.yahoo.com/.[51] http://www.nysedata.com/home.asp.[52] M. F. M. Osborne, Periodic structure in the Brownian motion

of stock prices, Operations Research10, 345 (1962).[53] R. A. Wood, Thomas H. McInish and J. K. Ord, An investi-

gation of transactions data for NYSE stocks, The Journal ofFinance40, 723 (1985).

[54] Y. Liu, P. Gopikrishnan, P. Cizeau, M. Meyer, C.-K. Peng, andH. Eugene Stanley, Statistical properties of the volatility ofprice fluctuations, Physical Review E60, 1390 (1999).

[55] R. F. Engle, The econometric of ultra-high frequency data,Econometrica68, 1 (2000).

[56] T. Bollerslev and H. O. Mikkelsen, Modeling and pricinglongmemory in stock market volatility, Journal of Econometrics73, 151 (1996).

[57] T. G. Andersena, T. Bollerslev, F. X. Dieboldc, H. Ebense,The distribution of realized stock return volatility, Journal of

23

Financial Economics61, 44 (2001).[58] Z. Din and C. W. J. Granger, Modeling volatility persistence of

speculative returns: a new approach, Journal of Econometrics73, 85 (1996).

[59] C. W. J. Granger and Z. Ding, Varieties of long memory mod-els, Journal of Econometrics73, 61 (1996).

[60] V. Plerou, P. Gopikrishnan, L. N. Amaral, M. Meyer, and H.E. Stanley, Scaling of the distribution of price fluctuations ofindividual companies, Phys. Rev. E60, 6519 (1999).

[61] P. Gopikrishnan, V. Plerou, L. A. N. Amaral, M. Meyer, andH. E. Stanley, Scaling of the distribution of fluctuations offi-nancial market indices, Phys. Rev. E60, 5305 (1999).

[62] C. W. Gardiner,Handbook of Stochastic Methods for Physics,Chemistry, and the Natural Sciences(Springer, Berlin, 1993).

[63] J. Pan, The jump-risk premia implicit in options: evidencefrom an integrated time-series study, Journal of FinancialEco-nomics63, 3 (2002).

[64] R. Vicente, C. M. de Toledo, V. B. P. Leite, andN. Caticha, Common underlying dynamics in anemerging market: from minutes to months, preprinthttp://lanl.arXiv.org/abs/cond-mat/0402185.

[65] G. Bakshi, C. Cao, and Z. Chen, Empirical performance ofalternative option pricing models, The Journal of Finance52,2003 (1997).

[66] D. Duffie, J. Pan, and K. Singleton, Transform analysis and as-set pricing for affine jump-diffusions, Econometrica68, 1343(2000).

[67] J. Hull, Options, Futures, and Other Derivatives(PracticeHall, New York, 2004).

[68] F. Black and M. Scholes, The pricing of options and corporateliabilities, Journal of Political Economy81, 637 (1973).

[69] A. E. Cohen, Control of nanoparticles with arbitrary two-dimensional force fields, Physical Review Letters94, 118102(2005).

[70] R. C. Merton, Theory of rational option pricing, Bell Journalof Economics and Management Science4, 141 (1973).

[71] J. Cox, J. Ingersoll, and S. Ross, A theory of the term structureof interest rates, Econometrica53, 385 (1985).

[72] P. Chalasani and S. Jha, StevenShreve: Stochastic Calculus and Finance(http://www.stat.berkeley.edu/users/evans/shreve.pdf).

[73] O. E. Barndorff-Nielsen, Exponentially decreasing distribu-tion for the logarithm of particle size, Proc. Roy. Soc. Lond.A353, 401 (1977).

[74] O. E. Barndorff-Nielsen, Models for non-Gaussian variation;with application to turbulence, Proc. Roy. Soc. Lond.A368,501 (1979).

[75] E. Eberlein and E. A. von Hammerstein, Generalized Hyper-bolic and inverse Gaussian distributions: limiting cases andapproximation of processes, Seminar on Stochastic Analysis,Random Fields and Applications IV, Progress in Probability58, R.C. Dalang, M. Dozzi, F. Russo (Eds.), Birkhuser Verlag(2004) 221-264 (2004).

[76] W. Feller, Two singular diffusion problems, Annals of Mathe-matics54, 173 (1951).

[77] K. Matia, M. Pal, H. Salunkay, and H. E. Stanley, Scale-dependent price fluctuations for the Indian stock market, Eu-rophys. Letters66, 909 (2004).

[78] T. Kaizoji and M. Kaizoji,Exponential laws of a stock priceindex and a stochastic model, Advances in Complex Systems6, 303 (2003).

[79] R. Remer and R. Mahnke,Application of Heston model and itssolution to German DAX data(talk presented at the APFA-4conference, 2003).

[80] L. C. Miranda and R. Riera, Truncated Levy walks and anemerging market economic index, Physica A297, 509 (2001).

[81] J. L. McCauley and G. H. Gunaratne, An empirical model ofvolatility of returns and option pricing, Physica A329 178(2003).

[82] J. Regnault,Calcul des Chances et Philosophie de la Bourse(Mallet-Bachelier et Castel, Paris, 1863).

[83] N. G. Ushakov,Selected Topics in Characteristic Functions(VSP, Utrecht, 1999).

[84] M. O’Hara, Market Microstructure Theory(Blackwell Pub-lishers, Oxford, 1995).

[85] R. Liesenfeld and Winfried Pohlmeier, A dy-namic integer count data model for financial trans-action prices, preprint University of Konstantz,http://www.ub.uni-konstanz.de/v13/volltexte/2003/1006/pdf/03-03.pdf

[86] T. Ane and H. Geman, Stochastic volatility and transactiontime: an activity-based volatility estimator, The JournalofRisk 2, (1) (1999).

[87] Michel M. Dacorogna, Ramazan Genay, Ulrich A. Mueller,Richard B. Olsen, Olivier V. Pictet,An Introduction to High-Frequency Finance(Academic Press, London, 2001).

[88] M. A. Goldstein and K. A. Kavajecz, Eighths, sixteenthsandmarket depth: changes in tick size and liquidity provision onthe NYSE, Journal of Financial Economics56, 125 (2000).

[89] F. B. Van Ness, R. A. Van Ness, and S. W. Pruitt, The impactof the reduction in tick incrememts in major U.S. markets onspreads depths and volatility, Rewiew of Quantitative Financeand Accounting15, 153 (2000).

[90] Marcus G. Daniels, J. Doyne Farmer, Laszlo Gillemot, Giu-lia Iori, and Eric Smith Quantitative model of price diffusionand market friction based on trading as a mechanistic randomprocess, Physics Review letters90, 108102 1 (2003).

[91] J. Doyne Farmer, Paolo Patelli, and Ilija I. Zovko, The predic-tive power of zero intelligence in financial markets, PNA102,2252 (2005).

[92] R. F. Engle and J. R. Russel, Autoregressive conditional du-ration: A new model for irregularly spaced transaction data,Econometrica66, 1127 (1998).

[93] Nikolaus Hautsch and Winfried Pohlmeier, Econometricanal-ysis of financial transaction data: pitfalls and opportunities,preprint University of Konstanz.

[94] J. Masoliver, M. Montero, and J. Perell, Return or stockpricedifferences, Physica A316, 539 (2002).

[95] D. W. Scott, On optimal and data-based histograms,Biometrika66, 605 (1979).

[96] P. Ball,Critical Mass(Farrar, Straus, and Giroux, New York,2004).

[97] V. Pareto,Le Cours d’Economie Politique(Macmillan, Lon-don, 1897).

[98] D. G. Champernowne,The Distribution of Income betweenPersons(Cambridge University Press, 1973).

[99] Y. Fujiwara, W. Souma, H. Aoyama, T. Kaizoji, and M. Aoki,Growth and fluctuations of personal income, Physica A321,598 (2003); H. Aoyama, W. Souma, and Y. Fujiwara, Growthand Fluctuations of personal and companys income, PhysicaA 324, 352 (2003).

[100] W. Souma, Physics of personal income, cond-mat/0202388.[101] F. Clementi and M. Gallegati, Power law tails in the Italian

personal income distribution, cond-mat/0408067.[102] T. Di Matteo, T. Aste, and S. T. Hyde, Exchanges in com-

plex networks: income and wealth distributions,The Physicsof Complex Systems (New Advances and Perspectives)”, Eds.F. Mallamace and H. E. Stanley (IOS Press, Amsterdam 2004),p. 435, see also cond-mat/0310544.

24

[103] A. A. Dragulescu and V. M. Yakovenko, Statistical mechanicsof money, Eur. Phys. J. B17, 723 (2000).

[104] A. A. Dragulescu and V. M. Yakovenko, Evidence for theex-ponential distribution of income in the USA, Eur. Phys. J. B20, 585 (2001).

[105] A. A. Dragulescu and V. M. Yakovenko, Exponential andpower-law probability distributions of wealth and income inthe United Kingdom and the United States, Physica A299,213 (2001).

[106] A. A. Dragulescu and V. M. Yakovenko, Statistical mechan-ics of money, income, and wealth: a short survey,Modelingof Complex Systems: Seventh Granada Lectures, Eds. P. L.Garrido and J. Marro (AIP Conference Proceedings661, NewYork, 2003), p. 180.

[107] N. Kakwani,Income Inequality and Poverty(Oxford Univer-sity Press, Oxford, 1980).

[108] D. G. Champernowne and F. A. Cowell,Economic Inequalityand Income Distribution(Cambridge University Press, Cam-bridge, 1998).

[109] Handbook of Income Distribution, edited by A. B. Atkinsonand F. Bourguignon (Elsevier, Amsterdam, 2000).

[110] M. Strudler and T. Petska,An Analysis of the Distribution ofIndividual Income and Taxes, 1979–2001(IRS, WashingtonDC, 2003), http://www.irs.gov/pub/irs-soi/03strudl.pdf.

[111] R. Gibrat,Les Inegalites Economiques(Sirely, Paris, 1931).[112] M. Nirei and W. Souma, Income distribution dynam-

ics: a classical perspective, working paper (2004),http://www.santafe.edu/˜makoto/papers/income.pdf.

[113] J. Mimkes, Th. Fruend, and G. Willis, Lagrange statistics insystems (markets) with price constraints: analysis of property,car sales, marriage and job markets by the Boltzmann functionand the Pareto distribution, cond-mat/0204234; G. Willis andJ. Mimkes,Evidence for the independence of waged and un-waged income, evidence for Boltzmann distributions in wagedincome, and the outlines of a coherent theory of income distri-bution, cond-mat/0406694.

[114] N. Scafetta, S. Picozzi, and B. J. West, An out-of-equilibriummodel of the distributions of wealth, Quantitative Finance, 4,353 (2004).

[115] I. Wright, The social architecture of capitalism,cond-mat/0401053.

[116] Individual Income Tax Returns, Pub. 1304(IRS, WashingtonDC, 1983–2001).

[117] A. Hasegawa, K. Mima, and M. Duong-van, Plasma distribu-

tion function in a superthermal radiation field, Phys. Rev. Lett.54, 2608 (1985); M. I. Desaiet al.,Evidence for a suprathermalseed population of heavy ions accelerated by interplanetaryshocks near 1 AU, Astrophysical Journal588, 1149 (2003); M.R. Collier,Outer planet magnetospheres: a tutorial, Advancesin Space Research33, 2108 (2004).

[118] How Much is That?http://eh.net/hmit/.[119] E. M. Lifshitz and L. P. Pitaevskii,Physical Kinetics(Perga-

mon Press, Oxford, 1981).[120] J. C. Ferrero, The statistical distribution of money and the rate

of money transference, Physica A341, 575 (2004).[121] A. Chatterjee, B. K. Chakrabarti, and S. S. Manna, Money

in gas-like markets: Gibbs and Pareto laws, Physica ScriptaT106, 36 (2003).

[122] J.-P. Bouchaud and M. Mezard, Wealth condensation inasimple model of economy, Physica A282, 536 (2000); S.Solomon and P. Richmond, Power laws of wealth, market or-der volumes and market returns, Physica A299, 188 (2001);A. Y. Abul-Magd, Wealth distribution in an ancient Egyptiansociety, Phys. Rev. E66, 057104 (2002).

[123] A. A. Dragulescu, Ph. D. Thesis (2002), Sec. II H,cond-mat/0307341.

[124] Stylized facts is a term that comes from the economicallitera-ture. It refers to facts that can not be proved right. For instance,the variance of returns is proportional tot for a good quantityof stocks but there might be stocks where this is not a fact.

[125] One of the possible reasons for the different between empiri-cal h and quoted priceh is the bid and ask spread. That is thedifference in price between the buy and sell quote. Since wework with transaction prices, these prices will tend to jumpbetween the bid and ask. And this gap is not quantized by law.Another point to remember is that this quantum set by law onlymake sense for limit orders (where the buyer of seller quoteshis preference price) and not market orders (the buyer or sellerbuys at the first available price). TAQ does not distinguish be-tween order types.

[126] Absolute price change is used here as an opposite to relativeprice changes. We do not refer to the absolute value. What werefer as absolute price changes are also known as the P&L ofthe trade.

[127] Even thought the income of women is generally lower thatmen, this seems not to make a difference in temperature sig-nificant enough to be noticed.

0 50 100 150 200 250 300 350 400 450 5001

2

3

4

5

6

7

8

9

10

11x 10

−3

0 25 50 75 100 125 150 175 200 225 2500

0.5

1

1.5

2

2.5

3

3.5

4x 10

6

Tot

al n

umbe

r of

trad

es N

afte

r T

day

s

T days in one year

INTC, 1997

Mean drift using <Nt>=η t, η = 40

0 50 100 150 200 250 3001

2

3

4

5

6

7x 10

−3

Days

Vol

atili

ty

INTC, 1997