modeling and prediction of financial trading …...modeling and prediction of financial trading...

SIGFIRM Working Paper No. 20

Modeling and Prediction of Financial Trading Networks: An Application to the NYMEX

Natural Gas Futures Market

Brenda Betancourt, University of California, Santa Cruz

May 2013

SIGFIRM-UCSC Engineering-2, Room 403E

1156 High Street Santa Cruz, CA 95064

831-459-2523 Sigfirm.ucsc.edu

The Sury Initiative for Global Finance and International Risk Management (SIGFIRM) at the University of California, Santa Cruz, addresses new challenges in global financial markets. Guiding practitioners and policymakers in a world of increased uncertainty, globalization and use of information technology, SIGFIRM supports research that offers new insights into the tradeoffs between risk and rewards through innovative techniques and analytical methods. Specific areas of interest include risk management models, models of human behavior in financial markets (the underpinnings of “behavioral finance”), integrated analyses of global systemic risks, and analysis of financial markets using Internet data mining

Modelling and prediction of financial trading net-works: An application to the NYMEX natural gasfutures market

Brenda BetancourtAbel RodrıguezUniversity of California, Santa Cruz, U.S.A.

Naomi BoydWest Virginia University

Summary. Over the last few years there has been a growing interest in usingfinancial trading networks to understand the microstructure of financial markets.Most of the methodologies developed so far for this purpose have been basedon the study of descriptive summaries of the networks such as the average nodedegree and the clustering coefficient. In contrast, this paper develops novel statis-tical methods for modeling sequences of financial trading networks. Our approachuses a stochastic blockmodel to describe the structure of the network during eachperiod, and then links multiple time periods using a hidden Markov model. Thisstructure allows us to identify events that affect the structure of the market andmake accurate short-term prediction of future transactions. The methodology isillustrated using data from the NYMEX natural gas futures market from January2005 to December 2008.

Keywords: Financial Trading Network; Stochastic Blockmodel; Hidden MarkovModel; Systemic Risk; Array-Valued Time Series

1. Introduction

Financial trading networks are directed graphs in which nodes correspond totraders participating in a financial market, and edges represent pairwise buy-sell transactions among them that occurr within a period of time. Financialtrading networks contain important information about patterns of order execu-tion in order-driven markets; hence, they provide insights into aspects of marketmicrostructure such as market frictions, trading strategies, and systemic risks.

Consider first the role of financial trading networks in understanding theeffect of market frictions on market microstructure. In the absence of marketfrictions, we could expect orders from different traders to be matched randomly.However, real trading networks often exhibit features such as elevated transitiv-ity or preferential attachment among certain groups of actors (Adamic et al.,2010; Boyd et al., 2013), which are inconsistent with random matching. In the

case of open-outcry markets, these features can be partially explained by socio-logical factors (for example, see Zaloom, 2004). Alternative explanations includethe effect of different market roles (e.g., liquidity providers/takers) or tradingstrategies (e.g., long vs. short strategies), see Ozsoylev et al., 2010 or Hatfieldet al., 2012.

Financial trading networks also provide information that is key in the assess-ment of systemic risks. Analysis of the evolution of financial trading networkscan aid in tests of financial market stability (or fragility as it may be) by fi-nancial regulators to ensure that events such as a large trader failures do notserve to destabilize financial markets. In the event of a large trader failure, anunderstanding of their network will help guide regulators through the process ofunwinding their positions and may dictate whether those positions are unwoundin the open market or through a transfer to a suitable counterparty (Boyd et al.,2011). Financial trading networks can also be used to identify important tradersthat play a critical role in the market (for example, by acting as de facto marketmakers or liquidity providers). In addition, they can also help us identify fre-quent counterparties of specific traders which may aid in regulatory oversight byfederal agencies and market exchanges alike; price distortion and manipulationmay be more likely between frequent counterparties than by one agent acting inisolation (Harris et al., 1994).

The literature on the mathematical modeling of financial trading networks islimited. Theoretical approaches that explain the structure of a financial networkas the outcome of a game have recently been developed (e.g., see Ozsoylev et al.,2010 and Hatfield et al., 2012), but they are of limited practical applicability.Most of the empirical work on trading networks has focused on the use of sum-mary statistics such as degree distributions, average betweenness and clusteringcoefficients (Newman, 2003; Adamic et al., 2010; Boyd et al., 2013). These typeof approaches provide some interesting insights into market microstructure, butsuffer from two main drawbacks. First, the summary statistics to be monitoredneed to be carefully chosen to ensure that relevant features of the market arecaptured. Although some of the game-theoretic work mentioned before mightprovide some insights into which network summaries should be monitored, thechoice is typically difficult and the selection is often incomplete. Second, andmore importantly, approaches of this type are not helpful in predicting futureinteractions among traders.

In this paper we move beyond descriptive network summaries to focus onstochastic models for array-valued data that place a probability distribution onthe full network. The statistical literature on stochastic models for individualnetworks is well developed. The simplest such model is the class Erdos-Renyimodel (Erdos and Renyi, 1959), which assumes that interactions among any twotraders occur independently and with constant probability that is independentof the identity of the traders. This class of models, although well studied from atheoretical perspective, is too simplistic to accommodate most realistic networks.As an alternative, Frank and Strauss (1986) proposed the class of exponentially

2

weighted random graphs, also called p∗ models. These models formalize the useof summary measures by including them as sufficient statistics in exponential-family models. On the other hand, Holland and Leinhardt (1981) proposedthe class of p1 models, which extend generalized linear models to array-valueddata. A related approach was introduced in Hoff et al. (2002) using the conceptof latent social space models in which the probability of a link between nodesincreases as they occupy closer positions in latent social space.

The model discussed in this paper extends the class of stochastic blockmodelsfirst introduced in Wang and Wong (1987) to account for time dependence.Stochastic blockmodels rely on the concept of structural equivalence to identifygroups of traders (which we shall refer to as trading communities in the contextof this application) with similar interaction patterns. Indeed, the partitionsinduced by stochastic blockmodels are driven not only by the internal relationswithin a particular group of traders, but also by the interactions among thesegroups (Wasserman and Faust, 1994). Model-based stochastic blockmodels havebeen developed as array-valued extensions of traditional mixture models. Forexample, Nowicki and Snijders (2001) proposed a simple Bayesian model thatuses a finite mixture model and a Dirichlet prior for the probabilities of thelatent classes. An extension of this model that relies on infinite mixture modelsbased on the Dirichlet process have been proposed by Kemp et al. (2006) and Xuet al. (2006). More recently Airoldi et al. (2008) introduced the idea of mixedmembership stochastic blockmodels for binary networks wherein the actors canbelong to more than one latent class to explore subjects with multiple roles inthe network.

In this paper we propose modeling the dynamics of financial trading networksusing an extension of the Bayesian infinite-dimensional model of Kemp et al.(2006). The model we propose accounts for dependence of the network structureover time and incorporates more general hierarchical priors on the interactionprobabilities as well as the partition structure. To account for changes in marketmicrostructure over time, the blockmodels associated with different time periodsare linked through a hidden Markov model. In finance, regime switching modelshave been used in many contexts such as applications to model stock returns(Guidolin and Timmermann, 2005; Kim et al., 2001; Perez-Quiroz and Tim-mermann, 2000), in asset allocation (Ang and Bekaert, 2002a), business cycles(Filardo, 1994), and interest rates (Ang and Bekaert, 2002b). As we show in ourillustration, by developing a dynamic, fully probabilistic model for array-valueddata we are able to monitor structural changes in market microstructure whileat the same time making more accurate short-term predictions of future tradingpatterns. The resulting model is similar in spirit to the one described in Ro-driguez (2011), but allows for additional flexibility and enhanced interpretability.

To motivate the structure of our model, consider a time series of weekly bi-nary trading networks generated from proprietary transactions among 71 tradersin the NYMEX natural gas futures market between January 2005 and Decem-ber 2008 (for more details about this dataset, see Section 4). Figure 1 presents

3

time series plots of the corresponding clustering coefficients (which measure thetendency of traders to establish transitive relationships) and assortativity coef-ficients (which measures the tendency of traders to interact with other tradersthat are similar to themselves). From these plots it is clear that there is a changein market microstructure on September 2006 (which corresponds the date of in-troduction of an electronic trading system in this market). In addition, thereis evidence of two additional structural changes around January 2007 and June2008. This suggests that a Markov switching model in which the state of the sys-tem is assumed to be constant over short periods of time could be a reasonablemodel for this type of data.

Figure 2 presents a matrix representation for the trading network associatedwith the week of February 22, 2005 (traders have been reordered to make thegraph easier to read). The graph suggests the existence of groups of tradersthat are structurally equivalent, including a large group of inactive traders thatdo not participate on transactions during this particular week, as well as smallgroup of traders with a high number of intra-group and a relatively low numberof inter-group transactions. This suggests that a stochastic block model mightbe a reasonable model for individual trading networks.

The rest of the paper is organized as follows: Section 2 provides a detaileddescription of our modeling approach and discusses its main properties. Sec-tion 3 describes the associated computational algorithm. Section 4 presents theanalysis of the NYMEX natural gas futures market data. Finally, Section 5 dis-cusses some limitations of the model and possibilities for further extensions andapplications.

2. Modeling Approach

2.1. Stochastic blockmodels for financial trading networksWe encode a financial trading network among n traders using an n × n binarymatrix Y = [yi,j ], where yi,j = 1 if trader i sold at least one contract to trader j,and yi,j = 0 otherwise. Since we focus on proprietary trading (i.e., transactionscarried out by the traders with their own money, rather that their clients’), weadopt the convention yi,i ≡ 0, as traders do not buy from themselves. Notethat treating the network as binary ignores information about the transactionssuch as the number, maturities, and prices of the contracts. We proceed in thisway for two reasons. First, in some markets (i.e., black pools) the prices andnumber of contracts might not be disclosed, making it impossible to apply moregeneral models. Second, even if available, this extra information provides limitadditional information about the identity of counterparties subject to contagionrisks. Nonetheless, the framework we describe here can be easily extended tomore general types of weighted networks.

A stochastic blockmodel for Y assumes that its entries are conditionallyindependent given two sets of parameters: a vector of discrete indicators ξ =(ξ1, . . . , ξn), where ξi = k if and only if trader i belongs to community k =

4

−0.

4−

0.2

0.0

0.1

0.2

Ass

orta

tivity

coe

ffici

ent b

y de

gree

01/0

3/20

05

03/1

6/20

05

05/2

6/20

05

08/0

8/20

05

10/1

8/20

05

12/3

0/20

05

03/1

6/20

06

05/2

6/20

06

08/0

8/20

06

10/1

8/20

06

12/2

9/20

06

03/1

5/20

07

05/2

5/20

07

08/0

7/20

07

10/1

7/20

07

12/2

8/20

07

03/1

2/20

08

05/2

2/20

08

08/0

4/20

08

10/1

4/20

08

12/2

4/20

08

0.40

0.45

0.50

0.55

0.60

Clu

ster

ing

Coe

ffici

ent

01/0

3/20

05

03/1

6/20

05

05/2

6/20

05

08/0

8/20

05

10/1

8/20

05

12/3

0/20

05

03/1

6/20

06

05/2

6/20

06

08/0

8/20

06

10/1

8/20

06

12/2

9/20

06

03/1

5/20

07

05/2

5/20

07

08/0

7/20

07

10/1

7/20

07

12/2

8/20

07

03/1

2/20

08

05/2

2/20

08

08/0

4/20

08

10/1

4/20

08

12/2

4/20

08

Fig. 1. Clustering coefficients and assortativity coefficient for weekly trading networksin the NYMEX natural gas futures market between January 2005 and December 2008.The vertical dashed line indicates the date in which electronic trading was introduced inthis market.

5

Trader

Trad

er

1 2 3 6 11 13 14 17 19 22 28 29 30 32 33 35 41 44 45 46 48 49 51 54 55 60 68 4 5 16 18 23 26 34 36 37 43 50 63 64 66 69 70 71 7 12 27 52 53 58 8 10 20 21 24 42 47 57 61 9 59 62 67 15 25 31 38 39 40 56 65

1236

111314171922282930323335414445464849515455606845

1618232634363743506364666970717

12275253588

10202124424757619

5962671525313839405665

Fig. 2. Incidence matrix for the trading network associated with the week of February22, 2005. The solid lines suggest one possible partition of the traders into groups ofstructurally equivalent nodes.

6

1, . . .K, and a K×K matrix Θ = [θk,l] such that θk,l represents the probabilitythat a member of community k sells a contract to a member of community l.Therefore,

yi,j | ξ,Θ ∼ Ber(θξi,ξj ).

Note that K represents the maximum potential number of trading communitiesallowed a priori. A posteriori, the effective number of trading communities K∗

present in the sample could potentially be smaller than K.A Bayesian formulation for this model is completed by eliciting prior distri-

butions for K, ξ, and Θ. In the sequel we set K =∞ and let the indicators beindependent a priori where

Pr(ξi = k | w) = wk, k = 1, 2, . . . ,

and the vector of weights w = (w1, w2, . . .) is constructed so that

wk = vk∏s<k

{1− vs}, vk ∼ Beta(1− α, β + αk), (1)

for 0 ≤ α < 1 and β > −α. Note that, by setting K =∞, the model allows forthe effective number of components K∗ to be as large as the number of tradersn, for any n.

The formulation in (1) is equivalent to the constructive definition of thePoisson-Dirichlet process (Pitman, 1995; Pitman and Yor, 1997), with α = 0leading to the Dirichlet process. Hence, the implied prior on the effective numberof trading communities K∗ and the size of those communities, m1, . . . ,mK∗ , isgiven by

Γ(β + 1)

(β + αK∗)Γ(β + n)

K∗∏k=1

(β + αk)Γ(mk − α)

Γ(1− α).

Note that larger values of α or β favor a priori a larger effective number ofK∗. Setting α = 0 leads to the prior expected number of communities to growlogarithmically with n, while for α > 0 the expected number components growsas a power of the number of traders.

Consider now building a prior on the matrix of interaction probabilities Θ.In this case we let

θk,l | aO, bO, aD, bD ∼

{Beta(aO, bO) k 6= l

Beta(aD, bD) k = l.

This prior is more general than those typically used in stochastic blockmodels,as it allows the distribution of the diagonal and off-diagonal elements of Θ tohave different hyperparameters. This ensures additional flexibility in terms ofthe implied degree distribution of the network, while still ensures that both p(Y)

7

and p(Θ) are jointly exchangeable, i.e., that the distributions are invariant to theorder in which traders or communities are labeled (Aldous, 1981). In addition,it allows us to define an assortative index for the network as

Υ = log {E(θk,k | aD, bD)} − log {E(θk,l | aO, bO)}

= log

{aD

aD + bD

}− log

{aO

aO + bO

},

and a cycle-type transitivity index

χ = Pr(yi,j = 1 | yj,k = 1, yk,i = 1, aO, bO, aD, bD, α, β) =χNχD

,

where

χN =(1− α)(2− α)

(β + 1)(β + 2)

(aD + 2)(aD + 1)aD(aD + bD + 2)(aD + bD + 1)(aD + bD)

+ 3(1− α)(β + α)

(β + 1)(β + 2)

aD(aD + bD)

(aO

aO + bO

)2

+(β + α)(β + 2α)

(β + 1)(β + 2)

(aO

aO + bO

)3

,

and

χD =(1− α)(2− α)

(β + 1)(β + 2)

(aD + 1)aD(aD + bD + 1)(aD + bD)

+ 2(1− α)(β + α)

(β + 1)(β + 2)

aD(aD + bD)

aO(aO + bO)

+(β + α)(β + α+ 1)

(β + 1)(β + 2)

(aO

aO + bO

)2

.

These two indexes are model-based alternatives to assortativity by degree andthe clustering coefficients discussed in Figure 1 (Rodriguez and Reyes, 2013).

2.2. Hidden Markov models for time series of financial trading networksWe are interested in extending the hierarchical blockmodel described in Section2.1 to model a time series of financial trading networks Y1, . . .YT . The ex-tension is built with two goals in mind. First, we are interested in identifyingevents associated with structural changes in the network and, therefore, in themicrostructure of the market. Second, we aim at making short-term predictionsabout the structure of the network in future periods. For these reasons, we fo-cus our attention on the use of hidden Markov models for network data. HiddenMarkov models are widely used in financial (e.g., see Ryden et al., 1998 and

8

references therein) and biological (e.g., Yau et al., 2011 and references therein)applications where there is interest in identifying structural changes in the sys-tem under study. Hence, they represent a natural alternative in this context.

More specifically, consider now a sequence Y1, . . . ,YT of binary trading net-works observed over T consecutive time intervals, where all networks are associ-ated with a common set of n traders. In addition, let ζ1, . . . , ζT be a sequence ofunobserved state variables such that ζt = s indicates that the market is in states ∈ {1, 2, . . . , S} during period t = {1, 2, . . . , T}. Each state has associated withit a vector of community indicators ξs = (ξ1,s, . . . , ξn,s) with ξi,s ∈ {1, 2, . . . ,K}and a matrix of interaction probabilities Θs = [θk,l,s] representing, respectively,the grouping of traders into trading communities and the probabilities of tradesoccurring between communities when the system is in state s. Analogously toour previous discussion, S and K represent the maximum number of states andthe maximum number of trading communities allowed by the model a priori. Aposteriori, the effective number of states S∗ and the effective number of commu-nities on each state K∗1 , . . . ,K

∗S is potentially smaller than S and K, respectively.

Conditionally on the state parameters, observations are assumed to be inde-pendent, i.e.,

yi,j,t | ζt, {ξs}, {Θs} ∼ Ber(yi,j,t | θξi,ζt ,ξj,ζt ,ζt).

Hence, the joint likelihood for the data can be written as

p ({Yt} | {ζt}, {ξs}, {Θs}) =

T∏t

n∏i=1

n∏j=1

j 6=i

θyi,j,tξi,ζt ,ξj,ζt ,ζt

(1− θξi,ζt ,ξj,ζt ,ζt

)1−yi,j,t

=

S∏s=1

K∏k=1

K∏l=1

∏(i,j,t)∈Ak,l,s

θyi,j,tk,l,s (1− θk,l,s)1−yi,j,t ,

where Ak,l,s = {(i, j, t) : i 6= j, ζt = s, ξi,ζt = k, ξj,ζt = l} is the set of observationsassociated with the interactions between communities k and l in state s.

To account for the persistence in network structure illustrated in Figure 1, weassume that the evolution of the system indicators follows a first-order Markovprocess with transition probabilities

p(ζt = s | ζt−1 = r, {πr}) = πr,s,

where πr = (πr,1, . . . , πr,S), the r-th row of the transition matrix Π = [πr,s],

must satisfy∑Ss=1 πr,s = 1. A natural prior for πr is a symmetric Dirichlet

distribution,

πr | γ ∼ Dir( γS,γ

S, . . . ,

γ

S

).

Note that, as S → ∞, the induced distribution of transitions over states isequivalent to that generated by a Dirichlet process prior with concentration

9

parameter γ (for example, see Green and Richardson, 2001). Therefore the modelis similar in spirit to the infinite hidden Markov model discussed in Rodriguez(2011) (see also Teh et al., 2006). However our construction does not couple thevalues of π1,π2, . . . through a common centering probability. This is in contrastto the infinite hidden Markov model, where all transition probabilities into thesame state are assumed to be similar. Indeed, the structure of the infinite hiddenMarkov model implies that if state s is highly persistent (i.e., πs,s is close to one),then the probability of transitioning from any other states into state s will alsotend to be large, a property that is unappealing when modeling financial tradingnetworks. Since γ plays an important role in controlling the number of effectivestates S∗, its value is estimated from the data by assigning an exponential priorto it and carrying out a sensitivity analysis to evaluate the impact of our priorchoice on model performance.

The specification of the model is completed by eliciting hierarchical priors onthe state-specific parameters ξ1, . . . , ξS and Θ1, . . . ,ΘS . Following the discus-sion in Section 2.1, we let

Pr(ξi,s = k | ws) = wk,s, k = 1, 2, ·,

where wk,s = vk,s∏h<k{1 − vh,s} are weights constructed from a sequence

v1,s, v2,s, . . . where vk,s ∼ Beta(1 − αs, βs + kαs). Again, since the hyperpa-rameters αs and βs play a critical role in controlling the number of expectedtrading communities, they are assigned independent hyperpriors αs ∼ p(αs) andβs ∼ p(βs). A natural choice is to assign αs a uniform prior on the unit intervaland βs an exponential prior, while carrying out a sensitivity analysis that in-volves priors that favor small values of αs as well as priors that favor both lowerand higher values for βs.

Similarly, the interaction probabilities are assigned priors

θk,l,s | as,O, bs,O, as,D, bs,D ∼

{Beta(as,O, bs,O) k 6= l

Beta(as,D, bs,D) k = l.

where {as,O}, {bs,O}, {as,D}, and {bs,D} are independent and gamma distributedwith shape parameter c and unknown rates dO, eO, dD and eD, which are inturn assigned exponential priors with means λd and λe.

3. Computation

The joint posterior distribution for the model is proportional to

p ({Yt} | {ζt}, {ξs}, {Θs}) p ({Θs} | {as,O}, {bs.O}, {as,D}, {bs.D})p ({as,O} | dO) p ({bs,O} | eO) p ({as,D} | dD) p ({bs,D} | eD)

p(dO) p(eO) p(dD) p(eD) p ({ζt} | Π) p (Π | γ) p (γ)

p ({ξs} | {αs}, {βs}) p ({αs}) p ({βs}) . (2)

10

This posterior distribution is not analytically tractable. Therefore, we imple-mented a Markov chain Monte Carlo algorithm (Robert and Casella, 2005) thatsimulates a dependent sequence of random draws from the target distribution.Given initial values for the parameters, these are successively updated from theirfull conditional distributions. Standard Markov chain theory ensures that, afteran appropriate burn-in, the values of the parameters generated by the algorithmare approximately distributed according to (2).

To derive the algorithm, we rely on the fact that the joint posterior distribu-tion in (2) can be factorized as

p ({Θs} | {ξs}, {ζt}, {as,O}, {bs,O}, {as,D}, {bs,D}, {Yt})×p ({ξs}, {ζt}, {as,O}, {bs,O}, {as,D}, {bs,D},

dO, eO, dD, eD, {αs}, {βs}, γ | {Yt}) (3)

Since the values of θk,l,s are conditionally independent a posteriori given theobservations, the indicators {ζt} and {ξi,s}, and the prior parameters {as,O},{bs,O}, {as,D} and {bs,D}, the first term in (3) is easy to sample from. Further-more, conditionally on the other parameters in the model, the state indicatorsζ1, . . . , ζT are sampled jointly using a forward-backward algorithm (Rabiner,1986), while the full conditional distribution for each collection of indicatorsξ1,s, . . . , ξn,s is sampled using a collapsed (marginal) Gibbs sampler (Neal, 2000).Details of the algorithm are discussed in Appendix A.

3.1. Estimation and predictionGiven a sample from the previous Markov chain Monte Carlo algorithm,({Θ(b)

s }, {ξ(b)s }, {ζ

(b)t }, {a

(b)s,O}, {b

(b)s,O}, {a

(b)s,D},

{b(b)s,D}, d(b)O , e

(b)O , d

(b)D , e

(b)D , {α(b)

s }, {β(b)s }, γ(b)

), b = 1, . . . , B,

obtained after an appropriate burn-in period, point and interval estimates formodel parameters can be easily obtained by computing the empirical meanand/or the empirical quantiles of the posterior distribution. For example, pos-terior co-clustering probabilities, ωt,t′ = Pr(ζt = ζt′ | {Yt}) can be estimatedas

ωt,t′ = Pr(ζt = ζt′ | {Yt}) ≈1

B

B∑b=1

I(ζ(b)t = ζ(b)t′ ),

where I(·) denotes the indicator function. The estimates can be arranged intoa co-clustering matrix [ωt,t′ ], which can in turn be used to identify the state ofthe system at each time period through a decision-theoretic approach (e.g., seeLau and Green, 2007). A similar procedure can be used to identify communitieson each period.

11

The samples from the posterior distribution can also be used as the basis forprediction. For this purpose, note that the probability that trader i sells at leastone security to trader j in the unobserved period T + 1 can be estimated by

E (yi,j,T+1 | {Yt}) ≈1

B

B∑b=1

π(b)

ζ(b)T ,s

θξ(b)i,s ,ξ

(b)j,s.

Using a simple 0/1 utility function, a future sell trade from trader i to trader jis predicted as yi,j,T+1 = I(E{yi,j,T+1 | {Yt}} > f), for some threshold f thatreflects the relative cost associated false positive and false negative links.

4. An application to the NYMEX natural gas futures market

In this section we analyze the sequence of T = 201 weekly financial tradingnetworks constructed from proprietary trades in the natural gas futures marketon the New York Mercantile Exchange (NYMEX) between January 2005 andDecember 2008 that were introduced in Section 1. As discussed in Section 2,directed binary networks were constructed by setting yi,j,t = 1 if there wasat least one transaction in which trader i sold an option to trader j duringweek t. One particularity of this market is that futures were traded on theNew York Mercantile Exchange (NYMEX) only through traditional open-outcrytrades until September 5, 2006, and as a hybrid market that included electronictrading conducted via the CME Globex platform after that date.

A total of 970 unique traders participate in proprietary transactions at leastonce over the four years to December 2008. However, this list includes tradersthat either abandoned proprietary trading or went bankrupt during the periodunder study, as well as traders that entered the market after January 2005.Indeed, only between 240 and 340 traders participated in trades each week (seeFigure 3). Since we have limited information about the times at which differenttraders entered or left the market, our analysis focuses on 71 traders we identifiedas being present in the market (although not necessarily active) during the wholeperiod. Traders were anonymized and are identified in the plots using numbers.

The results presented in this section are based on 100,000 iterations collectedafter a burn-in period of 10,000 iterations. Convergence of the algorithm wasdiagnosed using the single-chain approach discussed in Geweke (1992) and bya visual evaluation of trace plots. We monitored the log-likelihood function, aswell as the number of active states S∗ and the mean and variance over time ofthe assortativity and transitivity indexes {Υt} and {χt}. In terms of hyperpa-rameters, the maximum number of states is set to S = 30, the prior means forγ and {βs} are assigned exponential priors with unit mean, and the priors fordO, eO, dD and eD are exponential distributions with mean 2. This specificationimplies that, a priori, E(Υt) = 0 for all t = 1, . . . , T , so that we favor neitherassortative nor dissasortive trading communities a priori.

12

220

240

260

280

300

320

340

Act

ive

trad

ers

01/0

3/20

05

03/1

6/20

05

05/2

6/20

05

08/0

8/20

05

10/1

8/20

05

12/3

0/20

05

03/1

6/20

06

05/2

6/20

06

08/0

8/20

06

10/1

8/20

06

12/2

9/20

06

03/1

5/20

07

05/2

5/20

07

08/0

7/20

07

10/1

7/20

07

12/2

8/20

07

03/1

2/20

08

05/2

2/20

08

08/0

4/20

08

10/1

4/20

08

12/2

4/20

08

Fig. 3. Number of active traders each week in the NYMEX natural gas future marketbetween January 2005 and December 2008.

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●

●●●●●

●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

Sta

tes

1

2

3

4

01/0

3/20

05

03/1

6/20

05

05/2

6/20

05

08/0

8/20

05

10/1

8/20

05

12/3

0/20

05

03/1

6/20

06

05/2

6/20

06

08/0

8/20

06

10/1

8/20

06

12/2

9/20

06

03/1

5/20

07

05/2

5/20

07

08/0

7/20

07

10/1

7/20

07

12/2

8/20

07

03/1

2/20

08

05/2

2/20

08

08/0

4/20

08

10/1

4/20

08

12/2

4/20

08

01/0

3/20

05

03/1

6/20

05

05/2

6/20

05

08/0

8/20

05

10/1

8/20

05

12/3

0/20

05

03/1

6/20

06

05/2

6/20

06

08/0

8/20

06

10/1

8/20

06

12/2

9/20

06

03/1

5/20

07

05/2

5/20

07

08/0

7/20

07

10/1

7/20

07

12/2

8/20

07

03/1

2/20

08

05/2

2/20

08

08/0

4/20

08

10/1

4/20

08

12/2

4/20

08

01/03/2005

03/16/2005

05/26/2005

08/08/2005

10/18/2005

12/30/2005

03/16/2006

05/26/2006

08/08/2006

10/18/2006

12/29/2006

03/15/2007

05/25/2007

08/07/2007

10/17/2007

12/28/2007

03/12/2008

05/22/2008

08/04/2008

10/14/2008

12/24/2008

0

0.25

0.5

0.75

1

Fig. 4. In the left panel, point estimate of the states for the 201 weeks observed for thetrading network with the vertical line indicating the introduction of the electronic platformon week 85. On the right, mean posterior pairwise incidence matrix for the weeklynetworks, illustrating the uncertainty associated with this point estimate.

13

4.1. Identifying changes in market microstructure

Figure 4 presents the posterior estimate of the co-clustering matrix for the latentstates ζ1, . . . , ζT , along with a point estimator for the grouping of networks intostates (recall Section 3.1). This point estimator suggests that the structure ofthe trading networks alternates between four highly persistent states. The firststate runs between early January 2005 and early September 2006, when theelectronic market is introduced. The presence of a change point at this date isnot surprising in light of the descriptive analysis presented in Section 1. Thesecond state runs between early September 2006 and early May 2007, when thesystem transitions to a new state for a short period of 3 months. After that, thesystem seems to transition to a fourth state in early August 2007 (interestingly,the beginning of the recent financial crises), where it stays for 37 weeks beforereturning to the third state in early June 2008 (which coincides with some of thelargest drops in the S&P500 energy sector index over the last 13 years). Also,it is clear from the heatmap that, although there is some uncertainty associatedwith this point estimate of the system states (mostly in time of the transitionsbetween states three and four), this uncertainty is relatively low.

Figure 5 shows estimates of the community structure associated with twodifferent weeks, that of October 11, 2005 (t = 40) and that of November 14, 2007(t = 145). We selected these dates because they are representative of states 1and 4. Note that, although there are some similarities, the overall structure ofthe communities is quite different. State 1 is characterized by a large group of25 mostly inactive traders, while all other traders tend to fall, for the most part,into singleton clusters. On the other hand, while state 4 also exhibits a numberof singleton clusters, it also shows a number of small communities comprisingbetween 5 and 10 traders each.

Figure 6 shows time series plots for the estimates of the assortativity andtransitivity indexes Υ1, . . . ,ΥT and χ1, . . . , χT . Recall that these quantitiesare model-based alternatives to the assortativity by degree and the clusteringcoefficient presented in Figure 1. Both sets of plots share some common features,revealing mild assortativity and higher transitivity before September 2006 andhighly disassortative networks with lower transitivity afterwards. This makessense because we would expect that the introduction of an electronic marketwould limit the effect of social connections among traders (which tend to beassortative and transitive) and favor connections based on differential trendingstrategies (which tend to be disassortative).

Finally, Table 1 shows point estimates and credible intervals associated withsome hyperparameters in the model, both a priori and a posteriori. In all cases,the posterior estimates appear to be more concentrated and be centered arounddifferent values than the prior.

14

Week 40

1:71

46 60 54 22 44 51 41 1 55 14 19 2 6 45 29 30 32 68 28 49 48 35 3 33 17 11 13 18 23 38 31 65 24 26 47 61 8 20 50 42 27 69 64 57 53 59 34 5 4 25 66 36 63 71 37 16 43 9 56 67 15 40 62 21 10 58 39 70 12 7 52

466054224451411

55141926

4529303268284948353

331711131823383165242647618

2050422769645753593454

25663663713716439

56671540622110583970127

52

0

0.25

0.5

0.75

1Week 145

1:71

46 60 44 54 51 22 41 14 55 2 33 49 58 15 4 5 69 57 62 27 38 34 59 3 16 23 42 53 18 21 67 64 35 68 40 48 11 65 30 45 43 13 29 6 28 32 50 20 8 61 26 24 47 66 25 36 63 71 39 9 31 70 56 12 7 52 37 10 1 17 19

4660445451224114552

3349581545

695762273834593

162342531821676435684048116530454313296

283250208

612624476625366371399

317056127

5237101

1719

0

0.25

0.5

0.75

1

Fig. 5. Mean posterior pairwise incidence matrices of traders for t = 40 from state 1 andt = 145 from state 4.

Table 1. Prior and posterior point estimates and credibility intervals forsome model hyperparameters.

ParameterPosterior Posterior 95% Prior Prior 95%mean credible interval mean credible interval

α40 0.748 (0.198, 0.942) 0.500 (0.025, 0.975)α100 0.524 (0.240, 0.851) 0.500 (0.025, 0.975)α145 0.587 (0.264, 0.785) 0.500 (0.025, 0.975)

β40 1.225 (0.034, 4.526) 1.000 (0.025, 3.689)β100 2.518 (0.107, 7.563) 1.000 (0.025, 3.689)β145 1.223 (0.034, 4.510) 1.000 (0.025, 3.689)

γ 0.344 (0.090, 0.814) 1.000 (0.025, 3.689)

15

−0.

6−

0.4

−0.

20.

00.

2

Ass

orta

tivity

01/0

3/20

05

03/1

6/20

05

05/2

6/20

05

08/0

8/20

05

10/1

8/20

05

12/3

0/20

05

03/1

6/20

06

05/2

6/20

06

08/0

8/20

06

10/1

8/20

06

12/2

9/20

06

03/1

5/20

07

05/2

5/20

07

08/0

7/20

07

10/1

7/20

07

12/2

8/20

07

03/1

2/20

08

05/2

2/20

08

08/0

4/20

08

10/1

4/20

08

12/2

4/20

08

0.45

0.50

0.55

0.60

0.65

Tran

sitiv

ity in

dex

01/0

3/20

05

03/1

6/20

05

05/2

6/20

05

08/0

8/20

05

10/1

8/20

05

12/3

0/20

05

03/1

6/20

06

05/2

6/20

06

08/0

8/20

06

10/1

8/20

06

12/2

9/20

06

03/1

5/20

07

05/2

5/20

07

08/0

7/20

07

10/1

7/20

07

12/2

8/20

07

03/1

2/20

08

05/2

2/20

08

08/0

4/20

08

10/1

4/20

08

12/2

4/20

08

Fig. 6. Time series plot for assortativity and transitivity indexes. The vertical line repre-sents the transitions across states identified from Figure 4.

16

ROC curves

False positive rate

True

pos

itive

rat

e

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

0.5

0.6

0.7

0.8

0.9

WeekA

UC

192 193 194 195 196 197 198 199 200 201

HMMBlockmodelErdos−Renyi

Fig. 7. The left panel shows ten operating characteristic curves associated with one-step-ahead out of sample predictions from our hidden Markov model. The right panelshows a time series plot of the area under the receiver operating characteristic curvesfor three different models.

4.2. Network prediction

As we discussed in the introduction, besides identifying points of change inthe microstructure of the market, one of our goals is to predict future tradingpartnerships. To assess the predictive capabilities of the model we ran an out-of-sample crossvalidation exercise where we held out the last ten weeks in thedataset and made one-step-ahead predictions for the structure of the held-outnetworks. More specifically, for each t = 191, 192, . . . , 200 we use the infor-mation contained in Y1, . . . ,Yt to estimate the model parameters and obtainpredictions for Yt+1 for different values of the threshold f . Each of these predic-tions is compared against the observed network Yt+1, the number of false andtrue positives computed, and a receiver operating characteristic (ROC) curve isconstructed. For comparison purposes, the same exercise was repeated first byfitting a single blockmodel (which corresponds to taking S = 1) and then byfitting a single Erdos-Renyi model (which corresponds to S = 1 and K1 = 1) tothe whole collection of networks. Figure 7 shows the ten operating characteristiccurves associated with one-step-ahead out of sample predictions from our hid-den Markov model, along with estimates of the area under the receiver operatingcharacteristic curves (AUC) for all three models. The proposed hidden Markovmodel shows superior predictive capabilities, with AUCs between 0.85 and 0.9.

17

4.3. Sensitivity analysisTo assess the effect of our prior choice on posterior inference we conducted asensitivity analysis where the model was fitted with somewhat different priors.In particular, we used independent Beta priors with mean 1/10 and variance9/1100 for each αs, as well as exponential priors with means 1/3 and 3 foreach βs. On the other hand, a exponential priors with mean 2 were also usedfor dO, eO, dD and eD. Although inferences on the community structure weresomewhat affected by prior choices, inferences on the state parameters as well asthe assortativity and transitivity indexes and the predictive performance wereessentially unchanged.

5. Discussion

We have presented a class of hidden Markov models for financial trading net-works that have clear potential for market regulatory oversight. Key applicationof these models include identifying specific events (such as large trader failuresor specific changes in market rules) that affect market stability, as well as iden-tifying frequent trading counterparties that might be likely collusion partners orparticularly at risk in case of bankruptcies.

In this paper we have focused on models for binary networks where onlythe presence/absence of transactions over a week is recorded. However, wheninformation about volumes is available, the model can be easily extended toincorporate this information.

Although the use of a hidden Markov model allows us to account for time de-pendence and is useful for identifying structural changes in the system, a struc-ture that assumes abrupt changes in the network might be too restrictive forpredictive purpose. In the future we plan to evaluate models based on fragmen-tations and coagulations (e.g., see Bertoin, 2006) that allow for smooth evolutionin the community structure, as well as extensions of auto logistic models thatmight allow for improved predictions.

18

A. Appendix A

Here, we provide the details of the MCMC algorithm discussed in 3. The al-gorithm proceeds by updating the model parameters from the following fullconditional distributions:

(a) For each i = 1, . . . , n and occupied states s, ξi,s = k with probability

Pr(ξi,s = k | · · · ,Y)

=

(m−ik − αs)K∗s,−i∏l=1

p({yi,j,t:(i,j,t)∈Aik,l,s})p({yi,j,t:(i,j,t)∈A−ik,l,s})

p({yj,i,t:(i,j,t)∈Aik,l,s})p({yj,i,t:(i,j,t)∈A−ik,l,s})

, k ≤ K∗s,−i

(βs + αsK∗s,−i)

K∗s,−i∏l=1

p({yi,j,t : (i, j, t) ∈ A−il,s})

p({yj,i,t : (i, j, t)A−il,s}), k = K∗s,−i + 1,

where K∗s,−i = maxj 6=i{ξj,s}, m−ik =∑j 6=i I(ξj,s=k),

A−ik,l,s = {(i′, j′, t) : i′ 6= j′ 6= i, ζt = s, ξi′,ζt = k, ξj′,ζt = l},

Aik,l,s = {(i′, j′, t) : i′ = i, ζt = s, ξj′,ζt = l}⋃A−ik,l,s,

A−il,s = {(j, t) : j 6= i, ζt = s, ξj,ζt = l},

and the marginal predictive distribution, p({yi,j,t : (i, j, t) ∈ A}) is givenby

Γ(∑A yi,j,t + as)Γ(|A|+ bs −

∑A yi,j,t)

Γ(as + bs + |A|)Γ(as + bs)

Γ(as)Γ(bs).

and |A| is the number of elements in A.

(b) Since the prior for θk,l,s is conditionally conjugate, we update these param-eters for k, l ∈ {1, . . . ,K∗s } by sampling from

θk,l,s | · · · ,Y ∼ Beta

∑Ak,l,s

yi,j,t + as,mk,l,s + bs −∑Ak,l,s

yi,j,t

for Ak,l,s = {(i, j, t) : i 6= j, ζt = s, ξi,ζt = k, ξj,ζt = l} and mk,l,s = |Ak,l,s|.

(c) Since the prior for the transition probabilities is conditionally conjugate, theposterior full conditional for πr, r = 1, . . . , S is the Dirichlet distribution

p(πr | · · · ,Y) =

S∏s=1

πγ/S+nrs−1r,s

for nrs = |{t : ζt−1 = r, ζt = s}|.

19

(d) The posterior full conditional of γ is

p(γ | · · · ,Y) ∝ p(γ)

S∏s=1

Γ(γ)

Γ(γ + ns)γLs

where ns = |{t : ζt = s}| and Ls =∑r Ins,r>0 for ns,r = |{t : ζt−1 =

s, ζt = r}|. Since this distribution has no standard form, we update γusing a random walk Metropolis-Hastings algorithm with symmetric log-normal proposal,

log{γ(p)} | γ(c) ∼ Normal(

log{γ(c)}, κ2γ)

where κ2γ is a tuning parameter chosen to get an average acceptance ratebetween 30% and 40% .

(e) The posterior full conditional of the pairs (as,O, bs,O) and (as,D, bs,D) hasthe following general form:

p(as, bs | · · · ,Y) ∝ p(as | d)p(bs | e)S∏k=1

S∏l=1

p(yi,j,t | Ak,l,s,mk,l,s)

for the marginal predictive p(yi,j,t | Ak,l,s,mk,l,s) as defined in step (b),Ak,l,s = {(i, j, t) : i 6= j, ζt = s, ξi,ζt = k, ξj,ζt = l} and mk,l,s = |Ak,l,s|.Since no direct sampler is available for this distribution, we update eachpair using a random walk Metropolis-Hastings algorithm with bivariatelog-normal proposals,

(log{a(p)s }, log{b(p)s }

)t|(a(c)s , b(c)s

)t∼

Normal

[(log{a(c)s }, log{b(c)s }

)t,Σab

]where Σab is a tuning parameter matrix chosen independently for diagonaland off-diagonal pairs of parameters.

(f) The parameters of the Poisson-Dirichlet process (αs, βs) can be jointly up-dated using the algorithm described in Escobar and West (1995).

(g) The posterior full conditional distributions for the hyperparameters dO, eO, dD,and eD correspond to gamma distributions with shape parameter (cS∗+1)and rate parameters (

∑S∗ as,O + λd), (

∑S∗ bs,O + λe), (

∑S∗ as,D + λd),

(∑S∗ bs,D + λe), respectively.

20

References

Adamic, L., C. Brunetti, J. H. Harris, and A. A. Kirilenko (2010). Tradingnetworks. Technical report, MIT Sloan School of Management.

Airoldi, E., D. M. Blei, S. E. Fienberg, and E. P. Xing (2008). Mixed membershipstochastic blockmodels. Journal of Machine Learning Research 9, 1981–2014.

Aldous, D. J. (1981). Representations for partially exchangeable arrays of ran-dom variables. Journal of Multivariate Analysis 11, 581–598.

Ang, A. and G. Bekaert (2002a). International asset allocation with regimeshifts. Review of Financial Studies 15, 1137–1187.

Ang, A. and G. Bekaert (2002b). Regime switches in interest rates. Journal ofBusiness Economics and Statistics 20, 163–182.

Bertoin, J. (2006). Random fragmentation and coagulation processes. Cambridgestudies in advanced mathematics. Cambridge University Press.

Boyd, N., B. Betancourt, and A. Rodrıguez (2013). On the effect of electronictrading on financial market microstructure. Technical report, University ofCalifornia, Santa Cruz.

Boyd, N., J. Harris, and A. Nowak (2011). The role of speculators during periodsof financial distress. Journal of Alternative Investments 14, 10–25.

Erdos, P. and A. Renyi (1959). On random graphs. Publicationes Mathemati-cae 6, 290–297.

Escobar, M. and M. West (1995). Bayesian density estimation and inferenceusing mixtures. Journal of the American Statistical Association 90, 577–588.

Filardo, A. (1994). Business cycle phases and their transitional dynamics. Jour-nal of Business, Economics, and Statistics 12, 299–308.

Frank, O. and D. Strauss (1986). Markov graphs. Journal of the AmericanStatistical Association 81, 832–842.

Geweke, J. (1992). Evaluating the accuracy of sampling-based approaches to thecalculation of posterior moments. In J. M. Bernardo, J. Berger, A. P. Dawid,and A. F. M. Smith (Eds.), Bayesian Statistics 4, Oxford, pp. 169–193. OxfordUniversity Press.

Green, P. and S. Richardson (2001). Modelling heterogeneity with and withoutthe Dirichlet process. Scandinavian Journal of Statistics 28, 355–375.

Guidolin, M. and A. Timmermann (2005). Economic implications of bull andbear regimes in uk stock and bond returns. The Economic Journal 115, 111–143.

21

Harris, J., W. Christie, and P. Schulta (1994). Why did NASDAQ market makersstop avoiding odd eighth quotes? Journal of Finance 49, 1841–1860.

Hatfield, J. W., S. D. Kominers, A. Nichifor, M. Ostrovsky, and A. Westkamp(2012). Stability and competitive equilibrium in trading networks. Technicalreport, Standford University.

Hoff, P., A. Raftery, and M. Handcock (2002). Latent space approaches to socialnetwork analisys. Journal of the American Statistical Association 97, 1090–1098.

Holland, P. and K. Leinhardt (1981). An exponential family of probability dis-tributions for directed graphs. Journal of the American Statistical Associa-tion 76, 33–65.

Kemp, C., J. B. Tenenbaum, T. Griffiths, T. Yamada, and N. Ueda (2006).Learning systems of concepts with an infinite relational data. In Proceedings ofthe 22nd Annual Conference on Artificial Intelligence(UAI 2006), Cambridge,MA, USA.

Kim, C., J. Morley, and C. Nelson (2001). Does an international tradeoff betweenrisk and return explain mean reversion in stock prices? Journal of EmpiricalFinance 8, 403–426.

Lau, J. W. and P. Green (2007). Bayesian model based clustering procedures.Journal of Computational and Graphical Statistics 16, 526–558.

Neal, R. M. (2000). Markov chain sampling methods for Dirichlet process mix-ture models. Journal of Computational and Graphical Statistics 9, 249–265.

Newman, M. E. J. (2003). The structure and function of complex networks.SIAM Review 45, 167–256.

Nowicki, K. and T. A. B. Snijders (2001). Estimation and prediction for stochas-tic blockstructures. Journal of the American Statistical Association 96, 1077–1087.

Ozsoylev, H. N., J. Walden, and R. Yavuz, M. D. Bildik (2010). Investor networksin the stock market. Technical report, University of California, Berkeley.

Perez-Quiroz, G. and A. Timmermann (2000). Firm size and cyclical variationsin stock returns. Journal of Finance 55, 1229–1262.

Pitman, J. (1995). Exchangeable and partially exchangeable random partitions.Probability Theory and Related Fields 102, 145–158.

Pitman, J. and M. Yor (1997). The two-parameter poisson-dirichlet distributionderived from a stable subordinator. Annals of Probability 25(2), 855–900.

22

Rabiner, L. (1986). An introduction to hidden markov models. IEEE ASSPMagazine, 4–15.

Robert, C. and G. Casella (2005). Monte Carlo Statistical Methods 2nd ed. NewYork: Springer.

Rodriguez, A. (2011). Modeling the dynamics of social networks using Bayesianhierarchical blockmodels. Statistical Analysis and Data Mining , In press.

Rodriguez, A. and P. E. Reyes (2013). On the statistical properties of randomblockmodels for network data. Technical report, University of California -Santa Cruz.

Ryden, T., T. Terasvirta, and S. Asbrink (1998). Stylized facts of daily returnseries and the hidden Markov model. Journal of Applied Econometrics 13,217–244.

Teh, Y. W., M. I. Jordan, M. J. Beal, and D. M. Blei (2006). Sharing clus-ters among related groups: Hierarchical Dirichlet processes. Journal of theAmerican Statistical Association 101, 1566–1581.

Wang, Y. and G. Wong (1987). Stochastic blockmodels for directed graphs.Journal of the American Statistical Association 82, 8–19.

Wasserman, S. and K. Faust (1994). Social Network Analysis: Methods andApplications. New York: Cambridge University Press.

Xu, Z., V. Tresp, K. Yu, and H. Kriegel (2006). Infinite hidden relational models.In Proceedings of the 22nd Annual Conference on Artificial Intelligence(UAI2006), Cambridge, MA, USA.

Yau, C., O. Papaspiliopoulos, G. O. Roberts, and C. C. Holmes (2011). Bayesiannon-parametric hidden markov models with applications in genomics. Journalof the Royal Statistical Society, Series B 73, 37–57.

Zaloom, C. (2004). Time, space and technology in financial networks. InM. Castells (Ed.), The Network Society: A Cross-cultural Perspective, pp.198–216. Edward Elgar Publishing Limited.

23

modeling and prediction of financial trading …...modeling and prediction of financial trading...

Documents