a new family of regular semivalues and applications roberto lucchetti politecnico di milano,italy

Post on 28-Mar-2015

216 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

A new family of regular semivalues and applications

Roberto LucchettiPolitecnico di Milano,Italy

R.Lucchetti Politecnico di Milano 2

Main goal:

To rank genes from DNA data provided by Microarray Analysis.

Tools: Cooperative Game Theory, in particular Power indicesPower indices rank players according to their “strength” in the game.

In the EU council the strongest states (GE,FR,IT,UK) have a some 10 times power w.r.t. the weakest state (MT)

In UN the veto players have a some 100 (10) times power w.r.t. non permanent players, according to Shapley (Banzhaf).

R.Lucchetti Politecnico di Milano 3

A (TU) game is with

N={1,…,n} is the set of players,

v is the characteristic function of the game.

A N is called coalition.

v(A) is the utility (or cost) for the coalition A.

GN represents the set of all games having N as set of players.

Remark:

GN R2n-1

R.Lucchetti Politecnico di Milano 4

A Base for GN:

Unanimity gamesSubclass of games:

Simple games. Among them the weighted majority games:

Introduction: how an array works

A chip can contain millions of DNA probes

Introduction: how a microarray works

Hybridization

When a single DNA helix meets a single mRNA helix, if they are complementary they will stick to each other.

Hybridization helps researchers to identify what RNA sequences are present in a sample and this tells them what genes are being expressed by the organism and how much they are being expressed.

Introduction: how a microarray works

GeneChip microarrays use the natural chemical attraction between the RNA target (from the sample preparation) and the DNA on the array to determine the expression level of a given gene.

Adenine (A)

Guanine (G)

Thyimine (T)/Uracil (U)

Cytosine (C)

DNA/RNADNA/RNA

T

C

A

G

Introduction: how a microarray works

The RNA extract from a sample is copied in cRNA (through a process known as PCR)PCR). Copying the RNA allows it to be more easily detected on the array. At the same time the RNA is copied, a chemical flourescent molecule called biotin is attached to the strand. This molecule will show where the sample RNA has stuck to the DNA probe on the array.

Introduction: how a microarray works

If the gene is highly expressed,many RNA molecules will stick to the probe and the probe location will shine brightly when the laser hit it.

If the sample RNA doesn’t match it will be rejected by the probe on the array and when the laser hits the probe, nothing glows.

Introduction: how a microarray works

The whole point of microarray gene expression analysis is to compare expression levels among different samples. Let’s simplify the situation with an example in which we have four genes and two samples.Gene1: 2RUDE Gene2: 2LOUD Gene3: GETOUT Gene4: FATMET

Gene4 is not glowing.

Array1 Array2 Array3

array 1 array 2 array 3 array 4 …

gene 1 0,67 0,45 1,32 1,34 …

gene 2 1,01 1,13 1,54 2,13 …

gene 3 1,38 1,21 1,23 0,12 …

gene 4 0,65 0,98 0,54 … …

gene 5 0,17 1,32 2,43 … …

… … … … … …

Expression level of gene 4 in array 2

R.Lucchetti Politecnico di Milano 12

The Microarray Game:

An mxn Boolean matrix M such that

Given the column , supp

R.Lucchetti Politecnico di Milano 13

Sample 1 Sample 2 Sample 3

gene1 0.5 0.2 1

gene2 0.4 1 0.3

gene3 0.8 0.4 0.2

Sample1 Sample2 Sample3 Sample 4

gene1 0.7 0.3 1.8 0.8

gene2 0.1 0.2 0.5 0.9

gene3 1 0.6 1.7 0.1

Sample1 Sample2 Sample3 Sample4

gene1 0 0 1 0

gene2 1 1 0 0

gene3 1 0 1 1

R.Lucchetti Politecnico di Milano 14

A power index for the game (N,v) is (x1,…,xn) such

that:

xi represents the power of player i in game v.

weighted voting does not work…

The most famous:

Shapley () and Banzhaf () .

R.Lucchetti Politecnico di Milano 15

the marginal contribution of i to S {i}

Shapley () and Banzhaf()

R.Lucchetti Politecnico di Milano 16

is a probabilistic value if there is a probability

on

such that

Shapley

Banzhaf

R.Lucchetti Politecnico di Milano 17

If pi(S)=p(|S|)>0, the probabilistic value is called regular semivalue

Examples:

Banzhaf Shapley p-binomial

Regular semivalues are points in the simplex:

R.Lucchetti Politecnico di Milano 18

Properties for power indices

Let

The solution has the dummy player (DP) property, if for each player such that

for all coalitions A not containing i,

R.Lucchetti Politecnico di Milano 19

Let be a permutation.

Given the game v, denote by the game

and by

The solution has the symmetry (S) property if, for each permutation as above

R.Lucchetti Politecnico di Milano 20

The new family of power indices

Let

Define on the unanimity game as

and extend it by linearity on a generic

R.Lucchetti Politecnico di Milano 21

R.Lucchetti Politecnico di Milano 22

R.Lucchetti Politecnico di Milano 23

R.Lucchetti Politecnico di Milano 24

Theorem 1

There exists one and only one value fulfilling the symmetry, linearity and dummy player properties, and assigning aS to all non null players

in the unanimity game uS , where a1=1 and as>0 for s=2,…,n.

fulfills the formula:

R.Lucchetti Politecnico di Milano 25

Theorem 2 a is a regular semivalue for all a>0. 2 fulfills the formula:

•Corollary

The family of the weighting coefficients of the values a, a>0,is an open curve in the simplex of the regular semivalues, containing the Shapley value. The addition of the Banzhaf value to the curve provides a one-point compactification of the curve.

R.Lucchetti Politecnico di Milano 26

Theorem 3 study of the term:

Key tool

Let , let

Then

Moreover, for all natural l, and positive real a,x:

Finally, for each natural m, the following formula holds:

R.Lucchetti Politecnico di Milano 27

Let count in how many ways the sum of the weights of j players different from i can give k. Then the following proposition holds.

Let be the value defined in the theorem above. Let q>0 be a positive integer, and let w1,…,wn be non negative integers.

Let v=[q;w1,…,wn] be the associated weighted majority game. Then the following formula holds:

Calculating the indices in weighted majority games

An efficient algorithm based on generating functions and formal series allows for a fast calculation of the coefficients

R.Lucchetti Politecnico di Milano 28

Applications

The EU

29

STATI SY S2 BF SY(I)/MT S2(I)/MT BF(I)/MTGE 0,086738 0,02797 0,032688 10,6066383 9,815722703 8,260803639FR 0,086738 0,02797 0,032688 10,6066383 9,815722703 8,260803639IT 0,086738 0,02797 0,032688 10,6066383 9,815722703 8,260803639UK 0,086738 0,02797 0,032688 10,6066383 9,815722703 8,260803639SP 0,079975 0,025999 0,031164 9,77960769 9,123884457 7,875663381PL 0,079975 0,025999 0,031164 9,77960769 9,123884457 7,875663381RO 0,039937 0,013476 0,017889 4,88360405 4,729163962 4,520849128NL 0,036825 0,012476 0,016691 4,5031054 4,378366807 4,218094516BE 0,034068 0,011555 0,015475 4,16600531 4,055048061 3,910791003CZ 0,034068 0,011555 0,015475 4,16600531 4,055048061 3,910791003GR 0,034068 0,011555 0,015475 4,16600531 4,055048061 3,910791003HU 0,034068 0,011555 0,015475 4,16600531 4,055048061 3,910791003PT 0,034068 0,011555 0,015475 4,16600531 4,055048061 3,910791003SE 0,028193 0,00961 0,012989 3,44756282 3,372390341 3,282537276AU 0,028193 0,00961 0,012989 3,44756282 3,372390341 3,282537276BG 0,028193 0,00961 0,012989 3,44756282 3,372390341 3,282537276FI 0,019606 0,006721 0,00916 2,39749856 2,358602005 2,314885014DK 0,019606 0,006721 0,00916 2,39749856 2,358602005 2,314885014SK 0,019606 0,006721 0,00916 2,39749856 2,358602005 2,314885014IR 0,019606 0,006721 0,00916 2,39749856 2,358602005 2,314885014LT 0,019606 0,006721 0,00916 2,39749856 2,358602005 2,314885014LV 0,011042 0,003813 0,005251 1,35024683 1,338033557 1,327015416SLO 0,011042 0,003813 0,005251 1,35024683 1,338033557 1,327015416CY 0,011042 0,003813 0,005251 1,35024683 1,338033557 1,327015416ES 0,011042 0,003813 0,005251 1,35024683 1,338033557 1,327015416LU 0,011042 0,003813 0,005251 1,35024683 1,338033557 1,327015416MT 0,008178 0,00285 0,003957 1 1 1

R.Lucchetti Politecnico di Milano 30

The power indices, when considering the 56 genes common to the indices, among the first 100 common to all indices. Data from 40 tumor samples vs 22 normal, 2000 genes

0

0,1

0,2

0,3

0,4

0,5

0,6

0,7

0 10 20 30 40 50 60

Genes

No

rmal

ized

val

ues

sigma2(10^-4)

sigma3(10^-6)

Banzhaf(10^-13)

Shapley(10^-2)

R.Lucchetti Politecnico di Milano 31

Data from a Colon Rectal Cancer10 Healthy 12 Tumoral tissues

An extended microarray game considers also how much the genes are abnormally expressed w.r.t a normality interval.Given the normality interval [mi,Mi] of the gene i, si the standard deviation, Nk

i=[mi-ksi,mi+ksi], assign k to the ij cell of the matrix if value of gene i in patient j falls in Ni

k \ Nik-1

A weighted Shapley value is used to rank genes. This allows better differentiating the genes. Taking the first 100 genes in the ranking, the game is formed as an average of weighted majority games.Then we calculate the Shapley, Banzhaf and 2 indices

R.Lucchetti Politecnico di Milano 32

Gene expression analysis was performed by using Human Genome U133A-Plus 2.0 GeneChip arrays (Affymetrix, Inc., Calif).

The following 7 genes are quoted in medical literature as having great importance in the onset of the disease:CYR61, UCHL1, FOS,FOSB, EGR1, VIP, KRT24.

One of them was ranked around the 100-th position by the weighted Shapley value. All other ones are among the first 50 and played the subsequent game.

S B 2

FOSB 2 1 1

CYR61 1 2 2

FOS 3 3 3

VIP 5 5 6

EGR1 10 9 9

KRT24 45 35 35

R.Lucchetti Politecnico di Milano 33

References R.Lucchetti P.Radrizzani, E. Munarini, A new family of regular semivalues

and applications, Int.J.of Game Theory DOI 10.1007/s00182-010-0263-5

R. Lucchetti-S. Moretti-F. Patrone-P. Radrizzani, The Shapley and Banzhaf indices in microarray games, Computers and Operations Research, 37, (2010) p. 1406-1412.

R. Lucchetti-P.Radrizzani, Microarray Data Analysis Via Weighted Indices and Weighted Majority Games, Computational Intelligent Methods for Bioinformatics and Biostatistics II, Masulli, Peterson, Tagliaferri (Eds), Lecture Notes in Computer Science, Springer (2010) p.179-190.

S.Moretti , F.Patrone, S.Bonassi, The class of microarraygames and the relevance index for genes. TOP 15 (2007), p256-280.

D. Albino, P. Scaruffi, S. Moretti, S.Coco, C.Di Cristofano, A.Cavazzana, M.Truini, S.Stigliani, S.Bonassi, G.Ptonini (2008): Stroma poor and stroma rich gene signatures show a low intratumoral gene expression heterogeneity in Neuroblastic tumors. Cancer 113, p. 1412-1422.

top related