wacha bounliphone, arthur gretton, arthur tenenhaus ... · wacha bounliphone, arthur gretton,...

20
A low variance consistent test of relative dependency A low variance consistent test of relative dependency Wacha Bounliphone, Arthur Gretton, Arthur Tenenhaus, Matthew Blaschko 32nd International Conference on Machine Learning 2015 CVN – L2S Gatsby Unit Galen Team

Upload: dodieu

Post on 12-Sep-2018

241 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Wacha Bounliphone, Arthur Gretton, Arthur Tenenhaus ... · Wacha Bounliphone, Arthur Gretton, Arthur Tenenhaus, Matthew Blaschko 32nd International Conference on Machine Learning

A low variance consistent test of relative dependency

A low variance consistent test of relative dependency

Wacha Bounliphone, Arthur Gretton, Arthur Tenenhaus, Matthew Blaschko

32nd International Conference on Machine Learning 2015

CVN – L2S Gatsby Unit Galen Team

Page 2: Wacha Bounliphone, Arthur Gretton, Arthur Tenenhaus ... · Wacha Bounliphone, Arthur Gretton, Arthur Tenenhaus, Matthew Blaschko 32nd International Conference on Machine Learning

A low variance consistent test of relative dependencyIntroductionTest of relative dependencyExperiments

Motivation questions

Tests of dependence : Spearman’s ρ, Kendall’s τ , kernel measure of covariance andcorrelation, distance covariance ...

However, there may be multiple dependencies: Is the dependency betweenEnglish and Dutch stronger than the dependency between English and Spanish ?

H0: Dep(English,Dutch) ≤ Dep(English,Spanish ) - p-value < 10−4

Page 3: Wacha Bounliphone, Arthur Gretton, Arthur Tenenhaus ... · Wacha Bounliphone, Arthur Gretton, Arthur Tenenhaus, Matthew Blaschko 32nd International Conference on Machine Learning

A low variance consistent test of relative dependencyIntroductionTest of relative dependencyExperiments

Motivation questions

Tests of dependence : Spearman’s ρ, Kendall’s τ , kernel measure of covariance andcorrelation, distance covariance ...

However, there may be multiple dependencies: Is the dependency betweenEnglish and Dutch stronger than the dependency between English and Spanish ?

H0: Dep(English,Dutch) ≤ Dep(English,Spanish ) - p-value < 10−4

Page 4: Wacha Bounliphone, Arthur Gretton, Arthur Tenenhaus ... · Wacha Bounliphone, Arthur Gretton, Arthur Tenenhaus, Matthew Blaschko 32nd International Conference on Machine Learning

A low variance consistent test of relative dependencyIntroductionTest of relative dependencyExperiments

Detecting statistical dependence

- How do you detect dependence in structured data ?

X1: Conscious of its spiritual and moral

heritage, the Union is founded on the

indivisible, universal values of human

dignity, freedom, equality and solidarity; it is

based on the principles of democracy and the

rule of law. It places the individual at the heart

of its activities, by establishing the citizenship

of the Union and by creating an area of

freedom, security and justice.

Y1: In dem Bewusstsein ihres geistig-

religiösen und sittlichen Erbes gründet sich

die Union auf die unteilbaren und

universellen Werte der Würde des Menschen,

der Freiheit, der Gleichheit und der

Solidarität. Sie beruht auf den Grundsätzen

der Demokratie und der Rechtsstaatlichkeit.

Sie stellt den Menschen in den Mittelpunkt

ihres Handelns, indem sie die

Unionsbürgerschaft und einen Raum der

Freiheit, der Sicherheit und des Rechts

begründet.

Z1: Consciente de su patrimonio espiritual y

moral, la Unión está fundada sobre los valores

indivisibles y universales de la dignidad

humana, la libertad, la igualdad y la

solidaridad, y se basa en los principios de la

democracia y el Estado de Derecho. Al

instituir la ciudadanía de la Unión y crear un

espacio de libertad, seguridad y justicia, sitúa

a la persona en el centro de su actuación.

X2: The Union contributes to the preservation

and to the development of these common

values while respecting the diversity of the

cultures and traditions of the peoples of

Europe as well as the national identities of the

Member States and the organization of their

public authorities at national, regional and

local levels; it seeks to promote balanced and

sustainable development and ensures free

movement of persons, services, goods and

capital, and the freedom of establishment.

Y2: Die Union trägt zur Erhaltung und zur

Entwicklung dieser gemeinsamen Werte

unter Achtung der Vielfalt der Kulturen und

Traditionen der Völker Europas sowie der

nationalen Identität der Mitgliedstaaten und

der Organisation ihrer staatlichen Gewalt auf

nationaler, regionaler und lokaler Ebene bei.

Sie ist bestrebt, eine ausgewogene und

nachhaltige Entwicklung zu fördern und stellt

den freien Personen-, Dienstleistungs-,

Waren- und Kapitalverkehr sowie die

Niederlassungsfreiheit sicher.

Z2: La Unión contribuye a defender y

fomentar estos valores comunes dentro del

respeto de la diversidad de culturas y

tradiciones de los pueblos de Europa, así

como de la identidad nacional de los Estados

miembros y de la organización de sus poderes

públicos a escala nacional, regional y local;

trata de fomentar un desarrollo equilibrado y

sostenible y garantiza la libre circulación de

personas, servicios, mercancías y capitales,

así como la libertad de establecimiento.

The Union contributes to the preservation

and to the development of these common

values while respecting the diversity of

the cultures and traditions of the peoples

of Europe as well as the national

identities of the Member States and the

organization of their public authorities at

national, regional and local levels; it

seeks to promote balanced and

sustainable development and ensures free

movement of persons, services, goods

and capital, and the freedom of

establishment.

→ K =

Die Union trägt zur Erhaltung und zur

Entwicklung dieser gemeinsamen Werte

unter Achtung der Vielfalt der Kulturen

und Traditionen der Völker Europas

sowie der nationalen Identität der

Mitgliedstaaten und der Organisation

ihrer staatlichen Gewalt auf nationaler,

regionaler und lokaler Ebene bei. Sie ist

bestrebt, eine ausgewogene und

nachhaltige Entwicklung zu fördern und

stellt den freien Personen-,

Dienstleistungs-, Waren- und

Kapitalverkehr sowie die

Niederlassungsfreiheit sicher.

→ L =

X1: Conscious of its spiritual and moral

heritage, the Union is founded on the

indivisible, universal values of human

dignity, freedom, equality and solidarity; it is

based on the principles of democracy and the

rule of law. It places the individual at the heart

of its activities, by establishing the citizenship

of the Union and by creating an area of

freedom, security and justice.

Y1: De Unie, die zich bewust is van haar

geestelijke en morele erfgoed, heeft haar

grondslag in de ondeelbare en universele

waarden van menselijke waardigheid en van

vrijheid, gelijkheid en solidariteit. Zij berust

op het beginsel van democratie en het

beginsel van de rechtsstaat. De Unie stelt de

mens centraal in haar optreden, door het

burgerschap van de Unie in te stellen en een

ruimte van vrijheid, veiligheid en recht tot

stand te brengen.

Z1: Consciente de su patrimonio espiritual y

moral, la Unión está fundada sobre los valores

indivisibles y universales de la dignidad

humana, la libertad, la igualdad y la

solidaridad, y se basa en los principios de la

democracia y el Estado de Derecho. Al

instituir la ciudadanía de la Unión y crear un

espacio de libertad, seguridad y justicia, sitúa

a la persona en el centro de su actuación.

X2: The Union contributes to the preservation

and to the development of these common

values while respecting the diversity of the

cultures and traditions of the peoples of

Europe as well as the national identities of the

Member States and the organization of their

public authorities at national, regional and

local levels; it seeks to promote balanced and

sustainable development and ensures free

movement of persons, services, goods and

capital, and the freedom of establishment.

Y2: De Unie draagt bij tot de instandhouding

en de ontwikkeling van deze

gemeenschappelijke waarden, met

inachtneming van de verscheidenheid van

cultuur en traditie van de volkeren van

Europa, alsmede van de nationale identiteit

van de lidstaten en van hun staatsinrichting op

nationaal, regionaal en lokaal niveau. Zij

streeft ernaar een evenwichtige en duurzame

ontwikkeling te bevorderen en bewerkstelligt

het vrije verkeer van personen, diensten,

goederen en kapitaal, alsook de vrijheid van

vestiging.

Z2: La Unión contribuye a defender y

fomentar estos valores comunes dentro del

respeto de la diversidad de culturas y

tradiciones de los pueblos de Europa, así

como de la identidad nacional de los Estados

miembros y de la organización de sus poderes

públicos a escala nacional, regional y local;

trata de fomentar un desarrollo equilibrado y

sostenible y garantiza la libre circulación de

personas, servicios, mercancías y capitales,

así como la libertad de establecimiento.

The Union contributes to the preservation

and to the development of these common

values while respecting the diversity of

the cultures and traditions of the peoples

of Europe as well as the national

identities of the Member States and the

organization of their public authorities at

national, regional and local levels; it

seeks to promote balanced and

sustainable development and ensures free

movement of persons, services, goods

and capital, and the freedom of

establishment.

→ K =

De Unie draagt bij tot de instandhouding

en de ontwikkeling van deze

gemeenschappelijke waarden, met

inachtneming van de verscheidenheid

van cultuur en traditie van de volkeren

van Europa, alsmede van de nationale

identiteit van de lidstaten en van hun

staatsinrichting op nationaal, regionaal

en lokaal niveau. Zij streeft ernaar een

evenwichtige en duurzame ontwikkeling

te bevorderen en bewerkstelligt het vrije

verkeer van personen, diensten, goederen

en kapitaal, alsook de vrijheid van

vestiging.

→ L =

Page 5: Wacha Bounliphone, Arthur Gretton, Arthur Tenenhaus ... · Wacha Bounliphone, Arthur Gretton, Arthur Tenenhaus, Matthew Blaschko 32nd International Conference on Machine Learning

A low variance consistent test of relative dependencyIntroductionTest of relative dependencyExperiments

Detecting statistical dependence

X1: Conscious of its spiritual and moral heritage, the Union is founded on the

indivisible, universal values of human dignity, freedom, equality and solidarity; it is

based on the principles of democracy and the rule of law. It places the individual at the

heart of its activities, by establishing the citizenship of the Union and by creating an

area of freedom, security and justice.

X2: The Union contributes to the preservation and to the development of these

common values while respecting the diversity of the cultures and traditions of the

peoples of Europe as well as the national identities of the Member States and the

organization of their public authorities at national, regional and local levels; it seeks

to promote balanced and sustainable development and ensures free movement of

persons, services, goods and capital, and the freedom of establishment.

→ K =

Y1: De Unie, die zich bewust is van haar geestelijke en morele erfgoed, heeft haar

grondslag in de ondeelbare en universele waarden van menselijke waardigheid en van

vrijheid, gelijkheid en solidariteit. Zij berust op het beginsel van democratie en het

beginsel van de rechtsstaat. De Unie stelt de mens centraal in haar optreden, door het

burgerschap van de Unie in te stellen en een ruimte van vrijheid, veiligheid en recht

tot stand te brengen.

Y2: De Unie draagt bij tot de instandhouding en de ontwikkeling van deze

gemeenschappelijke waarden, met inachtneming van de verscheidenheid van cultuur

en traditie van de volkeren van Europa, alsmede van de nationale identiteit van de

lidstaten en van hun staatsinrichting op nationaal, regionaal en lokaal niveau. Zij streeft

ernaar een evenwichtige en duurzame ontwikkeling te bevorderen en bewerkstelligt

het vrije verkeer van personen, diensten, goederen en kapitaal, alsook de vrijheid van

vestiging.

→ L =

Idea: measure similarity between the kernel matrices 〈K , L〉 = Tr(K L), K = HKH,where H = I − 1

m11T the centering matrix

Page 6: Wacha Bounliphone, Arthur Gretton, Arthur Tenenhaus ... · Wacha Bounliphone, Arthur Gretton, Arthur Tenenhaus, Matthew Blaschko 32nd International Conference on Machine Learning

A low variance consistent test of relative dependencyIntroductionTest of relative dependencyExperiments

Detecting statistical dependence

X1: Conscious of its spiritual and moral heritage, the Union is founded on the

indivisible, universal values of human dignity, freedom, equality and solidarity; it is

based on the principles of democracy and the rule of law. It places the individual at the

heart of its activities, by establishing the citizenship of the Union and by creating an

area of freedom, security and justice.

X2: The Union contributes to the preservation and to the development of these

common values while respecting the diversity of the cultures and traditions of the

peoples of Europe as well as the national identities of the Member States and the

organization of their public authorities at national, regional and local levels; it seeks

to promote balanced and sustainable development and ensures free movement of

persons, services, goods and capital, and the freedom of establishment.

→ K =

Y1: De Unie, die zich bewust is van haar geestelijke en morele erfgoed, heeft haar

grondslag in de ondeelbare en universele waarden van menselijke waardigheid en van

vrijheid, gelijkheid en solidariteit. Zij berust op het beginsel van democratie en het

beginsel van de rechtsstaat. De Unie stelt de mens centraal in haar optreden, door het

burgerschap van de Unie in te stellen en een ruimte van vrijheid, veiligheid en recht

tot stand te brengen.

Y2: De Unie draagt bij tot de instandhouding en de ontwikkeling van deze

gemeenschappelijke waarden, met inachtneming van de verscheidenheid van cultuur

en traditie van de volkeren van Europa, alsmede van de nationale identiteit van de

lidstaten en van hun staatsinrichting op nationaal, regionaal en lokaal niveau. Zij streeft

ernaar een evenwichtige en duurzame ontwikkeling te bevorderen en bewerkstelligt

het vrije verkeer van personen, diensten, goederen en kapitaal, alsook de vrijheid van

vestiging.

→ L =

Idea: measure similarity between the kernel matrices 〈K , L〉 = Tr(K L), K = HKH,where H = I − 1

m11T the centering matrix

Page 7: Wacha Bounliphone, Arthur Gretton, Arthur Tenenhaus ... · Wacha Bounliphone, Arthur Gretton, Arthur Tenenhaus, Matthew Blaschko 32nd International Conference on Machine Learning

A low variance consistent test of relative dependencyIntroductionTest of relative dependencyExperiments

Probability in feature space

feature spaceProbability in

−→ DiscrepancyMaximum Mean

−→ MeasureKernel Dependence

Feature Map- Consider x 7→ k(., x) ∈ F

instead of x 7→ (ϕ1(x), ..., ϕs(x)) ∈ Rs

- Inner product easily compute〈k(., x), k(., y)〉F = k(x , y)

Embedding of probability measures into Reproducing Kernel Hilbert Space- In particular, we can look at the set of distributions and take each distribution P as a

point that we can embed through the mean-embedding µP :

P 7→ µP = EX∼Px k(.,X ) =∫

Ω φ(x) dP(x) ∈ F- Each distribution can thus be uniquely represented in the F .- Inner product easily compute 〈µP , µQ〉F = EX ,Y k(x , y)

Page 8: Wacha Bounliphone, Arthur Gretton, Arthur Tenenhaus ... · Wacha Bounliphone, Arthur Gretton, Arthur Tenenhaus, Matthew Blaschko 32nd International Conference on Machine Learning

A low variance consistent test of relative dependencyIntroductionTest of relative dependencyExperiments

Maximum Mean Discrepancy

feature spaceProbability in

−→ DiscrepancyMaximum Mean

−→ MeasureKernel Dependence

Maximum Mean Discrepancy (MMD): [Gretton et al, 2007]

MMD2(P,Q) = ‖µP − µQ‖2F

= 〈µP , µP〉+ 〈µQ , µQ〉 − 2〈µP , µQ〉

Page 9: Wacha Bounliphone, Arthur Gretton, Arthur Tenenhaus ... · Wacha Bounliphone, Arthur Gretton, Arthur Tenenhaus, Matthew Blaschko 32nd International Conference on Machine Learning

A low variance consistent test of relative dependencyIntroductionTest of relative dependencyExperiments

Kernel dependence measure

feature spaceProbability in

−→ DiscrepancyMaximum Mean

−→ MeasureKernel Dependence

Dependence Measure using the Hilbert-Schmidt Independence Criterion (HSIC):[Gretton et al, 2005, 2008]

HSIC2(Px ,Py ) = ‖µPxy − µPxPy ‖2F

HSIC2(Px ,Py ) = 0⇐⇒ Pxy = PxPy when kernels K and L are characteristic on theirrespective marginal domains.

Empirical HSIC2(Px ,Py ) : HSICXYm =

1

m2Tr(K L), O(m2) computation time

HSICXYm can be rewritten in terms of a U-statistic, which produces minimum-variance

unbiased estimators.

Page 10: Wacha Bounliphone, Arthur Gretton, Arthur Tenenhaus ... · Wacha Bounliphone, Arthur Gretton, Arthur Tenenhaus, Matthew Blaschko 32nd International Conference on Machine Learning

A low variance consistent test of relative dependencyIntroductionTest of relative dependencyExperiments

The Problem of relative dependency

Is the dependency between English and Dutch stronger than the dependencybetween English and Spanish ?

H0: HSIC(Px ,Py ) ≤ HSIC(Px ,Pz ) (null hypothesis)

H1: HSIC(Px ,Py ) > HSIC(Px ,Pz ) (alternative hypothesis)

Test statistic: HSICXYm −HSICXZ

m

Observed samples ximi=1 ∼ Px , yimi=1 ∼ Py , zimi=1 ∼ Pz

Two strategies:

- Naively: compute the value of the two independent statistics HSICX ′Y ′m/2 and

HSICX ′′Z ′′m/2 on sample subsets;

- Efficiently: compute the value of the two dependent statistics HSICXYm and HSICXZ

m

and if empirical HSICXYm −HSICXZ

m is :

- ”less or equal than 0”: reject H0

- otherwise: do not reject H0

Page 11: Wacha Bounliphone, Arthur Gretton, Arthur Tenenhaus ... · Wacha Bounliphone, Arthur Gretton, Arthur Tenenhaus, Matthew Blaschko 32nd International Conference on Machine Learning

A low variance consistent test of relative dependencyIntroductionTest of relative dependencyExperiments

The Problem of relative dependency

Is the dependency between English and Dutch stronger than the dependencybetween English and Spanish ?

H0: HSIC(Px ,Py ) ≤ HSIC(Px ,Pz ) (null hypothesis)

H1: HSIC(Px ,Py ) > HSIC(Px ,Pz ) (alternative hypothesis)

Test statistic: HSICXYm −HSICXZ

m

Observed samples ximi=1 ∼ Px , yimi=1 ∼ Py , zimi=1 ∼ Pz

Two strategies:

- Naively: compute the value of the two independent statistics HSICX ′Y ′m/2 and

HSICX ′′Z ′′m/2 on sample subsets;

- Efficiently: compute the value of the two dependent statistics HSICXYm and HSICXZ

m

and if empirical HSICXYm −HSICXZ

m is :

- ”less or equal than 0”: reject H0

- otherwise: do not reject H0

Page 12: Wacha Bounliphone, Arthur Gretton, Arthur Tenenhaus ... · Wacha Bounliphone, Arthur Gretton, Arthur Tenenhaus, Matthew Blaschko 32nd International Conference on Machine Learning

A low variance consistent test of relative dependencyIntroductionTest of relative dependencyExperiments

A simple consistent test via independent HSICs

Construction of two independent statistics HSICX ′Y ′m/2 and HSICX ′′Z ′′

m/2 by subsampling

K L M

Joint asymptotic distribution of independent HSIC: [Serfling, 2009]

√m

((HSICX ′Y ′

m/2

HSICX ′′Z ′′m/2

)−(

HSIC(Px ,Py )HSIC(Px ,Pz )

))d−→ N

((00

),

(σ2X ′Y ′ 00 σ2

X ′′Z ′′

))Relative dependency test with independent HSIC statistic: p-value

√m[HSICX ′Y ′

m/2 − HSICX ′′Z ′′m/2 ]

d−→

N(√

2

2(HSIC(Px ,Py )− HSIC(Px ,Pz ),

1

2(σ2

X ′Y ′ + σ2X ′′Z ′′

)

Page 13: Wacha Bounliphone, Arthur Gretton, Arthur Tenenhaus ... · Wacha Bounliphone, Arthur Gretton, Arthur Tenenhaus, Matthew Blaschko 32nd International Conference on Machine Learning

A low variance consistent test of relative dependencyIntroductionTest of relative dependencyExperiments

Joint asymptotic distribution of two dependent HSIC

Joint asymptotic distribution of HSIC and test statistic

√m

((HSICXY

m

HSICXZm

)−(HSIC(Px ,Py )HSIC(Px ,Pz)

))d−→ N

((00

),

(σ2XY σXYXZ

σXYXZ σ2XZ

))

σXYXZ =16

m

1

m

m∑i=1

( (m − 1)!

(m − 4)!

)2 ∑(j,q,r)∈im3 \i

hijqrgijqr

− HSICXYm HSICXZ

m

σXYXZ =16

m

((4m)−1(m − 1)−2

3 hXYThXZ − HSICXY

m HSICXZm

)

hXY = (m − 2)2(K L

)1−m(K1) (L1)

+ (m − 2)(

(Tr(KL))1− K(L1)− L(K1))

+ (1T L1)K1 + (1T K1)L1− ((1T K)(L1))1

We have a O(m2) computation for all terms.

Page 14: Wacha Bounliphone, Arthur Gretton, Arthur Tenenhaus ... · Wacha Bounliphone, Arthur Gretton, Arthur Tenenhaus, Matthew Blaschko 32nd International Conference on Machine Learning

A low variance consistent test of relative dependencyIntroductionTest of relative dependencyExperiments

Joint asymptotic distribution of two dependent HSIC

Joint asymptotic distribution of HSIC and test statistic

√m

((HSICXY

m

HSICXZm

)−(HSIC(Px ,Py )HSIC(Px ,Pz)

))d−→ N

((00

),

(σ2XY σXYXZ

σXYXZ σ2XZ

))

σXYXZ =16

m

1

m

m∑i=1

( (m − 1)!

(m − 4)!

)2 ∑(j,q,r)∈im3 \i

hijqrgijqr

− HSICXYm HSICXZ

m

σXYXZ =

16

m

((4m)−1(m − 1)−2

3 hXYThXZ − HSICXY

m HSICXZm

)

hXY = (m − 2)2(K L

)1−m(K1) (L1)

+ (m − 2)(

(Tr(KL))1− K(L1)− L(K1))

+ (1T L1)K1 + (1T K1)L1− ((1T K)(L1))1

We have a O(m2) computation for all terms.

Page 15: Wacha Bounliphone, Arthur Gretton, Arthur Tenenhaus ... · Wacha Bounliphone, Arthur Gretton, Arthur Tenenhaus, Matthew Blaschko 32nd International Conference on Machine Learning

A low variance consistent test of relative dependencyIntroductionTest of relative dependencyExperiments

Properties of the test of relative dependency

Relative dependency statistical test: p-value

√m[HSICXY

m −HSICXZm ]

d−→

N(√

2

2(HSIC(Px ,Py )− HSIC(Px ,Pz),

1

2(σ2

XY + σ2XZ − 2σXYXZ

)

The dependent test is more powerful than the independent test

Theorem

The asymptotic relative efficiency of the independent approach relative to the dependentapproach is always greater to 1.

1

2(σ2

XY + σ2XZ − 2σXYXZ ) <

1

2(2σ2

XY + 2σ2XZ ) (1)

Page 16: Wacha Bounliphone, Arthur Gretton, Arthur Tenenhaus ... · Wacha Bounliphone, Arthur Gretton, Arthur Tenenhaus, Matthew Blaschko 32nd International Conference on Machine Learning

A low variance consistent test of relative dependencyIntroductionTest of relative dependencyExperiments

Experiments on Synthetic Data

We control the relative degree of functional dependency between variates.

Dependency (X,Y) > Dependency (X,Z) ?si

n(t

)+γ

1N

(0,

1)

−1 0 1 2 3 4 5 6 7−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

tsi

n(t

)+γ

2N

(0,

1)

−10 −5 0 5 10 15−15

−10

−5

0

5

10

tsi

n(t

)+γ

3N

(0,

1)

−15 −10 −5 0 5 10 15−15

−10

−5

0

5

10

t + γ1N (0, 1) t cos(t) + γ2N (0, 1) t cos(t) + γ3N (0, 1)

(X) γ1 = 0.3 (Y) γ2 = 0.3 (Z) γ3 = 0.6

Pow

ero

fth

ete

sts

0.5 1 1.5 2 2.5 3 3.50

0.2

0.4

0.6

0.8

1

dependent testsindependent tests

HS

ICXZ

mvs

HS

ICX′′Z′′

m/

2

0.01 0.015 0.02 0.025 0.03 0.0350.01

0.015

0.02

0.025

0.03

0.035independent testsdependent tests

γ3 HSICXYm vs HSICX

′Y ′m/2

Page 17: Wacha Bounliphone, Arthur Gretton, Arthur Tenenhaus ... · Wacha Bounliphone, Arthur Gretton, Arthur Tenenhaus, Matthew Blaschko 32nd International Conference on Machine Learning

A low variance consistent test of relative dependencyIntroductionTest of relative dependencyExperiments

Experiments on Multilingual Data

Uralic: Finnish (fi), Romance: Italian (it), French (fr), Spanish (es), Portuguese (pt),Germanic: English (en), Dutch (nl), German (de), Danish (da), Swedish (sv).

H0 : Dep(Sc., Tg.1) ≤ Dep(Sc., Tg.2)

Source Target 1 Target 2 p-valuefr es it 0.0157fr pt it 0.1882es fr it 0.2147es pt it < 10−4

es pt fr < 10−4

pt fr it 0.7649pt es it 0.0011pt es fr < 10−8

Relative dependency tests between Romance

languages.

Page 18: Wacha Bounliphone, Arthur Gretton, Arthur Tenenhaus ... · Wacha Bounliphone, Arthur Gretton, Arthur Tenenhaus, Matthew Blaschko 32nd International Conference on Machine Learning

A low variance consistent test of relative dependencyIntroductionTest of relative dependencyExperiments

Pediatric high-grade gliomas (pHGG)

Brain tumors localisation pHGG have different genetics origins depending on thelocation of the tumor in the brain. The goal is to identify the mechanismsresponsible for the tumor.

H0: Dependency(Loc,Gene Exp.) < Dependency(Loc,Chrom. Imbalance)

p-value < 10−5

Page 19: Wacha Bounliphone, Arthur Gretton, Arthur Tenenhaus ... · Wacha Bounliphone, Arthur Gretton, Arthur Tenenhaus, Matthew Blaschko 32nd International Conference on Machine Learning

A low variance consistent test of relative dependencyIntroductionTest of relative dependencyExperiments

Conclusion

A novel non-parametric statistical test that determines whether a source variable ismore stronger dependent on one target variable or another.

The test is low variance, consistent and unbiased.

Computation requirement is quadratic time.

Bibliography:- Gretton, A., Fukumizu, K., Teo, C. H., Song, L., Scholkopf, B., Smola, A. J. (2008).

A kernel statistical test of independence. In Advances in Neural Information ProcessingSystems.

- Gretton, A., Herbrich, R., Smola, A., Bousquet, O., et Schoelkopf, B., (2005). KernelMethods for Measuring Independence, Journal of Machine Learning Research, 6 ,2075-2129,

- Hoeffding, W. (1948). A class of statistics with asymptotically normal distribution.The annals of mathematical statistics, 293-325.

- Serfling, R. J. (2009). Approximation theorems of mathematical statistics, 162. JohnWiley & Sons.

Page 20: Wacha Bounliphone, Arthur Gretton, Arthur Tenenhaus ... · Wacha Bounliphone, Arthur Gretton, Arthur Tenenhaus, Matthew Blaschko 32nd International Conference on Machine Learning

A low variance consistent test of relative dependencyIntroductionTest of relative dependencyExperiments

Thanks for your attention!

Code: https://github.com/wbounliphone/reldep

Contact: [email protected]