community structures of multilayer political cosponsorship networks

39
Sang Hoon Lee Department of Energy Science, Sungkyunkwan University http://sites.google.com/site/lshlj82 Community Structures of Multilayer Political Cosponsorship Networks The 2nd Daegu Gyeongbuk International Social Network Conference (DISC 2014)

Upload: sang-hoon-lee

Post on 12-Jul-2015

184 views

Category:

Science


1 download

TRANSCRIPT

  • Sang Hoon Lee Department of Energy Science, Sungkyunkwan University

    http://sites.google.com/site/lshlj82

    Community Structures of Multilayer Political Cosponsorship Networks

    The 2nd Daegu Gyeongbuk International Social Network Conference (DISC 2014)

  • Community structures in networks

    modularity (the objective function to be maximized)

    M. A. Porter, J.-P. Onnela, and P. J. Mucha, Not. Am. Math. Soc. 56, 1082 (2009); S. Fortunato, Phys. Rep. 486, 75 (2010).

    Q =1

    2m

    Xij

    Aij kikj

    2m

    (gi, gj)

    where the adjacency matrixAij 6= 0 if nodes i and j are connected and Aij = 0 otherwise,ki is the degree (number of neighboring nodes of i)or strength (sum of weights around i),gi is the community to which i belongs,and m is the total number of edges or sum of weights in the network

    resolution parameter: controlling the characteristic size of communities

    importing network dataidentifying community structure

    visualizing

    smal

    ler

    com

    mun

    ities

    TA

    XO

    NO

    MIE

    SO

    FN

    ETW

    ORK

    SFR

    OM

    COM

    MU

    NIT

    YST

    RUCT

    URE

    PHY

    SICA

    LR

    EVIE

    WE

    86,

    0361

    04(20

    12)

    that

    alle

    dges

    are

    antif

    erro

    mag

    netic

    atre

    solu

    tion="

    max

    and

    ther

    eby

    forc

    esea

    chnode

    into

    itsow

    nco

    mm

    unity

    .

    III.

    MES

    OSC

    OPI

    CR

    ESPO

    NSE

    FUN

    CTI

    ON

    S(M

    RFS)

    Tode

    scrib

    eho

    wa

    net

    work

    disin

    tegr

    ates

    into

    com

    mu

    niti

    esas

    thev

    alue

    of

    isin

    crea

    sed

    from"

    min

    to"

    max

    [see

    Fig.

    1(a)

    fora

    sche

    mat

    ic],o

    ne

    nee

    dsto

    sele

    ctsu

    mm

    ary

    stat

    istic

    s.Th

    ere

    are

    man

    ypo

    ssib

    lew

    ays

    tosu

    mm

    ariz

    esu

    cha

    disin

    tegr

    atio

    npr

    oces

    s,an

    dw

    efo

    cus

    on

    thre

    edi

    agno

    stics

    that

    char

    acte

    rize

    fund

    amen

    talp

    rope

    rties

    ofn

    etw

    ork

    com

    muniti

    es.

    Firs

    t,w

    euse

    the

    val

    ueoft

    heH

    amilt

    onia

    nH(

    )(

    1),w

    hich

    isa

    scal

    arqu

    antit

    ycl

    osel

    yre

    late

    dto

    net

    work

    modu

    larit

    yan

    dqu

    antifi

    esth

    een

    ergy

    of

    the

    syste

    m[1

    3,14

    ].Se

    cond

    ,w

    eca

    lcul

    ate

    apa

    rtitio

    nen

    trop

    yS

    ()

    toch

    arac

    teriz

    eth

    eco

    mm

    unity

    size

    distr

    ibutio

    n.To

    doth

    is,le

    tnk

    deno

    teth

    enum

    ber

    of

    node

    sin

    com

    munity

    kan

    dde

    finepk=nk/N

    tobe

    the

    prob

    abili

    tyto

    choo

    sea

    node

    from

    com

    munity

    kunifo

    rmly

    atra

    ndo

    m.T

    hisy

    ield

    sa(S

    hann

    on)p

    artit

    ione

    ntr

    opy

    ofS

    ()=

    (

    )k=1pk

    logp

    k,

    whi

    chqu

    antifi

    esth

    edi

    sord

    erin

    the

    asso

    ciat

    edco

    mm

    unity

    size

    distr

    ibutio

    n.Th

    ird,w

    euse

    the

    num

    bero

    fcom

    muniti

    es

    ().

    =1,

    =34

    =0,

    =1

    =0.2

    , =

    8=0

    .4,

    =12

    =0.6

    , =

    17=0

    .8,

    =24

    = 0.

    2 =

    0.

    4 =

    0.

    6 =

    0.

    8 =

    0

    = 1

    00.2

    0.4

    0.6

    0.81

    ferro

    mag

    netic

    link

    snonlin

    ksantif

    erro

    mag

    netic

    link

    s

    (a)

    (c)

    (b)

    Heff

    S eff

    eff

    FIG

    .1.

    (Colo

    ronlin

    e)(a)

    Sche

    mat

    icofs

    om

    eoft

    hew

    ays

    that

    a

    net

    work

    can

    brea

    kup

    into

    com

    muniti

    esas

    the

    val

    ue

    of

    (or

    )is

    incr

    ease

    d.(b)

    Zach

    ary

    Kar

    ate

    Club

    net

    work

    [23]

    for

    diffe

    ren

    tval

    ues

    oft

    heef

    fect

    ive

    fract

    ion

    ofa

    ntif

    erro

    mag

    netic

    edge

    s.A

    llin

    tera

    ctio

    ns

    are

    eith

    erfe

    rrom

    agnet

    icor

    antif

    erro

    mag

    net

    ic;i

    .e.,

    for

    the

    val

    ues

    of

    th

    atw

    euse

    d,th

    ere

    are

    no

    neu

    tral

    inte

    ract

    ions

    .We

    colo

    red

    ges

    inbl

    ueif

    the

    corr

    espo

    ndin

    gin

    tera

    ctio

    nsar

    efe

    rrom

    agne

    tic,a

    nd

    we

    colo

    rth

    emin

    red

    ifth

    ein

    tera

    ctio

    ns

    are

    antif

    erro

    mag

    net

    ic.

    We

    colo

    r

    the

    node

    sbas

    edon

    com

    munity

    affil

    iatio

    n.(c)

    TheH e

    ff,S

    eff,

    and

    eff

    MR

    Fs,a

    nd

    the

    inte

    ract

    ion

    mat

    rixJ

    for

    diffe

    rent

    val

    ues

    of

    .W

    e

    colo

    rel

    emen

    tsof

    the

    inte

    ract

    ion

    mat

    rixby

    depi

    ctin

    gth

    eab

    sence

    of

    aned

    gein

    whi

    te,

    ferr

    om

    agnet

    iced

    ges

    inbl

    ue

    (dark

    gray

    ),an

    dan

    tifer

    rom

    agne

    ticed

    gesi

    nre

    d(li

    ghtg

    ray).

    Bec

    ause

    we

    nee

    dto

    norm

    aliz

    eH,S

    ,an

    dto

    com

    pare

    them

    acro

    ssnet

    work

    s,w

    ede

    fine

    aneff

    ectiv

    eener

    gy

    H eff

    ()=

    H(

    )H m

    in

    H maxH m

    in=

    1H(

    )

    H min,

    (4)

    whe

    reH m

    in=H(

    "m

    in)a

    ndH m

    ax=H(

    "m

    ax);

    aneff

    ectiv

    een

    tropy

    Sef

    f()=

    S(

    )S

    min

    Sm

    axS

    min=

    S(

    )lo

    gN,

    (5)

    whe

    reS

    min=S

    ("m

    in)a

    ndS

    max=S

    ("m

    ax);

    and

    aneff

    ectiv

    enum

    bero

    fcom

    muniti

    es

    ef

    f()=

    (

    )

    min

    m

    ax

    min=

    (

    )1

    N

    1,

    (6)

    whe

    re

    min=

    ("m

    in)=

    1an

    d

    max=

    ("m

    ax)=

    N.

    Som

    enet

    work

    sco

    nta

    ina

    smal

    lnum

    ber

    of

    entr

    ies"

    ij

    that

    are

    ord

    ers

    of

    mag

    nitu

    dela

    rger

    than

    most

    oth

    eren

    trie

    s.Fo

    rex

    ampl

    e,in

    the

    net

    work

    of

    Face

    boo

    kfri

    ends

    hips

    atCa

    ltech

    [21,

    22],

    98%

    of

    the"

    ijen

    trie

    sar

    ele

    ssth

    an10

    0,bu

    t0.

    02%

    of

    them

    are

    larg

    erth

    an80

    00.

    Thes

    ela

    rge"

    ij

    val

    ues

    arise

    whe

    ntw

    olo

    w-st

    ren

    gth

    no

    des

    beco

    me

    con

    nec

    ted.

    Usin

    gthe

    null

    mode

    lPij=k ik j/(2

    m),t

    hein

    tera

    ctio

    nbet

    wee

    ntw

    onode

    si

    andj

    beco

    mes

    antif

    erro

    mag

    netic

    whe

    n>

    Aij/P

    ij=

    2mAij/(k

    ikj).

    Ifa

    net

    work

    has

    ala

    rge

    tota

    ledg

    ew

    eigh

    tbu

    tbo

    thi

    andj

    have

    smal

    lst

    reng

    ths

    com

    pare

    dto

    oth

    ernode

    sin

    the

    net

    work

    ,th

    en

    nee

    dsto

    bela

    rge

    tom

    ake

    the

    inte

    ract

    ion

    antif

    erro

    mag

    netic

    .In

    prio

    rst

    udie

    s,net

    work

    com

    munity

    stru

    ctur

    eha

    sbee

    ninv

    estig

    ated

    atdi

    ffere

    nt

    mes

    osc

    opi

    csc

    ales

    byco

    nsid

    erin

    gpl

    otso

    fvar

    ious

    diag

    nosti

    csas

    afu

    nct

    ion

    of

    the

    reso

    lutio

    npa

    ram

    eter

    [1

    3,14

    ,17

    ].In

    the

    pres

    ent

    exam

    ple,

    such

    plot

    sw

    ould

    bedo

    min

    ated

    byin

    tera

    ctio

    nsth

    atre

    quire

    larg

    ere

    solu

    tion-

    para

    met

    erval

    ues

    tobe

    com

    ean

    tifer

    rom

    agne

    tic.T

    oover

    com

    eth

    isiss

    ue,w

    ede

    fine

    the

    effec

    tivef

    ractio

    nofa

    ntife

    rrom

    agn

    etic

    edge

    s

    =

    ()=

    A(

    )A

    ("m

    in)

    A("

    max

    )A

    ("m

    in)

    [0,1

    ],(7)

    whe

    reA

    ()i

    sth

    eto

    tal

    nu

    mbe

    ro

    fan

    tifer

    rom

    agn

    etic

    in-

    tera

    ctio

    nsfo

    rth

    egi

    ven

    val

    ueof

    .In

    oth

    erw

    ord

    s,it

    isth

    enum

    ber

    of"

    ijel

    emen

    tsth

    atar

    esm

    alle

    rth

    an

    .Th

    us,

    A("

    min

    )is

    the

    larg

    estn

    um

    bero

    fantif

    erro

    mag

    netic

    inte

    rac-

    tions

    forw

    hich

    anet

    work

    still

    form

    sasin

    gle

    com

    munity

    ,an

    dth

    eef

    fect

    ive

    num

    ber

    of

    antif

    erro

    mag

    netic

    inte

    ract

    ions

    (

    )is

    the

    num

    ber

    of

    antif

    erro

    mag

    netic

    inte

    ract

    ions

    (norm

    alize

    dto

    the

    unit

    inte

    rval

    )in

    exce

    ssof

    A("

    min

    ).Th

    efu

    nct

    ion

    ()

    incr

    ease

    smonoto

    nica

    llyin

    .

    Swee

    ping

    fro

    m"

    min

    to"

    max

    corr

    espo

    nds

    tosw

    eepi

    ngth

    eval

    ueof

    from

    0to

    1.(O

    neca

    nth

    ink

    of

    asa

    contin

    uous

    var

    iabl

    ean

    d

    asa

    disc

    rete

    var

    iabl

    etha

    tcha

    nges

    with

    even

    ts.)

    Asw

    epe

    rform

    such

    swee

    ping

    fora

    given

    net

    work

    ,the

    num

    ber

    ofc

    om

    muniti

    esin

    crea

    sesf

    rom

    (=

    0)=

    1to

    (=

    1)=N

    andy

    ield

    savec

    tor[H e

    ff(

    ),Sef

    f(),

    eff(

    )]who

    seco

    mpo

    nent

    sw

    eca

    llth

    em

    esosc

    opi

    cre

    spon

    sefun

    ction

    s(M

    RFs)

    of

    that

    net

    work

    .(W

    eal

    soso

    met

    imes

    refe

    rto

    the

    vec

    tor

    itsel

    fas

    anM

    RF.

    )Bec

    auseH e

    ff

    [0,1

    ],S

    eff

    [0,1

    ],

    eff

    [0,1

    ],an

    d

    [0,1

    ]for

    ever

    ynet

    work

    ,we

    can

    com

    pare

    theM

    RFs

    acro

    ssnet

    work

    san

    duse

    them

    toid

    entif

    ygr

    oups

    of

    net

    work

    sw

    ithsim

    ilar

    mes

    osc

    opi

    cst

    ruct

    ures

    .In

    Fig.

    1(b),

    we

    show

    the

    Zach

    ary

    Kar

    ate

    Club

    net

    work

    [23]

    for

    diffe

    rent

    valu

    esof

    0361

    04-3

    J.-P. Onnela et al., Phys. Rev. E 86, 036104 (2012).

  • Community structures in networks

    modularity (the objective function to be maximized)

    M. A. Porter, J.-P. Onnela, and P. J. Mucha, Not. Am. Math. Soc. 56, 1082 (2009); S. Fortunato, Phys. Rep. 486, 75 (2010).

    Q =1

    2m

    Xij

    Aij kikj

    2m

    (gi, gj)

    where the adjacency matrixAij 6= 0 if nodes i and j are connected and Aij = 0 otherwise,ki is the degree (number of neighboring nodes of i)or strength (sum of weights around i),gi is the community to which i belongs,and m is the total number of edges or sum of weights in the network

    resolution parameter: controlling the characteristic size of communities

    importing network dataidentifying community structure

    visualizing

    smal

    ler

    com

    mun

    ities

    TA

    XO

    NO

    MIE

    SO

    FN

    ETW

    ORK

    SFR

    OM

    COM

    MU

    NIT

    YST

    RUCT

    URE

    PHY

    SICA

    LR

    EVIE

    WE

    86,

    0361

    04(20

    12)

    that

    alle

    dges

    are

    antif

    erro

    mag

    netic

    atre

    solu

    tion="

    max

    and

    ther

    eby

    forc

    esea

    chnode

    into

    itsow

    nco

    mm

    unity

    .

    III.

    MES

    OSC

    OPI

    CR

    ESPO

    NSE

    FUN

    CTI

    ON

    S(M

    RFS)

    Tode

    scrib

    eho

    wa

    net

    work

    disin

    tegr

    ates

    into

    com

    mu

    niti

    esas

    thev

    alue

    of

    isin

    crea

    sed

    from"

    min

    to"

    max

    [see

    Fig.

    1(a)

    fora

    sche

    mat

    ic],o

    ne

    nee

    dsto

    sele

    ctsu

    mm

    ary

    stat

    istic

    s.Th

    ere

    are

    man

    ypo

    ssib

    lew

    ays

    tosu

    mm

    ariz

    esu

    cha

    disin

    tegr

    atio

    npr

    oces

    s,an

    dw

    efo

    cus

    on

    thre

    edi

    agno

    stics

    that

    char

    acte

    rize

    fund

    amen

    talp

    rope

    rties

    ofn

    etw

    ork

    com

    muniti

    es.

    Firs

    t,w

    euse

    the

    val

    ueoft

    heH

    amilt

    onia

    nH(

    )(

    1),w

    hich

    isa

    scal

    arqu

    antit

    ycl

    osel

    yre

    late

    dto

    net

    work

    modu

    larit

    yan

    dqu

    antifi

    esth

    een

    ergy

    of

    the

    syste

    m[1

    3,14

    ].Se

    cond

    ,w

    eca

    lcul

    ate

    apa

    rtitio

    nen

    trop

    yS

    ()

    toch

    arac

    teriz

    eth

    eco

    mm

    unity

    size

    distr

    ibutio

    n.To

    doth

    is,le

    tnk

    deno

    teth

    enum

    ber

    of

    node

    sin

    com

    munity

    kan

    dde

    finepk=nk/N

    tobe

    the

    prob

    abili

    tyto

    choo

    sea

    node

    from

    com

    munity

    kunifo

    rmly

    atra

    ndo

    m.T

    hisy

    ield

    sa(S

    hann

    on)p

    artit

    ione

    ntr

    opy

    ofS

    ()=

    (

    )k=1pk

    logp

    k,

    whi

    chqu

    antifi

    esth

    edi

    sord

    erin

    the

    asso

    ciat

    edco

    mm

    unity

    size

    distr

    ibutio

    n.Th

    ird,w

    euse

    the

    num

    bero

    fcom

    muniti

    es

    ().

    =1,

    =34

    =0,

    =1

    =0.2

    , =

    8=0

    .4,

    =12

    =0.6

    , =

    17=0

    .8,

    =24

    = 0.

    2 =

    0.

    4 =

    0.

    6 =

    0.

    8 =

    0

    = 1

    00.2

    0.4

    0.6

    0.81

    ferro

    mag

    netic

    link

    snonlin

    ksantif

    erro

    mag

    netic

    link

    s

    (a)

    (c)

    (b)

    Heff

    S eff

    eff

    FIG

    .1.

    (Colo

    ronlin

    e)(a)

    Sche

    mat

    icofs

    om

    eoft

    hew

    ays

    that

    a

    net

    work

    can

    brea

    kup

    into

    com

    muniti

    esas

    the

    val

    ue

    of

    (or

    )is

    incr

    ease

    d.(b)

    Zach

    ary

    Kar

    ate

    Club

    net

    work

    [23]

    for

    diffe

    ren

    tval

    ues

    oft

    heef

    fect

    ive

    fract

    ion

    ofa

    ntif

    erro

    mag

    netic

    edge

    s.A

    llin

    tera

    ctio

    ns

    are

    eith

    erfe

    rrom

    agnet

    icor

    antif

    erro

    mag

    net

    ic;i

    .e.,

    for

    the

    val

    ues

    of

    th

    atw

    euse

    d,th

    ere

    are

    no

    neu

    tral

    inte

    ract

    ions

    .We

    colo

    red

    ges

    inbl

    ueif

    the

    corr

    espo

    ndin

    gin

    tera

    ctio

    nsar

    efe

    rrom

    agne

    tic,a

    nd

    we

    colo

    rth

    emin

    red

    ifth

    ein

    tera

    ctio

    ns

    are

    antif

    erro

    mag

    net

    ic.

    We

    colo

    r

    the

    node

    sbas

    edon

    com

    munity

    affil

    iatio

    n.(c)

    TheH e

    ff,S

    eff,

    and

    eff

    MR

    Fs,a

    nd

    the

    inte

    ract

    ion

    mat

    rixJ

    for

    diffe

    rent

    val

    ues

    of

    .W

    e

    colo

    rel

    emen

    tsof

    the

    inte

    ract

    ion

    mat

    rixby

    depi

    ctin

    gth

    eab

    sence

    of

    aned

    gein

    whi

    te,

    ferr

    om

    agnet

    iced

    ges

    inbl

    ue

    (dark

    gray

    ),an

    dan

    tifer

    rom

    agne

    ticed

    gesi

    nre

    d(li

    ghtg

    ray).

    Bec

    ause

    we

    nee

    dto

    norm

    aliz

    eH,S

    ,an

    dto

    com

    pare

    them

    acro

    ssnet

    work

    s,w

    ede

    fine

    aneff

    ectiv

    eener

    gy

    H eff

    ()=

    H(

    )H m

    in

    H maxH m

    in=

    1H(

    )

    H min,

    (4)

    whe

    reH m

    in=H(

    "m

    in)a

    ndH m

    ax=H(

    "m

    ax);

    aneff

    ectiv

    een

    tropy

    Sef

    f()=

    S(

    )S

    min

    Sm

    axS

    min=

    S(

    )lo

    gN,

    (5)

    whe

    reS

    min=S

    ("m

    in)a

    ndS

    max=S

    ("m

    ax);

    and

    aneff

    ectiv

    enum

    bero

    fcom

    muniti

    es

    ef

    f()=

    (

    )

    min

    m

    ax

    min=

    (

    )1

    N

    1,

    (6)

    whe

    re

    min=

    ("m

    in)=

    1an

    d

    max=

    ("m

    ax)=

    N.

    Som

    enet

    work

    sco

    nta

    ina

    smal

    lnum

    ber

    of

    entr

    ies"

    ij

    that

    are

    ord

    ers

    of

    mag

    nitu

    dela

    rger

    than

    most

    oth

    eren

    trie

    s.Fo

    rex

    ampl

    e,in

    the

    net

    work

    of

    Face

    boo

    kfri

    ends

    hips

    atCa

    ltech

    [21,

    22],

    98%

    of

    the"

    ijen

    trie

    sar

    ele

    ssth

    an10

    0,bu

    t0.

    02%

    of

    them

    are

    larg

    erth

    an80

    00.

    Thes

    ela

    rge"

    ij

    val

    ues

    arise

    whe

    ntw

    olo

    w-st

    ren

    gth

    no

    des

    beco

    me

    con

    nec

    ted.

    Usin

    gthe

    null

    mode

    lPij=k ik j/(2

    m),t

    hein

    tera

    ctio

    nbet

    wee

    ntw

    onode

    si

    andj

    beco

    mes

    antif

    erro

    mag

    netic

    whe

    n>

    Aij/P

    ij=

    2mAij/(k

    ikj).

    Ifa

    net

    work

    has

    ala

    rge

    tota

    ledg

    ew

    eigh

    tbu

    tbo

    thi

    andj

    have

    smal

    lst

    reng

    ths

    com

    pare

    dto

    oth

    ernode

    sin

    the

    net

    work

    ,th

    en

    nee

    dsto

    bela

    rge

    tom

    ake

    the

    inte

    ract

    ion

    antif

    erro

    mag

    netic

    .In

    prio

    rst

    udie

    s,net

    work

    com

    munity

    stru

    ctur

    eha

    sbee

    ninv

    estig

    ated

    atdi

    ffere

    nt

    mes

    osc

    opi

    csc

    ales

    byco

    nsid

    erin

    gpl

    otso

    fvar

    ious

    diag

    nosti

    csas

    afu

    nct

    ion

    of

    the

    reso

    lutio

    npa

    ram

    eter

    [1

    3,14

    ,17

    ].In

    the

    pres

    ent

    exam

    ple,

    such

    plot

    sw

    ould

    bedo

    min

    ated

    byin

    tera

    ctio

    nsth

    atre

    quire

    larg

    ere

    solu

    tion-

    para

    met

    erval

    ues

    tobe

    com

    ean

    tifer

    rom

    agne

    tic.T

    oover

    com

    eth

    isiss

    ue,w

    ede

    fine

    the

    effec

    tivef

    ractio

    nofa

    ntife

    rrom

    agn

    etic

    edge

    s

    =

    ()=

    A(

    )A

    ("m

    in)

    A("

    max

    )A

    ("m

    in)

    [0,1

    ],(7)

    whe

    reA

    ()i

    sth

    eto

    tal

    nu

    mbe

    ro

    fan

    tifer

    rom

    agn

    etic

    in-

    tera

    ctio

    nsfo

    rth

    egi

    ven

    val

    ueof

    .In

    oth

    erw

    ord

    s,it

    isth

    enum

    ber

    of"

    ijel

    emen

    tsth

    atar

    esm

    alle

    rth

    an

    .Th

    us,

    A("

    min

    )is

    the

    larg

    estn

    um

    bero

    fantif

    erro

    mag

    netic

    inte

    rac-

    tions

    forw

    hich

    anet

    work

    still

    form

    sasin

    gle

    com

    munity

    ,an

    dth

    eef

    fect

    ive

    num

    ber

    of

    antif

    erro

    mag

    netic

    inte

    ract

    ions

    (

    )is

    the

    num

    ber

    of

    antif

    erro

    mag

    netic

    inte

    ract

    ions

    (norm

    alize

    dto

    the

    unit

    inte

    rval

    )in

    exce

    ssof

    A("

    min

    ).Th

    efu

    nct

    ion

    ()

    incr

    ease

    smonoto

    nica

    llyin

    .

    Swee

    ping

    fro

    m"

    min

    to"

    max

    corr

    espo

    nds

    tosw

    eepi

    ngth

    eval

    ueof

    from

    0to

    1.(O

    neca

    nth

    ink

    of

    asa

    contin

    uous

    var

    iabl

    ean

    d

    asa

    disc

    rete

    var

    iabl

    etha

    tcha

    nges

    with

    even

    ts.)

    Asw

    epe

    rform

    such

    swee

    ping

    fora

    given

    net

    work

    ,the

    num

    ber

    ofc

    om

    muniti

    esin

    crea

    sesf

    rom

    (=

    0)=

    1to

    (=

    1)=N

    andy

    ield

    savec

    tor[H e

    ff(

    ),Sef

    f(),

    eff(

    )]who

    seco

    mpo

    nent

    sw

    eca

    llth

    em

    esosc

    opi

    cre

    spon

    sefun

    ction

    s(M

    RFs)

    of

    that

    net

    work

    .(W

    eal

    soso

    met

    imes

    refe

    rto

    the

    vec

    tor

    itsel

    fas

    anM

    RF.

    )Bec

    auseH e

    ff

    [0,1

    ],S

    eff

    [0,1

    ],

    eff

    [0,1

    ],an

    d

    [0,1

    ]for

    ever

    ynet

    work

    ,we

    can

    com

    pare

    theM

    RFs

    acro

    ssnet

    work

    san

    duse

    them

    toid

    entif

    ygr

    oups

    of

    net

    work

    sw

    ithsim

    ilar

    mes

    osc

    opi

    cst

    ruct

    ures

    .In

    Fig.

    1(b),

    we

    show

    the

    Zach

    ary

    Kar

    ate

    Club

    net

    work

    [23]

    for

    diffe

    rent

    valu

    esof

    0361

    04-3

    J.-P. Onnela et al., Phys. Rev. E 86, 036104 (2012).

    6www.nature.com/nature

    doi: 10.1038/nature09182 SUPPLEMENTARY INFORMATION

    Gavroche

    ValjeanBossuet

    Mabeuf

    Bahorel

    Grantaire

    Gervais

    Fauchelevent

    Gribier

    Fameuil

    Listolier

    Thenardier

    Bamatabois

    Champmathieu

    MmeHucheloup

    Montparnasse

    Courfeyrac

    Enjolras

    Gillenormand

    Fantine

    Tholomyes

    MariusJoly

    Brujon

    GueulemerFavourite

    Zephine

    Eponine

    MmeMagloireMyriel

    MmeThenardier

    Cosette

    LtGillenormand

    MlleGillenormand

    Feuilly

    MlleBaptistine

    Blacheville

    Claquesous

    CombeferreJavert

    Woman1

    Dahlia

    Child1

    Child2

    Perpetue

    Simplice

    Babet

    Pontmercy

    Chenildieu

    Napoleon

    CravatteChamptercier

    Scaufflaire

    Boulatruelle

    Labarre

    Judge

    BaronessTCountessDeLo

    Isabeau

    Marguerite Brevet

    Cochepaille

    MmePontmercy

    MlleVaubois

    Magnon

    Woman2

    Prouvaire

    MmeDeR

    Toussaint

    Count

    MotherPlutarch

    MmeBurgon

    MotherInnocent

    Anzelma

    OldMan

    Jondrette

    Geborand

    0

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    Figure 3: Link communities for the coappearance network of characters in the novel Les Miserables [9]. (Top) the networkwith link colors indicating the clustering, with grey indicating single-link clusters. Each node is depicted as a pie-chartrepresenting its membership distribution. The main characters have more diverse community membership. (Bottom) thefull link dendrogram (left) and partition density (right). Note the internal blue community in the large blue and red cliquecontaining Valjean. Link clustering is able to unveil hierarchical structure even inside of cliques.

    2.3.1 Clique percolation

    Clique percolation [11, 15] provides an elegant and highly useful method to uncover overlapping com-munity structure [16]. It is currently the most popular and most successful tool available for this task.A particularly interesting feature of this method is that it presents the experimenter with a knob k, theclique size, which can be used to tune the result between high coverage, low community quality (sparsecommunities) and low coverage, high community quality (dense communities). For some networks,such as the mobile phone network, a precedent exists for the choice of k, which we follow. Wheneverthat is not the case, we have computed the composite performance for a range of ks and chosen the kwhich results in the optimum overall performance2. This weighs coverage and quality equally, however,and it remains at the discretion of the researcher to decide if this is optimal for his or her application.See Appendix A.2.

    2For some of the very large or very dense networks, we were not able to run clique percolation for large values of k with thefastest existing software (even on a machine with 32 Gb of RAM), using the fast algorithm developed by Kumpala et al. [17].

    6

    Y.-Y. Ahn, J. P. Bagrow, and S. Lehmann, Nature 466, 761 (2010).

  • note: i and j are node indices, and s and r are layer indices.The adjacency tensor Aijs 6= 0 if nodes i and j are connectedin layer s, and Aijs = 0 otherwise.kis is the degree (or strength) of node i in layer s,ms is the number of edges (or sum of weights) in layer s,and s = is the resolution parameter in layer s.Cjsr = ! 6= 0 if layers s and r are connected via node j,and Cjsr = 0 otherwise.The normalization factor 2 =

    PijsAijs +

    Pjsr Cjsr for Qmultilayer 2 [1, 1].

    Qmultilayer =1

    2

    Xijsr

    Aijs s kiskjs

    2ms

    sr + ijCjsr

    (gis, gjr)

    Community Structure inTime-Dependent, Multiscale,and Multiplex NetworksPeter J. Mucha,1,2* Thomas Richardson,1,3 Kevin Macon,1 Mason A. Porter,4,5 Jukka-Pekka Onnela6,7

    Network science is an interdisciplinary endeavor, with methods and applications drawn from acrossthe natural, social, and information sciences. A prominent problem in network science is thealgorithmic detection of tightly connected groups of nodes known as communities. We developed ageneralized framework of network quality functions that allowed us to study the communitystructure of arbitrary multislice networks, which are combinations of individual networks coupledthrough links that connect each node in one network slice to itself in other slices. This frameworkallows studies of community structure in a general setting encompassing networks that evolve overtime, have multiple types of links (multiplexity), and have multiple scales.

    Thestudy of graphs, or networks, has a longtradition in fields such as sociology andmathematics, and it is now ubiquitous inacademic and everyday settings. An importanttool in network analysis is the detection ofmesoscopic structures known as communities (orcohesive groups), which are defined intuitively asgroups of nodes that are more tightly connected toeach other than they are to the rest of the network(13). One way to quantify communities is by aquality function that compares the number ofintracommunity edges to what one would expectat random.Given the network adjacencymatrixA,where the element Aij details a direct connectionbetween nodes i and j, one can construct a qual-ity functionQ (4, 5) for the partitioning of nodesinto communities as Q = ij (Aij Pij)d(gi, gj),where d(gi, gj) = 1 if the community assignmentsgi and gj of nodes i and j are the same and 0otherwise, and Pij is the expected weight of theedge between i and j under a specified null model.

    The choice of null model is a crucial con-sideration in studying network community struc-ture (2). After selecting a null model appropriateto the network and application at hand, one canuse a variety of computational heuristics to assignnodes to communities to optimize the quality Q(2, 3). However, such null models have not beenavailable for time-dependent networks; analyseshave instead depended on ad hoc methods to

    piece together the structures obtained at differenttimes (69) or have abandoned quality functionsin favor of such alternatives as the MinimumDescriptionLength principle (10). Although tensordecompositions (11) have been used to clusternetwork data with different types of connections,no quality-function method has been developedfor such multiplex networks.

    We developed a methodology to remove theselimits, generalizing the determination of commu-nity structure via quality functions to multislicenetworks that are defined by coupling multipleadjacency matrices (Fig. 1). The connectionsencoded by the network slices are flexible; theycan represent variations across time, variationsacross different types of connections, or evencommunity detection of the same network atdifferent scales. However, the usual procedure forestablishing a quality function as a direct count ofthe intracommunity edge weight minus that

    expected at random fails to provide any contribu-tion from these interslice couplings. Because theyare specified by common identifications of nodesacross slices, interslice couplings are either presentor absent by definition, so when they do fall insidecommunities, their contribution in the count of intra-community edges exactly cancels that expected atrandom. In contrast, by formulating a null model interms of stability of communities under Laplaciandynamics, we have derived a principled generaliza-tion of community detection to multislice networks,

    REPORTS

    1Carolina Center for Interdisciplinary Applied Mathematics,Department of Mathematics, University of North Carolina,Chapel Hill, NC 27599, USA. 2Institute for Advanced Materials,Nanoscience and Technology, University of North Carolina,Chapel Hill, NC 27599, USA. 3Operations Research, NorthCarolina State University, Raleigh, NC 27695, USA. 4OxfordCentre for Industrial and Applied Mathematics, MathematicalInstitute, University of Oxford, Oxford OX1 3LB, UK. 5CABDyNComplexity Centre, University of Oxford, Oxford OX1 1HP, UK.6Department of Health Care Policy, Harvard Medical School,Boston, MA 02115, USA. 7Harvard Kennedy School, HarvardUniversity, Cambridge, MA 02138, USA.

    *To whom correspondence should be addressed. E-mail:[email protected]

    1

    2

    3

    4

    Fig. 1. Schematic of amultislice network. Four slicess= {1, 2, 3, 4} represented by adjacencies Aijs encodeintraslice connections (solid lines). Interslice con-nections (dashed lines) are encoded byCjrs, specifyingthe coupling of node j to itself between slices r and s.For clarity, interslice couplings are shown for only twonodes and depict two different types of couplings: (i)coupling between neighboring slices, appropriate forordered slices; and (ii) all-to-all interslice coupling,appropriate for categorical slices.

    no

    des

    resolution parameters

    coupling = 0

    1 2 3 4

    5

    10

    15

    20

    25

    30

    no

    des

    resolution parameters

    coupling = 0.1

    1 2 3 4

    5

    10

    15

    20

    25

    30

    no

    des

    resolution parameters

    coupling = 1

    1 2 3 4

    5

    10

    15

    20

    25

    30

    Fig. 2. Multislice community detection of theZachary Karate Club network (22) across multipleresolutions. Colors depict community assignments ofthe 34 nodes (renumbered vertically to groupsimilarly assigned nodes) in each of the 16 slices(with resolution parameters gs = {0.25, 0.5,, 4}),for w = 0 (top), w = 0.1 (middle), and w =1 (bottom). Dashed lines bound the communitiesobtained using the default resolution (g = 1).

    14 MAY 2010 VOL 328 SCIENCE www.sciencemag.org876

    CORRECTED 16 JULY 2010; SEE LAST PAGE

    on

    Nove

    mbe

    r 8, 2

    011

    www.

    scien

    cem

    ag.o

    rgDo

    wnloa

    ded

    from

    Multilayer community detection

    P. J. Mucha, T. Richardson, K. Macon, M. A. Porter, and J.-P. Onnela, Science 328, 876 (2010).

    different slices: time series or categories

    nodes in individual slices

    (weighted) edges

    with a single parameter controlling the interslicecorrespondence of communities.

    Important to our method is the equivalencebetween themodularity quality function (12) [witha resolution parameter (5)] and stability of com-munities under Laplacian dynamics (13), whichwe have generalized to recover the null models forbipartite, directed, and signed networks (14). First,we obtained the resolution-parameter generaliza-

    tion of Barbers null model for bipartite networks(15) by requiring the independent joint probabilitycontribution to stability in (13) to be conditionalon the type of connection necessary to stepbetween two nodes. Second, we recovered thestandard null model for directed networks (16, 17)(again with a resolution parameter) by generaliz-ing the Laplacian dynamics to include motionalong different kinds of connectionsin this case,

    both with and against the direction of a link. Bythis generalization, we similarly recovered a nullmodel for signed networks (18). Third, weinterpreted the stability under Laplacian dynamicsflexibly to permit different spreading weights onthe different types of links, giving multiple reso-lution parameters to recover a general null modelfor signed networks (19).

    We applied these generalizations to derive nullmodels for multislice networks that extend theexisting quality-function methodology, includingan additional parameter w to control the couplingbetween slices. Representing each network slice sby adjacencies Aijs between nodes i and j, withinterslice couplingsCjrs that connect node j in slicer to itself in slice s (Fig. 1), we have restricted ourattention to unipartite, undirected network slices(Aijs = Ajis) and couplings (Cjrs = Cjsr), but we canincorporate additional structure in the slices andcouplings in the same manner as demonstrated forsingle-slice null models. Notating the strengths ofeach node individually in each slice by kjs =iAijsand across slices by cjs = rCjsr, we define themultislice strength by kjs = kjs + cjs. The continuous-time Laplacian dynamics given by

    pis jrAijsdsr dijCjsrpjr

    kjr pis 1

    respects the intraslice nature of Aijs and theinterslice couplings of Cjsr. Using the steady-stateprobability distribution pjr kjr=2m, where 2m = jrkjr, we obtained the multislice null model interms of the probability ris| jr of sampling node i inslice s conditional on whether the multislice struc-ture allowsone to step from ( j, r) to (i, s), accountingfor intra- and interslice steps separately as

    risj jrpjr

    kis2ms

    kjrkjr

    dsr Cjsrcjrcjrkjr

    dij

    ! "kjr2m

    2

    where ms = jkjs. The second term in parentheses,which describes the conditional probability ofmotion between two slices, leverages the definitionof the Cjsr coupling. That is, the conditionalprobability of stepping from ( j, r) to (i, s) alongan interslice coupling is nonzero if and only if i = j,and it is proportional to the probability Cjsr/kjr ofselecting the precise interslice link that connects toslice s. Subtracting this conditional joint probabilityfrom the linear (in time) approximation of theexponential describing the Laplacian dynamics,weobtained a multislice generalization of modularity(14):

    Qmultislice 12m ijsrh#

    Aijs gskiskjs2ms

    dsr$

    dijCjsridgis,gjr 3

    where we have used reweighting of the conditionalprobabilities, which allows a different resolution gsin each slice. We have absorbed the resolution pa-rameter for the interslice couplings into the mag-nitude of the elements ofCjsr, which, for simplicity,we presume to take binary values {0,w} indicatingthe absence (0) or presence (w) of interslice links.

    1800 1820 1840 1860 1880 1900 1920 1940 1960 1980 2000

    40PA, 24F, 8AA

    151DR, 30AA, 14PA, 5F141F, 43DR

    44D, 2R

    1784R, 276D, 149DR, 162J, 53W, 84other

    176W, 97AJ, 61DR, 49A,24D, 19F, 13J, 37other

    3168D, 252R, 73other

    222D, 6W, 11other

    1490R, 247D, 19other

    Year

    Sena

    tor

    10 20 30 40 50 60 70 80 90 100 110CTMEMANHRI VTDE NJNY PAIL INMI OHWI IAKSMNMONENDSDVA ALAR FLGA LAMSNCSC TXKYMDOK TNWVAZCO IDMTNVNMUTWYCAORWAAK HI

    Congress #

    A

    B

    Fig. 3. Multislice community detection of U.S. Senate roll call vote similarities (23) withw = 0.5 couplingof 110 slices (i.e., the number of 2-year Congresses from 1789 to 2008) across time. (A) Colors indicateassignments to nine communities of the 1884 unique senators (sorted vertically and connected acrossCongresses by dashed lines) in each Congress in which they appear. The dark blue and red communitiescorrespond closely to the modern Democratic and Republican parties, respectively. Horizontal barsindicate the historical period of each community, with accompanying text enumerating nominal partyaffiliations of the single-slice nodes (each representing a senator in a Congress): PA, pro-administration;AA, anti-administration; F, Federalist; DR, Democratic-Republican; W, Whig; AJ, anti-Jackson; A, Adams; J,Jackson; D, Democratic; R, Republican. Vertical gray bars indicate Congresses in which three communitiesappeared simultaneously. (B) The same assignments according to state affiliations.

    www.sciencemag.org SCIENCE VOL 328 14 MAY 2010 877

    REPORTS

    on

    Nove

    mbe

    r 8, 2

    011

    www.

    scien

    cem

    ag.o

    rgDo

    wnloa

    ded

    from

    JUKKA-PEKKA ONNELA et al. PHYSICAL REVIEW E 86, 036104 (2012)

    Social Facebook Political: voting

    Political: cosponsorship Political: committee Protein interaction

    Metabolic Brain Fungal

    Financial

    Language Collaboration

    effeffeff

    FIG. 7. (Color online) MRFs for all of the network categoriescontaining at least eight networks (see Table I). At each value of , theupper curve shows the maximum value ofHeff (magenta, left panel ineach category), Seff (blue, center panels), and eff (black, right panels)for all networks in the category and the lower curve shows the mini-mum value. The dashed curves show the corresponding mean MRFs.

    A. Voting in the United States SenateOur first example deals with roll-call voting in the US

    Senate [3134,48]. Establishing a taxonomy of networksdetailing the voting similarities of individual legislators com-plements previous studies of these data, and it facilitatesthe comparison of voting similarity networks across time.We consider Congresses 1110, which cover the period17892008. As in Ref. [34], we construct networks from theroll-call data [31,32] for each two-year Congress such that theadjacency matrix element Aij [0,1] represents the numberof times Senators i and j voted the same way on a bill (eitherboth in favor of it or both against it) divided by the total numberof bills on which both of them voted. Following the approachof Ref. [32], we consider only nonunanimous roll-call votes,which are defined as votes in which at least 3% of the Senatorswere in the minority.

    Much research on the US Congress has been devoted tothe ebb and flow of partisan polarization over time and theinfluence of parties on roll-call voting [33,34]. In highlypolarized legislatures, representatives tend to vote alongparty lines, so there are strong similarities in the votingpatterns of members of the same party and strong differencesbetween members of different parties. In contrast, duringperiods of low polarization, the party lines become blurred.The notion of partisan polarization can be used to helpunderstand the taxonomy of Senates in Fig. 8, in which weconsider two measures of polarization. The first measure usesDW-Nominate scores (a multidimensional scaling techniquecommonly used in political science [32,33]), where the extentof polarization is given by the absolute value of the differencebetween the mean first-dimension DW-Nominate scores formembers of one party and the same mean for members ofthe other party [3133]. In particular, we use the simplestsuch measure of polarization, called MPR polarization, whichassumes a competitive two-party system and hence cannot becalculated prior to the 46th Senate. The second measure thatwe consider is the maximum modularity Q over partitions of

    0

    0.11

    0.22

    0.33

    0.44

    0.56

    0.67

    0.78

    0.89

    1

    10 20 30 40 50 60 70 80 90 100 1100

    0.2

    0.4

    0.6

    0.8

    1

    0

    Mod

    ular

    ity (Q

    )D

    W-N

    omin

    ate

    pola

    rizat

    ion

    Mod

    ular

    ity (Q

    )D

    W-N

    omin

    ate

    pola

    rizat

    ion

    (a)

    (b)

    FIG. 8. (Color) (a) Dendrogram for Senate roll-call voting net-works for the 1st110th Congresses. Each leaf in the dendrogramrepresents a single Senate. The two horizontal color bars below thedendrograms indicate polarization measured in terms of optimizedmodularity (upper bar) and DW-Nominate scores (lower bar). Wecolor the branches in the dendrogram corresponding to periods ofsimilar polarization. (b) Polarization of the US Senate as a function oftime, which we label using the Congress number. The height of eachstem indicates the level of polarization measured using optimizedmodularity, and the color of each stem gives the cluster membershipof each Senate in (a). The black curve shows the DW-Nominatepolarization. Note that we have normalized both measures to lie inthe interval [0,1].

    a network. It was shown recently that Q is a good measure ofpolarization even for Congresses without clear party divisions[34]. Modularity is given in terms of the energy H in Eq. (1)by Q = H( = 1)/(2m).

    In Fig. 8(a), we include bars under the dendrogramsto represent the two polarization measures, both of whichhave been normalized to lie in the interval [0,1]. The barsdemonstrate that Senates with similar levels of polarization(measured in terms of both DW-Nominate scores and opti-mized modularity values) are usually assigned to the samegroup, suggesting that our MRF clustering technique groupsSenates based on the polarization of roll-call votes. We havealso colored dendrogram groups according to their mean levelsof polarization using optimized modularity, where the browngroup in the dendrogram corresponds to the most polarizedSenates and the blue group corresponds to the least polarizedSenates. We chose the specific number of groups by inspectionof the dendrogram. Although one ought to expect similarity inthe results from the modularity-based measure of polarizationand the MRF clustering, it is important to stress that theMRF clustering method is based on different principles;modularity attempts to quantify the extent to which a givennetwork is modular, whereas the MRF clustering explicitly

    036104-8

    data: US senators

  • note: i and j are node indices, and s and r are layer indices.The adjacency tensor Aijs 6= 0 if nodes i and j are connectedin layer s, and Aijs = 0 otherwise.kis is the degree (or strength) of node i in layer s,ms is the number of edges (or sum of weights) in layer s,and s = is the resolution parameter in layer s.Cjsr = ! 6= 0 if layers s and r are connected via node j,and Cjsr = 0 otherwise.The normalization factor 2 =

    PijsAijs +

    Pjsr Cjsr for Qmultilayer 2 [1, 1].

    Qmultilayer =1

    2

    Xijsr

    Aijs s kiskjs

    2ms

    sr + ijCjsr

    (gis, gjr)

    Community Structure inTime-Dependent, Multiscale,and Multiplex NetworksPeter J. Mucha,1,2* Thomas Richardson,1,3 Kevin Macon,1 Mason A. Porter,4,5 Jukka-Pekka Onnela6,7

    Network science is an interdisciplinary endeavor, with methods and applications drawn from acrossthe natural, social, and information sciences. A prominent problem in network science is thealgorithmic detection of tightly connected groups of nodes known as communities. We developed ageneralized framework of network quality functions that allowed us to study the communitystructure of arbitrary multislice networks, which are combinations of individual networks coupledthrough links that connect each node in one network slice to itself in other slices. This frameworkallows studies of community structure in a general setting encompassing networks that evolve overtime, have multiple types of links (multiplexity), and have multiple scales.

    Thestudy of graphs, or networks, has a longtradition in fields such as sociology andmathematics, and it is now ubiquitous inacademic and everyday settings. An importanttool in network analysis is the detection ofmesoscopic structures known as communities (orcohesive groups), which are defined intuitively asgroups of nodes that are more tightly connected toeach other than they are to the rest of the network(13). One way to quantify communities is by aquality function that compares the number ofintracommunity edges to what one would expectat random.Given the network adjacencymatrixA,where the element Aij details a direct connectionbetween nodes i and j, one can construct a qual-ity functionQ (4, 5) for the partitioning of nodesinto communities as Q = ij (Aij Pij)d(gi, gj),where d(gi, gj) = 1 if the community assignmentsgi and gj of nodes i and j are the same and 0otherwise, and Pij is the expected weight of theedge between i and j under a specified null model.

    The choice of null model is a crucial con-sideration in studying network community struc-ture (2). After selecting a null model appropriateto the network and application at hand, one canuse a variety of computational heuristics to assignnodes to communities to optimize the quality Q(2, 3). However, such null models have not beenavailable for time-dependent networks; analyseshave instead depended on ad hoc methods to

    piece together the structures obtained at differenttimes (69) or have abandoned quality functionsin favor of such alternatives as the MinimumDescriptionLength principle (10). Although tensordecompositions (11) have been used to clusternetwork data with different types of connections,no quality-function method has been developedfor such multiplex networks.

    We developed a methodology to remove theselimits, generalizing the determination of commu-nity structure via quality functions to multislicenetworks that are defined by coupling multipleadjacency matrices (Fig. 1). The connectionsencoded by the network slices are flexible; theycan represent variations across time, variationsacross different types of connections, or evencommunity detection of the same network atdifferent scales. However, the usual procedure forestablishing a quality function as a direct count ofthe intracommunity edge weight minus that

    expected at random fails to provide any contribu-tion from these interslice couplings. Because theyare specified by common identifications of nodesacross slices, interslice couplings are either presentor absent by definition, so when they do fall insidecommunities, their contribution in the count of intra-community edges exactly cancels that expected atrandom. In contrast, by formulating a null model interms of stability of communities under Laplaciandynamics, we have derived a principled generaliza-tion of community detection to multislice networks,

    REPORTS

    1Carolina Center for Interdisciplinary Applied Mathematics,Department of Mathematics, University of North Carolina,Chapel Hill, NC 27599, USA. 2Institute for Advanced Materials,Nanoscience and Technology, University of North Carolina,Chapel Hill, NC 27599, USA. 3Operations Research, NorthCarolina State University, Raleigh, NC 27695, USA. 4OxfordCentre for Industrial and Applied Mathematics, MathematicalInstitute, University of Oxford, Oxford OX1 3LB, UK. 5CABDyNComplexity Centre, University of Oxford, Oxford OX1 1HP, UK.6Department of Health Care Policy, Harvard Medical School,Boston, MA 02115, USA. 7Harvard Kennedy School, HarvardUniversity, Cambridge, MA 02138, USA.

    *To whom correspondence should be addressed. E-mail:[email protected]

    1

    2

    3

    4

    Fig. 1. Schematic of amultislice network. Four slicess= {1, 2, 3, 4} represented by adjacencies Aijs encodeintraslice connections (solid lines). Interslice con-nections (dashed lines) are encoded byCjrs, specifyingthe coupling of node j to itself between slices r and s.For clarity, interslice couplings are shown for only twonodes and depict two different types of couplings: (i)coupling between neighboring slices, appropriate forordered slices; and (ii) all-to-all interslice coupling,appropriate for categorical slices.

    no

    des

    resolution parameters

    coupling = 0

    1 2 3 4

    5

    10

    15

    20

    25

    30

    no

    des

    resolution parameters

    coupling = 0.1

    1 2 3 4

    5

    10

    15

    20

    25

    30

    no

    des

    resolution parameters

    coupling = 1

    1 2 3 4

    5

    10

    15

    20

    25

    30

    Fig. 2. Multislice community detection of theZachary Karate Club network (22) across multipleresolutions. Colors depict community assignments ofthe 34 nodes (renumbered vertically to groupsimilarly assigned nodes) in each of the 16 slices(with resolution parameters gs = {0.25, 0.5,, 4}),for w = 0 (top), w = 0.1 (middle), and w =1 (bottom). Dashed lines bound the communitiesobtained using the default resolution (g = 1).

    14 MAY 2010 VOL 328 SCIENCE www.sciencemag.org876

    CORRECTED 16 JULY 2010; SEE LAST PAGE

    on

    Nove

    mbe

    r 8, 2

    011

    www.

    scien

    cem

    ag.o

    rgDo

    wnloa

    ded

    from

    Multilayer community detection

    P. J. Mucha, T. Richardson, K. Macon, M. A. Porter, and J.-P. Onnela, Science 328, 876 (2010).

    different slices: time series or categories

    nodes in individual slices

    (weighted) edges

    with a single parameter controlling the interslicecorrespondence of communities.

    Important to our method is the equivalencebetween themodularity quality function (12) [witha resolution parameter (5)] and stability of com-munities under Laplacian dynamics (13), whichwe have generalized to recover the null models forbipartite, directed, and signed networks (14). First,we obtained the resolution-parameter generaliza-

    tion of Barbers null model for bipartite networks(15) by requiring the independent joint probabilitycontribution to stability in (13) to be conditionalon the type of connection necessary to stepbetween two nodes. Second, we recovered thestandard null model for directed networks (16, 17)(again with a resolution parameter) by generaliz-ing the Laplacian dynamics to include motionalong different kinds of connectionsin this case,

    both with and against the direction of a link. Bythis generalization, we similarly recovered a nullmodel for signed networks (18). Third, weinterpreted the stability under Laplacian dynamicsflexibly to permit different spreading weights onthe different types of links, giving multiple reso-lution parameters to recover a general null modelfor signed networks (19).

    We applied these generalizations to derive nullmodels for multislice networks that extend theexisting quality-function methodology, includingan additional parameter w to control the couplingbetween slices. Representing each network slice sby adjacencies Aijs between nodes i and j, withinterslice couplingsCjrs that connect node j in slicer to itself in slice s (Fig. 1), we have restricted ourattention to unipartite, undirected network slices(Aijs = Ajis) and couplings (Cjrs = Cjsr), but we canincorporate additional structure in the slices andcouplings in the same manner as demonstrated forsingle-slice null models. Notating the strengths ofeach node individually in each slice by kjs =iAijsand across slices by cjs = rCjsr, we define themultislice strength by kjs = kjs + cjs. The continuous-time Laplacian dynamics given by

    pis jrAijsdsr dijCjsrpjr

    kjr pis 1

    respects the intraslice nature of Aijs and theinterslice couplings of Cjsr. Using the steady-stateprobability distribution pjr kjr=2m, where 2m = jrkjr, we obtained the multislice null model interms of the probability ris| jr of sampling node i inslice s conditional on whether the multislice struc-ture allowsone to step from ( j, r) to (i, s), accountingfor intra- and interslice steps separately as

    risj jrpjr

    kis2ms

    kjrkjr

    dsr Cjsrcjrcjrkjr

    dij

    ! "kjr2m

    2

    where ms = jkjs. The second term in parentheses,which describes the conditional probability ofmotion between two slices, leverages the definitionof the Cjsr coupling. That is, the conditionalprobability of stepping from ( j, r) to (i, s) alongan interslice coupling is nonzero if and only if i = j,and it is proportional to the probability Cjsr/kjr ofselecting the precise interslice link that connects toslice s. Subtracting this conditional joint probabilityfrom the linear (in time) approximation of theexponential describing the Laplacian dynamics,weobtained a multislice generalization of modularity(14):

    Qmultislice 12m ijsrh#

    Aijs gskiskjs2ms

    dsr$

    dijCjsridgis,gjr 3

    where we have used reweighting of the conditionalprobabilities, which allows a different resolution gsin each slice. We have absorbed the resolution pa-rameter for the interslice couplings into the mag-nitude of the elements ofCjsr, which, for simplicity,we presume to take binary values {0,w} indicatingthe absence (0) or presence (w) of interslice links.

    1800 1820 1840 1860 1880 1900 1920 1940 1960 1980 2000

    40PA, 24F, 8AA

    151DR, 30AA, 14PA, 5F141F, 43DR

    44D, 2R

    1784R, 276D, 149DR, 162J, 53W, 84other

    176W, 97AJ, 61DR, 49A,24D, 19F, 13J, 37other

    3168D, 252R, 73other

    222D, 6W, 11other

    1490R, 247D, 19other

    Year

    Sena

    tor

    10 20 30 40 50 60 70 80 90 100 110CTMEMANHRI VTDE NJNY PAIL INMI OHWI IAKSMNMONENDSDVA ALAR FLGA LAMSNCSC TXKYMDOK TNWVAZCO IDMTNVNMUTWYCAORWAAK HI

    Congress #

    A

    B

    Fig. 3. Multislice community detection of U.S. Senate roll call vote similarities (23) withw = 0.5 couplingof 110 slices (i.e., the number of 2-year Congresses from 1789 to 2008) across time. (A) Colors indicateassignments to nine communities of the 1884 unique senators (sorted vertically and connected acrossCongresses by dashed lines) in each Congress in which they appear. The dark blue and red communitiescorrespond closely to the modern Democratic and Republican parties, respectively. Horizontal barsindicate the historical period of each community, with accompanying text enumerating nominal partyaffiliations of the single-slice nodes (each representing a senator in a Congress): PA, pro-administration;AA, anti-administration; F, Federalist; DR, Democratic-Republican; W, Whig; AJ, anti-Jackson; A, Adams; J,Jackson; D, Democratic; R, Republican. Vertical gray bars indicate Congresses in which three communitiesappeared simultaneously. (B) The same assignments according to state affiliations.

    www.sciencemag.org SCIENCE VOL 328 14 MAY 2010 877

    REPORTS

    on

    Nove

    mbe

    r 8, 2

    011

    www.

    scien

    cem

    ag.o

    rgDo

    wnloa

    ded

    from

    JUKKA-PEKKA ONNELA et al. PHYSICAL REVIEW E 86, 036104 (2012)

    Social Facebook Political: voting

    Political: cosponsorship Political: committee Protein interaction

    Metabolic Brain Fungal

    Financial

    Language Collaboration

    effeffeff

    FIG. 7. (Color online) MRFs for all of the network categoriescontaining at least eight networks (see Table I). At each value of , theupper curve shows the maximum value ofHeff (magenta, left panel ineach category), Seff (blue, center panels), and eff (black, right panels)for all networks in the category and the lower curve shows the mini-mum value. The dashed curves show the corresponding mean MRFs.

    A. Voting in the United States SenateOur first example deals with roll-call voting in the US

    Senate [3134,48]. Establishing a taxonomy of networksdetailing the voting similarities of individual legislators com-plements previous studies of these data, and it facilitatesthe comparison of voting similarity networks across time.We consider Congresses 1110, which cover the period17892008. As in Ref. [34], we construct networks from theroll-call data [31,32] for each two-year Congress such that theadjacency matrix element Aij [0,1] represents the numberof times Senators i and j voted the same way on a bill (eitherboth in favor of it or both against it) divided by the total numberof bills on which both of them voted. Following the approachof Ref. [32], we consider only nonunanimous roll-call votes,which are defined as votes in which at least 3% of the Senatorswere in the minority.

    Much research on the US Congress has been devoted tothe ebb and flow of partisan polarization over time and theinfluence of parties on roll-call voting [33,34]. In highlypolarized legislatures, representatives tend to vote alongparty lines, so there are strong similarities in the votingpatterns of members of the same party and strong differencesbetween members of different parties. In contrast, duringperiods of low polarization, the party lines become blurred.The notion of partisan polarization can be used to helpunderstand the taxonomy of Senates in Fig. 8, in which weconsider two measures of polarization. The first measure usesDW-Nominate scores (a multidimensional scaling techniquecommonly used in political science [32,33]), where the extentof polarization is given by the absolute value of the differencebetween the mean first-dimension DW-Nominate scores formembers of one party and the same mean for members ofthe other party [3133]. In particular, we use the simplestsuch measure of polarization, called MPR polarization, whichassumes a competitive two-party system and hence cannot becalculated prior to the 46th Senate. The second measure thatwe consider is the maximum modularity Q over partitions of

    0

    0.11

    0.22

    0.33

    0.44

    0.56

    0.67

    0.78

    0.89

    1

    10 20 30 40 50 60 70 80 90 100 1100

    0.2

    0.4

    0.6

    0.8

    1

    0

    Mod

    ular

    ity (Q

    )D

    W-N

    omin

    ate

    pola

    rizat

    ion

    Mod

    ular

    ity (Q

    )D

    W-N

    omin

    ate

    pola

    rizat

    ion

    (a)

    (b)

    FIG. 8. (Color) (a) Dendrogram for Senate roll-call voting net-works for the 1st110th Congresses. Each leaf in the dendrogramrepresents a single Senate. The two horizontal color bars below thedendrograms indicate polarization measured in terms of optimizedmodularity (upper bar) and DW-Nominate scores (lower bar). Wecolor the branches in the dendrogram corresponding to periods ofsimilar polarization. (b) Polarization of the US Senate as a function oftime, which we label using the Congress number. The height of eachstem indicates the level of polarization measured using optimizedmodularity, and the color of each stem gives the cluster membershipof each Senate in (a). The black curve shows the DW-Nominatepolarization. Note that we have normalized both measures to lie inthe interval [0,1].

    a network. It was shown recently that Q is a good measure ofpolarization even for Congresses without clear party divisions[34]. Modularity is given in terms of the energy H in Eq. (1)by Q = H( = 1)/(2m).

    In Fig. 8(a), we include bars under the dendrogramsto represent the two polarization measures, both of whichhave been normalized to lie in the interval [0,1]. The barsdemonstrate that Senates with similar levels of polarization(measured in terms of both DW-Nominate scores and opti-mized modularity values) are usually assigned to the samegroup, suggesting that our MRF clustering technique groupsSenates based on the polarization of roll-call votes. We havealso colored dendrogram groups according to their mean levelsof polarization using optimized modularity, where the browngroup in the dendrogram corresponds to the most polarizedSenates and the blue group corresponds to the least polarizedSenates. We chose the specific number of groups by inspectionof the dendrogram. Although one ought to expect similarity inthe results from the modularity-based measure of polarizationand the MRF clustering, it is important to stress that theMRF clustering method is based on different principles;modularity attempts to quantify the extent to which a givennetwork is modular, whereas the MRF clustering explicitly

    036104-8

    data: US senators

    multilayer community index:for node i on layer s

  • note: i and j are node indices, and s and r are layer indices.The adjacency tensor Aijs 6= 0 if nodes i and j are connectedin layer s, and Aijs = 0 otherwise.kis is the degree (or strength) of node i in layer s,ms is the number of edges (or sum of weights) in layer s,and s = is the resolution parameter in layer s.Cjsr = ! 6= 0 if layers s and r are connected via node j,and Cjsr = 0 otherwise.The normalization factor 2 =

    PijsAijs +

    Pjsr Cjsr for Qmultilayer 2 [1, 1].

    Qmultilayer =1

    2

    Xijsr

    Aijs s kiskjs

    2ms

    sr + ijCjsr

    (gis, gjr)

    Community Structure inTime-Dependent, Multiscale,and Multiplex NetworksPeter J. Mucha,1,2* Thomas Richardson,1,3 Kevin Macon,1 Mason A. Porter,4,5 Jukka-Pekka Onnela6,7

    Network science is an interdisciplinary endeavor, with methods and applications drawn from acrossthe natural, social, and information sciences. A prominent problem in network science is thealgorithmic detection of tightly connected groups of nodes known as communities. We developed ageneralized framework of network quality functions that allowed us to study the communitystructure of arbitrary multislice networks, which are combinations of individual networks coupledthrough links that connect each node in one network slice to itself in other slices. This frameworkallows studies of community structure in a general setting encompassing networks that evolve overtime, have multiple types of links (multiplexity), and have multiple scales.

    Thestudy of graphs, or networks, has a longtradition in fields such as sociology andmathematics, and it is now ubiquitous inacademic and everyday settings. An importanttool in network analysis is the detection ofmesoscopic structures known as communities (orcohesive groups), which are defined intuitively asgroups of nodes that are more tightly connected toeach other than they are to the rest of the network(13). One way to quantify communities is by aquality function that compares the number ofintracommunity edges to what one would expectat random.Given the network adjacencymatrixA,where the element Aij details a direct connectionbetween nodes i and j, one can construct a qual-ity functionQ (4, 5) for the partitioning of nodesinto communities as Q = ij (Aij Pij)d(gi, gj),where d(gi, gj) = 1 if the community assignmentsgi and gj of nodes i and j are the same and 0otherwise, and Pij is the expected weight of theedge between i and j under a specified null model.

    The choice of null model is a crucial con-sideration in studying network community struc-ture (2). After selecting a null model appropriateto the network and application at hand, one canuse a variety of computational heuristics to assignnodes to communities to optimize the quality Q(2, 3). However, such null models have not beenavailable for time-dependent networks; analyseshave instead depended on ad hoc methods to

    piece together the structures obtained at differenttimes (69) or have abandoned quality functionsin favor of such alternatives as the MinimumDescriptionLength principle (10). Although tensordecompositions (11) have been used to clusternetwork data with different types of connections,no quality-function method has been developedfor such multiplex networks.

    We developed a methodology to remove theselimits, generalizing the determination of commu-nity structure via quality functions to multislicenetworks that are defined by coupling multipleadjacency matrices (Fig. 1). The connectionsencoded by the network slices are flexible; theycan represent variations across time, variationsacross different types of connections, or evencommunity detection of the same network atdifferent scales. However, the usual procedure forestablishing a quality function as a direct count ofthe intracommunity edge weight minus that

    expected at random fails to provide any contribu-tion from these interslice couplings. Because theyare specified by common identifications of nodesacross slices, interslice couplings are either presentor absent by definition, so when they do fall insidecommunities, their contribution in the count of intra-community edges exactly cancels that expected atrandom. In contrast, by formulating a null model interms of stability of communities under Laplaciandynamics, we have derived a principled generaliza-tion of community detection to multislice networks,

    REPORTS

    1Carolina Center for Interdisciplinary Applied Mathematics,Department of Mathematics, University of North Carolina,Chapel Hill, NC 27599, USA. 2Institute for Advanced Materials,Nanoscience and Technology, University of North Carolina,Chapel Hill, NC 27599, USA. 3Operations Research, NorthCarolina State University, Raleigh, NC 27695, USA. 4OxfordCentre for Industrial and Applied Mathematics, MathematicalInstitute, University of Oxford, Oxford OX1 3LB, UK. 5CABDyNComplexity Centre, University of Oxford, Oxford OX1 1HP, UK.6Department of Health Care Policy, Harvard Medical School,Boston, MA 02115, USA. 7Harvard Kennedy School, HarvardUniversity, Cambridge, MA 02138, USA.

    *To whom correspondence should be addressed. E-mail:[email protected]

    1

    2

    3

    4

    Fig. 1. Schematic of amultislice network. Four slicess= {1, 2, 3, 4} represented by adjacencies Aijs encodeintraslice connections (solid lines). Interslice con-nections (dashed lines) are encoded byCjrs, specifyingthe coupling of node j to itself between slices r and s.For clarity, interslice couplings are shown for only twonodes and depict two different types of couplings: (i)coupling between neighboring slices, appropriate forordered slices; and (ii) all-to-all interslice coupling,appropriate for categorical slices.

    no

    des

    resolution parameters

    coupling = 0

    1 2 3 4

    5

    10

    15

    20

    25

    30

    no

    des

    resolution parameters

    coupling = 0.1

    1 2 3 4

    5

    10

    15

    20

    25

    30

    no

    des

    resolution parameters

    coupling = 1

    1 2 3 4

    5

    10

    15

    20

    25

    30

    Fig. 2. Multislice community detection of theZachary Karate Club network (22) across multipleresolutions. Colors depict community assignments ofthe 34 nodes (renumbered vertically to groupsimilarly assigned nodes) in each of the 16 slices(with resolution parameters gs = {0.25, 0.5,, 4}),for w = 0 (top), w = 0.1 (middle), and w =1 (bottom). Dashed lines bound the communitiesobtained using the default resolution (g = 1).

    14 MAY 2010 VOL 328 SCIENCE www.sciencemag.org876

    CORRECTED 16 JULY 2010; SEE LAST PAGE

    on

    Nove

    mbe

    r 8, 2

    011

    www.

    scien

    cem

    ag.o

    rgDo

    wnloa

    ded

    from

    Multilayer community detection

    P. J. Mucha, T. Richardson, K. Macon, M. A. Porter, and J.-P. Onnela, Science 328, 876 (2010).

    all the layers connected to each other: categorical multilayer communities

    only the adjacent layers connected to each other: ordered multilayer communities

    different slices: time series or categories

    nodes in individual slices

    (weighted) edges

    with a single parameter controlling the interslicecorrespondence of communities.

    Important to our method is the equivalencebetween themodularity quality function (12) [witha resolution parameter (5)] and stability of com-munities under Laplacian dynamics (13), whichwe have generalized to recover the null models forbipartite, directed, and signed networks (14). First,we obtained the resolution-parameter generaliza-

    tion of Barbers null model for bipartite networks(15) by requiring the independent joint probabilitycontribution to stability in (13) to be conditionalon the type of connection necessary to stepbetween two nodes. Second, we recovered thestandard null model for directed networks (16, 17)(again with a resolution parameter) by generaliz-ing the Laplacian dynamics to include motionalong different kinds of connectionsin this case,

    both with and against the direction of a link. Bythis generalization, we similarly recovered a nullmodel for signed networks (18). Third, weinterpreted the stability under Laplacian dynamicsflexibly to permit different spreading weights onthe different types of links, giving multiple reso-lution parameters to recover a general null modelfor signed networks (19).

    We applied these generalizations to derive nullmodels for multislice networks that extend theexisting quality-function methodology, includingan additional parameter w to control the couplingbetween slices. Representing each network slice sby adjacencies Aijs between nodes i and j, withinterslice couplingsCjrs that connect node j in slicer to itself in slice s (Fig. 1), we have restricted ourattention to unipartite, undirected network slices(Aijs = Ajis) and couplings (Cjrs = Cjsr), but we canincorporate additional structure in the slices andcouplings in the same manner as demonstrated forsingle-slice null models. Notating the strengths ofeach node individually in each slice by kjs =iAijsand across slices by cjs = rCjsr, we define themultislice strength by kjs = kjs + cjs. The continuous-time Laplacian dynamics given by

    pis jrAijsdsr dijCjsrpjr

    kjr pis 1

    respects the intraslice nature of Aijs and theinterslice couplings of Cjsr. Using the steady-stateprobability distribution pjr kjr=2m, where 2m = jrkjr, we obtained the multislice null model interms of the probability ris| jr of sampling node i inslice s conditional on whether the multislice struc-ture allowsone to step from ( j, r) to (i, s), accountingfor intra- and interslice steps separately as

    risj jrpjr

    kis2ms

    kjrkjr

    dsr Cjsrcjrcjrkjr

    dij

    ! "kjr2m

    2

    where ms = jkjs. The second term in parentheses,which describes the conditional probability ofmotion between two slices, leverages the definitionof the Cjsr coupling. That is, the conditionalprobability of stepping from ( j, r) to (i, s) alongan interslice coupling is nonzero if and only if i = j,and it is proportional to the probability Cjsr/kjr ofselecting the precise interslice link that connects toslice s. Subtracting this conditional joint probabilityfrom the linear (in time) approximation of theexponential describing the Laplacian dynamics,weobtained a multislice generalization of modularity(14):

    Qmultislice 12m ijsrh#

    Aijs gskiskjs2ms

    dsr$

    dijCjsridgis,gjr 3

    where we have used reweighting of the conditionalprobabilities, which allows a different resolution gsin each slice. We have absorbed the resolution pa-rameter for the interslice couplings into the mag-nitude of the elements ofCjsr, which, for simplicity,we presume to take binary values {0,w} indicatingthe absence (0) or presence (w) of interslice links.

    1800 1820 1840 1860 1880 1900 1920 1940 1960 1980 2000

    40PA, 24F, 8AA

    151DR, 30AA, 14PA, 5F141F, 43DR

    44D, 2R

    1784R, 276D, 149DR, 162J, 53W, 84other

    176W, 97AJ, 61DR, 49A,24D, 19F, 13J, 37other

    3168D, 252R, 73other

    222D, 6W, 11other

    1490R, 247D, 19other

    Year

    Sena

    tor

    10 20 30 40 50 60 70 80 90 100 110CTMEMANHRI VTDE NJNY PAIL INMI OHWI IAKSMNMONENDSDVA ALAR FLGA LAMSNCSC TXKYMDOK TNWVAZCO IDMTNVNMUTWYCAORWAAK HI

    Congress #

    A

    B

    Fig. 3. Multislice community detection of U.S. Senate roll call vote similarities (23) withw = 0.5 couplingof 110 slices (i.e., the number of 2-year Congresses from 1789 to 2008) across time. (A) Colors indicateassignments to nine communities of the 1884 unique senators (sorted vertically and connected acrossCongresses by dashed lines) in each Congress in which they appear. The dark blue and red communitiescorrespond closely to the modern Democratic and Republican parties, respectively. Horizontal barsindicate the historical period of each community, with accompanying text enumerating nominal partyaffiliations of the single-slice nodes (each representing a senator in a Congress): PA, pro-administration;AA, anti-administration; F, Federalist; DR, Democratic-Republican; W, Whig; AJ, anti-Jackson; A, Adams; J,Jackson; D, Democratic; R, Republican. Vertical gray bars indicate Congresses in which three communitiesappeared simultaneously. (B) The same assignments according to state affiliations.

    www.sciencemag.org SCIENCE VOL 328 14 MAY 2010 877

    REPORTS

    on

    Nove

    mbe

    r 8, 2

    011

    www.

    scien

    cem

    ag.o

    rgDo

    wnloa

    ded

    from

    JUKKA-PEKKA ONNELA et al. PHYSICAL REVIEW E 86, 036104 (2012)

    Social Facebook Political: voting

    Political: cosponsorship Political: committee Protein interaction

    Metabolic Brain Fungal

    Financial

    Language Collaboration

    effeffeff

    FIG. 7. (Color online) MRFs for all of the network categoriescontaining at least eight networks (see Table I). At each value of , theupper curve shows the maximum value ofHeff (magenta, left panel ineach category), Seff (blue, center panels), and eff (black, right panels)for all networks in the category and the lower curve shows the mini-mum value. The dashed curves show the corresponding mean MRFs.

    A. Voting in the United States SenateOur first example deals with roll-call voting in the US

    Senate [3134,48]. Establishing a taxonomy of networksdetailing the voting similarities of individual legislators com-plements previous studies of these data, and it facilitatesthe comparison of voting similarity networks across time.We consider Congresses 1110, which cover the period17892008. As in Ref. [34], we construct networks from theroll-call data [31,32] for each two-year Congress such that theadjacency matrix element Aij [0,1] represents the numberof times Senators i and j voted the same way on a bill (eitherboth in favor of it or both against it) divided by the total numberof bills on which both of them voted. Following the approachof Ref. [32], we consider only nonunanimous roll-call votes,which are defined as votes in which at least 3% of the Senatorswere in the minority.

    Much research on the US Congress has been devoted tothe ebb and flow of partisan polarization over time and theinfluence of parties on roll-call voting [33,34]. In highlypolarized legislatures, representatives tend to vote alongparty lines, so there are strong similarities in the votingpatterns of members of the same party and strong differencesbetween members of different parties. In contrast, duringperiods of low polarization, the party lines become blurred.The notion of partisan polarization can be used to helpunderstand the taxonomy of Senates in Fig. 8, in which weconsider two measures of polarization. The first measure usesDW-Nominate scores (a multidimensional scaling techniquecommonly used in political science [32,33]), where the extentof polarization is given by the absolute value of the differencebetween the mean first-dimension DW-Nominate scores formembers of one party and the same mean for members ofthe other party [3133]. In particular, we use the simplestsuch measure of polarization, called MPR polarization, whichassumes a competitive two-party system and hence cannot becalculated prior to the 46th Senate. The second measure thatwe consider is the maximum modularity Q over partitions of

    0

    0.11

    0.22

    0.33

    0.44

    0.56

    0.67

    0.78

    0.89

    1

    10 20 30 40 50 60 70 80 90 100 1100

    0.2

    0.4

    0.6

    0.8

    1

    0

    Mod

    ular

    ity (Q

    )D

    W-N

    omin

    ate

    pola

    rizat

    ion

    Mod

    ular

    ity (Q

    )D

    W-N

    omin

    ate

    pola

    rizat

    ion

    (a)

    (b)

    FIG. 8. (Color) (a) Dendrogram for Senate roll-call voting net-works for the 1st110th Congresses. Each leaf in the dendrogramrepresents a single Senate. The two horizontal color bars below thedendrograms indicate polarization measured in terms of optimizedmodularity (upper bar) and DW-Nominate scores (lower bar). Wecolor the branches in the dendrogram corresponding to periods ofsimilar polarization. (b) Polarization of the US Senate as a function oftime, which we label using the Congress number. The height of eachstem indicates the level of polarization measured using optimizedmodularity, and the color of each stem gives the cluster membershipof each Senate in (a). The black curve shows the DW-Nominatepolarization. Note that we have normalized both measures to lie inthe interval [0,1].

    a network. It was shown recently that Q is a good measure ofpolarization even for Congresses without clear party divisions[34]. Modularity is given in terms of the energy H in Eq. (1)by Q = H( = 1)/(2m).

    In Fig. 8(a), we include bars under the dendrogramsto represent the two polarization measures, both of whichhave been normalized to lie in the interval [0,1]. The barsdemonstrate that Senates with similar levels of polarization(measured in terms of both DW-Nominate scores and opti-mized modularity values) are usually assigned to the samegroup, suggesting that our MRF clustering technique groupsSenates based on the polarization of roll-call votes. We havealso colored dendrogram groups according to their mean levelsof polarization using optimized modularity, where the browngroup in the dendrogram corresponds to the most polarizedSenates and the blue group corresponds to the least polarizedSenates. We chose the specific number of groups by inspectionof the dendrogram. Although one ought to expect similarity inthe results from the modularity-based measure of polarizationand the MRF clustering, it is important to stress that theMRF clustering method is based on different principles;modularity attempts to quantify the extent to which a givennetwork is modular, whereas the MRF clustering explicitly

    036104-8

    data: US senators

    multilayer community index:for node i on layer s

    parameter space = [ (intralayer resolution),! (interlayer coupling strength)]

  • congressional cosponsorship networks

    1

    1

    2

    1

    1

    2

    1 1

    congressperson a congressperson d

    congressperson b congressperson c congressperson e

    congressperson a congressperson d

    congressperson b congressperson e

    congressperson c

    bill A bill B bill C

    (a) bipartite network

    (b) congresspersonmode projection (c) billmode projection

    bill A

    bill B bill C

    FIG. 1: The construction procedure for the weighted projected networks from the bipartite network.

    Figure 1 illustrates the method to project the bipartite cosponsorship network to the weighted

    bill and congressperson networks similar to the one used in the relationship between protein com-

    plexes and component proteins [1], where Tables I and II show the detailed statistics divided by

    eight periods. There are 15 dierent status assigned for the bills as listed in Table III, but we do not

    distinguish those status in the following analysis for now. For the bipartite network, Fig. 2 shows

    the degree distributions of bills and congresspersons, indicating that the number of congressper-

    sons cosponsoring bills is much more heterogeneously distributed compared to the number of bills

    in which individual congresspersons participates.

    The bills are clearly partitioned into ten periods (2006 I, 2006 II, 2007 I, 2007 II, 2008 I,

    2008 II, 2009 I, 2009 II, 2010 I, and 2010 II), but the weighted congressperson-mode projection

    networks for dierent years (composed of 130 members in total) shares most congresspersons (ex-

    cept for the congresspersons invisible for a specific year due to their absence in the cosponsoring

    activities), which allows the temporally changing or multiplex relations over the eight periods. In

    2

    Congress of the Republic of Peru (20062011)

    Senate of the United States (1973-2009)

  • congressional cosponsorship networks

    1

    1

    2

    1

    1

    2

    1 1

    congressperson a congressperson d

    congressperson b congressperson c congressperson e

    congressperson a congressperson d

    congressperson b congressperson e

    congressperson c

    bill A bill B bill C

    (a) bipartite network

    (b) congresspersonmode projection (c) billmode projection

    bill A

    bill B bill C

    FIG. 1: The construction procedure for the weighted projected networks from the bipartite network.

    Figure 1 illustrates the method to project the bipartite cosponsorship network to the weighted

    bill and congressperson networks similar to the one used in the relationship between protein com-

    plexes and component proteins [1], where Tables I and II show the detailed statistics divided by

    eight periods. There are 15 dierent status assigned for the bills as listed in Table III, but we do not

    distinguish those status in the following analysis for now. For the bipartite network, Fig. 2 shows

    the degree distributions of bills and congresspersons, indicating that the number of congressper-

    sons cosponsoring bills is much more heterogeneously distributed compared to the number of bills

    in which individual congresspersons participates.

    The bills are clearly partitioned into ten periods (2006 I, 2006 II, 2007 I, 2007 II, 2008 I,

    2008 II, 2009 I, 2009 II, 2010 I, and 2010 II), but the weighted congressperson-mode projection

    networks for dierent years (composed of 130 members in total) shares most congresspersons (ex-

    cept for the congresspersons invisible for a specific year due to their absence in the cosponsoring

    activities), which allows the temporally changing or multiplex relations over the eight periods. In

    2

    Congress of the Republic of Peru (20062011)

    Senate of the United States (1973-2009)

    temporally ordered multilayer

  • Senate of the United States (1973-2009)

    93 97

    101 105 109

    50 100 150 200 250 300c on g

    r es s

    i nd e

    x

    senator index

    19 time points, a=1.0, t=20.0

    93 97

    101 105 109

    50 100 150 200 250 300c on g

    r es s

    i nd e

    x

    senator index

    19 time points, a=1.0, t=40.0

    93 97

    101 105 109

    50 100 150 200 250 300c on g

    r es s

    i nd e

    x

    senator index

    19 time points, a=1.0, t=60.0

    93 97

    101 105 109

    50 100 150 200 250 300c on g

    r es s

    i nd e

    x

    senator index

    19 time points, a=1.0, t=80.0

    93 97

    101 105 109

    50 100 150 200 250 300c on g

    r es s

    i nd e

    x

    senator index

    19 time points, a=1.0, t=100.0

    log(modularity Q) log[flexibility (average number of community switching normalized by the number of time points)]

    paint drip plots

    US senate (93rd-110th): 18 time slices

    0 20 40 60 80 100t

    0

    0.5

    1

    1.5

    2

    a

    -9-8-7-6-5-4-3-2-1 0

    US Senate (93rd-110th): 18 time slices

    0 20 40 60 80 100t

    0

    0.5

    1

    1.5

    2

    a

    -10-9-8-7-6-5-4-3-2-1 0

    19 time points: = 1, ! = 20

    19 time points: = 1, ! = 40

    19 time points: = 1, ! = 60

    19 time points: = 1, ! = 80

    19 time points: = 1, ! = 100

    congressional cosponsorship networks

  • (a) (b)

    10-4

    10-3

    10-2

    10-1

    100

    100 1