svm2.docx

Upload: lycaphe8x

Post on 14-Apr-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/27/2019 SVM2.docx

    1/10

    Support Vector Machine

    1. Support Vector Machine:Support Vector Machine (SVM) l mt phung php phn lp da trn l thuyt hc thng k,c xut bi Vapnik (1995).

    n gin ta s xt bi ton phn lp nhphn, sau s mrng vn ra cho bi ton phnnhiu lp.

    Xt mt v d ca bi ton phn lp nh hnh v; ta phi tm mt ng thng sao cho bntri n ton l cc im , bn phi n ton l cc im xanh. Bi ton m dng ng thng phn chia ny c gi l phn lp tuyn tnh (linear classification).

    Hm tuyn tnh phn bit hai lp nh sau:

    () () (1)

    Trong :

    l vector trng s hay vector chun ca siu phng phn cch, T l k hiuchuyn v.

    l lch () l vc t c trng, lm hm nh x tkhng gian u vo sang khng

    gian c trng.

    Tp d liu u vo gm N mu input vector {x1, x2,...,xN}, vi cc gi trnhn tng ng l{t1,,tN} trong *+.

    Lu cch dng ty: im d liu, mu u c hiu l input vector xi; nu l khnggian 2 chiu th ng phn cch l ng thng, nhng trong khng gian a chiu th gi lsiu phng.

    Gi s tp d liu ca ta c th phn tch tuyn tnh hon ton (cc mu u c phn nglp) trong khng gian c trng (feature space), do s tn ti gi tr tham s w v b theo (1)

  • 7/27/2019 SVM2.docx

    2/10

    tha () cho nhng im c nhn v () cho nhng im c , vth m () cho mi im d liu hun luyn.

    SVM tip cn gii quyt vn ny thng qua khi nim gi l l, ng bin (margin). Lc chn l khong cch nh nht tng phn cch n mi im d liu hay l khong

    cch tng phn cch n nhng im gn nht.

    Trong SVM, ng phn lp tt nht chnh l ng c khong cch margin ln nht (tc l stn ti rt nhiu ng phn cch xoay theo cc phng khc nhau, v ta chn ra ng phncch m c khong cch margin l ln nht).

    Ta c cng thc tnh khong cch tim d liu n mt phn cch nh sau:

    |()|

  • 7/27/2019 SVM2.docx

    3/10

    Do ta ang xt trong trng hp cc im d liu u c phn lp ng nn () chomi n. V th khong cch tim xnn mt phn cch c vit li nh sau:

    ()

    (()) (2)

    L l khong cch vung gc n im d liu gn nht xn t tp d liu, v chng ta mun tmgi tr ti u ca w v b bng cch cc i khong cch ny. Vn cn gii quyt sc vitli di dng cng thc sau:

    { ,(() )-} (3)

    Chng ta c them nhn t

    ra ngoi bi v w khng ph thuc n. Gii quyt vn ny mt

    cch trc tip s rt phc tp, do ta s chuyn n v mt vn tng ng d gii quythn. Ta s scale v cho mi im d liu, ty khong cch l trthnh 1,vic bin i ny khng lm thay i bn cht vn .

    (() ) (4)

    T by gi, cc im d liu s tha rng buc:

    (() ) (5)

    Vn ti u yu cu ta cc i c chuyn thnh cc tiu , ta vit li cng thc:

    (6)

  • 7/27/2019 SVM2.docx

    4/10

    Vic nhn h s s gip thun li cho ly o hm v sau.

    L thuyt Nhn t Lagrange:

    Vn cc i hm f(x) tha iu kin ( ) sc vit li di dng ti u ca hm

    Lagrange nh sau:( ) () ()

    Trong x v phi tha iu kin Karush-Kuhn-Tucker (KKT) nh sau:

    ( )

    ()

    Nu l cc tiu hm f(x) th hm Lagrange s l

    ( ) () ()

    gii quyt bi ton trn, ta vit li theo hm Lagrange nh sau:

    () *(() ) + (7)

    Trong ( ) l nhn t Lagrange.

    Lu du () trong hm Lagrange, bi v ta cc tiu theo bin w v b, v l cc i theo bin a.

    Ly o hm L(w,b,a) theo w v b ta c:

    () (8)

    (9)

    Loi b w v b ra khi L(w,b,a) bng cch th(8), (9) vo. iu ny s dn ta n vn ti u:

    () ( ) (10)

    Tha cc rng buc:

    (11)

    (12)

  • 7/27/2019 SVM2.docx

    5/10

    y hm nhn (kernel function) c nh ngha l ( ) ()().

    Vn tm thi gc li y, ta s tho lun k thut gii quyt (10) tha (11), (12) ny sau.

    phn lp cho 1 im d liu mi dng m hnh hun luyn, ta tnh du ca y(x) theo cng

    thc (1), nhng th w trong (8) vo:() ( ) (13)

    Tha cc iu kin KKT sau:

    (14)

    () (15)

    *() + (16)

    V th vi mi im d liu, hoc l hoc l () . Nhng im d liu m c s khng xut hin trong (13) v do m khng ng gp trong vic don im dliu mi.

    Nhng im d liu cn li ( )c gi l support vector, chng tha () , lnhng im nm trn l ca siu phng trong khng gian c trng.

    Support vector chnh l ci m ta quan tm trong qu trnh hu n luyn ca SVM. Vic phn lpcho mt im d liu mi s ch ph thuc vo cc support vector.

    Gi s rng ta gii quyt c vn (10) v tm c gi tr nhn t a, by gita cn xc

    nh tham s b da vo cc support vector xn c () . Th (13) vo:

    ( ( ) ) (17)

    Trong S l tp cc support vector. Mc d ta ch cn th mt im support vector xn vo l cthtm ra b, nhng m bo tnh n nh ca b ta s tnh b theo cch ly gi tr trung bnh datrn cc support vector.

    u tin ta nhn tnvo (17) (lu ), v gi tr b s l:

    (

    (

    )

    )

    (18)

    Trong Ns l tng s support vector.

    Ban u d trnh by thut ton ta gi sl cc im d liu c th phn tch hon tontrong khng gian c trng (). Nhng vic phn tch hon ton ny c th dn n khnngtng qut ha km, v thc t mt s mu trong qu trnh thu thp d liu c th b gn nhn sai,nu ta c tnh phn tch hon ton s lm cho m hnh don qu khp.

  • 7/27/2019 SVM2.docx

    6/10

    chng li s qu khp, chng ta chp nhn cho mt vi im b phn lp sai.

    lm iu ny, ta dng cc bin slack variables cho mi im d liu.

    cho nhng im nm trn l hoc pha trong ca l () cho nhng im cn li. Do nhng im nm trn ng phn cch () s c Cn nhng im phn lp sai s c

  • 7/27/2019 SVM2.docx

    7/10

    Cng thc (5) s vit li nh sau:

    () (20)

    Mc tiu ca ta by gi l cc i khong cch l, nhng ng thi cng m bo tnh mm

    mng cho nhng im b phn lp sai. Ta vit li vn cn cc tiu:

    (21)

    Trong C > 0 ng vai tr quyt nh t tm quan trng vo bin hay l l.

    By gichng ta cn cc tiu (21) tha rng buc (20) v . Theo Lagrange ta vit li:

    ( ) *() + (22)

    Trong * + v * + l cc nhn t Lagrange.

    Cc iu kin KKT cn tha l:

    (23)

    () (24)

    (() ) (25)

    (26)

    (27)

    (28)

    Vi n = 1,,N

    Ly o hm (22) theo w, b v {}:

    () (29)

    (30)

    (31)

    Th(29), (30), (31) vo (22) ta c:

    () ( )

    (32)

    T (23), (26) v (31) ta c:

  • 7/27/2019 SVM2.docx

    8/10

    Vn cn ti u ging ht vi trng hp phn tch hon ton, chc iu kin rng buc khcbit nh sau:

    (33)

    (34)

    Th (29) vo (1), ta s thy don cho mt im d liu mi tng tnh (13).

    Nh trc , tp cc im c khng c ng gp g cho vic don im d liu mi.

    Nhng im cn li to thnh cc support vector. Nhng im c v theo (25) tha:

    () (35)

    Nu theo (31) c , t (28) suy ra v l nhng im nm trn l.

    Nhng im c c th l nhng im phn lp ng nm gia lv ng phn cchnu hoc c th l phn lp sai nu

    xc nh tham s b trong (1) ta s dng nhng support vector m c vth() :

    ( ( ) ) (36)

    Ln na, m bo tnh n nh ca b ta tnh theo trung bnh:

    ( ( ) ) (37)

    Trong M l tp cc im c

    gii quyt (10) v (32) ta dng thut ton Sequential Minimal Optimization (SMO) do Platta ra vo 1999.

  • 7/27/2019 SVM2.docx

    9/10

    2. MultiClass SVMs:

    By gixt n trng hp phn nhiu lp K > 2. Chng ta c th xy dng vic phn K-classda trn vic kt hp mt sng phn 2 lp. Tuy nhin, iu ny s dn n mt vi kh khn(theo Duda and Hart, 1973).

    Hng one-versus-the-rest, ta s dng K-1 b phn lp nhphn xy dng K-class.

    Hng one-versus-one, dng K(K-1)/2 b phn lp nhphn xy dng K-class.

    C2 hng u dn n vng mp mtrong phn lp (nh hnh v).

    Ta c th trnh c vn ny bng cch xy dng K-Class da trn K hm tuyn tnh cdng:

    ()

    V mt im x c gn vo lp Ckkhi () () vi mi .

    Mt hng tip cn khc do Wu (2004) xut phng php c lng xc sut cho vic phnm lp.

    3.p dng cho bi ton phn loi vn bn:Hng dn ci t:

    M tvector c trng ca vn bn: L vector c s chiu l sc trng trong ton tp d liu,cc c trng ny i mt khc nhau. Nu vn bn c cha c trng s c gi tr1, ngc lil 0.

  • 7/27/2019 SVM2.docx

    10/10

    Vic ci t SVM kh phc tp ta nn dng cc th vin ci sn trn mng nh LibSVM,SVMLight.

    Thut ton gm 2 giai on hun luyn v phn lp:

    1.

    Hun luyn:u vo:

    Cc vector c trng ca vn bn trong tp hun luyn (Ma trn MxN, vi M l s vectorc trng trong tp hun luyn, N l sc trng ca vector).

    Tp nhn/lp cho tng vector c trng ca tp hun luyn. Cc tham s cho m hnh SVM: C, (tham s ca hm kernel, thng dng hm Gauss)

    u ra: M hnh SVM (Cc Support Vector, nhn t Lagrange a, tham s b).

    2. Phn lp:u vo:

    Vector c trng ca vn bn cn phn lp. M hnh SVM

    u ra: Nhn/lp ca vn bn cn phn loi.

    4. Ti liu tham kho:[1] Christopher M. Bishop,Pattern Recognition and Machine Learning, Springer (2007)

    .