svm2.docx
TRANSCRIPT
-
7/27/2019 SVM2.docx
1/10
Support Vector Machine
1. Support Vector Machine:Support Vector Machine (SVM) l mt phung php phn lp da trn l thuyt hc thng k,c xut bi Vapnik (1995).
n gin ta s xt bi ton phn lp nhphn, sau s mrng vn ra cho bi ton phnnhiu lp.
Xt mt v d ca bi ton phn lp nh hnh v; ta phi tm mt ng thng sao cho bntri n ton l cc im , bn phi n ton l cc im xanh. Bi ton m dng ng thng phn chia ny c gi l phn lp tuyn tnh (linear classification).
Hm tuyn tnh phn bit hai lp nh sau:
() () (1)
Trong :
l vector trng s hay vector chun ca siu phng phn cch, T l k hiuchuyn v.
l lch () l vc t c trng, lm hm nh x tkhng gian u vo sang khng
gian c trng.
Tp d liu u vo gm N mu input vector {x1, x2,...,xN}, vi cc gi trnhn tng ng l{t1,,tN} trong *+.
Lu cch dng ty: im d liu, mu u c hiu l input vector xi; nu l khnggian 2 chiu th ng phn cch l ng thng, nhng trong khng gian a chiu th gi lsiu phng.
Gi s tp d liu ca ta c th phn tch tuyn tnh hon ton (cc mu u c phn nglp) trong khng gian c trng (feature space), do s tn ti gi tr tham s w v b theo (1)
-
7/27/2019 SVM2.docx
2/10
tha () cho nhng im c nhn v () cho nhng im c , vth m () cho mi im d liu hun luyn.
SVM tip cn gii quyt vn ny thng qua khi nim gi l l, ng bin (margin). Lc chn l khong cch nh nht tng phn cch n mi im d liu hay l khong
cch tng phn cch n nhng im gn nht.
Trong SVM, ng phn lp tt nht chnh l ng c khong cch margin ln nht (tc l stn ti rt nhiu ng phn cch xoay theo cc phng khc nhau, v ta chn ra ng phncch m c khong cch margin l ln nht).
Ta c cng thc tnh khong cch tim d liu n mt phn cch nh sau:
|()|
-
7/27/2019 SVM2.docx
3/10
Do ta ang xt trong trng hp cc im d liu u c phn lp ng nn () chomi n. V th khong cch tim xnn mt phn cch c vit li nh sau:
()
(()) (2)
L l khong cch vung gc n im d liu gn nht xn t tp d liu, v chng ta mun tmgi tr ti u ca w v b bng cch cc i khong cch ny. Vn cn gii quyt sc vitli di dng cng thc sau:
{ ,(() )-} (3)
Chng ta c them nhn t
ra ngoi bi v w khng ph thuc n. Gii quyt vn ny mt
cch trc tip s rt phc tp, do ta s chuyn n v mt vn tng ng d gii quythn. Ta s scale v cho mi im d liu, ty khong cch l trthnh 1,vic bin i ny khng lm thay i bn cht vn .
(() ) (4)
T by gi, cc im d liu s tha rng buc:
(() ) (5)
Vn ti u yu cu ta cc i c chuyn thnh cc tiu , ta vit li cng thc:
(6)
-
7/27/2019 SVM2.docx
4/10
Vic nhn h s s gip thun li cho ly o hm v sau.
L thuyt Nhn t Lagrange:
Vn cc i hm f(x) tha iu kin ( ) sc vit li di dng ti u ca hm
Lagrange nh sau:( ) () ()
Trong x v phi tha iu kin Karush-Kuhn-Tucker (KKT) nh sau:
( )
()
Nu l cc tiu hm f(x) th hm Lagrange s l
( ) () ()
gii quyt bi ton trn, ta vit li theo hm Lagrange nh sau:
() *(() ) + (7)
Trong ( ) l nhn t Lagrange.
Lu du () trong hm Lagrange, bi v ta cc tiu theo bin w v b, v l cc i theo bin a.
Ly o hm L(w,b,a) theo w v b ta c:
() (8)
(9)
Loi b w v b ra khi L(w,b,a) bng cch th(8), (9) vo. iu ny s dn ta n vn ti u:
() ( ) (10)
Tha cc rng buc:
(11)
(12)
-
7/27/2019 SVM2.docx
5/10
y hm nhn (kernel function) c nh ngha l ( ) ()().
Vn tm thi gc li y, ta s tho lun k thut gii quyt (10) tha (11), (12) ny sau.
phn lp cho 1 im d liu mi dng m hnh hun luyn, ta tnh du ca y(x) theo cng
thc (1), nhng th w trong (8) vo:() ( ) (13)
Tha cc iu kin KKT sau:
(14)
() (15)
*() + (16)
V th vi mi im d liu, hoc l hoc l () . Nhng im d liu m c s khng xut hin trong (13) v do m khng ng gp trong vic don im dliu mi.
Nhng im d liu cn li ( )c gi l support vector, chng tha () , lnhng im nm trn l ca siu phng trong khng gian c trng.
Support vector chnh l ci m ta quan tm trong qu trnh hu n luyn ca SVM. Vic phn lpcho mt im d liu mi s ch ph thuc vo cc support vector.
Gi s rng ta gii quyt c vn (10) v tm c gi tr nhn t a, by gita cn xc
nh tham s b da vo cc support vector xn c () . Th (13) vo:
( ( ) ) (17)
Trong S l tp cc support vector. Mc d ta ch cn th mt im support vector xn vo l cthtm ra b, nhng m bo tnh n nh ca b ta s tnh b theo cch ly gi tr trung bnh datrn cc support vector.
u tin ta nhn tnvo (17) (lu ), v gi tr b s l:
(
(
)
)
(18)
Trong Ns l tng s support vector.
Ban u d trnh by thut ton ta gi sl cc im d liu c th phn tch hon tontrong khng gian c trng (). Nhng vic phn tch hon ton ny c th dn n khnngtng qut ha km, v thc t mt s mu trong qu trnh thu thp d liu c th b gn nhn sai,nu ta c tnh phn tch hon ton s lm cho m hnh don qu khp.
-
7/27/2019 SVM2.docx
6/10
chng li s qu khp, chng ta chp nhn cho mt vi im b phn lp sai.
lm iu ny, ta dng cc bin slack variables cho mi im d liu.
cho nhng im nm trn l hoc pha trong ca l () cho nhng im cn li. Do nhng im nm trn ng phn cch () s c Cn nhng im phn lp sai s c
-
7/27/2019 SVM2.docx
7/10
Cng thc (5) s vit li nh sau:
() (20)
Mc tiu ca ta by gi l cc i khong cch l, nhng ng thi cng m bo tnh mm
mng cho nhng im b phn lp sai. Ta vit li vn cn cc tiu:
(21)
Trong C > 0 ng vai tr quyt nh t tm quan trng vo bin hay l l.
By gichng ta cn cc tiu (21) tha rng buc (20) v . Theo Lagrange ta vit li:
( ) *() + (22)
Trong * + v * + l cc nhn t Lagrange.
Cc iu kin KKT cn tha l:
(23)
() (24)
(() ) (25)
(26)
(27)
(28)
Vi n = 1,,N
Ly o hm (22) theo w, b v {}:
() (29)
(30)
(31)
Th(29), (30), (31) vo (22) ta c:
() ( )
(32)
T (23), (26) v (31) ta c:
-
7/27/2019 SVM2.docx
8/10
Vn cn ti u ging ht vi trng hp phn tch hon ton, chc iu kin rng buc khcbit nh sau:
(33)
(34)
Th (29) vo (1), ta s thy don cho mt im d liu mi tng tnh (13).
Nh trc , tp cc im c khng c ng gp g cho vic don im d liu mi.
Nhng im cn li to thnh cc support vector. Nhng im c v theo (25) tha:
() (35)
Nu theo (31) c , t (28) suy ra v l nhng im nm trn l.
Nhng im c c th l nhng im phn lp ng nm gia lv ng phn cchnu hoc c th l phn lp sai nu
xc nh tham s b trong (1) ta s dng nhng support vector m c vth() :
( ( ) ) (36)
Ln na, m bo tnh n nh ca b ta tnh theo trung bnh:
( ( ) ) (37)
Trong M l tp cc im c
gii quyt (10) v (32) ta dng thut ton Sequential Minimal Optimization (SMO) do Platta ra vo 1999.
-
7/27/2019 SVM2.docx
9/10
2. MultiClass SVMs:
By gixt n trng hp phn nhiu lp K > 2. Chng ta c th xy dng vic phn K-classda trn vic kt hp mt sng phn 2 lp. Tuy nhin, iu ny s dn n mt vi kh khn(theo Duda and Hart, 1973).
Hng one-versus-the-rest, ta s dng K-1 b phn lp nhphn xy dng K-class.
Hng one-versus-one, dng K(K-1)/2 b phn lp nhphn xy dng K-class.
C2 hng u dn n vng mp mtrong phn lp (nh hnh v).
Ta c th trnh c vn ny bng cch xy dng K-Class da trn K hm tuyn tnh cdng:
()
V mt im x c gn vo lp Ckkhi () () vi mi .
Mt hng tip cn khc do Wu (2004) xut phng php c lng xc sut cho vic phnm lp.
3.p dng cho bi ton phn loi vn bn:Hng dn ci t:
M tvector c trng ca vn bn: L vector c s chiu l sc trng trong ton tp d liu,cc c trng ny i mt khc nhau. Nu vn bn c cha c trng s c gi tr1, ngc lil 0.
-
7/27/2019 SVM2.docx
10/10
Vic ci t SVM kh phc tp ta nn dng cc th vin ci sn trn mng nh LibSVM,SVMLight.
Thut ton gm 2 giai on hun luyn v phn lp:
1.
Hun luyn:u vo:
Cc vector c trng ca vn bn trong tp hun luyn (Ma trn MxN, vi M l s vectorc trng trong tp hun luyn, N l sc trng ca vector).
Tp nhn/lp cho tng vector c trng ca tp hun luyn. Cc tham s cho m hnh SVM: C, (tham s ca hm kernel, thng dng hm Gauss)
u ra: M hnh SVM (Cc Support Vector, nhn t Lagrange a, tham s b).
2. Phn lp:u vo:
Vector c trng ca vn bn cn phn lp. M hnh SVM
u ra: Nhn/lp ca vn bn cn phn loi.
4. Ti liu tham kho:[1] Christopher M. Bishop,Pattern Recognition and Machine Learning, Springer (2007)
.