model regresi logit multinomial · 1. gunakan program r untuk data alligator food choice (agresti,...
TRANSCRIPT
Analisis Data Kategorik - STK654 (Materi UAS)
Dr. Kusman Sadik, M.Si
Program Studi Magister Statistika Terapan
Departemen Statistika IPB, Semester Ganjil 2019/2020
IPB University─ Bogor Indonesia ─ Inspiring Innovation with Integrity
Model Regresi Logit Multinomial(Peubah Respon Multikategori-Nominal)
2
We discussed logistic regression models for a binary
outcome; that is, an outcome variable that consists of
two categories.
We extend our discussion of logistic regression to
multicategory outcomes, or outcome variables with
several categories.
The multicategory logistic model can still accommodate
several predictor (or explanatory) variables, and these
can be either continuous, categorical, or both.
3
Peubah Respon Y(Multikategori)
Skala Nominal(Regresi Logistik Multinomial)
Skala Ordinal(Regresi Logistik Ordinal, Regresi
Logistik Kumulatif)
4
5
Misalkan kategori Y ada sebanyak J, yaitu j = 1,2, ..., J.
P(Y = j) = πj
π1+ π2+ ... + πJ= 1
Kategori terakhir (J ) sebagai referensi : 𝑙𝑛𝜋𝑗
𝜋𝐽
6
7
8
9
10
# Model Logistik Multinomial Data GSS (Azen, sub-bab 10.1)
dataku <- read.csv(file="GSS-2006-DEGREE-AGEWED.csv",
header=TRUE)
degree <- dataku$degree
agewed <- dataku$agewed
# Referensi kategori
degree <- relevel(degree, ref="LT HIGH SCHOOL")
data.frame(degree,agewed)
# Perlu package : "foreign" dan "nnet"
library("foreign")
library("nnet")
table(degree)
model <- multinom(degree ~ agewed)
summary(model)
11
Degree
LT HIGH SCHOOL BACHELOR GRADUATE HIGH SCHOOL
195 185 104 590
JUNIOR COLLEGE
86
12
Call: multinom(formula = degree ~ agewed)
Coefficients:
(Intercept) agewed
BACHELOR -1.4390298 0.05931128
GRADUATE -2.2159390 0.06740282
HIGH SCHOOL 0.7576068 0.01542479
JUNIOR COLLEGE -1.7591391 0.04082775
Std. Errors:
(Intercept) agewed
BACHELOR 0.4335079 0.01817825
GRADUATE 0.4778828 0.01961857
HIGH SCHOOL 0.3821854 0.01654322
JUNIOR COLLEGE 0.5416678 0.02267685
Residual Deviance: 3098.768
AIC: 3114.768
13
Output SAS : Bandingkan dengan Output R
14
Output SAS : Bandingkan dengan Output R
15
Interpretasi dan Pengujian Parameter
16
17
YX
18
# Model Logistik Multinomial Agresti 7.1.2
dataku <- read.csv(file="Data-Agresti-7.1.2.csv")
lake <- factor(dataku[,1])
gend <- factor(dataku[,2])
size <- factor(dataku[,3])
food <- factor(dataku[,4])
# Referensi kategori
lake <- relevel(lake, ref="4Geo")
size <- relevel(size, ref="2")
food <- relevel(food, ref="1Fish")
data.frame(lake,gend,size,food)
# Perlu package : "foreign" dan "nnet"
library("foreign")
library("nnet")
model <- multinom(food ~ lake + size)
summary(model)
19
lake gend size food
1 1Han M 1 1Fish
2 1Han M 1 1Fish
3 1Han M 1 1Fish
4 1Han M 1 1Fish
5 1Han M 1 1Fish
6 1Han M 1 1Fish
7 1Han M 1 1Fish
8 1Han M 2 1Fish
9 1Han M 2 1Fish
10 1Han M 2 1Fish
11 1Han M 2 1Fish
.
.
.
218 4Geo F 1 5Othe
219 4Geo F 2 5Othe
20
Call:
multinom(formula = food ~ lake + size)
Coefficients:
(Intercept) lake1Han lake2Okl lake3Tra size1
2Inve -1.549021 -1.6581178 0.937237973 1.122002 1.4581457
3Rept -3.314512 1.2428408 2.458913302 2.935262 -0.3512702
4Bird -2.093358 0.6954256 -0.652622721 1.088098 -0.6306329
5Othe -1.904343 0.8263115 0.005792737 1.516461 0.3315514
Std. Errors:
(Intercept) lake1Han lake2Okl lake3Tra size1
2Inve 0.4249185 0.6128466 0.4719035 0.4905122 0.3959418
3Rept 1.0530577 1.1854031 1.1181000 1.1163844 0.5800207
4Bird 0.6622972 0.7813123 1.2020025 0.8417085 0.6424863
5Othe 0.5258313 0.5575446 0.7765655 0.6214371 0.4482504
Residual Deviance: 540.0803
AIC: 580.0803
21
22
23
24
25
1. Gunakan Program R untuk data Alligator Food Choice (Agresti,
sub-bab 7.1.2 ) .
a. Lakukan pemodelan regresi logistik multinomial pada data
tersebut dengan peubah responnya adalah tipe makanan
utama alligator dan peubah bebasnya adalah Lake (L) dan
Size (S). Bandingkan hasilnya dengan buku Agresti serta
berikan interpretasi pada tiap nilai dugaan parameter model.
b. Lakukan pemodelan seperti pada poin (a) di atas, tetapi
peubah bebasnya adalah Lake (L), Size (S) dan Gender (G).
Peubah mana saja (L, S, G) yang berpengaruh nyata?
Gunakan uji Deviance untuk = 0.05.
26
c. Tentukan model terbaik dengan peubah bebasnya adalah Lake
(L), Size (S) dan Gender (G) serta semua interaksinya (L*S,
L*G, S*G, dan L*S*G). Gunakan uji Deviance untuk = 0.05.
27
2. Agresti (Problems 7.1, hlm. 302) .
28
29
30
Pustaka
1. Azen, R. dan Walker, C.R. (2011). Categorical Data
Analysis for the Behavioral and Social Sciences.
Routledge, Taylor and Francis Group, New York.
2. Agresti, A. (2002). Categorical Data Analysis 2nd. New
York: Wiley.
3. Pustaka lain yang relevan.
31
Bisa di-download di
kusmansadik.wordpress.com
32
Terima Kasih