로지스틱 회귀모형
Code(1)
Sys.setenv(JAVA_HOME='C:\\Program Files\\Java\\Jdk1.7.0_79')library(xlsx)
mower.data <- read.xlsx("c:/ian_R/mower.xlsx",1)
head(mower.data)
mower.logit <- glm(owner~.,family=binomial, data=mower.data)
summary(mower.logit)
1-pchisq(15.323, 21)
Result(1)
Call:
glm(formula = owner ~ ., family = binomial, data = mower.data)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.74044 -0.29685 0.00439 0.44750 1.86821
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -25.9382 11.4871 -2.258 0.0239 *
income 0.3326 0.1629 2.042 0.0412 *
size 1.9276 0.9256 2.083 0.0373 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 33.271 on 23 degrees of freedom
상수항만이 포함된 모형으로 적합할 대 모형추정 값과 관찰값의 차이에 관한 통계량
Residual deviance: 15.323 on 21 degrees of freedom
독립변수가 포함될 때의 차이에 대한 통계량
AIC: 21.323
Number of Fisher Scoring iterations: 6
1-pchisq(15.323, 21)
결과에서 residual deviance = 15.323이고, 자유도 21에서 유의확률 p-값 = 0.806
따라서 귀무가설 적합
Code(2) : 새로운 자료 분류
mower.predict <-predict(mower.logit, newdata = mower.data, type="response")
pred <- ifelse(mower.predict < 0.5, "no", "yes")
pred <- factor(pred)
confusion.matrix <- table(mower.data$owner, pred)
error <- 1-(sum(diag(confusion.matrix))/sum(confusion.matrix))
error
pred <- ifelse(mower.predict < 0.5, "no", "yes")
pred <- factor(pred)
confusion.matrix <- table(mower.data$owner, pred)
error <- 1-(sum(diag(confusion.matrix))/sum(confusion.matrix))
error
Result(2)
0.1666667