When computing predictions for a two-class case there seems to be a mistake.
Here is a reproducible example:
library(polyreg)
library(MLmetrics)
data(kyphosis, package = "rpart")
kyphosis$y <- ifelse(kyphosis$Kyphosis == "absent", 1, 0)
kyphosis$Kyphosis <- NULL
mod <- glm(y ~ ., data = kyphosis, family = binomial())
mod
# Coefficients:
# (Intercept) Age Number Start
# 2.03693 -0.01093 -0.41060 0.20651
#
# Degrees of Freedom: 80 Total (i.e. Null); 77 Residual
# Null Deviance: 83.23
# Residual Deviance: 61.38 AIC: 69.38
table(ifelse(predict(mod, type = "response") > 0.5, 1, 0), kyphosis$y)
# 0 1
# 0 7 3
# 1 10 61
Accuracy(ifelse(predict(mod, type = "response") > 0.5, 1, 0), kyphosis$y)
# 0.8395062
table(ifelse(predict(mod) > 0.5, 1, 0), kyphosis$y)
# 0 1
# 0 10 8
# 1 7 56
Accuracy(ifelse(predict(mod) > 0.5, 1, 0), kyphosis$y)
# 0.8148148
data(kyphosis, package = "rpart")
kyphosis <- kyphosis[,c(2:4,1)]
kyphosis$Kyphosis <- as.character(kyphosis$Kyphosis)
pf <- polyFit(kyphosis, deg = 1, use = "glm")
pf$fit
# Coefficients:
# (Intercept) V1 V2 V3
# 2.03693 -0.01093 -0.41060 0.20651
#
# Degrees of Freedom: 80 Total (i.e. Null); 77 Residual
# Null Deviance: 83.23
# Residual Deviance: 61.38 AIC: 69.38
Ok the same model is fitted, but computing predictions:
table(predict(pf, kyphosis), kyphosis$Kyphosis)
# absent present
# absent 56 7
# present 8 10
Accuracy(predict(pf, kyphosis), kyphosis$Kyphosis)
# 0.8148148
seems to be wrong. Looking at the code you can see
# glm case
if (is.null(object$glmMethod)) { # only two classes
pre <- predict(object$fit, plm.newdata)
pred <- ifelse(pre > 0.5, object$classes[1], object$classes[2])
}
IMHO the prediction returned is in the link scale (see help(predict.glm)) but it should be on the probability scale, i.e. type = "response", or if in the link scale pre > 0. However, I prefer to resonate in terms of probability scale.
When computing predictions for a two-class case there seems to be a mistake.
Here is a reproducible example:
Ok the same model is fitted, but computing predictions:
seems to be wrong. Looking at the code you can see
IMHO the prediction returned is in the link scale (see
help(predict.glm)) but it should be on the probability scale, i.e.type = "response", or if in the link scalepre > 0. However, I prefer to resonate in terms of probability scale.