The KL information or distance between a full reality model considered to be fixed \(f\) and an approximating model \(g\) is
\[I(f,g)=\int f(x) \log \left( \frac{f(x)}{g(x | \theta)} \right) \, dx\]
that can be written as
\[I(f,g)=E_f \left[ f(x) \right] - E_f \left[ g(x | \theta) \right]\]
The first expected value is a constant \(C\) that we will never know as we don’t know the full reality model, but we can calculate relative distances
\[relative \, distance = I(f,g) - C = - E_f \left[ g(x | \theta) \right]\]
For two approximating models:
\[I(f,g_1) - I(f,g_2) = - E_f \left[ g_1(x | \theta) \right] + - E_f \left[ g_2(x | \theta) \right]\]
Notice that while \(I(f,g)\) has a true zero, the \(relative \, distance\) has not a true zero.
Often, we will have an estimation of \(I\) as we don’t know the parameters of the model and need to estimate them \(\widehat{\theta}\).
\[ AIC = -2\log{L(\widehat{\theta})} + 2 k\]
where L is the likelihood evaluated in the parameters of the model that maximizes it and \(k\) is the number of parameters of the model.
n <- 100
x <- c(.2, .4, .6, .8, 1)
k <- c(10, 26, 73, 94, 97)
r <- n - k
y <- k/n
dat <- data.frame(x, y, k, r, n)
library(quickpsy)
fit<-quickpsy(dat, x, k, n)
fit$aic
## aic
## 1 30.89906
model <- glm( cbind(k, r) ~ x, data= dat, family = binomial(probit))
model
##
## Call: glm(formula = cbind(k, r) ~ x, family = binomial(probit), data = dat)
##
## Coefficients:
## (Intercept) x
## -2.279 4.572
##
## Degrees of Freedom: 4 Total (i.e. Null); 3 Residual
## Null Deviance: 304.4
## Residual Deviance: 6.662 AIC: 30.9
Sometimes, other software provide other values for the \(AIC\) become it uses the loglikelihood dropping the binomial coefficients (that do not depend on the parameters). In general, this is not a problem as we are interested in differences in the \(AIC\), but it could be a problem when comparing models with different distribution of errors.