Akaike information small sample AICc

  • A
  • Thread starter mertcan
  • Start date
  • Tags
    Information
In summary, the correction in AICc is added to account for the fact that small samples have a lower power to detect differences between models.
  • #1
mertcan
345
6
hi, initially I am aware that AICc value is $$ -2(*log-likelihood)+2K+2K*(K+1)/(n-K-1)$$ where n is sample size and K is number of model parameters. But I really do not know how last term of right hand side is added, also AIC value is $$ -2*(log-likelihood)+2K$$ , so AICc has some correction in addition to AIC. In short my question is what is the derivation of correction in AICc $$(2K*(K+1)/(n-K-1) )$$ ??
 
Physics news on Phys.org
  • #2
Unfortunately, in searching the web, we find that the usual approach is just to define AIC by a formula and to define AICc by a different formula. However, the terminology "correction" suggests that both formulae are trying to compute a common quantity, whose definition is unstated. If we only consider history as the authority on definitions, we would have to read the original papers that defined the AIC and the AICc to see if the people who proposed the AIC and AICc defined a common quantity that these formulae are supposed to approximate.

If we go beyond history to seek a respectable definition for the AIC, the section "Model Selection Criterion" on page 7 of the presentation http://www4.ncsu.edu/~shu3/Presentation/AIC.pdf, defines a quantity that is to be maximized. The particular formulae used to estimate that quantity could be different for different types of models and situations (e.g. linear models and large samples vs linear model and small samples ). If we define the AIC abstractly as a quantity proportional to:

##E_y E_x [\log(g(x| \hat{\theta}(y)))]##

then, in different situations, the AIC can be given by different formulae.

I don't know what level of abstraction you are comfortable with. One can probably understand formulae for the AIC and AICc by considering specific situations. - but I won't try to figure this out myself unless someone else is really interested in participating!
 
  • #3
Stephen Tashi said:
Unfortunately, in searching the web, we find that the usual approach is just to define AIC by a formula and to define AICc by a different formula. However, the terminology "correction" suggests that both formulae are trying to compute a common quantity, whose definition is unstated. If we only consider history as the authority on definitions, we would have to read the original papers that defined the AIC and the AICc to see if the people who proposed the AIC and AICc defined a common quantity that these formulae are supposed to approximate.

If we go beyond history to seek a respectable definition for the AIC, the section "Model Selection Criterion" on page 7 of the presentation http://www4.ncsu.edu/~shu3/Presentation/AIC.pdf, defines a quantity that is to be maximized. The particular formulae used to estimate that quantity could be different for different types of models and situations (e.g. linear models and large samples vs linear model and small samples ). If we define the AIC abstractly as a quantity proportional to:

##E_y E_x [\log(g(x| \hat{\theta}(y)))]##

then, in different situations, the AIC can be given by different formulae.

I don't know what level of abstraction you are comfortable with. One can probably understand formulae for the AIC and AICc by considering specific situations. - but I won't try to figure this out myself unless someone else is really interested in participating!
First of all thanks for your return, but I would like to express that I know how to derive AIC value without the correction, but when sample size is small relative to number of parameters (if n/k<40, by the way k is number of parameters n is sample size) it is said that we should use correction of AIC which means AICc. I really wonder why 40 takes place, what kind of assumptions in AIC definiton create 40 or why n/k<40 exists? So, could you help me about which assumptions may result in n/k<40 in small sample size case of AIC value?
 
  • #4
I myself don't know where the number 40 comes from.

The articles I've found that bother to footnote the recommendation n/k < 40 cite Burnham LS, Anderson DR. Model Selection and Inference: A Practical Information-Theoretic Approach. 2. Springer-Verlag; New York: 2002. I don't have a copy of that book.

We could try to follow the derivation given in http://myweb.uiowa.edu/cavaaugh/doc/pub/aicaicc.pdf, starting on page 3. However, I don't see the number 40 mentioned in that document.
 

Related to Akaike information small sample AICc

1. What is the Akaike information criterion (AIC)?

The Akaike information criterion (AIC) is a statistical measure used for model selection. It is based on the idea that a good model should fit the data well while being as simple as possible. The AIC takes into account both the goodness of fit and the complexity of the model to determine the best model among a set of competing models.

2. What is the difference between AIC and AICc?

AICc, or corrected AIC, is an extension of the AIC that is more suitable for small sample sizes. In AIC, the penalty for model complexity is fixed, while in AICc, the penalty is adjusted based on the sample size. This correction is important because the AIC tends to overfit the data when sample size is small, leading to a biased model selection.

3. How is AICc calculated?

AICc is calculated by adding a correction term to the AIC formula: AICc = AIC + 2k(k+1)/(n-k-1), where k is the number of parameters in the model and n is the sample size. This correction accounts for the small sample size and reduces the bias in model selection.

4. When should AICc be used instead of AIC?

AICc should be used when dealing with small sample sizes, typically when n/k < 40. This is because AIC becomes increasingly biased as the sample size decreases, while AICc provides a more accurate measure of model fit.

5. How can AICc be used in model selection?

AICc can be used to compare the relative quality of different models by choosing the model with the lowest AICc value. A difference of 2 or more in AICc between models indicates a significant difference in fit, with the model with the lower AICc being the preferred choice. However, AICc should not be the only factor considered in model selection, and other criteria such as theoretical plausibility and interpretability should also be taken into account.

Similar threads

Back
Top