Diff for "FAQ/AICreg" - CBU statistics Wiki
location: Diff for "FAQ/AICreg"
Differences between revisions 15 and 16
Revision 15 as of 2016-04-15 11:26:46
Size: 3721
Editor: PeterWatson
Comment:
Revision 16 as of 2016-06-08 14:53:26
Size: 4205
Editor: PeterWatson
Comment:
Deletions are marked like this. Additions are marked like this.
Line 42: Line 42:
  * Free Bayesian analysis software (JASP) is available [[ https://jasp-stats.org/ | from here]]. There is also a pdf guide to computing and interpreting Bayes Factors in factorial ANOVAs [[attachment:bayesANOVA.pdf | here]]. In particular pages 28-31 of this guide show how to compare pairs of Bayes factors nested within the same model e.g. with and without a main effect to assess the importance of the extra terms (e.g. the main effect) in the fuller model using the Bayes Factors.

How do I compute Akaike's and Bayesian information criteria (AIC, BIC) to compare regression models and how do I interpret them?

Akaike's information criterion is used to compare both the efficiency of multivariate models looking at the same data combining the degree of fit with the number of terms in the model. Better fitting simpler models are preferred with smaller AICs. AIC can be used as an alternative to the F ratio in stepwise regressions investigating the effectiveness of adding or subtracting one or more predictors from a model (see an example in the Regression Grad talk). Information criteria can also be used to compare logistic regression models with overdispersion (Agresti, 1996).

AIC = n ln(RSS/n) + 2 df(model)

where RSS is the Residual Sum of Squares which is routinely outputted from the regression analysis, n is the total sample size and df(model) is the degrees of freedom of the regression model which is the number of parameters equal to the number of predictors + 1 (for the intercept). The above formula for AIC is also given on page 63 of Burnham and Anderson (2002).

There is also a Bayesian Information Criterion (BIC) or Schwarz's criterion

BIC = n ln(RSS/n) + [(k+1) ln(n)]/n

where n is the total sample size and there are k parameters (including the intercept).

Nagin (1999) suggests using bij=exp(BIC(1)-BIC(2)) as a means of deciding on whether one BIC is meaningfully lower than another BIC (page 147 and Table 2 on page 148 gives some rules of thumb). It is also mentioned in Chapter Four of Nagin(2005).

bij

Interpretation

bij < 1/10

Strong evidence for model j

1/10 < bij < 1/3

Moderate evidence for model j

1/3 < bij < 1

Weak evidence for model j

1 < bij < 3

Weak evidence for model

3 <bij< 10

Moderate evidence for model

bij > 10

Strong evidence for model

Jones, Nagin and Roeder (2001) alternatively suggest using twice the raw difference in BICs to compare models.

2(Diff in BICs)

Interpretation

0 to 2

Not worth mentioning

2 to 6

Positive

6 to 10

Strong

> 10

Very Strong

On a related note Shafer and, also, Jeffreys (1961) give rules of thumb for sizes of Bayes Factors (which compare an alternative model to a null model) suggesting Bayes Factors under 3 are weak (Shafer) and anecdotal (Jeffreys).

Some rules of thumb for using Bayes factors (Jeffreys 1961)

1 < Bayes factor <= 3

weak evidence for M1

3 < Bayes factor <= 10

substantial evidence for M1

10 < Bayes factor <= 100

strong evidence for M1

100 < Bayes factor

decisive evidence for M1

* Free Bayesian analysis software (JASP) is available from here. There is also a pdf guide to computing and interpreting Bayes Factors in factorial ANOVAs here. In particular pages 28-31 of this guide show how to compare pairs of Bayes factors nested within the same model e.g. with and without a main effect to assess the importance of the extra terms (e.g. the main effect) in the fuller model using the Bayes Factors.

References

Agresti A (1996) An introduction to categorical data analysis. Wiley:New York.

Burnham, K.P., and Anderson, D.R. 2002. Model selection and multimodel inference: a practical information-theoretic approach, second edition. Springer-Verlag, New York.

(A pdf copy of the above book may also be downloaded for free from here.)

Jeffreys H (1961) Theory of Probability, 3rd ed. Oxford Classic Texts in the Physical Sciences. Oxford Univ. Press: Oxford.

Jones B, Nagin D & Roeder KA (2001) SAS Procedure Based on Mixture Models for Estimating Developmental Trajectories. Sociological Methods & Research 29 374-393.

Nagin DS (1999) Analyzing Developmental Trajectories: A Semiparametric, Group-Based Approach. Psychological Methods 4(2) 139-157.

Nagin DS (2005) Group-based Modeling of Development. Harvard University Press: Massachusetts.

None: FAQ/AICreg (last edited 2024-01-25 11:14:01 by PeterWatson)