2153
Comment:
|
2314
|
Deletions are marked like this. | Additions are marked like this. |
Line 31: | Line 31: |
This R-squared estimate is also advocated by Train (2003). |
|
Line 35: | Line 37: |
Train, K. (2003) Discrete choice methods with simulation. Cambridge University Press:Cambridge. |
How do I summarise a fit for a logistic regression model?
Menard (2000) compares various R-squared measures for binary logistic regressions and concludes that the log-likelihood ratio chi-square is the most appropriate:
$$ \mbox{R-squared (Likelihood ratio)} = 1 - \frac{ln(L[m])}{ln(L[0]) } = 1 - \frac{-2 ln(L[m])}{-2 ln(L[0]) } = \frac{ln(L[m]) - ln(L[0])}{ln(L[0])}$$
where L(m) and L(0) are the log likelihoods for the model with predictors and the model containing only the intercept respectively. The latter term involves using -2 times the log likelihood which is outputted by SPSS (and other software) rather than the log likelihood. This R-squared form is also known as McFadden's R-squared.
Ths statistical significance of the predictors may be jointly assessed using twice the change in the log-likelihoods in the above expression. This equals 2 (ln (L[m]) - ln (L[0])) which is distributed as chi-square(p) if the p predictors jointly have no influence on group membership. This chi-square is computed and outputted by most software which performs binary logistic regressions. In SPSS, for example, this term is denoted by the chi-square statistic produced immediately after the predictors are added to the model under the heading 'Block 1 Method=Enter'. For example running a logistic regression in SPSS to assess the joint importance of two predictors p1 and p2 with the syntax below
LOGISTIC REGRESSION y /METHOD = ENTER p1 p2 /CRITERIA = PIN(.05) POUT(.10) ITERATE(20) CUT(.5) .
we obtain the likelihood ratio chi-square in the output which is of form:
BLOCK 1: METHOD-ENTER Omnibus Tests of Model Coefficients Chi-square df Sig. Step 1 Step 3.958 2 .138 Block 3.958 2 .138 Model 3.958 2 .138
This may be expressed as chi-square(2) = 3.96, p = 0.14 indicating that together the two predictors, p1 and p2, do not have a statistically significant association with group, y.
This R-squared estimate is also advocated by Train (2003).
Reference
Menard, S. (2000) Coefficients of determination for multiple logistic regression analysis. American Statistician, 54, 17-24.
Train, K. (2003) Discrete choice methods with simulation. Cambridge University Press:Cambridge.