# Using linear regression as an alternative to logistic regression

Hellevik (2009) suggests that linear regression can be used with a binary outcome as opposed to logistic regression particularly for large samples. P-values between the two types of regression were found to be very close despite the ordinary least squares approach not incorporating heterogeneity of variance (where the variance of the proportions depends on the proportion) which is taken into account by logistic regression.Mood (2010) shows almost identical results using linear regression and logistic regression in Table 5 of her paper and also suggests linear regressions may be used on a binary outcome e.g. to make the results more interpretable. Battey, Cox and Jackson (2019) also suggest using linear regression fitting to binary data coded as 1 and -1 and obtain similar results to logistic regression. They also find the results using linear regression are easy to interpret.

On a related theme this research report by Jaeger suggests using mixed random effect models as opposed to the arcsine transformation which has been traditionally used to make variance of proportions independent of the proportion. Ahrens, Cox and Budhwar (1990) suggest the arcsine transform can be used on proportions provided the proportions are greater than 0.2 and less than 0.8 in magnitude.

In addition to the use of OLS and logistic regressions Mood (2010) also raises some caveats about comparing logistic regression odds ratios between different groups involving assumptions of underlying continuums which have been dichotomised to produce the outcome variable but Kuha and Mills (2018) argue that these concerns are usually misplaced.

References

Ahrens, W.H., Cox, D.J., Budhwar, G. (1990) Use of the arcsine and square root transformations for subjectively determined percentage data **Weed Science** **38** 452-458.

Battey, H.S., Cox, D.R. and Jackson, M. (2019) On the linear in probability model for binary data. *Royal Society Open Science* **6** 190067.

Hellevik, O. (2009) Linear versus logistic regression when the dependent variable is a dichotomy. *Qual Quant* **43** 59-74.

Kuha, J. and Mills, C. (2018) On group comparisons with logistic regression models. *Sociological Methods & Research* 1-28.

Mood, C. (2010) Logistic regression: why we cannot do what we think we can do, and what we can do about it. *European Sociological Review* **26(1)** 67-82.