## Polychoric correlations

Some researchers (e.g. Holgado–Tello et al. (2010), Garrido et al. (2013)) suggest using an alternative to the Pearson correlation when correlating two ordinal categorical variables. Polychoric (or tetrachoric in the case of two binary variables) correlations assume that the continuous measure underlying the categorical variables is normally distributed.

If this is the case then it has been shown (for example in Homer, P and O’Brien, RM (1988)) that polychorics more accurately estimate the correlation between pairs of categorical variables. Polychoric correlations are not estimated in SPSS but may be estimated using this syntax from R and incorporated into SPSS versions 16 and above. In addition there is a SPSS macro, located in r_tetra.sps, that computes a single tetrachoric correlation located here.

SAS users can use this SAS macro to estimate a matrix of polychoric correlations which can be used in the SAS factor analysis procedure or create a file of correlations which can be exported for use in SPSS where they may be inputted in lieu of raw data (Kinnear and Gray, 1999). There is also a PC programme freely downloadable from JS Uebersax’s website here. A help guide is included showing an example of its use. Yiu and Poon (2008) present a downloadable spreadsheet which computes polychoric correlations for pairwise 3 x 3 tables of frequencies. Polychoric correlations may also be computed using this spreadsheet which can the be copied and pasted into a SPSS spreadsheet. Tetrachoric correlations may be computed using a spreadsheet.

The polychoric correlations can then be typed into a correlation matrix which is entered using syntax into SPSS by inputting the correlation matrix directly rather than the raw data (Kinnear and Grar, 1999). An example correlation input file for four variables, V1, V2, V3 and V4, is given below. Column one identifies the rows as either containing sample sizes or correlations, column two contains the variable names, the remaining columns give the sample sizes and correlations for the four variables.

N 20.0000000 20.0000000 20.0000000 20.0000000 CORR V1 1.0000000 .3360000 .1617943 -.1944917 CORR V2 .3360000 1.0000000 -.0763708 .1959513 CORR V3 .1617943 -.0763708 1.0000000 -.0093707 CORR V4 -.1944917 .1959513 -.0093707 1.0000000

This data file can then be directly inputted into a factor analysis by running the following syntax into a syntax window.

FACTOR MATRIX IN(COR=*) /MISSING=LISTWISE /ANALYSIS=V1 V2 V3 V4 /PRINT=CORRELATION /PLOT=EIGEN /EXTRACTION=ML /ROTATION=OBLIMIN .

References

Garrido, L.E., Abad, J.J. and Ponsoda, V. (2013) A new look at Horn's parallel analysis with ordinal variables. *Psychological Methods* **18(4)**, 454-474. Parallel analysis is a factor extraction method which extracts only those factors with eigenvalues greater than the average of those produced from those obtained from simulations using independent variables (the 'by chance' scenario). Matlab code is given in the appendix.

Homer, P. and O’Brien, R.M. (1988) Using LISREL models with crude rank category measures. *Quality and Quantity*, **22**, 191-201.

Holgado–Tello, F.P., Chacón–Moscoso, S., Barbero–García, I. and Vila–Abad E. (2010) Polychoric versus Pearson correlations in exploratory and confirmatory factor analysis of ordinal variables. *Quality and Quantity* **44(1)**, 153-166.

Kinnear, P. and Gray, C.D. (1999) SPSS for windows made simple. Third Edition. Psychology Press:Hove,East Sussex, England. Chapter 15 in the 1999 edition details how to input a correlation matrix into SPSS. This may also be in the 2009 edition.

Olsson, U. (1979) Maximum likelihood estimation of the polychoric correlation coefficient. *Psychometrika,* **44(4)**, 443-460.

Yiu C.F. & Poon, W.Y. (2008) Estimating the polychoric correlation from misclassified data. *British Journal of Mathematical and Statistical Psychology*, **61**, 49-74.