FAQ/polychoric - CBU statistics Wiki

You can't save spelling words.

Clear message
location: FAQ / polychoric

Polychoric correlations

Some researchers (e.g. Foldnes et al. (2021), Holgado–Tello et al. (2010), Garrido et al. (2013)) suggest using an alternative to the Pearson correlation when correlating two ordinal categorical variables. Polychoric (or tetrachoric in the case of two binary variables) correlations assume that the continuous measure underlying the categorical variables is normally distributed. Lorenzo-Seva and Ferrando (2015) give SPSS syntax for computing a matrix of polychoric correlations with examples of its use which are downloadable via the 'supplementary materials' link from here. The program may take a few minutes to compute the correlations.

If this is the case then it has been shown (for example in Homer, P and O’Brien, RM (1988)) that polychorics more accurately estimate the correlation between pairs of categorical variables. Polychoric correlations may be estimated in SPSS using a macro (Lorenzo-Seva and Ferrando, 2014) or by using this syntax from R and incorporated into SPSS versions 16 and above. In addition there is a SPSS macro, located in r_tetra.sps, that computes a single tetrachoric correlation located here.

SAS users can use this SAS macro to estimate a matrix of polychoric correlations which can be used in the SAS factor analysis procedure or create a file of correlations which can be exported for use in SPSS where they may be inputted in lieu of raw data (Kinnear and Gray, 1999). There is also a PC programme freely downloadable from JS Uebersax’s website here. A help guide is included showing an example of its use. Yiu and Poon (2008) present a downloadable spreadsheet which computes polychoric correlations for pairwise 3 x 3 tables of frequencies. Polychoric correlations may also be computed using this spreadsheet which can the be copied and pasted into a SPSS spreadsheet. Tetrachoric correlations may be computed using a spreadsheet. Urbano Lorenzo has written some a SPSS syntax programme called POLYMAT-C which enables polychoric correlations to be entered into an exploratory factor analysis in SPSS (This paper may appear in Behavior Research Methods in June 2014+). The SPSS Code is available from urbano.lorenzo@urv.cat.

The polychoric correlations can then be typed into a correlation matrix which is entered using syntax into SPSS by inputting the correlation matrix directly rather than the raw data (Kinnear and Grar, 1999). An example correlation input file for four variables, V1, V2, V3 and V4, is given below. Column one identifies the rows as either containing sample sizes or correlations, column two contains the variable names, the remaining columns give the sample sizes and correlations for the four variables.

N       20.0000000      20.0000000      20.0000000      20.0000000

CORR V1  1.0000000        .3360000        .1617943       -.1944917

CORR V2   .3360000       1.0000000       -.0763708        .1959513

CORR V3   .1617943       -.0763708       1.0000000       -.0093707

CORR V4  -.1944917        .1959513       -.0093707       1.0000000

This data file can then be directly inputted into a factor analysis by running the following syntax into a syntax window.

FACTOR MATRIX IN(COR=*) /MISSING=LISTWISE
/ANALYSIS=V1 V2 V3 V4
/PRINT=CORRELATION 
/PLOT=EIGEN 
/EXTRACTION=ML
/ROTATION=OBLIMIN .

References

Foldnes, N. and Gronneberg, S. (2021) The sensitivity of structural equation modeling with ordinal data to underlying non-normality and observed distributional forms. In Press Psychological Methods.

Garrido, L.E., Abad, J.J. and Ponsoda, V. (2013) A new look at Horn's parallel analysis with ordinal variables. Psychological Methods 18(4), 454-474. Parallel analysis is a factor extraction method which extracts only those factors with eigenvalues greater than the average of those produced from those obtained from simulations using independent variables (the 'by chance' scenario). Matlab code is given in the appendix.

Homer, P. and O’Brien, R.M. (1988) Using LISREL models with crude rank category measures. Quality and Quantity, 22, 191-201.

Holgado–Tello, F.P., Chacón–Moscoso, S., Barbero–García, I. and Vila–Abad E. (2010) Polychoric versus Pearson correlations in exploratory and confirmatory factor analysis of ordinal variables. Quality and Quantity 44(1), 153-166.

Kinnear, P. and Gray, C.D. (1999) SPSS for windows made simple. Third Edition. Psychology Press:Hove,East Sussex, England. Chapter 15 in the 1999 edition details how to input a correlation matrix into SPSS. This may also be in the 2009 edition.

Lorenzo-Seva, U. and Ferrando, P. J. (2015) POLYMAT-C: a comprehensive SPSS program for computing the polychoric correlation matrix. Behavioral Research Methods 47(3) 884-889. This features SPSS syntax to use with Exploratory Factor Analysis. email: urbano.lorenzo@urv.cat

Olsson, U. (1979) Maximum likelihood estimation of the polychoric correlation coefficient. Psychometrika, 44(4), 443-460.

Yiu C.F. & Poon, W.Y. (2008) Estimating the polychoric correlation from misclassified data. British Journal of Mathematical and Statistical Psychology, 61, 49-74.

None: FAQ/polychoric (last edited 2021-04-06 14:59:24 by PeterWatson)