Diff for "FAQ/rest" - CBU statistics Wiki
location: Diff for "FAQ/rest"
Differences between revisions 10 and 11
Revision 10 as of 2007-12-06 12:14:30
Size: 2194
Editor: PeterWatson
Comment:
Revision 11 as of 2007-12-06 12:17:54
Size: 2446
Editor: PeterWatson
Comment:
Deletions are marked like this. Additions are marked like this.
Line 15: Line 15:
Also relatively small amounts of measurement error can distort correlations if this is large relative to the variance in either of the two variables being correlated (Bland, 2005). This is more likely if one of the variables has a restricted range.

A note on correcting for restriction of ranges which underestimate Pearson correlations

A Pearson correlation on variables which take a subset of values of at least one of the two variables being correlated will tend to be smaller than using a larger range of values. For example in an extreme case if we only used people with IQ scores of 100 and correlated IQ with memory score we would obtain a correlation of zero. To obtain a zero order correlation you need two variables. Variables with skews > abs(2) have been [http://www.pbarrett.net/index1.html recommended] as likely to suffer from this. See also Kendall and Stuart (1958).

Chan & Chan (2004) (amongst others) present formula which adjust (upwards)a correlation based on a subset of values in one of the variables to represent the correlation you would have got using a larger set. (e.g. taking the correlation using IQ scores between 95 and 105 and adjusting to estimate the correlation you would have got using IQ scores between 70 and 140. You need, though, to know the variance of one of the two variables for this larger range of values.

In particular the (Pearson) correlation for the larger range, $$r_text{corrected}$$ is obtained using VR, the variance for one of the variables in the restricted range, and V, its variance when it takes the larger range of values:

$$r_text{corrected} = \frac{\mbox{r}}{\sqrt{\mbox{r}text{2} + \frac{(\mbox{1-r}text{2})\mbox{VR}}{\mbox{V}}}$$

You must also assume the relationship between the two variables is linear and as accurate in the smaller and larger ranges.

This underestimation of a relationship due to small variable variation is also sometimes called attenuation.

Also relatively small amounts of measurement error can distort correlations if this is large relative to the variance in either of the two variables being correlated (Bland, 2005). This is more likely if one of the variables has a restricted range.

References

Bland M(2005) [http://www-users.york.ac.uk/%7Emb55/talks/oxtalk.htm Measuring agreement between measurements ]. Talk presented at Centre for Statistics in Medicine, Oxford.

Chan W, Chan DW-L (2004) Bootstrap standard error and confidence intervals for the correlation corrected for range restriction: a simulation study. Psychological methods 9(3) 369-385.

Kendall, MG and Stuart, A (1958) The Advanced Theory of Statistics. New York:Hafner.

None: FAQ/rest (last edited 2021-11-02 09:06:32 by PeterWatson)