Differences between revisions 1 and 56 (spanning 55 versions)

Methods of Handling outliers

The purpose of this note is to mention three approaches to reducing the effects of outliers in a particular variable in a data set which does not involve removing outlying data points.

Winsorising a variable removes a fixed percentage of its highest and lowest values and replaces them by a corresponding percentile

e.g. Winsorising a variable with values of 1 4 7 9 10 25 replaces 1 and 25 by 2.5 and 17.5 which are the 25th and 75th percentiles respectively. Although this so-called Winsorization procedure is a robust method to estimate the mean, applying statistical analysis like a t test on this adjusted data set will not result in robust results because the estimation of the standard error is incorrect (Wilcox, 2012). Hence, this practice is suboptimal.

Patall, Cooper and Robinson (2008) use a statistical test (Grubb's test) rather than percentiles to determine the outliers that are then replaced by the highest or lowest non-significant values. In particular on page 13 of their paper Patall et al. state

"Grubbs's (1950) test, also called "the maximum normed residual test," was applied (see also Barnett & Lewis, 1994). This test identifies outliers in univariate distributions and does so one observation at a time. If outliers were identified, (using p < .05, two-tailed, as the significance level) these values would be set at the value of their next nearest neighbour." A more robust version of Grubb's test (the extreme studentized deviate test (ESD) test) which tests for upto a specified number of outliers at once is described here which also gives this R code to implement it. Cook's Distance is also used to assess influence of single observations in regressions. A Cook's Distance over 1 being suggestive of an outlier (see Hair et al, 1998, 2005) and here.

Nearest neighbour approaches looking at distances between data points and their nearest neighbour can also be used with the Kolmogorov-Smirnov test comparing the distribution of distances between k nearest neighbours to a particular point with the distribution of distances between the k nearest neighbours and the outlying point.(See here for an example).

Trimming a variable removes a fixed percentage of its lowest and highest values. Consequently a 20% trimmed mean is the mean of a variable that has had its top 20% and its lowest 20% of values removed.

Some authors recommend computing estimates of variability using trimming by taking bootstrap samples (e.g. Wilcox et al, 2000 use trimming to downweight the effect of outliers in repeated measures anova).

There are also weighting techniques collectively known as M-estimation which give different emphasis (weights) to each observation with more outlying observations being giving smaller weights.

Transforming data using power transforms such as log, square root can also help downweight outliers. The Box-Cox transformation can be used to determine an optimal power transform. SPSS code is available.

Sometimes outliers could be removed e.g. Reaction times less than 100 which are highly improbably fast. These may be filtered out of the data file in SPSS.

Zimmerman (2014) found that when samples are selected to make within group variances more equal (e.g. by deleting outliers) the resulting t-tests and one-way between subject ANOVAs had inflated type I errors. Bakker and Wicherts (2014) advocate not removing outliers but using nonparametric tests for group comparisons to downweight them because these tests have been found to have nominal Type I error rates with a minimal loss of power when no outliers are present in the data and to have nominal Type I error rates and good power when outliers are present.

Langkjaer-Bain (2017) advocates not removing outliers unless there is a measurement error.

References

A summary of the above robust estimates for dealing with outliers is in: Andrews (1972) D.F. Andrews, P.J. Bickel, F.R. Hampel, P.J. Huber, W.H. Rogers and J.W. Tukey, Robust estimates of location survey and advances, Princeton University Press, Princeton (1972).

Bakker, M., & Wicherts, J. M. (2014) Outlier Removal, Sum Scores, and the Inflation of the Type I Error Rate in Independent Samples t Tests: The Power of Alternatives and Recommendations. Psychological Methods, 19(3), 409–427. (A good overview of methods for handling outliers).

Barnett, V., & Lewis, T. (1994) Outliers in statistical data analysis (3rd ed.). New York: John Wiley & Sons.

Details of bootstrapping are in: Efron, & Tibshirani (1993) B. Efron and R.J. Tibshirani, An introduction to the bootstrap, Chapman & Hall, London (1993). Bootstrapping SPSS macros for regressions are here.

Hair Jr., JF, Tatham, RL, Anderson, RE and Black, W (1998, 2005) Multivariate Data Analysis (5th edition). Prentice-Hall:Englewood Cliffs, NJ.

Keselman, HJ, Algina, J, Lix, LM, Wilcox, RR, Deering, KN (2008) A generally robust approach for testing hypotheses and setting confidence intervals for effect sizes Psychological Methods 13(2) 110-129.

Keselman, HJ, Wilcox, RR, Lix, LM, Algina, J, Fradette K (2010) Adaptive robust estimation and testing. British Journal of Mathematical and Statistical Psychology 60(2) 267-293.

Langkjaer-Bain R. (2017) The murky tale of Flint's deceptive water data. Significance 14(2) 17-21.

Patall, EA, Cooper, H and Robinson JC (2008) Parent involvement in homework: a research synthesis. Review of Educational Research 78(4) 1039-1101. (available using JSTOR login).

Wilcox, R. (2012) Modern statistics for the social and behavioral sciences:A practical introduction. Boca Raton, FL: CRC Press.

Wilcox RR, Keselman HJ, Muska J and Cribbie R (2000) Repeated measures ANOVA: some results on comparing trimmed means and means. Journal of Mathematical and Statistical Psychology 53 69-82.

Zimmerman, DW (2014) Consequences of choosing samples in hypothesis testing to ensure homogeneity of variance. British Journal of Mathematical and Statistical Psychology 67, 1–29

-  ⇤ ← Revision 1 as of 2008-09-03 11:28:50 → 
  Size: 1544
  Editor: PeterWatson
  Comment:
+   ← Revision 56 as of 2017-07-13 09:30:50 → ⇥
  Size: 6474
  Editor: PeterWatson
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 3:
 The purpose of this note is to mention three approaches to reducing the effects of outliers in a particular variable in a data set which does not involve removing outlying data points.
 Line 5:
-A ''winsorised'' mean is the mean of a variable where a fixed percentage of its highest and lowest values are removed and replaced by a corresponding percentile
+''Winsorising'' a variable removes a fixed percentage of its highest and lowest values and replaces them by a corresponding percentile
 Line 7:
-e.g. A 25% winsorised mean of 1 4 7 9 10 25 replaces 1 and 25 by 2.5 and 17.5 which are the 25th and 75th percentiles respectively and averages 2.5 4 7 9 10 17.5
+e.g. Winsorising a variable with values of 1 4 7 9 10 25 replaces 1 and 25 by 2.5 and 17.5 which are the 25th and 75th percentiles respectively. Although this so-called Winsorization
procedure is a robust method to estimate the mean, applying statistical analysis like a t test on this adjusted data set will not result in robust results because the estimation of the
standard error is incorrect (Wilcox, 2012). Hence, this practice is suboptimal.
-Line 9:
+Line 11:
-A ''trimmed'' mean is the mean of a variable which has had a fixed percentage of its lowest and highest values removed. Consequently a 20% trimmed mean is the mean of a variable that has had its top and its lowest 20% of values removed.
+Patall, Cooper and Robinson (2008) use a statistical test [[http://www.graphpad.com/quickcalcs/grubbs1/ | (Grubb's test)]] rather than percentiles to determine the outliers that are then replaced by the highest or lowest non-significant values. In particular on page 13 of their paper Patall et al. state
-Line 11:
+Line 13:
-Some authors recommend computing estimates of variability such as standard errors for trimmed means by taking bootstrap samples (Wilcox, 1998).
+"Grubbs's (1950) test, also called "the maximum normed residual test," was applied (see also Barnett & Lewis, 1994). This test identifies outliers in univariate distributions and does so one observation at a time. If outliers were identified, (using p < .05, two-tailed, as the significance level) these values would be set at the value of their next nearest neighbour."
A more robust version of Grubb's test (the extreme studentized deviate test (ESD) test) which tests for upto a specified number of outliers at once is described [[http://www.itl.nist.gov/div898/handbook/eda/section3/eda35h3.htm | here]] which also gives this [[FAQ/ESDtest | R code]] to implement it. Cook's Distance is also used to assess influence of single observations in regressions. A Cook's Distance over 1 being suggestive of an outlier (see Hair et al, 1998, 2005) and [[FAQ/cookdmore | here.]]
-Line 13:
+Line 16:
-There are also weighted means known as M-estimators which give different emphasis (weights) to each observation with more outlying observations giving smaller weights.
+Nearest neighbour approaches looking at distances between data points and their nearest neighbour can also be used with the Kolmogorov-Smirnov test comparing the distribution of distances between k nearest neighbours to a particular point with the distribution of distances between the k nearest neighbours and the outlying point.[[attachment:kneighbour.pdf | (See here for an example).]]

''Trimming'' a variable removes a fixed percentage of its lowest and highest values. Consequently a 20% trimmed mean is the mean of a variable that has had its top 20% and its lowest 20% of values removed. 

Some authors recommend computing estimates of variability using trimming by taking bootstrap samples (e.g. Wilcox et al, 2000 use trimming to downweight the effect of outliers in repeated measures anova).

There are also weighting techniques collectively known as ''M-estimation'' which give different emphasis (weights) to each observation with more outlying observations being giving smaller weights. 

Transforming data using power transforms such as log, square root can also help downweight outliers. The Box-Cox transformation can be used to determine an optimal power transform. [[http://www.stat.tamu.edu/ftp/pub/mspeed/stat653/spss/|SPSS code is available.]]

Sometimes outliers could be removed e.g. Reaction times less than 100 which are highly improbably fast. These may be [[FAQ/selectif | filtered out]] of the data file in SPSS.

Zimmerman (2014) found that when samples are selected to make within group variances more equal (e.g. by deleting outliers) the resulting t-tests and one-way between subject ANOVAs had inflated type I errors. Bakker and Wicherts (2014) advocate not removing outliers but using nonparametric tests for group comparisons to downweight them because these tests have been found to have nominal Type I error rates with a minimal loss of power when no outliers are present in the data and to have nominal Type I error rates and good power when outliers are present.

Langkjaer-Bain (2017) advocates not removing outliers unless there is a measurement error.
-Line 17:
+Line 34:
+A summary of the above robust estimates for dealing with outliers is in:
Andrews (1972) D.F. Andrews, P.J. Bickel, F.R. Hampel, P.J. Huber, W.H. Rogers and J.W. Tukey, Robust estimates of location survey and advances, Princeton University Press, Princeton (1972).
-Line 18:
+Line 37:
+Bakker, M., & Wicherts, J. M. (2014) Outlier Removal, Sum Scores, and the Inflation of the Type I Error Rate in Independent Samples t Tests: The Power of Alternatives and Recommendations. ''Psychological Methods'', '''19(3)''', 409–427. (A good overview of methods for handling outliers).
-Line 19:
+Line 39:
-A summary of the above robust estimates to outliers is in:
Andrews (1972) D.F. Andrews, P.J. Bickel, F.R. Hampel, P.J. Huber, W.H. Rogers and J.W. Tukey, Robust estimates of location survey and advances, Princeton University Press, Princeton (1972).
+Barnett, V., & Lewis, T. (1994) Outliers in statistical data analysis (3rd ed.). New York: John Wiley & Sons.
-Line 24:
+Line 43:
+Bootstrapping SPSS macros for regressions are [[http://www.stat.tamu.edu/ftp/pub/mspeed/stat653/spss/|here.]]

Hair Jr., JF, Tatham, RL, Anderson, RE and Black, W (1998, 2005) Multivariate Data Analysis (5th edition). Prentice-Hall:Englewood Cliffs, NJ.

Keselman, HJ, Algina, J, Lix, LM, Wilcox, RR, Deering, KN (2008) [[attachment:kes.pdf|A generally robust approach for testing hypotheses and setting confidence intervals for effect sizes Psychological Methods 13(2) 110-129.]]

Keselman, HJ, Wilcox, RR, Lix, LM, Algina, J, Fradette K (2010) Adaptive robust estimation and testing. ''British Journal of Mathematical and Statistical Psychology'' '''60(2)''' 267-293.

Langkjaer-Bain R. (2017) The murky tale of Flint's deceptive water data. ''Significance'' '''14(2)''' 17-21.

Patall, EA, Cooper, H and Robinson JC (2008) Parent involvement in homework: a research synthesis. ''Review of Educational Research'' '''78(4)''' 1039-1101. (available using JSTOR login).

Wilcox, R. (2012) Modern statistics for the social and behavioral sciences:A practical introduction. Boca Raton, FL: CRC Press.

Wilcox RR, Keselman HJ, Muska J and Cribbie R (2000)
Repeated measures ANOVA: some results on comparing trimmed means and means. ''Journal of Mathematical and Statistical Psychology'' '''53''' 69-82.

Zimmerman, DW (2014) Consequences of choosing samples in hypothesis testing to ensure homogeneity of variance. ''British Journal of Mathematical and Statistical Psychology'' '''67''', 1–29

MRC CBU Wiki

Quick Links

Search Wiki

Page Tools

Methods of Handling outliers