Diff for "FAQ/Winsor" - CBU statistics Wiki
location: Diff for "FAQ/Winsor"
Differences between revisions 2 and 3
Revision 2 as of 2008-09-03 11:37:49
Size: 1741
Editor: PeterWatson
Comment:
Revision 3 as of 2008-09-03 11:38:20
Size: 1739
Editor: PeterWatson
Comment:
Deletions are marked like this. Additions are marked like this.
Line 25: Line 25:
Wilcox RR, Keselman, HJ, Muska, J and Cribbie R (2000) Wilcox RR, Keselman HJ, Muska J and Cribbie R (2000)

Methods of Handling outliers

The purpose of this note is to mention three approaches to reducing the effects of outliers in a particular variable in a data set which does not involve removing outlying data points.

Winsorising a variable removes a fixed percentage of its highest and lowest values and replaces them by a corresponding percentile

e.g. Winsorising a variable with values of 1 4 7 9 10 25 replaces 1 and 25 by 2.5 and 17.5 which are the 25th and 75th percentiles respectively.

Trimming a variable removes a fixed percentage of its lowest and highest values. Consequently a 20% trimmed mean is the mean of a variable that has had its top and its lowest 20% of values removed.

Some authors recommend computing estimates of variability using trimming by taking bootstrap samples (e.g. Wilcox et al, 1998 use trimming to downweight the effect of outliers in repeated measures anova).

There are also weighting techniques known as M-estimation which give different emphasis (weights) to each observation with more outlying observations giving smaller weights.

References

A summary of the above robust estimates to outliers is in: Andrews (1972) D.F. Andrews, P.J. Bickel, F.R. Hampel, P.J. Huber, W.H. Rogers and J.W. Tukey, Robust estimates of location survey and advances, Princeton University Press, Princeton (1972).

Details of bootstrapping are in: Efron, & Tibshirani (1993) B. Efron and R.J. Tibshirani, An introduction to the bootstrap, Chapman & Hall, London (1993).

Wilcox RR, Keselman HJ, Muska J and Cribbie R (2000) Repeated measures ANOVA: some results on comparing trimmed means and means. Journal of mathematical and statistical psychology 53 69-82.

None: FAQ/Winsor (last edited 2017-07-13 09:30:50 by PeterWatson)