FAQ/studentres - CBU statistics Wiki

Revision 1 as of 2011-03-16 15:26:06

Clear message
location: FAQ / studentres

= How do I check for outliers in a simple regression with one predictor variable?=

A simple way to check for outliers is to evaluate either standardized or studentized residuals and see if there are many with high values e.g. > +/- 2. The key reason for studentizing is that the variances of the residuals at different predictor values are different.

This can be done as follows:

  1. Standardize both the response variable and the predictor variable by subtracting their means and dividing by their standard deviations, call these $$y_text{s}$$ and $$x_text{s}$$.
  2. Evaluate a Pearson or Spearman correlation, R.
  3. Obtain the i-th raw residual as $$Y\text_{si} - Rx_text{si}$$
  4. To obtain the standardized residual just divide by the standard deviation of the residuals. The mean raw residual should be zero.
  5. The studentized residual may also be used to identify potential outliers. This divides the raw residual by its standard error, SERES.

SERES equals $$s \sqrt{1 - hii}$$ where s equals $$\sum_{i}(Y\text_{si} - Rx_text{si}$$)/(N-2) for N observations and hii equals $$\frac{1}{N} + \frac{xtext{2}_text{si}}{\sum_{i}xtext{2}_text{si}}

Studentised residuals may be evaluated using this spreadsheet (to be added).