= How do I check for outliers in a simple regression with one predictor variable?=
A simple way to check for outliers is to evaluate either standardized or studentized residuals and see if there are many with high values e.g. > +/- 2. The key reason for studentizing is that the variances of the residuals at different predictor values are different.
This can be done as follows:
- Standardize both the response variable and the predictor variable by subtracting their means and dividing by their standard deviations, call these $$y_text{s}$$ and $$x_text{s}$$.
- Evaluate a Pearson or Spearman correlation, R.
- Obtain the i-th raw residual as $$Y\text_{si} - Rx_text{si}$$
- To obtain the standardized residual just divide by the standard deviation of the residuals. The mean raw residual should be zero.
- The studentized residual may also be used to identify potential outliers. This divides the raw residual by its standard error, SERES.
SERES equals $$s \sqrt{1 - hii}$$ where s equals $$\sum_{i}(Y\text_{si} - Rx_text{si}$$)/(N-2) for N observations and hii equals $$\frac{1}{N} + \frac{xtext{2}_text{si}}{\sum_{i}xtext{2}_text{si}}
Studentised residuals may be evaluated using this spreadsheet (to be added).