FAQ/missing - CBU statistics Wiki

Upload page content

You can upload content for the page named below. If you change the page name, you can also upload content for another page. If the page name is empty, we derive the page name from the file name.

File to load page content from
Page name
Comment
Type the missing letters from: Lodon is th capial of nglnd

Revision 9 as of 2010-05-18 14:30:22

location: FAQ / missing

How do I handle missing data in SPSS?

Missing values are problematic in multivariate analyses because they reduce the number of cases as cases with any incomplete information are automatically dropped. One simplistic approach to this problem is to 'fill in' the missing values using variable means. The below illustrates how to use macros to perform this approach in SPSS and assumes missing values are missing completely at random so the missing values are not likely to differ in value from those that are recorded.

There are, however, more complex approaches (namely the EM algorithm and mixed random effect models) to handling missing values which are detailed [:FAQ/emalgm: here.] These approaches have gained popularity and are now available to use in most statistical packages.

Below are two macros for replacing missing values with variable means in SPSS. Suppose we have 50 variables labelled in consecutive columns aq1 to aq50. The below macro will identify only complete cases.

compute ind=1.
exe.

define !inmiss ( !pos !tokens(1)
                          / !pos !tokens(1)) .
!do !i=!1 !to !2.
if missing(!concat(aq,!i)) ind=ind*0.
!doend.
!enddefine.

!inmiss 1 50.
exe.
USE ALL.
COMPUTE filter_$=(ind=1).
VARIABLE LABEL filter_$ 'ind=1 (FILTER)'.
VALUE LABELS filter_$  0 'Selected' 1 'Not Selected'.
FORMAT filter_$ (f1.0).
FILTER BY filter_$.
EXECUTE .

The below macro will replace the missing values with the variable mean

define !inmiss ( !pos !tokens(1)
                          / !pos !tokens(1)) .
!do !i=!1 !to !2.
rmv /!concat(aq,!i,a)=smean(!concat(aq,!i)).
compute !concat(aq,!i,a) = rnd(aq,!1,a).
!doend.
!enddefine.

!inmiss 1 50.
exe.

As the items are dichotomous hence can only take two values we could consider rounding up the inputed means so that they take values that can actually occur. For 50 variables called aq1a to aq50a the below syntax rounds up their inputed values and places the results in variables y1 to y50.

do repeat r=aq1a to aq50a /y = y1 to y50.
compute y=rnd(r).
end repeat.
exe.