FAQ/looe - CBU statistics Wiki

You can't save spelling words.

Clear message
location: FAQ / looe

The leaving one out error rate in SPSS for logistic regression

The syntax below uses dummy data of one binary variable (y) and a continuous predictor, x and is an adaptation of code from Andy W taken from here for cross validating using the leaving-one-out approach for a linear regression model.

INPUT PROGRAM.
LOOP Id = 1 TO 100.
END CASE.
END LOOP.
END FILE.
END INPUT PROGRAM.
DATASET NAME Sim.
COMPUTE X = RV.NORMAL(10,5).
COMPUTE YS = 3 + 0.2*(X) + RV.NORMAL(0,0.2).
RECODE YS (Lowest thru 4.5=0) (4.5 thru Highest=1) INTO Y. 
COMPUTE YCOPY=Y.
FORMATS Id (F2.0) X Y YCOPY (F4.2).
EXECUTE.

We can save this file into a file dummy.sav in U:/My Documents folder. The macro below can then be used to produce the classification table of predicted by true group for the single omitted data point. The inputs are number of rows in data file (assumed sample size), outcome variable name, the name of the data file sandwiched between % (percent) signs and the names of the predictors in the logistic regression.

DEFINE !LOOE (!POS !TOKENS(1)
                        /!POS !TOKENS(1)
                        /!POS !ENCLOSE('%','%') 
                        /!POS !CMDEND).

COMPUTE YCOPY=!2.

INPUT PROGRAM.
COMPUTE #Cases = !1.
LOOP #Id = 1 TO #Cases.
  LOOP #Iter = 1 TO #Cases.
    COMPUTE L1O = #Iter.
    COMPUTE Id = #Id.
    END CASE.
  END LOOP.
END LOOP.
END FILE.
END INPUT PROGRAM.
DATASET NAME LeaveOneOut.

*Merging in original data.
MATCH FILES FILE = *
  /TABLE = !QUOTE(!3)
  /BY Id.

#MATCH FILES FILE = *
# /TABLE = 'Sim'
# /BY Id.

*Set missing to 

IF L1O = Id !2 = $SYSMIS.
SORT CASES BY L1O.
SPLIT FILE BY L1O.
*You can replace the logistic regression with whatever procedure you are.
*interested in.  
LOGISTIC REGRESSION VARIABLES !2
  /METHOD=ENTER !4
  /SAVE=PRED (pred) PGROUP (pgp) 
  /CRITERIA=PIN(.05) POUT(.10) ITERATE(20) CUT(.5).
SPLIT FILE OFF.

*This shows the original leave one out stats.
*And new stats are the same besides some floating.
*point differences (tested for linear regresison)
.
*COMPUTE Test = (CVFit2 - (PredAll-CVFit)).

TEMPORARY.
SELECT IF (L1O = Id).
*FREQ Y pgp.
CROSSTABS
  /TABLES=YCOPY BY pgp
  /FORMAT=AVALUE TABLES
  /CELLS=COUNT
  /COUNT ROUND CELL.
EXECUTE.
!ENDDEFINE.

We can now run the macro

* ARGUMENTS ARE THE NUMBER OF ROWS OF DATA (SAMPLE SIZE), THE NAME OF THE 
OUTCOME VARIABLE AND A LIST OF PREDICTORS

!LOOE 40 y %U:/My Documents/dummy.sav% x1 x2.

None: FAQ/looe (last edited 2014-11-18 16:21:40 by PeterWatson)