Diff for "FAQ/kappa" - CBU statistics Wiki

Differences between revisions 1 and 24 (spanning 23 versions)

Kappa statistic evaluation in SPSS

SPSS syntax available:

[:FAQ/kappa/kappans:Non-square tables where one rater does not give all possible ratings]
[:FAQ/kappa/multiple:More than 2 raters]
[:FAQ/ad:An inter-rater measure based on Euclidean distances]

Note: Reliability as defined by correlation coefficients (such as Kappa) requires variation in the scores to achieve a determinate result. If you have a program which produces a determinate result when the scores of one of the coders is constant, the bug is in that program, not in SPSS. Each rater must give at least two ratings.

[:FAQ/kappa/magnitude:Benchmarks for suggesting what makes a high kappa]

There is also a weighted kappa which allows different weights to be attached to misclassifications. Warrens (2011) shows that weighted kappa is an example of a more general test of randomness. This [attachment:kappa.pdf paper] by Von Eye and Von Eye (2005) gives a comprehensive insight into kappa and variants of it. These include a variant by Brennan and Prediger (1981) (computed using either this [http://justusrandolph.net/kappa/ on-line calculator], which also computes Cohen's kappa, or this [attachment:bpkappa.xls spreadsheet]) which enables kappa to attain the maximum value of '1' comparing to a uniform distribution when the number of category ratings is not fixed. Von Eye and Von Eye's paper suggests, however, that this measure can give a misleadingly high value if the raters give different numbers of category ratings.

References

Brennan RL, & Prediger DJ (1981). Coefficient kappa: Some uses, misuses, and alternatives. Educational and Psychological Measurement 41 687–699.

von Eye A & von Eye M (2005). Can One Use Cohen's Kappa to Examine Disagreement? Methodology 1(4) 129–142.

Warrens MJ (2011). Chance-corrected measures for 2 × 2 tables that coincide with weighted kappa. British Journal of Mathematical and Statistical Psychology 64(2) 355–365.

-  ⇤ ← Revision 1 as of 2006-07-18 13:48:11 → 
  Size: 3009
  Editor: pc0082
  Comment:
+   ← Revision 24 as of 2012-08-08 09:29:43 → ⇥
  Size: 2032
  Editor: PeterWatson
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 1:
-Describe FAQ/kappa here.
+== Kappa statistic evaluation in SPSS ==
 Line 3:
-[CUT AND PASTE THE SYNTAX BELOW AND ADJUST DATA INPUT AS REQUIRED]
+SPSS syntax available:
 Line 5:
-{{{
* example data input template
*          
*                        Rater 2
*                  Mild Moderate Severe 
*         Mild      5      5       0
*Rater 1  Moderate  3      6       0
*         Severe    1      1       0
+ * [:FAQ/kappa/kappans:Non-square tables where one rater does not give all possible ratings]
-Line 14:
+Line 7:
-set format f10.5.
data list free
/ r1 r2 freq.
+ * [:FAQ/kappa/multiple:More than 2 raters]
-Line 18:
+Line 9:
-begin data
1 1 5
2 1 3
3 1 1
1 2 5
2 2 6
3 2 1
end data.
+ * [:FAQ/ad:An inter-rater measure based on Euclidean distances]
-Line 27:
+Line 11:
-*
* Syntax used for rectangular tables to compute kappa.
* (David Nichols, ASSESS Newsletter 1996)
* (recommended on P.104 of SPSS Reference manual, 1990):   
*
*
* Program uses Cohen's Kappa for agreement between a pair of raters
* for a two way rectangular table of ratings (ie at least 2 ratings given by both raters)
*
*
* Gives kappa and the asymptotic standard error of Everitt(1996)
* P.292 Making Sense of Statistics in Psychology
*
*
preserve.
set printback=off mprint=off.
save outfile='kap0.sav'.
define kapparec (vars=!tokens(2) /num=!tokens(1) ).
count ms__=!vars !num (missing).
select if ms__=0.
matrix.
get x /var=!vars.
get ff /var=!num.
compute c=mmax(x).
compute y=make(c,2,0).
compute w=make(c,1,0).
compute sume=make(c,1,0).
compute ans=make(1,3,0).
loop i=1 to nrow(x).
loop k=1 to c.
do if x(i,1)=k.
compute y(k,1)=y(k,1)+ff(i,1).
end if.
do if x(i,2)=k.
compute y(k,2)=y(k,2)+ff(i,1).
end if.
do if (x(i,1) eq k and x(i,2) eq k).
compute w(k,1)=w(k,1)+ff(i,1).
end if.
end loop.
end loop.
loop k=1 to c.
compute sume(k,1)= y(k,1) * y(k,2)  / csum(ff).
end loop. 
compute kstat= ( csum(w) - csum(sume) ) / (csum(ff) - csum(sume)).
loop k=1 to c.
compute ans(1,1)=(csum(ff)-csum(sume)) / csum(ff).
compute ans(1,1)=ans(1,1)-(y(k,1)+y(k,2))*(csum(ff)-csum(w)) / (csum(ff))**2.
compute ans(1,1)=(w(k,1) / csum(ff))*ans(1,1)*ans(1,1).
compute ans(1,2)=ans(1,2)+ans(1,1).
end loop.
loop k=1 to c.
loop j=1 to c.
loop i=1 to nrow(x).
do if (x(i,1) eq k and x(i,2) eq j and x(i,1) ne x(i,2)).
compute ans(1,3)=ans(1,3)+ff(i,1)/csum(ff)*((y(k,2)/csum(ff))+(y(j,1)/csum(ff)))**(2).
end if.
end loop.
end loop.
end loop.
compute ans(1,3)=ans(1,3)*(1-(csum(w)/csum(ff)))**2.
compute ase=(csum(w)*csum(sume))/(csum(ff)*csum(ff)).
compute ase=ase-2*(csum(sume)/csum(ff))+(csum(w)/csum(ff)).
compute ase=ase**2.
compute ase=ans(1,3)-ase.
compute ase=ans(1,2)+ase.
compute ase=sqrt(ase*(1/(csum(ff)*(1-(csum(sume)/csum(ff)))**4))).
compute z=kstat/ase.
compute sig=1-chicdf(z**2,1).
save {kstat,ase,z,sig} /outfile='ka__tmp3.sav'
     /variables=kstat,ase,z,sig.
end matrix.
get file='ka__tmp3.sav'.
formats all (f11.8).
variable labels kstat 'Kappa' /ase 'ASE' /z 'Z-Value' /sig 'P-Value'.
report format=list automatic align(center)
  /variables=kstat ase z sig
  /title "Estimated Kappa, Asymptotic Standard Error,"
         "and Test of Null Hypothesis of 0 Population Value".
get file='kap0.sav'.
!enddefine.
restore.
-Line 110:
+Line 12:
-kapparec vars=r1 r2 num=freq.
}}}
+'''Note:''' Reliability as defined by correlation coefficients (such as Kappa)
requires variation in the scores to achieve a determinate result. If you
have a program which produces a determinate result when the scores of one
of the coders is constant, the bug is in that program, not in SPSS. Each rater must give at least two ratings.

 * [:FAQ/kappa/magnitude:Benchmarks for suggesting what makes a high kappa]

There is also a weighted kappa which allows different weights to be attached to misclassifications. Warrens (2011) shows that weighted kappa is an example of a more general test of randomness. This [attachment:kappa.pdf paper] by Von Eye and Von Eye (2005) gives a comprehensive insight into kappa and variants of it. These include a variant by Brennan and Prediger (1981) (computed using either this [http://justusrandolph.net/kappa/ on-line calculator], which also computes Cohen's kappa, or this [attachment:bpkappa.xls spreadsheet]) which enables kappa to attain the maximum value of '1' comparing to a uniform distribution when the number of category ratings is not fixed. Von Eye and Von Eye's paper suggests, however, that this measure can give a misleadingly high value if the raters give different numbers of category ratings.  

__References__

Brennan RL, & Prediger DJ (1981). Coefficient kappa: Some uses, misuses, and alternatives. ''Educational and Psychological Measurement'' '''41''' 687–699.

von Eye A & von Eye M (2005). Can One Use Cohen's Kappa to Examine Disagreement? ''Methodology'' '''1(4)''' 129–142. 

Warrens MJ (2011). Chance-corrected measures for 2 × 2 tables that coincide with weighted kappa. ''British Journal of Mathematical and Statistical Psychology'' '''64(2)''' 355–365.

MRC CBU Wiki

Quick Links

Search Wiki

Page Tools

Kappa statistic evaluation in SPSS