FAQ/rxx-y_in_R - CBU statistics Wiki

Upload page content

You can upload content for the page named below. If you change the page name, you can also upload content for another page. If the page name is empty, we derive the page name from the file name.

File to load page content from
Page name
Comment
Type the odd letters out: scieNce GATHeRS knowledge fAster tHAN SOCIeTY GATHErS wisdom

location: FAQ / rxx-y_in_R

R code for Jin's test of a correlation between baseline, X, and change from baseline, Y-X

(written by Leon Reteig of the Department of Psychology, University of Amsterdam, January 2019)

corr_change_baseline <- function(X,Y) {
  # Performs a (two-tailed) test for equal variances in the two input vectors. 
  # This is one way to do an unbiased assesment of the correlation between baseline and change from baseline (retest - baseline).
  # In this case, X would be the baseline, and Y the retest.
  #
  # Returns a list of 4 values, following Maloney & Rastogi (1970)
  #   -r: correlation of X-Y vs. X+Y. Assessing significance of this correlation is equivalent to a variance test
  #   -t: t-value constructed from the r-value
  #   -df: degrees of freedom for "r" and "t"
  #   -p: p-value for the variance test. If significant, this means the two vectors do not have equal variances,
  #       suggesting presence of a correlation between baseline and change from baseline.
  #
  # This implementation was inspired by and checked against an Excel spreadsheet, available at:
  # http://imaging.mrc-cbu.cam.ac.uk/statswiki/FAQ/rxxy_correction
  # This page is a great resource for further background on the issue of variance tests / correlations
  #
  # The relevant methods are contained in the following papers:
  #
  #   Jin, P. (1992). Toward a reconceptualization of the law of initial value. Psychological Bulletin 111(1) 176-184.
  #   Maloney, C. J. and Rastogi, S. C. (1970). Significance test for Grubb's estimators. Biometrics 26 671-676.
  #   Myrtek, M. and Foerster, F. (1986). The law of initial value: a rare exception. Biological Psychology 22 227-237
  #   Tu, Y-K. and Gilthorpe, M. S. (2007). Revisiting the relation between change and initial value: A review and evaluation. 
  #     Statistics in Medicine 26 443-457
  
  # This approach of testing the correlation correlation of X-Y and X+Y 
  # is equivalent to testing the variance of X vs. the variance of Y (Tu & Gilthorpe, 2007) 
  r_cor <- cor(X - Y, X + Y)
  n <- length(X) # number of observations 
  df <- n - 2 # degrees of freedom for r and t
  
  t_MR_1970 <- r_cor*sqrt(df) / sqrt(1 - r_cor^2) # t-value proposed by Maloney & Rastogi (1970); equation taken from Tu & Gilthorpe (2007)
  p_MR_1970 <- 2*pt(-abs(t_MR_1970), df) # p-value from t-value (for a two-sided test)
  return(list(r = r_cor, df = df, t = t_MR_1970, p = p_MR_1970))
  
  # Myrtek & Foerster (1986) proposed a different test. The approach is equivalent to Maloney & Rastogi (1970) and yields the same p-value.
  # It is only included here for the sake of completeness.
  
  B_e <- ( var(Y) - var(X) + sqrt((var(Y) - var(X))^2 + 4*(cov(X,Y))^2) ) / (2*cov(X,Y)) # structural relation, from Jin (1992)
  # N.B. this is a simplifaction of the equation in the paper, as "cor(X,Y) * sd(X) * sd(Y)"
  # simply reduces to "cov(X,Y)". The formula as literally in Jin (1992):
  # B_e_paper <- ( var(Y) - var(X) + sqrt((var(Y) - var(X))^2 + 4*(cor(X,Y) * sd(X) * sd(Y))^2) ) / (2*cor(X,Y) * sd(X) * sd(Y))
  # This simplification is also used in the formula below.
  
  # t-value proposed by Myrtek & Foerster, 1986; equation taken from from Jin (1992)
  t_MF_1986 <- sqrt( df * sin(2*(atan(B_e) - atan(1)))^2 *
                       ( (1/4) * (var(X) - var(Y))^2 + cov(X,Y)^2 ) / 
                       ( var(X) * var(Y) - cov(X,Y)^2) )
  p_MF_1986 <- 2*pt(-abs(t_MF_1986), df) #p-value from t-value (for a two-sided test)
  # return(list(df = df, t = t_MF_1986, p = p_MF_1986)) # the t- and p-values are identical to t_MR_1970 and p_MR_1970, so only return those
}

Example using the above R function (given in spreadsheet on preceding page)

library(foreign)

> dat2 <- data.frame(dat)
> X <- dat2[,1]
> Y <- dat2[,2]
> X
 [1]  1.05842  2.43411 -0.51272  0.96959  4.00000  5.00000  2.00000  3.00000  4.00000  5.00000  6.00000  7.00000  3.74174
[14] -1.12423  1.67558  2.57155  2.81140  4.79701 -0.31238  3.12158  2.45195  2.07938  5.03972  1.72247  1.06231 -0.99752
[27]  2.66932 -4.12602  2.81685  2.97308  3.13575  0.46938  0.46938
> Y
 [1]  3.41853  0.87283  2.76419  5.36829  2.15316  3.47944 -0.66727 -0.07276 -0.06596  1.54776  2.52074 -0.52881  5.48375
[14]  0.59687 -0.56258  2.86513  2.47741 -0.01483  4.12124  1.63788  0.88078  4.90384  3.36825  2.04285  3.68688  5.04538
[27]  1.52683 -1.26871  0.83673  3.58103  1.03052 -1.08757 -1.08757

This yields the same answer as the spreadsheet

> corr_change_baseline(X,Y)
$r
[1] 0.1296376

$df
[1] 31

$t
[1] 0.7279344

$p
[1] 0.4721166