FAQ/rxx-y_in_R - CBU statistics Wiki
location: FAQ / rxx-y_in_R

# R code for Jin's test of a correlation between baseline, X, and change from baseline, Y-X

(written by Leon Reteig of the Department of Psychology, University of Amsterdam, January 2019)

```corr_change_baseline <- function(X,Y) {
# Performs a (two-tailed) test for equal variances in the two input vectors.
# This is one way to do an unbiased assesment of the correlation between baseline and change from baseline (retest - baseline).
# In this case, X would be the baseline, and Y the retest.
#
# Returns a list of 4 values, following Maloney & Rastogi (1970)
#   -r: correlation of X-Y vs. X+Y. Assessing significance of this correlation is equivalent to a variance test
#   -t: t-value constructed from the r-value
#   -df: degrees of freedom for "r" and "t"
#   -p: p-value for the variance test. If significant, this means the two vectors do not have equal variances,
#       suggesting presence of a correlation between baseline and change from baseline.
#
# This implementation was inspired by and checked against an Excel spreadsheet, available at:
# http://imaging.mrc-cbu.cam.ac.uk/statswiki/FAQ/rxxy_correction
# This page is a great resource for further background on the issue of variance tests / correlations
#
# The relevant methods are contained in the following papers:
#
#   Jin, P. (1992). Toward a reconceptualization of the law of initial value. Psychological Bulletin 111(1) 176-184.
#   Maloney, C. J. and Rastogi, S. C. (1970). Significance test for Grubb's estimators. Biometrics 26 671-676.
#   Myrtek, M. and Foerster, F. (1986). The law of initial value: a rare exception. Biological Psychology 22 227-237
#   Tu, Y-K. and Gilthorpe, M. S. (2007). Revisiting the relation between change and initial value: A review and evaluation.
#     Statistics in Medicine 26 443-457

# This approach of testing the correlation correlation of X-Y and X+Y
# is equivalent to testing the variance of X vs. the variance of Y (Tu & Gilthorpe, 2007)
r_cor <- cor(X - Y, X + Y)
n <- length(X) # number of observations
df <- n - 2 # degrees of freedom for r and t

t_MR_1970 <- r_cor*sqrt(df) / sqrt(1 - r_cor^2) # t-value proposed by Maloney & Rastogi (1970); equation taken from Tu & Gilthorpe (2007)
p_MR_1970 <- 2*pt(-abs(t_MR_1970), df) # p-value from t-value (for a two-sided test)
return(list(r = r_cor, df = df, t = t_MR_1970, p = p_MR_1970))

# Myrtek & Foerster (1986) proposed a different test. The approach is equivalent to Maloney & Rastogi (1970) and yields the same p-value.
# It is only included here for the sake of completeness.

B_e <- ( var(Y) - var(X) + sqrt((var(Y) - var(X))^2 + 4*(cov(X,Y))^2) ) / (2*cov(X,Y)) # structural relation, from Jin (1992)
# N.B. this is a simplifaction of the equation in the paper, as "cor(X,Y) * sd(X) * sd(Y)"
# simply reduces to "cov(X,Y)". The formula as literally in Jin (1992):
# B_e_paper <- ( var(Y) - var(X) + sqrt((var(Y) - var(X))^2 + 4*(cor(X,Y) * sd(X) * sd(Y))^2) ) / (2*cor(X,Y) * sd(X) * sd(Y))
# This simplification is also used in the formula below.

# t-value proposed by Myrtek & Foerster, 1986; equation taken from from Jin (1992)
t_MF_1986 <- sqrt( df * sin(2*(atan(B_e) - atan(1)))^2 *
( (1/4) * (var(X) - var(Y))^2 + cov(X,Y)^2 ) /
( var(X) * var(Y) - cov(X,Y)^2) )
p_MF_1986 <- 2*pt(-abs(t_MF_1986), df) #p-value from t-value (for a two-sided test)
# return(list(df = df, t = t_MF_1986, p = p_MF_1986)) # the t- and p-values are identical to t_MR_1970 and p_MR_1970, so only return those
}```

Example using the above R function (given in spreadsheet on preceding page)

```library(foreign)

> dat2 <- data.frame(dat)
> X <- dat2[,1]
> Y <- dat2[,2]
> X
[1]  1.05842  2.43411 -0.51272  0.96959  4.00000  5.00000  2.00000  3.00000  4.00000  5.00000  6.00000  7.00000  3.74174
[14] -1.12423  1.67558  2.57155  2.81140  4.79701 -0.31238  3.12158  2.45195  2.07938  5.03972  1.72247  1.06231 -0.99752
[27]  2.66932 -4.12602  2.81685  2.97308  3.13575  0.46938  0.46938
> Y
[1]  3.41853  0.87283  2.76419  5.36829  2.15316  3.47944 -0.66727 -0.07276 -0.06596  1.54776  2.52074 -0.52881  5.48375
[14]  0.59687 -0.56258  2.86513  2.47741 -0.01483  4.12124  1.63788  0.88078  4.90384  3.36825  2.04285  3.68688  5.04538
[27]  1.52683 -1.26871  0.83673  3.58103  1.03052 -1.08757 -1.08757```

```> corr_change_baseline(X,Y)
\$r
[1] 0.1296376

\$df
[1] 31

\$t
[1] 0.7279344

\$p
[1] 0.4721166```

None: FAQ/rxx-y_in_R (last edited 2019-01-07 15:40:49 by PeterWatson)