FAQ/semiprinR82014-10-15 13:47:11PeterWatson72014-10-01 08:40:09PeterWatson62014-09-30 10:42:29PeterWatson52014-09-30 10:35:07PeterWatson42014-09-30 10:34:51PeterWatson32014-09-30 10:34:23PeterWatson22014-09-30 10:34:08PeterWatson12014-09-30 10:33:53PeterWatsonComputing the semi-partial coefficient for a predictor in a multiple regressionThe semi-partial correlation for a predictor in a regression is the signed square root of the difference in R2 s between a model with and without that term provided the term of interest is not included in any higher order interactions. For example consider a model using prestige score, gender and income to predict suicide rates. We fit all main effects and the prestige by gender and prestige by income interactions. The semi-partial r for prestige by gender is the square root of R2 for the full model minus the R2 for the model minus the prestige by gender interaction. (Similarly for the income by gender interaction). The semi-partial r for income is the square root of R2 for the model containing all three main effects but no interactions minus the R2 for the model containing prestige and group only. (Similarly for the other main effects). The R program below will compute the signed semi-partial correlation for terms in a multiple regressions containing any number of continuous predictors and any number of two-way interactions involving the factor and each continuous predictor. Thanks to Adam Wagner for the program. Semi-part (also called part) correlations may also be obtained in SPSS by running a linear regression and choosing 'Part and partial correlations' after clicking on the 'Statistics' button. 2)){
stop("Some factor in the model has more than 2 levels; this function won't work properly")
}
# Get the r.squared of a model
r2 <- function(modR2)summary(modR2)$r.squared
# Extract the names of the terms in the model
termLabels <- attr(terms(mod), "term.labels")
termSigns <- ifelse(coef(mod)[2:length(coef(mod))]>0, 1, -1)
# Refit the model length(termLabels) times, where each fit excludes
# one term (and any interactions associated with that term)
modDroppedTerms <- lapply(termLabels,
function(term){
termsToDrop <- termLabels[str_detect(termLabels, term)]
update(mod, paste(".~.", paste("-", termsToDrop, sep="", collapse="")))})
# Find comparison models for looking at change in proportion variance explained
# Essentially, Where there are interactions involving the term being tested, drop them from the model
compModels <- lapply(termLabels,
function(term){
intsToDrop <- c(termLabels[str_detect(termLabels, paste(term, ":", sep=""))],
termLabels[str_detect(termLabels, paste(":", term, sep=""))])
if(length(intsToDrop)==0) return(mod)
else return(update(mod,
paste(".~.", paste("-", intsToDrop, sep="", collapse=""))))
})
# Find the R^2 associated with each term
modDroppedR2 <- sapply(modDroppedTerms, r2)
compModelR2 <- sapply(compModels, r2)
# Calculate the R^2 associated with each term
termsR <- sqrt(compModelR2-modDroppedR2)
names(termsR) <- termLabels
# Determine if r is +ve or -ve
# Note: I'm not sure the terms will always line up, so worth checking
if(check){
cat("Check the names line up\n")
print(data.frame(termLabels, names(termSigns)))
}
termsR <- termsR*termSigns
termsR]]>