# Computing the semi-partial coefficient for a predictor in a multiple regression

The semi-partial correlation for a predictor in a regression is the signed square root of the difference in R^{2 }s between a model with and without that term provided the term of interest is not included in any higher order interactions.

For example consider a model using prestige score, gender and income to predict suicide rates. We fit all main effects and the prestige by gender and prestige by income interactions.

The semi-partial r for prestige by gender is the square root of R^{2 } for the full model minus the R^{2 } for the model minus the prestige by gender interaction. (Similarly for the income by gender interaction). The semi-partial r for income is the square root of R^{2 } for the model containing all three main effects but no interactions minus the R^{2 } for the model containing prestige and group only. (Similarly for the other main effects).

The R program below will compute the signed semi-partial correlation for terms in a multiple regressions containing any number of continuous predictors and any number of two-way interactions involving the factor and each continuous predictor. Thanks to Adam Wagner for the program. Semi-part (also called part) correlations may also be obtained in SPSS by running a linear regression and choosing 'Part and partial correlations' after clicking on the 'Statistics' button.

############################################################################################ # Function calculates r from partial r squared # Assumes: # - stringr package loaded # - There are no factors with more than two levels... and ALL two-way interactions involving one factor are fitted # Not sure what is coded works in the presence of interactions of greater order than 2 library(stringr) calcR <- function(mod, check=T # Check that labels line up correctly){ # Stops the function if any factor has more than two levels if(any(sapply(mod$xlevels, length)>2)){ stop("Some factor in the model has more than 2 levels; this function won't work properly") } # Get the r.squared of a model r2 <- function(modR2)summary(modR2)$r.squared # Extract the names of the terms in the model termLabels <- attr(terms(mod), "term.labels") termSigns <- ifelse(coef(mod)[2:length(coef(mod))]>0, 1, -1) # Refit the model length(termLabels) times, where each fit excludes # one term (and any interactions associated with that term) modDroppedTerms <- lapply(termLabels, function(term){ termsToDrop <- termLabels[str_detect(termLabels, term)] update(mod, paste(".~.", paste("-", termsToDrop, sep="", collapse="")))}) # Find comparison models for looking at change in proportion variance explained # Essentially, Where there are interactions involving the term being tested, drop them from the model compModels <- lapply(termLabels, function(term){ intsToDrop <- c(termLabels[str_detect(termLabels, paste(term, ":", sep=""))], termLabels[str_detect(termLabels, paste(":", term, sep=""))]) if(length(intsToDrop)==0) return(mod) else return(update(mod, paste(".~.", paste("-", intsToDrop, sep="", collapse="")))) }) # Find the R^2 associated with each term modDroppedR2 <- sapply(modDroppedTerms, r2) compModelR2 <- sapply(compModels, r2) # Calculate the R^2 associated with each term termsR <- sqrt(compModelR2-modDroppedR2) names(termsR) <- termLabels # Determine if r is +ve or -ve # Note: I'm not sure the terms will always line up, so worth checking if(check){ cat("Check the names line up\n") print(data.frame(termLabels, names(termSigns))) } termsR <- termsR*termSigns termsR