FAQ - CBU statistics Wiki
location: FAQ

mrclogo.gif

Frequently Asked Questions

  1. Questions about distributions of random variables

  2. Questions involving analysis of variance and t-tests

  3. Questions about graphics and file handling

  4. Questions about correlations

  5. Questions about regression

  6. Questions about categorical data analysis

  7. Questions about p-values, significance, or power of tests

  8. Questions about nonparametric tests

  9. Questions about multivariate analyses

  10. Questions about time series

The FAQs below will be divided into categories including those above. Some of the FAQs have a few mathematical formulae which may be seen using mathplayer which may be downloaded for free from here. Computer code is featured in some of the FAQs and is highlighted in a (grey coloured) text box. This code can be copied directly to your personal PC file (e.g. into a SPSS syntax file) however you also need to copy any small piece of text immediately outside the text box when you copy the code in order to copy over the linebreaks or else you will find the text is copied over as a single line!

If you wish to search for a particular word and are using Internet Explorer just press Control-f and a 'find' slot will appear above the page in your browser. Also please note that macros in the EXCEL spreadsheets will not work (by default) unless they are enabled.

  1. How many observations do I need to make z-scoring (assuming Normality) feasible?

  2. Are there any on-line statistical tables for commonly used distributions?

  3. Why are the degrees of freedom not whole numbers?

  4. What is the relationship between the z, t, chi-square and F distributions?

  5. A Normalising transformation of ratios of form (L-R)/(L+R)

  6. What is the probability of a repetition in a sequence of size k taken from K stimuli?

  7. How do I evaluate Multinomial probabilities of pooled trial frequencies of locations of peaks which can occur in one of K positions in each trial?

  8. How do I fit a Weibull distribution to scores over time using MATLAB and notes on using non-linear least squares?

  9. How do I recode levels of a group variable in SPSS?

  10. Why does SPSS give a row of asterisks instead of a mean in the outputted pivot table?

  11. What does a number containing an 'E' signify and how do I remove it?

  12. What is an intention to treat analysis?

  13. How do I evaluate a number to a general power in SPSS?

  14. What is the relationship between the Pearson zero-order correlation and a simple regression estimate?

  15. What is the relationship between predictor variable correlations and the presence of a suppressor variable?

  16. How do I check for outliers in a simple regression with one predictor variable?

  17. How do I explain adding 0.5 to the cells for the odds ratio, d' (dprime) or logit transform?

  18. Can I use an ordinary linear regression instead of logistic regression to test inference about proportions?

  19. What is heterogeneity of variance in SPSS Probit and Logit (and Poisson) Regressions?

  20. What is overdispersion in handling proportions or count data and how do I handle it?

  21. How do I perform tests of Marginal Homogeneity between two raters measuring the same items?

  22. When should I use a logit analysis as opposed to an arcsine transformed ANOVA?

  23. Why don't I see fiducial confidence intervals doing a probit analysis in version 16 of SPSS?

  24. Meta-analysis issues: 1) How do I measure publication bias? 2) How do I obtain a confidence interval for pooled estimates from sets of log odds, Cohen's d and Pearson correlations using a) meta-analysis and b)for Cohen's d as suggested for combining two studies to assess replication of a result?

  25. How do I do a matched pairs comparison on dichotomous data using covariates in SPSS?

  26. How do I handle messy data in SPSS to produce duration times from dates and frequencies from strings?

  27. How do I compare a mean to a constant in EXCEL (one-sample t)?

  28. How do I calculate a 95% confidence interval for the group means from one-way ANOVAs (using either a within or between subjects factor)?

  29. How do I compute a t-test and F ratio using only summary measures in 2x2 between and within subjects ANOVAs?

  30. A two-way analysis of variance to assess treatment and pre-test effects (Solomon's four group design)

  31. How do I work out degrees of freedom for terms in an ANOVA?

  32. In ANOVA and Regression, what do the various different types of Sums of Squares mean, and does the choice matter?

  33. Can I use subjects as a random or fixed factor in an ANOVA?

  34. Sums of squares used by R in lm, lmer and aov

  35. A note about unequal group sizes in ANOVA

  36. A note about different sums of squares in unbalanced factorial ANOVAs

  37. A note on comparing confidence intervals and statistical significance

  38. How do I obtain a pooled standard error of the mean?

  39. How do I obtain the standard deviation from the standard error of the mean (s.e.m.) and how does this and the mean vary with sample size?

  40. What is the variance of the mean of my transformed data and the variance of combinations of means?

  41. What is the value of the error variance for raw data using that of its scalar transformed data?

  42. How do I do a t-test between two weighted means?

  43. What is the relationship between the mean and the median from a sample and how do I compute 95% CIs for the median?

  44. What is an optimal cut-off for the discretization/splitting of a continuous variable and how do we use this in a regression?

  45. A quick approximation for Normally distributed expected order statistics

  46. How do I convert a z-score into a percentile in SPSS?

  47. How do I derive an expected total for a subset of items based on the expected overall total?

  48. How do I compute a difference in errors adjusted for overall number of errors when the errors have different signs?

  49. How do I format data for input into a repeated measures analysis in SPSS?

  50. What does the TRANSPOSE ALL DATA option in the RESTRUCTURE menu do in SPSS?

  51. Using the VARSTOCASES command in SPSS to convert repeated measures formatted data to a random effects (including multilevel) model format (and the CASESTOVARS command for the reverse operation).

  52. How do I perform a repeated measures analysis of variance in SPSS?

  53. How do I perform a repeated measures analysis of variance in SPSS involving repeated measures factors with more than 99 levels?

  54. How do I perform a repeated measures analysis of variance in R including correcting for sphericity?

  55. How do I perform a repeated measures analysis of variance in GENSTAT (and MATLAB)?

  56. How do I perform a repeated measures analysis of variance in MINITAB (and use an analogous procedure in SPSS to produce post-hocs on a repeated measures factor)?

  57. How do I perform a non-standard comparison of means in a repeated measures anova in SPSS?

  58. How do I work out the meaning of a significant interaction?

  59. How do I interpret a four-way interaction?

  60. What is the effect of dropping a between by within subjects interaction on other terms in a mixed anova?

  61. How do I test for an interaction involving a continuous variable (moderation analysis)?

  62. What is the difference between a hierarchical and a stepwise regression? Also:a reference for an alternative, decision trees

  63. How do I perform a stepwise regression in SPSS involving interactions?

  64. Aspects of ANCOVA (Analysis of Covariance)

  65. Why don't I need to use a covariate which differs between randomised groups in an ANCOVA?

  66. Comparing ANCOVA to Repeated Measures ANOVA

  67. Can I do an analysis of covariance using a regression (including computation of covariate adjusted means) and use this to adjust for regression to the mean?

  68. ANCOVA versus Analysis of Residuals

  69. How do I handle errors in variables to estimate slopes and intercepts in a linear regression?

  70. Inappropriate use of a constant covariate in repeated measures ANCOVA

  71. How do I adjust for varying covariates in a repeated measures ANOVA in SPSS?

  72. How do I obtain an interaction in SPSS to describe how a fixed covariate influences a repeated measures interaction?

  73. What summary measures can I use to describe repeated measures?

  74. How do I test for a trend, or contrast, between group means in a one-way ANOVA representing different subjects and also check location of asymptotes on a curve?

  75. Interpreting the intercept term in a regression when covarying out predictors of within-subject difference scores

  76. What is the importance of including an intercept term in the SPSS (between subjects ANOVA) univariate procedure?

  77. How do I obtain covariance terms involving the intercept term in a linear regression using SPSS?

  78. Are there any primer publications explaining rationale and application of single case studies?

  79. How do I compute linear trend coefficients for single cases?

  80. How do I compare a within subjects group difference to that of a single case?

  81. How do I interpret subsets in Tukey's HSD output?

  82. How do I compare all pairwise comparisons in a between subjects anova (and in a repeated measures anova)?

  83. Why won't SPSS let me do post hoc tests involving a within-subjects factor?

  84. How do I do a simple effects analysis in SPSS?

  85. How do I test whether one independent variable has an influence on a dependent variable other than via the mediation of a second independent variable? (Sobel Test)

  86. Can I fit two-way links between variables in a path model?

  87. How do I get SPSS to do a Chi-squared analysis of a two-way frequency table?

  88. How do I compare a list of observed frequencies with a list of my own expected frequencies?

  89. Using a chi-square to see if two or more proportions are equal

  90. How do I know which elements contribute to a relationship in a two-way frequency table?

  91. When do I use Fisher's exact test instead of chi-square?

  92. How do I obtain Fisher's exact test and chi-square for a two-way table in EXCEL or a 2x2 table on the web?

  93. When do I use the correction for continuity when performing a chi-square analysis on a 2x2 table?

  94. How do I do a linear trend of proportions in a Chi-squared analysis in SPSS?

  95. How do I test for a strictly increasing or decreasing series on a set of individuals (not necessarily linear)?

  96. How do I test for the presence of an unknown ordering across subjects using 2 Dimensional data?

  97. Repeated measures, Mixed models and Split-plot designs: A Rant

  98. What is the formula for Mauchly's W used for testing sphericity in univariate repeated measures anova?

  99. Why do I get a NAME? appearing in cells when I try to run an EXCEL spreadsheet program?

  100. How do I split comma delimited data occurring in a single cell into separate columns in EXCEL?

  101. How do I make bibliographic citations of a SPSS or R procedure?

  102. How do I copy a SPSS window into a document?

  103. Why does SPSS give me a file definition error message when I run syntax?

  104. How do I get my SPSS output file to open in a new version of SPSS?

  105. How do I e-mail a SPSS output file?

  106. How do I e-mail a SPSS data file created on a PC so that it is readable on a MAC (and vice-versa)?

  107. How do I read R syntax code into a R session and obtain primers on other issues to get started with R?

  108. How do I download R libraries?

  109. A primer on producing plots (and pdf copies) in R.

  110. How to avoid "$ operator is invalid for atomic vectors" in R

  111. How do I read a SPSS data file into SAS?

  112. How do I convert a Microsoft file application (e.g. powerpoint) into a pdf file?

  113. How do I recode ACE-R scores (prior to summing) in EXCEL?

  114. How do I put error bars on an EXCEL bar and line graphs?

  115. How do I produce interaction plots in SPSS?

  116. How do I plot boxplots in EXCEL?

  117. How do I scatterplot observations which have the same set of co-ordinates?

  118. How can I distinguish groups on a scatterplot in Excel?

  119. How do I do a multiple line plot of cluster profiles for different factors in SPSS?

  120. How do I produce a bar chart including one with empty categories and other features in SPSS?

  121. Improved Confidence Limits for a binomial proportion and differences in binomial proportions

  122. How do I compare observed numbers correct to those expected by chance in a multi-choice task?

  123. How do I produce an interactive bar chart of percentages in SPSS?

  124. How do I put one or more regression lines on a scatterplot in SPSS Version 12.0 and above and in R?

  125. How do I plot user defined regression lines (such as those from ANCOVA) in R?

  126. Why does the R-squared change when I fix the intercept in my regression line from its least squares value?

  127. How do I produce a clustered boxplot and other more advanced graphics in SPSS?

  128. How do I convert a pdf file into a MSWord or MSPowerpoint file?

  129. How do I convert a powerpoint show file so it can print handouts?

  130. How do I activate a hypertext link in powerpoint?

  131. How do I add in the analysis toolkit (and other add-ins) into EXCEL?

  132. How should I deal with outliers when doing correlations?

  133. Rank-based and other correlations (percentage bend) which are robust to non-Normal data

  134. How do I correlate change in score with score at baseline?

  135. How do I measure agreement to see if two measures are measuring the same thing (Bland and Altman plots)?

  136. Can I do a correlation between two variables where one variable is less than or equal to the other?

  137. How do I produce nonparametric Spearman partial correlations using SPSS?

  138. What is Collinearity in multiple regression, and what do I do about it?

  139. How do I find the condition number of a matrix using R?

  140. How do I convert a factor variable into a numeric in R?

  141. How do I perform a regression with a categorical outcome?

  142. How do I look for the best fitting model with an unknown changepoints on ordered data?

  143. Regression diagnostics for categorical variables

  144. How can I design an experiment so that conditions are counterbalanced for order?

  145. What is the Wald statistic?

  146. How do I compute the standard error of X1 in a regression also featuring X2?

  147. What is the relationship between regressions involving variables A and B to those involving B-A and A+B in predicting an outcome?

  148. How do I test to see if a mean adjusted for a covariate equals zero in a single group?

  149. How do I compute the standard error of ''beta'' in a linear regression in SPSS?

  150. Residuals and (Non-)Normality

  151. How do I compute Akaike's and Bayesian information criteria (AIC, BIC) to compare regression models and how do I interpret them?

  152. How do I check Normality assumptions in repeated measures analyses in SPSS?

  153. What is the difference between within subjects effects and within subjects contrasts in SPSS?

  154. When should I use a Multivariate Analysis of Variance (MANOVA)?

  155. Testing normality including rules of thumb for skew, kurtosis in SPSS

  156. Methods of handling outliers.

  157. Are there any references suggesting advantages to using parametric tests as opposed to nonparametric ones?

  158. How do I compare a specific pair of groups post-hoc in SPSS using the Kruskal-Wallis test?

  159. How do I compare the distributions and magnitudes of a set of positive and negative values (including accessing results from nonparametric tests in SPSS 19 and later)?

  160. Post-hoc nonparametric pairwise comparisons of a one-way within subjects factor

  161. How do I know whether to use an exact or asymptotic p-value with a Mann-Whitney or Kruskal-Wallis test?

  162. Is there such a thing as a nonparametric regression?

  163. A note about using ranked outcomes in t-tests and ANOVAs including nonparametric interactions and Quade's test

  164. Performing randomisation tests using nonparametric methods

  165. What is the expected total discrepancy score in a R choice task?

  166. How do I adjust p-values for number of comparisons using SPSS and R?

  167. What are adjusted p-values in SPSS?

  168. Combining p-values by Fisher's and Stouffer's methods

  169. Summing z values and summing t values

  170. Why does the value of one in the F distribution have a p-value which is less than 1?

  171. Improved Confidence Limits for a binomial proportion

  172. Is there an optimal ratio of cases to predictor variables I should have before doing a multivariate analysis or any guide as to total sample size?

  173. What is the relationship between significance tests of regression coefficients and of correlation coefficients?

  174. How do I compute the semi-partial correlation coefficient in R?

  175. How do I compare two squared (semi-partial) correlation coefficients (R-squareds) from different samples?

  176. How do I compare two squared (semi-partial) correlation coefficients from the same sample?

  177. How do I adjust R-squareds for the number of predictors in a model?

  178. What are random effect and multilevel models, when do I use them and are there effect sizes?

  179. Where can I find out about using random effects (including multilevel) models in R (including obtaining proportion of variance explained by a variable) & in SPSS?

  180. What does an error message concerning the Hessian matrix suggest when running a mixed (random effects) model?

  181. What does 's' denote in describing a General Linear Model (GLM) and a note on Generalized Linear Models in SPSS?

  182. What is a Generalized Additive (Mixed) Model (GA(M)M) and when do I use it?

  183. How do I perform all possible subsets regression in SPSS?

  184. How well does a regression line fit?

  185. How do I construct dummy variables for use in SPSS linear regression?

  186. Checking for outliers in regression

  187. What is the role of a Part or Semi-Partial Correlation in a regression?

  188. How do I obtain an average correlation?

  189. How do I test if a correlation is zero and compute its confidence interval?

  190. How many degrees of freedom are associated with a test of whether a zero-order or multiple correlation equals zero?

  191. How do I estimate a pooled correlation using multiple scores from a set of subjects?

  192. How do I adjust p-values to test if more than one correlation is zero?

  193. How do I obtain a 95% confidence interval for a correlation (or slope in a simple regression) in SPSS?

  194. How do I adjust a correlation for group differences (using partial/semi-partial correlations)?

  195. How do I compare a pair of correlations?

  196. How do I work out reliability for three or more items? (Cronbach's alpha, composite reliability and Raykov's rho)

  197. What thresholds should I use for convergent validity?

  198. Can I combine pilot data with main study data?

  199. When and how do I evaluate a one-sided p-value and quote a one-sided 95% confidence interval?

  200. What sample sizes do I need for doing tests with a given power?

  201. General rules of thumb for sample sizes in pilot studies

  202. What are polychoric correlations and how do I compute them and use in SPSS?

  203. Which matrix of loadings do I use doing a principal components extraction or non-PC analysis with a direct oblimin rotation when doing a factor analysis?

  204. A guide to the pros and cons of choosing a method for producing factor scores from a factor analysis

  205. What thresholds should I use for factor loading cut-offs?

  206. How do I compare a pair of factor loadings?

  207. How do I assess the importance of variables in a Normal Discriminant analysis?

  208. What is the difference between principal components analysis, principal axis analysis and other factor extraction methods?

  209. How do I interpret variables which load on more than one factor?

  210. How many factors/components should I retain in a factor/principal components analysis?

  211. A note on Cronbach's Alpha.

  212. How do I interpret variables which load on more than one factor?

  213. What is canonical correlation and where can I use it?

  214. What is multidimensional scaling and how do I do it?

  215. How do I handle missing data in multivariate analyses in SPSS?

  216. Using SPSS syntax to impute last observation carried forward (LOCF) for missing values in SPSS

  217. How do I find complete cases in SPSS and R?

  218. How can I detect identical cases (duplicates) in SPSS without having an ID number?

  219. How do I produce truncated exponential random variables using MATLAB?

  220. How do I obtain parameter estimates for finite Normal mixture distributions using SPSS and BMDP?

  221. How do I generate a random sample in EXCEL?

  222. Matching groups and randomized allocation to groups

  223. Some thoughts on testing for randomness

  224. How do I find out how many people have a score above a certain value?

  225. How do I compute percentile thresholds for exponential data and use these in outlier detection?

  226. Additional kappa statistic evaluation in SPSS, benchmarks on size and a measure of inter-rater agreement based on Euclidean distances

  227. What is an intraclass correlation and how do I use it?

  228. How do I compute consistency across subjects using an intraclass correlation?

  229. A note on correcting for restriction of ranges which underestimate Pearson correlations

  230. How do I sum rescored scales in SPSS?

  231. How do I obtain sums and means of partially complete cases in SPSS?

  232. How do I obtain the mean of several variables, each minus the same constant?

  233. How do I compute z-scores in SPSS and what is their relationship to comparing group means?

  234. Why does my t-statistic have a negative sign?

  235. How do I use cumulative distribution functions to compute p-values in SPSS, EXCEL and R?

  236. How do I adjust for age in comparing survival times of two different groups?

  237. What is the relationship between intercept and slope for scores at two time points?

  238. Arctan functions in EXCEL.

  239. How do I obtain statistical distributions in MATLAB?

  240. How do I produce random variables which follow a negative skew distribution?

  241. Simulations sampling from data with replacement (bootstrapping)

  242. Bootstrapping without replacement (Secret Santa) in R

  243. Generating multivariate data with a required correlation matrix

  244. How do I obtain the formulae behind statistics outputted by SPSS algorithms?

  245. How do I summarise a fit for a logistic regression model?

  246. How do I compute a leaving-one-out error rate for a logistic regression in SPSS?

  247. Why are the standard errors so large in logistic regression?

  248. How do I choose between different logistic regression models?

  249. How do I interpret output from a Multinomial logistic regression?

  250. Using an Odds Ratio to summarise a 2x2 frequency table

  251. How do I test an odds ratio?

  252. How do I combine proportions?

  253. How do I perform a cluster analysis in R?

  254. Which discriminant analysis should I use to obtain thresholds to indicate levels of abnormality using a single variable?

  255. How do I plot and interpret a ROC curve in assessing strength in two group prediction?

  256. Can you recommend any two group discriminant diagnostics?

  257. Which output criteria should I use when using the casewise results option with the Normal discriminant method in SPSS?

  258. How do I compute signal detection diagnostics?

  259. Structural equation modelling in R

  260. What is Euclidean distance and how do I compute it?

  261. How do I detect multivariate outliers?

  262. Can you recommend any statistical textbooks?

  263. How do I check for multivariate normality?

  264. What do I do if I have unequal group covariance matrices when doing a MANOVA?

  265. How do I find p-values using critical values as input in SPSS?

  266. Adjusted p-values

  267. How do I do False Discovery Rate (FDR) corrections for multiple tests?

  268. How do I calculate and interpret conditional probabilities?

  269. What is a p-value?

  270. How do I produce equations using MS Word?

  271. A guide to magnitudes of effect size

  272. Problems using p-rep, the Probability of Replication

  273. A quick guide to choice of sample sizes for Cohen's effect sizes

  274. A note on comparing z and t statistics and their p-values?

  275. How do I convert a t-statistic (and an Odds Ratio) into an effect size?

  276. How do I compute Cohen's d in SPSS and EXCEL and its and eta-squared confidence interval in SPSS, R or EXCEL?

  277. How do I compute Hedge's g from Cohen's d?

  278. How do I compute Mean Square Error (MSE) in EXCEL or SPSS?

  279. A guide to obtaining confidence intervals for effect sizes

  280. How do I compute effect sizes (including variance adjusted ones)?

  281. How do I do power calculations in SPSS, EXCEL, R and using web freeware?

  282. How do I do power (sample size) calculations on Poisson counts?

  283. How do I do power calculations using formulae for one sample t and sign tests?

  284. How do I work out sample size for apriori specificities and sensitivities?

  285. How do I compute statistical tests of equivalence?

  286. Computing Power for ANOVAs using the SPSS GLM procedure

  287. Computing Power for ANOVAs using the SPSS MANOVA procedure

  288. Jonckheere's and Page's L Trend Tests

  289. Fitting linear orthogonal polynomials

  290. How do I compare group means in a non-standard post-hoc contrast?

  291. How do I compare a set of group means with a control group mean?

  292. How do I manually compute t-statistics to compare means in a repeated measures ANOVA having 3 or more groups?

  293. Formulae for interaction sums of squares in balanced designs

  294. Information about Granger Causality


Return to Statistics main page

Return to CBU main page

These pages are maintained by Ian Nimmo-Smith and Peter Watson (/center)

[Last updated on 1 July 2006]

None: FAQ (last edited 2020-03-03 12:04:52 by PeterWatson)