# What is the relationship between the median and the mean?

Hozo et al. (2005) show that for a variable with minimum, min and maximum, max its mean is approximately equal to [min+2median+max]/4. They suggest formulae using the median and range values to compute group means and group standard deviations for a variety of group sample sizes. For example, the standard deviation for sample sizes less than 15 can be estimated as [([Min-2Med+Max]^{2 })/4 + (max-min)^{2 }]/12 (see the pdf here). An on-line calculator which uses these formulae to evaluate means and sds is available (although its outputted variance for small group sizes under 15 appears to provide an estimate less than that using the above formula which is claimed to be used).

This calculator computes Cohen's d comparing two independent groups just using means, ranges and relative group sizes.

A confidence interval for the median

95% Confidence intervals for the median may also be obtained, for example for assessing the influence of outliers since medians are more robust to outliers. In particular the 95% confidence interval for a median based upon a sample size, are the numbers with ranks [n/2 - 1.96 sqrt(n/4), n/2 + 1.96 sqrt(n/4) + 1]. So if we have 64 observations the 95% confidence interval for the median has lower bound equal to approximately the 24th (31.5 - 1.96 sqrt(64/4)) highest value and upper bound equal to approximately the 40th (= 64/2 + 1.96 sqrt(64/4) + 1) highest value. Some R code for working out 95% confidence intervals for medians is is given here.

The general formula treats the percentile as binomial proportion, p, with standard error sqrt[np(1-p)] which for the median equals sqrt[n(1/2)(1/2)]=sqrt[n/4].

Reference

Hozo, S-P, Djulbegovic, B and Hozo, I (2005). Estimating the mean and variance from the median, range, and the size of a sample. *BMC Medical Research Methodology* **5** 13. doi:10.1186/1471-2288-5-13. See also the link here.