1710
Comment:
|
1822
|
Deletions are marked like this. | Additions are marked like this. |
Line 4: | Line 4: |
Hozo et al. (2005) show that for a variable with minimum, min and maximum, max its mean is approximately equal to [min+2median+max]/4. They suggest using this for small samples < 25. The standard deviation for sample sizes less than 15 can be estimated as [([Min-2Med+Max]^2 ^)/4 + (max-min)^2 ^]/12 (see the pdf [[attachment:Thabane.pdf | here).]] [[http://vassarstats.net/median_range.html | An on-line calculator is here.]] | Hozo et al. (2005) show that for a variable with minimum, min and maximum, max its mean is approximately equal to [min+2median+max]/4. They suggest formulae using the median and range values to compute group means and group standard deviations for a variety of group sample sizes. For example, the standard deviation for sample sizes less than 15 can be estimated as [([Min-2Med+Max]^2 ^)/4 + (max-min)^2 ^]/12 (see the pdf [[attachment:Thabane.pdf | here).]] [[http://vassarstats.net/median_range.html | An on-line calculator is here.]] |
What is the relationship between the median and the mean?
Hozo et al. (2005) show that for a variable with minimum, min and maximum, max its mean is approximately equal to [min+2median+max]/4. They suggest formulae using the median and range values to compute group means and group standard deviations for a variety of group sample sizes. For example, the standard deviation for sample sizes less than 15 can be estimated as [([Min-2Med+Max]2 )/4 + (max-min)2 ]/12 (see the pdf here). An on-line calculator is here.
A confidence interval for the median
95% Confidence intervals for the median may also be obtained, for example for assessing the influence of outliers since medians are more robust to outliers. In particular the 95% confidence interval for a median based upon a sample size, are the numbers with ranks [n/2 - 1.96 sqrt(n/4), n/2 + 1.96 sqrt(n/4) + 1]. So if we have 64 observations the 95% confidence interval for the median has lower bound equal to approximately the 24th (31.5 - 1.96 sqrt(63/4)) highest value and upper bound equal to approximately the 40th (= 63/2 + 1.96 sqrt(63/3) + 1) highest value. Some R code for working out 95% confidence intervals for medians is is given here.
The general formula treats the percentile as binomial proportion, p, with standard error sqrt[np(1-p)] which for the median equals sqrt[n(1/2)(1/2)]=sqrt[n/4].
Reference
Hozo, S-P, Djulbegovic, B and Hozo, I (2005). Estimating the mean and variance from the median, range, and the size of a sample. BMC Medical Research Methodology 5 13. doi:10.1186/1471-2288-5-13. See also the link here.