Diff for "FAQ/semsd" - CBU statistics Wiki
location: Diff for "FAQ/semsd"
Differences between revisions 1 and 27 (spanning 26 versions)
 ⇤ ← Revision 1 as of 2008-11-11 16:45:20 → Size: 292 Editor: PeterWatson Comment: ← Revision 27 as of 2013-08-30 09:34:03 → ⇥ Size: 3746 Editor: PeterWatson Comment: Deletions are marked like this. Additions are marked like this. Line 1: Line 1: = How do I obtain the standard deviation from the standard error of the mean (s.e.m.)? = = How do I obtain the standard deviation from the standard error of the mean (s.e.m.) and how does this and the mean vary with sample size? = Line 3: Line 3: $$\frac{\mbox{The standard deviation}}{\sqrt{\mbox{sample size}}}$$= standard error of the mean Standard deviations and standard errors are both used for computing intervals about the mean. Standard deviations may be regarded as more exploratory giving basic distributional information whereas standard errors are used directly in statistical tests such as t-tests involving the testing of hypotheses regarding specific means.[The standard deviation] / Sqrt(sample size) = standard error of the mean Line 7: Line 9: standard deviation = $$\sqrt{\mbox{sample size}} \mbox{standard error of the mean}$$ standard deviation = the square root of sample size multiplied by the standard error of the meanIt follows from this (e.g. p.218 of Babbie) that the standard error of the mean decreases with sample size, N.This follows since:The variance of the mean = 1/N x variance of the response = 1/N x The variance of N randomly sampled responses from a parent population = 1/ $$N^2^ x (Nsigma^2^) = sigma^2^/N$$ where $$sigma^2$$ is the (unobserved true) variance of the (parent population of the) response. Since this involves a term 1/N the variance (or its square root = standard error) of the mean decreases with sample size. The sample variance is used as an estimate for $$sigma^2$$.The mean (= to the midpoint or median in a Normal distribution) on the other hand is not proportional to sample size so is uneffected by N as it converges to the true mean of the ( assumed) underlying normal distribution as N increases. One can see this easily by considering an example: suppose we have a sample of size 3 of a responses = 1 2 3 then the mean is 2. Suppose I take a sample of size 7 say of the same response and get values of 1 1 2 2 2 3 3 then the mean = 2 there since it is symmetric about the (hypothesised true) mean of 2. Putting these together the ratio mean / s.e.(mean) is akin to a one-sample t-test of the null hypothesis that the true mean equals zero. This ratio will increase with increasing sample size because the mean will stay approximately the same (equal to the true mean from the underlying Normal distribution) as its s.e. decreases. This also follows from the fact that as the sample size increases a t random variable converges to a value having a standard Normal distribution (z-value). If any z-value is squared it becomes a value having a chi-square distribution (with one degree of freedom) and chi-squares increase with sample size (see e.g. Howell p.157) so the square of the mean divided by its standard error goes up with N. The main practical problem resulting from the dependency of the standard error of the mean on N is that for large N the t and chi-square tests will be large and lead to type I error (incorrectly rejecting the null hypothesis) for trivially small effect sizes.The main point is that you already have a fixed difference (mean-0) and so the t statistic just tells you how "big that difference is" measured in (estimated) units of variability of the observed mean. In more general terms, when the units become smaller in absolute numbers, the same absolute difference d emp (= the difference between the observed sample mean and the mean assumed under the null hypothesis) will "be worth more units" and will thus count as "more surprisingly high" (= less likely to occur) if the true mean is equal to the mean assumed under the null hypothesis (see [[http://stats.stackexchange.com/questions/13676/why-does-t-statistic-increase-with-the-sample-size | here]] for further discussion). Line 10: Line 25: __References__Babbie, E. (2008). The Basics of Social Research. Fourth Edition. Thomson Wadsworth: Belmont.CA. Line 11: Line 29: Howell, D.C. (1979) Statistical Methods for Psychologists. Fourth Edition. Wadsworth:Belmont,CA.

# How do I obtain the standard deviation from the standard error of the mean (s.e.m.) and how does this and the mean vary with sample size?

Standard deviations and standard errors are both used for computing intervals about the mean. Standard deviations may be regarded as more exploratory giving basic distributional information whereas standard errors are used directly in statistical tests such as t-tests involving the testing of hypotheses regarding specific means.

[The standard deviation] / Sqrt(sample size) = standard error of the mean

i.e.

standard deviation = the square root of sample size multiplied by the standard error of the mean

It follows from this (e.g. p.218 of Babbie) that the standard error of the mean decreases with sample size, N.

This follows since: The variance of the mean = 1/N x variance of the response = 1/N x The variance of N randomly sampled responses from a parent population = 1/ $$N2 x (Nsigma2) = sigma2/N$$ where $$sigma^2$$ is the (unobserved true) variance of the (parent population of the) response. Since this involves a term 1/N the variance (or its square root = standard error) of the mean decreases with sample size. The sample variance is used as an estimate for $$sigma^2$$.

The mean (= to the midpoint or median in a Normal distribution) on the other hand is not proportional to sample size so is uneffected by N as it converges to the true mean of the ( assumed) underlying normal distribution as N increases. One can see this easily by considering an example: suppose we have a sample of size 3 of a responses = 1 2 3 then the mean is 2. Suppose I take a sample of size 7 say of the same response and get values of 1 1 2 2 2 3 3 then the mean = 2 there since it is symmetric about the (hypothesised true) mean of 2.

Putting these together the ratio mean / s.e.(mean) is akin to a one-sample t-test of the null hypothesis that the true mean equals zero. This ratio will increase with increasing sample size because the mean will stay approximately the same (equal to the true mean from the underlying Normal distribution) as its s.e. decreases. This also follows from the fact that as the sample size increases a t random variable converges to a value having a standard Normal distribution (z-value). If any z-value is squared it becomes a value having a chi-square distribution (with one degree of freedom) and chi-squares increase with sample size (see e.g. Howell p.157) so the square of the mean divided by its standard error goes up with N. The main practical problem resulting from the dependency of the standard error of the mean on N is that for large N the t and chi-square tests will be large and lead to type I error (incorrectly rejecting the null hypothesis) for trivially small effect sizes.

The main point is that you already have a fixed difference (mean-0) and so the t statistic just tells you how "big that difference is" measured in (estimated) units of variability of the observed mean. In more general terms, when the units become smaller in absolute numbers, the same absolute difference d emp (= the difference between the observed sample mean and the mean assumed under the null hypothesis) will "be worth more units" and will thus count as "more surprisingly high" (= less likely to occur) if the true mean is equal to the mean assumed under the null hypothesis (see here for further discussion).

References

Babbie, E. (2008). The Basics of Social Research. Fourth Edition. Thomson Wadsworth: Belmont.CA.

Howell, D.C. (1979) Statistical Methods for Psychologists. Fourth Edition. Wadsworth:Belmont,CA.

None: FAQ/semsd (last edited 2013-08-30 09:34:03 by PeterWatson)