Mad Scientist (Statistics): Two-sample t-procedures: Comparing Means

Two-Sample Inference
We can use statistical inference to draw conclusions about a population by looking at a sample of that population. In that same way, we can use statistical inference to draw conclusions about the difference between populations by looking at the difference between samples of those populations.

Comparing Means
If the means of our two populations are µ1 and µ2, then the difference between them is(obviously) µ1 - µ2. We can make confidence intervals and perform significance tests for µ1 - µ2 using the same t-procedures as with the mean of a single sample, however the standard deviation and degrees of freedom used in two-sample procedures are somewhat different from those in one-sample procedures.

Conditions
Two-sample t-procedures require that our two samples are SRS's from two independent populations. This means, for example, that the samples can't be from before and after/matched pairs type experiments.

Preparation
Just as the means from the two samples are labeled µ1 and µ2, the other symbols used should also be numbered to distinguish them.

The variables in the samples are called x1 and x2.
The means of each sample are called x1 and x2.
The standard deviations of the populations are called σ1 and σ2.
The standard deviations of the samples are called s1 and s2.
And the number of observations in the samples are called n1 and n2.

The standard deviation for x1 - x2 is given by the formula:

Thus the standard error(SE) is:

Degrees Of Freedom
The degrees of freedom for two-sample t-procedures are different than for one-sample t-procedures. Software will calculate the degrees of freedom automatically, but if you don't have access to such software simply use the smaller of n1-1 and n2-1. Using the smaller of n1 - 1 and n2 - 1 is a conservative estimate of the true degrees of freedom, so while it is inaccurate, it errs on the side of caution.

Confidence Interval
The confidence interval for µ1 - µ2 is given by:

Significance Test
The null hypothesis for comparing two means is usually that there is no difference between them, that is:
H0: µ1 = µ2
The alternative hypothesis can be one-sided or two-sided.
The overall procedure is the same as for a single sample, however now were using x1 - x2 instead of x and µ1 - µ2 instead of µ.

Standardising x1 - x2 gives the two-sample t-statistic:

And if H0: µ1 = µ2 then µ1 - µ2 = 0 so the statistic is simply:

From this point on we simply follow the same routine as with the one-sample test (but don't forget the different degrees of freedom.)

Robustness
Two-sample t-procedures are more robust than one-sample t-procedures. They are more robust for larger sample sizes, and more similar sample sizes.
As a rough guide for when it's safe to use these procedures, use the guidelines for one-sample t-procedures, but replace each "n" with "n1 + n2".

Mad Scientist (Statistics)

Two-sample t-procedures: Comparing Means

Tuesday, November 27, 2007

0 Comments: