Mad Scientist (Statistics)

F-test

The F-test is a type of significance test which compares the standard deviations of two samples in order to determine if the standard deviations of their parent populations are equal.

We give the standard deviations of the populations the symbols σ1 and σ2.

We give the standard deviations of the samples the symbols s1 and s2.

The hypothesis in an F-test are:
H0: σ1 = σ2
Ha: σ1 ≠ σ2

The test statistic (F) is given by:

The F-statistic has a unique distribution with two separate parameters for its degrees of freedom. This is because both of the sample standard deviations has have their own degrees of freedom value.

The first parameter is the degrees of freedom in the denominator(df1), which = (the number of observations in the sample with the larger s) – 1.
The second parameter is the degrees of freedom in the numerator(df2), which = (the number of observations in the sample with the smaller s) – 1.

To find at what level your data is significant you can look at a table of F distribution critical values. (some here) Find the highest significance level(α) for which your F-statistic is larger than the F critical value, then double that α (because Ha is two-sided.)
This is super-inconvenient though, so if possible use software to find the P-value exactly.

Wednesday, November 28, 2007

Two-sample t-procedures: Comparing Means

Two-Sample Inference
We can use statistical inference to draw conclusions about a population by looking at a sample of that population. In that same way, we can use statistical inference to draw conclusions about the difference between populations by looking at the difference between samples of those populations.

Comparing Means
If the means of our two populations are µ1 and µ2, then the difference between them is(obviously) µ1 - µ2. We can make confidence intervals and perform significance tests for µ1 - µ2 using the same t-procedures as with the mean of a single sample, however the standard deviation and degrees of freedom used in two-sample procedures are somewhat different from those in one-sample procedures.

Conditions
Two-sample t-procedures require that our two samples are SRS's from two independent populations. This means, for example, that the samples can't be from before and after/matched pairs type experiments.

Preparation
Just as the means from the two samples are labeled µ1 and µ2, the other symbols used should also be numbered to distinguish them.

The variables in the samples are called x1 and x2.
The means of each sample are called x1 and x2.
The standard deviations of the populations are called σ1 and σ2.
The standard deviations of the samples are called s1 and s2.
And the number of observations in the samples are called n1 and n2.

The standard deviation for x1 - x2 is given by the formula:

Thus the standard error(SE) is:

Degrees Of Freedom
The degrees of freedom for two-sample t-procedures are different than for one-sample t-procedures. Software will calculate the degrees of freedom automatically, but if you don't have access to such software simply use the smaller of n1-1 and n2-1. Using the smaller of n1 - 1 and n2 - 1 is a conservative estimate of the true degrees of freedom, so while it is inaccurate, it errs on the side of caution.

Confidence Interval
The confidence interval for µ1 - µ2 is given by:

Significance Test
The null hypothesis for comparing two means is usually that there is no difference between them, that is:
H0: µ1 = µ2
The alternative hypothesis can be one-sided or two-sided.
The overall procedure is the same as for a single sample, however now were using x1 - x2 instead of x and µ1 - µ2 instead of µ.

Standardising x1 - x2 gives the two-sample t-statistic:

And if H0: µ1 = µ2 then µ1 - µ2 = 0 so the statistic is simply:

From this point on we simply follow the same routine as with the one-sample test (but don't forget the different degrees of freedom.)

Robustness
Two-sample t-procedures are more robust than one-sample t-procedures. They are more robust for larger sample sizes, and more similar sample sizes.
As a rough guide for when it's safe to use these procedures, use the guidelines for one-sample t-procedures, but replace each "n" with "n1 + n2".

Tuesday, November 27, 2007

t-procedures

t distributions are similar to normal distributions, however they have a greater spread, with more area in the tails and less in the centre. This difference in shape is due to the fact that substituting s for σ in the formula for the standard deviation of x introduces more variability into the possible values for x.
A t(k) distribution becomes more and more similar to a normal distribution as k increases. For this reason, z can be used for very large samples even when σ is unknown.

Robustness of t-procedures
t-procedures are more robust for samples with more observations.
They are not robust against outliers, as outliers have the potential to greatly influence x.
They are somewhat robust against skewedness.

The general guidelines for one-sample t-procedure robustness are:
if n < 15 use the t-procedures only if the sample's distribution is very close to normal.
if n > 15 use the t-procedures if there are no outliers and the sample's distribution is not strongly skewed
if n > 40 the t-procedures can be used even for clearly skewed data as long as there are no extremely influential outliers.

Friday, November 23, 2007

Matched Pairs Inference

One-sample t-procedures can be used for inference on matched pairs experiments.
The one-sample t-procedures are used on the differences between the subjects of the pairs. This effectively turns the matched pairs data into a single sample.

Two-sample t-procedures cannot be used for inference on matched pairs experiments because the samples are not independent.

Errors in Significance Tests

Type 1 Error
A type 1 error is when we wrongly reject a true H0. The probability of this occurring is equal to α.

Type 2 Error
A type 2 error is when we wrongly accept an H0 when a specific Ha is true. The probability if this occurring is equal to 1 minus the power of the test against that Ha.

Power of a Test

The power of a significance test against a particular Ha is the probability of it rejecting H0 when that Ha is true. To find this probability we do the following: (for a one-sample t)

1.
Using the normal steps of a significance test find out which values of t will lead us to reject H0.
e.g. reject H0 when t > 3.

2.
Rearrange the one-sample test statistic to work out what values x needs to take for us to reject H0.
e.g. if t > 3 then:

3.
Find the probability of x taking such values if Ha is true. To do this we standardise using µa.
e.g. continuing from above:

Then we are simply looking at areas under a normal curve, which we can calculate easily.

Cautions For Confidence Intervals and Significance Tests

-When using data from a sample to perform a significance test or create a confidence interval, the data must be from an SRS, or from a sample chosen close enough to randomly that we can regard it as an SRS.

-x is strongly influenced by outliers. The sample should be examined beforehand to determine if any outliers are present, and if so, if they can be corrected or removed.

-Small samples must have roughly normal distributions for the sampling distribution of x to be normal, so confidence intervals and significance tests wont be accurate for small non-normal samples.

-Multiple analysis greatly increase the chance of an error. For example, if we create 20 90% confidence intervals for a parameter from 20 samples, the probability is quite high that at least one of our confidence intervals will fail to capture that parameter.

-The margin of error in confidence intervals and the level of significance of significance tests do not take into account any errors in the data itself. (e.g. bias, undercoverage, nonresponse etc.)