Contributed by Gary Hale, UoP, 2005
Chapter 12: Analysis of Variance (ANOVA)
The probability distribution used in this chapter is the F distribution.
This probability distribution is used as the distribution of the test
statistic for many situations. It is used to test whether two
samples are from populations having equal variances, and it is also
applied when there is needed a comparison of several population means
simultaneously. The simulation comparison of several population means is
called analysis of variance (ANOVA). In both of these situations, the
populations must follow a normal distribution, and the data must be at
least interval-scale (Lind, D.M., Marchal, W.G. & Wathen, S.A., 2004).
There are five
characteristics of the F distribution:
-
There is a “family”
of F distributions. A particular member of the family is determined
by two parameters: the degrees of freedom in the numerator and the
degrees of freedom in the denominator. The shape of the distribution
is illustrated by the graph below. There is one F distribution for
the combination of 29 degrees of freedom in the numerator and 28
degrees of freedom in the denominator. There is another F
distribution for 19 degrees in the numerator and 6 degrees of
freedom in the denominator. The shape of the curves change as the
degrees of freedom change (Lind et al., 2004).

-
The F distribution
is continuous, which means that it can assume an infinite number of
values between zero and positive infinity (Lind et al., 2004).
-
The F distribution
can not be negative. The smallest value F can assume is 0.
(Lind et al., 2004).
-
It is positively
skewed. The long tail of the distribution is to the right hand side.
As the number of degrees of freedom increases in both the numerator
and
denominator the distribution approaches a normal distribution (Lind
et al., 2004).
-
It is asymptotic.
As the values of X increases, the F curve approaches the X axis
but never touches it. This is similar to the behavior of the normal
distribution
(Lind et al., 2004).
The F distribution is
used to test the hypothesis that the variance of one normal population
equals the variance of another normal population. The F distribution is
also used to test assumptions for some statistical tests. To use that
test we assume that the variances of two normal populations are the
same. The F distribution provides a means for conducting a test
regarding the variances of two normal populations (Lind et al., 2004).
Regardless of whether
we want to determine whether one population has more variation than
another population or validate an assumption for a statistical test, we
first state the null hypothesis. The null hypothesis is that the
variance of one normal population equals the variance of the other
normal population. The alternate hypothesis could be that the variances
differ. The null hypothesis and the alternate hypothesis for a two sided
test are:

To conduct the test, we
select a random sample of n1 observation from one population, and a
sample of n2 observations from the second population. The statistic is
defined as shown:

The terms
s2/1 and s2/2 are the respective sample variances. If the null
hypothesis is true, the test statistic follows the F distribution with
n1 – 1 and n2 – 1 degrees of freedom. In order to reduce the size of the
table of critical values, the larger sample variance is placed in the
numerator; therefore, the tabled F ratio is always larger than 1.00. The
right-tail critical value is the only one required. The critical value
of F for a two-tailed test is found by dividing the significance level
in half (a/2) and then referring to the appropriate degrees of freedom
(Lind et al., 2004).
The usual practice is to determine the F ratio by putting the larger of
the two sample variances in the numerator. This will force the F ratio
to be at lease 1.00. This allows us to always use the right tail of the
F distribution, avoiding the need for more extensive F tables (Lind et
al., 2004).
|