There are many applications where it is of interest to compare two independent groups with respect to their mean scores on a continuous outcome. Here we compare means between groups, but rather than generating an estimate of the difference, we will test whether the observed difference (increase, decrease or difference) is statistically significant or not. Remember, that hypothesis testing gives an assessment of statistical significance, whereas estimation gives an estimate of effect and both are important.
Here we use the proportion specified in the null hypothesis as the true proportion of successes rather than the sample proportion. If we fail to satisfy the condition, then alternative procedures, called exact methods must be used to test the hypothesis about the population proportion.
We decide: "The data (and its sample mean) are significantly different than the value of the mean hypothesized under the null hypothesis, at the .001 level of significance." This decision is likely to be wrong (Type I error) 1 time out of 1000.
We decide: "The data (and its sample mean) are significantly different than the value of the mean hypothesized under the null hypothesis, at the .01 level of significance." This decision is likely to be wrong (Type I error) 1 time out of 100.
When we get the data we will calculate Z and then look it up in the Z table to see how unusual the obtained sample's mean is, if the null hypothesis Ho is true.
This module will focus on hypothesis testing for means and proportions. The next two modules in this series will address analysis of variance and chi-squared tests.
Definition: Assuming that the null hypothesis is true, the p value isthe probability of obtaining a sample mean as extreme or more extreme than thesample mean actually obtained.
NOTATION: The notation Ho represents a null hypothesis and Ha represents an alternative hypothesis and po is read as p-not or p-zero and represents the null hypothesized value. Shortly, we will substitute μo for when discussing a test of means.
The alternative hypothesis can bedirectional or non-directional.“Eating oatmeal lowers cholesterol” is a directional hypothesis; “Amountof sleep affects test scores” is non-directional.
In order to test the hypotheses, we select a random sample of American males in 2006 and measure their weights. Suppose we have resources available to recruit n=100 men into our sample. We weigh each participant and compute summary statistics on the sample data. Suppose in the sample we determine the following:
Do the sample data support the null or research hypothesis? The sample mean of 197.1 is numerically higher than 191. However, is this difference more than would be expected by chance? In hypothesis testing, we assume that the null hypothesis holds until proven otherwise. We therefore need to determine the likelihood of observing a sample mean of 197.1 or higher when the true population mean is 191 (i.e., if the null hypothesis is true or under the null hypothesis). We can compute this probability using the Central Limit Theorem. Specifically,
(Notice that we use the sample standard deviation in computing the Z score. This is generally an appropriate substitution as long as the sample size is large, n > 30. Thus, there is less than a 1% probability of observing a sample mean as large as 197.1 when the true population mean is 191. Do you think that the null hypothesis is likely true? Based on how unlikely it is to observe a sample mean of 197.1 under the null hypothesis (i.e.,
A small p-value favors the alternative hypothesis. A small p-value means the observed data would not be very likely to occur if we believe the null hypothesis is true. So we believe in our data and disbelieve the null hypothesis. An easy (hopefully!)way to grasp this is to consider the situation where a professor states that you are just a 70% student. You doubt this statement and want to show that you are better that a 70% student. If you took a random sample of 10 of your previous exams and calculated the mean percentage of these 10 tests, which mean wouldbe less likely to occur if in fact you were a 70% student (the null hypothesis): a sample mean of 72% or one of 90%? Obviously the 90% would be less likely and therefore would have a small probability (i.e. p-value).
For our example, we formally state: The alternative hypothesis (H1) is that prenatal exposure to alcohol has an effect on the birth weight for the population of lab rats.