LAB 5: Hypothesis Testing

Where are we going?

So far in this course, we have covered:

Descriptive statistics, a set of tools for summarizing data,
Probability theory, which is the foundation for inferential statistics,
Sampling, which provides the information base for statistical analysis,
Estimation, a set of tools for developing estimates of population parameters..

This lab introduces hypothesis testing. Descriptive statistics, estimation and hypothesis testing are all tools within the Analysis Toolbox. In the remaining labs, we will explore several inferential tests which use hypothesis testing as their backbone. Depending on the reserch question and analysis requirements, a researcher may choose specific tools or use several tools in combination.

In this lab, we will examine:

Establishing the null and alternate hypotheses
Establishing the error probabilities
Setting the significance level
Drawing inferences using a decision rule
Using p-values to confirm your decision
Selecting the appropriate test

The Steps for Conducting a Hypothesis Test are summarized at the end of the lab.

It is essential that you understand the theory behind hypothesis testing so that you can correctly interpret the results of inferential statistical tests. If you need assistance with the concepts presented in this lab, please see your TA or instructor.

Hypothesis Testing

Hypothesis testing is an area of statistical inference in which we evaluate a statement or claim (which we call a hypothesis) about some characteristic of a population. Hypotheses arise from the theory that drives the research. For example, based on our understanding of environmental factors that influence health, we might pose the research hypothesis that there is a difference in the incidence of arthritis amongst people over the age of 65 who live in Sooke and those over the age of 65 who live in Tucson Arizona. Before the analyst can begin to assess this claim, however, the research hypothesis must be restated in the form of statistical hypotheses.

Statistical Hypotheses

The null hypothesis (which is written as Ho) is referred to as the hypothesis of "no difference". Meaning, in the example given above, that there is no difference in the incidence of arthritis between people over the age of 65 who live in Sooke and Tucson respectively.

The alternate hypothesis (which is written as ) is a hypothesis that contradicts the null. Meaning, in the example given above, that there is a difference in the incidence of arthritis between people over the age of 65 who live in Sooke and Tucson respectively.

A test of hypotheses is a method for deciding which of the two contradictory claims is the correct one. As in a judicial proceeding (where one is presumed innocent until proven guilty), we shall initially assume that the null hypothesis is correct. In carrying out a statistical test, data are collected and analyzed to assess the strength of evidence against the null hypothesis. If the evidence casts sufficient doubt on the truth of the null hypothesis, it can be rejected in favour of the alternate hypothesis. We will consider the issue of "sufficient doubt" in the sections that discuss significance levels and p-values.

Directional and Nondirectional Hypotheses

The null and alternate hypotheses presented above are nondirectional, in the sense that the researcher is only interested in whether there is a difference in the incidence of arthritis between the people living in the two cities. It may be the case, however, that the researcher wishes to investigate whether a particular type of difference exists. This requires that the statistical hypotheses be presented in a directional form. For example, the analyst could state:

: People over the age of 65 who live in Sooke do not have a higher incidence of arthritis than those over the age of 65 who live in Tucson.

: People over the age of 65 who live in Sooke have a higher incidence of arthritis than those over the age of 65 who live in Tucson.

Directional hypotheses are sometimes referred to as one-sided or one-tail hypotheses for reasons that will soon become clear.

Type I and Type II Errors

Because the samples that can be drawn from a given population are not identical, the evidence provided by a particular sample can be misleading. Even with a truly random sample, a test of hypothesis could result in an incorrect decision. To return to the courtroom analogy, the null hypothesis states that the defendant is not guilty. Even though we hope that the jury makes the right choice, there is still the possibility that an innocent person will be convicted or a guilty person set free. Similarly, two types of errors are possible when testing hypotheses.

A Type I error is rejecting a null hypothesis that is true. The probability of committing a Type I error is denoted as

A Type II error is failing to reject a null hypothesis that is false. The probability of committing a Type II error is defined as B

We can summarize our decision and types of error in the following table:

REALITY	DECISION
REALITY	Do not reject	Reject
is true	Correct decision	Type I Error
is false	Type II error	Correct decision

P-Value

The p-value is the probability of observing a value of the test statistic as large as that which has been calculated from the data. The smaller the p-value, the stronger is the evidence against the null hypothesis. Calculating p-values by hand is rather difficult. SPSS and other statistical analysis packages, however, provide them for you.

Decision rule

A decision rule is a specific statement about the test statistic and critical value that establishes when to reject the null hypothesis.

Generally, the decision rule is:

Reject if the calculated value of the test statistic is greater than the critical value
Do not reject if the calculated value of the test statistic is less than or equal to the critical value

A note about Probability Distributions

Different inferential tests use different probability distributions. A researcher must identify the appropriate probability distribution before looking up the critical values. We will introduce the various probabability distributions when we examine the statistical tests in the remaining labs.

A note about Rejecting Hypotheses

In the protocol of hypothesis testing, we do not "accept" a null or alternate hypothesis, because we can never be absolutely certain that a hypothesis is correct. If the null hypothesis can be rejected, we infer that the alternate is likely to be true. If the null cannot be rejected, we infer that the null is likely to be true.

Selecting the Appropriate Test

There are many different inferential statistical tests - each is designed to perform a specific task. In order to select the most appropriate test for your analysis, you need to have a clear understanding of your research question, know the level of measurement for your data as well as the assumptions of each test.

Each statistical test has particular requirements. For example:

Some tests require that the population(s) under investigation be normally distributed. Other tests do not specify any type of distribution.
Some tests are used to compare a population characteristic to a known value; others compare one characteristic between several populations; others are suitable for investigating a possible relationship between characteristics within a population.

Steps For Conducting a Hypothesis Test

In the remaining labs, we will use the following 9 steps for each hypothesis test:

Briefly state the research hypothesis (theory),
Check assumptions for tests.
Formulate the null () and alternate () hypotheses
Specify the signficance level ()
Select the appropriate test
Identify the critical value of the test statistic
Calculate the value of the test statistic for the sample
Apply a decision rule to decide whether to reject or not reject
State your conclusion with respect to the research hypothesis (with a statement of confidence).