Basic definitions
The null hypothesis and the alternative hypothesis are types of conjectures used in statistical tests, which are formal methods of reaching conclusions or making decisions on the basis of data. The hypotheses are conjectures about a statistical model of the population, which are based on a sample of the population. The tests are core elements of statistical inference, heavily used in the interpretation of scientific experimental data, to separate scientific claims from statistical noise.
"The statement being tested in a test of statistical significance is called the null hypothesis. The test of significance is designed to assess the strength of the evidence against the null hypothesis. Usually, the null hypothesis is a statement of 'no effect' or 'no difference'."[2] It is often symbolized as H0.
The statement that is being tested against the null hypothesis is the alternative hypothesis.[2] Symbols include H1 and Ha.
Statistical significance test: "Very roughly, the procedure for deciding goes like this: Take a random sample from the population. If the sample data are consistent with the null hypothesis, then do not reject the null hypothesis; if the sample data are inconsistent with the null hypothesis, then reject the null hypothesis and conclude that the alternative hypothesis is true."[3]
The following adds context and nuance to the basic definitions.
Given the test scores of two random samples, one of men and one of women, does one group differ from the other? A possible null hypothesis is that the mean male score is the same as the mean female score:
- H0: μ1 = μ2
where
- H0 = the null hypothesis,
- μ1 = the mean of population 1, and
- μ2 = the mean of population 2.
A stronger null hypothesis is that the two samples are drawn from the same population, such that the variances and shapes of the distributions are also equal.
Terminology
- Simple hypothesis
- Any hypothesis which specifies the population distribution completely. For such a hypothesis the sampling distribution of any statistic is a function of the sample size alone.
- Composite hypothesis
- Any hypothesis which does not specify the population distribution completely.[4] Example: A hypothesis specifying a normal distribution with a specified mean and an unspecified variance.
The simple/composite distinction was made by Neyman and Pearson.[5]
- Exact hypothesis
- Any hypothesis that specifies an exact parameter value.[6] Example: μ = 100. Synonym: point hypothesis.
- Inexact hypothesis
- Those specifying a parameter range or interval. Examples: μ ≤ 100; 95 ≤ μ ≤ 105.
Fisher required an exact null hypothesis for testing (see the quotations below).
A one-tailed hypothesis (tested using a one-sided test)[2] is an inexact hypothesis in which the value of a parameter is specified as being either:
- above or equal to a certain value, or
- below or equal to a certain value.
A one-tailed hypothesis is said to have directionality.
Fisher's original (lady tasting tea) example was a one-tailed test. The null hypothesis was asymmetric. The probability of guessing all cups correctly was the same as guessing all cups incorrectly, but Fisher noted that only guessing correctly was compatible with the lady's claim.
Technical description
The null hypothesis is a default hypothesis that a quantity to be measured is zero (null). Typically, the quantity to be measured is the difference between two situations. For instance, trying to determine if there is a positive proof that an effect has occurred or that samples derive from different batches.[7][8]
The null hypothesis states that a quantity (of interest) is larger or equal to zero and smaller or equal to zero. If either requirement can be positively overturned, the null hypothesis is "excluded from the realm of possibilities".
The null hypothesis is generally assumed to remain possibly true. Multiple analyses can be performed to show how the hypothesis should either be rejected or excluded e.g. having a high confidence level, thus demonstrating a statistically significant difference. This is demonstrated by showing that zero is outside of the specified confidence interval of the measurement on either side, typically within the real numbers.[8] Failure to exclude the null hypothesis (with any confidence) does not logically confirm or support the (unprovable) null hypothesis. (When it is proven that something is e.g. bigger than x, it does not necessarily imply it is plausible that it is smaller or equal than x; it may instead be a poor quality measurement with low accuracy. Confirming the null hypothesis two-sided would amount to positively proving it is bigger or equal than 0 and to positively proving it is smaller or equal than 0; this is something for which infinite accuracy is needed as well as exactly zero effect, neither of which normally are realistic. Also measurements will never indicate a non-zero probability of exactly zero difference.) So failure of an exclusion of a null hypothesis amounts to a "don't know" at the specified confidence level; it does not immediately imply null somehow, as the data may already show a (less strong) indication for a non-null. The used confidence level does absolutely certainly not correspond to the likelihood of null at failing to exclude; in fact in this case a high used confidence level expands the still plausible range.
A non-null hypothesis can have the following meanings, depending on the author a) a value other than zero is used, b) some margin other than zero is used and c) the "alternative" hypothesis.[9][10]
Testing (excluding or failing to exclude) the null hypothesis provides evidence that there are (or are not) statistically sufficient grounds to believe there is a relationship between two phenomena (e.g., that a potential treatment has a non-zero effect, either way). Testing the null hypothesis is a central task in statistical hypothesis testing in the modern practice of science. There are precise criteria for excluding or not excluding a null hypothesis at a certain confidence level. The confidence level should indicate the likelihood that much more and better data would still be able to exclude the null hypothesis on the same side.[8]
The concept of a null hypothesis is used differently in two approaches to statistical inference. In the significance testing approach of Ronald Fisher, a null hypothesis is rejected if the observed data are significantly unlikely to have occurred if the null hypothesis were true. In this case, the null hypothesis is rejected and an alternative hypothesis is accepted in its place. If the data are consistent with the null hypothesis statistically possibly true, then the null hypothesis is not rejected. In neither case is the null hypothesis or its alternative proven; with better or more data, the null may still be rejected. This is analogous to the legal principle of presumption of innocence, in which a suspect or defendant is assumed to be innocent (null is not rejected) until proven guilty (null is rejected) beyond a reasonable doubt (to a statistically significant degree).[8]
In the hypothesis testing approach of Jerzy Neyman and Egon Pearson, a null hypothesis is contrasted with an alternative hypothesis, and the two hypotheses are distinguished on the basis of data, with certain error rates. It is used in formulating answers in research.
Statistical inference can be done without a null hypothesis, by specifying a statistical model corresponding to each candidate hypothesis, and by using model selection techniques to choose the most appropriate model.[11] (The most common selection techniques are based on either Akaike information criterion or Bayes factor).
Principle
Hypothesis testing requires constructing a statistical model of what the data would look like if chance or random processes alone were responsible for the results. The hypothesis that chance alone is responsible for the results is called the null hypothesis. The model of the result of the random process is called the distribution under the null hypothesis. The obtained results are compared with the distribution under the null hypothesis, and the likelihood of finding the obtained results is thereby determined.[12]
Hypothesis testing works by collecting data and measuring how likely the particular set of data is (assuming the null hypothesis is true), when the study is on a randomly selected representative sample. The null hypothesis assumes no relationship between variables in the population from which the sample is selected.[13]
If the data-set of a randomly selected representative sample is very unlikely relative to the null hypothesis (defined as being part of a class of sets of data that only rarely will be observed), the experimenter rejects the null hypothesis, concluding it (probably) is false. This class of data-sets is usually specified via a test statistic, which is designed to measure the extent of apparent departure from the null hypothesis. The procedure works by assessing whether the observed departure, measured by the test statistic, is larger than a value defined, so that the probability of occurrence of a more extreme value is small under the null hypothesis (usually in less than either 5% or 1% of similar data-sets in which the null hypothesis does hold).
If the data do not contradict the null hypothesis, then only a weak conclusion can be made: namely, that the observed data set provides insufficient evidence against the null hypothesis. In this case, because the null hypothesis could be true or false, in some contexts this is interpreted as meaning that the data give insufficient evidence to make any conclusion, while in other contexts, it is interpreted as meaning that there is not sufficient evidence to support changing from a currently useful regime to a different one. Nevertheless, if at this point the effect appears likely and/or large enough, there may be an incentive to further investigate, such as running a bigger sample.
For instance, a certain drug may reduce the risk of having a heart attack. Possible null hypotheses are "this drug does not reduce the risk of having a heart attack" or "this drug has no effect on the risk of having a heart attack". The test of the hypothesis consists of administering the drug to half of the people in a study group as a controlled experiment. If the data show a statistically significant change in the people receiving the drug, the null hypothesis is rejected.
No comments:
Post a Comment