Enago Academy

What is Null Hypothesis? What Is Its Importance in Research?

' src=

Scientists begin their research with a hypothesis that a relationship of some kind exists between variables. The null hypothesis is the opposite stating that no such relationship exists. Null hypothesis may seem unexciting, but it is a very important aspect of research. In this article, we discuss what null hypothesis is, how to make use of it, and why you should use it to improve your statistical analyses.

What is the Null Hypothesis?

The null hypothesis can be tested using statistical analysis  and is often written as H 0 (read as “H-naught”). Once you determine how likely the sample relationship would be if the H 0   were true, you can run your analysis. Researchers use a significance test to determine the likelihood that the results supporting the H 0 are not due to chance.

The null hypothesis is not the same as an alternative hypothesis. An alternative hypothesis states, that there is a relationship between two variables, while H 0 posits the opposite. Let us consider the following example.

A researcher wants to discover the relationship between exercise frequency and appetite. She asks:

Q: Does increased exercise frequency lead to increased appetite? Alternative hypothesis: Increased exercise frequency leads to increased appetite. H 0 assumes that there is no relationship between the two variables: Increased exercise frequency does not lead to increased appetite.

Let us look at another example of how to state the null hypothesis:

Q: Does insufficient sleep lead to an increased risk of heart attack among men over age 50? H 0 : The amount of sleep men over age 50 get does not increase their risk of heart attack.

Why is Null Hypothesis Important?

Many scientists often neglect null hypothesis in their testing. As shown in the above examples, H 0 is often assumed to be the opposite of the hypothesis being tested. However, it is good practice to include H 0 and ensure it is carefully worded. To understand why, let us return to our previous example. In this case,

Alternative hypothesis: Getting too little sleep leads to an increased risk of heart attack among men over age 50.

H 0 : The amount of sleep men over age 50 get has no effect on their risk of heart attack.

Note that this H 0 is different than the one in our first example. What if we were to conduct this experiment and find that neither H 0 nor the alternative hypothesis was supported? The experiment would be considered invalid . Take our original H 0 in this case, “the amount of sleep men over age 50 get, does not increase their risk of heart attack”. If this H 0 is found to be untrue, and so is the alternative, we can still consider a third hypothesis. Perhaps getting insufficient sleep actually decreases the risk of a heart attack among men over age 50. Because we have tested H 0 , we have more information that we would not have if we had neglected it.

Do I Really Need to Test It?

The biggest problem with the null hypothesis is that many scientists see accepting it as a failure of the experiment. They consider that they have not proven anything of value. However, as we have learned from the replication crisis , negative results are just as important as positive ones. While they may seem less appealing to publishers, they can tell the scientific community important information about correlations that do or do not exist. In this way, they can drive science forward and prevent the wastage of resources.

Do you test for the null hypothesis? Why or why not? Let us know your thoughts in the comments below.

' src=

The following null hypotheses were formulated for this study: Ho1. There are no significant differences in the factors that influence urban gardening when respondents are grouped according to age, sex, household size, social status and average combined monthly income.

Rate this article Cancel Reply

Your email address will not be published.

why is the null hypothesis important in quantitative research

Enago Academy's Most Popular Articles

What is Academic Integrity and How to Uphold it [FREE CHECKLIST]

Ensuring Academic Integrity and Transparency in Academic Research: A comprehensive checklist for researchers

Academic integrity is the foundation upon which the credibility and value of scientific findings are…

AI vs. AI: Can we outsmart image manipulation in research?

  • AI in Academia

AI vs. AI: How to detect image manipulation and avoid academic misconduct

The scientific community is facing a new frontier of controversy as artificial intelligence (AI) is…

Diversify Your Learning: Why inclusive academic curricula matter

  • Diversity and Inclusion

Need for Diversifying Academic Curricula: Embracing missing voices and marginalized perspectives

In classrooms worldwide, a single narrative often dominates, leaving many students feeling lost. These stories,…

Understand Academic Burnout: Spot the Signs & Reclaim Your Focus

  • Career Corner
  • Trending Now

Recognizing the signs: A guide to overcoming academic burnout

As the sun set over the campus, casting long shadows through the library windows, Alex…

How to Promote an Inclusive and Equitable Lab Environment

Reassessing the Lab Environment to Create an Equitable and Inclusive Space

The pursuit of scientific discovery has long been fueled by diverse minds and perspectives. Yet…

7 Steps of Writing an Excellent Academic Book Chapter

When Your Thesis Advisor Asks You to Quit

Virtual Defense: Top 5 Online Thesis Defense Tips

why is the null hypothesis important in quantitative research

Sign-up to read more

Subscribe for free to get unrestricted access to all our resources on research writing and academic publishing including:

  • 2000+ blog articles
  • 50+ Webinars
  • 10+ Expert podcasts
  • 50+ Infographics
  • 10+ Checklists
  • Research Guides

We hate spam too. We promise to protect your privacy and never spam you.

I am looking for Editing/ Proofreading services for my manuscript Tentative date of next journal submission:

why is the null hypothesis important in quantitative research

As a researcher, what do you consider most when choosing an image manipulation detector?

13.1 Understanding Null Hypothesis Testing

Learning objectives.

  • Explain the purpose of null hypothesis testing, including the role of sampling error.
  • Describe the basic logic of null hypothesis testing.
  • Describe the role of relationship strength and sample size in determining statistical significance and make reasonable judgments about statistical significance based on these two factors.

  The Purpose of Null Hypothesis Testing

As we have seen, psychological research typically involves measuring one or more variables in a sample and computing descriptive statistics for that sample. In general, however, the researcher’s goal is not to draw conclusions about that sample but to draw conclusions about the population that the sample was selected from. Thus researchers must use sample statistics to draw conclusions about the corresponding values in the population. These corresponding values in the population are called  parameters . Imagine, for example, that a researcher measures the number of depressive symptoms exhibited by each of 50 adults with clinical depression and computes the mean number of symptoms. The researcher probably wants to use this sample statistic (the mean number of symptoms for the sample) to draw conclusions about the corresponding population parameter (the mean number of symptoms for adults with clinical depression).

Unfortunately, sample statistics are not perfect estimates of their corresponding population parameters. This is because there is a certain amount of random variability in any statistic from sample to sample. The mean number of depressive symptoms might be 8.73 in one sample of adults with clinical depression, 6.45 in a second sample, and 9.44 in a third—even though these samples are selected randomly from the same population. Similarly, the correlation (Pearson’s  r ) between two variables might be +.24 in one sample, −.04 in a second sample, and +.15 in a third—again, even though these samples are selected randomly from the same population. This random variability in a statistic from sample to sample is called  sampling error . (Note that the term error  here refers to random variability and does not imply that anyone has made a mistake. No one “commits a sampling error.”)

One implication of this is that when there is a statistical relationship in a sample, it is not always clear that there is a statistical relationship in the population. A small difference between two group means in a sample might indicate that there is a small difference between the two group means in the population. But it could also be that there is no difference between the means in the population and that the difference in the sample is just a matter of sampling error. Similarly, a Pearson’s  r  value of −.29 in a sample might mean that there is a negative relationship in the population. But it could also be that there is no relationship in the population and that the relationship in the sample is just a matter of sampling error.

In fact, any statistical relationship in a sample can be interpreted in two ways:

  • There is a relationship in the population, and the relationship in the sample reflects this.
  • There is no relationship in the population, and the relationship in the sample reflects only sampling error.

The purpose of null hypothesis testing is simply to help researchers decide between these two interpretations.

The Logic of Null Hypothesis Testing

Null hypothesis testing  is a formal approach to deciding between two interpretations of a statistical relationship in a sample. One interpretation is called the  null hypothesis  (often symbolized  H 0  and read as “H-naught”). This is the idea that there is no relationship in the population and that the relationship in the sample reflects only sampling error. Informally, the null hypothesis is that the sample relationship “occurred by chance.” The other interpretation is called the  alternative hypothesis  (often symbolized as  H 1 ). This is the idea that there is a relationship in the population and that the relationship in the sample reflects this relationship in the population.

Again, every statistical relationship in a sample can be interpreted in either of these two ways: It might have occurred by chance, or it might reflect a relationship in the population. So researchers need a way to decide between them. Although there are many specific null hypothesis testing techniques, they are all based on the same general logic. The steps are as follows:

  • Assume for the moment that the null hypothesis is true. There is no relationship between the variables in the population.
  • Determine how likely the sample relationship would be if the null hypothesis were true.
  • If the sample relationship would be extremely unlikely, then reject the null hypothesis  in favor of the alternative hypothesis. If it would not be extremely unlikely, then  retain the null hypothesis .

Following this logic, we can begin to understand why Mehl and his colleagues concluded that there is no difference in talkativeness between women and men in the population. In essence, they asked the following question: “If there were no difference in the population, how likely is it that we would find a small difference of  d  = 0.06 in our sample?” Their answer to this question was that this sample relationship would be fairly likely if the null hypothesis were true. Therefore, they retained the null hypothesis—concluding that there is no evidence of a sex difference in the population. We can also see why Kanner and his colleagues concluded that there is a correlation between hassles and symptoms in the population. They asked, “If the null hypothesis were true, how likely is it that we would find a strong correlation of +.60 in our sample?” Their answer to this question was that this sample relationship would be fairly unlikely if the null hypothesis were true. Therefore, they rejected the null hypothesis in favor of the alternative hypothesis—concluding that there is a positive correlation between these variables in the population.

A crucial step in null hypothesis testing is finding the likelihood of the sample result if the null hypothesis were true. This probability is called the  p value . A low  p  value means that the sample result would be unlikely if the null hypothesis were true and leads to the rejection of the null hypothesis. A p  value that is not low means that the sample result would be likely if the null hypothesis were true and leads to the retention of the null hypothesis. But how low must the  p  value be before the sample result is considered unlikely enough to reject the null hypothesis? In null hypothesis testing, this criterion is called  α (alpha)  and is almost always set to .05. If there is a 5% chance or less of a result as extreme as the sample result if the null hypothesis were true, then the null hypothesis is rejected. When this happens, the result is said to be  statistically significant . If there is greater than a 5% chance of a result as extreme as the sample result when the null hypothesis is true, then the null hypothesis is retained. This does not necessarily mean that the researcher accepts the null hypothesis as true—only that there is not currently enough evidence to reject it. Researchers often use the expression “fail to reject the null hypothesis” rather than “retain the null hypothesis,” but they never use the expression “accept the null hypothesis.”

The Misunderstood  p  Value

The  p  value is one of the most misunderstood quantities in psychological research (Cohen, 1994) [1] . Even professional researchers misinterpret it, and it is not unusual for such misinterpretations to appear in statistics textbooks!

The most common misinterpretation is that the  p  value is the probability that the null hypothesis is true—that the sample result occurred by chance. For example, a misguided researcher might say that because the  p  value is .02, there is only a 2% chance that the result is due to chance and a 98% chance that it reflects a real relationship in the population. But this is incorrect . The  p  value is really the probability of a result at least as extreme as the sample result  if  the null hypothesis  were  true. So a  p  value of .02 means that if the null hypothesis were true, a sample result this extreme would occur only 2% of the time.

You can avoid this misunderstanding by remembering that the  p  value is not the probability that any particular  hypothesis  is true or false. Instead, it is the probability of obtaining the  sample result  if the null hypothesis were true.

image

“Null Hypothesis” retrieved from http://imgs.xkcd.com/comics/null_hypothesis.png (CC-BY-NC 2.5)

Role of Sample Size and Relationship Strength

Recall that null hypothesis testing involves answering the question, “If the null hypothesis were true, what is the probability of a sample result as extreme as this one?” In other words, “What is the  p  value?” It can be helpful to see that the answer to this question depends on just two considerations: the strength of the relationship and the size of the sample. Specifically, the stronger the sample relationship and the larger the sample, the less likely the result would be if the null hypothesis were true. That is, the lower the  p  value. This should make sense. Imagine a study in which a sample of 500 women is compared with a sample of 500 men in terms of some psychological characteristic, and Cohen’s  d  is a strong 0.50. If there were really no sex difference in the population, then a result this strong based on such a large sample should seem highly unlikely. Now imagine a similar study in which a sample of three women is compared with a sample of three men, and Cohen’s  d  is a weak 0.10. If there were no sex difference in the population, then a relationship this weak based on such a small sample should seem likely. And this is precisely why the null hypothesis would be rejected in the first example and retained in the second.

Of course, sometimes the result can be weak and the sample large, or the result can be strong and the sample small. In these cases, the two considerations trade off against each other so that a weak result can be statistically significant if the sample is large enough and a strong relationship can be statistically significant even if the sample is small. Table 13.1 shows roughly how relationship strength and sample size combine to determine whether a sample result is statistically significant. The columns of the table represent the three levels of relationship strength: weak, medium, and strong. The rows represent four sample sizes that can be considered small, medium, large, and extra large in the context of psychological research. Thus each cell in the table represents a combination of relationship strength and sample size. If a cell contains the word  Yes , then this combination would be statistically significant for both Cohen’s  d  and Pearson’s  r . If it contains the word  No , then it would not be statistically significant for either. There is one cell where the decision for  d  and  r  would be different and another where it might be different depending on some additional considerations, which are discussed in Section 13.2 “Some Basic Null Hypothesis Tests”

Although Table 13.1 provides only a rough guideline, it shows very clearly that weak relationships based on medium or small samples are never statistically significant and that strong relationships based on medium or larger samples are always statistically significant. If you keep this lesson in mind, you will often know whether a result is statistically significant based on the descriptive statistics alone. It is extremely useful to be able to develop this kind of intuitive judgment. One reason is that it allows you to develop expectations about how your formal null hypothesis tests are going to come out, which in turn allows you to detect problems in your analyses. For example, if your sample relationship is strong and your sample is medium, then you would expect to reject the null hypothesis. If for some reason your formal null hypothesis test indicates otherwise, then you need to double-check your computations and interpretations. A second reason is that the ability to make this kind of intuitive judgment is an indication that you understand the basic logic of this approach in addition to being able to do the computations.

Statistical Significance Versus Practical Significance

Table 13.1 illustrates another extremely important point. A statistically significant result is not necessarily a strong one. Even a very weak result can be statistically significant if it is based on a large enough sample. This is closely related to Janet Shibley Hyde’s argument about sex differences (Hyde, 2007) [2] . The differences between women and men in mathematical problem solving and leadership ability are statistically significant. But the word  significant  can cause people to interpret these differences as strong and important—perhaps even important enough to influence the college courses they take or even who they vote for. As we have seen, however, these statistically significant differences are actually quite weak—perhaps even “trivial.”

This is why it is important to distinguish between the  statistical  significance of a result and the  practical  significance of that result.  Practical significance refers to the importance or usefulness of the result in some real-world context. Many sex differences are statistically significant—and may even be interesting for purely scientific reasons—but they are not practically significant. In clinical practice, this same concept is often referred to as “clinical significance.” For example, a study on a new treatment for social phobia might show that it produces a statistically significant positive effect. Yet this effect still might not be strong enough to justify the time, effort, and other costs of putting it into practice—especially if easier and cheaper treatments that work almost as well already exist. Although statistically significant, this result would be said to lack practical or clinical significance.

image

“Conditional Risk” retrieved from http://imgs.xkcd.com/comics/conditional_risk.png (CC-BY-NC 2.5)

Key Takeaways

  • Null hypothesis testing is a formal approach to deciding whether a statistical relationship in a sample reflects a real relationship in the population or is just due to chance.
  • The logic of null hypothesis testing involves assuming that the null hypothesis is true, finding how likely the sample result would be if this assumption were correct, and then making a decision. If the sample result would be unlikely if the null hypothesis were true, then it is rejected in favor of the alternative hypothesis. If it would not be unlikely, then the null hypothesis is retained.
  • The probability of obtaining the sample result if the null hypothesis were true (the  p  value) is based on two considerations: relationship strength and sample size. Reasonable judgments about whether a sample relationship is statistically significant can often be made by quickly considering these two factors.
  • Statistical significance is not the same as relationship strength or importance. Even weak relationships can be statistically significant if the sample size is large enough. It is important to consider relationship strength and the practical significance of a result in addition to its statistical significance.
  • Discussion: Imagine a study showing that people who eat more broccoli tend to be happier. Explain for someone who knows nothing about statistics why the researchers would conduct a null hypothesis test.
  • The correlation between two variables is  r  = −.78 based on a sample size of 137.
  • The mean score on a psychological characteristic for women is 25 ( SD  = 5) and the mean score for men is 24 ( SD  = 5). There were 12 women and 10 men in this study.
  • In a memory experiment, the mean number of items recalled by the 40 participants in Condition A was 0.50 standard deviations greater than the mean number recalled by the 40 participants in Condition B.
  • In another memory experiment, the mean scores for participants in Condition A and Condition B came out exactly the same!
  • A student finds a correlation of  r  = .04 between the number of units the students in his research methods class are taking and the students’ level of stress.
  • Cohen, J. (1994). The world is round: p < .05. American Psychologist, 49 , 997–1003. ↵
  • Hyde, J. S. (2007). New directions in the study of gender similarities and differences. Current Directions in Psychological Science, 16 , 259–263. ↵

Creative Commons License

Share This Book

  • Increase Font Size

Have a thesis expert improve your writing

Check your thesis for plagiarism in 10 minutes, generate your apa citations for free.

  • Knowledge Base
  • Null and Alternative Hypotheses | Definitions & Examples

Null and Alternative Hypotheses | Definitions & Examples

Published on 5 October 2022 by Shaun Turney . Revised on 6 December 2022.

The null and alternative hypotheses are two competing claims that researchers weigh evidence for and against using a statistical test :

  • Null hypothesis (H 0 ): There’s no effect in the population .
  • Alternative hypothesis (H A ): There’s an effect in the population.

The effect is usually the effect of the independent variable on the dependent variable .

Table of contents

Answering your research question with hypotheses, what is a null hypothesis, what is an alternative hypothesis, differences between null and alternative hypotheses, how to write null and alternative hypotheses, frequently asked questions about null and alternative hypotheses.

The null and alternative hypotheses offer competing answers to your research question . When the research question asks “Does the independent variable affect the dependent variable?”, the null hypothesis (H 0 ) answers “No, there’s no effect in the population.” On the other hand, the alternative hypothesis (H A ) answers “Yes, there is an effect in the population.”

The null and alternative are always claims about the population. That’s because the goal of hypothesis testing is to make inferences about a population based on a sample . Often, we infer whether there’s an effect in the population by looking at differences between groups or relationships between variables in the sample.

You can use a statistical test to decide whether the evidence favors the null or alternative hypothesis. Each type of statistical test comes with a specific way of phrasing the null and alternative hypothesis. However, the hypotheses can also be phrased in a general way that applies to any test.

The null hypothesis is the claim that there’s no effect in the population.

If the sample provides enough evidence against the claim that there’s no effect in the population ( p ≤ α), then we can reject the null hypothesis . Otherwise, we fail to reject the null hypothesis.

Although “fail to reject” may sound awkward, it’s the only wording that statisticians accept. Be careful not to say you “prove” or “accept” the null hypothesis.

Null hypotheses often include phrases such as “no effect”, “no difference”, or “no relationship”. When written in mathematical terms, they always include an equality (usually =, but sometimes ≥ or ≤).

Examples of null hypotheses

The table below gives examples of research questions and null hypotheses. There’s always more than one way to answer a research question, but these null hypotheses can help you get started.

*Note that some researchers prefer to always write the null hypothesis in terms of “no effect” and “=”. It would be fine to say that daily meditation has no effect on the incidence of depression and p 1 = p 2 .

The alternative hypothesis (H A ) is the other answer to your research question . It claims that there’s an effect in the population.

Often, your alternative hypothesis is the same as your research hypothesis. In other words, it’s the claim that you expect or hope will be true.

The alternative hypothesis is the complement to the null hypothesis. Null and alternative hypotheses are exhaustive, meaning that together they cover every possible outcome. They are also mutually exclusive, meaning that only one can be true at a time.

Alternative hypotheses often include phrases such as “an effect”, “a difference”, or “a relationship”. When alternative hypotheses are written in mathematical terms, they always include an inequality (usually ≠, but sometimes > or <). As with null hypotheses, there are many acceptable ways to phrase an alternative hypothesis.

Examples of alternative hypotheses

The table below gives examples of research questions and alternative hypotheses to help you get started with formulating your own.

Null and alternative hypotheses are similar in some ways:

  • They’re both answers to the research question
  • They both make claims about the population
  • They’re both evaluated by statistical tests.

However, there are important differences between the two types of hypotheses, summarized in the following table.

To help you write your hypotheses, you can use the template sentences below. If you know which statistical test you’re going to use, you can use the test-specific template sentences. Otherwise, you can use the general template sentences.

The only thing you need to know to use these general template sentences are your dependent and independent variables. To write your research question, null hypothesis, and alternative hypothesis, fill in the following sentences with your variables:

Does independent variable affect dependent variable ?

  • Null hypothesis (H 0 ): Independent variable does not affect dependent variable .
  • Alternative hypothesis (H A ): Independent variable affects dependent variable .

Test-specific

Once you know the statistical test you’ll be using, you can write your hypotheses in a more precise and mathematical way specific to the test you chose. The table below provides template sentences for common statistical tests.

Note: The template sentences above assume that you’re performing one-tailed tests . One-tailed tests are appropriate for most studies.

The null hypothesis is often abbreviated as H 0 . When the null hypothesis is written using mathematical symbols, it always includes an equality symbol (usually =, but sometimes ≥ or ≤).

The alternative hypothesis is often abbreviated as H a or H 1 . When the alternative hypothesis is written using mathematical symbols, it always includes an inequality symbol (usually ≠, but sometimes < or >).

A research hypothesis is your proposed answer to your research question. The research hypothesis usually includes an explanation (‘ x affects y because …’).

A statistical hypothesis, on the other hand, is a mathematical statement about a population parameter. Statistical hypotheses always come in pairs: the null and alternative hypotheses. In a well-designed study , the statistical hypotheses correspond logically to the research hypothesis.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the ‘Cite this Scribbr article’ button to automatically add the citation to our free Reference Generator.

Turney, S. (2022, December 06). Null and Alternative Hypotheses | Definitions & Examples. Scribbr. Retrieved 14 May 2024, from https://www.scribbr.co.uk/stats/null-and-alternative-hypothesis/

Is this article helpful?

Shaun Turney

Shaun Turney

Other students also liked, levels of measurement: nominal, ordinal, interval, ratio, the standard normal distribution | calculator, examples & uses, types of variables in research | definitions & examples.

Ask an Academic

Scientists looking at DNA helices

Why Does Research Require a Null Hypothesis?

Every researcher is required to establish hypotheses in order to predict, tentatively, the outcome of the research.

What is a null hypothesis and why does research need one?

Every researcher is required to establish hypotheses in order to predict, tentatively, the outcome of the research (Leedy & Ormrod, 2016). A null hypothesis is “the result of chance alone”, there’s no patterns, differences or relationships between variables (Leedy & Ormrod, 2016). Whether the outcome is positive or negative, the requirement of a null hypothesis in addition of your alternative hypothesis means that your research (and you as the researcher as well) is not one-sided (Bland & Altman, 1994). In other words, you and the research are open to the possibility that maybe or maybe not a difference between the variables exists and open to the possibility that the outcome of the research is due to a reason (alternative hypothesis) or a chance (null hypothesis) (Leedy & Ormrod, 2016; Pierce, 2008 & Bland & Altman, 1994).

After collecting data, the hypotheses must be tested in order to reach a conclusion (Daniel & Cross, 2013). A null hypothesis is tested when the probability of the results are “due to chance alone” but the data collected reasonably suggest that something (a factor, a reason or other variable) in the studied environment/population leads to a difference/relationship/pattern between them (Leedy & Ormrod, 2016 & Pierce, 2008). A null hypothesis is used to draw conclusions from the collected data when the “process of comparing data” with the expected outcome (results) of chance alone (Leedy & Ormrod, 2016). When the result is because of “something other than chance”, the null hypothesis is rejected and the alternative hypothesis comes to play because the data, indirectly, led us to support it (Leedy & Ormrod, 2016). The alternative hypothesis might be the one the researcher wants to be accepted, however, it “can only be accepted” if after the collected data shows that the null hypothesis “has been rejected” (Pierce, 2008).

Bland, J. M., & Altman, D. G. (1994). Statistics Notes: One and two sided tests of significance.  British Medical Journal (BMJ), 309 , 248-248. doi:10.1136/bmj.309.6949.248

Daniel, W. W., & Cross, C. L. (2013). Chapter 7 Hypothesis Testing. In  Biostatistics: A Foundation for Analysis in the  Health   Sciences  (10th ed., pp. 214-303). Hoboken, NJ: Wiley. Retrieved February 13, 2018, from  https://msph1blog.files.wordpress.com/2016/10/biostatistics-_daniel-10th1.pdf .

Leedy, P. D., & Ormrod, J. E. (2016).  Practical Research: Planning and Design  (11th ed.). NJ:  Pearson Education . Retrieved February 13, 2018, from  https://digitalbookshelf.argosy.edu/#/books/9781323328798/cfi/6/6!/4/2/2/48@0:0 .

Pierce, T. (2008, September). Independent samples t-test. Retrieved February 13, 2018, from  http://www.radford.edu/~tpierce/610%20files/Data%20Analysis%20for%20Professional%20Psychologists/Independent%20samples%20t-test%2010-02-09.pdf

Privacy Overview

Hypothesis Testing (cont...)

Hypothesis testing, the null and alternative hypothesis.

In order to undertake hypothesis testing you need to express your research hypothesis as a null and alternative hypothesis. The null hypothesis and alternative hypothesis are statements regarding the differences or effects that occur in the population. You will use your sample to test which statement (i.e., the null hypothesis or alternative hypothesis) is most likely (although technically, you test the evidence against the null hypothesis). So, with respect to our teaching example, the null and alternative hypothesis will reflect statements about all statistics students on graduate management courses.

The null hypothesis is essentially the "devil's advocate" position. That is, it assumes that whatever you are trying to prove did not happen ( hint: it usually states that something equals zero). For example, the two different teaching methods did not result in different exam performances (i.e., zero difference). Another example might be that there is no relationship between anxiety and athletic performance (i.e., the slope is zero). The alternative hypothesis states the opposite and is usually the hypothesis you are trying to prove (e.g., the two different teaching methods did result in different exam performances). Initially, you can state these hypotheses in more general terms (e.g., using terms like "effect", "relationship", etc.), as shown below for the teaching methods example:

Depending on how you want to "summarize" the exam performances will determine how you might want to write a more specific null and alternative hypothesis. For example, you could compare the mean exam performance of each group (i.e., the "seminar" group and the "lectures-only" group). This is what we will demonstrate here, but other options include comparing the distributions , medians , amongst other things. As such, we can state:

Now that you have identified the null and alternative hypotheses, you need to find evidence and develop a strategy for declaring your "support" for either the null or alternative hypothesis. We can do this using some statistical theory and some arbitrary cut-off points. Both these issues are dealt with next.

Significance levels

The level of statistical significance is often expressed as the so-called p -value . Depending on the statistical test you have chosen, you will calculate a probability (i.e., the p -value) of observing your sample results (or more extreme) given that the null hypothesis is true . Another way of phrasing this is to consider the probability that a difference in a mean score (or other statistic) could have arisen based on the assumption that there really is no difference. Let us consider this statement with respect to our example where we are interested in the difference in mean exam performance between two different teaching methods. If there really is no difference between the two teaching methods in the population (i.e., given that the null hypothesis is true), how likely would it be to see a difference in the mean exam performance between the two teaching methods as large as (or larger than) that which has been observed in your sample?

So, you might get a p -value such as 0.03 (i.e., p = .03). This means that there is a 3% chance of finding a difference as large as (or larger than) the one in your study given that the null hypothesis is true. However, you want to know whether this is "statistically significant". Typically, if there was a 5% or less chance (5 times in 100 or less) that the difference in the mean exam performance between the two teaching methods (or whatever statistic you are using) is as different as observed given the null hypothesis is true, you would reject the null hypothesis and accept the alternative hypothesis. Alternately, if the chance was greater than 5% (5 times in 100 or more), you would fail to reject the null hypothesis and would not accept the alternative hypothesis. As such, in this example where p = .03, we would reject the null hypothesis and accept the alternative hypothesis. We reject it because at a significance level of 0.03 (i.e., less than a 5% chance), the result we obtained could happen too frequently for us to be confident that it was the two teaching methods that had an effect on exam performance.

Whilst there is relatively little justification why a significance level of 0.05 is used rather than 0.01 or 0.10, for example, it is widely used in academic research. However, if you want to be particularly confident in your results, you can set a more stringent level of 0.01 (a 1% chance or less; 1 in 100 chance or less).

Testimonials

One- and two-tailed predictions

When considering whether we reject the null hypothesis and accept the alternative hypothesis, we need to consider the direction of the alternative hypothesis statement. For example, the alternative hypothesis that was stated earlier is:

The alternative hypothesis tells us two things. First, what predictions did we make about the effect of the independent variable(s) on the dependent variable(s)? Second, what was the predicted direction of this effect? Let's use our example to highlight these two points.

Sarah predicted that her teaching method (independent variable: teaching method), whereby she not only required her students to attend lectures, but also seminars, would have a positive effect (that is, increased) students' performance (dependent variable: exam marks). If an alternative hypothesis has a direction (and this is how you want to test it), the hypothesis is one-tailed. That is, it predicts direction of the effect. If the alternative hypothesis has stated that the effect was expected to be negative, this is also a one-tailed hypothesis.

Alternatively, a two-tailed prediction means that we do not make a choice over the direction that the effect of the experiment takes. Rather, it simply implies that the effect could be negative or positive. If Sarah had made a two-tailed prediction, the alternative hypothesis might have been:

In other words, we simply take out the word "positive", which implies the direction of our effect. In our example, making a two-tailed prediction may seem strange. After all, it would be logical to expect that "extra" tuition (going to seminar classes as well as lectures) would either have a positive effect on students' performance or no effect at all, but certainly not a negative effect. However, this is just our opinion (and hope) and certainly does not mean that we will get the effect we expect. Generally speaking, making a one-tail prediction (i.e., and testing for it this way) is frowned upon as it usually reflects the hope of a researcher rather than any certainty that it will happen. Notable exceptions to this rule are when there is only one possible way in which a change could occur. This can happen, for example, when biological activity/presence in measured. That is, a protein might be "dormant" and the stimulus you are using can only possibly "wake it up" (i.e., it cannot possibly reduce the activity of a "dormant" protein). In addition, for some statistical tests, one-tailed tests are not possible.

Rejecting or failing to reject the null hypothesis

Let's return finally to the question of whether we reject or fail to reject the null hypothesis.

If our statistical analysis shows that the significance level is below the cut-off value we have set (e.g., either 0.05 or 0.01), we reject the null hypothesis and accept the alternative hypothesis. Alternatively, if the significance level is above the cut-off value, we fail to reject the null hypothesis and cannot accept the alternative hypothesis. You should note that you cannot accept the null hypothesis, but only find evidence against it.

Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons

Margin Size

  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Statistics LibreTexts

7.3: The Research Hypothesis and the Null Hypothesis

  • Last updated
  • Save as PDF
  • Page ID 18038

  • Michelle Oja
  • Taft College

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\id}{\mathrm{id}}\)

\( \newcommand{\kernel}{\mathrm{null}\,}\)

\( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\)

\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\)

\( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

\( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vectorC}[1]{\textbf{#1}} \)

\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

Hypotheses are predictions of expected findings.

The Research Hypothesis

A research hypothesis is a mathematical way of stating a research question.  A research hypothesis names the groups (we'll start with a sample and a population), what was measured, and which we think will have a higher mean.  The last one gives the research hypothesis a direction.  In other words, a research hypothesis should include:

  • The name of the groups being compared.  This is sometimes considered the IV.
  • What was measured.  This is the DV.
  • Which group are we predicting will have the higher mean.  

There are two types of research hypotheses related to sample means and population means:  Directional Research Hypotheses and Non-Directional Research Hypotheses

Directional Research Hypothesis

If we expect our obtained sample mean to be above or below the other group's mean (the population mean, for example), we have a directional hypothesis. There are two options:

  • Symbol:       \( \displaystyle \bar{X} > \mu \)
  • (The mean of the sample is greater than than the mean of the population.)
  • Symbol:     \( \displaystyle \bar{X} < \mu \)
  • (The mean of the sample is less than than mean of the population.)

Example \(\PageIndex{1}\)

A study by Blackwell, Trzesniewski, and Dweck (2007) measured growth mindset and how long the junior high student participants spent on their math homework.  What’s a directional hypothesis for how scoring higher on growth mindset (compared to the population of junior high students) would be related to how long students spent on their homework?  Write this out in words and symbols.

Answer in Words:            Students who scored high on growth mindset would spend more time on their homework than the population of junior high students.

Answer in Symbols:         \( \displaystyle \bar{X} > \mu \) 

Non-Directional Research Hypothesis

A non-directional hypothesis states that the means will be different, but does not specify which will be higher.  In reality, there is rarely a situation in which we actually don't want one group to be higher than the other, so we will focus on directional research hypotheses.  There is only one option for a non-directional research hypothesis: "The sample mean differs from the population mean."  These types of research hypotheses don’t give a direction, the hypothesis doesn’t say which will be higher or lower.

A non-directional research hypothesis in symbols should look like this:    \( \displaystyle \bar{X} \neq \mu \) (The mean of the sample is not equal to the mean of the population).

Exercise \(\PageIndex{1}\)

What’s a non-directional hypothesis for how scoring higher on growth mindset higher on growth mindset (compared to the population of junior high students) would be related to how long students spent on their homework (Blackwell, Trzesniewski, & Dweck, 2007)?  Write this out in words and symbols.

Answer in Words:            Students who scored high on growth mindset would spend a different amount of time on their homework than the population of junior high students.

Answer in Symbols:        \( \displaystyle \bar{X} \neq \mu \) 

See how a non-directional research hypothesis doesn't really make sense?  The big issue is not if the two groups differ, but if one group seems to improve what was measured (if having a growth mindset leads to more time spent on math homework).  This textbook will only use directional research hypotheses because researchers almost always have a predicted direction (meaning that we almost always know which group we think will score higher).

The Null Hypothesis

The hypothesis that an apparent effect is due to chance is called the null hypothesis, written \(H_0\) (“H-naught”). We usually test this through comparing an experimental group to a comparison (control) group.  This null hypothesis can be written as:

\[\mathrm{H}_{0}: \bar{X} = \mu \nonumber \]

For most of this textbook, the null hypothesis is that the means of the two groups are similar.  Much later, the null hypothesis will be that there is no relationship between the two groups.  Either way, remember that a null hypothesis is always saying that nothing is different.  

This is where descriptive statistics diverge from inferential statistics.  We know what the value of \(\overline{\mathrm{X}}\) is – it’s not a mystery or a question, it is what we observed from the sample.  What we are using inferential statistics to do is infer whether this sample's descriptive statistics probably represents the population's descriptive statistics.  This is the null hypothesis, that the two groups are similar.  

Keep in mind that the null hypothesis is typically the opposite of the research hypothesis. A research hypothesis for the ESP example is that those in my sample who say that they have ESP would get more correct answers than the population would get correct, while the null hypothesis is that the average number correct for the two groups will be similar. 

In general, the null hypothesis is the idea that nothing is going on: there is no effect of our treatment, no relation between our variables, and no difference in our sample mean from what we expected about the population mean. This is always our baseline starting assumption, and it is what we seek to reject. If we are trying to treat depression, we want to find a difference in average symptoms between our treatment and control groups. If we are trying to predict job performance, we want to find a relation between conscientiousness and evaluation scores. However, until we have evidence against it, we must use the null hypothesis as our starting point.

In sum, the null hypothesis is always : There is no difference between the groups’ means OR There is no relationship between the variables .

In the next chapter, the null hypothesis is that there’s no difference between the sample mean   and population mean.  In other words:

  • There is no mean difference between the sample and population.
  • The mean of the sample is the same as the mean of a specific population.
  • \(\mathrm{H}_{0}: \bar{X} = \mu \nonumber \)
  • We expect our sample’s mean to be same as the population mean.

Exercise \(\PageIndex{2}\)

A study by Blackwell, Trzesniewski, and Dweck (2007) measured growth mindset and how long the junior high student participants spent on their math homework.  What’s the null hypothesis for scoring higher on growth mindset (compared to the population of junior high students) and how long students spent on their homework?  Write this out in words and symbols.

Answer in Words:            Students who scored high on growth mindset would spend a similar amount of time on their homework as the population of junior high students.

Answer in Symbols:    \( \bar{X} = \mu \)

Contributors and Attributions

Foster et al.  (University of Missouri-St. Louis, Rice University, & University of Houston, Downtown Campus)

Dr. MO ( Taft College )

Null hypothesis significance tests. A mix-up of two different theories: the basis for widespread confusion and numerous misinterpretations

  • Published: 25 February 2014
  • Volume 102 , pages 411–432, ( 2015 )

Cite this article

why is the null hypothesis important in quantitative research

  • Jesper W. Schneider 1  

3274 Accesses

70 Citations

7 Altmetric

Explore all metrics

The statistician cannot excuse himself from the duty of getting his head clear on the principles of scientific inference, but equally no other thinking man can avoid a like obligation (Fisher 1951 , p. 2)

Null hypothesis statistical significance tests (NHST) are widely used in quantitative research in the empirical sciences including scientometrics. Nevertheless, since their introduction nearly a century ago significance tests have been controversial. Many researchers are not aware of the numerous criticisms raised against NHST. As practiced, NHST has been characterized as a ‘null ritual’ that is overused and too often misapplied and misinterpreted. NHST is in fact a patchwork of two fundamentally different classical statistical testing models, often blended with some wishful quasi-Bayesian interpretations. This is undoubtedly a major reason why NHST is very often misunderstood. But NHST also has intrinsic logical problems and the epistemic range of the information provided by such tests is much more limited than most researchers recognize. In this article we introduce to the scientometric community the theoretical origins of NHST, which is mostly absent from standard statistical textbooks, and we discuss some of the most prevalent problems relating to the practice of NHST and trace these problems back to the mix-up of the two different theoretical origins. Finally, we illustrate some of the misunderstandings with examples from the scientometric literature and bring forward some modest recommendations for a more sound practice in quantitative data analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

Similar content being viewed by others

why is the null hypothesis important in quantitative research

A Holistic Approach to Empirical Analysis: The Insignificance of P, Hypothesis Testing and Statistical Significance*

why is the null hypothesis important in quantitative research

Statistical significance and its critics: practicing damaging science, or damaging scientific practice?

The e-value: a fully bayesian significance measure for precise statistical hypotheses and its research program.

Notice, other hypotheses to be nullified, such as a directional, non-zero or interval estimates are possible but seldom used, hence the ‘null ritual’.

Statistical power is the probability of rejecting H 0 when it is false (Cohen 1988 ). Statistical power is affected by α and β levels, the size of the effect and the size of the sample used to detect it. These elements make it possible to define the probability density function for the alternative hypothesis.

E.g., a sampling design where one either chooses to toss a coin until it produces a pre-specified pattern, or instead doing a pre-specified number of tosses. The results can be identical, but the p values will be different.

For example, the instructions to authors in the journal Epidemiology reads “We strongly discourage the use of p values and language referring to statistical significance” ( http://edmgr.ovid.com/epid/accounts/ifauth.htm ).

Abelson, R. P. (1997). On the surprising longevity of flogged horses: Why there is a case for the significance test. Psychological Science, 8 (1), 12–15.

Google Scholar  

American Psychological Association. (2010). Publication Manual of the APA (6th ed.). Washington, DC: APA.

Anderson, D. R. (2008). Model based inference in the life sciences: A primer on evidence . New York: Springer.

Anderson, D. R., Burnham, K. P., & Thompson, W. L. (2000). Null hypothesis testing: Problems, prevalence, and an alternative. Journal of Wildlife Management, 64 , 912–923.

Armstrong, J. S. (2007). Significance tests harm progress in forecasting. International Journal of Forecasting, 23 (2), 321–327.

Armstrong, J. S. (2012). Illusions in regression analysis. International Journal of Forecasting, 28 (3), 689–694.

Beninger, P. G., Boldina, I., & Katsanevakis, S. (2012). Strengthening statistical usage in marine ecology. Journal of Experimental Marine Biology and Ecology, 426 , 97–108.

Berger, J. O., & Berry, D. A. (1988). Statistical analysis and the illusion of objectivity. American Scientist, 76 (2), 159–165.

Berger, J. O., & Sellke, T. (1987). Testing a point null hypothesis—The irreconcilability of p -values and evidence. Journal of the American Statistical Association, 82 (397), 112–122.

MathSciNet   MATH   Google Scholar  

Berk, R. A., & Freedman, D. A. (2003). Statistical assumptions as empirical commitments. In T. G. Blomberg & S. Cohen (Eds.), Law, punishment, and social control: Essays in honor of Sheldon Messinger (pp. 235–254). New York: Aldine.

Berk, R. A., Western, B., & Weiss, R. E. (1995). Statistical inference for apparent populations. Sociological Methodology, 25 , 421–458.

Berkson, J. (1938). Some difficulties of interpretation encountered in the application of the Chi square test. Journal of the American Statistical Association, 33 (203), 526–536.

MATH   Google Scholar  

Berkson, J. (1942). Tests of significance considered as evidence. Journal of the American Statistical Association, 37 (219), 325–335.

Boring, E. G. (1919). Mathematical versus scientific significance. Psychological Bulletin, 16 , 335–338.

Bornmann, L., & Leydesdorff, L. (2013). Statistical tests and research assessments: A comment on Schneider (2012). Journal of the American Society for Information Science and Technology, 64 (6), 1306–1308.

Carver, R. P. (1978). The case against statistical significance testing. Harvard Educational Review, 48 (3), 378–399.

MathSciNet   Google Scholar  

Chow, S. L. (1998). Précis of Statistical significance: Rationale, validity, and utility. Behavioral and Brain Sciences, 2 , 169–239.

Clark, C. A. (1963). Hypothesis testing in relation to statistical methodology. Review of Educational Research, 33 , 455–473.

Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum.

Cohen, J. (1990). Things I have learned (so far). American Psychologist, 45 (12), 1304–1312.

Cohen, J. (1994). The earth is round ( p  < 0.05). American Psychologist, 49 (12), 1003–1007.

Cortina, J. M., & Dunlap, W. P. (1997). On the logic and purpose of significance testing. Psychological Methods, 2 (2), 161–172.

Cumming, G. (2012). Understanding the new statistics. Effect sizes, confidence intervals, and meta-analysis . New York: Routledge.

Dixon, P., & O’Reilly, T. (1999). Scientific versus statistical inference. Canadian Journal of Experimental Psychology-Revue Canadienne De Psychologie Experimentale, 53 (2), 133–149.

Ellis, P. D. (2010). The essential guide to effect sizes: Statistical power, meta-analysis, and the interpretation of research results . Cambridge: Cambridge University Press.

Falk, R., & Greenbaum, C. W. (1995). Significance tests die hard. Theory and Psychology, 5 , 396–400.

Fisher, R. A. (1925). Statistical methods for research workers (1st ed.). London: Oliver & Boyd.

Fisher, R. A. (1935a). The design of experiments (1st ed.). Edinburgh: Oliver & Boyd.

Fisher, R. A. (1935b). Statistical tests. Nature, 136 , 474.

Fisher, R. A. (1935c). The logic of inductive inference. Journal of the Royal Statistical Society, 98 , 71–76.

Fisher, R. A. (1951). The design of experiments (6th ed.). Edinburgh: Oliver & Boyd.

Fisher, R. A. (1955). Statistical methods and scientific induction. Journal of the Royal Statistical Society B, 17 , 69–78.

Fisher, R. A. (1956). Statistical methods and scientific inference . London: Oliver & Boyd.

Frick, R. W. (1996). The appropriate use of null hypothesis testing. Psychological Methods, 1 (4), 379–390.

Gelman, A., Carlin, J. B., Stern, H. S., & Rubin, D. B. (2004). Bayesian data analysis . Boca Raton: Chapman & Hall/CRC.

Gelman, A., & Stern, H. (2006). The difference between “significant” and “not significant” is not itself statistically significant. The American Statistician, 60 (4), 328–331.

Gigerenzer, G. (1993). The superego, the ego, and the id in statistical reasoning. In G. Keren & C. Lewis (Eds.), A handbook for data analysis in the behavioral sciences: methodological issues (pp. 311–339). Hillsdale: Erlbaum.

Gigerenzer, G. (2004). Mindless statistics. The Journal of Socio-Economics, 33 (5), 587–606.

Gigerenzer, G., Swijtink, Z., Porter, T., Daston, L., Beatty, J., & Kruger, L. (1989). The empire of chance: How probability changed science and everyday life . New York: Cambridge University Press.

Gill, J. (2007). Bayesian methods: A social and behavioral sciences approach (2nd ed.). Boca Raton: Chapman & Hall/CRC.

Glass, G. (2006). Meta-analysis: The quantitative synthesis of research findings. In J. L. Green, G. Camilli, & P. B. Elmore (Eds.), Handbook of Complementary Methods in Education Research . Mahwah, NJ: Lawrence Erlbaum.

Good, I. J. (1950). Probability and the weighing of evidence . London: Griffin.

Goodman, S. N. (1993). P values, hypothesis tests, and likelihood: Implications for epidemiology of a neglected historical debate. American Journal of Epidemiology, 137 (5), 485–496.

Goodman, S. N. (1999a). Toward evidence-based medical statistics. 1: The P value fallacy. Annals of Internal Medicine, 130 (12), 995–1004.

Goodman, S. N. (1999b). Toward evidence-based medical statistics. 2: The Bayes factor. Annals of Internal Medicine, 130 (12), 1005–1013.

Goodman, S. N. (2003). Commentary: The P -value, devalued. International Journal of Epidemiology, 32 (5), 699–702.

Goodman, S. N. (2008). A dirty dozen: Twelve P -value misconceptions. Seminars in Hematology, 45 (3), 135–140.

Goodman, S. N., & Greenland, S. (2007). Why most published research findings are false: Problems in the analysis. PLoS Medicine, 4 (4), e168.

Greenland, S. (1990). Randomization, statistics, and causal Inference. Epidemiology, 1 (6), 421–429.

Greenland, S., & Poole, C. (2013). Living with statistics in observational research. Epidemiology, 24 (1), 73–78.

Hacking, I. (1965). Logic of statistical inference . Cambridge: Cambridge University Press.

Haller, H., & Krauss, S. (2002). Misinterpretations of significance: A problem students share with their teachers. Methods of Psychological Research, 7 (1), 1–20.

Harlow, L. L., Muliak, S. A., & Steiger, J. H. (Eds.). (1997). What if there were no significance tests? . Mahwah: Lawrence Erlbaum.

Hubbard, R. (2004). Alphabet soup: Blurring the distinctions between p’s and a’s in psychological research. Theory and Psychology, 14 (3), 295–327.

Hubbard, R., & Armstrong, J. S. (2006). Why we don’t really know what statistical significance means: Implications for educators. Journal of Marketing Education, 28 (2), 114–120.

Hubbard, R., & Bayarri, M. J. (2003). Confusion over measures of evidence (p’s) versus errors (α’s) in classical statistical testing. American Statistician, 57 (3), 171–178.

Hubbard, R., & Lindsay, R. M. (2008). Why P values are not a useful measure of evidence in statistical significance testing. Theory and Psychology, 18 (1), 69–88.

Hubbard, R., & Ryan, P. A. (2000). The historical growth of statistical significance testing in psychology and its future prospects. Educational and Psychological Measurement, 60 , 661–681.

Hunter, J. E. (1997). Needed: A ban on the significance test. Psychological Science, 8 , 3–7.

Hurlbert, S. H., & Lombardi, C. M. (2009). Final collapse of the Neyman–Pearson decision theoretic framework and rise of the neoFisherian. Annales Zoologici Fennici, 46 (5), 311–349.

Ioannidis, J. P. A. (2005). Why most published research findings are false. PLoS Medicine, 2 (8), 696–701.

Jeffreys, H. (1939). The theory of probability (1st ed.). Oxford: Oxford University Press.

Jeffreys, H. (1961). The theory of probability (3rd ed.). Oxford: Oxford University Press.

Kirk, R. E. (1996). Practical significance: a concept whose time has come. Educational and Psychological Measurement, 61 (5), 246–759.

Kline, R. B. (2004). Beyond significance testing: reforming data analysis methods in behavioral research . Washington, DC: American Psychological Association.

Kline, R. B. (2013). Beyond significance testing: reforming data analysis methods in behavioral research (2nd ed.). Washington, DC: American Psychological Association.

Krämer, W., & Gigerenzer, G. (2005). How to confuse with statistics or: The use and misuse of conditional probabilities. Statistical Science, 20 (3), 223–230.

Kruschke, J. K. (2010). What to believe: Bayesian methods for data analysis. Trends in Cognitive Sciences, 14 (7), 293–300.

Lehmann, E. L. (1993). The Fisher, Neyman–Pearson theories of testing hypotheses: One theory or two? Journal of the American Statistical Association, 88 (424), 1242–1249.

Leydesdorff, L. (2013). Does the specification of uncertainty hurt the progress of scientometrics? Journal of Informetrics, 7 (2), 292–293.

Lindley, D. (1957). A statistical paradox. Biometrika, 44 , 187–192.

Ludwig, D. A. (2005). Use and misuse of p -values in designed and observational studies: Guide for researchers and reviewers. Aviation, Space and Environmental Medicine, 76 (7), 675–680.

Lykken, D. T. (1968). Statistical significance in psychological research. Psychological Bulletin, 70 (3, Part 1), 151–159.

Mayo, D. (1996). Error and the growth of experimental knowledge . Chicago University Press: Chicago, IL.

Mayo, D. (2006). Philosophy of Statistics. In S. Sarkar & J. Pfeifer (Eds.), The philosophy of science: An encyclopedia (pp. 802–815). London: Routledge.

Meehl, P. E. (1978). Theoretical risks and tabular asterisk: Sir Karl, Sir Ronald, and the slow progress of soft psychology. Journal of Counseling and Clinical Psychology, 46 , 806–834.

Meehl, P. E. (1990). Appraising and amending theories: the strategy of Lakatosian defense and two principles that warrant it. Psychological Inquiry, 1 , 108–141.

Morrison, D. E., & Henkel, R. E. (Eds.). (1970). The significance test controversy . Chicago: Aldine.

Neyman, J. (1937). Outline of a theory of statistical estimation based on the classical theory of probability. Philosophical Transactions of the Royal Society A, 236 , 333–380.

Neyman, J., & Pearson, E. S. (1928). On the use and interpretation of certain test criteria of statistical inference, part I. Biometrika, 20A , 175–240.

Neyman, J., & Pearson, E. S. (1933a). On the problem of the most efficient test of statistical hypotheses. Philosophical Transactions of the Royal Society of London A, 231 , 289–337.

Neyman, J., & Pearson, E. S. (1933b). The testing of statistical hypotheses in relation to probabilituies a priori. Proceedings of the Cambridge Philosophical Society, 29 , 492–510.

Nickerson, R. S. (2000). Null hypothesis significance testing: a review of an old and continuing controversy. Psychological Methods, 5 (2), 241–301.

Oakes, M. (1986). Statistical inference: A commentary for the social and behavioral sciences . New York: Wiley.

Pollard, P., & Richardson, J. T. E. (1987). On the probability of making Type I errors. Psychological Bulletin, 102 , 159–163.

Rosnow, R. L., & Rosenthal, R. (1989). Statistical procedures and the justification of knowledge in psychological science. American Psychologist, 44 (10), 1276–1284.

Royall, R. M. (1997). Statistical evidence: A likelihood paradigm . London: Chapman & Hall.

Rozeboom, W. W. (1960). The fallacy of the null-hypothesis significance test. Psychological Bulletin, 57 (5), 416–428.

Scarr, S. (1997). Rules of evidence: A larger context for the statistical debate. Psychological Science, 8 , 16–17.

Schneider, J. W. (2012). Testing university rankings statistically: Why this perhaps is not such a good idea after all. Some reflections on statistical power, effect size, random sampling and imaginary populations. In É. Archambault, Y. Gingras, & V. Larivière (Eds.), Proceedings of the 17th international conference on science and technology indicators, Montreal . Retrieved, from http://2012.sticonference.org/Proceedings/vol2/Schneider_Testing_719.pdf .

Schneider, J. W. (2013). Caveats for using statistical significance tests in research assessments. Journal of Informetrics, 7 (1), 50–62.

Schneider, A. L., & Darcy, R. E. (1984). Policy implications of using significance tests in evaluation research. Evaluation Review, 8 (4), 573–582.

Schrodt, P. A. (2006). Beyond the linear frequentist orthodoxy. Political Analysis, 14 (3), 335–339.

Schwab, A., Abrahamson, E., Starbuck, W. H., & Fidler, F. (2011). Researchers should make thoughtful assessments instead of null-hypothesis significance tests. Organization Science, 22 (4), 1105–1120.

Sellke, T., Bayarri, M. J., & Berger, J. O. (2001). Calibration of rho values for testing precise null hypotheses. The American Statistician, 55 , 62–71.

Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-positive psychology: undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science, 22 (11), 1359–1366.

Spielman, S. (1974). The logic of tests of significance. Philosophy of Science, 41 , 211–226.

Starbuck, W. H. (2006). The production of knowledge: The challenge of social science research . Oxford: Oxford University Press.

Taagepera, R. (2008). Making social sciences more scientific: The need for predictive models . Oxford: Oxford University Press.

Tukey, J. W. (1977). Exploratory data analysis . Reading: Addison-Wesley.

Tukey, J. W. (1991). The philosophy of multiple comparisons. Statistical Science, 6 (1), 100–116.

Wagenmakers, E. J. (2007). A practical solution to the pervasive problem of p values. Psychonomic Bulletin & Review, 14 (5), 779–804.

Webster, E. J., & Starbuck, W. H. (1988). Theory building in industrial and organizational psychology. In C. L. Cooper & I. Robertson (Eds.), International review of industrial and organizational psychology (pp. 93–138). London: Wiley.

Wetzels, R., Matzke, D., Lee, M. D., Rouder, J. N., Iverson, G. J., & Wagenmakers, E.-J. (2011). Statistical evidence in experimental psychology: An empirical comparison using 855 t tests. Perspectives on Psychological Science, 6 (3), 291–298.

Wilkinson, L., & Task Force on Statistical Inference, APA Board on Scientific Affairs (1999). Statistical methods in psychology journals - Guidelines and explanations. American Psychologist, 54 (8), 594–604.

Ziliak, S. T., & McCloskey, D. N. (2008). The cult of statistical significance: How the standard error costs us jobs, justice, and lives . Ann Arbor: The University of Michigan Press.

Download references

Author information

Authors and affiliations.

Department of Political Science & Government, Danish Centre for Studies in Research and Research Policy, Aarhus University, Bartholins Allé 7, 8000, Aarhus C, Denmark

Jesper W. Schneider

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Jesper W. Schneider .

Rights and permissions

Reprints and permissions

About this article

Schneider, J.W. Null hypothesis significance tests. A mix-up of two different theories: the basis for widespread confusion and numerous misinterpretations. Scientometrics 102 , 411–432 (2015). https://doi.org/10.1007/s11192-014-1251-5

Download citation

Received : 03 February 2014

Published : 25 February 2014

Issue Date : January 2015

DOI : https://doi.org/10.1007/s11192-014-1251-5

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Null hypothesis significance test
  • Fisher’s significance test
  • Neyman–Pearson’s hypothesis test
  • Statistical inference
  • Scientometrics

Mathematic Subject Classification

Jel classification.

  • Find a journal
  • Publish with us
  • Track your research

Banner

  • Teesside University Student & Library Services
  • Learning Hub Group

Quantitative data collection and analysis

  • Testing hypotheses
  • Quantitative data collection
  • Averages and percentiles
  • Measures of Spread or Dispersion
  • Samples and population
  • Statistical tests - parametric
  • Statistical tests - non-parametric
  • Probability
  • Reliability and Validity
  • Analysing relationships
  • Useful Books

Testing Hypotheses

  • What is a hypothesis?
  • Significance testing
  • One-tailed or two-tailed?
  • Degrees of freedom

A hypothesis is a statement that we are trying to prove or disprove. It is used to express the relationship between variables  and whether this relationship is significant. It is specific and offers a prediction on the results of your research question.

Your research question  will lead you to developing a hypothesis, this is why your research question needs to be specific and clear.

The hypothesis will then guide you to the most appropriate techniques you should use to answer the question. They reflect the literature and theories on which you basing them. They need to be testable (i.e. measurable and practical).

Null hypothesis  (H 0 ) is the proposition that there will not be a relationship between the variables you are looking at. i.e. any differences are due to chance). They always refer to the population. (Usually we don't believe this to be true.)

e.g. There is  no difference in instances of illegal drug use by teenagers who are members of a gang and those who are not..

Alternative hypothesis  (H A ) or ( H 1 ):  this is sometimes called the research hypothesis or experimental hypothesis. It is the proposition that there will be a relationship. It is a statement of inequality between the variables you are interested in. They always refer to the sample. It is usually a declaration rather than a question and is clear, to the point and specific.

e.g. The instances of illegal drug use of teenagers who are members of a gang  is different than the instances of illegal drug use of teenagers who are not gang members.

A non-directional research hypothesis - reflects an expected difference between groups but does not specify the direction of this difference (see two-tailed test).

A directional research hypothesis - reflects an expected difference between groups but does specify the direction of this difference. (see one-tailed test)

e.g. The instances of illegal drug use by teenagers who are members of a gang will be higher t han the instances of illegal drug use of teenagers who are not gang members.

Then the process of testing is to ascertain which hypothesis to believe. 

It is usually easier to prove something as untrue rather than true, so looking at the null hypothesis is the usual starting point.

The process of examining the null hypothesis in light of evidence from the sample is called significance testing . It is a way of establishing a range of values in which we can establish whether the null hypothesis is true or false.

The debate over hypothesis testing

There has been discussion over whether the scientific method employed in traditional hypothesis testing is appropriate.  

See below for some articles that discuss this:

  • Gill, J. (1999) 'The insignificance of null hypothesis testing',  Politics Research Quarterly , 52(3), pp. 647-674 .
  • Wainer, H. and Robinson, D.H. (2003) 'Shaping up the practice of null hypothesis significance testing',  Educational Researcher, 32(7), pp.22-30 .
  • Ferguson, C.J. and Heener, M. (2012) ' A vast graveyard of undead theories: publication bias and psychological science's aversion to the null' ,  Perspectives on Psychological Science, 7(6), pp.555-561 .

Taken from: Salkind, N.J. (2017)  Statistics for people who (think they) hate statistics. 6th edn. London: SAGE pp. 144-145.

  • Null hypothesis - a simple introduction (SPSS)

A significance level defines the level when your sample evidence contradicts your null hypothesis so that your can then reject it. It is the probability of rejecting the null hypothesis when it is really true.

e.g. a significance level of 0.05 indicates that there is a 5% (or 1 in 20) risk of deciding that there is an effect when in fact there is none.

The lower the significance level that you set,  then the evidence from the sample has to be stronger to be able to reject the null hypothesis.

N.B.  - it is important that you set the significance level before you carry out your study and analysis.

Using Confidence Intervals

I t is possible to test the significance of your null hypothesis using Confidence Interval (see under samples and populations tab).

- if the range lies outside our predicted null hypothesis value we can reject it and accept the alternative hypothesis  

The test statistic

This is another commonly used statistic

  • Write down your null and alternative hypothesis
  • Find the sample statistic (e.g.the mean of your sample)
  • Calculate the test statistic Z score (see under Measures of spread or dispersion and Statistical tests - parametric). In this case the sample mean is compared to the population mean (assumed from the null hypothesis) and the standard error (see under Samples and population) is used rather than the standard deviation.
  • Compare the test statistic with the critical values (e.g. plus or minus 1.96 for 5% significance)
  • Draw a conclusion about the hypotheses - does the calculated z value lies in this critical range i.e. above 1.96 or below -1.96? If it does we can reject the null hypothesis. This would indicate that the results are significant (or an effect has been detected) - which means that if there were no difference in the population then getting a result that you have observed would be highly unlikely therefore you can reject the null hypothesis.

why is the null hypothesis important in quantitative research

Type I error  - this is the chance of wrongly rejecting the null hypothesis even though it is actually true, e.g. by using a 5% p  level you would expect the null hypothesis to be rejected about 5% of the time when the null hypothesis is true. You could set a more stringent p  level such as 1% (or 1 in 100) to be more certain of not seeing a Type I error. This, however, makes more likely another type of error (Type II) occurring.

Type II error  - this is where there is an effect, but the  p  value you obtain is non-significant hence you don’t detect this effect.

  • Statistical significance - what does it really mean?
  • Statistical tables

One-tailed tests - where we know in which direction (e.g. larger or smaller) the difference between sample and population will be. It is a directional hypothesis.

Two-tailed tests - where we are looking at whether there is a difference between sample and population. This difference could be larger or smaller. This is a non-directional hypothesis.

If the difference is in the direction you have predicted (i.e. a one-tailed test) it is easier to get a significant result. Though there are arguments against using a one-tailed test (Wright and London, 2009, p. 98-99)*

*Wright, D. B. & London, K. (2009)  First (and second) steps in statistics . 2nd edn. London: SAGE.

N.B. - think of the ‘tails’ as the regions at the far-end of a normal distribution. For a two-tailed test with significance level of 0.05% then 0.025% of the values would be at one end of the distribution and the other 0.025% would be at the other end of the distribution. It is the values in these ‘critical’ extreme regions where we can think about rejecting the null hypothesis and claim that there has been an effect.

Degrees of freedom ( df)  is a rather difficult mathematical concept, but is needed to calculate the signifcance of certain statistical tests, such as the t-test, ANOVA and Chi-squared test.

It is broadly defined as the number of "observations" (pieces of information) in the data that are free to vary when estimating statistical parameters. (Taken from Minitab Blog ).

The higher the degrees of freedom are the more powerful and precise your estimates of the parameter (population) will be.

Typically, for a 1-sample t-test it is considered as the number of values in your sample minus 1.

For chi-squared tests with a table of rows and columns the rule is:

(Number of rows minus 1) times (number of columns minus 1)

Any accessible example to illustrate the principle of degrees of freedom using chocolates.

  • You have seven chocolates in a box, each being a different type, e.g. truffle, coffee cream, caramel cluster, fudge, strawberry dream, hazelnut whirl, toffee. 
  • You are being good and intend to eat only one chocolate each day of the week.
  • On the first day, you can choose to eat any one of the 7 chocolate types  - you have a choice from all 7.
  • On the second day, you can choose from the 6 remaining chocolates, on day 3 you can choose from 5 chocolates, and so on.
  • On the sixth day you have a choice of the remaining 2 chocolates you haven't ate that week.
  • However on the seventh day - you haven't really got any choice of chocolate - it has got to be the one you have left in your box.
  • You had 7-1 = 6 days of “chocolate” freedom—in which the chocolate you ate could vary!
  • << Previous: Samples and population
  • Next: Statistical tests - parametric >>
  • Last Updated: Jan 9, 2024 11:01 AM
  • URL: https://libguides.tees.ac.uk/quantitative

Our websites may use cookies to personalize and enhance your experience. By continuing without changing your cookie settings, you agree to this collection. For more information, please see our University Websites Privacy Notice .

Neag School of Education

Educational Research Basics by Del Siegle

Null and alternative hypotheses.

Converting research questions to hypothesis is a simple task. Take the questions and make it a positive statement that says a relationship exists (correlation studies) or a difference exists between the groups (experiment study) and you have the alternative hypothesis. Write the statement such that a relationship does not exist or a difference does not exist and you have the null hypothesis. You can reverse the process if you have a hypothesis and wish to write a research question.

When you are comparing two groups, the groups are the independent variable. When you are testing whether something affects something else, the cause is the independent variable. The independent variable is the one you manipulate.

Teachers given higher pay will have more positive attitudes toward children than teachers given lower pay. The first step is to ask yourself “Are there two or more groups being compared?” The answer is “Yes.” What are the groups? Teachers who are given higher pay and teachers who are given lower pay. The independent variable is teacher pay. The dependent variable (the outcome) is attitude towards school.

You could also approach is another way. “Is something causing something else?” The answer is “Yes.”  What is causing what? Teacher pay is causing attitude towards school. Therefore, teacher pay is the independent variable (cause) and attitude towards school is the dependent variable (outcome).

By tradition, we try to disprove (reject) the null hypothesis. We can never prove a null hypothesis, because it is impossible to prove something does not exist. We can disprove something does not exist by finding an example of it. Therefore, in research we try to disprove the null hypothesis. When we do find that a relationship (or difference) exists then we reject the null and accept the alternative. If we do not find that a relationship (or difference) exists, we fail to reject the null hypothesis (and go with it). We never say we accept the null hypothesis because it is never possible to prove something does not exist. That is why we say that we failed to reject the null hypothesis, rather than we accepted it.

Del Siegle, Ph.D. Neag School of Education – University of Connecticut [email protected] www.delsiegle.com

Academic Success Center

Statistics Resources

  • Excel - Tutorials
  • Basic Probability Rules
  • Single Event Probability
  • Complement Rule
  • Intersections & Unions
  • Compound Events
  • Levels of Measurement
  • Independent and Dependent Variables
  • Entering Data
  • Central Tendency
  • Data and Tests
  • Displaying Data
  • Discussing Statistics In-text
  • SEM and Confidence Intervals
  • Two-Way Frequency Tables
  • Empirical Rule
  • Finding Probability
  • Accessing SPSS
  • Chart and Graphs
  • Frequency Table and Distribution
  • Descriptive Statistics
  • Converting Raw Scores to Z-Scores
  • Converting Z-scores to t-scores
  • Split File/Split Output
  • Partial Eta Squared
  • Downloading and Installing G*Power: Windows/PC
  • Correlation
  • Testing Parametric Assumptions
  • One-Way ANOVA
  • Two-Way ANOVA
  • Repeated Measures ANOVA
  • Goodness-of-Fit
  • Test of Association
  • Pearson's r
  • Point Biserial
  • Mediation and Moderation
  • Simple Linear Regression
  • Multiple Linear Regression
  • Binomial Logistic Regression
  • Multinomial Logistic Regression
  • Independent Samples T-test
  • Dependent Samples T-test
  • Testing Assumptions
  • T-tests using SPSS
  • T-Test Practice
  • Predictive Analytics This link opens in a new window
  • Quantitative Research Questions
  • Null & Alternative Hypotheses
  • One-Tail vs. Two-Tail
  • Alpha & Beta
  • Associated Probability
  • Decision Rule
  • Statement of Conclusion
  • Statistics Group Sessions

ASC Chat Hours

ASC Chat is usually available at the following times ( Pacific Time):

If there is not a coach on duty, submit your question via one of the below methods:

  928-440-1325

  Ask a Coach

  [email protected]

Search our FAQs on the Academic Success Center's  Ask a Coach   page.

Once you have developed a clear and focused research question or set of research questions, you’ll be ready to conduct further research, a literature review, on the topic to help you make an educated guess about the answer to your question(s). This educated guess is called a hypothesis.

In research, there are two types of hypotheses: null and alternative. They work as a complementary pair, each stating that the other is wrong.

  • Null Hypothesis (H 0 ) – This can be thought of as the implied hypothesis. “Null” meaning “nothing.”  This hypothesis states that there is no difference between groups or no relationship between variables. The null hypothesis is a presumption of status quo or no change.
  • Alternative Hypothesis (H a ) – This is also known as the claim. This hypothesis should state what you expect the data to show, based on your research on the topic. This is your answer to your research question.

Null Hypothesis:   H 0 : There is no difference in the salary of factory workers based on gender. Alternative Hypothesis :  H a : Male factory workers have a higher salary than female factory workers.

Null Hypothesis :  H 0 : There is no relationship between height and shoe size. Alternative Hypothesis :  H a : There is a positive relationship between height and shoe size.

Null Hypothesis :  H 0 : Experience on the job has no impact on the quality of a brick mason’s work. Alternative Hypothesis :  H a : The quality of a brick mason’s work is influenced by on-the-job experience.

Was this resource helpful?

  • << Previous: Hypothesis Testing
  • Next: One-Tail vs. Two-Tail >>
  • Last Updated: Apr 19, 2024 3:09 PM
  • URL: https://resources.nu.edu/statsresources

NCU Library Home

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • HCA Healthc J Med
  • v.1(2); 2020
  • PMC10324782

Logo of hcahjm

Introduction to Research Statistical Analysis: An Overview of the Basics

Christian vandever.

1 HCA Healthcare Graduate Medical Education

Description

This article covers many statistical ideas essential to research statistical analysis. Sample size is explained through the concepts of statistical significance level and power. Variable types and definitions are included to clarify necessities for how the analysis will be interpreted. Categorical and quantitative variable types are defined, as well as response and predictor variables. Statistical tests described include t-tests, ANOVA and chi-square tests. Multiple regression is also explored for both logistic and linear regression. Finally, the most common statistics produced by these methods are explored.

Introduction

Statistical analysis is necessary for any research project seeking to make quantitative conclusions. The following is a primer for research-based statistical analysis. It is intended to be a high-level overview of appropriate statistical testing, while not diving too deep into any specific methodology. Some of the information is more applicable to retrospective projects, where analysis is performed on data that has already been collected, but most of it will be suitable to any type of research. This primer will help the reader understand research results in coordination with a statistician, not to perform the actual analysis. Analysis is commonly performed using statistical programming software such as R, SAS or SPSS. These allow for analysis to be replicated while minimizing the risk for an error. Resources are listed later for those working on analysis without a statistician.

After coming up with a hypothesis for a study, including any variables to be used, one of the first steps is to think about the patient population to apply the question. Results are only relevant to the population that the underlying data represents. Since it is impractical to include everyone with a certain condition, a subset of the population of interest should be taken. This subset should be large enough to have power, which means there is enough data to deliver significant results and accurately reflect the study’s population.

The first statistics of interest are related to significance level and power, alpha and beta. Alpha (α) is the significance level and probability of a type I error, the rejection of the null hypothesis when it is true. The null hypothesis is generally that there is no difference between the groups compared. A type I error is also known as a false positive. An example would be an analysis that finds one medication statistically better than another, when in reality there is no difference in efficacy between the two. Beta (β) is the probability of a type II error, the failure to reject the null hypothesis when it is actually false. A type II error is also known as a false negative. This occurs when the analysis finds there is no difference in two medications when in reality one works better than the other. Power is defined as 1-β and should be calculated prior to running any sort of statistical testing. Ideally, alpha should be as small as possible while power should be as large as possible. Power generally increases with a larger sample size, but so does cost and the effect of any bias in the study design. Additionally, as the sample size gets bigger, the chance for a statistically significant result goes up even though these results can be small differences that do not matter practically. Power calculators include the magnitude of the effect in order to combat the potential for exaggeration and only give significant results that have an actual impact. The calculators take inputs like the mean, effect size and desired power, and output the required minimum sample size for analysis. Effect size is calculated using statistical information on the variables of interest. If that information is not available, most tests have commonly used values for small, medium or large effect sizes.

When the desired patient population is decided, the next step is to define the variables previously chosen to be included. Variables come in different types that determine which statistical methods are appropriate and useful. One way variables can be split is into categorical and quantitative variables. ( Table 1 ) Categorical variables place patients into groups, such as gender, race and smoking status. Quantitative variables measure or count some quantity of interest. Common quantitative variables in research include age and weight. An important note is that there can often be a choice for whether to treat a variable as quantitative or categorical. For example, in a study looking at body mass index (BMI), BMI could be defined as a quantitative variable or as a categorical variable, with each patient’s BMI listed as a category (underweight, normal, overweight, and obese) rather than the discrete value. The decision whether a variable is quantitative or categorical will affect what conclusions can be made when interpreting results from statistical tests. Keep in mind that since quantitative variables are treated on a continuous scale it would be inappropriate to transform a variable like which medication was given into a quantitative variable with values 1, 2 and 3.

Categorical vs. Quantitative Variables

Both of these types of variables can also be split into response and predictor variables. ( Table 2 ) Predictor variables are explanatory, or independent, variables that help explain changes in a response variable. Conversely, response variables are outcome, or dependent, variables whose changes can be partially explained by the predictor variables.

Response vs. Predictor Variables

Choosing the correct statistical test depends on the types of variables defined and the question being answered. The appropriate test is determined by the variables being compared. Some common statistical tests include t-tests, ANOVA and chi-square tests.

T-tests compare whether there are differences in a quantitative variable between two values of a categorical variable. For example, a t-test could be useful to compare the length of stay for knee replacement surgery patients between those that took apixaban and those that took rivaroxaban. A t-test could examine whether there is a statistically significant difference in the length of stay between the two groups. The t-test will output a p-value, a number between zero and one, which represents the probability that the two groups could be as different as they are in the data, if they were actually the same. A value closer to zero suggests that the difference, in this case for length of stay, is more statistically significant than a number closer to one. Prior to collecting the data, set a significance level, the previously defined alpha. Alpha is typically set at 0.05, but is commonly reduced in order to limit the chance of a type I error, or false positive. Going back to the example above, if alpha is set at 0.05 and the analysis gives a p-value of 0.039, then a statistically significant difference in length of stay is observed between apixaban and rivaroxaban patients. If the analysis gives a p-value of 0.91, then there was no statistical evidence of a difference in length of stay between the two medications. Other statistical summaries or methods examine how big of a difference that might be. These other summaries are known as post-hoc analysis since they are performed after the original test to provide additional context to the results.

Analysis of variance, or ANOVA, tests can observe mean differences in a quantitative variable between values of a categorical variable, typically with three or more values to distinguish from a t-test. ANOVA could add patients given dabigatran to the previous population and evaluate whether the length of stay was significantly different across the three medications. If the p-value is lower than the designated significance level then the hypothesis that length of stay was the same across the three medications is rejected. Summaries and post-hoc tests also could be performed to look at the differences between length of stay and which individual medications may have observed statistically significant differences in length of stay from the other medications. A chi-square test examines the association between two categorical variables. An example would be to consider whether the rate of having a post-operative bleed is the same across patients provided with apixaban, rivaroxaban and dabigatran. A chi-square test can compute a p-value determining whether the bleeding rates were significantly different or not. Post-hoc tests could then give the bleeding rate for each medication, as well as a breakdown as to which specific medications may have a significantly different bleeding rate from each other.

A slightly more advanced way of examining a question can come through multiple regression. Regression allows more predictor variables to be analyzed and can act as a control when looking at associations between variables. Common control variables are age, sex and any comorbidities likely to affect the outcome variable that are not closely related to the other explanatory variables. Control variables can be especially important in reducing the effect of bias in a retrospective population. Since retrospective data was not built with the research question in mind, it is important to eliminate threats to the validity of the analysis. Testing that controls for confounding variables, such as regression, is often more valuable with retrospective data because it can ease these concerns. The two main types of regression are linear and logistic. Linear regression is used to predict differences in a quantitative, continuous response variable, such as length of stay. Logistic regression predicts differences in a dichotomous, categorical response variable, such as 90-day readmission. So whether the outcome variable is categorical or quantitative, regression can be appropriate. An example for each of these types could be found in two similar cases. For both examples define the predictor variables as age, gender and anticoagulant usage. In the first, use the predictor variables in a linear regression to evaluate their individual effects on length of stay, a quantitative variable. For the second, use the same predictor variables in a logistic regression to evaluate their individual effects on whether the patient had a 90-day readmission, a dichotomous categorical variable. Analysis can compute a p-value for each included predictor variable to determine whether they are significantly associated. The statistical tests in this article generate an associated test statistic which determines the probability the results could be acquired given that there is no association between the compared variables. These results often come with coefficients which can give the degree of the association and the degree to which one variable changes with another. Most tests, including all listed in this article, also have confidence intervals, which give a range for the correlation with a specified level of confidence. Even if these tests do not give statistically significant results, the results are still important. Not reporting statistically insignificant findings creates a bias in research. Ideas can be repeated enough times that eventually statistically significant results are reached, even though there is no true significance. In some cases with very large sample sizes, p-values will almost always be significant. In this case the effect size is critical as even the smallest, meaningless differences can be found to be statistically significant.

These variables and tests are just some things to keep in mind before, during and after the analysis process in order to make sure that the statistical reports are supporting the questions being answered. The patient population, types of variables and statistical tests are all important things to consider in the process of statistical analysis. Any results are only as useful as the process used to obtain them. This primer can be used as a reference to help ensure appropriate statistical analysis.

Funding Statement

This research was supported (in whole or in part) by HCA Healthcare and/or an HCA Healthcare affiliated entity.

Conflicts of Interest

The author declares he has no conflicts of interest.

Christian Vandever is an employee of HCA Healthcare Graduate Medical Education, an organization affiliated with the journal’s publisher.

This research was supported (in whole or in part) by HCA Healthcare and/or an HCA Healthcare affiliated entity. The views expressed in this publication represent those of the author(s) and do not necessarily represent the official views of HCA Healthcare or any of its affiliated entities.

OPINION article

All animals are conscious in their own way: comparing the markers hypothesis with the universal consciousness hypothesis.

\r\nAngelica Kaufmann

  • Cognition in Action Lab, University of Milan, Milan, Italy

1 Introduction

Consciousness in non-human animals can be explored philosophically through two central questions: the distribution question, which enquires which animals are conscious, and the phenomenological question, which seeks to understand what the experiences of animals are like ( Allen and Trestman, 2024 ).

The distribution question is considered empirically tractable by those scientists who believe that markers, such as traits or behaviors, can be used to assess the presence of consciousness in animals. Bayne et al. (2024) have recently offered a version of this markers approach. They also aim to make the phenomenological question empirically tractable by targeting phenomenological experience through potential C-tests with the aim of identifying conscious entities across a spectrum of beings, including humans, animals, and artificial systems. Indirectly, Andrews (2024) also advocates for the empirical tractability of the phenomenological question and indirectly criticizes the marker-based approach, highlighting its inadequacies in addressing the “distribution question” of consciousness—namely, which animals are conscious. She argues for a paradigm shift that favors an inclusive presumption that all animals possess consciousness, challenging the premise of needing C-tests to distinguish conscious from non-conscious entities.

Acknowledging the complexity of applying C-tests to non-human entities, Bayne et al. reference Dung and Newen (2023) , who propose a species-sensitive, two-tier account of animal consciousness, aiming to assess not just whether animals are conscious (the distribution question) but also how their conscious experiences differ (the phenomenological question). Both approaches highlight the diversity of conscious experiences in the animal kingdom and encourage ethical considerations regarding the treatment of other animal species.

Andrews does not engage with Dung and Newen directly. Her focus is on proposing a foundational shift in how we approach the study of animal consciousness, arguing for the assumption that all animals are conscious as a starting point for research. This approach contrasts with seeking specific markers or dimensions of consciousness, as Bayne et al. and Dung and Newen suggested frameworks do, or Birch et al. (2020) before them, by instead questioning the very methodologies we use to infer consciousness in non-human animals.

Bayne et al. champion the utilization of precise markers, or C-tests, to demarcate conscious entities. Their methodology, underscored by a commitment to scientific rigor, seeks to establish a clear boundary between conscious and non-conscious beings. This approach, whilst promising methodological clarity, may inadvertently overlook the intricate and varied nature of consciousness, potentially imposing anthropocentric limitations on the understanding of animal consciousness. However, Andrews's broad ethical presumption of consciousness across all animals may risk diluting the specificity required to discern the diverse manifestations of consciousness across species.

Each perspective presents its merits—Bayne et al.'s methodological clarity, and Andrews' ethical inclusivity. It is Dung and Newen's account that appears to provide a preferable methodological synthesis where the identification of markers is informed by an ethical commitment to presume consciousness broadly, all whilst acknowledging diversity across species.

2 The markers hypothesis

Bayne et al. (2024) introduce the concept of C-tests, emphasizing the urgent need for validated methods to determine consciousness across different systems, including humans at various developmental stages, non-human animals, AI, and more recent innovations such as neural organoids and xenobots. Bayne et al. highlight the general consensus on consciousness in healthy, awake adult humans but acknowledge the debate on the presence of consciousness in other entities or states, such as during human development, in sleep, under anesthesia, and in various brain-damaged conditions. They also point out the controversies over consciousness in non-human animals.

The authors propose a four-dimensional space for classifying potential C-tests. These dimensions include the target population (identify which entities the C-test is applicable to, such as humans, specific animals, or artificial systems), specificity (measure the false-positive rate of the C-test since a test with high specificity accurately indicates consciousness when it is present), sensitivity (the test's ability to correctly identify true positives—genuinely conscious entities), and rational confidence (the degree of trust in the test's specificity and sensitivity assessments). To validate C-tests, Bayne et al. suggest three strategies:

The redeployment strategy: using variants of widely accepted tests for consciousness.

The theory-based strategy: grounding tests in consciousness theories.

The iterative natural kind strategy: an iterative process of refining and validating tests, treating consciousness as a natural kind.

This latter, indicated as the preferred strategy, posits that C-tests should be applied hierarchically, beginning with “consensus cases” (e.g., neurotypical, adult humans) and extending to “neighboring” and then more “alien” populations.

The authors recognize the moral implications of consciousness assessment, especially since consciousness is often linked to moral status ( Shepherd, 2018 , 2023 ). They acknowledge the importance of aligning C-tests with ethical considerations, as consciousness may dictate how various entities should be treated.

Bayne et al. also address the challenge of applying these tests to non-human subjects, particularly when certain abilities required by the test may be specific to humans, such as language or certain patterns of neural activity.

The significance of Bayne et al.'s studies lies not only in the advancement of C-tests but also in the broader philosophical and ethical discourse on consciousness. By considering different population targets and validating the sensitivity and specificity of these tests, Bayne et al.'s studies directly contribute to the ongoing dialogue on animal consciousness and how to appropriately measure it.

Bayne et al.'s (2024) proposal exemplify methodological rigor through its systematic and interdisciplinary approach. It sets forth a comprehensive framework to classify tests as C-tests, considering diverse entities from human development to artificial systems. This framework is underpinned by a precise categorization based on the target population, specificity, sensitivity, and rational confidence, each dimension addressing distinct validation challenges. The authors expand the robustness of their approach by critically assessing three validation strategies: redeployment, theory-based, and iterative NK, thus avoiding reliance on a single, potentially narrow methodological pathway. The authors advocate for an iterative NK strategy that emphasizes flexibility and adaptability, allowing for the refinement of hypotheses and methods in light of new evidence. By transparently discussing the inherent limitations and crucial decision points of developing C-tests, the authors exhibit a conscientious understanding of the complexity of their research question. This self-reflective stance not only clarifies the methodological boundaries but also ensures that the research advances with clarity and precision.

Although not directly addressing it, their paper can be understood as a response to Andrews' (2024) view that “all animals are conscious” and challenges it by proposing a structured, methodological framework for assessing consciousness across a broad spectrum of entities. This may sound in contrast with Andrews' position, which promotes an assumption of consciousness across all animals as a foundational starting point for research. Instead, Bayne et al.'s methodology could offer a systematic way to test Andrews' assertion and investigate the dimensions of consciousness she suggests should be the focus of research.

3 Universal consciousness

Andrews (2024) advocates for a paradigmatic shift in consciousness studies: the scientific community should adopt the stance that all animals are conscious by default and then work to explore dimensions of consciousness rather than laboring to mark consciousness in different species.

This approach, she argues, is limited by its reliance on initial markers—pretheoretical indicators such as language, social responsiveness, and emotional expression—and its development of derived markers—indicators that emerge from scientific investigation.

Andrews points out that as research progresses, the number of derived markers for consciousness increases, leading to a higher probability of ascribing consciousness to various species, potentially even those such as Caenorhabditis elegans and Hydra, which traditionally might not be considered conscious.

Andrews suggests that this approach creates an illusion of progress on the distribution question of consciousness because it can only increase the confidence in an animal's consciousness, not decrease it.

Initial markers are simply characteristics observed that set a baseline for the study of consciousness but are insufficient as proof. For instance, the fact that an entity displays pain behavior or engages in goal-directed activities does not conclusively demonstrate consciousness. This is particularly true in organisms whose physical forms or neural architectures differ significantly from humans or in the case of artificial intelligence. Conversely, derived markers arise through more theoretical means and often reveal aspects of consciousness not immediately evident through initial markers. These can encompass a range of behaviors that pass certain tests, or they can be mechanistic, rooted in the neurophysiology or biochemistry of the entity in question. These markers are less human-centric, recognizing behaviors and structures distinct from those typically found in humans, as long as they fulfill similar functional roles. The derived marker approach accommodates the multiple realizability of psychological properties, indicating a move toward a more inclusive and varied recognition of consciousness markers.

Andrews recommends that scientists default to the assumption that all animals are conscious and then investigate the various expressions and intensities of consciousness. This change in the scientific stance could catalyze more comprehensive and productive research, facilitating the development of a rich and inclusive theory of consciousness built on data spanning a vast array of life forms.

In essence, Andrews' argument is both pragmatic and methodological. She suggests that accepting the premise that all animals are conscious would eliminate biases that could hinder research and would leverage simpler organisms to gain insights into consciousness that might be obfuscated in more complex beings. Embracing this foundational shift would not only enhance the study of animal minds but could also have ethical implications for their treatment, emphasizing the importance of understanding the subjective experiences of non-human beings.

4 Between markers and dimensions

Dung and Newen (2023) propose a framework between markers and dimensions by addressing simultaneously the distribution question (which animals are conscious) and the phenomenological question (how consciousness experiences differ between animals).

The framework establishes 10 dimensions of consciousness with species-sensitive operationalizations, which allows for a comprehensive comparison of consciousness profiles across different animal species. This approach differentiates between strong and weak indicators of consciousness, enabling researchers to assign a multi-faceted profile to animal species, reflecting their conscious experiences. Strong indicators are direct evidence of consciousness, whilst weak indicators require multiple instances or higher degrees of the behavior to suggest conscious experience. Dung and Newen build upon previous studies by Birch et al. (2020) , whilst making four key advancements in their methodology: (1) a distinction between the distribution and the phenomenological question; (2) a structured taxonomy with strong and weak indicators; (3) the inclusion of dimensions for cognitive processing strategies beyond content features of conscious experience; and (4) a more extensive set of ten dimensions as opposed to the five suggested by Birch et al. (2020) . The five dimensions included: perceptual richness (how fine-grained is perception), evaluative richness (how fine-grained is valence), integration at a time (how temporally integrated is an experience), integration across time (how continuous or fragmented is an experience), and self-consciousness (how conscious of being a specific entity separate from the environment). Whereas, Dung and Newen add three dimensions of cognitive processing strategies: complex forms of reasoning (such as transitive inferences and causal reasoning), some forms of learning, and abstract categorization of specific sensory stimuli or events. They also include two further dimensions: the experience of body and mental agency and that of body ownership. The experience of agency pertains to whether an animal perceives its actions, including mental actions, as self-generated and under its voluntary command, rather than as occurrences that exceed their control (such as, mind wandering). The experience of ownership determines whether an animal recognizes its body parts as intrinsic to its being or merely as objects existing within the external environment.

They argue that these 10 dimensions are core for any general investigation of animal consciousness, but they are adaptable for more specific comparisons, such as between two species or different stages of ontogenetic development.

The operationalizations for these dimensions draw from a variety of behaviors and cognitive abilities. For example, perceptual categorization can be measured through tests such as discrimination learning and motivational trade-offs, whereas agency might be gauged through tasks testing delay of gratification or response inhibition.

Their studies contribute to the understanding of animal consciousness by offering a structured framework that can inform both empirical research and ethical considerations about the treatment of animals. Their approach specifically seeks to recognize indicators of consciousness that are potentially unique to non-human animals, which could differ significantly from human consciousness markers. In addition, the introduction of strong and weak indicators adds a layer of complexity to the evaluation of consciousness. This distinction acknowledges that not all indicators provide the same level of evidence for consciousness, and a set of weaker indicators can collectively signal the presence of consciousness in an animal. Process-oriented indicators for cognitive processes such as reasoning, learning, and abstraction reflect a deeper inquiry into how consciousness operates rather than just its outward manifestations. This shows an interest in the mechanisms of consciousness, providing a richer picture than what might be obtained through more static, trait-based markers. A defining feature of their framework is its adaptability and openness to revision based on empirical findings. This flexibility is an acknowledgment of the evolving nature of consciousness science. Their framework is not just theoretical but comes with concrete operationalizations for each dimension, providing tangible, testable manifestations of consciousness. This aspect is particularly valuable as it moves the field beyond theoretical speculation to empirical investigation. Furthermore, the authors recognize the limitations of current methodologies and introduce what they term pragmatic idealizations. This approach is intended to guide and refine research without making unwarranted assertions, which marks a departure from the sometimes binary perspective of traditional markers.

Dung and Newen's perspective can be seen as an intermediary between the marker-based approach of Bayne et al. and the universal consciousness claim argued by Andrews. Whilst they utilize a form of marker through their structured taxonomy, their approach is species-sensitive and acknowledges the diversity and richness of consciousness across species.

5 Conclusion

A balanced perspective on animal consciousness requires both empirical and ethical sensitivities. The C-tests proposed by Bayne et al. (2024) bring a necessary scientific precision to the field, whilst Andrews (2024) ethical presumption of universal consciousness ensures the moral consideration of all animals. Dung and Newen (2023) multi-dimensional framework integrates these aspects, offering a methodological approach that is both scientifically informed and ethically aware, incorporating the strengths of each perspective.

Author contributions

AK: Conceptualization, Methodology, Resources, Writing – original draft, Writing – review & editing.

The author(s) declare that no financial support was received for the research, authorship, and/or publication of this article.

Conflict of interest

The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher's note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Allen, C., and Trestman, M. (2024). “Animal consciousness,” The Stanford Encyclopedia of Philosophy (Spring 2024 Edition) , eds. E. N. Zalta and U. Nodelman. Available online at: https://plato.stanford.edu/archives/spr2024/entries/consciousness-animal/

Google Scholar

Andrews, K. (2024). “All animals are conscious”: shifting the null hypothesis in consciousness science. Mind Lang . 1–19. doi: 10.1111/mila.12498

Crossref Full Text | Google Scholar

Bayne, T., Seth, A. K., Massimini, M., Shepherd, J., Cleeremans, A., Fleming, S. M., et al. (2024). Tests for consciousness in humans and beyond. Trends Cogn. Sci . doi: 10.1016/j.tics.2024.01.010

PubMed Abstract | Crossref Full Text | Google Scholar

Birch, J., Schnell, A. K., and Clayton, N. S. (2020). Dimensions of animal consciousness. Trends Cogn. Sci. 24, 789–801. doi: 10.1016/j.tics.2020.07.007

Dung, L., and Newen, A. (2023). Profiles of animal consciousness: a species-sensitive, two-tier account to quality and distribution. Cognition 235:105409. doi: 10.1016/j.cognition.2023.105409

Shepherd, J. (2018). Consciousness and Moral Status . Oxford: Taylor and Francis, 122. doi: 10.4324/9781315396347

Shepherd, J. (2023). Non-human moral status: problems with phenomenal consciousness. AJOB Neurosci. 14, 148–157. doi: 10.1080/21507740.2022.2148770

Keywords: animal consciousness, markers hypothesis, dimensions, distribution, animal cognition

Citation: Kaufmann A (2024) All animals are conscious in their own way: comparing the markers hypothesis with the universal consciousness hypothesis. Front. Psychol. 15:1405394. doi: 10.3389/fpsyg.2024.1405394

Received: 22 March 2024; Accepted: 19 April 2024; Published: 13 May 2024.

Reviewed by:

Copyright © 2024 Kaufmann. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Angelica Kaufmann, angelica.kaufmann@gmail.com

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

IMAGES

  1. 15 Null Hypothesis Examples (2024)

    why is the null hypothesis important in quantitative research

  2. Null Hypothesis

    why is the null hypothesis important in quantitative research

  3. Null And Research Hypothesis Examples

    why is the null hypothesis important in quantitative research

  4. hypothesis in research methodology notes

    why is the null hypothesis important in quantitative research

  5. how to write a hypothesis for quantitative research

    why is the null hypothesis important in quantitative research

  6. How to Write a Null Hypothesis (with Examples and Templates)

    why is the null hypothesis important in quantitative research

VIDEO

  1. QUANTITATIVE BIOLOGY

  2. Research Methods

  3. Misunderstanding The Null Hypothesis

  4. Understanding the Null Hypothesis

  5. Difference between null and alternative hypothesis |research methodology in tamil #sscomputerstudies

  6. Research understanding

COMMENTS

  1. Why we habitually engage in null-hypothesis significance testing: A qualitative study

    Arguably the most important drawback is the fact that NHST is a form of indirect or inverse inference: researchers usually want to know if the null or alternative hypothesis can be accepted and use NHST to conclude either way. But with NHST, the probability of a finding, or more extreme findings, given the null hypothesis is calculated . Ergo ...

  2. What is Null Hypothesis? What Is Its Importance in Research?

    The null hypothesis is the opposite stating that no such relationship exists. Null hypothesis may seem unexciting, but it is a very important aspect of research. In this article, we discuss what null hypothesis is, how to make use of it, and why you should use it to improve your statistical analyses.

  3. Null & Alternative Hypotheses

    The null and alternative hypotheses offer competing answers to your research question. When the research question asks "Does the independent variable affect the dependent variable?": The null hypothesis ( H0) answers "No, there's no effect in the population.". The alternative hypothesis ( Ha) answers "Yes, there is an effect in the ...

  4. Importance of Null Hypothesis in Research

    The null hypothesis often denoted as H0, is a statement in statistical inference that suggests no statistical significance exists in a set of observed data. In other words, it assumes that any kind of difference or importance you see in a set of data is due to chance. The null hypothesis is the initial claim that researchers set out to test.

  5. Null Hypothesis: Definition, Rejecting & Examples

    The null hypothesis in statistics states that there is no difference between groups or no relationship between variables. It is one of two mutually exclusive hypotheses about a population in a hypothesis test. When your sample contains sufficient evidence, you can reject the null and conclude that the effect is statistically significant.

  6. Hypothesis Testing

    Step 1: State your null and alternate hypothesis. After developing your initial research hypothesis (the prediction that you want to investigate), it is important to restate it as a null (H o) and alternate (H a) hypothesis so that you can test it mathematically.. The alternate hypothesis is usually your initial hypothesis that predicts a relationship between variables.

  7. 13.1 Understanding Null Hypothesis Testing

    A crucial step in null hypothesis testing is finding the likelihood of the sample result if the null hypothesis were true. This probability is called the p value. A low p value means that the sample result would be unlikely if the null hypothesis were true and leads to the rejection of the null hypothesis. A p value that is not low means that ...

  8. Null hypothesis

    In scientific research, the null hypothesis (often denoted H 0) is the claim that the effect being studied does not exist. The null hypothesis can also be described as the hypothesis in which no relationship exists between two sets of data or variables being analyzed. If the null hypothesis is true, any experimentally observed effect is due to ...

  9. Null and Alternative Hypotheses

    The null and alternative hypotheses are two competing claims that researchers weigh evidence for and against using a statistical test: Null hypothesis (H0): There's no effect in the population. Alternative hypothesis (HA): There's an effect in the population. The effect is usually the effect of the independent variable on the dependent ...

  10. Hypothesis Testing

    Hypothesis Testing. When you conduct a piece of quantitative research, you are inevitably attempting to answer a research question or hypothesis that you have set. One method of evaluating this research question is via a process called hypothesis testing, which is sometimes also referred to as significance testing. Since there are many facets ...

  11. Why Does Research Require a Null Hypothesis?

    Answer. Every researcher is required to establish hypotheses in order to predict, tentatively, the outcome of the research (Leedy & Ormrod, 2016). A null hypothesis is "the result of chance alone", there's no patterns, differences or relationships between variables (Leedy & Ormrod, 2016). Whether the outcome is positive or negative, the ...

  12. Hypothesis Testing

    Let's return finally to the question of whether we reject or fail to reject the null hypothesis. If our statistical analysis shows that the significance level is below the cut-off value we have set (e.g., either 0.05 or 0.01), we reject the null hypothesis and accept the alternative hypothesis. Alternatively, if the significance level is above ...

  13. 7.3: The Research Hypothesis and the Null Hypothesis

    This null hypothesis can be written as: H0: X¯ = μ H 0: X ¯ = μ. For most of this textbook, the null hypothesis is that the means of the two groups are similar. Much later, the null hypothesis will be that there is no relationship between the two groups. Either way, remember that a null hypothesis is always saying that nothing is different.

  14. Null hypothesis significance tests. A mix-up of two ...

    Null hypothesis statistical significance tests (NHST) are widely used in quantitative research in the empirical sciences including scientometrics. Nevertheless, since their introduction nearly a century ago significance tests have been controversial. Many researchers are not aware of the numerous criticisms raised against NHST. As practiced, NHST has been characterized as a 'null ritual ...

  15. How to Write a Strong Hypothesis

    5. Phrase your hypothesis in three ways. To identify the variables, you can write a simple prediction in if…then form. The first part of the sentence states the independent variable and the second part states the dependent variable. If a first-year student starts attending more lectures, then their exam scores will improve.

  16. Quantitative data collection and analysis

    A hypothesis is a statement that we are trying to prove or disprove. It is used to express the relationship between variables and whether this relationship is significant. It is specific and offers a prediction on the results of your research question. Your research question will lead you to developing a hypothesis, this is why your research ...

  17. Null and Alternative Hypotheses

    If we do not find that a relationship (or difference) exists, we fail to reject the null hypothesis (and go with it). We never say we accept the null hypothesis because it is never possible to prove something does not exist. That is why we say that we failed to reject the null hypothesis, rather than we accepted it. Del Siegle, Ph.D.

  18. Null & Alternative Hypotheses

    In research, there are two types of hypotheses: null and alternative. They work as a complementary pair, each stating that the other is wrong. Null Hypothesis (H0) - This can be thought of as the implied hypothesis. "Null" meaning "nothing.". This hypothesis states that there is no difference between groups or no relationship between ...

  19. Constructing Hypotheses in Quantitative Research

    Hypotheses are the testable statements linked to your research question. Hypotheses bridge the gap from the general question you intend to investigate (i.e., the research question) to concise statements of what you hypothesize the connection between your variables to be. For example, if we were studying the influence of mentoring relationships ...

  20. Introduction to Research Statistical Analysis: An Overview of the

    The null hypothesis is generally that there is no difference between the groups compared. ... Common quantitative variables in research include age and weight. An important note is that there can often be a choice for whether to treat a variable as quantitative or categorical. ... Control variables can be especially important in reducing the ...

  21. Frontiers

    By transparently discussing the inherent limitations and crucial decision points of developing C-tests, the authors exhibit a conscientious understanding of the complexity of their research question. This self-reflective stance not only clarifies the methodological boundaries but also ensures that the research advances with clarity and precision.