- Skip to secondary menu
- Skip to main content
- Skip to primary sidebar
Statistics By Jim
Making statistics intuitive
![](http://ustaliy.fun/777/templates/cheerup/res/banner1.gif)
One-Tailed and Two-Tailed Hypothesis Tests Explained
By Jim Frost 60 Comments
Choosing whether to perform a one-tailed or a two-tailed hypothesis test is one of the methodology decisions you might need to make for your statistical analysis. This choice can have critical implications for the types of effects it can detect, the statistical power of the test, and potential errors.
In this post, you’ll learn about the differences between one-tailed and two-tailed hypothesis tests and their advantages and disadvantages. I include examples of both types of statistical tests. In my next post, I cover the decision between one and two-tailed tests in more detail.
What Are Tails in a Hypothesis Test?
First, we need to cover some background material to understand the tails in a test. Typically, hypothesis tests take all of the sample data and convert it to a single value, which is known as a test statistic. You’re probably already familiar with some test statistics. For example, t-tests calculate t-values . F-tests, such as ANOVA, generate F-values . The chi-square test of independence and some distribution tests produce chi-square values. All of these values are test statistics. For more information, read my post about Test Statistics .
These test statistics follow a sampling distribution. Probability distribution plots display the probabilities of obtaining test statistic values when the null hypothesis is correct. On a probability distribution plot, the portion of the shaded area under the curve represents the probability that a value will fall within that range.
The graph below displays a sampling distribution for t-values. The two shaded regions cover the two-tails of the distribution.
![two tailed hypothesis test example Plot that display critical regions in the two tails of the distribution.](https://i0.wp.com/statisticsbyjim.com/wp-content/uploads/2018/11/t-test_two_tails_05.png?resize=576%2C384)
Keep in mind that this t-distribution assumes that the null hypothesis is correct for the population. Consequently, the peak (most likely value) of the distribution occurs at t=0, which represents the null hypothesis in a t-test. Typically, the null hypothesis states that there is no effect. As t-values move further away from zero, it represents larger effect sizes. When the null hypothesis is true for the population, obtaining samples that exhibit a large apparent effect becomes less likely, which is why the probabilities taper off for t-values further from zero.
Related posts : How t-Tests Work and Understanding Probability Distributions
Critical Regions in a Hypothesis Test
In hypothesis tests, critical regions are ranges of the distributions where the values represent statistically significant results. Analysts define the size and location of the critical regions by specifying both the significance level (alpha) and whether the test is one-tailed or two-tailed.
Consider the following two facts:
- The significance level is the probability of rejecting a null hypothesis that is correct.
- The sampling distribution for a test statistic assumes that the null hypothesis is correct.
Consequently, to represent the critical regions on the distribution for a test statistic, you merely shade the appropriate percentage of the distribution. For the common significance level of 0.05, you shade 5% of the distribution.
Related posts : Significance Levels and P-values and T-Distribution Table of Critical Values
Two-Tailed Hypothesis Tests
Two-tailed hypothesis tests are also known as nondirectional and two-sided tests because you can test for effects in both directions. When you perform a two-tailed test, you split the significance level percentage between both tails of the distribution. In the example below, I use an alpha of 5% and the distribution has two shaded regions of 2.5% (2 * 2.5% = 5%).
When a test statistic falls in either critical region, your sample data are sufficiently incompatible with the null hypothesis that you can reject it for the population.
In a two-tailed test, the generic null and alternative hypotheses are the following:
- Null : The effect equals zero.
- Alternative : The effect does not equal zero.
The specifics of the hypotheses depend on the type of test you perform because you might be assessing means, proportions, or rates.
Example of a two-tailed 1-sample t-test
Suppose we perform a two-sided 1-sample t-test where we compare the mean strength (4.1) of parts from a supplier to a target value (5). We use a two-tailed test because we care whether the mean is greater than or less than the target value.
To interpret the results, simply compare the p-value to your significance level. If the p-value is less than the significance level, you know that the test statistic fell into one of the critical regions, but which one? Just look at the estimated effect. In the output below, the t-value is negative, so we know that the test statistic fell in the critical region in the left tail of the distribution, indicating the mean is less than the target value. Now we know this difference is statistically significant.
![two tailed hypothesis test example Statistical output from a two-tailed 1-sample t-test.](https://i0.wp.com/statisticsbyjim.com/wp-content/uploads/2018/11/T-test_SWO_two_tails.png?resize=399%2C122)
We can conclude that the population mean for part strength is less than the target value. However, the test had the capacity to detect a positive difference as well. You can also assess the confidence interval. With a two-tailed hypothesis test, you’ll obtain a two-sided confidence interval. The confidence interval tells us that the population mean is likely to fall between 3.372 and 4.828. This range excludes the target value (5), which is another indicator of significance.
Advantages of two-tailed hypothesis tests
You can detect both positive and negative effects. Two-tailed tests are standard in scientific research where discovering any type of effect is usually of interest to researchers.
One-Tailed Hypothesis Tests
One-tailed hypothesis tests are also known as directional and one-sided tests because you can test for effects in only one direction. When you perform a one-tailed test, the entire significance level percentage goes into the extreme end of one tail of the distribution.
In the examples below, I use an alpha of 5%. Each distribution has one shaded region of 5%. When you perform a one-tailed test, you must determine whether the critical region is in the left tail or the right tail. The test can detect an effect only in the direction that has the critical region. It has absolutely no capacity to detect an effect in the other direction.
In a one-tailed test, you have two options for the null and alternative hypotheses, which corresponds to where you place the critical region.
You can choose either of the following sets of generic hypotheses:
- Null : The effect is less than or equal to zero.
- Alternative : The effect is greater than zero.
![two tailed hypothesis test example Plot that displays a single critical region for a one-tailed test.](https://i0.wp.com/statisticsbyjim.com/wp-content/uploads/2018/11/t-test_right_tail_05.png?resize=576%2C384)
- Null : The effect is greater than or equal to zero.
- Alternative : The effect is less than zero.
![two tailed hypothesis test example Plot that displays a single critical region in the left tail for a one-tailed test.](https://i0.wp.com/statisticsbyjim.com/wp-content/uploads/2018/11/t-test_left_tail_05.png?resize=576%2C384)
Again, the specifics of the hypotheses depend on the type of test you perform.
Notice how for both possible null hypotheses the tests can’t distinguish between zero and an effect in a particular direction. For example, in the example directly above, the null combines “the effect is greater than or equal to zero” into a single category. That test can’t differentiate between zero and greater than zero.
Example of a one-tailed 1-sample t-test
Suppose we perform a one-tailed 1-sample t-test. We’ll use a similar scenario as before where we compare the mean strength of parts from a supplier (102) to a target value (100). Imagine that we are considering a new parts supplier. We will use them only if the mean strength of their parts is greater than our target value. There is no need for us to differentiate between whether their parts are equally strong or less strong than the target value—either way we’d just stick with our current supplier.
Consequently, we’ll choose the alternative hypothesis that states the mean difference is greater than zero (Population mean – Target value > 0). The null hypothesis states that the difference between the population mean and target value is less than or equal to zero.
![two tailed hypothesis test example Statistical output for a one-tailed 1-sample t-test.](https://i0.wp.com/statisticsbyjim.com/wp-content/uploads/2018/11/T-test_SWO_one_tail.png?resize=418%2C125)
To interpret the results, compare the p-value to your significance level. If the p-value is less than the significance level, you know that the test statistic fell into the critical region. For this study, the statistically significant result supports the notion that the population mean is greater than the target value of 100.
Confidence intervals for a one-tailed test are similarly one-sided. You’ll obtain either an upper bound or a lower bound. In this case, we get a lower bound, which indicates that the population mean is likely to be greater than or equal to 100.631. There is no upper limit to this range.
A lower-bound matches our goal of determining whether the new parts are stronger than our target value. The fact that the lower bound (100.631) is higher than the target value (100) indicates that these results are statistically significant.
This test is unable to detect a negative difference even when the sample mean represents a very negative effect.
Advantages and disadvantages of one-tailed hypothesis tests
One-tailed tests have more statistical power to detect an effect in one direction than a two-tailed test with the same design and significance level. One-tailed tests occur most frequently for studies where one of the following is true:
- Effects can exist in only one direction.
- Effects can exist in both directions but the researchers only care about an effect in one direction. There is no drawback to failing to detect an effect in the other direction. (Not recommended.)
The disadvantage of one-tailed tests is that they have no statistical power to detect an effect in the other direction.
As part of your pre-study planning process, determine whether you’ll use the one- or two-tailed version of a hypothesis test. To learn more about this planning process, read 5 Steps for Conducting Scientific Studies with Statistical Analyses .
This post explains the differences between one-tailed and two-tailed statistical hypothesis tests. How these forms of hypothesis tests function is clear and based on mathematics. However, there is some debate about when you can use one-tailed tests. My next post explores this decision in much more depth and explains the different schools of thought and my opinion on the matter— When Can I Use One-Tailed Hypothesis Tests .
If you’re learning about hypothesis testing and like the approach I use in my blog, check out my Hypothesis Testing book! You can find it at Amazon and other retailers.
![two tailed hypothesis test example Cover image of my Hypothesis Testing: An Intuitive Guide ebook.](https://i0.wp.com/statisticsbyjim.com/wp-content/uploads/2018/11/jfrost-hypothesis-book-promo-image.jpg?resize=200%2C300&ssl=1)
Share this:
![two tailed hypothesis test example two tailed hypothesis test example](https://i0.wp.com/assets.pinterest.com/images/pidgets/pinit_fg_en_rect_gray_20.png)
Reader Interactions
June 26, 2022 at 12:14 pm
Hi, Can help me with figuring out the null and alternative hypothesis of the following statement? Some claimed that the real average expenditure on beverage by general people is at least $10.
February 19, 2022 at 6:02 am
thank you for the thoroughly explanation, I’m still strugling to wrap my mind around the t-table and the relation between the alpha values for one or two tail probability and the confidence levels on the bottom (I’m understanding it so wrongly that for me it should be the oposite, like one tail 0,05 should correspond 95% CI and two tailed 0,025 should correspond to 95% because then you got the 2,5% on each side). In my mind if I picture the one tail diagram with an alpha of 0,05 I see the rest 95% inside the diagram, but for a one tail I only see 90% CI paired with a 5% alpha… where did the other 5% go? I tried to understand when you said we should just double the alpha for a one tail probability in order to find the CI but I still cant picture it. I have been trying to understand this. Like if you only have one tail and there is 0,05, shouldn’t the rest be on the other side? why is it then 90%… I know I’m missing a point and I can’t figure it out and it’s so frustrating…
February 23, 2022 at 10:01 pm
The alpha is the total shaded area. So, if the alpha = 0.05, you know that 5% of the distribution is shaded. The number of tails tells you how to divide the shaded areas. Is it all in one region (1-tailed) or do you split the shaded regions in two (2-tailed)?
So, for a one-tailed test with an alpha of 0.05, the 5% shading is all in one tail. If alpha = 0.10, then it’s 10% on one side. If it’s two-tailed, then you need to split that 10% into two–5% in both tails. Hence, the 5% in a one-tailed test is the same as a two-tailed test with an alpha of 0.10 because that test has the same 5% on one side (but there’s another 5% in the other tail).
It’s similar for CIs. However, for CIs, you shade the middle rather than the extremities. I write about that in one my articles about hypothesis testing and confidence intervals .
I’m not sure if I’m answering your question or not.
February 17, 2022 at 1:46 pm
I ran a post hoc Dunnett’s test alpha=0.05 after a significant Anova test in Proc Mixed using SAS. I want to determine if the means for treatment (t1, t2, t3) is significantly less than the means for control (p=pathogen). The code for the dunnett’s test is – LSmeans trt / diff=controll (‘P’) adjust=dunnett CL plot=control; I think the lower bound one tailed test is the correct test to run but I’m not 100% sure. I’m finding conflicting information online. In the output table for the dunnett’s test the mean difference between the control and the treatments is t1=9.8, t2=64.2, and t3=56.5. The control mean estimate is 90.5. The adjusted p-value by treatment is t1(p=0.5734), t2 (p=.0154) and t3(p=.0245). The adjusted lower bound confidence limit in order from t1-t3 is -38.8, 13.4, and 7.9. The adjusted upper bound for all test is infinity. The graphical output for the dunnett’s test in SAS is difficult to understand for those of us who are beginner SAS users. All treatments appear as a vertical line below the the horizontal line for control at 90.5 with t2 and t3 in the shaded area. For treatment 1 the shaded area is above the line for control. Looking at just the output table I would say that t2 and t3 are significantly lower than the control. I guess I would like to know if my interpretation of the outputs is correct that treatments 2 and 3 are statistically significantly lower than the control? Should I have used an upper bound one tailed test instead?
November 10, 2021 at 1:00 am
Thanks Jim. Please help me understand how a two tailed testing can be used to minimize errors in research
July 1, 2021 at 9:19 am
Hi Jim, Thanks for posting such a thorough and well-written explanation. It was extremely useful to clear up some doubts.
May 7, 2021 at 4:27 pm
Hi Jim, I followed your instructions for the Excel add-in. Thank you. I am very new to statistics and sort of enjoy it as I enter week number two in my class. I am to select if three scenarios call for a one or two-tailed test is required and why. The problem is stated:
30% of mole biopsies are unnecessary. Last month at his clinic, 210 out of 634 had benign biopsy results. Is there enough evidence to reject the dermatologist’s claim?
Part two, the wording changes to “more than of 30% of biopsies,” and part three, the wording changes to “less than 30% of biopsies…”
I am not asking for the problem to be solved for me, but I cannot seem to find direction needed. I know the elements i am dealing with are =30%, greater than 30%, and less than 30%. 210 and 634. I just don’t know what to with the information. I can’t seem to find an example of a similar problem to work with.
May 9, 2021 at 9:22 pm
As I detail in this post, a two-tailed test tells you whether an effect exists in either direction. Or, is it different from the null value in either direction. For the first example, the wording suggests you’d need a two-tailed test to determine whether the population proportion is ≠ 30%. Whenever you just need to know ≠, it suggests a two-tailed test because you’re covering both directions.
For part two, because it’s in one direction (greater than), you need a one-tailed test. Same for part three but it’s less than. Look in this blog post to see how you’d construct the null and alternative hypotheses for these cases. Note that you’re working with a proportion rather than the mean, but the principles are the same! Just plug your scenario and the concept of proportion into the wording I use for the hypotheses.
I hope that helps!
April 11, 2021 at 9:30 am
Hello Jim, great website! I am using a statistics program (SPSS) that does NOT compute one-tailed t-tests. I am trying to compare two independent groups and have justifiable reasons why I only care about one direction. Can I do the following? Use SPSS for two-tailed tests to calculate the t & p values. Then report the p-value as p/2 when it is in the predicted direction (e.g , SPSS says p = .04, so I report p = .02), and report the p-value as 1 – (p/2) when it is in the opposite direction (e.g., SPSS says p = .04, so I report p = .98)? If that is incorrect, what do you suggest (hopefully besides changing statistics programs)? Also, if I want to report confidence intervals, I realize that I would only have an upper or lower bound, but can I use the CI’s from SPSS to compute that? Thank you very much!
April 11, 2021 at 5:42 pm
Yes, for p-values, that’s absolutely correct for both cases.
For confidence intervals, if you take one endpoint of a two-side CI, it becomes a one-side bound with half the confidence level.
Consequently, to obtain a one-sided bound with your desired confidence level, you need to take your desired significance level (e.g., 0.05) and double it. Then subtract it from 1. So, if you’re using a significance level of 0.05, double that to 0.10 and then subtract from 1 (1 – 0.10 = 0.90). 90% is the confidence level you want to use for a two-sided test. After obtaining the two-sided CI, use one of the endpoints depending on the direction of your hypothesis (i.e., upper or lower bound). That’s produces the one-sided the bound with the confidence level that you want. For our example, we calculated a 95% one-sided bound.
March 3, 2021 at 8:27 am
Hi Jim. I used the one-tailed(right) statistical test to determine an anomaly in the below problem statement: On a daily basis, I calculate the (mapped_%) in a common field between two tables.
The way I used the t-test is: On any particular day, I calculate the sample_mean, S.D and sample_count (n=30) for the last 30 days including the current day. My null hypothesis, H0 (pop. mean)=95 and H1>95 (alternate hypothesis). So, I calculate the t-stat based on the sample_mean, pop.mean, sample S.D and n. I then choose the t-crit value for 0.05 from my t-ditribution table for dof(n-1). On the current day if my abs.(t-stat)>t-crit, then I reject the null hypothesis and I say the mapped_pct on that day has passed the t-test.
I get some weird results here, where if my mapped_pct is as low as 6%-8% in all the past 30 days, the t-test still gets a “pass” result. Could you help on this? If my hypothesis needs to be changed.
I would basically look for the mapped_pct >95, if it worked on a static trigger. How can I use the t-test effectively in this problem statement?
December 18, 2020 at 8:23 pm
Hello Dr. Jim, I am wondering if there is evidence in one of your books or other source you could provide, which supports that it is OK not to divide alpha level by 2 in one-tailed hypotheses. I need the source for supporting evidence in a Portfolio exercise and couldn’t find one.
I am grateful for your reply and for your statistics knowledge sharing!
November 27, 2020 at 10:31 pm
If I did a one directional F test ANOVA(one tail ) and wanted to calculate a confidence interval for each individual groups (3) mean . Would I use a one tailed or two tailed t , within my confidence interval .
November 29, 2020 at 2:36 am
Hi Bashiru,
F-tests for ANOVA will always be one-tailed for the reasons I discuss in this post. To learn more about, read my post about F-tests in ANOVA .
For the differences between my groups, I would not use t-tests because the family-wise error rate quickly grows out of hand. To learn more about how to compare group means while controlling the familywise error rate, read my post about using post hoc tests with ANOVA . Typically, these are two-side intervals but you’d be able to use one-sided.
November 26, 2020 at 10:51 am
Hi Jim, I had a question about the formulation of the hypotheses. When you want to test if a beta = 1 or a beta = 0. What will be the null hypotheses? I’m having trouble with finding out. Because in most cases beta = 0 is the null hypotheses but in this case you want to test if beta = 0. so i’m having my doubts can it in this case be the alternative hypotheses or is it still the null hypotheses?
Kind regards, Noa
November 27, 2020 at 1:21 am
Typically, the null hypothesis represents no effect or no relationship. As an analyst, you’re hoping that your data have enough evidence to reject the null and favor the alternative.
Assuming you’re referring to beta as in regression coefficients, zero represents no relationship. Consequently, beta = 0 is the null hypothesis.
You might hope that beta = 1, but you don’t usually include that in your alternative hypotheses. The alternative hypothesis usually states that it does not equal no effect. In other words, there is an effect but it doesn’t state what it is.
There are some exceptions to the above but I’m writing about the standard case.
November 22, 2020 at 8:46 am
Your articles are a help to intro to econometrics students. Keep up the good work! More power to you!
November 6, 2020 at 11:25 pm
Hello Jim. Can you help me with these please?
Write the null and alternative hypothesis using a 1-tailed and 2-tailed test for each problem. (In paragraph and symbols)
A teacher wants to know if there is a significant difference in the performance in MAT C313 between her morning and afternoon classes.
It is known that in our university canteen, the average waiting time for a customer to receive and pay for his/her order is 20 minutes. Additional personnel has been added and now the management wants to know if the average waiting time had been reduced.
November 8, 2020 at 12:29 am
I cover how to write the hypotheses for the different types of tests in this post. So, you just need to figure which type of test you need to use. In your case, you want to determine whether the mean waiting time is less than the target value of 20 minutes. That’s a 1-sample t-test because you’re comparing a mean to a target value (20 minutes). You specifically want to determine whether the mean is less than the target value. So, that’s a one-tailed test. And, you’re looking for a mean that is “less than” the target.
So, go to the one-tailed section in the post and look for the hypotheses for the effect being less than. That’s the one with the critical region on the left side of the curve.
Now, you need include your own information. In your case, you’re comparing the sample estimate to a population mean of 20. The 20 minutes is your null hypothesis value. Use the symbol mu μ to represent the population mean.
You put all that together and you get the following:
Null: μ ≥ 20 Alternative: μ 0 to denote the null hypothesis and H 1 or H A to denote the alternative hypothesis if that’s what you been using in class.
October 17, 2020 at 12:11 pm
I was just wondering if you could please help with clarifying what the hypothesises would be for say income for gamblers and, age of gamblers. I am struggling to find which means would be compared.
October 17, 2020 at 7:05 pm
Those are both continuous variables, so you’d use either correlation or regression for them. For both of those analyses, the hypotheses are the following:
Null : The correlation or regression coefficient equals zero (i.e., there is no relationship between the variables) Alternative : The coefficient does not equal zero (i.e., there is a relationship between the variables.)
When the p-value is less than your significance level, you reject the null and conclude that a relationship exists.
October 17, 2020 at 3:05 am
I was ask to choose and justify the reason between a one tailed and two tailed test for dummy variables, how do I do that and what does it mean?
October 17, 2020 at 7:11 pm
I don’t have enough information to answer your question. A dummy variable is also known as an indicator variable, which is a binary variable that indicates the presence or absence of a condition or characteristic. If you’re using this variable in a hypothesis test, I’d presume that you’re using a proportions test, which is based on the binomial distribution for binary data.
Choosing between a one-tailed or two-tailed test depends on subject area issues and, possibly, your research objectives. Typically, use a two-tailed test unless you have a very good reason to use a one-tailed test. To understand when you might use a one-tailed test, read my post about when to use a one-tailed hypothesis test .
October 16, 2020 at 2:07 pm
In your one-tailed example, Minitab describes the hypotheses as “Test of mu = 100 vs > 100”. Any idea why Minitab says the null is “=” rather than “= or less than”? No ASCII character for it?
October 16, 2020 at 4:20 pm
I’m not entirely sure even though I used to work there! I know we had some discussions about how to represent that hypothesis but I don’t recall the exact reasoning. I suspect that it has to do with the conclusions that you can draw. Let’s focus on the failing to reject the null hypothesis. If the test statistic falls in that region (i.e., it is not significant), you fail to reject the null. In this case, all you know is that you have insufficient evidence to say it is different than 100. I’m pretty sure that’s why they use the equal sign because it might as well be one.
Mathematically, I think using ≤ is more accurate, which you can really see when you look at the distribution plots. That’s why I phrase the hypotheses using ≤ or ≥ as needed. However, in terms of the interpretation, the “less than” portion doesn’t really add anything of importance. You can conclude that its equal to 100 or greater than 100, but not less than 100.
October 15, 2020 at 5:46 am
Thank you so much for your timely feedback. It helps a lot
October 14, 2020 at 10:47 am
How can i use one tailed test at 5% alpha on this problem?
A manufacturer of cellular phone batteries claims that when fully charged, the mean life of his product lasts for 26 hours with a standard deviation of 5 hours. Mr X, a regular distributor, randomly picked and tested 35 of the batteries. His test showed that the average life of his sample is 25.5 hours. Is there a significant difference between the average life of all the manufacturer’s batteries and the average battery life of his sample?
October 14, 2020 at 8:22 pm
I don’t think you’d want to use a one-tailed test. The goal is to determine whether the sample is significantly different than the manufacturer’s population average. You’re not saying significantly greater than or less than, which would be a one-tailed test. As phrased, you want a two-tailed test because it can detect a difference in either direct.
It sounds like you need to use a 1-sample t-test to test the mean. During this test, enter 26 as the test mean. The procedure will tell you if the sample mean of 25.5 hours is a significantly different from that test mean. Similarly, you’d need a one variance test to determine whether the sample standard deviation is significantly different from the test value of 5 hours.
For both of these tests, compare the p-value to your alpha of 0.05. If the p-value is less than this value, your results are statistically significant.
September 22, 2020 at 4:16 am
Hi Jim, I didn’t get an idea that when to use two tail test and one tail test. Will you please explain?
September 22, 2020 at 10:05 pm
I have a complete article dedicated to that: When Can I Use One-Tailed Tests .
Basically, start with the assumption that you’ll use a two-tailed test but then consider scenarios where a one-tailed test can be appropriate. I talk about all of that in the article.
If you have questions after reading that, please don’t hesitate to ask!
July 31, 2020 at 12:33 pm
Thank you so so much for this webpage.
I have two scenarios that I need some clarification. I will really appreciate it if you can take a look:
So I have several of materials that I know when they are tested after production. My hypothesis is that the earlier they are tested after production, the higher the mean value I should expect. At the same time, the later they are tested after production, the lower the mean value. Since this is more like a “greater or lesser” situation, I should use one tail. Is that the correct approach?
On the other hand, I have several mix of materials that I don’t know when they are tested after production. I only know the mean values of the test. And I only want to know whether one mean value is truly higher or lower than the other, I guess I want to know if they are only significantly different. Should I use two tail for this? If they are not significantly different, I can judge based on the mean values of test alone. And if they are significantly different, then I will need to do other type of analysis. Also, when I get my P-value for two tail, should I compare it to 0.025 or 0.05 if my confidence level is 0.05?
Thank you so much again.
July 31, 2020 at 11:19 pm
For your first, if you absolutely know that the mean must be lower the later the material is tested, that it cannot be higher, that would be a situation where you can use a one-tailed test. However, if that’s not a certainty, you’re just guessing, use a two-tail test. If you’re measuring different items at the different times, use the independent 2-sample t-test. However, if you’re measuring the same items at two time points, use the paired t-test. If it’s appropriate, using the paired t-test will give you more statistical power because it accounts for the variability between items. For more information, see my post about when it’s ok to use a one-tailed test .
For the mix of materials, use a two-tailed test because the effect truly can go either direction.
Always compare the p-value to your full significance level regardless of whether it’s a one or two-tailed test. Don’t divide the significance level in half.
June 17, 2020 at 2:56 pm
Is it possible that we reach to opposite conclusions if we use a critical value method and p value method Secondly if we perform one tail test and use p vale method to conclude our Ho, then do we need to convert sig value of 2 tail into sig value of one tail. That can be done just by dividing it with 2
June 18, 2020 at 5:17 pm
The p-value method and critical value method will always agree as long as you’re not changing anything about how the methodology.
If you’re using statistical software, you don’t need to make any adjustments. The software will do that for you.
However, if you calculating it by hand, you’ll need to take your significance level and then look in the table for your test statistic for a one-tailed test. For example, you’ll want to look up 5% for a one-tailed test rather than a two-tailed test. That’s not as simple as dividing by two. In this article, I show examples of one-tailed and two-tailed tests for the same degrees of freedom. The t critical value for the two-tailed test is +/- 2.086 while for the one-sided test it is 1.725. It is true that probability associated with those critical values doubles for the one-tailed test (2.5% -> 5%), but the critical value itself is not half (2.086 -> 1.725). Study the first several graphs in this article to see why that is true.
For the p-value, you can take a two-tailed p-value and divide by 2 to determine the one-sided p-value. However, if you’re using statistical software, it does that for you.
June 11, 2020 at 3:46 pm
Hello Jim, if you have the time I’d be grateful if you could shed some clarity on this scenario:
“A researcher believes that aromatherapy can relieve stress but wants to determine whether it can also enhance focus. To test this, the researcher selected a random sample of students to take an exam in which the average score in the general population is 77. Prior to the exam, these students studied individually in a small library room where a lavender scent was present. If students in this group scored significantly above the average score in general population [is this one-tailed or two-tailed hypothesis?], then this was taken as evidence that the lavender scent enhanced focus.”
Thank you for your time if you do decide to respond.
June 11, 2020 at 4:00 pm
It’s unclear from the information provided whether the researchers used a one-tailed or two-tailed test. It could be either. A two-tailed test can detect effects in both directions, so it could definitely detect an average group score above the population score. However, you could also detect that effect using a one-tailed test if it was set up correctly. So, there’s not enough information in what you provided to know for sure. It could be either.
However, that’s irrelevant to answering the question. The tricky part, as I see it, is that you’re not entirely sure about why the scores are higher. Are they higher because the lavender scent increased concentration or are they higher because the subjects have lower stress from the lavender? Or, maybe it’s not even related to the scent but some other characteristic of the room or testing conditions in which they took the test. You just know the scores are higher but not necessarily why they’re higher.
I’d say that, no, it’s not necessarily evidence that the lavender scent enhanced focus. There are competing explanations for why the scores are higher. Also, it would be best do this as an experiment with a control and treatment group where subjects are randomly assigned to either group. That process helps establish causality rather than just correlation and helps rules out competing explanations for why the scores are higher.
By the way, I spend a lot of time on these issues in my Introduction to Statistics ebook .
June 9, 2020 at 1:47 pm
If a left tail test has an alpha value of 0.05 how will you find the value in the table
April 19, 2020 at 10:35 am
Hi Jim, My question is in regards to the results in the table in your example of the one-sample T (Two-Tailed) test. above. What about the P-value? The P-value listed is .018. I assuming that is compared to and alpha of 0.025, correct?
In regression analysis, when I get a test statistic for the predictive variable of -2.099 and a p-value of 0.039. Am I comparing the p-value to an alpha of 0.025 or 0.05? Now if I run a Bootstrap for coefficients analysis, the results say the sig (2-tail) is 0.098. What are the critical values and alpha in this case? I’m trying to reconcile what I am seeing in both tables.
Thanks for your help.
April 20, 2020 at 3:24 am
Hi Marvalisa,
For one-tailed tests, you don’t need to divide alpha in half. If you can tell your software to perform a one-tailed test, it’ll do all the calculations necessary so you don’t need to adjust anything. So, if you’re using an alpha of 0.05 for a one-tailed test and your p-value is 0.04, it is significant. The procedures adjust the p-values automatically and it all works out. So, whether you’re using a one-tailed or two-tailed test, you always compare the p-value to the alpha with no need to adjust anything. The procedure does that for you!
The exception would be if for some reason your software doesn’t allow you to specify that you want to use a one-tailed test instead of a two-tailed test. Then, you divide the p-value from a two-tailed test in half to get the p-value for a one tailed test. You’d still compare it to your original alpha.
For regression, the same thing applies. If you want to use a one-tailed test for a cofficient, just divide the p-value in half if you can’t tell the software that you want a one-tailed test. The default is two-tailed. If your software has the option for one-tailed tests for any procedure, including regression, it’ll adjust the p-value for you. So, in the normal course of things, you won’t need to adjust anything.
March 26, 2020 at 12:00 pm
Hey Jim, for a one-tailed hypothesis test with a .05 confidence level, should I use a 95% confidence interval or a 90% confidence interval? Thanks
March 26, 2020 at 5:05 pm
You should use a one-sided 95% confidence interval. One-sided CIs have either an upper OR lower bound but remains unbounded on the other side.
March 16, 2020 at 4:30 pm
This is not applicable to the subject but… When performing tests of equivalence, we look at the confidence interval of the difference between two groups, and we perform two one-sided t-tests for equivalence..
March 15, 2020 at 7:51 am
Thanks for this illustrative blogpost. I had a question on one of your points though.
By definition of H1 and H0, a two-sided alternate hypothesis is that there is a difference in means between the test and control. Not that anything is ‘better’ or ‘worse’.
Just because we observed a negative result in your example, does not mean we can conclude it’s necessarily worse, but instead just ‘different’.
Therefore while it enables us to spot the fact that there may be differences between test and control, we cannot make claims about directional effects. So I struggle to see why they actually need to be used instead of one-sided tests.
What’s your take on this?
March 16, 2020 at 3:02 am
Hi Dominic,
If you’ll notice, I carefully avoid stating better or worse because in a general sense you’re right. However, given the context of a specific experiment, you can conclude whether a negative value is better or worse. As always in statistics, you have to use your subject-area knowledge to help interpret the results. In some cases, a negative value is a bad result. In other cases, it’s not. Use your subject-area knowledge!
I’m not sure why you think that you can’t make claims about directional effects? Of course you can!
As for why you shouldn’t use one-tailed tests for most cases, read my post When Can I Use One-Tailed Tests . That should answer your questions.
May 10, 2019 at 12:36 pm
Your website is absolutely amazing Jim, you seem like the nicest guy for doing this and I like how there’s no ulterior motive, (I wasn’t automatically signed up for emails or anything when leaving this comment). I study economics and found econometrics really difficult at first, but your website explains it so clearly its been a big asset to my studies, keep up the good work!
May 10, 2019 at 2:12 pm
Thank you so much, Jack. Your kind words mean a lot!
April 26, 2019 at 5:05 am
Hy Jim I really need your help now pls
One-tailed and two- tailed hypothesis, is it the same or twice, half or unrelated pls
April 26, 2019 at 11:41 am
Hi Anthony,
I describe how the hypotheses are different in this post. You’ll find your answers.
February 8, 2019 at 8:00 am
Thank you for your blog Jim, I have a Statistics exam soon and your articles let me understand a lot!
February 8, 2019 at 10:52 am
You’re very welcome! I’m happy to hear that it’s been helpful. Best of luck on your exam!
January 12, 2019 at 7:06 am
Hi Jim, When you say target value is 5. Do you mean to say the population mean is 5 and we are trying to validate it with the help of sample mean 4.1 using Hypo tests ?.. If it is so.. How can we measure a population parameter as 5 when it is almost impossible o measure a population parameter. Please clarify
January 12, 2019 at 6:57 pm
When you set a target for a one-sample test, it’s based on a value that is important to you. It’s not a population parameter or anything like that. The example in this post uses a case where we need parts that are stronger on average than a value of 5. We derive the value of 5 by using our subject area knowledge about what is required for a situation. Given our product knowledge for the hypothetical example, we know it should be 5 or higher. So, we use that in the hypothesis test and determine whether the population mean is greater than that target value.
When you perform a one-sample test, a target value is optional. If you don’t supply a target value, you simply obtain a confidence interval for the range of values that the parameter is likely to fall within. But, sometimes there is meaningful number that you want to test for specifically.
I hope that clarifies the rational behind the target value!
November 15, 2018 at 8:08 am
I understand that in Psychology a one tailed hypothesis is preferred. Is that so
November 15, 2018 at 11:30 am
No, there’s no overall preference for one-tailed hypothesis tests in statistics. That would be a study-by-study decision based on the types of possible effects. For more information about this decision, read my post: When Can I Use One-Tailed Tests?
November 6, 2018 at 1:14 am
I’m grateful to you for the explanations on One tail and Two tail hypothesis test. This opens my knowledge horizon beyond what an average statistics textbook can offer. Please include more examples in future posts. Thanks
November 5, 2018 at 10:20 am
Thank you. I will search it as well.
Stan Alekman
November 4, 2018 at 8:48 pm
Jim, what is the difference between the central and non-central t-distributions w/respect to hypothesis testing?
November 5, 2018 at 10:12 am
Hi Stan, this is something I will need to look into. I know central t-distribution is the common Student t-distribution, but I don’t have experience using non-central t-distributions. There might well be a blog post in that–after I learn more!
November 4, 2018 at 7:42 pm
this is awesome.
Comments and Questions Cancel reply
- Search Search Please fill out this field.
What Is a Two-Tailed Test?
Understanding a two-tailed test, special considerations, two-tailed vs. one-tailed test.
- Two-Tailed Test FAQs
- Corporate Finance
- Financial Analysis
What Is a Two-Tailed Test? Definition and Example
Adam Hayes, Ph.D., CFA, is a financial writer with 15+ years Wall Street experience as a derivatives trader. Besides his extensive derivative trading expertise, Adam is an expert in economics and behavioral finance. Adam received his master's in economics from The New School for Social Research and his Ph.D. from the University of Wisconsin-Madison in sociology. He is a CFA charterholder as well as holding FINRA Series 7, 55 & 63 licenses. He currently researches and teaches economic sociology and the social studies of finance at the Hebrew University in Jerusalem.
![two tailed hypothesis test example two tailed hypothesis test example](https://www.investopedia.com/thmb/Q919UgFxwYXjgrqoFhjc5V7cFqE=/90x200/filters:no_upscale():max_bytes(150000):strip_icc():format(webp)/adam_hayes-5bfc262a46e0fb005118b414.jpg)
Investopedia / Joules Garcia
A two-tailed test, in statistics, is a method in which the critical area of a distribution is two-sided and tests whether a sample is greater than or less than a certain range of values. It is used in null-hypothesis testing and testing for statistical significance . If the sample being tested falls into either of the critical areas, the alternative hypothesis is accepted instead of the null hypothesis.
Key Takeaways
- In statistics, a two-tailed test is a method in which the critical area of a distribution is two-sided and tests whether a sample is greater or less than a range of values.
- It is used in null-hypothesis testing and testing for statistical significance.
- If the sample being tested falls into either of the critical areas, the alternative hypothesis is accepted instead of the null hypothesis.
- By convention two-tailed tests are used to determine significance at the 5% level, meaning each side of the distribution is cut at 2.5%.
A basic concept of inferential statistics is hypothesis testing , which determines whether a claim is true or not given a population parameter. A hypothesis test that is designed to show whether the mean of a sample is significantly greater than and significantly less than the mean of a population is referred to as a two-tailed test. The two-tailed test gets its name from testing the area under both tails of a normal distribution , although the test can be used in other non-normal distributions.
A two-tailed test is designed to examine both sides of a specified data range as designated by the probability distribution involved. The probability distribution should represent the likelihood of a specified outcome based on predetermined standards. This requires the setting of a limit designating the highest (or upper) and lowest (or lower) accepted variable values included within the range. Any data point that exists above the upper limit or below the lower limit is considered out of the acceptance range and in an area referred to as the rejection range.
There is no inherent standard about the number of data points that must exist within the acceptance range. In instances where precision is required, such as in the creation of pharmaceutical drugs, a rejection rate of 0.001% or less may be instituted. In instances where precision is less critical, such as the number of food items in a product bag, a rejection rate of 5% may be appropriate.
A two-tailed test can also be used practically during certain production activities in a firm, such as with the production and packaging of candy at a particular facility. If the production facility designates 50 candies per bag as its goal, with an acceptable distribution of 45 to 55 candies, any bag found with an amount below 45 or above 55 is considered within the rejection range.
To confirm the packaging mechanisms are properly calibrated to meet the expected output, random sampling may be taken to confirm accuracy. A simple random sample takes a small, random portion of the entire population to represent the entire data set, where each member has an equal probability of being chosen.
For the packaging mechanisms to be considered accurate, an average of 50 candies per bag with an appropriate distribution is desired. Additionally, the number of bags that fall within the rejection range needs to fall within the probability distribution limit considered acceptable as an error rate. Here, the null hypothesis would be that the mean is 50 while the alternate hypothesis would be that it is not 50.
If, after conducting the two-tailed test, the z-score falls in the rejection region, meaning that the deviation is too far from the desired mean, then adjustments to the facility or associated equipment may be required to correct the error. Regular use of two-tailed testing methods can help ensure production stays within limits over the long term.
Be careful to note if a statistical test is one- or two-tailed as this will greatly influence a model's interpretation.
When a hypothesis test is set up to show that the sample mean would be only higher than the population mean, this is referred to as a one-tailed test . A formulation of this hypothesis would be, for example, that "the returns on an investment fund would be at least x%." One-tailed tests could also be set up to show that the sample mean could be only less than the population mean. The key difference from a two-tailed test is that in a two-tailed test, the sample mean could be different from the population mean by being either higher or lower than it.
If the sample being tested falls into the one-sided critical area, the alternative hypothesis will be accepted instead of the null hypothesis. A one-tailed test is also known as a directional hypothesis or directional test.
A two-tailed test, on the other hand, is designed to examine both sides of a specified data range to test whether a sample is greater than or less than the range of values.
Example of a Two-Tailed Test
As a hypothetical example, imagine that a new stockbroker , named XYZ, claims that their brokerage fees are lower than that of your current stockbroker, ABC) Data available from an independent research firm indicates that the mean and standard deviation of all ABC broker clients are $18 and $6, respectively.
A sample of 100 clients of ABC is taken, and brokerage charges are calculated with the new rates of XYZ broker. If the mean of the sample is $18.75 and the sample standard deviation is $6, can any inference be made about the difference in the average brokerage bill between ABC and XYZ broker?
- H 0 : Null Hypothesis: mean = 18
- H 1 : Alternative Hypothesis: mean <> 18 (This is what we want to prove.)
- Rejection region: Z <= - Z 2.5 and Z>=Z 2.5 (assuming 5% significance level, split 2.5 each on either side).
- Z = (sample mean – mean) / (std-dev / sqrt (no. of samples)) = (18.75 – 18) / (6/(sqrt(100)) = 1.25
This calculated Z value falls between the two limits defined by: - Z 2.5 = -1.96 and Z 2.5 = 1.96.
This concludes that there is insufficient evidence to infer that there is any difference between the rates of your existing broker and the new broker. Therefore, the null hypothesis cannot be rejected. Alternatively, the p-value = P(Z< -1.25)+P(Z >1.25) = 2 * 0.1056 = 0.2112 = 21.12%, which is greater than 0.05 or 5%, leads to the same conclusion.
How Is a Two-Tailed Test Designed?
A two-tailed test is designed to determine whether a claim is true or not given a population parameter. It examines both sides of a specified data range as designated by the probability distribution involved. As such, the probability distribution should represent the likelihood of a specified outcome based on predetermined standards.
What Is the Difference Between a Two-Tailed and One-Tailed Test?
A two-tailed hypothesis test is designed to show whether the sample mean is significantly greater than or significantly less than the mean of a population. The two-tailed test gets its name from testing the area under both tails (sides) of a normal distribution. A one-tailed hypothesis test, on the other hand, is set up to show only one test; that the sample mean would be higher than the population mean, or, in a separate test, that the sample mean would be lower than the population mean.
What Is a Z-score?
A Z-score numerically describes a value's relationship to the mean of a group of values and is measured in terms of the number of standard deviations from the mean. If a Z-score is 0, it indicates that the data point's score is identical to the mean score whereas Z-scores of 1.0 and -1.0 would indicate values one standard deviation above or below the mean. In most large data sets, 99% of values have a Z-score between -3 and 3, meaning they lie within three standard deviations above and below the mean.
San Jose State University. " 6: Introduction to Null Hypothesis Significance Testing ."
![two tailed hypothesis test example two tailed hypothesis test example](https://www.investopedia.com/thmb/CvVlKGHMbm9XzxkdVsOuUR2nywQ=/400x300/filters:no_upscale():max_bytes(150000):strip_icc():format(webp)/z-test.asp-final-81378e9e20704163ba30aad511c16e5d.jpg)
- Terms of Service
- Editorial Policy
- Privacy Policy
Two Tailed Test: Definition, Examples
Hypothesis Testing > Two Tailed Test
What is a Two Tailed Test?
![two tailed hypothesis test example two tailed test](https://www.statisticshowto.com/wp-content/uploads/2009/09/two-tails-normal-dist-300x123.png)
A two tailed test tells you that you’re finding the area in the middle of a distribution. In other words, your rejection region (the place where you would reject the null hypothesis ) is in both tails.
For example, let’s say you were running a z test with an alpha level of 5% (0.05). In a one tailed test, the entire 5% would be in a single tail. But with a two tailed test, that 5% is split between the two tails, giving you 2.5% (0.025) in each tail.
Need help with a homework question? Check out our tutoring page!
Two Tailed T Test
![two tailed hypothesis test example Image: ETSU.edu](https://www.statisticshowto.com/wp-content/uploads/2014/02/t-distribution2-300x120.jpg)
You may want to compare a sample mean to a given value of x with a t test . Let’s say your null hypothesis is that the mean is equal to 10 (μ = 10). A two tailed t test will test:
- Is the mean greater than 10?
- Is the mean less than 10?
If you choose an alpha level of 5%, and the f statistic is in the top 2.5% or bottom 2.5% of the probability distribution, then there is a significant difference in the means. That situation will also result in a p-value of less than 0.05. A small p-value gives you a reason to reject the null hypothesis .
Two tailed F test
An f test tells you if two population variances are equal. A two tailed f test is the standard type of f test which will tell you if the variances are equal or not equal. The two tailed version of test will test if one variance is greater than, or less than, the other variance. This is in comparison to the one tailed f test , which is used when you only want to test if one variance is greater than the other or that one variance is less than the other (but not both).
Everitt, B. S.; Skrondal, A. (2010), The Cambridge Dictionary of Statistics , Cambridge University Press. Gonick, L. (1993). The Cartoon Guide to Statistics . HarperPerennial.
If you're seeing this message, it means we're having trouble loading external resources on our website.
If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.
To log in and use all the features of Khan Academy, please enable JavaScript in your browser.
Statistics and probability
Course: statistics and probability > unit 12.
- Hypothesis testing and p-values
One-tailed and two-tailed tests
- Z-statistics vs. T-statistics
- Small sample hypothesis test
- Large sample proportion hypothesis testing
![two tailed hypothesis test example](https://cdn.kastatic.org/images/google_classroom_color.png)
Want to join the conversation?
- Upvote Button navigates to signup page
- Downvote Button navigates to signup page
- Flag Button navigates to signup page
![two tailed hypothesis test example Good Answer](https://cdn.kastatic.org/images/badges/moon/good-answer-40x40.png)
Video transcript
![two tailed hypothesis test example](https://sphweb.bumc.bu.edu/banner_custom.png)
Hypothesis Testing for Means & Proportions
- 1
- | 2
- | 3
- | 4
- | 5
- | 6
- | 7
- | 8
- | 9
- | 10
![two tailed hypothesis test example On This Page sidebar](https://sphweb.bumc.bu.edu/onthispage_custom.png)
Hypothesis Testing: Upper-, Lower, and Two Tailed Tests
Type i and type ii errors.
![two tailed hypothesis test example Learn More sidebar](https://sphweb.bumc.bu.edu/learnmore_custom.png)
All Modules
![two tailed hypothesis test example More Resources sidebar](https://sphweb.bumc.bu.edu/resources_custom.png)
Z score Table
t score Table
The procedure for hypothesis testing is based on the ideas described above. Specifically, we set up competing hypotheses, select a random sample from the population of interest and compute summary statistics. We then determine whether the sample data supports the null or alternative hypotheses. The procedure can be broken down into the following five steps.
- Step 1. Set up hypotheses and select the level of significance α.
H 0 : Null hypothesis (no change, no difference);
H 1 : Research hypothesis (investigator's belief); α =0.05
Upper-tailed, Lower-tailed, Two-tailed Tests The research or alternative hypothesis can take one of three forms. An investigator might believe that the parameter has increased, decreased or changed. For example, an investigator might hypothesize: : μ > μ , where μ is the comparator or null value (e.g., μ =191 in our example about weight in men in 2006) and an increase is hypothesized - this type of test is called an ; : μ < μ , where a decrease is hypothesized and this is called a ; or : μ ≠ μ where a difference is hypothesized and this is called a .The exact form of the research hypothesis depends on the investigator's belief about the parameter of interest and whether it has possibly increased, decreased or is different from the null value. The research hypothesis is set up by the investigator before any data are collected.
|
- Step 2. Select the appropriate test statistic.
The test statistic is a single number that summarizes the sample information. An example of a test statistic is the Z statistic computed as follows:
When the sample size is small, we will use t statistics (just as we did when constructing confidence intervals for small samples). As we present each scenario, alternative test statistics are provided along with conditions for their appropriate use.
- Step 3. Set up decision rule.
The decision rule is a statement that tells under what circumstances to reject the null hypothesis. The decision rule is based on specific values of the test statistic (e.g., reject H 0 if Z > 1.645). The decision rule for a specific test depends on 3 factors: the research or alternative hypothesis, the test statistic and the level of significance. Each is discussed below.
- The decision rule depends on whether an upper-tailed, lower-tailed, or two-tailed test is proposed. In an upper-tailed test the decision rule has investigators reject H 0 if the test statistic is larger than the critical value. In a lower-tailed test the decision rule has investigators reject H 0 if the test statistic is smaller than the critical value. In a two-tailed test the decision rule has investigators reject H 0 if the test statistic is extreme, either larger than an upper critical value or smaller than a lower critical value.
- The exact form of the test statistic is also important in determining the decision rule. If the test statistic follows the standard normal distribution (Z), then the decision rule will be based on the standard normal distribution. If the test statistic follows the t distribution, then the decision rule will be based on the t distribution. The appropriate critical value will be selected from the t distribution again depending on the specific alternative hypothesis and the level of significance.
- The third factor is the level of significance. The level of significance which is selected in Step 1 (e.g., α =0.05) dictates the critical value. For example, in an upper tailed Z test, if α =0.05 then the critical value is Z=1.645.
The following figures illustrate the rejection regions defined by the decision rule for upper-, lower- and two-tailed Z tests with α=0.05. Notice that the rejection regions are in the upper, lower and both tails of the curves, respectively. The decision rules are written below each figure.
Rejection Region for Upper-Tailed Z Test (H : μ > μ ) with α=0.05 The decision rule is: Reject H if Z 1.645. |
![]() Rejection Region for Lower-Tailed Z Test (H 1 : μ < μ 0 ) with α =0.05 The decision rule is: Reject H 0 if Z < 1.645.
![]() Rejection Region for Two-Tailed Z Test (H 1 : μ ≠ μ 0 ) with α =0.05 The decision rule is: Reject H 0 if Z < -1.960 or if Z > 1.960.
The complete table of critical values of Z for upper, lower and two-tailed tests can be found in the table of Z values to the right in "Other Resources." Critical values of t for upper, lower and two-tailed tests can be found in the table of t values in "Other Resources."
Here we compute the test statistic by substituting the observed sample data into the test statistic identified in Step 2.
The final conclusion is made by comparing the test statistic (which is a summary of the information observed in the sample) to the decision rule. The final conclusion will be either to reject the null hypothesis (because the sample data are very unlikely if the null hypothesis is true) or not to reject the null hypothesis (because the sample data are not very unlikely). If the null hypothesis is rejected, then an exact significance level is computed to describe the likelihood of observing the sample data assuming that the null hypothesis is true. The exact level of significance is called the p-value and it will be less than the chosen level of significance if we reject H 0 . Statistical computing packages provide exact p-values as part of their standard output for hypothesis tests. In fact, when using a statistical computing package, the steps outlined about can be abbreviated. The hypotheses (step 1) should always be set up in advance of any analysis and the significance criterion should also be determined (e.g., α =0.05). Statistical computing packages will produce the test statistic (usually reporting the test statistic as t) and a p-value. The investigator can then determine statistical significance using the following: If p < α then reject H 0 .
H 0 : μ = 191 H 1 : μ > 191 α =0.05 The research hypothesis is that weights have increased, and therefore an upper tailed test is used.
Because the sample size is large (n > 30) the appropriate test statistic is
In this example, we are performing an upper tailed test (H 1 : μ> 191), with a Z test statistic and selected α =0.05. Reject H 0 if Z > 1.645. We now substitute the sample data into the formula for the test statistic identified in Step 2. We reject H 0 because 2.38 > 1.645. We have statistically significant evidence at a =0.05, to show that the mean weight in men in 2006 is more than 191 pounds. Because we rejected the null hypothesis, we now approximate the p-value which is the likelihood of observing the sample data if the null hypothesis is true. An alternative definition of the p-value is the smallest level of significance where we can still reject H 0 . In this example, we observed Z=2.38 and for α=0.05, the critical value was 1.645. Because 2.38 exceeded 1.645 we rejected H 0 . In our conclusion we reported a statistically significant increase in mean weight at a 5% level of significance. Using the table of critical values for upper tailed tests, we can approximate the p-value. If we select α=0.025, the critical value is 1.96, and we still reject H 0 because 2.38 > 1.960. If we select α=0.010 the critical value is 2.326, and we still reject H 0 because 2.38 > 2.326. However, if we select α=0.005, the critical value is 2.576, and we cannot reject H 0 because 2.38 < 2.576. Therefore, the smallest α where we still reject H 0 is 0.010. This is the p-value. A statistical computing package would produce a more precise p-value which would be in between 0.005 and 0.010. Here we are approximating the p-value and would report p < 0.010. In all tests of hypothesis, there are two types of errors that can be committed. The first is called a Type I error and refers to the situation where we incorrectly reject H 0 when in fact it is true. This is also called a false positive result (as we incorrectly conclude that the research hypothesis is true when in fact it is not). When we run a test of hypothesis and decide to reject H 0 (e.g., because the test statistic exceeds the critical value in an upper tailed test) then either we make a correct decision because the research hypothesis is true or we commit a Type I error. The different conclusions are summarized in the table below. Note that we will never know whether the null hypothesis is really true or false (i.e., we will never know which row of the following table reflects reality). Table - Conclusions in Test of Hypothesis
In the first step of the hypothesis test, we select a level of significance, α, and α= P(Type I error). Because we purposely select a small value for α, we control the probability of committing a Type I error. For example, if we select α=0.05, and our test tells us to reject H 0 , then there is a 5% probability that we commit a Type I error. Most investigators are very comfortable with this and are confident when rejecting H 0 that the research hypothesis is true (as it is the more likely scenario when we reject H 0 ). When we run a test of hypothesis and decide not to reject H 0 (e.g., because the test statistic is below the critical value in an upper tailed test) then either we make a correct decision because the null hypothesis is true or we commit a Type II error. Beta (β) represents the probability of a Type II error and is defined as follows: β=P(Type II error) = P(Do not Reject H 0 | H 0 is false). Unfortunately, we cannot choose β to be small (e.g., 0.05) to control the probability of committing a Type II error because β depends on several factors including the sample size, α, and the research hypothesis. When we do not reject H 0 , it may be very likely that we are committing a Type II error (i.e., failing to reject H 0 when in fact it is false). Therefore, when tests are run and the null hypothesis is not rejected we often make a weak concluding statement allowing for the possibility that we might be committing a Type II error. If we do not reject H 0 , we conclude that we do not have significant evidence to show that H 1 is true. We do not conclude that H 0 is true. ![]() The most common reason for a Type II error is a small sample size. return to top | previous page | next page Content ©2017. All Rights Reserved. Date last modified: November 6, 2017. Wayne W. LaMorte, MD, PhD, MPH
My OpenLearn ProfilePersonalise your OpenLearn profile, save your favourite content and get recognition for your learning About this free courseBecome an ou student, download this course, share this free course. ![]() Start this free course now. Just create an account and sign in. Enrol and complete the course for a free statement of participation or digital badge if available. 4.2 Two-tailed testsHypotheses that have an equal (=) or not equal (≠) supposition (sign) in the statement are called non-directional hypotheses . In non-directional hypotheses, the researcher is interested in whether there is a statistically significant difference or relationship between two or more variables, but does not have any specific expectation about which group or variable will be higher or lower. For example, a non-directional hypothesis might be: ‘There is a difference in the preference for brand X between male and female consumers.’ In this hypothesis, the researcher is interested in whether there is a statistically significant difference in the preference for brand X between male and female consumers, but does not have a specific prediction about which gender will have a higher preference. The researcher may conduct a survey or experiment to collect data on the brand preference of male and female consumers and then use statistical analysis to determine whether there is a significant difference between the two groups. Non-directional hypotheses are also known as two-tailed hypotheses. The term ‘two-tailed’ comes from the fact that the statistical test used to evaluate the hypothesis is based on the assumption that the difference or relationship could occur in either direction, resulting in two ‘tails’ in the probability distribution. Using the coffee foam example (from Activity 1), you have the following set of hypotheses: H 0 : µ = 1cm foam H a : µ ≠ 1cm foam In this case, the researcher can reject the null hypothesis for the mean value that is either ‘much higher’ or ‘much lower’ than 1 cm foam. This is called a two-tailed test because the rejection region includes outcomes from both the upper and lower tails of the sample distribution when determining a decision rule. To give an illustration, if you set alpha level (α) equal to 0.05, that would give you a 95% confidence level. Then, you would reject the null hypothesis for obtained values of z 1.96 (you will look at how to calculate z-scores later in the course). This can be plotted on a graph as shown in Figure 7. ![]() A symmetrical graph reminiscent of a bell. The x-axis is labelled ‘z-score’ and the y-axis is labelled ‘probability density’. The x-axis increases in increments of 1 from -2 to 2. The top of the bell-shaped curve is labelled ‘Foam height = 1cm’. The graph circles the rejection regions of the null hypothesis on both sides of the bell curve. Within these circles are two areas shaded orange: beneath the curve from -2 downwards which is labelled z 1.96 and α = 0.025. In a two-tailed hypothesis test, the null hypothesis assumes that there is no significant difference or relationship between the two groups or variables, and the alternative hypothesis suggests that there is a significant difference or relationship, but does not specify the direction of the difference or relationship. When performing a two-tailed test, you need to determine the level of significance, which is denoted by alpha (α). The value of alpha, in this case, is 0.05. To perform a two-tailed test at a significance level of 0.05, you need to divide alpha by 2, giving a significance level of 0.025 for each distribution tail (0.05/2 = 0.025). This is done because the two-tailed test is looking for significance in either tail of the distribution. If the calculated test statistic falls in the rejection region of either tail of the distribution, then the null hypothesis is rejected and the alternative hypothesis is accepted. In this case, the researcher can conclude that there is a significant difference or relationship between the two groups or variables. Assuming that the population follows a normal distribution, the tail located below the critical value of z = –1.96 (in a later section, you will discuss how this value was determined) and the tail above the critical value of z = +1.96 each represent a proportion of 0.025. These tails are referred to as the lower and upper tails, respectively, and they correspond to the extreme values of the distribution that are far from the central part of the bell curve. These critical values are used in a two-tailed hypothesis test to determine whether to reject or fail to reject the null hypothesis. The null hypothesis represents the default assumption that there is no significant difference between the observed data and what would be expected under a specific condition. If the calculated test statistic falls within the critical values, then the null hypothesis cannot be rejected at the 0.05 level of significance. However, if the calculated test statistic falls outside the critical values (orange-coloured areas in Figure 7), then the null hypothesis can be rejected in favour of the alternative hypothesis, suggesting that there is evidence of a significant difference between the observed data and what would be expected under the specified condition. Have a language expert improve your writingRun a free plagiarism check in 10 minutes, generate accurate citations for free.
Hypothesis Testing | A Step-by-Step Guide with Easy ExamplesPublished on November 8, 2019 by Rebecca Bevans . Revised on June 22, 2023. Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics . It is most often used by scientists to test specific predictions, called hypotheses, that arise from theories. There are 5 main steps in hypothesis testing:
Though the specific details might vary, the procedure you will use when testing a hypothesis will always follow some version of these steps. Table of contentsStep 1: state your null and alternate hypothesis, step 2: collect data, step 3: perform a statistical test, step 4: decide whether to reject or fail to reject your null hypothesis, step 5: present your findings, other interesting articles, frequently asked questions about hypothesis testing. After developing your initial research hypothesis (the prediction that you want to investigate), it is important to restate it as a null (H o ) and alternate (H a ) hypothesis so that you can test it mathematically. The alternate hypothesis is usually your initial hypothesis that predicts a relationship between variables. The null hypothesis is a prediction of no relationship between the variables you are interested in.
Here's why students love Scribbr's proofreading servicesDiscover proofreading & editing For a statistical test to be valid , it is important to perform sampling and collect data in a way that is designed to test your hypothesis. If your data are not representative, then you cannot make statistical inferences about the population you are interested in. There are a variety of statistical tests available, but they are all based on the comparison of within-group variance (how spread out the data is within a category) versus between-group variance (how different the categories are from one another). If the between-group variance is large enough that there is little or no overlap between groups, then your statistical test will reflect that by showing a low p -value . This means it is unlikely that the differences between these groups came about by chance. Alternatively, if there is high within-group variance and low between-group variance, then your statistical test will reflect that with a high p -value. This means it is likely that any difference you measure between groups is due to chance. Your choice of statistical test will be based on the type of variables and the level of measurement of your collected data .
Based on the outcome of your statistical test, you will have to decide whether to reject or fail to reject your null hypothesis. In most cases you will use the p -value generated by your statistical test to guide your decision. And in most cases, your predetermined level of significance for rejecting the null hypothesis will be 0.05 – that is, when there is a less than 5% chance that you would see these results if the null hypothesis were true. In some cases, researchers choose a more conservative level of significance, such as 0.01 (1%). This minimizes the risk of incorrectly rejecting the null hypothesis ( Type I error ). The results of hypothesis testing will be presented in the results and discussion sections of your research paper , dissertation or thesis . In the results section you should give a brief summary of the data and a summary of the results of your statistical test (for example, the estimated difference between group means and associated p -value). In the discussion , you can discuss whether your initial hypothesis was supported by your results or not. In the formal language of hypothesis testing, we talk about rejecting or failing to reject the null hypothesis. You will probably be asked to do this in your statistics assignments. However, when presenting research results in academic papers we rarely talk this way. Instead, we go back to our alternate hypothesis (in this case, the hypothesis that men are on average taller than women) and state whether the result of our test did or did not support the alternate hypothesis. If your null hypothesis was rejected, this result is interpreted as “supported the alternate hypothesis.” These are superficial differences; you can see that they mean the same thing. You might notice that we don’t say that we reject or fail to reject the alternate hypothesis . This is because hypothesis testing is not designed to prove or disprove anything. It is only designed to test whether a pattern we measure could have arisen spuriously, or by chance. If we reject the null hypothesis based on our research (i.e., we find that it is unlikely that the pattern arose by chance), then we can say our test lends support to our hypothesis . But if the pattern does not pass our decision rule, meaning that it could have arisen by chance, then we say the test is inconsistent with our hypothesis . If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.
Methodology
Research bias
Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics. It is used by scientists to test specific predictions, called hypotheses , by calculating how likely it is that a pattern or relationship between variables could have arisen by chance. A hypothesis states your predictions about what your research will find. It is a tentative answer to your research question that has not yet been tested. For some research projects, you might have to write several hypotheses that address different aspects of your research question. A hypothesis is not just a guess — it should be based on existing theories and knowledge. It also has to be testable, which means you can support or refute it through scientific research methods (such as experiments, observations and statistical analysis of data). Null and alternative hypotheses are used in statistical hypothesis testing . The null hypothesis of a test always predicts no effect or no relationship between variables, while the alternative hypothesis states your research prediction of an effect or relationship. Cite this Scribbr articleIf you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator. Bevans, R. (2023, June 22). Hypothesis Testing | A Step-by-Step Guide with Easy Examples. Scribbr. Retrieved July 3, 2024, from https://www.scribbr.com/statistics/hypothesis-testing/ Is this article helpful?Rebecca BevansOther students also liked, choosing the right statistical test | types & examples, understanding p values | definition and examples, what is your plagiarism score. ![]() In order to continue enjoying our site, we ask that you confirm your identity as a human. Thank you very much for your cooperation. Stack Exchange NetworkStack Exchange network consists of 183 Q&A communities including Stack Overflow , the largest, most trusted online community for developers to learn, share their knowledge, and build their careers. Q&A for work Connect and share knowledge within a single location that is structured and easy to search. Explaining two-tailed testsI am looking for various ways of explaining to my students (in an elementary statistics course) what is a two tailed test, and how its P value is calculated. How do you explain to your students the two- vs one- tailed test?
2 Answers 2This is a great question and I'm looking forward to everyones version of explaining the p-value and the two-tailed v.s. one-tailed test. I've been teaching fellow orthopaedic surgeons statistics and therefore I tried to keep it as basic as possible since most of them haven't done any advanced math for 10-30 years. My way of explaining calculating p-values & the tailsI start with a explaining that if we believe that we have a fair coin we know it should end up tails 50 % of the flips on average ($=H_0$). Now if you wonder what the probability of getting only 2 tails out of 10 flips with this fair coin you can calculate that probability as I've done in the bar graph. From the graph you can see that the probability of getting 8 out of 10 flips with a fair coin is about about $\approx 4.4\%$. Since we would question the fairness of the coin if we got 9 or 10 tails we have to include these possibilities, the tail of the test. By adding the values we get that the probability now is a little more than $\approx 5.5\%$ of getting 2 tails or less. Now if we would get only 2 heads, ie 8 heads (the other tail), we would probably be just as willing to question the fairness of the coin. This means that you end up with a probability of $5.4...\%+5.4...\% \approx 10.9\%$ for a two-tailed test . Since we in medicine usually are interested in studying failures we need to include the opposite side of the probability even if our intent is to do good and to introduce a beneficial treatment. ![]() Reflections slightly out of topicThis simple example also shows how dependent we are on the null hypothesis to calculate the p-value. I also like to point out the resemblance between the binomial curve and the bell curve. When changing into 200 flips you get a natural way of explaining why the probability of getting exactly 100 flips starts to lack relevance. The defining intervals of interest is a natural transition to probability density/mass function functions and their cumulative counterparts. In my class I recommend them the Khan academy statistics videos and I also use some of his explanations for certain concepts. They also get to flip coins where we look into the randomness of the coin flipping - the thing that I try to show is that randomness is more random than what we usually believe inspired by this Radiolab episode . I usually have one graph/slide, the R-code that I used to create the graph:
Suppose that you want to test the hypothesis that the average height of men is "5 ft 7 inches". You select a random sample of men, measure their heights and calculate the sample mean. Your hypothesis then is: $H_0: \mu = 5\ \text{ft} \ 7 \ \text{inches}$ $H_A: \mu \ne 5\ \text{ft} \ 7 \ \text{inches}$ In the above situation you do a two-tailed test as you would reject your null if the sample average is either too low or too high. In this case, the p-value represents the probability of realizing a sample mean that is at least as extreme as the one we actually obtained assuming that the null is in fact true. Thus, if observe the sample mean to be "5 ft 8 inches" then the p-value will represent the probability that we will observe heights greater than "5 ft 8 inches" or heights less than "5 ft 6 inches" provided the null is true. If on the other hand your alternative was framed like so: $H_A: \mu > 5\ \text{ft} \ 7 \ \text{inches}$ In the above situation you would a one-tailed test on the right side. The reason is that you would prefer to reject the null in favor of the alternative only if the sample mean is extremely high. The interpretation of the p-value stays the same with the slight nuance that we are now talking about the probability of realizing a sample mean that is greater than the one we actually obtained. Thus, if observe the sample mean to be "5 ft 8 inches" then the p-value will represent the probability that we will observe heights greater than "5 ft 8 inches" provided the null is true.
Your AnswerSign up or log in, post as a guest. Required, but never shown By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy . Not the answer you're looking for? Browse other questions tagged hypothesis-testing p-value teaching faq or ask your own question .
Hot Network Questions
![]() Two Sample t-test: Definition, Formula, and ExampleA two sample t-test is used to determine whether or not two population means are equal. This tutorial explains the following:
![]() Two Sample t-test: MotivationSuppose we want to know whether or not the mean weight between two different species of turtles is equal. Since there are thousands of turtles in each population, it would be too time-consuming and costly to go around and weigh each individual turtle. Instead, we might take a simple random sample of 15 turtles from each population and use the mean weight in each sample to determine if the mean weight is equal between the two populations: ![]() However, it’s virtually guaranteed that the mean weight between the two samples will be at least a little different. The question is whether or not this difference is statistically significant . Fortunately, a two sample t-test allows us to answer this question. Two Sample t-test: FormulaA two-sample t-test always uses the following null hypothesis:
The alternative hypothesis can be either two-tailed, left-tailed, or right-tailed:
We use the following formula to calculate the test statistic t: Test statistic: ( x 1 – x 2 ) / s p (√ 1/n 1 + 1/n 2 ) where x 1 and x 2 are the sample means, n 1 and n 2 are the sample sizes, and where s p is calculated as: s p = √ (n 1 -1)s 1 2 + (n 2 -1)s 2 2 / (n 1 +n 2 -2) where s 1 2 and s 2 2 are the sample variances. If the p-value that corresponds to the test statistic t with (n 1 +n 2 -1) degrees of freedom is less than your chosen significance level (common choices are 0.10, 0.05, and 0.01) then you can reject the null hypothesis. Two Sample t-test: AssumptionsFor the results of a two sample t-test to be valid, the following assumptions should be met:
Two Sample t-test : ExampleSuppose we want to know whether or not the mean weight between two different species of turtles is equal. To test this, will perform a two sample t-test at significance level α = 0.05 using the following steps: Step 1: Gather the sample data. Suppose we collect a random sample of turtles from each population with the following information:
Step 2: Define the hypotheses. We will perform the two sample t-test with the following hypotheses:
Step 3: Calculate the test statistic t . First, we will calculate the pooled standard deviation s p : s p = √ (n 1 -1)s 1 2 + (n 2 -1)s 2 2 / (n 1 +n 2 -2) = √ (40-1)18.5 2 + (38-1)16.7 2 / (40+38-2) = 17.647 Next, we will calculate the test statistic t : t = ( x 1 – x 2 ) / s p (√ 1/n 1 + 1/n 2 ) = (300-305) / 17.647(√ 1/40 + 1/38 ) = -1.2508 Step 4: Calculate the p-value of the test statistic t . According to the T Score to P Value Calculator , the p-value associated with t = -1.2508 and degrees of freedom = n 1 +n 2 -2 = 40+38-2 = 76 is 0.21484 . Step 5: Draw a conclusion. Since this p-value is not less than our significance level α = 0.05, we fail to reject the null hypothesis. We do not have sufficient evidence to say that the mean weight of turtles between these two populations is different. Note: You can also perform this entire two sample t-test by simply using the Two Sample t-test Calculator . Additional ResourcesThe following tutorials explain how to perform a two-sample t-test using different statistical programs: How to Perform a Two Sample t-test in Excel How to Perform a Two Sample t-test in SPSS How to Perform a Two Sample t-test in Stata How to Perform a Two Sample t-test in R How to Perform a Two Sample t-test in Python How to Perform a Two Sample t-test on a TI-84 Calculator Featured Posts![]() Hey there. My name is Zach Bobbitt. I have a Masters of Science degree in Applied Statistics and I’ve worked on machine learning algorithms for professional businesses in both healthcare and retail. I’m passionate about statistics, machine learning, and data visualization and I created Statology to be a resource for both students and teachers alike. My goal with this site is to help you learn statistics through using simple terms, plenty of real-world examples, and helpful illustrations. 2 Replies to “Two Sample t-test: Definition, Formula, and Example”I like the detailed information and simplified in the way I can understand and relate easily. Thank you It seems a couple of parenthesis is missed at the pooled standard deviation formula. Under square root you have (n1-1)s12 + (n2-1)s22 / (n1+n2-2) but it should be [(n1-1)s12 + (n2-1)s22] / (n1+n2-2) I used square bracket Leave a Reply Cancel replyYour email address will not be published. Required fields are marked * Join the Statology CommunitySign up to receive Statology's exclusive study resource: 100 practice problems with step-by-step solutions. Plus, get our latest insights, tutorials, and data analysis tips straight to your inbox! By subscribing you accept Statology's Privacy Policy. ![]() Code. Models. Analysis. Decisions.
Using R for Hypothesis TestingThis tutorial demonstrates step-by step how to use R and Jupyter Notebook to conduct a two-tailed or two sided hypothesis test. An eight step approach that begins with the formulation of Null and Alternative Hypothesis and ends by stating what the results of the test mean in plain English. What is hypothesis testing?Hypothesis testing is one of the cornerstones of inferential statistics. It is generally used to test whether some phenomenon observed in a sample is likely the result of random change or is "real", that is statistically significant. What are the steps of hypothesis testing?Hypothesis testing is statistics is usually challenging at first, but a step-by-step approach can help keep everything straight. We always start with a hypothesis we would like to prove, i.e. Human activity is contributing to global warming. This is what we need to prove, and is called the alternative hypothesis . Its opposite, humans are not contributing to global warming, is the null hypothesis . To prove the alternative hypothesis we would need to collect data and if the collected sample data deviates enough from what we can attribute to normal variation we will have proved the alternative hypothesis is much more likely than not. The global warming question is devilishly hard to test, and as you may have heard the jury is still out. But it does serve as a nice vignette. When presented with the sorts of questions found in introductory statistics, you can use the following steps:
The video tutorial presented here is an example of a so called two-tailed hypothesis test, and walks through a problem using these steps. You can download the Jupyter Notebook used in the video here Statistics TutorialDescriptive statistics, inferential statistics, stat reference, statistics - hypothesis testing a proportion (two tailed). A population proportion is the share of a population that belongs to a particular category . Hypothesis tests are used to check a claim about the size of that population proportion. Hypothesis Testing a ProportionThe following steps are used for a hypothesis test:
For example:
And we want to check the claim: "The share of Nobel Prize winners that are women is not 50%" By taking a sample of 100 randomly selected Nobel Prize winners we could find that: 10 out of 100 Nobel Prize winners in the sample were women The sample proportion is then: \(\displaystyle \frac{10}{100} = 0.1\), or 10%. From this sample data we check the claim with the steps below. 1. Checking the ConditionsThe conditions for calculating a confidence interval for a proportion are:
In our example, we randomly selected 10 people that were women. The rest were not women, so there are 90 in the other category. The conditions are fulfilled in this case. Note: It is possible to do a hypothesis test without having 5 of each category. But special adjustments need to be made. 2. Defining the ClaimsWe need to define a null hypothesis (\(H_{0}\)) and an alternative hypothesis (\(H_{1}\)) based on the claim we are checking. The claim was: In this case, the parameter is the proportion of Nobel Prize winners that are women (\(p\)). The null and alternative hypothesis are then: Null hypothesis : 50% of Nobel Prize winners were women. Alternative hypothesis : The share of Nobel Prize winners that are women is not 50% Which can be expressed with symbols as: \(H_{0}\): \(p = 0.50 \) \(H_{1}\): \(p \neq 0.50 \) This is a ' two-tailed ' test, because the alternative hypothesis claims that the proportion is different (larger or smaller) than in the null hypothesis. If the data supports the alternative hypothesis, we reject the null hypothesis and accept the alternative hypothesis. Advertisement 3. Deciding the Significance LevelThe significance level (\(\alpha\)) is the uncertainty we accept when rejecting the null hypothesis in a hypothesis test. The significance level is a percentage probability of accidentally making the wrong conclusion. Typical significance levels are:
A lower significance level means that the evidence in the data needs to be stronger to reject the null hypothesis. There is no "correct" significance level - it only states the uncertainty of the conclusion. Note: A 5% significance level means that when we reject a null hypothesis: We expect to reject a true null hypothesis 5 out of 100 times. 4. Calculating the Test StatisticThe test statistic is used to decide the outcome of the hypothesis test. The test statistic is a standardized value calculated from the sample. The formula for the test statistic (TS) of a population proportion is: \(\displaystyle \frac{\hat{p} - p}{\sqrt{p(1-p)}} \cdot \sqrt{n} \) \(\hat{p}-p\) is the difference between the sample proportion (\(\hat{p}\)) and the claimed population proportion (\(p\)). \(n\) is the sample size. In our example: The claimed (\(H_{0}\)) population proportion (\(p\)) was \( 0.50 \) The sample size (\(n\)) was \(100\) So the test statistic (TS) is then: \(\displaystyle \frac{0.1-0.5}{\sqrt{0.5(1-0.5)}} \cdot \sqrt{100} = \frac{-0.4}{\sqrt{0.5(0.5)}} \cdot \sqrt{100} = \frac{-0.4}{\sqrt{0.25}} \cdot \sqrt{100} = \frac{-0.4}{0.5} \cdot 10 = \underline{-8}\) You can also calculate the test statistic using programming language functions: With Python use the scipy and math libraries to calculate the test statistic for a proportion. With R use the built-in math functions to calculate the test statistic for a proportion. 5. ConcludingThere are two main approaches for making the conclusion of a hypothesis test:
Note: The two approaches are only different in how they present the conclusion. The Critical Value ApproachFor the critical value approach we need to find the critical value (CV) of the significance level (\(\alpha\)). For a population proportion test, the critical value (CV) is a Z-value from a standard normal distribution . This critical Z-value (CV) defines the rejection region for the test. The rejection region is an area of probability in the tails of the standard normal distribution. Because the claim is that the population proportion is different from 50%, the rejection region is split into both the left and right tail: Choosing a significance level (\(\alpha\)) of 0.01, or 1%, we can find the critical Z-value from a Z-table , or with a programming language function: Note: Because this is a two-tailed test the tail area (\(\alpha\)) needs to be split in half (divided by 2). With Python use the Scipy Stats library norm.ppf() function find the Z-value for an \(\alpha\)/2 = 0.005 in the left tail. With R use the built-in qnorm() function to find the Z-value for an \(\alpha\) = 0.005 in the left tail. Using either method we can find that the critical Z-value in the left tail is \(\approx \underline{-2.5758}\) Since a normal distribution i symmetric, we know that the critical Z-value in the right tail will be the same number, only positive: \(\underline{2.5758}\) For a two-tailed test we need to check if the test statistic (TS) is smaller than the negative critical value (-CV), or bigger than the positive critical value (CV). If the test statistic is smaller than the negative critical value, the test statistic is in the rejection region . If the test statistic is bigger than the positive critical value, the test statistic is in the rejection region . When the test statistic is in the rejection region, we reject the null hypothesis (\(H_{0}\)). Here, the test statistic (TS) was \(\approx \underline{-8}\) and the critical value was \(\approx \underline{-2.5758}\) Here is an illustration of this test in a graph: Since the test statistic was smaller than the negative critical value we reject the null hypothesis. This means that the sample data supports the alternative hypothesis. And we can summarize the conclusion stating: The sample data supports the claim that "The share of Nobel Prize winners that are women is not 50%" at a 1% significance level . The P-Value ApproachFor the P-value approach we need to find the P-value of the test statistic (TS). If the P-value is smaller than the significance level (\(\alpha\)), we reject the null hypothesis (\(H_{0}\)). The test statistic was found to be \( \approx \underline{-8} \) For a population proportion test, the test statistic is a Z-Value from a standard normal distribution . Because this is a two-tailed test, we need to find the P-value of a Z-value smaller than -8 and multiply it by 2 . We can find the P-value using a Z-table , or with a programming language function: With Python use the Scipy Stats library norm.cdf() function find the P-value of a Z-value smaller than -8 for a two tailed test: With R use the built-in pnorm() function find the P-value of a Z-value smaller than -8 for a two tailed test: Using either method we can find that the P-value is \(\approx \underline{1.25 \cdot 10^{-15}}\) or \(0.00000000000000125\) This tells us that the significance level (\(\alpha\)) would need to be bigger than 0.000000000000125%, to reject the null hypothesis. This P-value is smaller than any of the common significance levels (10%, 5%, 1%). So the null hypothesis is rejected at all of these significance levels. The sample data supports the claim that "The share of Nobel Prize winners that are women is not 50%" at a 10%, 5%, and 1% significance level . Calculating a P-Value for a Hypothesis Test with ProgrammingMany programming languages can calculate the P-value to decide outcome of a hypothesis test. Using software and programming to calculate statistics is more common for bigger sets of data, as calculating manually becomes difficult. The P-value calculated here will tell us the lowest possible significance level where the null-hypothesis can be rejected. With Python use the scipy and math libraries to calculate the P-value for a two-tailed tailed hypothesis test for a proportion. Here, the sample size is 100, the occurrences are 10, and the test is for a proportion different from than 0.50. With R use the built-in prop.test() function find the P-value for a left tailed hypothesis test for a proportion. Here, the sample size is 100, the occurrences are 10, and the test is for a proportion different from 0.50. Note: The conf.level in the R code is the reverse of the significance level. Here, the significance level is 0.01, or 1%, so the conf.level is 1-0.01 = 0.99, or 99%. Left-Tailed and Two-Tailed TestsThis was an example of a two tailed test, where the alternative hypothesis claimed that parameter is different from the null hypothesis claim. You can check out an equivalent step-by-step guide for other types here:
![]() COLOR PICKER![]() Contact SalesIf you want to use W3Schools services as an educational institution, team or enterprise, send us an e-mail: [email protected] Report ErrorIf you want to report an error, or if you want to make a suggestion, send us an e-mail: [email protected] Top TutorialsTop references, top examples, get certified. ![]() Two Tailed HypothesisAi generator. ![]() In the vast realm of scientific inquiry, the two-tailed hypothesis holds a special place, serving as a compass for researchers exploring possibilities in two opposing directions. Instead of predicting a specific direction of the relationship between variables, it remains open to outcomes on both ends of the spectrum. Understanding how to craft such a hypothesis, enriched with insights and nuances, can elevate the robustness of one’s research. Delve into its world, discover thesis statement examples, learn the art of its formulation, and grasp tips to master its intricacies. What is Two Tailed Hypothesis? – DefinitionA two-tailed hypothesis, also known as a non-directional hypothesis , is a type of hypothesis used in statistical testing that predicts a relationship between variables without specifying the direction of the relationship. In other words, it tests for the possibility of the relationship in both directions. This approach is used when a researcher believes there might be a difference due to the experiment but doesn’t have enough preliminary evidence or basis to predict a specific direction of that difference. What is an example of a Two Tailed hypothesis statement?Let’s consider a study on the impact of a new teaching method on student performance: Hypothesis Statement : The new teaching method will have an effect on student performance. Notice that the hypothesis doesn’t specify whether the effect will be positive or negative (i.e., whether student performance will improve or decline). It’s open to both possibilities, making it a two-tailed hypothesis. Two Tailed Hypothesis Statement ExamplesThe two-tailed hypothesis, an essential tool in research, doesn’t predict a specific directional outcome between variables. Instead, it posits that an effect exists, without specifying its nature. This approach offers flexibility, as it remains open to both positive and negative outcomes. Below are various examples from diverse fields to shed light on this versatile research method. You may also be interested to browse through our other one-tailed hypothesis .
Two Tailed Hypothesis Statement Examples in ResearchIn academic research, a two-tailed hypothesis is versatile, not pointing to a specific direction of effect but remaining open to outcomes on both ends of the spectrum. Such hypothesis aim to determine if a particular variable affects another, without specifying how. Here are examples tailored to research scenarios.
Two Tailed Testing Hypothesis Statement ExamplesIn hypothesis testing , a two-tailed test examines the possibility of a relationship in both directions. Unlike one-tailed tests, it doesn’t anticipate a specific direction of the relationship. The following are examples that encapsulate this approach within varied testing scenarios.
How do you know if a hypothesis is two-tailed?To determine if a hypothesis is two-tailed, you must look at the nature of the prediction. A two-tailed hypothesis is neutral concerning the direction of the predicted relationship or difference between groups. It simply predicts a difference or relationship without specifying whether it will be positive, negative, greater, or lesser. The hypothesis tests for effects in both directions. What is one-tailed and two-tailed Hypothesis test with example?In hypothesis testing, the choice between a one-tailed and a two-tailed test is determined by the nature of the research question. One-tailed hypothesis: This tests for a specific direction of the effect. It predicts the direction of the relationship or difference between groups. For example, a one-tailed hypothesis might state: “The new drug will reduce symptoms more effectively than the standard treatment.” Two-tailed hypothesis: This doesn’t specify the direction. It predicts that there will be a difference, but it doesn’t forecast whether the difference will be positive or negative. For example, a two-tailed hypothesis might state: “The new drug will have a different effect on symptoms compared to the standard treatment.” What is a two-tailed hypothesis in psychology?In psychology, a two-tailed hypothesis is frequently used when researchers are exploring new areas or relationships without a strong prior basis to predict the direction of findings. For instance, a psychologist might use a two-tailed hypothesis to explore whether a new therapeutic method has different outcomes than a traditional method, without predicting whether the outcomes will be better or worse. What does a two-tailed alternative hypothesis look like?A two-tailed alternative hypothesis is generally framed to show that a parameter is simply different from a certain value, without specifying the direction of the difference. Using mathematical notation, for a population mean (μ) and a proposed value (k), the two-tailed hypothesis would look like: H1: μ ≠ k. How do you write a Two-Tailed hypothesis statement? – A Step by Step Guide
Tips for Writing Two Tailed Hypothesis
![]() Text prompt
10 Examples of Public speaking 20 Examples of Gas lighting What are three examples of two-tailed hypothesis tests?Table of Contents A two-tailed hypothesis test is a statistical method used to determine if there is a significant difference between two groups or variables. It involves testing the null hypothesis that there is no difference between the two groups against the alternative hypothesis that there is a difference. Three examples of two-tailed hypothesis tests include: comparing the mean test scores of students who received a new teaching method versus those who did not, analyzing the effectiveness of two different medications on treating a certain illness, and examining the relationship between income and job satisfaction among employees in two different industries. In each of these examples, the two-tailed hypothesis test would help determine if there is a significant difference between the two groups being compared. Two-Tailed Hypothesis Tests: 3 Example ProblemsIn statistics, we use to determine whether some claim about a is true or not. Whenever we perform a hypothesis test, we always write a null hypothesis and an alternative hypothesis, which take the following forms: H 0 (Null Hypothesis): Population parameter = ≤, ≥ some value H A (Alternative Hypothesis): Population parameter <, >, ≠ some value There are two types of hypothesis tests:
In a two-tailed test , the alternative hypothesis always contains the not equal ( ≠ ) sign. This indicates that we’re testing whether or not some effect exists, regardless of whether it’s a positive or negative effect. Check out the following example problems to gain a better understanding of two-tailed tests. Example 1: Factory WidgetsSuppose it’s assumed that the average weight of a certain widget produced at a factory is 20 grams. However, one engineer believes that a new method produces widgets that weigh less than 20 grams. To test this, he can perform a one-tailed hypothesis test with the following null and alternative hypotheses:
This is an example of a two-tailed hypothesis test because the alternative hypothesis contains the not equal “≠” sign. The engineer believes that the new method will influence widget weight, but doesn’t specify whether it will cause average weight to increase or decrease. To test this, he uses the new method to produce 20 widgets and obtains the following information :
Plugging these values into the , we obtain the following results:
Since the p-value is not less than .05, the engineer fails to reject the null hypothesis. He does not have sufficient evidence to say that the true mean weight of widgets produced by the new method is different than 20 grams. Example 2: Plant GrowthSuppose a standard fertilizer has been shown to cause a species of plants to grow by an average of 10 inches. However, one botanist believes a new fertilizer causes this species of plants to grow by an average amount different than 10 inches. To test this, she can perform a one-tailed hypothesis test with the following null and alternative hypotheses:
This is an example of a two-tailed hypothesis test because the alternative hypothesis contains the not equal “≠” sign. The botanist believes that the new fertilizer will influence plant growth, but doesn’t specify whether it will cause average growth to increase or decrease. To test this claim, she applies the new fertilizer to a simple random sample of 15 plants and obtains the following information :
Since the p-value is less than .05, the botanist rejects the null hypothesis. She has sufficient evidence to conclude that the new fertilizer causes an average growth that is different than 10 inches. Example 3: Studying MethodA professor believes that a certain studying technique will influence the mean score that her students receive on a certain exam, but she’s unsure if it will increase or decrease the mean score, which is currently 82. To test this, she lets each student use the studying technique for one month leading up to the exam and then administers the same exam to each of the students . She then performs a hypothesis test using the following hypotheses:
This is an example of a two-tailed hypothesis test because the alternative hypothesis contains the not equal “≠” sign. The professor believes that the studying technique will influence the mean exam score, but doesn’t specify whether it will cause the mean score to increase or decrease. To test this claim, the professor has 25 students use the new studying method and then take the exam. He collects the following data on the exam scores for this sample of students :
Since the p-value is less than .05, the professor rejects the null hypothesis. She has sufficient evidence to conclude that the new studying method produces exam scores with an average score that is different than 82. Additional ResourcesThe following tutorials provide additional information about hypothesis testing: Related terms:
Pardon Our InterruptionAs you were browsing something about your browser made us think you were a bot. There are a few reasons this might happen:
To regain access, please make sure that cookies and JavaScript are enabled before reloading the page. ![]() |
IMAGES
VIDEO
COMMENTS
To test this, he can perform a one-tailed hypothesis test with the following null and alternative hypotheses: H 0 (Null Hypothesis): μ = 20 grams; H A (Alternative Hypothesis): μ ≠ 20 grams; This is an example of a two-tailed hypothesis test because the alternative hypothesis contains the not equal "≠" sign. The engineer believes that ...
Two-tailed hypothesis tests are also known as nondirectional and two-sided tests because you can test for effects in both directions. When you perform a two-tailed test, you split the significance level percentage between both tails of the distribution. In the example below, I use an alpha of 5% and the distribution has two shaded regions of 2. ...
Two-Tailed Test: A two-tailed test is a statistical test in which the critical area of a distribution is two-sided and tests whether a sample is greater than or less than a certain range of values ...
A two tailed test tells you that you're finding the area in the middle of a distribution. In other words, your rejection region (the place where you would reject the null hypothesis) is in both tails. For example, let's say you were running a z test with an alpha level of 5% (0.05). In a one tailed test, the entire 5% would be in a single tail.
A one tailed test does not leave more room to conclude that the alternative hypothesis is true. The benefit (increased certainty) of a one tailed test doesn't come free, as the analyst must know "something more", which is the direction of the effect, compared to a two tailed test. ( 3 votes)
So let's perform the step -1 of hypothesis testing which is: Specify the Null (H0) and Alternate (H1) hypothesis. Null hypothesis (H0): The null hypothesis here is what currently stated to be true about the population. In our case it will be the average height of students in the batch is 100. H0 : μ = 100.
The level of significance which is selected in Step 1 (e.g., α =0.05) dictates the critical value. For example, in an upper tailed Z test, if α =0.05 then the critical value is Z=1.645. The following figures illustrate the rejection regions defined by the decision rule for upper-, lower- and two-tailed Z tests with α=0.05.
The one-tailed hypothesis is rejected only if the sample proportion is much greater than \(0.5\). The alternative hypothesis in the two-tailed test is \(\pi \neq 0.5\). In the one-tailed test it is \(\pi > 0.5\). You should always decide whether you are going to use a one-tailed or a two-tailed probability before looking at the data.
At this point, you might use a statistical test, like unpaired or 2-sample t-test, to see if there's a significant difference between the two groups' means. Typically, an unpaired t-test starts with two hypotheses. The first hypothesis is called the null hypothesis, and it basically says there's no difference in the means of the two groups.
The term 'two-tailed' comes from the fact that the statistical test used to evaluate the hypothesis is based on the assumption that the difference or relationship could occur in either direction, resulting in two 'tails' in the probability distribution. Using the coffee foam example (from Activity 1), you have the following set of ...
With R use built-in math and statistics functions find the P-value for a two tailed hypothesis test for a mean. Here, the sample size is 30, the sample mean is 62.1, the sample standard deviation is 13.46, and the test is for a mean different from 60. ... This was an example of a left tailed test, where the alternative hypothesis claimed that ...
Hypothesis testing example Based on the type of data you collected, you perform a one-tailed t-test to test whether men are in fact taller than women. This test gives you: an estimate of the difference in average height between the two groups.
A two-tailed hypothesis test example: A machine is used to fill bags with coffee, and each bag is 1 kg. A randomly selected sample of 30 bags has a mean weight of 1.01 kg with a standard deviation ...
This tutorial explains the basics of hypothesis testing. It also shows how to conduct a two-tailed hypothesis z-test for a population mean.Intro to hypothesi...
In coin flipping, the null hypothesis is a sequence of Bernoulli trials with probability 0.5, yielding a random variable X which is 1 for heads and 0 for tails, and a common test statistic is the sample mean (of the number of heads) ¯. If testing for whether the coin is biased towards heads, a one-tailed test would be used - only large numbers of heads would be significant.
Two-tailed hypothesis test example Problem: A premium golf ball production line must produce all of its balls to 1.615 ounces in order to get the top rating (and therefore the top dollar). Samples are drawn hourly and checked. If the production line gets out of sync with a statistical significance of more than 1%, it must be shut down and repaired.
In the above situation you do a two-tailed test as you would reject your null if the sample average is either too low or too high. In this case, the p-value represents the probability of realizing a sample mean that is at least as extreme as the one we actually obtained assuming that the null is in fact true.
This statistics video tutorial explains when you should use a one tailed test vs a two tailed test when solving problems associated with hypothesis testing. ...
The alternative hypothesis can be either two-tailed, left-tailed, or right-tailed: H 1 ... 0.05, and 0.01) then you can reject the null hypothesis. Two Sample t-test: Assumptions. For the results of a two sample t-test to be valid, the following assumptions should be met:
In a two-tailed test of hypothesis, the two critical points divide the area under the sampling distribution of a: Group of answer choices a. sample statistic...
Choose a sample size to test your hypothesis. The textbook problem usually does this for you, but you will need to use the sample size to claculate your test statistic. ... The video tutorial presented here is an example of a so called two-tailed hypothesis test, and walks through a problem using these steps. ...
With Python use the scipy and math libraries to calculate the P-value for a two-tailed tailed hypothesis test for a proportion. Here, the sample size is 100, the occurrences are 10, and the test is for a proportion different from than 0.50. ... Left-Tailed and Two-Tailed Tests. This was an example of a two tailed test, ...
What is one-tailed and two-tailed Hypothesis test with example? In hypothesis testing, the choice between a one-tailed and a two-tailed test is determined by the nature of the research question. One-tailed hypothesis: This tests for a specific direction of the effect. It predicts the direction of the relationship or difference between groups.
Two-Tailed Hypothesis Tests: 3 Example Problems. Example 1: Factory Widgets. Example 2: Plant Growth. Example 3: Studying Method. Additional Resources. Related terms: A two-tailed hypothesis test is a statistical method used to determine if there is a significant difference between two groups or variables.
Statistical Hypothesis Hypothesis Testing - another methos of statistical inference along with estimation - a systematic procedure for deciding whether the results of a research study support a particular theory which applies to a population - hypothesis testing uses sample data to evaluate a hypothesis about a population - a statistical hypothesis is a conjecture (inference formed without ...
Hyp Mean - One Sample (data) One Sample Hypothesis Test for the Mean PROBLEM 1 1100 Ho: Mean SAT =1100 Level of Significance 0.05 Ha: Mean SAT greater than 1100 Sample Size 95 Upper-tailed test Sample Mean 1293.16 With a p-value of .0000 < .05, reject the null hypothesis Standard Deviation 121.28 Conclude that the mean SAT is less than 1100. Test Statistic (Computed) 15.52 Direction of Test ...
The employment of two-sample hypothesis testing in examining random graphs has been a prevalent approach in diverse fields such as social sciences, neuroscience, and genetics. We advance a spectral-based two-sample hypothesis testing methodology to test the latent position random graphs. We propose two distinct asymptotic normal statistics ...