9.4 Full Hypothesis Test Examples

Tests on means, example 9.8.

Jeffrey, as an eight-year old, established a mean time of 16.43 seconds for swimming the 25-yard freestyle, with a standard deviation of 0.8 seconds . His dad, Frank, thought that Jeffrey could swim the 25-yard freestyle faster using goggles. Frank bought Jeffrey a new pair of expensive goggles and timed Jeffrey for 15 25-yard freestyle swims . For the 15 swims, Jeffrey's mean time was 16 seconds. Frank thought that the goggles helped Jeffrey to swim faster than the 16.43 seconds. Conduct a hypothesis test using a preset α = 0.05. Assume that the swim times for the 25-yard freestyle are normal.

Set up the Hypothesis Test:

Since the problem is about a mean, this is a test of a single population mean .

H 0 : μ = 16.43   H a : μ < 16.43

For Jeffrey to swim faster, his time will be less than 16.43 seconds. The "<" tells you this is left-tailed.

Determine the distribution needed:

Random variable: X ¯ X ¯ = the mean time to swim the 25-yard freestyle.

Distribution for the test: X ¯ X ¯ is normal (population standard deviation is known: σ = 0.8)

X ¯ ~ N ( μ , σ X n ) X ¯ ~ N ( μ , σ X n ) Therefore, X ¯ ~ N ( 16.43 , 0.8 15 ) X ¯ ~ N ( 16.43 , 0.8 15 )

μ = 16.43 comes from H 0 and not the data. σ = 0.8, and n = 15.

Calculate the p -value using the normal distribution for a mean:

p -value = P ( x ¯ x ¯ < 16) = 0.0187 where the sample mean in the problem is given as 16.

p -value = 0.0187 (This is called the actual level of significance .) The p -value is the area to the left of the sample mean is given as 16.

μ = 16.43 comes from H 0 . Our assumption is μ = 16.43.

Interpretation of the p -value: If H 0 is true , there is a 0.0187 probability (1.87%)that Jeffrey's mean time to swim the 25-yard freestyle is 16 seconds or less. Because a 1.87% chance is small, the mean time of 16 seconds or less is unlikely to have happened randomly. It is a rare event.

Compare α and the p -value:

α = 0.05 p -value = 0.0187 α > p -value

Make a decision: Since α > α > p -value, reject H 0 .

This indicates that you reject the null hypothesis that the mean time to swim the 25-yard freestyle is at least 16.43 seconds.

Conclusion: At the 5% significance level, there is sufficient evidence that Jeffrey's mean time to swim the 25-yard freestyle is less than 16.43 seconds. Thus, based on the sample data, we conclude that Jeffrey swims faster using the new goggles.

The Type I and Type II errors for this problem are as follows: The Type I error is to conclude that Jeffrey swims the 25-yard freestyle, on average, in less than 16.43 seconds when, in fact, he actually swims the 25-yard freestyle, on average, in at least 16.43 seconds. (Reject the null hypothesis when the null hypothesis is true.)

The Type II error is that there is not evidence to conclude that Jeffrey swims the 25-yard freestyle, on average, in less than 16.43 seconds when, in fact, he actually does swim the 25-yard free-style, on average, in less than 16.43 seconds. (Do not reject the null hypothesis when the null hypothesis is false.)

The mean throwing distance of a football for Marco, a high school quarterback, is 40 yards, with a standard deviation of two yards. The team coach tells Marco to adjust his grip to get more distance. The coach records the distances for 20 throws. For the 20 throws, Marco’s mean distance was 45 yards. The coach thought the different grip helped Marco throw farther than 40 yards. Conduct a hypothesis test using a preset α = 0.05. Assume the throw distances for footballs are normal.

First, determine what type of test this is, set up the hypothesis test, find the p -value, sketch the graph, and state your conclusion.

Example 9.9

Jasmine has just begun her new job on the sales force of a very competitive company. In a sample of 16 sales calls it was found that she closed the contract for an average value of 108 dollars with a standard deviation of 12 dollars. Test at 5% significance that the population mean is at least 100 dollars against the alternative that it is less than 100 dollars. Company policy requires that new members of the sales force must exceed an average of $100 per contract during the trial employment period. Can we conclude that Jasmine has met this requirement at the significance level of 95%?

  • H 0 : µ ≤ 100 H a : µ > 100 The null and alternative hypothesis are for the parameter µ because the number of dollars of the contracts is a continuous random variable. Also, this is a one-tailed test because the company has only an interested if the number of dollars per contact is below a particular number not "too high" a number. This can be thought of as making a claim that the requirement is being met and thus the claim is in the alternative hypothesis.
  • Test statistic: t c = x ¯ − µ 0 s n = 108 − 100 ( 12 16 ) = 2.67 t c = x ¯ − µ 0 s n = 108 − 100 ( 12 16 ) = 2.67
  • Critical value: t a = 1.753 t a = 1.753 with n-1 degrees of freedom= 15

The test statistic is a Student's t because the sample size is below 30; therefore, we cannot use the normal distribution. Comparing the calculated value of the test statistic and the critical value of t t ( t a ) ( t a ) at a 5% significance level, we see that the calculated value is in the tail of the distribution. Thus, we conclude that 108 dollars per contract is significantly larger than the hypothesized value of 100 and thus we cannot accept the null hypothesis. There is evidence that supports Jasmine's performance meets company standards.

It is believed that a stock price for a particular company will grow at a rate of $5 per week with a standard deviation of $1. An investor believes the stock won’t grow as quickly. The changes in stock price is recorded for ten weeks and are as follows: $4, $3, $2, $3, $1, $7, $2, $1, $1, $2. Perform a hypothesis test using a 5% level of significance. State the null and alternative hypotheses, state your conclusion, and identify the Type I errors.

Example 9.10

A manufacturer of salad dressings uses machines to dispense liquid ingredients into bottles that move along a filling line. The machine that dispenses salad dressings is working properly when 8 ounces are dispensed. Suppose that the average amount dispensed in a particular sample of 35 bottles is 7.91 ounces with a variance of 0.03 ounces squared, s 2 s 2 . Is there evidence that the machine should be stopped and production wait for repairs? The lost production from a shutdown is potentially so great that management feels that the level of significance in the analysis should be 99%.

Again we will follow the steps in our analysis of this problem.

STEP 1 : Set the Null and Alternative Hypothesis. The random variable is the quantity of fluid placed in the bottles. This is a continuous random variable and the parameter we are interested in is the mean. Our hypothesis therefore is about the mean. In this case we are concerned that the machine is not filling properly. From what we are told it does not matter if the machine is over-filling or under-filling, both seem to be an equally bad error. This tells us that this is a two-tailed test: if the machine is malfunctioning it will be shutdown regardless if it is from over-filling or under-filling. The null and alternative hypotheses are thus:

STEP 2 : Decide the level of significance and draw the graph showing the critical value.

This problem has already set the level of significance at 99%. The decision seems an appropriate one and shows the thought process when setting the significance level. Management wants to be very certain, as certain as probability will allow, that they are not shutting down a machine that is not in need of repair. To draw the distribution and the critical value, we need to know which distribution to use. Because this is a continuous random variable and we are interested in the mean, and the sample size is greater than 30, the appropriate distribution is the normal distribution and the relevant critical value is 2.575 from the normal table or the t-table at 0.005 column and infinite degrees of freedom. We draw the graph and mark these points.

STEP 3 : Calculate sample parameters and the test statistic. The sample parameters are provided, the sample mean is 7.91 and the sample variance is .03 and the sample size is 35. We need to note that the sample variance was provided not the sample standard deviation, which is what we need for the formula. Remembering that the standard deviation is simply the square root of the variance, we therefore know the sample standard deviation, s, is 0.173. With this information we calculate the test statistic as -3.07, and mark it on the graph.

STEP 4 : Compare test statistic and the critical values Now we compare the test statistic and the critical value by placing the test statistic on the graph. We see that the test statistic is in the tail, decidedly greater than the critical value of 2.575. We note that even the very small difference between the hypothesized value and the sample value is still a large number of standard deviations. The sample mean is only 0.08 ounces different from the required level of 8 ounces, but it is 3 plus standard deviations away and thus we cannot accept the null hypothesis.

STEP 5 : Reach a Conclusion

Three standard deviations of a test statistic will guarantee that the test will fail. The probability that anything is within three standard deviations is almost zero. Actually it is 0.0026 on the normal distribution, which is certainly almost zero in a practical sense. Our formal conclusion would be “ At a 99% level of significance we cannot accept the hypothesis that the sample mean came from a distribution with a mean of 8 ounces” Or less formally, and getting to the point, “At a 99% level of significance we conclude that the machine is under filling the bottles and is in need of repair”.

Try It 9.10

A company records the mean time of employees working in a day. The mean comes out to be 475 minutes, with a standard deviation of 45 minutes. A manager recorded times of 20 employees. The times of working were (frequencies are in parentheses) 460(3); 465(2); 470(3); 475(1); 480(6); 485(3); 490(2).

Conduct a hypothesis test using a 2.5% level of significance to determine if the mean time is more than 475 .

Hypothesis Test for Proportions

Just as there were confidence intervals for proportions, or more formally, the population parameter p of the binomial distribution, there is the ability to test hypotheses concerning p .

The population parameter for the binomial is p . The estimated value (point estimate) for p is p′ where p′ = x/n , x is the number of successes in the sample and n is the sample size.

When you perform a hypothesis test of a population proportion p , you take a simple random sample from the population. The conditions for a binomial distribution must be met, which are: there are a certain number n of independent trials meaning random sampling, the outcomes of any trial are binary, success or failure, and each trial has the same probability of a success p . The shape of the binomial distribution needs to be similar to the shape of the normal distribution. To ensure this, the quantities np′ and nq′ must both be greater than five ( np′ > 5 and nq′ > 5). In this case the binomial distribution of a sample (estimated) proportion can be approximated by the normal distribution with μ = np μ = np and σ = npq σ = npq . Remember that q = 1 – p q = 1 – p . There is no distribution that can correct for this small sample bias and thus if these conditions are not met we simply cannot test the hypothesis with the data available at that time. We met this condition when we first were estimating confidence intervals for p .

Again, we begin with the standardizing formula modified because this is the distribution of a binomial.

Substituting p 0 p 0 , the hypothesized value of p , we have:

This is the test statistic for testing hypothesized values of p , where the null and alternative hypotheses take one of the following forms:

The decision rule stated above applies here also: if the calculated value of Z c shows that the sample proportion is "too many" standard deviations from the hypothesized proportion, the null hypothesis cannot be accepted. The decision as to what is "too many" is pre-determined by the analyst depending on the level of significance required in the test.

Example 9.11

The mortgage department of a large bank is interested in the nature of loans of first-time borrowers. This information will be used to tailor their marketing strategy. They believe that 50% of first-time borrowers take out smaller loans than other borrowers. They perform a hypothesis test to determine if the percentage is the same or different from 50% . They sample 100 first-time borrowers and find 53 of these loans are smaller that the other borrowers. For the hypothesis test, they choose a 5% level of significance.

STEP 1 : Set the null and alternative hypothesis.

H 0 : p = 0.50   H a : p ≠ 0.50

The words "is the same or different from" tell you this is a two-tailed test. The Type I and Type II errors are as follows: The Type I error is to conclude that the proportion of borrowers is different from 50% when, in fact, the proportion is actually 50%. (Reject the null hypothesis when the null hypothesis is true). The Type II error is there is not enough evidence to conclude that the proportion of first time borrowers differs from 50% when, in fact, the proportion does differ from 50%. (You fail to reject the null hypothesis when the null hypothesis is false.)

STEP 2 : Decide the level of significance and draw the graph showing the critical value

The level of significance has been set by the problem at the 5% level. Because this is two-tailed test one-half of the alpha value will be in the upper tail and one-half in the lower tail as shown on the graph. The critical value for the normal distribution at the 95% level of confidence is 1.96. This can easily be found on the student’s t-table at the very bottom at infinite degrees of freedom remembering that at infinity the t-distribution is the normal distribution. Of course the value can also be found on the normal table but you have go looking for one-half of 95 (0.475) inside the body of the table and then read out to the sides and top for the number of standard deviations.

STEP 3 : Calculate the sample parameters and critical value of the test statistic.

The test statistic is a normal distribution, Z, for testing proportions and is:

For this case, the sample of 100 found 53 of these loans were smaller than those of other borrowers. The sample proportion, p′ = 53/100= 0.53 The test question, therefore, is : “Is 0.53 significantly different from .50?” Putting these values into the formula for the test statistic we find that 0.53 is only 0.60 standard deviations away from .50. This is barely off of the mean of the standard normal distribution of zero. There is virtually no difference from the sample proportion and the hypothesized proportion in terms of standard deviations.

STEP 4 : Compare the test statistic and the critical value.

The calculated value is well within the critical values of ± 1.96 standard deviations and thus we cannot reject the null hypothesis. To reject the null hypothesis we need significant evident of difference between the hypothesized value and the sample value. In this case the sample value is very nearly the same as the hypothesized value measured in terms of standard deviations.

STEP 5 : Reach a conclusion

The formal conclusion would be “At a 5% level of significance we cannot reject the null hypothesis that 50% of first-time borrowers take out smaller loans than other borrowers.” Notice the length to which the conclusion goes to include all of the conditions that are attached to the conclusion. Statisticians, for all the criticism they receive, are careful to be very specific even when this seems trivial. Statisticians cannot say more than they know, and the data constrain the conclusion to be within the metes and bounds of the data.

Try It 9.11

A teacher believes that 85% of students in the class will want to go on a field trip to the local zoo. The teacher performs a hypothesis test to determine if the percentage is the same or different from 85%. The teacher samples 50 students and 39 reply that they would want to go to the zoo. For the hypothesis test, use a 1% level of significance.

Example 9.12

Suppose a consumer group suspects that the proportion of households that have three or more cell phones is 30%. A cell phone company has reason to believe that the proportion is not 30%. Before they start a big advertising campaign, they conduct a hypothesis test. Their marketing people survey 150 households with the result that 43 of the households have three or more cell phones.

Here is an abbreviate version of the system to solve hypothesis tests applied to a test on a proportions.

Try It 9.12

Marketers believe that 92% of adults in the United States own a cell phone. A cell phone manufacturer believes that number is actually lower. 200 American adults are surveyed, of which, 174 report having cell phones. Use a 5% level of significance. State the null and alternative hypothesis, find the p -value, state your conclusion, and identify the Type I and Type II errors.

Example 9.13

The National Institute of Standards and Technology provides exact data on conductivity properties of materials. Following are conductivity measurements for 11 randomly selected pieces of a particular type of glass.

1.11; 1.07; 1.11; 1.07; 1.12; 1.08; .98; .98; 1.02; .95; .95 Is there convincing evidence that the average conductivity of this type of glass is greater than one? Use a significance level of 0.05.

Let’s follow a four-step process to answer this statistical question.

  • H 0 : μ ≤ 1
  • H a : μ > 1
  • Plan : We are testing a sample mean without a known population standard deviation with less than 30 observations. Therefore, we need to use a Student's-t distribution. Assume the underlying population is normal.
  • Do the calculations and draw the graph .
  • State the Conclusions : We cannot accept the null hypothesis. It is reasonable to state that the data supports the claim that the average conductivity level is greater than one.

Try It 9.13

The boiling point of a specific liquid is measured for 15 samples, and the boiling points are obtained as follows:

205; 206; 206; 202; 199; 194; 197; 198; 198; 201; 201; 202; 207; 211; 205

Is there convincing evidence that the average boiling point is greater than 200? Use a significance level of 0.1. Assume the population is normal.

Example 9.14

In a study of 420,019 cell phone users, 172 of the subjects developed brain cancer. Test the claim that cell phone users developed brain cancer at a greater rate than that for non-cell phone users (the rate of brain cancer for non-cell phone users is 0.0340%). Since this is a critical issue, use a 0.005 significance level. Explain why the significance level should be so low in terms of a Type I error.

  • H 0 : p ≤ 0.00034
  • H a : p > 0.00034

If we commit a Type I error, we are essentially accepting a false claim. Since the claim describes cancer-causing environments, we want to minimize the chances of incorrectly identifying causes of cancer.

  • We will be testing a sample proportion with x = 172 and n = 420,019. The sample is sufficiently large because we have np' = 420,019(0.00034) = 142.8, nq' = 420,019(0.99966) = 419,876.2, two independent outcomes, and a fixed probability of success p' = 0.00034. Thus we will be able to generalize our results to the population.

Try It 9.14

In a study of 390,000 moisturizer users, 138 of the subjects developed skin diseases. Test the claim that moisturizer users developed skin diseases at a greater rate than that for non-moisturizer users (the rate of skin diseases for non-moisturizer users is 0.041%). Since this is a critical issue, use a 0.005 significance level. Explain why the significance level should be so low in terms of a Type I error.

This book may not be used in the training of large language models or otherwise be ingested into large language models or generative AI offerings without OpenStax's permission.

Want to cite, share, or modify this book? This book uses the Creative Commons Attribution License and you must attribute OpenStax.

Access for free at https://openstax.org/books/introductory-business-statistics-2e/pages/1-introduction
  • Authors: Alexander Holmes, Barbara Illowsky, Susan Dean
  • Publisher/website: OpenStax
  • Book title: Introductory Business Statistics 2e
  • Publication date: Dec 13, 2023
  • Location: Houston, Texas
  • Book URL: https://openstax.org/books/introductory-business-statistics-2e/pages/1-introduction
  • Section URL: https://openstax.org/books/introductory-business-statistics-2e/pages/9-4-full-hypothesis-test-examples

© Dec 6, 2023 OpenStax. Textbook content produced by OpenStax is licensed under a Creative Commons Attribution License . The OpenStax name, OpenStax logo, OpenStax book covers, OpenStax CNX name, and OpenStax CNX logo are not subject to the Creative Commons license and may not be reproduced without the prior and express written consent of Rice University.

Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons

Margin Size

  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Statistics LibreTexts

9.1: Introduction to Hypothesis Testing

  • Last updated
  • Save as PDF
  • Page ID 10211

  • Kyle Siegrist
  • University of Alabama in Huntsville via Random Services

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\id}{\mathrm{id}}\)

\( \newcommand{\kernel}{\mathrm{null}\,}\)

\( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\)

\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\)

\( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

\( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vectorC}[1]{\textbf{#1}} \)

\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

Basic Theory

Preliminaries.

As usual, our starting point is a random experiment with an underlying sample space and a probability measure \(\P\). In the basic statistical model, we have an observable random variable \(\bs{X}\) taking values in a set \(S\). In general, \(\bs{X}\) can have quite a complicated structure. For example, if the experiment is to sample \(n\) objects from a population and record various measurements of interest, then \[ \bs{X} = (X_1, X_2, \ldots, X_n) \] where \(X_i\) is the vector of measurements for the \(i\)th object. The most important special case occurs when \((X_1, X_2, \ldots, X_n)\) are independent and identically distributed. In this case, we have a random sample of size \(n\) from the common distribution.

The purpose of this section is to define and discuss the basic concepts of statistical hypothesis testing . Collectively, these concepts are sometimes referred to as the Neyman-Pearson framework, in honor of Jerzy Neyman and Egon Pearson, who first formalized them.

A statistical hypothesis is a statement about the distribution of \(\bs{X}\). Equivalently, a statistical hypothesis specifies a set of possible distributions of \(\bs{X}\): the set of distributions for which the statement is true. A hypothesis that specifies a single distribution for \(\bs{X}\) is called simple ; a hypothesis that specifies more than one distribution for \(\bs{X}\) is called composite .

In hypothesis testing , the goal is to see if there is sufficient statistical evidence to reject a presumed null hypothesis in favor of a conjectured alternative hypothesis . The null hypothesis is usually denoted \(H_0\) while the alternative hypothesis is usually denoted \(H_1\).

An hypothesis test is a statistical decision ; the conclusion will either be to reject the null hypothesis in favor of the alternative, or to fail to reject the null hypothesis. The decision that we make must, of course, be based on the observed value \(\bs{x}\) of the data vector \(\bs{X}\). Thus, we will find an appropriate subset \(R\) of the sample space \(S\) and reject \(H_0\) if and only if \(\bs{x} \in R\). The set \(R\) is known as the rejection region or the critical region . Note the asymmetry between the null and alternative hypotheses. This asymmetry is due to the fact that we assume the null hypothesis, in a sense, and then see if there is sufficient evidence in \(\bs{x}\) to overturn this assumption in favor of the alternative.

An hypothesis test is a statistical analogy to proof by contradiction, in a sense. Suppose for a moment that \(H_1\) is a statement in a mathematical theory and that \(H_0\) is its negation. One way that we can prove \(H_1\) is to assume \(H_0\) and work our way logically to a contradiction. In an hypothesis test, we don't prove anything of course, but there are similarities. We assume \(H_0\) and then see if the data \(\bs{x}\) are sufficiently at odds with that assumption that we feel justified in rejecting \(H_0\) in favor of \(H_1\).

Often, the critical region is defined in terms of a statistic \(w(\bs{X})\), known as a test statistic , where \(w\) is a function from \(S\) into another set \(T\). We find an appropriate rejection region \(R_T \subseteq T\) and reject \(H_0\) when the observed value \(w(\bs{x}) \in R_T\). Thus, the rejection region in \(S\) is then \(R = w^{-1}(R_T) = \left\{\bs{x} \in S: w(\bs{x}) \in R_T\right\}\). As usual, the use of a statistic often allows significant data reduction when the dimension of the test statistic is much smaller than the dimension of the data vector.

The ultimate decision may be correct or may be in error. There are two types of errors, depending on which of the hypotheses is actually true.

Types of errors:

  • A type 1 error is rejecting the null hypothesis \(H_0\) when \(H_0\) is true.
  • A type 2 error is failing to reject the null hypothesis \(H_0\) when the alternative hypothesis \(H_1\) is true.

Similarly, there are two ways to make a correct decision: we could reject \(H_0\) when \(H_1\) is true or we could fail to reject \(H_0\) when \(H_0\) is true. The possibilities are summarized in the following table:

Of course, when we observe \(\bs{X} = \bs{x}\) and make our decision, either we will have made the correct decision or we will have committed an error, and usually we will never know which of these events has occurred. Prior to gathering the data, however, we can consider the probabilities of the various errors.

If \(H_0\) is true (that is, the distribution of \(\bs{X}\) is specified by \(H_0\)), then \(\P(\bs{X} \in R)\) is the probability of a type 1 error for this distribution. If \(H_0\) is composite, then \(H_0\) specifies a variety of different distributions for \(\bs{X}\) and thus there is a set of type 1 error probabilities.

The maximum probability of a type 1 error, over the set of distributions specified by \( H_0 \), is the significance level of the test or the size of the critical region.

The significance level is often denoted by \(\alpha\). Usually, the rejection region is constructed so that the significance level is a prescribed, small value (typically 0.1, 0.05, 0.01).

If \(H_1\) is true (that is, the distribution of \(\bs{X}\) is specified by \(H_1\)), then \(\P(\bs{X} \notin R)\) is the probability of a type 2 error for this distribution. Again, if \(H_1\) is composite then \(H_1\) specifies a variety of different distributions for \(\bs{X}\), and thus there will be a set of type 2 error probabilities. Generally, there is a tradeoff between the type 1 and type 2 error probabilities. If we reduce the probability of a type 1 error, by making the rejection region \(R\) smaller, we necessarily increase the probability of a type 2 error because the complementary region \(S \setminus R\) is larger.

The extreme cases can give us some insight. First consider the decision rule in which we never reject \(H_0\), regardless of the evidence \(\bs{x}\). This corresponds to the rejection region \(R = \emptyset\). A type 1 error is impossible, so the significance level is 0. On the other hand, the probability of a type 2 error is 1 for any distribution defined by \(H_1\). At the other extreme, consider the decision rule in which we always rejects \(H_0\) regardless of the evidence \(\bs{x}\). This corresponds to the rejection region \(R = S\). A type 2 error is impossible, but now the probability of a type 1 error is 1 for any distribution defined by \(H_0\). In between these two worthless tests are meaningful tests that take the evidence \(\bs{x}\) into account.

If \(H_1\) is true, so that the distribution of \(\bs{X}\) is specified by \(H_1\), then \(\P(\bs{X} \in R)\), the probability of rejecting \(H_0\) is the power of the test for that distribution.

Thus the power of the test for a distribution specified by \( H_1 \) is the probability of making the correct decision.

Suppose that we have two tests, corresponding to rejection regions \(R_1\) and \(R_2\), respectively, each having significance level \(\alpha\). The test with region \(R_1\) is uniformly more powerful than the test with region \(R_2\) if \[ \P(\bs{X} \in R_1) \ge \P(\bs{X} \in R_2) \text{ for every distribution of } \bs{X} \text{ specified by } H_1 \]

Naturally, in this case, we would prefer the first test. Often, however, two tests will not be uniformly ordered; one test will be more powerful for some distributions specified by \(H_1\) while the other test will be more powerful for other distributions specified by \(H_1\).

If a test has significance level \(\alpha\) and is uniformly more powerful than any other test with significance level \(\alpha\), then the test is said to be a uniformly most powerful test at level \(\alpha\).

Clearly a uniformly most powerful test is the best we can do.

\(P\)-value

In most cases, we have a general procedure that allows us to construct a test (that is, a rejection region \(R_\alpha\)) for any given significance level \(\alpha \in (0, 1)\). Typically, \(R_\alpha\) decreases (in the subset sense) as \(\alpha\) decreases.

The \(P\)-value of the observed value \(\bs{x}\) of \(\bs{X}\), denoted \(P(\bs{x})\), is defined to be the smallest \(\alpha\) for which \(\bs{x} \in R_\alpha\); that is, the smallest significance level for which \(H_0\) is rejected, given \(\bs{X} = \bs{x}\).

Knowing \(P(\bs{x})\) allows us to test \(H_0\) at any significance level for the given data \(\bs{x}\): If \(P(\bs{x}) \le \alpha\) then we would reject \(H_0\) at significance level \(\alpha\); if \(P(\bs{x}) \gt \alpha\) then we fail to reject \(H_0\) at significance level \(\alpha\). Note that \(P(\bs{X})\) is a statistic . Informally, \(P(\bs{x})\) can often be thought of as the probability of an outcome as or more extreme than the observed value \(\bs{x}\), where extreme is interpreted relative to the null hypothesis \(H_0\).

Analogy with Justice Systems

There is a helpful analogy between statistical hypothesis testing and the criminal justice system in the US and various other countries. Consider a person charged with a crime. The presumed null hypothesis is that the person is innocent of the crime; the conjectured alternative hypothesis is that the person is guilty of the crime. The test of the hypotheses is a trial with evidence presented by both sides playing the role of the data. After considering the evidence, the jury delivers the decision as either not guilty or guilty . Note that innocent is not a possible verdict of the jury, because it is not the point of the trial to prove the person innocent. Rather, the point of the trial is to see whether there is sufficient evidence to overturn the null hypothesis that the person is innocent in favor of the alternative hypothesis of that the person is guilty. A type 1 error is convicting a person who is innocent; a type 2 error is acquitting a person who is guilty. Generally, a type 1 error is considered the more serious of the two possible errors, so in an attempt to hold the chance of a type 1 error to a very low level, the standard for conviction in serious criminal cases is beyond a reasonable doubt .

Tests of an Unknown Parameter

Hypothesis testing is a very general concept, but an important special class occurs when the distribution of the data variable \(\bs{X}\) depends on a parameter \(\theta\) taking values in a parameter space \(\Theta\). The parameter may be vector-valued, so that \(\bs{\theta} = (\theta_1, \theta_2, \ldots, \theta_n)\) and \(\Theta \subseteq \R^k\) for some \(k \in \N_+\). The hypotheses generally take the form \[ H_0: \theta \in \Theta_0 \text{ versus } H_1: \theta \notin \Theta_0 \] where \(\Theta_0\) is a prescribed subset of the parameter space \(\Theta\). In this setting, the probabilities of making an error or a correct decision depend on the true value of \(\theta\). If \(R\) is the rejection region, then the power function \( Q \) is given by \[ Q(\theta) = \P_\theta(\bs{X} \in R), \quad \theta \in \Theta \] The power function gives a lot of information about the test.

The power function satisfies the following properties:

  • \(Q(\theta)\) is the probability of a type 1 error when \(\theta \in \Theta_0\).
  • \(\max\left\{Q(\theta): \theta \in \Theta_0\right\}\) is the significance level of the test.
  • \(1 - Q(\theta)\) is the probability of a type 2 error when \(\theta \notin \Theta_0\).
  • \(Q(\theta)\) is the power of the test when \(\theta \notin \Theta_0\).

If we have two tests, we can compare them by means of their power functions.

Suppose that we have two tests, corresponding to rejection regions \(R_1\) and \(R_2\), respectively, each having significance level \(\alpha\). The test with rejection region \(R_1\) is uniformly more powerful than the test with rejection region \(R_2\) if \( Q_1(\theta) \ge Q_2(\theta)\) for all \( \theta \notin \Theta_0 \).

Most hypothesis tests of an unknown real parameter \(\theta\) fall into three special cases:

Suppose that \( \theta \) is a real parameter and \( \theta_0 \in \Theta \) a specified value. The tests below are respectively the two-sided test , the left-tailed test , and the right-tailed test .

  • \(H_0: \theta = \theta_0\) versus \(H_1: \theta \ne \theta_0\)
  • \(H_0: \theta \ge \theta_0\) versus \(H_1: \theta \lt \theta_0\)
  • \(H_0: \theta \le \theta_0\) versus \(H_1: \theta \gt \theta_0\)

Thus the tests are named after the conjectured alternative. Of course, there may be other unknown parameters besides \(\theta\) (known as nuisance parameters ).

Equivalence Between Hypothesis Test and Confidence Sets

There is an equivalence between hypothesis tests and confidence sets for a parameter \(\theta\).

Suppose that \(C(\bs{x})\) is a \(1 - \alpha\) level confidence set for \(\theta\). The following test has significance level \(\alpha\) for the hypothesis \( H_0: \theta = \theta_0 \) versus \( H_1: \theta \ne \theta_0 \): Reject \(H_0\) if and only if \(\theta_0 \notin C(\bs{x})\)

By definition, \(\P[\theta \in C(\bs{X})] = 1 - \alpha\). Hence if \(H_0\) is true so that \(\theta = \theta_0\), then the probability of a type 1 error is \(P[\theta \notin C(\bs{X})] = \alpha\).

Equivalently, we fail to reject \(H_0\) at significance level \(\alpha\) if and only if \(\theta_0\) is in the corresponding \(1 - \alpha\) level confidence set. In particular, this equivalence applies to interval estimates of a real parameter \(\theta\) and the common tests for \(\theta\) given above .

In each case below, the confidence interval has confidence level \(1 - \alpha\) and the test has significance level \(\alpha\).

  • Suppose that \(\left[L(\bs{X}, U(\bs{X})\right]\) is a two-sided confidence interval for \(\theta\). Reject \(H_0: \theta = \theta_0\) versus \(H_1: \theta \ne \theta_0\) if and only if \(\theta_0 \lt L(\bs{X})\) or \(\theta_0 \gt U(\bs{X})\).
  • Suppose that \(L(\bs{X})\) is a confidence lower bound for \(\theta\). Reject \(H_0: \theta \le \theta_0\) versus \(H_1: \theta \gt \theta_0\) if and only if \(\theta_0 \lt L(\bs{X})\).
  • Suppose that \(U(\bs{X})\) is a confidence upper bound for \(\theta\). Reject \(H_0: \theta \ge \theta_0\) versus \(H_1: \theta \lt \theta_0\) if and only if \(\theta_0 \gt U(\bs{X})\).

Pivot Variables and Test Statistics

Recall that confidence sets of an unknown parameter \(\theta\) are often constructed through a pivot variable , that is, a random variable \(W(\bs{X}, \theta)\) that depends on the data vector \(\bs{X}\) and the parameter \(\theta\), but whose distribution does not depend on \(\theta\) and is known. In this case, a natural test statistic for the basic tests given above is \(W(\bs{X}, \theta_0)\).

statistics hypothesis testing exam questions

User Preferences

Content preview.

Arcu felis bibendum ut tristique et egestas quis:

  • Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris
  • Duis aute irure dolor in reprehenderit in voluptate
  • Excepteur sint occaecat cupidatat non proident

Keyboard Shortcuts

S.3 hypothesis testing.

In reviewing hypothesis tests, we start first with the general idea. Then, we keep returning to the basic procedures of hypothesis testing, each time adding a little more detail.

The general idea of hypothesis testing involves:

  • Making an initial assumption.
  • Collecting evidence (data).
  • Based on the available evidence (data), deciding whether to reject or not reject the initial assumption.

Every hypothesis test — regardless of the population parameter involved — requires the above three steps.

Example S.3.1

Is normal body temperature really 98.6 degrees f section  .

Consider the population of many, many adults. A researcher hypothesized that the average adult body temperature is lower than the often-advertised 98.6 degrees F. That is, the researcher wants an answer to the question: "Is the average adult body temperature 98.6 degrees? Or is it lower?" To answer his research question, the researcher starts by assuming that the average adult body temperature was 98.6 degrees F.

Then, the researcher went out and tried to find evidence that refutes his initial assumption. In doing so, he selects a random sample of 130 adults. The average body temperature of the 130 sampled adults is 98.25 degrees.

Then, the researcher uses the data he collected to make a decision about his initial assumption. It is either likely or unlikely that the researcher would collect the evidence he did given his initial assumption that the average adult body temperature is 98.6 degrees:

  • If it is likely , then the researcher does not reject his initial assumption that the average adult body temperature is 98.6 degrees. There is not enough evidence to do otherwise.
  • either the researcher's initial assumption is correct and he experienced a very unusual event;
  • or the researcher's initial assumption is incorrect.

In statistics, we generally don't make claims that require us to believe that a very unusual event happened. That is, in the practice of statistics, if the evidence (data) we collected is unlikely in light of the initial assumption, then we reject our initial assumption.

Example S.3.2

Criminal trial analogy section  .

One place where you can consistently see the general idea of hypothesis testing in action is in criminal trials held in the United States. Our criminal justice system assumes "the defendant is innocent until proven guilty." That is, our initial assumption is that the defendant is innocent.

In the practice of statistics, we make our initial assumption when we state our two competing hypotheses -- the null hypothesis ( H 0 ) and the alternative hypothesis ( H A ). Here, our hypotheses are:

  • H 0 : Defendant is not guilty (innocent)
  • H A : Defendant is guilty

In statistics, we always assume the null hypothesis is true . That is, the null hypothesis is always our initial assumption.

The prosecution team then collects evidence — such as finger prints, blood spots, hair samples, carpet fibers, shoe prints, ransom notes, and handwriting samples — with the hopes of finding "sufficient evidence" to make the assumption of innocence refutable.

In statistics, the data are the evidence.

The jury then makes a decision based on the available evidence:

  • If the jury finds sufficient evidence — beyond a reasonable doubt — to make the assumption of innocence refutable, the jury rejects the null hypothesis and deems the defendant guilty. We behave as if the defendant is guilty.
  • If there is insufficient evidence, then the jury does not reject the null hypothesis . We behave as if the defendant is innocent.

In statistics, we always make one of two decisions. We either "reject the null hypothesis" or we "fail to reject the null hypothesis."

Errors in Hypothesis Testing Section  

Did you notice the use of the phrase "behave as if" in the previous discussion? We "behave as if" the defendant is guilty; we do not "prove" that the defendant is guilty. And, we "behave as if" the defendant is innocent; we do not "prove" that the defendant is innocent.

This is a very important distinction! We make our decision based on evidence not on 100% guaranteed proof. Again:

  • If we reject the null hypothesis, we do not prove that the alternative hypothesis is true.
  • If we do not reject the null hypothesis, we do not prove that the null hypothesis is true.

We merely state that there is enough evidence to behave one way or the other. This is always true in statistics! Because of this, whatever the decision, there is always a chance that we made an error .

Let's review the two types of errors that can be made in criminal trials:

Table S.3.2 shows how this corresponds to the two types of errors in hypothesis testing.

Note that, in statistics, we call the two types of errors by two different  names -- one is called a "Type I error," and the other is called  a "Type II error." Here are the formal definitions of the two types of errors:

There is always a chance of making one of these errors. But, a good scientific study will minimize the chance of doing so!

Making the Decision Section  

Recall that it is either likely or unlikely that we would observe the evidence we did given our initial assumption. If it is likely , we do not reject the null hypothesis. If it is unlikely , then we reject the null hypothesis in favor of the alternative hypothesis. Effectively, then, making the decision reduces to determining "likely" or "unlikely."

In statistics, there are two ways to determine whether the evidence is likely or unlikely given the initial assumption:

  • We could take the " critical value approach " (favored in many of the older textbooks).
  • Or, we could take the " P -value approach " (what is used most often in research, journal articles, and statistical software).

In the next two sections, we review the procedures behind each of these two approaches. To make our review concrete, let's imagine that μ is the average grade point average of all American students who major in mathematics. We first review the critical value approach for conducting each of the following three hypothesis tests about the population mean $\mu$:

In Practice

  • We would want to conduct the first hypothesis test if we were interested in concluding that the average grade point average of the group is more than 3.
  • We would want to conduct the second hypothesis test if we were interested in concluding that the average grade point average of the group is less than 3.
  • And, we would want to conduct the third hypothesis test if we were only interested in concluding that the average grade point average of the group differs from 3 (without caring whether it is more or less than 3).

Upon completing the review of the critical value approach, we review the P -value approach for conducting each of the above three hypothesis tests about the population mean \(\mu\). The procedures that we review here for both approaches easily extend to hypothesis tests about any other population parameter.

The Genius Blog

Hypothesis Testing Solved Examples(Questions and Solutions)

Here is a list hypothesis testing exercises and solutions. Try to solve a question by yourself first before you look at the solution.

Question 1 In the population, the average IQ is 100 with a standard deviation of 15. A team of scientists want to test a new medication to see if it has either a positive or negative effect on intelligence, or not effect at all. A sample of 30 participants who have taken the medication  has a mean of 140. Did the medication affect intelligence? View Solution to Question 1

A professor wants to know if her introductory statistics class has a good grasp of basic math. Six students are chosen at random from the class and given a math proficiency test. The professor wants the class to be able to score above 70 on the test. The six students get the following scores:62, 92, 75, 68, 83, 95. Can the professor have 90% confidence that the mean score for the class on the test would be above 70. Solution to Question 2

Question 3 In a packaging plant, a machine packs cartons with jars. It is supposed that a new machine would pack faster on the average than the machine currently used. To test the hypothesis, the time it takes each machine to pack ten cartons are recorded. The result in seconds is as follows.

Do the data provide sufficient evidence to conclude that, on the average, the new machine packs faster? Perform  the required hypothesis test at the 5% level of significance. Solution to Question 3 

Question 4 We want to compare the heights in inches of two groups of individuals. Here are the measurements: X: 175, 168, 168, 190, 156, 181, 182, 175, 174, 179 Y:  120, 180, 125, 188, 130, 190, 110, 185, 112, 188 Solution to Question 4 

Question 5 A clinic provides a program to help their clients lose weight and asks a consumer agency to investigate the effectiveness of the program. The agency takes a sample of 15 people, weighing each person in the sample before the program begins and 3 months later. The results a tabulated below

Determine is the program is effective. Solution to Question 5

Question 6 A sample of 20 students were selected and given a diagnostic module prior to studying for a test. And then they were given the test again after completing the module. . The result of the students scores in the test before and after the test is tabulated below.

We want to see if there is significant improvement in the student’s performance due to this teaching method Solution to Question 6 

Question 7 A study was performed to test wether cars get better mileage on premium gas than on regular gas. Each of 10 cars was first filled with regular or premium gas, decided by a coin toss, and the mileage for the tank was recorded. The mileage was recorded again for the same cars using other kind of gasoline. Determine wether cars get significantly better mileage with premium gas.

Mileage with regular gas: 16,20,21,22,23,22,27,25,27,28 Mileage with premium gas: 19, 22,24,24,25,25,26,26,28,32 Solution to Question 7 

Question 8  An automatic cutter machine must cut steel strips of 1200 mm length. From a preliminary data, we checked that the lengths of the pieces produced by the machine can be considered as normal random variables  with a 3mm standard deviation. We want to make sure that the machine is set correctly. Therefore 16 pieces of the products are randomly selected and weight. The figures were in mm: 1193,1196,1198,1195,1198,1199,1204,1193,1203,1201,1196,1200,1191,1196,1198,1191 Examine wether there is any significant deviation from the required size Solution to Question 8

Question 9 Blood pressure reading of ten patients before and after medication for reducing the blood pressure are as follows

Patient: 1,2,3,4,5,6,7,8,9,10 Before treatment: 86,84,78,90,92,77,89,90,90,86 After treatment:    80,80,92,79,92,82,88,89,92,83

Test the null hypothesis of no effect agains the alternate hypothesis that medication is effective. Execute it with Wilcoxon test Solution to Question 9

Question on ANOVA Sussan Sound predicts that students will learn most effectively with a constant background sound, as opposed to an unpredictable sound or no sound at all. She randomly divides 24 students into three groups of 8 each. All students study a passage of text for 30 minutes. Those in group 1 study with background sound at a constant volume in the background. Those in group 2 study with nose that changes volume periodically. Those in group 3 study with no sound at all. After studying, all students take a 10 point multiple choice test over the material. Their scores are tabulated below.

Group1: Constant sound: 7,4,6,8,6,6,2,9 Group 2: Random sound: 5,5,3,4,4,7,2,2 Group 3: No sound at all: 2,4,7,1,2,1,5,5 Solution to Question 10

Question 11 Using the following three groups of data, perform a one-way analysis of variance using α  = 0.05.

Solution to Question 11

Question 12 In a packaging plant, a machine packs cartons with jars. It is supposed that a new machine would pack faster on the average than the machine currently used. To test the hypothesis, the time it takes each machine to pack ten cartons are recorded. The result in seconds is as follows.

New Machine: 42,41,41.3,41.8,42.4,42.8,43.2,42.3,41.8,42.7 Old Machine:  42.7,43.6,43.8,43.3,42.5,43.5,43.1,41.7,44,44.1

Perform an F-test to determine if the null hypothesis should be accepted. Solution to Question 12

Question 13 A random sample 500 U.S adults are questioned about their political affiliation and opinion on a tax reform bill. We need to test if the political affiliation and their opinon on a tax reform bill are dependent, at 5% level of significance. The observed contingency table is given below.

Solution to Question 13

Question 14 Can a dice be considered regular which is showing the following frequency distribution during 1000 throws?

Solution to Question 14

Solution to Question 15

Question 16 A newly developed muesli contains five types of seeds (A, B, C, D and E). The percentage of which is 35%, 25%, 20%, 10% and 10% according to the product information. In a randomly selected muesli, the following volume distribution was found.

Lets us decide about the null hypothesis whether the composition of the sample corresponds to the distribution indicated on the packaging at alpha = 0.1 significance level. Solution to Question 16

Question 17 A research team investigated whether there was any significant correlation between the severity of a certain disease runoff and the age of the patients. During the study, data for n = 200 patients were collected and grouped according to the severity of the disease and the age of the patient. The table below shows the result

Let us decided about the correlation between the age of the patients and the severity of disease progression. Solution to Question 17

Question 18 A publisher is interested in determine which of three book cover is most attractive. He interviews 400 people in each of the three states (California, Illinois and New York), and asks each person which of the  cover he or she prefers. The number of preference for each cover is as follows:

Do these data indicate that there are regional differences in people’s preferences concerning these covers? Use the 0.05 level of significance. Solution to Question 18

Question 19 Trees planted along the road were checked for which ones are healthy(H) or diseased (D) and the following arrangement of the trees were obtained:

H H H H D D D H H H H H H H D D H H D D D

Test at the    = 0.05 significance wether this arrangement may be regarded as random

Solution to Question 19 

Question 20 Suppose we flip a coin n = 15 times and come up with the following arrangements

H T T T H H T T T T H H T H H

(H = head, T = tail)

Test at the alpha = 0.05 significance level whether this arrangement may be regarded as random.

Solution to Question 20

kindsonthegenius

You might also like, one-way anova(analysis of variance) problem – question 11, hypothesis testing problem – quesion 6 (20 students given diagnostic module…), chi-square test for independence – question 18 (a publisher is interested…).

I am really impressed with your writing abilities as well as with the structure to your weblog. Is this a paid subject matter or did you modify it yourself?

Either way stay up the excellent high quality writing, it’s uncommon to look a great blog like this one these days..

Below are given the gain in weights (in lbs.) of pigs fed on two diet A and B Dieta 25 32 30 34 24 14 32 24 30 31 35 25 – – DietB 44 34 22 10 47 31 40 30 32 35 18 21 35 29

If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

To log in and use all the features of Khan Academy, please enable JavaScript in your browser.

Statistics and probability

Course: statistics and probability   >   unit 12, hypothesis testing and p-values.

  • One-tailed and two-tailed tests
  • Z-statistics vs. T-statistics
  • Small sample hypothesis test
  • Large sample proportion hypothesis testing

statistics hypothesis testing exam questions

Want to join the conversation?

  • Upvote Button navigates to signup page
  • Downvote Button navigates to signup page
  • Flag Button navigates to signup page

Good Answer

Video transcript

HIGH SCHOOL

  • ACT Tutoring
  • SAT Tutoring
  • PSAT Tutoring
  • ASPIRE Tutoring
  • SHSAT Tutoring
  • STAAR Tutoring

GRADUATE SCHOOL

  • MCAT Tutoring
  • GRE Tutoring
  • LSAT Tutoring
  • GMAT Tutoring
  • AIMS Tutoring
  • HSPT Tutoring
  • ISAT Tutoring
  • SSAT Tutoring

Search 50+ Tests

Loading Page

math tutoring

  • Elementary Math
  • Pre-Calculus
  • Trigonometry

science tutoring

Foreign languages.

  • Mandarin Chinese

elementary tutoring

  • Computer Science

Search 350+ Subjects

  • Video Overview
  • Tutor Selection Process
  • Online Tutoring
  • Mobile Tutoring
  • Instant Tutoring
  • How We Operate
  • Our Guarantee
  • Impact of Tutoring
  • Reviews & Testimonials
  • Media Coverage
  • About Varsity Tutors

FREE Statistics Practice Tests

All statistics resources, free statistics practice tests, practice tests by concept, analysis of variance practice test, standard deviation practice test, basic inferential statistics practice test, populations and samples practice test, correlation practice test, coefficients practice test, descriptive statistics practice test, normal distributions practice test, skewed distributions practice test, error practice test, type i practice test, type ii practice test, probability practice test, combinations and permutations practice test, compound events practice test, simple events practice test, random variables practice test, continuous practice test, discrete practice test, practice quizzes, statistics problem set 1.

Display vt optimized

Browse Course Material

Course info, instructors.

  • Dr. Jeremy Orloff
  • Dr. Jennifer French Kamrin

Departments

  • Mathematics

As Taught In

  • Discrete Mathematics
  • Probability and Statistics

Learning Resource Types

Introduction to probability and statistics, exams with solutions.

facebook

You are leaving MIT OpenCourseWare

Tutorial Playlist

Statistics tutorial, everything you need to know about the probability density function in statistics, the best guide to understand central limit theorem, an in-depth guide to measures of central tendency : mean, median and mode, the ultimate guide to understand conditional probability.

A Comprehensive Look at Percentile in Statistics

The Best Guide to Understand Bayes Theorem

Everything you need to know about the normal distribution, an in-depth explanation of cumulative distribution function, a complete guide to chi-square test, a complete guide on hypothesis testing in statistics, understanding the fundamentals of arithmetic and geometric progression, the definitive guide to understand spearman’s rank correlation, a comprehensive guide to understand mean squared error, all you need to know about the empirical rule in statistics, the complete guide to skewness and kurtosis, a holistic look at bernoulli distribution.

All You Need to Know About Bias in Statistics

A Complete Guide to Get a Grasp of Time Series Analysis

The Key Differences Between Z-Test Vs. T-Test

The Complete Guide to Understand Pearson's Correlation

A complete guide on the types of statistical studies, everything you need to know about poisson distribution, your best guide to understand correlation vs. regression, the most comprehensive guide for beginners on what is correlation, what is hypothesis testing in statistics types and examples.

Lesson 10 of 24 By Avijeet Biswal

A Complete Guide on Hypothesis Testing in Statistics

Table of Contents

In today’s data-driven world , decisions are based on data all the time. Hypothesis plays a crucial role in that process, whether it may be making business decisions, in the health sector, academia, or in quality improvement. Without hypothesis & hypothesis tests, you risk drawing the wrong conclusions and making bad decisions. In this tutorial, you will look at Hypothesis Testing in Statistics.

What Is Hypothesis Testing in Statistics?

Hypothesis Testing is a type of statistical analysis in which you put your assumptions about a population parameter to the test. It is used to estimate the relationship between 2 statistical variables.

Let's discuss few examples of statistical hypothesis from real-life - 

  • A teacher assumes that 60% of his college's students come from lower-middle-class families.
  • A doctor believes that 3D (Diet, Dose, and Discipline) is 90% effective for diabetic patients.

Now that you know about hypothesis testing, look at the two types of hypothesis testing in statistics.

Hypothesis Testing Formula

Z = ( x̅ – μ0 ) / (σ /√n)

  • Here, x̅ is the sample mean,
  • μ0 is the population mean,
  • σ is the standard deviation,
  • n is the sample size.

How Hypothesis Testing Works?

An analyst performs hypothesis testing on a statistical sample to present evidence of the plausibility of the null hypothesis. Measurements and analyses are conducted on a random sample of the population to test a theory. Analysts use a random population sample to test two hypotheses: the null and alternative hypotheses.

The null hypothesis is typically an equality hypothesis between population parameters; for example, a null hypothesis may claim that the population means return equals zero. The alternate hypothesis is essentially the inverse of the null hypothesis (e.g., the population means the return is not equal to zero). As a result, they are mutually exclusive, and only one can be correct. One of the two possibilities, however, will always be correct.

Your Dream Career is Just Around The Corner!

Your Dream Career is Just Around The Corner!

Null Hypothesis and Alternate Hypothesis

The Null Hypothesis is the assumption that the event will not occur. A null hypothesis has no bearing on the study's outcome unless it is rejected.

H0 is the symbol for it, and it is pronounced H-naught.

The Alternate Hypothesis is the logical opposite of the null hypothesis. The acceptance of the alternative hypothesis follows the rejection of the null hypothesis. H1 is the symbol for it.

Let's understand this with an example.

A sanitizer manufacturer claims that its product kills 95 percent of germs on average. 

To put this company's claim to the test, create a null and alternate hypothesis.

H0 (Null Hypothesis): Average = 95%.

Alternative Hypothesis (H1): The average is less than 95%.

Another straightforward example to understand this concept is determining whether or not a coin is fair and balanced. The null hypothesis states that the probability of a show of heads is equal to the likelihood of a show of tails. In contrast, the alternate theory states that the probability of a show of heads and tails would be very different.

Become a Data Scientist with Hands-on Training!

Become a Data Scientist with Hands-on Training!

Hypothesis Testing Calculation With Examples

Let's consider a hypothesis test for the average height of women in the United States. Suppose our null hypothesis is that the average height is 5'4". We gather a sample of 100 women and determine that their average height is 5'5". The standard deviation of population is 2.

To calculate the z-score, we would use the following formula:

z = ( x̅ – μ0 ) / (σ /√n)

z = (5'5" - 5'4") / (2" / √100)

z = 0.5 / (0.045)

 We will reject the null hypothesis as the z-score of 11.11 is very large and conclude that there is evidence to suggest that the average height of women in the US is greater than 5'4".

Steps of Hypothesis Testing

Step 1: specify your null and alternate hypotheses.

It is critical to rephrase your original research hypothesis (the prediction that you wish to study) as a null (Ho) and alternative (Ha) hypothesis so that you can test it quantitatively. Your first hypothesis, which predicts a link between variables, is generally your alternate hypothesis. The null hypothesis predicts no link between the variables of interest.

Step 2: Gather Data

For a statistical test to be legitimate, sampling and data collection must be done in a way that is meant to test your hypothesis. You cannot draw statistical conclusions about the population you are interested in if your data is not representative.

Step 3: Conduct a Statistical Test

Other statistical tests are available, but they all compare within-group variance (how to spread out the data inside a category) against between-group variance (how different the categories are from one another). If the between-group variation is big enough that there is little or no overlap between groups, your statistical test will display a low p-value to represent this. This suggests that the disparities between these groups are unlikely to have occurred by accident. Alternatively, if there is a large within-group variance and a low between-group variance, your statistical test will show a high p-value. Any difference you find across groups is most likely attributable to chance. The variety of variables and the level of measurement of your obtained data will influence your statistical test selection.

Step 4: Determine Rejection Of Your Null Hypothesis

Your statistical test results must determine whether your null hypothesis should be rejected or not. In most circumstances, you will base your judgment on the p-value provided by the statistical test. In most circumstances, your preset level of significance for rejecting the null hypothesis will be 0.05 - that is, when there is less than a 5% likelihood that these data would be seen if the null hypothesis were true. In other circumstances, researchers use a lower level of significance, such as 0.01 (1%). This reduces the possibility of wrongly rejecting the null hypothesis.

Step 5: Present Your Results 

The findings of hypothesis testing will be discussed in the results and discussion portions of your research paper, dissertation, or thesis. You should include a concise overview of the data and a summary of the findings of your statistical test in the results section. You can talk about whether your results confirmed your initial hypothesis or not in the conversation. Rejecting or failing to reject the null hypothesis is a formal term used in hypothesis testing. This is likely a must for your statistics assignments.

Types of Hypothesis Testing

To determine whether a discovery or relationship is statistically significant, hypothesis testing uses a z-test. It usually checks to see if two means are the same (the null hypothesis). Only when the population standard deviation is known and the sample size is 30 data points or more, can a z-test be applied.

A statistical test called a t-test is employed to compare the means of two groups. To determine whether two groups differ or if a procedure or treatment affects the population of interest, it is frequently used in hypothesis testing.

Chi-Square 

You utilize a Chi-square test for hypothesis testing concerning whether your data is as predicted. To determine if the expected and observed results are well-fitted, the Chi-square test analyzes the differences between categorical variables from a random sample. The test's fundamental premise is that the observed values in your data should be compared to the predicted values that would be present if the null hypothesis were true.

Hypothesis Testing and Confidence Intervals

Both confidence intervals and hypothesis tests are inferential techniques that depend on approximating the sample distribution. Data from a sample is used to estimate a population parameter using confidence intervals. Data from a sample is used in hypothesis testing to examine a given hypothesis. We must have a postulated parameter to conduct hypothesis testing.

Bootstrap distributions and randomization distributions are created using comparable simulation techniques. The observed sample statistic is the focal point of a bootstrap distribution, whereas the null hypothesis value is the focal point of a randomization distribution.

A variety of feasible population parameter estimates are included in confidence ranges. In this lesson, we created just two-tailed confidence intervals. There is a direct connection between these two-tail confidence intervals and these two-tail hypothesis tests. The results of a two-tailed hypothesis test and two-tailed confidence intervals typically provide the same results. In other words, a hypothesis test at the 0.05 level will virtually always fail to reject the null hypothesis if the 95% confidence interval contains the predicted value. A hypothesis test at the 0.05 level will nearly certainly reject the null hypothesis if the 95% confidence interval does not include the hypothesized parameter.

Simple and Composite Hypothesis Testing

Depending on the population distribution, you can classify the statistical hypothesis into two types.

Simple Hypothesis: A simple hypothesis specifies an exact value for the parameter.

Composite Hypothesis: A composite hypothesis specifies a range of values.

A company is claiming that their average sales for this quarter are 1000 units. This is an example of a simple hypothesis.

Suppose the company claims that the sales are in the range of 900 to 1000 units. Then this is a case of a composite hypothesis.

One-Tailed and Two-Tailed Hypothesis Testing

The One-Tailed test, also called a directional test, considers a critical region of data that would result in the null hypothesis being rejected if the test sample falls into it, inevitably meaning the acceptance of the alternate hypothesis.

In a one-tailed test, the critical distribution area is one-sided, meaning the test sample is either greater or lesser than a specific value.

In two tails, the test sample is checked to be greater or less than a range of values in a Two-Tailed test, implying that the critical distribution area is two-sided.

If the sample falls within this range, the alternate hypothesis will be accepted, and the null hypothesis will be rejected.

Become a Data Scientist With Real-World Experience

Become a Data Scientist With Real-World Experience

Right Tailed Hypothesis Testing

If the larger than (>) sign appears in your hypothesis statement, you are using a right-tailed test, also known as an upper test. Or, to put it another way, the disparity is to the right. For instance, you can contrast the battery life before and after a change in production. Your hypothesis statements can be the following if you want to know if the battery life is longer than the original (let's say 90 hours):

  • The null hypothesis is (H0 <= 90) or less change.
  • A possibility is that battery life has risen (H1) > 90.

The crucial point in this situation is that the alternate hypothesis (H1), not the null hypothesis, decides whether you get a right-tailed test.

Left Tailed Hypothesis Testing

Alternative hypotheses that assert the true value of a parameter is lower than the null hypothesis are tested with a left-tailed test; they are indicated by the asterisk "<".

Suppose H0: mean = 50 and H1: mean not equal to 50

According to the H1, the mean can be greater than or less than 50. This is an example of a Two-tailed test.

In a similar manner, if H0: mean >=50, then H1: mean <50

Here the mean is less than 50. It is called a One-tailed test.

Type 1 and Type 2 Error

A hypothesis test can result in two types of errors.

Type 1 Error: A Type-I error occurs when sample results reject the null hypothesis despite being true.

Type 2 Error: A Type-II error occurs when the null hypothesis is not rejected when it is false, unlike a Type-I error.

Suppose a teacher evaluates the examination paper to decide whether a student passes or fails.

H0: Student has passed

H1: Student has failed

Type I error will be the teacher failing the student [rejects H0] although the student scored the passing marks [H0 was true]. 

Type II error will be the case where the teacher passes the student [do not reject H0] although the student did not score the passing marks [H1 is true].

Level of Significance

The alpha value is a criterion for determining whether a test statistic is statistically significant. In a statistical test, Alpha represents an acceptable probability of a Type I error. Because alpha is a probability, it can be anywhere between 0 and 1. In practice, the most commonly used alpha values are 0.01, 0.05, and 0.1, which represent a 1%, 5%, and 10% chance of a Type I error, respectively (i.e. rejecting the null hypothesis when it is in fact correct).

Future-Proof Your AI/ML Career: Top Dos and Don'ts

Future-Proof Your AI/ML Career: Top Dos and Don'ts

A p-value is a metric that expresses the likelihood that an observed difference could have occurred by chance. As the p-value decreases the statistical significance of the observed difference increases. If the p-value is too low, you reject the null hypothesis.

Here you have taken an example in which you are trying to test whether the new advertising campaign has increased the product's sales. The p-value is the likelihood that the null hypothesis, which states that there is no change in the sales due to the new advertising campaign, is true. If the p-value is .30, then there is a 30% chance that there is no increase or decrease in the product's sales.  If the p-value is 0.03, then there is a 3% probability that there is no increase or decrease in the sales value due to the new advertising campaign. As you can see, the lower the p-value, the chances of the alternate hypothesis being true increases, which means that the new advertising campaign causes an increase or decrease in sales.

Why is Hypothesis Testing Important in Research Methodology?

Hypothesis testing is crucial in research methodology for several reasons:

  • Provides evidence-based conclusions: It allows researchers to make objective conclusions based on empirical data, providing evidence to support or refute their research hypotheses.
  • Supports decision-making: It helps make informed decisions, such as accepting or rejecting a new treatment, implementing policy changes, or adopting new practices.
  • Adds rigor and validity: It adds scientific rigor to research using statistical methods to analyze data, ensuring that conclusions are based on sound statistical evidence.
  • Contributes to the advancement of knowledge: By testing hypotheses, researchers contribute to the growth of knowledge in their respective fields by confirming existing theories or discovering new patterns and relationships.

Limitations of Hypothesis Testing

Hypothesis testing has some limitations that researchers should be aware of:

  • It cannot prove or establish the truth: Hypothesis testing provides evidence to support or reject a hypothesis, but it cannot confirm the absolute truth of the research question.
  • Results are sample-specific: Hypothesis testing is based on analyzing a sample from a population, and the conclusions drawn are specific to that particular sample.
  • Possible errors: During hypothesis testing, there is a chance of committing type I error (rejecting a true null hypothesis) or type II error (failing to reject a false null hypothesis).
  • Assumptions and requirements: Different tests have specific assumptions and requirements that must be met to accurately interpret results.

After reading this tutorial, you would have a much better understanding of hypothesis testing, one of the most important concepts in the field of Data Science . The majority of hypotheses are based on speculation about observed behavior, natural phenomena, or established theories.

If you are interested in statistics of data science and skills needed for such a career, you ought to explore Simplilearn’s Post Graduate Program in Data Science.

If you have any questions regarding this ‘Hypothesis Testing In Statistics’ tutorial, do share them in the comment section. Our subject matter expert will respond to your queries. Happy learning!

1. What is hypothesis testing in statistics with example?

Hypothesis testing is a statistical method used to determine if there is enough evidence in a sample data to draw conclusions about a population. It involves formulating two competing hypotheses, the null hypothesis (H0) and the alternative hypothesis (Ha), and then collecting data to assess the evidence. An example: testing if a new drug improves patient recovery (Ha) compared to the standard treatment (H0) based on collected patient data.

2. What is hypothesis testing and its types?

Hypothesis testing is a statistical method used to make inferences about a population based on sample data. It involves formulating two hypotheses: the null hypothesis (H0), which represents the default assumption, and the alternative hypothesis (Ha), which contradicts H0. The goal is to assess the evidence and determine whether there is enough statistical significance to reject the null hypothesis in favor of the alternative hypothesis.

Types of hypothesis testing:

  • One-sample test: Used to compare a sample to a known value or a hypothesized value.
  • Two-sample test: Compares two independent samples to assess if there is a significant difference between their means or distributions.
  • Paired-sample test: Compares two related samples, such as pre-test and post-test data, to evaluate changes within the same subjects over time or under different conditions.
  • Chi-square test: Used to analyze categorical data and determine if there is a significant association between variables.
  • ANOVA (Analysis of Variance): Compares means across multiple groups to check if there is a significant difference between them.

3. What are the steps of hypothesis testing?

The steps of hypothesis testing are as follows:

  • Formulate the hypotheses: State the null hypothesis (H0) and the alternative hypothesis (Ha) based on the research question.
  • Set the significance level: Determine the acceptable level of error (alpha) for making a decision.
  • Collect and analyze data: Gather and process the sample data.
  • Compute test statistic: Calculate the appropriate statistical test to assess the evidence.
  • Make a decision: Compare the test statistic with critical values or p-values and determine whether to reject H0 in favor of Ha or not.
  • Draw conclusions: Interpret the results and communicate the findings in the context of the research question.

4. What are the 2 types of hypothesis testing?

  • One-tailed (or one-sided) test: Tests for the significance of an effect in only one direction, either positive or negative.
  • Two-tailed (or two-sided) test: Tests for the significance of an effect in both directions, allowing for the possibility of a positive or negative effect.

The choice between one-tailed and two-tailed tests depends on the specific research question and the directionality of the expected effect.

5. What are the 3 major types of hypothesis?

The three major types of hypotheses are:

  • Null Hypothesis (H0): Represents the default assumption, stating that there is no significant effect or relationship in the data.
  • Alternative Hypothesis (Ha): Contradicts the null hypothesis and proposes a specific effect or relationship that researchers want to investigate.
  • Nondirectional Hypothesis: An alternative hypothesis that doesn't specify the direction of the effect, leaving it open for both positive and negative possibilities.

Find our Data Analyst Online Bootcamp in top cities:

About the author.

Avijeet Biswal

Avijeet is a Senior Research Analyst at Simplilearn. Passionate about Data Analytics, Machine Learning, and Deep Learning, Avijeet is also interested in politics, cricket, and football.

Recommended Resources

The Key Differences Between Z-Test Vs. T-Test

Free eBook: Top Programming Languages For A Data Scientist

Normality Test in Minitab: Minitab with Statistics

Normality Test in Minitab: Minitab with Statistics

A Comprehensive Look at Percentile in Statistics

Machine Learning Career Guide: A Playbook to Becoming a Machine Learning Engineer

  • PMP, PMI, PMBOK, CAPM, PgMP, PfMP, ACP, PBA, RMP, SP, and OPM3 are registered marks of the Project Management Institute, Inc.

Advertisement

Advertisement

Key points of statistics for clinically oriented research

Kernpunkte der Statistik für klinisch orientierte Forschung

  • AGA-Komitee-Hefte
  • Open access
  • Published: 23 May 2024

Cite this article

You have full access to this open access article

statistics hypothesis testing exam questions

  • Brenda Laky PhD 1 , 2 , 3 , 4 &

AGA Research Committee

Processing of research data is a crucial point in every scientific study. The basis of each statistical test is a clear research question and hypothesis. Then the scientist must consider which type of data are to be collected (nominal, ordinal, continuous). It is recommended that study-specific spreadsheets are developed before data collection, which consider the type of data being collected, how the data are to be analyzed, and which software will be used. Then, once the data have been gathered, tests for normality should be applied before proceeding with the specific statistical tests. This article provides a brief overview of common statistical tests based on fictional examples.

Zusammenfassung

Die Datensammlung und -analyse ist ein entscheidender Teil einer wissenschaftlichen Arbeit. Die Grundlage eines jeden wissenschaftlichen Tests liefert dabei eine klar formulierte Fragestellung und Hypothese. Anschließend muss man sich klar machen, welche Art von Daten erhoben werden (nominale, ordinale, kontinuierliche). Hierfür ist es zu empfehlen, bereits vor Beginn der Studie entsprechende Datenblätter zu erstellen, um eine einheitliche und an die spätere Auswertung angepasste Datensammlung zu gewährleisten. Sobald die wissenschaftlichen Daten erhoben wurden, sollten diese auf Normalverteilung überprüft werden, bevor die spezifischen statistischen Tests Anwendung finden. Der vorliegende Artikel liefert anhand von Beispielen eine kurze Übersicht zu den wichtigsten statistischen Verfahren.

Avoid common mistakes on your manuscript.

Statistical considerations should not only be applied after data collection: an exact statistical analysis plan including sample size calculation, a well-described primary research question, and an entire data evaluation plan should already be incorporated into the study protocol!

Furthermore, precisely defined data are a prerequisite to being able to perform statistical analyses. The best way to do this is to create an Excel (Microsoft, Redmond, WA, USA) or SPSS (IMB Corp., Armonk, NY, USA) file including all data to be collected prior to data collection. In the next step, starting with the primary question (primary endpoint), specific questions can be formulated according to the PICO(T) and FINER criteria. The PICO acronym stands for patients (P) with a defined health problem, an intervention (I) with a specific treatment, with a comparison (C; e.g., control intervention) and an outcome (O; e.g., outcome measure), and, if applicable, for a time (T). The planned study design and methods can and should then be “refined” with the FINER method [ 1 ].

Choice of statistical test

The choice of statistical test depends on the following:

The reason for testing

a group/variable.

relationship between (a) groups with respect to one variable or (b) variables.

group comparison between 2 or ≥ 3 groups.

group comparison between independent (e.g., comparison between treatment A vs. B) or paired groups (e.g., comparison between before and after treatment).

type of variables.

nominal: bi- (e.g., yes/no) or polynominal (e.g., treatment A, B, C, D).

ordinal: non-parametric (not normally distributed, e.g., school grading system: very good/good/satisfactory/insufficient).

continuous: parametric (normally distributed, e.g., age in years).

The distribution of metric data can be graphically represented using histograms. Measures of central tendency (e.g., mean, median) describe the center of the data, and measures of dispersion (e.g., standard deviation, interquartile range) indicate how much the data vary. Normally distributed quantitative data should be reported as mean and standard deviation. Outliers within a data series can distort the mean (i.e., move it up or down) and, thus, affect distribution. Therefore, non-parametric continuous and ordinal data should be reported as median with interquartile range.

To make the selection easier, the following chart may help when planning a study (Fig.  1 ).

figure 1

Statistical tests cheat chart. ANOVA  analysis of variance, H0  null hypothesis

However, before a statistical test can be applied, the prerequisites must always be checked. For example, continuous variables must always be checked to see if they are normally distributed. If they are normally distributed, a parametric test must be chosen, if they are not normally distributed, a non-parametric test is more adequate.

Every protocol needs a statistical analysis plan in advance

When the right test is clear, data can be analyzed. There are good online tutorials for this (e.g., https://peterstatistics.com/CrashCourse/ shows how to do certain analyses with MS Excel, SPSS, or R). Besides well-known software like MS Excel, SPSS, SAS, and R (R Foundation for Statistical Computing, Vienna, Austria), the software Bluesky statistics (see https://www.blueskystatistics.com/ ) is a good free alternative.

Statistical analysis plan

Every protocol needs a statistical analysis plan in advance! This should include the following points:

The statistical test planned for the primary question/hypothesis. If the primary question/hypothesis requires more than one test to answer the research question, multiple testing must be considered and, hence, appropriate correction procedures (e.g., Bonferroni) must be applied.

Sample size calculation , if required; sample size calculations are not required for pilot projects and retrospective studies because such study types serve as the basis for prospective studies. Next to the primary question/hypothesis, an approximate hunch regarding the primary outcome from preliminary studies, reviews, and literature is required for sample size calculation. However, for prospective studies, an a priori sample size calculation is always important. Once a primary research question has been precisely defined, a sample size analysis can be performed e.g., by using G*Power [ 2 , 3 ]. Data needed for the sample size analysis should be taken from previous pilot studies or from similar studies that have already been published.

Descriptive statistics to describe demographic data.

For each further exploratory question, the planned test should be described ( analysis of secondary outcome parameters ). Exploratory, because a study should generally answer only one primary question/hypothesis at a time and all further questions serve to generate further hypotheses and, thus, could be used for future studies’ sample size calculations.

Practical approach

The following sections outline the procedure in practice.

Formulation of the research idea (general question).

Example : Does arthroscopic therapy improve orthopedic pathology more than endoprosthetic treatment?

Formulation of the primary question/hypothesis according to the PICO method:

patients with specific orthopedic pathologies

arthroscopic therapy (treatment A)

endoprosthetic treatment (treatment B)

outcome score (difference between pre- and postoperative)

preoperative and 5 years postoperative

Primary question

Does arthroscopic therapy improve the outcome score difference from preoperative to 5 years postoperative more than endoprosthetic treatment?

Null hypothesis (H0)

Mean outcome score difference = between treatment A and B

Alternative hypothesis (HA)

Possibility 1:.

mean outcome score difference ≠ between treatment A and B (if a difference is expected but one does not yet know which of the two treatments might be better).

Possibility 2:

mean outcome score difference < between treatment A and B (if one has the hypothesis that treatment A will show less mean outcome score difference than treatment B).

Possibility 3:

mean outcome score difference > between treatment A and B (if one has the hypothesis that treatment A will show more mean outcome score difference than treatment B).

Assuming the outcome score is a continuous variable, it would first need to be checked whether it is normally distributed (e.g., using Kolmogorov–Smirnov test or Shapiro–Wilk test). If the outcome score is parametric, the independent t‑test can be used to compare the means of the two independent groups. If the outcome score is not normally distributed, the Mann–Whitney test must be used.

Formulation and selection of statistical tests of all further questions

Other subsidiary questions of this project could be formulated as follows:

Question 3a

preoperative and 2 years postoperative

Does arthroscopic therapy improve the outcome score difference from preoperative to 2 years postoperative more than endoprosthetic treatment?

Statistical test.

Question 3b

satisfaction with treatment (satisfied/unsatisfied)

Is there a difference between arthroscopic and endoprosthetic treatment in terms of satisfaction with the treatment?

Since the outcome is a binomial variable (satisfaction with treatment: satisfied or dissatisfied) to be compared between two independent groups (treatment A vs. B), either the chi-square or Fisher exact test (in case of studies with small numbers of cases or when the number of observations in cells is small) must be applied.

Question 3c

outcome score 2 years after arthroscopic therapy (postoperative)

outcome score before arthroscopic therapy (preoperative)

outcome score comparison between pre- and 2‑year postoperative

Is there a difference between outcome score from pre- to 2‑year postoperative?

Here, a continuous variable (outcome score) is to be compared before and 2 years after arthroscopic therapy. Since this is a comparison of two outcome score results of one group of patients at different timepoints (paired), first it must be checked whether the outcome score results are normally distributed (e.g., using Kolmogorov–Smirnov test or Shapiro–Wilk test). If the outcome score is normally distributed and thus, parametric, the paired t‑test can be used for comparison of means between before and after arthroscopic therapy. If the outcome score is not normally distributed, the Wilcoxon signed-rank test must be used as data are non-parametrically distributed.

Practical conclusion

Checklist regarding the statistical analysis plan:

Define primary hypothesis according to the PICO plus FINER method.

Examination of data. Depending on the data type (nominal, ordinal, continuous) and considering the prerequisites of the respective statistical test (e.g., parametric, non-parametric), the adequate statistical test can then be selected.

If the PICOs are precisely defined and the adequate statistical test is selected, and if required, a sample size calculation can be performed.

Neugebauer E, Mutschler W, Claes L (2004) Von der Idee zur Publikation – Eine Anleitung zum erfolgreichen wissenschaftlichen Arbeiten, 1st edn. Georg Thieme Verlag KG

Google Scholar  

Faul F, Erdfelder E, Lang A‑G, Buchner A (2007) G*power 3: a flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behav Res 39:175–191

Article   Google Scholar  

Faul F, Erdfelder E, Buchner A, Lang A‑G (2009) Statistical power analyses using G*power 3.1: tests for correlation and regression analyses. Behav Res 41:1149–1160

Download references

Open access funding provided by Medical University of Vienna.

Author information

Authors and affiliations.

Austrian Research Group for Regenerative and Orthopedic Medicine (AURROM), Hartmanngasse 15/10, 1050, Vienna, Austria

Brenda Laky PhD

Austrian Society of Regenerative Medicine, Wollzeile 3/2.1, Vienna, Austria

Medical Faculty, Sigmund Freud University Vienna, Freudplatz 3, 1020, Vienna, Austria

Center for Clinical Research, University Clinic of Dentistry, Medical University of Vienna, Vienna, Austria

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Brenda Laky PhD .

Ethics declarations

Conflict of interest.

B. Laky and the AGA Research Committee declare that they have no competing interests.

For this article no studies with human participants or animals were performed by any of the authors. All studies mentioned were in accordance with the ethical standards indicated in each case.

Additional information

D. Günther, Köln

E. Herbst, Münster

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

figure qr

Scan QR code & read article online

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Laky, B., AGA Research Committee. Key points of statistics for clinically oriented research. Arthroskopie (2024). https://doi.org/10.1007/s00142-024-00685-8

Download citation

Accepted : 15 April 2024

Published : 23 May 2024

DOI : https://doi.org/10.1007/s00142-024-00685-8

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Data interpretation, statistical
  • Data collection
  • Data visualization
  • Sample size
  • Research design

Schlüsselwörter

  • Dateninterpretation, statistische
  • Datensammlung
  • Datenvisualisierung
  • Stichprobengröße
  • Studiendesign
  • Find a journal
  • Publish with us
  • Track your research

IMAGES

  1. Hypothesis Testing Solved Examples(Questions and Solutions)

    statistics hypothesis testing exam questions

  2. Hypothesis Testing Statistics Formula Sheet

    statistics hypothesis testing exam questions

  3. Hypothesis Testing- Meaning, Types & Steps

    statistics hypothesis testing exam questions

  4. PPT

    statistics hypothesis testing exam questions

  5. Your Guide to Master Hypothesis Testing in Statistics

    statistics hypothesis testing exam questions

  6. Hypothesis Testing Cheat Sheet

    statistics hypothesis testing exam questions

VIDEO

  1. Hypothesis Testing of Normal Variables- examples

  2. T test Part 1 Hypothesis Set Up and Formula Discussion MBS First Semester Statistics Solution

  3. Mixed exercise 7 Question 3 Edexcel Applied Year 12

  4. A-Level maths June 2018 Question 5 Stats Paper Edexcel

  5. Statistics for Hypothesis Testing

  6. Hypothsis Testing in Statistics Part 2 Steps to Solving a Problem

COMMENTS

  1. PDF Unit 7

    Carry out an appropriate statistical test and interpret your findings. ANSWER . Yes, a paired t-test suggests that the average difference in hours slept (Dalmane - Halcion) = 0.32 is statistically significant (one sided p-value = .018). SOLUTION . This question is asking for a hypothesis test of the equality of two means in the setting of ...

  2. 9.E: Hypothesis Testing with One Sample (Exercises)

    An Introduction to Statistics class in Davies County, KY conducted a hypothesis test at the local high school (a medium sized-approximately 1,200 students-small city demographic) to determine if the local high school's percentage was lower. One hundred fifty students were chosen at random and surveyed.

  3. Hypothesis Testing

    Table of contents. Step 1: State your null and alternate hypothesis. Step 2: Collect data. Step 3: Perform a statistical test. Step 4: Decide whether to reject or fail to reject your null hypothesis. Step 5: Present your findings. Other interesting articles. Frequently asked questions about hypothesis testing.

  4. Significance tests (hypothesis testing)

    Significance tests give us a formal process for using sample data to evaluate the likelihood of some claim about a population value. Learn how to conduct significance tests and calculate p-values to see how likely a sample result is to occur by random chance. You'll also see how we use p-values to make conclusions about hypotheses.

  5. S.3.3 Hypothesis Testing Examples

    If the engineer used the P -value approach to conduct his hypothesis test, he would determine the area under a tn - 1 = t24 curve and to the right of the test statistic t * = 1.22: In the output above, Minitab reports that the P -value is 0.117. Since the P -value, 0.117, is greater than \ (\alpha\) = 0.05, the engineer fails to reject the null ...

  6. 9.4 Full Hypothesis Test Examples

    A teacher believes that 85% of students in the class will want to go on a field trip to the local zoo. The teacher performs a hypothesis test to determine if the percentage is the same or different from 85%. The teacher samples 50 students and 39 reply that they would want to go to the zoo. For the hypothesis test, use a 1% level of significance.

  7. Simple hypothesis testing (practice)

    Simple hypothesis testing. Niels has a Magic 8 -Ball, which is a toy used for fortune-telling or seeking advice. To consult the ball, you ask the ball a question and shake it. One of 5 different possible answers then appears at random in the ball. Niels sensed that the ball answers " Ask again later " too frequently.

  8. 9.1: Introduction to Hypothesis Testing

    In hypothesis testing, the goal is to see if there is sufficient statistical evidence to reject a presumed null hypothesis in favor of a conjectured alternative hypothesis.The null hypothesis is usually denoted \(H_0\) while the alternative hypothesis is usually denoted \(H_1\). An hypothesis test is a statistical decision; the conclusion will either be to reject the null hypothesis in favor ...

  9. Exam Questions

    Tutorial Contents / Maths / Exam Questions - Hypothesis tests: binomial distribution.

  10. S.3 Hypothesis Testing

    S.3 Hypothesis Testing. In reviewing hypothesis tests, we start first with the general idea. Then, we keep returning to the basic procedures of hypothesis testing, each time adding a little more detail. The general idea of hypothesis testing involves: Making an initial assumption. Collecting evidence (data).

  11. Hypothesis Testing Solved Examples(Questions and Solutions)

    View Solution to Question 1. Question 2. A professor wants to know if her introductory statistics class has a good grasp of basic math. Six students are chosen at random from the class and given a math proficiency test. The professor wants the class to be able to score above 70 on the test. The six students get the following scores:62, 92, 75 ...

  12. Hypothesis testing quiz Flashcards

    Study with Quizlet and memorize flashcards containing terms like What distribution is used when testing hypotheses, What does the null hypothesis state, Why do we test the null hypothesis and more.

  13. AP Statistics Exam Questions

    Download free-response questions from this year's exam and past exams along with scoring guidelines, sample responses from exam takers, and scoring distributions. If you are using assistive technology and need help accessing these PDFs in another format, contact Services for Students with Disabilities at 212-713-8333 or by email at ssd@info ...

  14. Hypothesis testing and p-values (video)

    In this video there was no critical value set for this experiment. In the last seconds of the video, Sal briefly mentions a p-value of 5% (0.05), which would have a critical of value of z = (+/-) 1.96. Since the experiment produced a z-score of 3, which is more extreme than 1.96, we reject the null hypothesis.

  15. Hypothesis Testing

    Step 6: Find the. test statistic. using this formula: For this set of data: z= (112.5 - 100) / (15/√30) = 4.56. Step 6: If Step 6 is greater than Step 5, reject the null hypothesis. If it's less than Step 5, you cannot reject the null hypothesis. In this case, it is more (4.56 > 1.645), so you can reject the null.

  16. Statistics Practice Tests

    Free Statistics Practice Tests. Our completely free Statistics practice tests are the perfect way to brush up your skills. Take one of our many Statistics practice tests for a run-through of commonly asked questions. You will receive incredibly detailed scoring results at the end of your Statistics practice test to help you identify your ...

  17. PDF Stats 2 Hypothesis Testing Questions

    Test, at the 5% level of significance, changed. and — 15321 the group's belief that the mean score of 85.9 has (8 marks) 6 The lifetime, X hours, of Ever-white camera batteries is normally distributed. The manufacturer claims that the mean lifetime of these batteries is 100 hours. (a) (b) The members of a photography club suspect that the ...

  18. PDF Hypothesis Testing

    Hypothesis Testing www.naikermaths.com 10. (a) Define the critical region of a test statistic.(2) A discrete random variable X has a Binomial distribution B(30, p).A single observation is used to test H 0: p = 0.3 against H 1: p ≠ 0.3 (b) Using a 1% level of significance find the critical region of this test.You should state the probability of rejection in each tail which should be as close ...

  19. 5.1 Hypothesis Testing

    Model Answers. 1 5 marks. A hypothesis test uses a sample of data in an experiment to test a statement made about the value of a population parameter ( ). Explain, in the context of hypothesis testing, what is meant by: (i) 'sample of data', (ii) 'population parameter'. (iii) 'null hypothesis',

  20. Exams with Solutions

    Exams with Solutions. pdf. 470 kB. 18.05 Introduction to Probability and Statistics (S22), Exam 1 Review: all questions: solutions. pdf. 144 kB. 18.05 Introduction to Probability and Statistics (S22), Exam 1 Review: practice 1: solutions. pdf.

  21. 5.1.1 Hypothesis Testing

    A hypothesis test is carried out at the 5% level of significance to test if a normal coin is fair or not. (i) Describe what the population parameter could be for the hypothesis test. (ii) State whether the hypothesis test should be a one-tailed test or a two-tailed test, give a reason for your answer. (iii)

  22. PDF hypothesis testing introduction

    The teacher claims that children use more gold beads during the activity and checks a random sample of 20 beads out of the bag, after the end of the activity. She finds just two gold beads in the sample. Test, at the 5% level of significance, whether or not there is evidence to support the teacher's claim. significant evidence, 3.55% 5%<.

  23. What is Hypothesis Testing in Statistics? Types and Examples

    Hypothesis testing is a statistical method used to determine if there is enough evidence in a sample data to draw conclusions about a population. It involves formulating two competing hypotheses, the null hypothesis (H0) and the alternative hypothesis (Ha), and then collecting data to assess the evidence.

  24. Hypothesis Test Calculator

    Calculation Example: There are six steps you would follow in hypothesis testing: Formulate the null and alternative hypotheses in three different ways: H0: θ = θ0 versus H1: θ ≠ θ0. H0: θ ≤ θ0 versus H1: θ > θ0. H0: θ ≥ θ0 versus H1: θ < θ0.

  25. . Calculate the p-value for each of the given hypothesis test

    Step 1: 1) The test is left tailed test Now. test stat =t=-3.1881. sample size =n=32. Hence degrees of freedom =df=n-1=32-1=31. The population sd unknown so t distribution will be applicable

  26. Key points of statistics for clinically oriented research

    Every protocol needs a statistical analysis plan in advance! This should include the following points: 1. The statistical test planned for the primary question/hypothesis. If the primary question/hypothesis requires more than one test to answer the research question, multiple testing must be considered and, hence, appropriate correction procedures (e.g., Bonferroni) must be applied.