Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons
  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Statistics LibreTexts

9.4: Distribution Needed for Hypothesis Testing

  • Last updated
  • Save as PDF
  • Page ID 773

Earlier in the course, we discussed sampling distributions. Particular distributions are associated with hypothesis testing. Perform tests of a population mean using a normal distribution or a Student's \(t\)-distribution. (Remember, use a Student's \(t\)-distribution when the population standard deviation is unknown and the distribution of the sample mean is approximately normal.) We perform tests of a population proportion using a normal distribution (usually \(n\) is large or the sample size is large).

If you are testing a single population mean, the distribution for the test is for means :

\[\bar{X} - N\left(\mu_{x}, \frac{\sigma_{x}}{\sqrt{n}}\right)\]

The population parameter is \(\mu\). The estimated value (point estimate) for \(\mu\) is \(\bar{x}\), the sample mean.

If you are testing a single population proportion, the distribution for the test is for proportions or percentages:

\[P' - N\left(p, \sqrt{\frac{p-q}{n}}\right)\]

The population parameter is \(p\). The estimated value (point estimate) for \(p\) is \(p′\). \(p' = \frac{x}{n}\) where \(x\) is the number of successes and n is the sample size.

Assumptions

When you perform a hypothesis test of a single population mean \(\mu\) using a Student's \(t\)-distribution (often called a \(t\)-test), there are fundamental assumptions that need to be met in order for the test to work properly. Your data should be a simple random sample that comes from a population that is approximately normally distributed. You use the sample standard deviation to approximate the population standard deviation. (Note that if the sample size is sufficiently large, a \(t\)-test will work even if the population is not approximately normally distributed).

When you perform a hypothesis test of a single population mean \(\mu\) using a normal distribution (often called a \(z\)-test), you take a simple random sample from the population. The population you are testing is normally distributed or your sample size is sufficiently large. You know the value of the population standard deviation which, in reality, is rarely known.

When you perform a hypothesis test of a single population proportion \(p\), you take a simple random sample from the population. You must meet the conditions for a binomial distribution which are: there are a certain number \(n\) of independent trials, the outcomes of any trial are success or failure, and each trial has the same probability of a success \(p\). The shape of the binomial distribution needs to be similar to the shape of the normal distribution. To ensure this, the quantities \(np\) and \(nq\) must both be greater than five \((np > 5\) and \(nq > 5)\). Then the binomial distribution of a sample (estimated) proportion can be approximated by the normal distribution with \(\mu = p\) and \(\sigma = \sqrt{\frac{pq}{n}}\). Remember that \(q = 1 – p\).

In order for a hypothesis test’s results to be generalized to a population, certain requirements must be satisfied.

When testing for a single population mean:

  • A Student's \(t\)-test should be used if the data come from a simple, random sample and the population is approximately normally distributed, or the sample size is large, with an unknown standard deviation.
  • The normal test will work if the data come from a simple, random sample and the population is approximately normally distributed, or the sample size is large, with a known standard deviation.

When testing a single population proportion use a normal test for a single population proportion if the data comes from a simple, random sample, fill the requirements for a binomial distribution, and the mean number of successes and the mean number of failures satisfy the conditions: \(np > 5\) and \(nq > 5\) where \(n\) is the sample size, \(p\) is the probability of a success, and \(q\) is the probability of a failure.

Formula Review

If there is no given preconceived \(\alpha\), then use \(\alpha = 0.05\).

Types of Hypothesis Tests

  • Single population mean, known population variance (or standard deviation): Normal test .
  • Single population mean, unknown population variance (or standard deviation): Student's \(t\)-test .
  • Single population proportion: Normal test .
  • For a single population mean , we may use a normal distribution with the following mean and standard deviation. Means: \(\mu = \mu_{\bar{x}}\) and \(\\sigma_{\bar{x}} = \frac{\sigma_{x}}{\sqrt{n}}\)
  • A single population proportion , we may use a normal distribution with the following mean and standard deviation. Proportions: \(\mu = p\) and \(\sigma = \sqrt{\frac{pq}{n}}\).
  • It is continuous and assumes any real values.
  • The pdf is symmetrical about its mean of zero. However, it is more spread out and flatter at the apex than the normal distribution.
  • It approaches the standard normal distribution as \(n\) gets larger.
  • There is a "family" of \(t\)-distributions: every representative of the family is completely defined by the number of degrees of freedom which is one less than the number of data items.

Hypothesis tests about the mean

by Marco Taboga , PhD

This lecture explains how to conduct hypothesis tests about the mean of a normal distribution.

We tackle two different cases:

when we know the variance of the distribution, then we use a z-statistic to conduct the test;

when the variance is unknown, then we use the t-statistic.

In each case we derive the power and the size of the test.

We conclude with two solved exercises on size and power.

Table of contents

Known variance: the z-test

The null hypothesis, the test statistic, the critical region, the decision, the power function, the size of the test, how to choose the critical value, unknown variance: the t-test, how to choose the critical values, solved exercises.

The assumptions are the same we made in the lecture on confidence intervals for the mean .

A test of hypothesis based on it is called z-test .

Otherwise, it is not rejected.

[eq7]

We explain how to do this in the page on critical values .

This case is similar to the previous one. The only difference is that we now relax the assumption that the variance of the distribution is known.

The test of hypothesis based on it is called t-test .

Otherwise, we do not reject it.

[eq19]

The page on critical values explains how this equation is solved.

Below you can find some exercises with explained solutions.

Suppose that a statistician observes 100 independent realizations of a normal random variable.

The mean and the variance of the random variable, which the statistician does not know, are equal to 1 and 4 respectively.

Find the probability that the statistician will reject the null hypothesis that the mean is equal to zero if:

she runs a t-test based on the 100 observed realizations;

[eq32]

A statistician observes 100 independent realizations of a normal random variable.

She performs a t-test of the null hypothesis that the mean of the variable is equal to zero.

[eq38]

How to cite

Please cite as:

Taboga, Marco (2021). "Hypothesis tests about the mean", Lectures on probability theory and mathematical statistics. Kindle Direct Publishing. Online appendix. https://www.statlect.com/fundamentals-of-statistics/hypothesis-testing-mean.

Most of the learning materials found on this website are now available in a traditional textbook format.

  • Gamma function
  • Characteristic function
  • Uniform distribution
  • Mean square convergence
  • Convergence in probability
  • Likelihood ratio test
  • Statistical inference
  • Point estimation
  • Combinations
  • Mathematical tools
  • Fundamentals of probability
  • Probability distributions
  • Asymptotic theory
  • Fundamentals of statistics
  • About Statlect
  • Cookies, privacy and terms of use
  • Discrete random variable
  • Mean squared error
  • Continuous mapping theorem
  • Alternative hypothesis
  • Probability density function
  • IID sequence
  • To enhance your privacy,
  • we removed the social buttons,
  • but don't forget to share .
  • FOR INSTRUCTOR
  • FOR INSTRUCTORS

8.4.3 Hypothesis Testing for the Mean

$\quad$ $H_0$: $\mu=\mu_0$, $\quad$ $H_1$: $\mu \neq \mu_0$.

$\quad$ $H_0$: $\mu \leq \mu_0$, $\quad$ $H_1$: $\mu > \mu_0$.

$\quad$ $H_0$: $\mu \geq \mu_0$, $\quad$ $H_1$: $\mu \lt \mu_0$.

Two-sided Tests for the Mean:

Therefore, we can suggest the following test. Choose a threshold, and call it $c$. If $|W| \leq c$, accept $H_0$, and if $|W|>c$, accept $H_1$. How do we choose $c$? If $\alpha$ is the required significance level, we must have

  • As discussed above, we let \begin{align}%\label{} W(X_1,X_2, \cdots,X_n)=\frac{\overline{X}-\mu_0}{\sigma / \sqrt{n}}. \end{align} Note that, assuming $H_0$, $W \sim N(0,1)$. We will choose a threshold, $c$. If $|W| \leq c$, we accept $H_0$, and if $|W|>c$, accept $H_1$. To choose $c$, we let \begin{align} P(|W| > c \; | \; H_0) =\alpha. \end{align} Since the standard normal PDF is symmetric around $0$, we have \begin{align} P(|W| > c \; | \; H_0) = 2 P(W>c | \; H_0). \end{align} Thus, we conclude $P(W>c | \; H_0)=\frac{\alpha}{2}$. Therefore, \begin{align} c=z_{\frac{\alpha}{2}}. \end{align} Therefore, we accept $H_0$ if \begin{align} \left|\frac{\overline{X}-\mu_0}{\sigma / \sqrt{n}} \right| \leq z_{\frac{\alpha}{2}}, \end{align} and reject it otherwise.
  • We have \begin{align} \beta (\mu) &=P(\textrm{type II error}) = P(\textrm{accept }H_0 \; | \; \mu) \\ &= P\left(\left|\frac{\overline{X}-\mu_0}{\sigma / \sqrt{n}} \right| \lt z_{\frac{\alpha}{2}}\; | \; \mu \right). \end{align} If $X_i \sim N(\mu,\sigma^2)$, then $\overline{X} \sim N(\mu, \frac{\sigma^2}{n})$. Thus, \begin{align} \beta (\mu)&=P\left(\left|\frac{\overline{X}-\mu_0}{\sigma / \sqrt{n}} \right| \lt z_{\frac{\alpha}{2}}\; | \; \mu \right)\\ &=P\left(\mu_0- z_{\frac{\alpha}{2}} \frac{\sigma}{\sqrt{n}} \leq \overline{X} \leq \mu_0+ z_{\frac{\alpha}{2}} \frac{\sigma}{\sqrt{n}}\right)\\ &=\Phi\left(z_{\frac{\alpha}{2}}+\frac{\mu_0-\mu}{\sigma / \sqrt{n}}\right)-\Phi\left(-z_{\frac{\alpha}{2}}+\frac{\mu_0-\mu}{\sigma / \sqrt{n}}\right). \end{align}
  • Let $S^2$ be the sample variance for this random sample. Then, the random variable $W$ defined as \begin{equation} W(X_1,X_2, \cdots, X_n)=\frac{\overline{X}-\mu_0}{S / \sqrt{n}} \end{equation} has a $t$-distribution with $n-1$ degrees of freedom, i.e., $W \sim T(n-1)$. Thus, we can repeat the analysis of Example 8.24 here. The only difference is that we need to replace $\sigma$ by $S$ and $z_{\frac{\alpha}{2}}$ by $t_{\frac{\alpha}{2},n-1}$. Therefore, we accept $H_0$ if \begin{align} |W| \leq t_{\frac{\alpha}{2},n-1}, \end{align} and reject it otherwise. Let us look at a numerical example of this case.

$\quad$ $H_0$: $\mu=170$, $\quad$ $H_1$: $\mu \neq 170$.

  • Let's first compute the sample mean and the sample standard deviation. The sample mean is \begin{align}%\label{} \overline{X}&=\frac{X_1+X_2+X_3+X_4+X_5+X_6+X_7+X_8+X_9}{9}\\ &=165.8 \end{align} The sample variance is given by \begin{align}%\label{} {S}^2=\frac{1}{9-1} \sum_{k=1}^9 (X_k-\overline{X})^2&=68.01 \end{align} The sample standard deviation is given by \begin{align}%\label{} S&= \sqrt{S^2}=8.25 \end{align} The following MATLAB code can be used to obtain these values: x=[176.2,157.9,160.1,180.9,165.1,167.2,162.9,155.7,166.2]; m=mean(x); v=var(x); s=std(x); Now, our test statistic is \begin{align} W(X_1,X_2, \cdots, X_9)&=\frac{\overline{X}-\mu_0}{S / \sqrt{n}}\\ &=\frac{165.8-170}{8.25 / 3}=-1.52 \end{align} Thus, $|W|=1.52$. Also, we have \begin{align} t_{\frac{\alpha}{2},n-1} = t_{0.025,8} \approx 2.31 \end{align} The above value can be obtained in MATLAB using the command $\mathtt{tinv(0.975,8)}$. Thus, we conclude \begin{align} |W| \leq t_{\frac{\alpha}{2},n-1}. \end{align} Therefore, we accept $H_0$. In other words, we do not have enough evidence to conclude that the average height in the city is different from the average height in the country.

Let us summarize what we have obtained for the two-sided test for the mean.

One-sided Tests for the Mean:

  • As before, we define the test statistic as \begin{align}%\label{} W(X_1,X_2, \cdots,X_n)=\frac{\overline{X}-\mu_0}{\sigma / \sqrt{n}}. \end{align} If $H_0$ is true (i.e., $\mu \leq \mu_0$), we expect $\overline{X}$ (and thus $W$) to be relatively small, while if $H_1$ is true, we expect $\overline{X}$ (and thus $W$) to be larger. This suggests the following test: Choose a threshold, and call it $c$. If $W \leq c$, accept $H_0$, and if $W>c$, accept $H_1$. How do we choose $c$? If $\alpha$ is the required significance level, we must have \begin{align} P(\textrm{type I error}) &= P(\textrm{Reject }H_0 \; | \; H_0) \\ &= P(W > c \; | \; \mu \leq \mu_0) \leq \alpha. \end{align} Here, the probability of type I error depends on $\mu$. More specifically, for any $\mu \leq \mu_0$, we can write \begin{align} P(\textrm{type I error} \; | \; \mu) &= P(\textrm{Reject }H_0 \; | \; \mu) \\ &= P(W > c \; | \; \mu)\\ &=P \left(\frac{\overline{X}-\mu_0}{\sigma / \sqrt{n}}> c \; | \; \mu\right)\\ &=P \left(\frac{\overline{X}-\mu}{\sigma / \sqrt{n}}+\frac{\mu-\mu_0}{\sigma / \sqrt{n}}> c \; | \; \mu\right)\\ &=P \left(\frac{\overline{X}-\mu}{\sigma / \sqrt{n}}> c+\frac{\mu_0-\mu}{\sigma / \sqrt{n}} \; | \; \mu\right)\\ &\leq P \left(\frac{\overline{X}-\mu}{\sigma / \sqrt{n}}> c \; | \; \mu\right) \quad (\textrm{ since }\mu \leq \mu_0)\\ &=1-\Phi(c) \quad \big(\textrm{ since given }\mu, \frac{\overline{X}-\mu}{\sigma / \sqrt{n}} \sim N(0,1) \big). \end{align} Thus, we can choose $\alpha=1-\Phi(c)$, which results in \begin{align} c=z_{\alpha}. \end{align} Therefore, we accept $H_0$ if \begin{align} \frac{\overline{X}-\mu_0}{\sigma / \sqrt{n}} \leq z_{\alpha}, \end{align} and reject it otherwise.

$\quad$ $H_0$: $\mu \geq \mu_0$, $\quad$ $H_1$: $\mu \lt \mu_0$,

8.1 A Single Population Mean Using the Normal Distribution

A confidence interval for a population mean with a known standard deviation is based on the fact that the sample means follow an approximately normal distribution. Suppose that our sample has a mean of x ¯  = 10 x ¯  = 10 and we have constructed the 90 percent confidence interval (5, 15), where the margin of error = 5.

Calculating the Confidence Interval

To construct a confidence interval for a single unknown population mean, μ , where the population standard deviation is known , we need x ¯ x ¯ as an estimate for μ , and we need the margin of error. The margin of error for the population mean is called the error bound for a population mean ( EBM ). The sample mean, x ¯ , x ¯ , is the point estimate of the unknown population mean, μ .

The confidence interval ( CI ) estimate will have the form:

(point estimate – error bound, point estimate + error bound) or, in symbols, ( x ¯ – E B M , x ¯ + E B M x ¯ – E B M , x ¯ + E B M ).

The margin of error ( EBM ) depends on the confidence level ( CL ). The confidence level is often considered the probability that the calculated confidence interval estimate will contain the true population parameter. However, it is more accurate to state that the confidence level is the percentage of confidence intervals that contain the true population parameter when repeated samples are taken. Most often, the person constructing the confidence interval will choose a confidence level of 90 percent or higher, because that person wants to be reasonably certain of his or her conclusions.

Another probability, which is called alpha ( α ) ( α ) is related to the confidence level, CL . Alpha is the probability that the confidence interval does not contain the unknown population parameter. Mathematically, alpha can be computed as α = 1 − C L α = 1 − C L .

Example 8.1

  • Suppose we have collected data from a sample. We know the sample mean, but we do not know the mean for the entire population.
  • The sample mean is seven, and the error bound for the mean is 2.5.

The confidence interval is (7 – 2.5, 7 + 2.5), and calculating the values gives (4.5, 9.5).

If the confidence level is 95 percent, then we say, "We estimate with 95 percent confidence that the true value of the population mean is between 4.5 and 9.5."

Suppose we have data from a sample. The sample mean is 15, and the error bound for the mean is 3.2.

What is the confidence interval estimate for the population mean?

A confidence interval for a population mean with a known standard deviation is based on the fact that the sample means follow an approximately normal distribution. Suppose that our sample has a mean of x ¯ x ¯ = 10, and we have constructed the 90 percent confidence interval (5, 15) where EBM = 5.

To get a 90 percent confidence interval, we must include the central 90 percent of the probability of the normal distribution. If we include the central 90 percent, we leave out a total of α = 10 percent in both tails, or 5 percent in each tail, of the normal distribution.

The critical value 1.645 is the z -score in a standard normal probability distribution that puts an area of 0.90 in the center, an area of 0.05 in the far left tail, and an area of 0.05 in the far right tail. To capture the central 90 percent, we must go out 1.645 standard deviations on either side of the calculated sample mean. The critical value will change depending on the confidence level of the interval.

It is important that the standard deviation used be appropriate for the parameter we are estimating, so in this section, we need to use the standard deviation that applies to sample means, which is σ n σ n . The fraction σ n σ n is commonly called the standard error of the mean in order to distinguish clearly the standard deviation for a mean from the population standard deviation, σ .

  • X ¯ X ¯ is normally distributed, that is, X ¯ X ¯ ~ N ( μ X , σ n ) ( μ X , σ n ) .
  • When the population standard deviation σ is known, we use a normal distribution to calculate the error bound.

To construct a confidence interval estimate for an unknown population mean, we need data from a random sample. The steps to construct and interpret the confidence interval are as follows:

  • Calculate the sample mean, x ¯ , x ¯ , from the sample data. Remember, in this section, we already know the population standard deviation, σ .
  • Find the z -score that corresponds to the confidence level.
  • Calculate the error bound EBM .
  • Construct the confidence interval.
  • If we denote the critical z -score by z a 2 z a 2 , and the sample size by n , then the formula for the confidence interval with confidence level C l = 1 − α C l = 1 − α , is given by ( x ¯ − z a 2 × σ n , x ¯ + z a 2 × σ n ) . ( x ¯ − z a 2 × σ n , x ¯ + z a 2 × σ n ) .
  • Write a sentence that interprets the estimate in the context of the situation in the problem. (Explain what the confidence interval means, in the words of the problem.)

We will first examine each step in more detail and then illustrate the process with some examples.

Finding the z -Score for the Stated Confidence Level

When we know the population standard deviation, σ , we use a standard normal distribution to calculate the error bound EBM and construct the confidence interval. We need to find the value of z that puts an area equal to the confidence level (in decimal form) in the middle of the standard normal distribution Z ~ N (0, 1).

The confidence level, CL , is the area in the middle of the standard normal distribution. CL = 1 – α , so α is the area that is split equally between the two tails. Each of the tails contains an area equal to α 2 α 2 .

The z -score that has an area to the right of α 2 α 2 is denoted by z α 2 z α 2 .

For example, when CL = 0.95, α = 0.05, and α 2 α 2 = 0.025, we write z α 2 z α 2 = z z 0.025 .

The area to the right of z 0.025 is 0.025 and the area to the left of z 0.025 is 1 – 0.025 = 0.975.

z α 2  =  z 0. 025  = 1 .96 z α 2  =  z 0. 025  = 1 .96 , using a calculator, computer, or standard normal probability table.

Normal table (see appendices) shows that the probability for 0 to 1.96 is 0.47500, and so the probability to the right tail of the critical value 1.96 is 0.5 – 0.475 = 0.025

Using the TI-83, 83+, 84, 84+ Calculator

invNorm (0.975, 0, 1) = 1.96. In this command, the value 0.975 is the total area to the left of the critical value that we are looking to calculate. The parameters 0 and 1 are the mean value and the standard deviation of the standard normal distribution Z.

Remember to use the area to the LEFT of z α 2 z α 2 . In this chapter, the last two inputs in the invNorm command are 0, 1, because you are using a standard normal distribution Z with mean 0 and standard deviation 1.

Calculating the Margin of Error EBM

The error bound formula for an unknown population mean, μ , when the population standard deviation, σ , is known is

Margin of error = ( z α 2 ) ( σ n ) . ( z α 2 ) ( σ n ) .

Constructing the Confidence Interval

The confidence interval estimate has the format sample mean plus or minus the margin of error.

The graph gives a picture of the entire situation

CL + α 2 α 2 + α 2 α 2 = CL + α = 1.

Writing the Interpretation

The interpretation should clearly state the confidence level ( CL ), explain which population parameter is being estimated (here, a population mean ), and state the confidence interval (both endpoints): "We estimate with ___percent confidence that the true population mean (include the context of the problem) is between ___ and ___ (include appropriate units)."

Example 8.2

Suppose scores on exams in statistics are normally distributed with an unknown population mean and a population standard deviation of three points. A random sample of 36 scores is taken and gives a sample mean (sample mean score) of 68. Find a confidence interval estimate for the population mean exam score (the mean score on all exams).

Find a 90 percent confidence interval for the true (population) mean of statistics exam scores.

  • You can use technology to calculate the confidence interval directly.
  • The first solution is shown step-by-step (Solution A).
  • The second solution uses the TI-83, 83+, and 84+ calculators (Solution B).
  • x ¯  = 68 x ¯  = 68 8.2 E B M = ( z α 2 ) ( σ n ) E B M = ( z α 2 ) ( σ n ) σ = 3; n = 36 ; σ = 3; n = 36 ;
  • The confidence level is 90 percent ( CL = 0.90).

The area to the right of z 0.05 is 0.05 and the area to the left of z 0.05 is 1 – 0.05 = 0.95.

using invNorm(0.95, 0, 1) on the TI-83,83+, and 84+ calculators. This can also be found using appropriate commands on other calculators, using a computer, or using a probability table for the standard normal distribution.

EBM = (1.645) ( 3 36 ) ( 3 36 ) = 0.8225

x ¯ x ¯ – EBM = 68 – 0.8225 = 67.1775

x ¯ x ¯ + EBM = 68 + 0.8225 = 68.8225

The 90 percent confidence interval is (67.1775, 68.8225).

Press STAT and arrow over to TESTS . Arrow down to 7:ZInterval . Press ENTER . Arrow to Stats and press ENTER . Arrow down and enter 3 for σ , 68 for x ¯ x ¯ , 36 for n , and .90 for C-level . Arrow down to Calculate and press ENTER . The confidence interval is (to three decimal places)(67.178, 68.822).

Interpretation

Explanation of 90 percent confidence level.

Suppose average pizza delivery times are normally distributed with an unknown population mean and a population standard deviation of 6 minutes. A random sample of 28 pizza delivery restaurants is taken and has a sample mean delivery time of 36 min.

Find a 90 percent confidence interval estimate for the population mean delivery time.

Example 8.3

The specific absorption rate (SAR) for a cell phone measures the amount of radio frequency (RF) energy absorbed by the user’s body when using the handset. Every cell phone emits RF energy. Different phone models have different SAR measures. For certification from the Federal Communications Commission for sale in the United States, the SAR level for a cell phone must be no more than 1.6 watts per kilogram. Table 8.1 shows the highest SAR level for a random selection of cell phone models of a random cell phone company.

Find a 98 percent confidence interval for the true (population) mean of the SARs for cell phones. Assume that the population standard deviation is σ = 0.337.

This is calculated by adding the specific absorption rate for the 30 cell phones in the sample, and dividing the result by 30.

Next, find the EBM . Because you are creating a 98 percent confidence interval, CL = 0.98.

You need to find z 0.01 , having the property that the area under the normal density curve to the right of z 0.01 is 0.01 and the area to the left is 0.99. Use your calculator, a computer, or a probability table for the standard normal distribution to find z 0.01 = 2.326.

To find the 98 percent confidence interval, find   x ¯ ± E B M   x ¯ ± E B M .

We estimate with 98 percent confidence that the true SAR mean for the population of cell phones in the United States is between 0.8809 and 1.1671 watts per kilogram.

  • Press STAT and arrow over to TESTS.
  • Arrow down to 7:ZInterval.
  • Press ENTER.
  • Arrow to Stats and press ENTER.
  • x ¯ : 1.024 x ¯ : 1.024
  • C -level: 0.98
  • Arrow down to Calculate and press ENTER.
  • The confidence interval is (to three decimal places) (0.881, 1.167).

Table 8.2 shows a different random sampling of 20 cell phone models. Use these data to calculate a 93 percent confidence interval for the true mean SAR for cell phones certified for use in the United States. As previously, assume that the population standard deviation is σ = 0.337.

Notice the difference in the confidence intervals calculated in Example 8.3 and the following Try It exercise. These intervals are different for several reasons: they are calculated from different samples, the samples are different sizes, and the intervals are calculated for different levels of confidence. Even though the intervals are different, they do not yield conflicting information. The effects of these kinds of changes are the subject of the next section in this chapter.

Changing the Confidence Level or Sample Size

Example 8.4.

Suppose we change the original problem in Example 8.2 by using a 95 percent confidence level. Find a 95 percent confidence interval for the true (population) mean statistics exam score.

To find the confidence interval, you need the sample mean, x ¯ x ¯ , and the EBM .

  • x ¯  = 68 x ¯  = 68 E B M = ( z α 2 ) ( σ n ) E B M = ( z α 2 ) ( σ n ) σ = 3; n = 36 σ = 3; n = 36
  • The confidence level is 95 percent ( CL = 0.95).

The area to the right of z z 0.025 is 0.025, and the area to the left of z z 0.025 is 1 – 0.025 = 0.975.

when using invnorm(0.975,0,1) on the TI-83, 83+, or 84+ calculators. (This can also be found using appropriate commands on other calculators, using a computer, or using a probability table for the standard normal distribution.)

Notice that the EBM is larger for a 95 percent confidence level in the original problem.

Explanation of 95 percent Confidence Level

Comparing the results, summary: effect of changing the confidence level.

  • Increasing the confidence level increases the error bound, making the confidence interval wider.
  • Decreasing the confidence level decreases the error bound, making the confidence interval narrower.

Refer back to the pizza-delivery Try It exercise. The population standard deviation is six minutes and the sample mean deliver time is 36 minutes. Use a sample size of 20. Find a 95 percent confidence interval estimate for the true mean pizza-delivery time.

Example 8.5

Suppose we change the original problem in Example 8.2 to see what happens to the error bound if the sample size is changed.

Leave everything the same except the sample size. Use the original 90 percent confidence level. What happens to the error bound and the confidence interval if we increase the sample size and use n = 100 instead of n = 36? What happens if we decrease the sample size to n = 25 instead of n = 36?

  • x ¯ x ¯ = 68
  • EBM = ( z α 2 ) ( σ n ) ( z α 2 ) ( σ n )
  • σ = 3, the confidence level is 90 percent ( CL = 0.90), z α 2 z α 2 = z 0.05 = 1.645.

When n = 100, EBM = ( z α 2 ) ( σ n ) ( z α 2 ) ( σ n ) = (1.645) ( 3 100 ) ( 3 100 ) = 0.4935.

When n = 25, EBM = ( z α 2 ) ( σ n ) ( z α 2 ) ( σ n ) = (1.645) ( 3 25 ) ( 3 25 ) = 0.987.

Summary: Effect of Changing the Sample Size

  • Increasing the sample size causes the error bound to decrease, making the confidence interval narrower.
  • Decreasing the sample size causes the error bound to increase, making the confidence interval wider.

Refer back to the pizza-delivery Try It exercise. The mean delivery time is 36 minutes and the population standard deviation is six minutes. Assume the sample size is changed to 50 restaurants with the same sample mean. Find a 90 percent confidence interval estimate for the population mean delivery time.

Working Backward to Find the Error Bound or Sample Mean

When we calculate a confidence interval, we find the sample mean, calculate the error bound, and use them to calculate the confidence interval. However, sometimes when we read statistical studies, the study may state the confidence interval only. If we know the confidence interval, we can work backward to find both the error bound and the sample mean.

  • From the upper value for the interval, subtract the sample mean,
  • Or, from the upper value for the interval, subtract the lower value. Then divide the difference by 2.
  • Subtract the error bound from the upper value of the confidence interval,
  • Or, average the upper and lower endpoints of the confidence interval.

Notice that there are two methods to perform each calculation. You can choose the method that is easier to use with the information you know.

Example 8.6

Suppose we know that a confidence interval is (67.18, 68.82) and we want to find the error bound. We may know that the sample mean is 68, or perhaps our source only gives the confidence interval and does not tell us the value of the sample mean.

Calculate the error bound:

  • If we know that the sample mean is 68, EBM = 68.82 – 68 = 0.82.
  • If we do not know the sample mean, EBM = ( 68.82 − 67.18 ) 2 ( 68.82 − 67.18 ) 2 = 0.82. The margin of error is the quantity that we add and subtract from the sample mean to obtain the confidence interval. Therefore, the margin of error is half of the length of the interval.

Calculate the sample mean:

  • If we know the error bound, x ¯ x ¯ = 68.82 – 0.82 = 68.
  • If we do not know the error bound, x ¯ x ¯ = ( 67.18 + 68.82 ) 2 ( 67.18 + 68.82 ) 2 = 68.

Suppose we know that a confidence interval is (42.12, 47.88). Find the error bound and the sample mean.

Calculating the Sample Size n

If researchers desire a specific margin of error, then they can use the error bound formula to calculate the required sample size. In this situation, we are given the desired margin of error, EBM , and we need to compute the sample size n .

The formula for sample size is n = z 2 σ 2 E B M 2 z 2 σ 2 E B M 2 , found by solving the error bound formula for n . Always round up the value of n to the closest integer.

In this formula, z is the critical value z α 2 z α 2 , corresponding to the desired confidence level. A researcher planning a study who wants a specified confidence level and error bound can use this formula to calculate the size of the sample needed for the study.

Example 8.7

The population standard deviation for the age of Foothill College students is 15 years. If we want to be 95 percent confident that the sample mean age is within two years of the true population mean age of Foothill College students, how many randomly selected Foothill College students must be surveyed?

  • From the problem, we know that σ = 15 and EBM = 2.
  • z = z 0.025 = 1.96, because the confidence level is 95 percent.
  • n = z 2 σ 2 E B M 2 z 2 σ 2 E B M 2 = ( 1.96 ) 2 ( 15 ) 2 2 2 ( 1.96 ) 2 ( 15 ) 2 2 2 = 216.09 using the sample size equation.
  • Use n = 217. Always round the answer up to the next higher integer to ensure that the sample size is large enough.

Therefore, 217 Foothill College students should be surveyed in order to be 95 percent confident that we are within two years of the true population mean age of Foothill College students.

The population standard deviation for the height of high school basketball players is three inches. If we want to be 95 percent confident that the sample mean height is within one inch of the true population mean height, how many randomly selected students must be surveyed?

As an Amazon Associate we earn from qualifying purchases.

This book may not be used in the training of large language models or otherwise be ingested into large language models or generative AI offerings without OpenStax's permission.

Want to cite, share, or modify this book? This book uses the Creative Commons Attribution License and you must attribute Texas Education Agency (TEA). The original material is available at: https://www.texasgateway.org/book/tea-statistics . Changes were made to the original material, including updates to art, structure, and other content updates.

Access for free at https://openstax.org/books/statistics/pages/1-introduction
  • Authors: Barbara Illowsky, Susan Dean
  • Publisher/website: OpenStax
  • Book title: Statistics
  • Publication date: Mar 27, 2020
  • Location: Houston, Texas
  • Book URL: https://openstax.org/books/statistics/pages/1-introduction
  • Section URL: https://openstax.org/books/statistics/pages/8-1-a-single-population-mean-using-the-normal-distribution

© Jan 23, 2024 Texas Education Agency (TEA). The OpenStax name, OpenStax logo, OpenStax book covers, OpenStax CNX name, and OpenStax CNX logo are not subject to the Creative Commons license and may not be reproduced without the prior and express written consent of Rice University.

Teach yourself statistics

Hypothesis Test for a Mean

This lesson explains how to conduct a hypothesis test of a mean, when the following conditions are met:

  • The sampling method is simple random sampling .
  • The sampling distribution is normal or nearly normal.

Generally, the sampling distribution will be approximately normally distributed if any of the following conditions apply.

  • The population distribution is normal.
  • The population distribution is symmetric , unimodal , without outliers , and the sample size is 15 or less.
  • The population distribution is moderately skewed , unimodal, without outliers, and the sample size is between 16 and 40.
  • The sample size is greater than 40, without outliers.

This approach consists of four steps: (1) state the hypotheses, (2) formulate an analysis plan, (3) analyze sample data, and (4) interpret results.

State the Hypotheses

Every hypothesis test requires the analyst to state a null hypothesis and an alternative hypothesis . The hypotheses are stated in such a way that they are mutually exclusive. That is, if one is true, the other must be false; and vice versa.

The table below shows three sets of hypotheses. Each makes a statement about how the population mean μ is related to a specified value M . (In the table, the symbol ≠ means " not equal to ".)

The first set of hypotheses (Set 1) is an example of a two-tailed test , since an extreme value on either side of the sampling distribution would cause a researcher to reject the null hypothesis. The other two sets of hypotheses (Sets 2 and 3) are one-tailed tests , since an extreme value on only one side of the sampling distribution would cause a researcher to reject the null hypothesis.

Formulate an Analysis Plan

The analysis plan describes how to use sample data to accept or reject the null hypothesis. It should specify the following elements.

  • Significance level. Often, researchers choose significance levels equal to 0.01, 0.05, or 0.10; but any value between 0 and 1 can be used.
  • Test method. Use the one-sample t-test to determine whether the hypothesized mean differs significantly from the observed sample mean.

Analyze Sample Data

Using sample data, conduct a one-sample t-test. This involves finding the standard error, degrees of freedom, test statistic, and the P-value associated with the test statistic.

SE = s * sqrt{ ( 1/n ) * [ ( N - n ) / ( N - 1 ) ] }

SE = s / sqrt( n )

  • Degrees of freedom. The degrees of freedom (DF) is equal to the sample size (n) minus one. Thus, DF = n - 1.

t = ( x - μ) / SE

  • P-value. The P-value is the probability of observing a sample statistic as extreme as the test statistic. Since the test statistic is a t statistic, use the t Distribution Calculator to assess the probability associated with the t statistic, given the degrees of freedom computed above. (See sample problems at the end of this lesson for examples of how this is done.)

Sample Size Calculator

As you probably noticed, the process of hypothesis testing can be complex. When you need to test a hypothesis about a mean score, consider using the Sample Size Calculator. The calculator is fairly easy to use, and it is free. You can find the Sample Size Calculator in Stat Trek's main menu under the Stat Tools tab. Or you can tap the button below.

Interpret Results

If the sample findings are unlikely, given the null hypothesis, the researcher rejects the null hypothesis. Typically, this involves comparing the P-value to the significance level , and rejecting the null hypothesis when the P-value is less than the significance level.

Test Your Understanding

In this section, two sample problems illustrate how to conduct a hypothesis test of a mean score. The first problem involves a two-tailed test; the second problem, a one-tailed test.

Problem 1: Two-Tailed Test

An inventor has developed a new, energy-efficient lawn mower engine. He claims that the engine will run continuously for 5 hours (300 minutes) on a single gallon of regular gasoline. From his stock of 2000 engines, the inventor selects a simple random sample of 50 engines for testing. The engines run for an average of 295 minutes, with a standard deviation of 20 minutes. Test the null hypothesis that the mean run time is 300 minutes against the alternative hypothesis that the mean run time is not 300 minutes. Use a 0.05 level of significance. (Assume that run times for the population of engines are normally distributed.)

Solution: The solution to this problem takes four steps: (1) state the hypotheses, (2) formulate an analysis plan, (3) analyze sample data, and (4) interpret results. We work through those steps below:

Null hypothesis: μ = 300

Alternative hypothesis: μ ≠ 300

  • Formulate an analysis plan . For this analysis, the significance level is 0.05. The test method is a one-sample t-test .

SE = s / sqrt(n) = 20 / sqrt(50) = 20/7.07 = 2.83

DF = n - 1 = 50 - 1 = 49

t = ( x - μ) / SE = (295 - 300)/2.83 = -1.77

where s is the standard deviation of the sample, x is the sample mean, μ is the hypothesized population mean, and n is the sample size.

Since we have a two-tailed test , the P-value is the probability that the t statistic having 49 degrees of freedom is less than -1.77 or greater than 1.77. We use the t Distribution Calculator to find P(t < -1.77) is about 0.04.

  • If you enter 1.77 as the sample mean in the t Distribution Calculator, you will find the that the P(t < 1.77) is about 0.04. Therefore, P(t >  1.77) is 1 minus 0.96 or 0.04. Thus, the P-value = 0.04 + 0.04 = 0.08.
  • Interpret results . Since the P-value (0.08) is greater than the significance level (0.05), we cannot reject the null hypothesis.

Note: If you use this approach on an exam, you may also want to mention why this approach is appropriate. Specifically, the approach is appropriate because the sampling method was simple random sampling, the population was normally distributed, and the sample size was small relative to the population size (less than 5%).

Problem 2: One-Tailed Test

Bon Air Elementary School has 1000 students. The principal of the school thinks that the average IQ of students at Bon Air is at least 110. To prove her point, she administers an IQ test to 20 randomly selected students. Among the sampled students, the average IQ is 108 with a standard deviation of 10. Based on these results, should the principal accept or reject her original hypothesis? Assume a significance level of 0.01. (Assume that test scores in the population of engines are normally distributed.)

Null hypothesis: μ >= 110

Alternative hypothesis: μ < 110

  • Formulate an analysis plan . For this analysis, the significance level is 0.01. The test method is a one-sample t-test .

SE = s / sqrt(n) = 10 / sqrt(20) = 10/4.472 = 2.236

DF = n - 1 = 20 - 1 = 19

t = ( x - μ) / SE = (108 - 110)/2.236 = -0.894

Here is the logic of the analysis: Given the alternative hypothesis (μ < 110), we want to know whether the observed sample mean is small enough to cause us to reject the null hypothesis.

The observed sample mean produced a t statistic test statistic of -0.894. We use the t Distribution Calculator to find P(t < -0.894) is about 0.19.

  • This means we would expect to find a sample mean of 108 or smaller in 19 percent of our samples, if the true population IQ were 110. Thus the P-value in this analysis is 0.19.
  • Interpret results . Since the P-value (0.19) is greater than the significance level (0.01), we cannot reject the null hypothesis.

Find Study Materials for

  • Business Studies
  • Combined Science
  • Computer Science
  • Engineering
  • English Literature
  • Environmental Science
  • Human Geography
  • Macroeconomics
  • Microeconomics
  • Social Studies
  • Browse all subjects
  • Read our Magazine

Create Study Materials

Hypothesis tests for the normal distribution can be conducted in a very similar way to binomial distribution , e xcept this time we switch our test statistic.  These tests are useful as again they help us test claims of normally distributed items.

Mockup Schule

Explore our app and discover over 50 million learning materials for free.

  • Normal Distribution Hypothesis Test
  • Explanations
  • StudySmarter AI
  • Textbook Solutions
  • Applied Mathematics
  • Decision Maths
  • Discrete Mathematics
  • Logic and Functions
  • Mechanics Maths
  • Probability and Statistics
  • Bayesian Statistics
  • Bias in Experiments
  • Binomial Distribution
  • Binomial Hypothesis Test
  • Biostatistics
  • Bivariate Data
  • Categorical Data Analysis
  • Categorical Variables
  • Causal Inference
  • Central Limit Theorem
  • Chi Square Test for Goodness of Fit
  • Chi Square Test for Homogeneity
  • Chi Square Test for Independence
  • Chi-Square Distribution
  • Cluster Analysis
  • Combining Random Variables
  • Comparing Data
  • Comparing Two Means Hypothesis Testing
  • Conditional Probability
  • Conducting A Study
  • Conducting a Survey
  • Conducting an Experiment
  • Confidence Interval for Population Mean
  • Confidence Interval for Population Proportion
  • Confidence Interval for Slope of Regression Line
  • Confidence Interval for the Difference of Two Means
  • Confidence Intervals
  • Correlation Math
  • Cox Regression
  • Cumulative Distribution Function
  • Cumulative Frequency
  • Data Analysis
  • Data Interpretation
  • Decision Theory
  • Degrees of Freedom
  • Discrete Random Variable
  • Discriminant Analysis
  • Distributions
  • Empirical Bayes Methods
  • Empirical Rule
  • Errors In Hypothesis Testing
  • Estimation Theory
  • Estimator Bias
  • Events (Probability)
  • Experimental Design
  • Factor Analysis
  • Frequency Polygons
  • Generalization and Conclusions
  • Geometric Distribution
  • Geostatistics
  • Hierarchical Modeling
  • Hypothesis Test for Correlation
  • Hypothesis Test for Regression Slope
  • Hypothesis Test of Two Population Proportions
  • Hypothesis Testing
  • Inference For Distributions Of Categorical Data
  • Inferences in Statistics
  • Item Response Theory
  • Kaplan-Meier Estimate
  • Kernel Density Estimation
  • Large Data Set
  • Lasso Regression
  • Latent Variable Models
  • Least Squares Linear Regression
  • Linear Interpolation
  • Linear Regression
  • Logistic Regression
  • Machine Learning
  • Mann-Whitney Test
  • Markov Chains
  • Mean and Variance of Poisson Distributions
  • Measures of Central Tendency
  • Methods of Data Collection
  • Mixed Models
  • Multilevel Modeling
  • Multivariate Analysis
  • Neyman-Pearson Lemma
  • Non-parametric Methods
  • Normal Distribution
  • Normal Distribution Percentile
  • Ordinal Regression
  • Paired T-Test
  • Parametric Methods
  • Path Analysis
  • Point Estimation
  • Poisson Regression
  • Principle Components Analysis
  • Probability
  • Probability Calculations
  • Probability Density Function
  • Probability Distribution
  • Probability Generating Function
  • Product Moment Correlation Coefficient
  • Quantile Regression
  • Quantitative Variables
  • Random Effects Model
  • Random Variables
  • Randomized Block Design
  • Regression Analysis
  • Residual Sum of Squares
  • Robust Statistics
  • Sample Mean
  • Sample Proportion
  • Sampling Distribution
  • Sampling Theory
  • Scatter Graphs
  • Sequential Analysis
  • Single Variable Data
  • Spearman's Rank Correlation
  • Spearman's Rank Correlation Coefficient
  • Standard Deviation
  • Standard Error
  • Standard Normal Distribution
  • Statistical Graphs
  • Statistical Inference
  • Statistical Measures
  • Stem and Leaf Graph
  • Stochastic Processes
  • Structural Equation Modeling
  • Sum of Independent Random Variables
  • Survey Bias
  • Survival Analysis
  • Survivor Function
  • T-distribution
  • The Power Function
  • Time Series Analysis
  • Transforming Random Variables
  • Tree Diagram
  • Two Categorical Variables
  • Two Quantitative Variables
  • Type I Error
  • Type II Error
  • Types of Data in Statistics
  • Variance for Binomial Distribution
  • Venn Diagrams
  • Wilcoxon Test
  • Zero-Inflated Models
  • Theoretical and Mathematical Physics

Lerne mit deinen Freunden und bleibe auf dem richtigen Kurs mit deinen persönlichen Lernstatistiken

Nie wieder prokastinieren mit unseren Lernerinnerungen.

Hypothesis tests for the normal distribution can be conducted in a very similar way to binomial distribution , e xcept this time we switch our test statistic. These tests are useful as again they help us test claims of normally distributed items.

How do we carry out a hypothesis test for normal distribution?

When we hypothesis test for the mean of a normal distribution we think about looking at the mean of a sample from a population.

So for a random sample of size n of a population, taken from the random variable \(X \sim N(\mu, \sigma^2)\) , the sample mean \(\bar{X}\) can be normally distributed by \(\bar{X} \sim N(\mu, \frac{\sigma^2}{n})\) .

Let's look at an example.

The weight of crisps is each packet is normally distributed with a standard deviation of 2.5g.

The crisp company claims that the crisp packets have a mean weight of 28g. There were numerous complaints that each crisp packet weighs less than this. Therefore a trading inspector investigated this and found in a sample of 50 crisp packets, the mean weight was 27.2g.

Using a 5% significance level and stating the hypothesis, clearly test whether or not the evidence upholds the complaints.

This is an example of a one tailed test. Let's look at an example of a two tailed test.

A machine produces circular discs with a radius R, where R is normally distributed with a mean of 2cm and a standard deviation of 0.3cm.

The machine is serviced and after the service, a random sample of 40 discs is taken to see if the mean has changed from 2cm. The radius is still normally distributed with a standard deviation of 0.3 cm.

The mean is found to be 1.9cm.

Has the mean changed? Test this to a 5% significance level.

Step 5 may be confusing – do we carry out the calculation with \(P(\bar{X} \leq \bar{x})\) or \(P(\bar{X} \geq \bar{x})\)? As a general rule of thumb if the value is between 0 and the mean, then we use \(P(\bar{X} \leq \bar{x})\) . If it is greater than the mean then we use \(P(\bar{X} \geq \bar{x})\) .

How about finding critical values and critical regions?

This is the same idea as in binomial distribution . However, in normal distribution, a calculator can make our lives easier.

The distributions menu has an option called inverse normal.

Here, we enter the significance level (Area), the mean (\(\mu\) ) and the standard deviation (\(\sigma\) ).

The calculator will give us an answer. Let's have a look at an example below.

Wheels are made to measure for a bike. The diameter of the wheel is normally distributed with a mean of 40cm and a standard deviation of 5cm. Some people think that their wheels are too small. Find the critical value of this to a 5% significance level.

In our calculator, in the inverse normal function, we need to enter:

If we perform the inverse normal function we get 31.775732 .

So that is our critical value and our critical region is \(X \leq 31.775732\) .

Let's look at an example with two tails.

Normal distribution hypothesis test two tailed test studysmarter

Hypothesis Test for Normal Distribution - Key takeaways

  • When we hypothesis test for a normal distribution we are trying to see if the mean is different from the mean stated in the null hypothesis.
  • We use the sample mean which is \(\bar{X} \sim N(\mu, \frac{\sigma^2}{n})\) .
  • In two-tailed tests we divide the significance level by two and test on both tails.
  • When finding critical values we use the calculator inverse normal function entering the area as the significance level.
  • For two-tailed tests we need to find two critical values on either end of the distribution.

Frequently Asked Questions about Normal Distribution Hypothesis Test

--> how do you test a hypothesis for a normal distribution, --> is hypothesis testing only for a normal distribution.

No, pretty much any distribution can be used when testing a hypothesis. The two distributions that you learn at A-Level are Normal and Binomial.

--> What statistical hypothesis can be tested in the means of a normal distribution?

We test whether or not the data can support the value of a mean being too low or too high.

What is our test statistic with normal distribution?

What calculator tool do we use to work backwards with normal distribution?

The Inverse Normal

A coach thinks his athletes will achieve less than 12 seconds in their 100 metre race. His assistant thinks they won't be this fast. If this claim was tested is this a one or two-tailed test?

One-tailed test

How do we find the critical region of a normal distribution?

By using the calculator inverse normal setting.

Flashcards

Learn with 4 Normal Distribution Hypothesis Test flashcards in the free StudySmarter app

Already have an account? Log in

of the users don't pass the Normal Distribution Hypothesis Test quiz! Will you pass the quiz?

How would you like to learn this content?

Free math cheat sheet!

Everything you need to know on . A perfect summary so you can easily remember everything.

Join over 22 million students in learning with our StudySmarter App

The first learning app that truly has everything you need to ace your exams in one place

  • Flashcards & Quizzes
  • AI Study Assistant
  • Study Planner
  • Smart Note-Taking

Join over 22 million students in learning with our StudySmarter App

Sign up to highlight and take notes. It’s 100% free.

This is still free to read, it's not a paywall.

You need to register to keep reading, create a free account to save this explanation..

Save explanations to your personalised space and access them anytime, anywhere!

By signing up, you agree to the Terms and Conditions and the Privacy Policy of StudySmarter.

Entdecke Lernmaterial in der StudySmarter-App

Google Popup

Privacy Overview

hypothesis testing for a mean using the normal distribution

  • The Open University
  • Guest user / Sign out
  • Study with The Open University

My OpenLearn Profile

Personalise your OpenLearn profile, save your favourite content and get recognition for your learning

About this free course

Become an ou student, download this course, share this free course.

Data analysis: hypothesis testing

Start this free course now. Just create an account and sign in. Enrol and complete the course for a free statement of participation or digital badge if available.

4.1 The normal distribution

Here, you will look at the concept of normal distribution and the bell-shaped curve. The peak point (the top of the bell) represents the most probable occurrences, while other possible occurrences are distributed symmetrically around the peak point, creating a downward-sloping curve on either side of the peak point.

Cartoon showing a bell-shaped curve.

The cartoon shows a bell-shaped curve. The x-axis is titled ‘How high the hill is’ and the y-axis is titled ‘Number of hills’. The top of the bell-shaped curve is labelled ‘Average hill’, but on the lower right tail of the bell-shaped curve is labelled ‘Big hill’.

In order to test hypotheses, you need to calculate the test statistic and compare it with the value in the bell curve. This will be done by using the concept of ‘normal distribution’.

A normal distribution is a probability distribution that is symmetric about the mean, indicating that data near the mean are more likely to occur than data far from it. In graph form, a normal distribution appears as a bell curve. The values in the x-axis of the normal distribution graph represent the z-scores. The test statistic that you wish to use to test the set of hypotheses is the z-score . A z-score is used to measure how far the observation (sample mean) is from the 0 value of the bell curve (population mean). In statistics, this distance is measured by standard deviation. Therefore, when the z-score is equal to 2, the observation is 2 standard deviations away from the value 0 in the normal distribution curve.

A symmetrical graph reminiscent of a bell showing normal distribution.

A symmetrical graph reminiscent of a bell. The top of the bell-shaped curve appears where the x-axis is at 0. This is labelled as Normal distribution.

Previous

Module 9: Hypothesis Testing With One Sample

Distribution needed for hypothesis testing.

Earlier in the course, we discussed sampling distributions.  Particular distributions are associated with hypothesis testing. Perform tests of a population mean using a normal distribution or a Student’s t- distribution . (Remember, use a Student’s t -distribution when the population standard deviation is unknown and the distribution of the sample mean is approximately normal.) We perform tests of a population proportion using a normal distribution (usually n is large or the sample size is large).

If you are testing a  single population mean , the distribution for the test is for means :

[latex]\displaystyle\overline{{X}}~{N}{\left(\mu_{{X}}\frac{{\sigma_{{X}}}}{\sqrt{{n}}}\right)}{\quad\text{or}\quad}{t}_{{{d}{f}}}[/latex]

The population parameter is  μ . The estimated value (point estimate) for μ is [latex]\displaystyle\overline{{x}}[/latex], the sample mean.

If you are testing a  single population proportion , the distribution for the test is for proportions or percentages:

[latex]\displaystyle{P}\prime~{N}{\left({p}\sqrt{{\frac{{{p}{q}}}{{n}}}}\right)}[/latex]

The population parameter is  p . The estimated value (point estimate) for p is p′ . [latex]\displaystyle{p}\prime=\frac{{x}}{{n}}[/latex] where x is the number of successes and n is the sample size.

Assumptions

When you perform a  hypothesis test of a single population mean μ using a Student’s t -distribution (often called a t-test), there are fundamental assumptions that need to be met in order for the test to work properly. Your data should be a simple random sample that comes from a population that is approximately normally distributed . You use the sample standard deviation to approximate the population standard deviation. (Note that if the sample size is sufficiently large, a t-test will work even if the population is not approximately normally distributed).

When you perform a  hypothesis test of a single population mean μ using a normal distribution (often called a z -test), you take a simple random sample from the population. The population you are testing is normally distributed or your sample size is sufficiently large. You know the value of the population standard deviation which, in reality, is rarely known.

When you perform a  hypothesis test of a single population proportion p , you take a simple random sample from the population. You must meet the conditions for a binomial distribution which are as follows: there are a certain number n of independent trials, the outcomes of any trial are success or failure, and each trial has the same probability of a success p . The shape of the binomial distribution needs to be similar to the shape of the normal distribution. To ensure this, the quantities np  and nq must both be greater than five ( np > 5 and nq > 5). Then the binomial distribution of a sample (estimated) proportion can be approximated by the normal distribution with μ = p and [latex]\displaystyle\sigma=\sqrt{{\frac{{{p}{q}}}{{n}}}}[/latex] . Remember that q = 1 – p .

Concept Review

In order for a hypothesis test’s results to be generalized to a population, certain requirements must be satisfied.

When testing for a single population mean:

  • A Student’s t -test should be used if the data come from a simple, random sample and the population is approximately normally distributed, or the sample size is large, with an unknown standard deviation.
  • The normal test will work if the data come from a simple, random sample and the population is approximately normally distributed, or the sample size is large, with a known standard deviation.

When testing a single population proportion use a normal test for a single population proportion if the data comes from a simple, random sample, fill the requirements for a binomial distribution, and the mean number of success and the mean number of failures satisfy the conditions:  np > 5 and nq > n where n is the sample size, p is the probability of a success, and q is the probability of a failure.

Formula Review

If there is no given preconceived  α , then use α = 0.05.

Types of Hypothesis Tests

  • Single population mean, known population variance (or standard deviation): Normal test .
  • Single population mean, unknown population variance (or standard deviation): Student’s t -test .
  • Single population proportion: Normal test .
  • For a single population mean , we may use a normal distribution with the following mean and standard deviation. Means: [latex]\displaystyle\mu=\mu_{{\overline{{x}}}}{\quad\text{and}\quad}\sigma_{{\overline{{x}}}}=\frac{{\sigma_{{x}}}}{\sqrt{{n}}}[/latex]
  • A single population proportion , we may use a normal distribution with the following mean and standard deviation. Proportions: [latex]\displaystyle\mu={p}{\quad\text{and}\quad}\sigma=\sqrt{{\frac{{{p}{q}}}{{n}}}}[/latex].
  • Distribution Needed for Hypothesis Testing. Provided by : OpenStax. Located at : . License : CC BY: Attribution
  • Introductory Statistics . Authored by : Barbara Illowski, Susan Dean. Provided by : Open Stax. Located at : http://cnx.org/contents/[email protected] . License : CC BY: Attribution . License Terms : Download for free at http://cnx.org/contents/[email protected]

An Introduction to Bayesian Thinking

Chapter 5 hypothesis testing with normal populations.

In Section 3.5 , we described how the Bayes factors can be used for hypothesis testing. Now we will use the Bayes factors to compare normal means, i.e., test whether the mean of a population is zero or compare the means of two groups of normally-distributed populations. We divide this mission into three cases: known variance for a single population, unknown variance for a single population using paired data, and unknown variance using two independent groups.

Also note that some of the examples in this section use an updated version of the bayes_inference function. If your local output is different from what is seen in this chapter, or the provided code fails to run for you please make sure that you have the most recent version of the package.

5.1 Bayes Factors for Testing a Normal Mean: variance known

Now we show how to obtain Bayes factors for testing hypothesis about a normal mean, where the variance is known . To start, let’s consider a random sample of observations from a normal population with mean \(\mu\) and pre-specified variance \(\sigma^2\) . We consider testing whether the population mean \(\mu\) is equal to \(m_0\) or not.

Therefore, we can formulate the data and hypotheses as below:

Data \[Y_1, \cdots, Y_n \mathrel{\mathop{\sim}\limits^{\rm iid}}\textsf{Normal}(\mu, \sigma^2)\]

  • \(H_1: \mu = m_0\)
  • \(H_2: \mu \neq m_0\)

We also need to specify priors for \(\mu\) under both hypotheses. Under \(H_1\) , we assume that \(\mu\) is exactly \(m_0\) , so this occurs with probability 1 under \(H_1\) . Now under \(H_2\) , \(\mu\) is unspecified, so we describe our prior uncertainty with the conjugate normal distribution centered at \(m_0\) and with a variance \(\sigma^2/\mathbf{n_0}\) . This is centered at the hypothesized value \(m_0\) , and it seems that the mean is equally likely to be larger or smaller than \(m_0\) , so a dividing factor \(n_0\) is given to the variance. The hyper parameter \(n_0\) controls the precision of the prior as before.

In mathematical terms, the priors are:

  • \(H_1: \mu = m_0 \text{ with probability 1}\)
  • \(H_2: \mu \sim \textsf{Normal}(m_0, \sigma^2/\mathbf{n_0})\)

Bayes Factor

Now the Bayes factor for comparing \(H_1\) to \(H_2\) is the ratio of the distribution, the data under the assumption that \(\mu = m_0\) to the distribution of the data under \(H_2\) .

\[\begin{aligned} \textit{BF}[H_1 : H_2] &= \frac{p(\text{data}\mid \mu = m_0, \sigma^2 )} {\int p(\text{data}\mid \mu, \sigma^2) p(\mu \mid m_0, \mathbf{n_0}, \sigma^2)\, d \mu} \\ \textit{BF}[H_1 : H_2] &=\left(\frac{n + \mathbf{n_0}}{\mathbf{n_0}} \right)^{1/2} \exp\left\{-\frac 1 2 \frac{n }{n + \mathbf{n_0}} Z^2 \right\} \\ Z &= \frac{(\bar{Y} - m_0)}{\sigma/\sqrt{n}} \end{aligned}\]

The term in the denominator requires integration to account for the uncertainty in \(\mu\) under \(H_2\) . And it can be shown that the Bayes factor is a function of the observed sampled size, the prior sample size \(n_0\) and a \(Z\) score.

Let’s explore how the hyperparameters in \(n_0\) influences the Bayes factor in Equation (5.1) . For illustration we will use the sample size of 100. Recall that for estimation, we interpreted \(n_0\) as a prior sample size and considered the limiting case where \(n_0\) goes to zero as a non-informative or reference prior.

\[\begin{equation} \textsf{BF}[H_1 : H_2] = \left(\frac{n + \mathbf{n_0}}{\mathbf{n_0}}\right)^{1/2} \exp\left\{-\frac{1}{2} \frac{n }{n + \mathbf{n_0}} Z^2 \right\} \tag{5.1} \end{equation}\]

Figure 5.1 shows the Bayes factor for comparing \(H_1\) to \(H_2\) on the y-axis as \(n_0\) changes on the x-axis. The different lines correspond to different values of the \(Z\) score or how many standard errors \(\bar{y}\) is from the hypothesized mean. As expected, larger values of the \(Z\) score favor \(H_2\) .

Vague prior for mu: n=100

Figure 5.1: Vague prior for mu: n=100

But as \(n_0\) becomes smaller and approaches 0, the first term in the Bayes factor goes to infinity, while the exponential term involving the data goes to a constant and is ignored. In the limit as \(n_0 \rightarrow 0\) under this noninformative prior, the Bayes factor paradoxically ends up favoring \(H_1\) regardless of the value of \(\bar{y}\) .

The takeaway from this is that we cannot use improper priors with \(n_0 = 0\) , if we are going to test our hypothesis that \(\mu = n_0\) . Similarly, vague priors that use a small value of \(n_0\) are not recommended due to the sensitivity of the results to the choice of an arbitrarily small value of \(n_0\) .

This problem arises with vague priors – the Bayes factor favors the null model \(H_1\) even when the data are far away from the value under the null – are known as the Bartlett’s paradox or the Jeffrey’s-Lindleys paradox.

Now, one way to understand the effect of prior is through the standard effect size

\[\delta = \frac{\mu - m_0}{\sigma}.\] The prior of the standard effect size is

\[\delta \mid H_2 \sim \textsf{Normal}(0, \frac{1}{\mathbf{n_0}})\]

This allows us to think about a standardized effect independent of the units of the problem. One default choice is using the unit information prior, where the prior sample size \(n_0\) is 1, leading to a standard normal for the standardized effect size. This is depicted with the blue normal density in Figure 5.2 . This suggested that we expect that the mean will be within \(\pm 1.96\) standard deviations of the hypothesized mean with probability 0.95 . (Note that we can say this only under a Bayesian setting.)

In many fields we expect that the effect will be small relative to \(\sigma\) . If we do not expect to see large effects, then we may want to use a more informative prior on the effect size as the density in orange with \(n_0 = 4\) . So they expected the mean to be within \(\pm 1/\sqrt{n_0}\) or five standard deviations of the prior mean.

Prior on standard effect size

Figure 5.2: Prior on standard effect size

Example 1.1 To illustrate, we give an example from parapsychological research. The case involved the test of the subject’s claim to affect a series of randomly generated 0’s and 1’s by means of extra sensory perception (ESP). The random sequence of 0’s and 1’s are generated by a machine with probability of generating 1 being 0.5. The subject claims that his ESP would make the sample mean differ significantly from 0.5.

Therefore, we are testing \(H_1: \mu = 0.5\) versus \(H_2: \mu \neq 0.5\) . Let’s use a prior that suggests we do not expect a large effect which leads the following solution for \(n_0\) . Assume we want a standard effect of 0.03, there is a 95% chance that it is between \((-0.03/\sigma, 0.03/\sigma)\) , with \(n_0 = (1.96\sigma/0.03)^2 = 32.7^2\) .

Figure 5.3 shows our informative prior in blue, while the unit information prior is in orange. On this scale, the unit information prior needs to be almost uniform for the range that we are interested.

Prior effect in the extra sensory perception test

Figure 5.3: Prior effect in the extra sensory perception test

A very large data set with over 104 million trials was collected to test this hypothesis, so we use a normal distribution to approximate the distribution the sample mean.

  • Sample size: \(n = 1.0449 \times 10^8\)
  • Sample mean: \(\bar{y} = 0.500177\) , standard deviation \(\sigma = 0.5\)
  • \(Z\) -score: 3.61

Now using our prior in the data, the Bayes factor for \(H_1\) to \(H_2\) was 0.46, implying evidence against the hypothesis \(H_1\) that \(\mu = 0.5\) .

  • Informative \(\textit{BF}[H_1:H_2] = 0.46\)
  • \(\textit{BF}[H_2:H_1] = 1/\textit{BF}[H_1:H_2] = 2.19\)

Now, this can be inverted to provide the evidence in favor of \(H_2\) . The evidence suggests that the hypothesis that the machine operates with a probability that is not 0.5, is 2.19 times more likely than the hypothesis the probability is 0.5. Based on the interpretation of Bayes factors from Table 3.5 , this is in the range of “not worth the bare mention”.

To recap, we present expressions for calculating Bayes factors for a normal model with a specified variance. We show that the improper reference priors for \(\mu\) when \(n_0 = 0\) , or vague priors where \(n_0\) is arbitrarily small, lead to Bayes factors that favor the null hypothesis regardless of the data, and thus should not be used for hypothesis testing.

Bayes factors with normal priors can be sensitive to the choice of the \(n_0\) . While the default value of \(n_0 = 1\) is reasonable in many cases, this may be too non-informative if one expects more effects. Wherever possible, think about how large an effect you expect and use that information to help select the \(n_0\) .

All the ESP examples suggest weak evidence and favored the machine generating random 0’s and 1’s with a probability that is different from 0.5. Note that ESP is not the only explanation – a deviation from 0.5 can also occur if the random number generator is biased. Bias in the stream of random numbers in our pseudorandom numbers has huge implications for numerous fields that depend on simulation. If the context had been about detecting a small bias in random numbers what prior would you use and how would it change the outcome? You can experiment it in R or other software packages that generate random Bernoulli trials.

Next, we will look at Bayes factors in normal models with unknown variances using the Cauchy prior so that results are less sensitive to the choice of \(n_0\) .

5.2 Comparing Two Paired Means using Bayes Factors

We previously learned that we can use a paired t-test to compare means from two paired samples. In this section, we will show how Bayes factors can be expressed as a function of the t-statistic for comparing the means and provide posterior probabilities of the hypothesis that whether the means are equal or different.

Example 5.1 Trace metals in drinking water affect the flavor, and unusually high concentrations can pose a health hazard. Ten pairs of data were taken measuring the zinc concentration in bottom and surface water at ten randomly sampled locations, as listed in Table 5.1 .

Water samples collected at the the same location, on the surface and the bottom, cannot be assumed to be independent of each other. However, it may be reasonable to assume that the differences in the concentration at the bottom and the surface in randomly sampled locations are independent of each other.

To start modeling, we will treat the ten differences as a random sample from a normal population where the parameter of interest is the difference between the average zinc concentration at the bottom and the average zinc concentration at the surface, or the main difference, \(\mu\) .

In mathematical terms, we have

  • Random sample of \(n= 10\) differences \(Y_1, \ldots, Y_n\)
  • Normal population with mean \(\mu \equiv \mu_B - \mu_S\)

In this case, we have no information about the variability in the data, and we will treat the variance, \(\sigma^2\) , as unknown.

The hypothesis of the main concentration at the surface and bottom are the same is equivalent to saying \(\mu = 0\) . The second hypothesis is that the difference between the mean bottom and surface concentrations, or equivalently that the mean difference \(\mu \neq 0\) .

In other words, we are going to compare the following hypotheses:

  • \(H_1: \mu_B = \mu_S \Leftrightarrow \mu = 0\)
  • \(H_2: \mu_B \neq \mu_S \Leftrightarrow \mu \neq 0\)

The Bayes factor is the ratio between the distributions of the data under each hypothesis, which does not depend on any unknown parameters.

\[\textit{BF}[H_1 : H_2] = \frac{p(\text{data}\mid H_1)} {p(\text{data}\mid H_2)}\]

To obtain the Bayes factor, we need to use integration over the prior distributions under each hypothesis to obtain those distributions of the data.

\[\textit{BF}[H_1 : H_2] = \iint p(\text{data}\mid \mu, \sigma^2) p(\mu \mid \sigma^2) p(\sigma^2 \mid H_2)\, d \mu \, d\sigma^2\]

This requires specifying the following priors:

  • \(\mu \mid \sigma^2, H_2 \sim \textsf{Normal}(0, \sigma^2/n_0)\)
  • \(p(\sigma^2) \propto 1/\sigma^2\) for both \(H_1\) and \(H_2\)

\(\mu\) is exactly zero under the hypothesis \(H_1\) . For \(\mu\) in \(H_2\) , we start with the same conjugate normal prior as we used in Section 5.1 – testing the normal mean with known variance. Since we assume that \(\sigma^2\) is known, we model \(\mu \mid \sigma^2\) instead of \(\mu\) itself.

The \(\sigma^2\) appears in both the numerator and denominator of the Bayes factor. For default or reference case, we use the Jeffreys prior (a.k.a. reference prior) on \(\sigma^2\) . As long as we have more than two observations, this (improper) prior will lead to a proper posterior.

After integration and rearranging, one can derive a simple expression for the Bayes factor:

\[\textit{BF}[H_1 : H_2] = \left(\frac{n + n_0}{n_0} \right)^{1/2} \left( \frac{ t^2 \frac{n_0}{n + n_0} + \nu } { t^2 + \nu} \right)^{\frac{\nu + 1}{2}}\]

This is a function of the t-statistic

\[t = \frac{|\bar{Y}|}{s/\sqrt{n}},\]

where \(s\) is the sample standard deviation and the degrees of freedom \(\nu = n-1\) (sample size minus one).

As we saw in the case of Bayes factors with known variance, we cannot use the improper prior on \(\mu\) because when \(n_0 \to 0\) , then \(\textit{BF}[H1:H_2] \to \infty\) favoring \(H_1\) regardless of the magnitude of the t-statistic. Arbitrary, vague small choices for \(n_0\) also lead to arbitrary large Bayes factors in favor of \(H_1\) . Another example of the Barlett’s or Jeffreys-Lindley paradox.

Sir Herald Jeffrey discovered another paradox testing using the conjugant normal prior, known as the information paradox . His thought experiment assumed that our sample size \(n\) and the prior sample size \(n_0\) . He then considered what would happen to the Bayes factor as the sample mean moved further and further away from the hypothesized mean, measured in terms standard errors with the t-statistic, i.e., \(|t| \to \infty\) . As the t-statistic or information about the mean moved further and further from zero, the Bayes factor goes to a constant depending on \(n, n_0\) rather than providing overwhelming support for \(H_2\) .

The bounded Bayes factor is

\[\textit{BF}[H_1 : H_2] \to \left( \frac{n_0}{n_0 + n} \right)^{\frac{n - 1}{2}}\]

Jeffrey wanted a prior with \(\textit{BF}[H_1 : H_2] \to 0\) (or equivalently, \(\textit{BF}[H_2 : H_1] \to \infty\) ), as the information from the t-statistic grows, indicating the sample mean is as far as from the hypothesized mean and should favor \(H_2\) .

To resolve the paradox when the information the t-statistic favors \(H_2\) but the Bayes factor does not, Jeffreys showed that no normal prior could resolve the paradox .

But a Cauchy prior on \(\mu\) , would resolve it. In this way, \(\textit{BF}[H_2 : H_1]\) goes to infinity as the sample mean becomes further away from the hypothesized mean. Recall that the Cauchy prior is written as \(\textsf{C}(0, r^2 \sigma^2)\) . While Jeffreys used a default of \(r = 1\) , smaller values of \(r\) can be used if smaller effects are expected.

The combination of the Jeffrey’s prior on \(\sigma^2\) and this Cauchy prior on \(\mu\) under \(H_2\) is sometimes referred to as the Jeffrey-Zellener-Siow prior .

However, there is no closed form expressions for the Bayes factor under the Cauchy distribution. To obtain the Bayes factor, we must use the numerical integration or simulation methods.

We will use the function from the package to test whether the mean difference is zero in Example 5.1 (zinc), using the JZS (Jeffreys-Zellener-Siow) prior.

hypothesis testing for a mean using the normal distribution

With equal prior probabilities on the two hypothesis, the Bayes factor is the posterior odds. From the output, we see this indicates that the hypothesis \(H_2\) , the mean difference is different from 0, is almost 51 times more likely than the hypothesis \(H_1\) that the average concentration is the same at the surface and the bottom.

To sum up, we have used the Cauchy prior as a default prior testing hypothesis about a normal mean when variances are unknown. This does require numerical integration, but it is available in the function from the package. If you expect that the effect sizes will be small, smaller values of \(r\) are recommended.

It is often important to quantify the magnitude of the difference in addition to testing. The Cauchy Prior provides a default prior for both testing and inference; it avoids problems that arise with choosing a value of \(n_0\) (prior sample size) in both cases. In the next section, we will illustrate using the Cauchy prior for comparing two means from independent normal samples.

5.3 Comparing Independent Means: Hypothesis Testing

In the previous section, we described Bayes factors for testing whether the mean difference of paired samples was zero. In this section, we will consider a slightly different problem – we have two independent samples, and we would like to test the hypothesis that the means are different or equal.

Example 5.2 We illustrate the testing of independent groups with data from a 2004 survey of birth records from North Carolina, which are available in the package.

The variable of interest is – the weight gain of mothers during pregnancy. We have two groups defined by the categorical variable, , with levels, younger mom and older mom.

Question of interest : Do the data provide convincing evidence of a difference between the average weight gain of older moms and the average weight gain of younger moms?

We will view the data as a random sample from two populations, older and younger moms. The two groups are modeled as:

\[\begin{equation} \begin{aligned} Y_{O,i} & \mathrel{\mathop{\sim}\limits^{\rm iid}} \textsf{N}(\mu + \alpha/2, \sigma^2) \\ Y_{Y,i} & \mathrel{\mathop{\sim}\limits^{\rm iid}} \textsf{N}(\mu - \alpha/2, \sigma^2) \end{aligned} \tag{5.2} \end{equation}\]

The model for weight gain for older moms using the subscript \(O\) , and it assumes that the observations are independent and identically distributed, with a mean \(\mu+\alpha/2\) and variance \(\sigma^2\) .

For the younger women, the observations with the subscript \(Y\) are independent and identically distributed with a mean \(\mu-\alpha/2\) and variance \(\sigma^2\) .

Using this representation of the means in the two groups, the difference in means simplifies to \(\alpha\) – the parameter of interest.

\[(\mu + \alpha/2) - (\mu - \alpha/2) = \alpha\]

You may ask, “Why don’t we set the average weight gain of older women to \(\mu+\alpha\) , and the average weight gain of younger women to \(\mu\) ?” We need the parameter \(\alpha\) to be present in both \(Y_{O,i}\) (the group of older women) and \(Y_{Y,i}\) (the group of younger women).

We have the following competing hypotheses:

  • \(H_1: \alpha = 0 \Leftrightarrow\) The means are not different.
  • \(H_2: \alpha \neq 0 \Leftrightarrow\) The means are different.

In this representation, \(\mu\) represents the overall weight gain for all women. (Does the model in Equation (5.2) make more sense now?) To test the hypothesis, we need to specify prior distributions for \(\alpha\) under \(H_2\) (c.f. \(\alpha = 0\) under \(H_1\) ) and priors for \(\mu,\sigma^2\) under both hypotheses.

Recall that the Bayes factor is the ratio of the distribution of the data under the two hypotheses.

\[\begin{aligned} \textit{BF}[H_1 : H_2] &= \frac{p(\text{data}\mid H_1)} {p(\text{data}\mid H_2)} \\ &= \frac{\iint p(\text{data}\mid \alpha = 0,\mu, \sigma^2 )p(\mu, \sigma^2 \mid H_1) \, d\mu \,d\sigma^2} {\int \iint p(\text{data}\mid \alpha, \mu, \sigma^2) p(\alpha \mid \sigma^2) p(\mu, \sigma^2 \mid H_2) \, d \mu \, d\sigma^2 \, d \alpha} \end{aligned}\]

As before, we need to average over uncertainty and the parameters to obtain the unconditional distribution of the data. Now, as in the test about a single mean, we cannot use improper or non-informative priors for \(\alpha\) for testing.

Under \(H_2\) , we use the Cauchy prior for \(\alpha\) , or equivalently, the Cauchy prior on the standardized effect \(\delta\) with the scale of \(r\) :

\[\delta = \alpha/\sigma^2 \sim \textsf{C}(0, r^2)\]

Now, under both \(H_1\) and \(H_2\) , we use the Jeffrey’s reference prior on \(\mu\) and \(\sigma^2\) :

\[p(\mu, \sigma^2) \propto 1/\sigma^2\]

While this is an improper prior on \(\mu\) , this does not suffer from the Bartlett’s-Lindley’s-Jeffreys’ paradox as \(\mu\) is a common parameter in the model in \(H_1\) and \(H_2\) . This is another example of the Jeffreys-Zellner-Siow prior.

As in the single mean case, we will need numerical algorithms to obtain the Bayes factor. Now the following output illustrates testing of Bayes factors, using the Bayes inference function from the package.

hypothesis testing for a mean using the normal distribution

We see that the Bayes factor for \(H_1\) to \(H_2\) is about 5.7, with positive support for \(H_1\) that there is no difference in average weight gain between younger and older women. Using equal prior probabilities, the probability that there is a difference in average weight gain between the two groups is about 0.15 given the data. Based on the interpretation of Bayes factors from Table 3.5 , this is in the range of “positive” (between 3 and 20).

To recap, we have illustrated testing hypotheses about population means with two independent samples, using a Cauchy prior on the difference in the means. One assumption that we have made is that the variances are equal in both groups . The case where the variances are unequal is referred to as the Behren-Fisher problem, and this is beyond the scope for this course. In the next section, we will look at another example to put everything together with testing and discuss summarizing results.

5.4 Inference after Testing

In this section, we will work through another example for comparing two means using both hypothesis tests and interval estimates, with an informative prior. We will also illustrate how to adjust the credible interval after testing.

Example 5.3 We will use the North Carolina survey data to examine the relationship between infant birth weight and whether the mother smoked during pregnancy. The response variable, , is the birth weight of the baby in pounds. The categorical variable provides the status of the mother as a smoker or non-smoker.

We would like to answer two questions:

Is there a difference in average birth weight between the two groups?

If there is a difference, how large is the effect?

As before, we need to specify models for the data and priors. We treat the data as a random sample for the two populations, smokers and non-smokers.

The birth weights of babies born to non-smokers, designated by a subgroup \(N\) , are assumed to be independent and identically distributed from a normal distribution with mean \(\mu + \alpha/2\) , as in Section 5.3 .

\[Y_{N,i} \mathrel{\mathop{\sim}\limits^{\rm iid}}\textsf{Normal}(\mu + \alpha/2, \sigma^2)\]

While the birth weights of the babies born to smokers, designated by the subgroup \(S\) , are also assumed to have a normal distribution, but with mean \(\mu - \alpha/2\) .

\[Y_{S,i} \mathrel{\mathop{\sim}\limits^{\rm iid}}\textsf{Normal}(\mu - \alpha/2, \sigma^2)\]

The difference in the average birth weights is the parameter \(\alpha\) , because

\[(\mu + \alpha/2) - (\mu - \alpha/2) = \alpha\] .

The hypotheses that we will test are \(H_1: \alpha = 0\) versus \(H_2: \alpha \ne 0\) .

We will still use the Jeffreys-Zellner-Siow Cauchy prior. However, since we may expect the standardized effect size to not be as strong, we will use a scale of \(r = 0.5\) rather than 1.

Therefore, under \(H_2\) , we have \[\delta = \alpha/\sigma \sim \textsf{C}(0, r^2), \text{ with } r = 0.5.\]

Under both \(H_1\) and \(H_2\) , we will use the reference priors on \(\mu\) and \(\sigma^2\) :

\[\begin{aligned} p(\mu) &\propto 1 \\ p(\sigma^2) &\propto 1/\sigma^2 \end{aligned}\]

The input to the base inference function is similar, but now we will specify that \(r = 0.5\) .

hypothesis testing for a mean using the normal distribution

We see that the Bayes factor is 1.44, which weakly favors there being a difference in average birth weights for babies whose mothers are smokers versus mothers who did not smoke. Converting this to a probability, we find that there is about a 60% chance of the average birth weights are different.

While looking at evidence of there being a difference is useful, Bayes factors and posterior probabilities do not convey any information about the magnitude of the effect. Reporting a credible interval or the complete posterior distribution is more relevant for quantifying the magnitude of the effect.

Using the function, we can generate samples from the posterior distribution under \(H_2\) using the option.

The 2.5 and 97.5 percentiles for the difference in the means provide a 95% credible interval of 0.023 to 0.57 pounds for the difference in average birth weight. The MCMC output shows not only summaries about the difference in the mean \(\alpha\) , but the other parameters in the model.

In particular, the Cauchy prior arises by placing a gamma prior on \(n_0\) and the conjugate normal prior. This provides quantiles about \(n_0\) after updating with the current data.

The row labeled effect size is the standardized effect size \(\delta\) , indicating that the effects are indeed small relative to the noise in the data.

Estimates of effect under H2

Figure 5.4: Estimates of effect under H2

Figure 5.4 shows the posterior density for the difference in means, with the 95% credible interval indicated by the shaded area. Under \(H_2\) , there is a 95% chance that the average birth weight of babies born to non-smokers is 0.023 to 0.57 pounds higher than that of babies born to smokers.

The previous statement assumes that \(H_2\) is true and is a conditional probability statement. In mathematical terms, the statement is equivalent to

\[P(0.023 < \alpha < 0.57 \mid \text{data}, H_2) = 0.95\]

However, we still have quite a bit of uncertainty based on the current data, because given the data, the probability of \(H_2\) being true is 0.59.

\[P(H_2 \mid \text{data}) = 0.59\]

Using the law of total probability, we can compute the probability that \(\mu\) is between 0.023 and 0.57 as below:

\[\begin{aligned} & P(0.023 < \alpha < 0.57 \mid \text{data}) \\ = & P(0.023 < \alpha < 0.57 \mid \text{data}, H_1)P(H_1 \mid \text{data}) + P(0.023 < \alpha < 0.57 \mid \text{data}, H_2)P(H_2 \mid \text{data}) \\ = & I( 0 \text{ in CI }) P(H_1 \mid \text{data}) + 0.95 \times P(H_2 \mid \text{data}) \\ = & 0 \times 0.41 + 0.95 \times 0.59 = 0.5605 \end{aligned}\]

Finally, we get that the probability that \(\alpha\) is in the interval, given the data, averaging over both hypotheses, is roughly 0.56. The unconditional statement is the average birth weight of babies born to nonsmokers is 0.023 to 0.57 pounds higher than that of babies born to smokers with probability 0.56. This adjustment addresses the posterior uncertainty and how likely \(H_2\) is.

To recap, we have illustrated testing, followed by reporting credible intervals, and using a Cauchy prior distribution that assumed smaller standardized effects. After testing, it is common to report credible intervals conditional on \(H_2\) . We also have shown how to adjust the probability of the interval to reflect our posterior uncertainty about \(H_2\) . In the next chapter, we will turn to regression models to incorporate continuous explanatory variables.

COMMENTS

  1. 9.4: Distribution Needed for Hypothesis Testing

    If you are testing a single population mean, the distribution for the test is for means: X¯ − N(μx, σx n−−√) (9.4.1) or. tdf (9.4.2) The population parameter is μ. The estimated value (point estimate) for μ is x¯, the sample mean. If you are testing a single population proportion, the distribution for the test is for proportions ...

  2. 5.3.2 Normal Hypothesis Testing

    What steps should I follow when carrying out a hypothesis test for the mean of a normal distribution? Following these steps will help when carrying out a hypothesis test for the mean of a normal distribution: Step 1. Define the distribution of the population mean usually Step 2. Write the null and alternative hypotheses clearly using the form

  3. 9.3 Probability Distribution Needed for Hypothesis Testing

    Assumptions. When you perform a hypothesis test of a single population mean μ using a normal distribution (often called a z-test), you take a simple random sample from the population. The population you are testing is normally distributed, or your sample size is sufficiently large.You know the value of the population standard deviation, which, in reality, is rarely known.

  4. Hypothesis tests about the mean

    We test the null hypothesis that the mean is equal to a specific value : The test statistic. To construct a test statistic, we use the sample mean. The test statistic, called z-statistic, is. A test of hypothesis based on it is called z-test. We prove below that has a normal distribution with zero mean and unit variance.

  5. Normal Distribution Hypothesis Tests

    When to do a Normal Hypothesis Test. There are two types of hypothesis tests you need to know about: binomial distribution hypothesis tests and normal distribution hypothesis tests.In binomial hypothesis tests, you are testing the probability parameter p.In normal hypothesis tests, you are testing the mean parameter \mu.This gives us a key difference that we can use to determine what test to ...

  6. Distribution Needed for Hypothesis Testing

    We perform tests of a population proportion using a normal distribution (usually n is large or the sample size is large). If you are testing a single population mean, the distribution for the test is for means: ¯¯¯¯¯X~N (μX , σX √n) or tdf X ¯ ~ N ( μ X , σ X n) or t d f. The population parameter is μ μ. The estimated value (point ...

  7. 9.3 Distribution Needed for Hypothesis Testing

    Earlier in the course, we discussed sampling distributions. Particular distributions are associated with hypothesis testing. Perform tests of a population mean using a normal distribution or a Student's t-distribution. (Remember, use a Student's t-distribution when the population standard deviation is unknown and the distribution of the sample mean is approximately normal.)

  8. Hypothesis Testing for the Mean

    Two-sided Tests for the Mean: Here, we are given a random sample X1 X 1, X2 X 2 ,..., Xn X n from a distribution. Let μ = EXi μ = E X i. Our goal is to decide between. H0 H 0: μ = μ0 μ = μ 0, H1 H 1: μ ≠ μ0 μ ≠ μ 0 . Example 8.22, which we saw previously is an instance of this case.

  9. 8.1 A Single Population Mean Using the Normal Distribution

    invNorm (0.975, 0, 1) = 1.96. In this command, the value 0.975 is the total area to the left of the critical value that we are looking to calculate. The parameters 0 and 1 are the mean value and the standard deviation of the standard normal distribution Z. Note. Remember to use the area to the LEFT of zα 2 z α 2.

  10. Hypothesis Test for a Mean

    The first set of hypotheses (Set 1) is an example of a two-tailed test, since an extreme value on either side of the sampling distribution would cause a researcher to reject the null hypothesis. The other two sets of hypotheses (Sets 2 and 3) are one-tailed tests, since an extreme value on only one side of the sampling distribution would cause a researcher to reject the null hypothesis.

  11. Normal Distribution Hypothesis Test: Explanation & Example

    Hypothesis Test for Normal Distribution - Key takeaways. When we hypothesis test for a normal distribution we are trying to see if the mean is different from the mean stated in the null hypothesis. We use the sample mean which is \(\bar{X} \sim N(\mu, \frac{\sigma^2}{n})\).

  12. Data analysis: hypothesis testing: 4.1 The normal distribution

    In graph form, a normal distribution appears as a bell curve. The values in the x-axis of the normal distribution graph represent the z-scores. The test statistic that you wish to use to test the set of hypotheses is the z-score. A z-score is used to measure how far the observation (sample mean) is from the 0 value of the bell curve (population ...

  13. Normal Distributions versus T-Distributions

    We perform tests of a population proportion using a normal distribution (usually n is large or the sample size is large). If you are testing a single population mean, the distribution for the test is for means: ¯¯¯¯¯X~N (μX , σX √n) or tdf X ¯ ~ N ( μ X , σ X n) or t d f. The population parameter is μ μ. The estimated value (point ...

  14. Introduction to Hypothesis testing for Normal distribution ...

    Introduction to Hypothesis testing for Normal distributionIn this tutorial, we learn how to conduct a hypothesis test for normal distribution using p values ...

  15. Summary: Distribution Needed for Hypothesis Testing

    When doing a hypothesis test for a mean, a normal distribution is used when the sample size is large and the population standard deviation is known. This is called a z -test for a mean. When doing a hypothesis test for a proportion, a normal distribution is used. However, np n p and nq n q must be both greater than 5.

  16. Distribution Needed for Hypothesis Testing

    Earlier in the course, we discussed sampling distributions. Particular distributions are associated with hypothesis testing. Perform tests of a population mean using a normal distribution or a Student's t-distribution. (Remember, use a Student's t-distribution when the population standard deviation is unknown and the distribution of the sample mean is approximately normal.)

  17. Chapter 5 Hypothesis Testing with Normal Populations

    5.1 Bayes Factors for Testing a Normal Mean: variance known. Now we show how to obtain Bayes factors for testing hypothesis about a normal mean, where the variance is known.To start, let's consider a random sample of observations from a normal population with mean \(\mu\) and pre-specified variance \(\sigma^2\).We consider testing whether the population mean \(\mu\) is equal to \(m_0\) or not.