Two-Sample Unpaired t Tests in Medical Research

Affiliations.

  • 1 From the Department of Anesthesiology, Amsterdam UMC, Vrije Universiteit Amsterdam, Amsterdam, the Netherlands.
  • 2 Department of Surgery and Perioperative Care, Dell Medical School at the University of Texas at Austin, Austin, Texas.
  • PMID: 31584913
  • DOI: 10.1213/ANE.0000000000004373

Publication types

  • Anesthesia, General
  • Biomedical Research*
  • Obesity, Morbid*

The two-sample t test: pre-testing its assumptions does not pay off

  • Regular Article
  • Published: 28 April 2009
  • Volume 52 , pages 219–231, ( 2011 )

Cite this article

two sample t test research paper

  • Dieter Rasch 1 ,
  • Klaus D. Kubinger 2 &
  • Karl Moder 1  

3592 Accesses

121 Citations

2 Altmetric

Explore all metrics

Traditionally, when applying the two-sample t test, some pre-testing occurs. That is, the theory-based assumptions of normal distributions as well as of homogeneity of the variances are often tested in applied sciences in advance of the tried-for t test. But this paper shows that such pre-testing leads to unknown final type-I- and type-II-risks if the respective statistical tests are performed using the same set of observations. In order to get an impression of the extension of the resulting misinterpreted risks, some theoretical deductions are given and, in particular, a systematic simulation study is done. As a result, we propose that it is preferable to apply no pre-tests for the t test and no t test at all, but instead to use the Welch-test as a standard test: its power comes close to that of the t test when the variances are homogeneous, and for unequal variances and skewness values | γ 1 | < 3, it keeps the so called 20% robustness whereas the t test as well as Wilcoxon’s U test cannot be recommended for most cases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

Similar content being viewed by others

two sample t test research paper

How Robust Is the Two-Sample Triangular Sequential T-Test Against Variance Heterogeneity?

two sample t test research paper

Non-parametric Tests

Revisiting the two-sample runs test.

Easterling RG, Anderson HE (1978) The effect of preliminary normality goodness of fit tests on subsequent inference. J Stat Comput Simul 8: 1–11

Article   MATH   Google Scholar  

Fleishman AJ (1978) A method for simulating non-normal distributions. Psychometrika 43: 521–532

Kolmogorov AV (1933) Sulla determinazione empirica di una legge di distribuzione. Inst Ital Attuari Gorn 4: 1–11

MATH   Google Scholar  

Levene H (1960) Robust tests for equality of variances. In: Olkin I (eds) Contributions to probability and statistics. Essays in honor of Harold Hotelling. University Press, Stanford, pp 278–292

Google Scholar  

Mann HB, Whitney DR (1947) On a test whether one of two random variables is stochastically larger than the other. Ann Math Stat 18: 50–60

Article   MATH   MathSciNet   Google Scholar  

Moser BK, Stevens GR (1992) Homogeneity of variance in the two-sample means test. Am Stat 46: 19–21

Article   Google Scholar  

Rasch D, Guiard V (2004) The robustness of parametric statistical methods. Psychol Sci 46: 175–208

Rasch D, Teuscher F, Guiard V (2007a) How robust are tests for two independent samples?. J Stat Plan Inference 137: 2706–2720

Rasch D, Verdooren LR, Gowers JI (2007b) Design and analysis of experiments and surveys (2nd edn.). Oldenbourg, München

Schucany WR, Ng HKT (2006) Preliminary goodness-of-fit tests for normality do not validate the one-sample Student t . Commun Stat Theory Methods 35: 2275–2286

Smirnov VI (1939) On the estimation of the discrepancy between empirical curves of distribution for two independent samples. Bull Math Univ Moscou 2: 3–14

Welch BL (1947) The generalisation of “Student’s” problem when several different population variances are involves. Biometrika 34: 28–35

MATH   MathSciNet   Google Scholar  

Wilcoxon F (1945) Individual comparisons by ranking methods. Biometrics 1: 80–82

Download references

Author information

Authors and affiliations.

Department of Landscape, Spatial and Infrastructure Sciences, Institute of Applied Statistics and Computing, University of Natural Resources and Applied Life Sciences, Vienna, Austria

Dieter Rasch & Karl Moder

Division of Psychological Assessment and Applied Psychometrics, Faculty of Psychology, University of Vienna, Vienna, Austria

Klaus D. Kubinger

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Klaus D. Kubinger .

Rights and permissions

Reprints and permissions

About this article

Rasch, D., Kubinger, K.D. & Moder, K. The two-sample t test: pre-testing its assumptions does not pay off. Stat Papers 52 , 219–231 (2011). https://doi.org/10.1007/s00362-009-0224-x

Download citation

Received : 13 October 2008

Revised : 30 March 2009

Published : 28 April 2009

Issue Date : February 2011

DOI : https://doi.org/10.1007/s00362-009-0224-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Two-sample t test
  • Wilcoxon- U test
  • Find a journal
  • Publish with us
  • Track your research

Statology

Statistics Made Easy

How to Report T-Test Results (With Examples)

We can use the following general format to report the results of a one sample t-test :

A one sample t-test was performed to compare [variable of interest] against the population mean.   The mean value of [variable of interest] (M = [Mean], SD = [standard deviation]) was significantly [higher, lower, or different] than the population mean; t(df) = [t-value], p = [p-value].

We can use the following format to report the results of an independent two samples t-test :

A two sample t-test was performed to compare [response variable of interest] in [group 1] and [group 2].   There [was or was not] a significant difference in [response variable of interest] between [group1] (M = [Mean], SD = [standard deviation]) and [group2] (M = [Mean], SD = [standard deviation]); t(df) = [t-value], p = [p-value].

We can use the following format to report the results of a paired samples t-test :

A paired samples t-test was performed to compare [response variable of interest] in [group 1] and [group 2].   There [was or was not] a significant difference in [response variable of interest] between [group1] (M = [Mean], SD = [standard deviation]) and [group2] (M = [Mean], SD = [standard deviation]); t(df) = [t-value], p = [p-value].

Note: The “M” in the results stands for sample mean, the “SD” stands for sample standard deviation, and “df” stands for degrees of freedom associated with the t-test statistic.

The following examples show how to report the results of each type of t-test in practice.

Example: Reporting Results of a One Sample T-Test

A botanist wants to know if the mean height of a certain species of plant is equal to 15 inches. She collects a random sample of 12 plants and performs a one sample-test.

The following screenshot shows the results of the test:

two sample t test research paper

Here’s how to report the results of the test:

A one sample t-test was performed to compare the mean height of a certain species of plant against the population mean.   The mean value of height (M = 14.33, SD = 1.37) was not significantly different than the population mean; t(11) = -1.685, p = .120.

Example: Reporting Results of an Independent Samples T-Test

Researchers want to know if a new fuel treatment leads to a change in the average miles per gallon of a certain car. To test this, they conduct an experiment in which 12 cars receive the new fuel treatment and 12 cars do not.

The following screenshot shows the results of the independent samples t-test:

Interpreting output of two sample t-test in SPSS

A two sample t-test was performed to compare miles per gallon between fuel treatment and no fuel treatment.   There was not a significant difference in miles per gallon between fuel treatment (M = 22.75, SD = 3.25) and no fuel treatment (M = 21, SD = 2.73); t(22) = -1.428, p = .167.

Example: Reporting Results of a Paired Samples T-Test

Researchers want to know if a new fuel treatment leads to a change in the average mpg of a certain car. To test this, they conduct an experiment in which they measure the mpg of 12 cars with and without the fuel treatment.

The following screenshot shows the results of the paired samples t-test:

Output of paired samples t-test in SPSS

A paired samples t-test was performed to compare miles per gallon between fuel treatment and no fuel treatment.   There was a significant difference in miles per gallon between fuel treatment (M = 22.75, SD = 3.25) and no fuel treatment (M = 21, SD = 2.73); t(11) = -2.244, p = .046.

Additional Resources

Use the following calculators to automatically perform various t-tests:

One Sample t-test Calculator Two Sample t-test Calculator Paired Samples t-test Calculator

Featured Posts

5 Regularization Techniques You Should Know

Hey there. My name is Zach Bobbitt. I have a Masters of Science degree in Applied Statistics and I’ve worked on machine learning algorithms for professional businesses in both healthcare and retail. I’m passionate about statistics, machine learning, and data visualization and I created Statology to be a resource for both students and teachers alike.  My goal with this site is to help you learn statistics through using simple terms, plenty of real-world examples, and helpful illustrations.

One Reply to “How to Report T-Test Results (With Examples)”

I really liked you explanation and examples. You solved my problem of mixing the concepts of independent sample t-test & paired sample t-test

Thank you very much

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Join the Statology Community

Sign up to receive Statology's exclusive study resource: 100 practice problems with step-by-step solutions. Plus, get our latest insights, tutorials, and data analysis tips straight to your inbox!

By subscribing you accept Statology's Privacy Policy.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Restor Dent Endod
  • v.44(3); 2019 Aug

Logo of rde

Statistical notes for clinical researchers: the independent samples t -test

Hae-young kim.

Department of Health Policy and Management, College of Health Science, and Department of Public Health Science, Graduate School, Korea University, Seoul, Korea.

The t -test is frequently used in comparing 2 group means. The compared groups may be independent to each other such as men and women. Otherwise, compared data are correlated in a case such as comparison of blood pressure levels from the same person before and after medication ( Figure 1 ). In this section we will focus on independent t -test only. There are 2 kinds of independent t -test depending on whether 2 group variances can be assumed equal or not. The t -test is based on the inference using t -distribution.

An external file that holds a picture, illustration, etc.
Object name is rde-44-e26-g001.jpg

T -DISTRIBUTION

The t -distribution was invented in 1908 by William Sealy Gosset, who was working for the Guinness brewery in Dublin, Ireland. As the Guinness brewery did not permit their employee's publishing the research results related to their work, Gosset published his findings by a pseudonym, “Student.” Therefore, the distribution he suggested was called as Student's t -distribution. The t -distribution is a distribution similar to the standard normal distribution, z -distribution, but has lower peak and higher tail compared to it ( Figure 2 ).

An external file that holds a picture, illustration, etc.
Object name is rde-44-e26-g002.jpg

According to the sampling theory, when samples are drawn from a normal-distributed population, the distribution of sample means is expected to be a normal distribution. When we know the variance of population, σ 2 , we can define the distribution of sample means as a normal distribution and adopt z -distribution in statistical inference. However, in reality, we generally never know σ 2 , we use sample variance, s 2 , instead. Although the s 2 is the best estimator for σ 2 , the degree of accuracy of s 2 depends on the sample size. When the sample size is large enough ( e.g. , n = 300), we expect that the sample variance would be very similar to the population variance. However, when sample size is small, such as n = 10, we could guess that the accuracy of sample variance may be not that high. The t -distribution reflects this difference of uncertainty according to sample size. Therefore the shape of t -distribution changes by the degree of freedom (df), which is sample size minus one (n − 1) when one sample mean is tested.

The t -distribution appears to be a family of distribution of which shape varies according to its df ( Figure 2 ). When df is smaller, the t -distribution has lower peak and higher tail compared to those with higher df. The shape of t -distribution approaches to z -distribution as df increases. When df gets large enough, e.g. , n = 300, t -distribution is almost identical with z -distribution. For the inferences of means using small samples, it is necessary to apply t -distribution, while similar inference can be obtain by either t -distribution or z -distribution for a case with a large sample. For inference of 2 means, we generally use t -test based on t -distribution regardless of the sizes of sample because it is always safe, not only for a test with small df but also for that with large df.

INDEPENDENT SAMPLES T -TEST

To adopt z - or t -distribution for inference using small samples, a basic assumption is that the distribution of population is not significantly different from normal distribution. As seen in Appendix 1 , the normality assumption needs to be tested in advance. If normality assumption cannot be met and we have a small sample ( n < 25), then we are not permitted to use ‘parametric’ t -test. Instead, a non-parametric analysis such as Mann-Whitney U test should be selected.

For comparison of 2 independent group means, we can use a z -statistic to test the hypothesis of equal population means only if we know the population variances of 2 groups, σ 1 2 and σ 2 2 , as follows;

where X ̄ 1 and X ̄ 2 , σ 1 2 and σ 2 2 , and n 1 and n 2 are sample means, population variances, and the sizes of 2 groups.

Again, as we never know the population variances, we need to use sample variances as their estimates. There are 2 methods whether 2 population variances could be assumed equal or not. Under assumption of equal variances, the t -test devised by Gosset in 1908, Student's t -test, can be applied. The other version is Welch's t -test introduced in 1947, for the cases where the assumption of equal variances cannot be accepted because quite a big difference is observed between 2 sample variances.

1. Student's t -test

In Student's t -test, the population variances are assumed equal. Therefore, we need only one common variance estimate for 2 groups. The common variance estimate is calculated as a pooled variance, a weighted average of 2 sample variances as follows;

where s 1 2 and s 2 2 are sample variances.

The resulting t -test statistic is a form that both the population variances, σ 1 2 and σ 1 2 , are exchanged with a common variance estimate, s p 2 . The df is given as n 1 + n 2 − 2 for the t -test statistic.

In Appendix 1 , ‘(E-1) Leven's test for equality of variances’ shows that the null hypothesis of equal variances was accepted by the high p value, 0.334 (under heading of Sig.). In ‘(E-2) t -test for equality of means t -values’, the upper line shows the result of Student's t -test. The t -value and df are shown −3.357 and 18. We can get the same figures using the formulas Eq. 2 and Eq. 3, and descriptive statistics in Table 1 , as follows.

The result of calculation is a little different from that by SPSS (IBM Corp., Armonk, NY, USA) of Appendix 1 , maybe because of rounding errors.

2. Welch's t -test

Actually there are a lot of cases where the equal variance cannot be assumed. Even if it is unlikely to assume equal variances, we still compare 2 independent group means by performing the Welch's t -test. Welch's t -test is more reliable when the 2 samples have unequal variances and/or unequal sample sizes. We need to maintain the assumption of normality.

Because the population variances are not equal, we have to estimate them separately by 2 sample variances, s 1 2 and s 2 2 . As the result, the form of t -test statistic is given as follows;

where ν is Satterthwaite degrees of freedom.

In Appendix 1 , ‘(E-1) Leven's test for equality of variances’ shows an equal variance can be successfully assumed ( p = 0.334). Therefore, the Welch's t -test is inappropriate for this data. Only for the purpose of exercise, we can try to interpret the results of Welch's t -test shown in the lower line in ‘(E-2) t -test for equality of means t -values’. The t -value and df are shown as −3.357 and 16.875.

We've confirmed nearly same results by calculation using the formula and by SPSS software.

The t -test is one of frequently used analysis methods for comparing 2 group means. However, sometimes we forget the underlying assumptions such as normality assumption or miss the meaning of equal variance assumption. Especially when we have a small sample, we need to check normality assumption first and make a decision between the parametric t -test and the nonparametric Mann-Whitney U test. Also, we need to assess the assumption of equal variances and select either Student's t -test or Welch's t -test.

Procedure of t -test analysis using IBM SPSS

The procedure of t -test analysis using IBM SPSS Statistics for Windows Version 23.0 (IBM Corp., Armonk, NY, USA) is as follows.

An external file that holds a picture, illustration, etc.
Object name is rde-44-e26-a001.jpg

Purdue Online Writing Lab Purdue OWL® College of Liberal Arts

Welcome to the Purdue Online Writing Lab

OWL logo

Welcome to the Purdue OWL

This page is brought to you by the OWL at Purdue University. When printing this page, you must include the entire legal notice.

Copyright ©1995-2018 by The Writing Lab & The OWL at Purdue and Purdue University. All rights reserved. This material may not be published, reproduced, broadcast, rewritten, or redistributed without permission. Use of this site constitutes acceptance of our terms and conditions of fair use.

The Online Writing Lab at Purdue University houses writing resources and instructional material, and we provide these as a free service of the Writing Lab at Purdue. Students, members of the community, and users worldwide will find information to assist with many writing projects. Teachers and trainers may use this material for in-class and out-of-class instruction.

The Purdue On-Campus Writing Lab and Purdue Online Writing Lab assist clients in their development as writers—no matter what their skill level—with on-campus consultations, online participation, and community engagement. The Purdue Writing Lab serves the Purdue, West Lafayette, campus and coordinates with local literacy initiatives. The Purdue OWL offers global support through online reference materials and services.

A Message From the Assistant Director of Content Development 

The Purdue OWL® is committed to supporting  students, instructors, and writers by offering a wide range of resources that are developed and revised with them in mind. To do this, the OWL team is always exploring possibilties for a better design, allowing accessibility and user experience to guide our process. As the OWL undergoes some changes, we welcome your feedback and suggestions by email at any time.

Please don't hesitate to contact us via our contact page  if you have any questions or comments.

All the best,

Social Media

Facebook twitter.

COMMENTS

  1. THE USE OF TWO-SAMPLE t-TEST IN THE REAL DATA

    In this research, using the two-sample t-test, a comparison between the students in the three grades of the Department of Mathematics Education, Tishk International University, is made to see ...

  2. T Test

    Selecting appropriate statistical tests is a critical step in conducting research. Therefore, there are three forms of Student's t-test about which physicians, particularly physician-scientists, need to be aware: (1) one-sample t-test; (2) two-sample t-test; and (3) two-sample paired t-test. The one-sample t-test evaluates a single list of ...

  3. The Differences and Similarities Between Two-Sample T-Test and Paired T

    The sample mean difference is the same as that in Example 1. However, the example variance of the sample mean difference is 2.45. The paired t -test statistic equals 6.33. From the t -distribution with df = 9, we obtain the p-value of 0.00007, which shows strong evidence to reject the null hypothesis.

  4. Two Sample t-test: Definition, Formula, and Example

    A two sample t-test is used to determine whether or not two population means are equal. This tutorial explains the following: The motivation for performing a two sample t-test. The formula to perform a two sample t-test. The assumptions that should be met to perform a two sample t-test. An example of how to perform a two sample t-test.

  5. PDF Single-Sample and Two-Sample t Tests

    161 Single-Sample and 6 Two-Sample t Tests Introduction T he simple t test has a long and venerable history in psychological research. Whenever researchers want to answer simple questions about one or two means for normally distributed variables (e.g., neu- roticism, daily caloric intake, height, rainfall), a t test will often provide the answer to such questions.

  6. The paired t test and beyond: Recommendations for testing the central

    1. Introduction. When two sets of non-count data are obtained in a design with two related, matched or dependent samples (the three terms are used interchangeably) many researchers use a t test for paired samples (Tp).However, quite often a non-parametric alternative is chosen, such as the well-known Wilcoxon Signed Ranks test (WSR), which is also known as Wilcoxon Matched Pairs, Signed Rank(s ...

  7. Two-Sample Unpaired t Tests in Medical Research

    Two-Sample Unpaired t Tests in Medical Research. ... Two-Sample Unpaired t Tests in Medical Research Anesth Analg. 2019 Oct;129(4):911. doi: 10.1213/ANE.0000000000004373. Authors Patrick Schober 1 , Thomas R Vetter 2 Affiliations 1 From the Department of Anesthesiology ...

  8. The New and Improved Two-Sample t Test

    It is well known that with Student's two-independent-sample t test, the actual level of significance can be well above or below the nominal level, confidence intervals can have inaccurate probability coverage, and power can be low relative to other methods. A solution to deal with heterogeneity is Welch's (1938) test.

  9. The two-sample t test: pre-testing its assumptions does not pay off

    Traditionally, when applying the two-sample t test, some pre-testing occurs. That is, the theory-based assumptions of normal distributions as well as of homogeneity of the variances are often tested in applied sciences in advance of the tried-for t test. But this paper shows that such pre-testing leads to unknown final type-I- and type-II-risks if the respective statistical tests are performed ...

  10. LWW

    A concise guide to the use and interpretation of two-sample unpaired t-tests in medical research, with examples and recommendations from the journal Anesthesia & Analgesia.

  11. T test as a parametric statistic

    In statistic tests, the probability distribution of the statistics is important. When samples are drawn from population N (µ, σ 2) with a sample size of n, the distribution of the sample mean X ̄ should be a normal distribution N (µ, σ 2 /n).Under the null hypothesis µ = µ 0, the distribution of statistics z = X ¯-µ 0 σ / n should be standardized as a normal distribution.

  12. An Introduction to t Tests

    When to use a t test. A t test can only be used when comparing the means of two groups (a.k.a. pairwise comparison). If you want to compare more than two groups, or if you want to do multiple pairwise comparisons, use an ANOVA test or a post-hoc test.. The t test is a parametric test of difference, meaning that it makes the same assumptions about your data as other parametric tests.

  13. The differences and similarities between two-sample t-test and paired t

    In clinical research, comparisons of the results from experimental and control groups are often encountered. The two-sample t-test (also called independent samples t-test) and the paired t-test are probably the most widely used tests in statistics for the comparison of mean values between two samples. However, confusion exists with regard to the use of the two test methods, resulting in their ...

  14. How to Report T-Test Results (With Examples)

    Here's how to report the results of the test: A two sample t-test was performed to compare miles per gallon between fuel treatment and no fuel treatment. There was not a significant difference in miles per gallon between fuel treatment (M = 22.75, SD = 3.25) and no fuel treatment (M = 21, SD = 2.73); t(22) = -1.428, p = .167.

  15. Independent Samples T Test: Definition, Using & Interpreting

    Independent Samples T Tests Hypotheses. Independent samples t tests have the following hypotheses: Null hypothesis: The means for the two populations are equal. Alternative hypothesis: The means for the two populations are not equal.; If the p-value is less than your significance level (e.g., 0.05), you can reject the null hypothesis. The difference between the two means is statistically ...

  16. Statistical notes for clinical researchers: the independent samples t-test

    2) where s 1 2 and s 2 2 are sample variances. The resulting t -test statistic is a form that both the population variances, σ 1 2 and σ 1 2, are exchanged with a common variance estimate, s p 2. The df is given as n1 + n2 − 2 for the t -test statistic. t = X − 1 − X − 2 s p 2 n 1 + s p 2 n 2 = X − 1 − X − 2 s p 1 n 1 + 1 n 2 ...

  17. PDF Limitations Involved in A Two- Sample Independent T-test

    When they use a t-test, they are normally biased or do not know how to remove outliers. This causes having wrong data-set and also improper analysis and incomplete interpretation of the information. (Kim, 2015) Discussion-The two-sample independent t-test is normally used in social, psychological, and educational research to

  18. PDF The differences and similarities between two-sample t-test and paired t

    3.2 Paired t-test The paired t-test is of the form ∑( ) = − − = n j dj d d X X n n X T 1 2 ( 1) 1 It's obvious that the paired t-test is exactly the one-sample t-test based on the difference within each pair. Under the null hypothesis, T 2 always follows t-distribution with df = n-1. 3.3 Differences between the two-sample t-test and ...

  19. Hello GPT-4o

    Prior to GPT-4o, you could use Voice Mode to talk to ChatGPT with latencies of 2.8 seconds (GPT-3.5) and 5.4 seconds (GPT-4) on average. To achieve this, Voice Mode is a pipeline of three separate models: one simple model transcribes audio to text, GPT-3.5 or GPT-4 takes in text and outputs text, and a third simple model converts that text back to audio.

  20. Welcome to the Purdue Online Writing Lab

    The Online Writing Lab at Purdue University houses writing resources and instructional material, and we provide these as a free service of the Writing Lab at Purdue.