U.S. flag

An official website of the United States government

The .gov means it's official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you're on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • Browse Titles

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

StatPearls [Internet]. Treasure Island (FL): StatPearls Publishing; 2024 Jan-.

Cover of StatPearls

StatPearls [Internet].

Mcnemar and mann-whitney u tests.

Joshua Henrina Sundjaja ; Rijen Shrestha ; Kewal Krishan .

Affiliations

Last Update: July 17, 2023 .

  • Definition/Introduction

All good research is based on a meticulous and well-designed question in the form of a hypothesis. To test this hypothesis, one must conduct an experiment with strict guidelines to obtain robust results. The results are then tested using statistics to examine its significance and conclude if a new treatment/ diagnostic modalities/biomarker is a better alternative to prevalent practice. Thus, statistical tests are an important component of research, especially in the fields of medicine. 

Historically, statistical testing has been a grueling and labor-intensive process. Thanks to modernization and the use of computers, statistical analysis can now be accomplished through many commercially available programs, such as the Statistical Program for Social Sciences (SPSS) or Software for Statistics and Data Science (STATA).

Conventionally, statistical tests divide into two major groups, parametric and non-parametric. The prerequisite of using a parametric analysis is that the data tested assumes a normal (Gaussian) distribution. If the data is not in a normal distribution, non-parametric tests are used. There are many non-parametric tests analogous to parametric tests in continuous variables, namely Mann-Whitney U test and independent t-test, Wilcoxon signed-rank test, and paired t-test, Kruskal Wali's test and Analysis of Variance (ANOVA), and Spearman rank correlation coefficient and Pearson product-moment coefficient. [1]

McNemar test

For nominal variables, in the form of a 2 x 2 table, three types of statistical tests can be used. The first one is the Fisher's exact test. The preconditions for its use are binary data and unpaired samples. The second one is the McNemar test, which requires binary data as in Fisher's exact, albeit with paired samples. The third one is the Chi-squared test, requiring a sample size of more than 60 subjects, with more than five counts in each cell. The Chi-squared test can also be useful for a contingency table of more than 2 x 2, i.e., 3 x 3, 4 x 4, and so on. [2]

The McNemar test is a non-parametric test used to analyze paired nominal data. It is a test on a 2 x 2 contingency table and checks the marginal homogeneity of two dichotomous variables. The test requires one nominal variable with two categories (dichotomous) and one independent variable with two dependent groups. Also, the two groups in the dependent variable must be mutually exclusive, i.e., cannot be in more than one group. The minimal sample size required for the McNemar test is at least ten discordant pairs. The formula for calculating the Chi-squared value for McNemar test appears in Image 1, where b is the false positive count, and c is the false negative count.

If the Chi-squared value is significant, the null hypothesis is rejected, meaning there is a substantial difference in the marginal proportions of the tests, i.e., the newer treatment/ diagnostic modalities/biomarker is a better alternative to prevalent practice.

It should be noted that if the sum of discordant pairs (b+c) is small (<25), even if the total sample size is large, the statistical power of the McNemar test is low. Thus, in research studies with small sample size, and the sum of discordant pairs is less than 25, the exact binomial test can be used. Alternatively, Edwards's continuity correction is another option. [3]  Nevertheless, the use of an exact test in studies with few subjects will produce unnecessary large  p  values with poor power.

Therefore, others developed a more precise approach to deal with this situation. The McNemar mid- p  test considerably improves the statistical significance without violating the nominal level. Furthermore, if small but frequent violations at the nominal level are acceptable, then the McNemar asymptotic test, is the most powerful test that for this purpose. [4]  

Mann Whitney U test

Mann Whitney U test or Wilcoxon Rank-Sum test, on the other hand, is an analog of the parametric Student's t-test. It compares the means between two independent groups with the assumption that the data is not in a normal distribution. Therefore, it is useful for numerical/continuous variables. For example, if researchers want to compare two different groups' age or height (continuous variables), in a study with non-normally distributed data, then the Mann Whitney U test can be used.

  • Issues of Concern

There are several issues of concern regarding the use of the McNemar test and the Mann-Whitney U test, explained as follows:

1. McNemar test compares paired categorical data. However, it can not be used to measure an agreement because the McNemar test compares the overall proportion. For example, if a researcher is comparing the test results of subjects examined by two different persons, and the proportion of subjects who pass the test between these populations are the same, this cannot be concluded as evidence of an agreement. [5]   

2. Mann-Whitney U test is a common test for comparing the median between two non normally distributed groups. However, researchers often forget the assumption that the data is derived from independent random samples from two distinct populations but with the same shape (distribution). Thus, when conducting this test, aside from reporting the p-value , the spread, and the shape of the data should be described, as it may relate to clinically significant and relevant findings. [6]  Also, the signs of skewness and variance of heterogeneity need investigation, and accordingly, if these factors existed, the Welch U test should be used. [7]

  • Clinical Significance

Basic statistical knowledge is imperative for every researcher working in the Life sciences. Research findings require examination using fundamentally correct statistical analysis to maintain the external validity of studies, which is important if we want to extrapolate our results to the general population. Unfortunately, statistical errors are not uncommon. Based on a study that assesses statistical errors and methodological pitfalls in dissertations needed for a Medical Doctorate (MD) degree at the National Cancer Institute, Cairo, statistical tests were appropriate in only 13 out of a total of 62 studies (24.5%). [8] Therefore, the aid of a biostatistician or an expert in public health should be available to assist researchers/students.

  • Nursing, Allied Health, and Interprofessional Team Interventions

Research findings should have excellent external validity, i.e., where the results extrapolate to the general population. Equally important is the robustness of the methods used, especially statistical analysis. Biostatistics should be an integral part of the curriculum at all levels of the university. The assistance of a biostatistician or a public health expert can be instrumental in choosing the appropriate design for the study, the analytical tools to be used, and the usefulness of the results. The inclusion of a statistician in the research team also develops a multi-disciplinary approach and can lead to better outcomes for the research.

  • Review Questions
  • Access free multiple choice questions on this topic.
  • Comment on this article.

Formula for calculating Chi-squared for McNemar Test Contributed from the Public Domain

Disclosure: Joshua Henrina Sundjaja declares no relevant financial relationships with ineligible companies.

Disclosure: Rijen Shrestha declares no relevant financial relationships with ineligible companies.

Disclosure: Kewal Krishan declares no relevant financial relationships with ineligible companies.

This book is distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) ( http://creativecommons.org/licenses/by-nc-nd/4.0/ ), which permits others to distribute the work, provided that the article is not altered or used commercially. You are not required to obtain permission to distribute this article, provided that you credit the author and journal.

  • Cite this Page Sundjaja JH, Shrestha R, Krishan K. McNemar And Mann-Whitney U Tests. [Updated 2023 Jul 17]. In: StatPearls [Internet]. Treasure Island (FL): StatPearls Publishing; 2024 Jan-.

In this Page

Bulk download.

  • Bulk download StatPearls data from FTP

Related information

  • PMC PubMed Central citations
  • PubMed Links to PubMed

Similar articles in PubMed

  • Hypothesis testing III: counts and medians. [Radiology. 2003] Hypothesis testing III: counts and medians. Applegate KE, Tello R, Ying J. Radiology. 2003 Sep; 228(3):603-8. Epub 2003 Jul 24.
  • Suicidal Ideation. [StatPearls. 2024] Suicidal Ideation. Harmer B, Lee S, Duong TVH, Saadabadi A. StatPearls. 2024 Jan
  • Causal estimands and confidence intervals associated with Wilcoxon-Mann-Whitney tests in randomized experiments. [Stat Med. 2018] Causal estimands and confidence intervals associated with Wilcoxon-Mann-Whitney tests in randomized experiments. Fay MP, Brittain EH, Shih JH, Follmann DA, Gabriel EE. Stat Med. 2018 Sep 10; 37(20):2923-2937. Epub 2018 May 17.
  • Review Introduction to biostatistics: Part 5, Statistical inference techniques for hypothesis testing with nonparametric data. [Ann Emerg Med. 1990] Review Introduction to biostatistics: Part 5, Statistical inference techniques for hypothesis testing with nonparametric data. Gaddis GM, Gaddis ML. Ann Emerg Med. 1990 Sep; 19(9):1054-9.
  • Review The interpretation of significance tests for independent and dependent samples. [J Neurosci Methods. 1983] Review The interpretation of significance tests for independent and dependent samples. Krauth J. J Neurosci Methods. 1983 Dec; 9(4):269-81.

Recent Activity

  • McNemar And Mann-Whitney U Tests - StatPearls McNemar And Mann-Whitney U Tests - StatPearls

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

Connect with NLM

National Library of Medicine 8600 Rockville Pike Bethesda, MD 20894

Web Policies FOIA HHS Vulnerability Disclosure

Help Accessibility Careers

statistics

u test hypothesis

Nonparametric Tests

  •   1  
  • |   2  
  • |   3  
  • |   4  
  • |   5  
  • |   6  
  • |   7  
  • |   8  
  • |   9  

Learn More sidebar

All Modules

Mann Whitney U Test (Wilcoxon Rank Sum Test)

The modules on hypothesis testing presented techniques for testing the equality of means in two independent samples. An underlying assumption for appropriate use of the tests described was that the continuous outcome was approximately normally distributed or that the samples were sufficiently large (usually n 1 > 30 and n 2 > 30) to justify their use based on the Central Limit Theorem. When comparing two independent samples when the outcome is not normally distributed and the samples are small, a nonparametric test is appropriate.

A popular nonparametric test to compare outcomes between two independent groups is the Mann Whitney U test. The Mann Whitney U test, sometimes called the Mann Whitney Wilcoxon Test or the Wilcoxon Rank Sum Test, is used to test whether two samples are likely to derive from the same population (i.e., that the two populations have the same shape). Some investigators interpret this test as comparing the medians between the two populations. Recall that the parametric test compares the means (H 0 : μ 1 =μ 2 ) between independent groups.

In contrast, the null and two-sided research hypotheses for the nonparametric test are stated as follows:

H 0 : The two populations are equal versus

H 1 : The two populations are not equal.

This test is often performed as a two-sided test and, thus, the research hypothesis indicates that the populations are not equal as opposed to specifying directionality. A one-sided research hypothesis is used if interest lies in detecting a positive or negative shift in one population as compared to the other. The procedure for the test involves pooling the observations from the two samples into one combined sample, keeping track of which sample each observation comes from, and then ranking lowest to highest from 1 to n 1 +n 2 , respectively.

Consider a Phase II clinical trial designed to investigate the effectiveness of a new drug to reduce symptoms of asthma in children. A total of n=10 participants are randomized to receive either the new drug or a placebo. Participants are asked to record the number of episodes of shortness of breath over a 1 week period following receipt of the assigned treatment. The data are shown below.

Is there a difference in the number of episodes of shortness of breath over a 1 week period in participants receiving the new drug as compared to those receiving the placebo? By inspection, it appears that participants receiving the placebo have more episodes of shortness of breath, but is this statistically significant?

In this example, the outcome is a count and in this sample the data do not follow a normal distribution.  

Frequency Histogram of Number of Episodes of Shortness of Breath

Frequency histogram of episodes of shortness of breath

In addition, the sample size is small (n 1 =n 2 =5), so a nonparametric test is appropriate. The hypothesis is given below, and we run the test at the 5% level of significance (i.e., α=0.05).

Note that if the null hypothesis is true (i.e., the two populations are equal), we expect to see similar numbers of episodes of shortness of breath in each of the two treatment groups, and we would expect to see some participants reporting few episodes and some reporting more episodes in each group. This does not appear to be the case with the observed data. A test of hypothesis is needed to determine whether the observed data is evidence of a statistically significant difference in populations.

The first step is to assign ranks and to do so we order the data from smallest to largest. This is done on the combined or total sample (i.e., pooling the data from the two treatment groups (n=10)), and assigning ranks from 1 to 10, as follows. We also need to keep track of the group assignments in the total sample.

Note that the lower ranks (e.g., 1, 2 and 3) are assigned to responses in the new drug group while the higher ranks (e.g., 9, 10) are assigned to responses in the placebo group. Again, the goal of the test is to determine whether the observed data support a difference in the populations of responses. Recall that in parametric tests (discussed in the modules on hypothesis testing), when comparing means between two groups, we analyzed the difference in the sample means relative to their variability and summarized the sample information in a test statistic. A similar approach is employed here. Specifically, we produce a test statistic based on the ranks.

First, we sum the ranks in each group. In the placebo group, the sum of the ranks is 37; in the new drug group, the sum of the ranks is 18. Recall that the sum of the ranks will always equal n(n+1)/2. As a check on our assignment of ranks, we have n(n+1)/2 = 10(11)/2=55 which is equal to 37+18 = 55.

For the test, we call the placebo group 1 and the new drug group 2 (assignment of groups 1 and 2 is arbitrary). We let R 1 denote the sum of the ranks in group 1 (i.e., R 1 =37), and R 2 denote the sum of the ranks in group 2 (i.e., R 2 =18). If the null hypothesis is true (i.e., if the two populations are equal), we expect R 1 and R 2 to be similar. In this example, the lower values (lower ranks) are clustered in the new drug group (group 2), while the higher values (higher ranks) are clustered in the placebo group (group 1). This is suggestive, but is the observed difference in the sums of the ranks simply due to chance? To answer this we will compute a test statistic to summarize the sample information and look up the corresponding value in a probability distribution.

T est Statistic for the Mann Whitney U Test

The test statistic for the Mann Whitney U Test is denoted U and is the smaller of U 1 and U 2 , defined below.

where R 1 = sum of the ranks for group 1 and R 2 = sum of the ranks for group 2.

For this example,

In our example, U=3. Is this evidence in support of the null or research hypothesis? Before we address this question, we consider the range of the test statistic U in two different situations.

Situation #1

Consider the situation where there is complete separation of the groups, supporting the research hypothesis that the two populations are not equal. If all of the higher numbers of episodes of shortness of breath (and thus all of the higher ranks) are in the placebo group, and all of the lower numbers of episodes (and ranks) are in the new drug group and that there are no ties, then:

Therefore, when there is clearly a difference in the populations, U=0.

Situation #2

Consider a second situation where l ow and high scores are approximately evenly distributed in the two groups , supporting the null hypothesis that the groups are equal. If ranks of 2, 4, 6, 8 and 10 are assigned to the numbers of episodes of shortness of breath reported in the placebo group and ranks of 1, 3, 5, 7 and 9 are assigned to the numbers of episodes of shortness of breath reported in the new drug group, then:

When there is clearly no difference between populations, then U=10.  

Thus, smaller values of U support the research hypothesis, and larger values of U support the null hypothesis.

In every test, we must determine whether the observed U supports the null or research hypothesis. This is done following the same approach used in parametric testing. Specifically, we determine a critical value of U such that if the observed value of U is less than or equal to the critical value, we reject H 0 in favor of H 1 and if the observed value of U exceeds the critical value we do not reject H 0 .

The critical value of U can be found in the table below. To determine the appropriate critical value we need sample sizes (for Example: n 1 =n 2 =5) and our two-sided level of significance (α=0.05). For Example 1 the critical value is 2, and the decision rule is to reject H 0 if U < 2. We do not reject H 0 because 3 > 2. We do not have statistically significant evidence at α =0.05, to show that the two populations of numbers of episodes of shortness of breath are not equal. However, in this example, the failure to reach statistical significance may be due to low power. The sample data suggest a difference, but the sample sizes are too small to conclude that there is a statistically significant difference.

Table of Critical Values for U

Is there statistical evidence of a difference in APGAR scores in women receiving the new and enhanced versus usual prenatal care? We run the test using the five-step approach.

  •   Step 1. Set up hypotheses and determine level of significance.

H 1 : The two populations are not equal.  α =0.05

  • Step 2.  Select the appropriate test statistic.  

Because APGAR scores are not normally distributed and the samples are small (n 1 =8 and n 2 =7), we use the Mann Whitney U test. The test statistic is U, the smaller of

  where R 1 and R 2 are the sums of the ranks in groups 1 and 2, respectively.

  • Step 3. Set up decision rule.

The appropriate critical value can be found in the table above. To determine the appropriate critical value we need sample sizes (n 1 =8 and n 2 =7) and our two-sided level of significance (α=0.05). The critical value for this test with n 1 =8, n 2 =7 and α =0.05 is 10 and the decision rule is as follows: Reject H 0 if U < 10.

  • Step 4. Compute the test statistic.  

The first step is to assign ranks of 1 through 15 to the smallest through largest values in the total sample, as follows:

Next, we sum the ranks in each group. In the usual care group, the sum of the ranks is R 1 =45.5 and in the new program group, the sum of the ranks is R 2 =74.5. Recall that the sum of the ranks will always equal n(n+1)/2.   As a check on our assignment of ranks, we have n(n+1)/2 = 15(16)/2=120 which is equal to 45.5+74.5 = 120.  

We now compute U 1 and U 2 , as follows:

Thus, the test statistic is U=9.5.  

  • Step 5.  Conclusion:

We reject H 0 because 9.5 < 10. We have statistically significant evidence at α =0.05 to show that the populations of APGAR scores are not equal in women receiving usual prenatal care as compared to the new program of prenatal care.

Example:  

A clinical trial is run to assess the effectiveness of a new anti-retroviral therapy for patients with HIV. Patients are randomized to receive a standard anti-retroviral therapy (usual care) or the new anti-retroviral therapy and are monitored for 3 months. The primary outcome is viral load which represents the number of HIV copies per milliliter of blood. A total of 30 participants are randomized and the data are shown below.

Is there statistical evidence of a difference in viral load in patients receiving the standard versus the new anti-retroviral therapy?  

  • Step 1. Set up hypotheses and determine level of significance.

H 1 : The two populations are not equal. α=0.05

  • Step 2. Select the appropriate test statistic.  

Because viral load measures are not normally distributed (with outliers as well as limits of detection (e.g., "undetectable")), we use the Mann-Whitney U test. The test statistic is U, the smaller of

where R 1 and R 2 are the sums of the ranks in groups 1 and 2, respectively.

  • Step 3. Set up the decision rule.  

The critical value can be found in the table of critical values based on sample sizes (n 1 =n 2 =15) and a two-sided level of significance (α=0.05). The critical value 64 and the decision rule is as follows: Reject H 0 if U < 64.

  • Step 4 . Compute the test statistic.  

The first step is to assign ranks of 1 through 30 to the smallest through largest values in the total sample. Note in the table below, that the "undetectable" measurement is listed first in the ordered values (smallest) and assigned a rank of 1.  

Next, we sum the ranks in each group. In the standard anti-retroviral therapy group, the sum of the ranks is R 1 =245; in the new anti-retroviral therapy group, the sum of the ranks is R 2 =220. Recall that the sum of the ranks will always equal n(n+1)/2. As a check on our assignment of ranks, we have n(n+1)/2 = 30(31)/2=465 which is equal to 245+220 = 465.  We now compute U 1 and U 2 , as follows,

Thus, the test statistic is U=100.  

  • Step 5.  Conclusion.  

We do not reject H 0 because 100 > 64. We do not have sufficient evidence to conclude that the treatment groups differ in viral load.

return to top | previous page | next page

Content ©2017. All Rights Reserved. Date last modified: May 4, 2017. Wayne W. LaMorte, MD, PhD, MPH

We've updated our Privacy Policy to make it clearer how we use your personal data. We use cookies to provide you with a better experience. You can read our Cookie Policy here.

Informatics

Stay up to date on the topics that matter to you

Mann-Whitney U Test: Assumptions and Example

Discover what the mann-whitney u test is, what it tells us and when it should be used..

Elliot McClenaghan image

Complete the form below to unlock access to ALL audio articles.

What is the Mann-Whitney U Test?

When to use the mann-whitney u test.

- Mann-Whitney U Test Assumptions

Mann-Whitney U Test Example

The Mann-Whitney U Test, also known as the Wilcoxon Rank Sum Test, is a non-parametric statistical test used to compare two samples or groups.

The Mann-Whitney U Test assesses whether two sampled groups are likely to derive from the same population, and essentially asks; do these two populations have the same shape with regards to their data? In other words, we want evidence as to whether the groups are drawn from populations with different levels of a variable of interest. It follows that the hypotheses in a Mann-Whitney U Test are:

  • The null hypothesis (H0) is that the two populations are equal.
  • The alternative hypothesis (H1) is that the two populations are not equal.

Some researchers interpret this as comparing the medians between the two populations (in contrast, parametric tests compare the means between two independent groups). In certain situations, where the data are similarly shaped (see assumptions), this is valid – but it should be noted that the medians are not actually involved in calculation of the Mann-Whitney U test statistic. Two groups could have the same median and be significantly different according to the Mann-Whitney U test.

Non-parametric tests (sometimes referred to as ‘distribution-free tests’) are used when you assume the data in your populations of interest do not have a Normal distribution. You can think of the Mann Whitney U-test as analogous to the unpaired Student’s t-test , which you would use when assuming your two populations are normally distributed, as defined by their means and standard deviation (the parameters of the distributions).

u test hypothesis

Figure 1: Normal distribution versus skewed distribution. Credit: Technology Networks. 

  

The Mann-Whitney U Test is a common statistical test that is used in many fields including economics, biological sciences and epidemiology. It is particularly useful when you are assessing the difference between two independent groups with low numbers of individuals in each group (usually less than 30), which are not normally distributed, and where the data are continuous. If you are interested in comparing more than two groups which have skewed data, a Kruskal-Wallis One-Way analysis of variance ( ANOVA ) should be used.

Mann-Whitney U Test Assumptions

Some key assumptions for Mann-Whitney U Test are detailed below:

  • The variable being compared between the two groups must be continuous (able to take any number in a range – for example age, weight, height or heart rate). This is because the test is based on ranking the observations in each group.
  • The data are assumed to take a non-Normal , or skewed, distribution. If your data are normally distributed, the unpaired Student’s t-test should be used to compare the two groups instead.
  • While the data in both groups are not assumed to be Normal, the data are assumed to be similar in shape across the two groups.
  • The data should be two randomly selected independent samples, meaning the groups have no relationship to each other. If samples are paired (for example, two measurements from the same group of participants), then a paired samples t-test should be used instead.
  • Sufficient sample size is needed for a valid test, usually more than 5 observations in each group.

Consider a randomized controlled trial evaluating a new anti-retroviral therapy for HIV. A pilot trial randomly assigned participants to either the treated or untreated groups (N=14). We want to assess the viral load (quantity of virus per milliliter of blood) in the treated versus the untreated groups. In practice, a Mann-Whitney U Test would be easily and quickly calculated using statistical software such as SPSS or Stata, but the steps are laid out below.

The data are shown below:

These data are both skewed with a sample size of n=7 in each treatment arm, and so a non-parametric test is appropriate. Before we calculate the test, we choose a significance level (usually α=0.05). The first step is to assign ranks to the values from the full sample (both treatment groups pooled together) in order from smallest to largest. We can then generate a test statistic based on the ranks.

The table below shows the viral load values in the treated and untreated groups ranked smallest to largest, along with the summed ranks of each group:

After summing the ranks for each group, the Mann-Whitney U test statistic is selected as the smallest of the two following calculated U values:

An example image of Mann-Whitney U test statistic

Normal approximation

There are situations where the sample size may be too large for the reference table to be used to calculate the exact probability distribution – in which case we can use a Normal approximation instead. Since U is found by adding together independent, similarly distributed random samples, the central limit theorem applies when the sample is large (usually >20 in each group). The standard deviation of the sum of the ranks can be used to generate a z-statistic and a significance value generated this way. If the null hypothesis is true, the distribution of U approximates to a Normal distribution.

Next we determine a ‘critical value’ of U with which to compare our calculated test statistic, which we can do using a reference table of critical values and using our sample sizes (n=7 in both groups) and two-sided level of significance (α=0.05).

In our current example, the critical value can be determined from the reference table as 8. Finally, we can use this to accept or reject the null hypothesis using the following decision rule: Reject H0 if U ≤ 8.

Given that our U statistic is equal to the critical value, we can reject the null hypothesis that the two groups are equal and accept the alternative hypothesis that there is evidence of a difference in viral load between the groups treated with the new therapy versus untreated.

Post-Hoc Tests in Statistical Analysis

Post-Hoc Tests in Statistical Analysis

What is the Mann-Whitney U Test? The Mann-Whitney U Test, also known as the Wilcoxon Rank Sum Test, is a non-parametric statistical test used to compare two samples or groups. The Mann-Whitney U Test assesses whether two sampled groups are likely to derive from the same population, and essentially asks; do these two populations have the same shape with regards to their data?

Elliot McClenaghan image

Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons
  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Statistics LibreTexts

13.5: Mann-Whitney U Test

  • Last updated
  • Save as PDF
  • Page ID 34948

  • Rachel Webb
  • Portland State University

The Mann-Whitney U Test is the non-parametric alternative to the independent t-test. The test was expanded on Frank Wilcoxon’s Rank Sum test by Henry Mann and Donald Whitney.

Black and white portrait photograph of Henry Mann.

The independent t-test assumes the populations are normally distributed. When these conditions are not met, the Mann-Whitney Test is an alternative method.

If two groups come from the same distribution and were randomly assigned labels, then the two different groups should have values somewhat equally distributed between the two groups. The Mann-Whitney Test looks at all the possible rankings between the data points. For large sample sizes, a normal approximation of the distribution of ranks is used.

Small Sample Size Case \((n \leq 20)\)

Combine the data from both groups and sort from smallest to largest. Make sure to label the data values so you know which group they came from. Rank the data. Sum the ranks separately from each group. Let \(R_{1}\) = sum of ranks for group one and \(R_{2}\) = sum of ranks for group two.

Find the \(U\) statistic for both groups: \(U_{1} = R_{1} - \frac{n_{1} \left(n_{1}+1\right)}{2}, U_{2} = R_{2} - \frac{n_{2} \left(n_{2}+1\right)}{2}\).

The test statistic \(U = \text{Min} \left(U_{1}, U_{2}\right)\) is the smaller of \(U_{1}\) or \(U_{2}\). Critical values are found given in the tables in Figures 13-6 \((\alpha = 0.05)\) and 13-7 \((\alpha = 0.01)\).

If \(U\) is less than or equal to the critical value, then reject \(H_{0}\). Dashes indicate that the sample is too small to reject \(H_{0}\).

If you have only sample size above 20, use the following online calculator to find the critical value: https://www.socscistatistics.com/tests/mannwhitney/default.aspx .

Student employees are a major part of most college campus employment. Two major departments that participate in student hiring are listed below with the number of hours worked by students for a month. At the 0.05 level of significance, is there sufficient evidence to conclude a difference in hours between the two departments?

Sum the ranks for each group:

\(R_{1} = 1 + 2 + 3 + 4.5 + 6.5 + 8.5 + 11.5 + 13.5 + 15.5 + 19 = 85\)

\(R_{2} = 4.5 + 6.5 +8.5 + 10 + 11.5 + 13.5 + 15.5 + 17 + 18 + 20 + 21 = 146\)

Compute the test statistic:

\(U_{1} = R_{1} - \frac{n_{1} \left(n_{1}+1\right)}{2} = 85 - \frac{10 \cdot 11}{2} = 30\)

\(U_{2} = R_{2} - \frac{n_{2} \left(n_{2}+1\right)}{2} = 146 - \frac{11 \cdot 12}{2} = 80\)

Find the critical value using Figure 13-8, where \(n_{1} = 10\) and \(n_{2} = 11\). The critical value = 26.

Do not reject \(H_{0}\), since \(U = 30 > \text{CV} = 26\).

There is not enough evidence to support the claim that there is a difference in the number of hours student employees work for the athletics department and the library.

Large Sample Size Case (\(n_{1} > 20\) and \(n_{2} > 20\))

Find the \(U\) statistic for both groups: \(U_{1} = R_{1} - \frac{n_{1} \left(n_{1}+1\right)}{2}\), \(U_{2} = R_{2} - \frac{n_{2} \left(n_{2}+1\right)}{2}\).

Let \(U = \text{Min} \left(U_{1}, U_{2}\right)\), the smaller of \(U_{1}\) or \(U_{2}\). The formula for the test statistic is: \[z = \frac{\left(U - \left( \dfrac{n_{1} \cdot n_{2}}{2} \right)\right)}{\sqrt{\dfrac{n_{1} \cdot n_{2} \left(n_{1} + n_{2} + 1\right)}{12}}} \nonumber\]

A manager believes that the sales of coffee at their Portland store is more than the sales at their Cannon Beach store. They take a random sample of weekly sales from the two stores over the last year. Use the Mann-Whitney test to see if the manager’s claim could be true. Use the p-value method with \(\alpha = 0.05\).

The hypotheses are:

\(H_{0}\): There is no difference in the coffee sales between the Portland and Cannon Beach stores. \(H_{1}\): There is a difference in the coffee sales between the Portland and Cannon Beach stores.

Sum the ranks for each group.

The sum for the Portland store’s ranks: \(R_{1} = 459.5\). The sum for the Cannon Beach store’s ranks: \(R_{2} = 443.5\).

\(U_{1} = R_{1} - \frac{n_{1} \left(n_{1}+1\right)}{2} = 459.5 - \frac{20 \cdot 21}{2} = 249.5\)

\(U_{2} = R_{2} - \frac{n_{2} \left(n_{2}+1\right)}{2} = 443.5 - \frac{22 \cdot 23}{2} = 190.5\)

\(U = 190.5\)

\(z = \frac{190.5 - \left(\frac{20 \cdot 22}{2}\right)}{\sqrt{ \left(\frac{20 \cdot 22 (20 + 22 + 1)}{12}\right) }} = -0.7429\)

This test uses the standard normal distribution with the same technique for finding a p-value or critical value as the z-test performed in previous chapters. Compute the p-value for a standard normal distribution for \(z = -0.7429\) for a two-tailed test using \(2 * \text{normalcdf}(-1E99,-0.7429,0,1) = 0.4575\).

Using a TI-84 calculator to find 2 times the normal cdf with parameters -1E99, -0.7429, 0, and 1, to get an answer of 0.45754.

The p-value = \(0.4575 > \alpha = 0.05\); therefore, do not reject \(H_{0}\).

This is a two-tailed test with \(\alpha = 0.05\). Use the lower tail area of \(\alpha/2 = 0.05\) and you get critical values of \(z_{\alpha/2} = \pm 1.96\).

There is not enough evidence to support the claim that there is a difference in coffee sales between the Portland and Cannon beach stores.

There are no shortcut keys on the TI calculators or Excel for this Nonparametric Test. Note that if your data has tied ranks, there are several methods not addressed in this text, to correct the standard deviation. Hence, the z-score in some software packages may not match your results calculated by hand.

LEARN STATISTICS EASILY

LEARN STATISTICS EASILY

Learn Data Analysis Now!

LEARN STATISTICS EASILY LOGO 2

Mastering the Mann-Whitney U Test: A Comprehensive Guide

The Mann-Whitney U Test is a non-parametric statistical test used to determine if there’s a significant difference between two independent, non-normally distributed groups of data. It ranks observations from both groups and then calculates the U statistic to compare them.

Introduction

The  Mann-Whitney U Test , or the Wilcoxon rank-sum test, is a powerful non-parametric test for comparing two independent samples. Unlike the traditional t-test, it does not require the assumption of normally distributed data. This test determines if the observations from one sample are typically bigger than those from the other.

It’s important to note that the  Mann-Whitney Test  is best suited for ordinal, count or continuous data that fails the normality test. This tool has become increasingly popular because it is highly resistant to outliers and skewed data, making it very useful for data scientists in various situations.

The  Mann-Whitney U Test  has vast practical applications. For instance, in pharmaceutical research, it could be used to compare the effectiveness of two different drugs. It might be employed in education to analyze whether teaching method A yields higher test scores than method B. The key is that it allows comparing two groups on a continuous or ordinal outcome.

  • Mann-Whitney U Test is non-parametric, comparing two independent groups.
  • Unlike the t-test, Mann-Whitney doesn’t require normally distributed data assumption.
  • The Mann-Whitney Test uses rank-biserial correlation to measure effect size.
  • Consider the U statistic, p-value, and effect size when interpreting results.

 width=

Ad description. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Assumptions of the Mann-Whitney U Test

The effectiveness of the  Mann-Whitney U Test  relies on certain assumptions:

Independence of Observations : This crucial assumption means that each observation is independent of others. There is no correlation or dependency between individual observations.

Random Sampling from Populations : The data should be sampled randomly from the populations. In other words, each individual observation should be independently drawn from the population.

Ordinal Data : The Mann-Whitney Test is particularly suited for ordinal (ranked), count or continuous data that does not follow a normal distribution. If the data is continuous and follows a normal distribution, a more appropriate test would be the parametric t-test, which has greater statistical power under these conditions.

Violations of these assumptions can lead to biased or incorrect results. Therefore, understanding and validating these assumptions is crucial before performing the  Mann-Whitney U Test .

Step-by-Step Process to Perform the Mann-Whitney U Test

Multiple steps must be followed to conduct the  Mann-Whitney U Test , a non-parametric test.

1. Sort the Data : Begin by combining the two datasets and sorting all the values in ascending order. Assign rank numbers to each observation, with the smallest data point getting a rank of 1. If two or more data points are identical (i.e., tied), they get an average rank.

2. Calculate Sum of Ranks : Separately sum up the ranks for each group. This gives you two totals — one for each of the two groups you’re comparing.

3. Calculate U Statistic : The U statistic for each group can be calculated using the formula  U = n1.n2 + (n1(n1+1))/2 – R1  (group 1) and  U = n1.n2 + (n2(n2+1))/2 – R2  (group 2), where n1 and n2 are the sizes of the 2 samples. R is the ranks sum in the first/second group. So you will get two U values, one for each group.

4. Find the Smaller U Value : The smaller U value between the two calculated U statistics is used for the test.

5. Determine Significance : Compare the calculated U statistic with the critical value from the Mann-Whitney U distribution tables (which varies with the sizes of the samples). If the calculated U value is less than or equal to the tabled value, then the difference is considered statistically significant.

6. Conduct a Hypothesis Test : Depending on the p-value from the U statistic (p < 0.05 is often used), reject or fail to reject the null hypothesis. The null hypothesis (H0) for the Mann-Whitney test is that the distributions of both groups are equal.

Remember, software packages and programming languages, such as R and Python, have built-in functions to perform these calculations for you. Using such tools can save time and reduce the likelihood of manual calculation errors.

Reporting the Results of the Mann-Whitney U Test

When reporting the results of a Mann-Whitney U Test, it’s crucial to provide the necessary details that allow the reader to fully comprehend the test’s outcome and validate the results. To create a thorough report, make sure to include these crucial components:

Describe the Test : State that you conducted a Mann-Whitney Test. Specify why this test was appropriate, generally due to the data being ordinal or not normally distributed.

Report Sample Sizes : Give the sizes of the samples you compared. These provide the context for the magnitude of the U statistic.

Provide the Test Statistics : Report the exact U statistic, the p-value, and the rank-biserial correlation as the measure of effect size.

Present Descriptive Statistics : Include the median of each group because the Mann-Whitney U Test is a test of medians. Also, provide a measure of variability for each group.

State the Result : Explain whether the result was significant and what this implies concerning your research question.

Discuss the Effect Size : Reflect on the practical implications of the rank-biserial correlation. A high absolute value represents a large effect size, indicating substantial practical significance.

Report Additional Relevant Information : Detail any other relevant analyses or tests that guided your decision to use the Mann-Whitney U Test. For instance, if a test of normality (like the Shapiro-Wilk test or Kolmogorov-Smirnov test) was conducted and the data were found to be non-normally distributed, this justifies the use of the Mann-Whitney Test instead of a t-test. Including this information provides a more transparent view of your statistical decision-making process.

Here’s an example of how you could report the results of a Mann-Whitney U Test:

“ We performed a Mann-Whitney U Test to investigate the difference in satisfaction levels between customers of Brand A (n = 50, median = 85, IQR = 10) and Brand B (n = 60, median = 75, IQR = 15). Before this, a Shapiro-Wilk normality test was conducted, revealing that the data were non-normally distributed, justifying the Mann-Whitney Test. The test results were statistically significant (U = 1200, p = .03), suggesting a difference in satisfaction levels between the two customer groups. The rank-biserial correlation, as the measure of effect size, was found to be 0.4, indicating a moderate practical significance. Thus, we can conclude that customers of Brand A are significantly more satisfied than Brand B customers. “

Interpreting Results from the Mann-Whitney U Test

Interpreting the results of the  Mann-Whitney U Test  involves understanding the U statistic, the p-value, and also the effect size:

U Statistic : The U statistic provides the rank sum of the data from the two groups. The smaller U value between the two calculated U statistics is the one used for the test. If the U statistic is small, it suggests many low ranks in the first group and many high ranks in the second group, indicating a significant difference between the two groups.

P-Value : The p-value helps determine the statistical significance of the test result. A p-value less than the chosen significance level (usually 0.05) suggests that the difference between the two groups is statistically significant. Thus, we reject the null hypothesis (that there’s no difference between the two groups).

Effect Size : Along with the p-value, it’s essential to consider the effect size. This crucial measure quantifies the size of the difference between the two groups. In the Mann-Whitney U Test context, the effect size is often measured using rank-biserial correlation. Unlike the p-value, the effect size is independent of the sample size. Therefore, it provides a more intuitive understanding of the magnitude of the observed effect. Rank-biserial correlation offers a standardized measure of the effect, which can be beneficial for comparing results across different studies or datasets. The value can range from -1 to +1. A value close to |1| indicates a large effect where the ranks in one group are consistently higher than those in the other. A value close to zero suggests little to no effect. This interpretation of effect size allows for a better understanding of the existence, relevance, and practical significance of the difference between groups.

The Mann-Whitney U Test vs. Other Non-parametric Tests

The  Mann-Whitney U Test  is often compared to non-parametric tests such as the Kruskal-Wallis H and Wilcoxon signed-rank tests. While these tests share similarities, they are used in different scenarios. For example, the Kruskal-Wallis H test extends the Mann-Whitney Test to more than two groups, while the Wilcoxon signed-rank test is used for paired data.

Recommended Articles

Explore other related articles on our blog for more insights on statistical tests and their applications!

  • The Union and Intersection of Two Sets: A Fundamental Approach to Set Analysis
  • Kruskal-Wallis Test: Mastering Non-Parametric Analysis for Multiple Groups

Understanding the Assumptions for Chi-Square Test of Independence

  • What is the difference between t-test and Mann-Whitney test?
  • Statistics vs Parameters: A Comprehensive FAQ Guide
  • Non-Parametric Statistics: A Comprehensive Guide

Frequently Asked Questions (FAQs)

It’s a non-parametric statistical test for comparing two independent, non-normally distributed data groups.

This test is ideal when dealing with ordinal or continuous data that are not normally distributed.

The Mann-Whitney U Test assumes Independence of Observations, meaning each observation is unrelated. It also assumes Random Sampling from Populations, which means the data should be sampled randomly from the populations. Finally, this test is appropriate for either ordinal or continuous data and does not follow a normal distribution.

First, you must combine and rank all data values in ascending order. Afterward, separately calculate the sum of ranks for both groups. The U statistic for each group can then be determined using a specific formula that considers the sizes of the samples (n1 and n2) and the sum of the ranks (R) in each group. The final U statistic in the test is the smaller of the two calculated U values.

A smaller U statistic suggests a significant difference between the two groups.

A p-value less than 0.05 suggests a statistically significant difference between the two groups.

It uses rank-biserial correlation to measure the size of the difference between the two groups.

Include details like sample sizes, test statistics, effect size, and a clear explanation of results.

The Kruskal-Wallis H Test extends the Mann-Whitney U Test to more than two groups.

No, the Wilcoxon signed-rank test would be more appropriate for paired data.

Similar Posts

Principal Component Analysis: Transforming Data into Truthful Insights

Principal Component Analysis: Transforming Data into Truthful Insights

This comprehensive guide explores how Principal Component Analysis transforms complex data into insightful, truthful information.

Box Plot: A Powerful Data Visualization Tool

Box Plot: A Powerful Data Visualization Tool

Unearth the power of Box Plots in statistical data analysis. Learn to create, interpret and apply these graphical data displays!

Which is Better, Mean or Median?

Which is Better, Mean or Median?

Explore the world of data interpretation with our blog, focusing on the key statistical measures — mean vs median. Understand their uses.

Mastering the Chi-Square Test: A Comprehensive Guide

Mastering the Chi-Square Test: A Comprehensive Guide

Uncover the workings, assumptions, and limitations of the Chi-Square Test — a vital tool for analyzing categorical data.

Random Sampling on Excel: An In-depth Analysis

Random Sampling on Excel: An In-depth Analysis

Unlock the potential of random sampling on Excel. Discover the key to efficient, reliable data analysis and actionable statistical insights.

Understanding the Assumptions for Chi-Square Test of Independence

Explore the assumptions and applications of the Chi-Square Test of Independence, a crucial tool for analyzing categorical data in various fields.

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

u test hypothesis

Assumptions of the Mann-Whitney U test

In order to run a Mann-Whitney U test, the following four assumptions must be met. The first three relate to your choice of study design, whilst the fourth reflects the nature of your data:

  • Assumption #1: You have one dependent variable that is measured at the continuous or ordinal level. Examples of continuous variables include revision time (measured in hours), intelligence (measured using IQ score), exam performance (measured from 0 to 100), weight (measured in kg), and so forth. Examples of ordinal variables include Likert items (e.g., a 7-point scale from "strongly agree" through to "strongly disagree"), amongst other ways of ranking categories (e.g., a 5-point scale explaining how much a customer liked a product, ranging from "Not very much" to "Yes, a lot").
  • Assumption #2: You have one independent variable that consists of two categorical , independent groups (i.e., a dichotomous variable ). Example independent variables that meet this criterion include gender (two groups: "males" or "females"), employment status (two groups: "employed" or "unemployed"), transport type (two groups: "bus" or "car"), smoker (two groups: "yes" or "no"), trial (two groups: "intervention" or "control"), and so forth.

Note: Practically speaking, your independent variable can actually have three or more groups (e.g., the independent variable, "transport type", could have four groups: "bus", "car", "train" and "plane"). However, when you run the Mann-Whitney U test procedure in SPSS, you will need to decide which two groups you want to compare (e.g., you could compare "bus" and "car", or "bus" and "plane", and so forth).

  • Assumption #3: You should have independence of observations , which means that there is no relationship between the observations in each group of the independent variable or between the groups themselves. For example, there must be different participants in each group with no participant being in more than one group. This is more of a study design issue than something you can test for, but it is an important assumption of the Mann-Whitney U test. If your study fails this assumption, you will need to use another statistical test instead of the Mann-Whitney U test (e.g., a Wilcoxon signed-rank test ).
  • Assumption #4: You must determine whether the distribution of scores for both groups of your independent variable (e.g., the distribution of scores for "males" and the distribution of scores for "females" for the independent variable, "gender") have the same shape or a different shape . This will determine how you interpret the results of the Mann-Whitney U test. Since this is a critical assumption of the Mann-Whitney U test, and will affect how to work your way through this guide, we discuss this further in the next section.

If you are unfamiliar with any of the above terms, you might want to read our Types of variable guide or use our Statistical Test Selector to check you are using the correct test before going any further, which can be accessed by subscribing to Laerd Statistics .

Assumption #4: Evaluating the distributions of the two groups of your independent variable

The Mann-Whitney U test was developed as a test of stochastic equality (Mann and Whitney, 1947). However, it is not often that the test is directly interpreted in this way. In practice, the Mann-Whitney U test is more broadly used to interpret whether there are differences in the "distributions" of two groups or differences in the "medians" of two groups . However, this is not so much a choice that you make, but is based on whether the distribution of scores for both groups of your independent variable (e.g., the distribution of scores for "males" and the distribution of scores for "females" for the independent variable, "gender") have the same shape or a different shape .

If the two distributions have a different shape , the Mann-Whitney U test is used to determine whether there are differences in the distributions of your two groups. However, if the two distributions are the same shape , the Mann-Whitney U test is used to determine whether there are differences in the medians of your two groups. We discuss these two different approaches to using the Mann-Whitney U test in turn:

A test of equal distributions

Let us consider the first possible objective for using the Mann-Whitney U test – testing for differences in distributions – by considering an example where engagement score was measured in males and females. Using this interpretation of the Mann-Whitney U test, we would wish to know whether male and female engagement scores are similar or whether one gender has higher or lower values than the other. An example of similar engagement scores and dissimilar engagement scores can be seen in the diagrams shown below:

Histograms showing similar and dissimilar engagement scores

For the above diagrams, you would most likely want to confirm that males and females had similar scores in the diagram on the left, but that females had higher engagement scores than males in the diagram on the right. The Mann-Whitney U test can do this – determine whether the values in one group are lower or higher than the values in the other group (e.g., females higher than males) – by comparing the mean ranks of each distribution of scores (e.g., males and females engagement scores).

The Mann-Whitney U test works by ranking each score of the dependent variable (e.g., engagement), irrespective of the group it is in (e.g., males or females), according to its size, with the smallest rank assigned to the smallest value. The ranks obtained for males are then averaged, as are the female's ranks. This results in a mean rank for males and a mean rank for females. If the distributions are identical, which is the null hypothesis of the Mann-Whitney U test, the mean rank will be the same for both males and females. However, if one group (e.g., females) tends to have higher values than the other group, that group's scores will have been assigned higher ranks and will have a higher mean rank (and vice-versa for the group with lower scores). It is this difference in mean rank that is tested by the Mann-Whitney U test for statistical significance. Using this approach, different distributions of scores can be accommodated by the Mann-Whitney U test when determining whether values (i.e., via mean ranks) are different between two groups, as shown below:

Histograms showing non-identical distributions

Both charts above show non-identical distributions, but with females having higher engagement scores than males in both cases. The chart on the left shows the distribution of male and female engagement scores having the same shape, but a different location (i.e., the female scores are 'shifted' to the right). However, the chart on the right shows dissimilarly shaped distributions of male and female engagement scores, but again with females tending to score higher than males for engagement. The mean rank of both of these distributions can be calculated and assessed by the Mann-Whitney U test to determine whether one group has higher or lower scores than the other group.

Sometimes you will be required to explicitly state the null and alternative hypotheses for a Mann-Whitney U test, and then state which was accepted and rejected at the end of the experiment. One such null hypothesis might be:

H 0 : the distribution of scores for the two groups are equal

And the alternative hypothesis might be:

H A : the distribution of scores for the two groups are not equal

However, another way to express the alternative hypothesis is as follows:

H A : the mean ranks of the two groups are not equal

The reason for describing the alternative hypothesis with respect to mean ranks is due to a problem that can occur if you have groups with different variances. Under these conditions, you can have very different distributions but still not reject the null hypothesis of equal distributions (see, for example, Hart (2001)) or get a good idea of whether values are higher or lower in one group compared to another. Indeed, any interpretation of differences between groups becomes difficult when variances are not equal.

A test of medians

You read in the previous section that – regardless of similar or dissimilar distributions – you can use the Mann-Whitney U test to determine whether engagement scores are higher or lower in males versus females based on the use of mean ranks to describe the group differences. However, rather than mean ranks , it would be nice if we were able to describe our data using the more familiar median value. This would be more in keeping with the Mann-Whitney U test being used as an alternative to the independent-samples t-test (i.e., both would then use a measure of central tendency: the 'mean' for the independent-samples t-test and the 'median' for the Mann-Whitney U test). Indeed, the Mann-Whitney U test can be used for this very purpose, but it requires an additional assumption about the shapes of the distributions: to compare medians the distribution of engagement scores for males and females must have the same shape (including dispersion) (see below):

Histograms showing the same shape on the left and differently shaped distributions on the right

First, consider the chart on the right where the distributions are differently shaped. In this situation, you are limited to describing the differences between male and female engagement scores to higher/lower statements as described in the previous section . However, the chart on the left shows an example where the distributions of engagement scores for males and females are the same shape . As such, only the location of the engagement scores is considered to be different between the two groups, with the median being the measure of location used. This is sometimes referred to as a shift in location (i.e., all scores are being shifted to the right). What all this means is that we can use the Mann-Whitney U test to determine if the group's medians are statistically significantly different rather than before where we could only make more general higher/lower statements based on mean ranks.

Expressing the difference in the medians as a null and alternative hypothesis, we have:

H 0 : the distributions of the two groups are equal  H A : the medians of the two groups are not equal

It is important to note that the null hypothesis is the same for both detecting equal distributions or changes in median using the Mann-Whitney U test; namely, that the distributions of the two groups are equal. It is just that with the assumption of similarly shaped distributions, you can appropriate any differences between groups highlighted by the Mann-Whitney U test as being down to a difference in medians.

Mann-Whitney U-Test

The mann-whitney u-test can be used to test whether there is a difference between two samples (groups), and the data need not be normally distributed..

To determine if there is a difference between two samples, the rank sums of the two samples are used rather than the means as in the t-test for independent samples .

Mann-Whitney U-Test VS t-test for independent samples

The Mann-Whitney U test is thus the non-parametric counterpart to the t-test for independent samples; it is subject to less stringent assumptions than the t-test. Therefore, the Mann-Whitney U test is always used when the requirement of normal distribution for the t-test is not met.

Assumptions Mann-Whitney U-Tests

To compute a Mann-Whitney U test, only two independent samples with at least ordinal scaled characteristics need to be available. The variables do not have to satisfy any distribution curve.

Mann-Whitney U-Test Assumptions

If the data are available in pairs, the Wilcoxon test must be used instead of the Mann-Whitney U test.

Hypotheses Mann-Whitney U-Tests

The hypotheses of the Mann-Whitney U-test are very similar to the hypotheses of the independent t-test. The difference, however, is that in the case of the Mann-Whitney U test, the test is based on a difference in the central tendency, whereas in the case of the t-test, the test is based on a difference in the mean values. Thus, the Mann-Whitney U test results in:

  • Null hypothesis: There is no difference (in terms of central tendency) between the two groups in the population.
  • Alternative hypothesis: There is a difference (with respect to the central tendency) between the two groups in the population.

Calculate Mann-Whitney U-Test

To calculate the Mann-Whitney U test for two independent samples, the rankings of the individual values must first be determined (An example with tied ranks follows below).

Calculate Mann-Whitney U-Test

These rankings are then added up for the two groups. In the example above, the rank sum T 1 of the women is 37 and the rank sum of the men T 2 is 29. The average value of the ranks is thus R̄ 1 = 6.17 for women and R̄ 1 = 5.80 for men. The difference between R̄ 1 and R̄ 2 now shows whether there are possible differences between the reaction times. In the next step, the U-values are calculated from the rank sums T 1 and T 2 .

Mann-Whitney U-Test equation

where n 1 , n 2 are the number of elements in the first and second group respectively. If both groups are from the same population, i.e., the groups do not differ, then the value of both U values is the expected value of U. After the mean and dispersion have been estimated, z can be calculated. For the Mann-Whitney U value, the smaller value of U 1 and U 2 is used.

Depending on how large the sample is, the p-value for the Mann-Whitney U-test is calculated in a different way. For up to 25 cases, the exact values are used, which can be read from a table. For larger samples, the normal distribution can be used as an approximation.

Note: In this example, we would actually use the exact value, but we will nevertheless use the normal distribution. To do this, simply insert the z-value into the z-value to p-value calculator of DATAtab.

Mann-Whitney U-Test p-Value

If the calculated z-value is larger than the critical z-value, the two groups differ.

Calculate Mann-Whitney U test with tied ranks

If several people share a rank, connected ranks are present. In this case, there is a change in the calculation of the rank sums and the standard deviation of the U-value. We will now go through both using an example.

In the example it can be seen that the...

  • ...reaction times 34 occur twice and share the ranks 2 and 3
  • ...reaction times 39 occur three times and share the ranks 6, 7 and 8.

Mann-Whitney U test with tied ranks

To account for these connected ranks, the mean values of the joined ranks are calculated in each case. In the first case, this results in a "new" rank of 2.5 and in the second case in a "new" rank of 7. Now the rank sums T can be calculated.

Mann-Whitney U-test calculation of rank ties

Since the rank ties are clearly visible in the upper table, a term is calculated here that is needed for the later calculation of the u-value in the presence of rank ties.

Now all values are available to calculate the z-value considering connected ranks.

Mann-Whitney U-test for rank ties

Again, noting that you actually need about 20 cases to assume normal distribution of u values.

Example with DATAtab

A Mann-Whitney U-Test can be easily calculated with DATAtab. Simply copy the table below or your own data into the statistics calculator and click on Hypothesis tests Then click on the two variables and select Non-Parametric Test.

DATAtab then gives you the following table for the Mann-Whitney U-Test:

Mann-Whitney U-Test Example

The Mann-Whitney U-Test works with ranks, so the result will first show the middle ranks and the rank sum . The reaction time of women has a slightly lower value than that of men.

Mann-Whitney U-Test Statistics

DATAtab gives you the asymptotic significance and the exact significance . The significance used depends on the sample size. As a rule:

  • n 1 + n 2 < 30 → exact significance
  • n 1 + n 2 > 30 → asymptotic significance

Therefore the exact significance is used for this example. The significance (2-tailed) is .931 and thus above the significance level of 0.05. Therefore, no difference between the reaction time of men and women can be determined with these data.

Interpret Mann-Whitney U-Test

The reaction time female group had the same high values ( Mdn = 39) as the reaction time male group ( Mdn = 39). A Mann-Whitney U-Test showed that this difference was not statistically significant, U =14, p =.931, r =0.06.

Mann-Whitney U-Test effect size

In order to make a statement about the effect size in the Mann-Whitney U-Test, you need the Standardised test statistic z and the number of pairs n, with this you can then calculate the effect size with the equation below

In this case, an effect size r of 0.06. In general, one can say about the effect strength:

  • effect size r less than 0.3 → small effect
  • effect size r between 0.3 and 0.5 → medium effect
  • effect size r greater than 0.5 → large effect

In this case, the effect size of 0.06 is therefore a small effect.

Statistics made easy

  • many illustrative examples
  • ideal for exams and theses
  • statistics made easy on 412 pages
  • 5rd revised edition (April 2024)
  • Only 7.99 €

Datatab

"Super simple written"

"It could not be simpler"

"So many helpful examples"

Statistics Calculator

Cite DATAtab: DATAtab Team (2024). DATAtab: Online Statistics Calculator. DATAtab e.U. Graz, Austria. URL https://datatab.net

Mann-Whitney U (Jump to: Lecture | Video )

Ordinal data is displayed in the table below. Is there a difference between Treatment A and Treatment B using alpha = 0.05?

Let's test to see if there is a difference with a hypothesis test.

1. Define Null and Alternative Hypotheses

2. State Alpha

alpha = 0.05

3. State Decision Rule

When you have a sample size that is greater than approximately 30, the Mann-Whitney U statistic follows the z distribution. Here, our sample is not greater than 30. However, I will still be using the z distribution for the sake of brevity. Keep this requirement in mind!

We look up our critical value in the z-Table and find a critical value of plus/minus 1.96.

If z is less than -1.96, or greater than 1.96, reject the null hypothesis.

4. Calculate Test Statistic

First, we must rank all of our scores and indicate which group the scores came from:

If there is a tie, as shown below, we average the ranks:

Next, we give every "B" group one point for every "A" group that is above it. We also give every "A" group one point for every "B" group that is above it. We then add together the points for "A" and "B", and take the smaller of those two values which we call "U".

The "U" score is then used to calculate the z statistic:

5. State Results

Reject the null hypothesis.

6. State Conclusion

There is a difference between the ranks of the two treatments, z = -2.88, p < .05.

Back to Top

Mann-Whitney U Test Calculator

What is the mann-whitney u test, mann-whitney u test vs. t-test, mann-whitney u test interpretation, how do i use this mann-whitney u test calculator, how to calculate the mann-whitney u test.

This Mann-Whitney U test calculator is here to help whenever you have to perform the Wilcoxon-Mann-Whitney test . It displays not only the final decision but also the results of intermediate computations so that you can learn how to calculate the Mann-Whitney U test by hand.

If you are not yet familiar with the Wilcoxon-Mann-Whitney test, we've prepared a short article explaining what the Mann-Whitney U test is, what is the correct interpretation of the Mann-Whitney U test, and when to use the Mann-Whitney U test vs. t-test .

💡 The Mann-Whitney U test is sometimes called the Wilcoxon-Mann-Whitney test because this test was first proposed by Wilcoxon and then further developed by Mann and Whitney. This calculator uses the test statistic U . See Omni's Wilcoxon rank-sum test calculator for a version of this test using the sum of ranks as the test statistic.

The Mann-Whitney U test is a statistical procedure that we use when we have two independent samples and we want to decide if they come from the same distribution (thus also if the medians of the two populations are equal ) or rather from shifted distributions .

If you're not sure what median is, make sure to visit our median calculator to learn a bit about this concept before plunging into the U-test.

In the picture below, you can see an example of shifted distributions: the blue probability density function is shifted to the left with respect to the green one. As a consequence, the median of the blue distribution is smaller than the median of the green one.

Shifted distributions.

OK, now that we know what the Mann-Whitney U test is, let's move on and discuss when this test can help us.

Recall that the t-test is a statistical procedure that helps you decide if the population means for two independent samples are equal. However, the t-test has its assumptions ; in particular, it requires that at least one of the following conditions holds:

  • Each sample is normally distributed .
  • The samples are relatively large . Optimally, they would have more than 30 or 50 points each, depending on the source. This allows us to resort to the central limit theorem.

However, life is tough sometimes, and from time to time, we encounter datasets that do not want to obey the assumptions of the t-test. For instance, data may be skewed (like in the lognormal distribution), or there may be too few data points . And that's exactly when we use the Mann-Whitney U test.

Let us now discuss the interpretation of the Mann-Whitney U test .

Or maybe you want to learn more about the t-test? We have a dedicated t-test calculator !

As we've already mentioned, the null hypothesis of the Mann-Whitney U test says that the two populations (we'll refer to them as A and B) have the same distribution. Clearly, in such a case, the two populations have equal medians . Rejecting the null hypothesis means we have evidence that the population distributions are shifted with respect to each other, and so are their medians. As in the t-test, there are three possible alternatives:

The distribution of A is shifted to the right with respect to the distribution of B. That is, the median of population A is greater than the median of population B. We'll denote this alternative by A > B.

The distribution of A is shifted to the left with respect to the distribution of B. That is, the median of population A is smaller than the median of population B. We'll denote this alternative by A < B.

The distribution of A is shifted to the right or to the left with respect to the distribution of B. That is, the median of population A is different from the median of population B. We'll denote this alternative by A ≠ B.

The pictures below show the hypothesis A > B (upper figure) and the hypothesis A < B (bottom figure):

u test hypothesis

Most of the time, we perform the two-sided test, i.e., with the alternative A ≠ B. Use a one-sided test if you have some prior theory suggesting that the shift between the populations occurred in a specific direction.

  • Enter your data in the fields of the calculator. Additional fields will appear as you go. Up to 50 fields per sample are available.
  • Pick the significance level and the alternative hypothesis of your test. If not sure, leave the default values.
  • The results of the Mann-Whitney U test will appear at the bottom of the calculator.
  • If at least one of the samples has more than 20 elements, the calculator uses the normal approximation by default. Otherwise, it performs the exact Mann-Whitney U test, but you can use the normal distribution by adjusting the Use normal approximation option.
  • If the calculator uses the normal approximation of the test statistic distribution, then you can choose between the p-value approach and the critical region approach .
  • In the Advanced mode of the calculator, you can decide whether to use the corrections for ties and continuity . Visit our Wilcoxon rank-sum calculator to learn more about them.

Mann-Whitney U is quite popular on tests and exams, so that it may happen you'll need to learn to perform this test by hand. That's why we'll now show you the Mann-Whitney U test formula and explain step-by-step how to calculate the Mann-Whitney U test !

To perform the Mann-Whitney U test, follow these steps:

Compute the test statistics : compare each observation from Sample A with each observation from Sample B. Count how many times an observation from Sample A is the bigger one: each instance is worth 1 point. Each tie is worth 0.5. Otherwise, it's zero points. Then add the points — this is the test statistic U.

If your samples are small, you have to compare U with the critical value , which you can find in statistical tables or in statistical packages.

If your samples are relatively large, you can use the normal approximation of the test statistic distribution. In such a case, you can make the decision based on the p-value.

Let's discuss these instructions in more detail . In what follows, we denote by n₁ and n₂ the number of observations in Sample A and Sample B, respectively, and by n , the total number of observations, i.e., we have n = n₁ + n₂ .

How to compute the test statistic

The test statistic in the Mann-Whitney U test is given by the following formula:

U₁ = ∑ i ∑ⱼ S(A i , Bⱼ) ,

where A i and B j are our observations (so i = 1, ..., n₁ and j = 1, ..., n₂ ) and:

  • S(A i , B j ) = 1 , if A i > B j ;
  • S(A i , B j ) = ½ , if A i = B j ; and
  • S(A i , B j ) = 0 , if A i < B j .

Clearly, U₁ has a discrete distribution and:

  • Its minimal possible value is 0 — when every observation from Sample B is bigger than every observation from Sample A.
  • Its maximal possible value is n₁n₂ — when every observation from Sample A is bigger than every observation from Sample B.

Alternatively, we can compute U₁ via the following formula:

U₁ = R₁ − n₁(n₁ + 1)/2

where R₁ is the sum of ranks in Sample A. This formula says that two test statistics, U₁ and R₁ , which appear in the context of the Mann-Whitney-Wilcoxon test, can be easily computed from one another.

💡 Visit Omni's Wilcoxon rank-sum calculator to learn more about ranks.

Critical values for the Mann-Whitney U test

As is always the case in hypothesis testing, the critical value (and also the direction of comparison) depends on the alternative hypothesis :

If A > B, then the observations from Sample A tend to be greater than those from Sample B. Hence, we have evidence in favor of this alternative if U₁ is unusually large , and so the critical region is right-sided , i.e., [c, ∞) , where c is the critical value. Considering the maximal possible value of U₁ , we actually obtain [c, n₁n₂] .

If A < B, then the observations from Sample B tend to be greater than those from Sample A. Hence, we have evidence in favor of this alternative if U₁ is unusually small , and so the critical region is left-sided , i.e., (∞, c] , where c is the critical value. Considering the minimal possible values of U₁ , we obtain [0, c] .

If A ≠ B, then U₁ is extreme, i.e., unusually small or unusually large . Hence, the critical region is two-sided , i.e., (-∞, c₁] ∪ [c₂, ∞) , where c₁ and c₂ are the critical values. Taking into account the minimal and maximal possible values of U₁ , we obtain [0, c₁] ∪ [c₂, n₁n₂] .

We can determine the actual values of the critical values c, c₁, c₂ from the distribution of U₁ , and they depend on n₁ , n₂ , and on the significance level. To find them, you have to use either a statistical package or the tables of the distribution of U statistics. Or Omni's Mann-Whitney U test calculator 😉!

Using the normal approximation

As we've seen above, it's important to know the distribution of the test statistic to find the critical values. Sometimes, however, it's much easier to use some approximations. In particular, the U statistic can be well approximated by the normal distribution if your samples have sufficiently many observations. Some sources say that even 5 observations per sample is fine, but, obviously, the more the better. The parameters of the normal distribution are the following:

mean: μ = n₁n₂ / 2 ; and

standard deviation: σ = √(n₁n₂(n₁ + n₂ + 1) / 12) .

Hence, the normalized test statistic:

z = (U₁ - μ) / σ

follows the standard normal distribution N(0,1) . Knowing the z-score , we can now use the p-value calculator and draw some conclusions.

Hurray! You now know how to calculate the Mann-Whitney U test 🎉!

When to use the Mann-Whitney U test?

Use the Mann-Whitney U test to verify if two populations have equal medians whenever your samples are not normally distributed, or they have relatively few elements . Remember that, in such a case, you cannot use the t-test!

What is the difference between the Mann-Whitney U and Wilcoxon rank-sum tests?

The Mann-Whitney U and Wilcoxon rank-sum tests are, in fact, one and the same test . That's why people sometimes call it the Mann-Whitney-Wilcoxon test. Although you can encounter two different test statistics (sum of ranks and U-statistic), it turns out they are closely related. No matter which test statistic you use, the test's conclusions are going to be the same .

What are the assumptions of the Wilcoxon-Mann-Whitney test?

The Mann-Whitney U test has the following assumptions:

  • There is one independent variable that is a dichotomous variable, that is, it has two categories . For instance, sex (male/female) or trial (intervention/control). These categories are mutually exclusive.
  • The dependent variable should be measured on a continuous or ordinal scale . For instance, Likert items fulfill this assumption.
  • The last assumption of the Wilcoxon-Mann-Whitney test is that the observations have to be independent .
  • Importantly, your sample does not have to follow the normal distribution. This is the main advantage of the Mann-Whitney U test vs. t-test.

Car crash force

Sum of squares, uniform distribution.

  • Biology (100)
  • Chemistry (100)
  • Construction (144)
  • Conversion (295)
  • Ecology (30)
  • Everyday life (262)
  • Finance (570)
  • Health (440)
  • Physics (510)
  • Sports (105)
  • Statistics (182)
  • Other (182)
  • Discover Omni (40)

Statology

Statistics Made Easy

Introduction to Hypothesis Testing

A statistical hypothesis is an assumption about a population parameter .

For example, we may assume that the mean height of a male in the U.S. is 70 inches.

The assumption about the height is the statistical hypothesis and the true mean height of a male in the U.S. is the population parameter .

A hypothesis test is a formal statistical test we use to reject or fail to reject a statistical hypothesis.

The Two Types of Statistical Hypotheses

To test whether a statistical hypothesis about a population parameter is true, we obtain a random sample from the population and perform a hypothesis test on the sample data.

There are two types of statistical hypotheses:

The null hypothesis , denoted as H 0 , is the hypothesis that the sample data occurs purely from chance.

The alternative hypothesis , denoted as H 1 or H a , is the hypothesis that the sample data is influenced by some non-random cause.

Hypothesis Tests

A hypothesis test consists of five steps:

1. State the hypotheses. 

State the null and alternative hypotheses. These two hypotheses need to be mutually exclusive, so if one is true then the other must be false.

2. Determine a significance level to use for the hypothesis.

Decide on a significance level. Common choices are .01, .05, and .1. 

3. Find the test statistic.

Find the test statistic and the corresponding p-value. Often we are analyzing a population mean or proportion and the general formula to find the test statistic is: (sample statistic – population parameter) / (standard deviation of statistic)

4. Reject or fail to reject the null hypothesis.

Using the test statistic or the p-value, determine if you can reject or fail to reject the null hypothesis based on the significance level.

The p-value  tells us the strength of evidence in support of a null hypothesis. If the p-value is less than the significance level, we reject the null hypothesis.

5. Interpret the results. 

Interpret the results of the hypothesis test in the context of the question being asked. 

The Two Types of Decision Errors

There are two types of decision errors that one can make when doing a hypothesis test:

Type I error: You reject the null hypothesis when it is actually true. The probability of committing a Type I error is equal to the significance level, often called  alpha , and denoted as α.

Type II error: You fail to reject the null hypothesis when it is actually false. The probability of committing a Type II error is called the Power of the test or  Beta , denoted as β.

One-Tailed and Two-Tailed Tests

A statistical hypothesis can be one-tailed or two-tailed.

A one-tailed hypothesis involves making a “greater than” or “less than ” statement.

For example, suppose we assume the mean height of a male in the U.S. is greater than or equal to 70 inches. The null hypothesis would be H0: µ ≥ 70 inches and the alternative hypothesis would be Ha: µ < 70 inches.

A two-tailed hypothesis involves making an “equal to” or “not equal to” statement.

For example, suppose we assume the mean height of a male in the U.S. is equal to 70 inches. The null hypothesis would be H0: µ = 70 inches and the alternative hypothesis would be Ha: µ ≠ 70 inches.

Note: The “equal” sign is always included in the null hypothesis, whether it is =, ≥, or ≤.

Related:   What is a Directional Hypothesis?

Types of Hypothesis Tests

There are many different types of hypothesis tests you can perform depending on the type of data you’re working with and the goal of your analysis.

The following tutorials provide an explanation of the most common types of hypothesis tests:

Introduction to the One Sample t-test Introduction to the Two Sample t-test Introduction to the Paired Samples t-test Introduction to the One Proportion Z-Test Introduction to the Two Proportion Z-Test

u test hypothesis

Hey there. My name is Zach Bobbitt. I have a Master of Science degree in Applied Statistics and I’ve worked on machine learning algorithms for professional businesses in both healthcare and retail. I’m passionate about statistics, machine learning, and data visualization and I created Statology to be a resource for both students and teachers alike.  My goal with this site is to help you learn statistics through using simple terms, plenty of real-world examples, and helpful illustrations.

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

u test hypothesis

  • Calculators
  • Descriptive Statistics
  • Merchandise
  • Which Statistics Test?

Mann-Whitney U Test Calculator

This is a simple Mann-Whitney U test calculator that provides a detailed breakdown of ranks, calculations, data and so on.

Further Information

The Mann-Whitney U test is a nonparametric test that allows two groups or conditions or treatments to be compared without making the assumption that values are normally distributed. So, for example, one might compare the speed at which two different groups of people can run 100 metres, where one group has trained for six weeks and the other has not.

Requirements

  • Two random, independent samples
  • The data is continuous - in other words, it must, in principle, be possible to distinguish between values at the nth decimal place
  • Scale of measurement should be ordinal, interval or ratio
  • For maximum accuracy, there should be no ties, though this test - like others - has a way to handle ties

Null Hypothesis

The null hypothesis asserts that the medians of the two samples are identical.

u test hypothesis

  • Machine Learning Tutorial
  • Data Analysis Tutorial
  • Python - Data visualization tutorial
  • Machine Learning Projects
  • Machine Learning Interview Questions
  • Machine Learning Mathematics
  • Deep Learning Tutorial
  • Deep Learning Project
  • Deep Learning Interview Questions
  • Computer Vision Tutorial
  • Computer Vision Projects
  • NLP Project
  • NLP Interview Questions
  • Statistics with Python
  • 100 Days of Machine Learning

Linear Algebra and Matrix

  • Scalar and Vector
  • Python Program to Add Two Matrices
  • Python program to multiply two matrices
  • Vector Operations
  • Product of Vectors
  • Scalar Product of Vectors
  • Dot and Cross Products on Vectors
  • Transpose a matrix in Single line in Python
  • Transpose of a Matrix
  • Adjoint and Inverse of a Matrix
  • How to inverse a matrix using NumPy
  • Determinant of a Matrix
  • Program to find Normal and Trace of a matrix
  • Data Science | Solving Linear Equations
  • Data Science - Solving Linear Equations with Python
  • System of Linear Equations
  • System of Linear Equations in three variables using Cramer's Rule
  • Eigenvalues
  • Applications of Eigenvalues and Eigenvectors
  • How to compute the eigenvalues and right eigenvectors of a given square array using NumPY?

Statistics for Machine Learning

  • Descriptive Statistic
  • Measures of Central Tendency
  • Measures of Dispersion | Types, Formula and Examples
  • Mean, Variance and Standard Deviation
  • Calculate the average, variance and standard deviation in Python using NumPy
  • Random Variables
  • Difference between Parametric and Non-Parametric Methods
  • Probability Distribution
  • Confidence Interval
  • Mathematics | Covariance and Correlation
  • Program to find correlation coefficient
  • Robust Correlation
  • Normal Probability Plot
  • Quantile Quantile plots
  • True Error vs Sample Error
  • Bias-Variance Trade Off - Machine Learning
  • Understanding Hypothesis Testing
  • Paired T-Test - A Detailed Overview
  • P-value in Machine Learning
  • F-Test in Statistics
  • Residual Leverage Plot (Regression Diagnostic)
  • Difference between Null and Alternate Hypothesis

Mann and Whitney U test

  • Wilcoxon Signed Rank Test
  • Kruskal Wallis Test
  • Friedman Test
  • Mathematics | Probability

Probability and Probability Distributions

  • Mathematics - Law of Total Probability
  • Bayes's Theorem for Conditional Probability
  • Mathematics | Probability Distributions Set 1 (Uniform Distribution)
  • Mathematics | Probability Distributions Set 4 (Binomial Distribution)
  • Mathematics | Probability Distributions Set 5 (Poisson Distribution)
  • Uniform Distribution Formula
  • Mathematics | Probability Distributions Set 2 (Exponential Distribution)
  • Mathematics | Probability Distributions Set 3 (Normal Distribution)
  • Mathematics | Beta Distribution Model
  • Gamma Distribution Model in Mathematics
  • Chi-Square Test for Feature Selection - Mathematical Explanation
  • Student's t-distribution in Statistics
  • Python - Central Limit Theorem
  • Mathematics | Limits, Continuity and Differentiability
  • Implicit Differentiation

Calculus for Machine Learning

  • Engineering Mathematics - Partial Derivatives
  • Advanced Differentiation
  • How to find Gradient of a Function using Python?
  • Optimization techniques for Gradient Descent
  • Higher Order Derivatives
  • Taylor Series
  • Application of Derivative - Maxima and Minima | Mathematics
  • Absolute Minima and Maxima
  • Optimization for Data Science
  • Unconstrained Multivariate Optimization
  • Lagrange Multipliers
  • Lagrange's Interpolation
  • Linear Regression in Machine learning
  • Ordinary Least Squares (OLS) using statsmodels

Regression in Machine Learning

Mann and Whitney’s U-test or Wilcoxon rank-sum test is the non-parametric statistic hypothesis test that is used to analyze the difference between two independent samples of ordinal data. In this test, we have provided two randomly drawn samples and we have to verify whether these two samples is from the same population.

The assumption for Mann-Whitney U test:

  • All observations of both groups are independent of each other.
  • The values of the dependent variable should be in an ordinal manner (means they can be compared to each other and ranked in order of highest to lowest).
  • The independent variable should be two independent, categorical groups.
  • For each of the sample recommended number is between 5 and 20.
  • The null hypothesis in Mann-Whitney U-test is always the same i.e. there is no significant difference between the two samples.
  • Mann Whitney test is applied to two distribution that need not be normally distributed but should have the same curve shape. For Example: If one curve (of a sample) has longer right-tailed, the other curve (or other samples) should also have a longer right tail.

The advantage of using the Mann-Whitney U test is that it has no effect because of the outliers as it considers the median instead of the mean for the test. 

Steps for Performing the Mann Whitney U test:

  • Collect two samples and sample 1 and sample 2.
  • Take the first observation from sample 1 and compare it with observations in sample 2. Count the number of observations in Sample 2 that are smaller than that and equal to it. For, example, 10 observations in sample 2 are smaller than the first observation in sample 1 and 2 equal then out U statistics for this sample: 10 + 2(1/2) = 11
  • Repeat Step 2 for all observations in sample 1
  • Add up all of your totals from Steps 2 and 3. This isour rank sum.
  • Now, we calculatethe U statistics using following formula

U_1 = n_{1}n_{2} +\frac{n_{1}\left ( n_{1}+1 \right )}{2} - R_{1}

  • n 1 : number of samples in sample 1
  • n 2 : number of samples in sample 2
  • R 1 : Rank sum of sample 1
  • R 2 : Rank sum of sample 2
  • Now, our test statistic (U) will be smaller of U 1 and U 2 .
  • if U <= U 0 : we reject the null hypothesis.
  • else, we do not reject the null hypothesis.

Examples: 

  • Suppose there is a test performed on the two batches of students and the results are below:
  • H 0 : There is no significant difference between batches.
  • H A : There is a significant difference between batches.
  • Here, our level of significance is 0.05
  • Now, we rank the samples according to batches,  if two samples have same rank then we will average the rank
  • Now, we calculate the U-statistics:

U_1 =  6*6 + 7*6/2 -23 = 34

  • So, our test statistics U = min ( U 1 , U 2 ) = min (34,2) =2.
  • Now, we look into the U-statistics table for n 1 = 6 and n 2 = 6  and level of significance for table below. Here, our critical value is:

u test hypothesis

Mann-Whitney two tailed test

U_0 = 5

  • Here U < U 0 , then we reject the null hypothesis.

Implementation:

Please Login to comment...

Similar reads.

  • Machine Learning

advertisewithusBannerImg

Improve your Coding Skills with Practice

 alt=

What kind of Experience do you want to share?

IMAGES

  1. How to Perform Mann-Witney U Test(Step by Step)

    u test hypothesis

  2. PPT

    u test hypothesis

  3. hypothesis test formula statistics

    u test hypothesis

  4. ️Hypothesis Testing Worksheet Free Download| Goodimg.co

    u test hypothesis

  5. Hypothesis Testing: Upper, Lower, and Two Tailed Tests

    u test hypothesis

  6. Mann Whitney U Test by Pharmaceutical Biostatistics

    u test hypothesis

VIDEO

  1. Hypothesis Testing by Hand: The Significance of a Correlation Coefficient

  2. Testing of Hypothesis by Rambabu Yadav sir

  3. 25

  4. Two-Tailed Test Conditions Explained with Example

  5. Hypothesis Testing: One Sample Inference

  6. Hypothesis Testing Statistics

COMMENTS

  1. Mann Whitney U Test Explained

    The Mann Whitney U test is a nonparametric hypothesis test that compares two independent groups. Statisticians also refer to it as the Wilcoxon rank sum test. The Kruskal Wallis test extends this analysis so that can compare more than two groups. If you're involved in data analysis or scientific research, you're likely familiar with the t-test.

  2. Mann-Whitney U Test

    4. Reject or fail to reject the null hypothesis. Using n 1 = 6 and n 2 = 6 with a significance level of .05, the Mann-Whitney U Table tells us that the critical value is 5:. Since our test statistic (13) is greater than our critical value (5), we fail to reject the null hypothesis.

  3. Mann-Whitney U test

    Mann-Whitney test (also called the Mann-Whitney-Wilcoxon (MWW/MWU), Wilcoxon rank-sum test, or Wilcoxon-Mann-Whitney test) is a nonparametric test of the null hypothesis that, for randomly selected values X and Y from two populations, the probability of X being greater than Y is equal to the probability of Y being greater than X.. Nonparametric tests used on two dependent samples are ...

  4. McNemar And Mann-Whitney U Tests

    All good research is based on a meticulous and well-designed question in the form of a hypothesis. To test this hypothesis, one must conduct an experiment with strict guidelines to obtain robust results. The results are then tested using statistics to examine its significance and conclude if a new treatment/ diagnostic modalities/biomarker is a better alternative to prevalent practice. Thus ...

  5. Mann Whitney U Test (Wilcoxon Rank Sum Test)

    The modules on hypothesis testing presented techniques for testing the equality of means in two independent samples. An underlying assumption for appropriate use of the tests described was that the continuous outcome was approximately normally distributed or that the samples were sufficiently large (usually n 1 > 30 and n 2 > 30) to justify their use based on the Central Limit Theorem.

  6. Mann-Whitney U Test: Assumptions and Example

    The Mann-Whitney U Test, also known as the Wilcoxon Rank Sum Test, is a non-parametric statistical test used to compare two samples or groups. In this article, we explore the basics of the Test and work through an example. ... It follows that the hypotheses in a Mann-Whitney U Test are: The null hypothesis (H0) is that the two populations are ...

  7. 13.5: Mann-Whitney U Test

    Let R1 = sum of ranks for group one and R2 = sum of ranks for group two. Find the U statistic for both groups: U1 = R1 − n1(n1 + 1) 2, U2 = R2 − n2(n2 + 1) 2. The test statistic U = Min(U1, U2) is the smaller of U1 or U2. Critical values are found given in the tables in Figures 13-6 (α = 0.05) and 13-7 (α = 0.01).

  8. Mastering the Mann-Whitney U Test: A Comprehensive Guide

    The Mann-Whitney U Test is a non-parametric statistical test used to determine if there's a significant difference between two independent, non-normally distributed groups of data. ... Conduct a Hypothesis Test: Depending on the p-value from the U statistic (p < 0.05 is often used), reject or fail to reject the null hypothesis. The null ...

  9. Assumptions of the Mann-Whitney U test

    In order to run a Mann-Whitney U test, the following four assumptions must be met. The first three relate to your choice of study design, whilst the fourth reflects the nature of your data: Assumption #1: You have one dependent variable that is measured at the continuous or ordinal level. Examples of continuous variables include revision time ...

  10. Mann Whitney U Test: Definition, How to Run in SPSS

    The result of performing a Mann Whitney U Test is a U Statistic. For small samples, use the direct method (see below) to find the U statistic; For larger samples, a formula is necessary. Or, you can use technology like SPSS to run the test. Either of these two formulas are valid for the Mann Whitney U Test. R is the sum of ranks in the sample ...

  11. PDF Statistics: 2.3 The Mann-Whitney U Test

    Statistics: 2.3 The Mann-Whitney U Test Rosie Shier. 2004. 1 Introduction The Mann-Whitney U test is a non-parametric test that can be used in place of an unpaired t-test. It is used to test the null hypothesis that two samples come from the same population (i.e. have the same median) or, alternatively, whether observations in one

  12. Mann-Whitney U-Test • Simply explained

    Example with DATAtab. A Mann-Whitney U-Test can be easily calculated with DATAtab. Simply copy the table below or your own data into the statistics calculator and click on Hypothesis tests Then click on the two variables and select Non-Parametric Test. Example data. Gender.

  13. Mann-Whitney U

    Mann-Whitney U-Test. The Mann-Whitney U-Test is a version of the independent samples t-Test that can be performed on ordinal (ranked) data. Ordinal data is displayed in the table below. Is there a difference between Treatment A and Treatment B using alpha = 0.05? Figure 1.

  14. Comprehensive tutorial

    Complete Hypothesis Testing Playlist - https://tinyurl.com/2bea2phzIn this video we introduce the Mann-Whitney U test, a powerful non-parametric alternative ...

  15. Mann-Whitney U Test

    The Mann-Whitney U-test can be used when the aim is to show a difference between two groups in the value of an ordinal, interval or ratio variable. It is the non-parametric version of the t -test, which can be used for interval, ratio or continuous data unless there are large departures from the parametric assumptions.

  16. Hypothesis Testing

    Table of contents. Step 1: State your null and alternate hypothesis. Step 2: Collect data. Step 3: Perform a statistical test. Step 4: Decide whether to reject or fail to reject your null hypothesis. Step 5: Present your findings. Other interesting articles. Frequently asked questions about hypothesis testing.

  17. Mann-Whitney U Test Calculator

    As we've already mentioned, the null hypothesis of the Mann-Whitney U test says that the two populations (we'll refer to them as A and B) have the same distribution. Clearly, in such a case, the two populations have equal medians.Rejecting the null hypothesis means we have evidence that the population distributions are shifted with respect to each other, and so are their medians.

  18. Mann-Whitney U Test

    The null hypothesis is that the two groups have no significant difference in test scores. She noted the test scores of all the students and ranked the combined data. Then she calculated the U statistic and evaluated the critical value or p-value. ... Mann Whitney U test: It is a non-parametric test for comparing two independent groups, which is ...

  19. How to Report a Mann-Whitney U Test (With Example)

    Here is the exact wording we can use: A Mann-Whitney U test was performed to compare [response variable of interest] in [group 1] and [group 2]. There [was or was not] a significant difference in [response variable of interest] between [group1] and [group2]; z = [z-value], p = [p-value]. The following example shows how to report the results of ...

  20. Introduction to Hypothesis Testing

    A statistical hypothesis is an assumption about a population parameter.. For example, we may assume that the mean height of a male in the U.S. is 70 inches. The assumption about the height is the statistical hypothesis and the true mean height of a male in the U.S. is the population parameter.. A hypothesis test is a formal statistical test we use to reject or fail to reject a statistical ...

  21. Mann-Whitney U Test Calculator

    Further Information. The Mann-Whitney U test is a nonparametric test that allows two groups or conditions or treatments to be compared without making the assumption that values are normally distributed. So, for example, one might compare the speed at which two different groups of people can run 100 metres, where one group has trained for six weeks and the other has not.

  22. Mann-Whitney U test: Video, Anatomy & Definition

    The Mann-Whitney U test is a nonparametric statistic that is used to compare two independent samples. function OptanonWrapper() Plans. Medicine (MD) Medicine (DO) Physician Assistant (PA) Nurse Practitioner (NP) ... Hypothesis testing: One-tailed and two-tailed tests. Chi-squared test. Fisher's exact test. Kappa coefficient. Mann-Whitney U test ...

  23. Mann and Whitney U test

    Mann and Whitney U test. Mann and Whitney's U-test or Wilcoxon rank-sum test is the non-parametric statistic hypothesis test that is used to analyze the difference between two independent samples of ordinal data. In this test, we have provided two randomly drawn samples and we have to verify whether these two samples is from the same population.

  24. Systems

    While digitalization offers new opportunities for small- and medium-sized enterprises (SMEs), it also introduces the phenomenon of the "digitalization paradox". This paper develops a theoretical model comprising digitalization, digital technology-business alignment, external social capital, and SMEs' performance, rooted in strategic alignment theory (SAT) and social capital theory (SCT ...