Sunday, 20 October 2013

Mann Whitney U-test Calculator

The Mann-Whitney U test is a non-parametric test used to determine whether two independent groups of data are different. It is a robust test, and is widely used in many social sciences, including quantitative psychology. For more details, have a look at the following post, or refer to an appropriate textbook on the subject.

As is common with hypothesis testing in general, we start out with a Null Hypothesis, which can be thought of as our default assumption. The Null Hypothesis for the Mann-Whitney U test is that the two groups of data are not different. Based on the U statistic, which is calculated from the data, we determine whether to accept or reject the Null Hypothesis. If we have a large enough number of samples, we can use the calculated p-value as a basis for either accepting or rejecting the Null Hypothesis. The p-value is the probability of obtaining either the observed difference or a more extreme value of the difference between the two groups, purely based on chance. If the p-value is very low (say below a threshold value of 0.05), we reject the Null Hypothesis and the result is considered significant. On the other hand, if the p-value is greater than 0.05, we accept the Null Hypothesis. One should exercise caution when interpreting the result, but a very low value of the p-value could merit further examination of the data, in terms of the possible causes of the significant difference.

Below is an online calculator of the Mann-Whitney U test. Please enter Group 1 and Group 2 values as comma separated numbers in the fields below. Group 1 and Group 2 can have a different number of samples, but there must be at least 5 samples for each of the two groups for the test results to be valid.

Alternatively, you can choose a two-column CSV file to load - simply press on the choose file button below the clear Group 1 and Group 2 buttons. To reload the same file after clearing the text areas, you would need to reload this webpage. Also, the second column of the CSV file can be shorter than the first, so that the rows near the bottom of the CSV file will have only one entry as opposed to two i.e. the smaller dataset must be the second column.

Below there is a graph that will contain a scatter plot of the samples for both groups once the button is pressed to perform the Mann Whitney U test. It is useful to visualise the spread of samples of the two groups. For example, if the samples of the two groups are spaced apart with no overlap, then we would expect the p-value to be very low, or the calculated U statistic to be below the critical value, such that the result is indeed significant.

At the bottom there is a graph that will contain separate histograms for both the groups. It is even more informative to visualise the spread of samples of the two groups in this manner, as it may yield extra information that a researcher would find useful.


Group 1 values: (enter comma separated numbers)
Group 2 values: (enter comma separated numbers)






Results pending...



Scatter plot
values
Group Number

Histograms
Frequency
Values