Whitney U Test(Wilcoxon rank-sum test) in Python
-
1.Concept:
- A Mann-Whitney U test (sometimes called the Wilcoxon rank-sum test) is used to compare the differences between two samples when the sample distributions are not normally distributed and the sample sizes are small (n <30). It is considered to be the nonparametric equivalent to the two sample t-test.
-
2.Where to use:
Here are some examples of when you might use a Mann-Whitney U test:
-
You want to compare the salaries of five graduates from university A vs. the salaries of five graduates from university B. The salaries are not normally distributed.
-
You want to know if weight loss varies for two groups: 12 people using diet A and 10 people using diet B. The weight loss is not normally distributed.
-
You want to know if the scores of 8 students in class A differ from those of 7 students in class B. The scores are not normally distributed.
-
In each example you have two groups that you want to compare, the sampling distributions are not normal, and the sample sizes are small.
Thus, a Mann-Whitney U test is appropriate as long as the following assumptions are met
. -
3. Assumptions of the Mann-Whitney U Test
-
Before you conduct a Mann-Whitney U test, you need to make sure the following four assumptions are met:
-
Ordinal or Continuous: The variable you’re analyzing is ordinal or continuous. Examples of ordinal variables include Likert items (e.g., a 5-point scale from “strongly disagree” to “strongly agree”). Examples of continuous variables include height (measured in inches), weight (measured in pounds), or exam scores (measured from 0 to 100).
-
Independence: All of the observations from both groups are independent of each other.
-
Shape: The shapes of the distributions for the two groups are roughly the same.
-
Data:
or :
group1 = [20, 23, 21, 25, 18, 17, 18, 24, 20, 24, 23, 19]
group2 = [24, 25, 21, 22, 23, 18, 17, 28, 24, 27, 21, 23]
-
5. Example:
- Example: Mann-Whitney U Test in Python
Researchers want to know if a fuel treatment leads to a change in the average mpg of a car. To test this, they measure the mpg of 12 cars with the fuel treatment and 12 cars without it. - Since the sample sizes are small and the researchers suspect that the sample distributions are not normally distributed, they decided to perform a Mann-Whitney U test to determine if there is a statistically significant difference in mpg between the two groups.
- Perform the following steps to conduct a Mann-Whitney U test in Python.
#A Mann-Whitney U test (sometimes called the Wilcoxon rank-sum test) is
# used to compare the differences between two samples when the sample distributions
# are not normally distributed and the sample sizes are small (n <30).
# It is considered to be the nonparametric equivalent to the two sample t-test.
#pip install scipy
#pip install openpyxl
import pandas as pd
import numpy as np
# read the data from excel
group1 = np.array(pd.read_excel('C:/Users/Mr.R/Desktop/excels/zm.xlsx', sheet_name='1', usecols='C'))
group2 = np.array(pd.read_excel('C:/Users/Mr.R/Desktop/excels/zm.xlsx', sheet_name='1', usecols='D'))
print(group1, group2)
#First, we’ll create two arrays to hold the mpg values for each group of cars:
# group1 = [20, 23, 21, 25, 18, 17, 18, 24, 20, 24, 23, 19]
# group2 = [24, 25, 21, 22, 23, 18, 17, 28, 24, 27, 21, 23]
import scipy.stats as stats
#perform the Mann-Whitney U test
Mann_Whitney_U_test = stats.mannwhitneyu(group1, group2, alternative='two-sided')
print("\n Mann_Whitney_U_test :\n", Mann_Whitney_U_test)
# Step 3: Interpret the results.
# In this example, the Mann-Whitney U Test uses the following null and alternative hypotheses:
# H0: The mpg is equal between the two groups
# HA: The mpg is not equal between the two groups
# if the p-value is not less than 0.05,
# we fail to reject the null hypothesis.
# We do not have sufficient evidence to say that the true mean mpg is different between the two groups.
-
Test result:
-
Interpretation:
In this example, the Mann-Whitney U Test uses the following null and alternative hypotheses:
-
H0: The mpg is equal between the two groups
-
HA: The mpg is not equal between the two groups
-
Since the p-value (0.03772318) is less than 0.05, we reject the null hypothesis. We do have sufficient evidence to say that the true mean mpg is different between the two groups.
网友评论