Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistence in W test statistic #951

Open
maximelepetit opened this issue Jul 18, 2024 · 2 comments
Open

Inconsistence in W test statistic #951

maximelepetit opened this issue Jul 18, 2024 · 2 comments

Comments

@maximelepetit
Copy link

I would like to thank you for this very interesting package.

I need help with interpretation and clarifying certain values.

I calculated an apoptosis score for two cell samples. BASE cells and LPS cells. And I would like to see if there is a significant statistical difference between the 2 groups.
For group 1 the sample size is 8126 cells and for group 2 the sample size is 7942 cells.

Naively I did a Wilcoxon test between these two groups.

# Extract data
apoptosis_data <- FetchData(neurons_v5_cb_subset_neurons_silvia, vars = c("ApoptosisScore1", "orig.ident"))
rownames(apoptosis_data)<-NULL
head(apoptosis_data)
  ApoptosisScore1     orig.ident
           <dbl>            <chr>
1	0.04673351	BASE		
2	0.03632951	BASE		
3	0.05176500	BASE		
4	0.04276227	BASE		
5	0.03331517	BASE		
6	0.03697204	BASE
group1 <- apoptosis_data[apoptosis_data$orig.ident == "LPS", "ApoptosisScore1"]
length(group1)
8126

and

group2 <- apoptosis_data[apoptosis_data$orig.ident == "BASE", "ApoptosisScore1"]
length(group2)
7942

Perform wilcoxon rank sum test :

wilcox_test <- wilcox.test(group1, group2)
print(wilcox_test)

That give :

	Wilcoxon rank sum test with continuity correction

data:  group1 and group2
W = 42939709, p-value < 2.2e-16
alternative hypothesis: true location shift is not equal to 0

Conclusion :
The p-value < 2.2e-16 suggests that there is a statistically significant difference in the ApoptosisScores between the two groups. Therefore, you can reject the null hypothesis that the distributions of the ApoptosisScores in the two groups are the same.

Then I discovered the ggstatsplot package.

After reading the documentation I decided to use ggbetweenstats function between the two groups. According to the documentation :
Non-parametric 2 Mann-Whitney U test [stats::wilcox.test()](https://rdrr.io/r/stats/wilcox.test.html)
I decided to set type="nonparametric" in order to find the value of p.value obtained previously.

Here the code used :

p <- ggbetweenstats(
  data  = apoptosis_data,
  x     = orig.ident,
  y     = ApoptosisScore1,
  type = "nonparametric",
  ylab = "Apoptosis score",
  xlab = "Condition",
  title = "Distribution of Apoptosis Score across condition"
) 

Give :
comparaison_lps_base_withoutggsignif

I am wondering why the test statistic (W) is different when i ran wilcoxon.test in one hand (W = 42939709) and the test statistic gave on the plot : 2.16e+07 ?

I need help !

Thanks.

Maxime

@IndrajeetPatil
Copy link
Owner

It's hard for me to look into this without a reproducible example.

@maximelepetit
Copy link
Author

Here, The code and the data ;)
issue_951_ggstatplot.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants