![](https://img.haomeiwen.com/i22388565/a8211275252f66c4.png)
Richard Bettis from the University of North Carolina best summed up p-hacking as “the hunt for asterisks.”
简单来说就是多次显著性检验得到的某一显著结果,很可能是by chance的,多重检验把随机的概率提高了。
一次性进行多重检验的时候大家都知道怎么做P值矫正,但如果是一次一次呢?
例如,假设A-Z是变量名称,对变量两两配对做显著性检验。1月1日我检验A和B之间没有显著性。2月2日我检验A和C之间没有显著差异....10月10日我检验K和P显著了!这是单次检验显著吗?不是,其实是10次 multiple tests,即使中间隔了很长时间,亦或我转变了好多次想法,只要样本没变,这都是多重检验。
P-hacking的危害非常大!
纠正手段:
1)调整P值或alpha值(多重检验矫正BH,FDR...)
2)Cross-Validation
3)在文章中真实反应你做过的每一次统计检验
4)用贝叶斯方法,把问题转化为“选择一个合适的先验”
The following simple strategies have been suggested to handle multiple comparisons:
• Readers should evaluate the quality of the study and the
actual effect size instead of focusing only on statistical
significance
• Results from single studies should not be used to
make treatment decisions; instead, one should look for scientific plausibility and supporting data from other studies which can validate the results of the original study
• Authors should try to limit comparisons between groups and identify a single primary endpoint; using a composite endpoint or global assessment tool is also an acceptable alternative to using multiple endpoints.
reference:
https://xkcd.com/882/
http://www.howsci.com/p-hacking.html
Ranganathan, P., Pramesh, C. S., & Buyse, M. (2016). Common pitfalls in statistical analysis: The perils of multiple testing. Perspectives in clinical research, 7(2), 106–107. https://doi.org/10.4103/2229-3485.179436
网友评论