美文网首页群体遗传孟德尔随机化
复现一篇高分(IF = 11.274)孟德尔随机化分析文章-da

复现一篇高分(IF = 11.274)孟德尔随机化分析文章-da

作者: rapunzel0103 | 来源:发表于2022-11-23 17:19 被阅读0次

    话不多说,直接上代码

    library(TwoSampleMR)
    library(data.table)
    
    #step 1. read exposure data
    exposure_dat <- read_exposure_data('met-a-746_MR_format.txt', clump = F, sep = "\t",phenotype_col = "Phenotype",snp = "SNP", beta_col = "beta", se_col = "se", effect_allele_col = "effect_allele", other_allele_col = "other_allele", pval_col = "pval", samplesize_col = "samplesize", eaf_col = "eaf")
    
    #step 2. exposure data clump, parameters are from the article(default parameter is clump_r2 = 0.001,clump_kb = 10000)
    exposure_dat_clump <- clump_data(exposure_dat,  clump_r2 = 0.1, pop = "EUR",clump_kb = 500)
    
    #step 3. read outcome data
    outcome_data <- fread('Alzheimer_GWAS_summary_example.txt')
    outcome_dat <- format_data( dat=outcome_data, type = "outcome", snps = exposure_dat_clump$SNP, header = TRUE, phenotype_col = "Phenotype", snp_col ="SNP",beta_col ="beta",se_col ="se",effect_allele_col ="effect_allele",other_allele_col ="other_allele",pval_col ="pval",samplesize_col = "samplesize", eaf_col = "eaf")
    
    #step 4. harmonise
    dat <- harmonise_data(exposure_dat_clump, outcome_dat)
    
    #step 5. caculate F-stat for each SNP
    dat$EAF2 <- (1 - dat$eaf.exposure)
    dat$MAF <- pmin(dat$eaf.exposure, dat$EAF2)
    PVEfx <- function(BETA, MAF, SE, N){
      pve <- (2*(BETA^2)*MAF*(1 - MAF))/((2*(BETA^2)*MAF*(1 - MAF)) + ((SE^2)*2*N*MAF*(1 - MAF)))
      return(pve) 
    }
    dat$PVE <- mapply(PVEfx, dat$beta.exposure, dat$MAF, dat$se.exposure, N = dat$samplesize.exposure)
    dat$FSTAT <- ((dat$samplesize.exposure - 1 - 1)/1)*(dat$PVE/(1 - dat$PVE))
    
    #step 6. heterogeneity test, heterogeneity (Inverse variance weighted) Q-pval = 0.3858222 > 0.05, then choose Inverse variance weighted (fixed effects) method
    mr_results_het <- mr_heterogeneity(dat)
    
    #step 7. MR analysis using Inverse variance weighted (fixed effects) method
    res <- mr(dat, method_list = c("mr_ivw_fe"))
    
    #step 8. Add OR and CI information
    res <- generate_odds_ratios(res)
    
    
    

    部分结果展示

    step 5.

    所有SNP的F-stat值都大于10,因此都纳入分析中


    step 5 结果展示

    step 6. 异质性检测结果

    p = 0.3858222 > 0.05,因此选择Inverse variance weighted (fixed effects) method;如果 p < 0.05,选择Inverse variance weighted (multiplicative random effects) method


    step 6 结果展示

    step 7. Inverse variance weighted (fixed effects) 结果

    step 7 和 step 8结果展示

    和原文章的对比,原文章附件中Table S6包含了代谢物和 Alzheimer的所有MR结果,最后一行就是test的代谢物,Q estimate 均为5.25, P value for Q estimate 均为0.39,均选择 Fixed-effect model 进行分析,OR 值均为0.69,95% CI均为0.57-0.84,P value 均为1.98×10-4。

    复现完成,喜大普奔......然后循环分析所有代谢物

    原文章Table S6

    大家来点赞啊,点赞超过100,分享day-4代码,多个工具变量F值的计算
    另外,还有一些reviewer关心统计功效,power计算的R代码on the way

    相关文章

      网友评论

        本文标题:复现一篇高分(IF = 11.274)孟德尔随机化分析文章-da

        本文链接:https://www.haomeiwen.com/subject/kvtaxdtx.html