美文网首页
2021-04-05 近交系数的摸索

2021-04-05 近交系数的摸索

作者: L6511 | 来源:发表于2021-04-06 21:26 被阅读0次

    计算我第二次过滤后的数据中的近交系数

    [lyc@200server ~]$ ./plink --bfile clean2 --allow-extra-chr --het
    
    Calculating allele frequencies... done.
    Total genotyping rate is 0.977479.
    3010393 variants and 141 people pass filters and QC.
    Note: No phenotypes present.
    --het: 3009972 variants scanned, report written to plink.het .
    
    [lyc@200server ~]$ head -n 5 plink.het
         FID      IID       O(HOM)       E(HOM)        N(NM)            F
      PB-362   PB-362      2513372    2.026e+06      2973035       0.5144
      PB-363   PB-363      2508234    2.012e+06      2952631       0.5274
      PB-364   PB-364      2493856    2.019e+06      2962142       0.5036
      PB-365   PB-365      2477780    2.005e+06      2942252       0.5044
    

    我有种不详的预感,0.8以下去掉后会没剩多少
    又到了官网的解释

    Inbreeding

    --het ['small-sample'] ['gz']
    --ibc
    --het computes observed and expected autosomal homozygous genotype counts for each sample, and reports method-of-moments F coefficient estimates (i.e. (<observed hom. count> - <expected count>) / (<total observations> - <expected count>)) to plink.het. (The 'gz' modifier has the usual effect.) Expected counts are based on loaded (via --read-freq) or imputed MAFs; if there are very few samples in your immediate fileset, --read-freq is practically mandatory since imputed MAFs are wildly inaccurate in that case.

    By default, the n/(n-1) multiplier in Nei's expected homozygosity formula is now omitted, since n may be unknown when using --read-freq. The 'small-sample' modifier causes the multiplier to be included, while forcing --het to use imputed MAFs (and known ns) from founders in the immediate dataset. (--maf-succ is not applied here.)

    --ibc (ported from GCTA) calculates three inbreeding coefficients for each sample, and writes a report to plink.ibc. Briefly, Fhat1 is the usual variance-standardized relationship minus 1, Fhat2 is approximately equal to the --het estimate, and Fhat3 is based on the correlation between uniting gametes.

    These calculations do not take LD into account. It is usually a good idea to perform some form of LD-based pruning before invoking them.

    讲道理还是没告诉我怎么删除那些小于阈值的数据
    看样子必须去写R的脚本了
    然后我就在linux中的R安装叫dplyr的包,试图设定一个阈值去删除那些行数,然后只是安装这个包就陷入了深渊

    > install.packages("dplyr")
    Warning in install.packages("dplyr") :
      'lib="/usr/local/lib64/R/library"'不可写
    Would you like to use a personal library instead? (yes/No/cancel) yes
    Would you like to create a personal library
    ‘~/R/x86_64-pc-linux-gnu-library/3.5’
    to install packages into? (yes/No/cancel) yes
    --- 在此連線階段时请选用CRAN的鏡子 ---
    Warning: failed to download mirrors file (无法打开URL'https://cran.r-project.org/CRAN_mirrors.csv'); using local file '/usr/local/lib64/R/doc/CRAN_mirrors.csv'
    Secure CRAN mirrors 
    
     1: 0-Cloud [https]                   2: Algeria [https]                
     3: Australia (Canberra) [https]      4: Australia (Melbourne 1) [https]
     5: Australia (Melbourne 2) [https]   6: Australia (Perth) [https]      
     7: Austria [https]                   8: Belgium (Ghent) [https]        
     9: Brazil (PR) [https]              10: Brazil (RJ) [https]            
    11: Brazil (SP 1) [https]            12: Brazil (SP 2) [https]          
    13: Bulgaria [https]                 14: Chile 1 [https]                
    15: Chile 2 [https]                  16: China (Guangzhou) [https]      
    17: China (Lanzhou) [https]          18: China (Shanghai) [https]       
    19: Colombia (Cali) [https]          20: Czech Republic [https]         
    21: Denmark [https]                  22: East Asia [https]              
    23: Ecuador (Cuenca) [https]         24: Ecuador (Quito) [https]        
    25: Estonia [https]                  26: France (Lyon 1) [https]        
    27: France (Lyon 2) [https]          28: France (Marseille) [https]     
    29: France (Montpellier) [https]     30: France (Paris 2) [https]       
    31: Germany (Erlangen) [https]       32: Germany (Göttingen) [https]    
    33: Germany (Münster) [https]       34: Greece [https]                 
    35: Iceland [https]                  36: Indonesia (Jakarta) [https]    
    37: Ireland [https]                  38: Italy (Padua) [https]          
    39: Japan (Tokyo) [https]            40: Japan (Yonezawa) [https]       
    41: Malaysia [https]                 42: Mexico (Mexico City) [https]   
    43: Norway [https]                   44: Philippines [https]            
    45: Serbia [https]                   46: Spain (A Coruña) [https]       
    47: Spain (Madrid) [https]           48: Sweden [https]                 
    49: Switzerland [https]              50: Turkey (Denizli) [https]       
    51: Turkey (Mersin) [https]          52: UK (Bristol) [https]           
    53: UK (Cambridge) [https]           54: UK (London 1) [https]          
    55: USA (CA 1) [https]               56: USA (IA) [https]               
    57: USA (KS) [https]                 58: USA (MI 1) [https]             
    59: USA (NY) [https]                 60: USA (OR) [https]               
    61: USA (TN) [https]                 62: USA (TX 1) [https]             
    63: Vietnam [https]                  64: (other mirrors)                
    
    
    Selection: 17
    Warning: 无法在貯藏處https://mirror.lzu.edu.cn/CRAN/src/contrib中读写索引:
      无法打开URL'https://mirror.lzu.edu.cn/CRAN/src/contrib/PACKAGES'
    Warning messages:
    1: In download.file(url, destfile = f, quiet = TRUE) :
      URL 'https://cran.r-project.org/CRAN_mirrors.csv': status was 'Couldn't connect to server'
    2: package ‘dplyr’ is not available (for R version 3.5.1) 
    

    我看到好几个回答都是选了兰州的镜像才这样的,看到有回答说换上海的可以

    > chooseCRANmirror()
    Warning: failed to download mirrors file (无法打开URL'https://cran.r-project.org/CRAN_mirrors.csv'); using local file '/usr/local/lib64/R/doc/CRAN_mirrors.csv'
    Secure CRAN mirrors 
    
     1: 0-Cloud [https]                   2: Algeria [https]                
     3: Australia (Canberra) [https]      4: Australia (Melbourne 1) [https]
     5: Australia (Melbourne 2) [https]   6: Australia (Perth) [https]      
     7: Austria [https]                   8: Belgium (Ghent) [https]        
     9: Brazil (PR) [https]              10: Brazil (RJ) [https]            
    11: Brazil (SP 1) [https]            12: Brazil (SP 2) [https]          
    13: Bulgaria [https]                 14: Chile 1 [https]                
    15: Chile 2 [https]                  16: China (Guangzhou) [https]      
    17: China (Lanzhou) [https]          18: China (Shanghai) [https]       
    19: Colombia (Cali) [https]          20: Czech Republic [https]         
    21: Denmark [https]                  22: East Asia [https]              
    23: Ecuador (Cuenca) [https]         24: Ecuador (Quito) [https]        
    25: Estonia [https]                  26: France (Lyon 1) [https]        
    27: France (Lyon 2) [https]          28: France (Marseille) [https]     
    29: France (Montpellier) [https]     30: France (Paris 2) [https]       
    31: Germany (Erlangen) [https]       32: Germany (Göttingen) [https]    
    33: Germany (Münster) [https]       34: Greece [https]                 
    35: Iceland [https]                  36: Indonesia (Jakarta) [https]    
    37: Ireland [https]                  38: Italy (Padua) [https]          
    39: Japan (Tokyo) [https]            40: Japan (Yonezawa) [https]       
    41: Malaysia [https]                 42: Mexico (Mexico City) [https]   
    43: Norway [https]                   44: Philippines [https]            
    45: Serbia [https]                   46: Spain (A Coruña) [https]       
    47: Spain (Madrid) [https]           48: Sweden [https]                 
    49: Switzerland [https]              50: Turkey (Denizli) [https]       
    51: Turkey (Mersin) [https]          52: UK (Bristol) [https]           
    53: UK (Cambridge) [https]           54: UK (London 1) [https]          
    55: USA (CA 1) [https]               56: USA (IA) [https]               
    57: USA (KS) [https]                 58: USA (MI 1) [https]             
    59: USA (NY) [https]                 60: USA (OR) [https]               
    61: USA (TN) [https]                 62: USA (TX 1) [https]             
    63: Vietnam [https]                  64: (other mirrors)                
    
    
    Selection: 18
    Warning message:
    In download.file(url, destfile = f, quiet = TRUE) :
      URL 'https://cran.r-project.org/CRAN_mirrors.csv': status was 'Couldn't connect to server'
    

    换了上海的也没用


    image.png

    换美国的也没用


    image.png
    > install.packages('dplyr',dependencies=TRUE,repos='http://cran.rstudio.com/')             
    错误: 句法分析器1行里不能有多字节字符
    
    image.png

    处处是坑,我决定换一个思路,不调包能不能直接用,或者直接用linux去做,忽然发现全部搞错了,那个不是真正的近交系数,还需要自己去算一算,利用maf和het,还要解决镜像的问题

    相关文章

      网友评论

          本文标题:2021-04-05 近交系数的摸索

          本文链接:https://www.haomeiwen.com/subject/igtjkltx.html