美文网首页
Eigensoft-smartpca分析PCA报错:warnin

Eigensoft-smartpca分析PCA报错:warnin

作者: 生物信息与育种 | 来源:发表于2021-07-21 14:43 被阅读0次

    问题

    一直以来用Eigensoft的smartpca来做群体遗传的PCA分析很顺畅,结果也比较靠谱。

    但今天报错如下:

    $ ~/miniconda3/bin/smartpca -p smartpca.par
    parameter file: smartpca.par
    ### THE INPUT PARAMETERS
    ##PARAMETER NAME: VALUE
    genotypename: plink.ped
    snpname: plink.pedsnp
    indivname: plink.pedind
    evecoutname: pca.vec
    evaloutname: pca.val
    numoutlieriter: 0
    numchrom: 1000000
    ## smartpca version: 16000
    norm used
    
    warning (mapfile): bad chrom: 100   100:1816    0   1816
    warning (mapfile): bad chrom: 101   101:1388    0   1388
    warning (mapfile): bad chrom: 101   101:1922    0   1922
    warning (mapfile): bad chrom: 102   102:1286    0   1286
    warning (mapfile): bad chrom: 103   103:867 0   867
    warning (mapfile): bad chrom: 104   104:149 0   149
    warning (mapfile): bad chrom: 105   105:1532    0   1532
    warning (mapfile): bad chrom: 106   106:1201    0   1201
    warning (mapfile): bad chrom: 107   107:1113    0   1113
    warning (mapfile): bad chrom: 108   108:255 0   255
    Segmentation fault
    

    这个原因有可能是染色体号为0导致。smartpca中 ,0意味着染色体编号信息缺失。

    检查我的map文件中第一列(染色体号),从1开始,并没有为0。以前用带chr或scaffold开头的染色体数据做过,也没有报错。

    解决

    在Google group上找到了原因。

    I have got Smartpca within EIGENSOFT (6.0.1) to work without converting with convertf - it will take map/ped directly.  I have madified the output map/ped that stacks outputs.
    
    EIGENSOFT and PLINK don't with thousands of chromosomes/contigs well - so I would suggest removing that info from the map file - replace the first column with all '1' for example.  I do have some chromosome info so I have chromosomes 1-37 for assigned loci and I used for '40' for unassigned loci.  I dont think smartpca likes a zero in the frist column of the map file.
    
    example map file: 
    
    [https://github.com/rwaples/chum_populations/blob/master/results/batch_4/EIGENSOFT/complete.codom.subsample.map](https://github.com/rwaples/chum_populations/blob/master/results/batch_4/EIGENSOFT/complete.codom.subsample.map)
    
    ped file - I have the phenotype (col 6) set to missing (-9) and smartpca complains about it - but it works.  
    
    example ped file:
    
    [https://github.com/rwaples/chum_populations/blob/master/results/batch_4/EIGENSOFT/complete.codom.subsample.ped](https://github.com/rwaples/chum_populations/blob/master/results/batch_4/EIGENSOFT/complete.codom.subsample.ped)
    
    example parfile:
    
    [https://github.com/rwaples/chum_populations/blob/master/results/batch_4/EIGENSOFT/complete.codom.subsample.parfile](https://github.com/rwaples/chum_populations/blob/master/results/batch_4/EIGENSOFT/complete.codom.subsample.parfile) 
    
    -Ryan
    

    https://groups.google.com/g/stacks-users/c/rkN9Q5G6hXg

    可以看到smartpca并不支持上千条的scaffold/contig(查看了下我的数据,有3000多contigs),而在做PCA分析时,染色体号并不影响最终结果。因此可将很碎的contig统一一个染色体号。

    sed 's/contig[0-9]*/20/g' map.vcf
    

    最终得到所有材料PCA结果。


    image.png

    https://www.jianshu.com/p/bdf1bc116127

    相关文章

      网友评论

          本文标题:Eigensoft-smartpca分析PCA报错:warnin

          本文链接:https://www.haomeiwen.com/subject/xqsjmltx.html