美文网首页
snakemake杂记:基因组组装工具Megahit提交任务

snakemake杂记:基因组组装工具Megahit提交任务

作者: 小明的数据分析笔记本 | 来源:发表于2023-03-21 21:01 被阅读0次

    最开始写的内容

    SAMPLES, = glob_wildcards("01.clean.fq/{sample}_1.fq.gz")
    
    print("Total sample: ",len(SAMPLES))
    
    rule all:
        input:
            expand("02.megahit/{sample}/{sample}.contigs.fa",sample=SAMPLES)
    
    rule run_megahit:
        input:
            r1 = "01.clean.fq/{sample}_1.fq.gz",
            r2 = "01.clean.fq/{sample}_2.fq.gz"
        output:
            "02.megahit/{sample}/{sample}.contigs.fa"
        threads:
            12
        resources:
            mem = 24000
        params:
            output_folder = "02.megahit/{sample}",
            prefix = "{sample}",
            mem = "24000000000"
        shell:
            """
            megahit -1 {input.r1} -2 {input.r2} -o {params.output_folder} --out-prefix {params.prefix} -t {threads} -m {params.mem}
            """
    

    提交任务的时候会一直报错

    Output directory /data/myan/raw_data/pome/pan.raw.fq/02.megahit/Gl_MOL already exists, please change the parameter -o to another value to avoid overwriting.
    

    改成如下

    SAMPLES, = glob_wildcards("../01.clean.fq/{sample}_1.fq.gz")
    
    print("Total sample: ",len(SAMPLES))
    
    rule all:
        input:
            expand("{sample}.log",sample=SAMPLES)
    
    rule run_megahit:
        input:
            r1 = "../01.clean.fq/{sample}_1.fq.gz",
            r2 = "../01.clean.fq/{sample}_2.fq.gz"
        output:
            "{sample}.log"
        threads:
            12
        resources:
            mem = 24000
        params:
            output_folder = "{sample}",
            prefix = "{sample}",
            mem = "24000000000"
        log:
            "{sample}.log"
        shell:
            """
            megahit -1 {input.r1} -2 {input.r2} -o {params.output_folder} --out-prefix {params.prefix} -t {threads} -m {params.mem} --min-contig-len 500 1>{log} 2>&1
            """
    

    就是output那里如果写了文件夹,snakemake会新建文件夹,到了shell命令那里又会有 -o参数,就会检测到存在这个文件夹就报错,目前能想到的办法就是在output那里不写文件夹,不知道还有没有其他解决办法

    之前写批量get_organelle_from_reads.py组装叶绿体的时候也遇到了这个问题,但是这个脚本有个参数是如果问价存在就覆盖这个文件夹,当时加了这个参数就好了

    相关文章

      网友评论

          本文标题:snakemake杂记:基因组组装工具Megahit提交任务

          本文链接:https://www.haomeiwen.com/subject/fpamrdtx.html