美文网首页科研信息学
Nanopore PolyA Tail Length Analy

Nanopore PolyA Tail Length Analy

作者: 落寞的橙子 | 来源:发表于2020-05-24 04:51 被阅读0次

    使用pipeline-polya-ng
    在运行前下载软件,使用的是conda的flair_env环境,polya ng 使用的是snakemake的流程管理,理论上一套流程参考说明书去做就行,但是第二步的call difference一步有点问题,所以拆开来做,如果能够set up好的话可以直接参考官方说明的文件去做这个事情,以下是我的流程和例子。
    1、Get the PolyA Length
    看测序深度来评判计算机资源,需要比较多的内存

    #!/bin/bash
    #SBATCH --time=1:00:00
    #SBATCH --cpus-per-task=2
    #SBATCH --mem=2g
    
    dir=/data_dir/Nanopore/PolyA
    log_dir=${dir}/log
    job_dir=${dir}/job
    config=${dir}/config
    snake_dir=/Scripts/Nanopore/PolyA_diff/pipeline-polya-ng
    mkdir -p ${log_dir}
    mkdir -p ${job_dir}
    thread=21
    cd ${config}
    for i in $(ls $pwd *.yml);do
    job_file="${job_dir}/${i%%.*}.job"
    
        echo "#!/bin/bash
    #SBATCH --job-name=${i%%.*}.ployA_ng
    #SBATCH --output=$log_dir/${i%%.*}.ployA_ng.out
    #SBATCH --time=50:00:00
    #SBATCH --cpus-per-task=${thread}
    #SBATCH --mem=300g
    source /conda_dir/conda/etc/profile.d/conda.sh
    conda activate flair_env
    cd ${snake_dir}
    snakemake --use-conda -j 20 --configfile ${config}/${i} all
    " > $job_file
    sbatch $job_file
    done
    

    2、获得所有PolyA tail length的矩阵
    这个python2 写的软件,建议使用conda管理的jupyter notebook来运行

    #python2.7
    #conda activate genome 
    #jupyter notebook
    
    import os
    import pandas as pd
    from os import path
    control_tails=["/data_dir/Nanopore/PolyA_tail/human/Control1/pipeline-polya-ng/tails/filtered_tails_Control1.tsv",
    "/data_dir/Nanopore/PolyA_tail/human/Control2/pipeline-polya-ng/tails/filtered_tails_Control2.tsv",
    "/data_dir/Nanopore/PolyA_tail/human/Control3/pipeline-polya-ng/tails/filtered_tails_Control3.tsv"]
    
    treated_tails=["/data_dir/Nanopore/PolyA_tail/human/Treatment5/pipeline-polya-ng/tails/filtered_tails_Treatment5.tsv",
    "/data_dir/Nanopore/PolyA_tail/human/Treatment6/pipeline-polya-ng/tails/filtered_tails_Treatment6.tsv",
    "/data_dir/Nanopore/PolyA_tail/human/Treatment7/pipeline-polya-ng/tails/filtered_tails_Treatment7.tsv"]   
    output_mt="~/Desktop/all_tails.tsv"
    dfs = []
    for sample in control_tails:
        df = pd.read_csv(sample, sep="\t")
        df['sample'] = sample
        df['group'] = "control"
        dfs.append(df)
    for sample in treated_tails:
        df = pd.read_csv(sample, sep="\t")
        df['sample'] = sample
        df['group'] = "treatment"
        dfs.append(df)
    adf = pd.concat(dfs)
    adf.to_csv(output_mt, sep="\t", index=False)
    for tsv in control_tails:
        print(tsv)
    

    3、比较差异
    由于流程在这儿我运行的有点问题,所以我直接下载来polya_diff.py 去运行这个差异

    data_dir=/data_dir/Nanopore/PolyA_tail/mouse/PolyA_diff
    data_dir=~/Desktop/PolyA_diff
    input_tails=${data_dir}/all_tails_mouse.tsv
    out_dir=${data_dir}/report
    mkdir -p ${out_dir}
    global_tsv=${out_dir}/polya_diff_global.tsv
    tr_tsv=${out_dir}/polya_diff_per_transcript.tsv
    out_pdf=${out_dir}/polya_diff_report.pdf
    polya_diff_dir=/soft/python/Scripts/polya_diff.py 
    min_reads_per_transcript=10
    python ${polya_diff_dir} -i ${input_tails} -g ${global_tsv} -t ${tr_tsv} -r ${out_pdf} -c ${min_reads_per_transcript} -x
    
    

    后面的分析就差不多就跟表达分析差不多,查看原始文件应该很容易搞定

    相关文章

      网友评论

        本文标题:Nanopore PolyA Tail Length Analy

        本文链接:https://www.haomeiwen.com/subject/mqtdahtx.html