美文网首页
生信地基系列--deeptools

生信地基系列--deeptools

作者: 可能性之兽 | 来源:发表于2022-11-07 14:45 被阅读0次

    The tools — deepTools 3.5.0 documentation

    image.png

    Tools for BAM and bigWig file processing

    multiBamSummary
    multiBigwigSummary
    correctGCBias
    bamCoverage
    bamCompare
    bigwigCompare
    computeMatrix
    alignmentSieve

    Tools for QC

    plotCorrelation
    plotPCA
    plotFingerprint
    bamPEFragmentSize
    computeGCBias
    plotCoverage

    Heatmaps and summary plots

    plotHeatmap
    plotProfile
    plotEnrichment

    Miscellaneous

    computeMatrixOperations
    estimateReadFiltering
    tool type input files main output file(s) application
    multiBamSummary data integration 2 or more BAM interval-based table of values perform cross-sample analyses of read counts –> plotCorrelation, plotPCA
    multiBigwigSummary data integration 2 or more bigWig interval-based table of values perform cross-sample analyses of genome-wide scores –> plotCorrelation, plotPCA
    plotCorrelation visualization bam/multiBigwigSummary output clustered heatmap visualize the Pearson/Spearman correlation
    plotPCA visualization bam/multiBigwigSummary output 2 PCA plots visualize the principal component analysis
    plotFingerprint QC 2 BAM 1 diagnostic plot assess enrichment strength of a ChIP sample
    computeGCBias QC 1 BAM 2 diagnostic plots calculate the exp. and obs. GC distribution of reads
    correctGCBias QC 1 BAM, output from computeGCbias 1 GC-corrected BAM obtain a BAM file with reads distributed according to the genome’s GC content
    bamCoverage normalization BAM bedGraph or bigWig obtain the normalized read coverage of a single BAM file
    bamCompare normalization 2 BAM bedGraph or bigWig normalize 2 files to each other (e.g. log2ratio, difference)
    computeMatrix data integration 1 or more bigWig, 1 or more BED zipped file for plotHeatmap or plotProfile compute the values needed for heatmaps and summary plots
    estimateReadFiltering information 1 or more BAM files table of values estimate the number of reads filtered from a BAM file or files
    alignmentSieve QC 1 BAM file 1 filtered BAM or BEDPE file filters a BAM file based on one or more criteria
    plotHeatmap visualization computeMatrix output heatmap of read coverages visualize the read coverages for genomic regions
    plotProfile visualization computeMatrix output summary plot (“meta-profile”) visualize the average read coverages over a group of genomic regions
    plotCoverage visualization 1 or more BAM 2 diagnostic plots visualize the average read coverages over sampled genomic positions
    bamPEFragmentSize information 1 BAM text with paired-end fragment length obtain the average fragment length from paired ends
    plotEnrichment visualization 1 or more BAM and 1 or more BED/GTF A diagnostic plot plots the fraction of alignments overlapping the given features
    computeMatrixOperations miscellaneous 1 or more BAM and 1 or more BED/GTF A diagnostic plot plots the fraction of alignments overlapping the given features

    computeMatrix 计算过程

    Bed文件下载
    https://mp.weixin.qq.com/s/POPN8kzMQT1jcil8ICvPxg
    Table Browser (ucsc.edu)

    image.png
    image.png
    • 用于计算相对于一个点(reference-point)的信号分布,例如,每个基因组区域的开始或结束
    • 用于计算一组区域(scale-regions)上的信号,其中所有区域都缩放到相同的大小
    image.png
    image.png

    单个计算bw的computeMatrix reference-point

    computeMatrix reference-point  --referencePoint TSS  -p 5  \
    -b 10000 -a 10000    \
    -R /home/data/vip13t16/project/epi/tss/ucsc.refseq.bed  \
    -S /home/data/vip13t16/project/epi/mergeBam/H2Aub1.bw  \
    --skipZeros  -o matrix1_test_TSS.gz  \
    --outFileSortedRegions regions1_test_genes.bed
    
    
    
    

    从bw开始批量计算computeMatrix reference-point

     ls *bw|while read id;do echo $id;sample=${id%%.*};echo $sample;computeMatrix reference-point  --referencePoint TSS  -p 50 -b 10000 -a 10000 -S $id -R ../BED/hg38.Refseq.bed --skipZeros  -o matrix1_${sample}_TSS.gz --outFileSortedRegions regions1_${sample}_genes.bed ;done
    

    从bam开始批量计算computeMatrix reference-point

    rm -rf Outbw
    mkdir Outbw
    ls *bam |while read id
    
    do
    
    file=$(basename $id )
    
    sample=${file%%.*}
    
    echo $sample
    
    bamCoverage -b $id -o Outbw/$sample.bw -p 50 --binSize 10 --normalizeUsing RPGC   --effectiveGenomeSize 2913022398
    ###  2913022398是官网写的hg38的大小
    
    computeMatrix reference-point --referencePoint TSS -b 2500 -a 2500 -R hg38.Refseq.bed  -S Outbw/$sample.bw --skipZeros -o Outbw/matrix1_${sample}_TSS.gz --outFileSortedRegions Outbw/regions1_${sample}_genes.bed -p 50
    
    plotHeatmap -m Outbw/matrix1_${sample}_TSS.gz -out Outbw/${sample}.png
     plotHeatmap -m  Outbw/matrix1_${sample}_TSS.gz -out Outbw/${sample}2.png --colorMap RdBu    --whatToShow 'heatmap and colorbar'
    done
    

    scale-region

    这里的genes19.bed genesX.bed 应该是从基因组之中提取出来的

    # run compute matrix to collect the data needed for plotting
    computeMatrix scale-regions -S H3K27Me3-input.bigWig \
                                     H3K4Me1-Input.bigWig  \
                                     H3K4Me3-Input.bigWig \
                                  -R genes19.bed genesX.bed \
                                  --beforeRegionStartLength 3000 \
                                  --regionBodyLength 5000 \
                                  --afterRegionStartLength 3000
                                  --skipZeros -o matrix.mat.gz
    plotHeatmap -m matrix.mat.gz \
          -out ExampleHeatmap1.png \
    
    image.png

    换一下颜色,从白到蓝

    plotHeatmap -m matrix1_chr19_TSS.gz --missingDataColor 1     --colorList 'white,#0066CC'             --heatmapHeight 12      -o scaleRegion-heatmap.pdf
    

    神器之 computeMatrix + 绘图 (qq.com)

    ChIP-seq基础入门 - 简书 (jianshu.com)
    ChIPseeker: an R package for ChIP peak Annotation, Comparison and Visualization (bioconductor.org)

    相关文章

      网友评论

          本文标题:生信地基系列--deeptools

          本文链接:https://www.haomeiwen.com/subject/ypjvtdtx.html