julia如何实现并行

作者: ZHANG先森_5850 | 来源:发表于2022-10-23 14:46 被阅读0次

    @小光amateur
    老师,您好。我用reader读取了整个bam,我想以染色体进行并行,每个核计算一个染色体,输出一个结果,然后最后在汇总到所有染色体的结果到一个文件中。

    using DataFrames,CSV,XAM,GenomicFeatures,BioSequences
    
    function generatechdf(reader,chromosomename::AbstractString,start::Int64,final::Int64)
        chdf=Vector{Tuple{String,Int64,Int64,Char,String,LongDNASeq}}()
        for record in eachoverlap(reader,chromosomename,start:final)
            if BAM.ismapped(record)
                a=BAM.refname(record), BAM.position(record),BAM.rightposition(record),f(record),BAM.cigar(record),BAM.sequence(record)
                push!(chdf,a)
            end
        end
        rename!(DataFrame(chdf),[:refname,:position,:rightposition,:strand,:cigar,:sequence])
    end
    
    Threads.@threads for number in 1:chr.ngroups #chr is a grouped dataframe containing all chromosomes like the following.
            chromosome=chr[number]
            start=chromosome[1,3];final=chromosome[end,3]
            chromosomename=chromosome[1,1]#like "chr1","chr2"...
            chdf=generatechdf(reader,chromosomename,start,final)
    end
    

    这是chr的形式

    julia >chr
    GroupedDataFrame with 25 groups based on key: Column1
    First Group (1482189 rows): Column1 = "chr1"
         Row │ Column1  Column2  Column3
             │ String7  String1  Int64
    ─────────┼─────────────────────────────
           1 │ chr1     C            10469
           2 │ chr1     C            10471
           3 │ chr1     C            10484
           4 │ chr1     C            10489
           5 │ chr1     C            10589
           6 │ chr1     G            10590
           7 │ chr1     G            10610
           8 │ chr1     C            10617
           9 │ chr1     G            10618
          10 │ chr1     C            10620
          11 │ chr1     C            10633
    

    当我运行多线程时(不确定是否用多线程),就报错了。

    Stacktrace:
     [1] wait
       @ ./task.jl:334 [inlined]
     [2] threading_run(func::Function)
       @ Base.Threads ./threadingconstructs.jl:38
     [3] top-level scope
       @ ./threadingconstructs.jl:97
    
        nested task error: zlib failed to inflate a compressed block
        Stacktrace:
    

    当我运行单线程时就没有这个问题,应该看那个报错是同一reader不能同时操作,我要怎么做呢?我想以每个染色体并行,然后最后把所有结果写成一个dataframe。谢谢老师指点

    相关文章

      网友评论

        本文标题:julia如何实现并行

        本文链接:https://www.haomeiwen.com/subject/almizrtx.html