美文网首页
自定义函数, 持续更新

自定义函数, 持续更新

作者: YPCHEN1992 | 来源:发表于2020-04-13 11:28 被阅读0次

    常用自定义 python 函数

    1. 解析 multi-fasta 到 python3 字典

    def fasta2dict(genome_fa: str) -> dict:
        '''
        parse genome fasta file into python dictionary
        '''
        fa_dict = {}
        with open(genome_fa, "rt") as fa_fh:
            for line in fa_fh:
                line = line.strip("\n")
                if line.startswith(">"):
                    contig_id = line.strip(">")
                    fa_dcit[contig_id] = []
                else:
                    fa_dict[contig_id].append(line)
    
        fa_dict = {key: ''.join(fa_dcit[key]) for key in fa_dcit}
        return fa_dict
    

    2. 序列反向、互补和反向互补

    def reverse(seq_without_header: str) -> str:
        '''
        converts a DNA sequence into its reverse sequence.
        '''
        return seq[::-1]
    
    def complement(seq_without_header: str) -> str:
        '''
        converts a DNA sequence into its  complement sequence.
        '''
        seq = seq_without_header.upper()
        upper = 'ATCG'
        lower = 'tagc'
        transtable = str.maketrans(upper, lower)
        seq = seq.translate(transtable)
        return seq.upper()
    
    def reverse_complement(seq_without_header: str) -> str:
        '''
        converts a DNA sequence into its reverse, complement sequence.
        '''
        seq = seq_without_header.upper()
        upper = 'ATCG'
        lower = 'tagc'
        transtable = str.maketrans(upper, lower)
        seq = seq.translate(transtable)[::-1]
        return seq.upper()
    

    相关文章

      网友评论

          本文标题:自定义函数, 持续更新

          本文链接:https://www.haomeiwen.com/subject/urfgmhtx.html