美文网首页
自定义函数, 持续更新

自定义函数, 持续更新

作者: YPCHEN1992 | 来源:发表于2020-04-13 11:28 被阅读0次

常用自定义 python 函数

1. 解析 multi-fasta 到 python3 字典

def fasta2dict(genome_fa: str) -> dict:
    '''
    parse genome fasta file into python dictionary
    '''
    fa_dict = {}
    with open(genome_fa, "rt") as fa_fh:
        for line in fa_fh:
            line = line.strip("\n")
            if line.startswith(">"):
                contig_id = line.strip(">")
                fa_dcit[contig_id] = []
            else:
                fa_dict[contig_id].append(line)

    fa_dict = {key: ''.join(fa_dcit[key]) for key in fa_dcit}
    return fa_dict

2. 序列反向、互补和反向互补

def reverse(seq_without_header: str) -> str:
    '''
    converts a DNA sequence into its reverse sequence.
    '''
    return seq[::-1]

def complement(seq_without_header: str) -> str:
    '''
    converts a DNA sequence into its  complement sequence.
    '''
    seq = seq_without_header.upper()
    upper = 'ATCG'
    lower = 'tagc'
    transtable = str.maketrans(upper, lower)
    seq = seq.translate(transtable)
    return seq.upper()

def reverse_complement(seq_without_header: str) -> str:
    '''
    converts a DNA sequence into its reverse, complement sequence.
    '''
    seq = seq_without_header.upper()
    upper = 'ATCG'
    lower = 'tagc'
    transtable = str.maketrans(upper, lower)
    seq = seq.translate(transtable)[::-1]
    return seq.upper()

相关文章

网友评论

      本文标题:自定义函数, 持续更新

      本文链接:https://www.haomeiwen.com/subject/urfgmhtx.html