美文网首页
记录一下TCRGP使用流程

记录一下TCRGP使用流程

作者: Yayamia | 来源:发表于2023-04-11 12:41 被阅读0次

    github主页

    TCRGP is a novel Gaussian process method that can predict if TCRs recognize certain epitopes. This method can utilize different CDR sequences from both TCRα and TCRβ chains from single-cell data and learn which CDRs are important in recognizing the different epitopes.

    • It is well known that the CDR3β of a TCR is important in recognizing peptides presented to the T cell.
    • We propose a method called TCRGP which builds on non-parametric modelling using Gaussian process (GP) classification. The probabilistic formulation of GPs allows robust model inference already from small data sets, which is a great benefit as currently there exists very limited amounts of reported TCR-epitope interactions in curated databases.

    一、下载

    安装好tensorflow和GPflow
    如果网络太慢,可以使用镜像安装
    直接下载github上面整个的zip,并解压使用

    ATTENTION!!!!!!!!!!!
    必须按照

    to use TCRGP, you will need to have

    • TensorFlow (We have used version 1.8.0)---对应Python为3.6(最好在对应的虚拟环境安装)
    • GPflow (We have used version 1.1.1)

    tensorflow1.8.0安装可以参考这个
    )

    关于如何在linux jupyter notebook使用conda虚拟环境,使用的方法二
    下载好后在对应Conda环境中输入jupyter notebook,然后输入对应网址即可


    如果报错ImportError: cannot import name 'secure_write'

    二、导入

    将jupyter notebook的默认读取路径设置到上述tcrgp解压位置

    %pwd #获得路径
    %cd #更改路径
    import tcrgp
    

    注意,对于虚拟环境中安装包的位置应指定到对应路径,如pip3 install matplotlib -i https://pypi.tuna.tsinghua.edu.cn/simple/ -t /home/user/test/miniconda3/envs/tfpy3/lib/python3.6/site-packages

    查看安装包的位置:

    import tensorflow
    print(tensorflow.__path__)
    

    三、训练集获得

    VDJdb

    如图,导出tsv文件

    进一步筛选:

    1. confidence score of at least 1
    2. 选择所需的物种,如小鼠
    3. 选择所需的HLA亚型。对于人类,可以看HLA-A*02,对于小鼠,可以筛选所需品系的小鼠。
    比如,对于普通的C57小鼠,其MHC-I为H-2Kb和H-2Db,其MHC-II分子为I-A,因此,我需要筛选H-2Kb和H-2Db提呈的抗原肽
    1. 至少存在一条识别序列包括50条TCRB序列

    导出为tsv文件

    四、模型构建及验证

    按照参考
    模型构建后验证方法:

    Leave-one-subject-out cross-validations
    Leave-one-subject-out (loso) cross-validations can be used to evaluate the performance of the model.

    解释

    注意,在预测时需要注意

    1. 修改lmax3,即最大CDR3的氨基酸数
    2. TRBV12-2+TRBV13-2 中的+要改成;
    3. 阈值为>0.85或0.9
      具体我再看看,有需要补充的我再更新

    相关文章

      网友评论

          本文标题:记录一下TCRGP使用流程

          本文链接:https://www.haomeiwen.com/subject/ycegddtx.html