美文网首页
Bioinformatics course_F6

Bioinformatics course_F6

作者: Ternq8 | 来源:发表于2017-08-30 11:30 被阅读21次

    F6 wiki

    F6_Bioinformatics.png
    • Intro:
      ls
      grep
      uniq

    BOOK: shell scripting


    cd
    ..
    options
    ls -l -t order as time edited.
    ls -lt the same


    https://fantom6-collaboration.gsc.riken.jp/files/mongoDB/

    ls -lt | head first 10 line
    bunzip2 RunAll_sample_summary.20170830.tsv.bz2 解压bz2

    less -S RunAll_sample_summary.20170830.tsv
    less RunAll_sample_summary.20170830.tsv
    bash-3.2$ wc RunAll_sample_summary.20170830.tsv 7191 212813 1857590 RunAll_sample_summary.20170830.tsv
    lines words letters


    are sample ID unique or not?
    cut -f1 RunAll_sample_summary.20170830.tsv | head
    >out put
    unique

    关于cut的参数: http://man.linuxde.net/cut


    SAM/BAM介绍
    mapping
    formate for reads

    Fast: Fasta + quality

    bam is the same as SAM,but zip

    Each line of SAM start with@

    • QNAME: the name of the read
    • FLAG: what happened with the reads
    • RNAME
    • POS
    • CIGAR: geometry
      drunken sailor?

    Bedtools

    bed formate

    chr start end | option (name score strand (+/-/.))|

    start <end

    mysql

    set theory for genomics

    bedtools intersect --help
    bedtools intersect -u -a XX.bed -b XXX.bed | head  #to see the unique
    #-wa
    #-v
    

    is this overlap significant?

    bedrolls fisher -m -a XXX.bed -b XXX.bed -g hg38.genome   
    #m: merge everything
    ## two tail p-value
    

    bedtools shuffle

    bedtools shuffle -i hg38_gwas.bed
    # only one file as input
    
    bedtools shuffle -i hg38_gwas.bed -g hg38.genome  | bedtools sort> hg38_gwas_shuffle.bed
    

    Formate conversion
    Coverage Plots


    R shiny
    R markdown
    knitR
    R a platform for releasing result
    R call variable “objects”

    • vectors 向量
      c()
      paste0()

    • Lists: a second cornerstone class of R
      可以把任何变量加在一个list里面。

    • dataframe

    df=data.frame(
    a=c()
    b=c()
    )
    summary(df)
    #列结合
    
    • subsetting elements of objects
      by coordinate
      or by name

    help(“[“)

    • “nothing”
      NA: missing values, not available
      NULL: nothing
      NaN: 0/0 # result is not a number
    sum(1,2,NA, na.rm=T)
    3
    sum(1, NULL)
    1
    
    • in terminal-
    R
    barplot(c(1,2,3))
    q() # get out of R
    

    first step in R with R studio

    on-line http://try.jupyter.org/ (chose Welcome R-demo)

    如何在R里搞一些随机数:
    http://blog.csdn.net/lilanfeng1991/article/details/18505723

    hist(c(1,2,4))
    hist(runif(100))
    hist(rnorm(100))
    

    GitHub:
    https://www.r-bloggers.com/rstudio-and-github/

    相关文章

      网友评论

          本文标题:Bioinformatics course_F6

          本文链接:https://www.haomeiwen.com/subject/mcdzdxtx.html