美文网首页
Protocols for RNA-seq data analy

Protocols for RNA-seq data analy

作者: 烦啦烦啦 | 来源:发表于2020-08-20 15:11 被阅读0次
1.Using rice as an example
2.All scripts and rawdata can be found in '/data/dta/shared/rnaseqworkflow'(For lab members)

Before working :

  1. Create a root directory to store all future data
  2. Create a subdirectory , download reference genome data and annotations
  3. Use the alignment software you like to make index for genome
  4. Create other subdirectories to store different data such as raw data, matrix, script
code:
$ mkdir Drought_stress
$ mkdir Drought_stress/Rice  && cd Drought_stress/Rice
$ mkdir data matrix homology olddata reference src_rice
$ mkdir reference/IRGSP && cd  reference/IRGSP
$ wget ftp://ftp.ensemblgenomes.org/pub/release-47/plants/fasta/oryza_sativa/dna/Oryza_sativa.IRGSP-1.0.dna.toplevel.fa.gz
$ wget ftp://ftp.ensemblgenomes.org/pub/release-47/plants/gtf/oryza_sativa/Oryza_sativa.IRGSP-1.0.47.gtf.gz
$ wget ftp://ftp.ensemblgenomes.org/pub/release-47/plants/gff3/oryza_sativa/Oryza_sativa.IRGSP-1.0.47.gff3.gz
$ gunzip *.gz
$ module load Anaconda3 hisat2
$ mkdir hsindex
$ hisat2-build -p 8 Oryza_sativa.IRGSP-1.0.dna.toplevel.fa hsindex/IRGSP
$ module unload Anaconda3 hisat2



Workflow:

1-3 :Run on the server. 4-7:Run on personal computer. 8-9:Run on the server
  1. Find bioprojects according to drought, roots and other conditions
  2. Make a samplelist.txt and save the sra number to be downloaded under data subdirectory
  3. command : nohup sh RNAseq_workflow.sh &
code:
$ cd ~/Drought_stress/Rice/data
$ vim samplelist.txt  # Then Enter the sra number we want to download
$ cd ../src_rice
$ nohup sh RNAseq_workflow.sh &  # This script can be found in the attachment

  1. Send count files to the local for downstream analysis(The R version of the server is too high to support the R package “biomRt”)
    (We can use scp command or FileZilla software to transfer files between local and server )
  2. Build an R project and use DESeq2 and biomaRt for diff analysis and annotation in Rstudio locally
  3. Run the following R scripts in sequence :downstream.R > Deseq2analysis.R > merge_desingn.R (Whole project can be found in the attachment named Rice4.zip)
  4. Send the diff gene table and gene count table to the server,Put them in the '~/Drought_stress/Rice/homology' directory

  1. Go to src_rice subdirectory
  2. Run related scripts
code:
$ cd ~/Drought_stress/Rice/src_rice
$ nohup sh anno.sh &
$ nohup sh merge.sh &
#Scripts can be found in the attachment
# the Rice.anno.txt can be found in the attachment
# the head.txt is a Colname for the final output table which was edit  and bind  from the colname of those raw files we used.

Attention:

If you have any suggestions or comments, please contact the author via xuyp8121@mail.ustc.edu.cn
We have been looking forward to friends who have the same interests in systems biology and comparative biology !!!

相关文章

网友评论

      本文标题:Protocols for RNA-seq data analy

      本文链接:https://www.haomeiwen.com/subject/qlukjktx.html