Sequence Read Archive (SRA)

作者: dearygt | 来源:发表于2021-08-12 19:58 被阅读0次

NCBI数据、获取使用
从NCBI-SRA和EBI-ENA数据库下载数据
Sequence Read Archive (SRA)
GREIN：用于重新分析GEO RNA-seq数据的交互式Web
使用aspera下载.fastq.gz和.sra数据
ascp下载数据
从NCBI-SRA和EBI-ENA数据库下载数据
使用aspera下载.fastq.gz和.sra数据
生信技能基础（一）下载Fastq文件的方法总结（主要介绍ascp
SRA数据库及其数据的下载（GEO、SRA）

Sequence Read Archive (SRA)

2021.8.13 HuPY

1. OverView

Prepare the following information for your submission and be ready to:

Provide a project name and description
Choose the type or ‘package’ of your samples
Provide sample metadata that is unique by sample
Provide sequence metadata
Upload your files

2. Project

For a new project, prepare the information that creates a BioProject.

Required information:

Title
Description

Optional information:

Participants
Grants: Required if your project was funded by a National Institutes of Health (NIH) grant

Important: The required BioProject and BioSample can be created during submission. You can also link an existing BioProject and BioSample with accession numbers.

3. Sample

For new samples, prepare the details that will serve as BioSamples' metadata for individual biological specimens (collection date, location, etc.).

Select the ‘package’ that best fits your biological samples. Each package has a distinct set of required attributes which you can preview here.

Take ’Metagenome‘ for example：
Each sample must have a unique set of attributes. Provide all required fields and any optional fields that apply to your samples.
Add custom attributes to fully describe your samples and facilitate searching. You should submit at least one unique data file for each sample you create.

4. Library

Prepare the following 'Library' information:

Which BioSample should be linked to which file(s)
Your library construction protocol
Other metadata like unique library names, sequencing platform, and filetype

Reference： About Sequence Read Archive (SRA) Submission (nih.gov)

5.Summary

BioSample is an description file.

BioProject related to SRA & BioSample

--- N BioSample

--- Biosamples： SRSxxx

--- N SRA

--- Experiments: SRXxxx

--- Runs: SRRxxx
Batch-download from SRA
1. 在文章Data Availability部分查找文章对应的BioProject ID，然后进入SRA Run Selector网址，在”Accession“搜索栏中输入BioProject ID即可获得研究所有相关的SRR...。
2. 在Found xx items一栏中勾选所有Run，然后点击上方的Select栏目的Selected中的 Metadata （named SraRunTable）或Accession list（named SRR_Acc_List）获取相对应的文件。使用FileZilla软件传输上述两个文件（path：~/paper_code/pnasHeterosis）。
3. 接着在集群中批量下载数据
<pre class="md-fences md-end-block ty-contain-cm modeLoaded" spellcheck="false" lang="bash" cid="n306" mdtype="fences" style="box-sizing: border-box; overflow: visible; font-family: var(--monospace); font-size: 0.9em; display: block; break-inside: avoid; text-align: left; white-space: normal; background-image: inherit; background-position: inherit; background-size: inherit; background-repeat: inherit; background-attachment: inherit; background-origin: inherit; background-clip: inherit; background-color: rgb(248, 248, 248); position: relative !important; border: 1px solid rgb(231, 234, 237); border-radius: 3px; padding: 8px 4px 6px; margin-bottom: 15px; margin-top: 15px; width: inherit; color: rgb(51, 51, 51); font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;"># 加载环境变量
modules

加载sratoolkit

module load sratoolkit/2.9.6

检查sratoolkit 可否正常使用

prefetch --help

进入工作路径“~/paper_script/pnasHeterosis”,使用sratoolkit批量下载

cd ~/paper_script/pnasHeterosis

Alias:alias BSUB="echo 'bsub -J blast -n 2 -R span[hosts=1] -o %J.out -e %J.err -q normal 'command''"

BSUB

下载的数据默认路径为"~/ncbi/public/sra"

</pre>

Command line

<pre mdtype="fences" cid="n319" lang="bash" spellcheck="false" class="md-fences md-end-block ty-contain-cm modeLoaded" style="box-sizing: border-box; overflow: visible; font-family: var(--monospace); font-size: 0.9em; display: block; break-inside: avoid; text-align: left; white-space: normal; background-image: inherit; background-position: inherit; background-size: inherit; background-repeat: inherit; background-attachment: inherit; background-origin: inherit; background-clip: inherit; background-color: rgb(248, 248, 248); position: relative !important; border: 1px solid rgb(231, 234, 237); border-radius: 3px; padding: 8px 4px 6px; margin-bottom: 15px; margin-top: 15px; width: inherit; color: rgb(51, 51, 51); font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">#!/bin/sh
bsub -J prefetch -n 2 -R span[hosts=1] -o %J.prefetch.out -e %J.prefetch.err -q normal 'prefetch --option-file ~/paper_script/pnasHeterosis/SRR_Acc_List.txt'
bsub -K -J fastqdump -n 2 -R span[hosts=1] -o %J.fastqdump.out -e %J.fastqdump.err -q normal 'fastq-dump -I --split-files ~/ncbi/public/sra/*sra'</pre>

网友评论

生信

本文标题：Sequence Read Archive (SRA)

本文链接：https://www.haomeiwen.com/subject/llejbltx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

Sequence Read Archive (SRA)

Sequence Read Archive (SRA)

1. OverView

2. Project

3. Sample

4. Library

5.Summary

加载sratoolkit

检查sratoolkit 可否正常使用

进入工作路径“~/paper_script/pnasHeterosis”,使用sratoolkit批量下载

Alias:alias BSUB="echo 'bsub -J blast -n 2 -R span[hosts=1] -o %J.out -e %J.err -q normal 'command''"

下载的数据默认路径为"~/ncbi/public/sra"

相关文章

NCBI数据、获取使用

从NCBI-SRA和EBI-ENA数据库下载数据