美文网首页单细胞测序scATAC-seq学习资料单细胞测序技术
使用cellranger-atac软件处理10x单细胞ATAC-

使用cellranger-atac软件处理10x单细胞ATAC-

作者: Davey1220 | 来源:发表于2020-06-08 14:33 被阅读0次

    image.png

    Generating FASTQs with cellranger-atac mkfastq

    Overview

    cellranger-atac软件提供了cellranger-atac mkfastq子程序,可以将Illumina测序仪产生的原始base call files(BCLs)下机数据转换为FASTQ格式的测序文件。该子程序是bcl2fastq软件的封装,提供了以下功能:

    • Translates 10x sample index set names into the corresponding list of four sample index oligonucleotides. For example, well A1 can be specified in the samplesheet as SI-NA-A1, and cellranger-atac mkfastq will recognize the four oligos AAACGGCG, CCTACCAT, GGCGTTTC, and TTGTAAGA and merge the resulting FASTQ files.
    • Supports a simplified CSV samplesheet format to handle 10x use cases.
    • Generates sequencing and 10x-specific quality control metrics, including barcode quality, accuracy, and diversity.
    • Supports most bcl2fastq arguments, such as --use-bases-mask.

    Example Workflows

    在本示例中,我们构建了两个10x测序文库(each processed through a separate Chromium chip channel),然后将它们混合在一起在同一个flowcell上进行测序。使用cellranger-atac mkfastq子程序拆库分离后,对每个测序文库的数据进行单独处理。

    image.png

    在本示例中,我们构建了一个10x测序文库,然后将它们放到两个不同的flowcell上进行测序。使用cellranger-atac mkfastq子程序将数据转换为FASTQ格式后,把它们合并到一起进行处理。

    image.png

    Arguments and Options

    cellranger-atac mkfastq子程序是bcl2fastq软件的封装,主要提供了以下常用参数:

    image.png image.png

    Example Data

    cellranger-atac mkfastq子程序可以识别两种格式的文件进行处理:一种是常见的逗号分隔的3列格式的CSV文件;另一种是Illumina® Experiment Manager (IEM)格式的文件。
    To follow along, please do the following:

    • Download the tiny-bcl tar file.
    • Untar the tiny-bcl tar file in a convenient location. This will create a new tiny-bcl subdirectory.
    • Download the simple CSV layout file: cellranger-atac-tiny-bcl-simple-1.0.0.csv.
    • Download the Illumina® Experiment Manager sample sheet: cellranger-atac-tiny-bcl-samplesheet-1.0.0.csv.

    Running mkfastq with a simple CSV samplesheet

    我们推荐使用简单的CSV格式文件进行处理,该文件包含三列,分别是(Lane, Sample, Index)

    Lane,Sample,Index
    1,test_sample,SI-NA-C1
    

    Here are the options for each column:


    image.png

    To run cellranger-atac mkfastq with a simple layout CSV, use the --csv argument. Here's how to run mkfastq on the tiny-bcl sequencing run with the simple layout:

    cellranger-atac mkfastq --id=tiny-bcl \
                            --run=/path/to/tiny_bcl \
                            --csv=cellranger-atac-tiny-bcl-simple-1.0.0.csv
     
    cellranger-atac mkfastq
    Copyright (c) 2018 10x Genomics, Inc.  All rights reserved.
    -------------------------------------------------------------------------------
    Martian Runtime - 1.2.0-3.2.4
    Running preflight checks (please wait)...
    2018-08-09 16:33:54 [runtime] (ready)           ID.tiny-bcl.MAKE_FASTQS_CS.MAKE_FASTQS.PREPARE_SAMPLESHEET
    2018-08-09 16:33:57 [runtime] (split_complete)  ID.tiny-bcl.MAKE_FASTQS_CS.MAKE_FASTQS.PREPARE_SAMPLESHEET
    2018-08-09 16:33:57 [runtime] (run:local)       ID.tiny-bcl.MAKE_FASTQS_CS.MAKE_FASTQS.PREPARE_SAMPLESHEET.fork0.chnk0.main
    2018-08-09 16:34:00 [runtime] (chunks_complete) ID.tiny-bcl.MAKE_FASTQS_CS.MAKE_FASTQS.PREPARE_SAMPLESHEET
    ...
    

    Running mkfastq with an Illumina® Experiment Manager sample sheet

    cellranger-atac mkfastq子程序还可以使用Illumina® Experiment Manager (IEM)格式的样本表作为输入。
    Let's briefly look at cellranger-atac-tiny-bcl-samplesheet-1.0.0.csv before running the pipeline. You will see a number of fields specific to running on Illumina® platforms, and then a [Data] section.
    That section is where to put your sample, lane and index information. Here's an example:

    [Data]
    Lane,Sample_ID,index,Sample_Project
    1,Sample1,SI-NA-C1,tiny-bcl
    

    Next, run the cellranger-atac mkfastq pipeline, using the --samplesheet argument:

    cellranger-atac mkfastq --id=tiny-bcl \
                            --run=/path/to/tiny_bcl \
                            --samplesheet=cellranger-atac-tiny-bcl-samplesheet-1.0.0.csv
     
    cellranger-atac mkfastq
    Copyright (c) 2018 10x Genomics, Inc.  All rights reserved.
    -------------------------------------------------------------------------------
    Martian Runtime - 3.2.4
    Running preflight checks (please wait)...
    2018-08-09 16:25:49 [runtime] (ready)           ID.tiny-bcl.MAKE_FASTQS_CS.MAKE_FASTQS.PREPARE_SAMPLESHEET
    2018-08-09 16:25:52 [runtime] (split_complete)  ID.tiny-bcl.MAKE_FASTQS_CS.MAKE_FASTQS.PREPARE_SAMPLESHEET
    2018-08-09 16:25:52 [runtime] (run:local)       ID.tiny-bcl.MAKE_FASTQS_CS.MAKE_FASTQS.PREPARE_SAMPLESHEET.fork0.chnk0.main
    2018-08-09 16:25:58 [runtime] (chunks_complete) ID.tiny-bcl.MAKE_FASTQS_CS.MAKE_FASTQS.PREPARE_SAMPLESHEET
    ...
    

    Checking FASTQ output

    运行完cellranger-atac mkfastq程序后,会生成一个新的文件夹,存放着转换好的FASTQ文件,该文件夹名是--id参数指定的名称(if not specified, defaults to the name of the flowcell):

    $ ls -l
    drwxr-xr-x 4 jdoe  jdoe     4096 Sep 13 12:05 tiny-bcl
    

    The key output files can be found in outs/fastq_path, and is organized in the same manner as a conventional bcl2fastq run:

    $ ls -l tiny-bcl/outs/fastq_path/
    drwxr-xr-x 3 jdoe jdoe         3 Aug  9 12:26 Reports
    drwxr-xr-x 2 jdoe jdoe         8 Aug  9 12:26 Stats
    drwxr-xr-x 3 jdoe jdoe         3 Aug  9 12:26 tiny-bcl
    -rw-r--r-- 1 jdoe jdoe  20615106 Aug  9 12:26 Undetermined_S0_L001_I1_001.fastq.gz
    -rw-r--r-- 1 jdoe jdoe 151499694 Aug  9 12:26 Undetermined_S0_L001_R1_001.fastq.gz
    -rw-r--r-- 1 jdoe jdoe  52692701 Aug  9 12:26 Undetermined_S0_L001_R2_001.fastq.gz
    -rw-r--r-- 1 jdoe jdoe 151499694 Aug  9 12:26 Undetermined_S0_L001_R3_001.fastq.gz
     
    $ tree tiny-bcl/outs/fastq_path/tiny_bcl/
    tiny-bcl/outs/fastq_path/tiny_bcl/
      Sample1
        Sample1_S1_L001_I1_001.fastq.gz
        Sample1_S1_L001_R1_001.fastq.gz
        Sample1_S1_L001_R2_001.fastq.gz
        Sample1_S1_L001_R3_001.fastq.gz
    

    If you want to remove the Undetermined FASTQs from the output to save space, you can run mkfastq with the --delete-undetermined flag.

    Assessing Quality Control Metrics

    When the --qc flag is specified, the cellranger-atac mkfastq pipeline writes both sequencing and 10x-specific quality control metrics into a JSON file. The metrics are in the outs/qc_summary.json file.
    The use of --qc flag is not supported on NovaSeq flow cells.
    The qc_summary.json file contains a number of useful metrics. The sample_qc key is a good place to start exploring your data.

    "sample_qc": {
      "Sample1": {
        "5": {
          "barcode_exact_match_ratio": 0.9336158258904611,
          "barcode_q30_base_ratio": 0.9611993091728814,
          "bc_on_whitelist": 0.9447542078230667,
          "mean_barcode_qscore": 37.770630795934,
          "number_reads": 2748155,
          "read1_q30_base_ratio": 0.8947676653366835,
          "read2_q30_base_ratio": 0.7771883245304577
        },
        "all": {
          "barcode_exact_match_ratio": 0.9336158258904611,
          "barcode_q30_base_ratio": 0.9611993091728814,
          "bc_on_whitelist": 0.9447542078230667,
          "mean_barcode_qscore": 37.770630795934,
          "number_reads": 2748155,
          "read1_q30_base_ratio": 0.8947676653366835,
          "read2_q30_base_ratio": 0.7771883245304577
        }
      }
    }
    

    The sample_qc metric is a series of key value pairs for each sample in the sample sheet, and one metrics structure per lane per sample, plus an 'all' structure in case a sample spans multiple lanes.

    The metrics are as follows:


    image.png

    Single-Library Analysis with cellranger-atac count

    Cell Ranger ATAC's pipelines analyze sequencing data produced from Chromium Single Cell ATAC libraries. This involves the following steps:

    • Run cellranger-atac mkfastq on the Illumina® BCL output folder to generate FASTQ files.
    • Run cellranger-atac count on each library that was demultiplexed by cellranger-atac mkfastq.

    Run cellranger-atac count

    To generate single-cell accessibility counts for a single library, run cellranger-atac count with the following arguments. For a complete list of command-line arguments, run cellranger-atac count --help.

    image.png
    image.png

    After determining these input arguments, run cellranger-atac:

    $ cd /home/jdoe/runs
    $ cellranger-atac count --id=sample345 \
                       --reference=/opt/refdata-cellranger-atac-GRCh38-1.2.0 \
                       --fastqs=/home/jdoe/runs/HAWT7ADXX/outs/fastq_path \
                       --sample=mysample \
                       --localcores=8 \
                       --localmem=64 
    
    Martian Runtime - 3.2.4
     
    Running preflight checks (please wait)...
    2018-09-17 21:33:47 [runtime] (ready)           ID.sample345.SC_ATAC_COUNTER_CS.SC_ATAC_COUNTER._BASIC_SC_ATAC_COUNTER._ALIGNER.SETUP_CHUNKS
    2018-09-17 21:33:47 [runtime] (run:local)       ID.sample345.SC_ATAC_COUNTER_CS.SC_ATAC_COUNTER._BASIC_SC_ATAC_COUNTER._ALIGNER.SETUP_CHUNKS.fork0.chnk0.main
    2018-09-17 21:33:56 [runtime] (chunks_complete) ID.sample345.SC_ATAC_COUNTER_CS.SC_ATAC_COUNTER._BASIC_SC_ATAC_COUNTER._ALIGNER.SETUP_CHUNKS
    ...
    

    默认情况下,cellranger-atac count子程序会调用系统中可用的所有CPU进行计算,我们可以通过设置--localcores参数指定要用的CPU个数,如设置--localcores=16参数指定使用16个CPU进行计算。同样的,我们还可以通过设置--localmem参数指定要使用的运行内存。

    Output Files

    A successful cellranger-atac count run should conclude with a message similar to this:

    2018-09-17 22:26:56 [runtime] (join_complete)   ID.sample345.SC_ATAC_COUNTER_CS.SC_ATAC_COUNTER.CLOUPE_PREPROCESS
     
    Outputs:
    - Per-barcode fragment counts & metrics:        /opt/sample345/outs/singlecell.csv
    - Position sorted BAM file:                     /opt/sample345/outs/possorted_bam.bam
    - Position sorted BAM index:                    /opt/sample345/outs/possorted_bam.bam.bai
    - Summary of all data metrics:                  /opt/sample345/outs/summary.json
    - HTML file summarizing data & analysis:        /opt/sample345/outs/web_summary.html
    - Bed file of all called peak locations:        /opt/sample345/outs/peaks.bed
    - Raw peak barcode matrix in hdf5 format:       /opt/sample345/outs/raw_peak_bc_matrix.h5
    - Raw peak barcode matrix in mex format:        /opt/sample345/outs/raw_peak_bc_matrix
    - Directory of analysis files:                  /opt/sample345/outs/analysis
    - Filtered peak barcode matrix in hdf5 format:  /opt/sample345/outs/filtered_peak_bc_matrix.h5
    - Filtered peak barcode matrix:                 /opt/sample345/outs/filtered_peak_bc_matrix
    - Barcoded and aligned fragment file:           /opt/sample345/outs/fragments.tsv.gz
    - Fragment file index:                          /opt/sample345/outs/fragments.tsv.gz.tbi
    - Filtered tf barcode matrix in hdf5 format:    /opt/sample345/outs/filtered_tf_bc_matrix.h5
    - Filtered tf barcode matrix in mex format:     /opt/sample345/outs/filtered_tf_bc_matrix
    - Loupe Cell Browser input file:                /opt/sample345/outs/cloupe.cloupe
    - csv summarizing important metrics and values: /opt/sample345/outs/summary.csv
    
    Pipestance completed successfully!
    

    运行完cellranger-atac coun子程序后,会生成一个sample ID名的新文件夹,里面存放着分析好的结果文件。我们可以使用浏览器查看summary HTML文件,还可以使用Loupe Cell Browser查看生成的.cloupe文件进行可视化的探索。

    image.png

    Aggregating Multiple GEM Groups with cellranger-atac aggr

    当我们使用多个GEM wells构建不同的测序文库时,首先对每个GEM wells的测序数据单独进行cellranger-atac count处理,然后再使用cellranger-atac aggr子程序将不同测序文库的结果进行整合。

    cellranger-atac aggr子程序使用一个逗号分隔的CSV文件作为输入,里面指定着cellranger-atac count处理好的每个文库的结果文件(specifically the fragments.tsv.gz, and singlecell.csv from each run), 最后生成一个整合好的peak-barcode matrix矩阵文件。

    Requirements

    For example, suppose you ran three count pipelines as follows:

    $ cd /opt/runs
    
    $ cellranger-atac count --id=LV123 ...
    ... wait for pipeline to finish ...
    
    $ cellranger-atac count --id=LB456 ...
    ... wait for pipeline to finish ...
    
    $ cellranger-atac count --id=LP789 ...
    ... wait for pipeline to finish ...
    

    Now you can aggregate these three runs to get an aggregated matrix and analysis. In order to do so, you need to create an Aggregation CSV.

    Setting Up An Aggregation CSV

    Create a CSV file with a header line containing the following columns:

    • library_id: Unique identifier for this input GEM well. This will be used for labeling purposes only; it doesn't need to match any previous ID you've assigned to the GEM well.
    • fragments: Path to the fragments.tsv.gz file produced by cellranger-atac count. For example, if you processed your GEM well by calling cellranger-atac count --id=ID in some directory /DIR, the fragments would be /DIR/ID/outs/fragments.tsv.gz.
    • cells: Path to the singlecell.csv file produced by cellranger-atac count.
    • (Optional) peaks: Path to the peaks.bed file produced by cellranger-atac count.
    • (Optional) Additional custom columns containing library meta-data (e.g., lab or sample origin). These custom library annotations do not affect the analysis pipeline but can be visualized downstream in the Loupe Browser.

    The CSV format file would look like this:

    library_id,fragments,cells
    LV123,/opt/runs/LV123/outs/fragments.tsv.gz,/opt/runs/LV123/outs/singlecell.csv
    LB456,/opt/runs/LB456/outs/fragments.tsv.gz,/opt/runs/LB456/outs/singlecell.csv
    LP789,/opt/runs/LP789/outs/fragments.tsv.gz,/opt/runs/LP789/outs/singlecell.csv
    

    Command Line Interface

    These are the most common command line arguments (run cellranger-atac aggr --help for a full list):

    image.png

    After specifying these input arguments, run cellranger-atac aggr:

    $ cd /home/jdoe/runs
    $ cellranger-atac aggr --id=AGG123 \
                           --csv=AGG123_libraries.csv \
                           --normalize=depth \
                           --reference=/home/jdoe/refs/hg19
    

    The pipeline will begin to run, creating a new folder named with the aggregation ID you specified (e.g. /home/jdoe/runs/AGG123) for its output. If this folder already exists, cellranger-atac will assume it is an existing pipestance and attempt to resume running it.

    Pipeline Outputs

    The cellranger-atac aggr pipeline generates output files that contain all of the data from the individual input jobs, aggregated into single output files, for convenient multi-sample analysis. The GEM well suffix of each barcode is updated to prevent barcode collisions, as described below.

    A successful run should conclude with a message similar to this:

    2019-03-21 10:14:34 [runtime] (run:hydra)       ID.AGG123.SC_ATAC_AGGREGATOR_CS.CLOUPE_PREPROCESS.fork0.join
    2019-03-21 10:14:40 [runtime] (join_complete)   ID.AGG123.SC_ATAC_AGGREGATOR_CS.CLOUPE_PREPROCESS
    2019-03-21 10:14:40 [runtime] VDR killed 281 files, 42 MB.
     
    Outputs:
    - Barcoded and aligned fragment file:           /home/jdoe/runs/AGG123/outs/fragments.tsv.gz
    - Fragment file index:                          /home/jdoe/runs/AGG123/outs/fragments.tsv.gz.tbi
    - Per-barcode fragment counts & metrics:        /home/jdoe/runs/AGG123/outs/singlecell.csv
    - Bed file of all called peak locations:        /home/jdoe/runs/AGG123/outs/peaks.bed
    - Filtered peak barcode matrix in hdf5 format:  /home/jdoe/runs/AGG123/outs/filtered_peak_bc_matrix.h5
    - Filtered peak barcode matrix in mex format:   /home/jdoe/runs/AGG123/outs/filtered_peak_bc_matrix
    - Directory of analysis files:                  /home/jdoe/runs/AGG123/outs/analysis
    - HTML file summarizing aggregation analysis :  /home/jdoe/runs/AGG123/outs/web_summary.html
    - Filtered tf barcode matrix in hdf5 format:    /home/jdoe/runs/AGG123/outs/filtered_tf_bc_matrix.h5
    - Filtered tf barcode matrix in mex format:     /home/jdoe/runs/AGG123/outs/filtered_tf_bc_matrix
    - Loupe Cell Browser input file:                /home/jdoe/runs/AGG123/outs/cloupe.cloupe
    - csv summarizing important metrics and values: /home/jdoe/runs/AGG123/outs/summary.csv
    - Summary of all data metrics:                  /home/jdoe/runs/AGG123/outs/summary.json
    - Annotation of peaks with genes:               /home/jdoe/runs/AGG123/outs/peak_annotation.tsv
    - Csv of aggregation of libraries:              /home/jdoe/runs/AGG123/outs/aggregation_csv.csv
     
    Pipestance completed successfully!
    

    Once cellranger-atac aggr has successfully completed, you can browse the resulting summary HTML file in any supported web browser, open the .cloupe file in Loupe Browser.

    Depth Normalization: equalize sensitivity

    When combining data from multiple GEM groups, the cellranger-atac aggr pipeline automatically equalizes the sensitivity of the groups before merging, which is the recommended approach in order to avoid the batch effect introduced by sequencing depth.

    There are three normalization modes:

    • depth: (default) Subsample fragments from higher-depth GEM wells until they all have an equal number of unique fragments per cell.
    • none: Do not normalize at all.
    • signal: Subsample fragments from GEM wells such that each GEM well library has the same distribution of enriched cut sites along the genome.

    Customized Secondary Analysis using cellranger-atac reanalyze

    The cellranger-atac reanalyze command reruns secondary analysis performed on the peak-barcode matrix (dimensionality reduction, clustering and visualization) using different parameter settings.

    Command Line Interface

    These are the most common command line arguments (run cellranger-atac reanalyze --help for a full list):

    image.png

    After specifying these input arguments, run cellranger-atac reanalyze. In this example, we're reanalyzing the results of an aggregation named AGG123:

    $ cd /home/jdoe/runs
    $ ls -1 AGG123/outs/*.gz # verify the input file exists
    AGG123/outs/fragments.tsv.gz
    
    $ cellranger-atac reanalyze --id=AGG123_reanalysis \
                                --peaks=AGG123/outs/peaks.bed \
                                --params=AGG123_reanalysis.csv \
                                --reference=/home/jdoe/refs/hg19 \
                                --fragments=/home/jdoe/runs/AGG123/outs/fragments.tsv.gz
    

    The pipeline will begin to run, creating a new folder named with the reanalysis ID you specified (e.g. /home/jdoe/runs/AGG123_reanalysis) for its output. If this folder already exists, cellranger-atac will assume it is an existing pipestance and attempt to resume running it.

    Pipeline Outputs

    A successful run should conclude with a message similar to this:

    2019-03-22 12:45:22 [runtime] (run:hydra)       ID.AGG123_reanalysis.SC_ATAC_REANALYZER_CS.SC_ATAC_REANALYZER.CLOUPE_PREPROCESS.fork0.join
    2019-03-22 12:46:04 [runtime] (join_complete)   ID.AGG123_reanalysis.SC_ATAC_REANALYZER_CS.SC_ATAC_REANALYZER.CLOUPE_PREPROCESS
    2019-03-22 12:46:04 [runtime] VDR killed 270 files, 18 MB.
     
    Outputs:
    - Summary of all data metrics:                  /home/jdoe/runs/AGG123_reanalysis/outs/summary.json
    - csv summarizing important metrics and values: /home/jdoe/runs/AGG123_reanalysis/outs/summary.csv
    - Per-barcode fragment counts & metrics:        /home/jdoe/runs/AGG123_reanalysis/outs/singlecell.csv
    - Raw peak barcode matrix in hdf5 format:       /home/jdoe/runs/AGG123_reanalysis/outs/raw_peak_bc_matrix.h5
    - Raw peak barcode matrix in mex format:        /home/jdoe/runs/AGG123_reanalysis/outs/raw_peak_bc_matrix
    - Filtered peak barcode matrix in hdf5 format:  /home/jdoe/runs/AGG123_reanalysis/outs/filtered_peak_bc_matrix.h5
    - Filtered peak barcode matrix in mex format:   /home/jdoe/runs/AGG123_reanalysis/outs/filtered_peak_bc_matrix
    - Directory of analysis files:                  /home/jdoe/runs/AGG123_reanalysis/outs/analysis
    - HTML file summarizing aggregation analysis :  /home/jdoe/runs/AGG123_reanalysis/outs/web_summary.html
    - Filtered tf barcode matrix in hdf5 format:    /home/jdoe/runs/AGG123_reanalysis/outs/filtered_tf_bc_matrix.h5
    - Filtered tf barcode matrix in mex format:     /home/jdoe/runs/AGG123_reanalysis/outs/filtered_tf_bc_matrix
    - Loupe Cell Browser input file:                /home/jdoe/runs/AGG123_reanalysis/outs/cloupe.cloupe
    - Annotation of peaks with genes:               /home/jdoe/runs/AGG123_reanalysis/outs/peak_annotation.tsv
    - Barcoded and aligned fragment file:           /home/jdoe/runs/AGG123_reanalysis/outs/fragments.tsv.gz
    - Fragment file index:                          /home/jdoe/runs/AGG123_reanalysis/outs/fragments.tsv.gz.tbi
     
    Pipestance completed successfully!
    

    Parameters

    The CSV file passed to --params should have 0 or more rows, one for every parameter that you want to customize. There is no header row. If a parameter is not specified in your CSV, its default value will be used.

    Here is a detailed description of each parameter. For parameters that subset the data, a default value of null indicates that no subsetting happens by default.

    image.png
    image.png
    image.png
    image.png

    Common Use Cases

    These examples illustrate what you should put in your --params CSV file in some common situations.
    1)More Principal Components and Clusters
    For very large / diverse cell populations, the defaults may not capture the full variation between cells. In that case, try increasing the number of principal components and / or clusters.

    To run dimensionality reduction with 50 components and k-means with up to 30 clusters, put this in your CSV:

    num_comps,50
    max_clusters,30
    

    2)Less Memory Usage
    You can limit the memory usage of the analysis by computing the LSA projection on a subset of cells and features. This is especially useful for large datasets (100k+ cells).

    If you have 100k cells, it's completely reasonable to use only 50% of them for LSA - the memory usage will be cut in half, but you'll still be well equipped to detect rare subpopulations. Limiting the number of features will reduce memory even further.

    To compute the LSA projection using 50000 cells and 3000 peaks, put this in your CSV:

    num_dr_bcs,50000
    num_dr_features,3000
    

    Note: Subsetting of cells is done randomly, to avoid bias. Subsetting of features is done by binning features by their mean expression across cells, then measuring the dispersion (a variance-like parameter) of each gene's expression normalized to the other features in its bin.

    参考来源:https://support.10xgenomics.com/single-cell-atac/software/pipelines/latest/using/count

    相关文章

      网友评论

        本文标题:使用cellranger-atac软件处理10x单细胞ATAC-

        本文链接:https://www.haomeiwen.com/subject/oxsltktx.html