美文网首页
scenic运行记录1

scenic运行记录1

作者: Seurat_Satija | 来源:发表于2021-04-16 09:48 被阅读0次

    R version 4.0.2 (2020-06-22) -- "Taking Off Again"
    Copyright (C) 2020 The R Foundation for Statistical Computing
    Platform: x86_64-w64-mingw32/x64 (64-bit)

    R is free software and comes with ABSOLUTELY NO WARRANTY.
    You are welcome to redistribute it under certain conditions.
    Type 'license()' or 'licence()' for distribution details.

    R is a collaborative project with many contributors.
    Type 'contributors()' for more information and
    'citation()' on how to cite R or R packages in publications.

    Type 'demo()' for some demos, 'help()' for on-line help, or
    'help.start()' for an HTML browser interface to help.
    Type 'q()' to quit R.

    > # https://mp.weixin.qq.com/s/pN4qWdUszuGqr8nOJstn8w
    > 
    > if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager")
    > BiocManager::version()
    [1] ‘3.12’
    > # If your bioconductor version is previous to 4.0, see the section bellow
    > 
    > ## Required
    > if (!requireNamespace(c("AUCell", "RcisTarget"), quietly = TRUE)) BiocManager::install(c("AUCell", "RcisTarget"),ask = F,update = F)
    > if (!requireNamespace('GENIE3', quietly = TRUE)) BiocManager::install(c("GENIE3"),ask = F,update = F)  # Optional. Can be replaced by GRNBoost
    > 
    > ## Optional (but highly recommended):
    > # To score the network on cells (i.e. run AUCell):
    > if (!requireNamespace(c("zoo", "mixtools", "rbokeh"), quietly = TRUE)) BiocManager::install(c("zoo", "mixtools", "rbokeh"),ask = F,update = F) 
    > # For various visualizations and perform t-SNEs:
    > if (!requireNamespace(c("DT", "NMF", "ComplexHeatmap", "R2HTML", "Rtsne"), quietly = TRUE)) BiocManager::install(c("DT", "NMF", "ComplexHeatmap", "R2HTML", "Rtsne"),ask = F,update = F)
    > 
    > # To support paralell execution (not available in Windows):
    > # To support paralell execution (not available in Windows):
    > #if (!requireNamespace(c("doMC", "doRNG"), quietly = TRUE))  BiocManager::install(c("doMC", "doRNG"),ask = F,update = F)
    > 
    > # To export/visualize in http://scope.aertslab.org
    > if (!requireNamespace("devtools", quietly = TRUE)) install.packages("devtools")
    > if (!requireNamespace("SCopeLoomR", quietly = TRUE))  devtools::install_github("aertslab/SCopeLoomR", build_vignettes = TRUE)
    > 
    > if (!requireNamespace("devtools", quietly = TRUE)) install.packages("devtools")
    > if (!requireNamespace("SCENIC", quietly = TRUE))  devtools::install_github("aertslab/SCENIC") 
    > packageVersion("SCENIC")
    [1] ‘1.2.4’
    > 
    > library(Rtsne)
    Warning message:
    程辑包‘Rtsne’是用R版本4.0.5 来建造的 
    > library(doSNOW)
    载入需要的程辑包:foreach
    载入需要的程辑包:iterators
    载入需要的程辑包:snow
    Warning messages:
    1: 程辑包‘doSNOW’是用R版本4.0.5 来建造的 
    2: 程辑包‘foreach’是用R版本4.0.3 来建造的 
    3: 程辑包‘iterators’是用R版本4.0.3 来建造的 
    4: 程辑包‘snow’是用R版本4.0.3 来建造的 
    > #install.packages("doParallel") 
    > library(doParallel)
    载入需要的程辑包:parallel
    
    载入程辑包:‘parallel’
    
    The following objects are masked from ‘package:snow’:
    
        clusterApply, clusterApplyLB, clusterCall, clusterEvalQ, clusterExport,
        clusterMap, clusterSplit, makeCluster, parApply, parCapply, parLapply,
        parRapply, parSapply, splitIndices, stopCluster
    
    Warning message:
    程辑包‘doParallel’是用R版本4.0.3 来建造的 
    > library(doMPI)
    载入需要的程辑包:Rmpi
    Error: package or namespace load failed for ‘Rmpi’:
     loadNamespace()里算'Rmpi'时.onLoad失败了,详细内容:
      调用: inDL(x, as.logical(local), as.logical(now), ...)
      错误: unable to load shared object 'C:/Users/Nano/Documents/R/win-library/4.0/Rmpi/libs/x64/Rmpi.dll':
      LoadLibrary failure:  找不到指定的模块。
    
    Error: 无法载入程辑包‘Rmpi’
    In addition: Warning messages:
    1: 程辑包‘doMPI’是用R版本4.0.5 来建造的 
    2: 程辑包‘Rmpi’是用R版本4.0.4 来建造的 
    > # install.packages("doSNOW")
    > # install.packages("doMPI")
    > rm(list = ls())
    > library(Seurat)
    Registered S3 method overwritten by 'spatstat':
      method     from
      print.boxx cli 
    Attaching SeuratObject
    Warning message:
    程辑包‘Seurat’是用R版本4.0.4 来建造的 
    > # devtools::install_github('satijalab/seurat-data')
    > library(SeuratData)
    -- Installed datasets ------------------------------------------------ SeuratData v0.2.1 --
    √ ifnb         3.1.0                         √ pbmcMultiome 0.1.0
    √ panc8        3.0.2                         √ stxBrain     0.1.1
    √ pbmc3k       3.1.4                         
    
    ------------------------------------------- Key -------------------------------------------
    √ Dataset loaded successfully
    > Dataset built with a newer version of Seurat than installed
    (?) Unknown version of Seurat installed
    
    There were 12 warnings (use warnings() to see them)
    > AvailableData()
                                       Dataset Version
    bmcite.SeuratData                   bmcite   0.3.0
    cbmc.SeuratData                       cbmc   3.1.4
    celegans.embryo.SeuratData celegans.embryo   0.1.0
    hcabm40k.SeuratData               hcabm40k   3.0.0
    ifnb.SeuratData                       ifnb   3.1.0
    panc8.SeuratData                     panc8   3.0.2
    pbmc3k.SeuratData                   pbmc3k   3.1.4
    pbmcMultiome.SeuratData       pbmcMultiome   0.1.0
    pbmcsca.SeuratData                 pbmcsca   3.0.0
    ssHippo.SeuratData                 ssHippo   3.1.4
    stxBrain.SeuratData               stxBrain   0.1.1
    stxKidney.SeuratData             stxKidney   0.1.0
    thp1.eccite.SeuratData         thp1.eccite   3.1.5
                                                                                      Summary
    bmcite.SeuratData                                                   30k Bone Marrow Cells
    cbmc.SeuratData                              scRNAseq and 13-antibody sequencing of CBMCs
    celegans.embryo.SeuratData         6k C. elegans embryos from Packer and Zhu et al (2019)
    hcabm40k.SeuratData        40,000 Cells From the Human Cell Atlas ICA Bone Marrow Dataset
    ifnb.SeuratData                                         IFNB-Stimulated and Control PBMCs
    panc8.SeuratData                         Eight Pancreas Datasets Across Five Technologies
    pbmc3k.SeuratData                                              3k PBMCs from 10X Genomics
    pbmcMultiome.SeuratData                                10X Genomics PBMC Multiome Dataset
    pbmcsca.SeuratData                   Broad Institute PBMC Systematic Comparative Analysis
    ssHippo.SeuratData                              Slide-seq v2 dataset of mouse hippocampus
    stxBrain.SeuratData                               10X Genomics Visium Mouse Brain Dataset
    stxKidney.SeuratData                             10X Genomics Visium Mouse Kidney Dataset
    thp1.eccite.SeuratData                                                   ECCITE-seq THP-1
                               seurat    species            system ncells
    bmcite.SeuratData           3.2.2      human       bone marrow  30672
    cbmc.SeuratData             3.1.4      human CBMC (cord blood)   8617
    celegans.embryo.SeuratData   <NA> C. elegans            embryo   6188
    hcabm40k.SeuratData          <NA>      human       bone marrow  40000
    ifnb.SeuratData              <NA>      human              PBMC  13999
    panc8.SeuratData             <NA>      human Pancreatic Islets  14892
    pbmc3k.SeuratData           3.1.4      human              PBMC   2700
    pbmcMultiome.SeuratData     4.0.0      human              pbmc  11909
    pbmcsca.SeuratData           <NA>      human              PBMC  31021
    ssHippo.SeuratData           <NA>      mouse       hippocampus     NA
    stxBrain.SeuratData          <NA>      mouse             brain  12167
    stxKidney.SeuratData         <NA>      mouse            kidney   1438
    thp1.eccite.SeuratData       <NA>      human              <NA>     NA
                                                                                          tech
    bmcite.SeuratData                                                                     <NA>
    cbmc.SeuratData                                                                   CITE-seq
    celegans.embryo.SeuratData                                                            <NA>
    hcabm40k.SeuratData                                                                 10x v2
    ifnb.SeuratData                                                                     10x v1
    panc8.SeuratData                          SMARTSeq2, Fluidigm C1, CelSeq, CelSeq2, inDrops
    pbmc3k.SeuratData                                                                   10x v1
    pbmcMultiome.SeuratData                                                               <NA>
    pbmcsca.SeuratData         10x v2, 10x v3, SMARTSeq2, Seq-Well, inDrops, Drop-seq, CelSeq2
    ssHippo.SeuratData                                                             slideseq v2
    stxBrain.SeuratData                                                                 visium
    stxKidney.SeuratData                                                                visium
    thp1.eccite.SeuratData                                                                <NA>
                               default.dataset disk.datasets
    bmcite.SeuratData                     <NA>          <NA>
    cbmc.SeuratData                        raw     processed
    celegans.embryo.SeuratData             raw          <NA>
    hcabm40k.SeuratData                    raw          <NA>
    ifnb.SeuratData                        raw          <NA>
    panc8.SeuratData                       raw          <NA>
    pbmc3k.SeuratData                      raw          <NA>
    pbmcMultiome.SeuratData                 NA          <NA>
    pbmcsca.SeuratData                     raw          <NA>
    ssHippo.SeuratData                     raw          <NA>
    stxBrain.SeuratData                     NA          <NA>
    stxKidney.SeuratData                   raw          <NA>
    thp1.eccite.SeuratData                <NA>          <NA>
                                                             other.datasets
    bmcite.SeuratData                                                  <NA>
    cbmc.SeuratData                                                    <NA>
    celegans.embryo.SeuratData                                         <NA>
    hcabm40k.SeuratData                                                <NA>
    ifnb.SeuratData                                               processed
    panc8.SeuratData                                                   <NA>
    pbmc3k.SeuratData                                          pbmc3k.final
    pbmcMultiome.SeuratData                             pbmc.rna, pbmc.atac
    pbmcsca.SeuratData                                                 <NA>
    ssHippo.SeuratData                                                 <NA>
    stxBrain.SeuratData        posterior1, posterior2, anterior1, anterior2
    stxKidney.SeuratData                                               <NA>
    thp1.eccite.SeuratData                                             <NA>
                                                                                                   notes
    bmcite.SeuratData                                                                               <NA>
    cbmc.SeuratData                                                                                 <NA>
    celegans.embryo.SeuratData                                                                      <NA>
    hcabm40k.SeuratData                                                                             <NA>
    ifnb.SeuratData                                                                                 <NA>
    panc8.SeuratData                                                                                <NA>
    pbmc3k.SeuratData                                                                               <NA>
    pbmcMultiome.SeuratData                                 One sample with two modalities, RNA and ATAC
    pbmcsca.SeuratData                                                                     HCA benchmark
    ssHippo.SeuratData                                                                              <NA>
    stxBrain.SeuratData        One sample split across four datasets as paired anterior/posterior slices
    stxKidney.SeuratData                                                                            <NA>
    thp1.eccite.SeuratData                                                                          <NA>
                               Installed InstalledVersion
    bmcite.SeuratData              FALSE             <NA>
    cbmc.SeuratData                FALSE             <NA>
    celegans.embryo.SeuratData     FALSE             <NA>
    hcabm40k.SeuratData            FALSE             <NA>
    ifnb.SeuratData                 TRUE            3.1.0
    panc8.SeuratData                TRUE            3.0.2
    pbmc3k.SeuratData               TRUE            3.1.4
    pbmcMultiome.SeuratData         TRUE            0.1.0
    pbmcsca.SeuratData             FALSE             <NA>
    ssHippo.SeuratData             FALSE             <NA>
    stxBrain.SeuratData             TRUE            0.1.1
    stxKidney.SeuratData           FALSE             <NA>
    thp1.eccite.SeuratData         FALSE             <NA>
    There were 12 warnings (use warnings() to see them)
    > # InstallData("pbmc3k") #  (89.4 MB) 
    > data("pbmc3k")
    > exprMat  <-  as.matrix(pbmc3k@assays$RNA@data)
    > dim(exprMat)
    [1] 13714  2700
    > exprMat[1:4,1:4]
                  AAACATACAACCAC AAACATTGAGCTAC AAACATTGATCAGC AAACCGTGCTTCCG
    AL627309.1                 0              0              0              0
    AP006222.2                 0              0              0              0
    RP11-206L10.2              0              0              0              0
    RP11-206L10.9              0              0              0              0
    > cellInfo <-  pbmc3k@meta.data[,c(4,2,3)]
    > colnames(cellInfo)=c('CellType', 'nGene' ,'nUMI')
    > head(cellInfo)
                       CellType nGene nUMI
    AAACATACAACCAC Memory CD4 T  2419  779
    AAACATTGAGCTAC            B  4903 1352
    AAACATTGATCAGC Memory CD4 T  3147 1129
    AAACCGTGCTTCCG   CD14+ Mono  2639  960
    AAACCGTGTATGCG           NK   980  521
    AAACGCACTGGTAC Memory CD4 T  2163  781
    > table(cellInfo$CellType)
    
     Naive CD4 T Memory CD4 T   CD14+ Mono            B        CD8 T FCGR3A+ Mono           NK 
             697          483          480          344          271          162          155 
              DC     Platelet 
              32           14 
    > ### Initialize settings
    > library(SCENIC)
    Warning message:
    程辑包‘SCENIC’是用R版本4.0.3 来建造的 
    > # 保证cisTarget_databases 文件夹下面有下载好2个1G的文件
    > scenicOptions <- initializeScenic(org="hgnc", 
    +                                   dbDir="cisTarget_databases", nCores=1)
    Motif databases selected: 
      hg19-500bp-upstream-7species.mc9nr.feather 
      hg19-tss-centered-10kb-7species.mc9nr.feather
    Registered S3 methods overwritten by 'arrow':
      method             from        
      as.character.Array DelayedArray
      as.integer.Array   DelayedArray
      as.vector.Array    DelayedArray
    Warning messages:
    1: In RcisTarget::importRankings(dbFilePath, columns = randomCol) :
      The following columns are missing from the database: 
    2: In RcisTarget::importRankings(dbFilePath, columns = randomCol) :
      The following columns are missing from the database: 
    3: package ‘RcisTarget’ was built under R version 4.0.3 
    4: In RcisTarget::importRankings(dbFile, columns = rnktype) :
      The following columns are missing from the database: features
    5: In RcisTarget::importRankings(dbFile, columns = rnktype) :
      The following columns are missing from the database: features
    > saveRDS(scenicOptions, file="int/scenicOptions.Rds")
    > ### Co-expression network
    > genesKept <- geneFiltering(exprMat, scenicOptions)
    Maximum value in the expression matrix: 419
    Ratio of detected vs non-detected: 0.066
    Number of counts (in the dataset units) per gene:
        Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
         3.0     16.0     57.0    465.7    176.8 161685.0 
    Number of cells in which each gene is detected:
       Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
        3.0    15.0    54.0   166.5   157.0  2700.0 
    
    Number of genes left after applying the following filters (sequential):
        5749    genes with counts per gene > 81
        5748    genes detected in more than 27 cells
        5318    genes available in RcisTarget database
    Gene list saved in int/1.1_genesKept.Rds
    > exprMat_filtered <- exprMat[genesKept, ]
    > exprMat_filtered[1:4,1:4]
             AAACATACAACCAC AAACATTGAGCTAC AAACATTGATCAGC AAACCGTGCTTCCG
    NOC2L                 0              0              0              0
    HES4                  0              0              0              0
    ISG15                 0              0              1              9
    TNFRSF18              0              2              0              0
    > dim(exprMat_filtered)
    [1] 5318 2700
    > runCorrelation(exprMat_filtered, scenicOptions)
    > exprMat_filtered_log <- log2(exprMat_filtered+1) 
    > runGenie3(exprMat_filtered_log, scenicOptions)
    Using 480 TFs as potential regulators...
    Running GENIE3 part 1
    Running GENIE3 part 10
    Running GENIE3 part 2
    Running GENIE3 part 3
    Running GENIE3 part 4
    Running GENIE3 part 5
    Running GENIE3 part 6
    Running GENIE3 part 7
    Running GENIE3 part 8
    Running GENIE3 part 9
    Finished running GENIE3.
    Warning message:
    In runGenie3(exprMat_filtered_log, scenicOptions) :
      Only 480 (26%) of the 1839 TFs in the database were found in the dataset. Do they use the same gene IDs?
    
    > 
    > ### Build and score the GRN
    > exprMat_log <- log2(exprMat+1)
    > scenicOptions@settings$dbs <- scenicOptions@settings$dbs["10kb"] # Toy run settings
    > scenicOptions <- runSCENIC_1_coexNetwork2modules(scenicOptions)
    00:50   Creating TF modules
    Number of links between TFs and targets (weight>=0.001): 1773162
                 [,1]
    nTFs          480
    nTargets     5318
    nGeneSets    3832
    nLinks    2515239
    > scenicOptions <- runSCENIC_2_createRegulons(scenicOptions,
    +                                             coexMethod=c("top5perTarget")) # Toy run settings
    00:50   Step 2. Identifying regulons
    tfModulesSummary:
                  [,1]
    top5perTarget   61
    00:50   RcisTarget: Calculating AUC
    Scoring database:  [Source file: hg19-tss-centered-10kb-7species.mc9nr.feather]
    00:53   RcisTarget: Adding motif annotation
    Using BiocParallel...
    Number of motifs in the initial enrichment: 19044
    Number of motifs annotated to the matching TF: 357
    00:53   RcisTarget: Pruning targets
    00:55   Number of motifs that support the regulons: 357
        Preview of motif enrichment saved as: output/Step2_MotifEnrichment_preview.html
    Warning messages:
    1: package ‘BiocParallel’ was built under R version 4.0.3 
    2: In MulticoreParam(nCores) :
      MulticoreParam() not supported on Windows, use SnowParam()
    3: package ‘AUCell’ was built under R version 4.0.3 
    4: In RcisTarget::importRankings(dbFilePath, columns = randomCol) :
      The following columns are missing from the database: 
    5: In RcisTarget::importRankings(dbFile, columns = rnktype) :
      The following columns are missing from the database: features
    6: In importRankings(rnkName, columns = allGenes) :
      The following columns are missing from the database: 
    7: In importRankings(dbNames[motifDbName], columns = allGenes) :
      The following columns are missing from the database: 
    8: In .addSignificantGenes(resultsTable = resultsTable, geneSets = geneSets,  :
      The rakings provided only include a subset of genes/regions included in the whole database.
    9: package ‘data.table’ was built under R version 4.0.3 
    > library(doParallel)
    > scenicOptions <- runSCENIC_3_scoreCells(scenicOptions, exprMat_log ) 
    00:56   Step 3. Analyzing the network activity in each individual cell
        Number of regulons to evaluate on cells: 21
    Biggest (non-extended) regulons: 
         SPI1 (539g)
         CEBPD (73g)
         CEBPB (47g)
         SPIB (38g)
         TBX21 (28g)
         IRF7 (15g)
         IRF8 (16g)
         STAT1 (14g)
    Quantiles for the number of genes detected by cell: 
    (Non-detected genes are shuffled at the end of the ranking. Keep it in mind when choosing the threshold for calculating the AUC).
        min      1%      5%     10%     50%    100% 
     212.00  325.00  434.95  539.90  816.00 3400.00 
    00:57   Finished running AUCell.
    00:57   Plotting heatmap...
    00:57   Plotting t-SNEs...
    > scenicOptions <- runSCENIC_4_aucell_binarize(scenicOptions)
    Binary regulon activity: 12 TF regulons x 2216 cells.
    (20 regulons including 'extended' versions)
    11 regulons are active in more than 1% (22.16) cells.
    > tsneAUC(scenicOptions, aucType="AUC") # choose settings
    [1] "int/tSNE_AUC_50pcs_30perpl.Rds"
    > 
    

    相关文章

      网友评论

          本文标题:scenic运行记录1

          本文链接:https://www.haomeiwen.com/subject/xsbslltx.html