美文网首页
MrBayes问题 2020-06-12

MrBayes问题 2020-06-12

作者: SnorkelingFan凡潜 | 来源:发表于2020-06-12 18:12 被阅读0次

写在前面,假设已经得到了最佳模型(根据AIC)

Model................................ : MtREV+I+G+F
  Number of parameters............... : 110 (21 + 89 branch length estimates)
    gamma shape (4 rate categories).. = 0.729
    proportion of invariable sites... = 0.156
    aminoacid frequencies............ = observed (see above)
 -lnL................................ = 150395.76
--
Best model according to AIC: MtREV+I+G+F
Confidence Interval: 100.0
  1. Charset
Charset

   This command defines a character set. The format for the charset command
   is

      charset <name> = <character numbers>

   For example, "charset first_pos = 1-720\3" defines a character set
   called "first_pos" that includes every third site from 1 to 720.
   The character set name cannot have any spaces in it. The slash (\)
   is a nifty way of telling the program to assign every third (or
   second, or fifth, or whatever) character to the character set.
   This option is best used not from the command line, but rather as a
   line in the mrbayes block of a file. Note that you can use "." to
   stand in for the last character (e.g., charset 1-.\3).
  1. 氨基酸序列矩阵
Aamodel         -- Aminoacid rate matrix
  1. gamma shapeproportion of invariable sites的指定
   Shapepr       -- This parameter specifies the prior for the gamma/lnorm shape
                    parameter for among-site rate variation. The options are:

                       prset shapepr = uniform(<number>,<number>)
                       prset shapepr = exponential(<number>)
                       prset shapepr = fixed(<number>)

   Pinvarpr      -- This parameter specifies the prior for the proportion of
                    invariable sites. The options are:

                       prset pinvarpr = uniform(<number>,<number>)
                       prset pinvarpr = fixed(<number>)

                    Note that the valid range for the parameter is between 0
                    and 1. Hence, "prset pinvarpr=uniform(0,0.8)" is valid
                    while "prset pinvarpr=uniform(0,10)" is not. The def-
                    ault setting is "prset pinvarpr=uniform(0,1)".
  1. 马尔科夫链Mcmc区块的参数指定
Mcmc

   This command starts the Markov chain Monte Carlo (MCMC) analysis. The
   posterior probability of phylogenetic trees (and other parameters of the
   substitution model) cannot be determined analytically. Instead, MCMC is
   used to approximate the posterior probabilities of trees by drawing
   (dependent) samples from the posterior distribution. This program can
   implement a variant of MCMC called "Metropolis-coupled Markov chain Monte
   Carlo", or MCMCMC for short. Basically, "Nchains" are run, with
   Nchains - 1 of them heated. The chains are labelled 1, 2, ..., Nchains.
   The heat that is applied to the i-th chain is B = 1 / (1 + temp X i). B
   is the power to which the posterior probability is raised. When B = 0, all
   trees have equal probability and the chain freely visits trees. B = 1 is
   the "cold" chain (or the distribution of interest). MCMCMC can mix
   better than ordinary MCMC; after all of the chains have gone through
   one cycle, two chains are chosen at random and an attempt is made to
   swap the states (with the probability of a swap being determined by the
   Metropolis et al. equation). This allows the chain to potentially jump
   a valley in a single bound. The correct usage is

      mcmc <parameter> = <value> ... <parameter> = <value>

   For example,

      mcmc ngen=100000 nchains=4 temp=0.5

   performs a MCMCMC analysis with four chains with the temperature set to
   0.5. The chains would be run for 100,000 cycles.
Parameter       Options               Current Setting
   -----------------------------------------------------
   Ngen            <number>              1000000
   Nruns           <number>              2
   Nchains         <number>              4
   Temp            <number>              0.100000
   Reweight        <number>,<number>     0.00 v 0.00 ^
   Swapfreq        <number>              1
   Nswaps          <number>              1
   Samplefreq      <number>              500
   Printfreq       <number>              1000
   Printall        Yes/No                Yes
   Printmax        <number>              8
   Mcmcdiagn       Yes/No                Yes
   Diagnfreq       <number>              5000
   Diagnstat       Avgstddev/Maxstddev   Avgstddev
   Minpartfreq     <number>              0.10
   Allchains       Yes/No                No
   Allcomps        Yes/No                No
   Relburnin       Yes/No                Yes
   Burnin          <number>              0
   Burninfrac      <number>              0.25
   Stoprule        Yes/No                No
   Stopval         <number>              0.05
   Savetrees       Yes/No                No
   Checkpoint      Yes/No                Yes
   Checkfreq       <number>              2000
   Filename        <name>                temp.<p/t>
   Startparams     Current/Reset         Current
   Starttree       Current/Random/       Current
                   Parsimony
   Nperts          <number>              0
   Data            Yes/No                Yes
   Ordertaxa       Yes/No                No
   Append          Yes/No                No
   Autotune        Yes/No                Yes
   Tunefreq        <number>              100
  1. 贝叶斯分析默认设置马尔科夫链mcmc会独立运行两次Nruns=2
 Parameter       Options               Current Setting
   -----------------------------------------------------
   Ngen            <number>              1000000
   Nruns           <number>              2
  1. 默认丢弃25%
BurninFrac    -- Determines the fraction of samples that will be discarded
                    when summary statistics are calculated. The value of this
                    option is only relevant when Relburnin is set to 'Yes'.
                    Example: A value for this option of 0.25 means that 25% of
                    the samples will be discarded.
Burninfrac      <number>              0.25
  1. sump是用来对参数值进行归纳。设置的burnin值为(ngen / samplefreq) * 0.25 程序给出一个概括的表,要确保PSRF一列中的值接近 1.0,否则需要运行该多的代数
Sump             -- Summarizes parameters from MCMC analysis
sump burnin=250 (250是根据设置而定,比如burnin=0.25, samplefreq=10, Ngen=10000)
  ---------------------------------------------------------------------------
   ---------------------------------------------------------------------------
   Sump

   During an MCMC analysis, MrBayes prints the sampled parameter values to one or
   more tab-delimited text files, one for each independent run in your analysis.
   The command 'Sump' summarizes the information in this parameter file or these
   parameter files. By default, the root of the parameter file name(s) is assumed
   to be the name of the last matrix-containing nexus file. MrBayes also remem-
   bers the number of independent runs in the last analysis that you set up, re-
   gardless of whether you actually ran it. For instance, if there were two in-
   dependent runs, which is the initial setting when you read in a new matrix,
   MrBayes will assume that there are two parameter files with the endings
   '.run1.p' and '.run2.p'. You can change the root of the file names and the
   number of runs using the 'Filename' and 'Nruns' settings.

   When you invoke the 'Sump' command, three items are output: (1) a generation
   plot of the likelihood values; (2) estimates of the marginal likelihood of
   the model; and (3) a table with the mean, variance, and 95 percent credible
   interval for the sampled parameters. All three items are output to screen.
   The table of marginal likelihoods is also printed to a file with the ending
   '.lstat' and the parameter table to a file with the ending '.pstat'. For some
   model parameters, there may also be a '.mstat' file.

   When running 'Sump' you typically want to discard a specified number or
   fraction of samples from the beginning of the chain as the burn in. This is
   done using the same mechanism used by the 'mcmc' command. That is, if you
   run an mcmc analysis with a relative burn in of 25 % of samples for con-
   vergence diagnostics, then the same burn in will be used for a subsequent
   sump command, unless a different burn in is specified. That is, issuing

   sump

   immediately after 'mcmc', will result in using the same burn in settings as
   for the 'mcmc' command. All burnin settings are reset to default values every
   time a new matrix is read in, namely relative burnin ('relburnin=yes') with
   25 % of samples discarded ('burninfrac = 0.25').
Options:
Burnin       -- Determines the number of samples (not generations) that will
                   be discarded when summary statistics are calculated. The
                   value of this option is only applicable when 'Relburnin' is
                   set to 'No'.
   Burninfrac   -- Determines the fraction of samples that will be discarded when
                   summary statistics are calculated. The setting only takes
                   effect if 'Relburnin' is set to 'Yes'.
  1. Sumt参数
Sumt             -- Summarizes trees from MCMC analysis
sumt burnin=250 查看树形
Sumt

   This command is used to produce summary statistics for trees sampled during
   a Bayesian MCMC analysis. You can either summarize trees from one individual
   analysis, or trees coming from several independent analyses. In either case,
   all the sampled trees are read in and the proportion of the time any single
   taxon bipartition (split) is found is counted. The proportion of the time that
   the bipartition is found is an approximation of the posterior probability of
   the bipartition. (Remember that a taxon bipartition is defined by removing a
   branch on the tree, dividing the tree into those taxa to the left and right
   of the removed branch. This set is called a taxon bipartition.) The branch
   length of the bipartition is also recorded, if branch lengths have been saved
   to file. The result is a list of the taxon bipartitions found, the frequency
   with which they were found, the posterior probability of the bipartition
   and, the mean and variance of the branch lengths or node depths, and various
   other statistics.

   The key to the partitions is output to a file with the suffix '.parts'. The
   summary statistics pertaining to bipartition probabilities are output to a
   file with the suffix '.tstat', and the statistics pertaining to branch or node
   parameters are output to a file with the suffix '.vstat'.

   A consensus tree is also printed to a file with the suffix '.con.tre' and
   printed to the screen as a cladogram, and as a phylogram if branch lengths
   have been saved. The consensus tree is either a 50 percent majority rule tree
   or a majority rule tree showing all compatible partitions. If branch lengths
   have been recorded during the run, the '.con.tre' file will contain a consen-
   sus tree with branch lengths and interior nodes labelled with support values.
   By default, the consensus tree will also contain other summary information in
   a format understood by the program 'FigTree'. To use a simpler format under-
   stood by other tree-drawing programs, such as 'TreeView', set 'Conformat' to
   'Simple'.

   MrBayes alo produces a file with the ending ".trprobs" that contains a list
   of all the trees that were found during the MCMC analysis, sorted by their
   probabilities. This list of trees can be used to construct a credible set of
   trees. For example, if you want to construct a 95 percent credible set of
   trees, you include all of those trees whose cumulative probability is less
   than or equal to 0.95. You have the option of displaying the trees to the
   screen using the "Showtreeprobs" option. The default is to not display the
   trees to the screen; the number of different trees sampled by the chain can
   be quite large. If you are analyzing a large set of taxa, you may actually
   want to skip the calculation of tree probabilities entirely by setting
   'Calctreeprobs' to 'No'.
  If you are summarizing the trees sampled in several independent analyses,
   such as those resulting from setting the 'Nruns' option of the 'Mcmc' command
   to a value larger than 1, MrBayes will also calculate convergence diagnostics
   for the sampled topologies and branch lengths. These values can help you
   determine whether it is likely that your chains have converged.

   The 'Sumt' command expands the 'Filename' according to the current values of
   the 'Nruns' and 'Ntrees' options. For instance, if both 'Nruns' and 'Ntrees'
   are set to 1, 'Sumt' will try to open a file named '<Filename>.t'. If 'Nruns'
   is set to 2 and 'Ntrees' to 1, then 'Sumt' will open two files, the first
   named '<Filename>.run1.t' and the second '<Filename>.run2.t', etc. By default,
   the 'Filename' option is set such that 'Sumt' automatically summarizes all the
   results from your immediately preceding 'Mcmc' command. You can also use the
   'Sumt' command to summarize tree samples in older analyses. If you want to do
   that, remember to first read in a matrix so that MrBayes knows what taxon
   names to expect in the trees. Then set the 'Nruns', 'Ntrees' and 'Filename'
   options appropriately if they differ from the MrBayes defaults.

   Options:

   Relburnin     -- If this option is set to YES, then a proportion of the
                    samples will be discarded as burnin when calculating summary
                    statistics. The proportion to be discarded is set with
                    Burninfrac (see below). When the Relburnin option is set to
                    NO, then a specific number of samples is discarded instead.
                    This number is set by Burnin (see below). Note that the
                    burnin setting is shared across the 'sumt', 'sump', and
                    'mcmc' commands.
   Burnin        -- Determines the number of samples (not generations) that will
                    be discarded when summary statistics are calculated. The
                    value of this option is only relevant when Relburnin is set
                    to NO.

参考:

  1. http://www.360doc.com/content/17/1002/18/45962007_691819677.shtml

相关文章

网友评论

      本文标题:MrBayes问题 2020-06-12

      本文链接:https://www.haomeiwen.com/subject/bnsntktx.html