写在前面,假设已经得到了最佳模型(根据AIC)
Model................................ : MtREV+I+G+F
Number of parameters............... : 110 (21 + 89 branch length estimates)
gamma shape (4 rate categories).. = 0.729
proportion of invariable sites... = 0.156
aminoacid frequencies............ = observed (see above)
-lnL................................ = 150395.76
--
Best model according to AIC: MtREV+I+G+F
Confidence Interval: 100.0
Charset
Charset
This command defines a character set. The format for the charset command
is
charset <name> = <character numbers>
For example, "charset first_pos = 1-720\3" defines a character set
called "first_pos" that includes every third site from 1 to 720.
The character set name cannot have any spaces in it. The slash (\)
is a nifty way of telling the program to assign every third (or
second, or fifth, or whatever) character to the character set.
This option is best used not from the command line, but rather as a
line in the mrbayes block of a file. Note that you can use "." to
stand in for the last character (e.g., charset 1-.\3).
- 氨基酸序列矩阵
Aamodel -- Aminoacid rate matrix
-
gamma shape
和proportion of invariable sites
的指定
Shapepr -- This parameter specifies the prior for the gamma/lnorm shape
parameter for among-site rate variation. The options are:
prset shapepr = uniform(<number>,<number>)
prset shapepr = exponential(<number>)
prset shapepr = fixed(<number>)
Pinvarpr -- This parameter specifies the prior for the proportion of
invariable sites. The options are:
prset pinvarpr = uniform(<number>,<number>)
prset pinvarpr = fixed(<number>)
Note that the valid range for the parameter is between 0
and 1. Hence, "prset pinvarpr=uniform(0,0.8)" is valid
while "prset pinvarpr=uniform(0,10)" is not. The def-
ault setting is "prset pinvarpr=uniform(0,1)".
-
马尔科夫链
Mcmc
区块的参数指定
Mcmc
This command starts the Markov chain Monte Carlo (MCMC) analysis. The
posterior probability of phylogenetic trees (and other parameters of the
substitution model) cannot be determined analytically. Instead, MCMC is
used to approximate the posterior probabilities of trees by drawing
(dependent) samples from the posterior distribution. This program can
implement a variant of MCMC called "Metropolis-coupled Markov chain Monte
Carlo", or MCMCMC for short. Basically, "Nchains" are run, with
Nchains - 1 of them heated. The chains are labelled 1, 2, ..., Nchains.
The heat that is applied to the i-th chain is B = 1 / (1 + temp X i). B
is the power to which the posterior probability is raised. When B = 0, all
trees have equal probability and the chain freely visits trees. B = 1 is
the "cold" chain (or the distribution of interest). MCMCMC can mix
better than ordinary MCMC; after all of the chains have gone through
one cycle, two chains are chosen at random and an attempt is made to
swap the states (with the probability of a swap being determined by the
Metropolis et al. equation). This allows the chain to potentially jump
a valley in a single bound. The correct usage is
mcmc <parameter> = <value> ... <parameter> = <value>
For example,
mcmc ngen=100000 nchains=4 temp=0.5
performs a MCMCMC analysis with four chains with the temperature set to
0.5. The chains would be run for 100,000 cycles.
Parameter Options Current Setting
-----------------------------------------------------
Ngen <number> 1000000
Nruns <number> 2
Nchains <number> 4
Temp <number> 0.100000
Reweight <number>,<number> 0.00 v 0.00 ^
Swapfreq <number> 1
Nswaps <number> 1
Samplefreq <number> 500
Printfreq <number> 1000
Printall Yes/No Yes
Printmax <number> 8
Mcmcdiagn Yes/No Yes
Diagnfreq <number> 5000
Diagnstat Avgstddev/Maxstddev Avgstddev
Minpartfreq <number> 0.10
Allchains Yes/No No
Allcomps Yes/No No
Relburnin Yes/No Yes
Burnin <number> 0
Burninfrac <number> 0.25
Stoprule Yes/No No
Stopval <number> 0.05
Savetrees Yes/No No
Checkpoint Yes/No Yes
Checkfreq <number> 2000
Filename <name> temp.<p/t>
Startparams Current/Reset Current
Starttree Current/Random/ Current
Parsimony
Nperts <number> 0
Data Yes/No Yes
Ordertaxa Yes/No No
Append Yes/No No
Autotune Yes/No Yes
Tunefreq <number> 100
- 贝叶斯分析默认设置马尔科夫链
mcmc
会独立运行两次Nruns=2
Parameter Options Current Setting
-----------------------------------------------------
Ngen <number> 1000000
Nruns <number> 2
- 默认丢弃25%
BurninFrac -- Determines the fraction of samples that will be discarded
when summary statistics are calculated. The value of this
option is only relevant when Relburnin is set to 'Yes'.
Example: A value for this option of 0.25 means that 25% of
the samples will be discarded.
Burninfrac <number> 0.25
- sump是用来对参数值进行归纳。设置的burnin值为
(ngen / samplefreq) * 0.25
程序给出一个概括的表,要确保PSRF一列中的值接近 1.0,否则需要运行该多的代数
Sump -- Summarizes parameters from MCMC analysis
sump burnin=250 (250是根据设置而定,比如burnin=0.25, samplefreq=10, Ngen=10000)
---------------------------------------------------------------------------
---------------------------------------------------------------------------
Sump
During an MCMC analysis, MrBayes prints the sampled parameter values to one or
more tab-delimited text files, one for each independent run in your analysis.
The command 'Sump' summarizes the information in this parameter file or these
parameter files. By default, the root of the parameter file name(s) is assumed
to be the name of the last matrix-containing nexus file. MrBayes also remem-
bers the number of independent runs in the last analysis that you set up, re-
gardless of whether you actually ran it. For instance, if there were two in-
dependent runs, which is the initial setting when you read in a new matrix,
MrBayes will assume that there are two parameter files with the endings
'.run1.p' and '.run2.p'. You can change the root of the file names and the
number of runs using the 'Filename' and 'Nruns' settings.
When you invoke the 'Sump' command, three items are output: (1) a generation
plot of the likelihood values; (2) estimates of the marginal likelihood of
the model; and (3) a table with the mean, variance, and 95 percent credible
interval for the sampled parameters. All three items are output to screen.
The table of marginal likelihoods is also printed to a file with the ending
'.lstat' and the parameter table to a file with the ending '.pstat'. For some
model parameters, there may also be a '.mstat' file.
When running 'Sump' you typically want to discard a specified number or
fraction of samples from the beginning of the chain as the burn in. This is
done using the same mechanism used by the 'mcmc' command. That is, if you
run an mcmc analysis with a relative burn in of 25 % of samples for con-
vergence diagnostics, then the same burn in will be used for a subsequent
sump command, unless a different burn in is specified. That is, issuing
sump
immediately after 'mcmc', will result in using the same burn in settings as
for the 'mcmc' command. All burnin settings are reset to default values every
time a new matrix is read in, namely relative burnin ('relburnin=yes') with
25 % of samples discarded ('burninfrac = 0.25').
Options:
Burnin -- Determines the number of samples (not generations) that will
be discarded when summary statistics are calculated. The
value of this option is only applicable when 'Relburnin' is
set to 'No'.
Burninfrac -- Determines the fraction of samples that will be discarded when
summary statistics are calculated. The setting only takes
effect if 'Relburnin' is set to 'Yes'.
-
Sumt
参数
Sumt -- Summarizes trees from MCMC analysis
sumt burnin=250 查看树形
Sumt
This command is used to produce summary statistics for trees sampled during
a Bayesian MCMC analysis. You can either summarize trees from one individual
analysis, or trees coming from several independent analyses. In either case,
all the sampled trees are read in and the proportion of the time any single
taxon bipartition (split) is found is counted. The proportion of the time that
the bipartition is found is an approximation of the posterior probability of
the bipartition. (Remember that a taxon bipartition is defined by removing a
branch on the tree, dividing the tree into those taxa to the left and right
of the removed branch. This set is called a taxon bipartition.) The branch
length of the bipartition is also recorded, if branch lengths have been saved
to file. The result is a list of the taxon bipartitions found, the frequency
with which they were found, the posterior probability of the bipartition
and, the mean and variance of the branch lengths or node depths, and various
other statistics.
The key to the partitions is output to a file with the suffix '.parts'. The
summary statistics pertaining to bipartition probabilities are output to a
file with the suffix '.tstat', and the statistics pertaining to branch or node
parameters are output to a file with the suffix '.vstat'.
A consensus tree is also printed to a file with the suffix '.con.tre' and
printed to the screen as a cladogram, and as a phylogram if branch lengths
have been saved. The consensus tree is either a 50 percent majority rule tree
or a majority rule tree showing all compatible partitions. If branch lengths
have been recorded during the run, the '.con.tre' file will contain a consen-
sus tree with branch lengths and interior nodes labelled with support values.
By default, the consensus tree will also contain other summary information in
a format understood by the program 'FigTree'. To use a simpler format under-
stood by other tree-drawing programs, such as 'TreeView', set 'Conformat' to
'Simple'.
MrBayes alo produces a file with the ending ".trprobs" that contains a list
of all the trees that were found during the MCMC analysis, sorted by their
probabilities. This list of trees can be used to construct a credible set of
trees. For example, if you want to construct a 95 percent credible set of
trees, you include all of those trees whose cumulative probability is less
than or equal to 0.95. You have the option of displaying the trees to the
screen using the "Showtreeprobs" option. The default is to not display the
trees to the screen; the number of different trees sampled by the chain can
be quite large. If you are analyzing a large set of taxa, you may actually
want to skip the calculation of tree probabilities entirely by setting
'Calctreeprobs' to 'No'.
If you are summarizing the trees sampled in several independent analyses,
such as those resulting from setting the 'Nruns' option of the 'Mcmc' command
to a value larger than 1, MrBayes will also calculate convergence diagnostics
for the sampled topologies and branch lengths. These values can help you
determine whether it is likely that your chains have converged.
The 'Sumt' command expands the 'Filename' according to the current values of
the 'Nruns' and 'Ntrees' options. For instance, if both 'Nruns' and 'Ntrees'
are set to 1, 'Sumt' will try to open a file named '<Filename>.t'. If 'Nruns'
is set to 2 and 'Ntrees' to 1, then 'Sumt' will open two files, the first
named '<Filename>.run1.t' and the second '<Filename>.run2.t', etc. By default,
the 'Filename' option is set such that 'Sumt' automatically summarizes all the
results from your immediately preceding 'Mcmc' command. You can also use the
'Sumt' command to summarize tree samples in older analyses. If you want to do
that, remember to first read in a matrix so that MrBayes knows what taxon
names to expect in the trees. Then set the 'Nruns', 'Ntrees' and 'Filename'
options appropriately if they differ from the MrBayes defaults.
Options:
Relburnin -- If this option is set to YES, then a proportion of the
samples will be discarded as burnin when calculating summary
statistics. The proportion to be discarded is set with
Burninfrac (see below). When the Relburnin option is set to
NO, then a specific number of samples is discarded instead.
This number is set by Burnin (see below). Note that the
burnin setting is shared across the 'sumt', 'sump', and
'mcmc' commands.
Burnin -- Determines the number of samples (not generations) that will
be discarded when summary statistics are calculated. The
value of this option is only relevant when Relburnin is set
to NO.
参考:
网友评论