-
Notifications
You must be signed in to change notification settings - Fork 9
Support Exporting Model and Partition Selections to MrBayes #29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
IntegerLimit
wants to merge
12
commits into
iqtree:master
Choose a base branch
from
IntegerLimit:mr-bayes-support-iqtree-3
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Support Exporting Model and Partition Selections to MrBayes #29
IntegerLimit
wants to merge
12
commits into
iqtree:master
from
IntegerLimit:mr-bayes-support-iqtree-3
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* Support Exporting DNA/RNA Analysis to MrBayes Block Files * Fix Formatting Issues * Move Functions to Supplementary, Fix +R Remapping
…#264) * Protein Model * Morphological Models Support * Cleanup Move Model Specific Functions to each model class Move other functions from phylotree and phylosupertree to phyloanalysis * Cleanup Imports * Binary Model Support * Misc Cleanup Misc Cleanup * Output Files Readability, Default Warning & Help Message * Fix Edge Case: Importing Values < 0.01 into MrBayes * Fix Edge Case: Extra Characters in Charset * Fix +G+I or +R Inputs * Fix Issues with Binary Model * Fix Issues with Morphology Model
* Codon Model * Fix Compiler Error due to Merge Conflicts * Fix Codon Model `NucModel` Parameter * Fix Empirical Warning + Indentation of Warnings * Improve Start-Of-File Warnings * Fix Indentation in alignment.cpp
@thomaskf what's the current status? |
I'll fix up the conflicts tomorrow. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Introduction
This PR ports over iqtree/iqtree2#267 into IQTree 3, as well as implementing all review requests previously on that PR.
Similarly, iqtree/iqtree2#195 has been implemented for all MrBayes supported sequence types. (DNA/RNA, Protein, Binary, Morphological & Codon) This PR is complete, excluding any changes required further from suggestions and bug fixes after testing has been conducted.
General Implementation Details
-mset mrbayes
, or adds the-mrbayes
flag.mr_bayes.nex
)GTR+G+I
(DNA) will be used, with warnings printed to log and file.+R
has been mapped to+G+I
(with warnings printed to log + file)DNA Fallbacks
For DNA, MrBayes supports three models: JR/F81 (
nst=1
), HKY (nst=2
) and GTR (nst=6
) (excluding their fixed frequency counterparts)Therefore, when a model is used that is not supported in MrBayes, it will default to GTR, due to the lower impact of increased parameters when using Bayesian Inference.
Protein Fallbacks
For Protein, when a model is used that is not supported by MrBayes, a default of
GTR
will be used. Then there will be a rate and state frequency matrix of the model included. The rate matrix will be set tofixed
, unless the model used by IQTree wasGTR20
, in which casedirichlet
will be used. This appears to be a mandatory parameter for MrBayes GTR models.Binary, Morphological and Codon Data Exclusions
Codon Implementation Details
Codon Models in MrBayes: Introduction
The basic structure for codon models in MrBayes is quite similar to mechanistic codon models in IQTree, following the same formulation of the model by Goldman & Yang 1994 and Muse & Gaut 1994. However, the settings for MrBayes are under different names, and most inputs cannot be ported directly, making it the most difficult model to port to MrBayes format.
Instead of using named models, such as
MG
orGY
, MrBayes uses only one main parameter, which acts similar to DNA model selection: Nucleotide Substitution Model (FromJC
,HKY
andGTR
), set throughlset nst
(nst = 1
forJC
,nst = 2
forHKY
,nst = 6
forGTR
)(Source: MrBayes Manual (Chapter 6.1.3 & Appendix A), MrBayes
help lset
andhelp prset
commands, IQTree Documentation on Substitution Models (Section on Codon Models))Mechanistic Model Output
Nucleotide Substitution Model
For retrieving the nucleotide substitution model that should be used as input into MrBayes, the implemented code does the following:
fix_kappa
is true, then the model will be set tonst = 1
(JC)fix_kappa
is false, then the model will be set tonst = 2
(HKY)This implementation means that GTR is not used, appropriate considering the inputs for Mechanistic Codon Models in IQTree (ds/dt ratio + ts/tv ratio).
Note that
fix_kappa
is only set to true under the Codon ModelsMGK
andGY0K
(which are the only models without ats/tv
input ratio). This can be shown through theinitCodon
function, which only callsinitMG94
orinitGY94
withfix_kappa
as true for those two models. That input is then read into thefix_kappa
field here for MG Models and here for GY Models.Empirical Model Output
MrBayes does not support Empirical Codon Models, so when such a model is being used (or a mixture of Empirical + Mechanistic), a warning is printed to the log and file. However, a model is still outputted, with
nst = 6
(GTR-like model).Codon Codes
Whilst IQTree uses Number IDs for its Codon Codes (CODON1, CODON2, etc.), MrBayes uses Text IDs. (
vertmt
,invermt
, etc.) There is no clear documentation or description for most of the codes, but below shows the final table to transfer from IQTree Codon Codes to MrBayes Codon Codes.An
XXX
in the MrBayes column represents a code that MrBayes does not support.If a code is used that MrBayes does not support, it defaults to the universal code, and prints a warning to the log and file.
Example Output Files
Without Partitions
With Partitions