-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
1240f5c
commit 4d9bbb8
Showing
8 changed files
with
272,909 additions
and
150 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,40 +1,29 @@ | ||
clade gene site alt | ||
|
||
A nuc 667 T | ||
A nuc 2044 C | ||
A nuc 2996 T | ||
A nuc 2551 G | ||
|
||
# Note that B is a subclade of C in the full 11k tree | ||
B nuc 841 C | ||
B nuc 3072 A | ||
B nuc 410 G | ||
|
||
C nuc 346 T | ||
C nuc 791 T | ||
C nuc 1135 C | ||
C nuc 2747 G | ||
C nuc 117 A | ||
C nuc 2981 A | ||
C nuc 484 C | ||
|
||
C_re nuc 287 A | ||
C_re nuc 957 G | ||
C_re nuc 45 A | ||
C_re nuc 1043 A | ||
C_re nuc 2136 C | ||
|
||
D nuc 724 T | ||
D nuc 1059 C | ||
D nuc 113 C | ||
D nuc 2689 G | ||
|
||
E nuc 346 T | ||
E nuc 505 T | ||
E nuc 1060 C | ||
E nuc 520 C | ||
|
||
F nuc 572 T | ||
F nuc 909 A | ||
F nuc 1332 A | ||
F nuc 1700 A | ||
F nuc 2979 C | ||
F nuc 1694 T | ||
|
||
G nuc 1008 G | ||
G nuc 2038 T | ||
G nuc 1953 G | ||
G nuc 2494 G | ||
G nuc 2770 G | ||
|
||
H nuc 529 C | ||
H nuc 1467 T | ||
H nuc 2619 G | ||
|
||
I nuc 181 C | ||
I nuc 852 G | ||
I nuc 1000 T | ||
I nuc 1052 G |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,10 +1,33 @@ | ||
|
||
|
||
# Used to re-circularise genomes. This should be the same as the reference in the nextclade dataset! | ||
# Also force-included for a number of datasets | ||
reference_accession: "NC_003977" | ||
|
||
|
||
# Path to the nextclade dataset, which will be used to align all sequences | ||
nextclade_dataset: "nextclade_datasets/references/NC_003977/versions/2023-08-22" | ||
|
||
nextclade_binary: "bin/nextclade" | ||
nextclade_binary: "bin/nextclade" | ||
|
||
# technically CDSs, but we use these terms rather interchangeably. | ||
# These should be the entire list of CDSs in the nextclade dataset's genemap | ||
genes: ['envS', 'envM', 'envL', 'X', 'pre-capsid', 'capsid', 'pol'] | ||
|
||
# Augur cannot parse the (correct) nextclade dataset genemap, so we make a temporary one | ||
# Luckily, while many CDSs wrap, `augur ancestral` does the right thing (perhaps by chance not design) | ||
temporary_genemap_for_augur_ancestral: "config/temp_genemap.gff" | ||
|
||
genotypes: ["A", "B", "C", "D"] | ||
|
||
roots: { | ||
"all": "HQ603073", # NHP-HBV isolate to root the tree | ||
# genotype roots chosen by examining the entire tree and picking a suitably close isolate | ||
"A": "MK534669", # root is genotype I (I is A/C/G recombinant) | ||
"B": "MK534669", # root is genotype I (I is A/C/G recombinant) | ||
"C": "MK534669", # root is genotype I (I is A/C/G recombinant) | ||
"D": "KX186584", # root is genotype E | ||
} | ||
|
||
# candidate non-human sequences which fall in outgroups (the entire outgroup will be pruned) | ||
outgroups: ["FM209514", "HQ603059", "FJ798097", "HQ603068"] # "HQ603059", "FJ798097", "AY330914", "AY781182", "AB823660", "AY330914", "FJ798098"] | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
##gff-version 3 | ||
##Created by Nextstrain (james hadfield) for use with `augur ancestral` which can only parse a very specific format of GFF files | ||
##sequence-region NC_003977.2 1 3182 | ||
##species https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=10407 | ||
NC_003977.2 RefSeq region 1 3182 . + . ID=NC_003977.2:1..3182;Dbxref=taxon:10407;Is_circular=true;gbkey=Src;genome=genomic;mol_type=genomic DNA;strain=ayw | ||
NC_003977.2 RefSeq gene 1376 1840 . + 0 Name=X;gene=X | ||
NC_003977.2 RefSeq gene 1816 2454 . + 0 Name=pre-capsid;gene=pre-capsid | ||
NC_003977.2 RefSeq gene 1903 2454 . + 0 Name=capsid;gene=capsid | ||
NC_003977.2 RefSeq gene 2309 4807 . + 0 Name=pol;gene=pol | ||
NC_003977.2 RefSeq gene 2850 4019 . + 0 Name=envL;gene=envL | ||
NC_003977.2 RefSeq gene 3174 4019 . + 0 Name=envM;gene=envM | ||
NC_003977.2 RefSeq gene 157 837 . + 0 Name=envS;gene=envS |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.