Skip to content

VADR doesn't annotate second segment of segmented CoV genome #52

@taltman

Description

@taltman

The Serratus Project expanded the set of known CoV/nidovirus genomes, including segmented ones. An example of a segmented nidovirus similar to the ones that we found is the Pacific salmon nidovirus (MK611985.1). Please see Figure 3 of our preprint for more context:
https://www.biorxiv.org/content/10.1101/2020.08.07.241729v2

When I try to annotate the AmexNV genome, with two segments in the input FASTA file, VADR 1.3 annotates the first segment, and then reports the following for the second one:

>Feature NODE_11_length_12596_cov_95.354468

Additional note(s) to submitter:
ERROR: NO_ANNOTATION: (*sequence*) no significant similarity detected [-]; seq-coords:-; mdl-coords:-; mdl:-;

Yet, when I concatenate the two contigs with a run of 16 Ns: I get additional annotations (see below). Is there a way for VADR to recognize the multiple segments, and annotate them individually? (see below for the input files used)

Additional annotations:

22167   27672   gene
                        gene    S
22167   27672   CDS
                        product spike glycoprotein
                        protein_id      NODE_3_length_19124_cov_65.568632_3
27717   28212   gene
                        gene    orf4
27717   28212   CDS
                        product non-structural protein
                        protein_id      NODE_3_length_19124_cov_65.568632_4
28193   28627   gene
                        gene    E
28193   28627   CDS
                        product small membrane protein
                        protein_id      NODE_3_length_19124_cov_65.568632_5
28639   29602   gene
                        gene    M
28639   29602   CDS
                        product membrane glycoprotein
                        protein_id      NODE_3_length_19124_cov_65.568632_6
29646   31439   gene
                        gene    N
29646   31439   CDS
                        product nucleocapsid phosphoprotein
                        protein_id      NODE_3_length_19124_cov_65.568632_7
29665   30389   gene
                        gene    N2
29665   30389   CDS
                        product nucleocapsid phosphoprotein 2
                        protein_id      NODE_3_length_19124_cov_65.568632_8

Original FASTA file with two segments:
SRR6788790.epsy.fa.txt

Modified FASTA with the two segments concatenated:
AmexNV-one-contig-test.fa.txt

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions