VADR predicted nested genes, prevents submission to ENA #54

taltman · 2021-12-23T22:39:10Z

This seemed to anger the validation guards at ENA:

19094   20750   gene
                        gene    N
19094   20750   CDS
                        product nucleocapsid phosphoprotein
                        protein_id      NODE_1_length_10623_cov_925.238_7
19115   19838   gene
                        gene    N2
19115   19838   CDS
                        product nucleocapsid phosphoprotein 2
                        protein_id      NODE_1_length_10623_cov_925.238_8

Is this desired behavior by VADR?

The text was updated successfully, but these errors were encountered:

nawrockie · 2021-12-24T15:27:56Z

What was the issue exactly? The protein_id values? If so there's a --noprotid option that will get rid of them. If it's not that let me know what the problem is, there may be a way around it.

nawrockie · 2021-12-26T14:35:16Z

Ah, I see from the title of the issue the problem is that they are nested. Can you send me the .minfo file used with v-annotate.pl?

taltman · 2021-12-27T22:38:50Z

Hi @nawrockie , I'm using the pan-Coronavirus model,
version 1.3:

Please let me know if I misunderstood what you were asking for. Thanks!

nawrockie · 2021-12-29T21:31:30Z

It looks like the best matching model for your sequence must be the NC_006577 model because that is the only model with a N2 gene. The NC_006577 RefSeq has N2 nested within N as shown in the .minfo file, so that's why vadr is annotating it in your sequence:

FEATURE NC_006577 type:"gene" coords:"28320..29645:+" parent_idx_str:"GBNULL" gene:"N"
FEATURE NC_006577 type:"CDS" coords:"28320..29645:+" parent_idx_str:"GBNULL" gene:"N" product:"nucleocapsid phosphoprotein"
FEATURE NC_006577 type:"gene" coords:"28342..28959:+" parent_idx_str:"GBNULL" gene:"N2"
FEATURE NC_006577 type:"CDS" coords:"28342..28959:+" parent_idx_str:"GBNULL" gene:"N2" product:"nucleocapsid phosphoprotein 2"

If nested CDS and gene features are not allowed by ENA for submission purposes, you can just remove the N2 annotations manually from your .tbl file, or you can make a new .minfo file for vadr that has N2 removed and use that to redo the annotation, whichever is easier.

Let me know if that addresses your question or not.

taltman changed the title ~~VADR predicted nested genes~~ VADR predicted nested genes, prevents submission to ENA Dec 23, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VADR predicted nested genes, prevents submission to ENA #54

VADR predicted nested genes, prevents submission to ENA #54

taltman commented Dec 23, 2021

nawrockie commented Dec 24, 2021

nawrockie commented Dec 26, 2021

taltman commented Dec 27, 2021

nawrockie commented Dec 29, 2021 •

edited

Loading

VADR predicted nested genes, prevents submission to ENA #54

VADR predicted nested genes, prevents submission to ENA #54

Comments

taltman commented Dec 23, 2021

nawrockie commented Dec 24, 2021

nawrockie commented Dec 26, 2021

taltman commented Dec 27, 2021

nawrockie commented Dec 29, 2021 • edited Loading

nawrockie commented Dec 29, 2021 •

edited

Loading