Skip to content

[ISSUE] Training stucked for longer time for larger gennome #24

@rohitlb-1999

Description

@rohitlb-1999

Hi, I am trying to run fun2 for Papaver_somniferum with below size, but it's getting longer time to run, not sure if it has got stucked or it takes longer time?
Papaver_somniferum.ASM357369v1.dna.toplevel.fa : 2.6 GB
Papaver_somniferum.ASM357369v1.61.gff3 : 79M

Even clean command could not finish, stucked at below shown step:

Image

Command used:
funannotate2 train -f Papaver_somniferum.ASM357369v1.dna.toplevel.fa -s "papaver somniferum" -o train_results --cpus 16 -t Papaver_somniferum.ASM357369v1.61.gff3

OUTPUT:
[Jul 16 07:28 PM] Python v3.10.14; funannotate2 v25.7.1; gfftk v25.6.10; buscolite v25.4.24
[Jul 16 07:28 PM] Loading genome assembly and running QC checks
[Jul 16 07:29 PM] Genome stats:
{
"n_contigs": 34380,
"size": 2715377404,
"n50": 204470928,
"n90": 9903009,
"l50": 6,
"l90": 24,
"avg_length": 78981
}
[Jul 16 07:29 PM] Getting taxonomy information
{
"superkingdom": "Eukaryota",
"kingdom": "Viridiplantae",
"phylum": "Streptophyta",
"class": "Magnoliopsida",
"order": "Ranunculales",
"family": "Papaveraceae",
"genus": "Papaver",
"species": "Papaver somniferum"
}
[Jul 16 07:29 PM] Choosing best augustus species based on taxonomy: arabidopsis
[Jul 16 07:34 PM] Training set [/mnt/nsa4/projects/bioit/custom_bioit/projects/IC-A5334_Bindics_cDNA_clones_RNA-Seq/analysis/fun2_annotation/train_results/train_misc/training-set.temp.gff3] loaded with 41770 gene models
[Jul 16 07:35 PM] 26,587 of 41,770 models pass training parameters
[Jul 16 07:37 PM] 26587 gene models selected for training, now splitting into test [n=200] and train [n=26387]
[Jul 16 07:37 PM] Training augustus using training set
[Jul 16 07:39 PM] Initial training completed in 00:02:40s
{
"tool": "augustus",
"model": "67a71ccd-a776-4303-9b85-8591c348cab6",
"n_test_genes": 200,
"ref_genes_found": 174,
"ref_genes_missed": 26,
"extra_query_genes": 26,
"average_aed": 0.19032601432712637,
"nucleotide_sensitivity": 0.8285966236531693,
"nucleotide_precision": 0.8067907338916892,
"exon_sensitivity": 0.5292397660818714,
"exon_precision": 0.5797081182014101,
"gene_sensitivity": 0.59375,
"gene_precision": 0.59375
}
[Jul 16 07:39 PM] Training snap using training set
[Jul 16 07:43 PM] Initial training completed in 00:03:14s
{
"tool": "snap",
"model": "snap-trained.hmm",
"n_test_genes": 200,
"ref_genes_found": 193,
"ref_genes_missed": 7,
"extra_query_genes": 72,
"average_aed": 0.30908754793977833,
"nucleotide_sensitivity": 0.719315410532641,
"nucleotide_precision": 0.7337884105984789,
"exon_sensitivity": 0.3776595744680851,
"exon_precision": 0.33256602870097307,
"gene_sensitivity": 0.125,
"gene_precision": 0.0136986301369863
}
[Jul 16 07:43 PM] Training glimmerHMM using training set
STUCKED HERE FOR MORE THAN 12 hours.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions