You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
2024-10-08 14:04:51.802616: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.2024-10-08 14:04:52.493612: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT2024-10-08 14:04:57.902564: W tensorflow/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 205286400 exceeds 10% of free system memory.2024-10-08 14:04:58.111355: W tensorflow/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 205286400 exceeds 10% of free system memory.2024-10-08 14:05:00.801827: W tensorflow/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 205286400 exceeds 10% of free system memory.2024-10-08 14:05:03.411030: W tensorflow/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 205286400 exceeds 10% of free system memory.2024-10-08 14:05:06.031928: W tensorflow/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 205286400 exceeds 10% of free system memory./usr/local/lib/python3.8/dist-packages/tensorflow_addons/utils/tfa_eol_msg.py:23: UserWarning: TensorFlow Addons (TFA) has ended development and introduction of new features.TFA has entered a minimal maintenance and release mode until a planned end of life in May 2024.Please modify downstream libraries to take dependencies from other repositories in our TensorFlow community (e.g. Keras, Keras-CV, and Keras-NLP). For more information see: https://github.com/tensorflow/addons/issues/2807 warnings.warn(Ignoring the following unexpected models in /tmp/tmpbjlbttns/job_working_directory/000/2/home/.local/share/Helixer/models:[].You can set --model-filepath in Helixer.py if you wish to use these.
Standard Output:
============ CUDA ============CUDA Version 11.8.0Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.This container image and its contents are governed by the NVIDIA Deep Learning Container License.By pulling and using the container, you accept the terms and conditions of this license:https://developer.nvidia.com/ngc/nvidia-deep-learning-container-licenseA copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.WARNING: The NVIDIA Driver was not detected. GPU functionality will not be available. Use the NVIDIA Container Toolkit to start this container with GPU support; see https://docs.nvidia.com/datacenter/cloud-native/ .retrived list of available models from https://raw.githubusercontent.com/weberlab-hhu/Helixer/main/resources/model_list.csvsaved model land_plant_v0.3_a_0080.h5 to /tmp/tmpbjlbttns/job_working_directory/000/2/home/.local/share/Helixer/modelssaved model vertebrate_v0.3_m_0080.h5 to /tmp/tmpbjlbttns/job_working_directory/000/2/home/.local/share/Helixer/modelssaved model fungi_v0.3_a_0100.h5 to /tmp/tmpbjlbttns/job_working_directory/000/2/home/.local/share/Helixer/modelssaved model invertebrate_v0.3_m_0100.h5 to /tmp/tmpbjlbttns/job_working_directory/000/2/home/.local/share/Helixer/modelsHelixerPost <genome.h5> <predictions.h5> <windowSize> <edgeThresh> <peakThresh> <minCodingLength> <gff>No config file foundretrived list of available models from https://raw.githubusercontent.com/weberlab-hhu/Helixer/main/resources/model_list.csvHelixer.py config: {'batch_size': 8, 'compression': 'gzip', 'config_path': 'config/helixer_config.yaml', 'debug': False, 'edge_threshold': 0.1, 'fasta_path': '/tmp/tmpbjlbttns/files/1/8/a/dataset_18a80251-044a-4fd1-a877-53eab06f9e9a.dat', 'gff_output_path': '/tmp/tmpbjlbttns/job_working_directory/000/2/outputs/dataset_8eba3340-4bd7-48bd-bc95-a30484ee3ef6.dat', 'lineage': 'land_plant', 'min_coding_length': 100, 'model_filepath': '/tmp/tmpbjlbttns/job_working_directory/000/2/home/.local/share/Helixer/models/land_plant/land_plant_v0.3_a_0080.h5', 'no_multiprocess': False, 'no_overlap': False, 'overlap_core_length': 80190, 'overlap_offset': 53460, 'peak_threshold': 0.8, 'species': '', 'subsequence_length': 106920, 'temporary_dir': './', 'window_size': 100}Testing whether helixer_post_bin is correctly installedHelixer.py config loaded. Starting FASTA to H5 conversion.storing temporary files under ./tmpufm2it1t1 expected num of chunks to write in 19994040 bases to hdf5Numerification of 0-720580 of the sequence of sample took 0.24 secs1 Numerified Fasta only Coordinate (seqid: sample, len: 720580) in 0.39 secs1 expected num of chunks to write in 19994040 bases to hdf5Numerification of 0-2030 of the sequence of sample2 took 0.00 secs2 Numerified Fasta only Coordinate (seqid: sample2, len: 2030) in 0.01 secs1 expected num of chunks to write in 19994040 bases to hdf5Numerification of 0-2100 of the sequence of sample3 took 0.00 secs3 Numerified Fasta only Coordinate (seqid: sample3, len: 2100) in 0.01 secs1 expected num of chunks to write in 19994040 bases to hdf5Numerification of 0-7560 of the sequence of sample4 took 0.00 secs4 Numerified Fasta only Coordinate (seqid: sample4, len: 7560) in 0.02 secslogged installed version in place of git commit for geenufflogged installed version in place of git commit for helixerFASTA to H5 conversion done. Starting neural network prediction with overlapping.HelixerModel config: {'batch_size': 8, 'calculate_uncertainty': False, 'check_every_nth_batch': 1000000, 'class_weights': 'None', 'clip_norm': 3.0, 'cnn_layers': 1, 'compression': 'gzip', 'core_length': 80190, 'coverage_norm': None, 'coverage_offset': 0.0, 'coverage_weights': False, 'cpus': 8, 'data_dir': None, 'debug': False, 'dropout1': 0.0, 'dropout2': 0.0, 'epochs': 10000, 'eval': False, 'filter_depth': 32, 'fine_tune': False, 'float_precision': 'float32', 'gpu_id': -1, 'input_coverage': False, 'kernel_size': 26, 'large_eval_folder': '', 'learning_rate': 0.0003, 'load_model_path': '/tmp/tmpbjlbttns/job_working_directory/000/2/home/.local/share/Helixer/models/land_plant/land_plant_v0.3_a_0080.h5', 'load_predictions': False, 'loss': '', 'lstm_layers': 1, 'nni': False, 'no_utrs': False, 'optimizer': 'adamw', 'overlap': True, 'overlap_offset': 53460, 'patience': 3, 'pool_size': 9, 'post_coverage_hidden_layer': False, 'predict_phase': False, 'prediction_output_path': './tmpufm2it1t/tmp_predictions_.h5', 'pretrained_model_path': None, 'resume_training': False, 'save_every_check': False, 'save_model_path': './best_model.h5', 'stretch_transition_weights': 0, 'test_data': './tmpufm2it1t/tmp_species_.h5', 'transition_weights': 'None', 'units': 32, 'val_test_batch_size': 8, 'verbose': True, 'weight_decay': 3.5e-05, 'workers': 1}No err_samples dataset found, correct samples will be set to 0No fully_intergenic_samples dataset found, fully intergenic samples will be set to 0Data config: [{'geenuff_commit': 'commit not found, version: 0.3.2', 'helixer_commit': 'commit not found, version: 0.3.3', 'input_path': '/tmp/tmpbjlbttns/files/1/8/a/dataset_18a80251-044a-4fd1-a877-53eab06f9e9a.dat', 'timestamp': '2024-10-08 14:04:55.157284'}]Test data shape: (20, 106920)Intergenic test seqs: 0.00%Fully correct test seqs: 0.00%Number of devices: 1Current Helixer version: 0.3.3Md5sum of the loaded model: f0e00efcbea83c66b69258d11119a691 /tmp/tmpbjlbttns/job_working_directory/000/2/home/.local/share/Helixer/models/land_plant/land_plant_v0.3_a_0080.h5Model: "model"__________________________________________________________________________________________________ Layer (type) Output Shape Param # Connected to ================================================================================================== main_input (InputLayer) [(None, None, 4)] 0 [] conv1d (Conv1D) (None, None, 96) 4704 ['main_input[0][0]'] batch_normalization (Batch (None, None, 96) 384 ['conv1d[0][0]'] Normalization) conv1d_1 (Conv1D) (None, None, 96) 110688 ['batch_normalization[0][0]'] batch_normalization_1 (Bat (None, None, 96) 384 ['conv1d_1[0][0]'] chNormalization) conv1d_2 (Conv1D) (None, None, 96) 110688 ['batch_normalization_1[0][0]' ] batch_normalization_2 (Bat (None, None, 96) 384 ['conv1d_2[0][0]'] chNormalization) conv1d_3 (Conv1D) (None, None, 96) 110688 ['batch_normalization_2[0][0]' ] reshape (Reshape) (None, None, 864) 0 ['conv1d_3[0][0]'] bidirectional (Bidirection (None, None, 256) 1016832 ['reshape[0][0]'] al) bidirectional_1 (Bidirecti (None, None, 256) 394240 ['bidirectional[0][0]'] onal) bidirectional_2 (Bidirecti (None, None, 256) 394240 ['bidirectional_1[0][0]'] onal) dense (Dense) (None, None, 72) 18504 ['bidirectional_2[0][0]'] tf.split (TFOpLambda) [(None, None, 36), 0 ['dense[0][0]'] (None, None, 36)] reshape_1 (Reshape) (None, None, 9, 4) 0 ['tf.split[0][0]'] reshape_2 (Reshape) (None, None, 9, 4) 0 ['tf.split[0][1]'] genic (Activation) (None, None, 9, 4) 0 ['reshape_1[0][0]'] phase (Activation) (None, None, 9, 4) 0 ['reshape_2[0][0]'] ==================================================================================================Total params: 2161736 (8.25 MB)Trainable params: 2161160 (8.24 MB)Non-trainable params: 576 (2.25 KB)__________________________________________________________________________________________________HMM Config Splicing Flags: U:true US:true S:true SC:true C:true CS:true S:true SU:true U:true Splicing - Weights: Donor 1, Acceptor 1 Splicing - Fixed Penalties: U2-GT-AG 0, U2-GT-AC 0 U12-GT-AG 0 U12-AT-AC 0 Coding - Weights: Start 1000, Stop 1000 Phase Mode: Implementation 1, Dilute to Total, Retention: 0.2Sequences for Species - 0 BP_Extractor for Sequence sample - ID 0Forward for Sequence sample - ID 0 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 613530 131 479 418 Non Coding 689510 1069 1104 1091 UTR 238 5141 87 120 Phase 0 230 9046 2 6 Coding 1243 82 29647 264 Phase 1 226 8 9014 2 Intron 1459 120 162 67459 Phase 2 239 2 5 9026 Precision Recall F1 Precision Recall F1 Intergenic 0.99523 0.99833 0.99678 Non Coding 0.99899 0.99529 0.99714 UTR 0.93917 0.92034 0.92966 Phase 0 0.89343 0.97436 0.93214 Coding 0.97603 0.94913 0.96239 Phase 1 0.89027 0.97449 0.93048 Intron 0.98825 0.97484 0.98150 Phase 2 0.89146 0.97347 0.93066 Subgenic 0.98449 0.96684 0.97559 Coding 0.89172 0.97411 0.93109 Genic 0.98211 0.96439 0.97317 Reverse for Sequence sample - ID 0 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 555814 209 108 776 Non Coding 659468 985 971 987 UTR 454 9649 137 214 Phase 0 670 18679 9 31 Coding 2123 49 58497 691 Phase 1 689 33 18702 12 Intron 8256 265 391 82947 Phase 2 620 14 29 18681 Precision Recall F1 Precision Recall F1 Intergenic 0.98088 0.99804 0.98939 Non Coding 0.99701 0.99556 0.99628 UTR 0.94858 0.92300 0.93562 Phase 0 0.94764 0.96338 0.95545 Coding 0.98924 0.95334 0.97096 Phase 1 0.94881 0.96224 0.95548 Intron 0.98014 0.90298 0.93998 Phase 2 0.94774 0.96573 0.95665 Subgenic 0.98388 0.92315 0.95255 Coding 0.94807 0.96378 0.95586 Genic 0.98155 0.92314 0.95145 BP_Extractor for Sequence sample2 - ID 1Forward for Sequence sample2 - ID 1 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 2026 0 0 0 Non Coding 2029 0 0 0 UTR 0 0 0 0 Phase 0 1 0 0 0 Coding 4 0 0 0 Phase 1 0 0 0 0 Intron 0 0 0 0 Phase 2 0 0 0 0 Precision Recall F1 Precision Recall F1 Intergenic 0.99803 1.00000 0.99901 Non Coding 0.99951 1.00000 0.99975 UTR NaN NaN NaN Phase 0 NaN 0.00000 0.00000 Coding NaN 0.00000 0.00000 Phase 1 NaN NaN NaN Intron NaN NaN NaN Phase 2 NaN NaN NaN Subgenic NaN 0.00000 0.00000 Coding NaN 0.00000 0.00000 Genic NaN 0.00000 0.00000 Reverse for Sequence sample2 - ID 1 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 2030 0 0 0 Non Coding 2030 0 0 0 UTR 0 0 0 0 Phase 0 0 0 0 0 Coding 0 0 0 0 Phase 1 0 0 0 0 Intron 0 0 0 0 Phase 2 0 0 0 0 Precision Recall F1 Precision Recall F1 Intergenic 1.00000 1.00000 1.00000 Non Coding 1.00000 1.00000 1.00000 UTR NaN NaN NaN Phase 0 NaN NaN NaN Coding NaN NaN NaN Phase 1 NaN NaN NaN Intron NaN NaN NaN Phase 2 NaN NaN NaN Subgenic NaN NaN NaN Coding NaN NaN NaN Genic NaN NaN NaN BP_Extractor for Sequence sample3 - ID 2Forward for Sequence sample3 - ID 2 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 2098 0 0 0 Non Coding 2099 0 0 0 UTR 0 0 0 0 Phase 0 1 0 0 0 Coding 2 0 0 0 Phase 1 0 0 0 0 Intron 0 0 0 0 Phase 2 0 0 0 0 Precision Recall F1 Precision Recall F1 Intergenic 0.99905 1.00000 0.99952 Non Coding 0.99952 1.00000 0.99976 UTR NaN NaN NaN Phase 0 NaN 0.00000 0.00000 Coding NaN 0.00000 0.00000 Phase 1 NaN NaN NaN Intron NaN NaN NaN Phase 2 NaN NaN NaN Subgenic NaN 0.00000 0.00000 Coding NaN 0.00000 0.00000 Genic NaN 0.00000 0.00000 Reverse for Sequence sample3 - ID 2 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 2100 0 0 0 Non Coding 2100 0 0 0 UTR 0 0 0 0 Phase 0 0 0 0 0 Coding 0 0 0 0 Phase 1 0 0 0 0 Intron 0 0 0 0 Phase 2 0 0 0 0 Precision Recall F1 Precision Recall F1 Intergenic 1.00000 1.00000 1.00000 Non Coding 1.00000 1.00000 1.00000 UTR NaN NaN NaN Phase 0 NaN NaN NaN Coding NaN NaN NaN Phase 1 NaN NaN NaN Intron NaN NaN NaN Phase 2 NaN NaN NaN Subgenic NaN NaN NaN Coding NaN NaN NaN Genic NaN NaN NaN BP_Extractor for Sequence sample4 - ID 3Forward for Sequence sample4 - ID 3 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 7535 0 0 0 Non Coding 7553 0 0 0 UTR 3 0 0 0 Phase 0 2 0 0 0 Coding 8 0 0 0 Phase 1 2 0 0 0 Intron 14 0 0 0 Phase 2 3 0 0 0 Precision Recall F1 Precision Recall F1 Intergenic 0.99669 1.00000 0.99834 Non Coding 0.99907 1.00000 0.99954 UTR NaN 0.00000 0.00000 Phase 0 NaN 0.00000 0.00000 Coding NaN 0.00000 0.00000 Phase 1 NaN 0.00000 0.00000 Intron NaN 0.00000 0.00000 Phase 2 NaN 0.00000 0.00000 Subgenic NaN 0.00000 0.00000 Coding NaN 0.00000 0.00000 Genic NaN 0.00000 0.00000 Reverse for Sequence sample4 - ID 3 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 4480 19 0 0 Non Coding 5923 2 6 4 UTR 40 453 5 0 Phase 0 17 524 0 0 Coding 321 0 1573 1 Phase 1 17 0 520 0 Intron 0 0 0 668 Phase 2 25 0 0 522 Precision Recall F1 Precision Recall F1 Intergenic 0.92543 0.99578 0.95931 Non Coding 0.99014 0.99798 0.99404 UTR 0.95975 0.90964 0.93402 Phase 0 0.99620 0.96858 0.98219 Coding 0.99683 0.83008 0.90585 Phase 1 0.98859 0.96834 0.97836 Intron 0.99851 1.00000 0.99925 Phase 2 0.99240 0.95430 0.97297 Subgenic 0.99733 0.87437 0.93181 Coding 0.99240 0.96369 0.97783 Genic 0.99081 0.88010 0.93218 Forward for Species - 0 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 625189 131 479 418 Non Coding 701191 1069 1104 1091 UTR 241 5141 87 120 Phase 0 234 9046 2 6 Coding 1257 82 29647 264 Phase 1 228 8 9014 2 Intron 1473 120 162 67459 Phase 2 242 2 5 9026 Precision Recall F1 Precision Recall F1 Intergenic 0.99527 0.99836 0.99681 Non Coding 0.99900 0.99537 0.99718 UTR 0.93917 0.91984 0.92940 Phase 0 0.89343 0.97394 0.93195 Coding 0.97603 0.94870 0.96217 Phase 1 0.89027 0.97428 0.93038 Intron 0.98825 0.97464 0.98140 Phase 2 0.89146 0.97315 0.93052 Subgenic 0.98449 0.96658 0.97545 Coding 0.89172 0.97379 0.93095 Genic 0.98211 0.96411 0.97303 Reverse for Species - 0 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 564424 228 108 776 Non Coding 669521 987 977 991 UTR 494 10102 142 214 Phase 0 687 19203 9 31 Coding 2444 49 60070 692 Phase 1 706 33 19222 12 Intron 8256 265 391 83615 Phase 2 645 14 29 19203 Precision Recall F1 Precision Recall F1 Intergenic 0.98055 0.99803 0.98922 Non Coding 0.99697 0.99561 0.99629 UTR 0.94908 0.92239 0.93554 Phase 0 0.94891 0.96352 0.95616 Coding 0.98944 0.94965 0.96914 Phase 1 0.94984 0.96240 0.95608 Intron 0.98028 0.90368 0.94042 Phase 2 0.94891 0.96541 0.95709 Subgenic 0.98409 0.92235 0.95222 Coding 0.94922 0.96378 0.95644 Genic 0.98171 0.92235 0.95110 Total for Species - 0 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 1189613 359 587 1194 Non Coding 1370712 2056 2081 2082 UTR 735 15243 229 334 Phase 0 921 28249 11 37 Coding 3701 131 89717 956 Phase 1 934 41 28236 14 Intron 9729 385 553 151074 Phase 2 887 16 34 28229 Precision Recall F1 Precision Recall F1 Intergenic 0.98823 0.99820 0.99319 Non Coding 0.99800 0.99548 0.99674 UTR 0.94571 0.92153 0.93346 Phase 0 0.93041 0.96684 0.94827 Coding 0.98497 0.94934 0.96682 Phase 1 0.92998 0.96616 0.94772 Intron 0.98382 0.93405 0.95829 Phase 2 0.92975 0.96787 0.94843 Subgenic 0.98425 0.93969 0.96145 Coding 0.93004 0.96696 0.94814 Genic 0.98187 0.93859 0.95974 Total: 482642bp across 25 windowsNonestarting to load test data into memory..For h5 starting with species = b'':x shape: (20, 106920, 4)Data loading of 20 (total so far 20) samples of data/X into memory took 0.06 secsCompressed data size of data/X is at least 0.0008 GBsetting self.n_seqs to 20, bc that is len of data/X0 / 81 / 82 / 83 / 84 / 85 / 86 / 87 / 8Neural network prediction done. Starting post processing.Helixer successfully finished the annotation of /tmp/tmpbjlbttns/files/1/8/a/dataset_18a80251-044a-4fd1-a877-53eab06f9e9a.dat in 0.07 hours. GFF file written to /tmp/tmpbjlbttns/job_working_directory/000/2/outputs/dataset_8eba3340-4bd7-48bd-bc95-a30484ee3ef6.dat.
2024-10-08 14:35:08.583343: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.2024-10-08 14:35:09.305046: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT2024-10-08 14:35:14.833701: W tensorflow/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 205286400 exceeds 10% of free system memory.2024-10-08 14:35:15.087517: W tensorflow/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 205286400 exceeds 10% of free system memory.2024-10-08 14:35:17.511875: W tensorflow/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 205286400 exceeds 10% of free system memory.2024-10-08 14:35:19.978380: W tensorflow/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 205286400 exceeds 10% of free system memory.2024-10-08 14:35:22.485818: W tensorflow/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 205286400 exceeds 10% of free system memory./usr/local/lib/python3.8/dist-packages/tensorflow_addons/utils/tfa_eol_msg.py:23: UserWarning: TensorFlow Addons (TFA) has ended development and introduction of new features.TFA has entered a minimal maintenance and release mode until a planned end of life in May 2024.Please modify downstream libraries to take dependencies from other repositories in our TensorFlow community (e.g. Keras, Keras-CV, and Keras-NLP). For more information see: https://github.com/tensorflow/addons/issues/2807 warnings.warn(Ignoring the following unexpected models in /tmp/tmpjo038yv5/job_working_directory/000/2/home/.local/share/Helixer/models:[].You can set --model-filepath in Helixer.py if you wish to use these.
Standard Output:
============ CUDA ============CUDA Version 11.8.0Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.This container image and its contents are governed by the NVIDIA Deep Learning Container License.By pulling and using the container, you accept the terms and conditions of this license:https://developer.nvidia.com/ngc/nvidia-deep-learning-container-licenseA copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.WARNING: The NVIDIA Driver was not detected. GPU functionality will not be available. Use the NVIDIA Container Toolkit to start this container with GPU support; see https://docs.nvidia.com/datacenter/cloud-native/ .retrived list of available models from https://raw.githubusercontent.com/weberlab-hhu/Helixer/main/resources/model_list.csvsaved model land_plant_v0.3_a_0080.h5 to /tmp/tmpjo038yv5/job_working_directory/000/2/home/.local/share/Helixer/modelssaved model vertebrate_v0.3_m_0080.h5 to /tmp/tmpjo038yv5/job_working_directory/000/2/home/.local/share/Helixer/modelssaved model fungi_v0.3_a_0100.h5 to /tmp/tmpjo038yv5/job_working_directory/000/2/home/.local/share/Helixer/modelssaved model invertebrate_v0.3_m_0100.h5 to /tmp/tmpjo038yv5/job_working_directory/000/2/home/.local/share/Helixer/modelsHelixerPost <genome.h5> <predictions.h5> <windowSize> <edgeThresh> <peakThresh> <minCodingLength> <gff>No config file foundretrived list of available models from https://raw.githubusercontent.com/weberlab-hhu/Helixer/main/resources/model_list.csvHelixer.py config: {'batch_size': 8, 'compression': 'gzip', 'config_path': 'config/helixer_config.yaml', 'debug': False, 'edge_threshold': 0.1, 'fasta_path': '/tmp/tmpjo038yv5/files/b/9/4/dataset_b947ea36-d15a-4beb-bba4-aba284330d6f.dat', 'gff_output_path': '/tmp/tmpjo038yv5/job_working_directory/000/2/outputs/dataset_ed96d8a1-c7fe-45f9-b5df-58e9f7431804.dat', 'lineage': 'land_plant', 'min_coding_length': 100, 'model_filepath': '/tmp/tmpjo038yv5/job_working_directory/000/2/home/.local/share/Helixer/models/land_plant/land_plant_v0.3_a_0080.h5', 'no_multiprocess': False, 'no_overlap': False, 'overlap_core_length': 80190, 'overlap_offset': 53460, 'peak_threshold': 0.8, 'species': '', 'subsequence_length': 106920, 'temporary_dir': './', 'window_size': 100}Testing whether helixer_post_bin is correctly installedHelixer.py config loaded. Starting FASTA to H5 conversion.storing temporary files under ./tmpb13qee9_1 expected num of chunks to write in 19994040 bases to hdf5Numerification of 0-720580 of the sequence of sample took 0.25 secs1 Numerified Fasta only Coordinate (seqid: sample, len: 720580) in 0.40 secs1 expected num of chunks to write in 19994040 bases to hdf5Numerification of 0-2030 of the sequence of sample2 took 0.00 secs2 Numerified Fasta only Coordinate (seqid: sample2, len: 2030) in 0.01 secs1 expected num of chunks to write in 19994040 bases to hdf5Numerification of 0-2100 of the sequence of sample3 took 0.00 secs3 Numerified Fasta only Coordinate (seqid: sample3, len: 2100) in 0.01 secs1 expected num of chunks to write in 19994040 bases to hdf5Numerification of 0-7560 of the sequence of sample4 took 0.00 secs4 Numerified Fasta only Coordinate (seqid: sample4, len: 7560) in 0.02 secslogged installed version in place of git commit for geenufflogged installed version in place of git commit for helixerFASTA to H5 conversion done. Starting neural network prediction with overlapping.HelixerModel config: {'batch_size': 8, 'calculate_uncertainty': False, 'check_every_nth_batch': 1000000, 'class_weights': 'None', 'clip_norm': 3.0, 'cnn_layers': 1, 'compression': 'gzip', 'core_length': 80190, 'coverage_norm': None, 'coverage_offset': 0.0, 'coverage_weights': False, 'cpus': 8, 'data_dir': None, 'debug': False, 'dropout1': 0.0, 'dropout2': 0.0, 'epochs': 10000, 'eval': False, 'filter_depth': 32, 'fine_tune': False, 'float_precision': 'float32', 'gpu_id': -1, 'input_coverage': False, 'kernel_size': 26, 'large_eval_folder': '', 'learning_rate': 0.0003, 'load_model_path': '/tmp/tmpjo038yv5/job_working_directory/000/2/home/.local/share/Helixer/models/land_plant/land_plant_v0.3_a_0080.h5', 'load_predictions': False, 'loss': '', 'lstm_layers': 1, 'nni': False, 'no_utrs': False, 'optimizer': 'adamw', 'overlap': True, 'overlap_offset': 53460, 'patience': 3, 'pool_size': 9, 'post_coverage_hidden_layer': False, 'predict_phase': False, 'prediction_output_path': './tmpb13qee9_/tmp_predictions_.h5', 'pretrained_model_path': None, 'resume_training': False, 'save_every_check': False, 'save_model_path': './best_model.h5', 'stretch_transition_weights': 0, 'test_data': './tmpb13qee9_/tmp_species_.h5', 'transition_weights': 'None', 'units': 32, 'val_test_batch_size': 8, 'verbose': True, 'weight_decay': 3.5e-05, 'workers': 1}No err_samples dataset found, correct samples will be set to 0No fully_intergenic_samples dataset found, fully intergenic samples will be set to 0Data config: [{'geenuff_commit': 'commit not found, version: 0.3.2', 'helixer_commit': 'commit not found, version: 0.3.3', 'input_path': '/tmp/tmpjo038yv5/files/b/9/4/dataset_b947ea36-d15a-4beb-bba4-aba284330d6f.dat', 'timestamp': '2024-10-08 14:35:12.001462'}]Test data shape: (20, 106920)Intergenic test seqs: 0.00%Fully correct test seqs: 0.00%Number of devices: 1Current Helixer version: 0.3.3Md5sum of the loaded model: f0e00efcbea83c66b69258d11119a691 /tmp/tmpjo038yv5/job_working_directory/000/2/home/.local/share/Helixer/models/land_plant/land_plant_v0.3_a_0080.h5Model: "model"__________________________________________________________________________________________________ Layer (type) Output Shape Param # Connected to ================================================================================================== main_input (InputLayer) [(None, None, 4)] 0 [] conv1d (Conv1D) (None, None, 96) 4704 ['main_input[0][0]'] batch_normalization (Batch (None, None, 96) 384 ['conv1d[0][0]'] Normalization) conv1d_1 (Conv1D) (None, None, 96) 110688 ['batch_normalization[0][0]'] batch_normalization_1 (Bat (None, None, 96) 384 ['conv1d_1[0][0]'] chNormalization) conv1d_2 (Conv1D) (None, None, 96) 110688 ['batch_normalization_1[0][0]' ] batch_normalization_2 (Bat (None, None, 96) 384 ['conv1d_2[0][0]'] chNormalization) conv1d_3 (Conv1D) (None, None, 96) 110688 ['batch_normalization_2[0][0]' ] reshape (Reshape) (None, None, 864) 0 ['conv1d_3[0][0]'] bidirectional (Bidirection (None, None, 256) 1016832 ['reshape[0][0]'] al) bidirectional_1 (Bidirecti (None, None, 256) 394240 ['bidirectional[0][0]'] onal) bidirectional_2 (Bidirecti (None, None, 256) 394240 ['bidirectional_1[0][0]'] onal) dense (Dense) (None, None, 72) 18504 ['bidirectional_2[0][0]'] tf.split (TFOpLambda) [(None, None, 36), 0 ['dense[0][0]'] (None, None, 36)] reshape_1 (Reshape) (None, None, 9, 4) 0 ['tf.split[0][0]'] reshape_2 (Reshape) (None, None, 9, 4) 0 ['tf.split[0][1]'] genic (Activation) (None, None, 9, 4) 0 ['reshape_1[0][0]'] phase (Activation) (None, None, 9, 4) 0 ['reshape_2[0][0]'] ==================================================================================================Total params: 2161736 (8.25 MB)Trainable params: 2161160 (8.24 MB)Non-trainable params: 576 (2.25 KB)__________________________________________________________________________________________________HMM Config Splicing Flags: U:true US:true S:true SC:true C:true CS:true S:true SU:true U:true Splicing - Weights: Donor 1, Acceptor 1 Splicing - Fixed Penalties: U2-GT-AG 0, U2-GT-AC 0 U12-GT-AG 0 U12-AT-AC 0 Coding - Weights: Start 1000, Stop 1000 Phase Mode: Implementation 1, Dilute to Total, Retention: 0.2Sequences for Species - 0 BP_Extractor for Sequence sample - ID 0Forward for Sequence sample - ID 0 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 613530 131 479 418 Non Coding 689510 1069 1104 1091 UTR 238 5141 87 120 Phase 0 230 9046 2 6 Coding 1243 82 29647 264 Phase 1 226 8 9014 2 Intron 1459 120 162 67459 Phase 2 239 2 5 9026 Precision Recall F1 Precision Recall F1 Intergenic 0.99523 0.99833 0.99678 Non Coding 0.99899 0.99529 0.99714 UTR 0.93917 0.92034 0.92966 Phase 0 0.89343 0.97436 0.93214 Coding 0.97603 0.94913 0.96239 Phase 1 0.89027 0.97449 0.93048 Intron 0.98825 0.97484 0.98150 Phase 2 0.89146 0.97347 0.93066 Subgenic 0.98449 0.96684 0.97559 Coding 0.89172 0.97411 0.93109 Genic 0.98211 0.96439 0.97317 Reverse for Sequence sample - ID 0 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 555814 209 108 776 Non Coding 659468 985 971 987 UTR 454 9649 137 214 Phase 0 670 18679 9 31 Coding 2123 49 58497 691 Phase 1 689 33 18702 12 Intron 8256 265 391 82947 Phase 2 620 14 29 18681 Precision Recall F1 Precision Recall F1 Intergenic 0.98088 0.99804 0.98939 Non Coding 0.99701 0.99556 0.99628 UTR 0.94858 0.92300 0.93562 Phase 0 0.94764 0.96338 0.95545 Coding 0.98924 0.95334 0.97096 Phase 1 0.94881 0.96224 0.95548 Intron 0.98014 0.90298 0.93998 Phase 2 0.94774 0.96573 0.95665 Subgenic 0.98388 0.92315 0.95255 Coding 0.94807 0.96378 0.95586 Genic 0.98155 0.92314 0.95145 BP_Extractor for Sequence sample2 - ID 1Forward for Sequence sample2 - ID 1 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 2026 0 0 0 Non Coding 2029 0 0 0 UTR 0 0 0 0 Phase 0 1 0 0 0 Coding 4 0 0 0 Phase 1 0 0 0 0 Intron 0 0 0 0 Phase 2 0 0 0 0 Precision Recall F1 Precision Recall F1 Intergenic 0.99803 1.00000 0.99901 Non Coding 0.99951 1.00000 0.99975 UTR NaN NaN NaN Phase 0 NaN 0.00000 0.00000 Coding NaN 0.00000 0.00000 Phase 1 NaN NaN NaN Intron NaN NaN NaN Phase 2 NaN NaN NaN Subgenic NaN 0.00000 0.00000 Coding NaN 0.00000 0.00000 Genic NaN 0.00000 0.00000 Reverse for Sequence sample2 - ID 1 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 2030 0 0 0 Non Coding 2030 0 0 0 UTR 0 0 0 0 Phase 0 0 0 0 0 Coding 0 0 0 0 Phase 1 0 0 0 0 Intron 0 0 0 0 Phase 2 0 0 0 0 Precision Recall F1 Precision Recall F1 Intergenic 1.00000 1.00000 1.00000 Non Coding 1.00000 1.00000 1.00000 UTR NaN NaN NaN Phase 0 NaN NaN NaN Coding NaN NaN NaN Phase 1 NaN NaN NaN Intron NaN NaN NaN Phase 2 NaN NaN NaN Subgenic NaN NaN NaN Coding NaN NaN NaN Genic NaN NaN NaN BP_Extractor for Sequence sample3 - ID 2Forward for Sequence sample3 - ID 2 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 2098 0 0 0 Non Coding 2099 0 0 0 UTR 0 0 0 0 Phase 0 1 0 0 0 Coding 2 0 0 0 Phase 1 0 0 0 0 Intron 0 0 0 0 Phase 2 0 0 0 0 Precision Recall F1 Precision Recall F1 Intergenic 0.99905 1.00000 0.99952 Non Coding 0.99952 1.00000 0.99976 UTR NaN NaN NaN Phase 0 NaN 0.00000 0.00000 Coding NaN 0.00000 0.00000 Phase 1 NaN NaN NaN Intron NaN NaN NaN Phase 2 NaN NaN NaN Subgenic NaN 0.00000 0.00000 Coding NaN 0.00000 0.00000 Genic NaN 0.00000 0.00000 Reverse for Sequence sample3 - ID 2 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 2100 0 0 0 Non Coding 2100 0 0 0 UTR 0 0 0 0 Phase 0 0 0 0 0 Coding 0 0 0 0 Phase 1 0 0 0 0 Intron 0 0 0 0 Phase 2 0 0 0 0 Precision Recall F1 Precision Recall F1 Intergenic 1.00000 1.00000 1.00000 Non Coding 1.00000 1.00000 1.00000 UTR NaN NaN NaN Phase 0 NaN NaN NaN Coding NaN NaN NaN Phase 1 NaN NaN NaN Intron NaN NaN NaN Phase 2 NaN NaN NaN Subgenic NaN NaN NaN Coding NaN NaN NaN Genic NaN NaN NaN BP_Extractor for Sequence sample4 - ID 3Forward for Sequence sample4 - ID 3 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 7535 0 0 0 Non Coding 7553 0 0 0 UTR 3 0 0 0 Phase 0 2 0 0 0 Coding 8 0 0 0 Phase 1 2 0 0 0 Intron 14 0 0 0 Phase 2 3 0 0 0 Precision Recall F1 Precision Recall F1 Intergenic 0.99669 1.00000 0.99834 Non Coding 0.99907 1.00000 0.99954 UTR NaN 0.00000 0.00000 Phase 0 NaN 0.00000 0.00000 Coding NaN 0.00000 0.00000 Phase 1 NaN 0.00000 0.00000 Intron NaN 0.00000 0.00000 Phase 2 NaN 0.00000 0.00000 Subgenic NaN 0.00000 0.00000 Coding NaN 0.00000 0.00000 Genic NaN 0.00000 0.00000 Reverse for Sequence sample4 - ID 3 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 4480 19 0 0 Non Coding 5923 2 6 4 UTR 40 453 5 0 Phase 0 17 524 0 0 Coding 321 0 1573 1 Phase 1 17 0 520 0 Intron 0 0 0 668 Phase 2 25 0 0 522 Precision Recall F1 Precision Recall F1 Intergenic 0.92543 0.99578 0.95931 Non Coding 0.99014 0.99798 0.99404 UTR 0.95975 0.90964 0.93402 Phase 0 0.99620 0.96858 0.98219 Coding 0.99683 0.83008 0.90585 Phase 1 0.98859 0.96834 0.97836 Intron 0.99851 1.00000 0.99925 Phase 2 0.99240 0.95430 0.97297 Subgenic 0.99733 0.87437 0.93181 Coding 0.99240 0.96369 0.97783 Genic 0.99081 0.88010 0.93218 Forward for Species - 0 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 625189 131 479 418 Non Coding 701191 1069 1104 1091 UTR 241 5141 87 120 Phase 0 234 9046 2 6 Coding 1257 82 29647 264 Phase 1 228 8 9014 2 Intron 1473 120 162 67459 Phase 2 242 2 5 9026 Precision Recall F1 Precision Recall F1 Intergenic 0.99527 0.99836 0.99681 Non Coding 0.99900 0.99537 0.99718 UTR 0.93917 0.91984 0.92940 Phase 0 0.89343 0.97394 0.93195 Coding 0.97603 0.94870 0.96217 Phase 1 0.89027 0.97428 0.93038 Intron 0.98825 0.97464 0.98140 Phase 2 0.89146 0.97315 0.93052 Subgenic 0.98449 0.96658 0.97545 Coding 0.89172 0.97379 0.93095 Genic 0.98211 0.96411 0.97303 Reverse for Species - 0 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 564424 228 108 776 Non Coding 669521 987 977 991 UTR 494 10102 142 214 Phase 0 687 19203 9 31 Coding 2444 49 60070 692 Phase 1 706 33 19222 12 Intron 8256 265 391 83615 Phase 2 645 14 29 19203 Precision Recall F1 Precision Recall F1 Intergenic 0.98055 0.99803 0.98922 Non Coding 0.99697 0.99561 0.99629 UTR 0.94908 0.92239 0.93554 Phase 0 0.94891 0.96352 0.95616 Coding 0.98944 0.94965 0.96914 Phase 1 0.94984 0.96240 0.95608 Intron 0.98028 0.90368 0.94042 Phase 2 0.94891 0.96541 0.95709 Subgenic 0.98409 0.92235 0.95222 Coding 0.94922 0.96378 0.95644 Genic 0.98171 0.92235 0.95110 Total for Species - 0 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 1189613 359 587 1194 Non Coding 1370712 2056 2081 2082 UTR 735 15243 229 334 Phase 0 921 28249 11 37 Coding 3701 131 89717 956 Phase 1 934 41 28236 14 Intron 9729 385 553 151074 Phase 2 887 16 34 28229 Precision Recall F1 Precision Recall F1 Intergenic 0.98823 0.99820 0.99319 Non Coding 0.99800 0.99548 0.99674 UTR 0.94571 0.92153 0.93346 Phase 0 0.93041 0.96684 0.94827 Coding 0.98497 0.94934 0.96682 Phase 1 0.92998 0.96616 0.94772 Intron 0.98382 0.93405 0.95829 Phase 2 0.92975 0.96787 0.94843 Subgenic 0.98425 0.93969 0.96145 Coding 0.93004 0.96696 0.94814 Genic 0.98187 0.93859 0.95974 Total: 482642bp across 25 windowsNonestarting to load test data into memory..For h5 starting with species = b'':x shape: (20, 106920, 4)Data loading of 20 (total so far 20) samples of data/X into memory took 0.07 secsCompressed data size of data/X is at least 0.0008 GBsetting self.n_seqs to 20, bc that is len of data/X0 / 81 / 82 / 83 / 84 / 85 / 86 / 87 / 8Neural network prediction done. Starting post processing.Helixer successfully finished the annotation of /tmp/tmpjo038yv5/files/b/9/4/dataset_b947ea36-d15a-4beb-bba4-aba284330d6f.dat in 0.07 hours. GFF file written to /tmp/tmpjo038yv5/job_working_directory/000/2/outputs/dataset_ed96d8a1-c7fe-45f9-b5df-58e9f7431804.dat.
2024-10-08 14:36:49 ERROR: Warning message:package ‘ggplot2’ was built under R version 4.4.1 Warning message:The `size` argument of `element_line()` is deprecated as of ggplot2 3.4.0.ℹ Please use the `linewidth` argument instead.
Standard Output:
2024-10-08 14:32:47 INFO: ***** Start a BUSCO v5.7.1 analysis, current time: 10/08/2024 14:32:47 *****2024-10-08 14:32:47 INFO: Configuring BUSCO with local environment2024-10-08 14:32:47 INFO: Running genome mode2024-10-08 14:33:01 INFO: Input file is /tmp/tmpjo038yv5/files/b/9/4/dataset_b947ea36-d15a-4beb-bba4-aba284330d6f.dat2024-10-08 14:33:01 INFO: No lineage specified. Running lineage auto selector.2024-10-08 14:33:01 INFO: ***** Starting Auto Select Lineage ***** This process runs BUSCO on the generic lineage datasets for the domains archaea, bacteria and eukaryota. Once the optimal domain is selected, BUSCO automatically attempts to find the most appropriate BUSCO dataset to use based on phylogenetic placement. --auto-lineage-euk and --auto-lineage-prok are also available if you know your input assembly is, or is not, an eukaryote. See the user guide for more information. A reminder: Busco evaluations are valid when an appropriate dataset is used, i.e., the dataset belongs to the lineage of the species to test. Because of overlapping markers/spurious matches among domains, busco matches in another domain do not necessarily mean that your genome/proteome contains sequences from this domain. However, a high busco score in multiple domains might help you identify possible contaminations.2024-10-08 14:33:01 WARNING: Option evalue was provided but is not used in the selected run mode, prok_genome_prod2024-10-08 14:33:01 WARNING: Option limit was provided but is not used in the selected run mode, prok_genome_prod2024-10-08 14:33:01 INFO: Running BUSCO using lineage dataset archaea_odb10 (prokaryota, 2021-02-23)2024-10-08 14:33:01 INFO: Running 1 job(s) on bbtools, starting at 10/08/2024 14:33:012024-10-08 14:33:02 INFO: [bbtools] 1 of 1 task(s) completed2024-10-08 14:33:02 INFO: ***** Run Prodigal on input to predict and extract genes *****2024-10-08 14:33:02 INFO: Running Prodigal with genetic code 11 in single mode2024-10-08 14:33:02 INFO: Running 1 job(s) on prodigal, starting at 10/08/2024 14:33:022024-10-08 14:33:04 INFO: [prodigal] 1 of 1 task(s) completed2024-10-08 14:33:04 INFO: Genetic code 11 selected as optimal2024-10-08 14:33:04 INFO: ***** Run HMMER on gene sequences *****2024-10-08 14:33:04 INFO: Running 194 job(s) on hmmsearch, starting at 10/08/2024 14:33:042024-10-08 14:33:06 INFO: [hmmsearch] 20 of 194 task(s) completed2024-10-08 14:33:07 INFO: [hmmsearch] 39 of 194 task(s) completed2024-10-08 14:33:07 INFO: [hmmsearch] 59 of 194 task(s) completed2024-10-08 14:33:08 INFO: [hmmsearch] 78 of 194 task(s) completed2024-10-08 14:33:09 INFO: [hmmsearch] 97 of 194 task(s) completed2024-10-08 14:33:10 INFO: [hmmsearch] 117 of 194 task(s) completed2024-10-08 14:33:11 INFO: [hmmsearch] 136 of 194 task(s) completed2024-10-08 14:33:12 INFO: [hmmsearch] 156 of 194 task(s) completed2024-10-08 14:33:13 INFO: [hmmsearch] 175 of 194 task(s) completed2024-10-08 14:33:13 INFO: [hmmsearch] 194 of 194 task(s) completed2024-10-08 14:33:13 INFO: Results: C:0.5%[S:0.5%,D:0.0%],F:0.0%,M:99.5%,n:194 2024-10-08 14:33:13 WARNING: Option evalue was provided but is not used in the selected run mode, prok_genome_prod2024-10-08 14:33:13 WARNING: Option limit was provided but is not used in the selected run mode, prok_genome_prod2024-10-08 14:33:14 INFO: Running BUSCO using lineage dataset bacteria_odb10 (prokaryota, 2020-03-06)2024-10-08 14:33:14 INFO: Skipping BBTools as already run2024-10-08 14:33:14 INFO: ***** Run Prodigal on input to predict and extract genes *****2024-10-08 14:33:14 INFO: Running Prodigal with genetic code 4 in single mode2024-10-08 14:33:14 INFO: Running 1 job(s) on prodigal, starting at 10/08/2024 14:33:142024-10-08 14:33:16 INFO: [prodigal] 1 of 1 task(s) completed2024-10-08 14:33:16 INFO: Genetic code 4 selected as optimal2024-10-08 14:33:16 INFO: ***** Run HMMER on gene sequences *****2024-10-08 14:33:16 INFO: Running 124 job(s) on hmmsearch, starting at 10/08/2024 14:33:162024-10-08 14:33:18 INFO: [hmmsearch] 13 of 124 task(s) completed2024-10-08 14:33:19 INFO: [hmmsearch] 25 of 124 task(s) completed2024-10-08 14:33:20 INFO: [hmmsearch] 38 of 124 task(s) completed2024-10-08 14:33:20 INFO: [hmmsearch] 50 of 124 task(s) completed2024-10-08 14:33:21 INFO: [hmmsearch] 63 of 124 task(s) completed2024-10-08 14:33:21 INFO: [hmmsearch] 75 of 124 task(s) completed2024-10-08 14:33:22 INFO: [hmmsearch] 87 of 124 task(s) completed2024-10-08 14:33:22 INFO: [hmmsearch] 100 of 124 task(s) completed2024-10-08 14:33:23 INFO: [hmmsearch] 112 of 124 task(s) completed2024-10-08 14:33:24 INFO: [hmmsearch] 124 of 124 task(s) completed2024-10-08 14:33:24 WARNING: BUSCO did not find any match. Make sure to check the log files if this is unexpected.2024-10-08 14:33:24 INFO: Results: C:0.0%[S:0.0%,D:0.0%],F:0.0%,M:100.0%,n:124 2024-10-08 14:33:24 INFO: Running BUSCO using lineage dataset eukaryota_odb10 (eukaryota, 2020-09-10)2024-10-08 14:33:24 INFO: Skipping BBTools as already run2024-10-08 14:33:24 INFO: Running 1 job(s) on makeblastdb, starting at 10/08/2024 14:33:242024-10-08 14:33:25 INFO: Creating BLAST database with input file2024-10-08 14:33:25 INFO: [makeblastdb] 1 of 1 task(s) completed2024-10-08 14:33:25 INFO: Running a BLAST search for BUSCOs against created database2024-10-08 14:33:25 INFO: Running 1 job(s) on tblastn, starting at 10/08/2024 14:33:252024-10-08 14:33:26 INFO: [tblastn] 1 of 1 task(s) completed2024-10-08 14:33:26 INFO: Running Augustus gene predictor on BLAST search results.2024-10-08 14:33:26 INFO: Running Augustus prediction using fly as species:2024-10-08 14:33:26 INFO: Running 6 job(s) on augustus, starting at 10/08/2024 14:33:262024-10-08 14:33:29 INFO: [augustus] 1 of 6 task(s) completed2024-10-08 14:33:31 INFO: [augustus] 2 of 6 task(s) completed2024-10-08 14:33:34 INFO: [augustus] 3 of 6 task(s) completed2024-10-08 14:33:37 INFO: [augustus] 4 of 6 task(s) completed2024-10-08 14:33:38 INFO: [augustus] 5 of 6 task(s) completed2024-10-08 14:33:40 INFO: [augustus] 6 of 6 task(s) completed2024-10-08 14:33:40 INFO: Extracting predicted proteins...2024-10-08 14:33:40 INFO: ***** Run HMMER on gene sequences *****2024-10-08 14:33:40 INFO: Running 6 job(s) on hmmsearch, starting at 10/08/2024 14:33:402024-10-08 14:33:40 INFO: [hmmsearch] 1 of 6 task(s) completed2024-10-08 14:33:40 INFO: [hmmsearch] 2 of 6 task(s) completed2024-10-08 14:33:41 INFO: [hmmsearch] 3 of 6 task(s) completed2024-10-08 14:33:41 INFO: [hmmsearch] 4 of 6 task(s) completed2024-10-08 14:33:41 INFO: [hmmsearch] 5 of 6 task(s) completed2024-10-08 14:33:41 INFO: [hmmsearch] 6 of 6 task(s) completed2024-10-08 14:33:41 INFO: 37 exons in total2024-10-08 14:33:41 INFO: Results: C:1.2%[S:1.2%,D:0.0%],F:0.0%,M:98.8%,n:255 2024-10-08 14:33:41 INFO: Starting second step of analysis. The gene predictor Augustus is retrained using the results from the initial run to yield more accurate results.2024-10-08 14:33:41 INFO: Extracting missing and fragmented buscos from the file ancestral_variants...2024-10-08 14:33:41 INFO: Running a BLAST search for BUSCOs against created database2024-10-08 14:33:41 INFO: Running 1 job(s) on tblastn, starting at 10/08/2024 14:33:412024-10-08 14:33:51 INFO: [tblastn] 1 of 1 task(s) completed2024-10-08 14:33:51 INFO: Converting predicted genes to short genbank files2024-10-08 14:33:51 INFO: Running 3 job(s) on gff2gbSmallDNA.pl, starting at 10/08/2024 14:33:512024-10-08 14:33:52 INFO: [gff2gbSmallDNA.pl] 1 of 3 task(s) completed2024-10-08 14:33:52 INFO: [gff2gbSmallDNA.pl] 2 of 3 task(s) completed2024-10-08 14:33:52 INFO: [gff2gbSmallDNA.pl] 3 of 3 task(s) completed2024-10-08 14:33:52 INFO: All files converted to short genbank files, now training Augustus using Single-Copy Complete BUSCOs2024-10-08 14:33:52 INFO: Running 1 job(s) on new_species.pl, starting at 10/08/2024 14:33:522024-10-08 14:33:52 INFO: [new_species.pl] 1 of 1 task(s) completed2024-10-08 14:33:52 INFO: Running 1 job(s) on etraining, starting at 10/08/2024 14:33:522024-10-08 14:33:53 INFO: [etraining] 1 of 1 task(s) completed2024-10-08 14:33:53 INFO: Re-running Augustus with the new metaparameters, number of target BUSCOs: 2522024-10-08 14:33:53 INFO: Running Augustus gene predictor on BLAST search results.2024-10-08 14:33:53 INFO: Running Augustus prediction using BUSCO_busco_galaxy as species:2024-10-08 14:33:53 INFO: Running 6 job(s) on augustus, starting at 10/08/2024 14:33:532024-10-08 14:33:57 INFO: [augustus] 1 of 6 task(s) completed2024-10-08 14:33:58 INFO: [augustus] 2 of 6 task(s) completed2024-10-08 14:34:00 INFO: [augustus] 3 of 6 task(s) completed2024-10-08 14:34:02 INFO: [augustus] 4 of 6 task(s) completed2024-10-08 14:34:04 INFO: [augustus] 5 of 6 task(s) completed2024-10-08 14:34:05 INFO: [augustus] 6 of 6 task(s) completed2024-10-08 14:34:05 INFO: Extracting predicted proteins...2024-10-08 14:34:05 INFO: ***** Run HMMER on gene sequences *****2024-10-08 14:34:05 INFO: Running 6 job(s) on hmmsearch, starting at 10/08/2024 14:34:052024-10-08 14:34:05 INFO: [hmmsearch] 1 of 6 task(s) completed2024-10-08 14:34:05 INFO: [hmmsearch] 2 of 6 task(s) completed2024-10-08 14:34:05 INFO: [hmmsearch] 3 of 6 task(s) completed2024-10-08 14:34:05 INFO: [hmmsearch] 4 of 6 task(s) completed2024-10-08 14:34:05 INFO: [hmmsearch] 5 of 6 task(s) completed2024-10-08 14:34:05 INFO: [hmmsearch] 6 of 6 task(s) completed2024-10-08 14:34:05 INFO: 37 exons in total2024-10-08 14:34:05 INFO: Results: C:1.6%[S:1.6%,D:0.0%],F:0.0%,M:98.4%,n:255 2024-10-08 14:34:05 INFO: eukaryota_odb10 selected2024-10-08 14:34:05 INFO: ***** Searching tree for chosen lineage to find best taxonomic match *****2024-10-08 14:34:05 INFO: Extract markers...2024-10-08 14:34:06 INFO: Place the markers on the reference tree...2024-10-08 14:34:06 INFO: Running 1 job(s) on sepp, starting at 10/08/2024 14:34:062024-10-08 14:36:46 INFO: [sepp] 1 of 1 task(s) completed2024-10-08 14:36:46 INFO: Not enough markers were placed on the tree (1). Root lineage eukaryota is kept2024-10-08 14:36:46 INFO: --------------------------------------------------- |Results from dataset eukaryota_odb10 | --------------------------------------------------- |C:1.6%[S:1.6%,D:0.0%],F:0.0%,M:98.4%,n:255 | |4 Complete BUSCOs (C) | |4 Complete and single-copy BUSCOs (S) | |0 Complete and duplicated BUSCOs (D) | |0 Fragmented BUSCOs (F) | |251 Missing BUSCOs (M) | |255 Total BUSCO groups searched | ---------------------------------------------------2024-10-08 14:36:46 INFO: BUSCO analysis done with WARNING(s). Total running time: 225 seconds***** Summary of warnings: *****2024-10-08 14:33:01 WARNING:busco.BuscoConfig Option evalue was provided but is not used in the selected run mode, prok_genome_prod2024-10-08 14:33:01 WARNING:busco.BuscoConfig Option limit was provided but is not used in the selected run mode, prok_genome_prod2024-10-08 14:33:13 WARNING:busco.BuscoConfig Option evalue was provided but is not used in the selected run mode, prok_genome_prod2024-10-08 14:33:13 WARNING:busco.BuscoConfig Option limit was provided but is not used in the selected run mode, prok_genome_prod2024-10-08 14:33:24 WARNING:busco.busco_tools.hmmer BUSCO did not find any match. Make sure to check the log files if this is unexpected.2024-10-08 14:36:46 INFO: Results written in /tmp/tmpjo038yv5/job_working_directory/000/3/working/busco_galaxy2024-10-08 14:36:46 INFO: For assistance with interpreting the results, please consult the userguide: https://busco.ezlab.org/busco_userguide.html2024-10-08 14:36:46 INFO: Visit this page https://gitlab.com/ezlab/busco#how-to-cite-busco to see how to cite BUSCOtotal 40drwxr-xr-x 8 1001 127 4096 Oct 8 14:34 augustus_outputdrwxr-xr-x 3 1001 127 4096 Oct 8 14:33 blast_outputdrwxr-xr-x 5 1001 127 4096 Oct 8 14:33 busco_sequences-rw-r--r-- 1 1001 127 5831 Oct 8 14:34 full_table.tsvdrwxr-xr-x 4 1001 127 4096 Oct 8 14:33 hmmer_output-rw-r--r-- 1 1001 127 3548 Oct 8 14:34 missing_busco_list.tsvdrwxr-xr-x 2 1001 127 4096 Oct 8 14:36 placement_files-rw-r--r-- 1 1001 127 3180 Oct 8 14:34 short_summary.json-rw-r--r-- 1 1001 127 1078 Oct 8 14:34 short_summary.txt2024-10-08 14:36:47 INFO: ****************** Start plot generation at 10/08/2024 14:36:47 ******************2024-10-08 14:36:47 INFO: Load data ...2024-10-08 14:36:47 INFO: Loaded BUSCO_summaries/short_summary.specific.eukaryota_odb10.busco_galaxy.txt successfully2024-10-08 14:36:47 INFO: Generate the R code ...2024-10-08 14:36:47 INFO: Run the R code ...2024-10-08 14:36:49 INFO: [1] "Plotting the figure ..."[1] "Done"2024-10-08 14:36:49 INFO: Plot generation done. Total running time: 2.2741615772247314 seconds2024-10-08 14:36:49 INFO: Results written in BUSCO_summaries/
2024-10-08 14:36:32.061351: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.2024-10-08 14:36:32.737547: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT2024-10-08 14:36:38.160538: W tensorflow/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 205286400 exceeds 10% of free system memory.2024-10-08 14:36:38.374570: W tensorflow/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 205286400 exceeds 10% of free system memory.2024-10-08 14:36:40.875694: W tensorflow/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 205286400 exceeds 10% of free system memory.2024-10-08 14:36:43.349369: W tensorflow/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 205286400 exceeds 10% of free system memory.2024-10-08 14:36:45.784360: W tensorflow/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 205286400 exceeds 10% of free system memory./usr/local/lib/python3.8/dist-packages/tensorflow_addons/utils/tfa_eol_msg.py:23: UserWarning: TensorFlow Addons (TFA) has ended development and introduction of new features.TFA has entered a minimal maintenance and release mode until a planned end of life in May 2024.Please modify downstream libraries to take dependencies from other repositories in our TensorFlow community (e.g. Keras, Keras-CV, and Keras-NLP). For more information see: https://github.com/tensorflow/addons/issues/2807 warnings.warn(Ignoring the following unexpected models in /tmp/tmp88q28s6d/job_working_directory/000/2/home/.local/share/Helixer/models:[].You can set --model-filepath in Helixer.py if you wish to use these.
Standard Output:
============ CUDA ============CUDA Version 11.8.0Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.This container image and its contents are governed by the NVIDIA Deep Learning Container License.By pulling and using the container, you accept the terms and conditions of this license:https://developer.nvidia.com/ngc/nvidia-deep-learning-container-licenseA copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.WARNING: The NVIDIA Driver was not detected. GPU functionality will not be available. Use the NVIDIA Container Toolkit to start this container with GPU support; see https://docs.nvidia.com/datacenter/cloud-native/ .retrived list of available models from https://raw.githubusercontent.com/weberlab-hhu/Helixer/main/resources/model_list.csvsaved model land_plant_v0.3_a_0080.h5 to /tmp/tmp88q28s6d/job_working_directory/000/2/home/.local/share/Helixer/modelssaved model vertebrate_v0.3_m_0080.h5 to /tmp/tmp88q28s6d/job_working_directory/000/2/home/.local/share/Helixer/modelssaved model fungi_v0.3_a_0100.h5 to /tmp/tmp88q28s6d/job_working_directory/000/2/home/.local/share/Helixer/modelssaved model invertebrate_v0.3_m_0100.h5 to /tmp/tmp88q28s6d/job_working_directory/000/2/home/.local/share/Helixer/modelsHelixerPost <genome.h5> <predictions.h5> <windowSize> <edgeThresh> <peakThresh> <minCodingLength> <gff>No config file foundretrived list of available models from https://raw.githubusercontent.com/weberlab-hhu/Helixer/main/resources/model_list.csvHelixer.py config: {'batch_size': 8, 'compression': 'gzip', 'config_path': 'config/helixer_config.yaml', 'debug': False, 'edge_threshold': 0.1, 'fasta_path': '/tmp/tmp88q28s6d/files/3/e/b/dataset_3eb29368-f28d-44dc-ae4c-29ed9705e5d9.dat', 'gff_output_path': '/tmp/tmp88q28s6d/job_working_directory/000/2/outputs/dataset_898a3f2e-09e0-452d-b752-0c05a730a5df.dat', 'lineage': 'land_plant', 'min_coding_length': 100, 'model_filepath': '/tmp/tmp88q28s6d/job_working_directory/000/2/home/.local/share/Helixer/models/land_plant/land_plant_v0.3_a_0080.h5', 'no_multiprocess': False, 'no_overlap': False, 'overlap_core_length': 80190, 'overlap_offset': 53460, 'peak_threshold': 0.8, 'species': '', 'subsequence_length': 106920, 'temporary_dir': './', 'window_size': 100}Testing whether helixer_post_bin is correctly installedHelixer.py config loaded. Starting FASTA to H5 conversion.storing temporary files under ./tmp6vfv9bei1 expected num of chunks to write in 19994040 bases to hdf5Numerification of 0-720580 of the sequence of sample took 0.34 secs1 Numerified Fasta only Coordinate (seqid: sample, len: 720580) in 0.49 secs1 expected num of chunks to write in 19994040 bases to hdf5Numerification of 0-2030 of the sequence of sample2 took 0.00 secs2 Numerified Fasta only Coordinate (seqid: sample2, len: 2030) in 0.01 secs1 expected num of chunks to write in 19994040 bases to hdf5Numerification of 0-2100 of the sequence of sample3 took 0.00 secs3 Numerified Fasta only Coordinate (seqid: sample3, len: 2100) in 0.01 secs1 expected num of chunks to write in 19994040 bases to hdf5Numerification of 0-7560 of the sequence of sample4 took 0.00 secs4 Numerified Fasta only Coordinate (seqid: sample4, len: 7560) in 0.02 secslogged installed version in place of git commit for geenufflogged installed version in place of git commit for helixerFASTA to H5 conversion done. Starting neural network prediction with overlapping.HelixerModel config: {'batch_size': 8, 'calculate_uncertainty': False, 'check_every_nth_batch': 1000000, 'class_weights': 'None', 'clip_norm': 3.0, 'cnn_layers': 1, 'compression': 'gzip', 'core_length': 80190, 'coverage_norm': None, 'coverage_offset': 0.0, 'coverage_weights': False, 'cpus': 8, 'data_dir': None, 'debug': False, 'dropout1': 0.0, 'dropout2': 0.0, 'epochs': 10000, 'eval': False, 'filter_depth': 32, 'fine_tune': False, 'float_precision': 'float32', 'gpu_id': -1, 'input_coverage': False, 'kernel_size': 26, 'large_eval_folder': '', 'learning_rate': 0.0003, 'load_model_path': '/tmp/tmp88q28s6d/job_working_directory/000/2/home/.local/share/Helixer/models/land_plant/land_plant_v0.3_a_0080.h5', 'load_predictions': False, 'loss': '', 'lstm_layers': 1, 'nni': False, 'no_utrs': False, 'optimizer': 'adamw', 'overlap': True, 'overlap_offset': 53460, 'patience': 3, 'pool_size': 9, 'post_coverage_hidden_layer': False, 'predict_phase': False, 'prediction_output_path': './tmp6vfv9bei/tmp_predictions_.h5', 'pretrained_model_path': None, 'resume_training': False, 'save_every_check': False, 'save_model_path': './best_model.h5', 'stretch_transition_weights': 0, 'test_data': './tmp6vfv9bei/tmp_species_.h5', 'transition_weights': 'None', 'units': 32, 'val_test_batch_size': 8, 'verbose': True, 'weight_decay': 3.5e-05, 'workers': 1}No err_samples dataset found, correct samples will be set to 0No fully_intergenic_samples dataset found, fully intergenic samples will be set to 0Data config: [{'geenuff_commit': 'commit not found, version: 0.3.2', 'helixer_commit': 'commit not found, version: 0.3.3', 'input_path': '/tmp/tmp88q28s6d/files/3/e/b/dataset_3eb29368-f28d-44dc-ae4c-29ed9705e5d9.dat', 'timestamp': '2024-10-08 14:36:35.465286'}]Test data shape: (20, 106920)Intergenic test seqs: 0.00%Fully correct test seqs: 0.00%Number of devices: 1Current Helixer version: 0.3.3Md5sum of the loaded model: f0e00efcbea83c66b69258d11119a691 /tmp/tmp88q28s6d/job_working_directory/000/2/home/.local/share/Helixer/models/land_plant/land_plant_v0.3_a_0080.h5Model: "model"__________________________________________________________________________________________________ Layer (type) Output Shape Param # Connected to ================================================================================================== main_input (InputLayer) [(None, None, 4)] 0 [] conv1d (Conv1D) (None, None, 96) 4704 ['main_input[0][0]'] batch_normalization (Batch (None, None, 96) 384 ['conv1d[0][0]'] Normalization) conv1d_1 (Conv1D) (None, None, 96) 110688 ['batch_normalization[0][0]'] batch_normalization_1 (Bat (None, None, 96) 384 ['conv1d_1[0][0]'] chNormalization) conv1d_2 (Conv1D) (None, None, 96) 110688 ['batch_normalization_1[0][0]' ] batch_normalization_2 (Bat (None, None, 96) 384 ['conv1d_2[0][0]'] chNormalization) conv1d_3 (Conv1D) (None, None, 96) 110688 ['batch_normalization_2[0][0]' ] reshape (Reshape) (None, None, 864) 0 ['conv1d_3[0][0]'] bidirectional (Bidirection (None, None, 256) 1016832 ['reshape[0][0]'] al) bidirectional_1 (Bidirecti (None, None, 256) 394240 ['bidirectional[0][0]'] onal) bidirectional_2 (Bidirecti (None, None, 256) 394240 ['bidirectional_1[0][0]'] onal) dense (Dense) (None, None, 72) 18504 ['bidirectional_2[0][0]'] tf.split (TFOpLambda) [(None, None, 36), 0 ['dense[0][0]'] (None, None, 36)] reshape_1 (Reshape) (None, None, 9, 4) 0 ['tf.split[0][0]'] reshape_2 (Reshape) (None, None, 9, 4) 0 ['tf.split[0][1]'] genic (Activation) (None, None, 9, 4) 0 ['reshape_1[0][0]'] phase (Activation) (None, None, 9, 4) 0 ['reshape_2[0][0]'] ==================================================================================================Total params: 2161736 (8.25 MB)Trainable params: 2161160 (8.24 MB)Non-trainable params: 576 (2.25 KB)__________________________________________________________________________________________________HMM Config Splicing Flags: U:true US:true S:true SC:true C:true CS:true S:true SU:true U:true Splicing - Weights: Donor 1, Acceptor 1 Splicing - Fixed Penalties: U2-GT-AG 0, U2-GT-AC 0 U12-GT-AG 0 U12-AT-AC 0 Coding - Weights: Start 1000, Stop 1000 Phase Mode: Implementation 1, Dilute to Total, Retention: 0.2Sequences for Species - 0 BP_Extractor for Sequence sample - ID 0Forward for Sequence sample - ID 0 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 613530 131 479 418 Non Coding 689510 1069 1104 1091 UTR 238 5141 87 120 Phase 0 230 9046 2 6 Coding 1243 82 29647 264 Phase 1 226 8 9014 2 Intron 1459 120 162 67459 Phase 2 239 2 5 9026 Precision Recall F1 Precision Recall F1 Intergenic 0.99523 0.99833 0.99678 Non Coding 0.99899 0.99529 0.99714 UTR 0.93917 0.92034 0.92966 Phase 0 0.89343 0.97436 0.93214 Coding 0.97603 0.94913 0.96239 Phase 1 0.89027 0.97449 0.93048 Intron 0.98825 0.97484 0.98150 Phase 2 0.89146 0.97347 0.93066 Subgenic 0.98449 0.96684 0.97559 Coding 0.89172 0.97411 0.93109 Genic 0.98211 0.96439 0.97317 Reverse for Sequence sample - ID 0 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 555814 209 108 776 Non Coding 659468 985 971 987 UTR 454 9649 137 214 Phase 0 670 18679 9 31 Coding 2123 49 58497 691 Phase 1 689 33 18702 12 Intron 8256 265 391 82947 Phase 2 620 14 29 18681 Precision Recall F1 Precision Recall F1 Intergenic 0.98088 0.99804 0.98939 Non Coding 0.99701 0.99556 0.99628 UTR 0.94858 0.92300 0.93562 Phase 0 0.94764 0.96338 0.95545 Coding 0.98924 0.95334 0.97096 Phase 1 0.94881 0.96224 0.95548 Intron 0.98014 0.90298 0.93998 Phase 2 0.94774 0.96573 0.95665 Subgenic 0.98388 0.92315 0.95255 Coding 0.94807 0.96378 0.95586 Genic 0.98155 0.92314 0.95145 BP_Extractor for Sequence sample2 - ID 1Forward for Sequence sample2 - ID 1 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 2026 0 0 0 Non Coding 2029 0 0 0 UTR 0 0 0 0 Phase 0 1 0 0 0 Coding 4 0 0 0 Phase 1 0 0 0 0 Intron 0 0 0 0 Phase 2 0 0 0 0 Precision Recall F1 Precision Recall F1 Intergenic 0.99803 1.00000 0.99901 Non Coding 0.99951 1.00000 0.99975 UTR NaN NaN NaN Phase 0 NaN 0.00000 0.00000 Coding NaN 0.00000 0.00000 Phase 1 NaN NaN NaN Intron NaN NaN NaN Phase 2 NaN NaN NaN Subgenic NaN 0.00000 0.00000 Coding NaN 0.00000 0.00000 Genic NaN 0.00000 0.00000 Reverse for Sequence sample2 - ID 1 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 2030 0 0 0 Non Coding 2030 0 0 0 UTR 0 0 0 0 Phase 0 0 0 0 0 Coding 0 0 0 0 Phase 1 0 0 0 0 Intron 0 0 0 0 Phase 2 0 0 0 0 Precision Recall F1 Precision Recall F1 Intergenic 1.00000 1.00000 1.00000 Non Coding 1.00000 1.00000 1.00000 UTR NaN NaN NaN Phase 0 NaN NaN NaN Coding NaN NaN NaN Phase 1 NaN NaN NaN Intron NaN NaN NaN Phase 2 NaN NaN NaN Subgenic NaN NaN NaN Coding NaN NaN NaN Genic NaN NaN NaN BP_Extractor for Sequence sample3 - ID 2Forward for Sequence sample3 - ID 2 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 2098 0 0 0 Non Coding 2099 0 0 0 UTR 0 0 0 0 Phase 0 1 0 0 0 Coding 2 0 0 0 Phase 1 0 0 0 0 Intron 0 0 0 0 Phase 2 0 0 0 0 Precision Recall F1 Precision Recall F1 Intergenic 0.99905 1.00000 0.99952 Non Coding 0.99952 1.00000 0.99976 UTR NaN NaN NaN Phase 0 NaN 0.00000 0.00000 Coding NaN 0.00000 0.00000 Phase 1 NaN NaN NaN Intron NaN NaN NaN Phase 2 NaN NaN NaN Subgenic NaN 0.00000 0.00000 Coding NaN 0.00000 0.00000 Genic NaN 0.00000 0.00000 Reverse for Sequence sample3 - ID 2 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 2100 0 0 0 Non Coding 2100 0 0 0 UTR 0 0 0 0 Phase 0 0 0 0 0 Coding 0 0 0 0 Phase 1 0 0 0 0 Intron 0 0 0 0 Phase 2 0 0 0 0 Precision Recall F1 Precision Recall F1 Intergenic 1.00000 1.00000 1.00000 Non Coding 1.00000 1.00000 1.00000 UTR NaN NaN NaN Phase 0 NaN NaN NaN Coding NaN NaN NaN Phase 1 NaN NaN NaN Intron NaN NaN NaN Phase 2 NaN NaN NaN Subgenic NaN NaN NaN Coding NaN NaN NaN Genic NaN NaN NaN BP_Extractor for Sequence sample4 - ID 3Forward for Sequence sample4 - ID 3 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 7535 0 0 0 Non Coding 7553 0 0 0 UTR 3 0 0 0 Phase 0 2 0 0 0 Coding 8 0 0 0 Phase 1 2 0 0 0 Intron 14 0 0 0 Phase 2 3 0 0 0 Precision Recall F1 Precision Recall F1 Intergenic 0.99669 1.00000 0.99834 Non Coding 0.99907 1.00000 0.99954 UTR NaN 0.00000 0.00000 Phase 0 NaN 0.00000 0.00000 Coding NaN 0.00000 0.00000 Phase 1 NaN 0.00000 0.00000 Intron NaN 0.00000 0.00000 Phase 2 NaN 0.00000 0.00000 Subgenic NaN 0.00000 0.00000 Coding NaN 0.00000 0.00000 Genic NaN 0.00000 0.00000 Reverse for Sequence sample4 - ID 3 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 4480 19 0 0 Non Coding 5923 2 6 4 UTR 40 453 5 0 Phase 0 17 524 0 0 Coding 321 0 1573 1 Phase 1 17 0 520 0 Intron 0 0 0 668 Phase 2 25 0 0 522 Precision Recall F1 Precision Recall F1 Intergenic 0.92543 0.99578 0.95931 Non Coding 0.99014 0.99798 0.99404 UTR 0.95975 0.90964 0.93402 Phase 0 0.99620 0.96858 0.98219 Coding 0.99683 0.83008 0.90585 Phase 1 0.98859 0.96834 0.97836 Intron 0.99851 1.00000 0.99925 Phase 2 0.99240 0.95430 0.97297 Subgenic 0.99733 0.87437 0.93181 Coding 0.99240 0.96369 0.97783 Genic 0.99081 0.88010 0.93218 Forward for Species - 0 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 625189 131 479 418 Non Coding 701191 1069 1104 1091 UTR 241 5141 87 120 Phase 0 234 9046 2 6 Coding 1257 82 29647 264 Phase 1 228 8 9014 2 Intron 1473 120 162 67459 Phase 2 242 2 5 9026 Precision Recall F1 Precision Recall F1 Intergenic 0.99527 0.99836 0.99681 Non Coding 0.99900 0.99537 0.99718 UTR 0.93917 0.91984 0.92940 Phase 0 0.89343 0.97394 0.93195 Coding 0.97603 0.94870 0.96217 Phase 1 0.89027 0.97428 0.93038 Intron 0.98825 0.97464 0.98140 Phase 2 0.89146 0.97315 0.93052 Subgenic 0.98449 0.96658 0.97545 Coding 0.89172 0.97379 0.93095 Genic 0.98211 0.96411 0.97303 Reverse for Species - 0 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 564424 228 108 776 Non Coding 669521 987 977 991 UTR 494 10102 142 214 Phase 0 687 19203 9 31 Coding 2444 49 60070 692 Phase 1 706 33 19222 12 Intron 8256 265 391 83615 Phase 2 645 14 29 19203 Precision Recall F1 Precision Recall F1 Intergenic 0.98055 0.99803 0.98922 Non Coding 0.99697 0.99561 0.99629 UTR 0.94908 0.92239 0.93554 Phase 0 0.94891 0.96352 0.95616 Coding 0.98944 0.94965 0.96914 Phase 1 0.94984 0.96240 0.95608 Intron 0.98028 0.90368 0.94042 Phase 2 0.94891 0.96541 0.95709 Subgenic 0.98409 0.92235 0.95222 Coding 0.94922 0.96378 0.95644 Genic 0.98171 0.92235 0.95110 Total for Species - 0 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 1189613 359 587 1194 Non Coding 1370712 2056 2081 2082 UTR 735 15243 229 334 Phase 0 921 28249 11 37 Coding 3701 131 89717 956 Phase 1 934 41 28236 14 Intron 9729 385 553 151074 Phase 2 887 16 34 28229 Precision Recall F1 Precision Recall F1 Intergenic 0.98823 0.99820 0.99319 Non Coding 0.99800 0.99548 0.99674 UTR 0.94571 0.92153 0.93346 Phase 0 0.93041 0.96684 0.94827 Coding 0.98497 0.94934 0.96682 Phase 1 0.92998 0.96616 0.94772 Intron 0.98382 0.93405 0.95829 Phase 2 0.92975 0.96787 0.94843 Subgenic 0.98425 0.93969 0.96145 Coding 0.93004 0.96696 0.94814 Genic 0.98187 0.93859 0.95974 Total: 482642bp across 25 windowsNonestarting to load test data into memory..For h5 starting with species = b'':x shape: (20, 106920, 4)Data loading of 20 (total so far 20) samples of data/X into memory took 0.07 secsCompressed data size of data/X is at least 0.0008 GBsetting self.n_seqs to 20, bc that is len of data/X0 / 81 / 82 / 83 / 84 / 85 / 86 / 87 / 8Neural network prediction done. Starting post processing.Helixer successfully finished the annotation of /tmp/tmp88q28s6d/files/3/e/b/dataset_3eb29368-f28d-44dc-ae4c-29ed9705e5d9.dat in 0.07 hours. GFF file written to /tmp/tmp88q28s6d/job_working_directory/000/2/outputs/dataset_898a3f2e-09e0-452d-b752-0c05a730a5df.dat.
2024-10-08 14:37:23 ERROR: Warning message:package ‘ggplot2’ was built under R version 4.4.1 Warning message:The `size` argument of `element_line()` is deprecated as of ggplot2 3.4.0.ℹ Please use the `linewidth` argument instead.
Standard Output:
2024-10-08 14:32:50 INFO: ***** Start a BUSCO v5.7.1 analysis, current time: 10/08/2024 14:32:50 *****2024-10-08 14:32:50 INFO: Configuring BUSCO with local environment2024-10-08 14:32:50 INFO: Running genome mode2024-10-08 14:33:10 INFO: Input file is /tmp/tmp88q28s6d/files/3/e/b/dataset_3eb29368-f28d-44dc-ae4c-29ed9705e5d9.dat2024-10-08 14:33:10 INFO: No lineage specified. Running lineage auto selector.2024-10-08 14:33:10 INFO: ***** Starting Auto Select Lineage ***** This process runs BUSCO on the generic lineage datasets for the domains archaea, bacteria and eukaryota. Once the optimal domain is selected, BUSCO automatically attempts to find the most appropriate BUSCO dataset to use based on phylogenetic placement. --auto-lineage-euk and --auto-lineage-prok are also available if you know your input assembly is, or is not, an eukaryote. See the user guide for more information. A reminder: Busco evaluations are valid when an appropriate dataset is used, i.e., the dataset belongs to the lineage of the species to test. Because of overlapping markers/spurious matches among domains, busco matches in another domain do not necessarily mean that your genome/proteome contains sequences from this domain. However, a high busco score in multiple domains might help you identify possible contaminations.2024-10-08 14:33:10 WARNING: Option evalue was provided but is not used in the selected run mode, prok_genome_prod2024-10-08 14:33:10 WARNING: Option limit was provided but is not used in the selected run mode, prok_genome_prod2024-10-08 14:33:10 INFO: Running BUSCO using lineage dataset archaea_odb10 (prokaryota, 2021-02-23)2024-10-08 14:33:10 INFO: Running 1 job(s) on bbtools, starting at 10/08/2024 14:33:102024-10-08 14:33:12 INFO: [bbtools] 1 of 1 task(s) completed2024-10-08 14:33:12 INFO: ***** Run Prodigal on input to predict and extract genes *****2024-10-08 14:33:12 INFO: Running Prodigal with genetic code 11 in single mode2024-10-08 14:33:12 INFO: Running 1 job(s) on prodigal, starting at 10/08/2024 14:33:122024-10-08 14:33:14 INFO: [prodigal] 1 of 1 task(s) completed2024-10-08 14:33:14 INFO: Genetic code 11 selected as optimal2024-10-08 14:33:14 INFO: ***** Run HMMER on gene sequences *****2024-10-08 14:33:14 INFO: Running 194 job(s) on hmmsearch, starting at 10/08/2024 14:33:142024-10-08 14:33:16 INFO: [hmmsearch] 20 of 194 task(s) completed2024-10-08 14:33:18 INFO: [hmmsearch] 39 of 194 task(s) completed2024-10-08 14:33:19 INFO: [hmmsearch] 59 of 194 task(s) completed2024-10-08 14:33:21 INFO: [hmmsearch] 78 of 194 task(s) completed2024-10-08 14:33:22 INFO: [hmmsearch] 97 of 194 task(s) completed2024-10-08 14:33:24 INFO: [hmmsearch] 117 of 194 task(s) completed2024-10-08 14:33:25 INFO: [hmmsearch] 136 of 194 task(s) completed2024-10-08 14:33:27 INFO: [hmmsearch] 156 of 194 task(s) completed2024-10-08 14:33:29 INFO: [hmmsearch] 175 of 194 task(s) completed2024-10-08 14:33:30 INFO: [hmmsearch] 194 of 194 task(s) completed2024-10-08 14:33:30 INFO: Results: C:0.5%[S:0.5%,D:0.0%],F:0.0%,M:99.5%,n:194 2024-10-08 14:33:30 WARNING: Option evalue was provided but is not used in the selected run mode, prok_genome_prod2024-10-08 14:33:30 WARNING: Option limit was provided but is not used in the selected run mode, prok_genome_prod2024-10-08 14:33:30 INFO: Running BUSCO using lineage dataset bacteria_odb10 (prokaryota, 2020-03-06)2024-10-08 14:33:30 INFO: Skipping BBTools as already run2024-10-08 14:33:30 INFO: ***** Run Prodigal on input to predict and extract genes *****2024-10-08 14:33:30 INFO: Running Prodigal with genetic code 4 in single mode2024-10-08 14:33:30 INFO: Running 1 job(s) on prodigal, starting at 10/08/2024 14:33:302024-10-08 14:33:33 INFO: [prodigal] 1 of 1 task(s) completed2024-10-08 14:33:33 INFO: Genetic code 4 selected as optimal2024-10-08 14:33:33 INFO: ***** Run HMMER on gene sequences *****2024-10-08 14:33:33 INFO: Running 124 job(s) on hmmsearch, starting at 10/08/2024 14:33:332024-10-08 14:33:35 INFO: [hmmsearch] 13 of 124 task(s) completed2024-10-08 14:33:36 INFO: [hmmsearch] 25 of 124 task(s) completed2024-10-08 14:33:37 INFO: [hmmsearch] 38 of 124 task(s) completed2024-10-08 14:33:38 INFO: [hmmsearch] 50 of 124 task(s) completed2024-10-08 14:33:39 INFO: [hmmsearch] 63 of 124 task(s) completed2024-10-08 14:33:40 INFO: [hmmsearch] 75 of 124 task(s) completed2024-10-08 14:33:41 INFO: [hmmsearch] 87 of 124 task(s) completed2024-10-08 14:33:42 INFO: [hmmsearch] 100 of 124 task(s) completed2024-10-08 14:33:43 INFO: [hmmsearch] 112 of 124 task(s) completed2024-10-08 14:33:44 INFO: [hmmsearch] 124 of 124 task(s) completed2024-10-08 14:33:44 WARNING: BUSCO did not find any match. Make sure to check the log files if this is unexpected.2024-10-08 14:33:44 INFO: Results: C:0.0%[S:0.0%,D:0.0%],F:0.0%,M:100.0%,n:124 2024-10-08 14:33:44 INFO: Running BUSCO using lineage dataset eukaryota_odb10 (eukaryota, 2020-09-10)2024-10-08 14:33:44 INFO: Skipping BBTools as already run2024-10-08 14:33:44 INFO: Running 1 job(s) on makeblastdb, starting at 10/08/2024 14:33:442024-10-08 14:33:45 INFO: Creating BLAST database with input file2024-10-08 14:33:45 INFO: [makeblastdb] 1 of 1 task(s) completed2024-10-08 14:33:45 INFO: Running a BLAST search for BUSCOs against created database2024-10-08 14:33:45 INFO: Running 1 job(s) on tblastn, starting at 10/08/2024 14:33:452024-10-08 14:33:47 INFO: [tblastn] 1 of 1 task(s) completed2024-10-08 14:33:47 INFO: Running Augustus gene predictor on BLAST search results.2024-10-08 14:33:47 INFO: Running Augustus prediction using fly as species:2024-10-08 14:33:47 INFO: Running 6 job(s) on augustus, starting at 10/08/2024 14:33:472024-10-08 14:33:50 INFO: [augustus] 1 of 6 task(s) completed2024-10-08 14:33:52 INFO: [augustus] 2 of 6 task(s) completed2024-10-08 14:33:56 INFO: [augustus] 3 of 6 task(s) completed2024-10-08 14:33:58 INFO: [augustus] 4 of 6 task(s) completed2024-10-08 14:33:59 INFO: [augustus] 5 of 6 task(s) completed2024-10-08 14:34:01 INFO: [augustus] 6 of 6 task(s) completed2024-10-08 14:34:01 INFO: Extracting predicted proteins...2024-10-08 14:34:01 INFO: ***** Run HMMER on gene sequences *****2024-10-08 14:34:01 INFO: Running 6 job(s) on hmmsearch, starting at 10/08/2024 14:34:012024-10-08 14:34:02 INFO: [hmmsearch] 1 of 6 task(s) completed2024-10-08 14:34:02 INFO: [hmmsearch] 2 of 6 task(s) completed2024-10-08 14:34:02 INFO: [hmmsearch] 3 of 6 task(s) completed2024-10-08 14:34:02 INFO: [hmmsearch] 4 of 6 task(s) completed2024-10-08 14:34:02 INFO: [hmmsearch] 5 of 6 task(s) completed2024-10-08 14:34:02 INFO: [hmmsearch] 6 of 6 task(s) completed2024-10-08 14:34:02 INFO: 37 exons in total2024-10-08 14:34:02 INFO: Results: C:1.2%[S:1.2%,D:0.0%],F:0.0%,M:98.8%,n:255 2024-10-08 14:34:02 INFO: Starting second step of analysis. The gene predictor Augustus is retrained using the results from the initial run to yield more accurate results.2024-10-08 14:34:02 INFO: Extracting missing and fragmented buscos from the file ancestral_variants...2024-10-08 14:34:03 INFO: Running a BLAST search for BUSCOs against created database2024-10-08 14:34:03 INFO: Running 1 job(s) on tblastn, starting at 10/08/2024 14:34:032024-10-08 14:34:13 INFO: [tblastn] 1 of 1 task(s) completed2024-10-08 14:34:13 INFO: Converting predicted genes to short genbank files2024-10-08 14:34:13 INFO: Running 3 job(s) on gff2gbSmallDNA.pl, starting at 10/08/2024 14:34:132024-10-08 14:34:14 INFO: [gff2gbSmallDNA.pl] 1 of 3 task(s) completed2024-10-08 14:34:14 INFO: [gff2gbSmallDNA.pl] 2 of 3 task(s) completed2024-10-08 14:34:14 INFO: [gff2gbSmallDNA.pl] 3 of 3 task(s) completed2024-10-08 14:34:14 INFO: All files converted to short genbank files, now training Augustus using Single-Copy Complete BUSCOs2024-10-08 14:34:14 INFO: Running 1 job(s) on new_species.pl, starting at 10/08/2024 14:34:142024-10-08 14:34:14 INFO: [new_species.pl] 1 of 1 task(s) completed2024-10-08 14:34:14 INFO: Running 1 job(s) on etraining, starting at 10/08/2024 14:34:142024-10-08 14:34:15 INFO: [etraining] 1 of 1 task(s) completed2024-10-08 14:34:15 INFO: Re-running Augustus with the new metaparameters, number of target BUSCOs: 2522024-10-08 14:34:15 INFO: Running Augustus gene predictor on BLAST search results.2024-10-08 14:34:15 INFO: Running Augustus prediction using BUSCO_busco_galaxy as species:2024-10-08 14:34:15 INFO: Running 6 job(s) on augustus, starting at 10/08/2024 14:34:152024-10-08 14:34:19 INFO: [augustus] 1 of 6 task(s) completed2024-10-08 14:34:20 INFO: [augustus] 2 of 6 task(s) completed2024-10-08 14:34:21 INFO: [augustus] 3 of 6 task(s) completed2024-10-08 14:34:24 INFO: [augustus] 4 of 6 task(s) completed2024-10-08 14:34:25 INFO: [augustus] 5 of 6 task(s) completed2024-10-08 14:34:26 INFO: [augustus] 6 of 6 task(s) completed2024-10-08 14:34:26 INFO: Extracting predicted proteins...2024-10-08 14:34:26 INFO: ***** Run HMMER on gene sequences *****2024-10-08 14:34:26 INFO: Running 6 job(s) on hmmsearch, starting at 10/08/2024 14:34:262024-10-08 14:34:27 INFO: [hmmsearch] 1 of 6 task(s) completed2024-10-08 14:34:27 INFO: [hmmsearch] 2 of 6 task(s) completed2024-10-08 14:34:27 INFO: [hmmsearch] 3 of 6 task(s) completed2024-10-08 14:34:27 INFO: [hmmsearch] 4 of 6 task(s) completed2024-10-08 14:34:27 INFO: [hmmsearch] 5 of 6 task(s) completed2024-10-08 14:34:27 INFO: [hmmsearch] 6 of 6 task(s) completed2024-10-08 14:34:27 INFO: 37 exons in total2024-10-08 14:34:27 INFO: Results: C:1.6%[S:1.6%,D:0.0%],F:0.0%,M:98.4%,n:255 2024-10-08 14:34:27 INFO: eukaryota_odb10 selected2024-10-08 14:34:27 INFO: ***** Searching tree for chosen lineage to find best taxonomic match *****2024-10-08 14:34:28 INFO: Extract markers...2024-10-08 14:34:28 INFO: Place the markers on the reference tree...2024-10-08 14:34:28 INFO: Running 1 job(s) on sepp, starting at 10/08/2024 14:34:282024-10-08 14:37:19 INFO: [sepp] 1 of 1 task(s) completed2024-10-08 14:37:19 INFO: Not enough markers were placed on the tree (1). Root lineage eukaryota is kept2024-10-08 14:37:19 INFO: --------------------------------------------------- |Results from dataset eukaryota_odb10 | --------------------------------------------------- |C:1.6%[S:1.6%,D:0.0%],F:0.0%,M:98.4%,n:255 | |4 Complete BUSCOs (C) | |4 Complete and single-copy BUSCOs (S) | |0 Complete and duplicated BUSCOs (D) | |0 Fragmented BUSCOs (F) | |251 Missing BUSCOs (M) | |255 Total BUSCO groups searched | ---------------------------------------------------2024-10-08 14:37:19 INFO: BUSCO analysis done with WARNING(s). Total running time: 249 seconds***** Summary of warnings: *****2024-10-08 14:33:10 WARNING:busco.BuscoConfig Option evalue was provided but is not used in the selected run mode, prok_genome_prod2024-10-08 14:33:10 WARNING:busco.BuscoConfig Option limit was provided but is not used in the selected run mode, prok_genome_prod2024-10-08 14:33:30 WARNING:busco.BuscoConfig Option evalue was provided but is not used in the selected run mode, prok_genome_prod2024-10-08 14:33:30 WARNING:busco.BuscoConfig Option limit was provided but is not used in the selected run mode, prok_genome_prod2024-10-08 14:33:44 WARNING:busco.busco_tools.hmmer BUSCO did not find any match. Make sure to check the log files if this is unexpected.2024-10-08 14:37:19 INFO: Results written in /tmp/tmp88q28s6d/job_working_directory/000/3/working/busco_galaxy2024-10-08 14:37:19 INFO: For assistance with interpreting the results, please consult the userguide: https://busco.ezlab.org/busco_userguide.html2024-10-08 14:37:19 INFO: Visit this page https://gitlab.com/ezlab/busco#how-to-cite-busco to see how to cite BUSCOtotal 40drwxr-xr-x 8 1001 127 4096 Oct 8 14:34 augustus_outputdrwxr-xr-x 3 1001 127 4096 Oct 8 14:34 blast_outputdrwxr-xr-x 5 1001 127 4096 Oct 8 14:33 busco_sequences-rw-r--r-- 1 1001 127 5831 Oct 8 14:34 full_table.tsvdrwxr-xr-x 4 1001 127 4096 Oct 8 14:34 hmmer_output-rw-r--r-- 1 1001 127 3548 Oct 8 14:34 missing_busco_list.tsvdrwxr-xr-x 2 1001 127 4096 Oct 8 14:37 placement_files-rw-r--r-- 1 1001 127 3180 Oct 8 14:34 short_summary.json-rw-r--r-- 1 1001 127 1078 Oct 8 14:34 short_summary.txt2024-10-08 14:37:20 INFO: ****************** Start plot generation at 10/08/2024 14:37:20 ******************2024-10-08 14:37:20 INFO: Load data ...2024-10-08 14:37:20 INFO: Loaded BUSCO_summaries/short_summary.specific.eukaryota_odb10.busco_galaxy.txt successfully2024-10-08 14:37:20 INFO: Generate the R code ...2024-10-08 14:37:20 INFO: Run the R code ...2024-10-08 14:37:23 INFO: [1] "Plotting the figure ..."[1] "Done"2024-10-08 14:37:23 INFO: Plot generation done. Total running time: 2.407008647918701 seconds2024-10-08 14:37:23 INFO: Results written in BUSCO_summaries/
2024-10-10 09:24:32.475025: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.2024-10-10 09:24:33.180938: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT2024-10-10 09:24:38.730559: W tensorflow/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 205286400 exceeds 10% of free system memory.2024-10-10 09:24:38.976324: W tensorflow/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 205286400 exceeds 10% of free system memory.2024-10-10 09:24:41.578172: W tensorflow/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 205286400 exceeds 10% of free system memory.2024-10-10 09:24:44.176115: W tensorflow/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 205286400 exceeds 10% of free system memory.2024-10-10 09:24:46.786359: W tensorflow/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 205286400 exceeds 10% of free system memory./usr/local/lib/python3.8/dist-packages/tensorflow_addons/utils/tfa_eol_msg.py:23: UserWarning: TensorFlow Addons (TFA) has ended development and introduction of new features.TFA has entered a minimal maintenance and release mode until a planned end of life in May 2024.Please modify downstream libraries to take dependencies from other repositories in our TensorFlow community (e.g. Keras, Keras-CV, and Keras-NLP). For more information see: https://github.com/tensorflow/addons/issues/2807 warnings.warn(Ignoring the following unexpected models in /tmp/tmpcmf3ezyu/job_working_directory/000/2/home/.local/share/Helixer/models:[].You can set --model-filepath in Helixer.py if you wish to use these.
Standard Output:
============ CUDA ============CUDA Version 11.8.0Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.This container image and its contents are governed by the NVIDIA Deep Learning Container License.By pulling and using the container, you accept the terms and conditions of this license:https://developer.nvidia.com/ngc/nvidia-deep-learning-container-licenseA copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.WARNING: The NVIDIA Driver was not detected. GPU functionality will not be available. Use the NVIDIA Container Toolkit to start this container with GPU support; see https://docs.nvidia.com/datacenter/cloud-native/ .retrived list of available models from https://raw.githubusercontent.com/weberlab-hhu/Helixer/main/resources/model_list.csvsaved model land_plant_v0.3_a_0080.h5 to /tmp/tmpcmf3ezyu/job_working_directory/000/2/home/.local/share/Helixer/modelssaved model vertebrate_v0.3_m_0080.h5 to /tmp/tmpcmf3ezyu/job_working_directory/000/2/home/.local/share/Helixer/modelssaved model fungi_v0.3_a_0100.h5 to /tmp/tmpcmf3ezyu/job_working_directory/000/2/home/.local/share/Helixer/modelssaved model invertebrate_v0.3_m_0100.h5 to /tmp/tmpcmf3ezyu/job_working_directory/000/2/home/.local/share/Helixer/modelsHelixerPost <genome.h5> <predictions.h5> <windowSize> <edgeThresh> <peakThresh> <minCodingLength> <gff>No config file foundretrived list of available models from https://raw.githubusercontent.com/weberlab-hhu/Helixer/main/resources/model_list.csvHelixer.py config: {'batch_size': 8, 'compression': 'gzip', 'config_path': 'config/helixer_config.yaml', 'debug': False, 'edge_threshold': 0.1, 'fasta_path': '/tmp/tmpcmf3ezyu/files/2/0/d/dataset_20de09af-76b1-40e1-b3d2-dc4b783030dc.dat', 'gff_output_path': '/tmp/tmpcmf3ezyu/job_working_directory/000/2/outputs/dataset_24d7c432-03d4-4d92-a83c-0fe657557917.dat', 'lineage': 'land_plant', 'min_coding_length': 100, 'model_filepath': '/tmp/tmpcmf3ezyu/job_working_directory/000/2/home/.local/share/Helixer/models/land_plant/land_plant_v0.3_a_0080.h5', 'no_multiprocess': False, 'no_overlap': False, 'overlap_core_length': 80190, 'overlap_offset': 53460, 'peak_threshold': 0.8, 'species': '', 'subsequence_length': 106920, 'temporary_dir': './', 'window_size': 100}Testing whether helixer_post_bin is correctly installedHelixer.py config loaded. Starting FASTA to H5 conversion.storing temporary files under ./tmpi3q8kbke1 expected num of chunks to write in 19994040 bases to hdf5Numerification of 0-720580 of the sequence of sample took 0.24 secs1 Numerified Fasta only Coordinate (seqid: sample, len: 720580) in 0.39 secs1 expected num of chunks to write in 19994040 bases to hdf5Numerification of 0-2030 of the sequence of sample2 took 0.00 secs2 Numerified Fasta only Coordinate (seqid: sample2, len: 2030) in 0.01 secs1 expected num of chunks to write in 19994040 bases to hdf5Numerification of 0-2100 of the sequence of sample3 took 0.00 secs3 Numerified Fasta only Coordinate (seqid: sample3, len: 2100) in 0.01 secs1 expected num of chunks to write in 19994040 bases to hdf5Numerification of 0-7560 of the sequence of sample4 took 0.00 secs4 Numerified Fasta only Coordinate (seqid: sample4, len: 7560) in 0.02 secslogged installed version in place of git commit for geenufflogged installed version in place of git commit for helixerFASTA to H5 conversion done. Starting neural network prediction with overlapping.HelixerModel config: {'batch_size': 8, 'calculate_uncertainty': False, 'check_every_nth_batch': 1000000, 'class_weights': 'None', 'clip_norm': 3.0, 'cnn_layers': 1, 'compression': 'gzip', 'core_length': 80190, 'coverage_norm': None, 'coverage_offset': 0.0, 'coverage_weights': False, 'cpus': 8, 'data_dir': None, 'debug': False, 'dropout1': 0.0, 'dropout2': 0.0, 'epochs': 10000, 'eval': False, 'filter_depth': 32, 'fine_tune': False, 'float_precision': 'float32', 'gpu_id': -1, 'input_coverage': False, 'kernel_size': 26, 'large_eval_folder': '', 'learning_rate': 0.0003, 'load_model_path': '/tmp/tmpcmf3ezyu/job_working_directory/000/2/home/.local/share/Helixer/models/land_plant/land_plant_v0.3_a_0080.h5', 'load_predictions': False, 'loss': '', 'lstm_layers': 1, 'nni': False, 'no_utrs': False, 'optimizer': 'adamw', 'overlap': True, 'overlap_offset': 53460, 'patience': 3, 'pool_size': 9, 'post_coverage_hidden_layer': False, 'predict_phase': False, 'prediction_output_path': './tmpi3q8kbke/tmp_predictions_.h5', 'pretrained_model_path': None, 'resume_training': False, 'save_every_check': False, 'save_model_path': './best_model.h5', 'stretch_transition_weights': 0, 'test_data': './tmpi3q8kbke/tmp_species_.h5', 'transition_weights': 'None', 'units': 32, 'val_test_batch_size': 8, 'verbose': True, 'weight_decay': 3.5e-05, 'workers': 1}No err_samples dataset found, correct samples will be set to 0No fully_intergenic_samples dataset found, fully intergenic samples will be set to 0Data config: [{'geenuff_commit': 'commit not found, version: 0.3.2', 'helixer_commit': 'commit not found, version: 0.3.3', 'input_path': '/tmp/tmpcmf3ezyu/files/2/0/d/dataset_20de09af-76b1-40e1-b3d2-dc4b783030dc.dat', 'timestamp': '2024-10-10 09:24:35.923143'}]Test data shape: (20, 106920)Intergenic test seqs: 0.00%Fully correct test seqs: 0.00%Number of devices: 1Current Helixer version: 0.3.3Md5sum of the loaded model: f0e00efcbea83c66b69258d11119a691 /tmp/tmpcmf3ezyu/job_working_directory/000/2/home/.local/share/Helixer/models/land_plant/land_plant_v0.3_a_0080.h5Model: "model"__________________________________________________________________________________________________ Layer (type) Output Shape Param # Connected to ================================================================================================== main_input (InputLayer) [(None, None, 4)] 0 [] conv1d (Conv1D) (None, None, 96) 4704 ['main_input[0][0]'] batch_normalization (Batch (None, None, 96) 384 ['conv1d[0][0]'] Normalization) conv1d_1 (Conv1D) (None, None, 96) 110688 ['batch_normalization[0][0]'] batch_normalization_1 (Bat (None, None, 96) 384 ['conv1d_1[0][0]'] chNormalization) conv1d_2 (Conv1D) (None, None, 96) 110688 ['batch_normalization_1[0][0]' ] batch_normalization_2 (Bat (None, None, 96) 384 ['conv1d_2[0][0]'] chNormalization) conv1d_3 (Conv1D) (None, None, 96) 110688 ['batch_normalization_2[0][0]' ] reshape (Reshape) (None, None, 864) 0 ['conv1d_3[0][0]'] bidirectional (Bidirection (None, None, 256) 1016832 ['reshape[0][0]'] al) bidirectional_1 (Bidirecti (None, None, 256) 394240 ['bidirectional[0][0]'] onal) bidirectional_2 (Bidirecti (None, None, 256) 394240 ['bidirectional_1[0][0]'] onal) dense (Dense) (None, None, 72) 18504 ['bidirectional_2[0][0]'] tf.split (TFOpLambda) [(None, None, 36), 0 ['dense[0][0]'] (None, None, 36)] reshape_1 (Reshape) (None, None, 9, 4) 0 ['tf.split[0][0]'] reshape_2 (Reshape) (None, None, 9, 4) 0 ['tf.split[0][1]'] genic (Activation) (None, None, 9, 4) 0 ['reshape_1[0][0]'] phase (Activation) (None, None, 9, 4) 0 ['reshape_2[0][0]'] ==================================================================================================Total params: 2161736 (8.25 MB)Trainable params: 2161160 (8.24 MB)Non-trainable params: 576 (2.25 KB)__________________________________________________________________________________________________HMM Config Splicing Flags: U:true US:true S:true SC:true C:true CS:true S:true SU:true U:true Splicing - Weights: Donor 1, Acceptor 1 Splicing - Fixed Penalties: U2-GT-AG 0, U2-GT-AC 0 U12-GT-AG 0 U12-AT-AC 0 Coding - Weights: Start 1000, Stop 1000 Phase Mode: Implementation 1, Dilute to Total, Retention: 0.2Sequences for Species - 0 BP_Extractor for Sequence sample - ID 0Forward for Sequence sample - ID 0 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 613530 131 479 418 Non Coding 689510 1069 1104 1091 UTR 238 5141 87 120 Phase 0 230 9046 2 6 Coding 1243 82 29647 264 Phase 1 226 8 9014 2 Intron 1459 120 162 67459 Phase 2 239 2 5 9026 Precision Recall F1 Precision Recall F1 Intergenic 0.99523 0.99833 0.99678 Non Coding 0.99899 0.99529 0.99714 UTR 0.93917 0.92034 0.92966 Phase 0 0.89343 0.97436 0.93214 Coding 0.97603 0.94913 0.96239 Phase 1 0.89027 0.97449 0.93048 Intron 0.98825 0.97484 0.98150 Phase 2 0.89146 0.97347 0.93066 Subgenic 0.98449 0.96684 0.97559 Coding 0.89172 0.97411 0.93109 Genic 0.98211 0.96439 0.97317 Reverse for Sequence sample - ID 0 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 555814 209 108 776 Non Coding 659468 985 971 987 UTR 454 9649 137 214 Phase 0 670 18679 9 31 Coding 2123 49 58497 691 Phase 1 689 33 18702 12 Intron 8256 265 391 82947 Phase 2 620 14 29 18681 Precision Recall F1 Precision Recall F1 Intergenic 0.98088 0.99804 0.98939 Non Coding 0.99701 0.99556 0.99628 UTR 0.94858 0.92300 0.93562 Phase 0 0.94764 0.96338 0.95545 Coding 0.98924 0.95334 0.97096 Phase 1 0.94881 0.96224 0.95548 Intron 0.98014 0.90298 0.93998 Phase 2 0.94774 0.96573 0.95665 Subgenic 0.98388 0.92315 0.95255 Coding 0.94807 0.96378 0.95586 Genic 0.98155 0.92314 0.95145 BP_Extractor for Sequence sample2 - ID 1Forward for Sequence sample2 - ID 1 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 2026 0 0 0 Non Coding 2029 0 0 0 UTR 0 0 0 0 Phase 0 1 0 0 0 Coding 4 0 0 0 Phase 1 0 0 0 0 Intron 0 0 0 0 Phase 2 0 0 0 0 Precision Recall F1 Precision Recall F1 Intergenic 0.99803 1.00000 0.99901 Non Coding 0.99951 1.00000 0.99975 UTR NaN NaN NaN Phase 0 NaN 0.00000 0.00000 Coding NaN 0.00000 0.00000 Phase 1 NaN NaN NaN Intron NaN NaN NaN Phase 2 NaN NaN NaN Subgenic NaN 0.00000 0.00000 Coding NaN 0.00000 0.00000 Genic NaN 0.00000 0.00000 Reverse for Sequence sample2 - ID 1 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 2030 0 0 0 Non Coding 2030 0 0 0 UTR 0 0 0 0 Phase 0 0 0 0 0 Coding 0 0 0 0 Phase 1 0 0 0 0 Intron 0 0 0 0 Phase 2 0 0 0 0 Precision Recall F1 Precision Recall F1 Intergenic 1.00000 1.00000 1.00000 Non Coding 1.00000 1.00000 1.00000 UTR NaN NaN NaN Phase 0 NaN NaN NaN Coding NaN NaN NaN Phase 1 NaN NaN NaN Intron NaN NaN NaN Phase 2 NaN NaN NaN Subgenic NaN NaN NaN Coding NaN NaN NaN Genic NaN NaN NaN BP_Extractor for Sequence sample3 - ID 2Forward for Sequence sample3 - ID 2 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 2098 0 0 0 Non Coding 2099 0 0 0 UTR 0 0 0 0 Phase 0 1 0 0 0 Coding 2 0 0 0 Phase 1 0 0 0 0 Intron 0 0 0 0 Phase 2 0 0 0 0 Precision Recall F1 Precision Recall F1 Intergenic 0.99905 1.00000 0.99952 Non Coding 0.99952 1.00000 0.99976 UTR NaN NaN NaN Phase 0 NaN 0.00000 0.00000 Coding NaN 0.00000 0.00000 Phase 1 NaN NaN NaN Intron NaN NaN NaN Phase 2 NaN NaN NaN Subgenic NaN 0.00000 0.00000 Coding NaN 0.00000 0.00000 Genic NaN 0.00000 0.00000 Reverse for Sequence sample3 - ID 2 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 2100 0 0 0 Non Coding 2100 0 0 0 UTR 0 0 0 0 Phase 0 0 0 0 0 Coding 0 0 0 0 Phase 1 0 0 0 0 Intron 0 0 0 0 Phase 2 0 0 0 0 Precision Recall F1 Precision Recall F1 Intergenic 1.00000 1.00000 1.00000 Non Coding 1.00000 1.00000 1.00000 UTR NaN NaN NaN Phase 0 NaN NaN NaN Coding NaN NaN NaN Phase 1 NaN NaN NaN Intron NaN NaN NaN Phase 2 NaN NaN NaN Subgenic NaN NaN NaN Coding NaN NaN NaN Genic NaN NaN NaN BP_Extractor for Sequence sample4 - ID 3Forward for Sequence sample4 - ID 3 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 7535 0 0 0 Non Coding 7553 0 0 0 UTR 3 0 0 0 Phase 0 2 0 0 0 Coding 8 0 0 0 Phase 1 2 0 0 0 Intron 14 0 0 0 Phase 2 3 0 0 0 Precision Recall F1 Precision Recall F1 Intergenic 0.99669 1.00000 0.99834 Non Coding 0.99907 1.00000 0.99954 UTR NaN 0.00000 0.00000 Phase 0 NaN 0.00000 0.00000 Coding NaN 0.00000 0.00000 Phase 1 NaN 0.00000 0.00000 Intron NaN 0.00000 0.00000 Phase 2 NaN 0.00000 0.00000 Subgenic NaN 0.00000 0.00000 Coding NaN 0.00000 0.00000 Genic NaN 0.00000 0.00000 Reverse for Sequence sample4 - ID 3 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 4480 19 0 0 Non Coding 5923 2 6 4 UTR 40 453 5 0 Phase 0 17 524 0 0 Coding 321 0 1573 1 Phase 1 17 0 520 0 Intron 0 0 0 668 Phase 2 25 0 0 522 Precision Recall F1 Precision Recall F1 Intergenic 0.92543 0.99578 0.95931 Non Coding 0.99014 0.99798 0.99404 UTR 0.95975 0.90964 0.93402 Phase 0 0.99620 0.96858 0.98219 Coding 0.99683 0.83008 0.90585 Phase 1 0.98859 0.96834 0.97836 Intron 0.99851 1.00000 0.99925 Phase 2 0.99240 0.95430 0.97297 Subgenic 0.99733 0.87437 0.93181 Coding 0.99240 0.96369 0.97783 Genic 0.99081 0.88010 0.93218 Forward for Species - 0 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 625189 131 479 418 Non Coding 701191 1069 1104 1091 UTR 241 5141 87 120 Phase 0 234 9046 2 6 Coding 1257 82 29647 264 Phase 1 228 8 9014 2 Intron 1473 120 162 67459 Phase 2 242 2 5 9026 Precision Recall F1 Precision Recall F1 Intergenic 0.99527 0.99836 0.99681 Non Coding 0.99900 0.99537 0.99718 UTR 0.93917 0.91984 0.92940 Phase 0 0.89343 0.97394 0.93195 Coding 0.97603 0.94870 0.96217 Phase 1 0.89027 0.97428 0.93038 Intron 0.98825 0.97464 0.98140 Phase 2 0.89146 0.97315 0.93052 Subgenic 0.98449 0.96658 0.97545 Coding 0.89172 0.97379 0.93095 Genic 0.98211 0.96411 0.97303 Reverse for Species - 0 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 564424 228 108 776 Non Coding 669521 987 977 991 UTR 494 10102 142 214 Phase 0 687 19203 9 31 Coding 2444 49 60070 692 Phase 1 706 33 19222 12 Intron 8256 265 391 83615 Phase 2 645 14 29 19203 Precision Recall F1 Precision Recall F1 Intergenic 0.98055 0.99803 0.98922 Non Coding 0.99697 0.99561 0.99629 UTR 0.94908 0.92239 0.93554 Phase 0 0.94891 0.96352 0.95616 Coding 0.98944 0.94965 0.96914 Phase 1 0.94984 0.96240 0.95608 Intron 0.98028 0.90368 0.94042 Phase 2 0.94891 0.96541 0.95709 Subgenic 0.98409 0.92235 0.95222 Coding 0.94922 0.96378 0.95644 Genic 0.98171 0.92235 0.95110 Total for Species - 0 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 1189613 359 587 1194 Non Coding 1370712 2056 2081 2082 UTR 735 15243 229 334 Phase 0 921 28249 11 37 Coding 3701 131 89717 956 Phase 1 934 41 28236 14 Intron 9729 385 553 151074 Phase 2 887 16 34 28229 Precision Recall F1 Precision Recall F1 Intergenic 0.98823 0.99820 0.99319 Non Coding 0.99800 0.99548 0.99674 UTR 0.94571 0.92153 0.93346 Phase 0 0.93041 0.96684 0.94827 Coding 0.98497 0.94934 0.96682 Phase 1 0.92998 0.96616 0.94772 Intron 0.98382 0.93405 0.95829 Phase 2 0.92975 0.96787 0.94843 Subgenic 0.98425 0.93969 0.96145 Coding 0.93004 0.96696 0.94814 Genic 0.98187 0.93859 0.95974 Total: 482642bp across 25 windowsNonestarting to load test data into memory..For h5 starting with species = b'':x shape: (20, 106920, 4)Data loading of 20 (total so far 20) samples of data/X into memory took 0.07 secsCompressed data size of data/X is at least 0.0008 GBsetting self.n_seqs to 20, bc that is len of data/X0 / 81 / 82 / 83 / 84 / 85 / 86 / 87 / 8Neural network prediction done. Starting post processing.Helixer successfully finished the annotation of /tmp/tmpcmf3ezyu/files/2/0/d/dataset_20de09af-76b1-40e1-b3d2-dc4b783030dc.dat in 0.07 hours. GFF file written to /tmp/tmpcmf3ezyu/job_working_directory/000/2/outputs/dataset_24d7c432-03d4-4d92-a83c-0fe657557917.dat.
2024-10-10 09:28:44 ERROR: Warning message:package ‘ggplot2’ was built under R version 4.4.1 Warning message:The `size` argument of `element_line()` is deprecated as of ggplot2 3.4.0.ℹ Please use the `linewidth` argument instead.
Standard Output:
2024-10-10 09:23:52 INFO: ***** Start a BUSCO v5.7.1 analysis, current time: 10/10/2024 09:23:52 *****2024-10-10 09:23:52 INFO: Configuring BUSCO with local environment2024-10-10 09:23:52 INFO: Running genome mode2024-10-10 09:23:56 INFO: Input file is /tmp/tmpcmf3ezyu/files/2/0/d/dataset_20de09af-76b1-40e1-b3d2-dc4b783030dc.dat2024-10-10 09:23:56 INFO: No lineage specified. Running lineage auto selector.2024-10-10 09:23:56 INFO: ***** Starting Auto Select Lineage ***** This process runs BUSCO on the generic lineage datasets for the domains archaea, bacteria and eukaryota. Once the optimal domain is selected, BUSCO automatically attempts to find the most appropriate BUSCO dataset to use based on phylogenetic placement. --auto-lineage-euk and --auto-lineage-prok are also available if you know your input assembly is, or is not, an eukaryote. See the user guide for more information. A reminder: Busco evaluations are valid when an appropriate dataset is used, i.e., the dataset belongs to the lineage of the species to test. Because of overlapping markers/spurious matches among domains, busco matches in another domain do not necessarily mean that your genome/proteome contains sequences from this domain. However, a high busco score in multiple domains might help you identify possible contaminations.2024-10-10 09:23:56 WARNING: Option evalue was provided but is not used in the selected run mode, prok_genome_prod2024-10-10 09:23:56 WARNING: Option limit was provided but is not used in the selected run mode, prok_genome_prod2024-10-10 09:23:56 INFO: Running BUSCO using lineage dataset archaea_odb10 (prokaryota, 2021-02-23)2024-10-10 09:23:56 INFO: Running 1 job(s) on bbtools, starting at 10/10/2024 09:23:562024-10-10 09:23:58 INFO: [bbtools] 1 of 1 task(s) completed2024-10-10 09:23:58 INFO: ***** Run Prodigal on input to predict and extract genes *****2024-10-10 09:23:58 INFO: Running Prodigal with genetic code 11 in single mode2024-10-10 09:23:58 INFO: Running 1 job(s) on prodigal, starting at 10/10/2024 09:23:582024-10-10 09:24:01 INFO: [prodigal] 1 of 1 task(s) completed2024-10-10 09:24:01 INFO: Genetic code 11 selected as optimal2024-10-10 09:24:01 INFO: ***** Run HMMER on gene sequences *****2024-10-10 09:24:01 INFO: Running 194 job(s) on hmmsearch, starting at 10/10/2024 09:24:012024-10-10 09:24:05 INFO: [hmmsearch] 20 of 194 task(s) completed2024-10-10 09:24:07 INFO: [hmmsearch] 39 of 194 task(s) completed2024-10-10 09:24:10 INFO: [hmmsearch] 59 of 194 task(s) completed2024-10-10 09:24:13 INFO: [hmmsearch] 78 of 194 task(s) completed2024-10-10 09:24:17 INFO: [hmmsearch] 97 of 194 task(s) completed2024-10-10 09:24:21 INFO: [hmmsearch] 117 of 194 task(s) completed2024-10-10 09:24:24 INFO: [hmmsearch] 136 of 194 task(s) completed2024-10-10 09:24:27 INFO: [hmmsearch] 156 of 194 task(s) completed2024-10-10 09:24:30 INFO: [hmmsearch] 175 of 194 task(s) completed2024-10-10 09:24:33 INFO: [hmmsearch] 194 of 194 task(s) completed2024-10-10 09:24:33 INFO: Results: C:0.5%[S:0.5%,D:0.0%],F:0.0%,M:99.5%,n:194 2024-10-10 09:24:33 WARNING: Option evalue was provided but is not used in the selected run mode, prok_genome_prod2024-10-10 09:24:33 WARNING: Option limit was provided but is not used in the selected run mode, prok_genome_prod2024-10-10 09:24:33 INFO: Running BUSCO using lineage dataset bacteria_odb10 (prokaryota, 2020-03-06)2024-10-10 09:24:33 INFO: Skipping BBTools as already run2024-10-10 09:24:33 INFO: ***** Run Prodigal on input to predict and extract genes *****2024-10-10 09:24:33 INFO: Running Prodigal with genetic code 4 in single mode2024-10-10 09:24:33 INFO: Running 1 job(s) on prodigal, starting at 10/10/2024 09:24:332024-10-10 09:24:36 INFO: [prodigal] 1 of 1 task(s) completed2024-10-10 09:24:36 INFO: Genetic code 4 selected as optimal2024-10-10 09:24:36 INFO: ***** Run HMMER on gene sequences *****2024-10-10 09:24:37 INFO: Running 124 job(s) on hmmsearch, starting at 10/10/2024 09:24:372024-10-10 09:24:39 INFO: [hmmsearch] 13 of 124 task(s) completed2024-10-10 09:24:41 INFO: [hmmsearch] 25 of 124 task(s) completed2024-10-10 09:24:43 INFO: [hmmsearch] 38 of 124 task(s) completed2024-10-10 09:24:45 INFO: [hmmsearch] 50 of 124 task(s) completed2024-10-10 09:24:47 INFO: [hmmsearch] 63 of 124 task(s) completed2024-10-10 09:24:49 INFO: [hmmsearch] 75 of 124 task(s) completed2024-10-10 09:24:51 INFO: [hmmsearch] 87 of 124 task(s) completed2024-10-10 09:24:53 INFO: [hmmsearch] 100 of 124 task(s) completed2024-10-10 09:24:55 INFO: [hmmsearch] 112 of 124 task(s) completed2024-10-10 09:24:57 INFO: [hmmsearch] 124 of 124 task(s) completed2024-10-10 09:24:57 WARNING: BUSCO did not find any match. Make sure to check the log files if this is unexpected.2024-10-10 09:24:57 INFO: Results: C:0.0%[S:0.0%,D:0.0%],F:0.0%,M:100.0%,n:124 2024-10-10 09:24:57 INFO: Running BUSCO using lineage dataset eukaryota_odb10 (eukaryota, 2020-09-10)2024-10-10 09:24:57 INFO: Skipping BBTools as already run2024-10-10 09:24:57 INFO: Running 1 job(s) on makeblastdb, starting at 10/10/2024 09:24:572024-10-10 09:24:58 INFO: Creating BLAST database with input file2024-10-10 09:24:58 INFO: [makeblastdb] 1 of 1 task(s) completed2024-10-10 09:24:58 INFO: Running a BLAST search for BUSCOs against created database2024-10-10 09:24:58 INFO: Running 1 job(s) on tblastn, starting at 10/10/2024 09:24:582024-10-10 09:25:00 INFO: [tblastn] 1 of 1 task(s) completed2024-10-10 09:25:00 INFO: Running Augustus gene predictor on BLAST search results.2024-10-10 09:25:00 INFO: Running Augustus prediction using fly as species:2024-10-10 09:25:00 INFO: Running 6 job(s) on augustus, starting at 10/10/2024 09:25:002024-10-10 09:25:04 INFO: [augustus] 1 of 6 task(s) completed2024-10-10 09:25:06 INFO: [augustus] 2 of 6 task(s) completed2024-10-10 09:25:10 INFO: [augustus] 3 of 6 task(s) completed2024-10-10 09:25:13 INFO: [augustus] 4 of 6 task(s) completed2024-10-10 09:25:15 INFO: [augustus] 5 of 6 task(s) completed2024-10-10 09:25:17 INFO: [augustus] 6 of 6 task(s) completed2024-10-10 09:25:17 INFO: Extracting predicted proteins...2024-10-10 09:25:17 INFO: ***** Run HMMER on gene sequences *****2024-10-10 09:25:17 INFO: Running 6 job(s) on hmmsearch, starting at 10/10/2024 09:25:172024-10-10 09:25:18 INFO: [hmmsearch] 1 of 6 task(s) completed2024-10-10 09:25:18 INFO: [hmmsearch] 2 of 6 task(s) completed2024-10-10 09:25:18 INFO: [hmmsearch] 3 of 6 task(s) completed2024-10-10 09:25:19 INFO: [hmmsearch] 4 of 6 task(s) completed2024-10-10 09:25:19 INFO: [hmmsearch] 5 of 6 task(s) completed2024-10-10 09:25:19 INFO: [hmmsearch] 6 of 6 task(s) completed2024-10-10 09:25:19 INFO: 37 exons in total2024-10-10 09:25:19 INFO: Results: C:1.2%[S:1.2%,D:0.0%],F:0.0%,M:98.8%,n:255 2024-10-10 09:25:19 INFO: Starting second step of analysis. The gene predictor Augustus is retrained using the results from the initial run to yield more accurate results.2024-10-10 09:25:19 INFO: Extracting missing and fragmented buscos from the file ancestral_variants...2024-10-10 09:25:20 INFO: Running a BLAST search for BUSCOs against created database2024-10-10 09:25:20 INFO: Running 1 job(s) on tblastn, starting at 10/10/2024 09:25:202024-10-10 09:25:32 INFO: [tblastn] 1 of 1 task(s) completed2024-10-10 09:25:32 INFO: Converting predicted genes to short genbank files2024-10-10 09:25:32 INFO: Running 3 job(s) on gff2gbSmallDNA.pl, starting at 10/10/2024 09:25:322024-10-10 09:25:32 INFO: [gff2gbSmallDNA.pl] 1 of 3 task(s) completed2024-10-10 09:25:32 INFO: [gff2gbSmallDNA.pl] 2 of 3 task(s) completed2024-10-10 09:25:32 INFO: [gff2gbSmallDNA.pl] 3 of 3 task(s) completed2024-10-10 09:25:32 INFO: All files converted to short genbank files, now training Augustus using Single-Copy Complete BUSCOs2024-10-10 09:25:32 INFO: Running 1 job(s) on new_species.pl, starting at 10/10/2024 09:25:322024-10-10 09:25:33 INFO: [new_species.pl] 1 of 1 task(s) completed2024-10-10 09:25:33 INFO: Running 1 job(s) on etraining, starting at 10/10/2024 09:25:332024-10-10 09:25:34 INFO: [etraining] 1 of 1 task(s) completed2024-10-10 09:25:34 INFO: Re-running Augustus with the new metaparameters, number of target BUSCOs: 2522024-10-10 09:25:34 INFO: Running Augustus gene predictor on BLAST search results.2024-10-10 09:25:34 INFO: Running Augustus prediction using BUSCO_busco_galaxy as species:2024-10-10 09:25:34 INFO: Running 6 job(s) on augustus, starting at 10/10/2024 09:25:342024-10-10 09:25:39 INFO: [augustus] 1 of 6 task(s) completed2024-10-10 09:25:40 INFO: [augustus] 2 of 6 task(s) completed2024-10-10 09:25:42 INFO: [augustus] 3 of 6 task(s) completed2024-10-10 09:25:45 INFO: [augustus] 4 of 6 task(s) completed2024-10-10 09:25:47 INFO: [augustus] 5 of 6 task(s) completed2024-10-10 09:25:48 INFO: [augustus] 6 of 6 task(s) completed2024-10-10 09:25:48 INFO: Extracting predicted proteins...2024-10-10 09:25:48 INFO: ***** Run HMMER on gene sequences *****2024-10-10 09:25:48 INFO: Running 6 job(s) on hmmsearch, starting at 10/10/2024 09:25:482024-10-10 09:25:49 INFO: [hmmsearch] 1 of 6 task(s) completed2024-10-10 09:25:49 INFO: [hmmsearch] 2 of 6 task(s) completed2024-10-10 09:25:50 INFO: [hmmsearch] 3 of 6 task(s) completed2024-10-10 09:25:50 INFO: [hmmsearch] 4 of 6 task(s) completed2024-10-10 09:25:50 INFO: [hmmsearch] 5 of 6 task(s) completed2024-10-10 09:25:50 INFO: [hmmsearch] 6 of 6 task(s) completed2024-10-10 09:25:50 INFO: 37 exons in total2024-10-10 09:25:50 INFO: Results: C:1.6%[S:1.6%,D:0.0%],F:0.0%,M:98.4%,n:255 2024-10-10 09:25:50 INFO: eukaryota_odb10 selected2024-10-10 09:25:50 INFO: ***** Searching tree for chosen lineage to find best taxonomic match *****2024-10-10 09:25:50 INFO: Extract markers...2024-10-10 09:25:50 INFO: Place the markers on the reference tree...2024-10-10 09:25:50 INFO: Running 1 job(s) on sepp, starting at 10/10/2024 09:25:502024-10-10 09:28:39 INFO: [sepp] 1 of 1 task(s) completed2024-10-10 09:28:40 INFO: Not enough markers were placed on the tree (1). Root lineage eukaryota is kept2024-10-10 09:28:40 INFO: --------------------------------------------------- |Results from dataset eukaryota_odb10 | --------------------------------------------------- |C:1.6%[S:1.6%,D:0.0%],F:0.0%,M:98.4%,n:255 | |4 Complete BUSCOs (C) | |4 Complete and single-copy BUSCOs (S) | |0 Complete and duplicated BUSCOs (D) | |0 Fragmented BUSCOs (F) | |251 Missing BUSCOs (M) | |255 Total BUSCO groups searched | ---------------------------------------------------2024-10-10 09:28:40 INFO: BUSCO analysis done with WARNING(s). Total running time: 284 seconds***** Summary of warnings: *****2024-10-10 09:23:56 WARNING:busco.BuscoConfig Option evalue was provided but is not used in the selected run mode, prok_genome_prod2024-10-10 09:23:56 WARNING:busco.BuscoConfig Option limit was provided but is not used in the selected run mode, prok_genome_prod2024-10-10 09:24:33 WARNING:busco.BuscoConfig Option evalue was provided but is not used in the selected run mode, prok_genome_prod2024-10-10 09:24:33 WARNING:busco.BuscoConfig Option limit was provided but is not used in the selected run mode, prok_genome_prod2024-10-10 09:24:57 WARNING:busco.busco_tools.hmmer BUSCO did not find any match. Make sure to check the log files if this is unexpected.2024-10-10 09:28:40 INFO: Results written in /tmp/tmpcmf3ezyu/job_working_directory/000/3/working/busco_galaxy2024-10-10 09:28:40 INFO: For assistance with interpreting the results, please consult the userguide: https://busco.ezlab.org/busco_userguide.html2024-10-10 09:28:40 INFO: Visit this page https://gitlab.com/ezlab/busco#how-to-cite-busco to see how to cite BUSCOtotal 40drwxr-xr-x 8 1001 118 4096 Oct 10 09:25 augustus_outputdrwxr-xr-x 3 1001 118 4096 Oct 10 09:25 blast_outputdrwxr-xr-x 5 1001 118 4096 Oct 10 09:24 busco_sequences-rw-r--r-- 1 1001 118 5831 Oct 10 09:25 full_table.tsvdrwxr-xr-x 4 1001 118 4096 Oct 10 09:25 hmmer_output-rw-r--r-- 1 1001 118 3548 Oct 10 09:25 missing_busco_list.tsvdrwxr-xr-x 2 1001 118 4096 Oct 10 09:28 placement_files-rw-r--r-- 1 1001 118 3180 Oct 10 09:25 short_summary.json-rw-r--r-- 1 1001 118 1078 Oct 10 09:25 short_summary.txt2024-10-10 09:28:41 INFO: ****************** Start plot generation at 10/10/2024 09:28:41 ******************2024-10-10 09:28:41 INFO: Load data ...2024-10-10 09:28:41 INFO: Loaded BUSCO_summaries/short_summary.specific.eukaryota_odb10.busco_galaxy.txt successfully2024-10-10 09:28:41 INFO: Generate the R code ...2024-10-10 09:28:41 INFO: Run the R code ...2024-10-10 09:28:44 INFO: [1] "Plotting the figure ..."[1] "Done"2024-10-10 09:28:44 INFO: Plot generation done. Total running time: 3.167226552963257 seconds2024-10-10 09:28:44 INFO: Results written in BUSCO_summaries/
2024-10-10 17:16:31.094407: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.2024-10-10 17:16:31.886043: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT2024-10-10 17:16:37.746447: W tensorflow/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 205286400 exceeds 10% of free system memory.2024-10-10 17:16:38.045402: W tensorflow/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 205286400 exceeds 10% of free system memory.2024-10-10 17:16:40.463150: W tensorflow/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 205286400 exceeds 10% of free system memory.2024-10-10 17:16:42.768220: W tensorflow/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 205286400 exceeds 10% of free system memory.2024-10-10 17:16:45.359623: W tensorflow/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 205286400 exceeds 10% of free system memory./usr/local/lib/python3.8/dist-packages/tensorflow_addons/utils/tfa_eol_msg.py:23: UserWarning: TensorFlow Addons (TFA) has ended development and introduction of new features.TFA has entered a minimal maintenance and release mode until a planned end of life in May 2024.Please modify downstream libraries to take dependencies from other repositories in our TensorFlow community (e.g. Keras, Keras-CV, and Keras-NLP). For more information see: https://github.com/tensorflow/addons/issues/2807 warnings.warn(Ignoring the following unexpected models in /tmp/tmp9tr0sopf/job_working_directory/000/2/home/.local/share/Helixer/models:[].You can set --model-filepath in Helixer.py if you wish to use these.
Standard Output:
============ CUDA ============CUDA Version 11.8.0Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.This container image and its contents are governed by the NVIDIA Deep Learning Container License.By pulling and using the container, you accept the terms and conditions of this license:https://developer.nvidia.com/ngc/nvidia-deep-learning-container-licenseA copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.WARNING: The NVIDIA Driver was not detected. GPU functionality will not be available. Use the NVIDIA Container Toolkit to start this container with GPU support; see https://docs.nvidia.com/datacenter/cloud-native/ .retrived list of available models from https://raw.githubusercontent.com/weberlab-hhu/Helixer/main/resources/model_list.csvsaved model land_plant_v0.3_a_0080.h5 to /tmp/tmp9tr0sopf/job_working_directory/000/2/home/.local/share/Helixer/modelssaved model vertebrate_v0.3_m_0080.h5 to /tmp/tmp9tr0sopf/job_working_directory/000/2/home/.local/share/Helixer/modelssaved model fungi_v0.3_a_0100.h5 to /tmp/tmp9tr0sopf/job_working_directory/000/2/home/.local/share/Helixer/modelssaved model invertebrate_v0.3_m_0100.h5 to /tmp/tmp9tr0sopf/job_working_directory/000/2/home/.local/share/Helixer/modelsHelixerPost <genome.h5> <predictions.h5> <windowSize> <edgeThresh> <peakThresh> <minCodingLength> <gff>No config file foundretrived list of available models from https://raw.githubusercontent.com/weberlab-hhu/Helixer/main/resources/model_list.csvHelixer.py config: {'batch_size': 8, 'compression': 'gzip', 'config_path': 'config/helixer_config.yaml', 'debug': False, 'edge_threshold': 0.1, 'fasta_path': '/tmp/tmp9tr0sopf/files/6/8/6/dataset_68696577-4bbd-403e-9d45-66006d769a89.dat', 'gff_output_path': '/tmp/tmp9tr0sopf/job_working_directory/000/2/outputs/dataset_94351e8a-63e9-44ac-932f-86e84f76fa16.dat', 'lineage': 'land_plant', 'min_coding_length': 100, 'model_filepath': '/tmp/tmp9tr0sopf/job_working_directory/000/2/home/.local/share/Helixer/models/land_plant/land_plant_v0.3_a_0080.h5', 'no_multiprocess': False, 'no_overlap': False, 'overlap_core_length': 80190, 'overlap_offset': 53460, 'peak_threshold': 0.8, 'species': '', 'subsequence_length': 106920, 'temporary_dir': './', 'window_size': 100}Testing whether helixer_post_bin is correctly installedHelixer.py config loaded. Starting FASTA to H5 conversion.storing temporary files under ./tmpp0uh165t1 expected num of chunks to write in 19994040 bases to hdf5Numerification of 0-720580 of the sequence of sample took 0.25 secs1 Numerified Fasta only Coordinate (seqid: sample, len: 720580) in 0.41 secs1 expected num of chunks to write in 19994040 bases to hdf5Numerification of 0-2030 of the sequence of sample2 took 0.00 secs2 Numerified Fasta only Coordinate (seqid: sample2, len: 2030) in 0.02 secs1 expected num of chunks to write in 19994040 bases to hdf5Numerification of 0-2100 of the sequence of sample3 took 0.00 secs3 Numerified Fasta only Coordinate (seqid: sample3, len: 2100) in 0.01 secs1 expected num of chunks to write in 19994040 bases to hdf5Numerification of 0-7560 of the sequence of sample4 took 0.00 secs4 Numerified Fasta only Coordinate (seqid: sample4, len: 7560) in 0.02 secslogged installed version in place of git commit for geenufflogged installed version in place of git commit for helixerFASTA to H5 conversion done. Starting neural network prediction with overlapping.HelixerModel config: {'batch_size': 8, 'calculate_uncertainty': False, 'check_every_nth_batch': 1000000, 'class_weights': 'None', 'clip_norm': 3.0, 'cnn_layers': 1, 'compression': 'gzip', 'core_length': 80190, 'coverage_norm': None, 'coverage_offset': 0.0, 'coverage_weights': False, 'cpus': 8, 'data_dir': None, 'debug': False, 'dropout1': 0.0, 'dropout2': 0.0, 'epochs': 10000, 'eval': False, 'filter_depth': 32, 'fine_tune': False, 'float_precision': 'float32', 'gpu_id': -1, 'input_coverage': False, 'kernel_size': 26, 'large_eval_folder': '', 'learning_rate': 0.0003, 'load_model_path': '/tmp/tmp9tr0sopf/job_working_directory/000/2/home/.local/share/Helixer/models/land_plant/land_plant_v0.3_a_0080.h5', 'load_predictions': False, 'loss': '', 'lstm_layers': 1, 'nni': False, 'no_utrs': False, 'optimizer': 'adamw', 'overlap': True, 'overlap_offset': 53460, 'patience': 3, 'pool_size': 9, 'post_coverage_hidden_layer': False, 'predict_phase': False, 'prediction_output_path': './tmpp0uh165t/tmp_predictions_.h5', 'pretrained_model_path': None, 'resume_training': False, 'save_every_check': False, 'save_model_path': './best_model.h5', 'stretch_transition_weights': 0, 'test_data': './tmpp0uh165t/tmp_species_.h5', 'transition_weights': 'None', 'units': 32, 'val_test_batch_size': 8, 'verbose': True, 'weight_decay': 3.5e-05, 'workers': 1}No err_samples dataset found, correct samples will be set to 0No fully_intergenic_samples dataset found, fully intergenic samples will be set to 0Data config: [{'geenuff_commit': 'commit not found, version: 0.3.2', 'helixer_commit': 'commit not found, version: 0.3.3', 'input_path': '/tmp/tmp9tr0sopf/files/6/8/6/dataset_68696577-4bbd-403e-9d45-66006d769a89.dat', 'timestamp': '2024-10-10 17:16:34.780739'}]Test data shape: (20, 106920)Intergenic test seqs: 0.00%Fully correct test seqs: 0.00%Number of devices: 1Current Helixer version: 0.3.3Md5sum of the loaded model: f0e00efcbea83c66b69258d11119a691 /tmp/tmp9tr0sopf/job_working_directory/000/2/home/.local/share/Helixer/models/land_plant/land_plant_v0.3_a_0080.h5Model: "model"__________________________________________________________________________________________________ Layer (type) Output Shape Param # Connected to ================================================================================================== main_input (InputLayer) [(None, None, 4)] 0 [] conv1d (Conv1D) (None, None, 96) 4704 ['main_input[0][0]'] batch_normalization (Batch (None, None, 96) 384 ['conv1d[0][0]'] Normalization) conv1d_1 (Conv1D) (None, None, 96) 110688 ['batch_normalization[0][0]'] batch_normalization_1 (Bat (None, None, 96) 384 ['conv1d_1[0][0]'] chNormalization) conv1d_2 (Conv1D) (None, None, 96) 110688 ['batch_normalization_1[0][0]' ] batch_normalization_2 (Bat (None, None, 96) 384 ['conv1d_2[0][0]'] chNormalization) conv1d_3 (Conv1D) (None, None, 96) 110688 ['batch_normalization_2[0][0]' ] reshape (Reshape) (None, None, 864) 0 ['conv1d_3[0][0]'] bidirectional (Bidirection (None, None, 256) 1016832 ['reshape[0][0]'] al) bidirectional_1 (Bidirecti (None, None, 256) 394240 ['bidirectional[0][0]'] onal) bidirectional_2 (Bidirecti (None, None, 256) 394240 ['bidirectional_1[0][0]'] onal) dense (Dense) (None, None, 72) 18504 ['bidirectional_2[0][0]'] tf.split (TFOpLambda) [(None, None, 36), 0 ['dense[0][0]'] (None, None, 36)] reshape_1 (Reshape) (None, None, 9, 4) 0 ['tf.split[0][0]'] reshape_2 (Reshape) (None, None, 9, 4) 0 ['tf.split[0][1]'] genic (Activation) (None, None, 9, 4) 0 ['reshape_1[0][0]'] phase (Activation) (None, None, 9, 4) 0 ['reshape_2[0][0]'] ==================================================================================================Total params: 2161736 (8.25 MB)Trainable params: 2161160 (8.24 MB)Non-trainable params: 576 (2.25 KB)__________________________________________________________________________________________________HMM Config Splicing Flags: U:true US:true S:true SC:true C:true CS:true S:true SU:true U:true Splicing - Weights: Donor 1, Acceptor 1 Splicing - Fixed Penalties: U2-GT-AG 0, U2-GT-AC 0 U12-GT-AG 0 U12-AT-AC 0 Coding - Weights: Start 1000, Stop 1000 Phase Mode: Implementation 1, Dilute to Total, Retention: 0.2Sequences for Species - 0 BP_Extractor for Sequence sample - ID 0Forward for Sequence sample - ID 0 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 613530 131 479 418 Non Coding 689510 1069 1104 1091 UTR 238 5141 87 120 Phase 0 230 9046 2 6 Coding 1243 82 29647 264 Phase 1 226 8 9014 2 Intron 1459 120 162 67459 Phase 2 239 2 5 9026 Precision Recall F1 Precision Recall F1 Intergenic 0.99523 0.99833 0.99678 Non Coding 0.99899 0.99529 0.99714 UTR 0.93917 0.92034 0.92966 Phase 0 0.89343 0.97436 0.93214 Coding 0.97603 0.94913 0.96239 Phase 1 0.89027 0.97449 0.93048 Intron 0.98825 0.97484 0.98150 Phase 2 0.89146 0.97347 0.93066 Subgenic 0.98449 0.96684 0.97559 Coding 0.89172 0.97411 0.93109 Genic 0.98211 0.96439 0.97317 Reverse for Sequence sample - ID 0 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 555814 209 108 776 Non Coding 659468 985 971 987 UTR 454 9649 137 214 Phase 0 670 18679 9 31 Coding 2123 49 58497 691 Phase 1 689 33 18702 12 Intron 8256 265 391 82947 Phase 2 620 14 29 18681 Precision Recall F1 Precision Recall F1 Intergenic 0.98088 0.99804 0.98939 Non Coding 0.99701 0.99556 0.99628 UTR 0.94858 0.92300 0.93562 Phase 0 0.94764 0.96338 0.95545 Coding 0.98924 0.95334 0.97096 Phase 1 0.94881 0.96224 0.95548 Intron 0.98014 0.90298 0.93998 Phase 2 0.94774 0.96573 0.95665 Subgenic 0.98388 0.92315 0.95255 Coding 0.94807 0.96378 0.95586 Genic 0.98155 0.92314 0.95145 BP_Extractor for Sequence sample2 - ID 1Forward for Sequence sample2 - ID 1 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 2026 0 0 0 Non Coding 2029 0 0 0 UTR 0 0 0 0 Phase 0 1 0 0 0 Coding 4 0 0 0 Phase 1 0 0 0 0 Intron 0 0 0 0 Phase 2 0 0 0 0 Precision Recall F1 Precision Recall F1 Intergenic 0.99803 1.00000 0.99901 Non Coding 0.99951 1.00000 0.99975 UTR NaN NaN NaN Phase 0 NaN 0.00000 0.00000 Coding NaN 0.00000 0.00000 Phase 1 NaN NaN NaN Intron NaN NaN NaN Phase 2 NaN NaN NaN Subgenic NaN 0.00000 0.00000 Coding NaN 0.00000 0.00000 Genic NaN 0.00000 0.00000 Reverse for Sequence sample2 - ID 1 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 2030 0 0 0 Non Coding 2030 0 0 0 UTR 0 0 0 0 Phase 0 0 0 0 0 Coding 0 0 0 0 Phase 1 0 0 0 0 Intron 0 0 0 0 Phase 2 0 0 0 0 Precision Recall F1 Precision Recall F1 Intergenic 1.00000 1.00000 1.00000 Non Coding 1.00000 1.00000 1.00000 UTR NaN NaN NaN Phase 0 NaN NaN NaN Coding NaN NaN NaN Phase 1 NaN NaN NaN Intron NaN NaN NaN Phase 2 NaN NaN NaN Subgenic NaN NaN NaN Coding NaN NaN NaN Genic NaN NaN NaN BP_Extractor for Sequence sample3 - ID 2Forward for Sequence sample3 - ID 2 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 2098 0 0 0 Non Coding 2099 0 0 0 UTR 0 0 0 0 Phase 0 1 0 0 0 Coding 2 0 0 0 Phase 1 0 0 0 0 Intron 0 0 0 0 Phase 2 0 0 0 0 Precision Recall F1 Precision Recall F1 Intergenic 0.99905 1.00000 0.99952 Non Coding 0.99952 1.00000 0.99976 UTR NaN NaN NaN Phase 0 NaN 0.00000 0.00000 Coding NaN 0.00000 0.00000 Phase 1 NaN NaN NaN Intron NaN NaN NaN Phase 2 NaN NaN NaN Subgenic NaN 0.00000 0.00000 Coding NaN 0.00000 0.00000 Genic NaN 0.00000 0.00000 Reverse for Sequence sample3 - ID 2 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 2100 0 0 0 Non Coding 2100 0 0 0 UTR 0 0 0 0 Phase 0 0 0 0 0 Coding 0 0 0 0 Phase 1 0 0 0 0 Intron 0 0 0 0 Phase 2 0 0 0 0 Precision Recall F1 Precision Recall F1 Intergenic 1.00000 1.00000 1.00000 Non Coding 1.00000 1.00000 1.00000 UTR NaN NaN NaN Phase 0 NaN NaN NaN Coding NaN NaN NaN Phase 1 NaN NaN NaN Intron NaN NaN NaN Phase 2 NaN NaN NaN Subgenic NaN NaN NaN Coding NaN NaN NaN Genic NaN NaN NaN BP_Extractor for Sequence sample4 - ID 3Forward for Sequence sample4 - ID 3 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 7535 0 0 0 Non Coding 7553 0 0 0 UTR 3 0 0 0 Phase 0 2 0 0 0 Coding 8 0 0 0 Phase 1 2 0 0 0 Intron 14 0 0 0 Phase 2 3 0 0 0 Precision Recall F1 Precision Recall F1 Intergenic 0.99669 1.00000 0.99834 Non Coding 0.99907 1.00000 0.99954 UTR NaN 0.00000 0.00000 Phase 0 NaN 0.00000 0.00000 Coding NaN 0.00000 0.00000 Phase 1 NaN 0.00000 0.00000 Intron NaN 0.00000 0.00000 Phase 2 NaN 0.00000 0.00000 Subgenic NaN 0.00000 0.00000 Coding NaN 0.00000 0.00000 Genic NaN 0.00000 0.00000 Reverse for Sequence sample4 - ID 3 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 4480 19 0 0 Non Coding 5923 2 6 4 UTR 40 453 5 0 Phase 0 17 524 0 0 Coding 321 0 1573 1 Phase 1 17 0 520 0 Intron 0 0 0 668 Phase 2 25 0 0 522 Precision Recall F1 Precision Recall F1 Intergenic 0.92543 0.99578 0.95931 Non Coding 0.99014 0.99798 0.99404 UTR 0.95975 0.90964 0.93402 Phase 0 0.99620 0.96858 0.98219 Coding 0.99683 0.83008 0.90585 Phase 1 0.98859 0.96834 0.97836 Intron 0.99851 1.00000 0.99925 Phase 2 0.99240 0.95430 0.97297 Subgenic 0.99733 0.87437 0.93181 Coding 0.99240 0.96369 0.97783 Genic 0.99081 0.88010 0.93218 Forward for Species - 0 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 625189 131 479 418 Non Coding 701191 1069 1104 1091 UTR 241 5141 87 120 Phase 0 234 9046 2 6 Coding 1257 82 29647 264 Phase 1 228 8 9014 2 Intron 1473 120 162 67459 Phase 2 242 2 5 9026 Precision Recall F1 Precision Recall F1 Intergenic 0.99527 0.99836 0.99681 Non Coding 0.99900 0.99537 0.99718 UTR 0.93917 0.91984 0.92940 Phase 0 0.89343 0.97394 0.93195 Coding 0.97603 0.94870 0.96217 Phase 1 0.89027 0.97428 0.93038 Intron 0.98825 0.97464 0.98140 Phase 2 0.89146 0.97315 0.93052 Subgenic 0.98449 0.96658 0.97545 Coding 0.89172 0.97379 0.93095 Genic 0.98211 0.96411 0.97303 Reverse for Species - 0 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 564424 228 108 776 Non Coding 669521 987 977 991 UTR 494 10102 142 214 Phase 0 687 19203 9 31 Coding 2444 49 60070 692 Phase 1 706 33 19222 12 Intron 8256 265 391 83615 Phase 2 645 14 29 19203 Precision Recall F1 Precision Recall F1 Intergenic 0.98055 0.99803 0.98922 Non Coding 0.99697 0.99561 0.99629 UTR 0.94908 0.92239 0.93554 Phase 0 0.94891 0.96352 0.95616 Coding 0.98944 0.94965 0.96914 Phase 1 0.94984 0.96240 0.95608 Intron 0.98028 0.90368 0.94042 Phase 2 0.94891 0.96541 0.95709 Subgenic 0.98409 0.92235 0.95222 Coding 0.94922 0.96378 0.95644 Genic 0.98171 0.92235 0.95110 Total for Species - 0 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 1189613 359 587 1194 Non Coding 1370712 2056 2081 2082 UTR 735 15243 229 334 Phase 0 921 28249 11 37 Coding 3701 131 89717 956 Phase 1 934 41 28236 14 Intron 9729 385 553 151074 Phase 2 887 16 34 28229 Precision Recall F1 Precision Recall F1 Intergenic 0.98823 0.99820 0.99319 Non Coding 0.99800 0.99548 0.99674 UTR 0.94571 0.92153 0.93346 Phase 0 0.93041 0.96684 0.94827 Coding 0.98497 0.94934 0.96682 Phase 1 0.92998 0.96616 0.94772 Intron 0.98382 0.93405 0.95829 Phase 2 0.92975 0.96787 0.94843 Subgenic 0.98425 0.93969 0.96145 Coding 0.93004 0.96696 0.94814 Genic 0.98187 0.93859 0.95974 Total: 482642bp across 25 windowsNonestarting to load test data into memory..For h5 starting with species = b'':x shape: (20, 106920, 4)Data loading of 20 (total so far 20) samples of data/X into memory took 0.09 secsCompressed data size of data/X is at least 0.0008 GBsetting self.n_seqs to 20, bc that is len of data/X0 / 81 / 82 / 83 / 84 / 85 / 86 / 87 / 8Neural network prediction done. Starting post processing.Helixer successfully finished the annotation of /tmp/tmp9tr0sopf/files/6/8/6/dataset_68696577-4bbd-403e-9d45-66006d769a89.dat in 0.07 hours. GFF file written to /tmp/tmp9tr0sopf/job_working_directory/000/2/outputs/dataset_94351e8a-63e9-44ac-932f-86e84f76fa16.dat.
2024-10-10 17:20:58 ERROR: Warning message:package ‘ggplot2’ was built under R version 4.4.1 Warning message:The `size` argument of `element_line()` is deprecated as of ggplot2 3.4.0.ℹ Please use the `linewidth` argument instead.
Standard Output:
2024-10-10 17:16:16 INFO: ***** Start a BUSCO v5.7.1 analysis, current time: 10/10/2024 17:16:16 *****2024-10-10 17:16:16 INFO: Configuring BUSCO with local environment2024-10-10 17:16:16 INFO: Running genome mode2024-10-10 17:16:18 INFO: Input file is /tmp/tmp9tr0sopf/files/6/8/6/dataset_68696577-4bbd-403e-9d45-66006d769a89.dat2024-10-10 17:16:18 INFO: No lineage specified. Running lineage auto selector.2024-10-10 17:16:18 INFO: ***** Starting Auto Select Lineage ***** This process runs BUSCO on the generic lineage datasets for the domains archaea, bacteria and eukaryota. Once the optimal domain is selected, BUSCO automatically attempts to find the most appropriate BUSCO dataset to use based on phylogenetic placement. --auto-lineage-euk and --auto-lineage-prok are also available if you know your input assembly is, or is not, an eukaryote. See the user guide for more information. A reminder: Busco evaluations are valid when an appropriate dataset is used, i.e., the dataset belongs to the lineage of the species to test. Because of overlapping markers/spurious matches among domains, busco matches in another domain do not necessarily mean that your genome/proteome contains sequences from this domain. However, a high busco score in multiple domains might help you identify possible contaminations.2024-10-10 17:16:18 WARNING: Option evalue was provided but is not used in the selected run mode, prok_genome_prod2024-10-10 17:16:18 WARNING: Option limit was provided but is not used in the selected run mode, prok_genome_prod2024-10-10 17:16:18 INFO: Running BUSCO using lineage dataset archaea_odb10 (prokaryota, 2021-02-23)2024-10-10 17:16:18 INFO: Running 1 job(s) on bbtools, starting at 10/10/2024 17:16:182024-10-10 17:16:19 INFO: [bbtools] 1 of 1 task(s) completed2024-10-10 17:16:19 INFO: ***** Run Prodigal on input to predict and extract genes *****2024-10-10 17:16:19 INFO: Running Prodigal with genetic code 11 in single mode2024-10-10 17:16:19 INFO: Running 1 job(s) on prodigal, starting at 10/10/2024 17:16:192024-10-10 17:16:21 INFO: [prodigal] 1 of 1 task(s) completed2024-10-10 17:16:22 INFO: Genetic code 11 selected as optimal2024-10-10 17:16:22 INFO: ***** Run HMMER on gene sequences *****2024-10-10 17:16:22 INFO: Running 194 job(s) on hmmsearch, starting at 10/10/2024 17:16:222024-10-10 17:16:24 INFO: [hmmsearch] 20 of 194 task(s) completed2024-10-10 17:16:25 INFO: [hmmsearch] 39 of 194 task(s) completed2024-10-10 17:16:27 INFO: [hmmsearch] 59 of 194 task(s) completed2024-10-10 17:16:28 INFO: [hmmsearch] 78 of 194 task(s) completed2024-10-10 17:16:30 INFO: [hmmsearch] 97 of 194 task(s) completed2024-10-10 17:16:32 INFO: [hmmsearch] 117 of 194 task(s) completed2024-10-10 17:16:33 INFO: [hmmsearch] 136 of 194 task(s) completed2024-10-10 17:16:35 INFO: [hmmsearch] 156 of 194 task(s) completed2024-10-10 17:16:36 INFO: [hmmsearch] 175 of 194 task(s) completed2024-10-10 17:16:38 INFO: [hmmsearch] 194 of 194 task(s) completed2024-10-10 17:16:38 INFO: Results: C:0.5%[S:0.5%,D:0.0%],F:0.0%,M:99.5%,n:194 2024-10-10 17:16:38 WARNING: Option evalue was provided but is not used in the selected run mode, prok_genome_prod2024-10-10 17:16:38 WARNING: Option limit was provided but is not used in the selected run mode, prok_genome_prod2024-10-10 17:16:38 INFO: Running BUSCO using lineage dataset bacteria_odb10 (prokaryota, 2020-03-06)2024-10-10 17:16:38 INFO: Skipping BBTools as already run2024-10-10 17:16:38 INFO: ***** Run Prodigal on input to predict and extract genes *****2024-10-10 17:16:38 INFO: Running Prodigal with genetic code 4 in single mode2024-10-10 17:16:38 INFO: Running 1 job(s) on prodigal, starting at 10/10/2024 17:16:382024-10-10 17:16:41 INFO: [prodigal] 1 of 1 task(s) completed2024-10-10 17:16:41 INFO: Genetic code 4 selected as optimal2024-10-10 17:16:41 INFO: ***** Run HMMER on gene sequences *****2024-10-10 17:16:41 INFO: Running 124 job(s) on hmmsearch, starting at 10/10/2024 17:16:412024-10-10 17:16:43 INFO: [hmmsearch] 13 of 124 task(s) completed2024-10-10 17:16:44 INFO: [hmmsearch] 25 of 124 task(s) completed2024-10-10 17:16:45 INFO: [hmmsearch] 38 of 124 task(s) completed2024-10-10 17:16:48 INFO: [hmmsearch] 50 of 124 task(s) completed2024-10-10 17:17:04 INFO: [hmmsearch] 63 of 124 task(s) completed2024-10-10 17:17:09 INFO: [hmmsearch] 75 of 124 task(s) completed2024-10-10 17:17:10 INFO: [hmmsearch] 87 of 124 task(s) completed2024-10-10 17:17:11 INFO: [hmmsearch] 100 of 124 task(s) completed2024-10-10 17:17:12 INFO: [hmmsearch] 112 of 124 task(s) completed2024-10-10 17:17:13 INFO: [hmmsearch] 124 of 124 task(s) completed2024-10-10 17:17:13 WARNING: BUSCO did not find any match. Make sure to check the log files if this is unexpected.2024-10-10 17:17:13 INFO: Results: C:0.0%[S:0.0%,D:0.0%],F:0.0%,M:100.0%,n:124 2024-10-10 17:17:13 INFO: Running BUSCO using lineage dataset eukaryota_odb10 (eukaryota, 2020-09-10)2024-10-10 17:17:13 INFO: Skipping BBTools as already run2024-10-10 17:17:13 INFO: Running 1 job(s) on makeblastdb, starting at 10/10/2024 17:17:132024-10-10 17:17:14 INFO: Creating BLAST database with input file2024-10-10 17:17:14 INFO: [makeblastdb] 1 of 1 task(s) completed2024-10-10 17:17:14 INFO: Running a BLAST search for BUSCOs against created database2024-10-10 17:17:14 INFO: Running 1 job(s) on tblastn, starting at 10/10/2024 17:17:142024-10-10 17:17:16 INFO: [tblastn] 1 of 1 task(s) completed2024-10-10 17:17:16 INFO: Running Augustus gene predictor on BLAST search results.2024-10-10 17:17:16 INFO: Running Augustus prediction using fly as species:2024-10-10 17:17:16 INFO: Running 6 job(s) on augustus, starting at 10/10/2024 17:17:162024-10-10 17:17:20 INFO: [augustus] 1 of 6 task(s) completed2024-10-10 17:17:23 INFO: [augustus] 2 of 6 task(s) completed2024-10-10 17:17:27 INFO: [augustus] 3 of 6 task(s) completed2024-10-10 17:17:30 INFO: [augustus] 4 of 6 task(s) completed2024-10-10 17:17:32 INFO: [augustus] 5 of 6 task(s) completed2024-10-10 17:17:34 INFO: [augustus] 6 of 6 task(s) completed2024-10-10 17:17:34 INFO: Extracting predicted proteins...2024-10-10 17:17:34 INFO: ***** Run HMMER on gene sequences *****2024-10-10 17:17:35 INFO: Running 6 job(s) on hmmsearch, starting at 10/10/2024 17:17:352024-10-10 17:17:35 INFO: [hmmsearch] 1 of 6 task(s) completed2024-10-10 17:17:35 INFO: [hmmsearch] 2 of 6 task(s) completed2024-10-10 17:17:35 INFO: [hmmsearch] 3 of 6 task(s) completed2024-10-10 17:17:35 INFO: [hmmsearch] 4 of 6 task(s) completed2024-10-10 17:17:36 INFO: [hmmsearch] 5 of 6 task(s) completed2024-10-10 17:17:36 INFO: [hmmsearch] 6 of 6 task(s) completed2024-10-10 17:17:36 INFO: 37 exons in total2024-10-10 17:17:36 INFO: Results: C:1.2%[S:1.2%,D:0.0%],F:0.0%,M:98.8%,n:255 2024-10-10 17:17:36 INFO: Starting second step of analysis. The gene predictor Augustus is retrained using the results from the initial run to yield more accurate results.2024-10-10 17:17:36 INFO: Extracting missing and fragmented buscos from the file ancestral_variants...2024-10-10 17:17:36 INFO: Running a BLAST search for BUSCOs against created database2024-10-10 17:17:36 INFO: Running 1 job(s) on tblastn, starting at 10/10/2024 17:17:362024-10-10 17:17:48 INFO: [tblastn] 1 of 1 task(s) completed2024-10-10 17:17:48 INFO: Converting predicted genes to short genbank files2024-10-10 17:17:48 INFO: Running 3 job(s) on gff2gbSmallDNA.pl, starting at 10/10/2024 17:17:482024-10-10 17:17:49 INFO: [gff2gbSmallDNA.pl] 1 of 3 task(s) completed2024-10-10 17:17:49 INFO: [gff2gbSmallDNA.pl] 2 of 3 task(s) completed2024-10-10 17:17:49 INFO: [gff2gbSmallDNA.pl] 3 of 3 task(s) completed2024-10-10 17:17:49 INFO: All files converted to short genbank files, now training Augustus using Single-Copy Complete BUSCOs2024-10-10 17:17:49 INFO: Running 1 job(s) on new_species.pl, starting at 10/10/2024 17:17:492024-10-10 17:17:49 INFO: [new_species.pl] 1 of 1 task(s) completed2024-10-10 17:17:49 INFO: Running 1 job(s) on etraining, starting at 10/10/2024 17:17:492024-10-10 17:17:50 INFO: [etraining] 1 of 1 task(s) completed2024-10-10 17:17:50 INFO: Re-running Augustus with the new metaparameters, number of target BUSCOs: 2522024-10-10 17:17:50 INFO: Running Augustus gene predictor on BLAST search results.2024-10-10 17:17:50 INFO: Running Augustus prediction using BUSCO_busco_galaxy as species:2024-10-10 17:17:50 INFO: Running 6 job(s) on augustus, starting at 10/10/2024 17:17:502024-10-10 17:17:55 INFO: [augustus] 1 of 6 task(s) completed2024-10-10 17:17:57 INFO: [augustus] 2 of 6 task(s) completed2024-10-10 17:17:59 INFO: [augustus] 3 of 6 task(s) completed2024-10-10 17:18:02 INFO: [augustus] 4 of 6 task(s) completed2024-10-10 17:18:04 INFO: [augustus] 5 of 6 task(s) completed2024-10-10 17:18:06 INFO: [augustus] 6 of 6 task(s) completed2024-10-10 17:18:06 INFO: Extracting predicted proteins...2024-10-10 17:18:06 INFO: ***** Run HMMER on gene sequences *****2024-10-10 17:18:06 INFO: Running 6 job(s) on hmmsearch, starting at 10/10/2024 17:18:062024-10-10 17:18:06 INFO: [hmmsearch] 1 of 6 task(s) completed2024-10-10 17:18:06 INFO: [hmmsearch] 2 of 6 task(s) completed2024-10-10 17:18:06 INFO: [hmmsearch] 3 of 6 task(s) completed2024-10-10 17:18:06 INFO: [hmmsearch] 4 of 6 task(s) completed2024-10-10 17:18:06 INFO: [hmmsearch] 5 of 6 task(s) completed2024-10-10 17:18:07 INFO: [hmmsearch] 6 of 6 task(s) completed2024-10-10 17:18:07 INFO: 37 exons in total2024-10-10 17:18:07 INFO: Results: C:1.6%[S:1.6%,D:0.0%],F:0.0%,M:98.4%,n:255 2024-10-10 17:18:07 INFO: eukaryota_odb10 selected2024-10-10 17:18:07 INFO: ***** Searching tree for chosen lineage to find best taxonomic match *****2024-10-10 17:18:07 INFO: Extract markers...2024-10-10 17:18:07 INFO: Place the markers on the reference tree...2024-10-10 17:18:07 INFO: Running 1 job(s) on sepp, starting at 10/10/2024 17:18:072024-10-10 17:20:53 INFO: [sepp] 1 of 1 task(s) completed2024-10-10 17:20:53 INFO: Not enough markers were placed on the tree (1). Root lineage eukaryota is kept2024-10-10 17:20:53 INFO: --------------------------------------------------- |Results from dataset eukaryota_odb10 | --------------------------------------------------- |C:1.6%[S:1.6%,D:0.0%],F:0.0%,M:98.4%,n:255 | |4 Complete BUSCOs (C) | |4 Complete and single-copy BUSCOs (S) | |0 Complete and duplicated BUSCOs (D) | |0 Fragmented BUSCOs (F) | |251 Missing BUSCOs (M) | |255 Total BUSCO groups searched | ---------------------------------------------------2024-10-10 17:20:53 INFO: BUSCO analysis done with WARNING(s). Total running time: 276 seconds***** Summary of warnings: *****2024-10-10 17:16:18 WARNING:busco.BuscoConfig Option evalue was provided but is not used in the selected run mode, prok_genome_prod2024-10-10 17:16:18 WARNING:busco.BuscoConfig Option limit was provided but is not used in the selected run mode, prok_genome_prod2024-10-10 17:16:38 WARNING:busco.BuscoConfig Option evalue was provided but is not used in the selected run mode, prok_genome_prod2024-10-10 17:16:38 WARNING:busco.BuscoConfig Option limit was provided but is not used in the selected run mode, prok_genome_prod2024-10-10 17:17:13 WARNING:busco.busco_tools.hmmer BUSCO did not find any match. Make sure to check the log files if this is unexpected.2024-10-10 17:20:53 INFO: Results written in /tmp/tmp9tr0sopf/job_working_directory/000/3/working/busco_galaxy2024-10-10 17:20:53 INFO: For assistance with interpreting the results, please consult the userguide: https://busco.ezlab.org/busco_userguide.html2024-10-10 17:20:53 INFO: Visit this page https://gitlab.com/ezlab/busco#how-to-cite-busco to see how to cite BUSCOtotal 40drwxr-xr-x 8 1001 118 4096 Oct 10 17:18 augustus_outputdrwxr-xr-x 3 1001 118 4096 Oct 10 17:17 blast_outputdrwxr-xr-x 5 1001 118 4096 Oct 10 17:17 busco_sequences-rw-r--r-- 1 1001 118 5831 Oct 10 17:18 full_table.tsvdrwxr-xr-x 4 1001 118 4096 Oct 10 17:17 hmmer_output-rw-r--r-- 1 1001 118 3548 Oct 10 17:18 missing_busco_list.tsvdrwxr-xr-x 2 1001 118 4096 Oct 10 17:20 placement_files-rw-r--r-- 1 1001 118 3180 Oct 10 17:18 short_summary.json-rw-r--r-- 1 1001 118 1078 Oct 10 17:18 short_summary.txt2024-10-10 17:20:55 INFO: ****************** Start plot generation at 10/10/2024 17:20:55 ******************2024-10-10 17:20:55 INFO: Load data ...2024-10-10 17:20:55 INFO: Loaded BUSCO_summaries/short_summary.specific.eukaryota_odb10.busco_galaxy.txt successfully2024-10-10 17:20:55 INFO: Generate the R code ...2024-10-10 17:20:55 INFO: Run the R code ...2024-10-10 17:20:58 INFO: [1] "Plotting the figure ..."[1] "Done"2024-10-10 17:20:58 INFO: Plot generation done. Total running time: 2.808326244354248 seconds2024-10-10 17:20:58 INFO: Results written in BUSCO_summaries/
2024-10-15 15:32:16.720998: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.2024-10-15 15:32:17.434596: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT2024-10-15 15:32:22.863717: W tensorflow/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 205286400 exceeds 10% of free system memory.2024-10-15 15:32:23.114792: W tensorflow/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 205286400 exceeds 10% of free system memory.2024-10-15 15:32:25.713627: W tensorflow/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 205286400 exceeds 10% of free system memory.2024-10-15 15:32:28.307060: W tensorflow/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 205286400 exceeds 10% of free system memory.2024-10-15 15:32:30.906081: W tensorflow/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 205286400 exceeds 10% of free system memory./usr/local/lib/python3.8/dist-packages/tensorflow_addons/utils/tfa_eol_msg.py:23: UserWarning: TensorFlow Addons (TFA) has ended development and introduction of new features.TFA has entered a minimal maintenance and release mode until a planned end of life in May 2024.Please modify downstream libraries to take dependencies from other repositories in our TensorFlow community (e.g. Keras, Keras-CV, and Keras-NLP). For more information see: https://github.com/tensorflow/addons/issues/2807 warnings.warn(Ignoring the following unexpected models in /tmp/tmp7vk7_3fs/job_working_directory/000/2/home/.local/share/Helixer/models:[].You can set --model-filepath in Helixer.py if you wish to use these.
Standard Output:
============ CUDA ============CUDA Version 11.8.0Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.This container image and its contents are governed by the NVIDIA Deep Learning Container License.By pulling and using the container, you accept the terms and conditions of this license:https://developer.nvidia.com/ngc/nvidia-deep-learning-container-licenseA copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.WARNING: The NVIDIA Driver was not detected. GPU functionality will not be available. Use the NVIDIA Container Toolkit to start this container with GPU support; see https://docs.nvidia.com/datacenter/cloud-native/ .retrived list of available models from https://raw.githubusercontent.com/weberlab-hhu/Helixer/main/resources/model_list.csvsaved model land_plant_v0.3_a_0080.h5 to /tmp/tmp7vk7_3fs/job_working_directory/000/2/home/.local/share/Helixer/modelssaved model vertebrate_v0.3_m_0080.h5 to /tmp/tmp7vk7_3fs/job_working_directory/000/2/home/.local/share/Helixer/modelssaved model fungi_v0.3_a_0100.h5 to /tmp/tmp7vk7_3fs/job_working_directory/000/2/home/.local/share/Helixer/modelssaved model invertebrate_v0.3_m_0100.h5 to /tmp/tmp7vk7_3fs/job_working_directory/000/2/home/.local/share/Helixer/modelsHelixerPost <genome.h5> <predictions.h5> <windowSize> <edgeThresh> <peakThresh> <minCodingLength> <gff>No config file foundretrived list of available models from https://raw.githubusercontent.com/weberlab-hhu/Helixer/main/resources/model_list.csvHelixer.py config: {'batch_size': 8, 'compression': 'gzip', 'config_path': 'config/helixer_config.yaml', 'debug': False, 'edge_threshold': 0.1, 'fasta_path': '/tmp/tmp7vk7_3fs/files/9/3/2/dataset_932b58cc-e381-4dc5-9612-0b6a72de2b2c.dat', 'gff_output_path': '/tmp/tmp7vk7_3fs/job_working_directory/000/2/outputs/dataset_d4560f2f-53d6-4f86-b4f4-186e41de76d3.dat', 'lineage': 'land_plant', 'min_coding_length': 100, 'model_filepath': '/tmp/tmp7vk7_3fs/job_working_directory/000/2/home/.local/share/Helixer/models/land_plant/land_plant_v0.3_a_0080.h5', 'no_multiprocess': False, 'no_overlap': False, 'overlap_core_length': 80190, 'overlap_offset': 53460, 'peak_threshold': 0.8, 'species': '', 'subsequence_length': 106920, 'temporary_dir': './', 'window_size': 100}Testing whether helixer_post_bin is correctly installedHelixer.py config loaded. Starting FASTA to H5 conversion.storing temporary files under ./tmp07e936s11 expected num of chunks to write in 19994040 bases to hdf5Numerification of 0-720580 of the sequence of sample took 0.24 secs1 Numerified Fasta only Coordinate (seqid: sample, len: 720580) in 0.39 secs1 expected num of chunks to write in 19994040 bases to hdf5Numerification of 0-2030 of the sequence of sample2 took 0.00 secs2 Numerified Fasta only Coordinate (seqid: sample2, len: 2030) in 0.01 secs1 expected num of chunks to write in 19994040 bases to hdf5Numerification of 0-2100 of the sequence of sample3 took 0.00 secs3 Numerified Fasta only Coordinate (seqid: sample3, len: 2100) in 0.01 secs1 expected num of chunks to write in 19994040 bases to hdf5Numerification of 0-7560 of the sequence of sample4 took 0.00 secs4 Numerified Fasta only Coordinate (seqid: sample4, len: 7560) in 0.02 secslogged installed version in place of git commit for geenufflogged installed version in place of git commit for helixerFASTA to H5 conversion done. Starting neural network prediction with overlapping.HelixerModel config: {'batch_size': 8, 'calculate_uncertainty': False, 'check_every_nth_batch': 1000000, 'class_weights': 'None', 'clip_norm': 3.0, 'cnn_layers': 1, 'compression': 'gzip', 'core_length': 80190, 'coverage_norm': None, 'coverage_offset': 0.0, 'coverage_weights': False, 'cpus': 8, 'data_dir': None, 'debug': False, 'dropout1': 0.0, 'dropout2': 0.0, 'epochs': 10000, 'eval': False, 'filter_depth': 32, 'fine_tune': False, 'float_precision': 'float32', 'gpu_id': -1, 'input_coverage': False, 'kernel_size': 26, 'large_eval_folder': '', 'learning_rate': 0.0003, 'load_model_path': '/tmp/tmp7vk7_3fs/job_working_directory/000/2/home/.local/share/Helixer/models/land_plant/land_plant_v0.3_a_0080.h5', 'load_predictions': False, 'loss': '', 'lstm_layers': 1, 'nni': False, 'no_utrs': False, 'optimizer': 'adamw', 'overlap': True, 'overlap_offset': 53460, 'patience': 3, 'pool_size': 9, 'post_coverage_hidden_layer': False, 'predict_phase': False, 'prediction_output_path': './tmp07e936s1/tmp_predictions_.h5', 'pretrained_model_path': None, 'resume_training': False, 'save_every_check': False, 'save_model_path': './best_model.h5', 'stretch_transition_weights': 0, 'test_data': './tmp07e936s1/tmp_species_.h5', 'transition_weights': 'None', 'units': 32, 'val_test_batch_size': 8, 'verbose': True, 'weight_decay': 3.5e-05, 'workers': 1}No err_samples dataset found, correct samples will be set to 0No fully_intergenic_samples dataset found, fully intergenic samples will be set to 0Data config: [{'geenuff_commit': 'commit not found, version: 0.3.2', 'helixer_commit': 'commit not found, version: 0.3.3', 'input_path': '/tmp/tmp7vk7_3fs/files/9/3/2/dataset_932b58cc-e381-4dc5-9612-0b6a72de2b2c.dat', 'timestamp': '2024-10-15 15:32:20.144514'}]Test data shape: (20, 106920)Intergenic test seqs: 0.00%Fully correct test seqs: 0.00%Number of devices: 1Current Helixer version: 0.3.3Md5sum of the loaded model: f0e00efcbea83c66b69258d11119a691 /tmp/tmp7vk7_3fs/job_working_directory/000/2/home/.local/share/Helixer/models/land_plant/land_plant_v0.3_a_0080.h5Model: "model"__________________________________________________________________________________________________ Layer (type) Output Shape Param # Connected to ================================================================================================== main_input (InputLayer) [(None, None, 4)] 0 [] conv1d (Conv1D) (None, None, 96) 4704 ['main_input[0][0]'] batch_normalization (Batch (None, None, 96) 384 ['conv1d[0][0]'] Normalization) conv1d_1 (Conv1D) (None, None, 96) 110688 ['batch_normalization[0][0]'] batch_normalization_1 (Bat (None, None, 96) 384 ['conv1d_1[0][0]'] chNormalization) conv1d_2 (Conv1D) (None, None, 96) 110688 ['batch_normalization_1[0][0]' ] batch_normalization_2 (Bat (None, None, 96) 384 ['conv1d_2[0][0]'] chNormalization) conv1d_3 (Conv1D) (None, None, 96) 110688 ['batch_normalization_2[0][0]' ] reshape (Reshape) (None, None, 864) 0 ['conv1d_3[0][0]'] bidirectional (Bidirection (None, None, 256) 1016832 ['reshape[0][0]'] al) bidirectional_1 (Bidirecti (None, None, 256) 394240 ['bidirectional[0][0]'] onal) bidirectional_2 (Bidirecti (None, None, 256) 394240 ['bidirectional_1[0][0]'] onal) dense (Dense) (None, None, 72) 18504 ['bidirectional_2[0][0]'] tf.split (TFOpLambda) [(None, None, 36), 0 ['dense[0][0]'] (None, None, 36)] reshape_1 (Reshape) (None, None, 9, 4) 0 ['tf.split[0][0]'] reshape_2 (Reshape) (None, None, 9, 4) 0 ['tf.split[0][1]'] genic (Activation) (None, None, 9, 4) 0 ['reshape_1[0][0]'] phase (Activation) (None, None, 9, 4) 0 ['reshape_2[0][0]'] ==================================================================================================Total params: 2161736 (8.25 MB)Trainable params: 2161160 (8.24 MB)Non-trainable params: 576 (2.25 KB)__________________________________________________________________________________________________HMM Config Splicing Flags: U:true US:true S:true SC:true C:true CS:true S:true SU:true U:true Splicing - Weights: Donor 1, Acceptor 1 Splicing - Fixed Penalties: U2-GT-AG 0, U2-GT-AC 0 U12-GT-AG 0 U12-AT-AC 0 Coding - Weights: Start 1000, Stop 1000 Phase Mode: Implementation 1, Dilute to Total, Retention: 0.2Sequences for Species - 0 BP_Extractor for Sequence sample - ID 0Forward for Sequence sample - ID 0 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 613530 131 479 418 Non Coding 689510 1069 1104 1091 UTR 238 5141 87 120 Phase 0 230 9046 2 6 Coding 1243 82 29647 264 Phase 1 226 8 9014 2 Intron 1459 120 162 67459 Phase 2 239 2 5 9026 Precision Recall F1 Precision Recall F1 Intergenic 0.99523 0.99833 0.99678 Non Coding 0.99899 0.99529 0.99714 UTR 0.93917 0.92034 0.92966 Phase 0 0.89343 0.97436 0.93214 Coding 0.97603 0.94913 0.96239 Phase 1 0.89027 0.97449 0.93048 Intron 0.98825 0.97484 0.98150 Phase 2 0.89146 0.97347 0.93066 Subgenic 0.98449 0.96684 0.97559 Coding 0.89172 0.97411 0.93109 Genic 0.98211 0.96439 0.97317 Reverse for Sequence sample - ID 0 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 555814 209 108 776 Non Coding 659468 985 971 987 UTR 454 9649 137 214 Phase 0 670 18679 9 31 Coding 2123 49 58497 691 Phase 1 689 33 18702 12 Intron 8256 265 391 82947 Phase 2 620 14 29 18681 Precision Recall F1 Precision Recall F1 Intergenic 0.98088 0.99804 0.98939 Non Coding 0.99701 0.99556 0.99628 UTR 0.94858 0.92300 0.93562 Phase 0 0.94764 0.96338 0.95545 Coding 0.98924 0.95334 0.97096 Phase 1 0.94881 0.96224 0.95548 Intron 0.98014 0.90298 0.93998 Phase 2 0.94774 0.96573 0.95665 Subgenic 0.98388 0.92315 0.95255 Coding 0.94807 0.96378 0.95586 Genic 0.98155 0.92314 0.95145 BP_Extractor for Sequence sample2 - ID 1Forward for Sequence sample2 - ID 1 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 2026 0 0 0 Non Coding 2029 0 0 0 UTR 0 0 0 0 Phase 0 1 0 0 0 Coding 4 0 0 0 Phase 1 0 0 0 0 Intron 0 0 0 0 Phase 2 0 0 0 0 Precision Recall F1 Precision Recall F1 Intergenic 0.99803 1.00000 0.99901 Non Coding 0.99951 1.00000 0.99975 UTR NaN NaN NaN Phase 0 NaN 0.00000 0.00000 Coding NaN 0.00000 0.00000 Phase 1 NaN NaN NaN Intron NaN NaN NaN Phase 2 NaN NaN NaN Subgenic NaN 0.00000 0.00000 Coding NaN 0.00000 0.00000 Genic NaN 0.00000 0.00000 Reverse for Sequence sample2 - ID 1 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 2030 0 0 0 Non Coding 2030 0 0 0 UTR 0 0 0 0 Phase 0 0 0 0 0 Coding 0 0 0 0 Phase 1 0 0 0 0 Intron 0 0 0 0 Phase 2 0 0 0 0 Precision Recall F1 Precision Recall F1 Intergenic 1.00000 1.00000 1.00000 Non Coding 1.00000 1.00000 1.00000 UTR NaN NaN NaN Phase 0 NaN NaN NaN Coding NaN NaN NaN Phase 1 NaN NaN NaN Intron NaN NaN NaN Phase 2 NaN NaN NaN Subgenic NaN NaN NaN Coding NaN NaN NaN Genic NaN NaN NaN BP_Extractor for Sequence sample3 - ID 2Forward for Sequence sample3 - ID 2 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 2098 0 0 0 Non Coding 2099 0 0 0 UTR 0 0 0 0 Phase 0 1 0 0 0 Coding 2 0 0 0 Phase 1 0 0 0 0 Intron 0 0 0 0 Phase 2 0 0 0 0 Precision Recall F1 Precision Recall F1 Intergenic 0.99905 1.00000 0.99952 Non Coding 0.99952 1.00000 0.99976 UTR NaN NaN NaN Phase 0 NaN 0.00000 0.00000 Coding NaN 0.00000 0.00000 Phase 1 NaN NaN NaN Intron NaN NaN NaN Phase 2 NaN NaN NaN Subgenic NaN 0.00000 0.00000 Coding NaN 0.00000 0.00000 Genic NaN 0.00000 0.00000 Reverse for Sequence sample3 - ID 2 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 2100 0 0 0 Non Coding 2100 0 0 0 UTR 0 0 0 0 Phase 0 0 0 0 0 Coding 0 0 0 0 Phase 1 0 0 0 0 Intron 0 0 0 0 Phase 2 0 0 0 0 Precision Recall F1 Precision Recall F1 Intergenic 1.00000 1.00000 1.00000 Non Coding 1.00000 1.00000 1.00000 UTR NaN NaN NaN Phase 0 NaN NaN NaN Coding NaN NaN NaN Phase 1 NaN NaN NaN Intron NaN NaN NaN Phase 2 NaN NaN NaN Subgenic NaN NaN NaN Coding NaN NaN NaN Genic NaN NaN NaN BP_Extractor for Sequence sample4 - ID 3Forward for Sequence sample4 - ID 3 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 7535 0 0 0 Non Coding 7553 0 0 0 UTR 3 0 0 0 Phase 0 2 0 0 0 Coding 8 0 0 0 Phase 1 2 0 0 0 Intron 14 0 0 0 Phase 2 3 0 0 0 Precision Recall F1 Precision Recall F1 Intergenic 0.99669 1.00000 0.99834 Non Coding 0.99907 1.00000 0.99954 UTR NaN 0.00000 0.00000 Phase 0 NaN 0.00000 0.00000 Coding NaN 0.00000 0.00000 Phase 1 NaN 0.00000 0.00000 Intron NaN 0.00000 0.00000 Phase 2 NaN 0.00000 0.00000 Subgenic NaN 0.00000 0.00000 Coding NaN 0.00000 0.00000 Genic NaN 0.00000 0.00000 Reverse for Sequence sample4 - ID 3 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 4480 19 0 0 Non Coding 5923 2 6 4 UTR 40 453 5 0 Phase 0 17 524 0 0 Coding 321 0 1573 1 Phase 1 17 0 520 0 Intron 0 0 0 668 Phase 2 25 0 0 522 Precision Recall F1 Precision Recall F1 Intergenic 0.92543 0.99578 0.95931 Non Coding 0.99014 0.99798 0.99404 UTR 0.95975 0.90964 0.93402 Phase 0 0.99620 0.96858 0.98219 Coding 0.99683 0.83008 0.90585 Phase 1 0.98859 0.96834 0.97836 Intron 0.99851 1.00000 0.99925 Phase 2 0.99240 0.95430 0.97297 Subgenic 0.99733 0.87437 0.93181 Coding 0.99240 0.96369 0.97783 Genic 0.99081 0.88010 0.93218 Forward for Species - 0 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 625189 131 479 418 Non Coding 701191 1069 1104 1091 UTR 241 5141 87 120 Phase 0 234 9046 2 6 Coding 1257 82 29647 264 Phase 1 228 8 9014 2 Intron 1473 120 162 67459 Phase 2 242 2 5 9026 Precision Recall F1 Precision Recall F1 Intergenic 0.99527 0.99836 0.99681 Non Coding 0.99900 0.99537 0.99718 UTR 0.93917 0.91984 0.92940 Phase 0 0.89343 0.97394 0.93195 Coding 0.97603 0.94870 0.96217 Phase 1 0.89027 0.97428 0.93038 Intron 0.98825 0.97464 0.98140 Phase 2 0.89146 0.97315 0.93052 Subgenic 0.98449 0.96658 0.97545 Coding 0.89172 0.97379 0.93095 Genic 0.98211 0.96411 0.97303 Reverse for Species - 0 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 564424 228 108 776 Non Coding 669521 987 977 991 UTR 494 10102 142 214 Phase 0 687 19203 9 31 Coding 2444 49 60070 692 Phase 1 706 33 19222 12 Intron 8256 265 391 83615 Phase 2 645 14 29 19203 Precision Recall F1 Precision Recall F1 Intergenic 0.98055 0.99803 0.98922 Non Coding 0.99697 0.99561 0.99629 UTR 0.94908 0.92239 0.93554 Phase 0 0.94891 0.96352 0.95616 Coding 0.98944 0.94965 0.96914 Phase 1 0.94984 0.96240 0.95608 Intron 0.98028 0.90368 0.94042 Phase 2 0.94891 0.96541 0.95709 Subgenic 0.98409 0.92235 0.95222 Coding 0.94922 0.96378 0.95644 Genic 0.98171 0.92235 0.95110 Total for Species - 0 ML v HP Class Intergenic UTR Coding Intron ML v HP Phase Non Coding Phase 0 Phase 1 Phase 2 Intergenic 1189613 359 587 1194 Non Coding 1370712 2056 2081 2082 UTR 735 15243 229 334 Phase 0 921 28249 11 37 Coding 3701 131 89717 956 Phase 1 934 41 28236 14 Intron 9729 385 553 151074 Phase 2 887 16 34 28229 Precision Recall F1 Precision Recall F1 Intergenic 0.98823 0.99820 0.99319 Non Coding 0.99800 0.99548 0.99674 UTR 0.94571 0.92153 0.93346 Phase 0 0.93041 0.96684 0.94827 Coding 0.98497 0.94934 0.96682 Phase 1 0.92998 0.96616 0.94772 Intron 0.98382 0.93405 0.95829 Phase 2 0.92975 0.96787 0.94843 Subgenic 0.98425 0.93969 0.96145 Coding 0.93004 0.96696 0.94814 Genic 0.98187 0.93859 0.95974 Total: 482642bp across 25 windowsNonestarting to load test data into memory..For h5 starting with species = b'':x shape: (20, 106920, 4)Data loading of 20 (total so far 20) samples of data/X into memory took 0.07 secsCompressed data size of data/X is at least 0.0008 GBsetting self.n_seqs to 20, bc that is len of data/X0 / 81 / 82 / 83 / 84 / 85 / 86 / 87 / 8Neural network prediction done. Starting post processing.Helixer successfully finished the annotation of /tmp/tmp7vk7_3fs/files/9/3/2/dataset_932b58cc-e381-4dc5-9612-0b6a72de2b2c.dat in 0.07 hours. GFF file written to /tmp/tmp7vk7_3fs/job_working_directory/000/2/outputs/dataset_d4560f2f-53d6-4f86-b4f4-186e41de76d3.dat.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Hello, I would like to propose this workflow in connection with the GTN "Genome annotation with Helixer".
Thank you. Have a nice day!