Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rosella recover flight status error #30

Closed
Rridley7 opened this issue Nov 20, 2022 · 9 comments
Closed

Rosella recover flight status error #30

Rridley7 opened this issue Nov 20, 2022 · 9 comments

Comments

@Rridley7
Copy link

Hi, I am running into an error when running rosella recover with several metagenomes. The error is not consistent between samples, e.g. I cannot predict when the error will occur, however it does happen consistently on the samples with which it occurs. The error statement is:

Error when running flight process. Exitstatus was : ExitStatus(unix_wait_status(256)) thread 'main' panicked at 'Failed to grab stderr from failed flight process', /home/conda/.cargo/registry/src/github.com-1ecc6299db9ec823/bird_tool_utils-0.3.0/src/command.rs:17:14

The original command was:
rosella recover -i S04_1a9817_spa_t_mtb_cov.txt -r S04_1a9817_spa_t_contigs.fa

I can provide the original files if needed, the coverage file was generated by use of coverm contig in metabat mode.

@rhysnewell
Copy link
Owner

Hmm, yeah that error is not very informative. Would you please provide the reference file and coverage file? I'll see if I can get to the bottom of it

Rhys

@Rridley7
Copy link
Author

Files are attached, thanks!
S04_1a9817_spa_t_mtb_cov.txt
S04_1a9817_spa_t_contigs.fa.zip

@rhysnewell
Copy link
Owner

rhysnewell commented Nov 20, 2022

Hi!

So i've looked through your files and tried running Rosella on them. You are right that rosella does error out, but I believe it is not due to a problem on rosella's end.

The assembly you are trying to bin is not very good, here are the stats from bbmap for it:

A	C	G	T	N	IUPAC	Other	GC	GC_stdev
0.2519	0.2479	0.2451	0.2551	0.0000	0.0000	0.0000	0.4930	0.0964

Main genome scaffold total:         	1828
Main genome contig total:           	1828
Main genome scaffold sequence total:	2.745 MB
Main genome contig sequence total:  	2.745 MB  	0.000% gap
Main genome scaffold N/L50:         	663/1.435 KB
Main genome contig N/L50:           	663/1.435 KB
Main genome scaffold N/L90:         	1562/1.067 KB
Main genome contig N/L90:           	1562/1.067 KB
Max scaffold length:                	15.637 KB
Max contig length:                  	15.637 KB
Number of scaffolds > 50 KB:        	0
% main genome in scaffolds > 50 KB: 	0.00%


Minimum 	Number        	Number        	Total         	Total         	Scaffold
Scaffold	of            	of            	Scaffold      	Contig        	Contig
Length  	Scaffolds     	Contigs       	Length        	Length        	Coverage
--------	--------------	--------------	--------------	--------------	--------
    All 	         1,828	         1,828	     2,745,043	     2,745,043	 100.00%
    500 	         1,828	         1,828	     2,745,043	     2,745,043	 100.00%
   1 KB 	         1,828	         1,828	     2,745,043	     2,745,043	 100.00%
 2.5 KB 	            99	            99	       375,466	       375,466	 100.00%
   5 KB 	            15	            15	       115,390	       115,390	 100.00%
  10 KB 	             3	             3	        38,314	        38,314	 100.00%

As you can see, most of the contigs fall below the default minimum contig size that rosella uses (--min-contig-size 1500). The size of the assembly of contigs > 1Kbp is less than 500Kbp. That's not really a whole lot of information for rosella, or any binning algorithm, to work with. I doubt you will easily get anything informative out of this assembly without some level of manual inspection.

I think I will go ahead and close this issue now. Hopefully you have found my response helpful, and you can find something useful in your assembly.

Cheers,
Rhys

@janfelix
Copy link

Hello Rhys,
I had the same issue and most likely due to the same problem with short contigs. My contigs are assembled from metatranscriptome data, so that's what they are. I had the impression that GroopM was able to process contigs as short as 500bp and then moved on to rosella. Do you see any chance rosella could work with contigs shorter than 1500 bp? Even just to try it out or by only using read coverage...

Thanks again for building rosella and the great support!

@rhysnewell
Copy link
Owner

rhysnewell commented Nov 23, 2022

You can certainly try it out, you just have to set --min-contig-size to the desired value and see how you go. If it returns and error again, then let me know. Thanks for trying it out :)

You'll probably also want to alter --min-bin-size as well and drop it down to a much lower value if you expect your metaT bins to small

@Rridley7
Copy link
Author

This was certainly helpful, thanks!

@janfelix
Copy link

Hi, I have tried a few things, contig size and bin size were lowered. Unfortunately, after successfully completing the "Contigs kmers analyzed" part it crashes:

[00:07:56] ⠋ Calculating UMAP embeddings and clustering... 3/6
[2022-11-24T22:17:10Z ERROR bird_tool_utils::command] Error when running flight process. Exitstatus was : ExitStatus(unix_wait_status(256))
thread 'main' panicked at 'Failed to grab stderr from failed flight process', /home/conda/.cargo/registry/src/github.com-1ecc6299db9ec823/bird_tool_utils-0.3.0/src/command.rs:17:14

Not sure what that could mean...

@rhysnewell
Copy link
Owner

Would you please be able to post the output of conda list for your rosella conda environment?

@janfelix
Copy link

Hi, thanks for looking into this!

packages in environment at /home/jan/.conda/envs/rosella:

Name Version Build Channel

_libgcc_mutex 0.1 conda_forge conda-forge
_openmp_mutex 4.5 2_kmp_llvm conda-forge
asttokens 2.1.0 pyhd8ed1ab_0 conda-forge
attrs 22.1.0 pyh71513ae_1 conda-forge
backcall 0.2.0 pyh9f0ad1d_0 conda-forge
backports 1.0 pyhd8ed1ab_3 conda-forge
backports.functools_lru_cache 1.6.4 pyhd8ed1ab_0 conda-forge
biopython 1.80 pypi_0 pypi
brotli 1.0.9 h166bdaf_8 conda-forge
brotli-bin 1.0.9 h166bdaf_8 conda-forge
brotlipy 0.7.0 py39hb9d737c_1005 conda-forge
bwa 0.7.17 h7132678_9 bioconda
bzip2 1.0.8 h7f98852_4 conda-forge
c-ares 1.18.1 h7f98852_0 conda-forge
ca-certificates 2022.9.24 ha878542_0 conda-forge
cachecontrol 0.12.12 pyhd8ed1ab_1 conda-forge
cached-property 1.5.2 hd8ed1ab_1 conda-forge
cached_property 1.5.2 pyha770c72_1 conda-forge
certifi 2022.9.24 pyhd8ed1ab_0 conda-forge
cffi 1.15.1 py39h74dc2b5_0
charset-normalizer 2.1.1 pyhd8ed1ab_0 conda-forge
colorama 0.4.6 pyhd8ed1ab_0 conda-forge
contourpy 1.0.6 py39hf939315_0 conda-forge
cryptography 38.0.3 py39hd97740a_0 conda-forge
cycler 0.11.0 pyhd8ed1ab_0 conda-forge
cython 0.29.32 py39h5a03fae_1 conda-forge
dbus 1.13.6 he372182_0 conda-forge
decorator 5.1.1 pyhd8ed1ab_0 conda-forge
exceptiongroup 1.0.4 pyhd8ed1ab_0 conda-forge
executing 1.2.0 pyhd8ed1ab_0 conda-forge
expat 2.5.0 h27087fc_0 conda-forge
filelock 3.8.0 pyhd8ed1ab_0 conda-forge
flight-genome 1.5.0 pypi_0 pypi
fontconfig 2.14.1 hc2a2eb6_0 conda-forge
fonttools 4.38.0 py39hb9d737c_1 conda-forge
freetype 2.12.1 hca18f0e_0 conda-forge
glib 2.69.1 h4ff587b_1
gst-plugins-base 1.14.0 hbbd80ab_1
gstreamer 1.14.0 h28cd5cc_2
h5py 3.7.0 nompi_py39h817c9c5_102 conda-forge
hdbscan 0.8.29 pypi_0 pypi
hdf5 1.12.2 nompi_h2386368_100 conda-forge
hdmedians 0.14.2 py39h2ae25f5_3 conda-forge
htslib 1.16 h6bc39ce_0 bioconda
icu 58.2 hf484d3e_1000 conda-forge
idna 3.4 pyhd8ed1ab_0 conda-forge
imageio 2.22.4 pypi_0 pypi
iniconfig 1.1.1 pyh9f0ad1d_0 conda-forge
ipython 8.4.0 py39hf3d152e_0 conda-forge
jedi 0.18.2 pyhd8ed1ab_0 conda-forge
joblib 1.1.1 pypi_0 pypi
jpeg 9e h166bdaf_2 conda-forge
k8 0.2.5 hd03093a_2 bioconda
keyutils 1.6.1 h166bdaf_0 conda-forge
kiwisolver 1.4.4 py39hf939315_1 conda-forge
krb5 1.19.3 h3790be6_0 conda-forge
lcms2 2.14 h6ed2654_0 conda-forge
ld_impl_linux-64 2.38 h1181459_1
lerc 4.0.0 h27087fc_0 conda-forge
libblas 3.9.0 16_linux64_openblas conda-forge
libbrotlicommon 1.0.9 h166bdaf_8 conda-forge
libbrotlidec 1.0.9 h166bdaf_8 conda-forge
libbrotlienc 1.0.9 h166bdaf_8 conda-forge
libcblas 3.9.0 16_linux64_openblas conda-forge
libcurl 7.86.0 h7bff187_1 conda-forge
libdeflate 1.13 h166bdaf_0 conda-forge
libedit 3.1.20191231 he28a2e2_2 conda-forge
libev 4.33 h516909a_1 conda-forge
libffi 3.3 he6710b0_2
libgcc-ng 12.2.0 h65d4601_19 conda-forge
libgfortran-ng 12.2.0 h69a702a_19 conda-forge
libgfortran5 12.2.0 h337968e_19 conda-forge
liblapack 3.9.0 16_linux64_openblas conda-forge
libllvm11 11.1.0 he0ac6c6_5 conda-forge
libnghttp2 1.47.0 hdcd2b5c_1 conda-forge
libopenblas 0.3.21 pthreads_h78a6416_3 conda-forge
libpng 1.6.39 h753d276_0 conda-forge
libssh2 1.10.0 haa6b8db_3 conda-forge
libstdcxx-ng 12.2.0 h46fd767_19 conda-forge
libtiff 4.4.0 h0e0dad5_3 conda-forge
libuuid 2.32.1 h7f98852_1000 conda-forge
libwebp-base 1.2.4 h166bdaf_0 conda-forge
libxcb 1.13 h7f98852_1004 conda-forge
libxml2 2.9.14 h74e7548_0
libzlib 1.2.13 h166bdaf_4 conda-forge
llvm-openmp 15.0.5 he0ac6c6_0 conda-forge
llvmlite 0.39.1 py39h7d9a04d_1 conda-forge
lockfile 0.12.2 py_1 conda-forge
matplotlib 3.6.2 py39hf3d152e_0 conda-forge
matplotlib-base 3.6.2 py39hf9fd14e_0 conda-forge
matplotlib-inline 0.1.6 pyhd8ed1ab_0 conda-forge
minimap2 2.24 h7132678_1 bioconda
msgpack-python 1.0.4 py39hf939315_1 conda-forge
munkres 1.0.7 py_1 bioconda
natsort 8.2.0 pyhd8ed1ab_0 conda-forge
ncurses 6.3 h5eee18b_3
numba 0.56.4 py39h61ddf18_0 conda-forge
numpy 1.21.0 pypi_0 pypi
openjpeg 2.5.0 h7d73246_1 conda-forge
openssl 1.1.1s h166bdaf_0 conda-forge
packaging 21.3 pyhd8ed1ab_0 conda-forge
pandas 1.5.2 py39h4661b88_0 conda-forge
parallel 20170422 pl5.22.0_0 bioconda
parso 0.8.3 pyhd8ed1ab_0 conda-forge
patsy 0.5.3 pyhd8ed1ab_0 conda-forge
pcre 8.45 h9c3ff4c_0 conda-forge
pebble 5.0.3 pypi_0 pypi
perl 5.22.0.1 0 conda-forge
pexpect 4.8.0 pyh1a96a4e_2 conda-forge
pickleshare 0.7.5 py_1003 conda-forge
pillow 9.2.0 py39hf3a2cdf_3 conda-forge
pip 22.2.2 py39h06a4308_0
pluggy 1.0.0 pyhd8ed1ab_5 conda-forge
prompt-toolkit 3.0.33 pyha770c72_0 conda-forge
pthread-stubs 0.4 h36c2ea0_1001 conda-forge
ptyprocess 0.7.0 pyhd3deb0d_0 conda-forge
pure_eval 0.2.2 pyhd8ed1ab_0 conda-forge
pycparser 2.21 pyhd8ed1ab_0 conda-forge
pygments 2.13.0 pyhd8ed1ab_0 conda-forge
pynndescent 0.5.8 pyh1a96a4e_0 conda-forge
pyopenssl 22.1.0 pyhd8ed1ab_0 conda-forge
pyparsing 3.0.9 pyhd8ed1ab_0 conda-forge
pyqt 5.9.2 py39h2531618_6 anaconda
pysocks 1.7.1 py39hf3d152e_5 conda-forge
pytest 7.2.0 pyhd8ed1ab_2 conda-forge
python 3.9.15 haa1d7c7_0
python-dateutil 2.8.2 pyhd8ed1ab_0 conda-forge
python_abi 3.9 2_cp39 conda-forge
pytz 2022.6 pyhd8ed1ab_0 conda-forge
qt 5.9.7 h5867ecd_1
readline 8.2 h5eee18b_0
requests 2.28.1 pyhd8ed1ab_1 conda-forge
rosella 0.4.2 h6f8cb4c_1 bioconda
samtools 1.16.1 h6899075_1 bioconda
scikit-bio 0.5.7 py39hce5d2b2_0 conda-forge
scikit-learn 1.0.2 pypi_0 pypi
scipy 1.8.1 pypi_0 pypi
seaborn 0.12.1 hd8ed1ab_0 conda-forge
seaborn-base 0.12.1 pyhd8ed1ab_0 conda-forge
setuptools 65.5.0 py39h06a4308_0
sip 4.19.13 py39h295c915_0 anaconda
six 1.16.0 pyh6c4a22f_0 conda-forge
sqlite 3.39.3 h5082296_0
stack_data 0.6.1 pyhd8ed1ab_0 conda-forge
starcode 1.4 hec16e2b_2 bioconda
statsmodels 0.13.5 py39h2ae25f5_2 conda-forge
tbb 2021.7.1 pypi_0 pypi
threadpoolctl 3.1.0 pyh8a188c0_0 conda-forge
tk 8.6.12 h1ccaba5_0
tomli 2.0.1 pyhd8ed1ab_0 conda-forge
tornado 6.2 py39hb9d737c_1 conda-forge
tqdm 4.64.1 pyhd8ed1ab_0 conda-forge
traitlets 5.5.0 pyhd8ed1ab_0 conda-forge
typing_extensions 4.4.0 pyha770c72_0 conda-forge
tzdata 2022f h04d1e81_0
umap-learn 0.5.3 py39hf3d152e_0 conda-forge
unicodedata2 15.0.0 py39hb9d737c_0 conda-forge
urllib3 1.26.11 pyhd8ed1ab_0 conda-forge
wcwidth 0.2.5 pyh9f0ad1d_2 conda-forge
wheel 0.37.1 pyhd3eb1b0_0
xorg-libxau 1.0.9 h7f98852_0 conda-forge
xorg-libxdmcp 1.1.3 h7f98852_0 conda-forge
xz 5.2.6 h5eee18b_0
zlib 1.2.13 h166bdaf_4 conda-forge
zstd 1.5.2 h6239696_4 conda-forge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants