Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pRIblast ruuning question #6

Open
fanglongfa opened this issue Oct 17, 2023 · 4 comments
Open

pRIblast ruuning question #6

fanglongfa opened this issue Oct 17, 2023 · 4 comments
Assignees
Labels
documentation Improvements or additions to documentation good first issue Good for newcomers

Comments

@fanglongfa
Copy link

mpirun -np 6 -x OMP_NUM_THREADS=6 pRIblast ris -i LncRNA.fa -o ZMpredictions.txt -d ZM -a area

[lzu-MZ72-HB0-00:2149410] *** Process received signal ***
[lzu-MZ72-HB0-00:2149410] Signal: Aborted (6)
[lzu-MZ72-HB0-00:2149410] Signal code: (-6)
[lzu-MZ72-HB0-00:2149410] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x14db7f442520]
[lzu-MZ72-HB0-00:2149410] [ 1] /lib/x86_64-linux-gnu/libc.so.6(pthread_kill+0x12c)[0x14db7f4969fc]
[lzu-MZ72-HB0-00:2149410] [ 2] /lib/x86_64-linux-gnu/libc.so.6(raise+0x16)[0x14db7f442476]
[lzu-MZ72-HB0-00:2149410] [ 3] /lib/x86_64-linux-gnu/libc.so.6(abort+0xd3)[0x14db7f4287f3]
[lzu-MZ72-HB0-00:2149410] [ 4] /mnt/sdb/fanglf/anaconda3/envs/CIRI/lib/libstdc++.so.6(+0xb135a)[0x14db7f8b135a]
[lzu-MZ72-HB0-00:2149410] [ 5] /mnt/sdb/fanglf/anaconda3/envs/CIRI/lib/libstdc++.so.6(+0xb13c5)[0x14db7f8b13c5]
[lzu-MZ72-HB0-00:2149410] [ 6] /mnt/sdb/fanglf/anaconda3/envs/CIRI/lib/libstdc++.so.6(+0xb1658)[0x14db7f8b1658]
[lzu-MZ72-HB0-00:2149410] [ 7] /mnt/sdb/fanglf/anaconda3/envs/CIRI/lib/libstdc++.so.6(_ZSt20__throw_length_errorPKc
[lzu-MZ72-HB0-00:2149410] [ 8] pRIblast(+0x35277)[0x557d90c24277]
[lzu-MZ72-HB0-00:2149410] [ 9] pRIblast(+0x1a10e)[0x557d90c0910e]
[lzu-MZ72-HB0-00:2149410] [10] pRIblast(+0x3029b)[0x557d90c1f29b]
[lzu-MZ72-HB0-00:2149410] [11] pRIblast(+0xee2a)[0x557d90bfde2a]
[lzu-MZ72-HB0-00:2149410] [12] /lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x14db7f429d90]
[lzu-MZ72-HB0-00:2149410] [13] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x14db7f429e40]
[lzu-MZ72-HB0-00:2149410] [14] pRIblast(+0x11c05)[0x557d90c00c05]
[lzu-MZ72-HB0-00:2149410] *** End of error message ***

@amatria
Copy link
Collaborator

amatria commented Oct 17, 2023

Hi :)

Can you please provide me with your input files so I can investigate the issue?

@amatria amatria self-assigned this Oct 17, 2023
@amatria amatria added the bug Something isn't working label Oct 17, 2023
@fanglongfa
Copy link
Author

fanglongfa commented Oct 18, 2023 via email

@amatria
Copy link
Collaborator

amatria commented Oct 18, 2023

Hi,

I cannot download the attachments. You must manually upload them to the GitHub issue. However, I can see that your target FASTA file is 62MB in size, and that your query FASTA file is 33MB in size. So, this is likely an out of memory issue. Have you tried rebuilding the database using the parameter -c <INT> (i.e., database page size)? This information is sourced from the original pRIblast paper:

4.2. Database paging

RIblast writes results to the output file only after all target sequences have been compared against a certain lncRNA query sequence. Therefore, if a target database is too large or a query segment produces a big amount of prediction candidates (i.e., seeds), memory may be filled up before results are finally saved to disk.

pRIblast introduces a very effective optimization to overcome this limitation. It divides the target database into several smaller and independent pages, which, because of their reduced size, will not overflow the available memory space. Put differently, because results can only be written after a sequence has been compared against a target database, pRIblast makes predictions against subsets of such database. In this way, once a query sequence has been compared against all subsets of the target dataset, the output file will hold the same results as if the sequence had been compared against the whole target database in one single round.

Because of the nature of the data structures stored in the database, it is not possible to paginate a target dataset with no previous preprocessing. Therefore, the database construction step (see Section 3) has been slightly modified to be able to read chunks of a fixed size later in the RNA interaction search step. In any case, databases built without the tweaked version of the construction step are backwards compatible with the new pRIblast workflow.

In simpler terms:

  1. Rebuild your database using the -c <INT> parameter. For example, try -c 250 (which processes the target sequences in chunks of 250).
$ mpirun -np 6 -x OMP_NUM_THREADS=6 ./target/pRIblast.release db -i Zhongmu.fa -o ZM-250 -c 250 -a block
  1. Execute the RNA interaction search step again and monitor memory usage. If you still run out of memory, rebuild the database again, but this time with a lower chunk size value, such as -c 100. You can even choose a chunk size of 1 if memory is limited, but this will impact the tool's performance.
$ mpirun -np 6 -x OMP_NUM_THREADS=6 ./target/pRIblast.release ris -i LncRNA.fa -d ZM-250 -o ZMpredictions.txt -c 250 -a area

Furthermore, judging by the size of your inputs, it's likely that you'll require ample disk space to store the output interactions resulting from your analysis. Refer to the table below, once more obtained from the original pRIblast paper, which can provide you with an estimate of the potential growth in the size of output files (Anser is approximately 12MB in size):

I hope this helps :-).
Iñaki

@amatria amatria added documentation Improvements or additions to documentation good first issue Good for newcomers and removed bug Something isn't working labels Oct 18, 2023
@fanglongfa
Copy link
Author

ok, thank you, I will try it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants