-
-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
alternative for seqkit locate
that can locate subsequences/motifs
#106
Comments
It's a very nice package overall, and to your point would make for a good reference to build a "recipe" list from. To your question, depending on the end goal, there's a somewhat crude way you could accomplish this in biobear, but it may not fit your needs well enough. This relates back to your comments trimming, in that I'm looking at performant alignment / string functions that I can add to For example, use the seqkit example
You could then add that to a Perhaps, I'll add a function that replicates
Then more or less have the returned table be aligned with |
Thanks @tshauck for your response. I think that's interesting addition (it's interesting that For the |
@abearab, cool, that makes sense to me... I just updated biobear on pypi with the new quality score function as well as an alignment_score function that does basic local alignment between two sequences (e.g. e69df6b). I'll explore more how to return alignment position(s) within a string and follow up when I have some more info. |
@tshauck I have issue installing it (this might be a separate issue, I just created a new VM and fresh OS, so ...)
|
I had to install this in system level:
|
Ah, ok, glad you got it. Looks like I have a little work to do dependency-wise. |
Hi @tshauck – I'm getting back to this discussion and I need to use this functionality to process a dataset. I would be more than happy to test your tool in case you have any new features. What do you recommend to start with? I liked your idea to have that |
Hey @abearab -- apologies for taking a bit to get onto this. The CRAM scanning feature is done, and I go started on the writing (though need some feedback from another developer). For |
I have something in this branch I hope to finish up tomorrow. It's slightly different than locate as it requires a regex right now, but is relatively close... e.g.
This is similar to the seqkit example: https://bioinf.shenwei.me/seqkit/usage/#locate I also hope to add a non-regex based one similar to |
@abearab I updated biobear to support biobear/python/tests/test_session.py Lines 43 to 68 in 5d9a881
Please let me know if you have any feedback on how that works for your task -- thanks! |
Hi @tshauck, Thanks for looking into this. There is no time pressure on my end (i.e. I might also not test new features in |
Cool, that all sounds good to me, no rush for me. Please just "at" me on this issue if/when you use this for your work if you have any thoughts on improvements and/or questions. Thanks! |
I found seqkit locate very useful but it was surprisingly slow. Would you think there is any
biobear
approach for that? In general, any seqkit functions are very common and useful – in case you are interested to look deeper :)The text was updated successfully, but these errors were encountered: