-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using a fasta file with multiple adapter sequences for trimming #86
Comments
Hi @tobjores Generally, it wouldn't be a problem to allow additional characters in the adapter sequence. We have so far not allowed to use an entire file of adapter sequences, because we were of the opinion that this hardly ever something you want to be doing for typical genomic use cases. Could you give an example of your intended use case? |
In my case, I had promoter variants that were linked to three different 5'-UTRs. For aligning the promoter sequences to the reference, I had to trim off the UTRs. I could have done that by running trim_galore three times (once for each adapter) or one time with a file containing the three UTR sequences.
with
should do the trick |
Hi @tobjores Apologies for the slow response, in addition to all the working from home we also had some tragic incidents at our institute... I have tried to understand the issue you raised, and you might indeed be right that the In the meantime, I have pushed a new version to Github that should allow you to specify your three different adapter on the command line. The format would have to look like this:
Where I just picked some random DNA sequences as examples. Could you clone the latest dev version of Trim Galore and let me know if it works in your hands? Once I've got a bit more time I will try to look into the Best, Felix |
Hi @FelixKrueger |
Hi Tobias, Thanks for the feedback. I have now done some more tests, and can confirm that you can now either use a command line option such as this:
or also using the
Or any combination thereof. Please get the latest dev version once more, and let me know if something does not appear to be working as expected. Stay safe! |
Hi @FelixKrueger, I would like to use this feature in a project, can I request that you tag a new release and also update the conda version? Thanks, |
I'll try to make a release very soon (Sunday, Monday?), I think the Conda event gets triggered automatically. |
Alright, managed to get it out just now. Enjoy, I hope it works in your hands! Felix |
Thank you!
… On Sep 4, 2020, at 5:30 PM, Felix Krueger ***@***.***> wrote:
Alright, managed to get it out just now. Enjoy, I hope it works in your hands! Felix
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
Correct me if I'm wrong but neither one of these strategies is guaranteed to trim all adapters as cutadapt's |
That's correct. I suppose you could 'trick' it into using the
|
Hi Felix! On a similar note, if I want to trim multiple adapters from the R2 paired file, does it work to structure it the same way? Ex. -a2 " AGCTCCCG -a2 TTTCATTATAT -a2 TTTATTCGGATTTAT -n 3". In my scenario I have long polyA and polyT tails on both my R1 and R2 reads and would like to trim from both simultaneously. The follow code no longer works when I try to specify multiple sequences to trim. |
I think the solution here would be to replace the OLD: The reason for this is that R1 and R2 do in fact get processed (as single-end reads) separately, which the second In essence, this command line should work:
|
@FelixKrueger That worked! Thank you so much! |
Since version 1.5, cutadapt accepts a fasta file with multiple adapters by using "file:adapters.fasta" as adapter. This feature, however, cannot be used with trim_galore, as it enforces the adapters to consist only of DNA letters. Yet, this could be easily implemented by adding a test for the prefix "file:" to the adapter check routine. If this is encountered the adapter "file:..." can be directly forwarded to cutadapt (without upper-casing it first). Optionally, trim_galore could check if the specified file exists and issue an error if not.
The text was updated successfully, but these errors were encountered: