You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Dear fgbio team,
I would like to use AnnotateBamWithUmis, however I am not very sure about the steps to take. I couldn't find any tutorial that refers to AnnotateBamWithUmis.
I have paired end RNA seq data, with three fastq files per samples: R1 is the forward read, R3 the reverse read and the UMI is indicated in a separate R2 read file.
I already aligned the R1 and R2 to the genome using Hisat2 and created a SAM file per sample. Then, with samtools view I created a bam file per sample, so I would like to use AnnotateBamWithUmis to annotate these bam files.
My questions are the following:
Should I sort the BAM files using samtools sort before annotating them with AnnotateBamWithUmis? Or should I use AnnotateBamWithUmis first and then sort the annotated bam files?
Do you have any example code to use AnnotateBamWithUmis?
Thank you very much for your help.
With best regards,
Sofia
The text was updated successfully, but these errors were encountered:
Please let us know how the usage (fgbio AnnotateBamWithUmis --help) could be improved.
If the FASTQ reads are in the same order as the BAM, then use the --sorted to indicate that as such. If they are not, all the FASTQ reads will be read into memory (needs lots of memory for large FASTQs). I don't think it matters if you sort before or after, unless you happen to sort it into the same order as the FASTQ (unlikely).
Do you have any example code to use AnnotateBamWithUmis? It should be as simple as fgbio AnnotateBamWithUmis --input in.bam --fastq R2.fastq.gz --output out.bam
Thank you very much for your prompt response,
Regarding question 1: How can I make sure that the FASTQ reads are in the same order as the BAM reads?
For example: I'm checking the first few lines in the bam file (before sorting with samtools sort) and in the fastq file, and they both share the same order of sequence identifiers (at least in the first lines, attached images 1 and 2; Fastq_file and UNsorted_bam_file). Does this mean I can use AnnotateBamWithUmis with the --sorted option?
On the other hand, if I compare the order of the sequence identifiers in the sorted_bam file (obtained using samtools sort), it differs from the fastq file (image 3; samtools_sorted_bam_file). Therefore, would it be the best option to annotate the unsorted bam file with AnnotateBamWithUmis --sorted, and then use samtools sort on the already annotated bam file?
Extra question: How should I icnlude the --sorted option? If I do fgbio AnnotateBamWithUmis --input in.bam --fastq R2.fastq.gz --output out.bam --sorted true, I obtain the following error "No option found with name sorted"
Thanks you very very much for your time and kind help.
Dear fgbio team,
I would like to use AnnotateBamWithUmis, however I am not very sure about the steps to take. I couldn't find any tutorial that refers to AnnotateBamWithUmis.
I have paired end RNA seq data, with three fastq files per samples: R1 is the forward read, R3 the reverse read and the UMI is indicated in a separate R2 read file.
I already aligned the R1 and R2 to the genome using Hisat2 and created a SAM file per sample. Then, with samtools view I created a bam file per sample, so I would like to use AnnotateBamWithUmis to annotate these bam files.
My questions are the following:
Thank you very much for your help.
With best regards,
Sofia
The text was updated successfully, but these errors were encountered: