-
Notifications
You must be signed in to change notification settings - Fork 3
Description
Ruolin
That is useful with the adding the gene_id from the ensembl annotation, much faster than my python script to do it afterwards. I'll let you know how this pans out with the downstream tools I use. There are a few other minor issues though that would help make the tool better IMO:
- What would likely be a further improvement is adding the gene_name as well, this is an example from Stringtie of the field we are discussing:
gene_id "ERR188044.1"; transcript_id "ERR188044.1.1"; reference_id "NM_018390"; ref_gene_id "NM_018390"; ref_gene_name "PLCXD1"
So we have your tool's gene_id and transcript_id, plus the ref_gene_name and ref_gene_id from ensembl/ reference annotation.
- If I am running the --no-quant flag, would it be possible to remove the TPM and FPKM parts from the annotation output file, they should not really be there, minor issue. We get these currently:
;FPKM "NA";Frac "NA";TPM "NA";
-
Usually I run stringtie in 1 directory and it outputs its .gtf files into the same directory, each named according to the original file ID. This is easier than having them all in different directories, then having to rename and move them all afterwards, before I run cuffmerge. I actually use taco (https://www.nature.com/articles/nmeth.4078) instead of cuffmerge, it is supposed to be a lot better, and my results made more sense when using this.
-
I don't really want any log files to be outputted, I only need the .gtf file. I have to include extra code to clean all this up. Is it possible to have a parameter to get a single .gtf and nothing else?
Thanks for your time. These are just some ideas for you to review, I am keen on using strawberry in our work.
Chris