-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gene_Centric_Coding function fails due to character-formatted annotation data #36
Comments
Hi @kwdoyle, Thank you for letting me know. Could I ask what FAVOR database files did you use to annotate your genotype data? Best, |
I'm using the full database of 160 annotations hosted on Harvard Dataverse. More specifically, just the database file for chromosome 5 for now. Interestingly enough, the Annotate.R script seems to work correctly. If I read in the saved, merged annotation file "Anno_chr5_STAARpipeline.csv" with data.table::fread, all of the data indicated in my previous post as being character-formatted are read in as numeric. So presumably the issue occurs somewhere in the gds2agds.R script? It also might be of interest that I chose to annotate the gds using all 160 annotations (rather, I set my 'anno_colnum' variable as c(1, 8:190). I chose the same initial annotations as the default (1 and 8) and then selected all variables from 8 to the end. I'm not sure if this is relevant, however, as the file from Annotate.R seems fine |
Oh, while looking through the scripts, I think this part in gds2agds.R might be the problem:
The character values I'm seeing might be in the location of these hard-coded character columns.. |
Hi @kwdoyle, Thank you so much for taking a closer look at the issue. Yes indeed, the current FAVORannotator script in the STAARpipeline-Tutorial works well for the FAVOR Essential Database hosted on Harvard Dataverse. For the full database of 160 annotations, it is recommended to adapt the Best, |
Yes, this was indeed the issue. If I removed the col_types specification and let read_csv auto-assign the column classes, everything seems to work fine. |
Thank you @kwdoyle! Feel free to let me know if you have any other questions, or you may close this issue. |
Hello,
I've been working my way through each step in the pipeline and have encountered an error when running STAARpipeline_Gene_Centric_Coding.r
The error occurs in the internal
coding
function within theGene_Centric_Coding
function. When the current annotation name is "aPC.LocalDiversity", the script attempts to perform a transformation of the data:However, the data in Annotation.PHRED is of the character class, causing this error:
aPC.LocalDiversity is not the only character annotation. Of those picked from the name catalog, the only other character values are 3 other annotation PCs:
Would anyone know why these values would be written to (or read from) the GDS file as characters?
The text was updated successfully, but these errors were encountered: