Skip to content

Change sample sheet Sample_ID and Sample_Name to use actual sample id instead of sample "content"? #237

@AmandaBirmingham

Description

@AmandaBirmingham

I have split this issue off of #204 because I think it may need more discussion before implementation.

#204 (comment)
"@tanaes Also, this question is not directly related to producing the project shortname, but I just want to double-check: what you want as the Sample_ID, Sample_Name, etc, in the shotgun sample sheet are strings that are the sample ids plus the plate and well they were plated on in the original sample plate? As in, "1_SKB1_640202_21_A1" (where the actual sample_id in qiita.study_sample is "1_SKB1_640202")?"

#204 (comment)
"We don't actually need to have the plate info -- just the study + sample identifier is ok, i don't want to encode extraneous data in the filename. TBH I'd rather have a non-human-readible unique identifier but I don't think that will work in our system."

#204 (comment)
"@tanaes Just to be clear, my question above is about the contents of the "Sample_ID" and "Sample_Name" columns in the shotgun sample sheet that LabPerson generates; as far as I know, these values aren't file names (or are you saying they are used as that, somewhere downstream)? As I said, this question strays a bit from the task of generating the project shortname; sorry :)

I just wanted to double-check that, whatever you use the sample sheet for after getting back the sequencing data, you actively want these "sample id plus position" descriptors in it rather than the actual keys that would allow you to, say, look up the sample metadata in Qiita (without having to strip off extraneous plate id and well position pieces at the end of the string). If you DO want the "sample id plus position" info (or you just don't care :) ), then all is copacetic. If actual Qiita sample ids would be more useful to you, it would be very easy to put them in the shotgun sample sheet instead of the "sample id plus position" ids."

#204 (comment)
"Sample names
Based on how the pipeline currently works, I think providing the actual Qiita Sample IDs to the format_sample_sheet machinery is going to be best. They do end up being munged into filenames (we end up encoding both the Qiita sample ID and the Illumna BCL2Fastq-compatible name in the sample sheet, and the latter is what gets prepended to the fastq filename), but we want to retain the original Qiita ID for when we rename these files later one. Currently this process is all already handled, and so I don't think [sic]"

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions