Starting the data pre-processing. All codes were implemented in Python with the help of Pandas module for data manipulation.
As the result: a clean count file, with 87 samples and ~4000 genes (will be filtered more for only protein coding genes)
Group meeting to kick off the work: We decided that everyone will take part in the quality control and data normalization to be able to understand the data better
Preparing libraries for the data analysis, for discussion tomorrow with the group:
- Gene To Transcript data
- Gene Length data (for calculating normalized count)
First entry of the diary! We had our first group meeting at KTH Library at 14.00 today. Several things was discussed:
- Overview about the project
- Overview of the data
- Discussion about the expected timeline
- Preparation for first seminar presentation