Skip to content

babinyurii/cancer_gene_expression_profile_analysis-microarray_data

Repository files navigation

The data set is derived from kaggle.com: https://www.kaggle.com/crawford/gene-expression. These data represent expression profiles of cancer patients. Gene expression is measured using microarrays. Two types of cancer are included: acute myeloid leukemia (AML) and acute lymphoblastic leukemia (ALL). The data are analyzed by principal component analysis (PCA) which is a common dimension reduction technique. The file "1_ALL_AML_train_and_test_data_set_PCA_ .ipynb" contains PCA of training and test data sets. The file "2_ALL_AML_whole_data_PCA_.ipynb" is intended to compare PCA of the whole data both with normalizing and without it. "3_ALL_AML_correlation_analysis_coregulated_genes" is an attempt to create a rough pattern for coregulated genes search, which may be of great interest in some study

The original study is: Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Science 286:531-537. (1999). Published: 1999.10.14 T.R. Golub, D.K. Slonim, P. Tamayo, C. Huard, M. Gaasenbeek, J.P. Mesirov, H. Coller, M. Loh, J.R. Downing, M.A. Caligiuri, C.D. Bloomfield, and E.S. Lander https://www.ncbi.nlm.nih.gov/pubmed/10521349

About

analysis of the dataset from kaggle.com

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published