Skip to content

shubham1637/CodeShare

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

43 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CodeShare

This repository contains code files I have been generating to learn BioInformatics. Motif.py, Replication.py files contains functions to find motifs and patterns in a DNA sequence. It includes greedyMotif, RandomMotifSearch, GibbsSampler algorithms etc. The main aim of code is to find origin of replication "oriC" which in most cases can be found with pattern detection in sequence.

Network.h and Network.cpp files describe creation of weighted graph (gene expression) and find out trans associated gene based on CNV value. After getting all output data k-means clustering is done to get cancer sub types. Equations for finding trans associated genes are taken from a research paper. Here we create a network of 13038 proteins. Weight of edges is decided by gene expression values of 997 tumor pateints. Vertex weight is combination of its own copy number variation value and its neighour's weight.

R files used to do generate transpose of METABRIC data. Lab4 file contains code to process gene expression data (Rna Seq data). Data analysis includes extracting genomic feature, RPKM normalization, hierarchical clustering etc. Bioconductor package is used to do data analysis.

About

This repository contains code files I have been generating to learn BioInformatics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages