Skip to content

jfwu777/PredGeneExpr

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Predicting gene expression using millions of random promoter sequences

competition website: https://www.synapse.org/#!Synapse:syn28469146/wiki/

Process

  1. Generate dataset EDA_prep
  2. A naive LSTM model is tested on a random 4:1 train/val split on full dataset
    1. model -> simple LSTM with simple embedding dimension (6, 100) for A,G,C,T,N(unknown) + PAD, where PAD is a PAD placeholder to pad all sequence to 150(check EDA_prep for detail)
    2. batch_size = 512
  3. Performance is documented in log_naive_lstm.txt
    • Pearson's R = 0.73
    • Spearman's R = 0.75

About

Predicting gene expression using millions of random promoter sequences - Synapse

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •