Skip to content

jhrcook/protein-language-models

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Protein language models

Setup

pyenv local 3.11
python -m venv .env
source .env/bin/activate
pip install --upgrade pip
pip install -r requirements.txt
pre-commit install

Data preparation

Run the following script to download and prepare the raw data:

./prepare_data.py

Data sources:

downloaded AlphaMissesnse predictions: https://zenodo.org/records/8360242 downloaded the file: "AlphaMissense_aa_substitutions.tsv.gz"

ESM1b paper: https://www.nature.com/articles/s41588-023-01465-0 Downloaded ESM1b: https://huggingface.co/spaces/ntranoslab/esm_variants/tree/main downloaded the file: "ALL_hum_isoforms_ESM1b_LLR.zip"

Copied them to "raw-data/"

About

Experimenting with protein language model predictions

Topics

Resources

Stars

Watchers

Forks

Languages