-
Notifications
You must be signed in to change notification settings - Fork 8
Changes related to data processing and fine-tuning new models #16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
avantikalal
commented
Jul 9, 2025
- Enabled finetuning via CLI
- Added new arguments to CLI training and finetuning scripts
- Modified the CLI inference script to enable running separately on train/val/test genes
- Enabled inference by changing predict_on_dataset to have separate logic depending on the dataset class (HDF5Dataset or VariantDataset)
- Added a new preprocessing function that searches for gene metadata on ensembl using mygene.
|
@avantikalal Can you added finetuning data similar to https://github.com/Genentech/decima/blob/main/tests/test_cli.py with may be dumy test and just runs new steps? It is easier to add unittest for I am preparing unittests for |
Shall I do these in a separate PR? |
…nd fine-tuning (#19) * ensemble vep init * backward compability of grelu, ensembling, testcases, custom fasta * gene dataset * gene expression prediction and sequence shifting * fix testcase * conflig * Changes related to data processing and fine-tuning new models (#16) * enable finetune via cli * split input and output directories * add mygene * added ensembl * added N padding * add more params * added args to cli finetune * add csv logging * add csv logging * add run name to checkpoints * gene pearson metric * training 202506 * added new params * added topk * reset unnecessary changes * reset unnecessary changes * reset unnecessary changes * reset unnecessary changes * reset unnecessary changes * fixed savek typo * more useful print * finetuning updates --------- Co-authored-by: Muhammed Hasan Celik <celik.muhammed_hasan@gene.com> * fix testcases * branch review updates --------- Co-authored-by: Muhammed Hasan Celik <celik.muhammed_hasan@gene.com> Co-authored-by: Avantika Lal <avantikalal1990@gmail.com>