make LR parameters customizable in training workflow #24

mayasheth · 2024-01-30T03:54:55Z

tested with no regularization, L1, L2, and elastic net settings

atancoder · 2024-01-30T18:47:53Z

config/config_models_test.tsv

-DNase_intact_8features	DNase_intact		resources/feature_tables/final_feature_set_DNase_hic.tsv	False
+model	dataset	ABC_directory	feature_table	polynomial	override_params
+DNase_megamap_noReg	DNase_megamap		resources/feature_tables/all_features_DNase_hic.tsv	False 	
+DNase_megamap_L1	DNase_megamap		resources/feature_tables/all_features_DNase_hic.tsv	False 	{'solver': 'saga', 'penalty': 'l1'}


I realize this is important for us to recognize what are the best hyperparameters for our logistic regression model. Do we expect users to want to choose their own hyperparameters? Or are we going to find the best ones and make them the default?

No idea who is going to use this besides us... the default parameters are already specified in config_training which is more transparent and explicit than having it hardcoded. The override_params can just be used if the user wants to change one of those for any particular model. The whole point of this workflow is to compare different model architectures so I feel like it;s a valid things to include!

atancoder · 2024-01-30T18:50:50Z

workflow/rules/train_model.smk

 		"""

+# compare cv-performance on training data across all models (note, this is not the true benchmarking performance CRISPR elements not overlapping prediction elements aren't considered)  
+rule gather_model_performances:


For the future, you can consider splitting the actual model training components from the best model selection components into separate snakemake modules to make things cleaner.

atancoder · 2024-01-30T18:56:40Z

workflow/scripts/compare_all_models.py

+
+    # sort table by AUPRC
+    df = df.sort_values(by='AUPRC', ascending=False)
+    df.to_csv(output_file, sep = '\t', index=False)


it would be good to share a snippet of the output from your changes in the PR summary/test plan

Maya Sheth and others added 2 commits January 14, 2024 09:59

remove L2 penalty default

5fdb1b7

make LR parameters customizable

a8beb98

mayasheth requested a review from atancoder January 30, 2024 03:55

atancoder approved these changes Jan 30, 2024

View reviewed changes

update ABC submodule to new dev branch

bc144c3

mayasheth merged commit 7fa1fd0 into main Feb 9, 2024

mayasheth deleted the LR_parameters branch February 9, 2024 00:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

make LR parameters customizable in training workflow #24

make LR parameters customizable in training workflow #24

mayasheth commented Jan 30, 2024

atancoder Jan 30, 2024

mayasheth Jan 30, 2024

atancoder Jan 30, 2024

atancoder Jan 30, 2024

make LR parameters customizable in training workflow #24

make LR parameters customizable in training workflow #24

Conversation

mayasheth commented Jan 30, 2024

atancoder Jan 30, 2024

Choose a reason for hiding this comment

mayasheth Jan 30, 2024

Choose a reason for hiding this comment

atancoder Jan 30, 2024

Choose a reason for hiding this comment

atancoder Jan 30, 2024

Choose a reason for hiding this comment