Skip to content

Commit

Permalink
Adding AmbrosM's solution to methods (openproblems-bio#31)
Browse files Browse the repository at this point in the history
* converted AmbrosM's notebook into viash script

* fixed parameters

* modified config

* method now runs successfully (in native mode with correct packages installed)

* remove ipynbcheckpoints

* add ipynb_checkpoints to gitignore

* remove ipynbcheckpoints

* Add longer description

Co-authored-by: Robrecht Cannoodt <rcannood@gmail.com>

* Clean up viash arguments

Co-authored-by: Robrecht Cannoodt <rcannood@gmail.com>

* Fix Viash start/end tags

Co-authored-by: Robrecht Cannoodt <rcannood@gmail.com>

* fixed script parameters to match task/api/comp_method.yaml

* modified config

* bug fixes

* switching to nvidia docker image

* changed nvidia docker container version

* adding colorama to list of required packages

* using older version of nvidia pytorch package that uses CUDA 11

* trying removing cupy-cuda11x==12.2.0 from package requirements to avoid installing multiple versions of cupy

* trying to fix cupy to cuda 11.8

* trying now to install just cuda11x (without pinning version)

* fix config

* fix config

* remove unused imports

* minor changes to script

* move helper functions to separate file

* fix missing variables

* remove cross validation

* rename method

* add to wf

* remove gpus from config

* move imports to individual models

* Delete .attach_pid21979

* fix path in resources dir

---------

Co-authored-by: Robrecht Cannoodt <rcannood@gmail.com>
  • Loading branch information
andrew-benz and rcannood authored May 20, 2024
1 parent ca8854b commit 061da00
Show file tree
Hide file tree
Showing 6 changed files with 583 additions and 2 deletions.
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,5 @@ target
.vscode
.DS_Store
output
trace-*
trace-*
.ipynb_checkpoints
57 changes: 57 additions & 0 deletions src/task/methods/pyboost/config.vsh.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
__merge__: ../../api/comp_method.yaml

functionality:
name: pyboost
info:
label: Py-boost
rank: 18
summary: "Py-boost predicting t-scores"
description: |
An ensemble of four models was considered:
* Py-boost (a ridge regression-based recommender system)
* ExtraTrees (a decision tree ensemble with target-encoded features)
* a k-nearest neighbors recommender system
* a ridge regression model
Each model offered distinct strengths and weaknesses: ExtraTrees and
knn were unable to extrapolate beyond the training data, while ridge
regression provided extrapolation capability. To enhance model performance,
data augmentation techniques were used, including averaging differential
expressions for compound mixtures and adjusting cell counts to reduce biases.
In the end, only the py-boost model is used for generating predictions.
documentation_url: https://www.kaggle.com/competitions/open-problems-single-cell-perturbations/discussion/458661
repository_url: https://github.com/Ambros-M/Single-Cell-Perturbations-2023
arguments:
- type: file
name: --train_obs_zip
default: "resources/neurips-2023-kaggle/train_obs.csv.zip"
example: "resources/neurips-2023-kaggle/train_obs.csv.zip"
- type: string
name: --predictor_names
multiple: true
choices: [py_boost, ridge_recommender, knn_recommender, predict_extratrees]
default: [py_boost]
description: Which predictor(s) to use.
info:
test_default: [knn_recommender]
resources:
- type: python_script
path: script.py
- path: helper.py
test_resources:
- path: /resources/neurips-2023-kaggle/train_obs.csv.zip
dest: resources/neurips-2023-kaggle/train_obs.csv.zip
platforms:
- type: docker
image: ghcr.io/openproblems-bio/base_pytorch_nvidia:1.0.4
setup:
- type: python
packages:
- colorama
- py-boost==0.4.3
- type: native
- type: nextflow
directives:
label: [midtime,midmem,midcpu,gpu]
Loading

0 comments on commit 061da00

Please sign in to comment.