merge components of scgpt into main #813

dorien-er · 2024-06-14T12:48:14Z

components have already been reviewed on scGPT branch:

embedding
tokenize_pad
vocabulary_check

… scgpt

scGPT integration preproc module

* add script to download scgpt test resources * Update resources_test_scripts/scgpt.sh Co-authored-by: Dries Schaumont <5946712+DriesSchaumont@users.noreply.github.com> * add drive folders containing data and model * chmod +x --------- Co-authored-by: Dries Schaumont <5946712+DriesSchaumont@users.noreply.github.com>

merge scgpt-dev into scgpt

Co-authored-by: Dries Schaumont <5946712+DriesSchaumont@users.noreply.github.com>

* add module for scgpt padding and tokenization * remove base requirement * update changelog * update component name * expand unit tests, update script with loggers and todo * fix unit tests * remove annotation script * run tests with subsampled data * use specific model input files instead of directory * remove unused binning script * update layer names and handling * Add script to download scgpt test resources (#750) * add script to download scgpt test resources * Update resources_test_scripts/scgpt.sh Co-authored-by: Dries Schaumont <5946712+DriesSchaumont@users.noreply.github.com> * add drive folders containing data and model * chmod +x --------- Co-authored-by: Dries Schaumont <5946712+DriesSchaumont@users.noreply.github.com> * preproc script * preproc script * tokenize and pad script * tokenize and pad script * embedding script * test resourcers and evaluation script * cross check gene set * Fix retag for viash-hub not using correct namespace separator (#745) * CI - Build: Fix second occurance of namespace separator (#746) * script to download scgpt test data * remove test resources script * pad_tokenize module * updat image * remove test resources, update inputs * use pytorch image * remove integration component * remove nvidia reqs * remove load_model option * adjust preprocessing script * add scgpt full preproc module * integration submodule * integration submodule and add normalize_total flag * add params * update scanpy version * remove branch irrelevant scripts * update output handling * update unit tests, add output compression * update key name input output * fix test * update unit tests * Update CHANGELOG.md Co-authored-by: Dries Schaumont <5946712+DriesSchaumont@users.noreply.github.com> * add pars to logging --------- Co-authored-by: Dries Schaumont <5946712+DriesSchaumont@users.noreply.github.com>

scGPT embedding component

* base script and config added * config extended + logger set up + tests in progress * config working + script improved + tests in progress * exception handling, extended tests * extended tests + better logging * changelog entry added * test resource path in config fixed * python test setup added to config * PR comments fixed * updated to use subset data * remove batch id column logic * update authors * resources, tests and dependencies fixes * update key name input output * update key name input output * update var gene names * update config * compression param added + minor fixes --------- Co-authored-by: dorien-er <roosen.dorien@gmail.com> Co-authored-by: DriesSchaumont <5946712+DriesSchaumont@users.noreply.github.com>

dorien-er and others added 30 commits March 6, 2024 11:16

preproc script

78954c0

preproc script

9068e7a

tokenize and pad script

dbe5204

tokenize and pad script

89a9c6a

embedding script

9e446f8

test resourcers and evaluation script

94dd10c

cross check gene set

3edf3c0

pad_tokenize module

085cdc4

updat image

724427e

remove test resources, update inputs

f9aadfa

use pytorch image

33c9ffe

remove integration component

0c6316d

remove nvidia reqs

47f5dda

Merge branch 'main' of github.com:openpipelines-bio/openpipeline into…

9d2ffd0

… scgpt

remove load_model option

0f74ebd

Fix retag for viash-hub not using correct namespace separator (#745)

52fb38c

CI - Build: Fix second occurance of namespace separator (#746)

accf980

script to download scgpt test data

b1dd6ce

remove test resources script

18db6d6

adjust preprocessing script

6c3fec0

add scgpt full preproc module

acd3600

integration submodule

3e31204

integration submodule and add normalize_total flag

b5d1970

add params

ec326f8

Merge pull request #751 from openpipelines-bio/scgpt-preprocessor

2dddc1c

scGPT integration preproc module

embedding module

adcd6f0

Merge pull request #755 from openpipelines-bio/scgpt-dev

bd7a32f

merge scgpt-dev into scgpt

add unit tests

154ef26

undo subsampling test data

a7e08bc

dorien-er and others added 27 commits March 26, 2024 11:57

update test data

418687a

Remove muon as test dependency for concatenate_h5mu. (#773)

41b60be

scGPT binning component (#765)

7ec3ba4

Co-authored-by: Dries Schaumont <5946712+DriesSchaumont@users.noreply.github.com>

Merge branch 'develop' into scgpt

5f2e092

update embedding dependencies and gene name layer handling

9e1d35a

update input handling

4e9d916

include dsbn logic

c11db8c

update unit tests

e3faf4b

update config

0ba4e9c

expand unit tests, fix dsbn

350ba33

Update CHANGELOG.md

86ff4ef

Co-authored-by: Dries Schaumont <5946712+DriesSchaumont@users.noreply.github.com>

Update src/scgpt/embedding/config.vsh.yaml

8fb4a68

Co-authored-by: Dries Schaumont <5946712+DriesSchaumont@users.noreply.github.com>

update required, remove shared memory docker

e12b2e4

Merge branch 'scgpt' into embed

b6083f0

enable gpu device option

832d754

update dsbn

e0ee58c

Merge branch 'scgpt' into embed

5d6ef32

remove temporary, unused components

d224787

update error messages, remove device param

c3e159a

remove dropout param

2daa6f6

fix typo

6ddd7c1

fix typo

0ae8cdb

Merge pull request #761 from openpipelines-bio/embed

a5977ea

scGPT embedding component

Merge branch 'main' into scgpt

94b955c

undo concat changes

49285b0

DriesSchaumont marked this pull request as ready for review June 14, 2024 13:40

DriesSchaumont merged commit ef5d0cc into main Jun 14, 2024

DriesSchaumont deleted the scgpt branch June 14, 2024 13:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

merge components of scgpt into main #813

merge components of scgpt into main #813

Uh oh!

dorien-er commented Jun 14, 2024 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

merge components of scgpt into main #813

merge components of scgpt into main #813

Uh oh!

Conversation

dorien-er commented Jun 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

dorien-er commented Jun 14, 2024 •

edited

Loading