Skip to content

Conversation

dorien-er
Copy link
Contributor

@dorien-er dorien-er commented Jun 14, 2024

components have already been reviewed on scGPT branch:

  • embedding
  • tokenize_pad
  • vocabulary_check

dorien-er and others added 30 commits March 6, 2024 11:16
* add script to download scgpt test resources

* Update resources_test_scripts/scgpt.sh

Co-authored-by: Dries Schaumont <5946712+DriesSchaumont@users.noreply.github.com>

* add drive folders containing data and model

* chmod +x

---------

Co-authored-by: Dries Schaumont <5946712+DriesSchaumont@users.noreply.github.com>
dorien-er and others added 27 commits March 26, 2024 11:57
Co-authored-by: Dries Schaumont <5946712+DriesSchaumont@users.noreply.github.com>
Co-authored-by: Dries Schaumont <5946712+DriesSchaumont@users.noreply.github.com>
Co-authored-by: Dries Schaumont <5946712+DriesSchaumont@users.noreply.github.com>
* add module for scgpt padding and tokenization

* remove base requirement

* update changelog

* update component name

* expand unit tests, update script with loggers and todo

* fix unit tests

* remove annotation script

* run tests with subsampled data

* use specific model input files instead of directory

* remove unused binning script

* update layer names and handling

* Add script to download scgpt test resources (#750)

* add script to download scgpt test resources

* Update resources_test_scripts/scgpt.sh

Co-authored-by: Dries Schaumont <5946712+DriesSchaumont@users.noreply.github.com>

* add drive folders containing data and model

* chmod +x

---------

Co-authored-by: Dries Schaumont <5946712+DriesSchaumont@users.noreply.github.com>

* preproc script

* preproc script

* tokenize and pad script

* tokenize and pad script

* embedding script

* test resourcers and evaluation script

* cross check gene set

* Fix retag for viash-hub not using correct namespace separator (#745)

* CI - Build: Fix second occurance of namespace separator (#746)

* script to download scgpt test data

* remove test resources script

* pad_tokenize module

* updat image

* remove test resources, update inputs

* use pytorch image

* remove integration component

* remove nvidia reqs

* remove load_model option

* adjust preprocessing script

* add scgpt full preproc module

* integration submodule

* integration submodule and add normalize_total flag

* add params

* update scanpy version

* remove branch irrelevant scripts

* update output handling

* update unit tests, add output compression

* update key name input output

* fix test

* update unit tests

* Update CHANGELOG.md

Co-authored-by: Dries Schaumont <5946712+DriesSchaumont@users.noreply.github.com>

* add pars to logging

---------

Co-authored-by: Dries Schaumont <5946712+DriesSchaumont@users.noreply.github.com>
* base script and config added

* config extended + logger set up + tests in progress

* config working + script improved + tests in progress

* exception handling, extended tests

* extended tests + better logging

* changelog entry added

* test resource path in config fixed

* python test setup added to config

* PR comments fixed

* updated to use subset data

* remove batch id column logic

* update authors

* resources, tests and dependencies fixes

* update key name input output

* update key name input output

* update var gene names

* update config

* compression param added + minor fixes

---------

Co-authored-by: dorien-er <roosen.dorien@gmail.com>
Co-authored-by: DriesSchaumont <5946712+DriesSchaumont@users.noreply.github.com>
@DriesSchaumont DriesSchaumont marked this pull request as ready for review June 14, 2024 13:40
@DriesSchaumont DriesSchaumont merged commit ef5d0cc into main Jun 14, 2024
@DriesSchaumont DriesSchaumont deleted the scgpt branch June 14, 2024 13:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants