- Expand test suite to include function unittests (maximize coverage)
- Add multi-gpu support (issue #6)
- support for loading embeddings on the fly to reduce memory usage (issue #11)
- Resolve #35 to use
require_dataset
-- can now add multiple .fasta files to the same h5 file - Update pretrained API and docs to include Topsy-Turvy
- Add retry decorator to get_pretrained if download fails
- Add ability to set a random seed for training
- Update
evaluate
code to also store metrics in a file
- Add biopython to setup.py
- Integrate Topsy-Turvy to allow for top-down supervision
- Use utils.log function across all commands
- Speed up loading embeddings into memory using parallel processing
- Update fasta parse and write to use BioPython SeqIO (better error checking)
- More comprehensive test suite for main commands
- Updated model loading on new version to handle re-named parameters
- Updated cpu-only loading during prediction with map_location
- Resolve #24 by fixing training
- Can now run
dscript train --train data/pairs/human_train.tsv --test data/pairs/human_test.tsv --embedding /afs/csail/u/s/samsl/Work/databases/STRING/homo.sapiens/human_nonRed.h5 --output [output] --save-prefix [prefix] --device 0
to replicate paper results - Updated code formatting with black and pre-commit
- Following previous update, addresses #24 by fixing model training while maintaining preferred API and command line usage
- Fixed significant bug in how training was run by reverting to older code
- Should address issue #24: unable to replicate paper results
- To do: code cleaning to bring up to formatting standards while maintaining performance
- Augmentation fix in v0.1.5 was bugged still and would throw an error, now resets index
- Change
--use-w
and--augment
to--no-w
and--no-augment
with store false
- Updated package level imports
- Updated documentation
- Fixed issue #13: improper augmentation of data
- Fixed issue #12: overwrites cmap data sets if they already exist
- Fixed issue #7: bug which would crash contact module if called directly
- Fixed issues #3, #4
- Basic logging system implemented to report skipped pairs
- Fixed wrong variable name in loading from sequence file
- Updated documentation
- Model should be put into
eval()
mode before prediction or evaluation, and when new models are downloaded - this makes the output deterministic by disabling dropout layers