Skip to content

Conversation

@a-r-j
Copy link
Owner

@a-r-j a-r-j commented Mar 29, 2023

Reference Issues/PRs

Waiting on #272

What does this implement/fix? Explain your changes

Dataset class for working with Sequence Datasets. Provides utilities for batch folding and embedding with ESM(Fold).

  • Set representative structure. For protein engineering tasks we can have a setup where we predict a single WT structure, which we use as the structure for the mutants & simply appropriately modify the residue types.

  • [] FoldComp compression of the predicted structures. Ideally this would run in the ESMFold step, but we can also do it post-hoc.

What testing did you do to verify the changes in this PR?

Pull Request Checklist

  • Added a note about the modification or contribution to the ./CHANGELOG.md file (if applicable)
  • Added appropriate unit test functions in the ./graphein/tests/* directories (if applicable)
  • Modify documentation in the corresponding Jupyter Notebook under ./notebooks/ (if applicable)
  • Ran python -m py.test tests/ and make sure that all unit tests pass (for small modifications, it might be sufficient to only run the specific test file, e.g., python -m py.test tests/protein/test_graphs.py)
  • Checked for style issues by running black . and isort .

@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 2 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant