Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feat] expose bfactors in protein_to_pyg function #388

Merged
merged 5 commits into from
Apr 23, 2024
Merged

Conversation

kierandidi
Copy link
Collaborator

@kierandidi kierandidi commented Apr 19, 2024

Reference Issues/PRs

What does this implement/fix? Explain your changes

Bfactors are present in df from biopandas, but cannot be saved to pyg object. Now this option is enabled.

What testing did you do to verify the changes in this PR?

Pull Request Checklist

  • Added a note about the modification or contribution to the ./CHANGELOG.md file (if applicable)
  • Added appropriate unit test functions in the ./graphein/tests/* directories (if applicable)
  • Modify documentation in the corresponding Jupyter Notebook under ./notebooks/ (if applicable)
  • Ran python -m py.test tests/ and make sure that all unit tests pass (for small modifications, it might be sufficient to only run the specific test file, e.g., python -m py.test tests/protein/test_graphs.py)
  • Checked for style issues by running black . and isort .

@kierandidi kierandidi requested a review from a-r-j April 19, 2024 17:35
@kierandidi kierandidi self-assigned this Apr 19, 2024
@@ -254,6 +261,10 @@ def protein_to_pyg(
)
if store_het:
out.hetatms = [het_coords]

if store_bfactor:
out.bfactor = torch.tensor(df["b_factor"].values)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think torch.from_numpy might be a little better

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -254,6 +261,10 @@ def protein_to_pyg(
)
if store_het:
out.hetatms = [het_coords]

if store_bfactor:
out.bfactor = torch.from_numpy(df["b_factor"].values)
Copy link
Owner

@a-r-j a-r-j Apr 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be of shape num_atoms x 1 right? I expect this would break batching as all other tensors are of shape n_res x X

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, solved this via a group by now that averages b factors on a per residue basis, consistent with the plddt information in the predicted datasets.

Copy link
Collaborator Author

@kierandidi kierandidi Apr 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ran more tests now and everything seems to work in a backward compatible way, lmk if there is anything else that needs to happen before merging @a-r-j

Copy link

sonarcloud bot commented Apr 21, 2024

Quality Gate Failed Quality Gate failed

Failed conditions
C Maintainability Rating on New Code (required ≥ A)

See analysis details on SonarCloud

Catch issues before they fail your Quality Gate with our IDE extension SonarLint

@a-r-j a-r-j merged commit e861231 into master Apr 23, 2024
28 of 32 checks passed
@a-r-j a-r-j deleted the feat/expose_b_factor branch July 15, 2024 07:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants