Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug in add_k_nn_edges and add minor extension #229

Merged
merged 13 commits into from
Nov 9, 2022

Conversation

anton-bushuiev
Copy link
Contributor

@anton-bushuiev anton-bushuiev commented Nov 2, 2022

Reference Issues/PRs

No related issues/PRs

What does this implement/fix? Explain your changes

This pull request fixes a bug in add_k_nn_edges. Currenly, kneighbors_graph(X=dist_mat, ...) leads to wrong results which may seem correct. According to scikit-learn docs, it does not support X to be a distance matrix.

Secondly, this PR adds filter_distmat to generalise self-loops filtering. It adds the functionality to filter inter- or intra- connections between nodes of a single or multiple chains. It may be useful to incorporate it into other functions to add edges. By running the following code you will get the visualized graphs.

for val in ['inter', 'intra']:
    edge_funcs = {
        'edge_construction_functions': [
            partial(add_k_nn_edges, k=2, long_interaction_threshold=0, exclude_edges=[val])
        ]
    }
    config = ProteinGraphConfig(**edge_funcs)
    g = construct_graph(config=config, pdb_code='10gs')
...

image
image

What testing did you do to verify the changes in this PR?

Pull Request Checklist

  • Added a note about the modification or contribution to the ./CHANGELOG.md file (if applicable)
  • Added appropriate unit test functions in the ./graphein/tests/* directories (if applicable)
  • Modify documentation in the corresponding Jupyter Notebook under ./notebooks/ (if applicable)
  • Ran python -m py.test tests/ and make sure that all unit tests pass (for small modifications, it might be sufficient to only run the specific test file, e.g., python -m py.test tests/protein/test_graphs.py)
  • Checked for style issues by running black . and isort .

anton-bushuiev and others added 3 commits November 2, 2022 19:28
`kneighbors_graph(X=dist_mat, ...)` is wrong since `X` may not be a distance matrix. This leads to wrong results which may be similar to correct ones.
@a-r-j
Copy link
Owner

a-r-j commented Nov 2, 2022

Thanks for this @anton-bushuiev. Good spot!

LGTM! Would you be able to add a couple tests?

@anton-bushuiev
Copy link
Contributor Author

Yeah, I'll add them tomorrow.

@a-r-j a-r-j added 0 - Priority P0 Highest priority bug Something isn't working labels Nov 2, 2022
@anton-bushuiev
Copy link
Contributor Author

Hi, @a-r-j ! I've added the tests. I had to fix a new bug introduced by the distance matrix filtering (see this) 😄.

@sonarcloud
Copy link

sonarcloud bot commented Nov 4, 2022

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
No Duplication information No Duplication information

@codecov-commenter
Copy link

Codecov Report

Base: 40.27% // Head: 47.86% // Increases project coverage by +7.59% 🎉

Coverage data is based on head (fd1b36b) compared to base (8123f42).
Patch coverage: 52.20% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #229      +/-   ##
==========================================
+ Coverage   40.27%   47.86%   +7.59%     
==========================================
  Files          48       85      +37     
  Lines        2811     5398    +2587     
==========================================
+ Hits         1132     2584    +1452     
- Misses       1679     2814    +1135     
Impacted Files Coverage Δ
graphein/ml/diffusion.py 0.00% <0.00%> (ø)
graphein/ppi/graph_metadata.py 0.00% <0.00%> (ø)
graphein/ppi/visualisation.py 0.00% <0.00%> (ø)
graphein/protein/analysis.py 0.00% <0.00%> (ø)
graphein/protein/features/sequence/utils.py 28.00% <0.00%> (+3.00%) ⬆️
graphein/protein/features/utils.py 27.77% <0.00%> (ø)
graphein/rna/utils.py 38.46% <ø> (ø)
graphein/rna/visualisation.py 28.57% <ø> (+28.57%) ⬆️
graphein/utils/config.py 100.00% <ø> (+100.00%) ⬆️
graphein/utils/config_parser.py 100.00% <ø> (ø)
... and 81 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@a-r-j a-r-j merged commit ff0427e into a-r-j:master Nov 9, 2022
@a-r-j
Copy link
Owner

a-r-j commented Nov 9, 2022

Thanks for the contribution @anton-bushuiev!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0 - Priority P0 Highest priority bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants