-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added ability to include nodes marked as obsolete #15
Conversation
I had to make this change in order to identify some legacy codes in some of our datasets. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the contribution!
one small suggestion, otherwise looks good.
a test would be nice, but this is so simple that I'm pretty confident your implementation will work. So happy to merge without test.
obonet/read.py
Outdated
is_obsolete = ignore_obsolete and term.get('is_obsolete', 'false') == 'true' | ||
if is_obsolete: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps keep is_obsolete = term.get('is_obsolete', 'false') == 'true'
but change the next line to:
if ignore_obsolete and is_obsolete:
assert taxrank.node['TAXRANK:0000001']['name'] == 'phylum' | ||
assert 'NCBITaxon:kingdom' in taxrank.node['TAXRANK:0000017']['xref'] | ||
# It looks like networkx has changed the name of the node variable to nodes -- EST 2020-09-25 | ||
assert taxrank.nodes['TAXRANK:0000001']['name'] == 'phylum' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like neworkx 2.4 removed G.nodes
, which had been marked as deprecated in 2.1.
not sure why Travis CI builds are not showing up on GitHub for this pull request, but see the failures at https://travis-ci.org/github/dhimmel/obonet/builds/730420444. We support both networkx 1.x and 2.x.
You could add pytest.importorskip("networkx", minversion="2.0")
to this test to skip it on 1.x builds. Removing 1.x support might be something we do soon anyways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original (taxrank.node) fails with networkx 2.5:
> assert taxrank.node['TAXRANK:0000001']['name'] == 'phylum'
E AttributeError: 'MultiDiGraph' object has no attribute 'node'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original (taxrank.node) fails with networkx 2.5:
yes. we should change to using .node
, although this will cause failure for the networkx 1.x builds. Therefore, I was thinking we could just skip the taxrank tests if networkx < 2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
taxrank.node fails for me when using networkx 2.5, but I left the failing test as it was originally.
tests/test_obo_reading.py
Outdated
def test_ignore_obsolete_nodes(): | ||
hpo = obonet.read_obo("http://purl.obolibrary.org/obo/hp.obo") | ||
nodes = hpo.nodes(data=True) | ||
assert "HP:0005549" not in nodes | ||
|
||
def test_presence_of_obsolete_nodes(): | ||
hpo = obonet.read_obo("http://purl.obolibrary.org/obo/hp.obo", ignore_obsolete=False) | ||
nodes = hpo.nodes(data=True) | ||
assert "HP:0005549" in nodes | ||
node = nodes['HP:0005549'] | ||
assert node['is_obsolete'] == 'true' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice tests.
hp.obo
looks to be 6.7M, so for the purpose of keeping these tests speedy, we might want to test on a simple ontology that we store locally. See examples in tests/data
. This also protects us from future test failure if something changes at that URL.
One easy way might be to add the following to tests/data/brenda-subset.obo
(source):
[Term]
id: BTO:0000311
name: culture filtrate
comment: Added as synonym to culture fluid.
is_obsolete: true
…al test for obsolete behavior using brenda as requested
@@ -8,7 +8,6 @@ | |||
|
|||
directory = os.path.dirname(os.path.abspath(__file__)) | |||
|
|||
|
|||
def test_read_taxrank_file(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test function also uses .node
and is therefore causing test failures on networkx 2.5. Would you be able to update .node to .nodes here and apply the pytest.importorskip
?
Sorry to make you fix things unrelated to the feature enhancement. I can also make these changes on master if they are becoming too burdensome.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done. Travis seems to be happy now.
Thanks @torstees for this great contribution! I'm going to enable scheduled CI builds so the next contributor isn't stuck with fixing a bunch of problems that aren't their fault. Will comment here when I've released a new version to PyPI with these changes. |
@torstees merged commit (0cf3b4a) is released in https://pypi.org/project/obonet/0.2.6/ |
Added parameter to the read_obo function to permit users to keep obsolete nodes in their graphs. Default is such that it shouldn't break any pre-existing client calls.