Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update BEL namespace to SCOMP and SFAM #184

Closed
cthoyt opened this issue Aug 6, 2021 · 3 comments · Fixed by #185
Closed

Update BEL namespace to SCOMP and SFAM #184

cthoyt opened this issue Aug 6, 2021 · 3 comments · Fixed by #185

Comments

@cthoyt
Copy link
Collaborator

cthoyt commented Aug 6, 2021

I think for historic reasons, the SCOMP and SFAM namespaces were combine to BEL for the purposes of famplex. This is a bit problematic because the information about the original namespace is lost. Even more confusing, these are actually the names of the entities and not their selventa identifiers

I added all of the selventa namespaces to PyOBO in biopragmatics/pyobo#114. Would it break a lot of things in INDRA if we started updating these in famplex?

One compromise would be to add the additional references then just filter out the BEL namespace information in places that don't require it

cthoyt added a commit to cthoyt/famplex that referenced this issue Aug 6, 2021
Closes sorgerlab#184

Note: "MAPK Erk1/3 Family" exists as an equivalence in famplex, but it's not actually in the selventa namespace. Should we assign it our own F number?
@bgyori
Copy link
Member

bgyori commented Aug 6, 2021

I think we're talking about different things. SFAM and SCOMP are actually relatively recent, and these mappings were done with respect to a version of BEL where they weren't used yet, hence the differences. "MAPK Erk1/3 Family" is a real BEL family that used to exist.

@cthoyt
Copy link
Collaborator Author

cthoyt commented Aug 6, 2021

I saw the way this list was generated was by processing the selventa large corpus. I didn't realize that there's a older version of the large corpus that doesn't use SFAM and SCOMP!

@bgyori
Copy link
Member

bgyori commented Aug 6, 2021

You can see in this file: https://raw.githubusercontent.com/sorgerlab/indra/de517e376256c61a3f91356bc8c95427febd8282/data/large_corpus.bel that the relevant namespaces that I used to import many families/complexes and curate mappings to FamPlex are PFH and NCH:

DEFINE NAMESPACE PFH AS URL "http://resource.belframework.org/belframework/1.0/namespace/selventa-named-human-protein-families.belns"
DEFINE NAMESPACE NCH AS URL "http://resource.belframework.org/belframework/1.0/namespace/selventa-named-human-complexes.belns"

At the time I didn't think these BEL namespaces were important beyond being able to map them when encountered. I then occasionally added things from SCOMP/SFAM later, again without splitting into a new namespace, to be able to process newer versions of Selventa BEL files where these appeared. I agree it would be good to update these since it's unlikely we will ever need to process old BEL content, we just need to be a bit careful that things stay operational downstream.

bgyori pushed a commit to cthoyt/famplex that referenced this issue Oct 30, 2021
Closes sorgerlab#184

Note: "MAPK Erk1/3 Family" exists as an equivalence in famplex, but it's not actually in the selventa namespace. Should we assign it our own F number?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants