Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent Relationship Extraction Results Compared to PubTator3 Website #11

Open
dillonl opened this issue Sep 17, 2024 · 1 comment

Comments

@dillonl
Copy link

dillonl commented Sep 17, 2024

I've encountered an issue where I'm unable to reproduce the relationship extraction results from the PubTator3 website (example: 19394258) using BioREx. When I run the tool using the suggested model, it only outputs one relationship, whereas the PubTator3 site identifies twelve.

Additionally, when I run the code as is (run_test_pred.sh), it crashes due to an empty intermediate file (out_processed.tsv). I've noticed that if I hardcode the relationships in src_tgt_pairs variable in src/convert_pubtator_2_tsv.py, the process continues past this issue, but it still doesn't match the expected output.

I suspect this might be due to differences in the models used. The README lists several models, but none seem to produce output that matches what the website provides.

Could you clarify whether the model used on the website is available in the repository? Also, any guidance on how you run this tool on the PubTator3 website would be appreciated.

Thanks you.

@ptlai
Copy link
Collaborator

ptlai commented Sep 24, 2024

Hi @dillonl ,

Sorry for the late reply.

BioREx can only predict the relation types of BioRED. Please let me know if you need any help while reproducing.

For PubTator3, we use a mapping table to map BioRED relation types to PubTator3 relation types as below:

chemical-chemical: positive_correlation => positive_correlate
chemical-chemical: negative_correlation => negative_correlate
chemical-chemical: association => associate
chemical-chemical: bind => interact
chemical-chemical: comparison => compare
chemical-chemical: conversion => convert
chemical-chemical: cotreatment => cotreat
chemical-chemical: drug_interaction => drug_interact
chemical-disease: positive_correlation => cause
chemical-disease: negative_correlation => treat
chemical-disease: association => associate
chemical-gene: positive_correlation => positive_correlate
chemical-gene: negative_correlation => negative_correlate
chemical-gene: association => associate
chemical-gene: bind => interact
chemical-variant: positive_correlation => stimulate
chemical-variant: negative_correlation => inhibit
chemical-variant: association => associate
chemical-variant: bind => interact
disease-gene: positive_correlation => stimulate
disease-gene: negative_correlation => inhibit
disease-gene: association => associate
disease-variant: positive_correlation => cause
disease-variant: negative_correlation => prevent
disease-variant: association => associate
gene-gene: positive_correlation => positive_correlate
gene-gene: negative_correlation => negative_correlate
gene-gene: association => associate
gene-gene: bind => interact
variant-variant: association => associate

Each NE pair and BioRED relation type will be assigned the corresponding PubTator3 relation type.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants