Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate IDs for same mets #519

Closed
feiranl opened this issue Apr 1, 2023 · 8 comments
Closed

Duplicate IDs for same mets #519

feiranl opened this issue Apr 1, 2023 · 8 comments

Comments

@feiranl
Copy link
Collaborator

feiranl commented Apr 1, 2023

Description of the issue:

mets metsNoComp metBiGGID metKEGGID metHMDBID metChEBIID metPubChemID metHepatoNET1ID metRecon3DID metMetaNetXID metHMR2ID metRetired
MAM00270c MAM00270 wharachd C14748 empty CHEBI:34306 5283157 empty wharachd MNXM22451 m00270c m00270c
MAM00270r MAM00270 wharachd C14748 empty CHEBI:34306 5283157 empty wharachd MNXM22451 m00270r m00270r
MAM00591c MAM00591 empty C14748 empty CHEBI:34306 empty HC02179 empty MNXM6760 m00591c m00591c
MAM00591e MAM00591 empty C14748 empty CHEBI:34306 empty HC02179 empty MNXM6760 m00591s m00591s
MAM00270e MAM00270 wharachd C14748 empty CHEBI:34306 5283157 empty wharachd MNXM22451 empty m00270s
@haowang-bioinfo
Copy link
Member

this shows there are two sets ids CHEBI:34306 (KEGG:C14748) in [c] and [e], which set should be removed?

@feiranl
Copy link
Collaborator Author

feiranl commented Apr 6, 2023

We will keep the metabolite MAM00270 , since it is related to 12 rxns (7 related to MAM00591)
Duplicate rxns caused by this merge will also be removed(the right column).

rxn with MAM00270 GPR rxn with MAM00591 GPR
MAR00934 Many genes MAR00941 ENSG00000162365 or ENSG00000187048
MAR02396 ENSG00000134538 MAR06142 ENSG00000134538
MAR02398 ENSG00000134538 MAR06143 ENSG00000134538
MAR02402 ENSG00000174640 MAR06220 ENSG00000174640
MAR02404 ENSG00000184999 MAR06252 Blank
MAR02400 ENSG00000134538 MAR06144 ENSG00000134538
MAR10348 Blank MAR10025 Blank

@haowang-bioinfo
Copy link
Member

Duplicate rxns caused by this merge will also be removed(the right column).

then how to deal with the GPRs after removal?

@feiranl
Copy link
Collaborator Author

feiranl commented Apr 6, 2023

These two metabolites are mapped to the same metabolite according to the MetaNetX

mets metsNoComp metBiGGID metKEGGID metHMDBID metChEBIID metPubChemID metLipidMapsID metEHMNID metHepatoNET1ID metRecon3DID metMetaNetXID metHMR2ID metRetired
MAM00077x MAM00077 CE2416 CE2416 MNXM165274 m00077p m00077p
MAM02766x MAM02766 prist HMDB0000795 CHEBI:51340 123929 LMPR0104010022 CE0932 prist MNXM3342;MNXM7698 m02766p m02766p

I think we should remove the metabolite MAM00077x, as it only relate to 2 rxns, while the other one is associated with 13 rxns from all compartments.

****Associated rxns for MAM00077x
MAR03488
Action: remove, for the wrong annotation of EC number with this rxn. EC1.14.11.18 should be related to AKG, while this rxn is more like 1.2.1.3.

MAR03489 is a duplicate of MAR03387 with the duplicate met MAM02766x , so it is okay to remove. CAUTION! GPR conflict.

@feiranl
Copy link
Collaborator Author

feiranl commented Apr 6, 2023

Duplicate rxns caused by this merge will also be removed(the right column).

then how to deal with the GPRs after removal?

I am not sure about this. In this case, there is one rxn which has conflict(I have updated the table). Do you have any suggestions?

@haowang-bioinfo
Copy link
Member

haowang-bioinfo commented Apr 6, 2023

Do you have any suggestions?

According to above table with updated reactions and GPRs, all GPRs for the left column can be retained except MAR00934, whose new GPR might be changed by merging GPR of MAR00941? So that there will be no information loss.

@haowang-bioinfo
Copy link
Member

I think we should remove the metabolite MAM00077x, as it only relate to 2 rxns, while the other one is associated with 13 rxns from all compartments.

agree

MAR03489 is a duplicate of MAR03387 with the duplicate met MAM02766x. CAUTION! GPR conflict.

a tentative solution is merging genes from both if no more evidence, e.g. use the version from MAR03489 whose GPR includes all genes from GPR of MAR03387

@haowang-bioinfo
Copy link
Member

fixed in #534

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants