Skip to content

Add synonym-generation experiment to improve semantic-only CIEL Bridge accuracy #2392

@filiperochalopes

Description

@filiperochalopes

User story

As a developer, I want to generate additional CIEL synonyms using an LLM-based prompt approach so we can evaluate whether this improves embedding-based semantic matching accuracy for CIEL Bridge (diagnosis-only) at low cost.

Use case

Generate synonyms for CIEL terms, vectorize them, and compare semantic matching accuracy with and without the generated synonyms.

Requirements

  • Add a backlog item to prototype synonym generation for CIEL using the paper-style prompt approach (e.g., Qwen3-8B locally).
  • Use the generated synonyms to build a more robust embedding dataset for semantic matching evaluation.
  • Clarify how/where these synonyms (and any vectorized representation) would be stored/used, and whether OCL Mapper/CIEL Bridge has a place to consume merged synonym data for semantic search.

Notes

  • Implement support in OCL Mapper (CIEL Bridge algorithm) to include these artificial synonyms from an external source in the same semantic search.
  • Synonyms will be published as a CSV with fields: CIEL concept ID, synonym; two rows per concept (as done in the referenced article).

Acceptance criteria

  • A prototype can generate synonyms for CIEL terms using the specified prompt approach.
  • Generated synonyms can be vectorized and included in an embedding dataset for evaluation.
  • OCL Mapper (CIEL Bridge) can include external artificial synonyms in the same semantic search (based on the published CSV).
  • There is a documented decision on storage/consumption (where the data lives and how CIEL Bridge semantic search uses it).

Ref 1: https://openmrs.slack.com/archives/C0A7S4SDXKR/p1770204859509349?thread_ts=1770148279.476269&cid=C0A7S4SDXKR
Ref 2: https://academic.oup.com/jamia/advance-article/doi/10.1093/jamia/ocag004/8445947

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

Status

Requirements

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions