-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Proposal:
Elevate Match Algorithms and Concepts (from the target repo) to 1st-class entities in OCL Mapper and separate $lookup from $match
Why?
- Decoupling candidates from concepts means that an algorithm only needs to return a code/concept ID and match metadata (score, matched fragments, etc.)
- Using
$lookup(in addition to$match) means that a user may configure an authoritative concept lookup source, if it is not preferable or not possible to rely on a$matchalgorithm to provide full concept details- e.g. $lookup could point to a different repo in OCL, an external FHIR service
- OCL Mapper will load a single canonical representation for each concept that is available in the candidate pool, even if the same code is returned by more than one algorithm
- Algorithms that don't return full concept details (e.g.
ocl-scispacy-loincand external ICD-11 algorithms) will be linked to the canonical concept representation, so a user will still be able to view full concept details - Formal algo definitions will allow many benefits:
- No hard-coded algo definitions in the code
- Configurable and expandable set of algorithms for each mapper instance
- Treat external match algos as first-class algos
- Automate optimization of candidate retrieval by using algo attributes to manage batch size, parallelization, etc.
Current OCL Mapper Information Architecture
flowchart TD
A[Input Dataset] --> B[Mapping Project]
B --> C[Match Candidates]
C --> B
B --> D[Target Repository]
Planned OCL Mapper Information Architecture
flowchart LR
A[Input Dataset] --> B[Mapping Project]
E[Match Algorithms] --> B
B --> C[Match Candidates]
D[Target Repository] --> F[Concepts]
C --> F
B --> D
Requirements
Decouple Candidates from Concepts
- Candidates = algo output; code and match metadata (score, matched fragments, etc.) are required; additional attributes are optional – can be minimalist, or fully enriched
- Concepts = single source of truth for the definition of a concept that is shared across algorithms
- Retrieved via a dedicated
$lookup operation, not an algorithm response - As an optional optimization,
$matchalgorithms may return a full concept definition, but that is not on the critical path
- Retrieved via a dedicated
- The decoupled approach means that
ocl-scispacy-loincnow points to a fully specified concept- Mapper can show a unified view of a concept in the candidates tab (e.g. in Match Quality view), where there is only a single row that was returned by more than one algo
- Users are mapping to Concepts, not Candidates -- meaning Candidates and Match Metadata are linked to the Mapped Concept, but are not directly part of it
- Re-ranking will be applied to the Concept Pool (both bridge and target concepts) not to the Candidate Pool
- Updated Retrieval workflow:
- Create Candidate Pool across all algorithms -->
- Generate Concept Pool consisting of both bridge and target codes (codes need to maintain bridge/target relationships) -->
- Populate the Concept Pool with full details
- Re-rank the Concept Pool to get Unified scores for all bridge and target codes
Separate $lookup from $match
- Enables a canonical concept representation without expecting that an algorithm provides this info
- Required when decoupling candidates from concepts
- Mapper should be smart enough to configure
$lookupon its own when it can (e.g. user selectsocl-semanticorocl-searchalgos) - User should have the option to configure
$lookupmanually when they want to $matchstill able to return a full concept definition, but no longer default behavior
Formalize Match Algorithms to be a first-class trackable entity
- Requirements for this are in this ticket: Implement generic match algorithm definition config for OCL Mapper #2301
Reactions are currently unavailable
Sub-issues
Metadata
Metadata
Assignees
Labels
No labels
Type
Projects
Status
In progress