Skip to content

feat(models): add searchTier annotations to glossary and document definitions#16995

Open
loustler wants to merge 1 commit intodatahub-project:masterfrom
loustler:feat/search-tier-pdl-annotations
Open

feat(models): add searchTier annotations to glossary and document definitions#16995
loustler wants to merge 1 commit intodatahub-project:masterfrom
loustler:feat/search-tier-pdl-annotations

Conversation

@loustler
Copy link
Copy Markdown
Contributor

Summary

The definition field on GlossaryNodeInfo and GlossaryTermInfo, and the text field on DocumentContents, are annotated with @Searchable but lack a searchTier assignment. In the V3 tiered search architecture, fields without a searchTier are not copied into the _search.tier_N consolidated fields and therefore don't participate in tiered relevance scoring.

This PR assigns searchTier: 2 (description/definition tier) to these fields, consistent with similar fields across the model:

Tier Purpose Examples
1 Names, titles DatasetProperties.name, ChartInfo.title, GlossaryTermInfo.name
2 Descriptions, definitions DatasetProperties.description, ChartInfo.description
3 Secondary text DatasetProperties.qualifiedName
4 Low-priority metadata Tags, custom properties

Changes

File Change
metadata-models/.../GlossaryNodeInfo.pdl Add "fieldType": "TEXT", "searchTier": 2 to definition
metadata-models/.../GlossaryTermInfo.pdl Add "fieldType": "TEXT", "searchTier": 2 to definition
metadata-models/.../DocumentContents.pdl Add "searchTier": 2 to text (already had fieldType: TEXT)

Notes

  • GlossaryNodeInfo.definition and GlossaryTermInfo.definition previously had @Searchable = {} (empty annotation). This explicitly sets fieldType: TEXT which was previously the implicit default. No behavioral change for V2 search.
  • Requires V3 index rebuild to take effect (the copy_to directives are generated at index creation time based on searchTier values).

Checklist

…initions

Assign searchTier 2 to GlossaryNodeInfo.definition, GlossaryTermInfo.definition,
and DocumentContents.text so these fields participate in V3 tiered search scoring.
Tier 2 is consistent with other description/definition fields across the model.
@github-actions github-actions bot added the community-contribution PR or Issue raised by member(s) of DataHub Community label Apr 13, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Linear: PFP-3335

Thanks for your contribution! We have created an internal ticket to track this PR. A member of the core DataHub team will be assigned to review it within the next few business days - you will get a follow-up comment once a reviewer is assigned.

@maggiehays maggiehays added the needs-review Label for PRs that need review from a maintainer. label Apr 13, 2026
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 13, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown
Collaborator

@david-leifker david-leifker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks!

@david-leifker david-leifker self-assigned this Apr 16, 2026
@david-leifker david-leifker added the merge-pending-ci A PR that has passed review and should be merged once CI is green. label Apr 16, 2026
@github-actions github-actions bot requested a review from david-leifker April 16, 2026 14:46
@github-actions
Copy link
Copy Markdown
Contributor

Your PR has been assigned to @david-leifker (david.leifker) for review (PFP-3335).

@david-leifker david-leifker enabled auto-merge (squash) April 16, 2026 14:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-contribution PR or Issue raised by member(s) of DataHub Community merge-pending-ci A PR that has passed review and should be merged once CI is green. needs-review Label for PRs that need review from a maintainer.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants