feat(document): add change_type field to TextItem#667
Open
a-huk wants to merge 1 commit into
Open
Conversation
Adds an optional change_type field to TextItem with values 'inserted' or 'deleted' (None means normal/final text). Also threads the parameter through DoclingDocument.add_text() so backends can set it at creation time. Needed by the docling DOCX backend to represent tracked changes in 'raw' mode without hijacking underline/strikethrough formatting.
Contributor
|
❌ DCO Check Failed Hi @a-huk, your pull request has failed the Developer Certificate of Origin (DCO) check. This repository supports remediation commits, so you can fix this without rewriting history — but you must follow the required message format. 🛠 Quick Fix: Add a remediation commitRun this command: git commit --allow-empty -s -m "DCO Remediation Commit for a-huk <huk.adam.g@gmail.com>
I, a-huk <huk.adam.g@gmail.com>, hereby add my Signed-off-by to this commit: 224777f61b0b6851d5552251fd0c8e9e055d0a66"
git push🔧 Advanced: Sign off each commit directlyFor the latest commit: git commit --amend --signoff
git push --force-with-leaseFor multiple commits: git rebase --signoff origin/main
git push --force-with-leaseMore info: DCO check report |
Contributor
Merge Protections🟢 Merge protection satisfied — ready to merge. Show 1 satisfied protection🟢 Enforce conventional commitMake sure that we follow https://www.conventionalcommits.org/en/v1.0.0/
|
ceberam
requested changes
Jun 30, 2026
ceberam
left a comment
Member
There was a problem hiding this comment.
Thanks @a-huk , it's a good starting draft, but there are quite a few things that need attention:
- The golden YAML file
test/data/docling_document/unit/TextItem.yamldoes not includechange_type, so the test fails with:
Left contains 1 more item: {'change_type': None}
change_typeis silently dropped for labels that are dispatched to helper methods.
Insideadd_text(), labels likeTITLE,LIST_ITEM, orCODEare all forwarded to specialized helpers (add_title,add_list_item,add_heading, etc.). None of which accept or forwardchange_type. The parameter is silently ignored. We may want to track changes on those items too.- Following up with the previous bullet: what if a contributor drops an entire table from the
.docxdocument? We should better definechange_typeas a field ofDocItem - The PR adds no test verifying that a
TextItemconstructed withchange_type="inserted"round-trips correctly (serialize → deserialize), nor thatadd_text()propagates the value onto the returned item. A single parametrized test would suffice. - It would be good to add the support of
change_typein at least one serializer, to show the intention of the new field and get an idea about how inserted or deleted text should be rendered compared to normal text. The Markdown or the HTML serializers would be excellent candidates. - I have doubts about the name of the new field.
change_typeis generic and ambiguous. Also change and type are both overloaded words in any codebase. It also encodes only the kind of change, not the fact that this is a tracked/revision-control concept. What abouttracked_changeandinsertionanddeletionfor its values?
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds an optional change_type field to TextItem with values 'inserted' or 'deleted' (None means normal/final text). Also threads the parameter through DoclingDocument.add_text() so backends can set it at creation time.
Needed by the docling DOCX backend to represent tracked changes in 'raw' mode without hijacking underline/strikethrough formatting.