-
Notifications
You must be signed in to change notification settings - Fork 1.6k
ENH: CID font resource from font file to encode more characters #3652
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
PJBrs
wants to merge
11
commits into
py-pdf:main
Choose a base branch
from
PJBrs:fontwork
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+314
−20
Draft
Changes from all commits
Commits
Show all changes
11 commits
Select commit
Hold shift + click to select a range
a04e579
Extract the /FontFile and store it in the new FileDescriptor object
5ef6898
ENH: Font: Enable initialisation from TrueType font file
PJBrs f4bdfcc
MAINT: Font: Refactor space width calculation
PJBrs d67f393
ENH: Font: Enable generating a CID font resource
PJBrs b151e8f
ENH: AppearanceStream: Generate new font resource for unicode
PJBrs c813d10
ENH: FontDescriptor: Add method to produce PDF resource
PJBrs 714bfa9
ENH: Font: Add our own font descriptor resource
PJBrs 1172dc8
ENH: PdfWriter: Make font descriptors indirect when filling forms
PJBrs f1dbe29
ENH: PdfWriter: Test adding unicode font resource for form filling
PJBrs 663e7b3
ENH: Test writer: Test for unavailable unicode characters
PJBrs cbc9ee4
ENH: Test font: Simple check for font descriptor as resource
PJBrs File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure that this is not correct. It accidentally works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what Claude AI says:
The Problem with
/IdentityCIDToGIDMapThe issue is on line 467 of the PR:
Why this causes garbled text:
When you set
/CIDToGIDMapto just/Identity, the PDF reader assumes:However, in a TrueType font file, the glyph IDs don't necessarily match Unicode codepoints. Looking at your code:
The
character_mapmaps:But when you later encode text using this map and then tell the PDF reader "use
/Identitymapping," the reader will try to use the Unicode codepoint as the GID directly—not the glyph ID stored in your character_map. This causes mismatches where the wrong glyphs get rendered.The Correct Solution
You need to create an explicit CIDToGIDMap stream that maps:
Here's the approach:
This ensures every character in your
character_maphas a corresponding, correct glyph ID lookup in the PDF.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://stackoverflow.com/questions/75576696/understanding-pdf-cidfonts-cmaps-and-gids-best-practices
https://ken-lunde.medium.com/to-cid-or-not-to-cid-e8e623dcde92
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably not correct either.