ENH: Add ability to add font resources for 14 Adobe Core fonts in text widget annotations#3624
ENH: Add ability to add font resources for 14 Adobe Core fonts in text widget annotations#3624stefan6419846 merged 16 commits intopy-pdf:mainfrom
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3624 +/- ##
=======================================
Coverage 97.39% 97.39%
=======================================
Files 55 55
Lines 9852 9877 +25
Branches 1800 1805 +5
=======================================
+ Hits 9595 9620 +25
Misses 151 151
Partials 106 106 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@stefan6419846 Thanks very much for your review! I addressed your comments. In factoring out the code that finds a font resource, I noticed that I previously had missed quite some opportunities for simplification, so I implemented those in the new method - _find_annotation_font_resource . Furthermore, I think my check whether a text value is encodable should be in the init method, not in the _from_text_annotation method, this with an eye to future font shaping steps for arabic characters. One question, I noticed that fpd2 contains some very nice code for parsing embedded ttf files. However, it is GPL-licensed, with is more restrictive than pypdf's license. Under what conditions can I use the steps in fpdf2 to produce a Font.from_font_file method? |
It is LGPL-3.0-only, not plain GPL. Nevertheless, this is too restrictive to be a hard requirement at the time being.
This highly depends on the actual code to copy and the corresponding authorship. In general, all contributors touching the corresponding code have to express their explicit consent to integrate this code under the terms of the BSD-3-Clause license. |
We used to overwrite a text appearance stream's resource dictionary when we initiated it from an annotation. This would then overwrite a font resource if we had previously added it. Make sure that we merge our new font resource into the annotation's resources instead.
Add clearer comments in the code dealing with font resources.
It appears that the code removed in this patch _may_ have run previously, but I don't see how it would have added anything beyond what already was done above.
This patch makes sure that a fallback font resource is added earlier and added already to the annotation's font resources in the .from_text_annotation method. The reason is that the font resource added by TextAppearanceStream.__init__ will be overwritten by TextAppearanceStream.from_text_annotation method when the associated annotation defines its own resources. This is not really a problem now, but it will be when we need to add font resources that don't exist yet.
This patch implements a method to produce a font resource. For now, it only works for the 14 Adobe Core fonts, which are very easy to deal with. In future versions, this can be used to produce more complex font resources, for instance for embedded fonts with insufficient font resourcs.
The current implementation of writer.update_page_form_field_values allows setting a font name, but it has to correspond with the name of an existing font resource. In practice, this is very limited, because we only look for font resources within the annotation's resources. This patch allows adding any of the 14 Adobe Core fonts. It will detect if such a font is added and it will add a corresponding font resource if needed.
Some text annotations receive a user value that cannot be encoded with available font resources. In these cases, warn the user and suggest to use writer.update_page_form_field_values with auto_regenerate=True.
This patch adds a couple of simplifications to the new _find_annotation_font_resource method. Where possible, it keeps objects indirect. Furthermore, we now always produce our own font resource if we couldn't find one, but the font name is in CORE_FONT_METRICS. This changes a couple of test results, but only for the better.
With the refactoring of finding font resources for annotations, we no longer report that we cannot find a font resource for one of the 14 Adobe Core fonts. Instead, we just produce a font resource ourselves. So, instead of asserting that we logged a warning about not being able to find a font resource, assert that our new font resource is part of the annotation's appearance stream font resources.
|
@stefan6419846 Thanks for reviewing again! Should be done now. |
## What's new ### Deprecations (DEP) - Deprecate support for abbreviations in decode_stream_data (#3617) by @stefan6419846 ### New Features (ENH) - Add ability to add font resources for 14 Adobe Core fonts in text widget annotations (#3624) by @PJBrs ### Bug Fixes (BUG) - Avoid invalid load for ICCBased FlateDecode images in mode 1 (#3619) by @stefan6419846 ### Robustness (ROB) - Fix AESV2 decryption when /Length missing in encrypt dict (#3629) by @dmitry-kostin - Fix merging when annotations point to NullObject (#3613) by @stefan6419846 - Check for `self._info` being None in `compress_identical_objects` (#3612) by @stefan6419846 [Full Changelog](6.6.2...6.7.0)
This PR includes a couple of enhancements that, together, lay some foundations for adding and changing font resources when dealing with text annotations. It adds:
These steps together contribute to dealing with text widget annotation values that cannot be encoded using existing font resources, which we need to deal with #3514, and probably #3361.
Follow-up steps for dealing with the above bugs would be: