Skip to content

Releases: pymupdf/PyMuPDF

Fixes and improved font subsetting

26 Feb 23:04
Compare
Choose a tag to compare

Fixed #888, #895, #896, #885, #922
Implemented #897 (text output right-to-left).

  • Font subsetting now works without rewriting the respective text.
  • Added a utility function to compute the quad of a text span for "dict" and "rawdict" text extraction.

Some hot fixes

04 Feb 21:17
Compare
Choose a tag to compare

Interesting new features and several fixes

02 Feb 21:00
Compare
Choose a tag to compare

Fixes:

Implemented enhancement requests:

  • #855, which allows font subsetting using package fontTools

  • #870, which allows convert_to_pdf method also for PDF documents.

  • #843, Document.tobytes() (formerly Document.write()) now also support linearized output. Plus several extensions / improvements around supporting Python fileobjects.

  • Added new methods to quickly determine whether a PDF has annotations or links.

  • Extended the Document.scrub() method with a new parameter, which allows to also remove page thumbnails.

  • Added methods to directly inquire and set values in PDF objects - without the need to manipulating PDF object sources in an unwieldy way - see methods Document.xref_set_key() / Document.xref_get_key().

Continued the process of changing the naming convention for class methods and attributes to "snake_case". As announced before, this is a tedious, error-prone process, and requires special care to maintain a high backlevel support for existing scripts.
In future versions - probably synchronously to MuPDF v1.19.0 - we will remove definitions of old names, but a method for re-activating old aliases will remain available.

Bug Fixes and some new features

07 Jan 14:09
Compare
Choose a tag to compare

The recent introduction of "Discussions" by Github has been very motivating for our users.
Based on their feedback, several enhancement have been implemented.
Here is a selection:

  • Most Python functions now have typing / annotation support .
  • For PDF table-of-contents items, colors are now supported (reading and writing)
  • PDF page label support for reading and writing
  • Support personalized tagging of new annotations, fields and links for easier selection of relevant objects.

There also is a number of fixes - please consult the documentation.

Minor fixes, improved font metrics handling

17 Dec 10:46
Compare
Choose a tag to compare

Font metrics handling has been improved: text box writing now observes the relevant font properties when determining line heights.
In this course a new option has been introduced, which allows getting text bboxes (glyphs, spans, text search quads, etc.) that more exactly wrap the text only - as opposed to always returning line height bboxes.

Fixes:

Better Optional Content support

20 Nov 16:38
Compare
Choose a tag to compare
  • Improved PDF Optional Content support

  • Started overhaul of method and attribute naming

  • Introduced support of Popup annotations

  • Implemented the following fixes:

Introducing PDF Optional Content

09 Nov 12:20
Compare
Choose a tag to compare

As a major new feature, the PDF Optional Content concept is now widely supported.

The following fixes have been implemented:

New features for text searching and more

27 Oct 12:09
Compare
Choose a tag to compare

This resolves

and removes the hit_max parameter from text searching. In addition, hyphenated words around line breaks are still found.

The use of the clip parameter in text searches and text extractions now only includes characters whose bboxes are fully contained in the clip rctangle.

Important fixes, some improvements for drawing extraction

18 Oct 16:54
Compare
Choose a tag to compare
  • fixed #692
  • fixed #686
  • Added transparency options for various methods in classes Shape and Page.

Support MuPDF v1.18.0

08 Oct 07:17
Compare
Choose a tag to compare

This version fixes the following issues:

  • #519 - method Page.cleanContents() should no longer destroy the PDF page's appearance. In earlier versions, this upstream bug occurred in rare cases.
  • #675 - unsuccessful storage allocations (e.g. for extremely large pixmaps), could occasionally lead to interpreter crashes. This should now always be prevented (fingers crossed).
  • #668 - the specification of line dashes in PDF is now correctly documented.
  • #669 - fixed a major cause of memory leakage in method Document.insertPDF.

The following new features or improvements are included:

  • Text extraction method Page.getText() now also works for annotations: Annot.getText().
  • Text from within a rectangle can now be extracted directly via Page.getTextbox(rect). This may obsolete extra scripts in many cases.
  • When applying redactions on PDF pages, the handling of images can now be fine-controlled via a new parameter.
  • The DPI (resolution) of PNG images created from pixmaps is now automatically set from the Pixmap.xres and Pixmap.yres values.